public inbox for [email protected]
help / color / mirror / Atom feedRe: generic plans and "initial" pruning
71+ messages / 7 participants
[nested] [flat]
* Re: generic plans and "initial" pruning
@ 2022-02-10 08:13 Amit Langote <[email protected]>
0 siblings, 1 reply; 71+ messages in thread
From: Amit Langote @ 2022-02-10 08:13 UTC (permalink / raw)
To: Robert Haas <[email protected]>; +Cc: pgsql-hackers
On Thu, Jan 13, 2022 at 3:20 AM Robert Haas <[email protected]> wrote:
> On Wed, Jan 12, 2022 at 9:32 AM Amit Langote <[email protected]> wrote:
> > Or, maybe this won't be a concern if performing ExecutorStart() is
> > made a part of CheckCachedPlan() somehow, which would then take locks
> > on the relation as the PlanState tree is built capturing any plan
> > invalidations, instead of AcquireExecutorLocks(). That does sound like
> > an ambitious undertaking though.
>
> On the surface that would seem to involve abstraction violations, but
> maybe that could be finessed somehow. The plancache shouldn't know too
> much about what the executor is going to do with the plan, but it
> could ask the executor to perform a step that has been designed for
> use by the plancache. I guess the core problem here is how to pass
> around information that is node-specific before we've stood up the
> executor state tree. Maybe the executor could have a function that
> does the pruning and returns some kind of array of results that can be
> used both to decide what to lock and also what to consider as pruned
> at the start of execution. (I'm hand-waving about the details because
> I don't know.)
The attached patch implements this idea. Sorry for the delay in
getting this out and thanks to Robert for the off-list discussions on
this.
So the new executor "step" you mention is the function ExecutorPrep in
the patch, which calls a recursive function ExecPrepNode on the plan
tree's top node, much as ExecutorStart calls (via InitPlan)
ExecInitNode to construct a PlanState tree for actual execution
paralleling the plan tree.
For now, ExecutorPrep() / ExecPrepNode() does mainly two things if and
as it walks the plan tree: 1) Extract the RT indexes of RTE_RELATION
entries and add them to a bitmapset in the result struct, 2) If the
node contains a PartitionPruneInfo, perform its "initial pruning
steps" and store the result of doing so in a per-plan-node node called
PlanPrepOutput. The bitmapset and the array containing per-plan-node
PlanPrepOutput nodes are returned in a node called ExecPrepOutput,
which is the result of ExecutorPrep, to its calling module (say,
plancache.c), which, after it's done using that information, must pass
it forward to subsequent execution steps. That is done by passing it,
via the module's callers, to CreateQueryDesc() which remembers the
ExecPrepOutput in QueryDesc that is eventually passed to
ExecutorStart().
A bunch of other details are mentioned in the patch's commit message,
which I'm pasting below for anyone reading to spot any obvious flaws
(no-go's) of this approach:
Invent a new executor "prep" phase
The new phase, implemented by execMain.c:ExecutorPrep() and its
recursive underling execProcnode.c:ExecPrepNode(), takes a query's
PlannedStmt and processes the plan tree contained in it to produce
a ExecPrepOutput node as result.
As the plan tree is walked, each node must add the RT index(es) of
any relation(s) that it directly manipulates to a bitmapset member of
ExecPrepOutput (for example, an IndexScan node must add the Scan's
scanrelid). Also, each node may want to make a PlanPrepOutput node
containing additional information that may be of interest to the
calling module or to the later execution phases, if the node can
provide one (for example, an Append node may perform initial pruning
and add a set of "initially valid subplans" to the PlanPrepOutput).
The PlanPrepOutput nodess of all the plan nodes are added to an array
in the ExecPrepOutput, which is indexed using the individual nodes'
plan_node_id; a NULL is stored in the array slots of nodes that
don't have anything interesting to add to the PlanPrepOutput.
The ExecPrepOutput thus produced is passed to CreateQueryDesc()
and subsequently to ExecutorStart() via QueryDesc, which then makes
it available to the executor routines via the query's EState.
The main goal of adding this new phase is, for now, to allow cached
cached generic plans containing scans of partitioned tables using
Append/MergeAppend to be executed more efficiently by the prep phase
doing any initial pruning, instead of deferring that to
ExecutorStart(). That may allow AcquireExecutorLocks() on the plan
to lock only only the minimal set of relations/partitions, that is
those whose subplans survive the initial pruning.
Implementation notes:
* To allow initial pruning to be done as part of the pre-execution
prep phase as opposed to as part of ExecutorStart(), this refactors
ExecCreatePartitionPruneState() and ExecFindInitialMatchingSubPlans()
to pass the information needed to do initial pruning directly as
parameters instead of getting that from the EState and the PlanState
of the parent Append/MergeAppend, both of which would not be
available in ExecutorPrep(). Another, sort of non-essential-to-this-
goal, refactoring this does is moving the partition pruning
initialization stanzas in ExecInitAppend() and ExecInitMergeAppend()
both of which contain the same cod into its own function
ExecInitPartitionPruning().
* To pass the ExecPrepOutput(s) created by the plancache module's
invocation of ExecutorPrep() to the callers of the module, which in
turn would pass them down to ExecutorStart(), CachedPlan gets a new
List field that stores those ExecPrepOutputs, containing one element
for each PlannedStmt also contained in the CachedPlan. The new list
is stored in a child context of the context containing the
PlannedStmts, though unlike the latter, it is reset on every
invocation of CheckCachedPlan(), which in turn calls ExecutorPrep()
with a new set of bound Params.
* AcquireExecutorLocks() is now made to loop over a bitmapset of RT
indexes, those of relations returned in ExecPrepOutput, instead of
over the whole range table. With initial pruning that is also done
as part of ExcecutorPrep(), only relations from non-pruned nodes of
the plan tree would get locked as a result of this new arrangement.
* PlannedStmt gets a new field usesPrepExecPruning that indicates
whether any of the nodes of the plan tree contain "initial" (or
"pre-execution") pruning steps, which saves ExecutorPrep() the
trouble of walking the plan tree only to find out whether that's
the case.
* PartitionPruneInfo nodes now explicitly stores whether the steps
contained in any of the individual PartitionedRelPruneInfos embedded
in it contain initial pruning steps (those that can be performed
during ExecutorPrep) and execution pruning steps (those that can only
be performed during ExecutorRun), as flags contains_initial_steps and
contains_exec_steps, respectively. In fact, the aforementioned
PlannedStmt field's value is a logical OR of the values of the former
across all PartitionPruneInfo nodes embedded in the plan tree.
* PlannedStmt also gets a bitmapset field to store the RT indexes of
all relation RTEs referenced in the query that is populated when
contructing the flat range table in setrefs.c, which effectively
contains all the relations that the planner must have locked. In the
case of a cached plan, AcquireExecutorLocks() must lock all of those
relations, except those whose subnodes get pruned as result of
ExecutorPrep().
* PlannedStmt gets yet another field numPlanNodes that records the
highest plan_node_id assigned to any of the node contained in the
tree, which serves as the size to use when allocating the
PlanPrepOutput array.
Maybe this should be more than one patch? Say:
0001 to add ExecutorPrep and the boilerplate,
0002 to teach plancache.c to use the new facility
Thoughts?
--
Amit Langote
EDB: http://www.enterprisedb.com
Attachments:
[application/octet-stream] v4-0001-Invent-a-new-executor-prep-phase.patch (169.2K, 2-v4-0001-Invent-a-new-executor-prep-phase.patch)
download | inline diff:
From 7d29fea0fcf8e6aec2877804555dd0239fdaf1be Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Wed, 22 Dec 2021 16:55:17 +0900
Subject: [PATCH v4] Invent a new executor "prep" phase
The new phase, implemented by execMain.c:ExecutorPrep() and its
recursive underling execProcnode.c:ExecPrepNode(), takes a query's
PlannedStmt and processes the plan tree contained in it to produce
a ExecPrepOutput node as result.
As the plan tree is walked, each node must add the RT index(es) of
any relation(s) that it directly manipulates to a bitmapset member of
ExecPrepOutput (for example, an IndexScan node must add the Scan's
scanrelid). Also, each node may want to make a PlanPrepOutput node
containing additional information that may be of interest to the
calling module or to the later execution phases, if the node can
provide one (for example, an Append node may perform initial pruning
and add a set of "initially valid subplans" to the PlanPrepOutput).
The PlanPrepOutput nodess of all the plan nodes are added to an array
in the ExecPrepOutput, which is indexed using the individual nodes'
plan_node_id; a NULL is stored in the array slots of nodes that
don't have anything interesting to add to the PlanPrepOutput.
The ExecPrepOutput thus produced is passed to CreateQueryDesc()
and subsequently to ExecutorStart() via QueryDesc, which then makes
it available to the executor routines via the query's EState.
The main goal of adding this new phase is, for now, to allow cached
cached generic plans containing scans of partitioned tables using
Append/MergeAppend to be executed more efficiently by the prep phase
doing any initial pruning, instead of deferring that to
ExecutorStart(). That may allow AcquireExecutorLocks() on the plan
to lock only only the minimal set of relations/partitions, that is
those whose subplans survive the initial pruning.
Implementation notes:
* To allow initial pruning to be done as part of the pre-execution
prep phase as opposed to as part of ExecutorStart(), this refactors
ExecCreatePartitionPruneState() and ExecFindInitialMatchingSubPlans()
to pass the information needed to do initial pruning directly as
parameters instead of getting that from the EState and the PlanState
of the parent Append/MergeAppend, both of which would not be
available in ExecutorPrep(). Another, sort of non-essential-to-this-
goal, refactoring this does is moving the partition pruning
initialization stanzas in ExecInitAppend() and ExecInitMergeAppend()
both of which contain the same cod into its own function
ExecInitPartitionPruning().
* To pass the ExecPrepOutput(s) created by the plancache module's
invocation of ExecutorPrep() to the callers of the module, which in
turn would pass them down to ExecutorStart(), CachedPlan gets a new
List field that stores those ExecPrepOutputs, containing one element
for each PlannedStmt also contained in the CachedPlan. The new list
is stored in a child context of the context containing the
PlannedStmts, though unlike the latter, it is reset on every
invocation of CheckCachedPlan(), which in turn calls ExecutorPrep()
with a new set of bound Params.
* AcquireExecutorLocks() is now made to loop over a bitmapset of RT
indexes, those of relations returned in ExecPrepOutput, instead of
over the whole range table. With initial pruning that is also done
as part of ExcecutorPrep(), only relations from non-pruned nodes of
the plan tree would get locked as a result of this new arrangement.
* PlannedStmt gets a new field usesPrepExecPruning that indicates
whether any of the nodes of the plan tree contain "initial" (or
"pre-execution") pruning steps, which saves ExecutorPrep() the
trouble of walking the plan tree only to find out whether that's
the case.
* PartitionPruneInfo nodes now explicitly stores whether the steps
contained in any of the individual PartitionedRelPruneInfos embedded
in it contain initial pruning steps (those that can be performed
during ExecutorPrep) and execution pruning steps (those that can only
be performed during ExecutorRun), as flags contains_initial_steps and
contains_exec_steps, respectively. In fact, the aforementioned
PlannedStmt field's value is a logical OR of the values of the former
across all PartitionPruneInfo nodes embedded in the plan tree.
* PlannedStmt also gets a bitmapset field to store the RT indexes of
all relation RTEs referenced in the query that is populated when
contructing the flat range table in setrefs.c, which effectively
contains all the relations that the planner must have locked. In the
case of a cached plan, AcquireExecutorLocks() must lock all of those
relations, except those whose subnodes get pruned as result of
ExecutorPrep().
* PlannedStmt gets yet another field numPlanNodes that records the
highest plan_node_id assigned to any of the node contained in the
tree, which serves as the size to use when allocating the
PlanPrepOutput array.
---
src/backend/commands/copyto.c | 2 +-
src/backend/commands/createas.c | 2 +-
src/backend/commands/explain.c | 7 +-
src/backend/commands/extension.c | 13 +-
src/backend/commands/matview.c | 2 +-
src/backend/commands/portalcmds.c | 1 +
src/backend/commands/prepare.c | 17 +-
src/backend/executor/README | 18 +
src/backend/executor/execMain.c | 48 ++
src/backend/executor/execParallel.c | 4 +-
src/backend/executor/execPartition.c | 538 +++++++++++++-----
src/backend/executor/execProcnode.c | 206 +++++++
src/backend/executor/execUtils.c | 8 +
src/backend/executor/functions.c | 2 +-
src/backend/executor/nodeAgg.c | 13 +
src/backend/executor/nodeAppend.c | 91 ++-
src/backend/executor/nodeBitmapAnd.c | 18 +
src/backend/executor/nodeBitmapHeapscan.c | 14 +
src/backend/executor/nodeBitmapIndexscan.c | 14 +
src/backend/executor/nodeBitmapOr.c | 18 +
src/backend/executor/nodeCtescan.c | 12 +
src/backend/executor/nodeCustom.c | 18 +
src/backend/executor/nodeForeignscan.c | 12 +
src/backend/executor/nodeFunctionscan.c | 13 +
src/backend/executor/nodeGather.c | 13 +
src/backend/executor/nodeGatherMerge.c | 13 +
src/backend/executor/nodeGroup.c | 13 +
src/backend/executor/nodeHash.c | 13 +
src/backend/executor/nodeHashjoin.c | 14 +
src/backend/executor/nodeIncrementalSort.c | 14 +
src/backend/executor/nodeIndexonlyscan.c | 14 +
src/backend/executor/nodeIndexscan.c | 14 +
src/backend/executor/nodeLimit.c | 13 +
src/backend/executor/nodeLockRows.c | 13 +
src/backend/executor/nodeMaterial.c | 13 +
src/backend/executor/nodeMemoize.c | 13 +
src/backend/executor/nodeMergeAppend.c | 90 ++-
src/backend/executor/nodeMergejoin.c | 14 +
src/backend/executor/nodeModifyTable.c | 26 +
.../executor/nodeNamedtuplestorescan.c | 13 +
src/backend/executor/nodeNestloop.c | 14 +
src/backend/executor/nodeProjectSet.c | 13 +
src/backend/executor/nodeRecursiveunion.c | 14 +
src/backend/executor/nodeResult.c | 13 +
src/backend/executor/nodeSamplescan.c | 14 +
src/backend/executor/nodeSeqscan.c | 13 +
src/backend/executor/nodeSetOp.c | 13 +
src/backend/executor/nodeSort.c | 13 +
src/backend/executor/nodeSubplan.c | 12 +
src/backend/executor/nodeSubqueryscan.c | 14 +
src/backend/executor/nodeTableFuncscan.c | 13 +
src/backend/executor/nodeTidrangescan.c | 14 +
src/backend/executor/nodeTidscan.c | 15 +-
src/backend/executor/nodeUnique.c | 13 +
src/backend/executor/nodeValuesscan.c | 13 +
src/backend/executor/nodeWindowAgg.c | 13 +
src/backend/executor/nodeWorktablescan.c | 12 +
src/backend/executor/spi.c | 14 +-
src/backend/nodes/copyfuncs.c | 49 ++
src/backend/nodes/outfuncs.c | 6 +
src/backend/nodes/readfuncs.c | 5 +
src/backend/optimizer/plan/planner.c | 3 +
src/backend/optimizer/plan/setrefs.c | 10 +
src/backend/partitioning/partprune.c | 57 +-
src/backend/tcop/postgres.c | 15 +-
src/backend/tcop/pquery.c | 21 +-
src/backend/utils/cache/plancache.c | 155 +++--
src/backend/utils/mmgr/portalmem.c | 2 +
src/include/commands/explain.h | 3 +-
src/include/executor/execPartition.h | 19 +-
src/include/executor/execdesc.h | 2 +
src/include/executor/executor.h | 3 +
src/include/executor/nodeAgg.h | 1 +
src/include/executor/nodeAppend.h | 1 +
src/include/executor/nodeBitmapAnd.h | 1 +
src/include/executor/nodeBitmapHeapscan.h | 1 +
src/include/executor/nodeBitmapIndexscan.h | 1 +
src/include/executor/nodeBitmapOr.h | 1 +
src/include/executor/nodeCtescan.h | 1 +
src/include/executor/nodeCustom.h | 1 +
src/include/executor/nodeForeignscan.h | 1 +
src/include/executor/nodeFunctionscan.h | 1 +
src/include/executor/nodeGather.h | 1 +
src/include/executor/nodeGatherMerge.h | 1 +
src/include/executor/nodeGroup.h | 1 +
src/include/executor/nodeHash.h | 1 +
src/include/executor/nodeHashjoin.h | 1 +
src/include/executor/nodeIncrementalSort.h | 1 +
src/include/executor/nodeIndexonlyscan.h | 1 +
src/include/executor/nodeIndexscan.h | 1 +
src/include/executor/nodeLimit.h | 1 +
src/include/executor/nodeLockRows.h | 1 +
src/include/executor/nodeMaterial.h | 1 +
src/include/executor/nodeMemoize.h | 1 +
src/include/executor/nodeMergeAppend.h | 1 +
src/include/executor/nodeMergejoin.h | 1 +
src/include/executor/nodeModifyTable.h | 1 +
.../executor/nodeNamedtuplestorescan.h | 1 +
src/include/executor/nodeNestloop.h | 1 +
src/include/executor/nodeProjectSet.h | 1 +
src/include/executor/nodeRecursiveunion.h | 1 +
src/include/executor/nodeResult.h | 2 +
src/include/executor/nodeSamplescan.h | 1 +
src/include/executor/nodeSeqscan.h | 1 +
src/include/executor/nodeSetOp.h | 1 +
src/include/executor/nodeSort.h | 1 +
src/include/executor/nodeSubplan.h | 1 +
src/include/executor/nodeSubqueryscan.h | 1 +
src/include/executor/nodeTableFuncscan.h | 1 +
src/include/executor/nodeTidrangescan.h | 1 +
src/include/executor/nodeTidscan.h | 1 +
src/include/executor/nodeUnique.h | 1 +
src/include/executor/nodeValuesscan.h | 1 +
src/include/executor/nodeWindowAgg.h | 1 +
src/include/executor/nodeWorktablescan.h | 1 +
src/include/nodes/execnodes.h | 78 +++
src/include/nodes/nodeFuncs.h | 3 +
src/include/nodes/nodes.h | 5 +
src/include/nodes/pathnodes.h | 6 +
src/include/nodes/plannodes.h | 17 +
src/include/partitioning/partprune.h | 2 +
src/include/tcop/tcopprot.h | 2 +-
src/include/utils/plancache.h | 5 +
src/include/utils/portal.h | 5 +
124 files changed, 1866 insertions(+), 285 deletions(-)
diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index 3283ef50d0..bb7d5e65ea 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -542,7 +542,7 @@ BeginCopyTo(ParseState *pstate,
((DR_copy *) dest)->cstate = cstate;
/* Create a QueryDesc requesting no output */
- cstate->queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+ cstate->queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
GetActiveSnapshot(),
InvalidSnapshot,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 9abbb6b555..f6607f2454 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -325,7 +325,7 @@ ExecCreateTableAs(ParseState *pstate, CreateTableAsStmt *stmt,
UpdateActiveSnapshotCommandId();
/* Create a QueryDesc, redirecting output to our tuple receiver */
- queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+ queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, 0);
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index b970997c34..9ee82824a1 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -407,7 +407,7 @@ ExplainOneQuery(Query *query, int cursorOptions,
}
/* run it (if needed) and produce output */
- ExplainOnePlan(plan, into, es, queryString, params, queryEnv,
+ ExplainOnePlan(plan, NULL, into, es, queryString, params, queryEnv,
&planduration, (es->buffers ? &bufusage : NULL));
}
}
@@ -515,7 +515,8 @@ ExplainOneUtility(Node *utilityStmt, IntoClause *into, ExplainState *es,
* to call it.
*/
void
-ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
+ExplainOnePlan(PlannedStmt *plannedstmt, ExecPrepOutput *execprep,
+ IntoClause *into, ExplainState *es,
const char *queryString, ParamListInfo params,
QueryEnvironment *queryEnv, const instr_time *planduration,
const BufferUsage *bufusage)
@@ -563,7 +564,7 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
dest = None_Receiver;
/* Create a QueryDesc for the query */
- queryDesc = CreateQueryDesc(plannedstmt, queryString,
+ queryDesc = CreateQueryDesc(plannedstmt, execprep, queryString,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, instrument_option);
diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index a2e77c418a..214a345aa2 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -741,8 +741,10 @@ execute_sql_string(const char *sql)
RawStmt *parsetree = lfirst_node(RawStmt, lc1);
MemoryContext per_parsetree_context,
oldcontext;
- List *stmt_list;
- ListCell *lc2;
+ List *stmt_list,
+ *stmt_execprep_list;
+ ListCell *lc2,
+ *lc3;
/*
* We do the work for each parsetree in a short-lived context, to
@@ -762,11 +764,13 @@ execute_sql_string(const char *sql)
NULL,
0,
NULL);
- stmt_list = pg_plan_queries(stmt_list, sql, CURSOR_OPT_PARALLEL_OK, NULL);
+ stmt_list = pg_plan_queries(stmt_list, sql, CURSOR_OPT_PARALLEL_OK, NULL,
+ &stmt_execprep_list);
- foreach(lc2, stmt_list)
+ forboth(lc2, stmt_list, lc3, stmt_execprep_list)
{
PlannedStmt *stmt = lfirst_node(PlannedStmt, lc2);
+ ExecPrepOutput *execprep = lfirst_node(ExecPrepOutput, lc3);
CommandCounterIncrement();
@@ -777,6 +781,7 @@ execute_sql_string(const char *sql)
QueryDesc *qdesc;
qdesc = CreateQueryDesc(stmt,
+ execprep,
sql,
GetActiveSnapshot(), NULL,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index 05e7b60059..4ef44aaf23 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -416,7 +416,7 @@ refresh_matview_datafill(DestReceiver *dest, Query *query,
UpdateActiveSnapshotCommandId();
/* Create a QueryDesc, redirecting output to our tuple receiver */
- queryDesc = CreateQueryDesc(plan, queryString,
+ queryDesc = CreateQueryDesc(plan, NULL, queryString,
GetActiveSnapshot(), InvalidSnapshot,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/portalcmds.c b/src/backend/commands/portalcmds.c
index 9902c5c566..0bea2dd18f 100644
--- a/src/backend/commands/portalcmds.c
+++ b/src/backend/commands/portalcmds.c
@@ -107,6 +107,7 @@ PerformCursorOpen(ParseState *pstate, DeclareCursorStmt *cstmt, ParamListInfo pa
queryString,
CMDTAG_SELECT, /* cursor's query is always a SELECT */
list_make1(plan),
+ list_make1(NULL), /* no ExecPrepOutput to pass */
NULL);
/*----------
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 206d2bbbf9..ac188a7347 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -189,6 +189,7 @@ ExecuteQuery(ParseState *pstate,
PreparedStatement *entry;
CachedPlan *cplan;
List *plan_list;
+ List *plan_execprep_list;
ParamListInfo paramLI = NULL;
EState *estate = NULL;
Portal portal;
@@ -229,6 +230,7 @@ ExecuteQuery(ParseState *pstate,
/* Replan if needed, and increment plan refcount for portal */
cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL);
plan_list = cplan->stmt_list;
+ plan_execprep_list = cplan->stmt_execprep_list;
/*
* DO NOT add any logic that could possibly throw an error between
@@ -238,7 +240,7 @@ ExecuteQuery(ParseState *pstate,
NULL,
query_string,
entry->plansource->commandTag,
- plan_list,
+ plan_list, plan_execprep_list,
cplan);
/*
@@ -610,7 +612,9 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
const char *query_string;
CachedPlan *cplan;
List *plan_list;
- ListCell *p;
+ List *plan_execprep_list;
+ ListCell *p,
+ *pe;
ParamListInfo paramLI = NULL;
EState *estate = NULL;
instr_time planstart;
@@ -666,15 +670,18 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
}
plan_list = cplan->stmt_list;
+ plan_execprep_list = cplan->stmt_execprep_list;
/* Explain each query */
- foreach(p, plan_list)
+ forboth(p, plan_list, pe, plan_execprep_list)
{
PlannedStmt *pstmt = lfirst_node(PlannedStmt, p);
+ ExecPrepOutput *execprep = lfirst_node(ExecPrepOutput, pe);
if (pstmt->commandType != CMD_UTILITY)
- ExplainOnePlan(pstmt, into, es, query_string, paramLI, queryEnv,
- &planduration, (es->buffers ? &bufusage : NULL));
+ ExplainOnePlan(pstmt, execprep, into, es, query_string, paramLI,
+ queryEnv, &planduration,
+ (es->buffers ? &bufusage : NULL));
else
ExplainOneUtility(pstmt->utilityStmt, into, es, query_string,
paramLI, queryEnv);
diff --git a/src/backend/executor/README b/src/backend/executor/README
index bf5e70860d..c25db66ff0 100644
--- a/src/backend/executor/README
+++ b/src/backend/executor/README
@@ -65,6 +65,21 @@ found there. This currently only occurs for Append and MergeAppend nodes. In
this case the non-required subplans are ignored and the executor state's
subnode array will become out of sequence to the plan's subplan list.
+A plan tree may also be made to go through ExecutorPrep() to collect some
+information about the individual plan nodes that may help optimize the
+actual execution of the plan. Such information about each plan node is put
+into a PlanPrepOutput node if the plan node type supports producing one and
+stored in an array in ExecPrepOutput that in turn represents the output of
+a ExecutorPrep() run. The PlanPrepOutput array is indexed with plan_node_id
+of the individual plan nodes. An example of what such information may look
+like is in the "prep" routine of the Append node (ExecPrepAppend), which does
+partition pruning using "initial steps", that is, pruning with expressions
+that can evaluated even before the actual execution has started. That produces
+a set of "initially valid subplans" that is put into the PlanPrepOutput
+belonging to Append that can be used as-is by the initializer routine of the
+Append node (nodeAppend.c: ExecInitAppend) to only initialize the plan state
+trees of those subplans.
+
Each Plan node may have expression trees associated with it, to represent
its target list, qualification conditions, etc. These trees are also
read-only to the executor, but the executor state for expression evaluation
@@ -247,6 +262,9 @@ Query Processing Control Flow
This is a sketch of control flow for full query processing:
+ [ ExecutorPrep ] --- an optional step to walk over the plan tree to produce
+ an ExecPrepOutput to be passed to CreateQueryDesc
+
CreateQueryDesc
ExecutorStart
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 549d9eb696..e38966295e 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -103,6 +103,52 @@ static void EvalPlanQualStart(EPQState *epqstate, Plan *planTree);
/* end of local decls */
+/* ----------------------------------------------------------------
+ * ExecutorPrep
+ *
+ * This optional executor routine must be called if the PlannedStmt
+ * indicates that some nodes in the planTree can perform preparatory
+ * actions, such as pre-execution/initial pruning
+ *
+ * Returned information includes the set of RT indexes of relations referenced
+ * in the plan, and a PlanPrepOutput node for each node in the planTree if the
+ * node type supports producing one.
+ *
+ * This may lock relations whose information may be used to produce the
+ * PlanPrepOutput nodes. For example, a partitioned table before perusing its
+ * PartitionPruneInfo contained in an Append node to do the pruning the result
+ * of which is used to populate the Append node's PlanPrepOutput.
+ */
+ExecPrepOutput *
+ExecutorPrep(ExecPrepContext *context)
+{
+ ExecPrepOutput *result = makeNode(ExecPrepOutput);
+
+ result->numPlanNodes = context->stmt->numPlanNodes;
+ result->planPrepResults = palloc0(sizeof(PlanPrepOutput *) *
+ result->numPlanNodes);
+ if (!context->stmt->usesPreExecPruning)
+ {
+ /* Shortcut */
+ result->relationRTIs = bms_copy(context->stmt->relationRTIs);
+ }
+ else
+ {
+ /* Go find the nodes that need any "prep" work done. */
+ ListCell *lc;
+
+ foreach(lc, context->stmt->subplans)
+ {
+ Plan *subplan = lfirst(lc);
+
+ ExecPrepNode(subplan, context, result);
+ }
+
+ ExecPrepNode(context->stmt->planTree, context, result);
+ }
+
+ return result;
+}
/* ----------------------------------------------------------------
* ExecutorStart
@@ -804,6 +850,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
{
CmdType operation = queryDesc->operation;
PlannedStmt *plannedstmt = queryDesc->plannedstmt;
+ ExecPrepOutput *execprep = queryDesc->execprep;
Plan *plan = plannedstmt->planTree;
List *rangeTable = plannedstmt->rtable;
EState *estate = queryDesc->estate;
@@ -823,6 +870,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
ExecInitRangeTable(estate, rangeTable);
estate->es_plannedstmt = plannedstmt;
+ estate->es_execprep = execprep;
/*
* Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index 5dd8ab7db2..0567534358 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -182,8 +182,10 @@ ExecSerializePlan(Plan *plan, EState *estate)
pstmt->transientPlan = false;
pstmt->dependsOnRole = false;
pstmt->parallelModeNeeded = false;
+ pstmt->usesPreExecPruning = false;
pstmt->planTree = plan;
pstmt->rtable = estate->es_range_table;
+ pstmt->relationRTIs = NULL;
pstmt->resultRelations = NIL;
pstmt->appendRelations = NIL;
@@ -1248,7 +1250,7 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
paramLI = RestoreParamList(¶mspace);
/* Create a QueryDesc for the query. */
- return CreateQueryDesc(pstmt,
+ return CreateQueryDesc(pstmt, NULL, /* XXX pass ExecPrepOutput too? */
queryString,
GetActiveSnapshot(), InvalidSnapshot,
receiver, paramLI, NULL, instrument_options);
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 90ed1485d1..75292fbd21 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -24,6 +24,7 @@
#include "mb/pg_wchar.h"
#include "miscadmin.h"
#include "nodes/makefuncs.h"
+#include "parser/parsetree.h"
#include "partitioning/partbounds.h"
#include "partitioning/partdesc.h"
#include "partitioning/partprune.h"
@@ -186,7 +187,11 @@ static void ExecInitPruningContext(PartitionPruneContext *context,
List *pruning_steps,
PartitionDesc partdesc,
PartitionKey partkey,
- PlanState *planstate);
+ PlanState *planstate,
+ ExprContext *econtext);
+static void ExecPartitionPruneFixSubPlanIndexes(PartitionPruneState *prunestate,
+ Bitmapset *initially_valid_subplans,
+ int n_total_subplans);
static void find_matching_subplans_recurse(PartitionPruningData *prunedata,
PartitionedRelPruningData *pprune,
bool initial_prune,
@@ -1476,8 +1481,9 @@ adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri)
* considered to be a stable expression, it can change value from one plan
* node scan to the next during query execution. Stable comparison
* expressions that don't involve such Params allow partition pruning to be
- * done once during executor startup. Expressions that do involve such Params
- * require us to prune separately for each scan of the parent plan node.
+ * done once during executor startup or even before during ExecutorPrep().
+ * Expressions that do involve such Params require us to prune separately for
+ * each scan of the parent plan node.
*
* Note that pruning away unneeded subplans during executor startup has the
* added benefit of not having to initialize the unneeded subplans at all.
@@ -1485,10 +1491,28 @@ adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri)
*
* Functions:
*
+ * ExecInitPartitionPruning:
+ * This determines the initially valid subplans by doing pruning with
+ * only pre-execution pruning expressions, that is, expressions in the
+ * query that were matched to the partition key(s), whose values are
+ * known at executor startup (excludeing expressions containing
+ * PARAM_EXEC Params); see ExecFindInitialMatchingSubPlans(). The
+ * PartitionPruneState thus created, which stores the details about
+ * mapping the partition indexes returned by the partition pruning code
+ * into subplan indexes, is also returned for use during subsquent
+ * pruning. Pruned subplans must be removed from the parent plan's list
+ * of subplans to be executed, so this also remaps the partition indexes
+ * in the PartitionPruneState to the new indexes of surviving subplans.
+ *
+ * ExecPrepDoInitialPruning:
+ * Do ExecFindInitialMatchingSubPlans as part of ExecPrepNode() on the
+ * parent plan node
+ *
* ExecCreatePartitionPruneState:
* Creates the PartitionPruneState required by each of the two pruning
* functions. Details stored include how to map the partition index
* returned by the partition pruning code into subplan indexes.
+ * (Note: Use ExecInitPartitionPruning() rather than use this directly.)
*
* ExecFindInitialMatchingSubPlans:
* Returns indexes of matching subplans. Partition pruning is attempted
@@ -1500,6 +1524,7 @@ adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri)
* remap of the partition index to subplan index map and the newly
* created map provides indexes only for subplans which remain after
* calling this function.
+ * (Note: Use ExecInitPartitionPruning() rather than use this directly.)
*
* ExecFindMatchingSubPlans:
* Returns indexes of matching subplans after evaluating all available
@@ -1514,7 +1539,9 @@ adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri)
* Build the data structure required for calling
* ExecFindInitialMatchingSubPlans and ExecFindMatchingSubPlans.
*
- * 'planstate' is the parent plan node's execution state.
+ * 'planstate', if not NULL, is the parent plan node's execution state. It
+ * can be NULL if being called before ExecutorStart(), in which case,
+ * 'rtable', 'econtext', and 'partdir' must be provided.
*
* 'partitionpruneinfo' is a PartitionPruneInfo as generated by
* make_partition_pruneinfo. Here we build a PartitionPruneState containing a
@@ -1529,18 +1556,20 @@ adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri)
*/
PartitionPruneState *
ExecCreatePartitionPruneState(PlanState *planstate,
- PartitionPruneInfo *partitionpruneinfo)
+ PartitionPruneInfo *partitionpruneinfo,
+ bool consider_initial_steps,
+ bool consider_exec_steps,
+ List *rtable, ExprContext *econtext,
+ PartitionDirectory partdir)
{
- EState *estate = planstate->state;
+ EState *estate = planstate ? planstate->state : NULL;
PartitionPruneState *prunestate;
int n_part_hierarchies;
ListCell *lc;
int i;
- /* For data reading, executor always omits detached partitions */
- if (estate->es_partition_directory == NULL)
- estate->es_partition_directory =
- CreatePartitionDirectory(estate->es_query_cxt, false);
+ Assert(partdir != NULL && econtext != NULL &&
+ (estate != NULL || rtable != NIL));
n_part_hierarchies = list_length(partitionpruneinfo->prune_infos);
Assert(n_part_hierarchies > 0);
@@ -1591,19 +1620,34 @@ ExecCreatePartitionPruneState(PlanState *planstate,
PartitionedRelPruneInfo *pinfo = lfirst_node(PartitionedRelPruneInfo, lc2);
PartitionedRelPruningData *pprune = &prunedata->partrelprunedata[j];
Relation partrel;
+ bool close_partrel = false;
PartitionDesc partdesc;
PartitionKey partkey;
/*
- * We can rely on the copies of the partitioned table's partition
- * key and partition descriptor appearing in its relcache entry,
- * because that entry will be held open and locked for the
- * duration of this executor run.
+ * We can rely on the copy of the partitioned table's partition
+ * key from in its relcache entry, because it can't change (or
+ * get destroyed) as long as the relation is locked. Partition
+ * descriptor is taken from the PartitionDirectory associated with
+ * the table that is held open long enough for the descriptor to
+ * remain valid while it's used to perform the pruning steps.
*/
- partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
+ if (estate == NULL)
+ {
+ RangeTblEntry *rte = rt_fetch(pinfo->rtindex, rtable);
+
+ partrel = table_open(rte->relid, rte->rellockmode);
+ close_partrel = true;
+ }
+ else
+ partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
+
partkey = RelationGetPartitionKey(partrel);
- partdesc = PartitionDirectoryLookup(estate->es_partition_directory,
- partrel);
+ partdesc = PartitionDirectoryLookup(partdir, partrel);
+
+ /* Safe to close partrel, if necessary, keeping the lock taken. */
+ if (close_partrel)
+ table_close(partrel, NoLock);
/*
* Initialize the subplan_map and subpart_map.
@@ -1705,30 +1749,32 @@ ExecCreatePartitionPruneState(PlanState *planstate,
* Initialize pruning contexts as needed.
*/
pprune->initial_pruning_steps = pinfo->initial_pruning_steps;
- if (pinfo->initial_pruning_steps)
+ if (consider_initial_steps && pinfo->initial_pruning_steps)
{
ExecInitPruningContext(&pprune->initial_context,
pinfo->initial_pruning_steps,
- partdesc, partkey, planstate);
+ partdesc, partkey, planstate,
+ econtext);
/* Record whether initial pruning is needed at any level */
prunestate->do_initial_prune = true;
}
pprune->exec_pruning_steps = pinfo->exec_pruning_steps;
- if (pinfo->exec_pruning_steps)
+ if (consider_exec_steps && pinfo->exec_pruning_steps)
{
ExecInitPruningContext(&pprune->exec_context,
pinfo->exec_pruning_steps,
- partdesc, partkey, planstate);
+ partdesc, partkey, planstate,
+ econtext);
/* Record whether exec pruning is needed at any level */
prunestate->do_exec_prune = true;
- }
- /*
- * Accumulate the IDs of all PARAM_EXEC Params affecting the
- * partitioning decisions at this plan node.
- */
- prunestate->execparamids = bms_add_members(prunestate->execparamids,
- pinfo->execparamids);
+ /*
+ * Accumulate the IDs of all PARAM_EXEC Params affecting the
+ * partitioning decisions at this plan node.
+ */
+ prunestate->execparamids = bms_add_members(prunestate->execparamids,
+ pinfo->execparamids);
+ }
j++;
}
@@ -1740,13 +1786,18 @@ ExecCreatePartitionPruneState(PlanState *planstate,
/*
* Initialize a PartitionPruneContext for the given list of pruning steps.
+ *
+ * At least one of 'planstate' or 'econtext' must be passed to be able to
+ * successfully evaluate any non-Const expressions contained in the
+ * steps.
*/
static void
ExecInitPruningContext(PartitionPruneContext *context,
List *pruning_steps,
PartitionDesc partdesc,
PartitionKey partkey,
- PlanState *planstate)
+ PlanState *planstate,
+ ExprContext *econtext)
{
int n_steps;
int partnatts;
@@ -1767,6 +1818,7 @@ ExecInitPruningContext(PartitionPruneContext *context,
context->ppccontext = CurrentMemoryContext;
context->planstate = planstate;
+ context->exprcontext = econtext;
/* Initialize expression state for each expression we need */
context->exprstates = (ExprState **)
@@ -1795,14 +1847,269 @@ ExecInitPruningContext(PartitionPruneContext *context,
step->step.step_id,
keyno);
- context->exprstates[stateidx] =
- ExecInitExpr(expr, context->planstate);
+ if (planstate == NULL)
+ context->exprstates[stateidx] =
+ ExecInitExprWithParams(expr,
+ econtext->ecxt_param_list_info);
+ else
+ context->exprstates[stateidx] =
+ ExecInitExpr(expr, context->planstate);
}
keyno++;
}
}
}
+Bitmapset *
+ExecInitPartitionPruning(PlanState *planstate, int n_total_subplans,
+ PartitionPruneInfo *pruneinfo,
+ PartitionPruneState **prunestate)
+{
+ Bitmapset *validsubplans;
+ Plan *plan = planstate->plan;
+ EState *estate = planstate->state;
+ PlanPrepOutput *planPrepResult = NULL;
+ bool do_pruning = (pruneinfo->contains_init_steps ||
+ pruneinfo->contains_exec_steps);
+
+ *prunestate = NULL;
+ if (estate->es_execprep)
+ {
+ planPrepResult = ExecPrepFetchPlanPrepOutput(estate->es_execprep,
+ plan);
+
+ Assert(planPrepResult != NULL);
+ /* No need to do initial pruning again, only exec pruning. */
+ do_pruning = pruneinfo->contains_exec_steps;
+ }
+
+ if (do_pruning)
+ {
+ /* We may need an expression context to evaluate partition exprs */
+ ExecAssignExprContext(estate, planstate);
+
+ /* For data reading, executor always omits detached partitions */
+ if (estate->es_partition_directory == NULL)
+ estate->es_partition_directory =
+ CreatePartitionDirectory(estate->es_query_cxt, false);
+
+ /*
+ * Create the working data structure for pruning. No need to consider
+ * initial pruning steps if we have a PlanPrepOutput.
+ */
+ *prunestate = ExecCreatePartitionPruneState(planstate, pruneinfo,
+ planPrepResult == NULL, true,
+ NIL, planstate->ps_ExprContext,
+ estate->es_partition_directory);
+ }
+
+ /*
+ * Perform an initial partition prune, if required.
+ */
+ if (planPrepResult)
+ {
+ /* ExecutorPrep() already did it for us! */
+ validsubplans = planPrepResult->initially_valid_subnodes;
+ }
+ else if (*prunestate && (*prunestate)->do_initial_prune)
+ {
+ /* Determine which subplans survive initial pruning */
+ validsubplans = ExecFindInitialMatchingSubPlans(*prunestate, pruneinfo,
+ NULL);
+ }
+ else
+ {
+ /* We'll need to initialize all subplans */
+ Assert(n_total_subplans > 0);
+ validsubplans = bms_add_range(NULL, 0, n_total_subplans - 1);
+ }
+
+ /*
+ * If exec-time pruning is required and subplans are pruned by initial
+ * pruning, then we must re-sequence the subplan indexes so that
+ * ExecFindMatchingSubPlans properly returns the indexes from the
+ * subplans which will remain after initial pruning.
+ *
+ * We can safely skip this when !do_exec_prune, even though that leaves
+ * invalid data in prunestate, because that data won't be consulted again
+ * (cf initial Assert in ExecFindMatchingSubPlans).
+ */
+ if (*prunestate && (*prunestate)->do_exec_prune &&
+ bms_num_members(validsubplans) < n_total_subplans)
+ ExecPartitionPruneFixSubPlanIndexes(*prunestate, validsubplans,
+ n_total_subplans);
+
+ return validsubplans;
+}
+
+/*
+ * ExecPrepDoInitialPruning
+ * Perform initial pruning as part of doing ExecPrepNode() on the parent
+ * plan node
+ */
+Bitmapset *
+ExecPrepDoInitialPruning(PartitionPruneInfo *pruneinfo,
+ List *rtable, ParamListInfo params,
+ Bitmapset **parentrelids)
+{
+ ExprContext *econtext;
+ PartitionDirectory pdir;
+ MemoryContext oldcontext,
+ tmpcontext;
+ PartitionPruneState *prunestate;
+ Bitmapset *validsubplans;
+
+ /*
+ * A temporary context to allocate stuff needded to run
+ * the pruning steps.
+ */
+ tmpcontext = AllocSetContextCreate(CurrentMemoryContext,
+ "initial pruning working data",
+ ALLOCSET_DEFAULT_SIZES);
+ oldcontext = MemoryContextSwitchTo(tmpcontext);
+
+ /* An ExprContext to evaluate expressions. */
+ econtext = CreateStandaloneExprContext();
+ econtext->ecxt_param_list_info = params;
+
+ /*
+ * PartitionDirectory, to look up partition descriptors
+ * Omits detached partitions, just like in the executor
+ * proper.
+ */
+ pdir = CreatePartitionDirectory(CurrentMemoryContext, false);
+ prunestate = ExecCreatePartitionPruneState(NULL, pruneinfo,
+ true, false,
+ rtable, econtext,
+ pdir);
+ MemoryContextSwitchTo(oldcontext);
+
+ /* Do the "initial" pruning. */
+ validsubplans =
+ ExecFindInitialMatchingSubPlans(prunestate,
+ pruneinfo,
+ parentrelids);
+
+ FreeExprContext(econtext, true);
+ DestroyPartitionDirectory(pdir);
+ MemoryContextDelete(tmpcontext);
+
+ return validsubplans;
+}
+
+/*
+ * ExecPartitionPruneFixSubPlanIndexes
+ * Fix mapping of partition indexes to subplan indexes contained in
+ * prunestate by considering the new list of subplans that survived
+ * initial pruning
+ *
+ * Subplans would be previously indexed 0..(n_total_subplans - 1), though
+ * now should be changed to index range 0..num(initially_valid_subplans).
+ */
+static void
+ExecPartitionPruneFixSubPlanIndexes(PartitionPruneState *prunestate,
+ Bitmapset *initially_valid_subplans,
+ int n_total_subplans)
+{
+ int *new_subplan_indexes;
+ Bitmapset *new_other_subplans;
+ int i;
+ int newidx;
+
+ /*
+ * First we must build a temporary array which maps old subplan
+ * indexes to new ones. For convenience of initialization, we use
+ * 1-based indexes in this array and leave pruned items as 0.
+ */
+ new_subplan_indexes = (int *) palloc0(sizeof(int) * n_total_subplans);
+ newidx = 1;
+ i = -1;
+ while ((i = bms_next_member(initially_valid_subplans, i)) >= 0)
+ {
+ Assert(i < n_total_subplans);
+ new_subplan_indexes[i] = newidx++;
+ }
+
+ /*
+ * Now we can update each PartitionedRelPruneInfo's subplan_map with
+ * new subplan indexes. We must also recompute its present_parts
+ * bitmap.
+ */
+ for (i = 0; i < prunestate->num_partprunedata; i++)
+ {
+ PartitionPruningData *prunedata = prunestate->partprunedata[i];
+ int j;
+
+ /*
+ * Within each hierarchy, we perform this loop in back-to-front
+ * order so that we determine present_parts for the lowest-level
+ * partitioned tables first. This way we can tell whether a
+ * sub-partitioned table's partitions were entirely pruned so we
+ * can exclude it from the current level's present_parts.
+ */
+ for (j = prunedata->num_partrelprunedata - 1; j >= 0; j--)
+ {
+ PartitionedRelPruningData *pprune = &prunedata->partrelprunedata[j];
+ int nparts = pprune->nparts;
+ int k;
+
+ /* We just rebuild present_parts from scratch */
+ bms_free(pprune->present_parts);
+ pprune->present_parts = NULL;
+
+ for (k = 0; k < nparts; k++)
+ {
+ int oldidx = pprune->subplan_map[k];
+ int subidx;
+
+ /*
+ * If this partition existed as a subplan then change the
+ * old subplan index to the new subplan index. The new
+ * index may become -1 if the partition was pruned above,
+ * or it may just come earlier in the subplan list due to
+ * some subplans being removed earlier in the list. If
+ * it's a subpartition, add it to present_parts unless
+ * it's entirely pruned.
+ */
+ if (oldidx >= 0)
+ {
+ Assert(oldidx < n_total_subplans);
+ pprune->subplan_map[k] = new_subplan_indexes[oldidx] - 1;
+
+ if (new_subplan_indexes[oldidx] > 0)
+ pprune->present_parts =
+ bms_add_member(pprune->present_parts, k);
+ }
+ else if ((subidx = pprune->subpart_map[k]) >= 0)
+ {
+ PartitionedRelPruningData *subprune;
+
+ subprune = &prunedata->partrelprunedata[subidx];
+
+ if (!bms_is_empty(subprune->present_parts))
+ pprune->present_parts =
+ bms_add_member(pprune->present_parts, k);
+ }
+ }
+ }
+ }
+
+ /*
+ * We must also recompute the other_subplans set, since indexes in it
+ * may change.
+ */
+ new_other_subplans = NULL;
+ i = -1;
+ while ((i = bms_next_member(prunestate->other_subplans, i)) >= 0)
+ new_other_subplans = bms_add_member(new_other_subplans,
+ new_subplan_indexes[i] - 1);
+
+ bms_free(prunestate->other_subplans);
+ prunestate->other_subplans = new_other_subplans;
+
+ pfree(new_subplan_indexes);
+}
+
/*
* ExecFindInitialMatchingSubPlans
* Identify the set of subplans that cannot be eliminated by initial
@@ -1817,10 +2124,14 @@ ExecInitPruningContext(PartitionPruneContext *context,
* Must only be called once per 'prunestate', and only if initial pruning
* is required.
*
- * 'nsubplans' must be passed as the total number of unpruned subplans.
+ * The RT indexes of unpruned parents are returned in *parentrelids if asked
+ * for by the caller, in which case 'pruneinfo' must also be passed because
+ * that is where the RT indexes are to be found.
*/
Bitmapset *
-ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate, int nsubplans)
+ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate,
+ PartitionPruneInfo *pruneinfo,
+ Bitmapset **parentrelids)
{
Bitmapset *result = NULL;
MemoryContext oldcontext;
@@ -1830,11 +2141,14 @@ ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate, int nsubplans)
Assert(prunestate->do_initial_prune);
/*
- * Switch to a temp context to avoid leaking memory in the executor's
- * query-lifespan memory context.
+ * Switch to a temp context to avoid leaking memory in the longer-term
+ * memory context.
*/
oldcontext = MemoryContextSwitchTo(prunestate->prune_context);
+ if (parentrelids)
+ *parentrelids = NULL;
+
/*
* For each hierarchy, do the pruning tests, and add nondeletable
* subplans' indexes to "result".
@@ -1845,14 +2159,42 @@ ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate, int nsubplans)
PartitionedRelPruningData *pprune;
prunedata = prunestate->partprunedata[i];
+
+ /*
+ * We pass the 1st item belonging to the root table of the hierarchy
+ * and find_matching_subplans_recurse() takes care of recursing to
+ * other (lower-level) parents as needed.
+ */
pprune = &prunedata->partrelprunedata[0];
/* Perform pruning without using PARAM_EXEC Params */
find_matching_subplans_recurse(prunedata, pprune, true, &result);
- /* Expression eval may have used space in node's ps_ExprContext too */
+ /*
+ * Collect the RT indexes of surviving parents if the callers asked
+ * to see them.
+ */
+ if (parentrelids)
+ {
+ int j;
+ List *partrelpruneinfos = list_nth_node(List,
+ pruneinfo->prune_infos,
+ i);
+
+ for (j = 0; j < prunedata->num_partrelprunedata; j++)
+ {
+ PartitionedRelPruneInfo *pinfo = list_nth_node(PartitionedRelPruneInfo,
+ partrelpruneinfos, j);
+
+ pprune = &prunedata->partrelprunedata[j];
+ if (!bms_is_empty(pprune->present_parts))
+ *parentrelids = bms_add_member(*parentrelids, pinfo->rtindex);
+ }
+ }
+
+ /* Expression eval may have used space in ExprContext too */
if (pprune->initial_pruning_steps)
- ResetExprContext(pprune->initial_context.planstate->ps_ExprContext);
+ ResetExprContext(pprune->initial_context.exprcontext);
}
/* Add in any subplans that partition pruning didn't account for */
@@ -1862,120 +2204,11 @@ ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate, int nsubplans)
/* Copy result out of the temp context before we reset it */
result = bms_copy(result);
+ if (parentrelids)
+ *parentrelids = bms_copy(*parentrelids);
MemoryContextReset(prunestate->prune_context);
- /*
- * If exec-time pruning is required and we pruned subplans above, then we
- * must re-sequence the subplan indexes so that ExecFindMatchingSubPlans
- * properly returns the indexes from the subplans which will remain after
- * execution of this function.
- *
- * We can safely skip this when !do_exec_prune, even though that leaves
- * invalid data in prunestate, because that data won't be consulted again
- * (cf initial Assert in ExecFindMatchingSubPlans).
- */
- if (prunestate->do_exec_prune && bms_num_members(result) < nsubplans)
- {
- int *new_subplan_indexes;
- Bitmapset *new_other_subplans;
- int i;
- int newidx;
-
- /*
- * First we must build a temporary array which maps old subplan
- * indexes to new ones. For convenience of initialization, we use
- * 1-based indexes in this array and leave pruned items as 0.
- */
- new_subplan_indexes = (int *) palloc0(sizeof(int) * nsubplans);
- newidx = 1;
- i = -1;
- while ((i = bms_next_member(result, i)) >= 0)
- {
- Assert(i < nsubplans);
- new_subplan_indexes[i] = newidx++;
- }
-
- /*
- * Now we can update each PartitionedRelPruneInfo's subplan_map with
- * new subplan indexes. We must also recompute its present_parts
- * bitmap.
- */
- for (i = 0; i < prunestate->num_partprunedata; i++)
- {
- PartitionPruningData *prunedata = prunestate->partprunedata[i];
- int j;
-
- /*
- * Within each hierarchy, we perform this loop in back-to-front
- * order so that we determine present_parts for the lowest-level
- * partitioned tables first. This way we can tell whether a
- * sub-partitioned table's partitions were entirely pruned so we
- * can exclude it from the current level's present_parts.
- */
- for (j = prunedata->num_partrelprunedata - 1; j >= 0; j--)
- {
- PartitionedRelPruningData *pprune = &prunedata->partrelprunedata[j];
- int nparts = pprune->nparts;
- int k;
-
- /* We just rebuild present_parts from scratch */
- bms_free(pprune->present_parts);
- pprune->present_parts = NULL;
-
- for (k = 0; k < nparts; k++)
- {
- int oldidx = pprune->subplan_map[k];
- int subidx;
-
- /*
- * If this partition existed as a subplan then change the
- * old subplan index to the new subplan index. The new
- * index may become -1 if the partition was pruned above,
- * or it may just come earlier in the subplan list due to
- * some subplans being removed earlier in the list. If
- * it's a subpartition, add it to present_parts unless
- * it's entirely pruned.
- */
- if (oldidx >= 0)
- {
- Assert(oldidx < nsubplans);
- pprune->subplan_map[k] = new_subplan_indexes[oldidx] - 1;
-
- if (new_subplan_indexes[oldidx] > 0)
- pprune->present_parts =
- bms_add_member(pprune->present_parts, k);
- }
- else if ((subidx = pprune->subpart_map[k]) >= 0)
- {
- PartitionedRelPruningData *subprune;
-
- subprune = &prunedata->partrelprunedata[subidx];
-
- if (!bms_is_empty(subprune->present_parts))
- pprune->present_parts =
- bms_add_member(pprune->present_parts, k);
- }
- }
- }
- }
-
- /*
- * We must also recompute the other_subplans set, since indexes in it
- * may change.
- */
- new_other_subplans = NULL;
- i = -1;
- while ((i = bms_next_member(prunestate->other_subplans, i)) >= 0)
- new_other_subplans = bms_add_member(new_other_subplans,
- new_subplan_indexes[i] - 1);
-
- bms_free(prunestate->other_subplans);
- prunestate->other_subplans = new_other_subplans;
-
- pfree(new_subplan_indexes);
- }
-
return result;
}
@@ -2018,11 +2251,16 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate)
prunedata = prunestate->partprunedata[i];
pprune = &prunedata->partrelprunedata[0];
+ /*
+ * We pass the 1st item belonging to the root table of the hierarchy
+ * and find_matching_subplans_recurse() takes care of recursing to
+ * other (lower-level) parents as needed.
+ */
find_matching_subplans_recurse(prunedata, pprune, false, &result);
- /* Expression eval may have used space in node's ps_ExprContext too */
+ /* Expression eval may have used space in ExprContext too */
if (pprune->exec_pruning_steps)
- ResetExprContext(pprune->exec_context.planstate->ps_ExprContext);
+ ResetExprContext(pprune->exec_context.exprcontext);
}
/* Add in any subplans that partition pruning didn't account for */
diff --git a/src/backend/executor/execProcnode.c b/src/backend/executor/execProcnode.c
index b5667e53e5..d5e10756ac 100644
--- a/src/backend/executor/execProcnode.c
+++ b/src/backend/executor/execProcnode.c
@@ -123,6 +123,209 @@ static TupleTableSlot *ExecProcNodeFirst(PlanState *node);
static TupleTableSlot *ExecProcNodeInstr(PlanState *node);
+/* ------------------------------------------------------------------------
+ * ExecPrepNode
+ * Recursively "prep" all the nodes in the plan tree rooted
+ * at 'node'.
+ *
+ * 'node' is the current node of the plan produced by the query planner
+ * 'context' is the information that may be necessary to do the prep
+ * work, (such as any EXTERN parameters in the query to do partition
+ * pruning with)
+ * 'result' is the output variable to add the result into
+ *
+ * NOTE: ExecPrepNode subroutine for a given node must add the RT indexes of
+ * any relations that it manipulates to result->relationRTIs. Optionally, it
+ * can produce a PlanPrepOutput node containing the information that may be of
+ * interest to later execution steps or to any intervening modules that have
+ * access to the ExecPrepOutput and put that in
+ * result->planPrepResults[plan->plan_node_id]. For example, nodes that
+ * supports partition pruning can perform the "initial" pruning steps to
+ * produce the set of "initially valid" subnodes that can be used as-is by the
+ * node's ExecInit* routine to only initialize those subnodes.
+ * ------------------------------------------------------------------------
+ */
+void
+ExecPrepNode(Plan *node, ExecPrepContext *context, ExecPrepOutput *result)
+{
+ ListCell *l;
+
+ /* Do nothing when we get to the end of a leaf on tree. */
+ if (node == NULL)
+ return;
+
+ /* Make sure there's enough stack available. */
+ check_stack_depth();
+
+ /*
+ * Write NULL for the node's PlanPruneOutput which the node's Prep routine
+ * might write over.
+ */
+ ExecPrepStorePlanPrepOutput(result, NULL, node);
+
+ switch (nodeTag(node))
+ {
+ /*
+ * control nodes
+ */
+ case T_Result:
+ ExecPrepResult((Result *) node, context, result);
+ break;
+ case T_ProjectSet:
+ ExecPrepProjectSet((ProjectSet *) node, context, result);
+ break;
+ case T_RecursiveUnion:
+ ExecPrepRecursiveUnion((RecursiveUnion *) node, context, result);
+ break;
+ case T_BitmapAnd:
+ ExecPrepBitmapAnd((BitmapAnd *) node, context, result);
+ break;
+ case T_BitmapOr:
+ ExecPrepBitmapOr((BitmapOr *) node, context, result);
+ break;
+ case T_ModifyTable:
+ ExecPrepModifyTable((ModifyTable *) node, context, result);
+ break;
+ case T_Append:
+ ExecPrepAppend((Append *) node, context, result);
+ break;
+ case T_MergeAppend:
+ ExecPrepMergeAppend((MergeAppend *) node, context, result);
+ break;
+
+ /*
+ * scan nodes
+ */
+ case T_SeqScan:
+ ExecPrepSeqScan((SeqScan *) node, context, result);
+ break;
+ case T_SampleScan:
+ ExecPrepSampleScan((SampleScan *) node, context, result);
+ break;
+ case T_IndexScan:
+ ExecPrepIndexScan((IndexScan *) node, context, result);
+ break;
+ case T_IndexOnlyScan:
+ ExecPrepIndexOnlyScan((IndexOnlyScan *) node, context, result);
+ break;
+ case T_BitmapIndexScan:
+ ExecPrepBitmapIndexScan((BitmapIndexScan *) node, context, result);
+ break;
+ case T_BitmapHeapScan:
+ ExecPrepBitmapHeapScan((BitmapHeapScan *) node, context, result);
+ break;
+ case T_TidScan:
+ ExecPrepTidScan((TidScan *) node, context, result);
+ break;
+ case T_TidRangeScan:
+ ExecPrepTidRangeScan((TidRangeScan *) node, context, result);
+ break;
+ case T_SubqueryScan:
+ ExecPrepSubqueryScan((SubqueryScan *) node, context, result);
+ break;
+ case T_FunctionScan:
+ ExecPrepFunctionScan((FunctionScan *) node, context, result);
+ break;
+ case T_TableFuncScan:
+ ExecPrepTableFuncScan((TableFuncScan *) node, context, result);
+ break;
+ case T_ValuesScan:
+ ExecPrepValuesScan((ValuesScan *) node, context, result);
+ break;
+ case T_CteScan:
+ ExecPrepCteScan((CteScan *) node, context, result);
+ break;
+ case T_NamedTuplestoreScan:
+ ExecPrepNamedTuplestoreScan((NamedTuplestoreScan *) node, context, result);
+ break;
+ case T_WorkTableScan:
+ ExecPrepWorkTableScan((WorkTableScan *) node, context, result);
+ break;
+ case T_ForeignScan:
+ ExecPrepForeignScan((ForeignScan *) node, context, result);
+ break;
+ case T_CustomScan:
+ ExecPrepCustomScan((CustomScan *) node, context, result);
+ break;
+
+ /*
+ * join nodes: subnodes handled below
+ */
+ case T_NestLoop:
+ ExecPrepNestLoop((NestLoop *) node, context, result);
+ break;
+ case T_MergeJoin:
+ ExecPrepMergeJoin((MergeJoin *) node, context, result);
+ break;
+ case T_HashJoin:
+ ExecPrepHashJoin((HashJoin *) node, context, result);
+ break;
+
+ /*
+ * materialization nodes: subnodes handled below
+ */
+ case T_Material:
+ ExecPrepMaterial((Material *) node, context, result);
+ break;
+ case T_Sort:
+ ExecPrepSort((Sort *) node, context, result);
+ break;
+ case T_IncrementalSort:
+ ExecPrepIncrementalSort((IncrementalSort *) node, context, result);
+ break;
+ case T_Memoize:
+ ExecPrepMemoize((Memoize *) node, context, result);
+ break;
+ case T_Group:
+ ExecPrepGroup((Group *) node, context, result);
+ break;
+ case T_Agg:
+ ExecPrepAgg((Agg *) node, context, result);
+ break;
+ case T_WindowAgg:
+ ExecPrepWindowAgg((WindowAgg *) node, context, result);
+ break;
+ case T_Unique:
+ ExecPrepUnique((Unique *) node, context, result);
+ break;
+ case T_Gather:
+ ExecPrepGather((Gather *) node, context, result);
+ break;
+ case T_GatherMerge:
+ ExecPrepGatherMerge((GatherMerge *) node, context, result);
+ break;
+ case T_Hash:
+ ExecPrepHash((Hash *) node, context, result);
+ break;
+ case T_SetOp:
+ ExecPrepSetOp((SetOp *) node, context, result);
+ break;
+ case T_LockRows:
+ ExecPrepLockRows((LockRows *) node, context, result);
+ break;
+ case T_Limit:
+ ExecPrepLimit((Limit *) node, context, result);
+ break;
+
+ default:
+ elog(ERROR, "unrecognized node type: %d", (int) nodeTag(node));
+ result = NULL; /* keep compiler quiet */
+ break;
+ }
+
+ /*
+ * Prep any initPlans present in this node. The planner put them in
+ * a separate list for us.
+ */
+ foreach(l, node->initPlan)
+ {
+ SubPlan *subplan = (SubPlan *) lfirst(l);
+
+ Assert(IsA(subplan, SubPlan));
+ ExecPrepSubPlan(subplan, context, result);
+ }
+}
+
/* ------------------------------------------------------------------------
* ExecInitNode
*
@@ -157,6 +360,9 @@ ExecInitNode(Plan *node, EState *estate, int eflags)
*/
check_stack_depth();
+ /* Check that the PlanPrepOutput for the node looks sane if any. */
+ EXEC_PREP_OUTPUT_SANITY(node, estate);
+
switch (nodeTag(node))
{
/*
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 9df1f81ea8..5c85148b37 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -119,6 +119,7 @@ CreateExecutorState(void)
estate->es_relations = NULL;
estate->es_rowmarks = NULL;
estate->es_plannedstmt = NULL;
+ estate->es_execprep = NULL;
estate->es_junkFilter = NULL;
@@ -785,6 +786,13 @@ ExecGetRangeTableRelation(EState *estate, Index rti)
Assert(rti > 0 && rti <= estate->es_range_table_size);
+ /*
+ * A cross-check that AcquireExecutorLocks() hasn't missed any relations
+ * it must not have.
+ */
+ Assert(estate->es_execprep == NULL ||
+ bms_is_member(rti, estate->es_execprep->relationRTIs));
+
rel = estate->es_relations[rti - 1];
if (rel == NULL)
{
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index 29a68879ee..5f0ff2df2a 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -844,7 +844,7 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
else
dest = None_Receiver;
- es->qd = CreateQueryDesc(es->stmt,
+ es->qd = CreateQueryDesc(es->stmt, NULL,
fcache->src,
GetActiveSnapshot(),
InvalidSnapshot,
diff --git a/src/backend/executor/nodeAgg.c b/src/backend/executor/nodeAgg.c
index 08cf569d8f..f3b0ec75d3 100644
--- a/src/backend/executor/nodeAgg.c
+++ b/src/backend/executor/nodeAgg.c
@@ -3142,6 +3142,19 @@ hashagg_reset_spill_state(AggState *aggstate)
}
}
+/* ----------------------------------------------------------------
+ * ExecPrepAgg
+ *
+ * This "preps" the Agg node and the node's subplan.
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepAgg(Agg *node, ExecPrepContext *context, ExecPrepOutput *result)
+{
+ /* Nothing to do beside recursing to the subplan. */
+ ExecPrepNode(outerPlan(node), context, result);
+}
+
/* -----------------
* ExecInitAgg
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index 7937f1c88f..a44c8079bd 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -62,6 +62,7 @@
#include "executor/execPartition.h"
#include "executor/nodeAppend.h"
#include "miscadmin.h"
+#include "partitioning/partdesc.h"
#include "pgstat.h"
#include "storage/latch.h"
@@ -94,6 +95,62 @@ static bool ExecAppendAsyncRequest(AppendState *node, TupleTableSlot **result);
static void ExecAppendAsyncEventWait(AppendState *node);
static void classify_matching_subplans(AppendState *node);
+/* ----------------------------------------------------------------
+ * ExecPrepAppend
+ *
+ * Prep an append node
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepAppend(Append *node, ExecPrepContext *context,
+ ExecPrepOutput *result)
+{
+ PartitionPruneInfo *pruneinfo = node->part_prune_info;
+
+ if (pruneinfo && pruneinfo->contains_init_steps)
+ {
+ List *rtable = context->stmt->rtable;
+ List *subplans = node->appendplans;
+ ParamListInfo params = context->params;
+ Bitmapset *parentrelids;
+ int i;
+ PlanPrepOutput *planPrepResult = makeNode(PlanPrepOutput);
+
+ planPrepResult->plan_node_id = node->plan.plan_node_id;
+ planPrepResult->initially_valid_subnodes =
+ ExecPrepDoInitialPruning(pruneinfo, rtable, params, &parentrelids);
+ /* Replace the NULL that ExecPrepNode() would've written. */
+ ExecPrepStorePlanPrepOutput(result, planPrepResult, &node->plan);
+
+ /* All relevant parents must be reported too. */
+ Assert(bms_num_members(parentrelids) > 0);
+ result->relationRTIs = bms_add_members(result->relationRTIs,
+ parentrelids);
+
+ /* And all leaf partitions that will be scanned. */
+ i = -1;
+ while ((i = bms_next_member(planPrepResult->initially_valid_subnodes, i)) >= 0)
+ {
+ Plan *subplan = list_nth(subplans, i);
+
+ ExecPrepNode(subplan, context, result);
+ }
+ }
+ else
+ {
+ List *subplans = node->appendplans;
+ ListCell *lc;
+
+ /* Recurse to prep *all* of the node's child subplans. */
+ foreach(lc, subplans)
+ {
+ Plan *subplan = (Plan *) lfirst(lc);
+
+ ExecPrepNode(subplan, context, result);
+ }
+ }
+}
+
/* ----------------------------------------------------------------
* ExecInitAppend
*
@@ -136,39 +193,19 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
/* If run-time partition pruning is enabled, then set that up now */
if (node->part_prune_info != NULL)
{
- PartitionPruneState *prunestate;
-
- /* We may need an expression context to evaluate partition exprs */
- ExecAssignExprContext(estate, &appendstate->ps);
-
- /* Create the working data structure for pruning. */
- prunestate = ExecCreatePartitionPruneState(&appendstate->ps,
- node->part_prune_info);
- appendstate->as_prune_state = prunestate;
-
- /* Perform an initial partition prune, if required. */
- if (prunestate->do_initial_prune)
- {
- /* Determine which subplans survive initial pruning */
- validsubplans = ExecFindInitialMatchingSubPlans(prunestate,
- list_length(node->appendplans));
-
- nplans = bms_num_members(validsubplans);
- }
- else
- {
- /* We'll need to initialize all subplans */
- nplans = list_length(node->appendplans);
- Assert(nplans > 0);
- validsubplans = bms_add_range(NULL, 0, nplans - 1);
- }
+ validsubplans = ExecInitPartitionPruning(&appendstate->ps,
+ list_length(node->appendplans),
+ node->part_prune_info,
+ &appendstate->as_prune_state);
+ nplans = bms_num_members(validsubplans);
/*
* When no run-time pruning is required and there's at least one
* subplan, we can fill as_valid_subplans immediately, preventing
* later calls to ExecFindMatchingSubPlans.
*/
- if (!prunestate->do_exec_prune && nplans > 0)
+ if (appendstate->as_prune_state == NULL ||
+ (!appendstate->as_prune_state->do_exec_prune && nplans > 0))
appendstate->as_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
}
else
diff --git a/src/backend/executor/nodeBitmapAnd.c b/src/backend/executor/nodeBitmapAnd.c
index b54c79f853..4ad3e5ff81 100644
--- a/src/backend/executor/nodeBitmapAnd.c
+++ b/src/backend/executor/nodeBitmapAnd.c
@@ -45,6 +45,24 @@ ExecBitmapAnd(PlanState *pstate)
return NULL;
}
+/* ----------------------------------------------------------------
+ * ExecPrepBitmapAnd
+ *
+ * This "preps" the BitmapAnd node and the subplans.
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepBitmapAnd(BitmapAnd *node, ExecPrepContext *context,
+ ExecPrepOutput *result)
+{
+ ListCell *lc;
+
+ foreach(lc, node->bitmapplans)
+ {
+ ExecPrepNode((Plan *) lfirst(lc), context, result);
+ }
+}
+
/* ----------------------------------------------------------------
* ExecInitBitmapAnd
*
diff --git a/src/backend/executor/nodeBitmapHeapscan.c b/src/backend/executor/nodeBitmapHeapscan.c
index f6fe07ad70..aaf215a4cc 100644
--- a/src/backend/executor/nodeBitmapHeapscan.c
+++ b/src/backend/executor/nodeBitmapHeapscan.c
@@ -696,6 +696,20 @@ ExecEndBitmapHeapScan(BitmapHeapScanState *node)
table_endscan(scanDesc);
}
+/* ----------------------------------------------------------------
+ * ExecPrepBitmapHeapScan
+ *
+ * This "preps" the BitmapHeapScan node.
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepBitmapHeapScan(BitmapHeapScan *node, ExecPrepContext *context,
+ ExecPrepOutput *result)
+{
+ result->relationRTIs = bms_add_member(result->relationRTIs,
+ node->scan.scanrelid);
+}
+
/* ----------------------------------------------------------------
* ExecInitBitmapHeapScan
*
diff --git a/src/backend/executor/nodeBitmapIndexscan.c b/src/backend/executor/nodeBitmapIndexscan.c
index 551e47630d..bb766f71a2 100644
--- a/src/backend/executor/nodeBitmapIndexscan.c
+++ b/src/backend/executor/nodeBitmapIndexscan.c
@@ -201,6 +201,20 @@ ExecEndBitmapIndexScan(BitmapIndexScanState *node)
index_close(indexRelationDesc, NoLock);
}
+/* ----------------------------------------------------------------
+ * ExecPrepBitmapIndexScan
+ *
+ * This "preps" the BitmapIndexScan node.
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepBitmapIndexScan(BitmapIndexScan *node, ExecPrepContext *context,
+ ExecPrepOutput *result)
+{
+ result->relationRTIs = bms_add_member(result->relationRTIs,
+ node->scan.scanrelid);
+}
+
/* ----------------------------------------------------------------
* ExecInitBitmapIndexScan
*
diff --git a/src/backend/executor/nodeBitmapOr.c b/src/backend/executor/nodeBitmapOr.c
index 2d57f11fe7..feb3e4a8d6 100644
--- a/src/backend/executor/nodeBitmapOr.c
+++ b/src/backend/executor/nodeBitmapOr.c
@@ -46,6 +46,24 @@ ExecBitmapOr(PlanState *pstate)
return NULL;
}
+/* ----------------------------------------------------------------
+ * ExecPrepBitmapOr
+ *
+ * This "preps" the BitmapOr node and the subplans.
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepBitmapOr(BitmapOr *node, ExecPrepContext *context,
+ ExecPrepOutput *result)
+{
+ ListCell *lc;
+
+ foreach(lc, node->bitmapplans)
+ {
+ ExecPrepNode((Plan *) lfirst(lc), context, result);
+ }
+}
+
/* ----------------------------------------------------------------
* ExecInitBitmapOr
*
diff --git a/src/backend/executor/nodeCtescan.c b/src/backend/executor/nodeCtescan.c
index b9d7dec8a2..533cfb7874 100644
--- a/src/backend/executor/nodeCtescan.c
+++ b/src/backend/executor/nodeCtescan.c
@@ -166,6 +166,18 @@ ExecCteScan(PlanState *pstate)
(ExecScanRecheckMtd) CteScanRecheck);
}
+/* ----------------------------------------------------------------
+ * ExecPrepCteScan
+ *
+ * This "preps" the CteScan node.
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepCteScan(CteScan *node, ExecPrepContext *context,
+ ExecPrepOutput *result)
+{
+ /* nothing to do */
+}
/* ----------------------------------------------------------------
* ExecInitCteScan
diff --git a/src/backend/executor/nodeCustom.c b/src/backend/executor/nodeCustom.c
index 8f56bd8a23..0bf1636326 100644
--- a/src/backend/executor/nodeCustom.c
+++ b/src/backend/executor/nodeCustom.c
@@ -24,6 +24,24 @@
static TupleTableSlot *ExecCustomScan(PlanState *pstate);
+/* ----------------------------------------------------------------
+ * ExecPrepCustomScan
+ *
+ * This "preps" the CustomScan node.
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepCustomScan(CustomScan *node, ExecPrepContext *context, ExecPrepOutput *result)
+{
+ ListCell *lc;
+
+ result->relationRTIs = bms_add_members(result->relationRTIs,
+ node->custom_relids);
+ foreach(lc, node->custom_plans)
+ {
+ ExecPrepNode((Plan *) lfirst(lc), context, result);
+ }
+}
CustomScanState *
ExecInitCustomScan(CustomScan *cscan, EState *estate, int eflags)
diff --git a/src/backend/executor/nodeForeignscan.c b/src/backend/executor/nodeForeignscan.c
index 5b9737c2ab..ffe17ec6d5 100644
--- a/src/backend/executor/nodeForeignscan.c
+++ b/src/backend/executor/nodeForeignscan.c
@@ -134,6 +134,18 @@ ExecForeignScan(PlanState *pstate)
(ExecScanRecheckMtd) ForeignRecheck);
}
+/* ----------------------------------------------------------------
+ * ExecPrepForeignScan
+ *
+ * This "preps" the ForeignScan node.
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepForeignScan(ForeignScan *node, ExecPrepContext *context, ExecPrepOutput *result)
+{
+ result->relationRTIs = bms_add_members(result->relationRTIs,
+ node->fs_relids);
+}
/* ----------------------------------------------------------------
* ExecInitForeignScan
diff --git a/src/backend/executor/nodeFunctionscan.c b/src/backend/executor/nodeFunctionscan.c
index 434379a5aa..df055ce01f 100644
--- a/src/backend/executor/nodeFunctionscan.c
+++ b/src/backend/executor/nodeFunctionscan.c
@@ -272,6 +272,19 @@ ExecFunctionScan(PlanState *pstate)
(ExecScanRecheckMtd) FunctionRecheck);
}
+/* ----------------------------------------------------------------
+ * ExecPrepFunctionScan
+ *
+ * This "preps" the FunctionScan node.
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepFunctionScan(FunctionScan *node, ExecPrepContext *context,
+ ExecPrepOutput *result)
+{
+ /* nothing to do*/
+}
+
/* ----------------------------------------------------------------
* ExecInitFunctionScan
* ----------------------------------------------------------------
diff --git a/src/backend/executor/nodeGather.c b/src/backend/executor/nodeGather.c
index 4f8a17df7d..0edb0ae13a 100644
--- a/src/backend/executor/nodeGather.c
+++ b/src/backend/executor/nodeGather.c
@@ -49,6 +49,19 @@ static TupleTableSlot *gather_getnext(GatherState *gatherstate);
static MinimalTuple gather_readnext(GatherState *gatherstate);
static void ExecShutdownGatherWorkers(GatherState *node);
+/* ----------------------------------------------------------------
+ * ExecPrepGather
+ *
+ * This "preps" the Gather node and the node's subplan.
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepGather(Gather *node, ExecPrepContext *context, ExecPrepOutput *result)
+{
+ /* Nothing to do beside recursing to the subplan. */
+ ExecPrepNode(outerPlan(node), context, result);
+}
+
/* ----------------------------------------------------------------
* ExecInitGather
diff --git a/src/backend/executor/nodeGatherMerge.c b/src/backend/executor/nodeGatherMerge.c
index a488cc6d8b..c564d4ac25 100644
--- a/src/backend/executor/nodeGatherMerge.c
+++ b/src/backend/executor/nodeGatherMerge.c
@@ -64,6 +64,19 @@ static bool gather_merge_readnext(GatherMergeState *gm_state, int reader,
bool nowait);
static void load_tuple_array(GatherMergeState *gm_state, int reader);
+/* ----------------------------------------------------------------
+ * ExecPrepGatherMerge
+ *
+ * This "preps" the GatherMerge node and the node's subplan.
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepGatherMerge(GatherMerge *node, ExecPrepContext *context, ExecPrepOutput *result)
+{
+ /* Nothing to do beside recursing to the subplan. */
+ ExecPrepNode(outerPlan(node), context, result);
+}
+
/* ----------------------------------------------------------------
* ExecInitGather
* ----------------------------------------------------------------
diff --git a/src/backend/executor/nodeGroup.c b/src/backend/executor/nodeGroup.c
index 666d02b58f..0e5bcf89bf 100644
--- a/src/backend/executor/nodeGroup.c
+++ b/src/backend/executor/nodeGroup.c
@@ -151,6 +151,19 @@ ExecGroup(PlanState *pstate)
}
}
+/* ----------------------------------------------------------------
+ * ExecPrepGroup
+ *
+ * This "preps" the Group node and the node's subplan.
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepGroup(Group *node, ExecPrepContext *context, ExecPrepOutput *result)
+{
+ /* Nothing to do beside recursing to the subplan. */
+ ExecPrepNode(outerPlan(node), context, result);
+}
+
/* -----------------
* ExecInitGroup
*
diff --git a/src/backend/executor/nodeHash.c b/src/backend/executor/nodeHash.c
index 4d68a8b97b..d20e14c7fc 100644
--- a/src/backend/executor/nodeHash.c
+++ b/src/backend/executor/nodeHash.c
@@ -344,6 +344,19 @@ MultiExecParallelHash(HashState *node)
BarrierPhase(build_barrier) == PHJ_BUILD_DONE);
}
+/* ----------------------------------------------------------------
+ * ExecPrepHash
+ *
+ * This "preps" the hash node and the node's subplan.
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepHash(Hash *node, ExecPrepContext *context, ExecPrepOutput *result)
+{
+ /* Nothing to do beside recursing to the subplan. */
+ ExecPrepNode(outerPlan(node), context, result);
+}
+
/* ----------------------------------------------------------------
* ExecInitHash
*
diff --git a/src/backend/executor/nodeHashjoin.c b/src/backend/executor/nodeHashjoin.c
index 88b870655e..5665c31873 100644
--- a/src/backend/executor/nodeHashjoin.c
+++ b/src/backend/executor/nodeHashjoin.c
@@ -607,6 +607,20 @@ ExecParallelHashJoin(PlanState *pstate)
return ExecHashJoinImpl(pstate, true);
}
+/* ----------------------------------------------------------------
+ * ExecPrepHashJoin
+ *
+ * This "preps" the HashJoin node and the node's children.
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepHashJoin(HashJoin *node, ExecPrepContext *context, ExecPrepOutput *result)
+{
+ /* Nothing to do beside recursing to the children. */
+ ExecPrepNode(outerPlan(node), context, result);
+ ExecPrepNode(innerPlan(node), context, result);
+}
+
/* ----------------------------------------------------------------
* ExecInitHashJoin
*
diff --git a/src/backend/executor/nodeIncrementalSort.c b/src/backend/executor/nodeIncrementalSort.c
index d6fb56dec7..c1c8fe2af6 100644
--- a/src/backend/executor/nodeIncrementalSort.c
+++ b/src/backend/executor/nodeIncrementalSort.c
@@ -964,6 +964,20 @@ ExecIncrementalSort(PlanState *pstate)
return slot;
}
+/* ----------------------------------------------------------------
+ * ExecPrepIncrementalSort
+ *
+ * This "preps" the IncrementalSort node and the node's subplan.
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepIncrementalSort(IncrementalSort *node, ExecPrepContext *context,
+ ExecPrepOutput *result)
+{
+ /* Nothing to do beside recursing to the subplan. */
+ ExecPrepNode(outerPlan(node), context, result);
+}
+
/* ----------------------------------------------------------------
* ExecInitIncrementalSort
*
diff --git a/src/backend/executor/nodeIndexonlyscan.c b/src/backend/executor/nodeIndexonlyscan.c
index eb3ddd2943..ccc60c38f5 100644
--- a/src/backend/executor/nodeIndexonlyscan.c
+++ b/src/backend/executor/nodeIndexonlyscan.c
@@ -476,6 +476,20 @@ ExecIndexOnlyRestrPos(IndexOnlyScanState *node)
index_restrpos(node->ioss_ScanDesc);
}
+/* ----------------------------------------------------------------
+ * ExecPrepIndexOnlyScan
+ *
+ * This "preps" the IndexOnlyScan node.
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepIndexOnlyScan(IndexOnlyScan *node, ExecPrepContext *context,
+ ExecPrepOutput *result)
+{
+ result->relationRTIs = bms_add_member(result->relationRTIs,
+ node->scan.scanrelid);
+}
+
/* ----------------------------------------------------------------
* ExecInitIndexOnlyScan
*
diff --git a/src/backend/executor/nodeIndexscan.c b/src/backend/executor/nodeIndexscan.c
index a91f135be7..5080abdd9d 100644
--- a/src/backend/executor/nodeIndexscan.c
+++ b/src/backend/executor/nodeIndexscan.c
@@ -885,6 +885,20 @@ ExecIndexRestrPos(IndexScanState *node)
index_restrpos(node->iss_ScanDesc);
}
+/* ----------------------------------------------------------------
+ * ExecPrepIndexScan
+ *
+ * This "preps" the IndexScan node.
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepIndexScan(IndexScan *node, ExecPrepContext *context,
+ ExecPrepOutput *result)
+{
+ result->relationRTIs = bms_add_member(result->relationRTIs,
+ node->scan.scanrelid);
+}
+
/* ----------------------------------------------------------------
* ExecInitIndexScan
*
diff --git a/src/backend/executor/nodeLimit.c b/src/backend/executor/nodeLimit.c
index 1b91b123fa..00aa5dd577 100644
--- a/src/backend/executor/nodeLimit.c
+++ b/src/backend/executor/nodeLimit.c
@@ -437,6 +437,19 @@ compute_tuples_needed(LimitState *node)
return node->count + node->offset;
}
+/* ----------------------------------------------------------------
+ * ExecPrepLimit
+ *
+ * This "preps" the limit node and the node's subplan.
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepLimit(Limit *node, ExecPrepContext *context, ExecPrepOutput *result)
+{
+ /* Nothing to do beside recursing to the subplan. */
+ ExecPrepNode(outerPlan(node), context, result);
+}
+
/* ----------------------------------------------------------------
* ExecInitLimit
*
diff --git a/src/backend/executor/nodeLockRows.c b/src/backend/executor/nodeLockRows.c
index 1a9dab25dd..9a3d2c5583 100644
--- a/src/backend/executor/nodeLockRows.c
+++ b/src/backend/executor/nodeLockRows.c
@@ -281,6 +281,19 @@ lnext:
return slot;
}
+/* ----------------------------------------------------------------
+ * ExecPrepLockRows
+ *
+ * This "preps" the LockRows node and the node's subplan.
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepLockRows(LockRows *node, ExecPrepContext *context, ExecPrepOutput *result)
+{
+ /* Nothing to do beside recursing to the subplan. */
+ ExecPrepNode(outerPlan(node), context, result);
+}
+
/* ----------------------------------------------------------------
* ExecInitLockRows
*
diff --git a/src/backend/executor/nodeMaterial.c b/src/backend/executor/nodeMaterial.c
index 2cb27e0e9a..802bf37ff1 100644
--- a/src/backend/executor/nodeMaterial.c
+++ b/src/backend/executor/nodeMaterial.c
@@ -156,6 +156,19 @@ ExecMaterial(PlanState *pstate)
return ExecClearTuple(slot);
}
+/* ----------------------------------------------------------------
+ * ExecPrepMaterial
+ *
+ * This "preps" the Material node and the node's subplan.
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepMaterial(Material *node, ExecPrepContext *context, ExecPrepOutput *result)
+{
+ /* Nothing to do beside recursing to the subplan. */
+ ExecPrepNode(outerPlan(node), context, result);
+}
+
/* ----------------------------------------------------------------
* ExecInitMaterial
* ----------------------------------------------------------------
diff --git a/src/backend/executor/nodeMemoize.c b/src/backend/executor/nodeMemoize.c
index 55cdd5c4d9..eacfd5f3cb 100644
--- a/src/backend/executor/nodeMemoize.c
+++ b/src/backend/executor/nodeMemoize.c
@@ -902,6 +902,19 @@ ExecMemoize(PlanState *pstate)
} /* switch */
}
+/* ----------------------------------------------------------------
+ * ExecPrepMemoize
+ *
+ * This "preps" the Memoize node and the node's subplan.
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepMemoize(Memoize *node, ExecPrepContext *context, ExecPrepOutput *result)
+{
+ /* Nothing to do beside recursing to the subplan. */
+ ExecPrepNode(outerPlan(node), context, result);
+}
+
MemoizeState *
ExecInitMemoize(Memoize *node, EState *estate, int eflags)
{
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index 418f89dea8..50f6429533 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -43,6 +43,7 @@
#include "executor/nodeMergeAppend.h"
#include "lib/binaryheap.h"
#include "miscadmin.h"
+#include "partitioning/partdesc.h"
/*
* We have one slot for each item in the heap array. We use SlotNumber
@@ -54,6 +55,62 @@ typedef int32 SlotNumber;
static TupleTableSlot *ExecMergeAppend(PlanState *pstate);
static int heap_compare_slots(Datum a, Datum b, void *arg);
+/* ----------------------------------------------------------------
+ * ExecPrepMergeAppend
+ *
+ * Prep an MergeAppend node
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepMergeAppend(MergeAppend *node, ExecPrepContext *context,
+ ExecPrepOutput *result)
+{
+ PartitionPruneInfo *pruneinfo = node->part_prune_info;
+
+ if (pruneinfo && pruneinfo->contains_init_steps)
+ {
+ List *rtable = context->stmt->rtable;
+ List *subplans = node->mergeplans;
+ ParamListInfo params = context->params;
+ Bitmapset *parentrelids;
+ int i;
+ PlanPrepOutput *planPrepResult = makeNode(PlanPrepOutput);
+
+ planPrepResult->plan_node_id = node->plan.plan_node_id;
+ planPrepResult->initially_valid_subnodes =
+ ExecPrepDoInitialPruning(pruneinfo, rtable, params, &parentrelids);
+ /* Replace the NULL that ExecPrepNode() would've written. */
+ ExecPrepStorePlanPrepOutput(result, planPrepResult, &node->plan);
+
+ /* All relevant parents must be reported too. */
+ Assert(bms_num_members(parentrelids) > 0);
+ result->relationRTIs = bms_add_members(result->relationRTIs,
+ parentrelids);
+
+ /* And all leaf partitions that will be scanned. */
+ i = -1;
+ while ((i = bms_next_member(planPrepResult->initially_valid_subnodes, i)) >= 0)
+ {
+ Plan *subplan = list_nth(subplans, i);
+
+ ExecPrepNode(subplan, context, result);
+ }
+ }
+ else
+ {
+ List *subplans = node->mergeplans;
+ ListCell *lc;
+
+ /* Recurse to prep *all* of the node's child subplans. */
+ foreach(lc, subplans)
+ {
+ Plan *subplan = (Plan *) lfirst(lc);
+
+ ExecPrepNode(subplan, context, result);
+ }
+ }
+}
+
/* ----------------------------------------------------------------
* ExecInitMergeAppend
@@ -84,38 +141,19 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
/* If run-time partition pruning is enabled, then set that up now */
if (node->part_prune_info != NULL)
{
- PartitionPruneState *prunestate;
-
- /* We may need an expression context to evaluate partition exprs */
- ExecAssignExprContext(estate, &mergestate->ps);
-
- prunestate = ExecCreatePartitionPruneState(&mergestate->ps,
- node->part_prune_info);
- mergestate->ms_prune_state = prunestate;
-
- /* Perform an initial partition prune, if required. */
- if (prunestate->do_initial_prune)
- {
- /* Determine which subplans survive initial pruning */
- validsubplans = ExecFindInitialMatchingSubPlans(prunestate,
- list_length(node->mergeplans));
-
- nplans = bms_num_members(validsubplans);
- }
- else
- {
- /* We'll need to initialize all subplans */
- nplans = list_length(node->mergeplans);
- Assert(nplans > 0);
- validsubplans = bms_add_range(NULL, 0, nplans - 1);
- }
+ validsubplans = ExecInitPartitionPruning(&mergestate->ps,
+ list_length(node->mergeplans),
+ node->part_prune_info,
+ &mergestate->ms_prune_state);
+ nplans = bms_num_members(validsubplans);
/*
* When no run-time pruning is required and there's at least one
* subplan, we can fill as_valid_subplans immediately, preventing
* later calls to ExecFindMatchingSubPlans.
*/
- if (!prunestate->do_exec_prune && nplans > 0)
+ if (mergestate->ms_prune_state == NULL ||
+ (!mergestate->ms_prune_state->do_exec_prune && nplans > 0))
mergestate->ms_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
}
else
diff --git a/src/backend/executor/nodeMergejoin.c b/src/backend/executor/nodeMergejoin.c
index a049bc4ae0..12b1790c8a 100644
--- a/src/backend/executor/nodeMergejoin.c
+++ b/src/backend/executor/nodeMergejoin.c
@@ -1428,6 +1428,20 @@ ExecMergeJoin(PlanState *pstate)
}
}
+/* ----------------------------------------------------------------
+ * ExecPrepMergeJoin
+ *
+ * This "preps" the MergeJoin node and the node's children.
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepMergeJoin(MergeJoin *node, ExecPrepContext *context, ExecPrepOutput *result)
+{
+ /* Nothing to do beside recursing to the children. */
+ ExecPrepNode(outerPlan(node), context, result);
+ ExecPrepNode(innerPlan(node), context, result);
+}
+
/* ----------------------------------------------------------------
* ExecInitMergeJoin
* ----------------------------------------------------------------
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index 5ec699a9bd..93a6ac062f 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -2700,6 +2700,32 @@ ExecLookupResultRelByOid(ModifyTableState *node, Oid resultoid,
return NULL;
}
+/* ----------------------------------------------------------------
+ * ExecPrepModifyTable
+ *
+ * This "preps" the ModifyTable node and the subplan.
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepModifyTable(ModifyTable *node, ExecPrepContext *context,
+ ExecPrepOutput *result)
+{
+ ListCell *lc;
+
+ if (node->rootRelation > 0)
+ result->relationRTIs = bms_add_member(result->relationRTIs,
+ node->rootRelation);
+ result->relationRTIs = bms_add_member(result->relationRTIs,
+ node->nominalRelation);
+ foreach(lc, node->resultRelations)
+ {
+ result->relationRTIs = bms_add_member(result->relationRTIs,
+ lfirst_int(lc));
+ }
+
+ ExecPrepNode(outerPlan(node), context, result);
+}
+
/* ----------------------------------------------------------------
* ExecInitModifyTable
* ----------------------------------------------------------------
diff --git a/src/backend/executor/nodeNamedtuplestorescan.c b/src/backend/executor/nodeNamedtuplestorescan.c
index ca637b1b0e..5db23af93c 100644
--- a/src/backend/executor/nodeNamedtuplestorescan.c
+++ b/src/backend/executor/nodeNamedtuplestorescan.c
@@ -74,6 +74,19 @@ ExecNamedTuplestoreScan(PlanState *pstate)
(ExecScanRecheckMtd) NamedTuplestoreScanRecheck);
}
+/* ----------------------------------------------------------------
+ * ExecPrepNamedTuplestoreScan
+ *
+ * This "preps" the NamedTuplestoreScan node.
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepNamedTuplestoreScan(NamedTuplestoreScan *node,
+ ExecPrepContext *context,
+ ExecPrepOutput *result)
+{
+ /* nothing to do */
+}
/* ----------------------------------------------------------------
* ExecInitNamedTuplestoreScan
diff --git a/src/backend/executor/nodeNestloop.c b/src/backend/executor/nodeNestloop.c
index 06767c3133..ffb3a94f07 100644
--- a/src/backend/executor/nodeNestloop.c
+++ b/src/backend/executor/nodeNestloop.c
@@ -255,6 +255,20 @@ ExecNestLoop(PlanState *pstate)
}
}
+/* ----------------------------------------------------------------
+ * ExecPrepNestLoop
+ *
+ * This "preps" the NestLoop node and the node's children.
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepNestLoop(NestLoop *node, ExecPrepContext *context, ExecPrepOutput *result)
+{
+ /* Nothing to do beside recursing to the children. */
+ ExecPrepNode(outerPlan(node), context, result);
+ ExecPrepNode(innerPlan(node), context, result);
+}
+
/* ----------------------------------------------------------------
* ExecInitNestLoop
* ----------------------------------------------------------------
diff --git a/src/backend/executor/nodeProjectSet.c b/src/backend/executor/nodeProjectSet.c
index ea40d61b0b..1d6085a3b4 100644
--- a/src/backend/executor/nodeProjectSet.c
+++ b/src/backend/executor/nodeProjectSet.c
@@ -208,6 +208,19 @@ ExecProjectSRF(ProjectSetState *node, bool continuing)
return NULL;
}
+/* ----------------------------------------------------------------
+ * ExecPrepProjectSet
+ *
+ * This "preps" the ProjectSet node and the subplan.
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepProjectSet(ProjectSet *node, ExecPrepContext *context,
+ ExecPrepOutput *result)
+{
+ ExecPrepNode(outerPlan(node), context, result);
+}
+
/* ----------------------------------------------------------------
* ExecInitProjectSet
*
diff --git a/src/backend/executor/nodeRecursiveunion.c b/src/backend/executor/nodeRecursiveunion.c
index 2d01ed7711..806c653c56 100644
--- a/src/backend/executor/nodeRecursiveunion.c
+++ b/src/backend/executor/nodeRecursiveunion.c
@@ -159,6 +159,20 @@ ExecRecursiveUnion(PlanState *pstate)
return NULL;
}
+/* ----------------------------------------------------------------
+ * ExecPrepRecursiveUnion
+ *
+ * This "preps" the RecursiveUnion node and the children.
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepRecursiveUnion(RecursiveUnion *node, ExecPrepContext *context,
+ ExecPrepOutput *result)
+{
+ ExecPrepNode(outerPlan(node), context, result);
+ ExecPrepNode(innerPlan(node), context, result);
+}
+
/* ----------------------------------------------------------------
* ExecInitRecursiveUnion
* ----------------------------------------------------------------
diff --git a/src/backend/executor/nodeResult.c b/src/backend/executor/nodeResult.c
index d0413e05de..14883b6764 100644
--- a/src/backend/executor/nodeResult.c
+++ b/src/backend/executor/nodeResult.c
@@ -169,6 +169,19 @@ ExecResultRestrPos(ResultState *node)
elog(ERROR, "Result nodes do not support mark/restore");
}
+/* ----------------------------------------------------------------
+ * ExecPrepResult
+ *
+ * This "preps" the Result node and the subplan.
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepResult(Result *node, ExecPrepContext *context,
+ ExecPrepOutput *result)
+{
+ ExecPrepNode(outerPlan(node), context, result);
+}
+
/* ----------------------------------------------------------------
* ExecInitResult
*
diff --git a/src/backend/executor/nodeSamplescan.c b/src/backend/executor/nodeSamplescan.c
index a03ae120f8..ef4c0775f7 100644
--- a/src/backend/executor/nodeSamplescan.c
+++ b/src/backend/executor/nodeSamplescan.c
@@ -89,6 +89,20 @@ ExecSampleScan(PlanState *pstate)
(ExecScanRecheckMtd) SampleRecheck);
}
+/* ----------------------------------------------------------------
+ * ExecPrepSampleScan
+ *
+ * This "preps" the SampleScan node.
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepSampleScan(SampleScan *node, ExecPrepContext *context,
+ ExecPrepOutput *result)
+{
+ result->relationRTIs = bms_add_member(result->relationRTIs,
+ node->scan.scanrelid);
+}
+
/* ----------------------------------------------------------------
* ExecInitSampleScan
* ----------------------------------------------------------------
diff --git a/src/backend/executor/nodeSeqscan.c b/src/backend/executor/nodeSeqscan.c
index 7b58cd9162..8964c1e9b2 100644
--- a/src/backend/executor/nodeSeqscan.c
+++ b/src/backend/executor/nodeSeqscan.c
@@ -114,6 +114,19 @@ ExecSeqScan(PlanState *pstate)
(ExecScanRecheckMtd) SeqRecheck);
}
+/* ----------------------------------------------------------------
+ * ExecPrepSeqScanScan
+ *
+ * This "preps" the SeqScan node.
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepSeqScan(SeqScan *node, ExecPrepContext *context,
+ ExecPrepOutput *result)
+{
+ result->relationRTIs = bms_add_member(result->relationRTIs,
+ node->scan.scanrelid);
+}
/* ----------------------------------------------------------------
* ExecInitSeqScan
diff --git a/src/backend/executor/nodeSetOp.c b/src/backend/executor/nodeSetOp.c
index 4b428cfa39..312aa8511f 100644
--- a/src/backend/executor/nodeSetOp.c
+++ b/src/backend/executor/nodeSetOp.c
@@ -470,6 +470,19 @@ setop_retrieve_hash_table(SetOpState *setopstate)
return NULL;
}
+/* ----------------------------------------------------------------
+ * ExecPrepSetOp
+ *
+ * This "preps" the setop node and the node's subplan.
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepSetOp(SetOp *node, ExecPrepContext *context, ExecPrepOutput *result)
+{
+ /* Nothing to do beside recursing to the subplan. */
+ ExecPrepNode(outerPlan(node), context, result);
+}
+
/* ----------------------------------------------------------------
* ExecInitSetOp
*
diff --git a/src/backend/executor/nodeSort.c b/src/backend/executor/nodeSort.c
index 9481a622bf..c31f2634e8 100644
--- a/src/backend/executor/nodeSort.c
+++ b/src/backend/executor/nodeSort.c
@@ -203,6 +203,19 @@ ExecSort(PlanState *pstate)
return slot;
}
+/* ----------------------------------------------------------------
+ * ExecPrepSort
+ *
+ * This "preps" the Sort node and the node's subplan.
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepSort(Sort *node, ExecPrepContext *context, ExecPrepOutput *result)
+{
+ /* Nothing to do beside recursing to the subplan. */
+ ExecPrepNode(outerPlan(node), context, result);
+}
+
/* ----------------------------------------------------------------
* ExecInitSort
*
diff --git a/src/backend/executor/nodeSubplan.c b/src/backend/executor/nodeSubplan.c
index 60d2290030..b95084ddb2 100644
--- a/src/backend/executor/nodeSubplan.c
+++ b/src/backend/executor/nodeSubplan.c
@@ -775,6 +775,18 @@ slotNoNulls(TupleTableSlot *slot)
return true;
}
+/* ----------------------------------------------------------------
+ * ExecPrepSubPlan
+ *
+ * This "preps" the SubPlan node and the node's subplan.
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepSubPlan(SubPlan *node, ExecPrepContext *context, ExecPrepOutput *result)
+{
+ /* nothing to do */
+}
+
/* ----------------------------------------------------------------
* ExecInitSubPlan
*
diff --git a/src/backend/executor/nodeSubqueryscan.c b/src/backend/executor/nodeSubqueryscan.c
index 242c9cd4b9..cc0d62ca85 100644
--- a/src/backend/executor/nodeSubqueryscan.c
+++ b/src/backend/executor/nodeSubqueryscan.c
@@ -89,6 +89,20 @@ ExecSubqueryScan(PlanState *pstate)
(ExecScanRecheckMtd) SubqueryRecheck);
}
+/* ----------------------------------------------------------------
+ * ExecPrepSubqueryScan
+ *
+ * This "preps" the SubqueryScan node and the subplan.
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepSubqueryScan(SubqueryScan *node, ExecPrepContext *context,
+ ExecPrepOutput *result)
+{
+ /* Nothing to do beside recursing to the subplan. */
+ ExecPrepNode((Plan *) node->subplan, context, result);
+}
+
/* ----------------------------------------------------------------
* ExecInitSubqueryScan
* ----------------------------------------------------------------
diff --git a/src/backend/executor/nodeTableFuncscan.c b/src/backend/executor/nodeTableFuncscan.c
index 0db4ed0c2f..dccecb3916 100644
--- a/src/backend/executor/nodeTableFuncscan.c
+++ b/src/backend/executor/nodeTableFuncscan.c
@@ -83,6 +83,19 @@ TableFuncRecheck(TableFuncScanState *node, TupleTableSlot *slot)
return true;
}
+/* ----------------------------------------------------------------
+ * ExecPrepTableFuncScan
+ *
+ * This "preps" the TableFuncScan node.
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepTableFuncScan(TableFuncScan *node, ExecPrepContext *context,
+ ExecPrepOutput *result)
+{
+ /* nothing to do*/
+}
+
/* ----------------------------------------------------------------
* ExecTableFuncScan(node)
*
diff --git a/src/backend/executor/nodeTidrangescan.c b/src/backend/executor/nodeTidrangescan.c
index d5bf1be787..1c05ce8035 100644
--- a/src/backend/executor/nodeTidrangescan.c
+++ b/src/backend/executor/nodeTidrangescan.c
@@ -340,6 +340,20 @@ ExecEndTidRangeScan(TidRangeScanState *node)
ExecClearTuple(node->ss.ss_ScanTupleSlot);
}
+/* ----------------------------------------------------------------
+ * ExecPrepTidRangeScan
+ *
+ * This "preps" the TidRangeScan node.
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepTidRangeScan(TidRangeScan *node, ExecPrepContext *context,
+ ExecPrepOutput *result)
+{
+ result->relationRTIs = bms_add_member(result->relationRTIs,
+ node->scan.scanrelid);
+}
+
/* ----------------------------------------------------------------
* ExecInitTidRangeScan
*
diff --git a/src/backend/executor/nodeTidscan.c b/src/backend/executor/nodeTidscan.c
index 4116d1f3b5..6031ab52b6 100644
--- a/src/backend/executor/nodeTidscan.c
+++ b/src/backend/executor/nodeTidscan.c
@@ -408,7 +408,6 @@ TidRecheck(TidScanState *node, TupleTableSlot *slot)
return true;
}
-
/* ----------------------------------------------------------------
* ExecTidScan(node)
*
@@ -483,6 +482,20 @@ ExecEndTidScan(TidScanState *node)
ExecClearTuple(node->ss.ss_ScanTupleSlot);
}
+/* ----------------------------------------------------------------
+ * ExecPrepTidScan
+ *
+ * This "preps" the TidScan node.
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepTidScan(TidScan *node, ExecPrepContext *context,
+ ExecPrepOutput *result)
+{
+ result->relationRTIs = bms_add_member(result->relationRTIs,
+ node->scan.scanrelid);
+}
+
/* ----------------------------------------------------------------
* ExecInitTidScan
*
diff --git a/src/backend/executor/nodeUnique.c b/src/backend/executor/nodeUnique.c
index 6c99d13a39..87c1b53515 100644
--- a/src/backend/executor/nodeUnique.c
+++ b/src/backend/executor/nodeUnique.c
@@ -104,6 +104,19 @@ ExecUnique(PlanState *pstate)
return ExecCopySlot(resultTupleSlot, slot);
}
+/* ----------------------------------------------------------------
+ * ExecPrepUnique
+ *
+ * This "preps" the unique node and the node's subplan.
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepUnique(Unique *node, ExecPrepContext *context, ExecPrepOutput *result)
+{
+ /* Nothing to do beside recursing to the subplan. */
+ ExecPrepNode(outerPlan(node), context, result);
+}
+
/* ----------------------------------------------------------------
* ExecInitUnique
*
diff --git a/src/backend/executor/nodeValuesscan.c b/src/backend/executor/nodeValuesscan.c
index dda1c59b23..6cf7fd77d6 100644
--- a/src/backend/executor/nodeValuesscan.c
+++ b/src/backend/executor/nodeValuesscan.c
@@ -203,6 +203,19 @@ ExecValuesScan(PlanState *pstate)
(ExecScanRecheckMtd) ValuesRecheck);
}
+/* ----------------------------------------------------------------
+ * ExecPrepValuesScan
+ *
+ * This "preps" the ValuesScan node.
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepValuesScan(ValuesScan *node, ExecPrepContext *context,
+ ExecPrepOutput *result)
+{
+ /* nothing to do */
+}
+
/* ----------------------------------------------------------------
* ExecInitValuesScan
* ----------------------------------------------------------------
diff --git a/src/backend/executor/nodeWindowAgg.c b/src/backend/executor/nodeWindowAgg.c
index 08ce05ca5a..90b7494bee 100644
--- a/src/backend/executor/nodeWindowAgg.c
+++ b/src/backend/executor/nodeWindowAgg.c
@@ -2238,6 +2238,19 @@ ExecWindowAgg(PlanState *pstate)
return ExecProject(winstate->ss.ps.ps_ProjInfo);
}
+/* ----------------------------------------------------------------
+ * ExecPrepWindowAgg
+ *
+ * This "preps" the WindowAgg node and the node's subplan.
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepWindowAgg(WindowAgg *node, ExecPrepContext *context, ExecPrepOutput *result)
+{
+ /* Nothing to do beside recursing to the subplan. */
+ ExecPrepNode(outerPlan(node), context, result);
+}
+
/* -----------------
* ExecInitWindowAgg
*
diff --git a/src/backend/executor/nodeWorktablescan.c b/src/backend/executor/nodeWorktablescan.c
index 15fd71fb32..71a2ac7e40 100644
--- a/src/backend/executor/nodeWorktablescan.c
+++ b/src/backend/executor/nodeWorktablescan.c
@@ -121,6 +121,18 @@ ExecWorkTableScan(PlanState *pstate)
(ExecScanRecheckMtd) WorkTableScanRecheck);
}
+/* ----------------------------------------------------------------
+ * ExecPrepWorkTableScan
+ *
+ * This "preps" the WorkTableScan node.
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepWorkTableScan(WorkTableScan *node, ExecPrepContext *context,
+ ExecPrepOutput *result)
+{
+ /* nothing to do */
+}
/* ----------------------------------------------------------------
* ExecInitWorkTableScan
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index c93f90de9b..84c1b22ccb 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -1485,6 +1485,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
CachedPlanSource *plansource;
CachedPlan *cplan;
List *stmt_list;
+ List *stmt_execprep_list;
char *query_string;
Snapshot snapshot;
MemoryContext oldcontext;
@@ -1566,6 +1567,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
/* Replan if needed, and increment plan refcount for portal */
cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv);
stmt_list = cplan->stmt_list;
+ stmt_execprep_list = cplan->stmt_execprep_list;
if (!plan->saved)
{
@@ -1577,6 +1579,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
*/
oldcontext = MemoryContextSwitchTo(portal->portalContext);
stmt_list = copyObject(stmt_list);
+ stmt_execprep_list = copyObject(stmt_execprep_list);
MemoryContextSwitchTo(oldcontext);
ReleaseCachedPlan(cplan, NULL);
cplan = NULL; /* portal shouldn't depend on cplan */
@@ -1590,6 +1593,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
query_string,
plansource->commandTag,
stmt_list,
+ stmt_execprep_list,
cplan);
/*
@@ -2380,7 +2384,9 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
{
CachedPlanSource *plansource = (CachedPlanSource *) lfirst(lc1);
List *stmt_list;
- ListCell *lc2;
+ List *stmt_execprep_list;
+ ListCell *lc2,
+ *lc3;
spicallbackarg.query = plansource->query_string;
@@ -2459,6 +2465,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
plan_owner, _SPI_current->queryEnv);
stmt_list = cplan->stmt_list;
+ stmt_execprep_list = cplan->stmt_execprep_list;
/*
* If we weren't given a specific snapshot to use, and the statement
@@ -2496,9 +2503,10 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
}
}
- foreach(lc2, stmt_list)
+ forboth(lc2, stmt_list, lc3, stmt_execprep_list)
{
PlannedStmt *stmt = lfirst_node(PlannedStmt, lc2);
+ ExecPrepOutput *execprep = lfirst_node(ExecPrepOutput, lc3);
bool canSetTag = stmt->canSetTag;
DestReceiver *dest;
@@ -2570,7 +2578,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
else
snap = InvalidSnapshot;
- qdesc = CreateQueryDesc(stmt,
+ qdesc = CreateQueryDesc(stmt, execprep,
plansource->query_string,
snap, crosscheck_snapshot,
dest,
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 6bd95bbce2..89101256cf 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -68,6 +68,18 @@
} \
} while (0)
+/* Copy a field that is an array with numElem of Node objects */
+#define COPY_NODE_ARRAY(fldname, numElem) \
+ do { \
+ int i; \
+ newnode->fldname = numElem > 0 ? \
+ palloc(numElem * sizeof(from->fldname[0])) : NULL; \
+ for (i = 0; i < numElem; i++) \
+ { \
+ newnode->fldname[i] = copyObject(from->fldname[i]); \
+ } \
+ } while (0)
+
/* Copy a parse location field (for Copy, this is same as scalar case) */
#define COPY_LOCATION_FIELD(fldname) \
(newnode->fldname = from->fldname)
@@ -94,9 +106,12 @@ _copyPlannedStmt(const PlannedStmt *from)
COPY_SCALAR_FIELD(transientPlan);
COPY_SCALAR_FIELD(dependsOnRole);
COPY_SCALAR_FIELD(parallelModeNeeded);
+ COPY_SCALAR_FIELD(usesPreExecPruning);
COPY_SCALAR_FIELD(jitFlags);
COPY_NODE_FIELD(planTree);
+ COPY_SCALAR_FIELD(numPlanNodes);
COPY_NODE_FIELD(rtable);
+ COPY_BITMAPSET_FIELD(relationRTIs);
COPY_NODE_FIELD(resultRelations);
COPY_NODE_FIELD(appendRelations);
COPY_NODE_FIELD(subplans);
@@ -1278,6 +1293,8 @@ _copyPartitionPruneInfo(const PartitionPruneInfo *from)
PartitionPruneInfo *newnode = makeNode(PartitionPruneInfo);
COPY_NODE_FIELD(prune_infos);
+ COPY_SCALAR_FIELD(contains_init_steps);
+ COPY_SCALAR_FIELD(contains_exec_steps);
COPY_BITMAPSET_FIELD(other_subplans);
return newnode;
@@ -4984,6 +5001,28 @@ _copyBitString(const BitString *from)
return newnode;
}
+static ExecPrepOutput *
+_copyExecPrepOutput(const ExecPrepOutput *from)
+{
+ ExecPrepOutput *newnode = makeNode(ExecPrepOutput);
+
+ COPY_BITMAPSET_FIELD(relationRTIs);
+ COPY_SCALAR_FIELD(numPlanNodes);
+ COPY_NODE_ARRAY(planPrepResults, from->numPlanNodes);
+
+ return newnode;
+}
+
+static PlanPrepOutput *
+_copyPlanPrepOutput(const PlanPrepOutput *from)
+{
+ PlanPrepOutput *newnode = makeNode(PlanPrepOutput);
+
+ COPY_SCALAR_FIELD(plan_node_id);
+ COPY_BITMAPSET_FIELD(initially_valid_subnodes);
+
+ return newnode;
+}
static ForeignKeyCacheInfo *
_copyForeignKeyCacheInfo(const ForeignKeyCacheInfo *from)
@@ -5930,6 +5969,16 @@ copyObjectImpl(const void *from)
retval = _copyPublicationTable(from);
break;
+ /*
+ * EXECUTION NODES
+ */
+ case T_ExecPrepOutput:
+ retval = _copyExecPrepOutput(from);
+ break;
+ case T_PlanPrepOutput:
+ retval = _copyPlanPrepOutput(from);
+ break;
+
/*
* MISCELLANEOUS NODES
*/
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 6bdad462c7..9fe247d505 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -312,9 +312,12 @@ _outPlannedStmt(StringInfo str, const PlannedStmt *node)
WRITE_BOOL_FIELD(transientPlan);
WRITE_BOOL_FIELD(dependsOnRole);
WRITE_BOOL_FIELD(parallelModeNeeded);
+ WRITE_BOOL_FIELD(usesPreExecPruning);
WRITE_INT_FIELD(jitFlags);
WRITE_NODE_FIELD(planTree);
+ WRITE_INT_FIELD(numPlanNodes);
WRITE_NODE_FIELD(rtable);
+ WRITE_BITMAPSET_FIELD(relationRTIs);
WRITE_NODE_FIELD(resultRelations);
WRITE_NODE_FIELD(appendRelations);
WRITE_NODE_FIELD(subplans);
@@ -1004,6 +1007,8 @@ _outPartitionPruneInfo(StringInfo str, const PartitionPruneInfo *node)
WRITE_NODE_TYPE("PARTITIONPRUNEINFO");
WRITE_NODE_FIELD(prune_infos);
+ WRITE_BOOL_FIELD(contains_init_steps);
+ WRITE_BOOL_FIELD(contains_exec_steps);
WRITE_BITMAPSET_FIELD(other_subplans);
}
@@ -2274,6 +2279,7 @@ _outPlannerGlobal(StringInfo str, const PlannerGlobal *node)
WRITE_NODE_FIELD(subplans);
WRITE_BITMAPSET_FIELD(rewindPlanIDs);
WRITE_NODE_FIELD(finalrtable);
+ WRITE_BITMAPSET_FIELD(relationRTIs);
WRITE_NODE_FIELD(finalrowmarks);
WRITE_NODE_FIELD(resultRelations);
WRITE_NODE_FIELD(appendRelations);
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 3f68f7c18d..7ecb9ad73c 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -1585,9 +1585,12 @@ _readPlannedStmt(void)
READ_BOOL_FIELD(transientPlan);
READ_BOOL_FIELD(dependsOnRole);
READ_BOOL_FIELD(parallelModeNeeded);
+ READ_BOOL_FIELD(usesPreExecPruning);
READ_INT_FIELD(jitFlags);
READ_NODE_FIELD(planTree);
+ READ_INT_FIELD(numPlanNodes);
READ_NODE_FIELD(rtable);
+ READ_BITMAPSET_FIELD(relationRTIs);
READ_NODE_FIELD(resultRelations);
READ_NODE_FIELD(appendRelations);
READ_NODE_FIELD(subplans);
@@ -2534,6 +2537,8 @@ _readPartitionPruneInfo(void)
READ_LOCALS(PartitionPruneInfo);
READ_NODE_FIELD(prune_infos);
+ READ_BOOL_FIELD(contains_init_steps);
+ READ_BOOL_FIELD(contains_exec_steps);
READ_BITMAPSET_FIELD(other_subplans);
READ_DONE();
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index bd09f85aea..70c5b9d88b 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -517,8 +517,11 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
result->transientPlan = glob->transientPlan;
result->dependsOnRole = glob->dependsOnRole;
result->parallelModeNeeded = glob->parallelModeNeeded;
+ result->usesPreExecPruning = glob->usesPreExecPruning;
result->planTree = top_plan;
+ result->numPlanNodes = glob->lastPlanNodeId;
result->rtable = glob->finalrtable;
+ result->relationRTIs = glob->relationRTIs;
result->resultRelations = glob->resultRelations;
result->appendRelations = glob->appendRelations;
result->subplans = glob->subplans;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index a7b11b7f03..c1b1cf503d 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -483,6 +483,7 @@ static void
add_rte_to_flat_rtable(PlannerGlobal *glob, RangeTblEntry *rte)
{
RangeTblEntry *newrte;
+ Index rti = list_length(glob->finalrtable) + 1;
/* flat copy to duplicate all the scalar fields */
newrte = (RangeTblEntry *) palloc(sizeof(RangeTblEntry));
@@ -517,7 +518,10 @@ add_rte_to_flat_rtable(PlannerGlobal *glob, RangeTblEntry *rte)
* but it would probably cost more cycles than it would save.
*/
if (newrte->rtekind == RTE_RELATION)
+ {
+ glob->relationRTIs = bms_add_member(glob->relationRTIs, rti);
glob->relationOids = lappend_oid(glob->relationOids, newrte->relid);
+ }
}
/*
@@ -1548,6 +1552,9 @@ set_append_references(PlannerInfo *root,
pinfo->rtindex += rtoffset;
}
}
+
+ if (aplan->part_prune_info->contains_init_steps)
+ root->glob->usesPreExecPruning = true;
}
/* We don't need to recurse to lefttree or righttree ... */
@@ -1620,6 +1627,9 @@ set_mergeappend_references(PlannerInfo *root,
pinfo->rtindex += rtoffset;
}
}
+
+ if (mplan->part_prune_info->contains_init_steps)
+ root->glob->usesPreExecPruning = true;
}
/* We don't need to recurse to lefttree or righttree ... */
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index 1bc00826c1..390d4e4c06 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -144,7 +144,9 @@ static List *make_partitionedrel_pruneinfo(PlannerInfo *root,
List *prunequal,
Bitmapset *partrelids,
int *relid_subplan_map,
- Bitmapset **matchedsubplans);
+ Bitmapset **matchedsubplans,
+ bool *contains_init_steps,
+ bool *contains_exec_steps);
static void gen_partprune_steps(RelOptInfo *rel, List *clauses,
PartClauseTarget target,
GeneratePruningStepsContext *context);
@@ -230,6 +232,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
int *relid_subplan_map;
ListCell *lc;
int i;
+ bool contains_init_steps = false;
+ bool contains_exec_steps = false;
/*
* Scan the subpaths to see which ones are scans of partition child
@@ -309,12 +313,16 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
Bitmapset *partrelids = (Bitmapset *) lfirst(lc);
List *pinfolist;
Bitmapset *matchedsubplans = NULL;
+ bool partrel_contains_init_steps;
+ bool partrel_contains_exec_steps;
pinfolist = make_partitionedrel_pruneinfo(root, parentrel,
prunequal,
partrelids,
relid_subplan_map,
- &matchedsubplans);
+ &matchedsubplans,
+ &partrel_contains_init_steps,
+ &partrel_contains_exec_steps);
/* When pruning is possible, record the matched subplans */
if (pinfolist != NIL)
@@ -323,6 +331,10 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
allmatchedsubplans = bms_join(matchedsubplans,
allmatchedsubplans);
}
+ if (!contains_init_steps)
+ contains_init_steps = partrel_contains_init_steps;
+ if (!contains_exec_steps)
+ contains_exec_steps = partrel_contains_exec_steps;
}
pfree(relid_subplan_map);
@@ -337,6 +349,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
/* Else build the result data structure */
pruneinfo = makeNode(PartitionPruneInfo);
pruneinfo->prune_infos = prunerelinfos;
+ pruneinfo->contains_init_steps = contains_init_steps;
+ pruneinfo->contains_exec_steps = contains_exec_steps;
/*
* Some subplans may not belong to any of the identified partitioned rels.
@@ -435,13 +449,18 @@ add_part_relids(List *allpartrelids, Bitmapset *partrelids)
* If we cannot find any useful run-time pruning steps, return NIL.
* However, on success, each rel identified in partrelids will have
* an element in the result list, even if some of them are useless.
+ * *contains_init_steps and *contains_exec_steps are set to indicate that the
+ * returned PartitionedRelPruneInfos contains pruning steps that can be
+ * performed before and after execution begins, respectively.
*/
static List *
make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
List *prunequal,
Bitmapset *partrelids,
int *relid_subplan_map,
- Bitmapset **matchedsubplans)
+ Bitmapset **matchedsubplans,
+ bool *contains_init_steps,
+ bool *contains_exec_steps)
{
RelOptInfo *targetpart = NULL;
List *pinfolist = NIL;
@@ -452,6 +471,10 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
int rti;
int i;
+ /* Will find out below. */
+ *contains_init_steps = false;
+ *contains_exec_steps = false;
+
/*
* Examine each partitioned rel, constructing a temporary array to map
* from planner relids to index of the partitioned rel, and building a
@@ -539,6 +562,9 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
* executor per-scan pruning steps. This first pass creates startup
* pruning steps and detects whether there's any possibly-useful quals
* that would require per-scan pruning.
+ *
+ * In the first pass, we note whether the 2nd pass is necessary by
+ * by noting the presence of EXEC parameters.
*/
gen_partprune_steps(subpart, partprunequal, PARTTARGET_INITIAL,
&context);
@@ -613,6 +639,11 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
pinfo->execparamids = execparamids;
/* Remaining fields will be filled in the next loop */
+ if (!*contains_init_steps)
+ *contains_init_steps = (initial_pruning_steps != NIL);
+ if (!*contains_exec_steps)
+ *contains_exec_steps = (exec_pruning_steps != NIL);
+
pinfolist = lappend(pinfolist, pinfo);
}
@@ -798,6 +829,7 @@ prune_append_rel_partitions(RelOptInfo *rel)
/* These are not valid when being called from the planner */
context.planstate = NULL;
+ context.exprcontext = NULL;
context.exprstates = NULL;
/* Actual pruning happens here. */
@@ -808,8 +840,8 @@ prune_append_rel_partitions(RelOptInfo *rel)
* get_matching_partitions
* Determine partitions that survive partition pruning
*
- * Note: context->planstate must be set to a valid PlanState when the
- * pruning_steps were generated with a target other than PARTTARGET_PLANNER.
+ * Note: context->exprcontext must be valid when the pruning_steps were
+ * generated with a target other than PARTTARGET_PLANNER.
*
* Returns a Bitmapset of the RelOptInfo->part_rels indexes of the surviving
* partitions.
@@ -3654,7 +3686,7 @@ match_boolean_partition_clause(Oid partopfamily, Expr *clause, Expr *partkey,
* exprstate array.
*
* Note that the evaluated result may be in the per-tuple memory context of
- * context->planstate->ps_ExprContext, and we may have leaked other memory
+ * context->exprcontext, and we may have leaked other memory
* there too. This memory must be recovered by resetting that ExprContext
* after we're done with the pruning operation (see execPartition.c).
*/
@@ -3677,13 +3709,18 @@ partkey_datum_from_expr(PartitionPruneContext *context,
ExprContext *ectx;
/*
- * We should never see a non-Const in a step unless we're running in
- * the executor.
+ * We should never see a non-Const in a step unless the caller has
+ * passed a valid ExprContext.
+ *
+ * When context->planstate is valid, context->exprcontext is same
+ * as context->planstate->ps_ExprContext.
*/
- Assert(context->planstate != NULL);
+ Assert(context->planstate != NULL || context->exprcontext != NULL);
+ Assert(context->planstate == NULL ||
+ (context->exprcontext == context->planstate->ps_ExprContext));
exprstate = context->exprstates[stateidx];
- ectx = context->planstate->ps_ExprContext;
+ ectx = context->exprcontext;
*value = ExecEvalExprSwitchContext(exprstate, ectx, isnull);
}
}
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index fda2e9360e..5d8f3fc3cb 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -910,15 +910,17 @@ pg_plan_query(Query *querytree, const char *query_string, int cursorOptions,
* For normal optimizable statements, invoke the planner. For utility
* statements, just make a wrapper PlannedStmt node.
*
- * The result is a list of PlannedStmt nodes.
+ * The result is a list of PlannedStmt nodes. Also, a NULL is appended to
+ * *execPrepResults for each PlannedStmt added to the returned list.
*/
List *
pg_plan_queries(List *querytrees, const char *query_string, int cursorOptions,
- ParamListInfo boundParams)
+ ParamListInfo boundParams, List **stmt_execprep_list)
{
List *stmt_list = NIL;
ListCell *query_list;
+ *stmt_execprep_list = NIL;
foreach(query_list, querytrees)
{
Query *query = lfirst_node(Query, query_list);
@@ -942,6 +944,7 @@ pg_plan_queries(List *querytrees, const char *query_string, int cursorOptions,
}
stmt_list = lappend(stmt_list, stmt);
+ *stmt_execprep_list = lappend(*stmt_execprep_list, NULL);
}
return stmt_list;
@@ -1045,7 +1048,8 @@ exec_simple_query(const char *query_string)
QueryCompletion qc;
MemoryContext per_parsetree_context = NULL;
List *querytree_list,
- *plantree_list;
+ *plantree_list,
+ *plantree_execprep_list;
Portal portal;
DestReceiver *receiver;
int16 format;
@@ -1132,7 +1136,8 @@ exec_simple_query(const char *query_string)
NULL, 0, NULL);
plantree_list = pg_plan_queries(querytree_list, query_string,
- CURSOR_OPT_PARALLEL_OK, NULL);
+ CURSOR_OPT_PARALLEL_OK, NULL,
+ &plantree_execprep_list);
/*
* Done with the snapshot used for parsing/planning.
@@ -1168,6 +1173,7 @@ exec_simple_query(const char *query_string)
query_string,
commandTag,
plantree_list,
+ plantree_execprep_list,
NULL);
/*
@@ -1978,6 +1984,7 @@ exec_bind_message(StringInfo input_message)
query_string,
psrc->commandTag,
cplan->stmt_list,
+ cplan->stmt_execprep_list,
cplan);
/* Done with the snapshot used for parameter I/O and parsing/planning */
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index 5f907831a3..b76aa3ef3b 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -35,7 +35,7 @@
Portal ActivePortal = NULL;
-static void ProcessQuery(PlannedStmt *plan,
+static void ProcessQuery(PlannedStmt *plan, ExecPrepOutput *execprep,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -65,6 +65,7 @@ static void DoPortalRewind(Portal portal);
*/
QueryDesc *
CreateQueryDesc(PlannedStmt *plannedstmt,
+ ExecPrepOutput *execprep,
const char *sourceText,
Snapshot snapshot,
Snapshot crosscheck_snapshot,
@@ -77,6 +78,7 @@ CreateQueryDesc(PlannedStmt *plannedstmt,
qd->operation = plannedstmt->commandType; /* operation */
qd->plannedstmt = plannedstmt; /* plan */
+ qd->execprep = execprep; /* ExecutorPrep() output for plan */
qd->sourceText = sourceText; /* query text */
qd->snapshot = RegisterSnapshot(snapshot); /* snapshot */
/* RI check snapshot */
@@ -122,6 +124,7 @@ FreeQueryDesc(QueryDesc *qdesc)
* PORTAL_ONE_RETURNING, or PORTAL_ONE_MOD_WITH portal
*
* plan: the plan tree for the query
+ * execprep: ExecutorPrep() output for the plan tree
* sourceText: the source text of the query
* params: any parameters needed
* dest: where to send results
@@ -134,6 +137,7 @@ FreeQueryDesc(QueryDesc *qdesc)
*/
static void
ProcessQuery(PlannedStmt *plan,
+ ExecPrepOutput *execprep,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -145,7 +149,7 @@ ProcessQuery(PlannedStmt *plan,
/*
* Create the QueryDesc object
*/
- queryDesc = CreateQueryDesc(plan, sourceText,
+ queryDesc = CreateQueryDesc(plan, execprep, sourceText,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, 0);
@@ -490,6 +494,7 @@ PortalStart(Portal portal, ParamListInfo params,
* the destination to DestNone.
*/
queryDesc = CreateQueryDesc(linitial_node(PlannedStmt, portal->stmts),
+ linitial_node(ExecPrepOutput, portal->stmt_execpreps),
portal->sourceText,
GetActiveSnapshot(),
InvalidSnapshot,
@@ -1190,7 +1195,8 @@ PortalRunMulti(Portal portal,
QueryCompletion *qc)
{
bool active_snapshot_set = false;
- ListCell *stmtlist_item;
+ ListCell *stmtlist_item,
+ *execpreplist_item;
/*
* If the destination is DestRemoteExecute, change to DestNone. The
@@ -1211,9 +1217,12 @@ PortalRunMulti(Portal portal,
* Loop to handle the individual queries generated from a single parsetree
* by analysis and rewrite.
*/
- foreach(stmtlist_item, portal->stmts)
+ forboth(stmtlist_item, portal->stmts,
+ execpreplist_item, portal->stmt_execpreps)
{
PlannedStmt *pstmt = lfirst_node(PlannedStmt, stmtlist_item);
+ ExecPrepOutput *execprep = lfirst_node(ExecPrepOutput,
+ execpreplist_item);
/*
* If we got a cancel signal in prior command, quit
@@ -1271,7 +1280,7 @@ PortalRunMulti(Portal portal,
if (pstmt->canSetTag)
{
/* statement can set tag string */
- ProcessQuery(pstmt,
+ ProcessQuery(pstmt, execprep,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
@@ -1280,7 +1289,7 @@ PortalRunMulti(Portal portal,
else
{
/* stmt added by rewrite cannot set tag */
- ProcessQuery(pstmt,
+ ProcessQuery(pstmt, execprep,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 4a9055e6bb..221738dddc 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -58,12 +58,14 @@
#include "access/transam.h"
#include "catalog/namespace.h"
+#include "executor/execPartition.h"
#include "executor/executor.h"
#include "miscadmin.h"
#include "nodes/nodeFuncs.h"
#include "optimizer/optimizer.h"
#include "parser/analyze.h"
#include "parser/parsetree.h"
+#include "partitioning/partdesc.h"
#include "storage/lmgr.h"
#include "tcop/pquery.h"
#include "tcop/utility.h"
@@ -99,14 +101,15 @@ static dlist_head cached_expression_list = DLIST_STATIC_INIT(cached_expression_l
static void ReleaseGenericPlan(CachedPlanSource *plansource);
static List *RevalidateCachedQuery(CachedPlanSource *plansource,
QueryEnvironment *queryEnv);
-static bool CheckCachedPlan(CachedPlanSource *plansource);
+static bool CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams);
static CachedPlan *BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
ParamListInfo boundParams, QueryEnvironment *queryEnv);
static bool choose_custom_plan(CachedPlanSource *plansource,
ParamListInfo boundParams);
static double cached_plan_cost(CachedPlan *plan, bool include_planner);
static Query *QueryListGetPrimaryStmt(List *stmts);
-static void AcquireExecutorLocks(List *stmt_list, bool acquire);
+static List *AcquireExecutorLocks(List *stmt_list, bool acquire,
+ ParamListInfo boundParams);
static void AcquirePlannerLocks(List *stmt_list, bool acquire);
static void ScanQueryForLocks(Query *parsetree, bool acquire);
static bool ScanQueryWalker(Node *node, bool *acquire);
@@ -782,6 +785,47 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
return tlist;
}
+/*
+ * CachedPlanSaveExecPrepOutputs
+ * Save the list containing ExecPrepOutput nodes in the given CachedPlan
+ *
+ * The provided list is copied into a dedicated context that is a child of
+ * plan->context.
+ */
+static void
+CachedPlanSaveExecPrepOutputs(CachedPlan *plan, List *execprep_list)
+{
+ MemoryContext execprep_context = plan->execprep_context,
+ oldcontext = CurrentMemoryContext;
+ List *execprep_list_copy;
+
+ /*
+ * Set up the dedicated context if not already done, saving it as a child
+ * of the CachedPlan's context.
+ */
+ if (execprep_context == NULL)
+ {
+ execprep_context = AllocSetContextCreate(CurrentMemoryContext,
+ "CachedPlan execprep list",
+ ALLOCSET_START_SMALL_SIZES);
+ MemoryContextSetParent(execprep_context, plan->context);
+ MemoryContextSetIdentifier(execprep_context, plan->context->ident);
+ plan->execprep_context = execprep_context;
+ }
+ else
+ {
+ /* Just lear existing contents by resetting the context. */
+ Assert(MemoryContextIsValid(execprep_context));
+ MemoryContextReset(execprep_context);
+ }
+
+ MemoryContextSwitchTo(execprep_context);
+ execprep_list_copy = copyObject(execprep_list);
+ MemoryContextSwitchTo(oldcontext);
+
+ plan->stmt_execprep_list = execprep_list_copy;
+}
+
/*
* CheckCachedPlan: see if the CachedPlanSource's generic plan is valid.
*
@@ -790,9 +834,16 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
*
* On a "true" return, we have acquired the locks needed to run the plan.
* (We must do this for the "true" result to be race-condition-free.)
+ *
+ * If the CachedPlan is valid, this prepares the PlannedStmts contained in it
+ * for execution by invoking ExecutorPrep() on each. Resulting ExecPrepOutput
+ * nodes, allocated in a child context of the context containing the plan
+ * itself, are added into plan->stmt_execprep_list. ExecPrepOutput nodes that
+ * may be present in the list from the last invocation of CheckCachedPlan() on
+ * the same CachedPlan are deleted.
*/
static bool
-CheckCachedPlan(CachedPlanSource *plansource)
+CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams)
{
CachedPlan *plan = plansource->gplan;
@@ -820,13 +871,22 @@ CheckCachedPlan(CachedPlanSource *plansource)
*/
if (plan->is_valid)
{
+ List *execprep_list;
+
/*
* Plan must have positive refcount because it is referenced by
* plansource; so no need to fear it disappears under us here.
*/
Assert(plan->refcount > 0);
- AcquireExecutorLocks(plan->stmt_list, true);
+ /*
+ * Take executor locks on the plan tree and perform other
+ * preparatatory actions on it by invoking ExecutorPrep(). A list of
+ * ExecPrepOutput nodes is generated as result which is saved in the
+ * CachedPlan.
+ */
+ execprep_list = AcquireExecutorLocks(plan->stmt_list, true, boundParams);
+ CachedPlanSaveExecPrepOutputs(plan, execprep_list);
/*
* If plan was transient, check to see if TransactionXmin has
@@ -848,7 +908,7 @@ CheckCachedPlan(CachedPlanSource *plansource)
}
/* Oops, the race case happened. Release useless locks. */
- AcquireExecutorLocks(plan->stmt_list, false);
+ (void) AcquireExecutorLocks(plan->stmt_list, false, boundParams);
}
/*
@@ -880,7 +940,8 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
ParamListInfo boundParams, QueryEnvironment *queryEnv)
{
CachedPlan *plan;
- List *plist;
+ List *plist,
+ *execprep_list;
bool snapshot_set;
bool is_transient;
MemoryContext plan_context;
@@ -933,7 +994,8 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
* Generate the plan.
*/
plist = pg_plan_queries(qlist, plansource->query_string,
- plansource->cursor_options, boundParams);
+ plansource->cursor_options, boundParams,
+ &execprep_list);
/* Release snapshot if we got one */
if (snapshot_set)
@@ -1002,6 +1064,11 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
plan->is_saved = false;
plan->is_valid = true;
+ /* Save the dummy ExecPrepOutput list. */
+ plan->execprep_context = NULL;
+ CachedPlanSaveExecPrepOutputs(plan, execprep_list);
+ Assert(MemoryContextIsValid(plan->execprep_context));
+
/* assign generation number to new plan */
plan->generation = ++(plansource->generation);
@@ -1160,7 +1227,7 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
if (!customplan)
{
- if (CheckCachedPlan(plansource))
+ if (CheckCachedPlan(plansource, boundParams))
{
/* We want a generic plan, and we already have a valid one */
plan = plansource->gplan;
@@ -1366,7 +1433,6 @@ CachedPlanAllowsSimpleValidityCheck(CachedPlanSource *plansource,
foreach(lc, plan->stmt_list)
{
PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc);
- ListCell *lc2;
if (plannedstmt->commandType == CMD_UTILITY)
return false;
@@ -1375,13 +1441,8 @@ CachedPlanAllowsSimpleValidityCheck(CachedPlanSource *plansource,
* We have to grovel through the rtable because it's likely to contain
* an RTE_RESULT relation, rather than being totally empty.
*/
- foreach(lc2, plannedstmt->rtable)
- {
- RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc2);
-
- if (rte->rtekind == RTE_RELATION)
- return false;
- }
+ if (!bms_is_empty(plannedstmt->relationRTIs))
+ return false;
}
/*
@@ -1738,16 +1799,22 @@ QueryListGetPrimaryStmt(List *stmts)
/*
* AcquireExecutorLocks: acquire locks needed for execution of a cached plan;
* or release them if acquire is false.
+ *
+ * Returns a list of ExecPrepOutput nodes containing one element for each
+ * PlannedStmt in stmt_list; NULL if the latter is utility statement.
*/
-static void
-AcquireExecutorLocks(List *stmt_list, bool acquire)
+static List *
+AcquireExecutorLocks(List *stmt_list, bool acquire, ParamListInfo boundParams)
{
ListCell *lc1;
+ List *stmt_execprep_list = NIL;
foreach(lc1, stmt_list)
{
PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
- ListCell *lc2;
+ ExecPrepContext *context;
+ ExecPrepOutput *execprep = NULL;
+ int rti;
if (plannedstmt->commandType == CMD_UTILITY)
{
@@ -1762,28 +1829,46 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
if (query)
ScanQueryForLocks(query, acquire);
- continue;
}
-
- foreach(lc2, plannedstmt->rtable)
+ else
{
- RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc2);
-
- if (rte->rtekind != RTE_RELATION)
- continue;
-
/*
- * Acquire the appropriate type of lock on each relation OID. Note
- * that we don't actually try to open the rel, and hence will not
- * fail if it's been dropped entirely --- we'll just transiently
- * acquire a non-conflicting lock.
+ * Prep the plan tree for execution.
*/
- if (acquire)
- LockRelationOid(rte->relid, rte->rellockmode);
- else
- UnlockRelationOid(rte->relid, rte->rellockmode);
+ context = makeNode(ExecPrepContext);
+ context->stmt = plannedstmt;
+ context->params = boundParams;
+ execprep = ExecutorPrep(context);
+
+ rti = -1;
+ while ((rti = bms_next_member(execprep->relationRTIs, rti)) >= 0)
+ {
+ RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
+
+ if (rte->rtekind != RTE_RELATION)
+ continue;
+
+ /*
+ * Acquire the appropriate type of lock on each relation OID.
+ * Note that we don't actually try to open the rel, and hence
+ * will not fail if it's been dropped entirely --- we'll just
+ * transiently acquire a non-conflicting lock.
+ */
+ if (acquire)
+ LockRelationOid(rte->relid, rte->rellockmode);
+ else
+ UnlockRelationOid(rte->relid, rte->rellockmode);
+ }
}
+
+ /*
+ * Keep the invariant that stmt_execprep_list is same length as
+ * stmt_list.
+ */
+ stmt_execprep_list = lappend(stmt_execprep_list, execprep);
}
+
+ return stmt_execprep_list;
}
/*
diff --git a/src/backend/utils/mmgr/portalmem.c b/src/backend/utils/mmgr/portalmem.c
index 236f450a2b..5cf1339ffd 100644
--- a/src/backend/utils/mmgr/portalmem.c
+++ b/src/backend/utils/mmgr/portalmem.c
@@ -284,6 +284,7 @@ PortalDefineQuery(Portal portal,
const char *sourceText,
CommandTag commandTag,
List *stmts,
+ List *stmt_execpreps,
CachedPlan *cplan)
{
AssertArg(PortalIsValid(portal));
@@ -298,6 +299,7 @@ PortalDefineQuery(Portal portal,
portal->qc.nprocessed = 0;
portal->commandTag = commandTag;
portal->stmts = stmts;
+ portal->stmt_execpreps = stmt_execpreps;
portal->cplan = cplan;
portal->status = PORTAL_DEFINED;
}
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 666977fb1f..f553649a5d 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -87,7 +87,8 @@ extern void ExplainOneUtility(Node *utilityStmt, IntoClause *into,
ExplainState *es, const char *queryString,
ParamListInfo params, QueryEnvironment *queryEnv);
-extern void ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into,
+extern void ExplainOnePlan(PlannedStmt *plannedstmt, ExecPrepOutput *execprep,
+ IntoClause *into,
ExplainState *es, const char *queryString,
ParamListInfo params, QueryEnvironment *queryEnv,
const instr_time *planduration,
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index 603d8becc4..785a09f15f 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -119,10 +119,21 @@ extern ResultRelInfo *ExecFindPartition(ModifyTableState *mtstate,
EState *estate);
extern void ExecCleanupTupleRouting(ModifyTableState *mtstate,
PartitionTupleRouting *proute);
+
extern PartitionPruneState *ExecCreatePartitionPruneState(PlanState *planstate,
- PartitionPruneInfo *partitionpruneinfo);
-extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate);
+ PartitionPruneInfo *partitionpruneinfo,
+ bool consider_initial_steps,
+ bool consider_exec_steps,
+ List *rtable, ExprContext *econtext,
+ PartitionDirectory partdir);
extern Bitmapset *ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate,
- int nsubplans);
-
+ PartitionPruneInfo *pruneinfo,
+ Bitmapset **parentrelids);
+extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate);
+extern Bitmapset *ExecInitPartitionPruning(PlanState *planstate, int n_total_subplans,
+ PartitionPruneInfo *pruneinfo,
+ PartitionPruneState **prunestate);
+extern Bitmapset *ExecPrepDoInitialPruning(PartitionPruneInfo *pruneinfo,
+ List *rtable, ParamListInfo params,
+ Bitmapset **parentrelids);
#endif /* EXECPARTITION_H */
diff --git a/src/include/executor/execdesc.h b/src/include/executor/execdesc.h
index e79e2c001f..491ceef401 100644
--- a/src/include/executor/execdesc.h
+++ b/src/include/executor/execdesc.h
@@ -35,6 +35,7 @@ typedef struct QueryDesc
/* These fields are provided by CreateQueryDesc */
CmdType operation; /* CMD_SELECT, CMD_UPDATE, etc. */
PlannedStmt *plannedstmt; /* planner's output (could be utility, too) */
+ ExecPrepOutput *execprep; /* ExecutorPrep()'s output given plannedstmt */
const char *sourceText; /* source text of the query */
Snapshot snapshot; /* snapshot to use for query */
Snapshot crosscheck_snapshot; /* crosscheck for RI update/delete */
@@ -57,6 +58,7 @@ typedef struct QueryDesc
/* in pquery.c */
extern QueryDesc *CreateQueryDesc(PlannedStmt *plannedstmt,
+ ExecPrepOutput *execprep,
const char *sourceText,
Snapshot snapshot,
Snapshot crosscheck_snapshot,
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index 344399f6a8..627cb19a4c 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -185,6 +185,7 @@ ExecGetJunkAttribute(TupleTableSlot *slot, AttrNumber attno, bool *isNull)
/*
* prototypes from functions in execMain.c
*/
+extern ExecPrepOutput *ExecutorPrep(ExecPrepContext *context);
extern void ExecutorStart(QueryDesc *queryDesc, int eflags);
extern void standard_ExecutorStart(QueryDesc *queryDesc, int eflags);
extern void ExecutorRun(QueryDesc *queryDesc,
@@ -233,6 +234,8 @@ extern void EvalPlanQualEnd(EPQState *epqstate);
/*
* functions in execProcnode.c
*/
+extern void ExecPrepNode(Plan *node, ExecPrepContext *context,
+ ExecPrepOutput *result);
extern PlanState *ExecInitNode(Plan *node, EState *estate, int eflags);
extern void ExecSetExecProcNode(PlanState *node, ExecProcNodeMtd function);
extern Node *MultiExecProcNode(PlanState *node);
diff --git a/src/include/executor/nodeAgg.h b/src/include/executor/nodeAgg.h
index 4d1bd92999..2dd7570067 100644
--- a/src/include/executor/nodeAgg.h
+++ b/src/include/executor/nodeAgg.h
@@ -314,6 +314,7 @@ typedef struct AggStatePerHashData
} AggStatePerHashData;
+extern void ExecPrepAgg(Agg *node, ExecPrepContext *context, ExecPrepOutput *result);
extern AggState *ExecInitAgg(Agg *node, EState *estate, int eflags);
extern void ExecEndAgg(AggState *node);
extern void ExecReScanAgg(AggState *node);
diff --git a/src/include/executor/nodeAppend.h b/src/include/executor/nodeAppend.h
index 4cb78ee5b6..85bc9d30a6 100644
--- a/src/include/executor/nodeAppend.h
+++ b/src/include/executor/nodeAppend.h
@@ -17,6 +17,7 @@
#include "access/parallel.h"
#include "nodes/execnodes.h"
+extern void ExecPrepAppend(Append *node, ExecPrepContext *context, ExecPrepOutput *execprep);
extern AppendState *ExecInitAppend(Append *node, EState *estate, int eflags);
extern void ExecEndAppend(AppendState *node);
extern void ExecReScanAppend(AppendState *node);
diff --git a/src/include/executor/nodeBitmapAnd.h b/src/include/executor/nodeBitmapAnd.h
index bae6a83826..aafb10a2aa 100644
--- a/src/include/executor/nodeBitmapAnd.h
+++ b/src/include/executor/nodeBitmapAnd.h
@@ -16,6 +16,7 @@
#include "nodes/execnodes.h"
+extern void ExecPrepBitmapAnd(BitmapAnd *node, ExecPrepContext *context, ExecPrepOutput *result);
extern BitmapAndState *ExecInitBitmapAnd(BitmapAnd *node, EState *estate, int eflags);
extern Node *MultiExecBitmapAnd(BitmapAndState *node);
extern void ExecEndBitmapAnd(BitmapAndState *node);
diff --git a/src/include/executor/nodeBitmapHeapscan.h b/src/include/executor/nodeBitmapHeapscan.h
index 789522cb8d..7240d9fa93 100644
--- a/src/include/executor/nodeBitmapHeapscan.h
+++ b/src/include/executor/nodeBitmapHeapscan.h
@@ -17,6 +17,7 @@
#include "access/parallel.h"
#include "nodes/execnodes.h"
+extern void ExecPrepBitmapHeapScan(BitmapHeapScan *node, ExecPrepContext *context, ExecPrepOutput *result);
extern BitmapHeapScanState *ExecInitBitmapHeapScan(BitmapHeapScan *node, EState *estate, int eflags);
extern void ExecEndBitmapHeapScan(BitmapHeapScanState *node);
extern void ExecReScanBitmapHeapScan(BitmapHeapScanState *node);
diff --git a/src/include/executor/nodeBitmapIndexscan.h b/src/include/executor/nodeBitmapIndexscan.h
index 01fb6ef536..6759724c2e 100644
--- a/src/include/executor/nodeBitmapIndexscan.h
+++ b/src/include/executor/nodeBitmapIndexscan.h
@@ -16,6 +16,7 @@
#include "nodes/execnodes.h"
+extern void ExecPrepBitmapIndexScan(BitmapIndexScan *node, ExecPrepContext *context, ExecPrepOutput *result);
extern BitmapIndexScanState *ExecInitBitmapIndexScan(BitmapIndexScan *node, EState *estate, int eflags);
extern Node *MultiExecBitmapIndexScan(BitmapIndexScanState *node);
extern void ExecEndBitmapIndexScan(BitmapIndexScanState *node);
diff --git a/src/include/executor/nodeBitmapOr.h b/src/include/executor/nodeBitmapOr.h
index ad90812cc1..66ddc18f63 100644
--- a/src/include/executor/nodeBitmapOr.h
+++ b/src/include/executor/nodeBitmapOr.h
@@ -16,6 +16,7 @@
#include "nodes/execnodes.h"
+extern void ExecPrepBitmapOr(BitmapOr *node, ExecPrepContext *context, ExecPrepOutput *result);
extern BitmapOrState *ExecInitBitmapOr(BitmapOr *node, EState *estate, int eflags);
extern Node *MultiExecBitmapOr(BitmapOrState *node);
extern void ExecEndBitmapOr(BitmapOrState *node);
diff --git a/src/include/executor/nodeCtescan.h b/src/include/executor/nodeCtescan.h
index 317d142b16..7908ae51df 100644
--- a/src/include/executor/nodeCtescan.h
+++ b/src/include/executor/nodeCtescan.h
@@ -16,6 +16,7 @@
#include "nodes/execnodes.h"
+extern void ExecPrepCteScan(CteScan *node, ExecPrepContext *context, ExecPrepOutput *result);
extern CteScanState *ExecInitCteScan(CteScan *node, EState *estate, int eflags);
extern void ExecEndCteScan(CteScanState *node);
extern void ExecReScanCteScan(CteScanState *node);
diff --git a/src/include/executor/nodeCustom.h b/src/include/executor/nodeCustom.h
index 5ef890144f..8c1d05f64b 100644
--- a/src/include/executor/nodeCustom.h
+++ b/src/include/executor/nodeCustom.h
@@ -18,6 +18,7 @@
/*
* General executor code
*/
+extern void ExecPrepCustomScan(CustomScan *node, ExecPrepContext *context, ExecPrepOutput *result);
extern CustomScanState *ExecInitCustomScan(CustomScan *cscan,
EState *estate, int eflags);
extern void ExecEndCustomScan(CustomScanState *node);
diff --git a/src/include/executor/nodeForeignscan.h b/src/include/executor/nodeForeignscan.h
index c9fbaed79c..a2d6667011 100644
--- a/src/include/executor/nodeForeignscan.h
+++ b/src/include/executor/nodeForeignscan.h
@@ -17,6 +17,7 @@
#include "access/parallel.h"
#include "nodes/execnodes.h"
+extern void ExecPrepForeignScan(ForeignScan *node, ExecPrepContext *context, ExecPrepOutput *result);
extern ForeignScanState *ExecInitForeignScan(ForeignScan *node, EState *estate, int eflags);
extern void ExecEndForeignScan(ForeignScanState *node);
extern void ExecReScanForeignScan(ForeignScanState *node);
diff --git a/src/include/executor/nodeFunctionscan.h b/src/include/executor/nodeFunctionscan.h
index 7a598a1d46..8686bb5c09 100644
--- a/src/include/executor/nodeFunctionscan.h
+++ b/src/include/executor/nodeFunctionscan.h
@@ -16,6 +16,7 @@
#include "nodes/execnodes.h"
+extern void ExecPrepFunctionScan(FunctionScan *node, ExecPrepContext *context, ExecPrepOutput *result);
extern FunctionScanState *ExecInitFunctionScan(FunctionScan *node, EState *estate, int eflags);
extern void ExecEndFunctionScan(FunctionScanState *node);
extern void ExecReScanFunctionScan(FunctionScanState *node);
diff --git a/src/include/executor/nodeGather.h b/src/include/executor/nodeGather.h
index 29829ffe9a..206185ffbc 100644
--- a/src/include/executor/nodeGather.h
+++ b/src/include/executor/nodeGather.h
@@ -16,6 +16,7 @@
#include "nodes/execnodes.h"
+extern void ExecPrepGather(Gather *node, ExecPrepContext *context, ExecPrepOutput *result);
extern GatherState *ExecInitGather(Gather *node, EState *estate, int eflags);
extern void ExecEndGather(GatherState *node);
extern void ExecShutdownGather(GatherState *node);
diff --git a/src/include/executor/nodeGatherMerge.h b/src/include/executor/nodeGatherMerge.h
index d724d5fea4..b124a3fe99 100644
--- a/src/include/executor/nodeGatherMerge.h
+++ b/src/include/executor/nodeGatherMerge.h
@@ -16,6 +16,7 @@
#include "nodes/execnodes.h"
+extern void ExecPrepGatherMerge(GatherMerge *node, ExecPrepContext *context, ExecPrepOutput *result);
extern GatherMergeState *ExecInitGatherMerge(GatherMerge *node,
EState *estate,
int eflags);
diff --git a/src/include/executor/nodeGroup.h b/src/include/executor/nodeGroup.h
index 816ed2c099..7e86abab01 100644
--- a/src/include/executor/nodeGroup.h
+++ b/src/include/executor/nodeGroup.h
@@ -16,6 +16,7 @@
#include "nodes/execnodes.h"
+extern void ExecPrepGroup(Group *node, ExecPrepContext *context, ExecPrepOutput *result);
extern GroupState *ExecInitGroup(Group *node, EState *estate, int eflags);
extern void ExecEndGroup(GroupState *node);
extern void ExecReScanGroup(GroupState *node);
diff --git a/src/include/executor/nodeHash.h b/src/include/executor/nodeHash.h
index e1e0dec24b..1426a6e9a1 100644
--- a/src/include/executor/nodeHash.h
+++ b/src/include/executor/nodeHash.h
@@ -19,6 +19,7 @@
struct SharedHashJoinBatch;
+extern void ExecPrepHash(Hash *node, ExecPrepContext *context, ExecPrepOutput *result);
extern HashState *ExecInitHash(Hash *node, EState *estate, int eflags);
extern Node *MultiExecHash(HashState *node);
extern void ExecEndHash(HashState *node);
diff --git a/src/include/executor/nodeHashjoin.h b/src/include/executor/nodeHashjoin.h
index b3b5a2c3f2..6dc88282d4 100644
--- a/src/include/executor/nodeHashjoin.h
+++ b/src/include/executor/nodeHashjoin.h
@@ -18,6 +18,7 @@
#include "nodes/execnodes.h"
#include "storage/buffile.h"
+extern void ExecPrepHashJoin(HashJoin *node, ExecPrepContext *context, ExecPrepOutput *result);
extern HashJoinState *ExecInitHashJoin(HashJoin *node, EState *estate, int eflags);
extern void ExecEndHashJoin(HashJoinState *node);
extern void ExecReScanHashJoin(HashJoinState *node);
diff --git a/src/include/executor/nodeIncrementalSort.h b/src/include/executor/nodeIncrementalSort.h
index 84cfd96b13..e909cb784b 100644
--- a/src/include/executor/nodeIncrementalSort.h
+++ b/src/include/executor/nodeIncrementalSort.h
@@ -15,6 +15,7 @@
#include "access/parallel.h"
#include "nodes/execnodes.h"
+extern void ExecPrepIncrementalSort(IncrementalSort *node, ExecPrepContext *context, ExecPrepOutput *result);
extern IncrementalSortState *ExecInitIncrementalSort(IncrementalSort *node, EState *estate, int eflags);
extern void ExecEndIncrementalSort(IncrementalSortState *node);
extern void ExecReScanIncrementalSort(IncrementalSortState *node);
diff --git a/src/include/executor/nodeIndexonlyscan.h b/src/include/executor/nodeIndexonlyscan.h
index 47b03950ea..d0aca7a303 100644
--- a/src/include/executor/nodeIndexonlyscan.h
+++ b/src/include/executor/nodeIndexonlyscan.h
@@ -17,6 +17,7 @@
#include "access/parallel.h"
#include "nodes/execnodes.h"
+extern void ExecPrepIndexOnlyScan(IndexOnlyScan *node, ExecPrepContext *context, ExecPrepOutput *result);
extern IndexOnlyScanState *ExecInitIndexOnlyScan(IndexOnlyScan *node, EState *estate, int eflags);
extern void ExecEndIndexOnlyScan(IndexOnlyScanState *node);
extern void ExecIndexOnlyMarkPos(IndexOnlyScanState *node);
diff --git a/src/include/executor/nodeIndexscan.h b/src/include/executor/nodeIndexscan.h
index 0a075f9aea..d57c370466 100644
--- a/src/include/executor/nodeIndexscan.h
+++ b/src/include/executor/nodeIndexscan.h
@@ -18,6 +18,7 @@
#include "access/parallel.h"
#include "nodes/execnodes.h"
+extern void ExecPrepIndexScan(IndexScan *node, ExecPrepContext *context, ExecPrepOutput *result);
extern IndexScanState *ExecInitIndexScan(IndexScan *node, EState *estate, int eflags);
extern void ExecEndIndexScan(IndexScanState *node);
extern void ExecIndexMarkPos(IndexScanState *node);
diff --git a/src/include/executor/nodeLimit.h b/src/include/executor/nodeLimit.h
index 6da0c4026c..05d7e4797b 100644
--- a/src/include/executor/nodeLimit.h
+++ b/src/include/executor/nodeLimit.h
@@ -16,6 +16,7 @@
#include "nodes/execnodes.h"
+extern void ExecPrepLimit(Limit *node, ExecPrepContext *context, ExecPrepOutput *result);
extern LimitState *ExecInitLimit(Limit *node, EState *estate, int eflags);
extern void ExecEndLimit(LimitState *node);
extern void ExecReScanLimit(LimitState *node);
diff --git a/src/include/executor/nodeLockRows.h b/src/include/executor/nodeLockRows.h
index 125a32b608..157d4a7f0e 100644
--- a/src/include/executor/nodeLockRows.h
+++ b/src/include/executor/nodeLockRows.h
@@ -16,6 +16,7 @@
#include "nodes/execnodes.h"
+extern void ExecPrepLockRows(LockRows *node, ExecPrepContext *context, ExecPrepOutput *result);
extern LockRowsState *ExecInitLockRows(LockRows *node, EState *estate, int eflags);
extern void ExecEndLockRows(LockRowsState *node);
extern void ExecReScanLockRows(LockRowsState *node);
diff --git a/src/include/executor/nodeMaterial.h b/src/include/executor/nodeMaterial.h
index 21a6860a1a..9b70d6e97b 100644
--- a/src/include/executor/nodeMaterial.h
+++ b/src/include/executor/nodeMaterial.h
@@ -16,6 +16,7 @@
#include "nodes/execnodes.h"
+extern void ExecPrepMaterial(Material *node, ExecPrepContext *context, ExecPrepOutput *result);
extern MaterialState *ExecInitMaterial(Material *node, EState *estate, int eflags);
extern void ExecEndMaterial(MaterialState *node);
extern void ExecMaterialMarkPos(MaterialState *node);
diff --git a/src/include/executor/nodeMemoize.h b/src/include/executor/nodeMemoize.h
index 4643163dc7..53a784f012 100644
--- a/src/include/executor/nodeMemoize.h
+++ b/src/include/executor/nodeMemoize.h
@@ -17,6 +17,7 @@
#include "access/parallel.h"
#include "nodes/execnodes.h"
+extern void ExecPrepMemoize(Memoize *node, ExecPrepContext *context, ExecPrepOutput *result);
extern MemoizeState *ExecInitMemoize(Memoize *node, EState *estate, int eflags);
extern void ExecEndMemoize(MemoizeState *node);
extern void ExecReScanMemoize(MemoizeState *node);
diff --git a/src/include/executor/nodeMergeAppend.h b/src/include/executor/nodeMergeAppend.h
index 97fe3b0665..60a9136de6 100644
--- a/src/include/executor/nodeMergeAppend.h
+++ b/src/include/executor/nodeMergeAppend.h
@@ -16,6 +16,7 @@
#include "nodes/execnodes.h"
+extern void ExecPrepMergeAppend(MergeAppend *node, ExecPrepContext *context, ExecPrepOutput *result);
extern MergeAppendState *ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags);
extern void ExecEndMergeAppend(MergeAppendState *node);
extern void ExecReScanMergeAppend(MergeAppendState *node);
diff --git a/src/include/executor/nodeMergejoin.h b/src/include/executor/nodeMergejoin.h
index 26ab517508..29553d5dd0 100644
--- a/src/include/executor/nodeMergejoin.h
+++ b/src/include/executor/nodeMergejoin.h
@@ -16,6 +16,7 @@
#include "nodes/execnodes.h"
+extern void ExecPrepMergeJoin(MergeJoin *node, ExecPrepContext *context, ExecPrepOutput *result);
extern MergeJoinState *ExecInitMergeJoin(MergeJoin *node, EState *estate, int eflags);
extern void ExecEndMergeJoin(MergeJoinState *node);
extern void ExecReScanMergeJoin(MergeJoinState *node);
diff --git a/src/include/executor/nodeModifyTable.h b/src/include/executor/nodeModifyTable.h
index 1d225bc88d..4b1846f8ff 100644
--- a/src/include/executor/nodeModifyTable.h
+++ b/src/include/executor/nodeModifyTable.h
@@ -19,6 +19,7 @@ extern void ExecComputeStoredGenerated(ResultRelInfo *resultRelInfo,
EState *estate, TupleTableSlot *slot,
CmdType cmdtype);
+extern void ExecPrepModifyTable(ModifyTable *node, ExecPrepContext *context, ExecPrepOutput *result);
extern ModifyTableState *ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags);
extern void ExecEndModifyTable(ModifyTableState *node);
extern void ExecReScanModifyTable(ModifyTableState *node);
diff --git a/src/include/executor/nodeNamedtuplestorescan.h b/src/include/executor/nodeNamedtuplestorescan.h
index d595124e54..964afcd816 100644
--- a/src/include/executor/nodeNamedtuplestorescan.h
+++ b/src/include/executor/nodeNamedtuplestorescan.h
@@ -16,6 +16,7 @@
#include "nodes/execnodes.h"
+extern void ExecPrepNamedTuplestoreScan(NamedTuplestoreScan *node, ExecPrepContext *context, ExecPrepOutput *result);
extern NamedTuplestoreScanState *ExecInitNamedTuplestoreScan(NamedTuplestoreScan *node, EState *estate, int eflags);
extern void ExecEndNamedTuplestoreScan(NamedTuplestoreScanState *node);
extern void ExecReScanNamedTuplestoreScan(NamedTuplestoreScanState *node);
diff --git a/src/include/executor/nodeNestloop.h b/src/include/executor/nodeNestloop.h
index b1411faf57..13ea4cc870 100644
--- a/src/include/executor/nodeNestloop.h
+++ b/src/include/executor/nodeNestloop.h
@@ -16,6 +16,7 @@
#include "nodes/execnodes.h"
+extern void ExecPrepNestLoop(NestLoop *node, ExecPrepContext *context, ExecPrepOutput *result);
extern NestLoopState *ExecInitNestLoop(NestLoop *node, EState *estate, int eflags);
extern void ExecEndNestLoop(NestLoopState *node);
extern void ExecReScanNestLoop(NestLoopState *node);
diff --git a/src/include/executor/nodeProjectSet.h b/src/include/executor/nodeProjectSet.h
index 2c2b58282c..c9b44356ba 100644
--- a/src/include/executor/nodeProjectSet.h
+++ b/src/include/executor/nodeProjectSet.h
@@ -16,6 +16,7 @@
#include "nodes/execnodes.h"
+extern void ExecPrepProjectSet(ProjectSet *node, ExecPrepContext *context, ExecPrepOutput *result);
extern ProjectSetState *ExecInitProjectSet(ProjectSet *node, EState *estate, int eflags);
extern void ExecEndProjectSet(ProjectSetState *node);
extern void ExecReScanProjectSet(ProjectSetState *node);
diff --git a/src/include/executor/nodeRecursiveunion.h b/src/include/executor/nodeRecursiveunion.h
index 2d20470da2..7b7585d594 100644
--- a/src/include/executor/nodeRecursiveunion.h
+++ b/src/include/executor/nodeRecursiveunion.h
@@ -16,6 +16,7 @@
#include "nodes/execnodes.h"
+extern void ExecPrepRecursiveUnion(RecursiveUnion *node, ExecPrepContext *context, ExecPrepOutput *result);
extern RecursiveUnionState *ExecInitRecursiveUnion(RecursiveUnion *node, EState *estate, int eflags);
extern void ExecEndRecursiveUnion(RecursiveUnionState *node);
extern void ExecReScanRecursiveUnion(RecursiveUnionState *node);
diff --git a/src/include/executor/nodeResult.h b/src/include/executor/nodeResult.h
index ebb131d265..998a50ae27 100644
--- a/src/include/executor/nodeResult.h
+++ b/src/include/executor/nodeResult.h
@@ -16,6 +16,8 @@
#include "nodes/execnodes.h"
+extern void ExecPrepResult(Result *node, ExecPrepContext *context, ExecPrepOutput *result);
+extern ResultState *ExecInitResult(Result *node, EState *estate, int eflags);
extern ResultState *ExecInitResult(Result *node, EState *estate, int eflags);
extern void ExecEndResult(ResultState *node);
extern void ExecResultMarkPos(ResultState *node);
diff --git a/src/include/executor/nodeSamplescan.h b/src/include/executor/nodeSamplescan.h
index 340b41a427..c0dd45b8bc 100644
--- a/src/include/executor/nodeSamplescan.h
+++ b/src/include/executor/nodeSamplescan.h
@@ -16,6 +16,7 @@
#include "nodes/execnodes.h"
+extern void ExecPrepSampleScan(SampleScan *node, ExecPrepContext *context, ExecPrepOutput *result);
extern SampleScanState *ExecInitSampleScan(SampleScan *node, EState *estate, int eflags);
extern void ExecEndSampleScan(SampleScanState *node);
extern void ExecReScanSampleScan(SampleScanState *node);
diff --git a/src/include/executor/nodeSeqscan.h b/src/include/executor/nodeSeqscan.h
index c225ba6e04..5452742622 100644
--- a/src/include/executor/nodeSeqscan.h
+++ b/src/include/executor/nodeSeqscan.h
@@ -17,6 +17,7 @@
#include "access/parallel.h"
#include "nodes/execnodes.h"
+extern void ExecPrepSeqScan(SeqScan *node, ExecPrepContext *context, ExecPrepOutput *result);
extern SeqScanState *ExecInitSeqScan(SeqScan *node, EState *estate, int eflags);
extern void ExecEndSeqScan(SeqScanState *node);
extern void ExecReScanSeqScan(SeqScanState *node);
diff --git a/src/include/executor/nodeSetOp.h b/src/include/executor/nodeSetOp.h
index a504cf8613..bc80011513 100644
--- a/src/include/executor/nodeSetOp.h
+++ b/src/include/executor/nodeSetOp.h
@@ -16,6 +16,7 @@
#include "nodes/execnodes.h"
+extern void ExecPrepSetOp(SetOp *node, ExecPrepContext *context, ExecPrepOutput *result);
extern SetOpState *ExecInitSetOp(SetOp *node, EState *estate, int eflags);
extern void ExecEndSetOp(SetOpState *node);
extern void ExecReScanSetOp(SetOpState *node);
diff --git a/src/include/executor/nodeSort.h b/src/include/executor/nodeSort.h
index 008e6a6bc6..def930a8bc 100644
--- a/src/include/executor/nodeSort.h
+++ b/src/include/executor/nodeSort.h
@@ -17,6 +17,7 @@
#include "access/parallel.h"
#include "nodes/execnodes.h"
+extern void ExecPrepSort(Sort *node, ExecPrepContext *context, ExecPrepOutput *result);
extern SortState *ExecInitSort(Sort *node, EState *estate, int eflags);
extern void ExecEndSort(SortState *node);
extern void ExecSortMarkPos(SortState *node);
diff --git a/src/include/executor/nodeSubplan.h b/src/include/executor/nodeSubplan.h
index 75cc6d5104..f6e21007fa 100644
--- a/src/include/executor/nodeSubplan.h
+++ b/src/include/executor/nodeSubplan.h
@@ -16,6 +16,7 @@
#include "nodes/execnodes.h"
+extern void ExecPrepSubPlan(SubPlan *node, ExecPrepContext *context, ExecPrepOutput *result);
extern SubPlanState *ExecInitSubPlan(SubPlan *subplan, PlanState *parent);
extern Datum ExecSubPlan(SubPlanState *node, ExprContext *econtext, bool *isNull);
diff --git a/src/include/executor/nodeSubqueryscan.h b/src/include/executor/nodeSubqueryscan.h
index a09e2be423..3fbf053e04 100644
--- a/src/include/executor/nodeSubqueryscan.h
+++ b/src/include/executor/nodeSubqueryscan.h
@@ -16,6 +16,7 @@
#include "nodes/execnodes.h"
+extern void ExecPrepSubqueryScan(SubqueryScan *node, ExecPrepContext *context, ExecPrepOutput *result);
extern SubqueryScanState *ExecInitSubqueryScan(SubqueryScan *node, EState *estate, int eflags);
extern void ExecEndSubqueryScan(SubqueryScanState *node);
extern void ExecReScanSubqueryScan(SubqueryScanState *node);
diff --git a/src/include/executor/nodeTableFuncscan.h b/src/include/executor/nodeTableFuncscan.h
index 2b82e7d7ed..ba2e7774f1 100644
--- a/src/include/executor/nodeTableFuncscan.h
+++ b/src/include/executor/nodeTableFuncscan.h
@@ -16,6 +16,7 @@
#include "nodes/execnodes.h"
+extern void ExecPrepTableFuncScan(TableFuncScan *node, ExecPrepContext *context, ExecPrepOutput *result);
extern TableFuncScanState *ExecInitTableFuncScan(TableFuncScan *node, EState *estate, int eflags);
extern void ExecEndTableFuncScan(TableFuncScanState *node);
extern void ExecReScanTableFuncScan(TableFuncScanState *node);
diff --git a/src/include/executor/nodeTidrangescan.h b/src/include/executor/nodeTidrangescan.h
index f122e09583..333cfbb5c6 100644
--- a/src/include/executor/nodeTidrangescan.h
+++ b/src/include/executor/nodeTidrangescan.h
@@ -16,6 +16,7 @@
#include "nodes/execnodes.h"
+extern void ExecPrepTidRangeScan(TidRangeScan *node, ExecPrepContext *context, ExecPrepOutput *result);
extern TidRangeScanState *ExecInitTidRangeScan(TidRangeScan *node,
EState *estate, int eflags);
extern void ExecEndTidRangeScan(TidRangeScanState *node);
diff --git a/src/include/executor/nodeTidscan.h b/src/include/executor/nodeTidscan.h
index 91a5f89f42..188f3f3f97 100644
--- a/src/include/executor/nodeTidscan.h
+++ b/src/include/executor/nodeTidscan.h
@@ -16,6 +16,7 @@
#include "nodes/execnodes.h"
+extern void ExecPrepTidScan(TidScan *node, ExecPrepContext *context, ExecPrepOutput *result);
extern TidScanState *ExecInitTidScan(TidScan *node, EState *estate, int eflags);
extern void ExecEndTidScan(TidScanState *node);
extern void ExecReScanTidScan(TidScanState *node);
diff --git a/src/include/executor/nodeUnique.h b/src/include/executor/nodeUnique.h
index 61f09d9853..970e894681 100644
--- a/src/include/executor/nodeUnique.h
+++ b/src/include/executor/nodeUnique.h
@@ -16,6 +16,7 @@
#include "nodes/execnodes.h"
+extern void ExecPrepUnique(Unique *node, ExecPrepContext *context, ExecPrepOutput *result);
extern UniqueState *ExecInitUnique(Unique *node, EState *estate, int eflags);
extern void ExecEndUnique(UniqueState *node);
extern void ExecReScanUnique(UniqueState *node);
diff --git a/src/include/executor/nodeValuesscan.h b/src/include/executor/nodeValuesscan.h
index 07c13ef123..f08bb080eb 100644
--- a/src/include/executor/nodeValuesscan.h
+++ b/src/include/executor/nodeValuesscan.h
@@ -16,6 +16,7 @@
#include "nodes/execnodes.h"
+extern void ExecPrepValuesScan(ValuesScan *node, ExecPrepContext *context, ExecPrepOutput *result);
extern ValuesScanState *ExecInitValuesScan(ValuesScan *node, EState *estate, int eflags);
extern void ExecEndValuesScan(ValuesScanState *node);
extern void ExecReScanValuesScan(ValuesScanState *node);
diff --git a/src/include/executor/nodeWindowAgg.h b/src/include/executor/nodeWindowAgg.h
index 4e62c8936d..a4d8487aba 100644
--- a/src/include/executor/nodeWindowAgg.h
+++ b/src/include/executor/nodeWindowAgg.h
@@ -16,6 +16,7 @@
#include "nodes/execnodes.h"
+extern void ExecPrepWindowAgg(WindowAgg *node, ExecPrepContext *context, ExecPrepOutput *result);
extern WindowAggState *ExecInitWindowAgg(WindowAgg *node, EState *estate, int eflags);
extern void ExecEndWindowAgg(WindowAggState *node);
extern void ExecReScanWindowAgg(WindowAggState *node);
diff --git a/src/include/executor/nodeWorktablescan.h b/src/include/executor/nodeWorktablescan.h
index 17842de576..5f7f76ec85 100644
--- a/src/include/executor/nodeWorktablescan.h
+++ b/src/include/executor/nodeWorktablescan.h
@@ -16,6 +16,7 @@
#include "nodes/execnodes.h"
+extern void ExecPrepWorkTableScan(WorkTableScan *node, ExecPrepContext *context, ExecPrepOutput *result);
extern WorkTableScanState *ExecInitWorkTableScan(WorkTableScan *node, EState *estate, int eflags);
extern void ExecEndWorkTableScan(WorkTableScanState *node);
extern void ExecReScanWorkTableScan(WorkTableScanState *node);
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index dd95dc40c7..7b03f46966 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -570,6 +570,8 @@ typedef struct EState
struct ExecRowMark **es_rowmarks; /* Array of per-range-table-entry
* ExecRowMarks, or NULL if none */
PlannedStmt *es_plannedstmt; /* link to top of plan tree */
+ struct ExecPrepOutput *es_execprep; /* link to ExecPrepOutput, if one was
+ * passed to ExecutorStart() */
const char *es_sourceText; /* Source text from QueryDesc */
JunkFilter *es_junkFilter; /* top-level junk filter, if any */
@@ -958,6 +960,82 @@ typedef struct DomainConstraintState
*/
typedef TupleTableSlot *(*ExecProcNodeMtd) (struct PlanState *pstate);
+/*----------------
+ * ExecPrepContext
+ *
+ * Context information for performing ExecutorPrep() on a given plan
+ */
+typedef struct ExecPrepContext
+{
+ NodeTag type;
+
+ PlannedStmt *stmt; /* target plan */
+ ParamListInfo params; /* EXTERN parameters to prune with */
+} ExecPrepContext;
+
+/*----------------
+ * ExecPrepOutput
+ *
+ * Result of of performing ExecutorPrep() for a given PlannedStmt
+ */
+typedef struct ExecPrepOutput
+{
+ NodeTag type;
+
+ Bitmapset *relationRTIs; /* RT indexes of RTE_RELATIONs */
+ int numPlanNodes; /* PlannedStmt.numPlanNodes */
+
+ /*
+ * Array of 'numPlanNodes' elements containing PlanPrepOutput nodes
+ * for each node in the plan tree, indexed using the node's plan_node_id.
+ * A NULL value means that the corresponding plan node does not have a
+ * PlanPrepOutput associated with it.
+ */
+ struct PlanPrepOutput **planPrepResults;
+} ExecPrepOutput;
+
+#define ExecPrepStorePlanPrepOutput(execprep, planPrepResult, plannode) \
+ (execprep)->planPrepResults[(plannode)->plan_node_id] = (planPrepResult)
+
+#define ExecPrepFetchPlanPrepOutput(execprep, plannode) \
+ ((execprep) != NULL ? \
+ (execprep)->planPrepResults[(plannode)->plan_node_id] : NULL)
+
+#ifdef USE_ASSERT_CHECKING
+#define EXEC_PREP_OUTPUT_SANITY(plannode, estate) \
+ do { \
+ PlanPrepOutput *planPrepOutput = \
+ ExecPrepFetchPlanPrepOutput(estate->es_execprep, node); \
+ Assert(planPrepOutput == NULL || \
+ (IsA(planPrepOutput, PlanPrepOutput) && \
+ planPrepOutput->plan_node_id == plannode->plan_node_id)); \
+ } while (0);
+#else
+#define EXEC_PREP_OUTPUT_SANITY(node, estate)
+#endif
+
+/* ---------------
+ * PlanPrepOutput
+ *
+ * ExecutorPrep() creates a node of this type for every node in the Plan tree
+ * that does some "prep" work.
+ */
+typedef struct PlanPrepOutput
+{
+ NodeTag type;
+
+ int plan_node_id; /* associated Plan node */
+
+ /* Information collected by ExecPrepNode subroutine for the node */
+
+ /*
+ * For nodes that contain a list of prunable subnodes, the following
+ * contains offsets into that list, of the subnodes that survive initial
+ * partition pruning.
+ */
+ Bitmapset *initially_valid_subnodes;
+} PlanPrepOutput;
+
/* ----------------
* PlanState node
*
diff --git a/src/include/nodes/nodeFuncs.h b/src/include/nodes/nodeFuncs.h
index 93c60bde66..fca107ad65 100644
--- a/src/include/nodes/nodeFuncs.h
+++ b/src/include/nodes/nodeFuncs.h
@@ -158,5 +158,8 @@ extern bool raw_expression_tree_walker(Node *node, bool (*walker) (),
struct PlanState;
extern bool planstate_tree_walker(struct PlanState *planstate, bool (*walker) (),
void *context);
+struct Plan;
+extern bool plan_tree_walker(struct Plan *plan, bool (*walker) (),
+ void *context);
#endif /* NODEFUNCS_H */
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index da35f2c272..8db017a138 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -96,6 +96,11 @@ typedef enum NodeTag
T_PartitionPruneStepCombine,
T_PlanInvalItem,
+ /* TAGS FOR EXECUTOR PREP NODES (execnodes.h) */
+ T_ExecPrepContext,
+ T_ExecPrepOutput,
+ T_PlanPrepOutput,
+
/*
* TAGS FOR PLAN STATE NODES (execnodes.h)
*
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 1f3845b3fe..ffde93ef13 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -101,6 +101,9 @@ typedef struct PlannerGlobal
List *finalrtable; /* "flat" rangetable for executor */
+ Bitmapset *relationRTIs; /* Indexes of RTE_RELATION entries in range
+ * table */
+
List *finalrowmarks; /* "flat" list of PlanRowMarks */
List *resultRelations; /* "flat" list of integer RT indexes */
@@ -129,6 +132,9 @@ typedef struct PlannerGlobal
char maxParallelHazard; /* worst PROPARALLEL hazard level */
+ bool usesPreExecPruning; /* Do some Plan nodes use pre-execution
+ * partition pruning */
+
PartitionDirectory partition_directory; /* partition descriptors */
} PlannerGlobal;
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 0b518ce6b2..69bc5f918c 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -59,12 +59,20 @@ typedef struct PlannedStmt
bool parallelModeNeeded; /* parallel mode required to execute? */
+ bool usesPreExecPruning; /* Do some Plan nodes use pre-execution
+ * partition pruning */
+
int jitFlags; /* which forms of JIT should be performed */
struct Plan *planTree; /* tree of Plan nodes */
+ int numPlanNodes; /* number of nodes in planTree */
+
List *rtable; /* list of RangeTblEntry nodes */
+ Bitmapset *relationRTIs; /* Indexes of RTE_RELATION entries in range
+ * table */
+
/* rtable indexes of target relations for INSERT/UPDATE/DELETE */
List *resultRelations; /* integer list of RT indexes, or NIL */
@@ -1172,6 +1180,13 @@ typedef struct PlanRowMark
* prune_infos List of Lists containing PartitionedRelPruneInfo nodes,
* one sublist per run-time-prunable partition hierarchy
* appearing in the parent plan node's subplans.
+ *
+ * contains_init_steps Does any of the PartitionedRelPruneInfos in
+ * prune_infos have its initial_pruning_steps set?
+ *
+ * contains_exec_steps Does any of the PartitionedRelPruneInfos in
+ * prune_infos have its exec_pruning_steps set?
+ *
* other_subplans Indexes of any subplans that are not accounted for
* by any of the PartitionedRelPruneInfo nodes in
* "prune_infos". These subplans must not be pruned.
@@ -1180,6 +1195,8 @@ typedef struct PartitionPruneInfo
{
NodeTag type;
List *prune_infos;
+ bool contains_init_steps;
+ bool contains_exec_steps;
Bitmapset *other_subplans;
} PartitionPruneInfo;
diff --git a/src/include/partitioning/partprune.h b/src/include/partitioning/partprune.h
index ee11b6feae..90684efa25 100644
--- a/src/include/partitioning/partprune.h
+++ b/src/include/partitioning/partprune.h
@@ -41,6 +41,7 @@ struct RelOptInfo;
* subsidiary data, such as the FmgrInfos.
* planstate Points to the parent plan node's PlanState when called
* during execution; NULL when called from the planner.
+ * exprcontext ExprContext to use when evaluating pruning expressions
* exprstates Array of ExprStates, indexed as per PruneCxtStateIdx; one
* for each partition key in each pruning step. Allocated if
* planstate is non-NULL, otherwise NULL.
@@ -56,6 +57,7 @@ typedef struct PartitionPruneContext
FmgrInfo *stepcmpfuncs;
MemoryContext ppccontext;
PlanState *planstate;
+ ExprContext *exprcontext;
ExprState **exprstates;
} PartitionPruneContext;
diff --git a/src/include/tcop/tcopprot.h b/src/include/tcop/tcopprot.h
index 15a11bc3ff..02124af4ed 100644
--- a/src/include/tcop/tcopprot.h
+++ b/src/include/tcop/tcopprot.h
@@ -59,7 +59,7 @@ extern PlannedStmt *pg_plan_query(Query *querytree, const char *query_string,
ParamListInfo boundParams);
extern List *pg_plan_queries(List *querytrees, const char *query_string,
int cursorOptions,
- ParamListInfo boundParams);
+ ParamListInfo boundParams, List **stmt_execprep_list);
extern bool check_max_stack_depth(int *newval, void **extra, GucSource source);
extern void assign_max_stack_depth(int newval, void *extra);
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index 95b99e3d25..14794972a0 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -148,6 +148,9 @@ typedef struct CachedPlan
{
int magic; /* should equal CACHEDPLAN_MAGIC */
List *stmt_list; /* list of PlannedStmts */
+ List *stmt_execprep_list; /* list of ExecutorPrepResult with one
+ * element for each of stmt_list; NIL
+ * if not a generic plan */
bool is_oneshot; /* is it a "oneshot" plan? */
bool is_saved; /* is CachedPlan in a long-lived context? */
bool is_valid; /* is the stmt_list currently valid? */
@@ -158,6 +161,8 @@ typedef struct CachedPlan
int generation; /* parent's generation number for this plan */
int refcount; /* count of live references to this struct */
MemoryContext context; /* context containing this CachedPlan */
+ MemoryContext execprep_context; /* context containing stmt_execprep_list,
+ * a child of the above context */
} CachedPlan;
/*
diff --git a/src/include/utils/portal.h b/src/include/utils/portal.h
index aeddbdafe5..03c39ff97a 100644
--- a/src/include/utils/portal.h
+++ b/src/include/utils/portal.h
@@ -137,6 +137,10 @@ typedef struct PortalData
CommandTag commandTag; /* command tag for original query */
QueryCompletion qc; /* command completion data for executed query */
List *stmts; /* list of PlannedStmts */
+ List *stmt_execpreps; /* list of ExecutorPrepResults with one element
+ * for each of 'stmts'; same as
+ * cplan->stmt_execprep_list if cplan is
+ * not NULL */
CachedPlan *cplan; /* CachedPlan, if stmts are from one */
ParamListInfo portalParams; /* params to pass to query */
@@ -241,6 +245,7 @@ extern void PortalDefineQuery(Portal portal,
const char *sourceText,
CommandTag commandTag,
List *stmts,
+ List *stmt_execpreps,
CachedPlan *cplan);
extern PlannedStmt *PortalGetPrimaryStmt(Portal portal);
extern void PortalCreateHoldStore(Portal portal);
--
2.24.1
^ permalink raw reply [nested|flat] 71+ messages in thread
* Re: generic plans and "initial" pruning
@ 2022-02-10 22:01 Robert Haas <[email protected]>
parent: Amit Langote <[email protected]>
0 siblings, 1 reply; 71+ messages in thread
From: Robert Haas @ 2022-02-10 22:01 UTC (permalink / raw)
To: Amit Langote <[email protected]>; +Cc: pgsql-hackers; David Rowley *EXTERN* <[email protected]>
On Thu, Feb 10, 2022 at 3:14 AM Amit Langote <[email protected]> wrote:
> Maybe this should be more than one patch? Say:
>
> 0001 to add ExecutorPrep and the boilerplate,
> 0002 to teach plancache.c to use the new facility
Could be, not sure. I agree that if it's possible to split this in a
meaningful way, it would facilitate review. I notice that there is
some straight code movement e.g. the creation of
ExecPartitionPruneFixSubPlanIndexes. It would be best, I think, to do
pure code movement in a preparatory patch so that the main patch is
just adding the new stuff we need and not moving stuff around.
David Rowley recently proposed a patch for some parallel-safety
debugging cross checks which added a plan tree walker. I'm not sure
whether he's going to press that patch forward to commit, but I think
we should get something like that into the tree and start using it,
rather than adding more bespoke code. Maybe you/we should steal that
part of his patch and commit it separately. What I'm imagining is that
plan_tree_walker() would know which nodes have subnodes and how to
recurse over the tree structure, and you'd have a walker function to
use with it that would know which executor nodes have ExecPrep
functions and call them, and just do nothing for the others. That
would spare you adding stub functions for nodes that don't need to do
anything, or don't need to do anything other than recurse. Admittedly
it would look a bit different from the existing executor phases, but
I'd argue that it's a better coding model.
Actually, you might've had this in the patch at some point, because
you have a declaration for plan_tree_walker but no implementation. I
guess one thing that's a bit awkward about this idea is that in some
cases you want to recurse to some subnodes but not other subnodes. But
maybe it would work to put the recursion in the walker function in
that case, and then just return true; but if you want to walk all
children, return false.
+ bool contains_init_steps;
+ bool contains_exec_steps;
s/steps/pruning/? maybe with contains -> needs or performs or requires as well?
+ * Returned information includes the set of RT indexes of relations referenced
+ * in the plan, and a PlanPrepOutput node for each node in the planTree if the
+ * node type supports producing one.
Aren't all RT indexes referenced in the plan?
+ * This may lock relations whose information may be used to produce the
+ * PlanPrepOutput nodes. For example, a partitioned table before perusing its
+ * PartitionPruneInfo contained in an Append node to do the pruning the result
+ * of which is used to populate the Append node's PlanPrepOutput.
"may lock" feels awfully fuzzy to me. How am I supposed to rely on
something that "may" happen? And don't we need to have tight logic
around locking, with specific guarantees about what is locked at which
points in the code and what is not?
+ * At least one of 'planstate' or 'econtext' must be passed to be able to
+ * successfully evaluate any non-Const expressions contained in the
+ * steps.
This also seems fuzzy. If I'm thinking of calling this function, I
don't know how I'd know whether this criterion is met.
I don't love PlanPrepOutput the way you have it. I think one of the
basic design issues for this patch is: should we think of the prep
phase as specifically pruning, or is it general prep and pruning is
the first thing for which we're going to use it? If it's really a
pre-pruning phase, we could name it that way instead of calling it
"prep". If it's really a general prep phase, then why does
PlanPrepOutput contain initially_valid_subnodes as a field? One could
imagine letting each prep function decide what kind of prep node it
would like to return, with partition pruning being just one of the
options. But is that a useful generalization of the basic concept, or
just pretending that a special-purpose mechanism is more general than
it really is?
+ return CreateQueryDesc(pstmt, NULL, /* XXX pass ExecPrepOutput too? */
It seems to me that we should do what the XXX suggests. It doesn't
seem nice if the parallel workers could theoretically decide to prune
a different set of nodes than the leader.
+ * known at executor startup (excludeing expressions containing
Extra e.
+ * into subplan indexes, is also returned for use during subsquent
Missing e.
Somewhere, we're going to need to document the idea that this may
permit us to execute a plan that isn't actually fully valid, but that
we expect to survive because we'll never do anything with the parts of
it that aren't. Maybe that should be added to the executor README, or
maybe there's some better place, but I don't think that should remain
something that's just implicit.
This is not a full review, just some initial thoughts looking through this.
--
Robert Haas
EDB: http://www.enterprisedb.com
^ permalink raw reply [nested|flat] 71+ messages in thread
* Re: generic plans and "initial" pruning
@ 2022-03-07 14:18 Amit Langote <[email protected]>
parent: Robert Haas <[email protected]>
0 siblings, 1 reply; 71+ messages in thread
From: Amit Langote @ 2022-03-07 14:18 UTC (permalink / raw)
To: Robert Haas <[email protected]>; +Cc: pgsql-hackers; David Rowley *EXTERN* <[email protected]>
On Fri, Feb 11, 2022 at 7:02 AM Robert Haas <[email protected]> wrote:
> On Thu, Feb 10, 2022 at 3:14 AM Amit Langote <[email protected]> wrote:
> > Maybe this should be more than one patch? Say:
> >
> > 0001 to add ExecutorPrep and the boilerplate,
> > 0002 to teach plancache.c to use the new facility
Thanks for taking a look and sorry about the delay.
> Could be, not sure. I agree that if it's possible to split this in a
> meaningful way, it would facilitate review. I notice that there is
> some straight code movement e.g. the creation of
> ExecPartitionPruneFixSubPlanIndexes. It would be best, I think, to do
> pure code movement in a preparatory patch so that the main patch is
> just adding the new stuff we need and not moving stuff around.
Okay, created 0001 for moving around the execution pruning code.
> David Rowley recently proposed a patch for some parallel-safety
> debugging cross checks which added a plan tree walker. I'm not sure
> whether he's going to press that patch forward to commit, but I think
> we should get something like that into the tree and start using it,
> rather than adding more bespoke code. Maybe you/we should steal that
> part of his patch and commit it separately.
I looked at the thread you mentioned (I guess [1]), though it seems
David's proposing a path_tree_walker(), so I guess only useful within
the planner and not here.
> What I'm imagining is that
> plan_tree_walker() would know which nodes have subnodes and how to
> recurse over the tree structure, and you'd have a walker function to
> use with it that would know which executor nodes have ExecPrep
> functions and call them, and just do nothing for the others. That
> would spare you adding stub functions for nodes that don't need to do
> anything, or don't need to do anything other than recurse. Admittedly
> it would look a bit different from the existing executor phases, but
> I'd argue that it's a better coding model.
>
> Actually, you might've had this in the patch at some point, because
> you have a declaration for plan_tree_walker but no implementation.
Right, the previous patch indeed used a plan_tree_walker() for this
and I think in a way you seem to think it should work.
I do agree that plan_tree_walker() allows for a better implementation
of the idea of this patch and may also be generally useful, so I've
created a separate patch that adds it to nodeFuncs.c.
> I guess one thing that's a bit awkward about this idea is that in some
> cases you want to recurse to some subnodes but not other subnodes. But
> maybe it would work to put the recursion in the walker function in
> that case, and then just return true; but if you want to walk all
> children, return false.
Right, that's how I've made ExecPrepAppend() etc. do it.
> + bool contains_init_steps;
> + bool contains_exec_steps;
>
> s/steps/pruning/? maybe with contains -> needs or performs or requires as well?
Went with: needs_{init|exec}_pruning
> + * Returned information includes the set of RT indexes of relations referenced
> + * in the plan, and a PlanPrepOutput node for each node in the planTree if the
> + * node type supports producing one.
>
> Aren't all RT indexes referenced in the plan?
Ah yes. How about:
* Returned information includes the set of RT indexes of relations that must
* be locked to safely execute the plan,
> + * This may lock relations whose information may be used to produce the
> + * PlanPrepOutput nodes. For example, a partitioned table before perusing its
> + * PartitionPruneInfo contained in an Append node to do the pruning the result
> + * of which is used to populate the Append node's PlanPrepOutput.
>
> "may lock" feels awfully fuzzy to me. How am I supposed to rely on
> something that "may" happen? And don't we need to have tight logic
> around locking, with specific guarantees about what is locked at which
> points in the code and what is not?
Agree the wording was fuzzy. I've rewrote as:
* This locks relations whose information is needed to produce the
* PlanPrepOutput nodes. For example, a partitioned table before perusing its
* PartitionedRelPruneInfo contained in an Append node to do the pruning, the
* result of which is used to populate the Append node's PlanPrepOutput.
BTW, I've added an Assert in ExecGetRangeTableRelation():
/*
* A cross-check that AcquireExecutorLocks() hasn't missed any relations
* it must not have.
*/
Assert(estate->es_execprep == NULL ||
bms_is_member(rti, estate->es_execprep->relationRTIs));
which IOW ensures that the actual execution of a plan only sees
relations that ExecutorPrep() would've told AcquireExecutorLocks() to
take a lock on.
> + * At least one of 'planstate' or 'econtext' must be passed to be able to
> + * successfully evaluate any non-Const expressions contained in the
> + * steps.
>
> This also seems fuzzy. If I'm thinking of calling this function, I
> don't know how I'd know whether this criterion is met.
OK, I have removed this comment (which was on top of a static local
function) in favor of adding some commentary on this in places where
it belongs. For example, in ExecPrepDoInitialPruning():
/*
* We don't yet have a PlanState for the parent plan node, so must create
* a standalone ExprContext to evaluate pruning expressions, equipped with
* the information about the EXTERN parameters that the caller passed us.
* Note that that's okay because the initial pruning steps does not
* involve anything that requires the execution to have started.
*/
econtext = CreateStandaloneExprContext();
econtext->ecxt_param_list_info = params;
prunestate = ExecCreatePartitionPruneState(NULL, pruneinfo,
true, false,
rtable, econtext,
pdir, parentrelids);
> I don't love PlanPrepOutput the way you have it. I think one of the
> basic design issues for this patch is: should we think of the prep
> phase as specifically pruning, or is it general prep and pruning is
> the first thing for which we're going to use it? If it's really a
> pre-pruning phase, we could name it that way instead of calling it
> "prep". If it's really a general prep phase, then why does
> PlanPrepOutput contain initially_valid_subnodes as a field? One could
> imagine letting each prep function decide what kind of prep node it
> would like to return, with partition pruning being just one of the
> options. But is that a useful generalization of the basic concept, or
> just pretending that a special-purpose mechanism is more general than
> it really is?
While it can feel like the latter TBH, I'm inclined to keep
ExecutorPrep generalized. What bothers me about about the
alternative of calling the new phase something less generalized like
ExecutorDoInitPruning() is that that makes the somewhat elaborate API
changes needed for the phase's output to put into QueryDesc, through
which it ultimately reaches the main executor, seem less worthwhile.
I agree that PlanPrepOutput design needs to be likewise generalized,
maybe like you suggest -- using PlanInitPruningOutput, a child class
of PlanPrepOutput, to return the prep output for plan nodes that
support pruning.
Thoughts?
> + return CreateQueryDesc(pstmt, NULL, /* XXX pass ExecPrepOutput too? */
>
> It seems to me that we should do what the XXX suggests. It doesn't
> seem nice if the parallel workers could theoretically decide to prune
> a different set of nodes than the leader.
OK, will fix.
> + * known at executor startup (excludeing expressions containing
>
> Extra e.
>
> + * into subplan indexes, is also returned for use during subsquent
>
> Missing e.
Will fix.
> Somewhere, we're going to need to document the idea that this may
> permit us to execute a plan that isn't actually fully valid, but that
> we expect to survive because we'll never do anything with the parts of
> it that aren't. Maybe that should be added to the executor README, or
> maybe there's some better place, but I don't think that should remain
> something that's just implicit.
Agreed. I'd added a description of the new prep phase to executor
README, though the text didn't mention this particular bit. Will fix
to mention it.
> This is not a full review, just some initial thoughts looking through this.
Thanks again. Will post a new version soon after a bit more polishing.
--
Amit Langote
EDB: http://www.enterprisedb.com
[1] https://www.postgresql.org/message-id/flat/b59605fecb20ba9ea94e70ab60098c237c870628.camel%40postgres...
^ permalink raw reply [nested|flat] 71+ messages in thread
* Re: generic plans and "initial" pruning
@ 2022-03-11 14:35 Amit Langote <[email protected]>
parent: Amit Langote <[email protected]>
0 siblings, 2 replies; 71+ messages in thread
From: Amit Langote @ 2022-03-11 14:35 UTC (permalink / raw)
To: Robert Haas <[email protected]>; +Cc: pgsql-hackers; David Rowley *EXTERN* <[email protected]>
On Mon, Mar 7, 2022 at 11:18 PM Amit Langote <[email protected]> wrote:
> On Fri, Feb 11, 2022 at 7:02 AM Robert Haas <[email protected]> wrote:
> > I don't love PlanPrepOutput the way you have it. I think one of the
> > basic design issues for this patch is: should we think of the prep
> > phase as specifically pruning, or is it general prep and pruning is
> > the first thing for which we're going to use it? If it's really a
> > pre-pruning phase, we could name it that way instead of calling it
> > "prep". If it's really a general prep phase, then why does
> > PlanPrepOutput contain initially_valid_subnodes as a field? One could
> > imagine letting each prep function decide what kind of prep node it
> > would like to return, with partition pruning being just one of the
> > options. But is that a useful generalization of the basic concept, or
> > just pretending that a special-purpose mechanism is more general than
> > it really is?
>
> While it can feel like the latter TBH, I'm inclined to keep
> ExecutorPrep generalized. What bothers me about about the
> alternative of calling the new phase something less generalized like
> ExecutorDoInitPruning() is that that makes the somewhat elaborate API
> changes needed for the phase's output to put into QueryDesc, through
> which it ultimately reaches the main executor, seem less worthwhile.
>
> I agree that PlanPrepOutput design needs to be likewise generalized,
> maybe like you suggest -- using PlanInitPruningOutput, a child class
> of PlanPrepOutput, to return the prep output for plan nodes that
> support pruning.
>
> Thoughts?
So I decided to agree with you after all about limiting the scope of
this new executor interface, or IOW call it what it is.
I have named it ExecutorGetLockRels() to go with the only use case we
know for it -- get the set of relations for AcquireExecutorLocks() to
lock to validate a plan tree. Its result returned in a node named
ExecLockRelsInfo, which contains the set of relations scanned in the
plan tree (lockrels) and a list of PlanInitPruningOutput nodes for all
nodes that undergo pruning.
> > + return CreateQueryDesc(pstmt, NULL, /* XXX pass ExecPrepOutput too? */
> >
> > It seems to me that we should do what the XXX suggests. It doesn't
> > seem nice if the parallel workers could theoretically decide to prune
> > a different set of nodes than the leader.
>
> OK, will fix.
Done. This required adding nodeToString() and stringToNode() support
for the nodes produced by the new executor function that wasn't there
before.
> > Somewhere, we're going to need to document the idea that this may
> > permit us to execute a plan that isn't actually fully valid, but that
> > we expect to survive because we'll never do anything with the parts of
> > it that aren't. Maybe that should be added to the executor README, or
> > maybe there's some better place, but I don't think that should remain
> > something that's just implicit.
>
> Agreed. I'd added a description of the new prep phase to executor
> README, though the text didn't mention this particular bit. Will fix
> to mention it.
Rewrote the comments above ExecutorGetLockRels() (previously
ExecutorPrep()) and the executor README text to be explicit about the
fact that not locking some relations effectively invalidates pruned
parts of the plan tree.
> > This is not a full review, just some initial thoughts looking through this.
>
> Thanks again. Will post a new version soon after a bit more polishing.
Attached is v5, now broken into 3 patches:
0001: Some refactoring of runtime pruning code
0002: Add a plan_tree_walker
0003: Teach AcquireExecutorLocks to skip locking pruned relations
--
Amit Langote
EDB: http://www.enterprisedb.com
Attachments:
[application/octet-stream] v5-0002-Add-a-plan_tree_walker.patch (3.9K, 2-v5-0002-Add-a-plan_tree_walker.patch)
download | inline diff:
From 22ff31c7b052eabb32f4a529c48fe48180332156 Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Thu, 3 Mar 2022 16:04:13 +0900
Subject: [PATCH v5 2/3] Add a plan_tree_walker()
Like planstate_tree_walker() but for uninitialized plan trees.
---
src/backend/nodes/nodeFuncs.c | 116 ++++++++++++++++++++++++++++++++++
src/include/nodes/nodeFuncs.h | 3 +
2 files changed, 119 insertions(+)
diff --git a/src/backend/nodes/nodeFuncs.c b/src/backend/nodes/nodeFuncs.c
index 47d0564fa2..cdf937f127 100644
--- a/src/backend/nodes/nodeFuncs.c
+++ b/src/backend/nodes/nodeFuncs.c
@@ -31,6 +31,10 @@ static bool planstate_walk_subplans(List *plans, bool (*walker) (),
void *context);
static bool planstate_walk_members(PlanState **planstates, int nplans,
bool (*walker) (), void *context);
+static bool plan_walk_subplans(List *plans,
+ bool (*walker) (),
+ void *context);
+static bool plan_walk_members(List *plans, bool (*walker) (), void *context);
/*
@@ -4148,3 +4152,115 @@ planstate_walk_members(PlanState **planstates, int nplans,
return false;
}
+
+/*
+ * plan_tree_walker --- walk plantrees
+ *
+ * The walker has already visited the current node, and so we need only
+ * recurse into any sub-nodes it has.
+ */
+bool
+plan_tree_walker(Plan *plan,
+ bool (*walker) (),
+ void *context)
+{
+ /* Guard against stack overflow due to overly complex plan trees */
+ check_stack_depth();
+
+ /* initPlan-s */
+ if (plan_walk_subplans(plan->initPlan, walker, context))
+ return true;
+
+ /* lefttree */
+ if (outerPlan(plan))
+ {
+ if (walker(outerPlan(plan), context))
+ return true;
+ }
+
+ /* righttree */
+ if (innerPlan(plan))
+ {
+ if (walker(innerPlan(plan), context))
+ return true;
+ }
+
+ /* special child plans */
+ switch (nodeTag(plan))
+ {
+ case T_Append:
+ if (plan_walk_members(((Append *) plan)->appendplans,
+ walker, context))
+ return true;
+ break;
+ case T_MergeAppend:
+ if (plan_walk_members(((MergeAppend *) plan)->mergeplans,
+ walker, context))
+ return true;
+ break;
+ case T_BitmapAnd:
+ if (plan_walk_members(((BitmapAnd *) plan)->bitmapplans,
+ walker, context))
+ return true;
+ break;
+ case T_BitmapOr:
+ if (plan_walk_members(((BitmapOr *) plan)->bitmapplans,
+ walker, context))
+ return true;
+ break;
+ case T_CustomScan:
+ if (plan_walk_members(((CustomScan *) plan)->custom_plans,
+ walker, context))
+ return true;
+ break;
+ case T_SubqueryScan:
+ if (walker(((SubqueryScan *) plan)->subplan, context))
+ return true;
+ break;
+ default:
+ break;
+ }
+
+ return false;
+}
+
+/*
+ * Walk a list of SubPlans (or initPlans, which also use SubPlan nodes).
+ */
+static bool
+plan_walk_subplans(List *plans,
+ bool (*walker) (),
+ void *context)
+{
+ ListCell *lc;
+ PlannedStmt *plannedstmt = (PlannedStmt *) context;
+
+ foreach(lc, plans)
+ {
+ SubPlan *sp = lfirst_node(SubPlan, lc);
+ Plan *p = list_nth(plannedstmt->subplans, sp->plan_id - 1);
+
+ if (walker(p, context))
+ return true;
+ }
+
+ return false;
+}
+
+/*
+ * Walk the constituent plans of a ModifyTable, Append, MergeAppend,
+ * BitmapAnd, or BitmapOr node.
+ */
+static bool
+plan_walk_members(List *plans, bool (*walker) (), void *context)
+{
+ ListCell *lc;
+
+ foreach(lc, plans)
+ {
+ if (walker(lfirst(lc), context))
+ return true;
+ }
+
+ return false;
+}
diff --git a/src/include/nodes/nodeFuncs.h b/src/include/nodes/nodeFuncs.h
index 93c60bde66..fca107ad65 100644
--- a/src/include/nodes/nodeFuncs.h
+++ b/src/include/nodes/nodeFuncs.h
@@ -158,5 +158,8 @@ extern bool raw_expression_tree_walker(Node *node, bool (*walker) (),
struct PlanState;
extern bool planstate_tree_walker(struct PlanState *planstate, bool (*walker) (),
void *context);
+struct Plan;
+extern bool plan_tree_walker(struct Plan *plan, bool (*walker) (),
+ void *context);
#endif /* NODEFUNCS_H */
--
2.24.1
[application/octet-stream] v5-0003-Teach-AcquireExecutorLocks-to-skip-locking-pruned.patch (93.2K, 3-v5-0003-Teach-AcquireExecutorLocks-to-skip-locking-pruned.patch)
download | inline diff:
From 62fd8ca887f62dcd89010bf4475529eb16f07d52 Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Wed, 22 Dec 2021 16:55:17 +0900
Subject: [PATCH v5 3/3] Teach AcquireExecutorLocks() to skip locking pruned
partitions
Instead of locking all relations listed in the range table, this
asks the new executor function ExecutorGetLockRels() to return a set
of relations (their RT indexes) to lock or simply use the set
given by PlannedStmt.lockrels. To wit, ExecutorGetLockRels() must be
called if some nodes in the plan tree contain initial pruning steps
(pruning steps containing expressions that can be computed before
before the executor proper has started), which results in the lockrels
set to be computed such that any subplans that are pruned as result of
doing initial pruning do not contribute any relations to the set.
That can result in a much smaller lockrels set when the plan contains
thousands of child subplans, of which only a small number remain
after pruning.
The result of doing the initial pruning during ExecutorGetLockRels()
is preserved for use later during actual execution by creating a
a new node called PlanInitPruningOutput for each plan node that
undergoes pruning and a set of those for the whole plan tree are
put into another new node ExecLockRelsInfo that represents the output
of a given ExecutorGetLockRels() invocation. ExecLockRelsInfos are
passed down the executor alongside the PlannedStmts. This
arrangement ensures that the set of plan tree nodes that
AcquireExecutorLocks() has acquired locks to protect and the one
that the executor will initialize and execute are one and the same.
---
src/backend/commands/copyto.c | 2 +-
src/backend/commands/createas.c | 2 +-
src/backend/commands/explain.c | 7 +-
src/backend/commands/extension.c | 13 +-
src/backend/commands/matview.c | 2 +-
src/backend/commands/portalcmds.c | 1 +
src/backend/commands/prepare.c | 17 +-
src/backend/executor/README | 22 ++-
src/backend/executor/execMain.c | 181 +++++++++++++++++++
src/backend/executor/execParallel.c | 27 ++-
src/backend/executor/execPartition.c | 233 +++++++++++++++++++++----
src/backend/executor/execUtils.c | 8 +
src/backend/executor/functions.c | 2 +-
src/backend/executor/nodeAppend.c | 42 ++++-
src/backend/executor/nodeMergeAppend.c | 42 ++++-
src/backend/executor/nodeModifyTable.c | 24 +++
src/backend/executor/spi.c | 14 +-
src/backend/nodes/copyfuncs.c | 50 +++++-
src/backend/nodes/outfuncs.c | 41 +++++
src/backend/nodes/readfuncs.c | 38 ++++
src/backend/optimizer/plan/planner.c | 3 +
src/backend/optimizer/plan/setrefs.c | 10 ++
src/backend/partitioning/partprune.c | 37 +++-
src/backend/tcop/postgres.c | 15 +-
src/backend/tcop/pquery.c | 21 ++-
src/backend/utils/cache/plancache.c | 220 +++++++++++++++++++----
src/backend/utils/mmgr/portalmem.c | 2 +
src/include/commands/explain.h | 3 +-
src/include/executor/execPartition.h | 2 +
src/include/executor/execdesc.h | 2 +
src/include/executor/executor.h | 2 +
src/include/executor/nodeAppend.h | 1 +
src/include/executor/nodeMergeAppend.h | 1 +
src/include/executor/nodeModifyTable.h | 1 +
src/include/nodes/execnodes.h | 87 +++++++++
src/include/nodes/nodes.h | 5 +
src/include/nodes/pathnodes.h | 7 +
src/include/nodes/plannodes.h | 18 ++
src/include/tcop/tcopprot.h | 2 +-
src/include/utils/plancache.h | 5 +
src/include/utils/portal.h | 5 +
41 files changed, 1108 insertions(+), 109 deletions(-)
diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index 55c38b04c4..d403eb2309 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -542,7 +542,7 @@ BeginCopyTo(ParseState *pstate,
((DR_copy *) dest)->cstate = cstate;
/* Create a QueryDesc requesting no output */
- cstate->queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+ cstate->queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
GetActiveSnapshot(),
InvalidSnapshot,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 9abbb6b555..f6607f2454 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -325,7 +325,7 @@ ExecCreateTableAs(ParseState *pstate, CreateTableAsStmt *stmt,
UpdateActiveSnapshotCommandId();
/* Create a QueryDesc, redirecting output to our tuple receiver */
- queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+ queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, 0);
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index de81379da3..a9dc6d1755 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -407,7 +407,7 @@ ExplainOneQuery(Query *query, int cursorOptions,
}
/* run it (if needed) and produce output */
- ExplainOnePlan(plan, into, es, queryString, params, queryEnv,
+ ExplainOnePlan(plan, NULL, into, es, queryString, params, queryEnv,
&planduration, (es->buffers ? &bufusage : NULL));
}
}
@@ -515,7 +515,8 @@ ExplainOneUtility(Node *utilityStmt, IntoClause *into, ExplainState *es,
* to call it.
*/
void
-ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
+ExplainOnePlan(PlannedStmt *plannedstmt, ExecLockRelsInfo *execlockrelsinfo,
+ IntoClause *into, ExplainState *es,
const char *queryString, ParamListInfo params,
QueryEnvironment *queryEnv, const instr_time *planduration,
const BufferUsage *bufusage)
@@ -563,7 +564,7 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
dest = None_Receiver;
/* Create a QueryDesc for the query */
- queryDesc = CreateQueryDesc(plannedstmt, queryString,
+ queryDesc = CreateQueryDesc(plannedstmt, execlockrelsinfo, queryString,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, instrument_option);
diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index 1013790dbb..008b8ce0e9 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -741,8 +741,10 @@ execute_sql_string(const char *sql)
RawStmt *parsetree = lfirst_node(RawStmt, lc1);
MemoryContext per_parsetree_context,
oldcontext;
- List *stmt_list;
- ListCell *lc2;
+ List *stmt_list,
+ *execlockrelsinfo_list;
+ ListCell *lc2,
+ *lc3;
/*
* We do the work for each parsetree in a short-lived context, to
@@ -762,11 +764,13 @@ execute_sql_string(const char *sql)
NULL,
0,
NULL);
- stmt_list = pg_plan_queries(stmt_list, sql, CURSOR_OPT_PARALLEL_OK, NULL);
+ stmt_list = pg_plan_queries(stmt_list, sql, CURSOR_OPT_PARALLEL_OK, NULL,
+ &execlockrelsinfo_list);
- foreach(lc2, stmt_list)
+ forboth(lc2, stmt_list, lc3, execlockrelsinfo_list)
{
PlannedStmt *stmt = lfirst_node(PlannedStmt, lc2);
+ ExecLockRelsInfo *execlockrelsinfo = lfirst_node(ExecLockRelsInfo, lc3);
CommandCounterIncrement();
@@ -777,6 +781,7 @@ execute_sql_string(const char *sql)
QueryDesc *qdesc;
qdesc = CreateQueryDesc(stmt,
+ execlockrelsinfo,
sql,
GetActiveSnapshot(), NULL,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index 05e7b60059..4ef44aaf23 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -416,7 +416,7 @@ refresh_matview_datafill(DestReceiver *dest, Query *query,
UpdateActiveSnapshotCommandId();
/* Create a QueryDesc, redirecting output to our tuple receiver */
- queryDesc = CreateQueryDesc(plan, queryString,
+ queryDesc = CreateQueryDesc(plan, NULL, queryString,
GetActiveSnapshot(), InvalidSnapshot,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/portalcmds.c b/src/backend/commands/portalcmds.c
index 9902c5c566..85e73ddded 100644
--- a/src/backend/commands/portalcmds.c
+++ b/src/backend/commands/portalcmds.c
@@ -107,6 +107,7 @@ PerformCursorOpen(ParseState *pstate, DeclareCursorStmt *cstmt, ParamListInfo pa
queryString,
CMDTAG_SELECT, /* cursor's query is always a SELECT */
list_make1(plan),
+ list_make1(NULL), /* no ExecLockRelsInfo to pass */
NULL);
/*----------
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 80738547ed..bbbf8bbcbd 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -155,6 +155,7 @@ ExecuteQuery(ParseState *pstate,
PreparedStatement *entry;
CachedPlan *cplan;
List *plan_list;
+ List *plan_execlockrelsinfo_list;
ParamListInfo paramLI = NULL;
EState *estate = NULL;
Portal portal;
@@ -195,6 +196,7 @@ ExecuteQuery(ParseState *pstate,
/* Replan if needed, and increment plan refcount for portal */
cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL);
plan_list = cplan->stmt_list;
+ plan_execlockrelsinfo_list = cplan->execlockrelsinfo_list;
/*
* DO NOT add any logic that could possibly throw an error between
@@ -204,7 +206,7 @@ ExecuteQuery(ParseState *pstate,
NULL,
query_string,
entry->plansource->commandTag,
- plan_list,
+ plan_list, plan_execlockrelsinfo_list,
cplan);
/*
@@ -576,7 +578,9 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
const char *query_string;
CachedPlan *cplan;
List *plan_list;
- ListCell *p;
+ List *plan_execlockrelsinfo_list;
+ ListCell *p,
+ *pe;
ParamListInfo paramLI = NULL;
EState *estate = NULL;
instr_time planstart;
@@ -632,15 +636,18 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
}
plan_list = cplan->stmt_list;
+ plan_execlockrelsinfo_list = cplan->execlockrelsinfo_list;
/* Explain each query */
- foreach(p, plan_list)
+ forboth(p, plan_list, pe, plan_execlockrelsinfo_list)
{
PlannedStmt *pstmt = lfirst_node(PlannedStmt, p);
+ ExecLockRelsInfo *execlockrelsinfo = lfirst_node(ExecLockRelsInfo, pe);
if (pstmt->commandType != CMD_UTILITY)
- ExplainOnePlan(pstmt, into, es, query_string, paramLI, queryEnv,
- &planduration, (es->buffers ? &bufusage : NULL));
+ ExplainOnePlan(pstmt, execlockrelsinfo, into, es, query_string,
+ paramLI, queryEnv, &planduration,
+ (es->buffers ? &bufusage : NULL));
else
ExplainOneUtility(pstmt->utilityStmt, into, es, query_string,
paramLI, queryEnv);
diff --git a/src/backend/executor/README b/src/backend/executor/README
index bf5e70860d..27341a2818 100644
--- a/src/backend/executor/README
+++ b/src/backend/executor/README
@@ -59,11 +59,20 @@ state tree. Read-only plan trees make life much simpler for plan caching and
reuse.
A corresponding executor state node may not be created during executor startup
-if the executor determines that an entire subplan is not required due to
-execution time partition pruning determining that no matching records will be
-found there. This currently only occurs for Append and MergeAppend nodes. In
-this case the non-required subplans are ignored and the executor state's
-subnode array will become out of sequence to the plan's subplan list.
+if the ExecutorGetLockRels() determines that an entire subplan is not required
+due to initial partition pruning determining that no matching records will be
+found there, while also skipping the locking of relation(s) that would be
+scanned by the subplan were it not pruned. This currently only occurs for
+Append and MergeAppend nodes (see ExecGet[Merge]AppendLockRels()). In this
+case, the non-required subplans are ignored and the executor state's subnode
+array will become out of sequence to the plan's subplan list.
+ExecutorGetLockRels() typically runs before the execution starts, for example,
+as part of checking if a cached generic plan is still valid, though the
+result it produces (ExecLockRelsInfo) is made available to ExecutorStart() via
+the QueryDesc. ExecInitNode() on the plan nodes whose child subplans may have
+been pruned as part of ExecutorGetLockRels() must look up the surviving set of
+subplans to initialize in the ExecLockRelsInfo, instead of reiterating the
+initial pruning computation.
Each Plan node may have expression trees associated with it, to represent
its target list, qualification conditions, etc. These trees are also
@@ -247,6 +256,9 @@ Query Processing Control Flow
This is a sketch of control flow for full query processing:
+ [ ExecutorGetLockRels ] --- an optional step to walk over the plan tree
+ to produce an ExecLockRelsInfo to be passed to CreateQueryDesc
+
CreateQueryDesc
ExecutorStart
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 549d9eb696..3b1f588321 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -48,11 +48,15 @@
#include "commands/matview.h"
#include "commands/trigger.h"
#include "executor/execdebug.h"
+#include "executor/nodeAppend.h"
+#include "executor/nodeMergeAppend.h"
+#include "executor/nodeModifyTable.h"
#include "executor/nodeSubplan.h"
#include "foreign/fdwapi.h"
#include "jit/jit.h"
#include "mb/pg_wchar.h"
#include "miscadmin.h"
+#include "nodes/nodeFuncs.h"
#include "parser/parsetree.h"
#include "storage/bufmgr.h"
#include "storage/lmgr.h"
@@ -100,9 +104,184 @@ static char *ExecBuildSlotValueDescription(Oid reloid,
Bitmapset *modifiedCols,
int maxfieldlen);
static void EvalPlanQualStart(EPQState *epqstate, Plan *planTree);
+static bool ExecGetScanLockRels(Scan *scan, ExecGetLockRelsContext *context);
/* end of local decls */
+/* ----------------------------------------------------------------
+ * ExecutorGetLockRels
+ *
+ * Figure out the set of relations to lock to be able to execute a given
+ * plan, after taking into account the result of performing any initial
+ * pruning steps present in the plan. Performing those pruning steps
+ * would effectively invalidate the pruned subplans (that is, will not
+ * be looked at during the actual execution of the parent plan), so the
+ * relations that those subplans scan need not be locked.
+ *
+ * Along with the set of RT indexes of relations that must be locked, the
+ * returned struct also contains the information look up PlanInitPruningOutput
+ * nodes, containing the result of performing initial pruning (identities of
+ * surviving partition subnodes), for each plan node that undergoes pruning.
+ *
+ * The caller must arrange to pass on the returned struct down to the
+ * executor, so that the latter can reuse the result of initial pruning to
+ * initialize the same set of surviving subplans, instead of doing the pruning
+ * again by itself.
+ *
+ * This locks relations whose information is perused to do the pruning. For
+ * example, a partitioned table before perusing its PartitionedRelPruneInfo
+ * contained in an Append node to do pruning in ExecGetAppendLockRels().
+ */
+ExecLockRelsInfo *
+ExecutorGetLockRels(PlannedStmt *plannedstmt, ParamListInfo params)
+{
+ int numPlanNodes = plannedstmt->numPlanNodes;
+ ExecGetLockRelsContext context;
+ ExecLockRelsInfo *result;
+ ListCell *lc;
+
+ /* Only get here if there is any pruning to do. */
+ Assert(plannedstmt->containsInitialPruning);
+
+ context.stmt = plannedstmt;
+ context.params = params;
+
+ /* Go do init pruning and fill lockrels. */
+ context.lockrels = NULL;
+ context.initPruningOutputs = NIL;
+ context.ipoIndexes = palloc0(sizeof(int) * numPlanNodes);
+ foreach(lc, plannedstmt->subplans)
+ {
+ Plan *subplan = lfirst(lc);
+
+ (void) ExecGetLockRels(subplan, &context);
+ }
+
+ (void) ExecGetLockRels(plannedstmt->planTree, &context);
+
+ result = makeNode(ExecLockRelsInfo);
+ result->lockrels = context.lockrels;
+ result->numPlanNodes = numPlanNodes;
+ result->initPruningOutputs = context.initPruningOutputs;
+ result->ipoIndexes = context.ipoIndexes;
+
+ return result;
+}
+
+/* ------------------------------------------------------------------------
+ * ExecGetLockRels
+ * Recursively find relations to lock in the plan tree rooted at 'node',
+ * performing initial pruning if the node contains the information to
+ * do so
+ *
+ * 'node' is the current node of the plan produced by the query planner
+ * 'context' contains the PlannedStmt and the information about EXTERN
+ * parameters to use for partition pruning and also where to add the
+ * result -- lockrels and PlanInitPruningOutput nodes
+ *
+ * NOTE: ExecGetLockRels subroutine for a given node must add the RT indexes of
+ * any relations that it manipulates to result->lockrels. If the node needs
+ * initial pruning, it must add the resulting PlanInitPruningOutput node to
+ * context using the ExecStorePlanInitPruningOutput() macro.
+ * ------------------------------------------------------------------------
+ */
+bool
+ExecGetLockRels(Plan *node, ExecGetLockRelsContext *context)
+{
+ /* Do nothing when we get to the end of a leaf on tree. */
+ if (node == NULL)
+ return true;
+
+ /* Make sure there's enough stack available. */
+ check_stack_depth();
+
+ switch (nodeTag(node))
+ {
+ case T_Append:
+ if (ExecGetAppendLockRels((Append *) node, context))
+ return true;
+ break;
+ case T_MergeAppend:
+ if (ExecGetMergeAppendLockRels((MergeAppend *) node, context))
+ return true;
+ break;
+
+ case T_SeqScan:
+ case T_SampleScan:
+ case T_IndexScan:
+ case T_IndexOnlyScan:
+ case T_BitmapIndexScan:
+ case T_BitmapHeapScan:
+ case T_TidScan:
+ case T_TidRangeScan:
+ case T_ForeignScan:
+ case T_SubqueryScan:
+ case T_CustomScan:
+ if (ExecGetScanLockRels((Scan *) node, context))
+ return true;
+ break;
+
+ case T_ModifyTable:
+ if (ExecGetModifyTableLockRels((ModifyTable *) node, context))
+ return true;
+ /* plan_tree_walker() will visit the subplan (outerNode) */
+ break;
+
+ default:
+ break;
+ }
+
+ return plan_tree_walker(node, ExecGetLockRels, (void *) context);
+}
+
+/*
+ * ExecGetScanLockRels
+ * Do ExecGetLockRels()'s work for a Scan plan
+ */
+static bool
+ExecGetScanLockRels(Scan *scan, ExecGetLockRelsContext *context)
+{
+ switch (nodeTag(scan))
+ {
+ case T_ForeignScan:
+ {
+ ForeignScan *fscan = (ForeignScan *) scan;
+
+ context->lockrels = bms_add_members(context->lockrels,
+ fscan->fs_relids);
+ }
+ break;
+
+ case T_SubqueryScan:
+ {
+ SubqueryScan *sscan = (SubqueryScan *) scan;
+
+ (void) ExecGetLockRels((Plan *) sscan->subplan, context);
+ }
+ break;
+
+ case T_CustomScan:
+ {
+ CustomScan *cscan = (CustomScan *) scan;
+ ListCell *lc;
+
+ context->lockrels = bms_add_members(context->lockrels,
+ cscan->custom_relids);
+ foreach(lc, cscan->custom_plans)
+ {
+ (void) ExecGetLockRels((Plan *) lfirst(lc), context);
+ }
+ }
+ break;
+
+ default:
+ context->lockrels = bms_add_member(context->lockrels,
+ scan->scanrelid);
+ break;
+ }
+
+ return true;
+}
/* ----------------------------------------------------------------
* ExecutorStart
@@ -804,6 +983,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
{
CmdType operation = queryDesc->operation;
PlannedStmt *plannedstmt = queryDesc->plannedstmt;
+ ExecLockRelsInfo *execlockrelsinfo = queryDesc->execlockrelsinfo;
Plan *plan = plannedstmt->planTree;
List *rangeTable = plannedstmt->rtable;
EState *estate = queryDesc->estate;
@@ -823,6 +1003,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
ExecInitRangeTable(estate, rangeTable);
estate->es_plannedstmt = plannedstmt;
+ estate->es_execlockrelsinfo = execlockrelsinfo;
/*
* Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index 5dd8ab7db2..f27f85ab4f 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -66,6 +66,7 @@
#define PARALLEL_KEY_QUERY_TEXT UINT64CONST(0xE000000000000008)
#define PARALLEL_KEY_JIT_INSTRUMENTATION UINT64CONST(0xE000000000000009)
#define PARALLEL_KEY_WAL_USAGE UINT64CONST(0xE00000000000000A)
+#define PARALLEL_KEY_EXECLOCKRELSINFO UINT64CONST(0xE00000000000000B)
#define PARALLEL_TUPLE_QUEUE_SIZE 65536
@@ -182,8 +183,10 @@ ExecSerializePlan(Plan *plan, EState *estate)
pstmt->transientPlan = false;
pstmt->dependsOnRole = false;
pstmt->parallelModeNeeded = false;
+ pstmt->containsInitialPruning = false;
pstmt->planTree = plan;
pstmt->rtable = estate->es_range_table;
+ pstmt->lockrels = NULL;
pstmt->resultRelations = NIL;
pstmt->appendRelations = NIL;
@@ -596,12 +599,15 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
FixedParallelExecutorState *fpes;
char *pstmt_data;
char *pstmt_space;
+ char *execlockrelsinfo_data;
+ char *execlockrelsinfo_space;
char *paramlistinfo_space;
BufferUsage *bufusage_space;
WalUsage *walusage_space;
SharedExecutorInstrumentation *instrumentation = NULL;
SharedJitInstrumentation *jit_instrumentation = NULL;
int pstmt_len;
+ int execlockrelsinfo_len;
int paramlistinfo_len;
int instrumentation_len = 0;
int jit_instrumentation_len = 0;
@@ -630,6 +636,7 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
/* Fix up and serialize plan to be sent to workers. */
pstmt_data = ExecSerializePlan(planstate->plan, estate);
+ execlockrelsinfo_data = nodeToString(estate->es_execlockrelsinfo);
/* Create a parallel context. */
pcxt = CreateParallelContext("postgres", "ParallelQueryMain", nworkers);
@@ -656,6 +663,11 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
shm_toc_estimate_chunk(&pcxt->estimator, pstmt_len);
shm_toc_estimate_keys(&pcxt->estimator, 1);
+ /* Estimate space for serialized ExecLockRelsInfo. */
+ execlockrelsinfo_len = strlen(execlockrelsinfo_data) + 1;
+ shm_toc_estimate_chunk(&pcxt->estimator, execlockrelsinfo_len);
+ shm_toc_estimate_keys(&pcxt->estimator, 1);
+
/* Estimate space for serialized ParamListInfo. */
paramlistinfo_len = EstimateParamListSpace(estate->es_param_list_info);
shm_toc_estimate_chunk(&pcxt->estimator, paramlistinfo_len);
@@ -750,6 +762,12 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
memcpy(pstmt_space, pstmt_data, pstmt_len);
shm_toc_insert(pcxt->toc, PARALLEL_KEY_PLANNEDSTMT, pstmt_space);
+ /* Store serialized ExecLockRelsInfo */
+ execlockrelsinfo_space = shm_toc_allocate(pcxt->toc, execlockrelsinfo_len);
+ memcpy(execlockrelsinfo_space, execlockrelsinfo_data, execlockrelsinfo_len);
+ shm_toc_insert(pcxt->toc, PARALLEL_KEY_EXECLOCKRELSINFO,
+ execlockrelsinfo_space);
+
/* Store serialized ParamListInfo. */
paramlistinfo_space = shm_toc_allocate(pcxt->toc, paramlistinfo_len);
shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARAMLISTINFO, paramlistinfo_space);
@@ -1231,8 +1249,10 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
int instrument_options)
{
char *pstmtspace;
+ char *execlockrelsinfospace;
char *paramspace;
PlannedStmt *pstmt;
+ ExecLockRelsInfo *execlockrelsinfo;
ParamListInfo paramLI;
char *queryString;
@@ -1243,12 +1263,17 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
pstmtspace = shm_toc_lookup(toc, PARALLEL_KEY_PLANNEDSTMT, false);
pstmt = (PlannedStmt *) stringToNode(pstmtspace);
+ /* Reconstruct leader-supplied ExecLockRelsInfo. */
+ execlockrelsinfospace = shm_toc_lookup(toc, PARALLEL_KEY_EXECLOCKRELSINFO,
+ false);
+ execlockrelsinfo = (ExecLockRelsInfo *) stringToNode(execlockrelsinfospace);
+
/* Reconstruct ParamListInfo. */
paramspace = shm_toc_lookup(toc, PARALLEL_KEY_PARAMLISTINFO, false);
paramLI = RestoreParamList(¶mspace);
/* Create a QueryDesc for the query. */
- return CreateQueryDesc(pstmt,
+ return CreateQueryDesc(pstmt, execlockrelsinfo,
queryString,
GetActiveSnapshot(), InvalidSnapshot,
receiver, paramLI, NULL, instrument_options);
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 21953f253b..db8c4cd719 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -24,6 +24,7 @@
#include "mb/pg_wchar.h"
#include "miscadmin.h"
#include "nodes/makefuncs.h"
+#include "parser/parsetree.h"
#include "partitioning/partbounds.h"
#include "partitioning/partdesc.h"
#include "partitioning/partprune.h"
@@ -183,8 +184,14 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation rel,
int maxfieldlen);
static List *adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri);
static PartitionPruneState *ExecCreatePartitionPruneState(PlanState *planstate,
- PartitionPruneInfo *partitionpruneinfo);
-static Bitmapset *ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate);
+ PartitionPruneInfo *partitionpruneinfo,
+ bool consider_initial_steps,
+ bool consider_exec_steps,
+ List *rtable, ExprContext *econtext,
+ PartitionDirectory partdir,
+ Bitmapset **parentrelids);
+static Bitmapset *ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate,
+ PartitionPruneInfo *pruneinfo);
static void ExecInitPruningContext(PartitionPruneContext *context,
List *pruning_steps,
PartitionDesc partdesc,
@@ -1483,8 +1490,9 @@ adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri)
* considered to be a stable expression, it can change value from one plan
* node scan to the next during query execution. Stable comparison
* expressions that don't involve such Params allow partition pruning to be
- * done once during executor startup. Expressions that do involve such Params
- * require us to prune separately for each scan of the parent plan node.
+ * done once during executor startup or even before during ExecutorGetLockRels().
+ * Expressions that do involve such Params require us to prune separately for
+ * each scan of the parent plan node.
*
* Note that pruning away unneeded subplans during executor startup has the
* added benefit of not having to initialize the unneeded subplans at all.
@@ -1503,6 +1511,10 @@ adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri)
* updated to account for initial pruning having eliminated some of the
* subplans, if any.
*
+ * ExecGetLockRelsDoInitialPruning:
+ * Do initial pruning as part of ExecGetLockRels() on the parent plan
+ * node
+ *
* ExecFindMatchingSubPlans:
* Returns indexes of matching subplans after evaluating all available
* expressions, that is, using execution pruning steps. This function can
@@ -1531,22 +1543,57 @@ ExecInitPartitionPruning(PlanState *planstate,
{
PartitionPruneState *prunestate;
EState *estate = planstate->state;
+ Plan *plan = planstate->plan;
+ PlanInitPruningOutput *initPruningOutput = NULL;
+ bool do_pruning = (pruneinfo->needs_init_pruning ||
+ pruneinfo->needs_exec_pruning);
- /* We may need an expression context to evaluate partition exprs */
- ExecAssignExprContext(estate, planstate);
+ if (estate->es_execlockrelsinfo)
+ {
+ initPruningOutput = (PlanInitPruningOutput *)
+ ExecFetchPlanInitPruningOutput(estate->es_execlockrelsinfo, plan);
- /*
- * Create the working data structure for pruning.
- */
- prunestate = ExecCreatePartitionPruneState(planstate, pruneinfo);
+ Assert(initPruningOutput != NULL &&
+ IsA(initPruningOutput, PlanInitPruningOutput));
+ /* No need to do initial pruning again, only exec pruning. */
+ do_pruning = pruneinfo->needs_exec_pruning;
+ }
+
+ prunestate = NULL;
+ if (do_pruning)
+ {
+ /* We may need an expression context to evaluate partition exprs */
+ ExecAssignExprContext(estate, planstate);
+
+ /* For data reading, executor always omits detached partitions */
+ if (estate->es_partition_directory == NULL)
+ estate->es_partition_directory =
+ CreatePartitionDirectory(estate->es_query_cxt, false);
+
+ /*
+ * Create the working data structure for pruning. No need to consider
+ * initial pruning steps if we have a PlanInitPruningOutput.
+ */
+ prunestate = ExecCreatePartitionPruneState(planstate, pruneinfo,
+ initPruningOutput == NULL, true,
+ NIL, planstate->ps_ExprContext,
+ estate->es_partition_directory,
+ NULL);
+ }
/*
* Perform an initial partition prune, if required.
*/
- if (prunestate->do_initial_prune)
+ if (initPruningOutput)
+ {
+ /* ExecutorGetLockRels() already did it for us! */
+ *initially_valid_subplans = initPruningOutput->initially_valid_subplans;
+ }
+ else if (prunestate && prunestate->do_initial_prune)
{
/* Determine which subplans survive initial pruning */
- *initially_valid_subplans = ExecFindInitialMatchingSubPlans(prunestate);
+ *initially_valid_subplans = ExecFindInitialMatchingSubPlans(prunestate,
+ pruneinfo);
}
else
{
@@ -1564,7 +1611,7 @@ ExecInitPartitionPruning(PlanState *planstate,
* invalid data in prunestate, because that data won't be consulted again
* (cf initial Assert in ExecFindMatchingSubPlans).
*/
- if (prunestate->do_exec_prune &&
+ if (prunestate && prunestate->do_exec_prune &&
bms_num_members(*initially_valid_subplans) < n_total_subplans)
ExecPartitionPruneFixSubPlanIndexes(prunestate,
*initially_valid_subplans,
@@ -1573,12 +1620,83 @@ ExecInitPartitionPruning(PlanState *planstate,
return prunestate;
}
+/*
+ * ExecGetLockRelsDoInitialPruning
+ * Perform initial pruning as part of doing ExecGetLockRels() on the parent
+ * plan node
+ */
+Bitmapset *
+ExecGetLockRelsDoInitialPruning(Plan *plan, ExecGetLockRelsContext *context,
+ PartitionPruneInfo *pruneinfo)
+{
+ List *rtable = context->stmt->rtable;
+ ParamListInfo params = context->params;
+ ExprContext *econtext;
+ PartitionDirectory pdir;
+ MemoryContext oldcontext,
+ tmpcontext;
+ Bitmapset *parentrelids;
+ PartitionPruneState *prunestate;
+ PlanInitPruningOutput *initPruningOutput;
+
+ /*
+ * A temporary context to allocate stuff needded to run the pruning steps.
+ */
+ tmpcontext = AllocSetContextCreate(CurrentMemoryContext,
+ "initial pruning working data",
+ ALLOCSET_DEFAULT_SIZES);
+ oldcontext = MemoryContextSwitchTo(tmpcontext);
+
+ /*
+ * PartitionDirectory to look up partition descriptors, which omits
+ * detached partitions, just like in the executor proper.
+ */
+ pdir = CreatePartitionDirectory(CurrentMemoryContext, false);
+
+ /*
+ * We don't yet have a PlanState for the parent plan node, so must create
+ * a standalone ExprContext to evaluate pruning expressions, equipped with
+ * the information about the EXTERN parameters that the caller passed us.
+ * Note that that's okay because the initial pruning steps do not contain
+ * anything that requires the execution to have started.
+ */
+ econtext = CreateStandaloneExprContext();
+ econtext->ecxt_param_list_info = params;
+ prunestate = ExecCreatePartitionPruneState(NULL, pruneinfo,
+ true, false,
+ rtable, econtext,
+ pdir, &parentrelids);
+ MemoryContextSwitchTo(oldcontext);
+
+ /* Do the pruning and populate a PlanInitPruningOutput for this node. */
+ initPruningOutput = makeNode(PlanInitPruningOutput);
+ initPruningOutput->initially_valid_subplans =
+ ExecFindInitialMatchingSubPlans(prunestate, pruneinfo);
+ ExecStorePlanInitPruningOutput(context, initPruningOutput, plan);
+
+ /*
+ * Report parent partitioned tables as locking targets, though they
+ * would already be locked by ExecCreatePartitionPruneState().
+ */
+ Assert(bms_num_members(parentrelids) > 0);
+ context->lockrels = bms_add_members(context->lockrels, parentrelids);
+
+ FreeExprContext(econtext, true);
+ DestroyPartitionDirectory(pdir);
+ MemoryContextDelete(tmpcontext);
+
+ return initPruningOutput->initially_valid_subplans;
+}
+
/*
* ExecCreatePartitionPruneState
* Build the data structure required for calling
* ExecFindInitialMatchingSubPlans and ExecFindMatchingSubPlans.
*
- * 'planstate' is the parent plan node's execution state.
+ * 'planstate', if not NULL, is the parent plan node's execution state. It
+ * can be NULL if being called before ExecutorStart(), in which case,
+ * 'rtable' (range table), 'econtext', and 'partdir' must be explicitly
+ * provided.
*
* 'partitionpruneinfo' is a PartitionPruneInfo as generated by
* make_partition_pruneinfo. Here we build a PartitionPruneState containing a
@@ -1590,26 +1708,35 @@ ExecInitPartitionPruning(PlanState *planstate,
* as children. The data stored in each PartitionedRelPruningData can be
* re-used each time we re-evaluate which partitions match the pruning steps
* provided in each PartitionedRelPruneInfo.
+ *
+ * The RT indexes of parent partitioned table that are locked here to peruse
+ * their PartitionedRelPruningInfo are returned in *parentrelids if asked
+ * for by the caller.
*/
static PartitionPruneState *
ExecCreatePartitionPruneState(PlanState *planstate,
- PartitionPruneInfo *partitionpruneinfo)
+ PartitionPruneInfo *partitionpruneinfo,
+ bool consider_initial_steps,
+ bool consider_exec_steps,
+ List *rtable, ExprContext *econtext,
+ PartitionDirectory partdir,
+ Bitmapset **parentrelids)
{
- EState *estate = planstate->state;
+ EState *estate = planstate ? planstate->state : NULL;
PartitionPruneState *prunestate;
int n_part_hierarchies;
ListCell *lc;
int i;
- ExprContext *econtext = planstate->ps_ExprContext;
- /* For data reading, executor always omits detached partitions */
- if (estate->es_partition_directory == NULL)
- estate->es_partition_directory =
- CreatePartitionDirectory(estate->es_query_cxt, false);
+ Assert((estate != NULL) ||
+ (partdir != NULL && econtext != NULL && rtable != NIL));
n_part_hierarchies = list_length(partitionpruneinfo->prune_infos);
Assert(n_part_hierarchies > 0);
+ if (parentrelids)
+ *parentrelids = NULL;
+
/*
* Allocate the data structure
*/
@@ -1656,19 +1783,58 @@ ExecCreatePartitionPruneState(PlanState *planstate,
PartitionedRelPruneInfo *pinfo = lfirst_node(PartitionedRelPruneInfo, lc2);
PartitionedRelPruningData *pprune = &prunedata->partrelprunedata[j];
Relation partrel;
+ bool close_partrel = false;
PartitionDesc partdesc;
PartitionKey partkey;
/*
- * We can rely on the copies of the partitioned table's partition
- * key and partition descriptor appearing in its relcache entry,
- * because that entry will be held open and locked for the
- * duration of this executor run.
+ * Must open the relation by ourselves when called before the
+ * execution has started, such as, when called during
+ * ExecutorGetLockRels() on a cached plan. In that case,
+ * sub-partitions must be locked, because AcquirePlannerLocks()
+ * would not have seen them. (1st relation in a partrelpruneinfos
+ * list is always the root partitioned table appearing in the
+ * query, which AcquirePlannerLocks() would have locked; the
+ * Assert in relation_open() guards that assumption.)
+ */
+ if (estate == NULL)
+ {
+ RangeTblEntry *rte = rt_fetch(pinfo->rtindex, rtable);
+ int lockmode = (j == 0) ? NoLock : rte->rellockmode;
+
+ partrel = table_open(rte->relid, lockmode);
+ close_partrel = true;
+
+ /*
+ * Also report the partitioned table as having been locked.
+ * XXX - actually, *parentrelids set is later merged by the
+ * caller into the set of relations "to-be locked" by
+ * AcquireExecutorLocks(), thus causing the lock on this
+ * table to be requested again.
+ */
+ Assert(parentrelids != NULL);
+ *parentrelids = bms_add_member(*parentrelids, pinfo->rtindex);
+ }
+ else
+ partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
+
+ /*
+ * We can rely on the copy of the partitioned table's partition
+ * key from in its relcache entry, because it can't change (or
+ * get destroyed) as long as the relation is locked. Partition
+ * descriptor is taken from the PartitionDirectory associated with
+ * the table that is held open long enough for the descriptor to
+ * remain valid while it's used to perform the pruning steps.
*/
- partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
partkey = RelationGetPartitionKey(partrel);
- partdesc = PartitionDirectoryLookup(estate->es_partition_directory,
- partrel);
+ partdesc = PartitionDirectoryLookup(partdir, partrel);
+
+ /*
+ * Must close partrel, keeping the lock taken, if we're not using
+ * EState's entry.
+ */
+ if (close_partrel)
+ table_close(partrel, NoLock);
/*
* Initialize the subplan_map and subpart_map.
@@ -1770,7 +1936,7 @@ ExecCreatePartitionPruneState(PlanState *planstate,
* Initialize pruning contexts as needed.
*/
pprune->initial_pruning_steps = pinfo->initial_pruning_steps;
- if (pinfo->initial_pruning_steps)
+ if (consider_initial_steps && pinfo->initial_pruning_steps)
{
ExecInitPruningContext(&pprune->initial_context,
pinfo->initial_pruning_steps,
@@ -1780,7 +1946,7 @@ ExecCreatePartitionPruneState(PlanState *planstate,
prunestate->do_initial_prune = true;
}
pprune->exec_pruning_steps = pinfo->exec_pruning_steps;
- if (pinfo->exec_pruning_steps)
+ if (consider_exec_steps && pinfo->exec_pruning_steps)
{
ExecInitPruningContext(&pprune->exec_context,
pinfo->exec_pruning_steps,
@@ -1899,7 +2065,8 @@ ExecInitPruningContext(PartitionPruneContext *context,
* is required.
*/
static Bitmapset *
-ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate)
+ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate,
+ PartitionPruneInfo *pruneinfo)
{
Bitmapset *result = NULL;
MemoryContext oldcontext;
@@ -1909,8 +2076,8 @@ ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate)
Assert(prunestate->do_initial_prune);
/*
- * Switch to a temp context to avoid leaking memory in the executor's
- * query-lifespan memory context.
+ * Switch to a temp context to avoid leaking memory in the longer-term
+ * memory context.
*/
oldcontext = MemoryContextSwitchTo(prunestate->prune_context);
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 9df1f81ea8..7246f9175f 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -119,6 +119,7 @@ CreateExecutorState(void)
estate->es_relations = NULL;
estate->es_rowmarks = NULL;
estate->es_plannedstmt = NULL;
+ estate->es_execlockrelsinfo = NULL;
estate->es_junkFilter = NULL;
@@ -785,6 +786,13 @@ ExecGetRangeTableRelation(EState *estate, Index rti)
Assert(rti > 0 && rti <= estate->es_range_table_size);
+ /*
+ * A cross-check that AcquireExecutorLocks() hasn't missed any relations
+ * it must not have.
+ */
+ Assert(estate->es_execlockrelsinfo == NULL ||
+ bms_is_member(rti, estate->es_execlockrelsinfo->lockrels));
+
rel = estate->es_relations[rti - 1];
if (rel == NULL)
{
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index f9460ae506..a2182a6b1f 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -844,7 +844,7 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
else
dest = None_Receiver;
- es->qd = CreateQueryDesc(es->stmt,
+ es->qd = CreateQueryDesc(es->stmt, NULL,
fcache->src,
GetActiveSnapshot(),
InvalidSnapshot,
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index 5b6d3eb23b..966615f670 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -94,6 +94,45 @@ static bool ExecAppendAsyncRequest(AppendState *node, TupleTableSlot **result);
static void ExecAppendAsyncEventWait(AppendState *node);
static void classify_matching_subplans(AppendState *node);
+/* ----------------------------------------------------------------
+ * ExecGetAppendLockRels
+ * Do ExecGetLockRels()'s work for an Append plan
+ * ----------------------------------------------------------------
+ */
+bool
+ExecGetAppendLockRels(Append *node, ExecGetLockRelsContext *context)
+{
+ PartitionPruneInfo *pruneinfo = node->part_prune_info;
+
+ if (pruneinfo && pruneinfo->needs_init_pruning)
+ {
+ List *subplans = node->appendplans;
+ Bitmapset *validsubplans;
+ int i;
+
+ validsubplans = ExecGetLockRelsDoInitialPruning((Plan *) node,
+ context, pruneinfo);
+
+ /* Prep the surviving subplans. */
+ i = -1;
+ while ((i = bms_next_member(validsubplans, i)) >= 0)
+ {
+ Plan *subplan = list_nth(subplans, i);
+
+ (void) ExecGetLockRels(subplan, context);
+ }
+
+ /* done with this node */
+ return true;
+ }
+
+ /*
+ * Look at all subplans, which the caller would do by calling
+ * plan_tree_walker() on the node.
+ */
+ return false;
+}
+
/* ----------------------------------------------------------------
* ExecInitAppend
*
@@ -155,7 +194,8 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
* subplan, we can fill as_valid_subplans immediately, preventing
* later calls to ExecFindMatchingSubPlans.
*/
- if (!prunestate->do_exec_prune && nplans > 0)
+ if (appendstate->as_prune_state == NULL ||
+ (!appendstate->as_prune_state->do_exec_prune && nplans > 0))
appendstate->as_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
}
else
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index 9a9f29e845..869b836a14 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -54,6 +54,45 @@ typedef int32 SlotNumber;
static TupleTableSlot *ExecMergeAppend(PlanState *pstate);
static int heap_compare_slots(Datum a, Datum b, void *arg);
+/* ----------------------------------------------------------------
+ * ExecGetMergeAppendLockRels
+ * Do ExecGetLockRels()'s work for a MergeAppend plan
+ * ----------------------------------------------------------------
+ */
+bool
+ExecGetMergeAppendLockRels(MergeAppend *node, ExecGetLockRelsContext *context)
+{
+ PartitionPruneInfo *pruneinfo = node->part_prune_info;
+
+ if (pruneinfo && pruneinfo->needs_init_pruning)
+ {
+ List *subplans = node->mergeplans;
+ Bitmapset *validsubplans;
+ int i;
+
+ validsubplans = ExecGetLockRelsDoInitialPruning((Plan *) node,
+ context, pruneinfo);
+
+ /* Prep the surviving subplans. */
+ i = -1;
+ while ((i = bms_next_member(validsubplans, i)) >= 0)
+ {
+ Plan *subplan = list_nth(subplans, i);
+
+ (void) ExecGetLockRels(subplan, context);
+ }
+
+ /* done with this node */
+ return true;
+ }
+
+ /*
+ * Look at all subplans, which the caller would do by calling
+ * plan_tree_walker() on the node.
+ */
+ return false;
+}
+
/* ----------------------------------------------------------------
* ExecInitMergeAppend
@@ -103,7 +142,8 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
* subplan, we can fill as_valid_subplans immediately, preventing
* later calls to ExecFindMatchingSubPlans.
*/
- if (!prunestate->do_exec_prune && nplans > 0)
+ if (mergestate->ms_prune_state == NULL ||
+ (!mergestate->ms_prune_state->do_exec_prune && nplans > 0))
mergestate->ms_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
}
else
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index 5ec699a9bd..c860045fcb 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -2700,6 +2700,30 @@ ExecLookupResultRelByOid(ModifyTableState *node, Oid resultoid,
return NULL;
}
+/*
+ * ExecGetModifyTableLockRels
+ * Do ExecGetLockRels()'s work for a ModifyTable plan
+ */
+bool
+ExecGetModifyTableLockRels(ModifyTable *plan, ExecGetLockRelsContext *context)
+{
+ ListCell *lc;
+
+ if (plan->rootRelation > 0)
+ context->lockrels = bms_add_member(context->lockrels,
+ plan->rootRelation);
+ context->lockrels = bms_add_member(context->lockrels,
+ plan->nominalRelation);
+ foreach(lc, plan->resultRelations)
+ {
+ context->lockrels = bms_add_member(context->lockrels,
+ lfirst_int(lc));
+ }
+
+ /* caller will look at the source subplan */
+ return false;
+}
+
/* ----------------------------------------------------------------
* ExecInitModifyTable
* ----------------------------------------------------------------
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index a82e986667..2107009591 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -1578,6 +1578,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
CachedPlanSource *plansource;
CachedPlan *cplan;
List *stmt_list;
+ List *execlockrelsinfo_list;
char *query_string;
Snapshot snapshot;
MemoryContext oldcontext;
@@ -1659,6 +1660,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
/* Replan if needed, and increment plan refcount for portal */
cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv);
stmt_list = cplan->stmt_list;
+ execlockrelsinfo_list = cplan->execlockrelsinfo_list;
if (!plan->saved)
{
@@ -1670,6 +1672,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
*/
oldcontext = MemoryContextSwitchTo(portal->portalContext);
stmt_list = copyObject(stmt_list);
+ execlockrelsinfo_list = copyObject(execlockrelsinfo_list);
MemoryContextSwitchTo(oldcontext);
ReleaseCachedPlan(cplan, NULL);
cplan = NULL; /* portal shouldn't depend on cplan */
@@ -1683,6 +1686,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
query_string,
plansource->commandTag,
stmt_list,
+ execlockrelsinfo_list,
cplan);
/*
@@ -2473,7 +2477,9 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
{
CachedPlanSource *plansource = (CachedPlanSource *) lfirst(lc1);
List *stmt_list;
- ListCell *lc2;
+ List *execlockrelsinfo_list;
+ ListCell *lc2,
+ *lc3;
spicallbackarg.query = plansource->query_string;
@@ -2552,6 +2558,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
plan_owner, _SPI_current->queryEnv);
stmt_list = cplan->stmt_list;
+ execlockrelsinfo_list = cplan->execlockrelsinfo_list;
/*
* If we weren't given a specific snapshot to use, and the statement
@@ -2589,9 +2596,10 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
}
}
- foreach(lc2, stmt_list)
+ forboth(lc2, stmt_list, lc3, execlockrelsinfo_list)
{
PlannedStmt *stmt = lfirst_node(PlannedStmt, lc2);
+ ExecLockRelsInfo *execlockrelsinfo = lfirst_node(ExecLockRelsInfo, lc3);
bool canSetTag = stmt->canSetTag;
DestReceiver *dest;
@@ -2663,7 +2671,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
else
snap = InvalidSnapshot;
- qdesc = CreateQueryDesc(stmt,
+ qdesc = CreateQueryDesc(stmt, execlockrelsinfo,
plansource->query_string,
snap, crosscheck_snapshot,
dest,
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index d4f8455a2b..68c664070c 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -68,6 +68,13 @@
} \
} while (0)
+/* Copy a field that is an array with numElem ints */
+#define COPY_INT_ARRAY(fldname, numElem) \
+ do { \
+ newnode->fldname = (numElem) > 0 ? palloc((numElem) * sizeof(int)) : NULL; \
+ memcpy(newnode->fldname, from->fldname, sizeof(int) * (numElem)); \
+ } while (0)
+
/* Copy a parse location field (for Copy, this is same as scalar case) */
#define COPY_LOCATION_FIELD(fldname) \
(newnode->fldname = from->fldname)
@@ -94,9 +101,12 @@ _copyPlannedStmt(const PlannedStmt *from)
COPY_SCALAR_FIELD(transientPlan);
COPY_SCALAR_FIELD(dependsOnRole);
COPY_SCALAR_FIELD(parallelModeNeeded);
+ COPY_SCALAR_FIELD(containsInitialPruning);
COPY_SCALAR_FIELD(jitFlags);
COPY_NODE_FIELD(planTree);
+ COPY_SCALAR_FIELD(numPlanNodes);
COPY_NODE_FIELD(rtable);
+ COPY_BITMAPSET_FIELD(lockrels);
COPY_NODE_FIELD(resultRelations);
COPY_NODE_FIELD(appendRelations);
COPY_NODE_FIELD(subplans);
@@ -1278,6 +1288,8 @@ _copyPartitionPruneInfo(const PartitionPruneInfo *from)
PartitionPruneInfo *newnode = makeNode(PartitionPruneInfo);
COPY_NODE_FIELD(prune_infos);
+ COPY_SCALAR_FIELD(needs_init_pruning);
+ COPY_SCALAR_FIELD(needs_exec_pruning);
COPY_BITMAPSET_FIELD(other_subplans);
return newnode;
@@ -4941,6 +4953,33 @@ _copyExtensibleNode(const ExtensibleNode *from)
return newnode;
}
+/* ****************************************************************
+ * execnodes.h copy functions
+ * ****************************************************************
+ */
+static ExecLockRelsInfo *
+_copyExecLockRelsInfo(const ExecLockRelsInfo *from)
+{
+ ExecLockRelsInfo *newnode = makeNode(ExecLockRelsInfo);
+
+ COPY_BITMAPSET_FIELD(lockrels);
+ COPY_SCALAR_FIELD(numPlanNodes);
+ COPY_NODE_FIELD(initPruningOutputs);
+ COPY_INT_ARRAY(ipoIndexes, from->numPlanNodes);
+
+ return newnode;
+}
+
+static PlanInitPruningOutput *
+_copyPlanInitPruningOutput(const PlanInitPruningOutput *from)
+{
+ PlanInitPruningOutput *newnode = makeNode(PlanInitPruningOutput);
+
+ COPY_BITMAPSET_FIELD(initially_valid_subplans);
+
+ return newnode;
+}
+
/* ****************************************************************
* value.h copy functions
* ****************************************************************
@@ -4995,7 +5034,6 @@ _copyBitString(const BitString *from)
return newnode;
}
-
static ForeignKeyCacheInfo *
_copyForeignKeyCacheInfo(const ForeignKeyCacheInfo *from)
{
@@ -5944,6 +5982,16 @@ copyObjectImpl(const void *from)
retval = _copyPublicationTable(from);
break;
+ /*
+ * EXECUTION NODES
+ */
+ case T_ExecLockRelsInfo:
+ retval = _copyExecLockRelsInfo(from);
+ break;
+ case T_PlanInitPruningOutput:
+ retval = _copyPlanInitPruningOutput(from);
+ break;
+
/*
* MISCELLANEOUS NODES
*/
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 6bdad462c7..e0e09d7abd 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -312,9 +312,12 @@ _outPlannedStmt(StringInfo str, const PlannedStmt *node)
WRITE_BOOL_FIELD(transientPlan);
WRITE_BOOL_FIELD(dependsOnRole);
WRITE_BOOL_FIELD(parallelModeNeeded);
+ WRITE_BOOL_FIELD(containsInitialPruning);
WRITE_INT_FIELD(jitFlags);
WRITE_NODE_FIELD(planTree);
+ WRITE_INT_FIELD(numPlanNodes);
WRITE_NODE_FIELD(rtable);
+ WRITE_BITMAPSET_FIELD(lockrels);
WRITE_NODE_FIELD(resultRelations);
WRITE_NODE_FIELD(appendRelations);
WRITE_NODE_FIELD(subplans);
@@ -1004,6 +1007,8 @@ _outPartitionPruneInfo(StringInfo str, const PartitionPruneInfo *node)
WRITE_NODE_TYPE("PARTITIONPRUNEINFO");
WRITE_NODE_FIELD(prune_infos);
+ WRITE_BOOL_FIELD(needs_init_pruning);
+ WRITE_BOOL_FIELD(needs_exec_pruning);
WRITE_BITMAPSET_FIELD(other_subplans);
}
@@ -2274,6 +2279,7 @@ _outPlannerGlobal(StringInfo str, const PlannerGlobal *node)
WRITE_NODE_FIELD(subplans);
WRITE_BITMAPSET_FIELD(rewindPlanIDs);
WRITE_NODE_FIELD(finalrtable);
+ WRITE_BITMAPSET_FIELD(lockrels);
WRITE_NODE_FIELD(finalrowmarks);
WRITE_NODE_FIELD(resultRelations);
WRITE_NODE_FIELD(appendRelations);
@@ -2697,6 +2703,31 @@ _outExtensibleNode(StringInfo str, const ExtensibleNode *node)
methods->nodeOut(str, node);
}
+/*****************************************************************************
+ *
+ * Stuff from execnodes.h
+ *
+ *****************************************************************************/
+
+static void
+_outExecLockRelsInfo(StringInfo str, const ExecLockRelsInfo *node)
+{
+ WRITE_NODE_TYPE("EXECLOCKRELSINFO");
+
+ WRITE_BITMAPSET_FIELD(lockrels);
+ WRITE_INT_FIELD(numPlanNodes);
+ WRITE_NODE_FIELD(initPruningOutputs);
+ WRITE_INT_ARRAY(ipoIndexes, node->numPlanNodes);
+}
+
+static void
+_outPlanInitPruningOutput(StringInfo str, const PlanInitPruningOutput *node)
+{
+ WRITE_NODE_TYPE("PARTITIONINITPRUNINGOUTPUT");
+
+ WRITE_BITMAPSET_FIELD(initially_valid_subplans);
+}
+
/*****************************************************************************
*
* Stuff from parsenodes.h.
@@ -4538,6 +4569,16 @@ outNode(StringInfo str, const void *obj)
_outPartitionRangeDatum(str, obj);
break;
+ /*
+ * EXECUTION NODES
+ */
+ case T_ExecLockRelsInfo:
+ _outExecLockRelsInfo(str, obj);
+ break;
+ case T_PlanInitPruningOutput:
+ _outPlanInitPruningOutput(str, obj);
+ break;
+
default:
/*
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 3f68f7c18d..41ded72c4c 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -1585,9 +1585,12 @@ _readPlannedStmt(void)
READ_BOOL_FIELD(transientPlan);
READ_BOOL_FIELD(dependsOnRole);
READ_BOOL_FIELD(parallelModeNeeded);
+ READ_BOOL_FIELD(containsInitialPruning);
READ_INT_FIELD(jitFlags);
READ_NODE_FIELD(planTree);
+ READ_INT_FIELD(numPlanNodes);
READ_NODE_FIELD(rtable);
+ READ_BITMAPSET_FIELD(lockrels);
READ_NODE_FIELD(resultRelations);
READ_NODE_FIELD(appendRelations);
READ_NODE_FIELD(subplans);
@@ -2534,6 +2537,8 @@ _readPartitionPruneInfo(void)
READ_LOCALS(PartitionPruneInfo);
READ_NODE_FIELD(prune_infos);
+ READ_BOOL_FIELD(needs_init_pruning);
+ READ_BOOL_FIELD(needs_exec_pruning);
READ_BITMAPSET_FIELD(other_subplans);
READ_DONE();
@@ -2703,6 +2708,35 @@ _readPartitionRangeDatum(void)
READ_DONE();
}
+/*
+ * _readExecLockRelsInfo
+ */
+static ExecLockRelsInfo *
+_readExecLockRelsInfo(void)
+{
+ READ_LOCALS(ExecLockRelsInfo);
+
+ READ_BITMAPSET_FIELD(lockrels);
+ READ_INT_FIELD(numPlanNodes);
+ READ_NODE_FIELD(initPruningOutputs);
+ READ_INT_ARRAY(ipoIndexes, local_node->numPlanNodes);
+
+ READ_DONE();
+}
+
+/*
+ * _readPlanInitPruningOutput
+ */
+static PlanInitPruningOutput *
+_readPlanInitPruningOutput(void)
+{
+ READ_LOCALS(PlanInitPruningOutput);
+
+ READ_BITMAPSET_FIELD(initially_valid_subplans);
+
+ READ_DONE();
+}
+
/*
* parseNodeString
*
@@ -2974,6 +3008,10 @@ parseNodeString(void)
return_value = _readPartitionBoundSpec();
else if (MATCH("PARTITIONRANGEDATUM", 19))
return_value = _readPartitionRangeDatum();
+ else if (MATCH("EXECLOCKRELSINFO", 16))
+ return_value = _readExecLockRelsInfo();
+ else if (MATCH("PARTITIONINITPRUNINGOUTPUT", 26))
+ return_value = _readPlanInitPruningOutput();
else
{
elog(ERROR, "badly formatted node string \"%.32s\"...", token);
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index bd09f85aea..9e41bbd228 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -517,8 +517,11 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
result->transientPlan = glob->transientPlan;
result->dependsOnRole = glob->dependsOnRole;
result->parallelModeNeeded = glob->parallelModeNeeded;
+ result->containsInitialPruning = glob->containsInitialPruning;
result->planTree = top_plan;
+ result->numPlanNodes = glob->lastPlanNodeId;
result->rtable = glob->finalrtable;
+ result->lockrels = glob->lockrels;
result->resultRelations = glob->resultRelations;
result->appendRelations = glob->appendRelations;
result->subplans = glob->subplans;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index a7b11b7f03..cee8c570fd 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -483,6 +483,7 @@ static void
add_rte_to_flat_rtable(PlannerGlobal *glob, RangeTblEntry *rte)
{
RangeTblEntry *newrte;
+ Index rti = list_length(glob->finalrtable) + 1;
/* flat copy to duplicate all the scalar fields */
newrte = (RangeTblEntry *) palloc(sizeof(RangeTblEntry));
@@ -517,7 +518,10 @@ add_rte_to_flat_rtable(PlannerGlobal *glob, RangeTblEntry *rte)
* but it would probably cost more cycles than it would save.
*/
if (newrte->rtekind == RTE_RELATION)
+ {
+ glob->lockrels = bms_add_member(glob->lockrels, rti);
glob->relationOids = lappend_oid(glob->relationOids, newrte->relid);
+ }
}
/*
@@ -1548,6 +1552,9 @@ set_append_references(PlannerInfo *root,
pinfo->rtindex += rtoffset;
}
}
+
+ if (aplan->part_prune_info->needs_init_pruning)
+ root->glob->containsInitialPruning = true;
}
/* We don't need to recurse to lefttree or righttree ... */
@@ -1620,6 +1627,9 @@ set_mergeappend_references(PlannerInfo *root,
pinfo->rtindex += rtoffset;
}
}
+
+ if (mplan->part_prune_info->needs_init_pruning)
+ root->glob->containsInitialPruning = true;
}
/* We don't need to recurse to lefttree or righttree ... */
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index 7080cb25d9..3322dc79f2 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -144,7 +144,9 @@ static List *make_partitionedrel_pruneinfo(PlannerInfo *root,
List *prunequal,
Bitmapset *partrelids,
int *relid_subplan_map,
- Bitmapset **matchedsubplans);
+ Bitmapset **matchedsubplans,
+ bool *needs_init_pruning,
+ bool *needs_exec_pruning);
static void gen_partprune_steps(RelOptInfo *rel, List *clauses,
PartClauseTarget target,
GeneratePruningStepsContext *context);
@@ -230,6 +232,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
int *relid_subplan_map;
ListCell *lc;
int i;
+ bool needs_init_pruning = false;
+ bool needs_exec_pruning = false;
/*
* Scan the subpaths to see which ones are scans of partition child
@@ -309,12 +313,16 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
Bitmapset *partrelids = (Bitmapset *) lfirst(lc);
List *pinfolist;
Bitmapset *matchedsubplans = NULL;
+ bool partrel_needs_init_pruning;
+ bool partrel_needs_exec_pruning;
pinfolist = make_partitionedrel_pruneinfo(root, parentrel,
prunequal,
partrelids,
relid_subplan_map,
- &matchedsubplans);
+ &matchedsubplans,
+ &partrel_needs_init_pruning,
+ &partrel_needs_exec_pruning);
/* When pruning is possible, record the matched subplans */
if (pinfolist != NIL)
@@ -323,6 +331,10 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
allmatchedsubplans = bms_join(matchedsubplans,
allmatchedsubplans);
}
+ if (!needs_init_pruning)
+ needs_init_pruning = partrel_needs_init_pruning;
+ if (!needs_exec_pruning)
+ needs_exec_pruning = partrel_needs_exec_pruning;
}
pfree(relid_subplan_map);
@@ -337,6 +349,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
/* Else build the result data structure */
pruneinfo = makeNode(PartitionPruneInfo);
pruneinfo->prune_infos = prunerelinfos;
+ pruneinfo->needs_init_pruning = needs_init_pruning;
+ pruneinfo->needs_exec_pruning = needs_exec_pruning;
/*
* Some subplans may not belong to any of the identified partitioned rels.
@@ -435,13 +449,18 @@ add_part_relids(List *allpartrelids, Bitmapset *partrelids)
* If we cannot find any useful run-time pruning steps, return NIL.
* However, on success, each rel identified in partrelids will have
* an element in the result list, even if some of them are useless.
+ * *needs_init_pruning and *needs_exec_pruning are set to indicate that the
+ * returned PartitionedRelPruneInfos contains pruning steps that can be
+ * performed before and after execution begins, respectively.
*/
static List *
make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
List *prunequal,
Bitmapset *partrelids,
int *relid_subplan_map,
- Bitmapset **matchedsubplans)
+ Bitmapset **matchedsubplans,
+ bool *needs_init_pruning,
+ bool *needs_exec_pruning)
{
RelOptInfo *targetpart = NULL;
List *pinfolist = NIL;
@@ -452,6 +471,10 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
int rti;
int i;
+ /* Will find out below. */
+ *needs_init_pruning = false;
+ *needs_exec_pruning = false;
+
/*
* Examine each partitioned rel, constructing a temporary array to map
* from planner relids to index of the partitioned rel, and building a
@@ -539,6 +562,9 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
* executor per-scan pruning steps. This first pass creates startup
* pruning steps and detects whether there's any possibly-useful quals
* that would require per-scan pruning.
+ *
+ * In the first pass, we note whether the 2nd pass is necessary by
+ * by noting the presence of EXEC parameters.
*/
gen_partprune_steps(subpart, partprunequal, PARTTARGET_INITIAL,
&context);
@@ -613,6 +639,11 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
pinfo->execparamids = execparamids;
/* Remaining fields will be filled in the next loop */
+ if (!*needs_init_pruning)
+ *needs_init_pruning = (initial_pruning_steps != NIL);
+ if (!*needs_exec_pruning)
+ *needs_exec_pruning = (exec_pruning_steps != NIL);
+
pinfolist = lappend(pinfolist, pinfo);
}
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index ba2fcfeb4a..085eb3f209 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -945,15 +945,17 @@ pg_plan_query(Query *querytree, const char *query_string, int cursorOptions,
* For normal optimizable statements, invoke the planner. For utility
* statements, just make a wrapper PlannedStmt node.
*
- * The result is a list of PlannedStmt nodes.
+ * The result is a list of PlannedStmt nodes. Also, a NULL is appended to
+ * *execlockrelsinfo_list for each PlannedStmt added to the returned list.
*/
List *
pg_plan_queries(List *querytrees, const char *query_string, int cursorOptions,
- ParamListInfo boundParams)
+ ParamListInfo boundParams, List **execlockrelsinfo_list)
{
List *stmt_list = NIL;
ListCell *query_list;
+ *execlockrelsinfo_list = NIL;
foreach(query_list, querytrees)
{
Query *query = lfirst_node(Query, query_list);
@@ -977,6 +979,7 @@ pg_plan_queries(List *querytrees, const char *query_string, int cursorOptions,
}
stmt_list = lappend(stmt_list, stmt);
+ *execlockrelsinfo_list = lappend(*execlockrelsinfo_list, NULL);
}
return stmt_list;
@@ -1080,7 +1083,8 @@ exec_simple_query(const char *query_string)
QueryCompletion qc;
MemoryContext per_parsetree_context = NULL;
List *querytree_list,
- *plantree_list;
+ *plantree_list,
+ *plantree_execlockrelsinfo_list;
Portal portal;
DestReceiver *receiver;
int16 format;
@@ -1167,7 +1171,8 @@ exec_simple_query(const char *query_string)
NULL, 0, NULL);
plantree_list = pg_plan_queries(querytree_list, query_string,
- CURSOR_OPT_PARALLEL_OK, NULL);
+ CURSOR_OPT_PARALLEL_OK, NULL,
+ &plantree_execlockrelsinfo_list);
/*
* Done with the snapshot used for parsing/planning.
@@ -1203,6 +1208,7 @@ exec_simple_query(const char *query_string)
query_string,
commandTag,
plantree_list,
+ plantree_execlockrelsinfo_list,
NULL);
/*
@@ -1991,6 +1997,7 @@ exec_bind_message(StringInfo input_message)
query_string,
psrc->commandTag,
cplan->stmt_list,
+ cplan->execlockrelsinfo_list,
cplan);
/* Done with the snapshot used for parameter I/O and parsing/planning */
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index 5f907831a3..972ddc014e 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -35,7 +35,7 @@
Portal ActivePortal = NULL;
-static void ProcessQuery(PlannedStmt *plan,
+static void ProcessQuery(PlannedStmt *plan, ExecLockRelsInfo *execlockrelsinfo,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -65,6 +65,7 @@ static void DoPortalRewind(Portal portal);
*/
QueryDesc *
CreateQueryDesc(PlannedStmt *plannedstmt,
+ ExecLockRelsInfo *execlockrelsinfo,
const char *sourceText,
Snapshot snapshot,
Snapshot crosscheck_snapshot,
@@ -77,6 +78,7 @@ CreateQueryDesc(PlannedStmt *plannedstmt,
qd->operation = plannedstmt->commandType; /* operation */
qd->plannedstmt = plannedstmt; /* plan */
+ qd->execlockrelsinfo = execlockrelsinfo; /* ExecutorGetLockRels() output for plan */
qd->sourceText = sourceText; /* query text */
qd->snapshot = RegisterSnapshot(snapshot); /* snapshot */
/* RI check snapshot */
@@ -122,6 +124,7 @@ FreeQueryDesc(QueryDesc *qdesc)
* PORTAL_ONE_RETURNING, or PORTAL_ONE_MOD_WITH portal
*
* plan: the plan tree for the query
+ * execlockrelsinfo: ExecutorGetLockRels() output for the plan tree
* sourceText: the source text of the query
* params: any parameters needed
* dest: where to send results
@@ -134,6 +137,7 @@ FreeQueryDesc(QueryDesc *qdesc)
*/
static void
ProcessQuery(PlannedStmt *plan,
+ ExecLockRelsInfo *execlockrelsinfo,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -145,7 +149,7 @@ ProcessQuery(PlannedStmt *plan,
/*
* Create the QueryDesc object
*/
- queryDesc = CreateQueryDesc(plan, sourceText,
+ queryDesc = CreateQueryDesc(plan, execlockrelsinfo, sourceText,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, 0);
@@ -490,6 +494,7 @@ PortalStart(Portal portal, ParamListInfo params,
* the destination to DestNone.
*/
queryDesc = CreateQueryDesc(linitial_node(PlannedStmt, portal->stmts),
+ linitial_node(ExecLockRelsInfo, portal->execlockrelsinfos),
portal->sourceText,
GetActiveSnapshot(),
InvalidSnapshot,
@@ -1190,7 +1195,8 @@ PortalRunMulti(Portal portal,
QueryCompletion *qc)
{
bool active_snapshot_set = false;
- ListCell *stmtlist_item;
+ ListCell *stmtlist_item,
+ *execlockrelsinfolist_item;
/*
* If the destination is DestRemoteExecute, change to DestNone. The
@@ -1211,9 +1217,12 @@ PortalRunMulti(Portal portal,
* Loop to handle the individual queries generated from a single parsetree
* by analysis and rewrite.
*/
- foreach(stmtlist_item, portal->stmts)
+ forboth(stmtlist_item, portal->stmts,
+ execlockrelsinfolist_item, portal->execlockrelsinfos)
{
PlannedStmt *pstmt = lfirst_node(PlannedStmt, stmtlist_item);
+ ExecLockRelsInfo *execlockrelsinfo = lfirst_node(ExecLockRelsInfo,
+ execlockrelsinfolist_item);
/*
* If we got a cancel signal in prior command, quit
@@ -1271,7 +1280,7 @@ PortalRunMulti(Portal portal,
if (pstmt->canSetTag)
{
/* statement can set tag string */
- ProcessQuery(pstmt,
+ ProcessQuery(pstmt, execlockrelsinfo,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
@@ -1280,7 +1289,7 @@ PortalRunMulti(Portal portal,
else
{
/* stmt added by rewrite cannot set tag */
- ProcessQuery(pstmt,
+ ProcessQuery(pstmt, execlockrelsinfo,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 4cf6db504f..c40a6f19d6 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -99,14 +99,15 @@ static dlist_head cached_expression_list = DLIST_STATIC_INIT(cached_expression_l
static void ReleaseGenericPlan(CachedPlanSource *plansource);
static List *RevalidateCachedQuery(CachedPlanSource *plansource,
QueryEnvironment *queryEnv);
-static bool CheckCachedPlan(CachedPlanSource *plansource);
+static bool CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams);
static CachedPlan *BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
ParamListInfo boundParams, QueryEnvironment *queryEnv);
static bool choose_custom_plan(CachedPlanSource *plansource,
ParamListInfo boundParams);
static double cached_plan_cost(CachedPlan *plan, bool include_planner);
static Query *QueryListGetPrimaryStmt(List *stmts);
-static void AcquireExecutorLocks(List *stmt_list, bool acquire);
+static List *AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams);
+static void ReleaseExecutorLocks(List *stmt_list, List *execlockrelsinfo_list);
static void AcquirePlannerLocks(List *stmt_list, bool acquire);
static void ScanQueryForLocks(Query *parsetree, bool acquire);
static bool ScanQueryWalker(Node *node, bool *acquire);
@@ -782,6 +783,47 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
return tlist;
}
+/*
+ * CachedPlanSaveExecLockRelsInfos
+ * Save the list containing ExecLockRelsInfo nodes in the given CachedPlan
+ *
+ * The provided list is copied into a dedicated context that is a child of
+ * plan->context.
+ */
+static void
+CachedPlanSaveExecLockRelsInfos(CachedPlan *plan, List *execlockrelsinfo_list)
+{
+ MemoryContext execlockrelsinfo_context = plan->execlockrelsinfo_context,
+ oldcontext = CurrentMemoryContext;
+ List *execlockrelsinfo_list_copy;
+
+ /*
+ * Set up the dedicated context if not already done, saving it as a child
+ * of the CachedPlan's context.
+ */
+ if (execlockrelsinfo_context == NULL)
+ {
+ execlockrelsinfo_context = AllocSetContextCreate(CurrentMemoryContext,
+ "CachedPlan execlockrelsinfo list",
+ ALLOCSET_START_SMALL_SIZES);
+ MemoryContextSetParent(execlockrelsinfo_context, plan->context);
+ MemoryContextSetIdentifier(execlockrelsinfo_context, plan->context->ident);
+ plan->execlockrelsinfo_context = execlockrelsinfo_context;
+ }
+ else
+ {
+ /* Just clear existing contents by resetting the context. */
+ Assert(MemoryContextIsValid(execlockrelsinfo_context));
+ MemoryContextReset(execlockrelsinfo_context);
+ }
+
+ MemoryContextSwitchTo(execlockrelsinfo_context);
+ execlockrelsinfo_list_copy = copyObject(execlockrelsinfo_list);
+ MemoryContextSwitchTo(oldcontext);
+
+ plan->execlockrelsinfo_list = execlockrelsinfo_list_copy;
+}
+
/*
* CheckCachedPlan: see if the CachedPlanSource's generic plan is valid.
*
@@ -790,9 +832,17 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
*
* On a "true" return, we have acquired the locks needed to run the plan.
* (We must do this for the "true" result to be race-condition-free.)
+ *
+ * If the CachedPlan is valid, this calls ExecutorGetLockRels on each
+ * PlannedStmt contained in it to determine the set of relations to lock by
+ * AcquireExecutorLocks(). Resulting ExecLockRelsInfo nodes, allocated in a
+ * child context of the context containing the plan itself, are added into
+ * plan->execlockrelsinfo_list. ExecLockRelsInfo nodes that may be present
+ * in the list from the last invocation of CheckCachedPlan() on the same
+ * CachedPlan are deleted.
*/
static bool
-CheckCachedPlan(CachedPlanSource *plansource)
+CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams)
{
CachedPlan *plan = plansource->gplan;
@@ -820,13 +870,22 @@ CheckCachedPlan(CachedPlanSource *plansource)
*/
if (plan->is_valid)
{
+ List *execlockrelsinfo_list;
+
/*
* Plan must have positive refcount because it is referenced by
* plansource; so no need to fear it disappears under us here.
*/
Assert(plan->refcount > 0);
- AcquireExecutorLocks(plan->stmt_list, true);
+ /*
+ * Lock relations scanned by the plan. This also invokes
+ * ExecutorGetLockRels() to do initial partition pruning on the plan
+ * tree iff some nodes in it are marked as needing it. Relations whose
+ * scan nodes are pruned as a result of that are not locked here.
+ */
+ execlockrelsinfo_list = AcquireExecutorLocks(plan->stmt_list,
+ boundParams);
/*
* If plan was transient, check to see if TransactionXmin has
@@ -844,11 +903,14 @@ CheckCachedPlan(CachedPlanSource *plansource)
if (plan->is_valid)
{
/* Successfully revalidated and locked the query. */
+
+ /* Remember ExecLockRelsInfos in the CachedPlan. */
+ CachedPlanSaveExecLockRelsInfos(plan, execlockrelsinfo_list);
return true;
}
/* Oops, the race case happened. Release useless locks. */
- AcquireExecutorLocks(plan->stmt_list, false);
+ ReleaseExecutorLocks(plan->stmt_list, execlockrelsinfo_list);
}
/*
@@ -880,7 +942,8 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
ParamListInfo boundParams, QueryEnvironment *queryEnv)
{
CachedPlan *plan;
- List *plist;
+ List *plist,
+ *execlockrelsinfo_list;
bool snapshot_set;
bool is_transient;
MemoryContext plan_context;
@@ -933,7 +996,8 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
* Generate the plan.
*/
plist = pg_plan_queries(qlist, plansource->query_string,
- plansource->cursor_options, boundParams);
+ plansource->cursor_options, boundParams,
+ &execlockrelsinfo_list);
/* Release snapshot if we got one */
if (snapshot_set)
@@ -1002,6 +1066,11 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
plan->is_saved = false;
plan->is_valid = true;
+ /* Save the dummy ExecLockRelsInfo list. */
+ plan->execlockrelsinfo_context = NULL;
+ CachedPlanSaveExecLockRelsInfos(plan, execlockrelsinfo_list);
+ Assert(MemoryContextIsValid(plan->execlockrelsinfo_context));
+
/* assign generation number to new plan */
plan->generation = ++(plansource->generation);
@@ -1160,7 +1229,7 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
if (!customplan)
{
- if (CheckCachedPlan(plansource))
+ if (CheckCachedPlan(plansource, boundParams))
{
/* We want a generic plan, and we already have a valid one */
plan = plansource->gplan;
@@ -1366,7 +1435,6 @@ CachedPlanAllowsSimpleValidityCheck(CachedPlanSource *plansource,
foreach(lc, plan->stmt_list)
{
PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc);
- ListCell *lc2;
if (plannedstmt->commandType == CMD_UTILITY)
return false;
@@ -1375,13 +1443,8 @@ CachedPlanAllowsSimpleValidityCheck(CachedPlanSource *plansource,
* We have to grovel through the rtable because it's likely to contain
* an RTE_RESULT relation, rather than being totally empty.
*/
- foreach(lc2, plannedstmt->rtable)
- {
- RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc2);
-
- if (rte->rtekind == RTE_RELATION)
- return false;
- }
+ if (!bms_is_empty(plannedstmt->lockrels))
+ return false;
}
/*
@@ -1737,17 +1800,22 @@ QueryListGetPrimaryStmt(List *stmts)
/*
* AcquireExecutorLocks: acquire locks needed for execution of a cached plan;
- * or release them if acquire is false.
+ *
+ * Returns a list of ExecLockRelsInfo nodes containing one element for each
+ * PlannedStmt in stmt_list; NULL when the latter is utility statement or
+ * its containsInitialPruning is false.
*/
-static void
-AcquireExecutorLocks(List *stmt_list, bool acquire)
+static List *
+AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams)
{
ListCell *lc1;
+ List *execlockrelsinfo_list = NIL;
foreach(lc1, stmt_list)
{
PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
- ListCell *lc2;
+ ExecLockRelsInfo *execlockrelsinfo = NULL;
+ int rti;
if (plannedstmt->commandType == CMD_UTILITY)
{
@@ -1761,27 +1829,113 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
Query *query = UtilityContainsQuery(plannedstmt->utilityStmt);
if (query)
- ScanQueryForLocks(query, acquire);
- continue;
+ ScanQueryForLocks(query, true);
}
-
- foreach(lc2, plannedstmt->rtable)
+ else
{
- RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc2);
-
- if (rte->rtekind != RTE_RELATION)
- continue;
+ Bitmapset *lockrels;
/*
- * Acquire the appropriate type of lock on each relation OID. Note
- * that we don't actually try to open the rel, and hence will not
- * fail if it's been dropped entirely --- we'll just transiently
- * acquire a non-conflicting lock.
+ * Figure out the set of relations that would need to be locked
+ * before executing the plan.
*/
- if (acquire)
+ if (!plannedstmt->containsInitialPruning)
+ {
+ /*
+ * If the plan contains no initial pruning steps, the executor
+ * would just need to lock whatever relations the planner would
+ * have locked to make the plan.
+ */
+ lockrels = plannedstmt->lockrels;
+ }
+ else
+ {
+ /*
+ * Ask the executor to perform initial pruning steps to skip
+ * relations that are pruned away.
+ */
+ execlockrelsinfo = ExecutorGetLockRels(plannedstmt, boundParams);
+ lockrels = execlockrelsinfo->lockrels;
+ }
+
+ rti = -1;
+ while ((rti = bms_next_member(lockrels, rti)) >= 0)
+ {
+ RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
+
+ Assert(rte->rtekind == RTE_RELATION);
+
+ /*
+ * Acquire the appropriate type of lock on each relation OID.
+ * Note that we don't actually try to open the rel, and hence
+ * will not fail if it's been dropped entirely --- we'll just
+ * transiently acquire a non-conflicting lock.
+ */
LockRelationOid(rte->relid, rte->rellockmode);
+ }
+ }
+
+ /*
+ * Remember ExecLockRelsInfo for later adding to the QueryDesc that
+ * will be passed to the executor when executing this plan. May be
+ * NULL, but must keep the list the same length as stmt_list.
+ */
+ execlockrelsinfo_list = lappend(execlockrelsinfo_list,
+ execlockrelsinfo);
+ }
+
+ return execlockrelsinfo_list;
+}
+
+/*
+ * ReleaseExecutorLocks
+ * Release locks that would've been acquired by an earlier call to
+ * AcquireExecutorLocks()
+ */
+static void
+ReleaseExecutorLocks(List *stmt_list, List *execlockrelsinfo_list)
+{
+ ListCell *lc1,
+ *lc2;
+
+ forboth(lc1, stmt_list, lc2, execlockrelsinfo_list)
+ {
+ PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
+ ExecLockRelsInfo *execlockrelsinfo = lfirst_node(ExecLockRelsInfo, lc2);
+ int rti;
+
+ if (plannedstmt->commandType == CMD_UTILITY)
+ {
+ /*
+ * Ignore utility statements, except those (such as EXPLAIN) that
+ * contain a parsed-but-not-planned query. Note: it's okay to use
+ * ScanQueryForLocks, even though the query hasn't been through
+ * rule rewriting, because rewriting doesn't change the query
+ * representation.
+ */
+ Query *query = UtilityContainsQuery(plannedstmt->utilityStmt);
+
+ if (query)
+ ScanQueryForLocks(query, false);
+ }
+ else
+ {
+ Bitmapset *lockrels;
+
+ if (execlockrelsinfo == NULL)
+ lockrels = plannedstmt->lockrels;
else
+ lockrels = execlockrelsinfo->lockrels;
+
+ rti = -1;
+ while ((rti = bms_next_member(lockrels, rti)) >= 0)
+ {
+ RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
+
+ Assert(rte->rtekind == RTE_RELATION);
+
UnlockRelationOid(rte->relid, rte->rellockmode);
+ }
}
}
}
diff --git a/src/backend/utils/mmgr/portalmem.c b/src/backend/utils/mmgr/portalmem.c
index d549f66d4a..896f51be08 100644
--- a/src/backend/utils/mmgr/portalmem.c
+++ b/src/backend/utils/mmgr/portalmem.c
@@ -285,6 +285,7 @@ PortalDefineQuery(Portal portal,
const char *sourceText,
CommandTag commandTag,
List *stmts,
+ List *execlockrelsinfos,
CachedPlan *cplan)
{
AssertArg(PortalIsValid(portal));
@@ -299,6 +300,7 @@ PortalDefineQuery(Portal portal,
portal->qc.nprocessed = 0;
portal->commandTag = commandTag;
portal->stmts = stmts;
+ portal->execlockrelsinfos = execlockrelsinfos;
portal->cplan = cplan;
portal->status = PORTAL_DEFINED;
}
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 666977fb1f..fef75ba147 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -87,7 +87,8 @@ extern void ExplainOneUtility(Node *utilityStmt, IntoClause *into,
ExplainState *es, const char *queryString,
ParamListInfo params, QueryEnvironment *queryEnv);
-extern void ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into,
+extern void ExplainOnePlan(PlannedStmt *plannedstmt, ExecLockRelsInfo *execlockrelsinfo,
+ IntoClause *into,
ExplainState *es, const char *queryString,
ParamListInfo params, QueryEnvironment *queryEnv,
const instr_time *planduration,
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index fd5735a946..ded19b8cbb 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -124,4 +124,6 @@ extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
PartitionPruneInfo *pruneinfo,
Bitmapset **initially_valid_subplans);
extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate);
+extern Bitmapset *ExecGetLockRelsDoInitialPruning(Plan *plan, ExecGetLockRelsContext *context,
+ PartitionPruneInfo *pruneinfo);
#endif /* EXECPARTITION_H */
diff --git a/src/include/executor/execdesc.h b/src/include/executor/execdesc.h
index e79e2c001f..4338463479 100644
--- a/src/include/executor/execdesc.h
+++ b/src/include/executor/execdesc.h
@@ -35,6 +35,7 @@ typedef struct QueryDesc
/* These fields are provided by CreateQueryDesc */
CmdType operation; /* CMD_SELECT, CMD_UPDATE, etc. */
PlannedStmt *plannedstmt; /* planner's output (could be utility, too) */
+ ExecLockRelsInfo *execlockrelsinfo; /* ExecutorGetLockRels()'s output given plannedstmt */
const char *sourceText; /* source text of the query */
Snapshot snapshot; /* snapshot to use for query */
Snapshot crosscheck_snapshot; /* crosscheck for RI update/delete */
@@ -57,6 +58,7 @@ typedef struct QueryDesc
/* in pquery.c */
extern QueryDesc *CreateQueryDesc(PlannedStmt *plannedstmt,
+ ExecLockRelsInfo *execlockrelsinfo,
const char *sourceText,
Snapshot snapshot,
Snapshot crosscheck_snapshot,
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index 344399f6a8..5959d67221 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -185,6 +185,8 @@ ExecGetJunkAttribute(TupleTableSlot *slot, AttrNumber attno, bool *isNull)
/*
* prototypes from functions in execMain.c
*/
+extern ExecLockRelsInfo *ExecutorGetLockRels(PlannedStmt *plannedstmt, ParamListInfo params);
+extern bool ExecGetLockRels(Plan *node, ExecGetLockRelsContext *context);
extern void ExecutorStart(QueryDesc *queryDesc, int eflags);
extern void standard_ExecutorStart(QueryDesc *queryDesc, int eflags);
extern void ExecutorRun(QueryDesc *queryDesc,
diff --git a/src/include/executor/nodeAppend.h b/src/include/executor/nodeAppend.h
index 4cb78ee5b6..b53535c2a4 100644
--- a/src/include/executor/nodeAppend.h
+++ b/src/include/executor/nodeAppend.h
@@ -17,6 +17,7 @@
#include "access/parallel.h"
#include "nodes/execnodes.h"
+extern bool ExecGetAppendLockRels(Append *node, ExecGetLockRelsContext *context);
extern AppendState *ExecInitAppend(Append *node, EState *estate, int eflags);
extern void ExecEndAppend(AppendState *node);
extern void ExecReScanAppend(AppendState *node);
diff --git a/src/include/executor/nodeMergeAppend.h b/src/include/executor/nodeMergeAppend.h
index 97fe3b0665..8eb4e9df93 100644
--- a/src/include/executor/nodeMergeAppend.h
+++ b/src/include/executor/nodeMergeAppend.h
@@ -16,6 +16,7 @@
#include "nodes/execnodes.h"
+extern bool ExecGetMergeAppendLockRels(MergeAppend *node, ExecGetLockRelsContext *context);
extern MergeAppendState *ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags);
extern void ExecEndMergeAppend(MergeAppendState *node);
extern void ExecReScanMergeAppend(MergeAppendState *node);
diff --git a/src/include/executor/nodeModifyTable.h b/src/include/executor/nodeModifyTable.h
index 1d225bc88d..5006499088 100644
--- a/src/include/executor/nodeModifyTable.h
+++ b/src/include/executor/nodeModifyTable.h
@@ -19,6 +19,7 @@ extern void ExecComputeStoredGenerated(ResultRelInfo *resultRelInfo,
EState *estate, TupleTableSlot *slot,
CmdType cmdtype);
+extern bool ExecGetModifyTableLockRels(ModifyTable *plan, ExecGetLockRelsContext *context);
extern ModifyTableState *ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags);
extern void ExecEndModifyTable(ModifyTableState *node);
extern void ExecReScanModifyTable(ModifyTableState *node);
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index dd95dc40c7..718603d400 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -570,6 +570,7 @@ typedef struct EState
struct ExecRowMark **es_rowmarks; /* Array of per-range-table-entry
* ExecRowMarks, or NULL if none */
PlannedStmt *es_plannedstmt; /* link to top of plan tree */
+ struct ExecLockRelsInfo *es_execlockrelsinfo; /* QueryDesc.execlockrelsinfo */
const char *es_sourceText; /* Source text from QueryDesc */
JunkFilter *es_junkFilter; /* top-level junk filter, if any */
@@ -958,6 +959,92 @@ typedef struct DomainConstraintState
*/
typedef TupleTableSlot *(*ExecProcNodeMtd) (struct PlanState *pstate);
+/*----------------
+ * ExecLockRelsInfo
+ *
+ * Result of performing ExecutorGetLockRels() for a given PlannedStmt
+ */
+typedef struct ExecLockRelsInfo
+{
+ NodeTag type;
+
+ /*
+ * Relations that must be locked to execute the plan tree contained in
+ * the PlannedStmt.
+ */
+ Bitmapset *lockrels;
+
+ /* PlannedStmt.numPlanNodes */
+ int numPlanNodes;
+
+ /*
+ * List of PlanInitPruningOutput, each representing the output of
+ * performing initial pruning on a given plan node, for all nodes in the
+ * plan tree that have been marked as needing initial pruning.
+ *
+ * 'ipoIndexes' is an array of 'numPlanNodes' elements, indexed with
+ * plan_node_id of the individual nodes in the plan tree, each a 1-based
+ * index into 'initPruningOutputs' list for a given plan node. 0 means
+ * that a given plan node has no entry in the list because of not needing
+ * any initial pruning done on it.
+ */
+ List *initPruningOutputs;
+ int *ipoIndexes;
+} ExecLockRelsInfo;
+
+/*----------------
+ * ExecGetLockRelsContext
+ *
+ * Context information for performing ExecutorGetLockRels() on a given plan
+ */
+typedef struct ExecGetLockRelsContext
+{
+ NodeTag type;
+
+ PlannedStmt *stmt; /* target plan */
+ ParamListInfo params; /* EXTERN parameters to prune with */
+
+ /* Output parameters for ExecGetLockRels and its subroutines. */
+ Bitmapset *lockrels;
+
+ /* See above comment. */
+ List *initPruningOutputs;
+ int *ipoIndexes;
+} ExecGetLockRelsContext;
+
+#define ExecStorePlanInitPruningOutput(prepcxt, initPruningOutput, plannode) \
+ do { \
+ (prepcxt)->initPruningOutputs = lappend((prepcxt)->initPruningOutputs, initPruningOutput); \
+ (prepcxt)->ipoIndexes[(plannode)->plan_node_id] = list_length((prepcxt)->initPruningOutputs); \
+ } while (0)
+
+#define ExecFetchPlanInitPruningOutput(prepres, plannode) \
+ (((prepres) != NULL && (prepres)->initPruningOutputs != NIL) ? \
+ list_nth((prepres)->initPruningOutputs, \
+ (prepres)->ipoIndexes[(plannode)->plan_node_id] - 1) : NULL)
+
+/* ---------------
+ * PlanInitPruningOutput
+ *
+ * Node to remember the result of performing initial partition pruning steps
+ * during ExecutorGetLockRels() on nodes that support pruning.
+ *
+ * ExecLockRelsDoInitPruning(), which runs during ExecutorGetLockRels(),
+ * creates it and stores it in the corresponding ExecLockRelsInfo.
+ *
+ * ExecInitPartitionPruning(), which runs during ExecuorStart(), fetches it
+ * from the EState's ExecLockRelsInfo (if any) and uses the value of
+ * initially_valid_subplans contained in it as-is to select the subplans to be
+ * initialized for execution, instead of re-evaluating that by performing
+ * initial pruning again.
+ */
+typedef struct PlanInitPruningOutput
+{
+ NodeTag type;
+
+ Bitmapset *initially_valid_subplans;
+} PlanInitPruningOutput;
+
/* ----------------
* PlanState node
*
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index 5d075f0c34..d365fc4402 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -96,6 +96,11 @@ typedef enum NodeTag
T_PartitionPruneStepCombine,
T_PlanInvalItem,
+ /* TAGS FOR EXECUTOR PREP NODES (execnodes.h) */
+ T_ExecGetLockRelsContext,
+ T_ExecLockRelsInfo,
+ T_PlanInitPruningOutput,
+
/*
* TAGS FOR PLAN STATE NODES (execnodes.h)
*
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 1f3845b3fe..96c652ebaf 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -101,6 +101,9 @@ typedef struct PlannerGlobal
List *finalrtable; /* "flat" rangetable for executor */
+ Bitmapset *lockrels; /* Indexes of RTE_RELATION entries in range
+ * table */
+
List *finalrowmarks; /* "flat" list of PlanRowMarks */
List *resultRelations; /* "flat" list of integer RT indexes */
@@ -129,6 +132,10 @@ typedef struct PlannerGlobal
char maxParallelHazard; /* worst PROPARALLEL hazard level */
+ bool containsInitialPruning; /* Do some Plan nodes in the tree
+ * have initial (pre-exec) pruning
+ * steps? */
+
PartitionDirectory partition_directory; /* partition descriptors */
} PlannerGlobal;
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 0b518ce6b2..5a8c34bdf6 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -59,12 +59,21 @@ typedef struct PlannedStmt
bool parallelModeNeeded; /* parallel mode required to execute? */
+ bool containsInitialPruning; /* Do some Plan nodes in the tree
+ * have initial (pre-exec) pruning
+ * steps? */
+
int jitFlags; /* which forms of JIT should be performed */
struct Plan *planTree; /* tree of Plan nodes */
+ int numPlanNodes; /* number of nodes in planTree */
+
List *rtable; /* list of RangeTblEntry nodes */
+ Bitmapset *lockrels; /* Indexes of RTE_RELATION entries in range
+ * table */
+
/* rtable indexes of target relations for INSERT/UPDATE/DELETE */
List *resultRelations; /* integer list of RT indexes, or NIL */
@@ -1172,6 +1181,13 @@ typedef struct PlanRowMark
* prune_infos List of Lists containing PartitionedRelPruneInfo nodes,
* one sublist per run-time-prunable partition hierarchy
* appearing in the parent plan node's subplans.
+ *
+ * needs_init_pruning Does any of the PartitionedRelPruneInfos in
+ * prune_infos have its initial_pruning_steps set?
+ *
+ * needs_exec_pruning Does any of the PartitionedRelPruneInfos in
+ * prune_infos have its exec_pruning_steps set?
+ *
* other_subplans Indexes of any subplans that are not accounted for
* by any of the PartitionedRelPruneInfo nodes in
* "prune_infos". These subplans must not be pruned.
@@ -1180,6 +1196,8 @@ typedef struct PartitionPruneInfo
{
NodeTag type;
List *prune_infos;
+ bool needs_init_pruning;
+ bool needs_exec_pruning;
Bitmapset *other_subplans;
} PartitionPruneInfo;
diff --git a/src/include/tcop/tcopprot.h b/src/include/tcop/tcopprot.h
index 92291a750d..bf80c53bed 100644
--- a/src/include/tcop/tcopprot.h
+++ b/src/include/tcop/tcopprot.h
@@ -64,7 +64,7 @@ extern PlannedStmt *pg_plan_query(Query *querytree, const char *query_string,
ParamListInfo boundParams);
extern List *pg_plan_queries(List *querytrees, const char *query_string,
int cursorOptions,
- ParamListInfo boundParams);
+ ParamListInfo boundParams, List **execlockrelsinfo_list);
extern bool check_max_stack_depth(int *newval, void **extra, GucSource source);
extern void assign_max_stack_depth(int newval, void *extra);
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index 95b99e3d25..2a847f54da 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -148,6 +148,9 @@ typedef struct CachedPlan
{
int magic; /* should equal CACHEDPLAN_MAGIC */
List *stmt_list; /* list of PlannedStmts */
+ List *execlockrelsinfo_list; /* list of ExecutorGetLockRelsResult with one
+ * element for each of stmt_list; NIL
+ * if not a generic plan */
bool is_oneshot; /* is it a "oneshot" plan? */
bool is_saved; /* is CachedPlan in a long-lived context? */
bool is_valid; /* is the stmt_list currently valid? */
@@ -158,6 +161,8 @@ typedef struct CachedPlan
int generation; /* parent's generation number for this plan */
int refcount; /* count of live references to this struct */
MemoryContext context; /* context containing this CachedPlan */
+ MemoryContext execlockrelsinfo_context; /* context containing execlockrelsinfo_list,
+ * a child of the above context */
} CachedPlan;
/*
diff --git a/src/include/utils/portal.h b/src/include/utils/portal.h
index aeddbdafe5..9abace6734 100644
--- a/src/include/utils/portal.h
+++ b/src/include/utils/portal.h
@@ -137,6 +137,10 @@ typedef struct PortalData
CommandTag commandTag; /* command tag for original query */
QueryCompletion qc; /* command completion data for executed query */
List *stmts; /* list of PlannedStmts */
+ List *execlockrelsinfos; /* list of ExecutorGetLockRelsResults with one element
+ * for each of 'stmts'; same as
+ * cplan->execlockrelsinfo_list if cplan is
+ * not NULL */
CachedPlan *cplan; /* CachedPlan, if stmts are from one */
ParamListInfo portalParams; /* params to pass to query */
@@ -241,6 +245,7 @@ extern void PortalDefineQuery(Portal portal,
const char *sourceText,
CommandTag commandTag,
List *stmts,
+ List *execlockrelsinfos,
CachedPlan *cplan);
extern PlannedStmt *PortalGetPrimaryStmt(Portal portal);
extern void PortalCreateHoldStore(Portal portal);
--
2.24.1
[application/octet-stream] v5-0001-Some-refactoring-of-runtime-pruning-code.patch (26.4K, 4-v5-0001-Some-refactoring-of-runtime-pruning-code.patch)
download | inline diff:
From 1164015d8561151d1fb5d861b236961e237102ff Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Wed, 2 Mar 2022 15:17:55 +0900
Subject: [PATCH v5 1/3] Some refactoring of runtime pruning code
This does two things mainly:
* Move the execution pruning initialization steps that are common
between both ExecInitAppend() and ExecInitMergeAppend() into a new
function ExecInitPartitionPruning() defined in execPartition.c.
Thus, ExecFindInitialMatchingSubPlans() need not be exported.
* Add an ExprContext field to PartitionPruneContext to remove the
implicit assumption in the runtime pruning code that the ExprContext
to use to compute pruning expressions that need one can always rely
on the PlanState providing it. A future patch will allow runtime
pruning (at least the initial pruning steps) to be performed without
the corresponding PlanState yet having been created, so this will
help.
---
src/backend/executor/execPartition.c | 340 ++++++++++++++++---------
src/backend/executor/nodeAppend.c | 33 +--
src/backend/executor/nodeMergeAppend.c | 32 +--
src/backend/partitioning/partprune.c | 20 +-
src/include/executor/execPartition.h | 9 +-
src/include/partitioning/partprune.h | 2 +
6 files changed, 255 insertions(+), 181 deletions(-)
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 90ed1485d1..21953f253b 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -182,11 +182,18 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation rel,
bool *isnull,
int maxfieldlen);
static List *adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri);
+static PartitionPruneState *ExecCreatePartitionPruneState(PlanState *planstate,
+ PartitionPruneInfo *partitionpruneinfo);
+static Bitmapset *ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate);
static void ExecInitPruningContext(PartitionPruneContext *context,
List *pruning_steps,
PartitionDesc partdesc,
PartitionKey partkey,
- PlanState *planstate);
+ PlanState *planstate,
+ ExprContext *econtext);
+static void ExecPartitionPruneFixSubPlanIndexes(PartitionPruneState *prunestate,
+ Bitmapset *initially_valid_subplans,
+ int n_total_subplans);
static void find_matching_subplans_recurse(PartitionPruningData *prunedata,
PartitionedRelPruningData *pprune,
bool initial_prune,
@@ -1485,30 +1492,87 @@ adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri)
*
* Functions:
*
- * ExecCreatePartitionPruneState:
- * Creates the PartitionPruneState required by each of the two pruning
- * functions. Details stored include how to map the partition index
- * returned by the partition pruning code into subplan indexes.
- *
- * ExecFindInitialMatchingSubPlans:
- * Returns indexes of matching subplans. Partition pruning is attempted
- * without any evaluation of expressions containing PARAM_EXEC Params.
- * This function must be called during executor startup for the parent
- * plan before the subplans themselves are initialized. Subplans which
- * are found not to match by this function must be removed from the
- * plan's list of subplans during execution, as this function performs a
- * remap of the partition index to subplan index map and the newly
- * created map provides indexes only for subplans which remain after
- * calling this function.
+ * ExecInitPartitionPruning:
+ * Sets up run-time pruning data structure (PartitionPruneState) that is
+ * needed by each of the two pruning functions. Also determines the set
+ * of initially valid subplans by performing initial pruning steps,
+ * telling the caller (such as ExecInitAppend) to initialize only those
+ * for execution. Maps in PartitionPruneState that are used to map the
+ * partition indexes returned by partprune.c functions into the indexes
+ * of partition's subplans in the parent node (such as Append) are
+ * updated to account for initial pruning having eliminated some of the
+ * subplans, if any.
*
* ExecFindMatchingSubPlans:
* Returns indexes of matching subplans after evaluating all available
- * expressions. This function can only be called during execution and
- * must be called again each time the value of a Param listed in
- * PartitionPruneState's 'execparamids' changes.
+ * expressions, that is, using execution pruning steps. This function can
+ * can only be called during execution and must be called again each time
+ * the value of a Param listed in PartitionPruneState's 'execparamids'
+ * changes.
*-------------------------------------------------------------------------
*/
+/*
+ * ExecInitPartitionPruning
+ * Initialize data structure needed for run-time partition pruning
+ *
+ * Initial pruning can be done immediately, so it is done here if needed and
+ * the set of surviving partition subplans' indexes are added to the output
+ * parameter *initially_valid_subplans. If subplans are indeed pruned,
+ * subplan_map arrays contained in the returned PartitionPruneState are
+ * re-sequenced to not count those, though only if the maps will be needed
+ * for subsequent execution pruning passes.
+ */
+PartitionPruneState *
+ExecInitPartitionPruning(PlanState *planstate,
+ int n_total_subplans,
+ PartitionPruneInfo *pruneinfo,
+ Bitmapset **initially_valid_subplans)
+{
+ PartitionPruneState *prunestate;
+ EState *estate = planstate->state;
+
+ /* We may need an expression context to evaluate partition exprs */
+ ExecAssignExprContext(estate, planstate);
+
+ /*
+ * Create the working data structure for pruning.
+ */
+ prunestate = ExecCreatePartitionPruneState(planstate, pruneinfo);
+
+ /*
+ * Perform an initial partition prune, if required.
+ */
+ if (prunestate->do_initial_prune)
+ {
+ /* Determine which subplans survive initial pruning */
+ *initially_valid_subplans = ExecFindInitialMatchingSubPlans(prunestate);
+ }
+ else
+ {
+ /* We'll need to initialize all subplans */
+ Assert(n_total_subplans > 0);
+ *initially_valid_subplans = bms_add_range(NULL, 0,
+ n_total_subplans - 1);
+ }
+
+ /*
+ * Re-sequence subplan indexes contained in prunestate to account for any
+ * that were removed above due to initial pruning.
+ *
+ * We can safely skip this when !do_exec_prune, even though that leaves
+ * invalid data in prunestate, because that data won't be consulted again
+ * (cf initial Assert in ExecFindMatchingSubPlans).
+ */
+ if (prunestate->do_exec_prune &&
+ bms_num_members(*initially_valid_subplans) < n_total_subplans)
+ ExecPartitionPruneFixSubPlanIndexes(prunestate,
+ *initially_valid_subplans,
+ n_total_subplans);
+
+ return prunestate;
+}
+
/*
* ExecCreatePartitionPruneState
* Build the data structure required for calling
@@ -1527,7 +1591,7 @@ adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri)
* re-used each time we re-evaluate which partitions match the pruning steps
* provided in each PartitionedRelPruneInfo.
*/
-PartitionPruneState *
+static PartitionPruneState *
ExecCreatePartitionPruneState(PlanState *planstate,
PartitionPruneInfo *partitionpruneinfo)
{
@@ -1536,6 +1600,7 @@ ExecCreatePartitionPruneState(PlanState *planstate,
int n_part_hierarchies;
ListCell *lc;
int i;
+ ExprContext *econtext = planstate->ps_ExprContext;
/* For data reading, executor always omits detached partitions */
if (estate->es_partition_directory == NULL)
@@ -1709,7 +1774,8 @@ ExecCreatePartitionPruneState(PlanState *planstate,
{
ExecInitPruningContext(&pprune->initial_context,
pinfo->initial_pruning_steps,
- partdesc, partkey, planstate);
+ partdesc, partkey, planstate,
+ econtext);
/* Record whether initial pruning is needed at any level */
prunestate->do_initial_prune = true;
}
@@ -1718,7 +1784,8 @@ ExecCreatePartitionPruneState(PlanState *planstate,
{
ExecInitPruningContext(&pprune->exec_context,
pinfo->exec_pruning_steps,
- partdesc, partkey, planstate);
+ partdesc, partkey, planstate,
+ econtext);
/* Record whether exec pruning is needed at any level */
prunestate->do_exec_prune = true;
}
@@ -1746,7 +1813,8 @@ ExecInitPruningContext(PartitionPruneContext *context,
List *pruning_steps,
PartitionDesc partdesc,
PartitionKey partkey,
- PlanState *planstate)
+ PlanState *planstate,
+ ExprContext *econtext)
{
int n_steps;
int partnatts;
@@ -1767,6 +1835,7 @@ ExecInitPruningContext(PartitionPruneContext *context,
context->ppccontext = CurrentMemoryContext;
context->planstate = planstate;
+ context->exprcontext = econtext;
/* Initialize expression state for each expression we need */
context->exprstates = (ExprState **)
@@ -1795,8 +1864,20 @@ ExecInitPruningContext(PartitionPruneContext *context,
step->step.step_id,
keyno);
- context->exprstates[stateidx] =
- ExecInitExpr(expr, context->planstate);
+ /*
+ * When planstate is NULL, pruning_steps is known not to
+ * contain any expressions that depend on the parent plan.
+ * Information of any available EXTERN parameters must be
+ * passed explicitly in that case, which the caller must
+ * have made available via econtext.
+ */
+ if (planstate == NULL)
+ context->exprstates[stateidx] =
+ ExecInitExprWithParams(expr,
+ econtext->ecxt_param_list_info);
+ else
+ context->exprstates[stateidx] =
+ ExecInitExpr(expr, context->planstate);
}
keyno++;
}
@@ -1816,11 +1897,9 @@ ExecInitPruningContext(PartitionPruneContext *context,
*
* Must only be called once per 'prunestate', and only if initial pruning
* is required.
- *
- * 'nsubplans' must be passed as the total number of unpruned subplans.
*/
-Bitmapset *
-ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate, int nsubplans)
+static Bitmapset *
+ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate)
{
Bitmapset *result = NULL;
MemoryContext oldcontext;
@@ -1845,14 +1924,20 @@ ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate, int nsubplans)
PartitionedRelPruningData *pprune;
prunedata = prunestate->partprunedata[i];
+
+ /*
+ * We pass the 1st item belonging to the root table of the hierarchy
+ * and find_matching_subplans_recurse() takes care of recursing to
+ * other (lower-level) parents as needed.
+ */
pprune = &prunedata->partrelprunedata[0];
/* Perform pruning without using PARAM_EXEC Params */
find_matching_subplans_recurse(prunedata, pprune, true, &result);
- /* Expression eval may have used space in node's ps_ExprContext too */
+ /* Expression eval may have used space in ExprContext too */
if (pprune->initial_pruning_steps)
- ResetExprContext(pprune->initial_context.planstate->ps_ExprContext);
+ ResetExprContext(pprune->initial_context.exprcontext);
}
/* Add in any subplans that partition pruning didn't account for */
@@ -1865,118 +1950,120 @@ ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate, int nsubplans)
MemoryContextReset(prunestate->prune_context);
+ return result;
+}
+
+/*
+ * ExecPartitionPruneFixSubPlanIndexes
+ * Fix mapping of partition indexes to subplan indexes contained in
+ * prunestate by considering the new list of subplans that survived
+ * initial pruning
+ *
+ * Subplans would be previously indexed 0..(n_total_subplans - 1), though
+ * now should be changed to index range 0..num(initially_valid_subplans).
+ */
+static void
+ExecPartitionPruneFixSubPlanIndexes(PartitionPruneState *prunestate,
+ Bitmapset *initially_valid_subplans,
+ int n_total_subplans)
+{
+ int *new_subplan_indexes;
+ Bitmapset *new_other_subplans;
+ int i;
+ int newidx;
+
/*
- * If exec-time pruning is required and we pruned subplans above, then we
- * must re-sequence the subplan indexes so that ExecFindMatchingSubPlans
- * properly returns the indexes from the subplans which will remain after
- * execution of this function.
- *
- * We can safely skip this when !do_exec_prune, even though that leaves
- * invalid data in prunestate, because that data won't be consulted again
- * (cf initial Assert in ExecFindMatchingSubPlans).
+ * First we must build a temporary array which maps old subplan
+ * indexes to new ones. For convenience of initialization, we use
+ * 1-based indexes in this array and leave pruned items as 0.
*/
- if (prunestate->do_exec_prune && bms_num_members(result) < nsubplans)
+ new_subplan_indexes = (int *) palloc0(sizeof(int) * n_total_subplans);
+ newidx = 1;
+ i = -1;
+ while ((i = bms_next_member(initially_valid_subplans, i)) >= 0)
{
- int *new_subplan_indexes;
- Bitmapset *new_other_subplans;
- int i;
- int newidx;
+ Assert(i < n_total_subplans);
+ new_subplan_indexes[i] = newidx++;
+ }
- /*
- * First we must build a temporary array which maps old subplan
- * indexes to new ones. For convenience of initialization, we use
- * 1-based indexes in this array and leave pruned items as 0.
- */
- new_subplan_indexes = (int *) palloc0(sizeof(int) * nsubplans);
- newidx = 1;
- i = -1;
- while ((i = bms_next_member(result, i)) >= 0)
- {
- Assert(i < nsubplans);
- new_subplan_indexes[i] = newidx++;
- }
+ /*
+ * Now we can update each PartitionedRelPruneInfo's subplan_map with
+ * new subplan indexes. We must also recompute its present_parts
+ * bitmap.
+ */
+ for (i = 0; i < prunestate->num_partprunedata; i++)
+ {
+ PartitionPruningData *prunedata = prunestate->partprunedata[i];
+ int j;
/*
- * Now we can update each PartitionedRelPruneInfo's subplan_map with
- * new subplan indexes. We must also recompute its present_parts
- * bitmap.
+ * Within each hierarchy, we perform this loop in back-to-front
+ * order so that we determine present_parts for the lowest-level
+ * partitioned tables first. This way we can tell whether a
+ * sub-partitioned table's partitions were entirely pruned so we
+ * can exclude it from the current level's present_parts.
*/
- for (i = 0; i < prunestate->num_partprunedata; i++)
+ for (j = prunedata->num_partrelprunedata - 1; j >= 0; j--)
{
- PartitionPruningData *prunedata = prunestate->partprunedata[i];
- int j;
+ PartitionedRelPruningData *pprune = &prunedata->partrelprunedata[j];
+ int nparts = pprune->nparts;
+ int k;
- /*
- * Within each hierarchy, we perform this loop in back-to-front
- * order so that we determine present_parts for the lowest-level
- * partitioned tables first. This way we can tell whether a
- * sub-partitioned table's partitions were entirely pruned so we
- * can exclude it from the current level's present_parts.
- */
- for (j = prunedata->num_partrelprunedata - 1; j >= 0; j--)
- {
- PartitionedRelPruningData *pprune = &prunedata->partrelprunedata[j];
- int nparts = pprune->nparts;
- int k;
+ /* We just rebuild present_parts from scratch */
+ bms_free(pprune->present_parts);
+ pprune->present_parts = NULL;
- /* We just rebuild present_parts from scratch */
- bms_free(pprune->present_parts);
- pprune->present_parts = NULL;
+ for (k = 0; k < nparts; k++)
+ {
+ int oldidx = pprune->subplan_map[k];
+ int subidx;
- for (k = 0; k < nparts; k++)
+ /*
+ * If this partition existed as a subplan then change the
+ * old subplan index to the new subplan index. The new
+ * index may become -1 if the partition was pruned above,
+ * or it may just come earlier in the subplan list due to
+ * some subplans being removed earlier in the list. If
+ * it's a subpartition, add it to present_parts unless
+ * it's entirely pruned.
+ */
+ if (oldidx >= 0)
{
- int oldidx = pprune->subplan_map[k];
- int subidx;
-
- /*
- * If this partition existed as a subplan then change the
- * old subplan index to the new subplan index. The new
- * index may become -1 if the partition was pruned above,
- * or it may just come earlier in the subplan list due to
- * some subplans being removed earlier in the list. If
- * it's a subpartition, add it to present_parts unless
- * it's entirely pruned.
- */
- if (oldidx >= 0)
- {
- Assert(oldidx < nsubplans);
- pprune->subplan_map[k] = new_subplan_indexes[oldidx] - 1;
+ Assert(oldidx < n_total_subplans);
+ pprune->subplan_map[k] = new_subplan_indexes[oldidx] - 1;
- if (new_subplan_indexes[oldidx] > 0)
- pprune->present_parts =
- bms_add_member(pprune->present_parts, k);
- }
- else if ((subidx = pprune->subpart_map[k]) >= 0)
- {
- PartitionedRelPruningData *subprune;
+ if (new_subplan_indexes[oldidx] > 0)
+ pprune->present_parts =
+ bms_add_member(pprune->present_parts, k);
+ }
+ else if ((subidx = pprune->subpart_map[k]) >= 0)
+ {
+ PartitionedRelPruningData *subprune;
- subprune = &prunedata->partrelprunedata[subidx];
+ subprune = &prunedata->partrelprunedata[subidx];
- if (!bms_is_empty(subprune->present_parts))
- pprune->present_parts =
- bms_add_member(pprune->present_parts, k);
- }
+ if (!bms_is_empty(subprune->present_parts))
+ pprune->present_parts =
+ bms_add_member(pprune->present_parts, k);
}
}
}
+ }
- /*
- * We must also recompute the other_subplans set, since indexes in it
- * may change.
- */
- new_other_subplans = NULL;
- i = -1;
- while ((i = bms_next_member(prunestate->other_subplans, i)) >= 0)
- new_other_subplans = bms_add_member(new_other_subplans,
- new_subplan_indexes[i] - 1);
-
- bms_free(prunestate->other_subplans);
- prunestate->other_subplans = new_other_subplans;
+ /*
+ * We must also recompute the other_subplans set, since indexes in it
+ * may change.
+ */
+ new_other_subplans = NULL;
+ i = -1;
+ while ((i = bms_next_member(prunestate->other_subplans, i)) >= 0)
+ new_other_subplans = bms_add_member(new_other_subplans,
+ new_subplan_indexes[i] - 1);
- pfree(new_subplan_indexes);
- }
+ bms_free(prunestate->other_subplans);
+ prunestate->other_subplans = new_other_subplans;
- return result;
+ pfree(new_subplan_indexes);
}
/*
@@ -2018,11 +2105,16 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate)
prunedata = prunestate->partprunedata[i];
pprune = &prunedata->partrelprunedata[0];
+ /*
+ * We pass the 1st item belonging to the root table of the hierarchy
+ * and find_matching_subplans_recurse() takes care of recursing to
+ * other (lower-level) parents as needed.
+ */
find_matching_subplans_recurse(prunedata, pprune, false, &result);
- /* Expression eval may have used space in node's ps_ExprContext too */
+ /* Expression eval may have used space in ExprContext too */
if (pprune->exec_pruning_steps)
- ResetExprContext(pprune->exec_context.planstate->ps_ExprContext);
+ ResetExprContext(pprune->exec_context.exprcontext);
}
/* Add in any subplans that partition pruning didn't account for */
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index 7937f1c88f..5b6d3eb23b 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -138,30 +138,17 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
{
PartitionPruneState *prunestate;
- /* We may need an expression context to evaluate partition exprs */
- ExecAssignExprContext(estate, &appendstate->ps);
-
- /* Create the working data structure for pruning. */
- prunestate = ExecCreatePartitionPruneState(&appendstate->ps,
- node->part_prune_info);
+ /*
+ * Set up pruning data structure. Initial pruning steps, if any, are
+ * performed as part of the setup, adding the set of indexes of
+ * surviving subplans to 'validsubplans'.
+ */
+ prunestate = ExecInitPartitionPruning(&appendstate->ps,
+ list_length(node->appendplans),
+ node->part_prune_info,
+ &validsubplans);
appendstate->as_prune_state = prunestate;
-
- /* Perform an initial partition prune, if required. */
- if (prunestate->do_initial_prune)
- {
- /* Determine which subplans survive initial pruning */
- validsubplans = ExecFindInitialMatchingSubPlans(prunestate,
- list_length(node->appendplans));
-
- nplans = bms_num_members(validsubplans);
- }
- else
- {
- /* We'll need to initialize all subplans */
- nplans = list_length(node->appendplans);
- Assert(nplans > 0);
- validsubplans = bms_add_range(NULL, 0, nplans - 1);
- }
+ nplans = bms_num_members(validsubplans);
/*
* When no run-time pruning is required and there's at least one
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index 418f89dea8..9a9f29e845 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -86,29 +86,17 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
{
PartitionPruneState *prunestate;
- /* We may need an expression context to evaluate partition exprs */
- ExecAssignExprContext(estate, &mergestate->ps);
-
- prunestate = ExecCreatePartitionPruneState(&mergestate->ps,
- node->part_prune_info);
+ /*
+ * Set up pruning data structure. Initial pruning steps, if any, are
+ * performed as part of the setup, adding the set of indexes of
+ * surviving subplans to 'validsubplans'.
+ */
+ prunestate = ExecInitPartitionPruning(&mergestate->ps,
+ list_length(node->mergeplans),
+ node->part_prune_info,
+ &validsubplans);
mergestate->ms_prune_state = prunestate;
-
- /* Perform an initial partition prune, if required. */
- if (prunestate->do_initial_prune)
- {
- /* Determine which subplans survive initial pruning */
- validsubplans = ExecFindInitialMatchingSubPlans(prunestate,
- list_length(node->mergeplans));
-
- nplans = bms_num_members(validsubplans);
- }
- else
- {
- /* We'll need to initialize all subplans */
- nplans = list_length(node->mergeplans);
- Assert(nplans > 0);
- validsubplans = bms_add_range(NULL, 0, nplans - 1);
- }
+ nplans = bms_num_members(validsubplans);
/*
* When no run-time pruning is required and there's at least one
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index 1bc00826c1..7080cb25d9 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -798,6 +798,7 @@ prune_append_rel_partitions(RelOptInfo *rel)
/* These are not valid when being called from the planner */
context.planstate = NULL;
+ context.exprcontext = NULL;
context.exprstates = NULL;
/* Actual pruning happens here. */
@@ -808,8 +809,8 @@ prune_append_rel_partitions(RelOptInfo *rel)
* get_matching_partitions
* Determine partitions that survive partition pruning
*
- * Note: context->planstate must be set to a valid PlanState when the
- * pruning_steps were generated with a target other than PARTTARGET_PLANNER.
+ * Note: context->exprcontext must be valid when the pruning_steps were
+ * generated with a target other than PARTTARGET_PLANNER.
*
* Returns a Bitmapset of the RelOptInfo->part_rels indexes of the surviving
* partitions.
@@ -3654,7 +3655,7 @@ match_boolean_partition_clause(Oid partopfamily, Expr *clause, Expr *partkey,
* exprstate array.
*
* Note that the evaluated result may be in the per-tuple memory context of
- * context->planstate->ps_ExprContext, and we may have leaked other memory
+ * context->exprcontext, and we may have leaked other memory
* there too. This memory must be recovered by resetting that ExprContext
* after we're done with the pruning operation (see execPartition.c).
*/
@@ -3677,13 +3678,18 @@ partkey_datum_from_expr(PartitionPruneContext *context,
ExprContext *ectx;
/*
- * We should never see a non-Const in a step unless we're running in
- * the executor.
+ * We should never see a non-Const in a step unless the caller has
+ * passed a valid ExprContext.
+ *
+ * When context->planstate is valid, context->exprcontext is same
+ * as context->planstate->ps_ExprContext.
*/
- Assert(context->planstate != NULL);
+ Assert(context->planstate != NULL || context->exprcontext != NULL);
+ Assert(context->planstate == NULL ||
+ (context->exprcontext == context->planstate->ps_ExprContext));
exprstate = context->exprstates[stateidx];
- ectx = context->planstate->ps_ExprContext;
+ ectx = context->exprcontext;
*value = ExecEvalExprSwitchContext(exprstate, ectx, isnull);
}
}
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index 603d8becc4..fd5735a946 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -119,10 +119,9 @@ extern ResultRelInfo *ExecFindPartition(ModifyTableState *mtstate,
EState *estate);
extern void ExecCleanupTupleRouting(ModifyTableState *mtstate,
PartitionTupleRouting *proute);
-extern PartitionPruneState *ExecCreatePartitionPruneState(PlanState *planstate,
- PartitionPruneInfo *partitionpruneinfo);
+extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
+ int n_total_subplans,
+ PartitionPruneInfo *pruneinfo,
+ Bitmapset **initially_valid_subplans);
extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate);
-extern Bitmapset *ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate,
- int nsubplans);
-
#endif /* EXECPARTITION_H */
diff --git a/src/include/partitioning/partprune.h b/src/include/partitioning/partprune.h
index ee11b6feae..90684efa25 100644
--- a/src/include/partitioning/partprune.h
+++ b/src/include/partitioning/partprune.h
@@ -41,6 +41,7 @@ struct RelOptInfo;
* subsidiary data, such as the FmgrInfos.
* planstate Points to the parent plan node's PlanState when called
* during execution; NULL when called from the planner.
+ * exprcontext ExprContext to use when evaluating pruning expressions
* exprstates Array of ExprStates, indexed as per PruneCxtStateIdx; one
* for each partition key in each pruning step. Allocated if
* planstate is non-NULL, otherwise NULL.
@@ -56,6 +57,7 @@ typedef struct PartitionPruneContext
FmgrInfo *stepcmpfuncs;
MemoryContext ppccontext;
PlanState *planstate;
+ ExprContext *exprcontext;
ExprState **exprstates;
} PartitionPruneContext;
--
2.24.1
^ permalink raw reply [nested|flat] 71+ messages in thread
* Re: generic plans and "initial" pruning
@ 2022-03-11 15:06 Amit Langote <[email protected]>
parent: Amit Langote <[email protected]>
1 sibling, 0 replies; 71+ messages in thread
From: Amit Langote @ 2022-03-11 15:06 UTC (permalink / raw)
To: Robert Haas <[email protected]>; +Cc: pgsql-hackers; David Rowley *EXTERN* <[email protected]>
On Fri, Mar 11, 2022 at 11:35 PM Amit Langote <[email protected]> wrote:
> Attached is v5, now broken into 3 patches:
>
> 0001: Some refactoring of runtime pruning code
> 0002: Add a plan_tree_walker
> 0003: Teach AcquireExecutorLocks to skip locking pruned relations
Repeated the performance tests described in the 1st email of this thread:
HEAD: (copied from the 1st email)
32 tps = 20561.776403 (without initial connection time)
64 tps = 12553.131423 (without initial connection time)
128 tps = 13330.365696 (without initial connection time)
256 tps = 8605.723120 (without initial connection time)
512 tps = 4435.951139 (without initial connection time)
1024 tps = 2346.902973 (without initial connection time)
2048 tps = 1334.680971 (without initial connection time)
Patched v1: (copied from the 1st email)
32 tps = 27554.156077 (without initial connection time)
64 tps = 27531.161310 (without initial connection time)
128 tps = 27138.305677 (without initial connection time)
256 tps = 25825.467724 (without initial connection time)
512 tps = 19864.386305 (without initial connection time)
1024 tps = 18742.668944 (without initial connection time)
2048 tps = 16312.412704 (without initial connection time)
Patched v5:
32 tps = 28204.197738 (without initial connection time)
64 tps = 26795.385318 (without initial connection time)
128 tps = 26387.920550 (without initial connection time)
256 tps = 25601.141556 (without initial connection time)
512 tps = 19911.947502 (without initial connection time)
1024 tps = 20158.692952 (without initial connection time)
2048 tps = 16180.195463 (without initial connection time)
Good to see that these rewrites haven't really hurt the numbers much,
which makes sense because the rewrites have really been about putting
the code in the right place.
BTW, these are the numbers for the same benchmark repeated with
plan_cache_mode = auto, which causes a custom plan to be chosen for
every execution and so unaffected by this patch.
32 tps = 13359.225082 (without initial connection time)
64 tps = 15760.533280 (without initial connection time)
128 tps = 15825.734482 (without initial connection time)
256 tps = 15017.693905 (without initial connection time)
512 tps = 13479.973395 (without initial connection time)
1024 tps = 13200.444397 (without initial connection time)
2048 tps = 12884.645475 (without initial connection time)
Comparing them to numbers when using force_generic_plan shows that
making the generic plans faster is indeed worthwhile.
--
Amit Langote
EDB: http://www.enterprisedb.com
^ permalink raw reply [nested|flat] 71+ messages in thread
* Re: generic plans and "initial" pruning
@ 2022-03-14 18:42 Robert Haas <[email protected]>
parent: Amit Langote <[email protected]>
1 sibling, 1 reply; 71+ messages in thread
From: Robert Haas @ 2022-03-14 18:42 UTC (permalink / raw)
To: Amit Langote <[email protected]>; +Cc: pgsql-hackers; David Rowley *EXTERN* <[email protected]>; Tom Lane <[email protected]>
On Fri, Mar 11, 2022 at 9:35 AM Amit Langote <[email protected]> wrote:
> Attached is v5, now broken into 3 patches:
>
> 0001: Some refactoring of runtime pruning code
> 0002: Add a plan_tree_walker
> 0003: Teach AcquireExecutorLocks to skip locking pruned relations
So is any other committer planning to look at this? Tom, perhaps?
David? This strikes me as important work, and I don't mind going
through and trying to do some detailed review, but (A) I am not the
person most familiar with the code being modified here and (B) there
are some important theoretical questions about the approach that we
might want to try to cover before we get down into the details.
In my opinion, the most important theoretical issue here is around
reuse of plans that are no longer entirely valid, but the parts that
are no longer valid are certain to be pruned. If, because we know that
some parameter has some particular value, we skip locking a bunch of
partitions, then when we're executing the plan, those partitions need
not exist any more -- or they could have different indexes, be
detached from the partitioning hierarchy and subsequently altered,
whatever. That seems fine to me provided that all of our code (and any
third-party code) is careful not to rely on the portion of the plan
that we've pruned away, and doesn't assume that (for example) we can
still fetch the name of an index whose OID appears in there someplace.
I cannot think of a hazard where the fact that the part of a plan is
no longer valid because some DDL has been executed "infects" the
remainder of the plan. As long as we lock the partitioned tables named
in the plan and their descendents down to the level just above the one
at which something is pruned, and are careful, I think we should be
OK. It would be nice to know if someone has a fundamentally different
view of the hazards here, though.
Just to state my position here clearly, I would be more than happy if
somebody else plans to pick this up and try to get some or all of it
committed, and will cheerfully defer to such person in the event that
they have that plan. If, however, no such person exists, I may try my
hand at that myself.
Thanks,
--
Robert Haas
EDB: http://www.enterprisedb.com
^ permalink raw reply [nested|flat] 71+ messages in thread
* Re: generic plans and "initial" pruning
@ 2022-03-14 19:38 Tom Lane <[email protected]>
parent: Robert Haas <[email protected]>
0 siblings, 1 reply; 71+ messages in thread
From: Tom Lane @ 2022-03-14 19:38 UTC (permalink / raw)
To: Robert Haas <[email protected]>; +Cc: Amit Langote <[email protected]>; pgsql-hackers; David Rowley *EXTERN* <[email protected]>
Robert Haas <[email protected]> writes:
> In my opinion, the most important theoretical issue here is around
> reuse of plans that are no longer entirely valid, but the parts that
> are no longer valid are certain to be pruned. If, because we know that
> some parameter has some particular value, we skip locking a bunch of
> partitions, then when we're executing the plan, those partitions need
> not exist any more -- or they could have different indexes, be
> detached from the partitioning hierarchy and subsequently altered,
> whatever.
Check.
> That seems fine to me provided that all of our code (and any
> third-party code) is careful not to rely on the portion of the plan
> that we've pruned away, and doesn't assume that (for example) we can
> still fetch the name of an index whose OID appears in there someplace.
... like EXPLAIN, for example?
If "pruning" means physical removal from the plan tree, then it's
probably all right. However, it looks to me like that doesn't
actually happen, or at least doesn't happen till much later, so
there's room for worry about a disconnect between what plancache.c
has verified and what executor startup will try to touch. As you
say, in the absence of any bugs, that's not a problem ... but if
there are such bugs, tracking them down would be really hard.
What I am skeptical about is that this work actually accomplishes
anything under real-world conditions. That's because if pruning would
save enough to make skipping the lock-acquisition phase worth the
trouble, the plan cache is almost certainly going to decide it should
be using a custom plan not a generic plan. Now if we had a better
cost model (or, indeed, any model at all) for run-time pruning effects
then maybe that situation could be improved. I think we'd be better
served to worry about that end of it before we spend more time making
the executor even less predictable.
Also, while I've not spent much time at all reading this patch,
it seems rather desperately undercommented, and a lot of the
new names are unintelligible. In particular, I suspect that the
patch is significantly redesigning when/where run-time pruning
happens (unless it's just letting that be run twice); but I don't
see any documentation or name changes suggesting where that
responsibility is now.
regards, tom lane
^ permalink raw reply [nested|flat] 71+ messages in thread
* Re: generic plans and "initial" pruning
@ 2022-03-14 20:06 Robert Haas <[email protected]>
parent: Tom Lane <[email protected]>
0 siblings, 1 reply; 71+ messages in thread
From: Robert Haas @ 2022-03-14 20:06 UTC (permalink / raw)
To: Tom Lane <[email protected]>; +Cc: Amit Langote <[email protected]>; pgsql-hackers; David Rowley *EXTERN* <[email protected]>
On Mon, Mar 14, 2022 at 3:38 PM Tom Lane <[email protected]> wrote:
> ... like EXPLAIN, for example?
Exactly! I think that's the foremost example, but extension modules
like auto_explain or even third-party extensions are also a risk. I
think there was some discussion of this previously.
> If "pruning" means physical removal from the plan tree, then it's
> probably all right. However, it looks to me like that doesn't
> actually happen, or at least doesn't happen till much later, so
> there's room for worry about a disconnect between what plancache.c
> has verified and what executor startup will try to touch. As you
> say, in the absence of any bugs, that's not a problem ... but if
> there are such bugs, tracking them down would be really hard.
Surgery on the plan would violate the general principle that plans are
read only once constructed. I think the idea ought to be to pass a
secondary data structure around with the plan that defines which parts
you must ignore. Any code that fails to use that other data structure
in the appropriate manner gets defined to be buggy and has to be fixed
by making it follow the new rules.
> What I am skeptical about is that this work actually accomplishes
> anything under real-world conditions. That's because if pruning would
> save enough to make skipping the lock-acquisition phase worth the
> trouble, the plan cache is almost certainly going to decide it should
> be using a custom plan not a generic plan. Now if we had a better
> cost model (or, indeed, any model at all) for run-time pruning effects
> then maybe that situation could be improved. I think we'd be better
> served to worry about that end of it before we spend more time making
> the executor even less predictable.
I don't agree with that analysis, because setting plan_cache_mode is
not uncommon. Even if that GUC didn't exist, I'm pretty sure there are
cases where the planner naturally falls into a generic plan anyway,
even though pruning is happening. But as it is, the GUC does exist,
and people use it. Consequently, while I'd love to see something done
about the costing side of things, I do not accept that all other
improvements should wait for that to happen.
> Also, while I've not spent much time at all reading this patch,
> it seems rather desperately undercommented, and a lot of the
> new names are unintelligible. In particular, I suspect that the
> patch is significantly redesigning when/where run-time pruning
> happens (unless it's just letting that be run twice); but I don't
> see any documentation or name changes suggesting where that
> responsibility is now.
I am sympathetic to that concern. I spent a while staring at a
baffling comment in 0001 only to discover it had just been moved from
elsewhere. I really don't feel that things in this are as clear as
they could be -- although I hasten to add that I respect the people
who have done work in this area previously and am grateful for what
they did. It's been a huge benefit to the project in spite of the
bumps in the road. Moreover, this isn't the only code in PostgreSQL
that needs improvement, or the worst. That said, I do think there are
problems. I don't yet have a position on whether this patch is making
that better or worse.
That said, I believe that the core idea of the patch is to optionally
perform pruning before we acquire locks or spin up the main executor
and then remember the decisions we made. If once the main executor is
spun up we already made those decisions, then we must stick with what
we decided. If not, we make those pruning decisions at the same point
we do currently - more or less on demand, at the point when we'd need
to know whether to descend that branch of the plan tree or not. I
think this scheme comes about because there are a couple of different
interfaces to the parameterized query stuff, and in some code paths we
have the values early enough to use them for pre-pruning, and in
others we don't.
--
Robert Haas
EDB: http://www.enterprisedb.com
^ permalink raw reply [nested|flat] 71+ messages in thread
* Re: generic plans and "initial" pruning
@ 2022-03-15 06:19 Amit Langote <[email protected]>
parent: Robert Haas <[email protected]>
0 siblings, 1 reply; 71+ messages in thread
From: Amit Langote @ 2022-03-15 06:19 UTC (permalink / raw)
To: Robert Haas <[email protected]>; +Cc: Tom Lane <[email protected]>; pgsql-hackers; David Rowley *EXTERN* <[email protected]>
On Tue, Mar 15, 2022 at 5:06 AM Robert Haas <[email protected]> wrote:
> On Mon, Mar 14, 2022 at 3:38 PM Tom Lane <[email protected]> wrote:
> > What I am skeptical about is that this work actually accomplishes
> > anything under real-world conditions. That's because if pruning would
> > save enough to make skipping the lock-acquisition phase worth the
> > trouble, the plan cache is almost certainly going to decide it should
> > be using a custom plan not a generic plan. Now if we had a better
> > cost model (or, indeed, any model at all) for run-time pruning effects
> > then maybe that situation could be improved. I think we'd be better
> > served to worry about that end of it before we spend more time making
> > the executor even less predictable.
>
> I don't agree with that analysis, because setting plan_cache_mode is
> not uncommon. Even if that GUC didn't exist, I'm pretty sure there are
> cases where the planner naturally falls into a generic plan anyway,
> even though pruning is happening. But as it is, the GUC does exist,
> and people use it. Consequently, while I'd love to see something done
> about the costing side of things, I do not accept that all other
> improvements should wait for that to happen.
I agree that making generic plans execute faster has merit even before
we make the costing changes to allow plancache.c prefer generic plans
over custom ones in these cases. As the numbers in my previous email
show, simply executing a generic plan with the proposed improvements
applied is significantly cheaper than having the planner do the
pruning on every execution:
nparts auto/custom generic
====== ========== ======
32 13359 28204
64 15760 26795
128 15825 26387
256 15017 25601
512 13479 19911
1024 13200 20158
2048 12884 16180
> > Also, while I've not spent much time at all reading this patch,
> > it seems rather desperately undercommented, and a lot of the
> > new names are unintelligible. In particular, I suspect that the
> > patch is significantly redesigning when/where run-time pruning
> > happens (unless it's just letting that be run twice); but I don't
> > see any documentation or name changes suggesting where that
> > responsibility is now.
>
> I am sympathetic to that concern. I spent a while staring at a
> baffling comment in 0001 only to discover it had just been moved from
> elsewhere. I really don't feel that things in this are as clear as
> they could be -- although I hasten to add that I respect the people
> who have done work in this area previously and am grateful for what
> they did. It's been a huge benefit to the project in spite of the
> bumps in the road. Moreover, this isn't the only code in PostgreSQL
> that needs improvement, or the worst. That said, I do think there are
> problems. I don't yet have a position on whether this patch is making
> that better or worse.
Okay, I'd like to post a new version with the comments edited to make
them a bit more intelligible. I understand that the comments around
the new invocation mode(s) of runtime pruning are not as clear as they
should be, especially as the changes that this patch wants to make to
how things work are not very localized.
> That said, I believe that the core idea of the patch is to optionally
> perform pruning before we acquire locks or spin up the main executor
> and then remember the decisions we made. If once the main executor is
> spun up we already made those decisions, then we must stick with what
> we decided. If not, we make those pruning decisions at the same point
> we do currently
Right. The "initial" pruning, that this patch wants to make occur at
an earlier point (plancache.c), is currently performed in
ExecInit[Merge]Append().
If it does occur early due to the plan being a cached one,
ExecInit[Merge]Append() simply refers to its result that would be made
available via a new data structure that plancache.c has been made to
pass down to the executor alongside the plan tree.
If it does not, ExecInit[Merge]Append() does the pruning in the same
way it does now. Such cases include initial pruning using only STABLE
expressions that the planner doesn't bother to compute by itself lest
the resulting plan may be cached, but no EXTERN parameters.
--
Amit Langote
EDB: http://www.enterprisedb.com
^ permalink raw reply [nested|flat] 71+ messages in thread
* Re: generic plans and "initial" pruning
@ 2022-03-22 12:44 Amit Langote <[email protected]>
parent: Amit Langote <[email protected]>
0 siblings, 1 reply; 71+ messages in thread
From: Amit Langote @ 2022-03-22 12:44 UTC (permalink / raw)
To: Robert Haas <[email protected]>; +Cc: Tom Lane <[email protected]>; pgsql-hackers; David Rowley *EXTERN* <[email protected]>
On Tue, Mar 15, 2022 at 3:19 PM Amit Langote <[email protected]> wrote:
> On Tue, Mar 15, 2022 at 5:06 AM Robert Haas <[email protected]> wrote:
> > On Mon, Mar 14, 2022 at 3:38 PM Tom Lane <[email protected]> wrote:
> > > Also, while I've not spent much time at all reading this patch,
> > > it seems rather desperately undercommented, and a lot of the
> > > new names are unintelligible. In particular, I suspect that the
> > > patch is significantly redesigning when/where run-time pruning
> > > happens (unless it's just letting that be run twice); but I don't
> > > see any documentation or name changes suggesting where that
> > > responsibility is now.
> >
> > I am sympathetic to that concern. I spent a while staring at a
> > baffling comment in 0001 only to discover it had just been moved from
> > elsewhere. I really don't feel that things in this are as clear as
> > they could be -- although I hasten to add that I respect the people
> > who have done work in this area previously and am grateful for what
> > they did. It's been a huge benefit to the project in spite of the
> > bumps in the road. Moreover, this isn't the only code in PostgreSQL
> > that needs improvement, or the worst. That said, I do think there are
> > problems. I don't yet have a position on whether this patch is making
> > that better or worse.
>
> Okay, I'd like to post a new version with the comments edited to make
> them a bit more intelligible. I understand that the comments around
> the new invocation mode(s) of runtime pruning are not as clear as they
> should be, especially as the changes that this patch wants to make to
> how things work are not very localized.
Actually, another area where the comments may not be as clear as they
should have been is the changes that the patch makes to the
AcquireExecutorLocks() logic that decides which relations are locked
to safeguard the plan tree for execution, which are those given by
RTE_RELATION entries in the range table.
Without the patch, they are found by actually scanning the range table.
With the patch, it's the same set of RTEs if the plan doesn't contain
any pruning nodes, though instead of the range table, what is scanned
is a bitmapset of their RT indexes that is made available by the
planner in the form of PlannedStmt.lockrels. When the plan does
contain a pruning node (PlannedStmt.containsInitialPruning), the
bitmapset is constructed by calling ExecutorGetLockRels() on the plan
tree, which walks it to add RT indexes of relations mentioned in the
Scan nodes, while skipping any nodes that are pruned after performing
initial pruning steps that may be present in their containing parent
node's PartitionPruneInfo. Also, the RT indexes of partitioned tables
that are present in the PartitionPruneInfo itself are also added to
the set.
While expanding comments added by the patch to make this clear, I
realized that there are two problems, one of them quite glaring:
* Planner's constructing this bitmapset and its copying along with the
PlannedStmt is pure overhead in the cases that this patch has nothing
to do with, which is the kind of thing that Andres cautioned against
upthread.
* Not all partitioned tables that would have been locked without the
patch to come up with a Append/MergeAppend plan may be returned by
ExecutorGetLockRels(). For example, if none of the query's
runtime-prunable quals were found to match the partition key of an
intermediate partitioned table and thus that partitioned table not
included in the PartitionPruneInfo. Or if an Append/MergeAppend
covering a partition tree doesn't contain any PartitionPruneInfo to
begin with, in which case, only the leaf partitions and none of
partitioned parents would be accounted for by the
ExecutorGetLockRels() logic.
The 1st one seems easy to fix by not inventing PlannedStmt.lockrels
and just doing what's being done now: scan the range table if
(!PlannedStmt.containsInitialPruning).
The only way perhaps to fix the second one is to reconsider the
decision we made in the following commit:
commit 52ed730d511b7b1147f2851a7295ef1fb5273776
Author: Tom Lane <[email protected]>
Date: Sun Oct 7 14:33:17 2018 -0400
Remove some unnecessary fields from Plan trees.
In the wake of commit f2343653f, we no longer need some fields that
were used before to control executor lock acquisitions:
* PlannedStmt.nonleafResultRelations can go away entirely.
* partitioned_rels can go away from Append, MergeAppend, and ModifyTable.
However, ModifyTable still needs to know the RT index of the partition
root table if any, which was formerly kept in the first entry of that
list. Add a new field "rootRelation" to remember that. rootRelation is
partly redundant with nominalRelation, in that if it's set it will have
the same value as nominalRelation. However, the latter field has a
different purpose so it seems best to keep them distinct.
That is, add back the partitioned_rels field, at least to Append and
MergeAppend, to store the RT indexes of partitioned tables whose
children's paths are present in Append/MergeAppend.subpaths.
Thoughts?
--
Amit Langote
EDB: http://www.enterprisedb.com
^ permalink raw reply [nested|flat] 71+ messages in thread
* Re: generic plans and "initial" pruning
@ 2022-03-28 07:17 Amit Langote <[email protected]>
parent: Amit Langote <[email protected]>
0 siblings, 1 reply; 71+ messages in thread
From: Amit Langote @ 2022-03-28 07:17 UTC (permalink / raw)
To: Robert Haas <[email protected]>; +Cc: Tom Lane <[email protected]>; pgsql-hackers; David Rowley *EXTERN* <[email protected]>
On Tue, Mar 22, 2022 at 9:44 PM Amit Langote <[email protected]> wrote:
> On Tue, Mar 15, 2022 at 3:19 PM Amit Langote <[email protected]> wrote:
> > On Tue, Mar 15, 2022 at 5:06 AM Robert Haas <[email protected]> wrote:
> > > On Mon, Mar 14, 2022 at 3:38 PM Tom Lane <[email protected]> wrote:
> > > > Also, while I've not spent much time at all reading this patch,
> > > > it seems rather desperately undercommented, and a lot of the
> > > > new names are unintelligible. In particular, I suspect that the
> > > > patch is significantly redesigning when/where run-time pruning
> > > > happens (unless it's just letting that be run twice); but I don't
> > > > see any documentation or name changes suggesting where that
> > > > responsibility is now.
> > >
> > > I am sympathetic to that concern. I spent a while staring at a
> > > baffling comment in 0001 only to discover it had just been moved from
> > > elsewhere. I really don't feel that things in this are as clear as
> > > they could be -- although I hasten to add that I respect the people
> > > who have done work in this area previously and am grateful for what
> > > they did. It's been a huge benefit to the project in spite of the
> > > bumps in the road. Moreover, this isn't the only code in PostgreSQL
> > > that needs improvement, or the worst. That said, I do think there are
> > > problems. I don't yet have a position on whether this patch is making
> > > that better or worse.
> >
> > Okay, I'd like to post a new version with the comments edited to make
> > them a bit more intelligible. I understand that the comments around
> > the new invocation mode(s) of runtime pruning are not as clear as they
> > should be, especially as the changes that this patch wants to make to
> > how things work are not very localized.
>
> Actually, another area where the comments may not be as clear as they
> should have been is the changes that the patch makes to the
> AcquireExecutorLocks() logic that decides which relations are locked
> to safeguard the plan tree for execution, which are those given by
> RTE_RELATION entries in the range table.
>
> Without the patch, they are found by actually scanning the range table.
>
> With the patch, it's the same set of RTEs if the plan doesn't contain
> any pruning nodes, though instead of the range table, what is scanned
> is a bitmapset of their RT indexes that is made available by the
> planner in the form of PlannedStmt.lockrels. When the plan does
> contain a pruning node (PlannedStmt.containsInitialPruning), the
> bitmapset is constructed by calling ExecutorGetLockRels() on the plan
> tree, which walks it to add RT indexes of relations mentioned in the
> Scan nodes, while skipping any nodes that are pruned after performing
> initial pruning steps that may be present in their containing parent
> node's PartitionPruneInfo. Also, the RT indexes of partitioned tables
> that are present in the PartitionPruneInfo itself are also added to
> the set.
>
> While expanding comments added by the patch to make this clear, I
> realized that there are two problems, one of them quite glaring:
>
> * Planner's constructing this bitmapset and its copying along with the
> PlannedStmt is pure overhead in the cases that this patch has nothing
> to do with, which is the kind of thing that Andres cautioned against
> upthread.
>
> * Not all partitioned tables that would have been locked without the
> patch to come up with a Append/MergeAppend plan may be returned by
> ExecutorGetLockRels(). For example, if none of the query's
> runtime-prunable quals were found to match the partition key of an
> intermediate partitioned table and thus that partitioned table not
> included in the PartitionPruneInfo. Or if an Append/MergeAppend
> covering a partition tree doesn't contain any PartitionPruneInfo to
> begin with, in which case, only the leaf partitions and none of
> partitioned parents would be accounted for by the
> ExecutorGetLockRels() logic.
>
> The 1st one seems easy to fix by not inventing PlannedStmt.lockrels
> and just doing what's being done now: scan the range table if
> (!PlannedStmt.containsInitialPruning).
The attached updated patch does it like this.
> The only way perhaps to fix the second one is to reconsider the
> decision we made in the following commit:
>
> commit 52ed730d511b7b1147f2851a7295ef1fb5273776
> Author: Tom Lane <[email protected]>
> Date: Sun Oct 7 14:33:17 2018 -0400
>
> Remove some unnecessary fields from Plan trees.
>
> In the wake of commit f2343653f, we no longer need some fields that
> were used before to control executor lock acquisitions:
>
> * PlannedStmt.nonleafResultRelations can go away entirely.
>
> * partitioned_rels can go away from Append, MergeAppend, and ModifyTable.
> However, ModifyTable still needs to know the RT index of the partition
> root table if any, which was formerly kept in the first entry of that
> list. Add a new field "rootRelation" to remember that. rootRelation is
> partly redundant with nominalRelation, in that if it's set it will have
> the same value as nominalRelation. However, the latter field has a
> different purpose so it seems best to keep them distinct.
>
> That is, add back the partitioned_rels field, at least to Append and
> MergeAppend, to store the RT indexes of partitioned tables whose
> children's paths are present in Append/MergeAppend.subpaths.
And implemented this in the attached 0002 that reintroduces
partitioned_rels in Append/MergeAppend nodes as a bitmapset of RT
indexes. The set contains the RT indexes of partitioned ancestors
whose expansion produced the leaf partitions that a given
Append/MergeAppend node scans. This project needs this way of
knowing the partitioned tables involved in producing an
Append/MergeAppend node, because we'd like to give plancache.c the
ability to glean the set of relations to be locked by scanning a plan
tree to make the tree ready for execution rather than by scanning the
range table and the only relations we're missing in the tree right now
are partitioned tables.
One fly-in-the-ointment situation I faced when doing that is the fact
that setrefs.c in most situations removes the Append/MergeAppend from
the final plan if it contains only one child subplan. I got around it
by inventing a PlannerGlobal/PlannedStmt.elidedAppendPartedRels set
which is a union of partitioned_rels of all the Append/MergeAppend
nodes in the plan tree that were removed as described.
Other than the changes mentioned above, the updated patch now contains
a bit more commentary than earlier versions, mostly around
AcquireExecutorLocks()'s new way of determining the set of relations
to lock and the significantly redesigned working of the "initial"
execution pruning.
--
Amit Langote
EDB: http://www.enterprisedb.com
Attachments:
[application/x-patch] v6-0003-Add-a-plan_tree_walker.patch (3.9K, 2-v6-0003-Add-a-plan_tree_walker.patch)
download | inline diff:
From 47a00a6b8cf695e5890fc6555e2df2980eb2115b Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Thu, 3 Mar 2022 16:04:13 +0900
Subject: [PATCH v6 3/4] Add a plan_tree_walker()
Like planstate_tree_walker() but for uninitialized plan trees.
---
src/backend/nodes/nodeFuncs.c | 116 ++++++++++++++++++++++++++++++++++
src/include/nodes/nodeFuncs.h | 3 +
2 files changed, 119 insertions(+)
diff --git a/src/backend/nodes/nodeFuncs.c b/src/backend/nodes/nodeFuncs.c
index ec25aae6e3..c16f9c6b40 100644
--- a/src/backend/nodes/nodeFuncs.c
+++ b/src/backend/nodes/nodeFuncs.c
@@ -31,6 +31,10 @@ static bool planstate_walk_subplans(List *plans, bool (*walker) (),
void *context);
static bool planstate_walk_members(PlanState **planstates, int nplans,
bool (*walker) (), void *context);
+static bool plan_walk_subplans(List *plans,
+ bool (*walker) (),
+ void *context);
+static bool plan_walk_members(List *plans, bool (*walker) (), void *context);
/*
@@ -4150,3 +4154,115 @@ planstate_walk_members(PlanState **planstates, int nplans,
return false;
}
+
+/*
+ * plan_tree_walker --- walk plantrees
+ *
+ * The walker has already visited the current node, and so we need only
+ * recurse into any sub-nodes it has.
+ */
+bool
+plan_tree_walker(Plan *plan,
+ bool (*walker) (),
+ void *context)
+{
+ /* Guard against stack overflow due to overly complex plan trees */
+ check_stack_depth();
+
+ /* initPlan-s */
+ if (plan_walk_subplans(plan->initPlan, walker, context))
+ return true;
+
+ /* lefttree */
+ if (outerPlan(plan))
+ {
+ if (walker(outerPlan(plan), context))
+ return true;
+ }
+
+ /* righttree */
+ if (innerPlan(plan))
+ {
+ if (walker(innerPlan(plan), context))
+ return true;
+ }
+
+ /* special child plans */
+ switch (nodeTag(plan))
+ {
+ case T_Append:
+ if (plan_walk_members(((Append *) plan)->appendplans,
+ walker, context))
+ return true;
+ break;
+ case T_MergeAppend:
+ if (plan_walk_members(((MergeAppend *) plan)->mergeplans,
+ walker, context))
+ return true;
+ break;
+ case T_BitmapAnd:
+ if (plan_walk_members(((BitmapAnd *) plan)->bitmapplans,
+ walker, context))
+ return true;
+ break;
+ case T_BitmapOr:
+ if (plan_walk_members(((BitmapOr *) plan)->bitmapplans,
+ walker, context))
+ return true;
+ break;
+ case T_CustomScan:
+ if (plan_walk_members(((CustomScan *) plan)->custom_plans,
+ walker, context))
+ return true;
+ break;
+ case T_SubqueryScan:
+ if (walker(((SubqueryScan *) plan)->subplan, context))
+ return true;
+ break;
+ default:
+ break;
+ }
+
+ return false;
+}
+
+/*
+ * Walk a list of SubPlans (or initPlans, which also use SubPlan nodes).
+ */
+static bool
+plan_walk_subplans(List *plans,
+ bool (*walker) (),
+ void *context)
+{
+ ListCell *lc;
+ PlannedStmt *plannedstmt = (PlannedStmt *) context;
+
+ foreach(lc, plans)
+ {
+ SubPlan *sp = lfirst_node(SubPlan, lc);
+ Plan *p = list_nth(plannedstmt->subplans, sp->plan_id - 1);
+
+ if (walker(p, context))
+ return true;
+ }
+
+ return false;
+}
+
+/*
+ * Walk the constituent plans of a ModifyTable, Append, MergeAppend,
+ * BitmapAnd, or BitmapOr node.
+ */
+static bool
+plan_walk_members(List *plans, bool (*walker) (), void *context)
+{
+ ListCell *lc;
+
+ foreach(lc, plans)
+ {
+ if (walker(lfirst(lc), context))
+ return true;
+ }
+
+ return false;
+}
diff --git a/src/include/nodes/nodeFuncs.h b/src/include/nodes/nodeFuncs.h
index 93c60bde66..fca107ad65 100644
--- a/src/include/nodes/nodeFuncs.h
+++ b/src/include/nodes/nodeFuncs.h
@@ -158,5 +158,8 @@ extern bool raw_expression_tree_walker(Node *node, bool (*walker) (),
struct PlanState;
extern bool planstate_tree_walker(struct PlanState *planstate, bool (*walker) (),
void *context);
+struct Plan;
+extern bool plan_tree_walker(struct Plan *plan, bool (*walker) (),
+ void *context);
#endif /* NODEFUNCS_H */
--
2.24.1
[application/x-patch] v6-0002-Add-Merge-Append.partitioned_rels.patch (17.4K, 3-v6-0002-Add-Merge-Append.partitioned_rels.patch)
download | inline diff:
From 8c81237402922ebf82786f3ff34972a6a3cb8c03 Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Thu, 24 Mar 2022 22:47:03 +0900
Subject: [PATCH v6 2/4] Add [Merge]Append.partitioned_rels
To record the RT indexes of all partitioned ancestors leading up to
leaf partitions that are appended by the node.
If a given [Merge]Append node is left out from the plan due to there
being only one element in its list of child subplans, then its
partitioned_rels set is added to PlannerGlobal.elidedAppendPartedRels
that is passed down to the executor through PlannedStmt.
There are no users for partitioned_rels and elidedAppendPartedRels
as of this commit, though a later commit will require the ability
to extract the set of relations that must be locked to make a plan
tree safe for execution by walking the plan tree itself, so having
the partitioned tables be also present in the plan tree will be
helpful. Note that currently the executor relies on the fact that
the set of relations to be locked can be obtained by simply scanning
the range table that's made available in PlannedStmt along with the
plan tree.
---
src/backend/nodes/copyfuncs.c | 3 +++
src/backend/nodes/outfuncs.c | 5 +++++
src/backend/nodes/readfuncs.c | 3 +++
src/backend/optimizer/path/joinrels.c | 9 ++++++++
src/backend/optimizer/plan/createplan.c | 18 +++++++++++++++-
src/backend/optimizer/plan/planner.c | 8 +++++++
src/backend/optimizer/plan/setrefs.c | 28 +++++++++++++++++++++++++
src/backend/optimizer/util/inherit.c | 16 ++++++++++++++
src/backend/optimizer/util/relnode.c | 20 ++++++++++++++++++
src/include/nodes/pathnodes.h | 22 +++++++++++++++++++
src/include/nodes/plannodes.h | 17 +++++++++++++++
11 files changed, 148 insertions(+), 1 deletion(-)
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 55f720a88f..dc68a12486 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -106,6 +106,7 @@ _copyPlannedStmt(const PlannedStmt *from)
COPY_NODE_FIELD(invalItems);
COPY_NODE_FIELD(paramExecTypes);
COPY_NODE_FIELD(utilityStmt);
+ COPY_BITMAPSET_FIELD(elidedAppendPartedRels);
COPY_LOCATION_FIELD(stmt_location);
COPY_SCALAR_FIELD(stmt_len);
@@ -253,6 +254,7 @@ _copyAppend(const Append *from)
COPY_SCALAR_FIELD(nasyncplans);
COPY_SCALAR_FIELD(first_partial_plan);
COPY_NODE_FIELD(part_prune_info);
+ COPY_BITMAPSET_FIELD(partitioned_rels);
return newnode;
}
@@ -281,6 +283,7 @@ _copyMergeAppend(const MergeAppend *from)
COPY_POINTER_FIELD(collations, from->numCols * sizeof(Oid));
COPY_POINTER_FIELD(nullsFirst, from->numCols * sizeof(bool));
COPY_NODE_FIELD(part_prune_info);
+ COPY_BITMAPSET_FIELD(partitioned_rels);
return newnode;
}
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 6bdad462c7..bc178d53bf 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -324,6 +324,7 @@ _outPlannedStmt(StringInfo str, const PlannedStmt *node)
WRITE_NODE_FIELD(invalItems);
WRITE_NODE_FIELD(paramExecTypes);
WRITE_NODE_FIELD(utilityStmt);
+ WRITE_BITMAPSET_FIELD(elidedAppendPartedRels);
WRITE_LOCATION_FIELD(stmt_location);
WRITE_INT_FIELD(stmt_len);
}
@@ -443,6 +444,7 @@ _outAppend(StringInfo str, const Append *node)
WRITE_INT_FIELD(nasyncplans);
WRITE_INT_FIELD(first_partial_plan);
WRITE_NODE_FIELD(part_prune_info);
+ WRITE_BITMAPSET_FIELD(partitioned_rels);
}
static void
@@ -460,6 +462,7 @@ _outMergeAppend(StringInfo str, const MergeAppend *node)
WRITE_OID_ARRAY(collations, node->numCols);
WRITE_BOOL_ARRAY(nullsFirst, node->numCols);
WRITE_NODE_FIELD(part_prune_info);
+ WRITE_BITMAPSET_FIELD(partitioned_rels);
}
static void
@@ -2288,6 +2291,7 @@ _outPlannerGlobal(StringInfo str, const PlannerGlobal *node)
WRITE_BOOL_FIELD(parallelModeOK);
WRITE_BOOL_FIELD(parallelModeNeeded);
WRITE_CHAR_FIELD(maxParallelHazard);
+ WRITE_BITMAPSET_FIELD(elidedAppendPartedRels);
}
static void
@@ -2399,6 +2403,7 @@ _outRelOptInfo(StringInfo str, const RelOptInfo *node)
WRITE_BOOL_FIELD(partbounds_merged);
WRITE_BITMAPSET_FIELD(live_parts);
WRITE_BITMAPSET_FIELD(all_partrels);
+ WRITE_BITMAPSET_FIELD(partitioned_rels);
}
static void
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 3f68f7c18d..3c673c42d5 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -1597,6 +1597,7 @@ _readPlannedStmt(void)
READ_NODE_FIELD(invalItems);
READ_NODE_FIELD(paramExecTypes);
READ_NODE_FIELD(utilityStmt);
+ READ_BITMAPSET_FIELD(elidedAppendPartedRels);
READ_LOCATION_FIELD(stmt_location);
READ_INT_FIELD(stmt_len);
@@ -1719,6 +1720,7 @@ _readAppend(void)
READ_INT_FIELD(nasyncplans);
READ_INT_FIELD(first_partial_plan);
READ_NODE_FIELD(part_prune_info);
+ READ_BITMAPSET_FIELD(partitioned_rels);
READ_DONE();
}
@@ -1741,6 +1743,7 @@ _readMergeAppend(void)
READ_OID_ARRAY(collations, local_node->numCols);
READ_BOOL_ARRAY(nullsFirst, local_node->numCols);
READ_NODE_FIELD(part_prune_info);
+ READ_BITMAPSET_FIELD(partitioned_rels);
READ_DONE();
}
diff --git a/src/backend/optimizer/path/joinrels.c b/src/backend/optimizer/path/joinrels.c
index 9da3ff2f9a..e74d40fee3 100644
--- a/src/backend/optimizer/path/joinrels.c
+++ b/src/backend/optimizer/path/joinrels.c
@@ -1549,6 +1549,15 @@ try_partitionwise_join(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2,
populate_joinrel_with_paths(root, child_rel1, child_rel2,
child_joinrel, child_sjinfo,
child_restrictlist);
+
+ /*
+ * A parent relation's partitioned_rels must be a superset of the sets
+ * of all its children, direct or indirect, so bubble up the child
+ * joinrel's set.
+ */
+ joinrel->partitioned_rels =
+ bms_add_members(joinrel->partitioned_rels,
+ child_joinrel->partitioned_rels);
}
}
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index fa069a217c..0026086591 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -26,10 +26,12 @@
#include "nodes/extensible.h"
#include "nodes/makefuncs.h"
#include "nodes/nodeFuncs.h"
+#include "optimizer/appendinfo.h"
#include "optimizer/clauses.h"
#include "optimizer/cost.h"
#include "optimizer/optimizer.h"
#include "optimizer/paramassign.h"
+#include "optimizer/pathnode.h"
#include "optimizer/paths.h"
#include "optimizer/placeholder.h"
#include "optimizer/plancat.h"
@@ -1331,11 +1333,11 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
best_path->subpaths,
prunequal);
}
-
plan->appendplans = subplans;
plan->nasyncplans = nasyncplans;
plan->first_partial_plan = best_path->first_partial_path;
plan->part_prune_info = partpruneinfo;
+ plan->partitioned_rels = bms_copy(rel->partitioned_rels);
copy_generic_path_info(&plan->plan, (Path *) best_path);
@@ -1499,6 +1501,20 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
node->mergeplans = subplans;
node->part_prune_info = partpruneinfo;
+ /*
+ * We need to explicitly add to the plan node the RT indexes of any
+ * partitioned tables whose partitions will be scanned by the nodes in
+ * 'subplans'. There can be multiple RT indexes in the set due to the
+ * partition tree being multi-level and/or this being a plan for UNION ALL
+ * over multiple partition trees. Along with scanrelids of leaf-level Scan
+ * nodes, this allows the executor to lock the full set of relations being
+ * scanned by this node.
+ *
+ * Note that 'apprelids' only contains the top-level base relation(s), so
+ * is not sufficient for the purpose.
+ */
+ node->partitioned_rels = bms_copy(rel->partitioned_rels);
+
/*
* If prepare_sort_from_pathkeys added sort columns, but we were told to
* produce either the exact tlist or a narrow tlist, we should get rid of
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index bd09f85aea..374a9d9753 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -529,6 +529,7 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
result->paramExecTypes = glob->paramExecTypes;
/* utilityStmt should be null, but we might as well copy it */
result->utilityStmt = parse->utilityStmt;
+ result->elidedAppendPartedRels = glob->elidedAppendPartedRels;
result->stmt_location = parse->stmt_location;
result->stmt_len = parse->stmt_len;
@@ -7365,6 +7366,13 @@ create_partitionwise_grouping_paths(PlannerInfo *root,
add_paths_to_append_rel(root, grouped_rel, grouped_live_children);
}
+
+ /*
+ * Input rel might be a partitioned appendrel, though grouped_rel has at
+ * this point taken its role as the an appendrel owning the former's
+ * children, so copy the former's partitioned_rels set into the latter.
+ */
+ grouped_rel->partitioned_rels = bms_copy(input_rel->partitioned_rels);
}
/*
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index a7b11b7f03..dbdeb8ec9d 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -1512,6 +1512,10 @@ set_append_references(PlannerInfo *root,
lfirst(l) = set_plan_refs(root, (Plan *) lfirst(l), rtoffset);
}
+ /* Fix up partitioned_rels before possibly removing the Append below. */
+ aplan->partitioned_rels = offset_relid_set(aplan->partitioned_rels,
+ rtoffset);
+
/*
* See if it's safe to get rid of the Append entirely. For this to be
* safe, there must be only one child plan and that child plan's parallel
@@ -1522,8 +1526,17 @@ set_append_references(PlannerInfo *root,
*/
if (list_length(aplan->appendplans) == 1 &&
((Plan *) linitial(aplan->appendplans))->parallel_aware == aplan->plan.parallel_aware)
+ {
+ /*
+ * Partitioned table involved, if any, must be made known to the
+ * executor.
+ */
+ root->glob->elidedAppendPartedRels =
+ bms_add_members(root->glob->elidedAppendPartedRels,
+ aplan->partitioned_rels);
return clean_up_removed_plan_level((Plan *) aplan,
(Plan *) linitial(aplan->appendplans));
+ }
/*
* Otherwise, clean up the Append as needed. It's okay to do this after
@@ -1584,6 +1597,12 @@ set_mergeappend_references(PlannerInfo *root,
lfirst(l) = set_plan_refs(root, (Plan *) lfirst(l), rtoffset);
}
+ /*
+ * Fix up partitioned_rels before possibly removing the MergeAppend below.
+ */
+ mplan->partitioned_rels = offset_relid_set(mplan->partitioned_rels,
+ rtoffset);
+
/*
* See if it's safe to get rid of the MergeAppend entirely. For this to
* be safe, there must be only one child plan and that child plan's
@@ -1594,8 +1613,17 @@ set_mergeappend_references(PlannerInfo *root,
*/
if (list_length(mplan->mergeplans) == 1 &&
((Plan *) linitial(mplan->mergeplans))->parallel_aware == mplan->plan.parallel_aware)
+ {
+ /*
+ * Partitioned tables involved, if any, must be made known to the
+ * executor.
+ */
+ root->glob->elidedAppendPartedRels =
+ bms_add_members(root->glob->elidedAppendPartedRels,
+ mplan->partitioned_rels);
return clean_up_removed_plan_level((Plan *) mplan,
(Plan *) linitial(mplan->mergeplans));
+ }
/*
* Otherwise, clean up the MergeAppend as needed. It's okay to do this
diff --git a/src/backend/optimizer/util/inherit.c b/src/backend/optimizer/util/inherit.c
index 7e134822f3..56912e4101 100644
--- a/src/backend/optimizer/util/inherit.c
+++ b/src/backend/optimizer/util/inherit.c
@@ -406,6 +406,14 @@ expand_partitioned_rtentry(PlannerInfo *root, RelOptInfo *relinfo,
childrte, childRTindex,
childrel, top_parentrc, lockmode);
+ /*
+ * A parent relation's partitioned_rels must be a superset of the sets
+ * of all its children, direct or indirect, so bubble up the child
+ * rel's set.
+ */
+ relinfo->partitioned_rels = bms_add_members(relinfo->partitioned_rels,
+ childrelinfo->partitioned_rels);
+
/* Close child relation, but keep locks */
table_close(childrel, NoLock);
}
@@ -737,6 +745,14 @@ expand_appendrel_subquery(PlannerInfo *root, RelOptInfo *rel,
/* Child may itself be an inherited rel, either table or subquery. */
if (childrte->inh)
expand_inherited_rtentry(root, childrel, childrte, childRTindex);
+
+ /*
+ * A parent relation's partitioned_rels must be a superset of the sets
+ * of all its children, direct or indirect, so bubble up the child
+ * rel's set.
+ */
+ rel->partitioned_rels = bms_add_members(rel->partitioned_rels,
+ childrel->partitioned_rels);
}
}
diff --git a/src/backend/optimizer/util/relnode.c b/src/backend/optimizer/util/relnode.c
index 520409f4ba..1d082a8fdd 100644
--- a/src/backend/optimizer/util/relnode.c
+++ b/src/backend/optimizer/util/relnode.c
@@ -361,6 +361,10 @@ build_simple_rel(PlannerInfo *root, int relid, RelOptInfo *parent)
}
}
+ /* A partitioned appendrel. */
+ if (rel->part_scheme != NULL)
+ rel->partitioned_rels = bms_copy(rel->relids);
+
/* Save the finished struct in the query's simple_rel_array */
root->simple_rel_array[relid] = rel;
@@ -729,6 +733,14 @@ build_join_rel(PlannerInfo *root,
set_joinrel_size_estimates(root, joinrel, outer_rel, inner_rel,
sjinfo, restrictlist);
+ /*
+ * The joinrel may get processed as an appendrel via partitionwise join
+ * if both outer and inner rels are partitioned, so set partitioned_rels
+ * appropriately.
+ */
+ joinrel->partitioned_rels = bms_union(outer_rel->partitioned_rels,
+ inner_rel->partitioned_rels);
+
/*
* Set the consider_parallel flag if this joinrel could potentially be
* scanned within a parallel worker. If this flag is false for either
@@ -897,6 +909,14 @@ build_child_join_rel(PlannerInfo *root, RelOptInfo *outer_rel,
set_joinrel_size_estimates(root, joinrel, outer_rel, inner_rel,
sjinfo, restrictlist);
+ /*
+ * The joinrel may get processed as an appendrel via partitionwise join
+ * if both outer and inner rels are partitioned, so set partitioned_rels
+ * appropriately.
+ */
+ joinrel->partitioned_rels = bms_union(outer_rel->partitioned_rels,
+ inner_rel->partitioned_rels);
+
/* We build the join only once. */
Assert(!find_join_rel(root, joinrel->relids));
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 1f3845b3fe..5327d9ba8b 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -130,6 +130,11 @@ typedef struct PlannerGlobal
char maxParallelHazard; /* worst PROPARALLEL hazard level */
PartitionDirectory partition_directory; /* partition descriptors */
+
+ Bitmapset *elidedAppendPartedRels; /* Combined partitioned_rels of all
+ * single-subplan [Merge]Append nodes
+ * that have been removed fron the
+ * various plan trees. */
} PlannerGlobal;
/* macro for fetching the Plan associated with a SubPlan node */
@@ -773,6 +778,23 @@ typedef struct RelOptInfo
Relids all_partrels; /* Relids set of all partition relids */
List **partexprs; /* Non-nullable partition key expressions */
List **nullable_partexprs; /* Nullable partition key expressions */
+
+ /*
+ * For an appendrel parent relation (base, join, or upper) that is
+ * partitioned, this stores the RT indexes of all the paritioned ancestors
+ * including itself that lead up to the individual leaf partitions that
+ * will be scanned to produce this relation's output rows. The relid set
+ * is copied into the resulting Append or MergeAppend plan node for
+ * allowing the executor to take appropriate locks on those relations,
+ * unless the node is deemed useless in setrefs.c due to having a single
+ * leaf subplan and thus elided from the final plan, in which case, the set
+ * is added into PlannerGlobal.elidedAppendPartedRels.
+ *
+ * Note that 'apprelids' of those nodes only contains the top-level base
+ * relation(s), so is not sufficient for said purpose.
+ */
+
+ Bitmapset *partitioned_rels;
} RelOptInfo;
/*
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 0b518ce6b2..bd87c35d6c 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -85,6 +85,11 @@ typedef struct PlannedStmt
Node *utilityStmt; /* non-null if this is utility stmt */
+ Bitmapset *elidedAppendPartedRels; /* Combined partitioned_rels of all
+ * single-subplan [Merge]Append nodes
+ * that have been removed from the
+ * various plan trees. */
+
/* statement location in source string (copied from Query) */
int stmt_location; /* start location, or -1 if unknown */
int stmt_len; /* length in bytes; 0 means "rest of string" */
@@ -261,6 +266,12 @@ typedef struct Append
/* Info for run-time subplan pruning; NULL if we're not doing that */
struct PartitionPruneInfo *part_prune_info;
+
+ /*
+ * RT indexes of all partitioned parents whose partitions' plans are
+ * present in appendplans.
+ */
+ Bitmapset *partitioned_rels;
} Append;
/* ----------------
@@ -281,6 +292,12 @@ typedef struct MergeAppend
bool *nullsFirst; /* NULLS FIRST/LAST directions */
/* Info for run-time subplan pruning; NULL if we're not doing that */
struct PartitionPruneInfo *part_prune_info;
+
+ /*
+ * RT indexes of all partitioned parents whose partitions' plans are
+ * present in appendplans.
+ */
+ Bitmapset *partitioned_rels;
} MergeAppend;
/* ----------------
--
2.24.1
[application/x-patch] v6-0004-Optimize-AcquireExecutorLocks-to-skip-pruned-part.patch (94.2K, 4-v6-0004-Optimize-AcquireExecutorLocks-to-skip-pruned-part.patch)
download | inline diff:
From 5e076f58274f6cd05afc8533af130e165c9b862e Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Wed, 22 Dec 2021 16:55:17 +0900
Subject: [PATCH v6 4/4] Optimize AcquireExecutorLocks() to skip pruned
partitions
Instead of locking all relations listed in the range table in the
cases where the PlannedStmt indicates that some nodes in the plan
tree can do partition pruning without depending on execution having
started (so called "initial" pruning), AcquireExecutorLocks() now
calls the new executor function ExecutorGetLockRels() which returns
a set of relations (their RT indexes) to be locked not including
those scanned by the subplans that pruned.
The result of pruning done this way must be remembered and reused
during actual execution of the plan, which is done by creating a
PlanInitPruningOutput nodes for for each plan node that undergoes
pruning and a set of those for the whole plan tree are added to
ExecLockRelsInfo which also stores the bitmapset of RT indexes of
relations that are actually locked by AcquireExecutorLocks().
ExecLockRelsInfos are passed down the executor alongside the
PlannedStmts. This arrangement ensures that the executor doesn't
accidentally try to process a plan tree subnodes that has been
deemed pruned by AcquireExecutorLocks().
---
src/backend/commands/copyto.c | 2 +-
src/backend/commands/createas.c | 2 +-
src/backend/commands/explain.c | 7 +-
src/backend/commands/extension.c | 13 +-
src/backend/commands/matview.c | 2 +-
src/backend/commands/portalcmds.c | 1 +
src/backend/commands/prepare.c | 17 +-
src/backend/executor/README | 24 +++
src/backend/executor/execMain.c | 202 ++++++++++++++++++++
src/backend/executor/execParallel.c | 26 ++-
src/backend/executor/execPartition.c | 224 ++++++++++++++++++----
src/backend/executor/execUtils.c | 8 +
src/backend/executor/functions.c | 2 +-
src/backend/executor/nodeAppend.c | 52 ++++-
src/backend/executor/nodeMergeAppend.c | 52 ++++-
src/backend/executor/nodeModifyTable.c | 25 +++
src/backend/executor/spi.c | 14 +-
src/backend/nodes/copyfuncs.c | 49 ++++-
src/backend/nodes/outfuncs.c | 39 ++++
src/backend/nodes/readfuncs.c | 37 ++++
src/backend/optimizer/plan/planner.c | 2 +
src/backend/optimizer/plan/setrefs.c | 6 +
src/backend/partitioning/partprune.c | 37 +++-
src/backend/tcop/postgres.c | 15 +-
src/backend/tcop/pquery.c | 21 ++-
src/backend/utils/cache/plancache.c | 252 ++++++++++++++++++++++---
src/backend/utils/mmgr/portalmem.c | 2 +
src/include/commands/explain.h | 3 +-
src/include/executor/execPartition.h | 2 +
src/include/executor/execdesc.h | 2 +
src/include/executor/executor.h | 2 +
src/include/executor/nodeAppend.h | 1 +
src/include/executor/nodeMergeAppend.h | 1 +
src/include/executor/nodeModifyTable.h | 1 +
src/include/nodes/execnodes.h | 96 ++++++++++
src/include/nodes/nodes.h | 5 +
src/include/nodes/pathnodes.h | 4 +
src/include/nodes/plannodes.h | 15 ++
src/include/tcop/tcopprot.h | 2 +-
src/include/utils/plancache.h | 6 +
src/include/utils/portal.h | 5 +
41 files changed, 1174 insertions(+), 104 deletions(-)
diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index 55c38b04c4..d403eb2309 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -542,7 +542,7 @@ BeginCopyTo(ParseState *pstate,
((DR_copy *) dest)->cstate = cstate;
/* Create a QueryDesc requesting no output */
- cstate->queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+ cstate->queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
GetActiveSnapshot(),
InvalidSnapshot,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 9abbb6b555..f6607f2454 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -325,7 +325,7 @@ ExecCreateTableAs(ParseState *pstate, CreateTableAsStmt *stmt,
UpdateActiveSnapshotCommandId();
/* Create a QueryDesc, redirecting output to our tuple receiver */
- queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+ queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, 0);
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 9f632285b6..1f1a44b9bb 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -407,7 +407,7 @@ ExplainOneQuery(Query *query, int cursorOptions,
}
/* run it (if needed) and produce output */
- ExplainOnePlan(plan, into, es, queryString, params, queryEnv,
+ ExplainOnePlan(plan, NULL, into, es, queryString, params, queryEnv,
&planduration, (es->buffers ? &bufusage : NULL));
}
}
@@ -515,7 +515,8 @@ ExplainOneUtility(Node *utilityStmt, IntoClause *into, ExplainState *es,
* to call it.
*/
void
-ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
+ExplainOnePlan(PlannedStmt *plannedstmt, ExecLockRelsInfo *execlockrelsinfo,
+ IntoClause *into, ExplainState *es,
const char *queryString, ParamListInfo params,
QueryEnvironment *queryEnv, const instr_time *planduration,
const BufferUsage *bufusage)
@@ -563,7 +564,7 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
dest = None_Receiver;
/* Create a QueryDesc for the query */
- queryDesc = CreateQueryDesc(plannedstmt, queryString,
+ queryDesc = CreateQueryDesc(plannedstmt, execlockrelsinfo, queryString,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, instrument_option);
diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index 1013790dbb..008b8ce0e9 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -741,8 +741,10 @@ execute_sql_string(const char *sql)
RawStmt *parsetree = lfirst_node(RawStmt, lc1);
MemoryContext per_parsetree_context,
oldcontext;
- List *stmt_list;
- ListCell *lc2;
+ List *stmt_list,
+ *execlockrelsinfo_list;
+ ListCell *lc2,
+ *lc3;
/*
* We do the work for each parsetree in a short-lived context, to
@@ -762,11 +764,13 @@ execute_sql_string(const char *sql)
NULL,
0,
NULL);
- stmt_list = pg_plan_queries(stmt_list, sql, CURSOR_OPT_PARALLEL_OK, NULL);
+ stmt_list = pg_plan_queries(stmt_list, sql, CURSOR_OPT_PARALLEL_OK, NULL,
+ &execlockrelsinfo_list);
- foreach(lc2, stmt_list)
+ forboth(lc2, stmt_list, lc3, execlockrelsinfo_list)
{
PlannedStmt *stmt = lfirst_node(PlannedStmt, lc2);
+ ExecLockRelsInfo *execlockrelsinfo = lfirst_node(ExecLockRelsInfo, lc3);
CommandCounterIncrement();
@@ -777,6 +781,7 @@ execute_sql_string(const char *sql)
QueryDesc *qdesc;
qdesc = CreateQueryDesc(stmt,
+ execlockrelsinfo,
sql,
GetActiveSnapshot(), NULL,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index 05e7b60059..4ef44aaf23 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -416,7 +416,7 @@ refresh_matview_datafill(DestReceiver *dest, Query *query,
UpdateActiveSnapshotCommandId();
/* Create a QueryDesc, redirecting output to our tuple receiver */
- queryDesc = CreateQueryDesc(plan, queryString,
+ queryDesc = CreateQueryDesc(plan, NULL, queryString,
GetActiveSnapshot(), InvalidSnapshot,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/portalcmds.c b/src/backend/commands/portalcmds.c
index 9902c5c566..85e73ddded 100644
--- a/src/backend/commands/portalcmds.c
+++ b/src/backend/commands/portalcmds.c
@@ -107,6 +107,7 @@ PerformCursorOpen(ParseState *pstate, DeclareCursorStmt *cstmt, ParamListInfo pa
queryString,
CMDTAG_SELECT, /* cursor's query is always a SELECT */
list_make1(plan),
+ list_make1(NULL), /* no ExecLockRelsInfo to pass */
NULL);
/*----------
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 80738547ed..bbbf8bbcbd 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -155,6 +155,7 @@ ExecuteQuery(ParseState *pstate,
PreparedStatement *entry;
CachedPlan *cplan;
List *plan_list;
+ List *plan_execlockrelsinfo_list;
ParamListInfo paramLI = NULL;
EState *estate = NULL;
Portal portal;
@@ -195,6 +196,7 @@ ExecuteQuery(ParseState *pstate,
/* Replan if needed, and increment plan refcount for portal */
cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL);
plan_list = cplan->stmt_list;
+ plan_execlockrelsinfo_list = cplan->execlockrelsinfo_list;
/*
* DO NOT add any logic that could possibly throw an error between
@@ -204,7 +206,7 @@ ExecuteQuery(ParseState *pstate,
NULL,
query_string,
entry->plansource->commandTag,
- plan_list,
+ plan_list, plan_execlockrelsinfo_list,
cplan);
/*
@@ -576,7 +578,9 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
const char *query_string;
CachedPlan *cplan;
List *plan_list;
- ListCell *p;
+ List *plan_execlockrelsinfo_list;
+ ListCell *p,
+ *pe;
ParamListInfo paramLI = NULL;
EState *estate = NULL;
instr_time planstart;
@@ -632,15 +636,18 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
}
plan_list = cplan->stmt_list;
+ plan_execlockrelsinfo_list = cplan->execlockrelsinfo_list;
/* Explain each query */
- foreach(p, plan_list)
+ forboth(p, plan_list, pe, plan_execlockrelsinfo_list)
{
PlannedStmt *pstmt = lfirst_node(PlannedStmt, p);
+ ExecLockRelsInfo *execlockrelsinfo = lfirst_node(ExecLockRelsInfo, pe);
if (pstmt->commandType != CMD_UTILITY)
- ExplainOnePlan(pstmt, into, es, query_string, paramLI, queryEnv,
- &planduration, (es->buffers ? &bufusage : NULL));
+ ExplainOnePlan(pstmt, execlockrelsinfo, into, es, query_string,
+ paramLI, queryEnv, &planduration,
+ (es->buffers ? &bufusage : NULL));
else
ExplainOneUtility(pstmt->utilityStmt, into, es, query_string,
paramLI, queryEnv);
diff --git a/src/backend/executor/README b/src/backend/executor/README
index bf5e70860d..9720d0ac2c 100644
--- a/src/backend/executor/README
+++ b/src/backend/executor/README
@@ -65,6 +65,27 @@ found there. This currently only occurs for Append and MergeAppend nodes. In
this case the non-required subplans are ignored and the executor state's
subnode array will become out of sequence to the plan's subplan list.
+Actually, the so-called execution time pruning may also occur even before the
+execution has started. One case where that occurs is when a cached generic
+plan is being validated for execution by plancache.c: GetCachedPlan(), which
+proceeds by locking all the relations that will be scanned by that plan. If
+the generic plan has nodes that contain so-called initial pruning steps (a
+subset of execution pruning steps that do not depend on full-fledged execution
+having started), they are performed at this point to figure out the minimal
+set of child subplans that satisfy those pruning instructions and the result
+of performing that pruning is saved in a data structure that gets passed to
+the executor alongside the plan tree. Relations scanned by only those
+surviving subplans are then locked while those scanned by the pruned subplans
+are not, even though the pruned subplans themselves are not removed from the
+plan tree. So, it is imperative that the executor and any third party code
+invoked by it that gets passed the plan tree look at the initial pruning result
+made available via the aforementioned data structure to determine whether or
+not a particular subplan is valid. (The data structure basically consists of
+an array of PlanInitPruningOutput nodes containing one element for each node
+of the plan tree indexable using plan_node_id of the individual plan nodes,
+where each node contains a bitmapset of indexes of unpruned child subplans of
+a given node.)
+
Each Plan node may have expression trees associated with it, to represent
its target list, qualification conditions, etc. These trees are also
read-only to the executor, but the executor state for expression evaluation
@@ -247,6 +268,9 @@ Query Processing Control Flow
This is a sketch of control flow for full query processing:
+ [ ExecutorGetLockRels ] --- an optional step to walk over the plan tree
+ to produce an ExecLockRelsInfo to be passed to CreateQueryDesc
+
CreateQueryDesc
ExecutorStart
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 473d2e00a2..1ddd1dfb83 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -49,11 +49,15 @@
#include "commands/matview.h"
#include "commands/trigger.h"
#include "executor/execdebug.h"
+#include "executor/nodeAppend.h"
+#include "executor/nodeMergeAppend.h"
+#include "executor/nodeModifyTable.h"
#include "executor/nodeSubplan.h"
#include "foreign/fdwapi.h"
#include "jit/jit.h"
#include "mb/pg_wchar.h"
#include "miscadmin.h"
+#include "nodes/nodeFuncs.h"
#include "parser/parsetree.h"
#include "storage/bufmgr.h"
#include "storage/lmgr.h"
@@ -101,9 +105,205 @@ static char *ExecBuildSlotValueDescription(Oid reloid,
Bitmapset *modifiedCols,
int maxfieldlen);
static void EvalPlanQualStart(EPQState *epqstate, Plan *planTree);
+static bool ExecGetScanLockRels(Scan *scan, ExecGetLockRelsContext *context);
/* end of local decls */
+/* ----------------------------------------------------------------
+ * ExecutorGetLockRels
+ *
+ * Figure out the minimal set of relations to lock to be able to safely
+ * execute a given plan
+ *
+ * This ignores the relations scanned by child subplans that are pruned away
+ * after performing initial pruning steps present in the plan using the
+ * provided set of EXTERN parameters.
+ *
+ * Along with the set of RT indexes of relations that must be locked, the
+ * returned struct also contains an array of PlanInitPruningOutput nodes each
+ * of which contains the result of initial pruning for a given plan node, which
+ * is basically a bitmapset of the indexes of surviving child subplans. Each
+ * plan node in the tree that undergoes pruning will have an element in the
+ * array.
+ *
+ * Note that while relations scanned by the subplans that are pruned will not
+ * be locked, the subplans themselves are left as-is in the plan tree, assuming
+ * anything that reads the plan tree during execution knows to ignore them by
+ * looking at the PlanInitPruningOutput's list of valid subplans.
+ *
+ * Partitioned tables mentioned in PartitionedRelPruneInfo nodes that drive
+ * the pruning will be locked before doing the pruning and also added to the
+ * the returned set.
+ */
+ExecLockRelsInfo *
+ExecutorGetLockRels(PlannedStmt *plannedstmt, ParamListInfo params)
+{
+ int numPlanNodes = plannedstmt->numPlanNodes;
+ ExecGetLockRelsContext context;
+ ExecLockRelsInfo *result;
+ ListCell *lc;
+
+ /* Only get here if there is any pruning to do. */
+ Assert(plannedstmt->containsInitialPruning);
+
+ context.stmt = plannedstmt;
+ context.params = params;
+
+ /*
+ * Go walk all the plan tree(s) present in the PlannedStmt, filling
+ * context.lockrels with only the relations from plan nodes that
+ * survive initial pruning and also the tables mentioned in
+ * partitioned_rels sets found in the plan.
+ */
+ context.lockrels = NULL;
+ context.initPruningOutputs = NIL;
+ context.ipoIndexes = palloc0(sizeof(int) * numPlanNodes);
+
+ /* All the subplans. */
+ foreach(lc, plannedstmt->subplans)
+ {
+ Plan *subplan = lfirst(lc);
+
+ (void) ExecGetLockRels(subplan, &context);
+ }
+
+ /* And the main tree. */
+ (void) ExecGetLockRels(plannedstmt->planTree, &context);
+
+ /*
+ * Also be sure to lock partitioned relations from any [Merge]Append nodes
+ * that were originally present but were ultimately left out from the plan
+ * due to being deemed no-op nodes.
+ */
+ context.lockrels = bms_add_members(context.lockrels,
+ plannedstmt->elidedAppendPartedRels);
+
+ result = makeNode(ExecLockRelsInfo);
+ result->lockrels = context.lockrels;
+ result->numPlanNodes = numPlanNodes;
+ result->initPruningOutputs = context.initPruningOutputs;
+ result->ipoIndexes = context.ipoIndexes;
+
+ return result;
+}
+
+/* ------------------------------------------------------------------------
+ * ExecGetLockRels
+ * Adds all the relations that will be scanned by 'node' and its child
+ * plans to context->lockrels after taking into the account the effect
+ * of performing initial pruning if any
+ *
+ * context->stmt gives the PlannedStmt being inspected to access the plan's
+ * range table if needed and context->params the set of EXTERN parameters
+ * available to evaluate pruning parameters.
+ *
+ * If initial pruning is done, a PlanInitPruningOutput node containing the
+ * result of pruning will be stored in context->initPruningOutputs that will
+ * be made available to the executor to reuse.
+ * ------------------------------------------------------------------------
+ */
+bool
+ExecGetLockRels(Plan *node, ExecGetLockRelsContext *context)
+{
+ /* Do nothing when we get to the end of a leaf on tree. */
+ if (node == NULL)
+ return true;
+
+ /* Make sure there's enough stack available. */
+ check_stack_depth();
+
+ switch (nodeTag(node))
+ {
+ /* Currently, only these two nodes have prunable child subplans. */
+ case T_Append:
+ if (ExecGetAppendLockRels((Append *) node, context))
+ return true;
+ break;
+ case T_MergeAppend:
+ if (ExecGetMergeAppendLockRels((MergeAppend *) node,
+ context))
+ return true;
+ break;
+
+ /*
+ * And these manipulate relations that must be added context->lockrels.
+ */
+ case T_SeqScan:
+ case T_SampleScan:
+ case T_IndexScan:
+ case T_IndexOnlyScan:
+ case T_BitmapIndexScan:
+ case T_BitmapHeapScan:
+ case T_TidScan:
+ case T_TidRangeScan:
+ case T_ForeignScan:
+ case T_SubqueryScan:
+ case T_CustomScan:
+ if (ExecGetScanLockRels((Scan *) node, context))
+ return true;
+ break;
+ case T_ModifyTable:
+ if (ExecGetModifyTableLockRels((ModifyTable *) node, context))
+ return true;
+ /* plan_tree_walker() will visit the subplan (outerNode) */
+ break;
+
+ default:
+ break;
+ }
+
+ /* Recurse to subnodes. */
+ return plan_tree_walker(node, ExecGetLockRels, (void *) context);
+}
+
+/*
+ * ExecGetScanLockRels
+ * Do ExecGetLockRels()'s work for a leaf Scan node
+ */
+static bool
+ExecGetScanLockRels(Scan *scan, ExecGetLockRelsContext *context)
+{
+ switch (nodeTag(scan))
+ {
+ case T_ForeignScan:
+ {
+ ForeignScan *fscan = (ForeignScan *) scan;
+
+ context->lockrels = bms_add_members(context->lockrels,
+ fscan->fs_relids);
+ }
+ break;
+
+ case T_SubqueryScan:
+ {
+ SubqueryScan *sscan = (SubqueryScan *) scan;
+
+ (void) ExecGetLockRels((Plan *) sscan->subplan, context);
+ }
+ break;
+
+ case T_CustomScan:
+ {
+ CustomScan *cscan = (CustomScan *) scan;
+ ListCell *lc;
+
+ context->lockrels = bms_add_members(context->lockrels,
+ cscan->custom_relids);
+ foreach(lc, cscan->custom_plans)
+ {
+ (void) ExecGetLockRels((Plan *) lfirst(lc), context);
+ }
+ }
+ break;
+
+ default:
+ context->lockrels = bms_add_member(context->lockrels,
+ scan->scanrelid);
+ break;
+ }
+
+ return true;
+}
/* ----------------------------------------------------------------
* ExecutorStart
@@ -805,6 +1005,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
{
CmdType operation = queryDesc->operation;
PlannedStmt *plannedstmt = queryDesc->plannedstmt;
+ ExecLockRelsInfo *execlockrelsinfo = queryDesc->execlockrelsinfo;
Plan *plan = plannedstmt->planTree;
List *rangeTable = plannedstmt->rtable;
EState *estate = queryDesc->estate;
@@ -824,6 +1025,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
ExecInitRangeTable(estate, rangeTable);
estate->es_plannedstmt = plannedstmt;
+ estate->es_execlockrelsinfo = execlockrelsinfo;
/*
* Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index 5dd8ab7db2..02f2c27fdf 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -66,6 +66,7 @@
#define PARALLEL_KEY_QUERY_TEXT UINT64CONST(0xE000000000000008)
#define PARALLEL_KEY_JIT_INSTRUMENTATION UINT64CONST(0xE000000000000009)
#define PARALLEL_KEY_WAL_USAGE UINT64CONST(0xE00000000000000A)
+#define PARALLEL_KEY_EXECLOCKRELSINFO UINT64CONST(0xE00000000000000B)
#define PARALLEL_TUPLE_QUEUE_SIZE 65536
@@ -182,6 +183,7 @@ ExecSerializePlan(Plan *plan, EState *estate)
pstmt->transientPlan = false;
pstmt->dependsOnRole = false;
pstmt->parallelModeNeeded = false;
+ pstmt->containsInitialPruning = false;
pstmt->planTree = plan;
pstmt->rtable = estate->es_range_table;
pstmt->resultRelations = NIL;
@@ -596,12 +598,15 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
FixedParallelExecutorState *fpes;
char *pstmt_data;
char *pstmt_space;
+ char *execlockrelsinfo_data;
+ char *execlockrelsinfo_space;
char *paramlistinfo_space;
BufferUsage *bufusage_space;
WalUsage *walusage_space;
SharedExecutorInstrumentation *instrumentation = NULL;
SharedJitInstrumentation *jit_instrumentation = NULL;
int pstmt_len;
+ int execlockrelsinfo_len;
int paramlistinfo_len;
int instrumentation_len = 0;
int jit_instrumentation_len = 0;
@@ -630,6 +635,7 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
/* Fix up and serialize plan to be sent to workers. */
pstmt_data = ExecSerializePlan(planstate->plan, estate);
+ execlockrelsinfo_data = nodeToString(estate->es_execlockrelsinfo);
/* Create a parallel context. */
pcxt = CreateParallelContext("postgres", "ParallelQueryMain", nworkers);
@@ -656,6 +662,11 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
shm_toc_estimate_chunk(&pcxt->estimator, pstmt_len);
shm_toc_estimate_keys(&pcxt->estimator, 1);
+ /* Estimate space for serialized ExecLockRelsInfo. */
+ execlockrelsinfo_len = strlen(execlockrelsinfo_data) + 1;
+ shm_toc_estimate_chunk(&pcxt->estimator, execlockrelsinfo_len);
+ shm_toc_estimate_keys(&pcxt->estimator, 1);
+
/* Estimate space for serialized ParamListInfo. */
paramlistinfo_len = EstimateParamListSpace(estate->es_param_list_info);
shm_toc_estimate_chunk(&pcxt->estimator, paramlistinfo_len);
@@ -750,6 +761,12 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
memcpy(pstmt_space, pstmt_data, pstmt_len);
shm_toc_insert(pcxt->toc, PARALLEL_KEY_PLANNEDSTMT, pstmt_space);
+ /* Store serialized ExecLockRelsInfo */
+ execlockrelsinfo_space = shm_toc_allocate(pcxt->toc, execlockrelsinfo_len);
+ memcpy(execlockrelsinfo_space, execlockrelsinfo_data, execlockrelsinfo_len);
+ shm_toc_insert(pcxt->toc, PARALLEL_KEY_EXECLOCKRELSINFO,
+ execlockrelsinfo_space);
+
/* Store serialized ParamListInfo. */
paramlistinfo_space = shm_toc_allocate(pcxt->toc, paramlistinfo_len);
shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARAMLISTINFO, paramlistinfo_space);
@@ -1231,8 +1248,10 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
int instrument_options)
{
char *pstmtspace;
+ char *execlockrelsinfospace;
char *paramspace;
PlannedStmt *pstmt;
+ ExecLockRelsInfo *execlockrelsinfo;
ParamListInfo paramLI;
char *queryString;
@@ -1243,12 +1262,17 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
pstmtspace = shm_toc_lookup(toc, PARALLEL_KEY_PLANNEDSTMT, false);
pstmt = (PlannedStmt *) stringToNode(pstmtspace);
+ /* Reconstruct leader-supplied ExecLockRelsInfo. */
+ execlockrelsinfospace = shm_toc_lookup(toc, PARALLEL_KEY_EXECLOCKRELSINFO,
+ false);
+ execlockrelsinfo = (ExecLockRelsInfo *) stringToNode(execlockrelsinfospace);
+
/* Reconstruct ParamListInfo. */
paramspace = shm_toc_lookup(toc, PARALLEL_KEY_PARAMLISTINFO, false);
paramLI = RestoreParamList(¶mspace);
/* Create a QueryDesc for the query. */
- return CreateQueryDesc(pstmt,
+ return CreateQueryDesc(pstmt, execlockrelsinfo,
queryString,
GetActiveSnapshot(), InvalidSnapshot,
receiver, paramLI, NULL, instrument_options);
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 7ff5a95f05..fddc97280e 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -24,6 +24,7 @@
#include "mb/pg_wchar.h"
#include "miscadmin.h"
#include "nodes/makefuncs.h"
+#include "parser/parsetree.h"
#include "partitioning/partbounds.h"
#include "partitioning/partdesc.h"
#include "partitioning/partprune.h"
@@ -183,8 +184,13 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation rel,
int maxfieldlen);
static List *adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri);
static PartitionPruneState *ExecCreatePartitionPruneState(PlanState *planstate,
- PartitionPruneInfo *partitionpruneinfo);
-static Bitmapset *ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate);
+ PartitionPruneInfo *partitionpruneinfo,
+ bool consider_initial_steps,
+ bool consider_exec_steps,
+ List *rtable, ExprContext *econtext,
+ PartitionDirectory partdir);
+static Bitmapset *ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate,
+ PartitionPruneInfo *pruneinfo);
static void ExecInitPruningContext(PartitionPruneContext *context,
List *pruning_steps,
PartitionDesc partdesc,
@@ -1483,8 +1489,9 @@ adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri)
* considered to be a stable expression, it can change value from one plan
* node scan to the next during query execution. Stable comparison
* expressions that don't involve such Params allow partition pruning to be
- * done once during executor startup. Expressions that do involve such Params
- * require us to prune separately for each scan of the parent plan node.
+ * done once during executor startup or even before during ExecutorGetLockRels().
+ * Expressions that do involve such Params require us to prune separately for
+ * each scan of the parent plan node.
*
* Note that pruning away unneeded subplans during executor startup has the
* added benefit of not having to initialize the unneeded subplans at all.
@@ -1496,10 +1503,17 @@ adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri)
* Creates the PartitionPruneState required by each of the two pruning
* functions. Details stored include how to map the partition index
* returned by the partition pruning code into subplan indexes. Also
- * determines the set of initially valid subplans by performing initial
- * pruning steps, only which need be initialized by the caller such as
- * ExecInitAppend. Maps in PartitionPruneState are updated to account
- * for initial pruning having eliminated some of the subplans, if any.
+ * determines the set of initially valid subplans by either looking that
+ * up in the plan node's PlanInitPruningOutput if one found in
+ * EState.es_execlockrelinfo or by performing initial pruning steps.
+ * Only the subplans included in that need be initialized by the caller
+ * such as ExecInitAppend. Maps in PartitionPruneState are updated to
+ * account for initial pruning having eliminated some of the subplans,
+ * if any.
+ *
+ * ExecGetLockRelsDoInitialPruning:
+ * Do initial pruning as part of ExecGetLockRels() on the parent plan
+ * node
*
* ExecFindMatchingSubPlans:
* Returns indexes of matching subplans after evaluating all available
@@ -1514,9 +1528,10 @@ adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri)
* ExecInitPartitionPruning
* Initialize data structure needed for run-time partition pruning
*
- * Initial pruning can be done immediately, so it is done here if needed and
- * the set of surviving partition subplans' indexes are added to the output
- * parameter *initially_valid_subplans.
+ * Initial pruning can be done immediately, so it is done here unless it has
+ * already been done by ExecGetLockRelsDoInitialPruning(), and the set of
+ * surviving partition subplans' indexes are added to the output parameter
+ * *initially_valid_subplans.
*
* If subplans are indeed pruned, subplan_map arrays contained in the returned
* PartitionPruneState are re-sequenced to not count those, though only if the
@@ -1530,22 +1545,57 @@ ExecInitPartitionPruning(PlanState *planstate,
{
PartitionPruneState *prunestate;
EState *estate = planstate->state;
+ Plan *plan = planstate->plan;
+ PlanInitPruningOutput *initPruningOutput = NULL;
+ bool do_pruning = (pruneinfo->needs_init_pruning ||
+ pruneinfo->needs_exec_pruning);
- /* We may need an expression context to evaluate partition exprs */
- ExecAssignExprContext(estate, planstate);
+ /* Retrieve the parent plan's PlanInitPruningOutput, if any. */
+ if (estate->es_execlockrelsinfo)
+ {
+ initPruningOutput = (PlanInitPruningOutput *)
+ ExecFetchPlanInitPruningOutput(estate->es_execlockrelsinfo, plan);
- /*
- * Create the working data structure for pruning.
- */
- prunestate = ExecCreatePartitionPruneState(planstate, pruneinfo);
+ Assert(initPruningOutput != NULL &&
+ IsA(initPruningOutput, PlanInitPruningOutput));
+ /* No need to do initial pruning again, only exec pruning. */
+ do_pruning = pruneinfo->needs_exec_pruning;
+ }
+
+ prunestate = NULL;
+ if (do_pruning)
+ {
+ /* We may need an expression context to evaluate partition exprs */
+ ExecAssignExprContext(estate, planstate);
+
+ /* For data reading, executor always omits detached partitions */
+ if (estate->es_partition_directory == NULL)
+ estate->es_partition_directory =
+ CreatePartitionDirectory(estate->es_query_cxt, false);
+
+ /*
+ * Create the working data structure for pruning. No need to consider
+ * initial pruning steps if we have a PlanInitPruningOutput.
+ */
+ prunestate = ExecCreatePartitionPruneState(planstate, pruneinfo,
+ initPruningOutput == NULL, true,
+ NIL, planstate->ps_ExprContext,
+ estate->es_partition_directory);
+ }
/*
* Perform an initial partition prune, if required.
*/
- if (prunestate->do_initial_prune)
+ if (initPruningOutput)
+ {
+ /* ExecGetLockRelsDoInitialPruning() already did it for us! */
+ *initially_valid_subplans = initPruningOutput->initially_valid_subplans;
+ }
+ else if (prunestate && prunestate->do_initial_prune)
{
/* Determine which subplans survive initial pruning */
- *initially_valid_subplans = ExecFindInitialMatchingSubPlans(prunestate);
+ *initially_valid_subplans = ExecFindInitialMatchingSubPlans(prunestate,
+ pruneinfo);
}
else
{
@@ -1563,7 +1613,7 @@ ExecInitPartitionPruning(PlanState *planstate,
* invalid data in prunestate, because that data won't be consulted again
* (cf initial Assert in ExecFindMatchingSubPlans).
*/
- if (prunestate->do_exec_prune &&
+ if (prunestate && prunestate->do_exec_prune &&
bms_num_members(*initially_valid_subplans) < n_total_subplans)
PartitionPruneStateFixSubPlanMap(prunestate,
*initially_valid_subplans,
@@ -1572,12 +1622,75 @@ ExecInitPartitionPruning(PlanState *planstate,
return prunestate;
}
+/*
+ * ExecGetLockRelsDoInitialPruning
+ * Perform initial pruning as part of doing ExecGetLockRels() on the parent
+ * plan node
+ */
+Bitmapset *
+ExecGetLockRelsDoInitialPruning(Plan *plan, ExecGetLockRelsContext *context,
+ PartitionPruneInfo *pruneinfo)
+{
+ List *rtable = context->stmt->rtable;
+ ParamListInfo params = context->params;
+ ExprContext *econtext;
+ PartitionDirectory pdir;
+ MemoryContext oldcontext,
+ tmpcontext;
+ PartitionPruneState *prunestate;
+ PlanInitPruningOutput *initPruningOutput;
+
+ /*
+ * A temporary context to allocate stuff needded to run the pruning steps.
+ */
+ tmpcontext = AllocSetContextCreate(CurrentMemoryContext,
+ "initial pruning working data",
+ ALLOCSET_DEFAULT_SIZES);
+ oldcontext = MemoryContextSwitchTo(tmpcontext);
+
+ /*
+ * PartitionDirectory to look up partition descriptors, which omits
+ * detached partitions, just like in the executor proper.
+ */
+ pdir = CreatePartitionDirectory(CurrentMemoryContext, false);
+
+ /*
+ * We don't yet have a PlanState for the parent plan node, so must create
+ * a standalone ExprContext to evaluate pruning expressions, equipped with
+ * the information about the EXTERN parameters that the caller passed us.
+ * Note that that's okay because the initial pruning steps do not contain
+ * anything that requires the execution to have started.
+ */
+ econtext = CreateStandaloneExprContext();
+ econtext->ecxt_param_list_info = params;
+ prunestate = ExecCreatePartitionPruneState(NULL, pruneinfo,
+ true, false,
+ rtable, econtext,
+ pdir);
+ MemoryContextSwitchTo(oldcontext);
+
+ /* Do the pruning and populate a PlanInitPruningOutput for this node. */
+ initPruningOutput = makeNode(PlanInitPruningOutput);
+ initPruningOutput->initially_valid_subplans =
+ ExecFindInitialMatchingSubPlans(prunestate, pruneinfo);
+ ExecStorePlanInitPruningOutput(context, initPruningOutput, plan);
+
+ FreeExprContext(econtext, true);
+ DestroyPartitionDirectory(pdir);
+ MemoryContextDelete(tmpcontext);
+
+ return initPruningOutput->initially_valid_subplans;
+}
+
/*
* ExecCreatePartitionPruneState
* Build the data structure required for calling
* ExecFindInitialMatchingSubPlans and ExecFindMatchingSubPlans.
*
- * 'planstate' is the parent plan node's execution state.
+ * 'planstate', if not NULL, is the parent plan node's execution state. It
+ * can be NULL if being called before ExecutorStart(), in which case,
+ * 'rtable' (range table), 'econtext', and 'partdir' must be explicitly
+ * provided.
*
* 'partitionpruneinfo' is a PartitionPruneInfo as generated by
* make_partition_pruneinfo. Here we build a PartitionPruneState containing a
@@ -1592,19 +1705,20 @@ ExecInitPartitionPruning(PlanState *planstate,
*/
static PartitionPruneState *
ExecCreatePartitionPruneState(PlanState *planstate,
- PartitionPruneInfo *partitionpruneinfo)
+ PartitionPruneInfo *partitionpruneinfo,
+ bool consider_initial_steps,
+ bool consider_exec_steps,
+ List *rtable, ExprContext *econtext,
+ PartitionDirectory partdir)
{
- EState *estate = planstate->state;
+ EState *estate = planstate ? planstate->state : NULL;
PartitionPruneState *prunestate;
int n_part_hierarchies;
ListCell *lc;
int i;
- ExprContext *econtext = planstate->ps_ExprContext;
- /* For data reading, executor always omits detached partitions */
- if (estate->es_partition_directory == NULL)
- estate->es_partition_directory =
- CreatePartitionDirectory(estate->es_query_cxt, false);
+ Assert((estate != NULL) ||
+ (partdir != NULL && econtext != NULL && rtable != NIL));
n_part_hierarchies = list_length(partitionpruneinfo->prune_infos);
Assert(n_part_hierarchies > 0);
@@ -1655,19 +1769,48 @@ ExecCreatePartitionPruneState(PlanState *planstate,
PartitionedRelPruneInfo *pinfo = lfirst_node(PartitionedRelPruneInfo, lc2);
PartitionedRelPruningData *pprune = &prunedata->partrelprunedata[j];
Relation partrel;
+ bool close_partrel = false;
PartitionDesc partdesc;
PartitionKey partkey;
/*
- * We can rely on the copies of the partitioned table's partition
- * key and partition descriptor appearing in its relcache entry,
- * because that entry will be held open and locked for the
- * duration of this executor run.
+ * Must open the relation by ourselves when called before the
+ * execution has started, such as, when called during
+ * ExecutorGetLockRels() on a cached plan. In that case,
+ * sub-partitions must be locked, because AcquirePlannerLocks()
+ * would not have seen them. (1st relation in a partrelpruneinfos
+ * list is always the root partitioned table appearing in the
+ * query, which AcquirePlannerLocks() would have locked; the
+ * Assert in relation_open() guards that assumption.)
+ */
+ if (estate == NULL)
+ {
+ RangeTblEntry *rte = rt_fetch(pinfo->rtindex, rtable);
+ int lockmode = (j == 0) ? NoLock : rte->rellockmode;
+
+ partrel = table_open(rte->relid, lockmode);
+ close_partrel = true;
+ }
+ else
+ partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
+
+ /*
+ * We can rely on the copy of the partitioned table's partition
+ * key from in its relcache entry, because it can't change (or
+ * get destroyed) as long as the relation is locked. Partition
+ * descriptor is taken from the PartitionDirectory associated with
+ * the table that is held open long enough for the descriptor to
+ * remain valid while it's used to perform the pruning steps.
*/
- partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
partkey = RelationGetPartitionKey(partrel);
- partdesc = PartitionDirectoryLookup(estate->es_partition_directory,
- partrel);
+ partdesc = PartitionDirectoryLookup(partdir, partrel);
+
+ /*
+ * Must close partrel, keeping the lock taken, if we're not using
+ * EState's entry.
+ */
+ if (close_partrel)
+ table_close(partrel, NoLock);
/*
* Initialize the subplan_map and subpart_map.
@@ -1769,7 +1912,7 @@ ExecCreatePartitionPruneState(PlanState *planstate,
* Initialize pruning contexts as needed.
*/
pprune->initial_pruning_steps = pinfo->initial_pruning_steps;
- if (pinfo->initial_pruning_steps)
+ if (consider_initial_steps && pinfo->initial_pruning_steps)
{
ExecInitPruningContext(&pprune->initial_context,
pinfo->initial_pruning_steps,
@@ -1779,7 +1922,7 @@ ExecCreatePartitionPruneState(PlanState *planstate,
prunestate->do_initial_prune = true;
}
pprune->exec_pruning_steps = pinfo->exec_pruning_steps;
- if (pinfo->exec_pruning_steps)
+ if (consider_exec_steps && pinfo->exec_pruning_steps)
{
ExecInitPruningContext(&pprune->exec_context,
pinfo->exec_pruning_steps,
@@ -1893,7 +2036,8 @@ ExecInitPruningContext(PartitionPruneContext *context,
* is required.
*/
static Bitmapset *
-ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate)
+ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate,
+ PartitionPruneInfo *pruneinfo)
{
Bitmapset *result = NULL;
MemoryContext oldcontext;
@@ -1903,8 +2047,8 @@ ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate)
Assert(prunestate->do_initial_prune);
/*
- * Switch to a temp context to avoid leaking memory in the executor's
- * query-lifespan memory context.
+ * Switch to a temp context to avoid leaking memory in the longer-term
+ * memory context.
*/
oldcontext = MemoryContextSwitchTo(prunestate->prune_context);
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 9df1f81ea8..7246f9175f 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -119,6 +119,7 @@ CreateExecutorState(void)
estate->es_relations = NULL;
estate->es_rowmarks = NULL;
estate->es_plannedstmt = NULL;
+ estate->es_execlockrelsinfo = NULL;
estate->es_junkFilter = NULL;
@@ -785,6 +786,13 @@ ExecGetRangeTableRelation(EState *estate, Index rti)
Assert(rti > 0 && rti <= estate->es_range_table_size);
+ /*
+ * A cross-check that AcquireExecutorLocks() hasn't missed any relations
+ * it must not have.
+ */
+ Assert(estate->es_execlockrelsinfo == NULL ||
+ bms_is_member(rti, estate->es_execlockrelsinfo->lockrels));
+
rel = estate->es_relations[rti - 1];
if (rel == NULL)
{
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index f9460ae506..a2182a6b1f 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -844,7 +844,7 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
else
dest = None_Receiver;
- es->qd = CreateQueryDesc(es->stmt,
+ es->qd = CreateQueryDesc(es->stmt, NULL,
fcache->src,
GetActiveSnapshot(),
InvalidSnapshot,
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index 5b6d3eb23b..9c6f907687 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -94,6 +94,55 @@ static bool ExecAppendAsyncRequest(AppendState *node, TupleTableSlot **result);
static void ExecAppendAsyncEventWait(AppendState *node);
static void classify_matching_subplans(AppendState *node);
+/* ----------------------------------------------------------------
+ * ExecGetAppendLockRels
+ * Do ExecGetLockRels()'s work for an Append plan
+ * ----------------------------------------------------------------
+ */
+bool
+ExecGetAppendLockRels(Append *node, ExecGetLockRelsContext *context)
+{
+ PartitionPruneInfo *pruneinfo = node->part_prune_info;
+
+ /*
+ * Must always lock all the partitioned tables whose direct and indirect
+ * partitions will be scanned by this Append.
+ */
+ context->lockrels = bms_add_members(context->lockrels,
+ node->partitioned_rels);
+
+ /*
+ * Now recurse to subplans to add relations scanned therein.
+ *
+ * If initial pruning can be done, do that now and only recurse to the
+ * surviving subplans.
+ */
+ if (pruneinfo && pruneinfo->needs_init_pruning)
+ {
+ List *subplans = node->appendplans;
+ Bitmapset *validsubplans;
+ int i;
+
+ validsubplans = ExecGetLockRelsDoInitialPruning((Plan *) node,
+ context, pruneinfo);
+
+ /* Recurse to surviving subplans. */
+ i = -1;
+ while ((i = bms_next_member(validsubplans, i)) >= 0)
+ {
+ Plan *subplan = list_nth(subplans, i);
+
+ (void) ExecGetLockRels(subplan, context);
+ }
+
+ /* done with this node */
+ return true;
+ }
+
+ /* Tell the caller to recurse to *all* the subplans. */
+ return false;
+}
+
/* ----------------------------------------------------------------
* ExecInitAppend
*
@@ -155,7 +204,8 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
* subplan, we can fill as_valid_subplans immediately, preventing
* later calls to ExecFindMatchingSubPlans.
*/
- if (!prunestate->do_exec_prune && nplans > 0)
+ if (appendstate->as_prune_state == NULL ||
+ (!appendstate->as_prune_state->do_exec_prune && nplans > 0))
appendstate->as_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
}
else
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index 9a9f29e845..4b04fcdbc2 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -54,6 +54,55 @@ typedef int32 SlotNumber;
static TupleTableSlot *ExecMergeAppend(PlanState *pstate);
static int heap_compare_slots(Datum a, Datum b, void *arg);
+/* ----------------------------------------------------------------
+ * ExecGetMergeAppendLockRels
+ * Do ExecGetLockRels()'s work for a MergeAppend plan
+ * ----------------------------------------------------------------
+ */
+bool
+ExecGetMergeAppendLockRels(MergeAppend *node, ExecGetLockRelsContext *context)
+{
+ PartitionPruneInfo *pruneinfo = node->part_prune_info;
+
+ /*
+ * Must always lock all the partitioned tables whose direct and indirect
+ * partitions will be scanned by this Append.
+ */
+ context->lockrels = bms_add_members(context->lockrels,
+ node->partitioned_rels);
+
+ /*
+ * Now recurse to subplans to add relations scanned therein.
+ *
+ * If initial pruning can be done, do that now and only recurse to the
+ * surviving subplans.
+ */
+ if (pruneinfo && pruneinfo->needs_init_pruning)
+ {
+ List *subplans = node->mergeplans;
+ Bitmapset *validsubplans;
+ int i;
+
+ validsubplans = ExecGetLockRelsDoInitialPruning((Plan *) node,
+ context, pruneinfo);
+
+ /* Recurse to surviving subplans. */
+ i = -1;
+ while ((i = bms_next_member(validsubplans, i)) >= 0)
+ {
+ Plan *subplan = list_nth(subplans, i);
+
+ (void) ExecGetLockRels(subplan, context);
+ }
+
+ /* done with this node */
+ return true;
+ }
+
+ /* Tell the caller to recurse to *all* the subplans. */
+ return false;
+}
+
/* ----------------------------------------------------------------
* ExecInitMergeAppend
@@ -103,7 +152,8 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
* subplan, we can fill as_valid_subplans immediately, preventing
* later calls to ExecFindMatchingSubPlans.
*/
- if (!prunestate->do_exec_prune && nplans > 0)
+ if (mergestate->ms_prune_state == NULL ||
+ (!mergestate->ms_prune_state->do_exec_prune && nplans > 0))
mergestate->ms_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
}
else
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index 701fe05296..23df3efef0 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -3008,6 +3008,31 @@ ExecLookupResultRelByOid(ModifyTableState *node, Oid resultoid,
return NULL;
}
+/*
+ * ExecGetModifyTableLockRels
+ * Do ExecGetLockRels()'s work for a ModifyTable plan
+ */
+bool
+ExecGetModifyTableLockRels(ModifyTable *plan, ExecGetLockRelsContext *context)
+{
+ ListCell *lc;
+
+ /* First add the result relation RTIs mentioned in the node. */
+ if (plan->rootRelation > 0)
+ context->lockrels = bms_add_member(context->lockrels,
+ plan->rootRelation);
+ context->lockrels = bms_add_member(context->lockrels,
+ plan->nominalRelation);
+ foreach(lc, plan->resultRelations)
+ {
+ context->lockrels = bms_add_member(context->lockrels,
+ lfirst_int(lc));
+ }
+
+ /* Tell the caller to recurse to the subplan (outerPlan(plan)). */
+ return false;
+}
+
/* ----------------------------------------------------------------
* ExecInitModifyTable
* ----------------------------------------------------------------
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index a82e986667..2107009591 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -1578,6 +1578,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
CachedPlanSource *plansource;
CachedPlan *cplan;
List *stmt_list;
+ List *execlockrelsinfo_list;
char *query_string;
Snapshot snapshot;
MemoryContext oldcontext;
@@ -1659,6 +1660,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
/* Replan if needed, and increment plan refcount for portal */
cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv);
stmt_list = cplan->stmt_list;
+ execlockrelsinfo_list = cplan->execlockrelsinfo_list;
if (!plan->saved)
{
@@ -1670,6 +1672,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
*/
oldcontext = MemoryContextSwitchTo(portal->portalContext);
stmt_list = copyObject(stmt_list);
+ execlockrelsinfo_list = copyObject(execlockrelsinfo_list);
MemoryContextSwitchTo(oldcontext);
ReleaseCachedPlan(cplan, NULL);
cplan = NULL; /* portal shouldn't depend on cplan */
@@ -1683,6 +1686,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
query_string,
plansource->commandTag,
stmt_list,
+ execlockrelsinfo_list,
cplan);
/*
@@ -2473,7 +2477,9 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
{
CachedPlanSource *plansource = (CachedPlanSource *) lfirst(lc1);
List *stmt_list;
- ListCell *lc2;
+ List *execlockrelsinfo_list;
+ ListCell *lc2,
+ *lc3;
spicallbackarg.query = plansource->query_string;
@@ -2552,6 +2558,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
plan_owner, _SPI_current->queryEnv);
stmt_list = cplan->stmt_list;
+ execlockrelsinfo_list = cplan->execlockrelsinfo_list;
/*
* If we weren't given a specific snapshot to use, and the statement
@@ -2589,9 +2596,10 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
}
}
- foreach(lc2, stmt_list)
+ forboth(lc2, stmt_list, lc3, execlockrelsinfo_list)
{
PlannedStmt *stmt = lfirst_node(PlannedStmt, lc2);
+ ExecLockRelsInfo *execlockrelsinfo = lfirst_node(ExecLockRelsInfo, lc3);
bool canSetTag = stmt->canSetTag;
DestReceiver *dest;
@@ -2663,7 +2671,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
else
snap = InvalidSnapshot;
- qdesc = CreateQueryDesc(stmt,
+ qdesc = CreateQueryDesc(stmt, execlockrelsinfo,
plansource->query_string,
snap, crosscheck_snapshot,
dest,
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index dc68a12486..1b94d7c881 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -68,6 +68,13 @@
} \
} while (0)
+/* Copy a field that is an array with numElem ints */
+#define COPY_INT_ARRAY(fldname, numElem) \
+ do { \
+ newnode->fldname = (numElem) > 0 ? palloc((numElem) * sizeof(int)) : NULL; \
+ memcpy(newnode->fldname, from->fldname, sizeof(int) * (numElem)); \
+ } while (0)
+
/* Copy a parse location field (for Copy, this is same as scalar case) */
#define COPY_LOCATION_FIELD(fldname) \
(newnode->fldname = from->fldname)
@@ -94,8 +101,10 @@ _copyPlannedStmt(const PlannedStmt *from)
COPY_SCALAR_FIELD(transientPlan);
COPY_SCALAR_FIELD(dependsOnRole);
COPY_SCALAR_FIELD(parallelModeNeeded);
+ COPY_SCALAR_FIELD(containsInitialPruning);
COPY_SCALAR_FIELD(jitFlags);
COPY_NODE_FIELD(planTree);
+ COPY_SCALAR_FIELD(numPlanNodes);
COPY_NODE_FIELD(rtable);
COPY_NODE_FIELD(resultRelations);
COPY_NODE_FIELD(appendRelations);
@@ -1281,6 +1290,8 @@ _copyPartitionPruneInfo(const PartitionPruneInfo *from)
PartitionPruneInfo *newnode = makeNode(PartitionPruneInfo);
COPY_NODE_FIELD(prune_infos);
+ COPY_SCALAR_FIELD(needs_init_pruning);
+ COPY_SCALAR_FIELD(needs_exec_pruning);
COPY_BITMAPSET_FIELD(other_subplans);
return newnode;
@@ -4944,6 +4955,33 @@ _copyExtensibleNode(const ExtensibleNode *from)
return newnode;
}
+/* ****************************************************************
+ * execnodes.h copy functions
+ * ****************************************************************
+ */
+static ExecLockRelsInfo *
+_copyExecLockRelsInfo(const ExecLockRelsInfo *from)
+{
+ ExecLockRelsInfo *newnode = makeNode(ExecLockRelsInfo);
+
+ COPY_BITMAPSET_FIELD(lockrels);
+ COPY_SCALAR_FIELD(numPlanNodes);
+ COPY_NODE_FIELD(initPruningOutputs);
+ COPY_INT_ARRAY(ipoIndexes, from->numPlanNodes);
+
+ return newnode;
+}
+
+static PlanInitPruningOutput *
+_copyPlanInitPruningOutput(const PlanInitPruningOutput *from)
+{
+ PlanInitPruningOutput *newnode = makeNode(PlanInitPruningOutput);
+
+ COPY_BITMAPSET_FIELD(initially_valid_subplans);
+
+ return newnode;
+}
+
/* ****************************************************************
* value.h copy functions
* ****************************************************************
@@ -4998,7 +5036,6 @@ _copyBitString(const BitString *from)
return newnode;
}
-
static ForeignKeyCacheInfo *
_copyForeignKeyCacheInfo(const ForeignKeyCacheInfo *from)
{
@@ -5947,6 +5984,16 @@ copyObjectImpl(const void *from)
retval = _copyPublicationTable(from);
break;
+ /*
+ * EXECUTION NODES
+ */
+ case T_ExecLockRelsInfo:
+ retval = _copyExecLockRelsInfo(from);
+ break;
+ case T_PlanInitPruningOutput:
+ retval = _copyPlanInitPruningOutput(from);
+ break;
+
/*
* MISCELLANEOUS NODES
*/
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index bc178d53bf..6c404c8664 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -312,8 +312,10 @@ _outPlannedStmt(StringInfo str, const PlannedStmt *node)
WRITE_BOOL_FIELD(transientPlan);
WRITE_BOOL_FIELD(dependsOnRole);
WRITE_BOOL_FIELD(parallelModeNeeded);
+ WRITE_BOOL_FIELD(containsInitialPruning);
WRITE_INT_FIELD(jitFlags);
WRITE_NODE_FIELD(planTree);
+ WRITE_INT_FIELD(numPlanNodes);
WRITE_NODE_FIELD(rtable);
WRITE_NODE_FIELD(resultRelations);
WRITE_NODE_FIELD(appendRelations);
@@ -1007,6 +1009,8 @@ _outPartitionPruneInfo(StringInfo str, const PartitionPruneInfo *node)
WRITE_NODE_TYPE("PARTITIONPRUNEINFO");
WRITE_NODE_FIELD(prune_infos);
+ WRITE_BOOL_FIELD(needs_init_pruning);
+ WRITE_BOOL_FIELD(needs_exec_pruning);
WRITE_BITMAPSET_FIELD(other_subplans);
}
@@ -2702,6 +2706,31 @@ _outExtensibleNode(StringInfo str, const ExtensibleNode *node)
methods->nodeOut(str, node);
}
+/*****************************************************************************
+ *
+ * Stuff from execnodes.h
+ *
+ *****************************************************************************/
+
+static void
+_outExecLockRelsInfo(StringInfo str, const ExecLockRelsInfo *node)
+{
+ WRITE_NODE_TYPE("EXECLOCKRELSINFO");
+
+ WRITE_BITMAPSET_FIELD(lockrels);
+ WRITE_INT_FIELD(numPlanNodes);
+ WRITE_NODE_FIELD(initPruningOutputs);
+ WRITE_INT_ARRAY(ipoIndexes, node->numPlanNodes);
+}
+
+static void
+_outPlanInitPruningOutput(StringInfo str, const PlanInitPruningOutput *node)
+{
+ WRITE_NODE_TYPE("PARTITIONINITPRUNINGOUTPUT");
+
+ WRITE_BITMAPSET_FIELD(initially_valid_subplans);
+}
+
/*****************************************************************************
*
* Stuff from parsenodes.h.
@@ -4543,6 +4572,16 @@ outNode(StringInfo str, const void *obj)
_outPartitionRangeDatum(str, obj);
break;
+ /*
+ * EXECUTION NODES
+ */
+ case T_ExecLockRelsInfo:
+ _outExecLockRelsInfo(str, obj);
+ break;
+ case T_PlanInitPruningOutput:
+ _outPlanInitPruningOutput(str, obj);
+ break;
+
default:
/*
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 3c673c42d5..863f082729 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -1585,8 +1585,10 @@ _readPlannedStmt(void)
READ_BOOL_FIELD(transientPlan);
READ_BOOL_FIELD(dependsOnRole);
READ_BOOL_FIELD(parallelModeNeeded);
+ READ_BOOL_FIELD(containsInitialPruning);
READ_INT_FIELD(jitFlags);
READ_NODE_FIELD(planTree);
+ READ_INT_FIELD(numPlanNodes);
READ_NODE_FIELD(rtable);
READ_NODE_FIELD(resultRelations);
READ_NODE_FIELD(appendRelations);
@@ -2537,6 +2539,8 @@ _readPartitionPruneInfo(void)
READ_LOCALS(PartitionPruneInfo);
READ_NODE_FIELD(prune_infos);
+ READ_BOOL_FIELD(needs_init_pruning);
+ READ_BOOL_FIELD(needs_exec_pruning);
READ_BITMAPSET_FIELD(other_subplans);
READ_DONE();
@@ -2706,6 +2710,35 @@ _readPartitionRangeDatum(void)
READ_DONE();
}
+/*
+ * _readExecLockRelsInfo
+ */
+static ExecLockRelsInfo *
+_readExecLockRelsInfo(void)
+{
+ READ_LOCALS(ExecLockRelsInfo);
+
+ READ_BITMAPSET_FIELD(lockrels);
+ READ_INT_FIELD(numPlanNodes);
+ READ_NODE_FIELD(initPruningOutputs);
+ READ_INT_ARRAY(ipoIndexes, local_node->numPlanNodes);
+
+ READ_DONE();
+}
+
+/*
+ * _readPlanInitPruningOutput
+ */
+static PlanInitPruningOutput *
+_readPlanInitPruningOutput(void)
+{
+ READ_LOCALS(PlanInitPruningOutput);
+
+ READ_BITMAPSET_FIELD(initially_valid_subplans);
+
+ READ_DONE();
+}
+
/*
* parseNodeString
*
@@ -2977,6 +3010,10 @@ parseNodeString(void)
return_value = _readPartitionBoundSpec();
else if (MATCH("PARTITIONRANGEDATUM", 19))
return_value = _readPartitionRangeDatum();
+ else if (MATCH("EXECLOCKRELSINFO", 16))
+ return_value = _readExecLockRelsInfo();
+ else if (MATCH("PARTITIONINITPRUNINGOUTPUT", 26))
+ return_value = _readPlanInitPruningOutput();
else
{
elog(ERROR, "badly formatted node string \"%.32s\"...", token);
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 374a9d9753..329fb9d6e7 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -517,7 +517,9 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
result->transientPlan = glob->transientPlan;
result->dependsOnRole = glob->dependsOnRole;
result->parallelModeNeeded = glob->parallelModeNeeded;
+ result->containsInitialPruning = glob->containsInitialPruning;
result->planTree = top_plan;
+ result->numPlanNodes = glob->lastPlanNodeId;
result->rtable = glob->finalrtable;
result->resultRelations = glob->resultRelations;
result->appendRelations = glob->appendRelations;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index dbdeb8ec9d..ac795ae9d9 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -1561,6 +1561,9 @@ set_append_references(PlannerInfo *root,
pinfo->rtindex += rtoffset;
}
}
+
+ if (aplan->part_prune_info->needs_init_pruning)
+ root->glob->containsInitialPruning = true;
}
/* We don't need to recurse to lefttree or righttree ... */
@@ -1648,6 +1651,9 @@ set_mergeappend_references(PlannerInfo *root,
pinfo->rtindex += rtoffset;
}
}
+
+ if (mplan->part_prune_info->needs_init_pruning)
+ root->glob->containsInitialPruning = true;
}
/* We don't need to recurse to lefttree or righttree ... */
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index 7080cb25d9..3322dc79f2 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -144,7 +144,9 @@ static List *make_partitionedrel_pruneinfo(PlannerInfo *root,
List *prunequal,
Bitmapset *partrelids,
int *relid_subplan_map,
- Bitmapset **matchedsubplans);
+ Bitmapset **matchedsubplans,
+ bool *needs_init_pruning,
+ bool *needs_exec_pruning);
static void gen_partprune_steps(RelOptInfo *rel, List *clauses,
PartClauseTarget target,
GeneratePruningStepsContext *context);
@@ -230,6 +232,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
int *relid_subplan_map;
ListCell *lc;
int i;
+ bool needs_init_pruning = false;
+ bool needs_exec_pruning = false;
/*
* Scan the subpaths to see which ones are scans of partition child
@@ -309,12 +313,16 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
Bitmapset *partrelids = (Bitmapset *) lfirst(lc);
List *pinfolist;
Bitmapset *matchedsubplans = NULL;
+ bool partrel_needs_init_pruning;
+ bool partrel_needs_exec_pruning;
pinfolist = make_partitionedrel_pruneinfo(root, parentrel,
prunequal,
partrelids,
relid_subplan_map,
- &matchedsubplans);
+ &matchedsubplans,
+ &partrel_needs_init_pruning,
+ &partrel_needs_exec_pruning);
/* When pruning is possible, record the matched subplans */
if (pinfolist != NIL)
@@ -323,6 +331,10 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
allmatchedsubplans = bms_join(matchedsubplans,
allmatchedsubplans);
}
+ if (!needs_init_pruning)
+ needs_init_pruning = partrel_needs_init_pruning;
+ if (!needs_exec_pruning)
+ needs_exec_pruning = partrel_needs_exec_pruning;
}
pfree(relid_subplan_map);
@@ -337,6 +349,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
/* Else build the result data structure */
pruneinfo = makeNode(PartitionPruneInfo);
pruneinfo->prune_infos = prunerelinfos;
+ pruneinfo->needs_init_pruning = needs_init_pruning;
+ pruneinfo->needs_exec_pruning = needs_exec_pruning;
/*
* Some subplans may not belong to any of the identified partitioned rels.
@@ -435,13 +449,18 @@ add_part_relids(List *allpartrelids, Bitmapset *partrelids)
* If we cannot find any useful run-time pruning steps, return NIL.
* However, on success, each rel identified in partrelids will have
* an element in the result list, even if some of them are useless.
+ * *needs_init_pruning and *needs_exec_pruning are set to indicate that the
+ * returned PartitionedRelPruneInfos contains pruning steps that can be
+ * performed before and after execution begins, respectively.
*/
static List *
make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
List *prunequal,
Bitmapset *partrelids,
int *relid_subplan_map,
- Bitmapset **matchedsubplans)
+ Bitmapset **matchedsubplans,
+ bool *needs_init_pruning,
+ bool *needs_exec_pruning)
{
RelOptInfo *targetpart = NULL;
List *pinfolist = NIL;
@@ -452,6 +471,10 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
int rti;
int i;
+ /* Will find out below. */
+ *needs_init_pruning = false;
+ *needs_exec_pruning = false;
+
/*
* Examine each partitioned rel, constructing a temporary array to map
* from planner relids to index of the partitioned rel, and building a
@@ -539,6 +562,9 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
* executor per-scan pruning steps. This first pass creates startup
* pruning steps and detects whether there's any possibly-useful quals
* that would require per-scan pruning.
+ *
+ * In the first pass, we note whether the 2nd pass is necessary by
+ * by noting the presence of EXEC parameters.
*/
gen_partprune_steps(subpart, partprunequal, PARTTARGET_INITIAL,
&context);
@@ -613,6 +639,11 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
pinfo->execparamids = execparamids;
/* Remaining fields will be filled in the next loop */
+ if (!*needs_init_pruning)
+ *needs_init_pruning = (initial_pruning_steps != NIL);
+ if (!*needs_exec_pruning)
+ *needs_exec_pruning = (exec_pruning_steps != NIL);
+
pinfolist = lappend(pinfolist, pinfo);
}
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index ba2fcfeb4a..085eb3f209 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -945,15 +945,17 @@ pg_plan_query(Query *querytree, const char *query_string, int cursorOptions,
* For normal optimizable statements, invoke the planner. For utility
* statements, just make a wrapper PlannedStmt node.
*
- * The result is a list of PlannedStmt nodes.
+ * The result is a list of PlannedStmt nodes. Also, a NULL is appended to
+ * *execlockrelsinfo_list for each PlannedStmt added to the returned list.
*/
List *
pg_plan_queries(List *querytrees, const char *query_string, int cursorOptions,
- ParamListInfo boundParams)
+ ParamListInfo boundParams, List **execlockrelsinfo_list)
{
List *stmt_list = NIL;
ListCell *query_list;
+ *execlockrelsinfo_list = NIL;
foreach(query_list, querytrees)
{
Query *query = lfirst_node(Query, query_list);
@@ -977,6 +979,7 @@ pg_plan_queries(List *querytrees, const char *query_string, int cursorOptions,
}
stmt_list = lappend(stmt_list, stmt);
+ *execlockrelsinfo_list = lappend(*execlockrelsinfo_list, NULL);
}
return stmt_list;
@@ -1080,7 +1083,8 @@ exec_simple_query(const char *query_string)
QueryCompletion qc;
MemoryContext per_parsetree_context = NULL;
List *querytree_list,
- *plantree_list;
+ *plantree_list,
+ *plantree_execlockrelsinfo_list;
Portal portal;
DestReceiver *receiver;
int16 format;
@@ -1167,7 +1171,8 @@ exec_simple_query(const char *query_string)
NULL, 0, NULL);
plantree_list = pg_plan_queries(querytree_list, query_string,
- CURSOR_OPT_PARALLEL_OK, NULL);
+ CURSOR_OPT_PARALLEL_OK, NULL,
+ &plantree_execlockrelsinfo_list);
/*
* Done with the snapshot used for parsing/planning.
@@ -1203,6 +1208,7 @@ exec_simple_query(const char *query_string)
query_string,
commandTag,
plantree_list,
+ plantree_execlockrelsinfo_list,
NULL);
/*
@@ -1991,6 +1997,7 @@ exec_bind_message(StringInfo input_message)
query_string,
psrc->commandTag,
cplan->stmt_list,
+ cplan->execlockrelsinfo_list,
cplan);
/* Done with the snapshot used for parameter I/O and parsing/planning */
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index 5f907831a3..972ddc014e 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -35,7 +35,7 @@
Portal ActivePortal = NULL;
-static void ProcessQuery(PlannedStmt *plan,
+static void ProcessQuery(PlannedStmt *plan, ExecLockRelsInfo *execlockrelsinfo,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -65,6 +65,7 @@ static void DoPortalRewind(Portal portal);
*/
QueryDesc *
CreateQueryDesc(PlannedStmt *plannedstmt,
+ ExecLockRelsInfo *execlockrelsinfo,
const char *sourceText,
Snapshot snapshot,
Snapshot crosscheck_snapshot,
@@ -77,6 +78,7 @@ CreateQueryDesc(PlannedStmt *plannedstmt,
qd->operation = plannedstmt->commandType; /* operation */
qd->plannedstmt = plannedstmt; /* plan */
+ qd->execlockrelsinfo = execlockrelsinfo; /* ExecutorGetLockRels() output for plan */
qd->sourceText = sourceText; /* query text */
qd->snapshot = RegisterSnapshot(snapshot); /* snapshot */
/* RI check snapshot */
@@ -122,6 +124,7 @@ FreeQueryDesc(QueryDesc *qdesc)
* PORTAL_ONE_RETURNING, or PORTAL_ONE_MOD_WITH portal
*
* plan: the plan tree for the query
+ * execlockrelsinfo: ExecutorGetLockRels() output for the plan tree
* sourceText: the source text of the query
* params: any parameters needed
* dest: where to send results
@@ -134,6 +137,7 @@ FreeQueryDesc(QueryDesc *qdesc)
*/
static void
ProcessQuery(PlannedStmt *plan,
+ ExecLockRelsInfo *execlockrelsinfo,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -145,7 +149,7 @@ ProcessQuery(PlannedStmt *plan,
/*
* Create the QueryDesc object
*/
- queryDesc = CreateQueryDesc(plan, sourceText,
+ queryDesc = CreateQueryDesc(plan, execlockrelsinfo, sourceText,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, 0);
@@ -490,6 +494,7 @@ PortalStart(Portal portal, ParamListInfo params,
* the destination to DestNone.
*/
queryDesc = CreateQueryDesc(linitial_node(PlannedStmt, portal->stmts),
+ linitial_node(ExecLockRelsInfo, portal->execlockrelsinfos),
portal->sourceText,
GetActiveSnapshot(),
InvalidSnapshot,
@@ -1190,7 +1195,8 @@ PortalRunMulti(Portal portal,
QueryCompletion *qc)
{
bool active_snapshot_set = false;
- ListCell *stmtlist_item;
+ ListCell *stmtlist_item,
+ *execlockrelsinfolist_item;
/*
* If the destination is DestRemoteExecute, change to DestNone. The
@@ -1211,9 +1217,12 @@ PortalRunMulti(Portal portal,
* Loop to handle the individual queries generated from a single parsetree
* by analysis and rewrite.
*/
- foreach(stmtlist_item, portal->stmts)
+ forboth(stmtlist_item, portal->stmts,
+ execlockrelsinfolist_item, portal->execlockrelsinfos)
{
PlannedStmt *pstmt = lfirst_node(PlannedStmt, stmtlist_item);
+ ExecLockRelsInfo *execlockrelsinfo = lfirst_node(ExecLockRelsInfo,
+ execlockrelsinfolist_item);
/*
* If we got a cancel signal in prior command, quit
@@ -1271,7 +1280,7 @@ PortalRunMulti(Portal portal,
if (pstmt->canSetTag)
{
/* statement can set tag string */
- ProcessQuery(pstmt,
+ ProcessQuery(pstmt, execlockrelsinfo,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
@@ -1280,7 +1289,7 @@ PortalRunMulti(Portal portal,
else
{
/* stmt added by rewrite cannot set tag */
- ProcessQuery(pstmt,
+ ProcessQuery(pstmt, execlockrelsinfo,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 4cf6db504f..9f5a40a0a6 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -99,14 +99,16 @@ static dlist_head cached_expression_list = DLIST_STATIC_INIT(cached_expression_l
static void ReleaseGenericPlan(CachedPlanSource *plansource);
static List *RevalidateCachedQuery(CachedPlanSource *plansource,
QueryEnvironment *queryEnv);
-static bool CheckCachedPlan(CachedPlanSource *plansource);
+static bool CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams);
static CachedPlan *BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
ParamListInfo boundParams, QueryEnvironment *queryEnv);
+static void CachedPlanSaveExecLockRelsInfos(CachedPlan *plan, List *execlockrelsinfo_list);
static bool choose_custom_plan(CachedPlanSource *plansource,
ParamListInfo boundParams);
static double cached_plan_cost(CachedPlan *plan, bool include_planner);
static Query *QueryListGetPrimaryStmt(List *stmts);
-static void AcquireExecutorLocks(List *stmt_list, bool acquire);
+static List *AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams);
+static void ReleaseExecutorLocks(List *stmt_list, List *execlockrelsinfo_list);
static void AcquirePlannerLocks(List *stmt_list, bool acquire);
static void ScanQueryForLocks(Query *parsetree, bool acquire);
static bool ScanQueryWalker(Node *node, bool *acquire);
@@ -790,9 +792,21 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
*
* On a "true" return, we have acquired the locks needed to run the plan.
* (We must do this for the "true" result to be race-condition-free.)
+ *
+ * If the CachedPlan is valid, this may in some cases call ExecutorGetLockRels
+ * on each PlannedStmt contained in it to determine the set of relations to be
+ * locked by AcquireExecutorLocks(), instead of just scanning its range table,
+ * which is done to prune away any nodes in the tree that need not be executed
+ * based on the result of initial partition pruning. Resulting
+ * ExecLockRelsInfo nodes containing the result of such pruning, allocated in
+ * a child context of the context containing the plan itself, are added into
+ * plan->execlockrelsinfo_list. The previous contents of the list from the
+ * last invocation on the same CachedPlan are deleted, because they would no
+ * longer be valid given the fresh set of parameter values which may be used
+ * as pruning parameters.
*/
static bool
-CheckCachedPlan(CachedPlanSource *plansource)
+CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams)
{
CachedPlan *plan = plansource->gplan;
@@ -820,13 +834,25 @@ CheckCachedPlan(CachedPlanSource *plansource)
*/
if (plan->is_valid)
{
+ List *execlockrelsinfo_list;
+
/*
* Plan must have positive refcount because it is referenced by
* plansource; so no need to fear it disappears under us here.
*/
Assert(plan->refcount > 0);
- AcquireExecutorLocks(plan->stmt_list, true);
+ /*
+ * Lock relations scanned by the plan. If ExecutorGetLockRels() asked
+ * to omit some relations because the plan nodes that scan them were
+ * found to be pruned, the executor will be informed of the omission of
+ * the plan nodes themselves, so that it doesn't accidentally try to
+ * execute those nodes, via the ExecLockRelsInfo nodes collected in the
+ * returned list that is also passed to it along with the list of
+ * PlannedStmts.
+ */
+ execlockrelsinfo_list = AcquireExecutorLocks(plan->stmt_list,
+ boundParams);
/*
* If plan was transient, check to see if TransactionXmin has
@@ -844,11 +870,14 @@ CheckCachedPlan(CachedPlanSource *plansource)
if (plan->is_valid)
{
/* Successfully revalidated and locked the query. */
+
+ /* Remember ExecLockRelsInfos in the CachedPlan. */
+ CachedPlanSaveExecLockRelsInfos(plan, execlockrelsinfo_list);
return true;
}
/* Oops, the race case happened. Release useless locks. */
- AcquireExecutorLocks(plan->stmt_list, false);
+ ReleaseExecutorLocks(plan->stmt_list, execlockrelsinfo_list);
}
/*
@@ -880,7 +909,8 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
ParamListInfo boundParams, QueryEnvironment *queryEnv)
{
CachedPlan *plan;
- List *plist;
+ List *plist,
+ *execlockrelsinfo_list;
bool snapshot_set;
bool is_transient;
MemoryContext plan_context;
@@ -933,7 +963,8 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
* Generate the plan.
*/
plist = pg_plan_queries(qlist, plansource->query_string,
- plansource->cursor_options, boundParams);
+ plansource->cursor_options, boundParams,
+ &execlockrelsinfo_list);
/* Release snapshot if we got one */
if (snapshot_set)
@@ -1002,6 +1033,16 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
plan->is_saved = false;
plan->is_valid = true;
+ /*
+ * Save the dummy ExecLockRelsInfo list, that is a list containing NULLs
+ * as elements. We must do this, becasue users of the CachedPlan expect
+ * one to go with the list of PlannedStmts.
+ * XXX maybe get rid of that contract.
+ */
+ plan->execlockrelsinfo_context = NULL;
+ CachedPlanSaveExecLockRelsInfos(plan, execlockrelsinfo_list);
+ Assert(MemoryContextIsValid(plan->execlockrelsinfo_context));
+
/* assign generation number to new plan */
plan->generation = ++(plansource->generation);
@@ -1160,7 +1201,7 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
if (!customplan)
{
- if (CheckCachedPlan(plansource))
+ if (CheckCachedPlan(plansource, boundParams))
{
/* We want a generic plan, and we already have a valid one */
plan = plansource->gplan;
@@ -1586,6 +1627,49 @@ CopyCachedPlan(CachedPlanSource *plansource)
return newsource;
}
+/*
+ * CachedPlanSaveExecLockRelsInfos
+ * Save the list containing ExecLockRelsInfo nodes into the given
+ * CachedPlan
+ *
+ * The provided list is copied into a dedicated context that is a child of
+ * plan->context. If the child context already exists, it is emptied, because
+ * any ExecLockRelsInfo contained therein would no longer be useful.
+ */
+static void
+CachedPlanSaveExecLockRelsInfos(CachedPlan *plan, List *execlockrelsinfo_list)
+{
+ MemoryContext execlockrelsinfo_context = plan->execlockrelsinfo_context,
+ oldcontext = CurrentMemoryContext;
+ List *execlockrelsinfo_list_copy;
+
+ /*
+ * Set up the dedicated context if not already done, saving it as a child
+ * of the CachedPlan's context.
+ */
+ if (execlockrelsinfo_context == NULL)
+ {
+ execlockrelsinfo_context = AllocSetContextCreate(CurrentMemoryContext,
+ "CachedPlan execlockrelsinfo list",
+ ALLOCSET_START_SMALL_SIZES);
+ MemoryContextSetParent(execlockrelsinfo_context, plan->context);
+ MemoryContextSetIdentifier(execlockrelsinfo_context, plan->context->ident);
+ plan->execlockrelsinfo_context = execlockrelsinfo_context;
+ }
+ else
+ {
+ /* Just clear existing contents by resetting the context. */
+ Assert(MemoryContextIsValid(execlockrelsinfo_context));
+ MemoryContextReset(execlockrelsinfo_context);
+ }
+
+ MemoryContextSwitchTo(execlockrelsinfo_context);
+ execlockrelsinfo_list_copy = copyObject(execlockrelsinfo_list);
+ MemoryContextSwitchTo(oldcontext);
+
+ plan->execlockrelsinfo_list = execlockrelsinfo_list_copy;
+}
+
/*
* CachedPlanIsValid: test whether the rewritten querytree within a
* CachedPlanSource is currently valid (that is, not marked as being in need
@@ -1737,17 +1821,21 @@ QueryListGetPrimaryStmt(List *stmts)
/*
* AcquireExecutorLocks: acquire locks needed for execution of a cached plan;
- * or release them if acquire is false.
+ *
+ * Returns a list of ExecLockRelsInfo nodes containing one element for each
+ * PlannedStmt in stmt_list or NULL if the latter is utility statement or its
+ * containsInitialPruning is false.
*/
-static void
-AcquireExecutorLocks(List *stmt_list, bool acquire)
+static List *
+AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams)
{
ListCell *lc1;
+ List *execlockrelsinfo_list = NIL;
foreach(lc1, stmt_list)
{
PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
- ListCell *lc2;
+ ExecLockRelsInfo *execlockrelsinfo = NULL;
if (plannedstmt->commandType == CMD_UTILITY)
{
@@ -1761,27 +1849,139 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
Query *query = UtilityContainsQuery(plannedstmt->utilityStmt);
if (query)
- ScanQueryForLocks(query, acquire);
- continue;
+ ScanQueryForLocks(query, true);
}
-
- foreach(lc2, plannedstmt->rtable)
+ else
{
- RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc2);
+ /*
+ * Figure out the set of relations that would need to be locked
+ * before executing the plan.
+ */
+ if (!plannedstmt->containsInitialPruning)
+ {
+ /*
+ * If the plan contains no initial pruning steps, just lock
+ * all the relations found in the range table.
+ */
+ ListCell *lc;
- if (rte->rtekind != RTE_RELATION)
- continue;
+ foreach(lc, plannedstmt->rtable)
+ {
+ RangeTblEntry *rte = lfirst(lc);
+
+ if (rte->rtekind != RTE_RELATION)
+ continue;
+
+ /*
+ * Acquire the appropriate type of lock on each relation
+ * OID. Note that we don't actually try to open the rel,
+ * and hence will not fail if it's been dropped entirely
+ * --- we'll just transiently acquire a non-conflicting
+ * lock.
+ */
+ LockRelationOid(rte->relid, rte->rellockmode);
+ }
+ }
+ else
+ {
+ int rti;
+ Bitmapset *lockrels;
+
+ /*
+ * Walk the plan tree to find only the minimal set of
+ * relations to be locked, considering the effect of performing
+ * initial partition pruning.
+ */
+ execlockrelsinfo = ExecutorGetLockRels(plannedstmt, boundParams);
+ lockrels = execlockrelsinfo->lockrels;
+
+ rti = -1;
+ while ((rti = bms_next_member(lockrels, rti)) >= 0)
+ {
+ RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
+ Assert(rte->rtekind == RTE_RELATION);
+
+ /* See the comment above. */
+ LockRelationOid(rte->relid, rte->rellockmode);
+ }
+ }
+ }
+
+ /*
+ * Remember ExecLockRelsInfo for later adding to the QueryDesc that
+ * will be passed to the executor when executing this plan. May be
+ * NULL, but must keep the list the same length as stmt_list.
+ */
+ execlockrelsinfo_list = lappend(execlockrelsinfo_list,
+ execlockrelsinfo);
+ }
+
+ return execlockrelsinfo_list;
+}
+
+/*
+ * ReleaseExecutorLocks
+ * Release locks that would've been acquired by an earlier call to
+ * AcquireExecutorLocks()
+ */
+static void
+ReleaseExecutorLocks(List *stmt_list, List *execlockrelsinfo_list)
+{
+ ListCell *lc1,
+ *lc2;
+
+ forboth(lc1, stmt_list, lc2, execlockrelsinfo_list)
+ {
+ PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
+ ExecLockRelsInfo *execlockrelsinfo = lfirst_node(ExecLockRelsInfo, lc2);
+
+ if (plannedstmt->commandType == CMD_UTILITY)
+ {
/*
- * Acquire the appropriate type of lock on each relation OID. Note
- * that we don't actually try to open the rel, and hence will not
- * fail if it's been dropped entirely --- we'll just transiently
- * acquire a non-conflicting lock.
+ * Ignore utility statements, except those (such as EXPLAIN) that
+ * contain a parsed-but-not-planned query. Note: it's okay to use
+ * ScanQueryForLocks, even though the query hasn't been through
+ * rule rewriting, because rewriting doesn't change the query
+ * representation.
*/
- if (acquire)
- LockRelationOid(rte->relid, rte->rellockmode);
+ Query *query = UtilityContainsQuery(plannedstmt->utilityStmt);
+
+ if (query)
+ ScanQueryForLocks(query, false);
+ }
+ else
+ {
+ if (execlockrelsinfo == NULL)
+ {
+ ListCell *lc;
+
+ foreach(lc, plannedstmt->rtable)
+ {
+ RangeTblEntry *rte = lfirst(lc);
+
+ if (rte->rtekind != RTE_RELATION)
+ continue;
+
+ LockRelationOid(rte->relid, rte->rellockmode);
+ }
+ }
else
- UnlockRelationOid(rte->relid, rte->rellockmode);
+ {
+ int rti;
+ Bitmapset *lockrels;
+
+ lockrels = execlockrelsinfo->lockrels;
+ rti = -1;
+ while ((rti = bms_next_member(lockrels, rti)) >= 0)
+ {
+ RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
+
+ Assert(rte->rtekind == RTE_RELATION);
+
+ UnlockRelationOid(rte->relid, rte->rellockmode);
+ }
+ }
}
}
}
diff --git a/src/backend/utils/mmgr/portalmem.c b/src/backend/utils/mmgr/portalmem.c
index d549f66d4a..896f51be08 100644
--- a/src/backend/utils/mmgr/portalmem.c
+++ b/src/backend/utils/mmgr/portalmem.c
@@ -285,6 +285,7 @@ PortalDefineQuery(Portal portal,
const char *sourceText,
CommandTag commandTag,
List *stmts,
+ List *execlockrelsinfos,
CachedPlan *cplan)
{
AssertArg(PortalIsValid(portal));
@@ -299,6 +300,7 @@ PortalDefineQuery(Portal portal,
portal->qc.nprocessed = 0;
portal->commandTag = commandTag;
portal->stmts = stmts;
+ portal->execlockrelsinfos = execlockrelsinfos;
portal->cplan = cplan;
portal->status = PORTAL_DEFINED;
}
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 666977fb1f..fef75ba147 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -87,7 +87,8 @@ extern void ExplainOneUtility(Node *utilityStmt, IntoClause *into,
ExplainState *es, const char *queryString,
ParamListInfo params, QueryEnvironment *queryEnv);
-extern void ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into,
+extern void ExplainOnePlan(PlannedStmt *plannedstmt, ExecLockRelsInfo *execlockrelsinfo,
+ IntoClause *into,
ExplainState *es, const char *queryString,
ParamListInfo params, QueryEnvironment *queryEnv,
const instr_time *planduration,
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index fd5735a946..ded19b8cbb 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -124,4 +124,6 @@ extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
PartitionPruneInfo *pruneinfo,
Bitmapset **initially_valid_subplans);
extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate);
+extern Bitmapset *ExecGetLockRelsDoInitialPruning(Plan *plan, ExecGetLockRelsContext *context,
+ PartitionPruneInfo *pruneinfo);
#endif /* EXECPARTITION_H */
diff --git a/src/include/executor/execdesc.h b/src/include/executor/execdesc.h
index e79e2c001f..4338463479 100644
--- a/src/include/executor/execdesc.h
+++ b/src/include/executor/execdesc.h
@@ -35,6 +35,7 @@ typedef struct QueryDesc
/* These fields are provided by CreateQueryDesc */
CmdType operation; /* CMD_SELECT, CMD_UPDATE, etc. */
PlannedStmt *plannedstmt; /* planner's output (could be utility, too) */
+ ExecLockRelsInfo *execlockrelsinfo; /* ExecutorGetLockRels()'s output given plannedstmt */
const char *sourceText; /* source text of the query */
Snapshot snapshot; /* snapshot to use for query */
Snapshot crosscheck_snapshot; /* crosscheck for RI update/delete */
@@ -57,6 +58,7 @@ typedef struct QueryDesc
/* in pquery.c */
extern QueryDesc *CreateQueryDesc(PlannedStmt *plannedstmt,
+ ExecLockRelsInfo *execlockrelsinfo,
const char *sourceText,
Snapshot snapshot,
Snapshot crosscheck_snapshot,
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index 82925b4b63..5cf414cc11 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -185,6 +185,8 @@ ExecGetJunkAttribute(TupleTableSlot *slot, AttrNumber attno, bool *isNull)
/*
* prototypes from functions in execMain.c
*/
+extern ExecLockRelsInfo *ExecutorGetLockRels(PlannedStmt *plannedstmt, ParamListInfo params);
+extern bool ExecGetLockRels(Plan *node, ExecGetLockRelsContext *context);
extern void ExecutorStart(QueryDesc *queryDesc, int eflags);
extern void standard_ExecutorStart(QueryDesc *queryDesc, int eflags);
extern void ExecutorRun(QueryDesc *queryDesc,
diff --git a/src/include/executor/nodeAppend.h b/src/include/executor/nodeAppend.h
index 4cb78ee5b6..b53535c2a4 100644
--- a/src/include/executor/nodeAppend.h
+++ b/src/include/executor/nodeAppend.h
@@ -17,6 +17,7 @@
#include "access/parallel.h"
#include "nodes/execnodes.h"
+extern bool ExecGetAppendLockRels(Append *node, ExecGetLockRelsContext *context);
extern AppendState *ExecInitAppend(Append *node, EState *estate, int eflags);
extern void ExecEndAppend(AppendState *node);
extern void ExecReScanAppend(AppendState *node);
diff --git a/src/include/executor/nodeMergeAppend.h b/src/include/executor/nodeMergeAppend.h
index 97fe3b0665..8eb4e9df93 100644
--- a/src/include/executor/nodeMergeAppend.h
+++ b/src/include/executor/nodeMergeAppend.h
@@ -16,6 +16,7 @@
#include "nodes/execnodes.h"
+extern bool ExecGetMergeAppendLockRels(MergeAppend *node, ExecGetLockRelsContext *context);
extern MergeAppendState *ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags);
extern void ExecEndMergeAppend(MergeAppendState *node);
extern void ExecReScanMergeAppend(MergeAppendState *node);
diff --git a/src/include/executor/nodeModifyTable.h b/src/include/executor/nodeModifyTable.h
index 1d225bc88d..5006499088 100644
--- a/src/include/executor/nodeModifyTable.h
+++ b/src/include/executor/nodeModifyTable.h
@@ -19,6 +19,7 @@ extern void ExecComputeStoredGenerated(ResultRelInfo *resultRelInfo,
EState *estate, TupleTableSlot *slot,
CmdType cmdtype);
+extern bool ExecGetModifyTableLockRels(ModifyTable *plan, ExecGetLockRelsContext *context);
extern ModifyTableState *ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags);
extern void ExecEndModifyTable(ModifyTableState *node);
extern void ExecReScanModifyTable(ModifyTableState *node);
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 44dd73fc80..1253fdb0ed 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -576,6 +576,7 @@ typedef struct EState
struct ExecRowMark **es_rowmarks; /* Array of per-range-table-entry
* ExecRowMarks, or NULL if none */
PlannedStmt *es_plannedstmt; /* link to top of plan tree */
+ struct ExecLockRelsInfo *es_execlockrelsinfo; /* QueryDesc.execlockrelsinfo */
const char *es_sourceText; /* Source text from QueryDesc */
JunkFilter *es_junkFilter; /* top-level junk filter, if any */
@@ -964,6 +965,101 @@ typedef struct DomainConstraintState
*/
typedef TupleTableSlot *(*ExecProcNodeMtd) (struct PlanState *pstate);
+/*----------------
+ * ExecLockRelsInfo
+ *
+ * Result of performing ExecutorGetLockRels() for a given PlannedStmt
+ */
+typedef struct ExecLockRelsInfo
+{
+ NodeTag type;
+
+ /*
+ * Relations that must be locked to execute the plan tree contained in
+ * the PlannedStmt.
+ */
+ Bitmapset *lockrels;
+
+ /* PlannedStmt.numPlanNodes */
+ int numPlanNodes;
+
+ /*
+ * List of PlanInitPruningOutput, each representing the output of
+ * performing initial pruning on a given plan node, for all nodes in the
+ * plan tree that have been marked as needing initial pruning.
+ *
+ * 'ipoIndexes' is an array of 'numPlanNodes' elements, indexed with
+ * plan_node_id of the individual nodes in the plan tree, each a 1-based
+ * index into 'initPruningOutputs' list for a given plan node. 0 means
+ * that a given plan node has no entry in the list because of not needing
+ * any initial pruning done on it.
+ */
+ List *initPruningOutputs;
+ int *ipoIndexes;
+} ExecLockRelsInfo;
+
+/*----------------
+ * ExecGetLockRelsContext
+ *
+ * Information pertaining to ExecutorGetLockRels() invocation for a given
+ * plan.
+ */
+typedef struct ExecGetLockRelsContext
+{
+ NodeTag type;
+
+ PlannedStmt *stmt; /* target plan */
+ ParamListInfo params; /* EXTERN parameters available for pruning */
+
+ /* Output parameters for ExecGetLockRels and its subroutines. */
+ Bitmapset *lockrels;
+
+ /* See the omment in the definition of ExecLockRelsInfo struct. */
+ List *initPruningOutputs;
+ int *ipoIndexes;
+} ExecGetLockRelsContext;
+
+/*
+ * Appends the provided PlanInitPruningOutput to
+ * ExecGetLockRelsContext.initPruningOutput
+ */
+#define ExecStorePlanInitPruningOutput(cxt, initPruningOutput, plannode) \
+ do { \
+ (cxt)->initPruningOutputs = lappend((cxt)->initPruningOutputs, initPruningOutput); \
+ (cxt)->ipoIndexes[(plannode)->plan_node_id] = list_length((cxt)->initPruningOutputs); \
+ } while (0)
+
+/*
+ * Finds the PlanInitPruningOutput for a given Plan node in
+ * ExecLockRelsInfo.initPruningOutputs.
+ */
+#define ExecFetchPlanInitPruningOutput(execlockrelsinfo, plannode) \
+ (((execlockrelsinfo) != NULL && (execlockrelsinfo)->initPruningOutputs != NIL) ? \
+ list_nth((execlockrelsinfo)->initPruningOutputs, \
+ (execlockrelsinfo)->ipoIndexes[(plannode)->plan_node_id] - 1) : NULL)
+
+/* ---------------
+ * PlanInitPruningOutput
+ *
+ * Node to remember the result of performing initial partition pruning steps
+ * during ExecutorGetLockRels() on nodes that support pruning.
+ *
+ * ExecLockRelsDoInitPruning(), which runs during ExecutorGetLockRels(),
+ * creates it and stores it in the corresponding ExecLockRelsInfo.
+ *
+ * ExecInitPartitionPruning(), which runs during ExecuorStart(), fetches it
+ * from the EState's ExecLockRelsInfo (if any) and uses the value of
+ * initially_valid_subplans contained in it as-is to select the subplans to be
+ * initialized for execution, instead of re-evaluating that by performing
+ * initial pruning again.
+ */
+typedef struct PlanInitPruningOutput
+{
+ NodeTag type;
+
+ Bitmapset *initially_valid_subplans;
+} PlanInitPruningOutput;
+
/* ----------------
* PlanState node
*
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index 5d075f0c34..d365fc4402 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -96,6 +96,11 @@ typedef enum NodeTag
T_PartitionPruneStepCombine,
T_PlanInvalItem,
+ /* TAGS FOR EXECUTOR PREP NODES (execnodes.h) */
+ T_ExecGetLockRelsContext,
+ T_ExecLockRelsInfo,
+ T_PlanInitPruningOutput,
+
/*
* TAGS FOR PLAN STATE NODES (execnodes.h)
*
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 5327d9ba8b..019719c1a4 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -129,6 +129,10 @@ typedef struct PlannerGlobal
char maxParallelHazard; /* worst PROPARALLEL hazard level */
+ bool containsInitialPruning; /* Do some Plan nodes in the tree
+ * have initial (pre-exec) pruning
+ * steps? */
+
PartitionDirectory partition_directory; /* partition descriptors */
Bitmapset *elidedAppendPartedRels; /* Combined partitioned_rels of all
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index bd87c35d6c..bfdb5bbf28 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -59,10 +59,16 @@ typedef struct PlannedStmt
bool parallelModeNeeded; /* parallel mode required to execute? */
+ bool containsInitialPruning; /* Do some Plan nodes in the tree
+ * have initial (pre-exec) pruning
+ * steps? */
+
int jitFlags; /* which forms of JIT should be performed */
struct Plan *planTree; /* tree of Plan nodes */
+ int numPlanNodes; /* number of nodes in planTree */
+
List *rtable; /* list of RangeTblEntry nodes */
/* rtable indexes of target relations for INSERT/UPDATE/DELETE */
@@ -1189,6 +1195,13 @@ typedef struct PlanRowMark
* prune_infos List of Lists containing PartitionedRelPruneInfo nodes,
* one sublist per run-time-prunable partition hierarchy
* appearing in the parent plan node's subplans.
+ *
+ * needs_init_pruning Does any of the PartitionedRelPruneInfos in
+ * prune_infos have its initial_pruning_steps set?
+ *
+ * needs_exec_pruning Does any of the PartitionedRelPruneInfos in
+ * prune_infos have its exec_pruning_steps set?
+ *
* other_subplans Indexes of any subplans that are not accounted for
* by any of the PartitionedRelPruneInfo nodes in
* "prune_infos". These subplans must not be pruned.
@@ -1197,6 +1210,8 @@ typedef struct PartitionPruneInfo
{
NodeTag type;
List *prune_infos;
+ bool needs_init_pruning;
+ bool needs_exec_pruning;
Bitmapset *other_subplans;
} PartitionPruneInfo;
diff --git a/src/include/tcop/tcopprot.h b/src/include/tcop/tcopprot.h
index 92291a750d..bf80c53bed 100644
--- a/src/include/tcop/tcopprot.h
+++ b/src/include/tcop/tcopprot.h
@@ -64,7 +64,7 @@ extern PlannedStmt *pg_plan_query(Query *querytree, const char *query_string,
ParamListInfo boundParams);
extern List *pg_plan_queries(List *querytrees, const char *query_string,
int cursorOptions,
- ParamListInfo boundParams);
+ ParamListInfo boundParams, List **execlockrelsinfo_list);
extern bool check_max_stack_depth(int *newval, void **extra, GucSource source);
extern void assign_max_stack_depth(int newval, void *extra);
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index 95b99e3d25..56b0dcc6bd 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -148,6 +148,9 @@ typedef struct CachedPlan
{
int magic; /* should equal CACHEDPLAN_MAGIC */
List *stmt_list; /* list of PlannedStmts */
+ List *execlockrelsinfo_list; /* list of ExecutorGetLockRelsResult with one
+ * element for each of stmt_list; NIL
+ * if not a generic plan */
bool is_oneshot; /* is it a "oneshot" plan? */
bool is_saved; /* is CachedPlan in a long-lived context? */
bool is_valid; /* is the stmt_list currently valid? */
@@ -158,6 +161,9 @@ typedef struct CachedPlan
int generation; /* parent's generation number for this plan */
int refcount; /* count of live references to this struct */
MemoryContext context; /* context containing this CachedPlan */
+ MemoryContext execlockrelsinfo_context; /* context containing
+ * execlockrelsinfo_list,
+ * a child of the above context */
} CachedPlan;
/*
diff --git a/src/include/utils/portal.h b/src/include/utils/portal.h
index aeddbdafe5..9abace6734 100644
--- a/src/include/utils/portal.h
+++ b/src/include/utils/portal.h
@@ -137,6 +137,10 @@ typedef struct PortalData
CommandTag commandTag; /* command tag for original query */
QueryCompletion qc; /* command completion data for executed query */
List *stmts; /* list of PlannedStmts */
+ List *execlockrelsinfos; /* list of ExecutorGetLockRelsResults with one element
+ * for each of 'stmts'; same as
+ * cplan->execlockrelsinfo_list if cplan is
+ * not NULL */
CachedPlan *cplan; /* CachedPlan, if stmts are from one */
ParamListInfo portalParams; /* params to pass to query */
@@ -241,6 +245,7 @@ extern void PortalDefineQuery(Portal portal,
const char *sourceText,
CommandTag commandTag,
List *stmts,
+ List *execlockrelsinfos,
CachedPlan *cplan);
extern PlannedStmt *PortalGetPrimaryStmt(Portal portal);
extern void PortalCreateHoldStore(Portal portal);
--
2.24.1
[application/x-patch] v6-0001-Some-refactoring-of-runtime-pruning-code.patch (26.5K, 5-v6-0001-Some-refactoring-of-runtime-pruning-code.patch)
download | inline diff:
From df8186c0e4a76f31c1f803a953f2c98ac88f9dc8 Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Wed, 2 Mar 2022 15:17:55 +0900
Subject: [PATCH v6 1/4] Some refactoring of runtime pruning code
This does two things mainly:
* Move the execution pruning initialization steps that are common
between both ExecInitAppend() and ExecInitMergeAppend() into a new
function ExecInitPartitionPruning() defined in execPartition.c.
Thus, ExecCreatePartitionPruneState() and
ExecFindInitialMatchingSubPlans() need not be exported.
* Add an ExprContext field to PartitionPruneContext to remove the
implicit assumption in the runtime pruning code that the ExprContext
to use to compute pruning expressions that need one can always rely
on the PlanState providing it. A future patch will allow runtime
pruning (at least the initial pruning steps) to be performed without
the corresponding PlanState yet having been created, so this will
help.
---
src/backend/executor/execPartition.c | 340 ++++++++++++++++---------
src/backend/executor/nodeAppend.c | 33 +--
src/backend/executor/nodeMergeAppend.c | 32 +--
src/backend/partitioning/partprune.c | 20 +-
src/include/executor/execPartition.h | 9 +-
src/include/partitioning/partprune.h | 2 +
6 files changed, 252 insertions(+), 184 deletions(-)
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 90ed1485d1..7ff5a95f05 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -182,11 +182,18 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation rel,
bool *isnull,
int maxfieldlen);
static List *adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri);
+static PartitionPruneState *ExecCreatePartitionPruneState(PlanState *planstate,
+ PartitionPruneInfo *partitionpruneinfo);
+static Bitmapset *ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate);
static void ExecInitPruningContext(PartitionPruneContext *context,
List *pruning_steps,
PartitionDesc partdesc,
PartitionKey partkey,
- PlanState *planstate);
+ PlanState *planstate,
+ ExprContext *econtext);
+static void PartitionPruneStateFixSubPlanMap(PartitionPruneState *prunestate,
+ Bitmapset *initially_valid_subplans,
+ int n_total_subplans);
static void find_matching_subplans_recurse(PartitionPruningData *prunedata,
PartitionedRelPruningData *pprune,
bool initial_prune,
@@ -1485,30 +1492,86 @@ adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri)
*
* Functions:
*
- * ExecCreatePartitionPruneState:
+ * ExecInitPartitionPruning:
* Creates the PartitionPruneState required by each of the two pruning
* functions. Details stored include how to map the partition index
- * returned by the partition pruning code into subplan indexes.
- *
- * ExecFindInitialMatchingSubPlans:
- * Returns indexes of matching subplans. Partition pruning is attempted
- * without any evaluation of expressions containing PARAM_EXEC Params.
- * This function must be called during executor startup for the parent
- * plan before the subplans themselves are initialized. Subplans which
- * are found not to match by this function must be removed from the
- * plan's list of subplans during execution, as this function performs a
- * remap of the partition index to subplan index map and the newly
- * created map provides indexes only for subplans which remain after
- * calling this function.
+ * returned by the partition pruning code into subplan indexes. Also
+ * determines the set of initially valid subplans by performing initial
+ * pruning steps, only which need be initialized by the caller such as
+ * ExecInitAppend. Maps in PartitionPruneState are updated to account
+ * for initial pruning having eliminated some of the subplans, if any.
*
* ExecFindMatchingSubPlans:
* Returns indexes of matching subplans after evaluating all available
- * expressions. This function can only be called during execution and
- * must be called again each time the value of a Param listed in
- * PartitionPruneState's 'execparamids' changes.
+ * expressions, that is, using execution pruning steps. This function can
+ * can only be called during execution and must be called again each time
+ * the value of a Param listed in PartitionPruneState's 'execparamids'
+ * changes.
*-------------------------------------------------------------------------
*/
+/*
+ * ExecInitPartitionPruning
+ * Initialize data structure needed for run-time partition pruning
+ *
+ * Initial pruning can be done immediately, so it is done here if needed and
+ * the set of surviving partition subplans' indexes are added to the output
+ * parameter *initially_valid_subplans.
+ *
+ * If subplans are indeed pruned, subplan_map arrays contained in the returned
+ * PartitionPruneState are re-sequenced to not count those, though only if the
+ * maps will be needed for subsequent execution pruning passes.
+ */
+PartitionPruneState *
+ExecInitPartitionPruning(PlanState *planstate,
+ int n_total_subplans,
+ PartitionPruneInfo *pruneinfo,
+ Bitmapset **initially_valid_subplans)
+{
+ PartitionPruneState *prunestate;
+ EState *estate = planstate->state;
+
+ /* We may need an expression context to evaluate partition exprs */
+ ExecAssignExprContext(estate, planstate);
+
+ /*
+ * Create the working data structure for pruning.
+ */
+ prunestate = ExecCreatePartitionPruneState(planstate, pruneinfo);
+
+ /*
+ * Perform an initial partition prune, if required.
+ */
+ if (prunestate->do_initial_prune)
+ {
+ /* Determine which subplans survive initial pruning */
+ *initially_valid_subplans = ExecFindInitialMatchingSubPlans(prunestate);
+ }
+ else
+ {
+ /* We'll need to initialize all subplans */
+ Assert(n_total_subplans > 0);
+ *initially_valid_subplans = bms_add_range(NULL, 0,
+ n_total_subplans - 1);
+ }
+
+ /*
+ * Re-sequence subplan indexes contained in prunestate to account for any
+ * that were removed above due to initial pruning.
+ *
+ * We can safely skip this when !do_exec_prune, even though that leaves
+ * invalid data in prunestate, because that data won't be consulted again
+ * (cf initial Assert in ExecFindMatchingSubPlans).
+ */
+ if (prunestate->do_exec_prune &&
+ bms_num_members(*initially_valid_subplans) < n_total_subplans)
+ PartitionPruneStateFixSubPlanMap(prunestate,
+ *initially_valid_subplans,
+ n_total_subplans);
+
+ return prunestate;
+}
+
/*
* ExecCreatePartitionPruneState
* Build the data structure required for calling
@@ -1527,7 +1590,7 @@ adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri)
* re-used each time we re-evaluate which partitions match the pruning steps
* provided in each PartitionedRelPruneInfo.
*/
-PartitionPruneState *
+static PartitionPruneState *
ExecCreatePartitionPruneState(PlanState *planstate,
PartitionPruneInfo *partitionpruneinfo)
{
@@ -1536,6 +1599,7 @@ ExecCreatePartitionPruneState(PlanState *planstate,
int n_part_hierarchies;
ListCell *lc;
int i;
+ ExprContext *econtext = planstate->ps_ExprContext;
/* For data reading, executor always omits detached partitions */
if (estate->es_partition_directory == NULL)
@@ -1709,7 +1773,8 @@ ExecCreatePartitionPruneState(PlanState *planstate,
{
ExecInitPruningContext(&pprune->initial_context,
pinfo->initial_pruning_steps,
- partdesc, partkey, planstate);
+ partdesc, partkey, planstate,
+ econtext);
/* Record whether initial pruning is needed at any level */
prunestate->do_initial_prune = true;
}
@@ -1718,7 +1783,8 @@ ExecCreatePartitionPruneState(PlanState *planstate,
{
ExecInitPruningContext(&pprune->exec_context,
pinfo->exec_pruning_steps,
- partdesc, partkey, planstate);
+ partdesc, partkey, planstate,
+ econtext);
/* Record whether exec pruning is needed at any level */
prunestate->do_exec_prune = true;
}
@@ -1746,7 +1812,8 @@ ExecInitPruningContext(PartitionPruneContext *context,
List *pruning_steps,
PartitionDesc partdesc,
PartitionKey partkey,
- PlanState *planstate)
+ PlanState *planstate,
+ ExprContext *econtext)
{
int n_steps;
int partnatts;
@@ -1767,6 +1834,7 @@ ExecInitPruningContext(PartitionPruneContext *context,
context->ppccontext = CurrentMemoryContext;
context->planstate = planstate;
+ context->exprcontext = econtext;
/* Initialize expression state for each expression we need */
context->exprstates = (ExprState **)
@@ -1795,8 +1863,20 @@ ExecInitPruningContext(PartitionPruneContext *context,
step->step.step_id,
keyno);
- context->exprstates[stateidx] =
- ExecInitExpr(expr, context->planstate);
+ /*
+ * When planstate is NULL, pruning_steps is known not to
+ * contain any expressions that depend on the parent plan.
+ * Information of any available EXTERN parameters must be
+ * passed explicitly in that case, which the caller must
+ * have made available via econtext.
+ */
+ if (planstate == NULL)
+ context->exprstates[stateidx] =
+ ExecInitExprWithParams(expr,
+ econtext->ecxt_param_list_info);
+ else
+ context->exprstates[stateidx] =
+ ExecInitExpr(expr, context->planstate);
}
keyno++;
}
@@ -1809,18 +1889,11 @@ ExecInitPruningContext(PartitionPruneContext *context,
* pruning, disregarding any pruning constraints involving PARAM_EXEC
* Params.
*
- * If additional pruning passes will be required (because of PARAM_EXEC
- * Params), we must also update the translation data that allows conversion
- * of partition indexes into subplan indexes to account for the unneeded
- * subplans having been removed.
- *
* Must only be called once per 'prunestate', and only if initial pruning
* is required.
- *
- * 'nsubplans' must be passed as the total number of unpruned subplans.
*/
-Bitmapset *
-ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate, int nsubplans)
+static Bitmapset *
+ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate)
{
Bitmapset *result = NULL;
MemoryContext oldcontext;
@@ -1845,14 +1918,20 @@ ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate, int nsubplans)
PartitionedRelPruningData *pprune;
prunedata = prunestate->partprunedata[i];
+
+ /*
+ * We pass the 1st item belonging to the root table of the hierarchy
+ * and find_matching_subplans_recurse() takes care of recursing to
+ * other (lower-level) parents as needed.
+ */
pprune = &prunedata->partrelprunedata[0];
/* Perform pruning without using PARAM_EXEC Params */
find_matching_subplans_recurse(prunedata, pprune, true, &result);
- /* Expression eval may have used space in node's ps_ExprContext too */
+ /* Expression eval may have used space in ExprContext too */
if (pprune->initial_pruning_steps)
- ResetExprContext(pprune->initial_context.planstate->ps_ExprContext);
+ ResetExprContext(pprune->initial_context.exprcontext);
}
/* Add in any subplans that partition pruning didn't account for */
@@ -1865,118 +1944,120 @@ ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate, int nsubplans)
MemoryContextReset(prunestate->prune_context);
+ return result;
+}
+
+/*
+ * PartitionPruneStateFixSubPlanMap
+ * Fix mapping of partition indexes to subplan indexes contained in
+ * prunestate by considering the new list of subplans that survived
+ * initial pruning
+ *
+ * Subplans would previously be indexed 0..(n_total_subplans - 1) should be
+ * changed to index range 0..num(initially_valid_subplans).
+ */
+static void
+PartitionPruneStateFixSubPlanMap(PartitionPruneState *prunestate,
+ Bitmapset *initially_valid_subplans,
+ int n_total_subplans)
+{
+ int *new_subplan_indexes;
+ Bitmapset *new_other_subplans;
+ int i;
+ int newidx;
+
/*
- * If exec-time pruning is required and we pruned subplans above, then we
- * must re-sequence the subplan indexes so that ExecFindMatchingSubPlans
- * properly returns the indexes from the subplans which will remain after
- * execution of this function.
- *
- * We can safely skip this when !do_exec_prune, even though that leaves
- * invalid data in prunestate, because that data won't be consulted again
- * (cf initial Assert in ExecFindMatchingSubPlans).
+ * First we must build a temporary array which maps old subplan
+ * indexes to new ones. For convenience of initialization, we use
+ * 1-based indexes in this array and leave pruned items as 0.
*/
- if (prunestate->do_exec_prune && bms_num_members(result) < nsubplans)
+ new_subplan_indexes = (int *) palloc0(sizeof(int) * n_total_subplans);
+ newidx = 1;
+ i = -1;
+ while ((i = bms_next_member(initially_valid_subplans, i)) >= 0)
{
- int *new_subplan_indexes;
- Bitmapset *new_other_subplans;
- int i;
- int newidx;
+ Assert(i < n_total_subplans);
+ new_subplan_indexes[i] = newidx++;
+ }
- /*
- * First we must build a temporary array which maps old subplan
- * indexes to new ones. For convenience of initialization, we use
- * 1-based indexes in this array and leave pruned items as 0.
- */
- new_subplan_indexes = (int *) palloc0(sizeof(int) * nsubplans);
- newidx = 1;
- i = -1;
- while ((i = bms_next_member(result, i)) >= 0)
- {
- Assert(i < nsubplans);
- new_subplan_indexes[i] = newidx++;
- }
+ /*
+ * Now we can update each PartitionedRelPruneInfo's subplan_map with
+ * new subplan indexes. We must also recompute its present_parts
+ * bitmap.
+ */
+ for (i = 0; i < prunestate->num_partprunedata; i++)
+ {
+ PartitionPruningData *prunedata = prunestate->partprunedata[i];
+ int j;
/*
- * Now we can update each PartitionedRelPruneInfo's subplan_map with
- * new subplan indexes. We must also recompute its present_parts
- * bitmap.
+ * Within each hierarchy, we perform this loop in back-to-front
+ * order so that we determine present_parts for the lowest-level
+ * partitioned tables first. This way we can tell whether a
+ * sub-partitioned table's partitions were entirely pruned so we
+ * can exclude it from the current level's present_parts.
*/
- for (i = 0; i < prunestate->num_partprunedata; i++)
+ for (j = prunedata->num_partrelprunedata - 1; j >= 0; j--)
{
- PartitionPruningData *prunedata = prunestate->partprunedata[i];
- int j;
+ PartitionedRelPruningData *pprune = &prunedata->partrelprunedata[j];
+ int nparts = pprune->nparts;
+ int k;
- /*
- * Within each hierarchy, we perform this loop in back-to-front
- * order so that we determine present_parts for the lowest-level
- * partitioned tables first. This way we can tell whether a
- * sub-partitioned table's partitions were entirely pruned so we
- * can exclude it from the current level's present_parts.
- */
- for (j = prunedata->num_partrelprunedata - 1; j >= 0; j--)
- {
- PartitionedRelPruningData *pprune = &prunedata->partrelprunedata[j];
- int nparts = pprune->nparts;
- int k;
+ /* We just rebuild present_parts from scratch */
+ bms_free(pprune->present_parts);
+ pprune->present_parts = NULL;
- /* We just rebuild present_parts from scratch */
- bms_free(pprune->present_parts);
- pprune->present_parts = NULL;
+ for (k = 0; k < nparts; k++)
+ {
+ int oldidx = pprune->subplan_map[k];
+ int subidx;
- for (k = 0; k < nparts; k++)
+ /*
+ * If this partition existed as a subplan then change the
+ * old subplan index to the new subplan index. The new
+ * index may become -1 if the partition was pruned above,
+ * or it may just come earlier in the subplan list due to
+ * some subplans being removed earlier in the list. If
+ * it's a subpartition, add it to present_parts unless
+ * it's entirely pruned.
+ */
+ if (oldidx >= 0)
{
- int oldidx = pprune->subplan_map[k];
- int subidx;
-
- /*
- * If this partition existed as a subplan then change the
- * old subplan index to the new subplan index. The new
- * index may become -1 if the partition was pruned above,
- * or it may just come earlier in the subplan list due to
- * some subplans being removed earlier in the list. If
- * it's a subpartition, add it to present_parts unless
- * it's entirely pruned.
- */
- if (oldidx >= 0)
- {
- Assert(oldidx < nsubplans);
- pprune->subplan_map[k] = new_subplan_indexes[oldidx] - 1;
+ Assert(oldidx < n_total_subplans);
+ pprune->subplan_map[k] = new_subplan_indexes[oldidx] - 1;
- if (new_subplan_indexes[oldidx] > 0)
- pprune->present_parts =
- bms_add_member(pprune->present_parts, k);
- }
- else if ((subidx = pprune->subpart_map[k]) >= 0)
- {
- PartitionedRelPruningData *subprune;
+ if (new_subplan_indexes[oldidx] > 0)
+ pprune->present_parts =
+ bms_add_member(pprune->present_parts, k);
+ }
+ else if ((subidx = pprune->subpart_map[k]) >= 0)
+ {
+ PartitionedRelPruningData *subprune;
- subprune = &prunedata->partrelprunedata[subidx];
+ subprune = &prunedata->partrelprunedata[subidx];
- if (!bms_is_empty(subprune->present_parts))
- pprune->present_parts =
- bms_add_member(pprune->present_parts, k);
- }
+ if (!bms_is_empty(subprune->present_parts))
+ pprune->present_parts =
+ bms_add_member(pprune->present_parts, k);
}
}
}
+ }
- /*
- * We must also recompute the other_subplans set, since indexes in it
- * may change.
- */
- new_other_subplans = NULL;
- i = -1;
- while ((i = bms_next_member(prunestate->other_subplans, i)) >= 0)
- new_other_subplans = bms_add_member(new_other_subplans,
- new_subplan_indexes[i] - 1);
-
- bms_free(prunestate->other_subplans);
- prunestate->other_subplans = new_other_subplans;
+ /*
+ * We must also recompute the other_subplans set, since indexes in it
+ * may change.
+ */
+ new_other_subplans = NULL;
+ i = -1;
+ while ((i = bms_next_member(prunestate->other_subplans, i)) >= 0)
+ new_other_subplans = bms_add_member(new_other_subplans,
+ new_subplan_indexes[i] - 1);
- pfree(new_subplan_indexes);
- }
+ bms_free(prunestate->other_subplans);
+ prunestate->other_subplans = new_other_subplans;
- return result;
+ pfree(new_subplan_indexes);
}
/*
@@ -2018,11 +2099,16 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate)
prunedata = prunestate->partprunedata[i];
pprune = &prunedata->partrelprunedata[0];
+ /*
+ * We pass the 1st item belonging to the root table of the hierarchy
+ * and find_matching_subplans_recurse() takes care of recursing to
+ * other (lower-level) parents as needed.
+ */
find_matching_subplans_recurse(prunedata, pprune, false, &result);
- /* Expression eval may have used space in node's ps_ExprContext too */
+ /* Expression eval may have used space in ExprContext too */
if (pprune->exec_pruning_steps)
- ResetExprContext(pprune->exec_context.planstate->ps_ExprContext);
+ ResetExprContext(pprune->exec_context.exprcontext);
}
/* Add in any subplans that partition pruning didn't account for */
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index 7937f1c88f..5b6d3eb23b 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -138,30 +138,17 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
{
PartitionPruneState *prunestate;
- /* We may need an expression context to evaluate partition exprs */
- ExecAssignExprContext(estate, &appendstate->ps);
-
- /* Create the working data structure for pruning. */
- prunestate = ExecCreatePartitionPruneState(&appendstate->ps,
- node->part_prune_info);
+ /*
+ * Set up pruning data structure. Initial pruning steps, if any, are
+ * performed as part of the setup, adding the set of indexes of
+ * surviving subplans to 'validsubplans'.
+ */
+ prunestate = ExecInitPartitionPruning(&appendstate->ps,
+ list_length(node->appendplans),
+ node->part_prune_info,
+ &validsubplans);
appendstate->as_prune_state = prunestate;
-
- /* Perform an initial partition prune, if required. */
- if (prunestate->do_initial_prune)
- {
- /* Determine which subplans survive initial pruning */
- validsubplans = ExecFindInitialMatchingSubPlans(prunestate,
- list_length(node->appendplans));
-
- nplans = bms_num_members(validsubplans);
- }
- else
- {
- /* We'll need to initialize all subplans */
- nplans = list_length(node->appendplans);
- Assert(nplans > 0);
- validsubplans = bms_add_range(NULL, 0, nplans - 1);
- }
+ nplans = bms_num_members(validsubplans);
/*
* When no run-time pruning is required and there's at least one
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index 418f89dea8..9a9f29e845 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -86,29 +86,17 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
{
PartitionPruneState *prunestate;
- /* We may need an expression context to evaluate partition exprs */
- ExecAssignExprContext(estate, &mergestate->ps);
-
- prunestate = ExecCreatePartitionPruneState(&mergestate->ps,
- node->part_prune_info);
+ /*
+ * Set up pruning data structure. Initial pruning steps, if any, are
+ * performed as part of the setup, adding the set of indexes of
+ * surviving subplans to 'validsubplans'.
+ */
+ prunestate = ExecInitPartitionPruning(&mergestate->ps,
+ list_length(node->mergeplans),
+ node->part_prune_info,
+ &validsubplans);
mergestate->ms_prune_state = prunestate;
-
- /* Perform an initial partition prune, if required. */
- if (prunestate->do_initial_prune)
- {
- /* Determine which subplans survive initial pruning */
- validsubplans = ExecFindInitialMatchingSubPlans(prunestate,
- list_length(node->mergeplans));
-
- nplans = bms_num_members(validsubplans);
- }
- else
- {
- /* We'll need to initialize all subplans */
- nplans = list_length(node->mergeplans);
- Assert(nplans > 0);
- validsubplans = bms_add_range(NULL, 0, nplans - 1);
- }
+ nplans = bms_num_members(validsubplans);
/*
* When no run-time pruning is required and there's at least one
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index 1bc00826c1..7080cb25d9 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -798,6 +798,7 @@ prune_append_rel_partitions(RelOptInfo *rel)
/* These are not valid when being called from the planner */
context.planstate = NULL;
+ context.exprcontext = NULL;
context.exprstates = NULL;
/* Actual pruning happens here. */
@@ -808,8 +809,8 @@ prune_append_rel_partitions(RelOptInfo *rel)
* get_matching_partitions
* Determine partitions that survive partition pruning
*
- * Note: context->planstate must be set to a valid PlanState when the
- * pruning_steps were generated with a target other than PARTTARGET_PLANNER.
+ * Note: context->exprcontext must be valid when the pruning_steps were
+ * generated with a target other than PARTTARGET_PLANNER.
*
* Returns a Bitmapset of the RelOptInfo->part_rels indexes of the surviving
* partitions.
@@ -3654,7 +3655,7 @@ match_boolean_partition_clause(Oid partopfamily, Expr *clause, Expr *partkey,
* exprstate array.
*
* Note that the evaluated result may be in the per-tuple memory context of
- * context->planstate->ps_ExprContext, and we may have leaked other memory
+ * context->exprcontext, and we may have leaked other memory
* there too. This memory must be recovered by resetting that ExprContext
* after we're done with the pruning operation (see execPartition.c).
*/
@@ -3677,13 +3678,18 @@ partkey_datum_from_expr(PartitionPruneContext *context,
ExprContext *ectx;
/*
- * We should never see a non-Const in a step unless we're running in
- * the executor.
+ * We should never see a non-Const in a step unless the caller has
+ * passed a valid ExprContext.
+ *
+ * When context->planstate is valid, context->exprcontext is same
+ * as context->planstate->ps_ExprContext.
*/
- Assert(context->planstate != NULL);
+ Assert(context->planstate != NULL || context->exprcontext != NULL);
+ Assert(context->planstate == NULL ||
+ (context->exprcontext == context->planstate->ps_ExprContext));
exprstate = context->exprstates[stateidx];
- ectx = context->planstate->ps_ExprContext;
+ ectx = context->exprcontext;
*value = ExecEvalExprSwitchContext(exprstate, ectx, isnull);
}
}
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index 603d8becc4..fd5735a946 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -119,10 +119,9 @@ extern ResultRelInfo *ExecFindPartition(ModifyTableState *mtstate,
EState *estate);
extern void ExecCleanupTupleRouting(ModifyTableState *mtstate,
PartitionTupleRouting *proute);
-extern PartitionPruneState *ExecCreatePartitionPruneState(PlanState *planstate,
- PartitionPruneInfo *partitionpruneinfo);
+extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
+ int n_total_subplans,
+ PartitionPruneInfo *pruneinfo,
+ Bitmapset **initially_valid_subplans);
extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate);
-extern Bitmapset *ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate,
- int nsubplans);
-
#endif /* EXECPARTITION_H */
diff --git a/src/include/partitioning/partprune.h b/src/include/partitioning/partprune.h
index ee11b6feae..90684efa25 100644
--- a/src/include/partitioning/partprune.h
+++ b/src/include/partitioning/partprune.h
@@ -41,6 +41,7 @@ struct RelOptInfo;
* subsidiary data, such as the FmgrInfos.
* planstate Points to the parent plan node's PlanState when called
* during execution; NULL when called from the planner.
+ * exprcontext ExprContext to use when evaluating pruning expressions
* exprstates Array of ExprStates, indexed as per PruneCxtStateIdx; one
* for each partition key in each pruning step. Allocated if
* planstate is non-NULL, otherwise NULL.
@@ -56,6 +57,7 @@ typedef struct PartitionPruneContext
FmgrInfo *stepcmpfuncs;
MemoryContext ppccontext;
PlanState *planstate;
+ ExprContext *exprcontext;
ExprState **exprstates;
} PartitionPruneContext;
--
2.24.1
^ permalink raw reply [nested|flat] 71+ messages in thread
* Re: generic plans and "initial" pruning
@ 2022-03-28 07:28 Amit Langote <[email protected]>
parent: Amit Langote <[email protected]>
0 siblings, 1 reply; 71+ messages in thread
From: Amit Langote @ 2022-03-28 07:28 UTC (permalink / raw)
To: Robert Haas <[email protected]>; +Cc: Tom Lane <[email protected]>; pgsql-hackers; David Rowley *EXTERN* <[email protected]>
On Mon, Mar 28, 2022 at 4:17 PM Amit Langote <[email protected]> wrote:
> Other than the changes mentioned above, the updated patch now contains
> a bit more commentary than earlier versions, mostly around
> AcquireExecutorLocks()'s new way of determining the set of relations
> to lock and the significantly redesigned working of the "initial"
> execution pruning.
Forgot to rebase over the latest HEAD, so here's v7. Also fixed that
_out and _read functions for PlanInitPruningOutput were using an
obsolete node label.
--
Amit Langote
EDB: http://www.enterprisedb.com
Attachments:
[application/octet-stream] v7-0002-Add-Merge-Append.partitioned_rels.patch (17.4K, 2-v7-0002-Add-Merge-Append.partitioned_rels.patch)
download | inline diff:
From b43aac217ba51854c5a22636f94f14e81bae3991 Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Thu, 24 Mar 2022 22:47:03 +0900
Subject: [PATCH v7 2/4] Add [Merge]Append.partitioned_rels
To record the RT indexes of all partitioned ancestors leading up to
leaf partitions that are appended by the node.
If a given [Merge]Append node is left out from the plan due to there
being only one element in its list of child subplans, then its
partitioned_rels set is added to PlannerGlobal.elidedAppendPartedRels
that is passed down to the executor through PlannedStmt.
There are no users for partitioned_rels and elidedAppendPartedRels
as of this commit, though a later commit will require the ability
to extract the set of relations that must be locked to make a plan
tree safe for execution by walking the plan tree itself, so having
the partitioned tables be also present in the plan tree will be
helpful. Note that currently the executor relies on the fact that
the set of relations to be locked can be obtained by simply scanning
the range table that's made available in PlannedStmt along with the
plan tree.
---
src/backend/nodes/copyfuncs.c | 3 +++
src/backend/nodes/outfuncs.c | 5 +++++
src/backend/nodes/readfuncs.c | 3 +++
src/backend/optimizer/path/joinrels.c | 9 ++++++++
src/backend/optimizer/plan/createplan.c | 18 +++++++++++++++-
src/backend/optimizer/plan/planner.c | 8 +++++++
src/backend/optimizer/plan/setrefs.c | 28 +++++++++++++++++++++++++
src/backend/optimizer/util/inherit.c | 16 ++++++++++++++
src/backend/optimizer/util/relnode.c | 20 ++++++++++++++++++
src/include/nodes/pathnodes.h | 22 +++++++++++++++++++
src/include/nodes/plannodes.h | 17 +++++++++++++++
11 files changed, 148 insertions(+), 1 deletion(-)
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 2cbd8aa0df..d4b5cc7e59 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -106,6 +106,7 @@ _copyPlannedStmt(const PlannedStmt *from)
COPY_NODE_FIELD(invalItems);
COPY_NODE_FIELD(paramExecTypes);
COPY_NODE_FIELD(utilityStmt);
+ COPY_BITMAPSET_FIELD(elidedAppendPartedRels);
COPY_LOCATION_FIELD(stmt_location);
COPY_SCALAR_FIELD(stmt_len);
@@ -253,6 +254,7 @@ _copyAppend(const Append *from)
COPY_SCALAR_FIELD(nasyncplans);
COPY_SCALAR_FIELD(first_partial_plan);
COPY_NODE_FIELD(part_prune_info);
+ COPY_BITMAPSET_FIELD(partitioned_rels);
return newnode;
}
@@ -281,6 +283,7 @@ _copyMergeAppend(const MergeAppend *from)
COPY_POINTER_FIELD(collations, from->numCols * sizeof(Oid));
COPY_POINTER_FIELD(nullsFirst, from->numCols * sizeof(bool));
COPY_NODE_FIELD(part_prune_info);
+ COPY_BITMAPSET_FIELD(partitioned_rels);
return newnode;
}
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index c25f0bd684..99056272f3 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -324,6 +324,7 @@ _outPlannedStmt(StringInfo str, const PlannedStmt *node)
WRITE_NODE_FIELD(invalItems);
WRITE_NODE_FIELD(paramExecTypes);
WRITE_NODE_FIELD(utilityStmt);
+ WRITE_BITMAPSET_FIELD(elidedAppendPartedRels);
WRITE_LOCATION_FIELD(stmt_location);
WRITE_INT_FIELD(stmt_len);
}
@@ -443,6 +444,7 @@ _outAppend(StringInfo str, const Append *node)
WRITE_INT_FIELD(nasyncplans);
WRITE_INT_FIELD(first_partial_plan);
WRITE_NODE_FIELD(part_prune_info);
+ WRITE_BITMAPSET_FIELD(partitioned_rels);
}
static void
@@ -460,6 +462,7 @@ _outMergeAppend(StringInfo str, const MergeAppend *node)
WRITE_OID_ARRAY(collations, node->numCols);
WRITE_BOOL_ARRAY(nullsFirst, node->numCols);
WRITE_NODE_FIELD(part_prune_info);
+ WRITE_BITMAPSET_FIELD(partitioned_rels);
}
static void
@@ -2333,6 +2336,7 @@ _outPlannerGlobal(StringInfo str, const PlannerGlobal *node)
WRITE_BOOL_FIELD(parallelModeOK);
WRITE_BOOL_FIELD(parallelModeNeeded);
WRITE_CHAR_FIELD(maxParallelHazard);
+ WRITE_BITMAPSET_FIELD(elidedAppendPartedRels);
}
static void
@@ -2444,6 +2448,7 @@ _outRelOptInfo(StringInfo str, const RelOptInfo *node)
WRITE_BOOL_FIELD(partbounds_merged);
WRITE_BITMAPSET_FIELD(live_parts);
WRITE_BITMAPSET_FIELD(all_partrels);
+ WRITE_BITMAPSET_FIELD(partitioned_rels);
}
static void
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index e0b3ad1ed2..7536f216bd 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -1662,6 +1662,7 @@ _readPlannedStmt(void)
READ_NODE_FIELD(invalItems);
READ_NODE_FIELD(paramExecTypes);
READ_NODE_FIELD(utilityStmt);
+ READ_BITMAPSET_FIELD(elidedAppendPartedRels);
READ_LOCATION_FIELD(stmt_location);
READ_INT_FIELD(stmt_len);
@@ -1784,6 +1785,7 @@ _readAppend(void)
READ_INT_FIELD(nasyncplans);
READ_INT_FIELD(first_partial_plan);
READ_NODE_FIELD(part_prune_info);
+ READ_BITMAPSET_FIELD(partitioned_rels);
READ_DONE();
}
@@ -1806,6 +1808,7 @@ _readMergeAppend(void)
READ_OID_ARRAY(collations, local_node->numCols);
READ_BOOL_ARRAY(nullsFirst, local_node->numCols);
READ_NODE_FIELD(part_prune_info);
+ READ_BITMAPSET_FIELD(partitioned_rels);
READ_DONE();
}
diff --git a/src/backend/optimizer/path/joinrels.c b/src/backend/optimizer/path/joinrels.c
index 9da3ff2f9a..e74d40fee3 100644
--- a/src/backend/optimizer/path/joinrels.c
+++ b/src/backend/optimizer/path/joinrels.c
@@ -1549,6 +1549,15 @@ try_partitionwise_join(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2,
populate_joinrel_with_paths(root, child_rel1, child_rel2,
child_joinrel, child_sjinfo,
child_restrictlist);
+
+ /*
+ * A parent relation's partitioned_rels must be a superset of the sets
+ * of all its children, direct or indirect, so bubble up the child
+ * joinrel's set.
+ */
+ joinrel->partitioned_rels =
+ bms_add_members(joinrel->partitioned_rels,
+ child_joinrel->partitioned_rels);
}
}
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index fa069a217c..0026086591 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -26,10 +26,12 @@
#include "nodes/extensible.h"
#include "nodes/makefuncs.h"
#include "nodes/nodeFuncs.h"
+#include "optimizer/appendinfo.h"
#include "optimizer/clauses.h"
#include "optimizer/cost.h"
#include "optimizer/optimizer.h"
#include "optimizer/paramassign.h"
+#include "optimizer/pathnode.h"
#include "optimizer/paths.h"
#include "optimizer/placeholder.h"
#include "optimizer/plancat.h"
@@ -1331,11 +1333,11 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
best_path->subpaths,
prunequal);
}
-
plan->appendplans = subplans;
plan->nasyncplans = nasyncplans;
plan->first_partial_plan = best_path->first_partial_path;
plan->part_prune_info = partpruneinfo;
+ plan->partitioned_rels = bms_copy(rel->partitioned_rels);
copy_generic_path_info(&plan->plan, (Path *) best_path);
@@ -1499,6 +1501,20 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
node->mergeplans = subplans;
node->part_prune_info = partpruneinfo;
+ /*
+ * We need to explicitly add to the plan node the RT indexes of any
+ * partitioned tables whose partitions will be scanned by the nodes in
+ * 'subplans'. There can be multiple RT indexes in the set due to the
+ * partition tree being multi-level and/or this being a plan for UNION ALL
+ * over multiple partition trees. Along with scanrelids of leaf-level Scan
+ * nodes, this allows the executor to lock the full set of relations being
+ * scanned by this node.
+ *
+ * Note that 'apprelids' only contains the top-level base relation(s), so
+ * is not sufficient for the purpose.
+ */
+ node->partitioned_rels = bms_copy(rel->partitioned_rels);
+
/*
* If prepare_sort_from_pathkeys added sort columns, but we were told to
* produce either the exact tlist or a narrow tlist, we should get rid of
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index bd09f85aea..374a9d9753 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -529,6 +529,7 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
result->paramExecTypes = glob->paramExecTypes;
/* utilityStmt should be null, but we might as well copy it */
result->utilityStmt = parse->utilityStmt;
+ result->elidedAppendPartedRels = glob->elidedAppendPartedRels;
result->stmt_location = parse->stmt_location;
result->stmt_len = parse->stmt_len;
@@ -7365,6 +7366,13 @@ create_partitionwise_grouping_paths(PlannerInfo *root,
add_paths_to_append_rel(root, grouped_rel, grouped_live_children);
}
+
+ /*
+ * Input rel might be a partitioned appendrel, though grouped_rel has at
+ * this point taken its role as the an appendrel owning the former's
+ * children, so copy the former's partitioned_rels set into the latter.
+ */
+ grouped_rel->partitioned_rels = bms_copy(input_rel->partitioned_rels);
}
/*
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index a7b11b7f03..dbdeb8ec9d 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -1512,6 +1512,10 @@ set_append_references(PlannerInfo *root,
lfirst(l) = set_plan_refs(root, (Plan *) lfirst(l), rtoffset);
}
+ /* Fix up partitioned_rels before possibly removing the Append below. */
+ aplan->partitioned_rels = offset_relid_set(aplan->partitioned_rels,
+ rtoffset);
+
/*
* See if it's safe to get rid of the Append entirely. For this to be
* safe, there must be only one child plan and that child plan's parallel
@@ -1522,8 +1526,17 @@ set_append_references(PlannerInfo *root,
*/
if (list_length(aplan->appendplans) == 1 &&
((Plan *) linitial(aplan->appendplans))->parallel_aware == aplan->plan.parallel_aware)
+ {
+ /*
+ * Partitioned table involved, if any, must be made known to the
+ * executor.
+ */
+ root->glob->elidedAppendPartedRels =
+ bms_add_members(root->glob->elidedAppendPartedRels,
+ aplan->partitioned_rels);
return clean_up_removed_plan_level((Plan *) aplan,
(Plan *) linitial(aplan->appendplans));
+ }
/*
* Otherwise, clean up the Append as needed. It's okay to do this after
@@ -1584,6 +1597,12 @@ set_mergeappend_references(PlannerInfo *root,
lfirst(l) = set_plan_refs(root, (Plan *) lfirst(l), rtoffset);
}
+ /*
+ * Fix up partitioned_rels before possibly removing the MergeAppend below.
+ */
+ mplan->partitioned_rels = offset_relid_set(mplan->partitioned_rels,
+ rtoffset);
+
/*
* See if it's safe to get rid of the MergeAppend entirely. For this to
* be safe, there must be only one child plan and that child plan's
@@ -1594,8 +1613,17 @@ set_mergeappend_references(PlannerInfo *root,
*/
if (list_length(mplan->mergeplans) == 1 &&
((Plan *) linitial(mplan->mergeplans))->parallel_aware == mplan->plan.parallel_aware)
+ {
+ /*
+ * Partitioned tables involved, if any, must be made known to the
+ * executor.
+ */
+ root->glob->elidedAppendPartedRels =
+ bms_add_members(root->glob->elidedAppendPartedRels,
+ mplan->partitioned_rels);
return clean_up_removed_plan_level((Plan *) mplan,
(Plan *) linitial(mplan->mergeplans));
+ }
/*
* Otherwise, clean up the MergeAppend as needed. It's okay to do this
diff --git a/src/backend/optimizer/util/inherit.c b/src/backend/optimizer/util/inherit.c
index 7e134822f3..56912e4101 100644
--- a/src/backend/optimizer/util/inherit.c
+++ b/src/backend/optimizer/util/inherit.c
@@ -406,6 +406,14 @@ expand_partitioned_rtentry(PlannerInfo *root, RelOptInfo *relinfo,
childrte, childRTindex,
childrel, top_parentrc, lockmode);
+ /*
+ * A parent relation's partitioned_rels must be a superset of the sets
+ * of all its children, direct or indirect, so bubble up the child
+ * rel's set.
+ */
+ relinfo->partitioned_rels = bms_add_members(relinfo->partitioned_rels,
+ childrelinfo->partitioned_rels);
+
/* Close child relation, but keep locks */
table_close(childrel, NoLock);
}
@@ -737,6 +745,14 @@ expand_appendrel_subquery(PlannerInfo *root, RelOptInfo *rel,
/* Child may itself be an inherited rel, either table or subquery. */
if (childrte->inh)
expand_inherited_rtentry(root, childrel, childrte, childRTindex);
+
+ /*
+ * A parent relation's partitioned_rels must be a superset of the sets
+ * of all its children, direct or indirect, so bubble up the child
+ * rel's set.
+ */
+ rel->partitioned_rels = bms_add_members(rel->partitioned_rels,
+ childrel->partitioned_rels);
}
}
diff --git a/src/backend/optimizer/util/relnode.c b/src/backend/optimizer/util/relnode.c
index 520409f4ba..1d082a8fdd 100644
--- a/src/backend/optimizer/util/relnode.c
+++ b/src/backend/optimizer/util/relnode.c
@@ -361,6 +361,10 @@ build_simple_rel(PlannerInfo *root, int relid, RelOptInfo *parent)
}
}
+ /* A partitioned appendrel. */
+ if (rel->part_scheme != NULL)
+ rel->partitioned_rels = bms_copy(rel->relids);
+
/* Save the finished struct in the query's simple_rel_array */
root->simple_rel_array[relid] = rel;
@@ -729,6 +733,14 @@ build_join_rel(PlannerInfo *root,
set_joinrel_size_estimates(root, joinrel, outer_rel, inner_rel,
sjinfo, restrictlist);
+ /*
+ * The joinrel may get processed as an appendrel via partitionwise join
+ * if both outer and inner rels are partitioned, so set partitioned_rels
+ * appropriately.
+ */
+ joinrel->partitioned_rels = bms_union(outer_rel->partitioned_rels,
+ inner_rel->partitioned_rels);
+
/*
* Set the consider_parallel flag if this joinrel could potentially be
* scanned within a parallel worker. If this flag is false for either
@@ -897,6 +909,14 @@ build_child_join_rel(PlannerInfo *root, RelOptInfo *outer_rel,
set_joinrel_size_estimates(root, joinrel, outer_rel, inner_rel,
sjinfo, restrictlist);
+ /*
+ * The joinrel may get processed as an appendrel via partitionwise join
+ * if both outer and inner rels are partitioned, so set partitioned_rels
+ * appropriately.
+ */
+ joinrel->partitioned_rels = bms_union(outer_rel->partitioned_rels,
+ inner_rel->partitioned_rels);
+
/* We build the join only once. */
Assert(!find_join_rel(root, joinrel->relids));
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 1f3845b3fe..5327d9ba8b 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -130,6 +130,11 @@ typedef struct PlannerGlobal
char maxParallelHazard; /* worst PROPARALLEL hazard level */
PartitionDirectory partition_directory; /* partition descriptors */
+
+ Bitmapset *elidedAppendPartedRels; /* Combined partitioned_rels of all
+ * single-subplan [Merge]Append nodes
+ * that have been removed fron the
+ * various plan trees. */
} PlannerGlobal;
/* macro for fetching the Plan associated with a SubPlan node */
@@ -773,6 +778,23 @@ typedef struct RelOptInfo
Relids all_partrels; /* Relids set of all partition relids */
List **partexprs; /* Non-nullable partition key expressions */
List **nullable_partexprs; /* Nullable partition key expressions */
+
+ /*
+ * For an appendrel parent relation (base, join, or upper) that is
+ * partitioned, this stores the RT indexes of all the paritioned ancestors
+ * including itself that lead up to the individual leaf partitions that
+ * will be scanned to produce this relation's output rows. The relid set
+ * is copied into the resulting Append or MergeAppend plan node for
+ * allowing the executor to take appropriate locks on those relations,
+ * unless the node is deemed useless in setrefs.c due to having a single
+ * leaf subplan and thus elided from the final plan, in which case, the set
+ * is added into PlannerGlobal.elidedAppendPartedRels.
+ *
+ * Note that 'apprelids' of those nodes only contains the top-level base
+ * relation(s), so is not sufficient for said purpose.
+ */
+
+ Bitmapset *partitioned_rels;
} RelOptInfo;
/*
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 0b518ce6b2..bd87c35d6c 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -85,6 +85,11 @@ typedef struct PlannedStmt
Node *utilityStmt; /* non-null if this is utility stmt */
+ Bitmapset *elidedAppendPartedRels; /* Combined partitioned_rels of all
+ * single-subplan [Merge]Append nodes
+ * that have been removed from the
+ * various plan trees. */
+
/* statement location in source string (copied from Query) */
int stmt_location; /* start location, or -1 if unknown */
int stmt_len; /* length in bytes; 0 means "rest of string" */
@@ -261,6 +266,12 @@ typedef struct Append
/* Info for run-time subplan pruning; NULL if we're not doing that */
struct PartitionPruneInfo *part_prune_info;
+
+ /*
+ * RT indexes of all partitioned parents whose partitions' plans are
+ * present in appendplans.
+ */
+ Bitmapset *partitioned_rels;
} Append;
/* ----------------
@@ -281,6 +292,12 @@ typedef struct MergeAppend
bool *nullsFirst; /* NULLS FIRST/LAST directions */
/* Info for run-time subplan pruning; NULL if we're not doing that */
struct PartitionPruneInfo *part_prune_info;
+
+ /*
+ * RT indexes of all partitioned parents whose partitions' plans are
+ * present in appendplans.
+ */
+ Bitmapset *partitioned_rels;
} MergeAppend;
/* ----------------
--
2.24.1
[application/octet-stream] v7-0003-Add-a-plan_tree_walker.patch (3.9K, 3-v7-0003-Add-a-plan_tree_walker.patch)
download | inline diff:
From 761e6c2583b37eb9d45d64de954d65d953277040 Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Thu, 3 Mar 2022 16:04:13 +0900
Subject: [PATCH v7 3/4] Add a plan_tree_walker()
Like planstate_tree_walker() but for uninitialized plan trees.
---
src/backend/nodes/nodeFuncs.c | 116 ++++++++++++++++++++++++++++++++++
src/include/nodes/nodeFuncs.h | 3 +
2 files changed, 119 insertions(+)
diff --git a/src/backend/nodes/nodeFuncs.c b/src/backend/nodes/nodeFuncs.c
index 25cf282aab..5e5158ea0e 100644
--- a/src/backend/nodes/nodeFuncs.c
+++ b/src/backend/nodes/nodeFuncs.c
@@ -31,6 +31,10 @@ static bool planstate_walk_subplans(List *plans, bool (*walker) (),
void *context);
static bool planstate_walk_members(PlanState **planstates, int nplans,
bool (*walker) (), void *context);
+static bool plan_walk_subplans(List *plans,
+ bool (*walker) (),
+ void *context);
+static bool plan_walk_members(List *plans, bool (*walker) (), void *context);
/*
@@ -4368,3 +4372,115 @@ planstate_walk_members(PlanState **planstates, int nplans,
return false;
}
+
+/*
+ * plan_tree_walker --- walk plantrees
+ *
+ * The walker has already visited the current node, and so we need only
+ * recurse into any sub-nodes it has.
+ */
+bool
+plan_tree_walker(Plan *plan,
+ bool (*walker) (),
+ void *context)
+{
+ /* Guard against stack overflow due to overly complex plan trees */
+ check_stack_depth();
+
+ /* initPlan-s */
+ if (plan_walk_subplans(plan->initPlan, walker, context))
+ return true;
+
+ /* lefttree */
+ if (outerPlan(plan))
+ {
+ if (walker(outerPlan(plan), context))
+ return true;
+ }
+
+ /* righttree */
+ if (innerPlan(plan))
+ {
+ if (walker(innerPlan(plan), context))
+ return true;
+ }
+
+ /* special child plans */
+ switch (nodeTag(plan))
+ {
+ case T_Append:
+ if (plan_walk_members(((Append *) plan)->appendplans,
+ walker, context))
+ return true;
+ break;
+ case T_MergeAppend:
+ if (plan_walk_members(((MergeAppend *) plan)->mergeplans,
+ walker, context))
+ return true;
+ break;
+ case T_BitmapAnd:
+ if (plan_walk_members(((BitmapAnd *) plan)->bitmapplans,
+ walker, context))
+ return true;
+ break;
+ case T_BitmapOr:
+ if (plan_walk_members(((BitmapOr *) plan)->bitmapplans,
+ walker, context))
+ return true;
+ break;
+ case T_CustomScan:
+ if (plan_walk_members(((CustomScan *) plan)->custom_plans,
+ walker, context))
+ return true;
+ break;
+ case T_SubqueryScan:
+ if (walker(((SubqueryScan *) plan)->subplan, context))
+ return true;
+ break;
+ default:
+ break;
+ }
+
+ return false;
+}
+
+/*
+ * Walk a list of SubPlans (or initPlans, which also use SubPlan nodes).
+ */
+static bool
+plan_walk_subplans(List *plans,
+ bool (*walker) (),
+ void *context)
+{
+ ListCell *lc;
+ PlannedStmt *plannedstmt = (PlannedStmt *) context;
+
+ foreach(lc, plans)
+ {
+ SubPlan *sp = lfirst_node(SubPlan, lc);
+ Plan *p = list_nth(plannedstmt->subplans, sp->plan_id - 1);
+
+ if (walker(p, context))
+ return true;
+ }
+
+ return false;
+}
+
+/*
+ * Walk the constituent plans of a ModifyTable, Append, MergeAppend,
+ * BitmapAnd, or BitmapOr node.
+ */
+static bool
+plan_walk_members(List *plans, bool (*walker) (), void *context)
+{
+ ListCell *lc;
+
+ foreach(lc, plans)
+ {
+ if (walker(lfirst(lc), context))
+ return true;
+ }
+
+ return false;
+}
diff --git a/src/include/nodes/nodeFuncs.h b/src/include/nodes/nodeFuncs.h
index 93c60bde66..fca107ad65 100644
--- a/src/include/nodes/nodeFuncs.h
+++ b/src/include/nodes/nodeFuncs.h
@@ -158,5 +158,8 @@ extern bool raw_expression_tree_walker(Node *node, bool (*walker) (),
struct PlanState;
extern bool planstate_tree_walker(struct PlanState *planstate, bool (*walker) (),
void *context);
+struct Plan;
+extern bool plan_tree_walker(struct Plan *plan, bool (*walker) (),
+ void *context);
#endif /* NODEFUNCS_H */
--
2.24.1
[application/octet-stream] v7-0001-Some-refactoring-of-runtime-pruning-code.patch (26.5K, 4-v7-0001-Some-refactoring-of-runtime-pruning-code.patch)
download | inline diff:
From 60ec0ebb911a2c7c8cc13ea9f96e1fb2038842a0 Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Wed, 2 Mar 2022 15:17:55 +0900
Subject: [PATCH v7 1/4] Some refactoring of runtime pruning code
This does two things mainly:
* Move the execution pruning initialization steps that are common
between both ExecInitAppend() and ExecInitMergeAppend() into a new
function ExecInitPartitionPruning() defined in execPartition.c.
Thus, ExecCreatePartitionPruneState() and
ExecFindInitialMatchingSubPlans() need not be exported.
* Add an ExprContext field to PartitionPruneContext to remove the
implicit assumption in the runtime pruning code that the ExprContext
to use to compute pruning expressions that need one can always rely
on the PlanState providing it. A future patch will allow runtime
pruning (at least the initial pruning steps) to be performed without
the corresponding PlanState yet having been created, so this will
help.
---
src/backend/executor/execPartition.c | 340 ++++++++++++++++---------
src/backend/executor/nodeAppend.c | 33 +--
src/backend/executor/nodeMergeAppend.c | 32 +--
src/backend/partitioning/partprune.c | 20 +-
src/include/executor/execPartition.h | 9 +-
src/include/partitioning/partprune.h | 2 +
6 files changed, 252 insertions(+), 184 deletions(-)
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 90ed1485d1..7ff5a95f05 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -182,11 +182,18 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation rel,
bool *isnull,
int maxfieldlen);
static List *adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri);
+static PartitionPruneState *ExecCreatePartitionPruneState(PlanState *planstate,
+ PartitionPruneInfo *partitionpruneinfo);
+static Bitmapset *ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate);
static void ExecInitPruningContext(PartitionPruneContext *context,
List *pruning_steps,
PartitionDesc partdesc,
PartitionKey partkey,
- PlanState *planstate);
+ PlanState *planstate,
+ ExprContext *econtext);
+static void PartitionPruneStateFixSubPlanMap(PartitionPruneState *prunestate,
+ Bitmapset *initially_valid_subplans,
+ int n_total_subplans);
static void find_matching_subplans_recurse(PartitionPruningData *prunedata,
PartitionedRelPruningData *pprune,
bool initial_prune,
@@ -1485,30 +1492,86 @@ adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri)
*
* Functions:
*
- * ExecCreatePartitionPruneState:
+ * ExecInitPartitionPruning:
* Creates the PartitionPruneState required by each of the two pruning
* functions. Details stored include how to map the partition index
- * returned by the partition pruning code into subplan indexes.
- *
- * ExecFindInitialMatchingSubPlans:
- * Returns indexes of matching subplans. Partition pruning is attempted
- * without any evaluation of expressions containing PARAM_EXEC Params.
- * This function must be called during executor startup for the parent
- * plan before the subplans themselves are initialized. Subplans which
- * are found not to match by this function must be removed from the
- * plan's list of subplans during execution, as this function performs a
- * remap of the partition index to subplan index map and the newly
- * created map provides indexes only for subplans which remain after
- * calling this function.
+ * returned by the partition pruning code into subplan indexes. Also
+ * determines the set of initially valid subplans by performing initial
+ * pruning steps, only which need be initialized by the caller such as
+ * ExecInitAppend. Maps in PartitionPruneState are updated to account
+ * for initial pruning having eliminated some of the subplans, if any.
*
* ExecFindMatchingSubPlans:
* Returns indexes of matching subplans after evaluating all available
- * expressions. This function can only be called during execution and
- * must be called again each time the value of a Param listed in
- * PartitionPruneState's 'execparamids' changes.
+ * expressions, that is, using execution pruning steps. This function can
+ * can only be called during execution and must be called again each time
+ * the value of a Param listed in PartitionPruneState's 'execparamids'
+ * changes.
*-------------------------------------------------------------------------
*/
+/*
+ * ExecInitPartitionPruning
+ * Initialize data structure needed for run-time partition pruning
+ *
+ * Initial pruning can be done immediately, so it is done here if needed and
+ * the set of surviving partition subplans' indexes are added to the output
+ * parameter *initially_valid_subplans.
+ *
+ * If subplans are indeed pruned, subplan_map arrays contained in the returned
+ * PartitionPruneState are re-sequenced to not count those, though only if the
+ * maps will be needed for subsequent execution pruning passes.
+ */
+PartitionPruneState *
+ExecInitPartitionPruning(PlanState *planstate,
+ int n_total_subplans,
+ PartitionPruneInfo *pruneinfo,
+ Bitmapset **initially_valid_subplans)
+{
+ PartitionPruneState *prunestate;
+ EState *estate = planstate->state;
+
+ /* We may need an expression context to evaluate partition exprs */
+ ExecAssignExprContext(estate, planstate);
+
+ /*
+ * Create the working data structure for pruning.
+ */
+ prunestate = ExecCreatePartitionPruneState(planstate, pruneinfo);
+
+ /*
+ * Perform an initial partition prune, if required.
+ */
+ if (prunestate->do_initial_prune)
+ {
+ /* Determine which subplans survive initial pruning */
+ *initially_valid_subplans = ExecFindInitialMatchingSubPlans(prunestate);
+ }
+ else
+ {
+ /* We'll need to initialize all subplans */
+ Assert(n_total_subplans > 0);
+ *initially_valid_subplans = bms_add_range(NULL, 0,
+ n_total_subplans - 1);
+ }
+
+ /*
+ * Re-sequence subplan indexes contained in prunestate to account for any
+ * that were removed above due to initial pruning.
+ *
+ * We can safely skip this when !do_exec_prune, even though that leaves
+ * invalid data in prunestate, because that data won't be consulted again
+ * (cf initial Assert in ExecFindMatchingSubPlans).
+ */
+ if (prunestate->do_exec_prune &&
+ bms_num_members(*initially_valid_subplans) < n_total_subplans)
+ PartitionPruneStateFixSubPlanMap(prunestate,
+ *initially_valid_subplans,
+ n_total_subplans);
+
+ return prunestate;
+}
+
/*
* ExecCreatePartitionPruneState
* Build the data structure required for calling
@@ -1527,7 +1590,7 @@ adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri)
* re-used each time we re-evaluate which partitions match the pruning steps
* provided in each PartitionedRelPruneInfo.
*/
-PartitionPruneState *
+static PartitionPruneState *
ExecCreatePartitionPruneState(PlanState *planstate,
PartitionPruneInfo *partitionpruneinfo)
{
@@ -1536,6 +1599,7 @@ ExecCreatePartitionPruneState(PlanState *planstate,
int n_part_hierarchies;
ListCell *lc;
int i;
+ ExprContext *econtext = planstate->ps_ExprContext;
/* For data reading, executor always omits detached partitions */
if (estate->es_partition_directory == NULL)
@@ -1709,7 +1773,8 @@ ExecCreatePartitionPruneState(PlanState *planstate,
{
ExecInitPruningContext(&pprune->initial_context,
pinfo->initial_pruning_steps,
- partdesc, partkey, planstate);
+ partdesc, partkey, planstate,
+ econtext);
/* Record whether initial pruning is needed at any level */
prunestate->do_initial_prune = true;
}
@@ -1718,7 +1783,8 @@ ExecCreatePartitionPruneState(PlanState *planstate,
{
ExecInitPruningContext(&pprune->exec_context,
pinfo->exec_pruning_steps,
- partdesc, partkey, planstate);
+ partdesc, partkey, planstate,
+ econtext);
/* Record whether exec pruning is needed at any level */
prunestate->do_exec_prune = true;
}
@@ -1746,7 +1812,8 @@ ExecInitPruningContext(PartitionPruneContext *context,
List *pruning_steps,
PartitionDesc partdesc,
PartitionKey partkey,
- PlanState *planstate)
+ PlanState *planstate,
+ ExprContext *econtext)
{
int n_steps;
int partnatts;
@@ -1767,6 +1834,7 @@ ExecInitPruningContext(PartitionPruneContext *context,
context->ppccontext = CurrentMemoryContext;
context->planstate = planstate;
+ context->exprcontext = econtext;
/* Initialize expression state for each expression we need */
context->exprstates = (ExprState **)
@@ -1795,8 +1863,20 @@ ExecInitPruningContext(PartitionPruneContext *context,
step->step.step_id,
keyno);
- context->exprstates[stateidx] =
- ExecInitExpr(expr, context->planstate);
+ /*
+ * When planstate is NULL, pruning_steps is known not to
+ * contain any expressions that depend on the parent plan.
+ * Information of any available EXTERN parameters must be
+ * passed explicitly in that case, which the caller must
+ * have made available via econtext.
+ */
+ if (planstate == NULL)
+ context->exprstates[stateidx] =
+ ExecInitExprWithParams(expr,
+ econtext->ecxt_param_list_info);
+ else
+ context->exprstates[stateidx] =
+ ExecInitExpr(expr, context->planstate);
}
keyno++;
}
@@ -1809,18 +1889,11 @@ ExecInitPruningContext(PartitionPruneContext *context,
* pruning, disregarding any pruning constraints involving PARAM_EXEC
* Params.
*
- * If additional pruning passes will be required (because of PARAM_EXEC
- * Params), we must also update the translation data that allows conversion
- * of partition indexes into subplan indexes to account for the unneeded
- * subplans having been removed.
- *
* Must only be called once per 'prunestate', and only if initial pruning
* is required.
- *
- * 'nsubplans' must be passed as the total number of unpruned subplans.
*/
-Bitmapset *
-ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate, int nsubplans)
+static Bitmapset *
+ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate)
{
Bitmapset *result = NULL;
MemoryContext oldcontext;
@@ -1845,14 +1918,20 @@ ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate, int nsubplans)
PartitionedRelPruningData *pprune;
prunedata = prunestate->partprunedata[i];
+
+ /*
+ * We pass the 1st item belonging to the root table of the hierarchy
+ * and find_matching_subplans_recurse() takes care of recursing to
+ * other (lower-level) parents as needed.
+ */
pprune = &prunedata->partrelprunedata[0];
/* Perform pruning without using PARAM_EXEC Params */
find_matching_subplans_recurse(prunedata, pprune, true, &result);
- /* Expression eval may have used space in node's ps_ExprContext too */
+ /* Expression eval may have used space in ExprContext too */
if (pprune->initial_pruning_steps)
- ResetExprContext(pprune->initial_context.planstate->ps_ExprContext);
+ ResetExprContext(pprune->initial_context.exprcontext);
}
/* Add in any subplans that partition pruning didn't account for */
@@ -1865,118 +1944,120 @@ ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate, int nsubplans)
MemoryContextReset(prunestate->prune_context);
+ return result;
+}
+
+/*
+ * PartitionPruneStateFixSubPlanMap
+ * Fix mapping of partition indexes to subplan indexes contained in
+ * prunestate by considering the new list of subplans that survived
+ * initial pruning
+ *
+ * Subplans would previously be indexed 0..(n_total_subplans - 1) should be
+ * changed to index range 0..num(initially_valid_subplans).
+ */
+static void
+PartitionPruneStateFixSubPlanMap(PartitionPruneState *prunestate,
+ Bitmapset *initially_valid_subplans,
+ int n_total_subplans)
+{
+ int *new_subplan_indexes;
+ Bitmapset *new_other_subplans;
+ int i;
+ int newidx;
+
/*
- * If exec-time pruning is required and we pruned subplans above, then we
- * must re-sequence the subplan indexes so that ExecFindMatchingSubPlans
- * properly returns the indexes from the subplans which will remain after
- * execution of this function.
- *
- * We can safely skip this when !do_exec_prune, even though that leaves
- * invalid data in prunestate, because that data won't be consulted again
- * (cf initial Assert in ExecFindMatchingSubPlans).
+ * First we must build a temporary array which maps old subplan
+ * indexes to new ones. For convenience of initialization, we use
+ * 1-based indexes in this array and leave pruned items as 0.
*/
- if (prunestate->do_exec_prune && bms_num_members(result) < nsubplans)
+ new_subplan_indexes = (int *) palloc0(sizeof(int) * n_total_subplans);
+ newidx = 1;
+ i = -1;
+ while ((i = bms_next_member(initially_valid_subplans, i)) >= 0)
{
- int *new_subplan_indexes;
- Bitmapset *new_other_subplans;
- int i;
- int newidx;
+ Assert(i < n_total_subplans);
+ new_subplan_indexes[i] = newidx++;
+ }
- /*
- * First we must build a temporary array which maps old subplan
- * indexes to new ones. For convenience of initialization, we use
- * 1-based indexes in this array and leave pruned items as 0.
- */
- new_subplan_indexes = (int *) palloc0(sizeof(int) * nsubplans);
- newidx = 1;
- i = -1;
- while ((i = bms_next_member(result, i)) >= 0)
- {
- Assert(i < nsubplans);
- new_subplan_indexes[i] = newidx++;
- }
+ /*
+ * Now we can update each PartitionedRelPruneInfo's subplan_map with
+ * new subplan indexes. We must also recompute its present_parts
+ * bitmap.
+ */
+ for (i = 0; i < prunestate->num_partprunedata; i++)
+ {
+ PartitionPruningData *prunedata = prunestate->partprunedata[i];
+ int j;
/*
- * Now we can update each PartitionedRelPruneInfo's subplan_map with
- * new subplan indexes. We must also recompute its present_parts
- * bitmap.
+ * Within each hierarchy, we perform this loop in back-to-front
+ * order so that we determine present_parts for the lowest-level
+ * partitioned tables first. This way we can tell whether a
+ * sub-partitioned table's partitions were entirely pruned so we
+ * can exclude it from the current level's present_parts.
*/
- for (i = 0; i < prunestate->num_partprunedata; i++)
+ for (j = prunedata->num_partrelprunedata - 1; j >= 0; j--)
{
- PartitionPruningData *prunedata = prunestate->partprunedata[i];
- int j;
+ PartitionedRelPruningData *pprune = &prunedata->partrelprunedata[j];
+ int nparts = pprune->nparts;
+ int k;
- /*
- * Within each hierarchy, we perform this loop in back-to-front
- * order so that we determine present_parts for the lowest-level
- * partitioned tables first. This way we can tell whether a
- * sub-partitioned table's partitions were entirely pruned so we
- * can exclude it from the current level's present_parts.
- */
- for (j = prunedata->num_partrelprunedata - 1; j >= 0; j--)
- {
- PartitionedRelPruningData *pprune = &prunedata->partrelprunedata[j];
- int nparts = pprune->nparts;
- int k;
+ /* We just rebuild present_parts from scratch */
+ bms_free(pprune->present_parts);
+ pprune->present_parts = NULL;
- /* We just rebuild present_parts from scratch */
- bms_free(pprune->present_parts);
- pprune->present_parts = NULL;
+ for (k = 0; k < nparts; k++)
+ {
+ int oldidx = pprune->subplan_map[k];
+ int subidx;
- for (k = 0; k < nparts; k++)
+ /*
+ * If this partition existed as a subplan then change the
+ * old subplan index to the new subplan index. The new
+ * index may become -1 if the partition was pruned above,
+ * or it may just come earlier in the subplan list due to
+ * some subplans being removed earlier in the list. If
+ * it's a subpartition, add it to present_parts unless
+ * it's entirely pruned.
+ */
+ if (oldidx >= 0)
{
- int oldidx = pprune->subplan_map[k];
- int subidx;
-
- /*
- * If this partition existed as a subplan then change the
- * old subplan index to the new subplan index. The new
- * index may become -1 if the partition was pruned above,
- * or it may just come earlier in the subplan list due to
- * some subplans being removed earlier in the list. If
- * it's a subpartition, add it to present_parts unless
- * it's entirely pruned.
- */
- if (oldidx >= 0)
- {
- Assert(oldidx < nsubplans);
- pprune->subplan_map[k] = new_subplan_indexes[oldidx] - 1;
+ Assert(oldidx < n_total_subplans);
+ pprune->subplan_map[k] = new_subplan_indexes[oldidx] - 1;
- if (new_subplan_indexes[oldidx] > 0)
- pprune->present_parts =
- bms_add_member(pprune->present_parts, k);
- }
- else if ((subidx = pprune->subpart_map[k]) >= 0)
- {
- PartitionedRelPruningData *subprune;
+ if (new_subplan_indexes[oldidx] > 0)
+ pprune->present_parts =
+ bms_add_member(pprune->present_parts, k);
+ }
+ else if ((subidx = pprune->subpart_map[k]) >= 0)
+ {
+ PartitionedRelPruningData *subprune;
- subprune = &prunedata->partrelprunedata[subidx];
+ subprune = &prunedata->partrelprunedata[subidx];
- if (!bms_is_empty(subprune->present_parts))
- pprune->present_parts =
- bms_add_member(pprune->present_parts, k);
- }
+ if (!bms_is_empty(subprune->present_parts))
+ pprune->present_parts =
+ bms_add_member(pprune->present_parts, k);
}
}
}
+ }
- /*
- * We must also recompute the other_subplans set, since indexes in it
- * may change.
- */
- new_other_subplans = NULL;
- i = -1;
- while ((i = bms_next_member(prunestate->other_subplans, i)) >= 0)
- new_other_subplans = bms_add_member(new_other_subplans,
- new_subplan_indexes[i] - 1);
-
- bms_free(prunestate->other_subplans);
- prunestate->other_subplans = new_other_subplans;
+ /*
+ * We must also recompute the other_subplans set, since indexes in it
+ * may change.
+ */
+ new_other_subplans = NULL;
+ i = -1;
+ while ((i = bms_next_member(prunestate->other_subplans, i)) >= 0)
+ new_other_subplans = bms_add_member(new_other_subplans,
+ new_subplan_indexes[i] - 1);
- pfree(new_subplan_indexes);
- }
+ bms_free(prunestate->other_subplans);
+ prunestate->other_subplans = new_other_subplans;
- return result;
+ pfree(new_subplan_indexes);
}
/*
@@ -2018,11 +2099,16 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate)
prunedata = prunestate->partprunedata[i];
pprune = &prunedata->partrelprunedata[0];
+ /*
+ * We pass the 1st item belonging to the root table of the hierarchy
+ * and find_matching_subplans_recurse() takes care of recursing to
+ * other (lower-level) parents as needed.
+ */
find_matching_subplans_recurse(prunedata, pprune, false, &result);
- /* Expression eval may have used space in node's ps_ExprContext too */
+ /* Expression eval may have used space in ExprContext too */
if (pprune->exec_pruning_steps)
- ResetExprContext(pprune->exec_context.planstate->ps_ExprContext);
+ ResetExprContext(pprune->exec_context.exprcontext);
}
/* Add in any subplans that partition pruning didn't account for */
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index 7937f1c88f..5b6d3eb23b 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -138,30 +138,17 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
{
PartitionPruneState *prunestate;
- /* We may need an expression context to evaluate partition exprs */
- ExecAssignExprContext(estate, &appendstate->ps);
-
- /* Create the working data structure for pruning. */
- prunestate = ExecCreatePartitionPruneState(&appendstate->ps,
- node->part_prune_info);
+ /*
+ * Set up pruning data structure. Initial pruning steps, if any, are
+ * performed as part of the setup, adding the set of indexes of
+ * surviving subplans to 'validsubplans'.
+ */
+ prunestate = ExecInitPartitionPruning(&appendstate->ps,
+ list_length(node->appendplans),
+ node->part_prune_info,
+ &validsubplans);
appendstate->as_prune_state = prunestate;
-
- /* Perform an initial partition prune, if required. */
- if (prunestate->do_initial_prune)
- {
- /* Determine which subplans survive initial pruning */
- validsubplans = ExecFindInitialMatchingSubPlans(prunestate,
- list_length(node->appendplans));
-
- nplans = bms_num_members(validsubplans);
- }
- else
- {
- /* We'll need to initialize all subplans */
- nplans = list_length(node->appendplans);
- Assert(nplans > 0);
- validsubplans = bms_add_range(NULL, 0, nplans - 1);
- }
+ nplans = bms_num_members(validsubplans);
/*
* When no run-time pruning is required and there's at least one
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index 418f89dea8..9a9f29e845 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -86,29 +86,17 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
{
PartitionPruneState *prunestate;
- /* We may need an expression context to evaluate partition exprs */
- ExecAssignExprContext(estate, &mergestate->ps);
-
- prunestate = ExecCreatePartitionPruneState(&mergestate->ps,
- node->part_prune_info);
+ /*
+ * Set up pruning data structure. Initial pruning steps, if any, are
+ * performed as part of the setup, adding the set of indexes of
+ * surviving subplans to 'validsubplans'.
+ */
+ prunestate = ExecInitPartitionPruning(&mergestate->ps,
+ list_length(node->mergeplans),
+ node->part_prune_info,
+ &validsubplans);
mergestate->ms_prune_state = prunestate;
-
- /* Perform an initial partition prune, if required. */
- if (prunestate->do_initial_prune)
- {
- /* Determine which subplans survive initial pruning */
- validsubplans = ExecFindInitialMatchingSubPlans(prunestate,
- list_length(node->mergeplans));
-
- nplans = bms_num_members(validsubplans);
- }
- else
- {
- /* We'll need to initialize all subplans */
- nplans = list_length(node->mergeplans);
- Assert(nplans > 0);
- validsubplans = bms_add_range(NULL, 0, nplans - 1);
- }
+ nplans = bms_num_members(validsubplans);
/*
* When no run-time pruning is required and there's at least one
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index 1bc00826c1..7080cb25d9 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -798,6 +798,7 @@ prune_append_rel_partitions(RelOptInfo *rel)
/* These are not valid when being called from the planner */
context.planstate = NULL;
+ context.exprcontext = NULL;
context.exprstates = NULL;
/* Actual pruning happens here. */
@@ -808,8 +809,8 @@ prune_append_rel_partitions(RelOptInfo *rel)
* get_matching_partitions
* Determine partitions that survive partition pruning
*
- * Note: context->planstate must be set to a valid PlanState when the
- * pruning_steps were generated with a target other than PARTTARGET_PLANNER.
+ * Note: context->exprcontext must be valid when the pruning_steps were
+ * generated with a target other than PARTTARGET_PLANNER.
*
* Returns a Bitmapset of the RelOptInfo->part_rels indexes of the surviving
* partitions.
@@ -3654,7 +3655,7 @@ match_boolean_partition_clause(Oid partopfamily, Expr *clause, Expr *partkey,
* exprstate array.
*
* Note that the evaluated result may be in the per-tuple memory context of
- * context->planstate->ps_ExprContext, and we may have leaked other memory
+ * context->exprcontext, and we may have leaked other memory
* there too. This memory must be recovered by resetting that ExprContext
* after we're done with the pruning operation (see execPartition.c).
*/
@@ -3677,13 +3678,18 @@ partkey_datum_from_expr(PartitionPruneContext *context,
ExprContext *ectx;
/*
- * We should never see a non-Const in a step unless we're running in
- * the executor.
+ * We should never see a non-Const in a step unless the caller has
+ * passed a valid ExprContext.
+ *
+ * When context->planstate is valid, context->exprcontext is same
+ * as context->planstate->ps_ExprContext.
*/
- Assert(context->planstate != NULL);
+ Assert(context->planstate != NULL || context->exprcontext != NULL);
+ Assert(context->planstate == NULL ||
+ (context->exprcontext == context->planstate->ps_ExprContext));
exprstate = context->exprstates[stateidx];
- ectx = context->planstate->ps_ExprContext;
+ ectx = context->exprcontext;
*value = ExecEvalExprSwitchContext(exprstate, ectx, isnull);
}
}
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index 603d8becc4..fd5735a946 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -119,10 +119,9 @@ extern ResultRelInfo *ExecFindPartition(ModifyTableState *mtstate,
EState *estate);
extern void ExecCleanupTupleRouting(ModifyTableState *mtstate,
PartitionTupleRouting *proute);
-extern PartitionPruneState *ExecCreatePartitionPruneState(PlanState *planstate,
- PartitionPruneInfo *partitionpruneinfo);
+extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
+ int n_total_subplans,
+ PartitionPruneInfo *pruneinfo,
+ Bitmapset **initially_valid_subplans);
extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate);
-extern Bitmapset *ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate,
- int nsubplans);
-
#endif /* EXECPARTITION_H */
diff --git a/src/include/partitioning/partprune.h b/src/include/partitioning/partprune.h
index ee11b6feae..90684efa25 100644
--- a/src/include/partitioning/partprune.h
+++ b/src/include/partitioning/partprune.h
@@ -41,6 +41,7 @@ struct RelOptInfo;
* subsidiary data, such as the FmgrInfos.
* planstate Points to the parent plan node's PlanState when called
* during execution; NULL when called from the planner.
+ * exprcontext ExprContext to use when evaluating pruning expressions
* exprstates Array of ExprStates, indexed as per PruneCxtStateIdx; one
* for each partition key in each pruning step. Allocated if
* planstate is non-NULL, otherwise NULL.
@@ -56,6 +57,7 @@ typedef struct PartitionPruneContext
FmgrInfo *stepcmpfuncs;
MemoryContext ppccontext;
PlanState *planstate;
+ ExprContext *exprcontext;
ExprState **exprstates;
} PartitionPruneContext;
--
2.24.1
[application/octet-stream] v7-0004-Optimize-AcquireExecutorLocks-to-skip-pruned-part.patch (94.2K, 5-v7-0004-Optimize-AcquireExecutorLocks-to-skip-pruned-part.patch)
download | inline diff:
From 14d951ca644860eec6d72ac03e3a95b12373938b Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Wed, 22 Dec 2021 16:55:17 +0900
Subject: [PATCH v7 4/4] Optimize AcquireExecutorLocks() to skip pruned
partitions
Instead of locking all relations listed in the range table in the
cases where the PlannedStmt indicates that some nodes in the plan
tree can do partition pruning without depending on execution having
started (so called "initial" pruning), AcquireExecutorLocks() now
calls the new executor function ExecutorGetLockRels() which returns
a set of relations (their RT indexes) to be locked not including
those scanned by the subplans that pruned.
The result of pruning done this way must be remembered and reused
during actual execution of the plan, which is done by creating a
PlanInitPruningOutput nodes for for each plan node that undergoes
pruning and a set of those for the whole plan tree are added to
ExecLockRelsInfo which also stores the bitmapset of RT indexes of
relations that are actually locked by AcquireExecutorLocks().
ExecLockRelsInfos are passed down the executor alongside the
PlannedStmts. This arrangement ensures that the executor doesn't
accidentally try to process a plan tree subnodes that has been
deemed pruned by AcquireExecutorLocks().
---
src/backend/commands/copyto.c | 2 +-
src/backend/commands/createas.c | 2 +-
src/backend/commands/explain.c | 7 +-
src/backend/commands/extension.c | 13 +-
src/backend/commands/matview.c | 2 +-
src/backend/commands/portalcmds.c | 1 +
src/backend/commands/prepare.c | 17 +-
src/backend/executor/README | 24 +++
src/backend/executor/execMain.c | 202 ++++++++++++++++++++
src/backend/executor/execParallel.c | 26 ++-
src/backend/executor/execPartition.c | 224 ++++++++++++++++++----
src/backend/executor/execUtils.c | 8 +
src/backend/executor/functions.c | 2 +-
src/backend/executor/nodeAppend.c | 52 ++++-
src/backend/executor/nodeMergeAppend.c | 52 ++++-
src/backend/executor/nodeModifyTable.c | 25 +++
src/backend/executor/spi.c | 14 +-
src/backend/nodes/copyfuncs.c | 49 ++++-
src/backend/nodes/outfuncs.c | 39 ++++
src/backend/nodes/readfuncs.c | 37 ++++
src/backend/optimizer/plan/planner.c | 2 +
src/backend/optimizer/plan/setrefs.c | 6 +
src/backend/partitioning/partprune.c | 37 +++-
src/backend/tcop/postgres.c | 15 +-
src/backend/tcop/pquery.c | 21 ++-
src/backend/utils/cache/plancache.c | 252 ++++++++++++++++++++++---
src/backend/utils/mmgr/portalmem.c | 2 +
src/include/commands/explain.h | 3 +-
src/include/executor/execPartition.h | 2 +
src/include/executor/execdesc.h | 2 +
src/include/executor/executor.h | 2 +
src/include/executor/nodeAppend.h | 1 +
src/include/executor/nodeMergeAppend.h | 1 +
src/include/executor/nodeModifyTable.h | 1 +
src/include/nodes/execnodes.h | 96 ++++++++++
src/include/nodes/nodes.h | 5 +
src/include/nodes/pathnodes.h | 4 +
src/include/nodes/plannodes.h | 15 ++
src/include/tcop/tcopprot.h | 2 +-
src/include/utils/plancache.h | 6 +
src/include/utils/portal.h | 5 +
41 files changed, 1174 insertions(+), 104 deletions(-)
diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index 55c38b04c4..d403eb2309 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -542,7 +542,7 @@ BeginCopyTo(ParseState *pstate,
((DR_copy *) dest)->cstate = cstate;
/* Create a QueryDesc requesting no output */
- cstate->queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+ cstate->queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
GetActiveSnapshot(),
InvalidSnapshot,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 9abbb6b555..f6607f2454 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -325,7 +325,7 @@ ExecCreateTableAs(ParseState *pstate, CreateTableAsStmt *stmt,
UpdateActiveSnapshotCommandId();
/* Create a QueryDesc, redirecting output to our tuple receiver */
- queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+ queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, 0);
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 9f632285b6..1f1a44b9bb 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -407,7 +407,7 @@ ExplainOneQuery(Query *query, int cursorOptions,
}
/* run it (if needed) and produce output */
- ExplainOnePlan(plan, into, es, queryString, params, queryEnv,
+ ExplainOnePlan(plan, NULL, into, es, queryString, params, queryEnv,
&planduration, (es->buffers ? &bufusage : NULL));
}
}
@@ -515,7 +515,8 @@ ExplainOneUtility(Node *utilityStmt, IntoClause *into, ExplainState *es,
* to call it.
*/
void
-ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
+ExplainOnePlan(PlannedStmt *plannedstmt, ExecLockRelsInfo *execlockrelsinfo,
+ IntoClause *into, ExplainState *es,
const char *queryString, ParamListInfo params,
QueryEnvironment *queryEnv, const instr_time *planduration,
const BufferUsage *bufusage)
@@ -563,7 +564,7 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
dest = None_Receiver;
/* Create a QueryDesc for the query */
- queryDesc = CreateQueryDesc(plannedstmt, queryString,
+ queryDesc = CreateQueryDesc(plannedstmt, execlockrelsinfo, queryString,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, instrument_option);
diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index 1013790dbb..008b8ce0e9 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -741,8 +741,10 @@ execute_sql_string(const char *sql)
RawStmt *parsetree = lfirst_node(RawStmt, lc1);
MemoryContext per_parsetree_context,
oldcontext;
- List *stmt_list;
- ListCell *lc2;
+ List *stmt_list,
+ *execlockrelsinfo_list;
+ ListCell *lc2,
+ *lc3;
/*
* We do the work for each parsetree in a short-lived context, to
@@ -762,11 +764,13 @@ execute_sql_string(const char *sql)
NULL,
0,
NULL);
- stmt_list = pg_plan_queries(stmt_list, sql, CURSOR_OPT_PARALLEL_OK, NULL);
+ stmt_list = pg_plan_queries(stmt_list, sql, CURSOR_OPT_PARALLEL_OK, NULL,
+ &execlockrelsinfo_list);
- foreach(lc2, stmt_list)
+ forboth(lc2, stmt_list, lc3, execlockrelsinfo_list)
{
PlannedStmt *stmt = lfirst_node(PlannedStmt, lc2);
+ ExecLockRelsInfo *execlockrelsinfo = lfirst_node(ExecLockRelsInfo, lc3);
CommandCounterIncrement();
@@ -777,6 +781,7 @@ execute_sql_string(const char *sql)
QueryDesc *qdesc;
qdesc = CreateQueryDesc(stmt,
+ execlockrelsinfo,
sql,
GetActiveSnapshot(), NULL,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index 05e7b60059..4ef44aaf23 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -416,7 +416,7 @@ refresh_matview_datafill(DestReceiver *dest, Query *query,
UpdateActiveSnapshotCommandId();
/* Create a QueryDesc, redirecting output to our tuple receiver */
- queryDesc = CreateQueryDesc(plan, queryString,
+ queryDesc = CreateQueryDesc(plan, NULL, queryString,
GetActiveSnapshot(), InvalidSnapshot,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/portalcmds.c b/src/backend/commands/portalcmds.c
index 9902c5c566..85e73ddded 100644
--- a/src/backend/commands/portalcmds.c
+++ b/src/backend/commands/portalcmds.c
@@ -107,6 +107,7 @@ PerformCursorOpen(ParseState *pstate, DeclareCursorStmt *cstmt, ParamListInfo pa
queryString,
CMDTAG_SELECT, /* cursor's query is always a SELECT */
list_make1(plan),
+ list_make1(NULL), /* no ExecLockRelsInfo to pass */
NULL);
/*----------
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 80738547ed..bbbf8bbcbd 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -155,6 +155,7 @@ ExecuteQuery(ParseState *pstate,
PreparedStatement *entry;
CachedPlan *cplan;
List *plan_list;
+ List *plan_execlockrelsinfo_list;
ParamListInfo paramLI = NULL;
EState *estate = NULL;
Portal portal;
@@ -195,6 +196,7 @@ ExecuteQuery(ParseState *pstate,
/* Replan if needed, and increment plan refcount for portal */
cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL);
plan_list = cplan->stmt_list;
+ plan_execlockrelsinfo_list = cplan->execlockrelsinfo_list;
/*
* DO NOT add any logic that could possibly throw an error between
@@ -204,7 +206,7 @@ ExecuteQuery(ParseState *pstate,
NULL,
query_string,
entry->plansource->commandTag,
- plan_list,
+ plan_list, plan_execlockrelsinfo_list,
cplan);
/*
@@ -576,7 +578,9 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
const char *query_string;
CachedPlan *cplan;
List *plan_list;
- ListCell *p;
+ List *plan_execlockrelsinfo_list;
+ ListCell *p,
+ *pe;
ParamListInfo paramLI = NULL;
EState *estate = NULL;
instr_time planstart;
@@ -632,15 +636,18 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
}
plan_list = cplan->stmt_list;
+ plan_execlockrelsinfo_list = cplan->execlockrelsinfo_list;
/* Explain each query */
- foreach(p, plan_list)
+ forboth(p, plan_list, pe, plan_execlockrelsinfo_list)
{
PlannedStmt *pstmt = lfirst_node(PlannedStmt, p);
+ ExecLockRelsInfo *execlockrelsinfo = lfirst_node(ExecLockRelsInfo, pe);
if (pstmt->commandType != CMD_UTILITY)
- ExplainOnePlan(pstmt, into, es, query_string, paramLI, queryEnv,
- &planduration, (es->buffers ? &bufusage : NULL));
+ ExplainOnePlan(pstmt, execlockrelsinfo, into, es, query_string,
+ paramLI, queryEnv, &planduration,
+ (es->buffers ? &bufusage : NULL));
else
ExplainOneUtility(pstmt->utilityStmt, into, es, query_string,
paramLI, queryEnv);
diff --git a/src/backend/executor/README b/src/backend/executor/README
index bf5e70860d..9720d0ac2c 100644
--- a/src/backend/executor/README
+++ b/src/backend/executor/README
@@ -65,6 +65,27 @@ found there. This currently only occurs for Append and MergeAppend nodes. In
this case the non-required subplans are ignored and the executor state's
subnode array will become out of sequence to the plan's subplan list.
+Actually, the so-called execution time pruning may also occur even before the
+execution has started. One case where that occurs is when a cached generic
+plan is being validated for execution by plancache.c: GetCachedPlan(), which
+proceeds by locking all the relations that will be scanned by that plan. If
+the generic plan has nodes that contain so-called initial pruning steps (a
+subset of execution pruning steps that do not depend on full-fledged execution
+having started), they are performed at this point to figure out the minimal
+set of child subplans that satisfy those pruning instructions and the result
+of performing that pruning is saved in a data structure that gets passed to
+the executor alongside the plan tree. Relations scanned by only those
+surviving subplans are then locked while those scanned by the pruned subplans
+are not, even though the pruned subplans themselves are not removed from the
+plan tree. So, it is imperative that the executor and any third party code
+invoked by it that gets passed the plan tree look at the initial pruning result
+made available via the aforementioned data structure to determine whether or
+not a particular subplan is valid. (The data structure basically consists of
+an array of PlanInitPruningOutput nodes containing one element for each node
+of the plan tree indexable using plan_node_id of the individual plan nodes,
+where each node contains a bitmapset of indexes of unpruned child subplans of
+a given node.)
+
Each Plan node may have expression trees associated with it, to represent
its target list, qualification conditions, etc. These trees are also
read-only to the executor, but the executor state for expression evaluation
@@ -247,6 +268,9 @@ Query Processing Control Flow
This is a sketch of control flow for full query processing:
+ [ ExecutorGetLockRels ] --- an optional step to walk over the plan tree
+ to produce an ExecLockRelsInfo to be passed to CreateQueryDesc
+
CreateQueryDesc
ExecutorStart
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 473d2e00a2..1ddd1dfb83 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -49,11 +49,15 @@
#include "commands/matview.h"
#include "commands/trigger.h"
#include "executor/execdebug.h"
+#include "executor/nodeAppend.h"
+#include "executor/nodeMergeAppend.h"
+#include "executor/nodeModifyTable.h"
#include "executor/nodeSubplan.h"
#include "foreign/fdwapi.h"
#include "jit/jit.h"
#include "mb/pg_wchar.h"
#include "miscadmin.h"
+#include "nodes/nodeFuncs.h"
#include "parser/parsetree.h"
#include "storage/bufmgr.h"
#include "storage/lmgr.h"
@@ -101,9 +105,205 @@ static char *ExecBuildSlotValueDescription(Oid reloid,
Bitmapset *modifiedCols,
int maxfieldlen);
static void EvalPlanQualStart(EPQState *epqstate, Plan *planTree);
+static bool ExecGetScanLockRels(Scan *scan, ExecGetLockRelsContext *context);
/* end of local decls */
+/* ----------------------------------------------------------------
+ * ExecutorGetLockRels
+ *
+ * Figure out the minimal set of relations to lock to be able to safely
+ * execute a given plan
+ *
+ * This ignores the relations scanned by child subplans that are pruned away
+ * after performing initial pruning steps present in the plan using the
+ * provided set of EXTERN parameters.
+ *
+ * Along with the set of RT indexes of relations that must be locked, the
+ * returned struct also contains an array of PlanInitPruningOutput nodes each
+ * of which contains the result of initial pruning for a given plan node, which
+ * is basically a bitmapset of the indexes of surviving child subplans. Each
+ * plan node in the tree that undergoes pruning will have an element in the
+ * array.
+ *
+ * Note that while relations scanned by the subplans that are pruned will not
+ * be locked, the subplans themselves are left as-is in the plan tree, assuming
+ * anything that reads the plan tree during execution knows to ignore them by
+ * looking at the PlanInitPruningOutput's list of valid subplans.
+ *
+ * Partitioned tables mentioned in PartitionedRelPruneInfo nodes that drive
+ * the pruning will be locked before doing the pruning and also added to the
+ * the returned set.
+ */
+ExecLockRelsInfo *
+ExecutorGetLockRels(PlannedStmt *plannedstmt, ParamListInfo params)
+{
+ int numPlanNodes = plannedstmt->numPlanNodes;
+ ExecGetLockRelsContext context;
+ ExecLockRelsInfo *result;
+ ListCell *lc;
+
+ /* Only get here if there is any pruning to do. */
+ Assert(plannedstmt->containsInitialPruning);
+
+ context.stmt = plannedstmt;
+ context.params = params;
+
+ /*
+ * Go walk all the plan tree(s) present in the PlannedStmt, filling
+ * context.lockrels with only the relations from plan nodes that
+ * survive initial pruning and also the tables mentioned in
+ * partitioned_rels sets found in the plan.
+ */
+ context.lockrels = NULL;
+ context.initPruningOutputs = NIL;
+ context.ipoIndexes = palloc0(sizeof(int) * numPlanNodes);
+
+ /* All the subplans. */
+ foreach(lc, plannedstmt->subplans)
+ {
+ Plan *subplan = lfirst(lc);
+
+ (void) ExecGetLockRels(subplan, &context);
+ }
+
+ /* And the main tree. */
+ (void) ExecGetLockRels(plannedstmt->planTree, &context);
+
+ /*
+ * Also be sure to lock partitioned relations from any [Merge]Append nodes
+ * that were originally present but were ultimately left out from the plan
+ * due to being deemed no-op nodes.
+ */
+ context.lockrels = bms_add_members(context.lockrels,
+ plannedstmt->elidedAppendPartedRels);
+
+ result = makeNode(ExecLockRelsInfo);
+ result->lockrels = context.lockrels;
+ result->numPlanNodes = numPlanNodes;
+ result->initPruningOutputs = context.initPruningOutputs;
+ result->ipoIndexes = context.ipoIndexes;
+
+ return result;
+}
+
+/* ------------------------------------------------------------------------
+ * ExecGetLockRels
+ * Adds all the relations that will be scanned by 'node' and its child
+ * plans to context->lockrels after taking into the account the effect
+ * of performing initial pruning if any
+ *
+ * context->stmt gives the PlannedStmt being inspected to access the plan's
+ * range table if needed and context->params the set of EXTERN parameters
+ * available to evaluate pruning parameters.
+ *
+ * If initial pruning is done, a PlanInitPruningOutput node containing the
+ * result of pruning will be stored in context->initPruningOutputs that will
+ * be made available to the executor to reuse.
+ * ------------------------------------------------------------------------
+ */
+bool
+ExecGetLockRels(Plan *node, ExecGetLockRelsContext *context)
+{
+ /* Do nothing when we get to the end of a leaf on tree. */
+ if (node == NULL)
+ return true;
+
+ /* Make sure there's enough stack available. */
+ check_stack_depth();
+
+ switch (nodeTag(node))
+ {
+ /* Currently, only these two nodes have prunable child subplans. */
+ case T_Append:
+ if (ExecGetAppendLockRels((Append *) node, context))
+ return true;
+ break;
+ case T_MergeAppend:
+ if (ExecGetMergeAppendLockRels((MergeAppend *) node,
+ context))
+ return true;
+ break;
+
+ /*
+ * And these manipulate relations that must be added context->lockrels.
+ */
+ case T_SeqScan:
+ case T_SampleScan:
+ case T_IndexScan:
+ case T_IndexOnlyScan:
+ case T_BitmapIndexScan:
+ case T_BitmapHeapScan:
+ case T_TidScan:
+ case T_TidRangeScan:
+ case T_ForeignScan:
+ case T_SubqueryScan:
+ case T_CustomScan:
+ if (ExecGetScanLockRels((Scan *) node, context))
+ return true;
+ break;
+ case T_ModifyTable:
+ if (ExecGetModifyTableLockRels((ModifyTable *) node, context))
+ return true;
+ /* plan_tree_walker() will visit the subplan (outerNode) */
+ break;
+
+ default:
+ break;
+ }
+
+ /* Recurse to subnodes. */
+ return plan_tree_walker(node, ExecGetLockRels, (void *) context);
+}
+
+/*
+ * ExecGetScanLockRels
+ * Do ExecGetLockRels()'s work for a leaf Scan node
+ */
+static bool
+ExecGetScanLockRels(Scan *scan, ExecGetLockRelsContext *context)
+{
+ switch (nodeTag(scan))
+ {
+ case T_ForeignScan:
+ {
+ ForeignScan *fscan = (ForeignScan *) scan;
+
+ context->lockrels = bms_add_members(context->lockrels,
+ fscan->fs_relids);
+ }
+ break;
+
+ case T_SubqueryScan:
+ {
+ SubqueryScan *sscan = (SubqueryScan *) scan;
+
+ (void) ExecGetLockRels((Plan *) sscan->subplan, context);
+ }
+ break;
+
+ case T_CustomScan:
+ {
+ CustomScan *cscan = (CustomScan *) scan;
+ ListCell *lc;
+
+ context->lockrels = bms_add_members(context->lockrels,
+ cscan->custom_relids);
+ foreach(lc, cscan->custom_plans)
+ {
+ (void) ExecGetLockRels((Plan *) lfirst(lc), context);
+ }
+ }
+ break;
+
+ default:
+ context->lockrels = bms_add_member(context->lockrels,
+ scan->scanrelid);
+ break;
+ }
+
+ return true;
+}
/* ----------------------------------------------------------------
* ExecutorStart
@@ -805,6 +1005,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
{
CmdType operation = queryDesc->operation;
PlannedStmt *plannedstmt = queryDesc->plannedstmt;
+ ExecLockRelsInfo *execlockrelsinfo = queryDesc->execlockrelsinfo;
Plan *plan = plannedstmt->planTree;
List *rangeTable = plannedstmt->rtable;
EState *estate = queryDesc->estate;
@@ -824,6 +1025,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
ExecInitRangeTable(estate, rangeTable);
estate->es_plannedstmt = plannedstmt;
+ estate->es_execlockrelsinfo = execlockrelsinfo;
/*
* Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index 9a0d5d59ef..fb6dbd298a 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -66,6 +66,7 @@
#define PARALLEL_KEY_QUERY_TEXT UINT64CONST(0xE000000000000008)
#define PARALLEL_KEY_JIT_INSTRUMENTATION UINT64CONST(0xE000000000000009)
#define PARALLEL_KEY_WAL_USAGE UINT64CONST(0xE00000000000000A)
+#define PARALLEL_KEY_EXECLOCKRELSINFO UINT64CONST(0xE00000000000000B)
#define PARALLEL_TUPLE_QUEUE_SIZE 65536
@@ -182,6 +183,7 @@ ExecSerializePlan(Plan *plan, EState *estate)
pstmt->transientPlan = false;
pstmt->dependsOnRole = false;
pstmt->parallelModeNeeded = false;
+ pstmt->containsInitialPruning = false;
pstmt->planTree = plan;
pstmt->rtable = estate->es_range_table;
pstmt->resultRelations = NIL;
@@ -596,12 +598,15 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
FixedParallelExecutorState *fpes;
char *pstmt_data;
char *pstmt_space;
+ char *execlockrelsinfo_data;
+ char *execlockrelsinfo_space;
char *paramlistinfo_space;
BufferUsage *bufusage_space;
WalUsage *walusage_space;
SharedExecutorInstrumentation *instrumentation = NULL;
SharedJitInstrumentation *jit_instrumentation = NULL;
int pstmt_len;
+ int execlockrelsinfo_len;
int paramlistinfo_len;
int instrumentation_len = 0;
int jit_instrumentation_len = 0;
@@ -630,6 +635,7 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
/* Fix up and serialize plan to be sent to workers. */
pstmt_data = ExecSerializePlan(planstate->plan, estate);
+ execlockrelsinfo_data = nodeToString(estate->es_execlockrelsinfo);
/* Create a parallel context. */
pcxt = CreateParallelContext("postgres", "ParallelQueryMain", nworkers);
@@ -656,6 +662,11 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
shm_toc_estimate_chunk(&pcxt->estimator, pstmt_len);
shm_toc_estimate_keys(&pcxt->estimator, 1);
+ /* Estimate space for serialized ExecLockRelsInfo. */
+ execlockrelsinfo_len = strlen(execlockrelsinfo_data) + 1;
+ shm_toc_estimate_chunk(&pcxt->estimator, execlockrelsinfo_len);
+ shm_toc_estimate_keys(&pcxt->estimator, 1);
+
/* Estimate space for serialized ParamListInfo. */
paramlistinfo_len = EstimateParamListSpace(estate->es_param_list_info);
shm_toc_estimate_chunk(&pcxt->estimator, paramlistinfo_len);
@@ -750,6 +761,12 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
memcpy(pstmt_space, pstmt_data, pstmt_len);
shm_toc_insert(pcxt->toc, PARALLEL_KEY_PLANNEDSTMT, pstmt_space);
+ /* Store serialized ExecLockRelsInfo */
+ execlockrelsinfo_space = shm_toc_allocate(pcxt->toc, execlockrelsinfo_len);
+ memcpy(execlockrelsinfo_space, execlockrelsinfo_data, execlockrelsinfo_len);
+ shm_toc_insert(pcxt->toc, PARALLEL_KEY_EXECLOCKRELSINFO,
+ execlockrelsinfo_space);
+
/* Store serialized ParamListInfo. */
paramlistinfo_space = shm_toc_allocate(pcxt->toc, paramlistinfo_len);
shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARAMLISTINFO, paramlistinfo_space);
@@ -1231,8 +1248,10 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
int instrument_options)
{
char *pstmtspace;
+ char *execlockrelsinfospace;
char *paramspace;
PlannedStmt *pstmt;
+ ExecLockRelsInfo *execlockrelsinfo;
ParamListInfo paramLI;
char *queryString;
@@ -1243,12 +1262,17 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
pstmtspace = shm_toc_lookup(toc, PARALLEL_KEY_PLANNEDSTMT, false);
pstmt = (PlannedStmt *) stringToNode(pstmtspace);
+ /* Reconstruct leader-supplied ExecLockRelsInfo. */
+ execlockrelsinfospace = shm_toc_lookup(toc, PARALLEL_KEY_EXECLOCKRELSINFO,
+ false);
+ execlockrelsinfo = (ExecLockRelsInfo *) stringToNode(execlockrelsinfospace);
+
/* Reconstruct ParamListInfo. */
paramspace = shm_toc_lookup(toc, PARALLEL_KEY_PARAMLISTINFO, false);
paramLI = RestoreParamList(¶mspace);
/* Create a QueryDesc for the query. */
- return CreateQueryDesc(pstmt,
+ return CreateQueryDesc(pstmt, execlockrelsinfo,
queryString,
GetActiveSnapshot(), InvalidSnapshot,
receiver, paramLI, NULL, instrument_options);
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 7ff5a95f05..fddc97280e 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -24,6 +24,7 @@
#include "mb/pg_wchar.h"
#include "miscadmin.h"
#include "nodes/makefuncs.h"
+#include "parser/parsetree.h"
#include "partitioning/partbounds.h"
#include "partitioning/partdesc.h"
#include "partitioning/partprune.h"
@@ -183,8 +184,13 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation rel,
int maxfieldlen);
static List *adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri);
static PartitionPruneState *ExecCreatePartitionPruneState(PlanState *planstate,
- PartitionPruneInfo *partitionpruneinfo);
-static Bitmapset *ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate);
+ PartitionPruneInfo *partitionpruneinfo,
+ bool consider_initial_steps,
+ bool consider_exec_steps,
+ List *rtable, ExprContext *econtext,
+ PartitionDirectory partdir);
+static Bitmapset *ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate,
+ PartitionPruneInfo *pruneinfo);
static void ExecInitPruningContext(PartitionPruneContext *context,
List *pruning_steps,
PartitionDesc partdesc,
@@ -1483,8 +1489,9 @@ adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri)
* considered to be a stable expression, it can change value from one plan
* node scan to the next during query execution. Stable comparison
* expressions that don't involve such Params allow partition pruning to be
- * done once during executor startup. Expressions that do involve such Params
- * require us to prune separately for each scan of the parent plan node.
+ * done once during executor startup or even before during ExecutorGetLockRels().
+ * Expressions that do involve such Params require us to prune separately for
+ * each scan of the parent plan node.
*
* Note that pruning away unneeded subplans during executor startup has the
* added benefit of not having to initialize the unneeded subplans at all.
@@ -1496,10 +1503,17 @@ adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri)
* Creates the PartitionPruneState required by each of the two pruning
* functions. Details stored include how to map the partition index
* returned by the partition pruning code into subplan indexes. Also
- * determines the set of initially valid subplans by performing initial
- * pruning steps, only which need be initialized by the caller such as
- * ExecInitAppend. Maps in PartitionPruneState are updated to account
- * for initial pruning having eliminated some of the subplans, if any.
+ * determines the set of initially valid subplans by either looking that
+ * up in the plan node's PlanInitPruningOutput if one found in
+ * EState.es_execlockrelinfo or by performing initial pruning steps.
+ * Only the subplans included in that need be initialized by the caller
+ * such as ExecInitAppend. Maps in PartitionPruneState are updated to
+ * account for initial pruning having eliminated some of the subplans,
+ * if any.
+ *
+ * ExecGetLockRelsDoInitialPruning:
+ * Do initial pruning as part of ExecGetLockRels() on the parent plan
+ * node
*
* ExecFindMatchingSubPlans:
* Returns indexes of matching subplans after evaluating all available
@@ -1514,9 +1528,10 @@ adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri)
* ExecInitPartitionPruning
* Initialize data structure needed for run-time partition pruning
*
- * Initial pruning can be done immediately, so it is done here if needed and
- * the set of surviving partition subplans' indexes are added to the output
- * parameter *initially_valid_subplans.
+ * Initial pruning can be done immediately, so it is done here unless it has
+ * already been done by ExecGetLockRelsDoInitialPruning(), and the set of
+ * surviving partition subplans' indexes are added to the output parameter
+ * *initially_valid_subplans.
*
* If subplans are indeed pruned, subplan_map arrays contained in the returned
* PartitionPruneState are re-sequenced to not count those, though only if the
@@ -1530,22 +1545,57 @@ ExecInitPartitionPruning(PlanState *planstate,
{
PartitionPruneState *prunestate;
EState *estate = planstate->state;
+ Plan *plan = planstate->plan;
+ PlanInitPruningOutput *initPruningOutput = NULL;
+ bool do_pruning = (pruneinfo->needs_init_pruning ||
+ pruneinfo->needs_exec_pruning);
- /* We may need an expression context to evaluate partition exprs */
- ExecAssignExprContext(estate, planstate);
+ /* Retrieve the parent plan's PlanInitPruningOutput, if any. */
+ if (estate->es_execlockrelsinfo)
+ {
+ initPruningOutput = (PlanInitPruningOutput *)
+ ExecFetchPlanInitPruningOutput(estate->es_execlockrelsinfo, plan);
- /*
- * Create the working data structure for pruning.
- */
- prunestate = ExecCreatePartitionPruneState(planstate, pruneinfo);
+ Assert(initPruningOutput != NULL &&
+ IsA(initPruningOutput, PlanInitPruningOutput));
+ /* No need to do initial pruning again, only exec pruning. */
+ do_pruning = pruneinfo->needs_exec_pruning;
+ }
+
+ prunestate = NULL;
+ if (do_pruning)
+ {
+ /* We may need an expression context to evaluate partition exprs */
+ ExecAssignExprContext(estate, planstate);
+
+ /* For data reading, executor always omits detached partitions */
+ if (estate->es_partition_directory == NULL)
+ estate->es_partition_directory =
+ CreatePartitionDirectory(estate->es_query_cxt, false);
+
+ /*
+ * Create the working data structure for pruning. No need to consider
+ * initial pruning steps if we have a PlanInitPruningOutput.
+ */
+ prunestate = ExecCreatePartitionPruneState(planstate, pruneinfo,
+ initPruningOutput == NULL, true,
+ NIL, planstate->ps_ExprContext,
+ estate->es_partition_directory);
+ }
/*
* Perform an initial partition prune, if required.
*/
- if (prunestate->do_initial_prune)
+ if (initPruningOutput)
+ {
+ /* ExecGetLockRelsDoInitialPruning() already did it for us! */
+ *initially_valid_subplans = initPruningOutput->initially_valid_subplans;
+ }
+ else if (prunestate && prunestate->do_initial_prune)
{
/* Determine which subplans survive initial pruning */
- *initially_valid_subplans = ExecFindInitialMatchingSubPlans(prunestate);
+ *initially_valid_subplans = ExecFindInitialMatchingSubPlans(prunestate,
+ pruneinfo);
}
else
{
@@ -1563,7 +1613,7 @@ ExecInitPartitionPruning(PlanState *planstate,
* invalid data in prunestate, because that data won't be consulted again
* (cf initial Assert in ExecFindMatchingSubPlans).
*/
- if (prunestate->do_exec_prune &&
+ if (prunestate && prunestate->do_exec_prune &&
bms_num_members(*initially_valid_subplans) < n_total_subplans)
PartitionPruneStateFixSubPlanMap(prunestate,
*initially_valid_subplans,
@@ -1572,12 +1622,75 @@ ExecInitPartitionPruning(PlanState *planstate,
return prunestate;
}
+/*
+ * ExecGetLockRelsDoInitialPruning
+ * Perform initial pruning as part of doing ExecGetLockRels() on the parent
+ * plan node
+ */
+Bitmapset *
+ExecGetLockRelsDoInitialPruning(Plan *plan, ExecGetLockRelsContext *context,
+ PartitionPruneInfo *pruneinfo)
+{
+ List *rtable = context->stmt->rtable;
+ ParamListInfo params = context->params;
+ ExprContext *econtext;
+ PartitionDirectory pdir;
+ MemoryContext oldcontext,
+ tmpcontext;
+ PartitionPruneState *prunestate;
+ PlanInitPruningOutput *initPruningOutput;
+
+ /*
+ * A temporary context to allocate stuff needded to run the pruning steps.
+ */
+ tmpcontext = AllocSetContextCreate(CurrentMemoryContext,
+ "initial pruning working data",
+ ALLOCSET_DEFAULT_SIZES);
+ oldcontext = MemoryContextSwitchTo(tmpcontext);
+
+ /*
+ * PartitionDirectory to look up partition descriptors, which omits
+ * detached partitions, just like in the executor proper.
+ */
+ pdir = CreatePartitionDirectory(CurrentMemoryContext, false);
+
+ /*
+ * We don't yet have a PlanState for the parent plan node, so must create
+ * a standalone ExprContext to evaluate pruning expressions, equipped with
+ * the information about the EXTERN parameters that the caller passed us.
+ * Note that that's okay because the initial pruning steps do not contain
+ * anything that requires the execution to have started.
+ */
+ econtext = CreateStandaloneExprContext();
+ econtext->ecxt_param_list_info = params;
+ prunestate = ExecCreatePartitionPruneState(NULL, pruneinfo,
+ true, false,
+ rtable, econtext,
+ pdir);
+ MemoryContextSwitchTo(oldcontext);
+
+ /* Do the pruning and populate a PlanInitPruningOutput for this node. */
+ initPruningOutput = makeNode(PlanInitPruningOutput);
+ initPruningOutput->initially_valid_subplans =
+ ExecFindInitialMatchingSubPlans(prunestate, pruneinfo);
+ ExecStorePlanInitPruningOutput(context, initPruningOutput, plan);
+
+ FreeExprContext(econtext, true);
+ DestroyPartitionDirectory(pdir);
+ MemoryContextDelete(tmpcontext);
+
+ return initPruningOutput->initially_valid_subplans;
+}
+
/*
* ExecCreatePartitionPruneState
* Build the data structure required for calling
* ExecFindInitialMatchingSubPlans and ExecFindMatchingSubPlans.
*
- * 'planstate' is the parent plan node's execution state.
+ * 'planstate', if not NULL, is the parent plan node's execution state. It
+ * can be NULL if being called before ExecutorStart(), in which case,
+ * 'rtable' (range table), 'econtext', and 'partdir' must be explicitly
+ * provided.
*
* 'partitionpruneinfo' is a PartitionPruneInfo as generated by
* make_partition_pruneinfo. Here we build a PartitionPruneState containing a
@@ -1592,19 +1705,20 @@ ExecInitPartitionPruning(PlanState *planstate,
*/
static PartitionPruneState *
ExecCreatePartitionPruneState(PlanState *planstate,
- PartitionPruneInfo *partitionpruneinfo)
+ PartitionPruneInfo *partitionpruneinfo,
+ bool consider_initial_steps,
+ bool consider_exec_steps,
+ List *rtable, ExprContext *econtext,
+ PartitionDirectory partdir)
{
- EState *estate = planstate->state;
+ EState *estate = planstate ? planstate->state : NULL;
PartitionPruneState *prunestate;
int n_part_hierarchies;
ListCell *lc;
int i;
- ExprContext *econtext = planstate->ps_ExprContext;
- /* For data reading, executor always omits detached partitions */
- if (estate->es_partition_directory == NULL)
- estate->es_partition_directory =
- CreatePartitionDirectory(estate->es_query_cxt, false);
+ Assert((estate != NULL) ||
+ (partdir != NULL && econtext != NULL && rtable != NIL));
n_part_hierarchies = list_length(partitionpruneinfo->prune_infos);
Assert(n_part_hierarchies > 0);
@@ -1655,19 +1769,48 @@ ExecCreatePartitionPruneState(PlanState *planstate,
PartitionedRelPruneInfo *pinfo = lfirst_node(PartitionedRelPruneInfo, lc2);
PartitionedRelPruningData *pprune = &prunedata->partrelprunedata[j];
Relation partrel;
+ bool close_partrel = false;
PartitionDesc partdesc;
PartitionKey partkey;
/*
- * We can rely on the copies of the partitioned table's partition
- * key and partition descriptor appearing in its relcache entry,
- * because that entry will be held open and locked for the
- * duration of this executor run.
+ * Must open the relation by ourselves when called before the
+ * execution has started, such as, when called during
+ * ExecutorGetLockRels() on a cached plan. In that case,
+ * sub-partitions must be locked, because AcquirePlannerLocks()
+ * would not have seen them. (1st relation in a partrelpruneinfos
+ * list is always the root partitioned table appearing in the
+ * query, which AcquirePlannerLocks() would have locked; the
+ * Assert in relation_open() guards that assumption.)
+ */
+ if (estate == NULL)
+ {
+ RangeTblEntry *rte = rt_fetch(pinfo->rtindex, rtable);
+ int lockmode = (j == 0) ? NoLock : rte->rellockmode;
+
+ partrel = table_open(rte->relid, lockmode);
+ close_partrel = true;
+ }
+ else
+ partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
+
+ /*
+ * We can rely on the copy of the partitioned table's partition
+ * key from in its relcache entry, because it can't change (or
+ * get destroyed) as long as the relation is locked. Partition
+ * descriptor is taken from the PartitionDirectory associated with
+ * the table that is held open long enough for the descriptor to
+ * remain valid while it's used to perform the pruning steps.
*/
- partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
partkey = RelationGetPartitionKey(partrel);
- partdesc = PartitionDirectoryLookup(estate->es_partition_directory,
- partrel);
+ partdesc = PartitionDirectoryLookup(partdir, partrel);
+
+ /*
+ * Must close partrel, keeping the lock taken, if we're not using
+ * EState's entry.
+ */
+ if (close_partrel)
+ table_close(partrel, NoLock);
/*
* Initialize the subplan_map and subpart_map.
@@ -1769,7 +1912,7 @@ ExecCreatePartitionPruneState(PlanState *planstate,
* Initialize pruning contexts as needed.
*/
pprune->initial_pruning_steps = pinfo->initial_pruning_steps;
- if (pinfo->initial_pruning_steps)
+ if (consider_initial_steps && pinfo->initial_pruning_steps)
{
ExecInitPruningContext(&pprune->initial_context,
pinfo->initial_pruning_steps,
@@ -1779,7 +1922,7 @@ ExecCreatePartitionPruneState(PlanState *planstate,
prunestate->do_initial_prune = true;
}
pprune->exec_pruning_steps = pinfo->exec_pruning_steps;
- if (pinfo->exec_pruning_steps)
+ if (consider_exec_steps && pinfo->exec_pruning_steps)
{
ExecInitPruningContext(&pprune->exec_context,
pinfo->exec_pruning_steps,
@@ -1893,7 +2036,8 @@ ExecInitPruningContext(PartitionPruneContext *context,
* is required.
*/
static Bitmapset *
-ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate)
+ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate,
+ PartitionPruneInfo *pruneinfo)
{
Bitmapset *result = NULL;
MemoryContext oldcontext;
@@ -1903,8 +2047,8 @@ ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate)
Assert(prunestate->do_initial_prune);
/*
- * Switch to a temp context to avoid leaking memory in the executor's
- * query-lifespan memory context.
+ * Switch to a temp context to avoid leaking memory in the longer-term
+ * memory context.
*/
oldcontext = MemoryContextSwitchTo(prunestate->prune_context);
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 9df1f81ea8..7246f9175f 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -119,6 +119,7 @@ CreateExecutorState(void)
estate->es_relations = NULL;
estate->es_rowmarks = NULL;
estate->es_plannedstmt = NULL;
+ estate->es_execlockrelsinfo = NULL;
estate->es_junkFilter = NULL;
@@ -785,6 +786,13 @@ ExecGetRangeTableRelation(EState *estate, Index rti)
Assert(rti > 0 && rti <= estate->es_range_table_size);
+ /*
+ * A cross-check that AcquireExecutorLocks() hasn't missed any relations
+ * it must not have.
+ */
+ Assert(estate->es_execlockrelsinfo == NULL ||
+ bms_is_member(rti, estate->es_execlockrelsinfo->lockrels));
+
rel = estate->es_relations[rti - 1];
if (rel == NULL)
{
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index f9460ae506..a2182a6b1f 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -844,7 +844,7 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
else
dest = None_Receiver;
- es->qd = CreateQueryDesc(es->stmt,
+ es->qd = CreateQueryDesc(es->stmt, NULL,
fcache->src,
GetActiveSnapshot(),
InvalidSnapshot,
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index 5b6d3eb23b..9c6f907687 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -94,6 +94,55 @@ static bool ExecAppendAsyncRequest(AppendState *node, TupleTableSlot **result);
static void ExecAppendAsyncEventWait(AppendState *node);
static void classify_matching_subplans(AppendState *node);
+/* ----------------------------------------------------------------
+ * ExecGetAppendLockRels
+ * Do ExecGetLockRels()'s work for an Append plan
+ * ----------------------------------------------------------------
+ */
+bool
+ExecGetAppendLockRels(Append *node, ExecGetLockRelsContext *context)
+{
+ PartitionPruneInfo *pruneinfo = node->part_prune_info;
+
+ /*
+ * Must always lock all the partitioned tables whose direct and indirect
+ * partitions will be scanned by this Append.
+ */
+ context->lockrels = bms_add_members(context->lockrels,
+ node->partitioned_rels);
+
+ /*
+ * Now recurse to subplans to add relations scanned therein.
+ *
+ * If initial pruning can be done, do that now and only recurse to the
+ * surviving subplans.
+ */
+ if (pruneinfo && pruneinfo->needs_init_pruning)
+ {
+ List *subplans = node->appendplans;
+ Bitmapset *validsubplans;
+ int i;
+
+ validsubplans = ExecGetLockRelsDoInitialPruning((Plan *) node,
+ context, pruneinfo);
+
+ /* Recurse to surviving subplans. */
+ i = -1;
+ while ((i = bms_next_member(validsubplans, i)) >= 0)
+ {
+ Plan *subplan = list_nth(subplans, i);
+
+ (void) ExecGetLockRels(subplan, context);
+ }
+
+ /* done with this node */
+ return true;
+ }
+
+ /* Tell the caller to recurse to *all* the subplans. */
+ return false;
+}
+
/* ----------------------------------------------------------------
* ExecInitAppend
*
@@ -155,7 +204,8 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
* subplan, we can fill as_valid_subplans immediately, preventing
* later calls to ExecFindMatchingSubPlans.
*/
- if (!prunestate->do_exec_prune && nplans > 0)
+ if (appendstate->as_prune_state == NULL ||
+ (!appendstate->as_prune_state->do_exec_prune && nplans > 0))
appendstate->as_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
}
else
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index 9a9f29e845..4b04fcdbc2 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -54,6 +54,55 @@ typedef int32 SlotNumber;
static TupleTableSlot *ExecMergeAppend(PlanState *pstate);
static int heap_compare_slots(Datum a, Datum b, void *arg);
+/* ----------------------------------------------------------------
+ * ExecGetMergeAppendLockRels
+ * Do ExecGetLockRels()'s work for a MergeAppend plan
+ * ----------------------------------------------------------------
+ */
+bool
+ExecGetMergeAppendLockRels(MergeAppend *node, ExecGetLockRelsContext *context)
+{
+ PartitionPruneInfo *pruneinfo = node->part_prune_info;
+
+ /*
+ * Must always lock all the partitioned tables whose direct and indirect
+ * partitions will be scanned by this Append.
+ */
+ context->lockrels = bms_add_members(context->lockrels,
+ node->partitioned_rels);
+
+ /*
+ * Now recurse to subplans to add relations scanned therein.
+ *
+ * If initial pruning can be done, do that now and only recurse to the
+ * surviving subplans.
+ */
+ if (pruneinfo && pruneinfo->needs_init_pruning)
+ {
+ List *subplans = node->mergeplans;
+ Bitmapset *validsubplans;
+ int i;
+
+ validsubplans = ExecGetLockRelsDoInitialPruning((Plan *) node,
+ context, pruneinfo);
+
+ /* Recurse to surviving subplans. */
+ i = -1;
+ while ((i = bms_next_member(validsubplans, i)) >= 0)
+ {
+ Plan *subplan = list_nth(subplans, i);
+
+ (void) ExecGetLockRels(subplan, context);
+ }
+
+ /* done with this node */
+ return true;
+ }
+
+ /* Tell the caller to recurse to *all* the subplans. */
+ return false;
+}
+
/* ----------------------------------------------------------------
* ExecInitMergeAppend
@@ -103,7 +152,8 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
* subplan, we can fill as_valid_subplans immediately, preventing
* later calls to ExecFindMatchingSubPlans.
*/
- if (!prunestate->do_exec_prune && nplans > 0)
+ if (mergestate->ms_prune_state == NULL ||
+ (!mergestate->ms_prune_state->do_exec_prune && nplans > 0))
mergestate->ms_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
}
else
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index 701fe05296..23df3efef0 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -3008,6 +3008,31 @@ ExecLookupResultRelByOid(ModifyTableState *node, Oid resultoid,
return NULL;
}
+/*
+ * ExecGetModifyTableLockRels
+ * Do ExecGetLockRels()'s work for a ModifyTable plan
+ */
+bool
+ExecGetModifyTableLockRels(ModifyTable *plan, ExecGetLockRelsContext *context)
+{
+ ListCell *lc;
+
+ /* First add the result relation RTIs mentioned in the node. */
+ if (plan->rootRelation > 0)
+ context->lockrels = bms_add_member(context->lockrels,
+ plan->rootRelation);
+ context->lockrels = bms_add_member(context->lockrels,
+ plan->nominalRelation);
+ foreach(lc, plan->resultRelations)
+ {
+ context->lockrels = bms_add_member(context->lockrels,
+ lfirst_int(lc));
+ }
+
+ /* Tell the caller to recurse to the subplan (outerPlan(plan)). */
+ return false;
+}
+
/* ----------------------------------------------------------------
* ExecInitModifyTable
* ----------------------------------------------------------------
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index a82e986667..2107009591 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -1578,6 +1578,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
CachedPlanSource *plansource;
CachedPlan *cplan;
List *stmt_list;
+ List *execlockrelsinfo_list;
char *query_string;
Snapshot snapshot;
MemoryContext oldcontext;
@@ -1659,6 +1660,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
/* Replan if needed, and increment plan refcount for portal */
cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv);
stmt_list = cplan->stmt_list;
+ execlockrelsinfo_list = cplan->execlockrelsinfo_list;
if (!plan->saved)
{
@@ -1670,6 +1672,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
*/
oldcontext = MemoryContextSwitchTo(portal->portalContext);
stmt_list = copyObject(stmt_list);
+ execlockrelsinfo_list = copyObject(execlockrelsinfo_list);
MemoryContextSwitchTo(oldcontext);
ReleaseCachedPlan(cplan, NULL);
cplan = NULL; /* portal shouldn't depend on cplan */
@@ -1683,6 +1686,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
query_string,
plansource->commandTag,
stmt_list,
+ execlockrelsinfo_list,
cplan);
/*
@@ -2473,7 +2477,9 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
{
CachedPlanSource *plansource = (CachedPlanSource *) lfirst(lc1);
List *stmt_list;
- ListCell *lc2;
+ List *execlockrelsinfo_list;
+ ListCell *lc2,
+ *lc3;
spicallbackarg.query = plansource->query_string;
@@ -2552,6 +2558,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
plan_owner, _SPI_current->queryEnv);
stmt_list = cplan->stmt_list;
+ execlockrelsinfo_list = cplan->execlockrelsinfo_list;
/*
* If we weren't given a specific snapshot to use, and the statement
@@ -2589,9 +2596,10 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
}
}
- foreach(lc2, stmt_list)
+ forboth(lc2, stmt_list, lc3, execlockrelsinfo_list)
{
PlannedStmt *stmt = lfirst_node(PlannedStmt, lc2);
+ ExecLockRelsInfo *execlockrelsinfo = lfirst_node(ExecLockRelsInfo, lc3);
bool canSetTag = stmt->canSetTag;
DestReceiver *dest;
@@ -2663,7 +2671,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
else
snap = InvalidSnapshot;
- qdesc = CreateQueryDesc(stmt,
+ qdesc = CreateQueryDesc(stmt, execlockrelsinfo,
plansource->query_string,
snap, crosscheck_snapshot,
dest,
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index d4b5cc7e59..631727d310 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -68,6 +68,13 @@
} \
} while (0)
+/* Copy a field that is an array with numElem ints */
+#define COPY_INT_ARRAY(fldname, numElem) \
+ do { \
+ newnode->fldname = (numElem) > 0 ? palloc((numElem) * sizeof(int)) : NULL; \
+ memcpy(newnode->fldname, from->fldname, sizeof(int) * (numElem)); \
+ } while (0)
+
/* Copy a parse location field (for Copy, this is same as scalar case) */
#define COPY_LOCATION_FIELD(fldname) \
(newnode->fldname = from->fldname)
@@ -94,8 +101,10 @@ _copyPlannedStmt(const PlannedStmt *from)
COPY_SCALAR_FIELD(transientPlan);
COPY_SCALAR_FIELD(dependsOnRole);
COPY_SCALAR_FIELD(parallelModeNeeded);
+ COPY_SCALAR_FIELD(containsInitialPruning);
COPY_SCALAR_FIELD(jitFlags);
COPY_NODE_FIELD(planTree);
+ COPY_SCALAR_FIELD(numPlanNodes);
COPY_NODE_FIELD(rtable);
COPY_NODE_FIELD(resultRelations);
COPY_NODE_FIELD(appendRelations);
@@ -1281,6 +1290,8 @@ _copyPartitionPruneInfo(const PartitionPruneInfo *from)
PartitionPruneInfo *newnode = makeNode(PartitionPruneInfo);
COPY_NODE_FIELD(prune_infos);
+ COPY_SCALAR_FIELD(needs_init_pruning);
+ COPY_SCALAR_FIELD(needs_exec_pruning);
COPY_BITMAPSET_FIELD(other_subplans);
return newnode;
@@ -5137,6 +5148,33 @@ _copyExtensibleNode(const ExtensibleNode *from)
return newnode;
}
+/* ****************************************************************
+ * execnodes.h copy functions
+ * ****************************************************************
+ */
+static ExecLockRelsInfo *
+_copyExecLockRelsInfo(const ExecLockRelsInfo *from)
+{
+ ExecLockRelsInfo *newnode = makeNode(ExecLockRelsInfo);
+
+ COPY_BITMAPSET_FIELD(lockrels);
+ COPY_SCALAR_FIELD(numPlanNodes);
+ COPY_NODE_FIELD(initPruningOutputs);
+ COPY_INT_ARRAY(ipoIndexes, from->numPlanNodes);
+
+ return newnode;
+}
+
+static PlanInitPruningOutput *
+_copyPlanInitPruningOutput(const PlanInitPruningOutput *from)
+{
+ PlanInitPruningOutput *newnode = makeNode(PlanInitPruningOutput);
+
+ COPY_BITMAPSET_FIELD(initially_valid_subplans);
+
+ return newnode;
+}
+
/* ****************************************************************
* value.h copy functions
* ****************************************************************
@@ -5191,7 +5229,6 @@ _copyBitString(const BitString *from)
return newnode;
}
-
static ForeignKeyCacheInfo *
_copyForeignKeyCacheInfo(const ForeignKeyCacheInfo *from)
{
@@ -6176,6 +6213,16 @@ copyObjectImpl(const void *from)
retval = _copyPublicationTable(from);
break;
+ /*
+ * EXECUTION NODES
+ */
+ case T_ExecLockRelsInfo:
+ retval = _copyExecLockRelsInfo(from);
+ break;
+ case T_PlanInitPruningOutput:
+ retval = _copyPlanInitPruningOutput(from);
+ break;
+
/*
* MISCELLANEOUS NODES
*/
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 99056272f3..f361d2e2bc 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -312,8 +312,10 @@ _outPlannedStmt(StringInfo str, const PlannedStmt *node)
WRITE_BOOL_FIELD(transientPlan);
WRITE_BOOL_FIELD(dependsOnRole);
WRITE_BOOL_FIELD(parallelModeNeeded);
+ WRITE_BOOL_FIELD(containsInitialPruning);
WRITE_INT_FIELD(jitFlags);
WRITE_NODE_FIELD(planTree);
+ WRITE_INT_FIELD(numPlanNodes);
WRITE_NODE_FIELD(rtable);
WRITE_NODE_FIELD(resultRelations);
WRITE_NODE_FIELD(appendRelations);
@@ -1007,6 +1009,8 @@ _outPartitionPruneInfo(StringInfo str, const PartitionPruneInfo *node)
WRITE_NODE_TYPE("PARTITIONPRUNEINFO");
WRITE_NODE_FIELD(prune_infos);
+ WRITE_BOOL_FIELD(needs_init_pruning);
+ WRITE_BOOL_FIELD(needs_exec_pruning);
WRITE_BITMAPSET_FIELD(other_subplans);
}
@@ -2747,6 +2751,31 @@ _outExtensibleNode(StringInfo str, const ExtensibleNode *node)
methods->nodeOut(str, node);
}
+/*****************************************************************************
+ *
+ * Stuff from execnodes.h
+ *
+ *****************************************************************************/
+
+static void
+_outExecLockRelsInfo(StringInfo str, const ExecLockRelsInfo *node)
+{
+ WRITE_NODE_TYPE("EXECLOCKRELSINFO");
+
+ WRITE_BITMAPSET_FIELD(lockrels);
+ WRITE_INT_FIELD(numPlanNodes);
+ WRITE_NODE_FIELD(initPruningOutputs);
+ WRITE_INT_ARRAY(ipoIndexes, node->numPlanNodes);
+}
+
+static void
+_outPlanInitPruningOutput(StringInfo str, const PlanInitPruningOutput *node)
+{
+ WRITE_NODE_TYPE("PLANINITPRUNINGOUTPUT");
+
+ WRITE_BITMAPSET_FIELD(initially_valid_subplans);
+}
+
/*****************************************************************************
*
* Stuff from parsenodes.h.
@@ -4600,6 +4629,16 @@ outNode(StringInfo str, const void *obj)
_outJsonConstructorExpr(str, obj);
break;
+ /*
+ * EXECUTION NODES
+ */
+ case T_ExecLockRelsInfo:
+ _outExecLockRelsInfo(str, obj);
+ break;
+ case T_PlanInitPruningOutput:
+ _outPlanInitPruningOutput(str, obj);
+ break;
+
default:
/*
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 7536f216bd..41fc710999 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -1650,8 +1650,10 @@ _readPlannedStmt(void)
READ_BOOL_FIELD(transientPlan);
READ_BOOL_FIELD(dependsOnRole);
READ_BOOL_FIELD(parallelModeNeeded);
+ READ_BOOL_FIELD(containsInitialPruning);
READ_INT_FIELD(jitFlags);
READ_NODE_FIELD(planTree);
+ READ_INT_FIELD(numPlanNodes);
READ_NODE_FIELD(rtable);
READ_NODE_FIELD(resultRelations);
READ_NODE_FIELD(appendRelations);
@@ -2602,6 +2604,8 @@ _readPartitionPruneInfo(void)
READ_LOCALS(PartitionPruneInfo);
READ_NODE_FIELD(prune_infos);
+ READ_BOOL_FIELD(needs_init_pruning);
+ READ_BOOL_FIELD(needs_exec_pruning);
READ_BITMAPSET_FIELD(other_subplans);
READ_DONE();
@@ -2771,6 +2775,35 @@ _readPartitionRangeDatum(void)
READ_DONE();
}
+/*
+ * _readExecLockRelsInfo
+ */
+static ExecLockRelsInfo *
+_readExecLockRelsInfo(void)
+{
+ READ_LOCALS(ExecLockRelsInfo);
+
+ READ_BITMAPSET_FIELD(lockrels);
+ READ_INT_FIELD(numPlanNodes);
+ READ_NODE_FIELD(initPruningOutputs);
+ READ_INT_ARRAY(ipoIndexes, local_node->numPlanNodes);
+
+ READ_DONE();
+}
+
+/*
+ * _readPlanInitPruningOutput
+ */
+static PlanInitPruningOutput *
+_readPlanInitPruningOutput(void)
+{
+ READ_LOCALS(PlanInitPruningOutput);
+
+ READ_BITMAPSET_FIELD(initially_valid_subplans);
+
+ READ_DONE();
+}
+
/*
* parseNodeString
*
@@ -3050,6 +3083,10 @@ parseNodeString(void)
return_value = _readJsonValueExpr();
else if (MATCH("JSONCTOREXPR", 12))
return_value = _readJsonConstructorExpr();
+ else if (MATCH("EXECLOCKRELSINFO", 16))
+ return_value = _readExecLockRelsInfo();
+ else if (MATCH("PLANINITPRUNINGOUTPUT", 21))
+ return_value = _readPlanInitPruningOutput();
else
{
elog(ERROR, "badly formatted node string \"%.32s\"...", token);
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 374a9d9753..329fb9d6e7 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -517,7 +517,9 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
result->transientPlan = glob->transientPlan;
result->dependsOnRole = glob->dependsOnRole;
result->parallelModeNeeded = glob->parallelModeNeeded;
+ result->containsInitialPruning = glob->containsInitialPruning;
result->planTree = top_plan;
+ result->numPlanNodes = glob->lastPlanNodeId;
result->rtable = glob->finalrtable;
result->resultRelations = glob->resultRelations;
result->appendRelations = glob->appendRelations;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index dbdeb8ec9d..ac795ae9d9 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -1561,6 +1561,9 @@ set_append_references(PlannerInfo *root,
pinfo->rtindex += rtoffset;
}
}
+
+ if (aplan->part_prune_info->needs_init_pruning)
+ root->glob->containsInitialPruning = true;
}
/* We don't need to recurse to lefttree or righttree ... */
@@ -1648,6 +1651,9 @@ set_mergeappend_references(PlannerInfo *root,
pinfo->rtindex += rtoffset;
}
}
+
+ if (mplan->part_prune_info->needs_init_pruning)
+ root->glob->containsInitialPruning = true;
}
/* We don't need to recurse to lefttree or righttree ... */
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index 7080cb25d9..3322dc79f2 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -144,7 +144,9 @@ static List *make_partitionedrel_pruneinfo(PlannerInfo *root,
List *prunequal,
Bitmapset *partrelids,
int *relid_subplan_map,
- Bitmapset **matchedsubplans);
+ Bitmapset **matchedsubplans,
+ bool *needs_init_pruning,
+ bool *needs_exec_pruning);
static void gen_partprune_steps(RelOptInfo *rel, List *clauses,
PartClauseTarget target,
GeneratePruningStepsContext *context);
@@ -230,6 +232,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
int *relid_subplan_map;
ListCell *lc;
int i;
+ bool needs_init_pruning = false;
+ bool needs_exec_pruning = false;
/*
* Scan the subpaths to see which ones are scans of partition child
@@ -309,12 +313,16 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
Bitmapset *partrelids = (Bitmapset *) lfirst(lc);
List *pinfolist;
Bitmapset *matchedsubplans = NULL;
+ bool partrel_needs_init_pruning;
+ bool partrel_needs_exec_pruning;
pinfolist = make_partitionedrel_pruneinfo(root, parentrel,
prunequal,
partrelids,
relid_subplan_map,
- &matchedsubplans);
+ &matchedsubplans,
+ &partrel_needs_init_pruning,
+ &partrel_needs_exec_pruning);
/* When pruning is possible, record the matched subplans */
if (pinfolist != NIL)
@@ -323,6 +331,10 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
allmatchedsubplans = bms_join(matchedsubplans,
allmatchedsubplans);
}
+ if (!needs_init_pruning)
+ needs_init_pruning = partrel_needs_init_pruning;
+ if (!needs_exec_pruning)
+ needs_exec_pruning = partrel_needs_exec_pruning;
}
pfree(relid_subplan_map);
@@ -337,6 +349,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
/* Else build the result data structure */
pruneinfo = makeNode(PartitionPruneInfo);
pruneinfo->prune_infos = prunerelinfos;
+ pruneinfo->needs_init_pruning = needs_init_pruning;
+ pruneinfo->needs_exec_pruning = needs_exec_pruning;
/*
* Some subplans may not belong to any of the identified partitioned rels.
@@ -435,13 +449,18 @@ add_part_relids(List *allpartrelids, Bitmapset *partrelids)
* If we cannot find any useful run-time pruning steps, return NIL.
* However, on success, each rel identified in partrelids will have
* an element in the result list, even if some of them are useless.
+ * *needs_init_pruning and *needs_exec_pruning are set to indicate that the
+ * returned PartitionedRelPruneInfos contains pruning steps that can be
+ * performed before and after execution begins, respectively.
*/
static List *
make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
List *prunequal,
Bitmapset *partrelids,
int *relid_subplan_map,
- Bitmapset **matchedsubplans)
+ Bitmapset **matchedsubplans,
+ bool *needs_init_pruning,
+ bool *needs_exec_pruning)
{
RelOptInfo *targetpart = NULL;
List *pinfolist = NIL;
@@ -452,6 +471,10 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
int rti;
int i;
+ /* Will find out below. */
+ *needs_init_pruning = false;
+ *needs_exec_pruning = false;
+
/*
* Examine each partitioned rel, constructing a temporary array to map
* from planner relids to index of the partitioned rel, and building a
@@ -539,6 +562,9 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
* executor per-scan pruning steps. This first pass creates startup
* pruning steps and detects whether there's any possibly-useful quals
* that would require per-scan pruning.
+ *
+ * In the first pass, we note whether the 2nd pass is necessary by
+ * by noting the presence of EXEC parameters.
*/
gen_partprune_steps(subpart, partprunequal, PARTTARGET_INITIAL,
&context);
@@ -613,6 +639,11 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
pinfo->execparamids = execparamids;
/* Remaining fields will be filled in the next loop */
+ if (!*needs_init_pruning)
+ *needs_init_pruning = (initial_pruning_steps != NIL);
+ if (!*needs_exec_pruning)
+ *needs_exec_pruning = (exec_pruning_steps != NIL);
+
pinfolist = lappend(pinfolist, pinfo);
}
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index ba2fcfeb4a..085eb3f209 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -945,15 +945,17 @@ pg_plan_query(Query *querytree, const char *query_string, int cursorOptions,
* For normal optimizable statements, invoke the planner. For utility
* statements, just make a wrapper PlannedStmt node.
*
- * The result is a list of PlannedStmt nodes.
+ * The result is a list of PlannedStmt nodes. Also, a NULL is appended to
+ * *execlockrelsinfo_list for each PlannedStmt added to the returned list.
*/
List *
pg_plan_queries(List *querytrees, const char *query_string, int cursorOptions,
- ParamListInfo boundParams)
+ ParamListInfo boundParams, List **execlockrelsinfo_list)
{
List *stmt_list = NIL;
ListCell *query_list;
+ *execlockrelsinfo_list = NIL;
foreach(query_list, querytrees)
{
Query *query = lfirst_node(Query, query_list);
@@ -977,6 +979,7 @@ pg_plan_queries(List *querytrees, const char *query_string, int cursorOptions,
}
stmt_list = lappend(stmt_list, stmt);
+ *execlockrelsinfo_list = lappend(*execlockrelsinfo_list, NULL);
}
return stmt_list;
@@ -1080,7 +1083,8 @@ exec_simple_query(const char *query_string)
QueryCompletion qc;
MemoryContext per_parsetree_context = NULL;
List *querytree_list,
- *plantree_list;
+ *plantree_list,
+ *plantree_execlockrelsinfo_list;
Portal portal;
DestReceiver *receiver;
int16 format;
@@ -1167,7 +1171,8 @@ exec_simple_query(const char *query_string)
NULL, 0, NULL);
plantree_list = pg_plan_queries(querytree_list, query_string,
- CURSOR_OPT_PARALLEL_OK, NULL);
+ CURSOR_OPT_PARALLEL_OK, NULL,
+ &plantree_execlockrelsinfo_list);
/*
* Done with the snapshot used for parsing/planning.
@@ -1203,6 +1208,7 @@ exec_simple_query(const char *query_string)
query_string,
commandTag,
plantree_list,
+ plantree_execlockrelsinfo_list,
NULL);
/*
@@ -1991,6 +1997,7 @@ exec_bind_message(StringInfo input_message)
query_string,
psrc->commandTag,
cplan->stmt_list,
+ cplan->execlockrelsinfo_list,
cplan);
/* Done with the snapshot used for parameter I/O and parsing/planning */
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index 5f907831a3..972ddc014e 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -35,7 +35,7 @@
Portal ActivePortal = NULL;
-static void ProcessQuery(PlannedStmt *plan,
+static void ProcessQuery(PlannedStmt *plan, ExecLockRelsInfo *execlockrelsinfo,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -65,6 +65,7 @@ static void DoPortalRewind(Portal portal);
*/
QueryDesc *
CreateQueryDesc(PlannedStmt *plannedstmt,
+ ExecLockRelsInfo *execlockrelsinfo,
const char *sourceText,
Snapshot snapshot,
Snapshot crosscheck_snapshot,
@@ -77,6 +78,7 @@ CreateQueryDesc(PlannedStmt *plannedstmt,
qd->operation = plannedstmt->commandType; /* operation */
qd->plannedstmt = plannedstmt; /* plan */
+ qd->execlockrelsinfo = execlockrelsinfo; /* ExecutorGetLockRels() output for plan */
qd->sourceText = sourceText; /* query text */
qd->snapshot = RegisterSnapshot(snapshot); /* snapshot */
/* RI check snapshot */
@@ -122,6 +124,7 @@ FreeQueryDesc(QueryDesc *qdesc)
* PORTAL_ONE_RETURNING, or PORTAL_ONE_MOD_WITH portal
*
* plan: the plan tree for the query
+ * execlockrelsinfo: ExecutorGetLockRels() output for the plan tree
* sourceText: the source text of the query
* params: any parameters needed
* dest: where to send results
@@ -134,6 +137,7 @@ FreeQueryDesc(QueryDesc *qdesc)
*/
static void
ProcessQuery(PlannedStmt *plan,
+ ExecLockRelsInfo *execlockrelsinfo,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -145,7 +149,7 @@ ProcessQuery(PlannedStmt *plan,
/*
* Create the QueryDesc object
*/
- queryDesc = CreateQueryDesc(plan, sourceText,
+ queryDesc = CreateQueryDesc(plan, execlockrelsinfo, sourceText,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, 0);
@@ -490,6 +494,7 @@ PortalStart(Portal portal, ParamListInfo params,
* the destination to DestNone.
*/
queryDesc = CreateQueryDesc(linitial_node(PlannedStmt, portal->stmts),
+ linitial_node(ExecLockRelsInfo, portal->execlockrelsinfos),
portal->sourceText,
GetActiveSnapshot(),
InvalidSnapshot,
@@ -1190,7 +1195,8 @@ PortalRunMulti(Portal portal,
QueryCompletion *qc)
{
bool active_snapshot_set = false;
- ListCell *stmtlist_item;
+ ListCell *stmtlist_item,
+ *execlockrelsinfolist_item;
/*
* If the destination is DestRemoteExecute, change to DestNone. The
@@ -1211,9 +1217,12 @@ PortalRunMulti(Portal portal,
* Loop to handle the individual queries generated from a single parsetree
* by analysis and rewrite.
*/
- foreach(stmtlist_item, portal->stmts)
+ forboth(stmtlist_item, portal->stmts,
+ execlockrelsinfolist_item, portal->execlockrelsinfos)
{
PlannedStmt *pstmt = lfirst_node(PlannedStmt, stmtlist_item);
+ ExecLockRelsInfo *execlockrelsinfo = lfirst_node(ExecLockRelsInfo,
+ execlockrelsinfolist_item);
/*
* If we got a cancel signal in prior command, quit
@@ -1271,7 +1280,7 @@ PortalRunMulti(Portal portal,
if (pstmt->canSetTag)
{
/* statement can set tag string */
- ProcessQuery(pstmt,
+ ProcessQuery(pstmt, execlockrelsinfo,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
@@ -1280,7 +1289,7 @@ PortalRunMulti(Portal portal,
else
{
/* stmt added by rewrite cannot set tag */
- ProcessQuery(pstmt,
+ ProcessQuery(pstmt, execlockrelsinfo,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 4cf6db504f..9f5a40a0a6 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -99,14 +99,16 @@ static dlist_head cached_expression_list = DLIST_STATIC_INIT(cached_expression_l
static void ReleaseGenericPlan(CachedPlanSource *plansource);
static List *RevalidateCachedQuery(CachedPlanSource *plansource,
QueryEnvironment *queryEnv);
-static bool CheckCachedPlan(CachedPlanSource *plansource);
+static bool CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams);
static CachedPlan *BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
ParamListInfo boundParams, QueryEnvironment *queryEnv);
+static void CachedPlanSaveExecLockRelsInfos(CachedPlan *plan, List *execlockrelsinfo_list);
static bool choose_custom_plan(CachedPlanSource *plansource,
ParamListInfo boundParams);
static double cached_plan_cost(CachedPlan *plan, bool include_planner);
static Query *QueryListGetPrimaryStmt(List *stmts);
-static void AcquireExecutorLocks(List *stmt_list, bool acquire);
+static List *AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams);
+static void ReleaseExecutorLocks(List *stmt_list, List *execlockrelsinfo_list);
static void AcquirePlannerLocks(List *stmt_list, bool acquire);
static void ScanQueryForLocks(Query *parsetree, bool acquire);
static bool ScanQueryWalker(Node *node, bool *acquire);
@@ -790,9 +792,21 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
*
* On a "true" return, we have acquired the locks needed to run the plan.
* (We must do this for the "true" result to be race-condition-free.)
+ *
+ * If the CachedPlan is valid, this may in some cases call ExecutorGetLockRels
+ * on each PlannedStmt contained in it to determine the set of relations to be
+ * locked by AcquireExecutorLocks(), instead of just scanning its range table,
+ * which is done to prune away any nodes in the tree that need not be executed
+ * based on the result of initial partition pruning. Resulting
+ * ExecLockRelsInfo nodes containing the result of such pruning, allocated in
+ * a child context of the context containing the plan itself, are added into
+ * plan->execlockrelsinfo_list. The previous contents of the list from the
+ * last invocation on the same CachedPlan are deleted, because they would no
+ * longer be valid given the fresh set of parameter values which may be used
+ * as pruning parameters.
*/
static bool
-CheckCachedPlan(CachedPlanSource *plansource)
+CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams)
{
CachedPlan *plan = plansource->gplan;
@@ -820,13 +834,25 @@ CheckCachedPlan(CachedPlanSource *plansource)
*/
if (plan->is_valid)
{
+ List *execlockrelsinfo_list;
+
/*
* Plan must have positive refcount because it is referenced by
* plansource; so no need to fear it disappears under us here.
*/
Assert(plan->refcount > 0);
- AcquireExecutorLocks(plan->stmt_list, true);
+ /*
+ * Lock relations scanned by the plan. If ExecutorGetLockRels() asked
+ * to omit some relations because the plan nodes that scan them were
+ * found to be pruned, the executor will be informed of the omission of
+ * the plan nodes themselves, so that it doesn't accidentally try to
+ * execute those nodes, via the ExecLockRelsInfo nodes collected in the
+ * returned list that is also passed to it along with the list of
+ * PlannedStmts.
+ */
+ execlockrelsinfo_list = AcquireExecutorLocks(plan->stmt_list,
+ boundParams);
/*
* If plan was transient, check to see if TransactionXmin has
@@ -844,11 +870,14 @@ CheckCachedPlan(CachedPlanSource *plansource)
if (plan->is_valid)
{
/* Successfully revalidated and locked the query. */
+
+ /* Remember ExecLockRelsInfos in the CachedPlan. */
+ CachedPlanSaveExecLockRelsInfos(plan, execlockrelsinfo_list);
return true;
}
/* Oops, the race case happened. Release useless locks. */
- AcquireExecutorLocks(plan->stmt_list, false);
+ ReleaseExecutorLocks(plan->stmt_list, execlockrelsinfo_list);
}
/*
@@ -880,7 +909,8 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
ParamListInfo boundParams, QueryEnvironment *queryEnv)
{
CachedPlan *plan;
- List *plist;
+ List *plist,
+ *execlockrelsinfo_list;
bool snapshot_set;
bool is_transient;
MemoryContext plan_context;
@@ -933,7 +963,8 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
* Generate the plan.
*/
plist = pg_plan_queries(qlist, plansource->query_string,
- plansource->cursor_options, boundParams);
+ plansource->cursor_options, boundParams,
+ &execlockrelsinfo_list);
/* Release snapshot if we got one */
if (snapshot_set)
@@ -1002,6 +1033,16 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
plan->is_saved = false;
plan->is_valid = true;
+ /*
+ * Save the dummy ExecLockRelsInfo list, that is a list containing NULLs
+ * as elements. We must do this, becasue users of the CachedPlan expect
+ * one to go with the list of PlannedStmts.
+ * XXX maybe get rid of that contract.
+ */
+ plan->execlockrelsinfo_context = NULL;
+ CachedPlanSaveExecLockRelsInfos(plan, execlockrelsinfo_list);
+ Assert(MemoryContextIsValid(plan->execlockrelsinfo_context));
+
/* assign generation number to new plan */
plan->generation = ++(plansource->generation);
@@ -1160,7 +1201,7 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
if (!customplan)
{
- if (CheckCachedPlan(plansource))
+ if (CheckCachedPlan(plansource, boundParams))
{
/* We want a generic plan, and we already have a valid one */
plan = plansource->gplan;
@@ -1586,6 +1627,49 @@ CopyCachedPlan(CachedPlanSource *plansource)
return newsource;
}
+/*
+ * CachedPlanSaveExecLockRelsInfos
+ * Save the list containing ExecLockRelsInfo nodes into the given
+ * CachedPlan
+ *
+ * The provided list is copied into a dedicated context that is a child of
+ * plan->context. If the child context already exists, it is emptied, because
+ * any ExecLockRelsInfo contained therein would no longer be useful.
+ */
+static void
+CachedPlanSaveExecLockRelsInfos(CachedPlan *plan, List *execlockrelsinfo_list)
+{
+ MemoryContext execlockrelsinfo_context = plan->execlockrelsinfo_context,
+ oldcontext = CurrentMemoryContext;
+ List *execlockrelsinfo_list_copy;
+
+ /*
+ * Set up the dedicated context if not already done, saving it as a child
+ * of the CachedPlan's context.
+ */
+ if (execlockrelsinfo_context == NULL)
+ {
+ execlockrelsinfo_context = AllocSetContextCreate(CurrentMemoryContext,
+ "CachedPlan execlockrelsinfo list",
+ ALLOCSET_START_SMALL_SIZES);
+ MemoryContextSetParent(execlockrelsinfo_context, plan->context);
+ MemoryContextSetIdentifier(execlockrelsinfo_context, plan->context->ident);
+ plan->execlockrelsinfo_context = execlockrelsinfo_context;
+ }
+ else
+ {
+ /* Just clear existing contents by resetting the context. */
+ Assert(MemoryContextIsValid(execlockrelsinfo_context));
+ MemoryContextReset(execlockrelsinfo_context);
+ }
+
+ MemoryContextSwitchTo(execlockrelsinfo_context);
+ execlockrelsinfo_list_copy = copyObject(execlockrelsinfo_list);
+ MemoryContextSwitchTo(oldcontext);
+
+ plan->execlockrelsinfo_list = execlockrelsinfo_list_copy;
+}
+
/*
* CachedPlanIsValid: test whether the rewritten querytree within a
* CachedPlanSource is currently valid (that is, not marked as being in need
@@ -1737,17 +1821,21 @@ QueryListGetPrimaryStmt(List *stmts)
/*
* AcquireExecutorLocks: acquire locks needed for execution of a cached plan;
- * or release them if acquire is false.
+ *
+ * Returns a list of ExecLockRelsInfo nodes containing one element for each
+ * PlannedStmt in stmt_list or NULL if the latter is utility statement or its
+ * containsInitialPruning is false.
*/
-static void
-AcquireExecutorLocks(List *stmt_list, bool acquire)
+static List *
+AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams)
{
ListCell *lc1;
+ List *execlockrelsinfo_list = NIL;
foreach(lc1, stmt_list)
{
PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
- ListCell *lc2;
+ ExecLockRelsInfo *execlockrelsinfo = NULL;
if (plannedstmt->commandType == CMD_UTILITY)
{
@@ -1761,27 +1849,139 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
Query *query = UtilityContainsQuery(plannedstmt->utilityStmt);
if (query)
- ScanQueryForLocks(query, acquire);
- continue;
+ ScanQueryForLocks(query, true);
}
-
- foreach(lc2, plannedstmt->rtable)
+ else
{
- RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc2);
+ /*
+ * Figure out the set of relations that would need to be locked
+ * before executing the plan.
+ */
+ if (!plannedstmt->containsInitialPruning)
+ {
+ /*
+ * If the plan contains no initial pruning steps, just lock
+ * all the relations found in the range table.
+ */
+ ListCell *lc;
- if (rte->rtekind != RTE_RELATION)
- continue;
+ foreach(lc, plannedstmt->rtable)
+ {
+ RangeTblEntry *rte = lfirst(lc);
+
+ if (rte->rtekind != RTE_RELATION)
+ continue;
+
+ /*
+ * Acquire the appropriate type of lock on each relation
+ * OID. Note that we don't actually try to open the rel,
+ * and hence will not fail if it's been dropped entirely
+ * --- we'll just transiently acquire a non-conflicting
+ * lock.
+ */
+ LockRelationOid(rte->relid, rte->rellockmode);
+ }
+ }
+ else
+ {
+ int rti;
+ Bitmapset *lockrels;
+
+ /*
+ * Walk the plan tree to find only the minimal set of
+ * relations to be locked, considering the effect of performing
+ * initial partition pruning.
+ */
+ execlockrelsinfo = ExecutorGetLockRels(plannedstmt, boundParams);
+ lockrels = execlockrelsinfo->lockrels;
+
+ rti = -1;
+ while ((rti = bms_next_member(lockrels, rti)) >= 0)
+ {
+ RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
+ Assert(rte->rtekind == RTE_RELATION);
+
+ /* See the comment above. */
+ LockRelationOid(rte->relid, rte->rellockmode);
+ }
+ }
+ }
+
+ /*
+ * Remember ExecLockRelsInfo for later adding to the QueryDesc that
+ * will be passed to the executor when executing this plan. May be
+ * NULL, but must keep the list the same length as stmt_list.
+ */
+ execlockrelsinfo_list = lappend(execlockrelsinfo_list,
+ execlockrelsinfo);
+ }
+
+ return execlockrelsinfo_list;
+}
+
+/*
+ * ReleaseExecutorLocks
+ * Release locks that would've been acquired by an earlier call to
+ * AcquireExecutorLocks()
+ */
+static void
+ReleaseExecutorLocks(List *stmt_list, List *execlockrelsinfo_list)
+{
+ ListCell *lc1,
+ *lc2;
+
+ forboth(lc1, stmt_list, lc2, execlockrelsinfo_list)
+ {
+ PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
+ ExecLockRelsInfo *execlockrelsinfo = lfirst_node(ExecLockRelsInfo, lc2);
+
+ if (plannedstmt->commandType == CMD_UTILITY)
+ {
/*
- * Acquire the appropriate type of lock on each relation OID. Note
- * that we don't actually try to open the rel, and hence will not
- * fail if it's been dropped entirely --- we'll just transiently
- * acquire a non-conflicting lock.
+ * Ignore utility statements, except those (such as EXPLAIN) that
+ * contain a parsed-but-not-planned query. Note: it's okay to use
+ * ScanQueryForLocks, even though the query hasn't been through
+ * rule rewriting, because rewriting doesn't change the query
+ * representation.
*/
- if (acquire)
- LockRelationOid(rte->relid, rte->rellockmode);
+ Query *query = UtilityContainsQuery(plannedstmt->utilityStmt);
+
+ if (query)
+ ScanQueryForLocks(query, false);
+ }
+ else
+ {
+ if (execlockrelsinfo == NULL)
+ {
+ ListCell *lc;
+
+ foreach(lc, plannedstmt->rtable)
+ {
+ RangeTblEntry *rte = lfirst(lc);
+
+ if (rte->rtekind != RTE_RELATION)
+ continue;
+
+ LockRelationOid(rte->relid, rte->rellockmode);
+ }
+ }
else
- UnlockRelationOid(rte->relid, rte->rellockmode);
+ {
+ int rti;
+ Bitmapset *lockrels;
+
+ lockrels = execlockrelsinfo->lockrels;
+ rti = -1;
+ while ((rti = bms_next_member(lockrels, rti)) >= 0)
+ {
+ RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
+
+ Assert(rte->rtekind == RTE_RELATION);
+
+ UnlockRelationOid(rte->relid, rte->rellockmode);
+ }
+ }
}
}
}
diff --git a/src/backend/utils/mmgr/portalmem.c b/src/backend/utils/mmgr/portalmem.c
index d549f66d4a..896f51be08 100644
--- a/src/backend/utils/mmgr/portalmem.c
+++ b/src/backend/utils/mmgr/portalmem.c
@@ -285,6 +285,7 @@ PortalDefineQuery(Portal portal,
const char *sourceText,
CommandTag commandTag,
List *stmts,
+ List *execlockrelsinfos,
CachedPlan *cplan)
{
AssertArg(PortalIsValid(portal));
@@ -299,6 +300,7 @@ PortalDefineQuery(Portal portal,
portal->qc.nprocessed = 0;
portal->commandTag = commandTag;
portal->stmts = stmts;
+ portal->execlockrelsinfos = execlockrelsinfos;
portal->cplan = cplan;
portal->status = PORTAL_DEFINED;
}
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 666977fb1f..fef75ba147 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -87,7 +87,8 @@ extern void ExplainOneUtility(Node *utilityStmt, IntoClause *into,
ExplainState *es, const char *queryString,
ParamListInfo params, QueryEnvironment *queryEnv);
-extern void ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into,
+extern void ExplainOnePlan(PlannedStmt *plannedstmt, ExecLockRelsInfo *execlockrelsinfo,
+ IntoClause *into,
ExplainState *es, const char *queryString,
ParamListInfo params, QueryEnvironment *queryEnv,
const instr_time *planduration,
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index fd5735a946..ded19b8cbb 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -124,4 +124,6 @@ extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
PartitionPruneInfo *pruneinfo,
Bitmapset **initially_valid_subplans);
extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate);
+extern Bitmapset *ExecGetLockRelsDoInitialPruning(Plan *plan, ExecGetLockRelsContext *context,
+ PartitionPruneInfo *pruneinfo);
#endif /* EXECPARTITION_H */
diff --git a/src/include/executor/execdesc.h b/src/include/executor/execdesc.h
index e79e2c001f..4338463479 100644
--- a/src/include/executor/execdesc.h
+++ b/src/include/executor/execdesc.h
@@ -35,6 +35,7 @@ typedef struct QueryDesc
/* These fields are provided by CreateQueryDesc */
CmdType operation; /* CMD_SELECT, CMD_UPDATE, etc. */
PlannedStmt *plannedstmt; /* planner's output (could be utility, too) */
+ ExecLockRelsInfo *execlockrelsinfo; /* ExecutorGetLockRels()'s output given plannedstmt */
const char *sourceText; /* source text of the query */
Snapshot snapshot; /* snapshot to use for query */
Snapshot crosscheck_snapshot; /* crosscheck for RI update/delete */
@@ -57,6 +58,7 @@ typedef struct QueryDesc
/* in pquery.c */
extern QueryDesc *CreateQueryDesc(PlannedStmt *plannedstmt,
+ ExecLockRelsInfo *execlockrelsinfo,
const char *sourceText,
Snapshot snapshot,
Snapshot crosscheck_snapshot,
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index 82925b4b63..5cf414cc11 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -185,6 +185,8 @@ ExecGetJunkAttribute(TupleTableSlot *slot, AttrNumber attno, bool *isNull)
/*
* prototypes from functions in execMain.c
*/
+extern ExecLockRelsInfo *ExecutorGetLockRels(PlannedStmt *plannedstmt, ParamListInfo params);
+extern bool ExecGetLockRels(Plan *node, ExecGetLockRelsContext *context);
extern void ExecutorStart(QueryDesc *queryDesc, int eflags);
extern void standard_ExecutorStart(QueryDesc *queryDesc, int eflags);
extern void ExecutorRun(QueryDesc *queryDesc,
diff --git a/src/include/executor/nodeAppend.h b/src/include/executor/nodeAppend.h
index 4cb78ee5b6..b53535c2a4 100644
--- a/src/include/executor/nodeAppend.h
+++ b/src/include/executor/nodeAppend.h
@@ -17,6 +17,7 @@
#include "access/parallel.h"
#include "nodes/execnodes.h"
+extern bool ExecGetAppendLockRels(Append *node, ExecGetLockRelsContext *context);
extern AppendState *ExecInitAppend(Append *node, EState *estate, int eflags);
extern void ExecEndAppend(AppendState *node);
extern void ExecReScanAppend(AppendState *node);
diff --git a/src/include/executor/nodeMergeAppend.h b/src/include/executor/nodeMergeAppend.h
index 97fe3b0665..8eb4e9df93 100644
--- a/src/include/executor/nodeMergeAppend.h
+++ b/src/include/executor/nodeMergeAppend.h
@@ -16,6 +16,7 @@
#include "nodes/execnodes.h"
+extern bool ExecGetMergeAppendLockRels(MergeAppend *node, ExecGetLockRelsContext *context);
extern MergeAppendState *ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags);
extern void ExecEndMergeAppend(MergeAppendState *node);
extern void ExecReScanMergeAppend(MergeAppendState *node);
diff --git a/src/include/executor/nodeModifyTable.h b/src/include/executor/nodeModifyTable.h
index 1d225bc88d..5006499088 100644
--- a/src/include/executor/nodeModifyTable.h
+++ b/src/include/executor/nodeModifyTable.h
@@ -19,6 +19,7 @@ extern void ExecComputeStoredGenerated(ResultRelInfo *resultRelInfo,
EState *estate, TupleTableSlot *slot,
CmdType cmdtype);
+extern bool ExecGetModifyTableLockRels(ModifyTable *plan, ExecGetLockRelsContext *context);
extern ModifyTableState *ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags);
extern void ExecEndModifyTable(ModifyTableState *node);
extern void ExecReScanModifyTable(ModifyTableState *node);
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 44dd73fc80..1253fdb0ed 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -576,6 +576,7 @@ typedef struct EState
struct ExecRowMark **es_rowmarks; /* Array of per-range-table-entry
* ExecRowMarks, or NULL if none */
PlannedStmt *es_plannedstmt; /* link to top of plan tree */
+ struct ExecLockRelsInfo *es_execlockrelsinfo; /* QueryDesc.execlockrelsinfo */
const char *es_sourceText; /* Source text from QueryDesc */
JunkFilter *es_junkFilter; /* top-level junk filter, if any */
@@ -964,6 +965,101 @@ typedef struct DomainConstraintState
*/
typedef TupleTableSlot *(*ExecProcNodeMtd) (struct PlanState *pstate);
+/*----------------
+ * ExecLockRelsInfo
+ *
+ * Result of performing ExecutorGetLockRels() for a given PlannedStmt
+ */
+typedef struct ExecLockRelsInfo
+{
+ NodeTag type;
+
+ /*
+ * Relations that must be locked to execute the plan tree contained in
+ * the PlannedStmt.
+ */
+ Bitmapset *lockrels;
+
+ /* PlannedStmt.numPlanNodes */
+ int numPlanNodes;
+
+ /*
+ * List of PlanInitPruningOutput, each representing the output of
+ * performing initial pruning on a given plan node, for all nodes in the
+ * plan tree that have been marked as needing initial pruning.
+ *
+ * 'ipoIndexes' is an array of 'numPlanNodes' elements, indexed with
+ * plan_node_id of the individual nodes in the plan tree, each a 1-based
+ * index into 'initPruningOutputs' list for a given plan node. 0 means
+ * that a given plan node has no entry in the list because of not needing
+ * any initial pruning done on it.
+ */
+ List *initPruningOutputs;
+ int *ipoIndexes;
+} ExecLockRelsInfo;
+
+/*----------------
+ * ExecGetLockRelsContext
+ *
+ * Information pertaining to ExecutorGetLockRels() invocation for a given
+ * plan.
+ */
+typedef struct ExecGetLockRelsContext
+{
+ NodeTag type;
+
+ PlannedStmt *stmt; /* target plan */
+ ParamListInfo params; /* EXTERN parameters available for pruning */
+
+ /* Output parameters for ExecGetLockRels and its subroutines. */
+ Bitmapset *lockrels;
+
+ /* See the omment in the definition of ExecLockRelsInfo struct. */
+ List *initPruningOutputs;
+ int *ipoIndexes;
+} ExecGetLockRelsContext;
+
+/*
+ * Appends the provided PlanInitPruningOutput to
+ * ExecGetLockRelsContext.initPruningOutput
+ */
+#define ExecStorePlanInitPruningOutput(cxt, initPruningOutput, plannode) \
+ do { \
+ (cxt)->initPruningOutputs = lappend((cxt)->initPruningOutputs, initPruningOutput); \
+ (cxt)->ipoIndexes[(plannode)->plan_node_id] = list_length((cxt)->initPruningOutputs); \
+ } while (0)
+
+/*
+ * Finds the PlanInitPruningOutput for a given Plan node in
+ * ExecLockRelsInfo.initPruningOutputs.
+ */
+#define ExecFetchPlanInitPruningOutput(execlockrelsinfo, plannode) \
+ (((execlockrelsinfo) != NULL && (execlockrelsinfo)->initPruningOutputs != NIL) ? \
+ list_nth((execlockrelsinfo)->initPruningOutputs, \
+ (execlockrelsinfo)->ipoIndexes[(plannode)->plan_node_id] - 1) : NULL)
+
+/* ---------------
+ * PlanInitPruningOutput
+ *
+ * Node to remember the result of performing initial partition pruning steps
+ * during ExecutorGetLockRels() on nodes that support pruning.
+ *
+ * ExecLockRelsDoInitPruning(), which runs during ExecutorGetLockRels(),
+ * creates it and stores it in the corresponding ExecLockRelsInfo.
+ *
+ * ExecInitPartitionPruning(), which runs during ExecuorStart(), fetches it
+ * from the EState's ExecLockRelsInfo (if any) and uses the value of
+ * initially_valid_subplans contained in it as-is to select the subplans to be
+ * initialized for execution, instead of re-evaluating that by performing
+ * initial pruning again.
+ */
+typedef struct PlanInitPruningOutput
+{
+ NodeTag type;
+
+ Bitmapset *initially_valid_subplans;
+} PlanInitPruningOutput;
+
/* ----------------
* PlanState node
*
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index 05f0b79e82..00c4d8293e 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -96,6 +96,11 @@ typedef enum NodeTag
T_PartitionPruneStepCombine,
T_PlanInvalItem,
+ /* TAGS FOR EXECUTOR PREP NODES (execnodes.h) */
+ T_ExecGetLockRelsContext,
+ T_ExecLockRelsInfo,
+ T_PlanInitPruningOutput,
+
/*
* TAGS FOR PLAN STATE NODES (execnodes.h)
*
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 5327d9ba8b..019719c1a4 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -129,6 +129,10 @@ typedef struct PlannerGlobal
char maxParallelHazard; /* worst PROPARALLEL hazard level */
+ bool containsInitialPruning; /* Do some Plan nodes in the tree
+ * have initial (pre-exec) pruning
+ * steps? */
+
PartitionDirectory partition_directory; /* partition descriptors */
Bitmapset *elidedAppendPartedRels; /* Combined partitioned_rels of all
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index bd87c35d6c..bfdb5bbf28 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -59,10 +59,16 @@ typedef struct PlannedStmt
bool parallelModeNeeded; /* parallel mode required to execute? */
+ bool containsInitialPruning; /* Do some Plan nodes in the tree
+ * have initial (pre-exec) pruning
+ * steps? */
+
int jitFlags; /* which forms of JIT should be performed */
struct Plan *planTree; /* tree of Plan nodes */
+ int numPlanNodes; /* number of nodes in planTree */
+
List *rtable; /* list of RangeTblEntry nodes */
/* rtable indexes of target relations for INSERT/UPDATE/DELETE */
@@ -1189,6 +1195,13 @@ typedef struct PlanRowMark
* prune_infos List of Lists containing PartitionedRelPruneInfo nodes,
* one sublist per run-time-prunable partition hierarchy
* appearing in the parent plan node's subplans.
+ *
+ * needs_init_pruning Does any of the PartitionedRelPruneInfos in
+ * prune_infos have its initial_pruning_steps set?
+ *
+ * needs_exec_pruning Does any of the PartitionedRelPruneInfos in
+ * prune_infos have its exec_pruning_steps set?
+ *
* other_subplans Indexes of any subplans that are not accounted for
* by any of the PartitionedRelPruneInfo nodes in
* "prune_infos". These subplans must not be pruned.
@@ -1197,6 +1210,8 @@ typedef struct PartitionPruneInfo
{
NodeTag type;
List *prune_infos;
+ bool needs_init_pruning;
+ bool needs_exec_pruning;
Bitmapset *other_subplans;
} PartitionPruneInfo;
diff --git a/src/include/tcop/tcopprot.h b/src/include/tcop/tcopprot.h
index 92291a750d..bf80c53bed 100644
--- a/src/include/tcop/tcopprot.h
+++ b/src/include/tcop/tcopprot.h
@@ -64,7 +64,7 @@ extern PlannedStmt *pg_plan_query(Query *querytree, const char *query_string,
ParamListInfo boundParams);
extern List *pg_plan_queries(List *querytrees, const char *query_string,
int cursorOptions,
- ParamListInfo boundParams);
+ ParamListInfo boundParams, List **execlockrelsinfo_list);
extern bool check_max_stack_depth(int *newval, void **extra, GucSource source);
extern void assign_max_stack_depth(int newval, void *extra);
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index 95b99e3d25..56b0dcc6bd 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -148,6 +148,9 @@ typedef struct CachedPlan
{
int magic; /* should equal CACHEDPLAN_MAGIC */
List *stmt_list; /* list of PlannedStmts */
+ List *execlockrelsinfo_list; /* list of ExecutorGetLockRelsResult with one
+ * element for each of stmt_list; NIL
+ * if not a generic plan */
bool is_oneshot; /* is it a "oneshot" plan? */
bool is_saved; /* is CachedPlan in a long-lived context? */
bool is_valid; /* is the stmt_list currently valid? */
@@ -158,6 +161,9 @@ typedef struct CachedPlan
int generation; /* parent's generation number for this plan */
int refcount; /* count of live references to this struct */
MemoryContext context; /* context containing this CachedPlan */
+ MemoryContext execlockrelsinfo_context; /* context containing
+ * execlockrelsinfo_list,
+ * a child of the above context */
} CachedPlan;
/*
diff --git a/src/include/utils/portal.h b/src/include/utils/portal.h
index aeddbdafe5..9abace6734 100644
--- a/src/include/utils/portal.h
+++ b/src/include/utils/portal.h
@@ -137,6 +137,10 @@ typedef struct PortalData
CommandTag commandTag; /* command tag for original query */
QueryCompletion qc; /* command completion data for executed query */
List *stmts; /* list of PlannedStmts */
+ List *execlockrelsinfos; /* list of ExecutorGetLockRelsResults with one element
+ * for each of 'stmts'; same as
+ * cplan->execlockrelsinfo_list if cplan is
+ * not NULL */
CachedPlan *cplan; /* CachedPlan, if stmts are from one */
ParamListInfo portalParams; /* params to pass to query */
@@ -241,6 +245,7 @@ extern void PortalDefineQuery(Portal portal,
const char *sourceText,
CommandTag commandTag,
List *stmts,
+ List *execlockrelsinfos,
CachedPlan *cplan);
extern PlannedStmt *PortalGetPrimaryStmt(Portal portal);
extern void PortalCreateHoldStore(Portal portal);
--
2.24.1
^ permalink raw reply [nested|flat] 71+ messages in thread
* Re: generic plans and "initial" pruning
@ 2022-03-31 03:25 Amit Langote <[email protected]>
parent: Amit Langote <[email protected]>
0 siblings, 2 replies; 71+ messages in thread
From: Amit Langote @ 2022-03-31 03:25 UTC (permalink / raw)
To: Robert Haas <[email protected]>; +Cc: Tom Lane <[email protected]>; pgsql-hackers; David Rowley *EXTERN* <[email protected]>
On Mon, Mar 28, 2022 at 4:28 PM Amit Langote <[email protected]> wrote:
> On Mon, Mar 28, 2022 at 4:17 PM Amit Langote <[email protected]> wrote:
> > Other than the changes mentioned above, the updated patch now contains
> > a bit more commentary than earlier versions, mostly around
> > AcquireExecutorLocks()'s new way of determining the set of relations
> > to lock and the significantly redesigned working of the "initial"
> > execution pruning.
>
> Forgot to rebase over the latest HEAD, so here's v7. Also fixed that
> _out and _read functions for PlanInitPruningOutput were using an
> obsolete node label.
Rebased.
--
Amit Langote
EDB: http://www.enterprisedb.com
Attachments:
[application/octet-stream] v8-0004-Optimize-AcquireExecutorLocks-to-skip-pruned-part.patch (94.3K, 2-v8-0004-Optimize-AcquireExecutorLocks-to-skip-pruned-part.patch)
download | inline diff:
From 9e0ae8887a9f3d75feb4df969dde504a21d3700d Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Wed, 22 Dec 2021 16:55:17 +0900
Subject: [PATCH v8 4/4] Optimize AcquireExecutorLocks() to skip pruned
partitions
Instead of locking all relations listed in the range table in the
cases where the PlannedStmt indicates that some nodes in the plan
tree can do partition pruning without depending on execution having
started (so called "initial" pruning), AcquireExecutorLocks() now
calls the new executor function ExecutorGetLockRels() which returns
a set of relations (their RT indexes) to be locked not including
those scanned by the subplans that pruned.
The result of pruning done this way must be remembered and reused
during actual execution of the plan, which is done by creating a
PlanInitPruningOutput nodes for for each plan node that undergoes
pruning and a set of those for the whole plan tree are added to
ExecLockRelsInfo which also stores the bitmapset of RT indexes of
relations that are actually locked by AcquireExecutorLocks().
ExecLockRelsInfos are passed down the executor alongside the
PlannedStmts. This arrangement ensures that the executor doesn't
accidentally try to process a plan tree subnodes that has been
deemed pruned by AcquireExecutorLocks().
---
src/backend/commands/copyto.c | 2 +-
src/backend/commands/createas.c | 2 +-
src/backend/commands/explain.c | 7 +-
src/backend/commands/extension.c | 13 +-
src/backend/commands/matview.c | 2 +-
src/backend/commands/portalcmds.c | 1 +
src/backend/commands/prepare.c | 17 +-
src/backend/executor/README | 24 +++
src/backend/executor/execMain.c | 202 ++++++++++++++++++++
src/backend/executor/execParallel.c | 26 ++-
src/backend/executor/execPartition.c | 224 ++++++++++++++++++----
src/backend/executor/execUtils.c | 8 +
src/backend/executor/functions.c | 2 +-
src/backend/executor/nodeAppend.c | 52 ++++-
src/backend/executor/nodeMergeAppend.c | 52 ++++-
src/backend/executor/nodeModifyTable.c | 25 +++
src/backend/executor/spi.c | 14 +-
src/backend/nodes/copyfuncs.c | 49 ++++-
src/backend/nodes/outfuncs.c | 39 ++++
src/backend/nodes/readfuncs.c | 37 ++++
src/backend/optimizer/plan/planner.c | 2 +
src/backend/optimizer/plan/setrefs.c | 6 +
src/backend/partitioning/partprune.c | 37 +++-
src/backend/tcop/postgres.c | 15 +-
src/backend/tcop/pquery.c | 21 ++-
src/backend/utils/cache/plancache.c | 252 ++++++++++++++++++++++---
src/backend/utils/mmgr/portalmem.c | 2 +
src/include/commands/explain.h | 3 +-
src/include/executor/execPartition.h | 2 +
src/include/executor/execdesc.h | 2 +
src/include/executor/executor.h | 2 +
src/include/executor/nodeAppend.h | 1 +
src/include/executor/nodeMergeAppend.h | 1 +
src/include/executor/nodeModifyTable.h | 1 +
src/include/nodes/execnodes.h | 96 ++++++++++
src/include/nodes/nodes.h | 5 +
src/include/nodes/pathnodes.h | 4 +
src/include/nodes/plannodes.h | 15 ++
src/include/tcop/tcopprot.h | 2 +-
src/include/utils/plancache.h | 6 +
src/include/utils/portal.h | 5 +
41 files changed, 1174 insertions(+), 104 deletions(-)
diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index 55c38b04c4..d403eb2309 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -542,7 +542,7 @@ BeginCopyTo(ParseState *pstate,
((DR_copy *) dest)->cstate = cstate;
/* Create a QueryDesc requesting no output */
- cstate->queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+ cstate->queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
GetActiveSnapshot(),
InvalidSnapshot,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 9abbb6b555..f6607f2454 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -325,7 +325,7 @@ ExecCreateTableAs(ParseState *pstate, CreateTableAsStmt *stmt,
UpdateActiveSnapshotCommandId();
/* Create a QueryDesc, redirecting output to our tuple receiver */
- queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+ queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, 0);
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index cb13227db1..e5dff2bc25 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -407,7 +407,7 @@ ExplainOneQuery(Query *query, int cursorOptions,
}
/* run it (if needed) and produce output */
- ExplainOnePlan(plan, into, es, queryString, params, queryEnv,
+ ExplainOnePlan(plan, NULL, into, es, queryString, params, queryEnv,
&planduration, (es->buffers ? &bufusage : NULL));
}
}
@@ -515,7 +515,8 @@ ExplainOneUtility(Node *utilityStmt, IntoClause *into, ExplainState *es,
* to call it.
*/
void
-ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
+ExplainOnePlan(PlannedStmt *plannedstmt, ExecLockRelsInfo *execlockrelsinfo,
+ IntoClause *into, ExplainState *es,
const char *queryString, ParamListInfo params,
QueryEnvironment *queryEnv, const instr_time *planduration,
const BufferUsage *bufusage)
@@ -563,7 +564,7 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
dest = None_Receiver;
/* Create a QueryDesc for the query */
- queryDesc = CreateQueryDesc(plannedstmt, queryString,
+ queryDesc = CreateQueryDesc(plannedstmt, execlockrelsinfo, queryString,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, instrument_option);
diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index 1013790dbb..008b8ce0e9 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -741,8 +741,10 @@ execute_sql_string(const char *sql)
RawStmt *parsetree = lfirst_node(RawStmt, lc1);
MemoryContext per_parsetree_context,
oldcontext;
- List *stmt_list;
- ListCell *lc2;
+ List *stmt_list,
+ *execlockrelsinfo_list;
+ ListCell *lc2,
+ *lc3;
/*
* We do the work for each parsetree in a short-lived context, to
@@ -762,11 +764,13 @@ execute_sql_string(const char *sql)
NULL,
0,
NULL);
- stmt_list = pg_plan_queries(stmt_list, sql, CURSOR_OPT_PARALLEL_OK, NULL);
+ stmt_list = pg_plan_queries(stmt_list, sql, CURSOR_OPT_PARALLEL_OK, NULL,
+ &execlockrelsinfo_list);
- foreach(lc2, stmt_list)
+ forboth(lc2, stmt_list, lc3, execlockrelsinfo_list)
{
PlannedStmt *stmt = lfirst_node(PlannedStmt, lc2);
+ ExecLockRelsInfo *execlockrelsinfo = lfirst_node(ExecLockRelsInfo, lc3);
CommandCounterIncrement();
@@ -777,6 +781,7 @@ execute_sql_string(const char *sql)
QueryDesc *qdesc;
qdesc = CreateQueryDesc(stmt,
+ execlockrelsinfo,
sql,
GetActiveSnapshot(), NULL,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index 05e7b60059..4ef44aaf23 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -416,7 +416,7 @@ refresh_matview_datafill(DestReceiver *dest, Query *query,
UpdateActiveSnapshotCommandId();
/* Create a QueryDesc, redirecting output to our tuple receiver */
- queryDesc = CreateQueryDesc(plan, queryString,
+ queryDesc = CreateQueryDesc(plan, NULL, queryString,
GetActiveSnapshot(), InvalidSnapshot,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/portalcmds.c b/src/backend/commands/portalcmds.c
index 9902c5c566..85e73ddded 100644
--- a/src/backend/commands/portalcmds.c
+++ b/src/backend/commands/portalcmds.c
@@ -107,6 +107,7 @@ PerformCursorOpen(ParseState *pstate, DeclareCursorStmt *cstmt, ParamListInfo pa
queryString,
CMDTAG_SELECT, /* cursor's query is always a SELECT */
list_make1(plan),
+ list_make1(NULL), /* no ExecLockRelsInfo to pass */
NULL);
/*----------
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 80738547ed..bbbf8bbcbd 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -155,6 +155,7 @@ ExecuteQuery(ParseState *pstate,
PreparedStatement *entry;
CachedPlan *cplan;
List *plan_list;
+ List *plan_execlockrelsinfo_list;
ParamListInfo paramLI = NULL;
EState *estate = NULL;
Portal portal;
@@ -195,6 +196,7 @@ ExecuteQuery(ParseState *pstate,
/* Replan if needed, and increment plan refcount for portal */
cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL);
plan_list = cplan->stmt_list;
+ plan_execlockrelsinfo_list = cplan->execlockrelsinfo_list;
/*
* DO NOT add any logic that could possibly throw an error between
@@ -204,7 +206,7 @@ ExecuteQuery(ParseState *pstate,
NULL,
query_string,
entry->plansource->commandTag,
- plan_list,
+ plan_list, plan_execlockrelsinfo_list,
cplan);
/*
@@ -576,7 +578,9 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
const char *query_string;
CachedPlan *cplan;
List *plan_list;
- ListCell *p;
+ List *plan_execlockrelsinfo_list;
+ ListCell *p,
+ *pe;
ParamListInfo paramLI = NULL;
EState *estate = NULL;
instr_time planstart;
@@ -632,15 +636,18 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
}
plan_list = cplan->stmt_list;
+ plan_execlockrelsinfo_list = cplan->execlockrelsinfo_list;
/* Explain each query */
- foreach(p, plan_list)
+ forboth(p, plan_list, pe, plan_execlockrelsinfo_list)
{
PlannedStmt *pstmt = lfirst_node(PlannedStmt, p);
+ ExecLockRelsInfo *execlockrelsinfo = lfirst_node(ExecLockRelsInfo, pe);
if (pstmt->commandType != CMD_UTILITY)
- ExplainOnePlan(pstmt, into, es, query_string, paramLI, queryEnv,
- &planduration, (es->buffers ? &bufusage : NULL));
+ ExplainOnePlan(pstmt, execlockrelsinfo, into, es, query_string,
+ paramLI, queryEnv, &planduration,
+ (es->buffers ? &bufusage : NULL));
else
ExplainOneUtility(pstmt->utilityStmt, into, es, query_string,
paramLI, queryEnv);
diff --git a/src/backend/executor/README b/src/backend/executor/README
index 0b5183fc4a..b45ca508a8 100644
--- a/src/backend/executor/README
+++ b/src/backend/executor/README
@@ -65,6 +65,27 @@ found there. This currently only occurs for Append and MergeAppend nodes. In
this case the non-required subplans are ignored and the executor state's
subnode array will become out of sequence to the plan's subplan list.
+Actually, the so-called execution time pruning may also occur even before the
+execution has started. One case where that occurs is when a cached generic
+plan is being validated for execution by plancache.c: GetCachedPlan(), which
+proceeds by locking all the relations that will be scanned by that plan. If
+the generic plan has nodes that contain so-called initial pruning steps (a
+subset of execution pruning steps that do not depend on full-fledged execution
+having started), they are performed at this point to figure out the minimal
+set of child subplans that satisfy those pruning instructions and the result
+of performing that pruning is saved in a data structure that gets passed to
+the executor alongside the plan tree. Relations scanned by only those
+surviving subplans are then locked while those scanned by the pruned subplans
+are not, even though the pruned subplans themselves are not removed from the
+plan tree. So, it is imperative that the executor and any third party code
+invoked by it that gets passed the plan tree look at the initial pruning result
+made available via the aforementioned data structure to determine whether or
+not a particular subplan is valid. (The data structure basically consists of
+an array of PlanInitPruningOutput nodes containing one element for each node
+of the plan tree indexable using plan_node_id of the individual plan nodes,
+where each node contains a bitmapset of indexes of unpruned child subplans of
+a given node.)
+
Each Plan node may have expression trees associated with it, to represent
its target list, qualification conditions, etc. These trees are also
read-only to the executor, but the executor state for expression evaluation
@@ -286,6 +307,9 @@ Query Processing Control Flow
This is a sketch of control flow for full query processing:
+ [ ExecutorGetLockRels ] --- an optional step to walk over the plan tree
+ to produce an ExecLockRelsInfo to be passed to CreateQueryDesc
+
CreateQueryDesc
ExecutorStart
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index ef2fd46092..56946c12dd 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -49,11 +49,15 @@
#include "commands/matview.h"
#include "commands/trigger.h"
#include "executor/execdebug.h"
+#include "executor/nodeAppend.h"
+#include "executor/nodeMergeAppend.h"
+#include "executor/nodeModifyTable.h"
#include "executor/nodeSubplan.h"
#include "foreign/fdwapi.h"
#include "jit/jit.h"
#include "mb/pg_wchar.h"
#include "miscadmin.h"
+#include "nodes/nodeFuncs.h"
#include "parser/parsetree.h"
#include "storage/bufmgr.h"
#include "storage/lmgr.h"
@@ -101,9 +105,205 @@ static char *ExecBuildSlotValueDescription(Oid reloid,
Bitmapset *modifiedCols,
int maxfieldlen);
static void EvalPlanQualStart(EPQState *epqstate, Plan *planTree);
+static bool ExecGetScanLockRels(Scan *scan, ExecGetLockRelsContext *context);
/* end of local decls */
+/* ----------------------------------------------------------------
+ * ExecutorGetLockRels
+ *
+ * Figure out the minimal set of relations to lock to be able to safely
+ * execute a given plan
+ *
+ * This ignores the relations scanned by child subplans that are pruned away
+ * after performing initial pruning steps present in the plan using the
+ * provided set of EXTERN parameters.
+ *
+ * Along with the set of RT indexes of relations that must be locked, the
+ * returned struct also contains an array of PlanInitPruningOutput nodes each
+ * of which contains the result of initial pruning for a given plan node, which
+ * is basically a bitmapset of the indexes of surviving child subplans. Each
+ * plan node in the tree that undergoes pruning will have an element in the
+ * array.
+ *
+ * Note that while relations scanned by the subplans that are pruned will not
+ * be locked, the subplans themselves are left as-is in the plan tree, assuming
+ * anything that reads the plan tree during execution knows to ignore them by
+ * looking at the PlanInitPruningOutput's list of valid subplans.
+ *
+ * Partitioned tables mentioned in PartitionedRelPruneInfo nodes that drive
+ * the pruning will be locked before doing the pruning and also added to the
+ * the returned set.
+ */
+ExecLockRelsInfo *
+ExecutorGetLockRels(PlannedStmt *plannedstmt, ParamListInfo params)
+{
+ int numPlanNodes = plannedstmt->numPlanNodes;
+ ExecGetLockRelsContext context;
+ ExecLockRelsInfo *result;
+ ListCell *lc;
+
+ /* Only get here if there is any pruning to do. */
+ Assert(plannedstmt->containsInitialPruning);
+
+ context.stmt = plannedstmt;
+ context.params = params;
+
+ /*
+ * Go walk all the plan tree(s) present in the PlannedStmt, filling
+ * context.lockrels with only the relations from plan nodes that
+ * survive initial pruning and also the tables mentioned in
+ * partitioned_rels sets found in the plan.
+ */
+ context.lockrels = NULL;
+ context.initPruningOutputs = NIL;
+ context.ipoIndexes = palloc0(sizeof(int) * numPlanNodes);
+
+ /* All the subplans. */
+ foreach(lc, plannedstmt->subplans)
+ {
+ Plan *subplan = lfirst(lc);
+
+ (void) ExecGetLockRels(subplan, &context);
+ }
+
+ /* And the main tree. */
+ (void) ExecGetLockRels(plannedstmt->planTree, &context);
+
+ /*
+ * Also be sure to lock partitioned relations from any [Merge]Append nodes
+ * that were originally present but were ultimately left out from the plan
+ * due to being deemed no-op nodes.
+ */
+ context.lockrels = bms_add_members(context.lockrels,
+ plannedstmt->elidedAppendPartedRels);
+
+ result = makeNode(ExecLockRelsInfo);
+ result->lockrels = context.lockrels;
+ result->numPlanNodes = numPlanNodes;
+ result->initPruningOutputs = context.initPruningOutputs;
+ result->ipoIndexes = context.ipoIndexes;
+
+ return result;
+}
+
+/* ------------------------------------------------------------------------
+ * ExecGetLockRels
+ * Adds all the relations that will be scanned by 'node' and its child
+ * plans to context->lockrels after taking into the account the effect
+ * of performing initial pruning if any
+ *
+ * context->stmt gives the PlannedStmt being inspected to access the plan's
+ * range table if needed and context->params the set of EXTERN parameters
+ * available to evaluate pruning parameters.
+ *
+ * If initial pruning is done, a PlanInitPruningOutput node containing the
+ * result of pruning will be stored in context->initPruningOutputs that will
+ * be made available to the executor to reuse.
+ * ------------------------------------------------------------------------
+ */
+bool
+ExecGetLockRels(Plan *node, ExecGetLockRelsContext *context)
+{
+ /* Do nothing when we get to the end of a leaf on tree. */
+ if (node == NULL)
+ return true;
+
+ /* Make sure there's enough stack available. */
+ check_stack_depth();
+
+ switch (nodeTag(node))
+ {
+ /* Currently, only these two nodes have prunable child subplans. */
+ case T_Append:
+ if (ExecGetAppendLockRels((Append *) node, context))
+ return true;
+ break;
+ case T_MergeAppend:
+ if (ExecGetMergeAppendLockRels((MergeAppend *) node,
+ context))
+ return true;
+ break;
+
+ /*
+ * And these manipulate relations that must be added context->lockrels.
+ */
+ case T_SeqScan:
+ case T_SampleScan:
+ case T_IndexScan:
+ case T_IndexOnlyScan:
+ case T_BitmapIndexScan:
+ case T_BitmapHeapScan:
+ case T_TidScan:
+ case T_TidRangeScan:
+ case T_ForeignScan:
+ case T_SubqueryScan:
+ case T_CustomScan:
+ if (ExecGetScanLockRels((Scan *) node, context))
+ return true;
+ break;
+ case T_ModifyTable:
+ if (ExecGetModifyTableLockRels((ModifyTable *) node, context))
+ return true;
+ /* plan_tree_walker() will visit the subplan (outerNode) */
+ break;
+
+ default:
+ break;
+ }
+
+ /* Recurse to subnodes. */
+ return plan_tree_walker(node, ExecGetLockRels, (void *) context);
+}
+
+/*
+ * ExecGetScanLockRels
+ * Do ExecGetLockRels()'s work for a leaf Scan node
+ */
+static bool
+ExecGetScanLockRels(Scan *scan, ExecGetLockRelsContext *context)
+{
+ switch (nodeTag(scan))
+ {
+ case T_ForeignScan:
+ {
+ ForeignScan *fscan = (ForeignScan *) scan;
+
+ context->lockrels = bms_add_members(context->lockrels,
+ fscan->fs_relids);
+ }
+ break;
+
+ case T_SubqueryScan:
+ {
+ SubqueryScan *sscan = (SubqueryScan *) scan;
+
+ (void) ExecGetLockRels((Plan *) sscan->subplan, context);
+ }
+ break;
+
+ case T_CustomScan:
+ {
+ CustomScan *cscan = (CustomScan *) scan;
+ ListCell *lc;
+
+ context->lockrels = bms_add_members(context->lockrels,
+ cscan->custom_relids);
+ foreach(lc, cscan->custom_plans)
+ {
+ (void) ExecGetLockRels((Plan *) lfirst(lc), context);
+ }
+ }
+ break;
+
+ default:
+ context->lockrels = bms_add_member(context->lockrels,
+ scan->scanrelid);
+ break;
+ }
+
+ return true;
+}
/* ----------------------------------------------------------------
* ExecutorStart
@@ -806,6 +1006,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
{
CmdType operation = queryDesc->operation;
PlannedStmt *plannedstmt = queryDesc->plannedstmt;
+ ExecLockRelsInfo *execlockrelsinfo = queryDesc->execlockrelsinfo;
Plan *plan = plannedstmt->planTree;
List *rangeTable = plannedstmt->rtable;
EState *estate = queryDesc->estate;
@@ -825,6 +1026,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
ExecInitRangeTable(estate, rangeTable);
estate->es_plannedstmt = plannedstmt;
+ estate->es_execlockrelsinfo = execlockrelsinfo;
/*
* Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index 9a0d5d59ef..fb6dbd298a 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -66,6 +66,7 @@
#define PARALLEL_KEY_QUERY_TEXT UINT64CONST(0xE000000000000008)
#define PARALLEL_KEY_JIT_INSTRUMENTATION UINT64CONST(0xE000000000000009)
#define PARALLEL_KEY_WAL_USAGE UINT64CONST(0xE00000000000000A)
+#define PARALLEL_KEY_EXECLOCKRELSINFO UINT64CONST(0xE00000000000000B)
#define PARALLEL_TUPLE_QUEUE_SIZE 65536
@@ -182,6 +183,7 @@ ExecSerializePlan(Plan *plan, EState *estate)
pstmt->transientPlan = false;
pstmt->dependsOnRole = false;
pstmt->parallelModeNeeded = false;
+ pstmt->containsInitialPruning = false;
pstmt->planTree = plan;
pstmt->rtable = estate->es_range_table;
pstmt->resultRelations = NIL;
@@ -596,12 +598,15 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
FixedParallelExecutorState *fpes;
char *pstmt_data;
char *pstmt_space;
+ char *execlockrelsinfo_data;
+ char *execlockrelsinfo_space;
char *paramlistinfo_space;
BufferUsage *bufusage_space;
WalUsage *walusage_space;
SharedExecutorInstrumentation *instrumentation = NULL;
SharedJitInstrumentation *jit_instrumentation = NULL;
int pstmt_len;
+ int execlockrelsinfo_len;
int paramlistinfo_len;
int instrumentation_len = 0;
int jit_instrumentation_len = 0;
@@ -630,6 +635,7 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
/* Fix up and serialize plan to be sent to workers. */
pstmt_data = ExecSerializePlan(planstate->plan, estate);
+ execlockrelsinfo_data = nodeToString(estate->es_execlockrelsinfo);
/* Create a parallel context. */
pcxt = CreateParallelContext("postgres", "ParallelQueryMain", nworkers);
@@ -656,6 +662,11 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
shm_toc_estimate_chunk(&pcxt->estimator, pstmt_len);
shm_toc_estimate_keys(&pcxt->estimator, 1);
+ /* Estimate space for serialized ExecLockRelsInfo. */
+ execlockrelsinfo_len = strlen(execlockrelsinfo_data) + 1;
+ shm_toc_estimate_chunk(&pcxt->estimator, execlockrelsinfo_len);
+ shm_toc_estimate_keys(&pcxt->estimator, 1);
+
/* Estimate space for serialized ParamListInfo. */
paramlistinfo_len = EstimateParamListSpace(estate->es_param_list_info);
shm_toc_estimate_chunk(&pcxt->estimator, paramlistinfo_len);
@@ -750,6 +761,12 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
memcpy(pstmt_space, pstmt_data, pstmt_len);
shm_toc_insert(pcxt->toc, PARALLEL_KEY_PLANNEDSTMT, pstmt_space);
+ /* Store serialized ExecLockRelsInfo */
+ execlockrelsinfo_space = shm_toc_allocate(pcxt->toc, execlockrelsinfo_len);
+ memcpy(execlockrelsinfo_space, execlockrelsinfo_data, execlockrelsinfo_len);
+ shm_toc_insert(pcxt->toc, PARALLEL_KEY_EXECLOCKRELSINFO,
+ execlockrelsinfo_space);
+
/* Store serialized ParamListInfo. */
paramlistinfo_space = shm_toc_allocate(pcxt->toc, paramlistinfo_len);
shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARAMLISTINFO, paramlistinfo_space);
@@ -1231,8 +1248,10 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
int instrument_options)
{
char *pstmtspace;
+ char *execlockrelsinfospace;
char *paramspace;
PlannedStmt *pstmt;
+ ExecLockRelsInfo *execlockrelsinfo;
ParamListInfo paramLI;
char *queryString;
@@ -1243,12 +1262,17 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
pstmtspace = shm_toc_lookup(toc, PARALLEL_KEY_PLANNEDSTMT, false);
pstmt = (PlannedStmt *) stringToNode(pstmtspace);
+ /* Reconstruct leader-supplied ExecLockRelsInfo. */
+ execlockrelsinfospace = shm_toc_lookup(toc, PARALLEL_KEY_EXECLOCKRELSINFO,
+ false);
+ execlockrelsinfo = (ExecLockRelsInfo *) stringToNode(execlockrelsinfospace);
+
/* Reconstruct ParamListInfo. */
paramspace = shm_toc_lookup(toc, PARALLEL_KEY_PARAMLISTINFO, false);
paramLI = RestoreParamList(¶mspace);
/* Create a QueryDesc for the query. */
- return CreateQueryDesc(pstmt,
+ return CreateQueryDesc(pstmt, execlockrelsinfo,
queryString,
GetActiveSnapshot(), InvalidSnapshot,
receiver, paramLI, NULL, instrument_options);
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 84b4e4b3d6..e79ada16f0 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -25,6 +25,7 @@
#include "mb/pg_wchar.h"
#include "miscadmin.h"
#include "nodes/makefuncs.h"
+#include "parser/parsetree.h"
#include "partitioning/partbounds.h"
#include "partitioning/partdesc.h"
#include "partitioning/partprune.h"
@@ -185,8 +186,13 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation rel,
static List *adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri);
static List *adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap);
static PartitionPruneState *ExecCreatePartitionPruneState(PlanState *planstate,
- PartitionPruneInfo *partitionpruneinfo);
-static Bitmapset *ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate);
+ PartitionPruneInfo *partitionpruneinfo,
+ bool consider_initial_steps,
+ bool consider_exec_steps,
+ List *rtable, ExprContext *econtext,
+ PartitionDirectory partdir);
+static Bitmapset *ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate,
+ PartitionPruneInfo *pruneinfo);
static void ExecInitPruningContext(PartitionPruneContext *context,
List *pruning_steps,
PartitionDesc partdesc,
@@ -1588,8 +1594,9 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
* considered to be a stable expression, it can change value from one plan
* node scan to the next during query execution. Stable comparison
* expressions that don't involve such Params allow partition pruning to be
- * done once during executor startup. Expressions that do involve such Params
- * require us to prune separately for each scan of the parent plan node.
+ * done once during executor startup or even before during ExecutorGetLockRels().
+ * Expressions that do involve such Params require us to prune separately for
+ * each scan of the parent plan node.
*
* Note that pruning away unneeded subplans during executor startup has the
* added benefit of not having to initialize the unneeded subplans at all.
@@ -1601,10 +1608,17 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
* Creates the PartitionPruneState required by each of the two pruning
* functions. Details stored include how to map the partition index
* returned by the partition pruning code into subplan indexes. Also
- * determines the set of initially valid subplans by performing initial
- * pruning steps, only which need be initialized by the caller such as
- * ExecInitAppend. Maps in PartitionPruneState are updated to account
- * for initial pruning having eliminated some of the subplans, if any.
+ * determines the set of initially valid subplans by either looking that
+ * up in the plan node's PlanInitPruningOutput if one found in
+ * EState.es_execlockrelinfo or by performing initial pruning steps.
+ * Only the subplans included in that need be initialized by the caller
+ * such as ExecInitAppend. Maps in PartitionPruneState are updated to
+ * account for initial pruning having eliminated some of the subplans,
+ * if any.
+ *
+ * ExecGetLockRelsDoInitialPruning:
+ * Do initial pruning as part of ExecGetLockRels() on the parent plan
+ * node
*
* ExecFindMatchingSubPlans:
* Returns indexes of matching subplans after evaluating all available
@@ -1619,9 +1633,10 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
* ExecInitPartitionPruning
* Initialize data structure needed for run-time partition pruning
*
- * Initial pruning can be done immediately, so it is done here if needed and
- * the set of surviving partition subplans' indexes are added to the output
- * parameter *initially_valid_subplans.
+ * Initial pruning can be done immediately, so it is done here unless it has
+ * already been done by ExecGetLockRelsDoInitialPruning(), and the set of
+ * surviving partition subplans' indexes are added to the output parameter
+ * *initially_valid_subplans.
*
* If subplans are indeed pruned, subplan_map arrays contained in the returned
* PartitionPruneState are re-sequenced to not count those, though only if the
@@ -1635,22 +1650,57 @@ ExecInitPartitionPruning(PlanState *planstate,
{
PartitionPruneState *prunestate;
EState *estate = planstate->state;
+ Plan *plan = planstate->plan;
+ PlanInitPruningOutput *initPruningOutput = NULL;
+ bool do_pruning = (pruneinfo->needs_init_pruning ||
+ pruneinfo->needs_exec_pruning);
- /* We may need an expression context to evaluate partition exprs */
- ExecAssignExprContext(estate, planstate);
+ /* Retrieve the parent plan's PlanInitPruningOutput, if any. */
+ if (estate->es_execlockrelsinfo)
+ {
+ initPruningOutput = (PlanInitPruningOutput *)
+ ExecFetchPlanInitPruningOutput(estate->es_execlockrelsinfo, plan);
- /*
- * Create the working data structure for pruning.
- */
- prunestate = ExecCreatePartitionPruneState(planstate, pruneinfo);
+ Assert(initPruningOutput != NULL &&
+ IsA(initPruningOutput, PlanInitPruningOutput));
+ /* No need to do initial pruning again, only exec pruning. */
+ do_pruning = pruneinfo->needs_exec_pruning;
+ }
+
+ prunestate = NULL;
+ if (do_pruning)
+ {
+ /* We may need an expression context to evaluate partition exprs */
+ ExecAssignExprContext(estate, planstate);
+
+ /* For data reading, executor always omits detached partitions */
+ if (estate->es_partition_directory == NULL)
+ estate->es_partition_directory =
+ CreatePartitionDirectory(estate->es_query_cxt, false);
+
+ /*
+ * Create the working data structure for pruning. No need to consider
+ * initial pruning steps if we have a PlanInitPruningOutput.
+ */
+ prunestate = ExecCreatePartitionPruneState(planstate, pruneinfo,
+ initPruningOutput == NULL, true,
+ NIL, planstate->ps_ExprContext,
+ estate->es_partition_directory);
+ }
/*
* Perform an initial partition prune, if required.
*/
- if (prunestate->do_initial_prune)
+ if (initPruningOutput)
+ {
+ /* ExecGetLockRelsDoInitialPruning() already did it for us! */
+ *initially_valid_subplans = initPruningOutput->initially_valid_subplans;
+ }
+ else if (prunestate && prunestate->do_initial_prune)
{
/* Determine which subplans survive initial pruning */
- *initially_valid_subplans = ExecFindInitialMatchingSubPlans(prunestate);
+ *initially_valid_subplans = ExecFindInitialMatchingSubPlans(prunestate,
+ pruneinfo);
}
else
{
@@ -1668,7 +1718,7 @@ ExecInitPartitionPruning(PlanState *planstate,
* invalid data in prunestate, because that data won't be consulted again
* (cf initial Assert in ExecFindMatchingSubPlans).
*/
- if (prunestate->do_exec_prune &&
+ if (prunestate && prunestate->do_exec_prune &&
bms_num_members(*initially_valid_subplans) < n_total_subplans)
PartitionPruneStateFixSubPlanMap(prunestate,
*initially_valid_subplans,
@@ -1677,12 +1727,75 @@ ExecInitPartitionPruning(PlanState *planstate,
return prunestate;
}
+/*
+ * ExecGetLockRelsDoInitialPruning
+ * Perform initial pruning as part of doing ExecGetLockRels() on the parent
+ * plan node
+ */
+Bitmapset *
+ExecGetLockRelsDoInitialPruning(Plan *plan, ExecGetLockRelsContext *context,
+ PartitionPruneInfo *pruneinfo)
+{
+ List *rtable = context->stmt->rtable;
+ ParamListInfo params = context->params;
+ ExprContext *econtext;
+ PartitionDirectory pdir;
+ MemoryContext oldcontext,
+ tmpcontext;
+ PartitionPruneState *prunestate;
+ PlanInitPruningOutput *initPruningOutput;
+
+ /*
+ * A temporary context to allocate stuff needded to run the pruning steps.
+ */
+ tmpcontext = AllocSetContextCreate(CurrentMemoryContext,
+ "initial pruning working data",
+ ALLOCSET_DEFAULT_SIZES);
+ oldcontext = MemoryContextSwitchTo(tmpcontext);
+
+ /*
+ * PartitionDirectory to look up partition descriptors, which omits
+ * detached partitions, just like in the executor proper.
+ */
+ pdir = CreatePartitionDirectory(CurrentMemoryContext, false);
+
+ /*
+ * We don't yet have a PlanState for the parent plan node, so must create
+ * a standalone ExprContext to evaluate pruning expressions, equipped with
+ * the information about the EXTERN parameters that the caller passed us.
+ * Note that that's okay because the initial pruning steps do not contain
+ * anything that requires the execution to have started.
+ */
+ econtext = CreateStandaloneExprContext();
+ econtext->ecxt_param_list_info = params;
+ prunestate = ExecCreatePartitionPruneState(NULL, pruneinfo,
+ true, false,
+ rtable, econtext,
+ pdir);
+ MemoryContextSwitchTo(oldcontext);
+
+ /* Do the pruning and populate a PlanInitPruningOutput for this node. */
+ initPruningOutput = makeNode(PlanInitPruningOutput);
+ initPruningOutput->initially_valid_subplans =
+ ExecFindInitialMatchingSubPlans(prunestate, pruneinfo);
+ ExecStorePlanInitPruningOutput(context, initPruningOutput, plan);
+
+ FreeExprContext(econtext, true);
+ DestroyPartitionDirectory(pdir);
+ MemoryContextDelete(tmpcontext);
+
+ return initPruningOutput->initially_valid_subplans;
+}
+
/*
* ExecCreatePartitionPruneState
* Build the data structure required for calling
* ExecFindInitialMatchingSubPlans and ExecFindMatchingSubPlans.
*
- * 'planstate' is the parent plan node's execution state.
+ * 'planstate', if not NULL, is the parent plan node's execution state. It
+ * can be NULL if being called before ExecutorStart(), in which case,
+ * 'rtable' (range table), 'econtext', and 'partdir' must be explicitly
+ * provided.
*
* 'partitionpruneinfo' is a PartitionPruneInfo as generated by
* make_partition_pruneinfo. Here we build a PartitionPruneState containing a
@@ -1697,19 +1810,20 @@ ExecInitPartitionPruning(PlanState *planstate,
*/
static PartitionPruneState *
ExecCreatePartitionPruneState(PlanState *planstate,
- PartitionPruneInfo *partitionpruneinfo)
+ PartitionPruneInfo *partitionpruneinfo,
+ bool consider_initial_steps,
+ bool consider_exec_steps,
+ List *rtable, ExprContext *econtext,
+ PartitionDirectory partdir)
{
- EState *estate = planstate->state;
+ EState *estate = planstate ? planstate->state : NULL;
PartitionPruneState *prunestate;
int n_part_hierarchies;
ListCell *lc;
int i;
- ExprContext *econtext = planstate->ps_ExprContext;
- /* For data reading, executor always omits detached partitions */
- if (estate->es_partition_directory == NULL)
- estate->es_partition_directory =
- CreatePartitionDirectory(estate->es_query_cxt, false);
+ Assert((estate != NULL) ||
+ (partdir != NULL && econtext != NULL && rtable != NIL));
n_part_hierarchies = list_length(partitionpruneinfo->prune_infos);
Assert(n_part_hierarchies > 0);
@@ -1760,19 +1874,48 @@ ExecCreatePartitionPruneState(PlanState *planstate,
PartitionedRelPruneInfo *pinfo = lfirst_node(PartitionedRelPruneInfo, lc2);
PartitionedRelPruningData *pprune = &prunedata->partrelprunedata[j];
Relation partrel;
+ bool close_partrel = false;
PartitionDesc partdesc;
PartitionKey partkey;
/*
- * We can rely on the copies of the partitioned table's partition
- * key and partition descriptor appearing in its relcache entry,
- * because that entry will be held open and locked for the
- * duration of this executor run.
+ * Must open the relation by ourselves when called before the
+ * execution has started, such as, when called during
+ * ExecutorGetLockRels() on a cached plan. In that case,
+ * sub-partitions must be locked, because AcquirePlannerLocks()
+ * would not have seen them. (1st relation in a partrelpruneinfos
+ * list is always the root partitioned table appearing in the
+ * query, which AcquirePlannerLocks() would have locked; the
+ * Assert in relation_open() guards that assumption.)
+ */
+ if (estate == NULL)
+ {
+ RangeTblEntry *rte = rt_fetch(pinfo->rtindex, rtable);
+ int lockmode = (j == 0) ? NoLock : rte->rellockmode;
+
+ partrel = table_open(rte->relid, lockmode);
+ close_partrel = true;
+ }
+ else
+ partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
+
+ /*
+ * We can rely on the copy of the partitioned table's partition
+ * key from in its relcache entry, because it can't change (or
+ * get destroyed) as long as the relation is locked. Partition
+ * descriptor is taken from the PartitionDirectory associated with
+ * the table that is held open long enough for the descriptor to
+ * remain valid while it's used to perform the pruning steps.
*/
- partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
partkey = RelationGetPartitionKey(partrel);
- partdesc = PartitionDirectoryLookup(estate->es_partition_directory,
- partrel);
+ partdesc = PartitionDirectoryLookup(partdir, partrel);
+
+ /*
+ * Must close partrel, keeping the lock taken, if we're not using
+ * EState's entry.
+ */
+ if (close_partrel)
+ table_close(partrel, NoLock);
/*
* Initialize the subplan_map and subpart_map.
@@ -1874,7 +2017,7 @@ ExecCreatePartitionPruneState(PlanState *planstate,
* Initialize pruning contexts as needed.
*/
pprune->initial_pruning_steps = pinfo->initial_pruning_steps;
- if (pinfo->initial_pruning_steps)
+ if (consider_initial_steps && pinfo->initial_pruning_steps)
{
ExecInitPruningContext(&pprune->initial_context,
pinfo->initial_pruning_steps,
@@ -1884,7 +2027,7 @@ ExecCreatePartitionPruneState(PlanState *planstate,
prunestate->do_initial_prune = true;
}
pprune->exec_pruning_steps = pinfo->exec_pruning_steps;
- if (pinfo->exec_pruning_steps)
+ if (consider_exec_steps && pinfo->exec_pruning_steps)
{
ExecInitPruningContext(&pprune->exec_context,
pinfo->exec_pruning_steps,
@@ -1998,7 +2141,8 @@ ExecInitPruningContext(PartitionPruneContext *context,
* is required.
*/
static Bitmapset *
-ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate)
+ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate,
+ PartitionPruneInfo *pruneinfo)
{
Bitmapset *result = NULL;
MemoryContext oldcontext;
@@ -2008,8 +2152,8 @@ ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate)
Assert(prunestate->do_initial_prune);
/*
- * Switch to a temp context to avoid leaking memory in the executor's
- * query-lifespan memory context.
+ * Switch to a temp context to avoid leaking memory in the longer-term
+ * memory context.
*/
oldcontext = MemoryContextSwitchTo(prunestate->prune_context);
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 9df1f81ea8..7246f9175f 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -119,6 +119,7 @@ CreateExecutorState(void)
estate->es_relations = NULL;
estate->es_rowmarks = NULL;
estate->es_plannedstmt = NULL;
+ estate->es_execlockrelsinfo = NULL;
estate->es_junkFilter = NULL;
@@ -785,6 +786,13 @@ ExecGetRangeTableRelation(EState *estate, Index rti)
Assert(rti > 0 && rti <= estate->es_range_table_size);
+ /*
+ * A cross-check that AcquireExecutorLocks() hasn't missed any relations
+ * it must not have.
+ */
+ Assert(estate->es_execlockrelsinfo == NULL ||
+ bms_is_member(rti, estate->es_execlockrelsinfo->lockrels));
+
rel = estate->es_relations[rti - 1];
if (rel == NULL)
{
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index f9460ae506..a2182a6b1f 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -844,7 +844,7 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
else
dest = None_Receiver;
- es->qd = CreateQueryDesc(es->stmt,
+ es->qd = CreateQueryDesc(es->stmt, NULL,
fcache->src,
GetActiveSnapshot(),
InvalidSnapshot,
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index 5b6d3eb23b..9c6f907687 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -94,6 +94,55 @@ static bool ExecAppendAsyncRequest(AppendState *node, TupleTableSlot **result);
static void ExecAppendAsyncEventWait(AppendState *node);
static void classify_matching_subplans(AppendState *node);
+/* ----------------------------------------------------------------
+ * ExecGetAppendLockRels
+ * Do ExecGetLockRels()'s work for an Append plan
+ * ----------------------------------------------------------------
+ */
+bool
+ExecGetAppendLockRels(Append *node, ExecGetLockRelsContext *context)
+{
+ PartitionPruneInfo *pruneinfo = node->part_prune_info;
+
+ /*
+ * Must always lock all the partitioned tables whose direct and indirect
+ * partitions will be scanned by this Append.
+ */
+ context->lockrels = bms_add_members(context->lockrels,
+ node->partitioned_rels);
+
+ /*
+ * Now recurse to subplans to add relations scanned therein.
+ *
+ * If initial pruning can be done, do that now and only recurse to the
+ * surviving subplans.
+ */
+ if (pruneinfo && pruneinfo->needs_init_pruning)
+ {
+ List *subplans = node->appendplans;
+ Bitmapset *validsubplans;
+ int i;
+
+ validsubplans = ExecGetLockRelsDoInitialPruning((Plan *) node,
+ context, pruneinfo);
+
+ /* Recurse to surviving subplans. */
+ i = -1;
+ while ((i = bms_next_member(validsubplans, i)) >= 0)
+ {
+ Plan *subplan = list_nth(subplans, i);
+
+ (void) ExecGetLockRels(subplan, context);
+ }
+
+ /* done with this node */
+ return true;
+ }
+
+ /* Tell the caller to recurse to *all* the subplans. */
+ return false;
+}
+
/* ----------------------------------------------------------------
* ExecInitAppend
*
@@ -155,7 +204,8 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
* subplan, we can fill as_valid_subplans immediately, preventing
* later calls to ExecFindMatchingSubPlans.
*/
- if (!prunestate->do_exec_prune && nplans > 0)
+ if (appendstate->as_prune_state == NULL ||
+ (!appendstate->as_prune_state->do_exec_prune && nplans > 0))
appendstate->as_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
}
else
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index 9a9f29e845..4b04fcdbc2 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -54,6 +54,55 @@ typedef int32 SlotNumber;
static TupleTableSlot *ExecMergeAppend(PlanState *pstate);
static int heap_compare_slots(Datum a, Datum b, void *arg);
+/* ----------------------------------------------------------------
+ * ExecGetMergeAppendLockRels
+ * Do ExecGetLockRels()'s work for a MergeAppend plan
+ * ----------------------------------------------------------------
+ */
+bool
+ExecGetMergeAppendLockRels(MergeAppend *node, ExecGetLockRelsContext *context)
+{
+ PartitionPruneInfo *pruneinfo = node->part_prune_info;
+
+ /*
+ * Must always lock all the partitioned tables whose direct and indirect
+ * partitions will be scanned by this Append.
+ */
+ context->lockrels = bms_add_members(context->lockrels,
+ node->partitioned_rels);
+
+ /*
+ * Now recurse to subplans to add relations scanned therein.
+ *
+ * If initial pruning can be done, do that now and only recurse to the
+ * surviving subplans.
+ */
+ if (pruneinfo && pruneinfo->needs_init_pruning)
+ {
+ List *subplans = node->mergeplans;
+ Bitmapset *validsubplans;
+ int i;
+
+ validsubplans = ExecGetLockRelsDoInitialPruning((Plan *) node,
+ context, pruneinfo);
+
+ /* Recurse to surviving subplans. */
+ i = -1;
+ while ((i = bms_next_member(validsubplans, i)) >= 0)
+ {
+ Plan *subplan = list_nth(subplans, i);
+
+ (void) ExecGetLockRels(subplan, context);
+ }
+
+ /* done with this node */
+ return true;
+ }
+
+ /* Tell the caller to recurse to *all* the subplans. */
+ return false;
+}
+
/* ----------------------------------------------------------------
* ExecInitMergeAppend
@@ -103,7 +152,8 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
* subplan, we can fill as_valid_subplans immediately, preventing
* later calls to ExecFindMatchingSubPlans.
*/
- if (!prunestate->do_exec_prune && nplans > 0)
+ if (mergestate->ms_prune_state == NULL ||
+ (!mergestate->ms_prune_state->do_exec_prune && nplans > 0))
mergestate->ms_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
}
else
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index 171575cd73..f17bede367 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -3853,6 +3853,31 @@ ExecLookupResultRelByOid(ModifyTableState *node, Oid resultoid,
return NULL;
}
+/*
+ * ExecGetModifyTableLockRels
+ * Do ExecGetLockRels()'s work for a ModifyTable plan
+ */
+bool
+ExecGetModifyTableLockRels(ModifyTable *plan, ExecGetLockRelsContext *context)
+{
+ ListCell *lc;
+
+ /* First add the result relation RTIs mentioned in the node. */
+ if (plan->rootRelation > 0)
+ context->lockrels = bms_add_member(context->lockrels,
+ plan->rootRelation);
+ context->lockrels = bms_add_member(context->lockrels,
+ plan->nominalRelation);
+ foreach(lc, plan->resultRelations)
+ {
+ context->lockrels = bms_add_member(context->lockrels,
+ lfirst_int(lc));
+ }
+
+ /* Tell the caller to recurse to the subplan (outerPlan(plan)). */
+ return false;
+}
+
/* ----------------------------------------------------------------
* ExecInitModifyTable
* ----------------------------------------------------------------
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 042a5f8b0a..64ebbfb31e 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -1578,6 +1578,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
CachedPlanSource *plansource;
CachedPlan *cplan;
List *stmt_list;
+ List *execlockrelsinfo_list;
char *query_string;
Snapshot snapshot;
MemoryContext oldcontext;
@@ -1659,6 +1660,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
/* Replan if needed, and increment plan refcount for portal */
cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv);
stmt_list = cplan->stmt_list;
+ execlockrelsinfo_list = cplan->execlockrelsinfo_list;
if (!plan->saved)
{
@@ -1670,6 +1672,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
*/
oldcontext = MemoryContextSwitchTo(portal->portalContext);
stmt_list = copyObject(stmt_list);
+ execlockrelsinfo_list = copyObject(execlockrelsinfo_list);
MemoryContextSwitchTo(oldcontext);
ReleaseCachedPlan(cplan, NULL);
cplan = NULL; /* portal shouldn't depend on cplan */
@@ -1683,6 +1686,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
query_string,
plansource->commandTag,
stmt_list,
+ execlockrelsinfo_list,
cplan);
/*
@@ -2473,7 +2477,9 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
{
CachedPlanSource *plansource = (CachedPlanSource *) lfirst(lc1);
List *stmt_list;
- ListCell *lc2;
+ List *execlockrelsinfo_list;
+ ListCell *lc2,
+ *lc3;
spicallbackarg.query = plansource->query_string;
@@ -2552,6 +2558,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
plan_owner, _SPI_current->queryEnv);
stmt_list = cplan->stmt_list;
+ execlockrelsinfo_list = cplan->execlockrelsinfo_list;
/*
* If we weren't given a specific snapshot to use, and the statement
@@ -2589,9 +2596,10 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
}
}
- foreach(lc2, stmt_list)
+ forboth(lc2, stmt_list, lc3, execlockrelsinfo_list)
{
PlannedStmt *stmt = lfirst_node(PlannedStmt, lc2);
+ ExecLockRelsInfo *execlockrelsinfo = lfirst_node(ExecLockRelsInfo, lc3);
bool canSetTag = stmt->canSetTag;
DestReceiver *dest;
@@ -2663,7 +2671,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
else
snap = InvalidSnapshot;
- qdesc = CreateQueryDesc(stmt,
+ qdesc = CreateQueryDesc(stmt, execlockrelsinfo,
plansource->query_string,
snap, crosscheck_snapshot,
dest,
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 29c515d7db..afffabbea0 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -68,6 +68,13 @@
} \
} while (0)
+/* Copy a field that is an array with numElem ints */
+#define COPY_INT_ARRAY(fldname, numElem) \
+ do { \
+ newnode->fldname = (numElem) > 0 ? palloc((numElem) * sizeof(int)) : NULL; \
+ memcpy(newnode->fldname, from->fldname, sizeof(int) * (numElem)); \
+ } while (0)
+
/* Copy a parse location field (for Copy, this is same as scalar case) */
#define COPY_LOCATION_FIELD(fldname) \
(newnode->fldname = from->fldname)
@@ -94,8 +101,10 @@ _copyPlannedStmt(const PlannedStmt *from)
COPY_SCALAR_FIELD(transientPlan);
COPY_SCALAR_FIELD(dependsOnRole);
COPY_SCALAR_FIELD(parallelModeNeeded);
+ COPY_SCALAR_FIELD(containsInitialPruning);
COPY_SCALAR_FIELD(jitFlags);
COPY_NODE_FIELD(planTree);
+ COPY_SCALAR_FIELD(numPlanNodes);
COPY_NODE_FIELD(rtable);
COPY_NODE_FIELD(resultRelations);
COPY_NODE_FIELD(appendRelations);
@@ -1282,6 +1291,8 @@ _copyPartitionPruneInfo(const PartitionPruneInfo *from)
PartitionPruneInfo *newnode = makeNode(PartitionPruneInfo);
COPY_NODE_FIELD(prune_infos);
+ COPY_SCALAR_FIELD(needs_init_pruning);
+ COPY_SCALAR_FIELD(needs_exec_pruning);
COPY_BITMAPSET_FIELD(other_subplans);
return newnode;
@@ -5373,6 +5384,33 @@ _copyExtensibleNode(const ExtensibleNode *from)
return newnode;
}
+/* ****************************************************************
+ * execnodes.h copy functions
+ * ****************************************************************
+ */
+static ExecLockRelsInfo *
+_copyExecLockRelsInfo(const ExecLockRelsInfo *from)
+{
+ ExecLockRelsInfo *newnode = makeNode(ExecLockRelsInfo);
+
+ COPY_BITMAPSET_FIELD(lockrels);
+ COPY_SCALAR_FIELD(numPlanNodes);
+ COPY_NODE_FIELD(initPruningOutputs);
+ COPY_INT_ARRAY(ipoIndexes, from->numPlanNodes);
+
+ return newnode;
+}
+
+static PlanInitPruningOutput *
+_copyPlanInitPruningOutput(const PlanInitPruningOutput *from)
+{
+ PlanInitPruningOutput *newnode = makeNode(PlanInitPruningOutput);
+
+ COPY_BITMAPSET_FIELD(initially_valid_subplans);
+
+ return newnode;
+}
+
/* ****************************************************************
* value.h copy functions
* ****************************************************************
@@ -5427,7 +5465,6 @@ _copyBitString(const BitString *from)
return newnode;
}
-
static ForeignKeyCacheInfo *
_copyForeignKeyCacheInfo(const ForeignKeyCacheInfo *from)
{
@@ -6454,6 +6491,16 @@ copyObjectImpl(const void *from)
retval = _copyPublicationTable(from);
break;
+ /*
+ * EXECUTION NODES
+ */
+ case T_ExecLockRelsInfo:
+ retval = _copyExecLockRelsInfo(from);
+ break;
+ case T_PlanInitPruningOutput:
+ retval = _copyPlanInitPruningOutput(from);
+ break;
+
/*
* MISCELLANEOUS NODES
*/
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 108ede9af9..e2d7e6bcac 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -312,8 +312,10 @@ _outPlannedStmt(StringInfo str, const PlannedStmt *node)
WRITE_BOOL_FIELD(transientPlan);
WRITE_BOOL_FIELD(dependsOnRole);
WRITE_BOOL_FIELD(parallelModeNeeded);
+ WRITE_BOOL_FIELD(containsInitialPruning);
WRITE_INT_FIELD(jitFlags);
WRITE_NODE_FIELD(planTree);
+ WRITE_INT_FIELD(numPlanNodes);
WRITE_NODE_FIELD(rtable);
WRITE_NODE_FIELD(resultRelations);
WRITE_NODE_FIELD(appendRelations);
@@ -1008,6 +1010,8 @@ _outPartitionPruneInfo(StringInfo str, const PartitionPruneInfo *node)
WRITE_NODE_TYPE("PARTITIONPRUNEINFO");
WRITE_NODE_FIELD(prune_infos);
+ WRITE_BOOL_FIELD(needs_init_pruning);
+ WRITE_BOOL_FIELD(needs_exec_pruning);
WRITE_BITMAPSET_FIELD(other_subplans);
}
@@ -2818,6 +2822,31 @@ _outExtensibleNode(StringInfo str, const ExtensibleNode *node)
methods->nodeOut(str, node);
}
+/*****************************************************************************
+ *
+ * Stuff from execnodes.h
+ *
+ *****************************************************************************/
+
+static void
+_outExecLockRelsInfo(StringInfo str, const ExecLockRelsInfo *node)
+{
+ WRITE_NODE_TYPE("EXECLOCKRELSINFO");
+
+ WRITE_BITMAPSET_FIELD(lockrels);
+ WRITE_INT_FIELD(numPlanNodes);
+ WRITE_NODE_FIELD(initPruningOutputs);
+ WRITE_INT_ARRAY(ipoIndexes, node->numPlanNodes);
+}
+
+static void
+_outPlanInitPruningOutput(StringInfo str, const PlanInitPruningOutput *node)
+{
+ WRITE_NODE_TYPE("PLANINITPRUNINGOUTPUT");
+
+ WRITE_BITMAPSET_FIELD(initially_valid_subplans);
+}
+
/*****************************************************************************
*
* Stuff from parsenodes.h.
@@ -4720,6 +4749,16 @@ outNode(StringInfo str, const void *obj)
_outJsonItemCoercions(str, obj);
break;
+ /*
+ * EXECUTION NODES
+ */
+ case T_ExecLockRelsInfo:
+ _outExecLockRelsInfo(str, obj);
+ break;
+ case T_PlanInitPruningOutput:
+ _outPlanInitPruningOutput(str, obj);
+ break;
+
default:
/*
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index ce146dd45e..88173f70a1 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -1782,8 +1782,10 @@ _readPlannedStmt(void)
READ_BOOL_FIELD(transientPlan);
READ_BOOL_FIELD(dependsOnRole);
READ_BOOL_FIELD(parallelModeNeeded);
+ READ_BOOL_FIELD(containsInitialPruning);
READ_INT_FIELD(jitFlags);
READ_NODE_FIELD(planTree);
+ READ_INT_FIELD(numPlanNodes);
READ_NODE_FIELD(rtable);
READ_NODE_FIELD(resultRelations);
READ_NODE_FIELD(appendRelations);
@@ -2735,6 +2737,8 @@ _readPartitionPruneInfo(void)
READ_LOCALS(PartitionPruneInfo);
READ_NODE_FIELD(prune_infos);
+ READ_BOOL_FIELD(needs_init_pruning);
+ READ_BOOL_FIELD(needs_exec_pruning);
READ_BITMAPSET_FIELD(other_subplans);
READ_DONE();
@@ -2904,6 +2908,35 @@ _readPartitionRangeDatum(void)
READ_DONE();
}
+/*
+ * _readExecLockRelsInfo
+ */
+static ExecLockRelsInfo *
+_readExecLockRelsInfo(void)
+{
+ READ_LOCALS(ExecLockRelsInfo);
+
+ READ_BITMAPSET_FIELD(lockrels);
+ READ_INT_FIELD(numPlanNodes);
+ READ_NODE_FIELD(initPruningOutputs);
+ READ_INT_ARRAY(ipoIndexes, local_node->numPlanNodes);
+
+ READ_DONE();
+}
+
+/*
+ * _readPlanInitPruningOutput
+ */
+static PlanInitPruningOutput *
+_readPlanInitPruningOutput(void)
+{
+ READ_LOCALS(PlanInitPruningOutput);
+
+ READ_BITMAPSET_FIELD(initially_valid_subplans);
+
+ READ_DONE();
+}
+
/*
* parseNodeString
*
@@ -3197,6 +3230,10 @@ parseNodeString(void)
return_value = _readJsonCoercion();
else if (MATCH("JSONITEMCOERCIONS", 17))
return_value = _readJsonItemCoercions();
+ else if (MATCH("EXECLOCKRELSINFO", 16))
+ return_value = _readExecLockRelsInfo();
+ else if (MATCH("PLANINITPRUNINGOUTPUT", 21))
+ return_value = _readPlanInitPruningOutput();
else
{
elog(ERROR, "badly formatted node string \"%.32s\"...", token);
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index c769b4b4b9..4c586ac1ec 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -517,7 +517,9 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
result->transientPlan = glob->transientPlan;
result->dependsOnRole = glob->dependsOnRole;
result->parallelModeNeeded = glob->parallelModeNeeded;
+ result->containsInitialPruning = glob->containsInitialPruning;
result->planTree = top_plan;
+ result->numPlanNodes = glob->lastPlanNodeId;
result->rtable = glob->finalrtable;
result->resultRelations = glob->resultRelations;
result->appendRelations = glob->appendRelations;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 8214edec54..a1c6c3caa2 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -1623,6 +1623,9 @@ set_append_references(PlannerInfo *root,
pinfo->rtindex += rtoffset;
}
}
+
+ if (aplan->part_prune_info->needs_init_pruning)
+ root->glob->containsInitialPruning = true;
}
/* We don't need to recurse to lefttree or righttree ... */
@@ -1710,6 +1713,9 @@ set_mergeappend_references(PlannerInfo *root,
pinfo->rtindex += rtoffset;
}
}
+
+ if (mplan->part_prune_info->needs_init_pruning)
+ root->glob->containsInitialPruning = true;
}
/* We don't need to recurse to lefttree or righttree ... */
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index 7080cb25d9..3322dc79f2 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -144,7 +144,9 @@ static List *make_partitionedrel_pruneinfo(PlannerInfo *root,
List *prunequal,
Bitmapset *partrelids,
int *relid_subplan_map,
- Bitmapset **matchedsubplans);
+ Bitmapset **matchedsubplans,
+ bool *needs_init_pruning,
+ bool *needs_exec_pruning);
static void gen_partprune_steps(RelOptInfo *rel, List *clauses,
PartClauseTarget target,
GeneratePruningStepsContext *context);
@@ -230,6 +232,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
int *relid_subplan_map;
ListCell *lc;
int i;
+ bool needs_init_pruning = false;
+ bool needs_exec_pruning = false;
/*
* Scan the subpaths to see which ones are scans of partition child
@@ -309,12 +313,16 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
Bitmapset *partrelids = (Bitmapset *) lfirst(lc);
List *pinfolist;
Bitmapset *matchedsubplans = NULL;
+ bool partrel_needs_init_pruning;
+ bool partrel_needs_exec_pruning;
pinfolist = make_partitionedrel_pruneinfo(root, parentrel,
prunequal,
partrelids,
relid_subplan_map,
- &matchedsubplans);
+ &matchedsubplans,
+ &partrel_needs_init_pruning,
+ &partrel_needs_exec_pruning);
/* When pruning is possible, record the matched subplans */
if (pinfolist != NIL)
@@ -323,6 +331,10 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
allmatchedsubplans = bms_join(matchedsubplans,
allmatchedsubplans);
}
+ if (!needs_init_pruning)
+ needs_init_pruning = partrel_needs_init_pruning;
+ if (!needs_exec_pruning)
+ needs_exec_pruning = partrel_needs_exec_pruning;
}
pfree(relid_subplan_map);
@@ -337,6 +349,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
/* Else build the result data structure */
pruneinfo = makeNode(PartitionPruneInfo);
pruneinfo->prune_infos = prunerelinfos;
+ pruneinfo->needs_init_pruning = needs_init_pruning;
+ pruneinfo->needs_exec_pruning = needs_exec_pruning;
/*
* Some subplans may not belong to any of the identified partitioned rels.
@@ -435,13 +449,18 @@ add_part_relids(List *allpartrelids, Bitmapset *partrelids)
* If we cannot find any useful run-time pruning steps, return NIL.
* However, on success, each rel identified in partrelids will have
* an element in the result list, even if some of them are useless.
+ * *needs_init_pruning and *needs_exec_pruning are set to indicate that the
+ * returned PartitionedRelPruneInfos contains pruning steps that can be
+ * performed before and after execution begins, respectively.
*/
static List *
make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
List *prunequal,
Bitmapset *partrelids,
int *relid_subplan_map,
- Bitmapset **matchedsubplans)
+ Bitmapset **matchedsubplans,
+ bool *needs_init_pruning,
+ bool *needs_exec_pruning)
{
RelOptInfo *targetpart = NULL;
List *pinfolist = NIL;
@@ -452,6 +471,10 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
int rti;
int i;
+ /* Will find out below. */
+ *needs_init_pruning = false;
+ *needs_exec_pruning = false;
+
/*
* Examine each partitioned rel, constructing a temporary array to map
* from planner relids to index of the partitioned rel, and building a
@@ -539,6 +562,9 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
* executor per-scan pruning steps. This first pass creates startup
* pruning steps and detects whether there's any possibly-useful quals
* that would require per-scan pruning.
+ *
+ * In the first pass, we note whether the 2nd pass is necessary by
+ * by noting the presence of EXEC parameters.
*/
gen_partprune_steps(subpart, partprunequal, PARTTARGET_INITIAL,
&context);
@@ -613,6 +639,11 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
pinfo->execparamids = execparamids;
/* Remaining fields will be filled in the next loop */
+ if (!*needs_init_pruning)
+ *needs_init_pruning = (initial_pruning_steps != NIL);
+ if (!*needs_exec_pruning)
+ *needs_exec_pruning = (exec_pruning_steps != NIL);
+
pinfolist = lappend(pinfolist, pinfo);
}
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index ba2fcfeb4a..085eb3f209 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -945,15 +945,17 @@ pg_plan_query(Query *querytree, const char *query_string, int cursorOptions,
* For normal optimizable statements, invoke the planner. For utility
* statements, just make a wrapper PlannedStmt node.
*
- * The result is a list of PlannedStmt nodes.
+ * The result is a list of PlannedStmt nodes. Also, a NULL is appended to
+ * *execlockrelsinfo_list for each PlannedStmt added to the returned list.
*/
List *
pg_plan_queries(List *querytrees, const char *query_string, int cursorOptions,
- ParamListInfo boundParams)
+ ParamListInfo boundParams, List **execlockrelsinfo_list)
{
List *stmt_list = NIL;
ListCell *query_list;
+ *execlockrelsinfo_list = NIL;
foreach(query_list, querytrees)
{
Query *query = lfirst_node(Query, query_list);
@@ -977,6 +979,7 @@ pg_plan_queries(List *querytrees, const char *query_string, int cursorOptions,
}
stmt_list = lappend(stmt_list, stmt);
+ *execlockrelsinfo_list = lappend(*execlockrelsinfo_list, NULL);
}
return stmt_list;
@@ -1080,7 +1083,8 @@ exec_simple_query(const char *query_string)
QueryCompletion qc;
MemoryContext per_parsetree_context = NULL;
List *querytree_list,
- *plantree_list;
+ *plantree_list,
+ *plantree_execlockrelsinfo_list;
Portal portal;
DestReceiver *receiver;
int16 format;
@@ -1167,7 +1171,8 @@ exec_simple_query(const char *query_string)
NULL, 0, NULL);
plantree_list = pg_plan_queries(querytree_list, query_string,
- CURSOR_OPT_PARALLEL_OK, NULL);
+ CURSOR_OPT_PARALLEL_OK, NULL,
+ &plantree_execlockrelsinfo_list);
/*
* Done with the snapshot used for parsing/planning.
@@ -1203,6 +1208,7 @@ exec_simple_query(const char *query_string)
query_string,
commandTag,
plantree_list,
+ plantree_execlockrelsinfo_list,
NULL);
/*
@@ -1991,6 +1997,7 @@ exec_bind_message(StringInfo input_message)
query_string,
psrc->commandTag,
cplan->stmt_list,
+ cplan->execlockrelsinfo_list,
cplan);
/* Done with the snapshot used for parameter I/O and parsing/planning */
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index 5aa5a350f3..0fd8c65de7 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -35,7 +35,7 @@
Portal ActivePortal = NULL;
-static void ProcessQuery(PlannedStmt *plan,
+static void ProcessQuery(PlannedStmt *plan, ExecLockRelsInfo *execlockrelsinfo,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -65,6 +65,7 @@ static void DoPortalRewind(Portal portal);
*/
QueryDesc *
CreateQueryDesc(PlannedStmt *plannedstmt,
+ ExecLockRelsInfo *execlockrelsinfo,
const char *sourceText,
Snapshot snapshot,
Snapshot crosscheck_snapshot,
@@ -77,6 +78,7 @@ CreateQueryDesc(PlannedStmt *plannedstmt,
qd->operation = plannedstmt->commandType; /* operation */
qd->plannedstmt = plannedstmt; /* plan */
+ qd->execlockrelsinfo = execlockrelsinfo; /* ExecutorGetLockRels() output for plan */
qd->sourceText = sourceText; /* query text */
qd->snapshot = RegisterSnapshot(snapshot); /* snapshot */
/* RI check snapshot */
@@ -122,6 +124,7 @@ FreeQueryDesc(QueryDesc *qdesc)
* PORTAL_ONE_RETURNING, or PORTAL_ONE_MOD_WITH portal
*
* plan: the plan tree for the query
+ * execlockrelsinfo: ExecutorGetLockRels() output for the plan tree
* sourceText: the source text of the query
* params: any parameters needed
* dest: where to send results
@@ -134,6 +137,7 @@ FreeQueryDesc(QueryDesc *qdesc)
*/
static void
ProcessQuery(PlannedStmt *plan,
+ ExecLockRelsInfo *execlockrelsinfo,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -145,7 +149,7 @@ ProcessQuery(PlannedStmt *plan,
/*
* Create the QueryDesc object
*/
- queryDesc = CreateQueryDesc(plan, sourceText,
+ queryDesc = CreateQueryDesc(plan, execlockrelsinfo, sourceText,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, 0);
@@ -493,6 +497,7 @@ PortalStart(Portal portal, ParamListInfo params,
* the destination to DestNone.
*/
queryDesc = CreateQueryDesc(linitial_node(PlannedStmt, portal->stmts),
+ linitial_node(ExecLockRelsInfo, portal->execlockrelsinfos),
portal->sourceText,
GetActiveSnapshot(),
InvalidSnapshot,
@@ -1193,7 +1198,8 @@ PortalRunMulti(Portal portal,
QueryCompletion *qc)
{
bool active_snapshot_set = false;
- ListCell *stmtlist_item;
+ ListCell *stmtlist_item,
+ *execlockrelsinfolist_item;
/*
* If the destination is DestRemoteExecute, change to DestNone. The
@@ -1214,9 +1220,12 @@ PortalRunMulti(Portal portal,
* Loop to handle the individual queries generated from a single parsetree
* by analysis and rewrite.
*/
- foreach(stmtlist_item, portal->stmts)
+ forboth(stmtlist_item, portal->stmts,
+ execlockrelsinfolist_item, portal->execlockrelsinfos)
{
PlannedStmt *pstmt = lfirst_node(PlannedStmt, stmtlist_item);
+ ExecLockRelsInfo *execlockrelsinfo = lfirst_node(ExecLockRelsInfo,
+ execlockrelsinfolist_item);
/*
* If we got a cancel signal in prior command, quit
@@ -1274,7 +1283,7 @@ PortalRunMulti(Portal portal,
if (pstmt->canSetTag)
{
/* statement can set tag string */
- ProcessQuery(pstmt,
+ ProcessQuery(pstmt, execlockrelsinfo,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
@@ -1283,7 +1292,7 @@ PortalRunMulti(Portal portal,
else
{
/* stmt added by rewrite cannot set tag */
- ProcessQuery(pstmt,
+ ProcessQuery(pstmt, execlockrelsinfo,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 4cf6db504f..9f5a40a0a6 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -99,14 +99,16 @@ static dlist_head cached_expression_list = DLIST_STATIC_INIT(cached_expression_l
static void ReleaseGenericPlan(CachedPlanSource *plansource);
static List *RevalidateCachedQuery(CachedPlanSource *plansource,
QueryEnvironment *queryEnv);
-static bool CheckCachedPlan(CachedPlanSource *plansource);
+static bool CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams);
static CachedPlan *BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
ParamListInfo boundParams, QueryEnvironment *queryEnv);
+static void CachedPlanSaveExecLockRelsInfos(CachedPlan *plan, List *execlockrelsinfo_list);
static bool choose_custom_plan(CachedPlanSource *plansource,
ParamListInfo boundParams);
static double cached_plan_cost(CachedPlan *plan, bool include_planner);
static Query *QueryListGetPrimaryStmt(List *stmts);
-static void AcquireExecutorLocks(List *stmt_list, bool acquire);
+static List *AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams);
+static void ReleaseExecutorLocks(List *stmt_list, List *execlockrelsinfo_list);
static void AcquirePlannerLocks(List *stmt_list, bool acquire);
static void ScanQueryForLocks(Query *parsetree, bool acquire);
static bool ScanQueryWalker(Node *node, bool *acquire);
@@ -790,9 +792,21 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
*
* On a "true" return, we have acquired the locks needed to run the plan.
* (We must do this for the "true" result to be race-condition-free.)
+ *
+ * If the CachedPlan is valid, this may in some cases call ExecutorGetLockRels
+ * on each PlannedStmt contained in it to determine the set of relations to be
+ * locked by AcquireExecutorLocks(), instead of just scanning its range table,
+ * which is done to prune away any nodes in the tree that need not be executed
+ * based on the result of initial partition pruning. Resulting
+ * ExecLockRelsInfo nodes containing the result of such pruning, allocated in
+ * a child context of the context containing the plan itself, are added into
+ * plan->execlockrelsinfo_list. The previous contents of the list from the
+ * last invocation on the same CachedPlan are deleted, because they would no
+ * longer be valid given the fresh set of parameter values which may be used
+ * as pruning parameters.
*/
static bool
-CheckCachedPlan(CachedPlanSource *plansource)
+CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams)
{
CachedPlan *plan = plansource->gplan;
@@ -820,13 +834,25 @@ CheckCachedPlan(CachedPlanSource *plansource)
*/
if (plan->is_valid)
{
+ List *execlockrelsinfo_list;
+
/*
* Plan must have positive refcount because it is referenced by
* plansource; so no need to fear it disappears under us here.
*/
Assert(plan->refcount > 0);
- AcquireExecutorLocks(plan->stmt_list, true);
+ /*
+ * Lock relations scanned by the plan. If ExecutorGetLockRels() asked
+ * to omit some relations because the plan nodes that scan them were
+ * found to be pruned, the executor will be informed of the omission of
+ * the plan nodes themselves, so that it doesn't accidentally try to
+ * execute those nodes, via the ExecLockRelsInfo nodes collected in the
+ * returned list that is also passed to it along with the list of
+ * PlannedStmts.
+ */
+ execlockrelsinfo_list = AcquireExecutorLocks(plan->stmt_list,
+ boundParams);
/*
* If plan was transient, check to see if TransactionXmin has
@@ -844,11 +870,14 @@ CheckCachedPlan(CachedPlanSource *plansource)
if (plan->is_valid)
{
/* Successfully revalidated and locked the query. */
+
+ /* Remember ExecLockRelsInfos in the CachedPlan. */
+ CachedPlanSaveExecLockRelsInfos(plan, execlockrelsinfo_list);
return true;
}
/* Oops, the race case happened. Release useless locks. */
- AcquireExecutorLocks(plan->stmt_list, false);
+ ReleaseExecutorLocks(plan->stmt_list, execlockrelsinfo_list);
}
/*
@@ -880,7 +909,8 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
ParamListInfo boundParams, QueryEnvironment *queryEnv)
{
CachedPlan *plan;
- List *plist;
+ List *plist,
+ *execlockrelsinfo_list;
bool snapshot_set;
bool is_transient;
MemoryContext plan_context;
@@ -933,7 +963,8 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
* Generate the plan.
*/
plist = pg_plan_queries(qlist, plansource->query_string,
- plansource->cursor_options, boundParams);
+ plansource->cursor_options, boundParams,
+ &execlockrelsinfo_list);
/* Release snapshot if we got one */
if (snapshot_set)
@@ -1002,6 +1033,16 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
plan->is_saved = false;
plan->is_valid = true;
+ /*
+ * Save the dummy ExecLockRelsInfo list, that is a list containing NULLs
+ * as elements. We must do this, becasue users of the CachedPlan expect
+ * one to go with the list of PlannedStmts.
+ * XXX maybe get rid of that contract.
+ */
+ plan->execlockrelsinfo_context = NULL;
+ CachedPlanSaveExecLockRelsInfos(plan, execlockrelsinfo_list);
+ Assert(MemoryContextIsValid(plan->execlockrelsinfo_context));
+
/* assign generation number to new plan */
plan->generation = ++(plansource->generation);
@@ -1160,7 +1201,7 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
if (!customplan)
{
- if (CheckCachedPlan(plansource))
+ if (CheckCachedPlan(plansource, boundParams))
{
/* We want a generic plan, and we already have a valid one */
plan = plansource->gplan;
@@ -1586,6 +1627,49 @@ CopyCachedPlan(CachedPlanSource *plansource)
return newsource;
}
+/*
+ * CachedPlanSaveExecLockRelsInfos
+ * Save the list containing ExecLockRelsInfo nodes into the given
+ * CachedPlan
+ *
+ * The provided list is copied into a dedicated context that is a child of
+ * plan->context. If the child context already exists, it is emptied, because
+ * any ExecLockRelsInfo contained therein would no longer be useful.
+ */
+static void
+CachedPlanSaveExecLockRelsInfos(CachedPlan *plan, List *execlockrelsinfo_list)
+{
+ MemoryContext execlockrelsinfo_context = plan->execlockrelsinfo_context,
+ oldcontext = CurrentMemoryContext;
+ List *execlockrelsinfo_list_copy;
+
+ /*
+ * Set up the dedicated context if not already done, saving it as a child
+ * of the CachedPlan's context.
+ */
+ if (execlockrelsinfo_context == NULL)
+ {
+ execlockrelsinfo_context = AllocSetContextCreate(CurrentMemoryContext,
+ "CachedPlan execlockrelsinfo list",
+ ALLOCSET_START_SMALL_SIZES);
+ MemoryContextSetParent(execlockrelsinfo_context, plan->context);
+ MemoryContextSetIdentifier(execlockrelsinfo_context, plan->context->ident);
+ plan->execlockrelsinfo_context = execlockrelsinfo_context;
+ }
+ else
+ {
+ /* Just clear existing contents by resetting the context. */
+ Assert(MemoryContextIsValid(execlockrelsinfo_context));
+ MemoryContextReset(execlockrelsinfo_context);
+ }
+
+ MemoryContextSwitchTo(execlockrelsinfo_context);
+ execlockrelsinfo_list_copy = copyObject(execlockrelsinfo_list);
+ MemoryContextSwitchTo(oldcontext);
+
+ plan->execlockrelsinfo_list = execlockrelsinfo_list_copy;
+}
+
/*
* CachedPlanIsValid: test whether the rewritten querytree within a
* CachedPlanSource is currently valid (that is, not marked as being in need
@@ -1737,17 +1821,21 @@ QueryListGetPrimaryStmt(List *stmts)
/*
* AcquireExecutorLocks: acquire locks needed for execution of a cached plan;
- * or release them if acquire is false.
+ *
+ * Returns a list of ExecLockRelsInfo nodes containing one element for each
+ * PlannedStmt in stmt_list or NULL if the latter is utility statement or its
+ * containsInitialPruning is false.
*/
-static void
-AcquireExecutorLocks(List *stmt_list, bool acquire)
+static List *
+AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams)
{
ListCell *lc1;
+ List *execlockrelsinfo_list = NIL;
foreach(lc1, stmt_list)
{
PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
- ListCell *lc2;
+ ExecLockRelsInfo *execlockrelsinfo = NULL;
if (plannedstmt->commandType == CMD_UTILITY)
{
@@ -1761,27 +1849,139 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
Query *query = UtilityContainsQuery(plannedstmt->utilityStmt);
if (query)
- ScanQueryForLocks(query, acquire);
- continue;
+ ScanQueryForLocks(query, true);
}
-
- foreach(lc2, plannedstmt->rtable)
+ else
{
- RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc2);
+ /*
+ * Figure out the set of relations that would need to be locked
+ * before executing the plan.
+ */
+ if (!plannedstmt->containsInitialPruning)
+ {
+ /*
+ * If the plan contains no initial pruning steps, just lock
+ * all the relations found in the range table.
+ */
+ ListCell *lc;
- if (rte->rtekind != RTE_RELATION)
- continue;
+ foreach(lc, plannedstmt->rtable)
+ {
+ RangeTblEntry *rte = lfirst(lc);
+
+ if (rte->rtekind != RTE_RELATION)
+ continue;
+
+ /*
+ * Acquire the appropriate type of lock on each relation
+ * OID. Note that we don't actually try to open the rel,
+ * and hence will not fail if it's been dropped entirely
+ * --- we'll just transiently acquire a non-conflicting
+ * lock.
+ */
+ LockRelationOid(rte->relid, rte->rellockmode);
+ }
+ }
+ else
+ {
+ int rti;
+ Bitmapset *lockrels;
+
+ /*
+ * Walk the plan tree to find only the minimal set of
+ * relations to be locked, considering the effect of performing
+ * initial partition pruning.
+ */
+ execlockrelsinfo = ExecutorGetLockRels(plannedstmt, boundParams);
+ lockrels = execlockrelsinfo->lockrels;
+
+ rti = -1;
+ while ((rti = bms_next_member(lockrels, rti)) >= 0)
+ {
+ RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
+ Assert(rte->rtekind == RTE_RELATION);
+
+ /* See the comment above. */
+ LockRelationOid(rte->relid, rte->rellockmode);
+ }
+ }
+ }
+
+ /*
+ * Remember ExecLockRelsInfo for later adding to the QueryDesc that
+ * will be passed to the executor when executing this plan. May be
+ * NULL, but must keep the list the same length as stmt_list.
+ */
+ execlockrelsinfo_list = lappend(execlockrelsinfo_list,
+ execlockrelsinfo);
+ }
+
+ return execlockrelsinfo_list;
+}
+
+/*
+ * ReleaseExecutorLocks
+ * Release locks that would've been acquired by an earlier call to
+ * AcquireExecutorLocks()
+ */
+static void
+ReleaseExecutorLocks(List *stmt_list, List *execlockrelsinfo_list)
+{
+ ListCell *lc1,
+ *lc2;
+
+ forboth(lc1, stmt_list, lc2, execlockrelsinfo_list)
+ {
+ PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
+ ExecLockRelsInfo *execlockrelsinfo = lfirst_node(ExecLockRelsInfo, lc2);
+
+ if (plannedstmt->commandType == CMD_UTILITY)
+ {
/*
- * Acquire the appropriate type of lock on each relation OID. Note
- * that we don't actually try to open the rel, and hence will not
- * fail if it's been dropped entirely --- we'll just transiently
- * acquire a non-conflicting lock.
+ * Ignore utility statements, except those (such as EXPLAIN) that
+ * contain a parsed-but-not-planned query. Note: it's okay to use
+ * ScanQueryForLocks, even though the query hasn't been through
+ * rule rewriting, because rewriting doesn't change the query
+ * representation.
*/
- if (acquire)
- LockRelationOid(rte->relid, rte->rellockmode);
+ Query *query = UtilityContainsQuery(plannedstmt->utilityStmt);
+
+ if (query)
+ ScanQueryForLocks(query, false);
+ }
+ else
+ {
+ if (execlockrelsinfo == NULL)
+ {
+ ListCell *lc;
+
+ foreach(lc, plannedstmt->rtable)
+ {
+ RangeTblEntry *rte = lfirst(lc);
+
+ if (rte->rtekind != RTE_RELATION)
+ continue;
+
+ LockRelationOid(rte->relid, rte->rellockmode);
+ }
+ }
else
- UnlockRelationOid(rte->relid, rte->rellockmode);
+ {
+ int rti;
+ Bitmapset *lockrels;
+
+ lockrels = execlockrelsinfo->lockrels;
+ rti = -1;
+ while ((rti = bms_next_member(lockrels, rti)) >= 0)
+ {
+ RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
+
+ Assert(rte->rtekind == RTE_RELATION);
+
+ UnlockRelationOid(rte->relid, rte->rellockmode);
+ }
+ }
}
}
}
diff --git a/src/backend/utils/mmgr/portalmem.c b/src/backend/utils/mmgr/portalmem.c
index d549f66d4a..896f51be08 100644
--- a/src/backend/utils/mmgr/portalmem.c
+++ b/src/backend/utils/mmgr/portalmem.c
@@ -285,6 +285,7 @@ PortalDefineQuery(Portal portal,
const char *sourceText,
CommandTag commandTag,
List *stmts,
+ List *execlockrelsinfos,
CachedPlan *cplan)
{
AssertArg(PortalIsValid(portal));
@@ -299,6 +300,7 @@ PortalDefineQuery(Portal portal,
portal->qc.nprocessed = 0;
portal->commandTag = commandTag;
portal->stmts = stmts;
+ portal->execlockrelsinfos = execlockrelsinfos;
portal->cplan = cplan;
portal->status = PORTAL_DEFINED;
}
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 666977fb1f..fef75ba147 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -87,7 +87,8 @@ extern void ExplainOneUtility(Node *utilityStmt, IntoClause *into,
ExplainState *es, const char *queryString,
ParamListInfo params, QueryEnvironment *queryEnv);
-extern void ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into,
+extern void ExplainOnePlan(PlannedStmt *plannedstmt, ExecLockRelsInfo *execlockrelsinfo,
+ IntoClause *into,
ExplainState *es, const char *queryString,
ParamListInfo params, QueryEnvironment *queryEnv,
const instr_time *planduration,
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index fd5735a946..ded19b8cbb 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -124,4 +124,6 @@ extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
PartitionPruneInfo *pruneinfo,
Bitmapset **initially_valid_subplans);
extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate);
+extern Bitmapset *ExecGetLockRelsDoInitialPruning(Plan *plan, ExecGetLockRelsContext *context,
+ PartitionPruneInfo *pruneinfo);
#endif /* EXECPARTITION_H */
diff --git a/src/include/executor/execdesc.h b/src/include/executor/execdesc.h
index e79e2c001f..4338463479 100644
--- a/src/include/executor/execdesc.h
+++ b/src/include/executor/execdesc.h
@@ -35,6 +35,7 @@ typedef struct QueryDesc
/* These fields are provided by CreateQueryDesc */
CmdType operation; /* CMD_SELECT, CMD_UPDATE, etc. */
PlannedStmt *plannedstmt; /* planner's output (could be utility, too) */
+ ExecLockRelsInfo *execlockrelsinfo; /* ExecutorGetLockRels()'s output given plannedstmt */
const char *sourceText; /* source text of the query */
Snapshot snapshot; /* snapshot to use for query */
Snapshot crosscheck_snapshot; /* crosscheck for RI update/delete */
@@ -57,6 +58,7 @@ typedef struct QueryDesc
/* in pquery.c */
extern QueryDesc *CreateQueryDesc(PlannedStmt *plannedstmt,
+ ExecLockRelsInfo *execlockrelsinfo,
const char *sourceText,
Snapshot snapshot,
Snapshot crosscheck_snapshot,
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index 873772f188..d03bd5a026 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -185,6 +185,8 @@ ExecGetJunkAttribute(TupleTableSlot *slot, AttrNumber attno, bool *isNull)
/*
* prototypes from functions in execMain.c
*/
+extern ExecLockRelsInfo *ExecutorGetLockRels(PlannedStmt *plannedstmt, ParamListInfo params);
+extern bool ExecGetLockRels(Plan *node, ExecGetLockRelsContext *context);
extern void ExecutorStart(QueryDesc *queryDesc, int eflags);
extern void standard_ExecutorStart(QueryDesc *queryDesc, int eflags);
extern void ExecutorRun(QueryDesc *queryDesc,
diff --git a/src/include/executor/nodeAppend.h b/src/include/executor/nodeAppend.h
index 4cb78ee5b6..b53535c2a4 100644
--- a/src/include/executor/nodeAppend.h
+++ b/src/include/executor/nodeAppend.h
@@ -17,6 +17,7 @@
#include "access/parallel.h"
#include "nodes/execnodes.h"
+extern bool ExecGetAppendLockRels(Append *node, ExecGetLockRelsContext *context);
extern AppendState *ExecInitAppend(Append *node, EState *estate, int eflags);
extern void ExecEndAppend(AppendState *node);
extern void ExecReScanAppend(AppendState *node);
diff --git a/src/include/executor/nodeMergeAppend.h b/src/include/executor/nodeMergeAppend.h
index 97fe3b0665..8eb4e9df93 100644
--- a/src/include/executor/nodeMergeAppend.h
+++ b/src/include/executor/nodeMergeAppend.h
@@ -16,6 +16,7 @@
#include "nodes/execnodes.h"
+extern bool ExecGetMergeAppendLockRels(MergeAppend *node, ExecGetLockRelsContext *context);
extern MergeAppendState *ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags);
extern void ExecEndMergeAppend(MergeAppendState *node);
extern void ExecReScanMergeAppend(MergeAppendState *node);
diff --git a/src/include/executor/nodeModifyTable.h b/src/include/executor/nodeModifyTable.h
index c318681b9a..287baf6257 100644
--- a/src/include/executor/nodeModifyTable.h
+++ b/src/include/executor/nodeModifyTable.h
@@ -19,6 +19,7 @@ extern void ExecComputeStoredGenerated(ResultRelInfo *resultRelInfo,
EState *estate, TupleTableSlot *slot,
CmdType cmdtype);
+extern bool ExecGetModifyTableLockRels(ModifyTable *plan, ExecGetLockRelsContext *context);
extern ModifyTableState *ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags);
extern void ExecEndModifyTable(ModifyTableState *node);
extern void ExecReScanModifyTable(ModifyTableState *node);
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index cbbcff81d2..ee0c73e9a4 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -596,6 +596,7 @@ typedef struct EState
struct ExecRowMark **es_rowmarks; /* Array of per-range-table-entry
* ExecRowMarks, or NULL if none */
PlannedStmt *es_plannedstmt; /* link to top of plan tree */
+ struct ExecLockRelsInfo *es_execlockrelsinfo; /* QueryDesc.execlockrelsinfo */
const char *es_sourceText; /* Source text from QueryDesc */
JunkFilter *es_junkFilter; /* top-level junk filter, if any */
@@ -984,6 +985,101 @@ typedef struct DomainConstraintState
*/
typedef TupleTableSlot *(*ExecProcNodeMtd) (struct PlanState *pstate);
+/*----------------
+ * ExecLockRelsInfo
+ *
+ * Result of performing ExecutorGetLockRels() for a given PlannedStmt
+ */
+typedef struct ExecLockRelsInfo
+{
+ NodeTag type;
+
+ /*
+ * Relations that must be locked to execute the plan tree contained in
+ * the PlannedStmt.
+ */
+ Bitmapset *lockrels;
+
+ /* PlannedStmt.numPlanNodes */
+ int numPlanNodes;
+
+ /*
+ * List of PlanInitPruningOutput, each representing the output of
+ * performing initial pruning on a given plan node, for all nodes in the
+ * plan tree that have been marked as needing initial pruning.
+ *
+ * 'ipoIndexes' is an array of 'numPlanNodes' elements, indexed with
+ * plan_node_id of the individual nodes in the plan tree, each a 1-based
+ * index into 'initPruningOutputs' list for a given plan node. 0 means
+ * that a given plan node has no entry in the list because of not needing
+ * any initial pruning done on it.
+ */
+ List *initPruningOutputs;
+ int *ipoIndexes;
+} ExecLockRelsInfo;
+
+/*----------------
+ * ExecGetLockRelsContext
+ *
+ * Information pertaining to ExecutorGetLockRels() invocation for a given
+ * plan.
+ */
+typedef struct ExecGetLockRelsContext
+{
+ NodeTag type;
+
+ PlannedStmt *stmt; /* target plan */
+ ParamListInfo params; /* EXTERN parameters available for pruning */
+
+ /* Output parameters for ExecGetLockRels and its subroutines. */
+ Bitmapset *lockrels;
+
+ /* See the omment in the definition of ExecLockRelsInfo struct. */
+ List *initPruningOutputs;
+ int *ipoIndexes;
+} ExecGetLockRelsContext;
+
+/*
+ * Appends the provided PlanInitPruningOutput to
+ * ExecGetLockRelsContext.initPruningOutput
+ */
+#define ExecStorePlanInitPruningOutput(cxt, initPruningOutput, plannode) \
+ do { \
+ (cxt)->initPruningOutputs = lappend((cxt)->initPruningOutputs, initPruningOutput); \
+ (cxt)->ipoIndexes[(plannode)->plan_node_id] = list_length((cxt)->initPruningOutputs); \
+ } while (0)
+
+/*
+ * Finds the PlanInitPruningOutput for a given Plan node in
+ * ExecLockRelsInfo.initPruningOutputs.
+ */
+#define ExecFetchPlanInitPruningOutput(execlockrelsinfo, plannode) \
+ (((execlockrelsinfo) != NULL && (execlockrelsinfo)->initPruningOutputs != NIL) ? \
+ list_nth((execlockrelsinfo)->initPruningOutputs, \
+ (execlockrelsinfo)->ipoIndexes[(plannode)->plan_node_id] - 1) : NULL)
+
+/* ---------------
+ * PlanInitPruningOutput
+ *
+ * Node to remember the result of performing initial partition pruning steps
+ * during ExecutorGetLockRels() on nodes that support pruning.
+ *
+ * ExecLockRelsDoInitPruning(), which runs during ExecutorGetLockRels(),
+ * creates it and stores it in the corresponding ExecLockRelsInfo.
+ *
+ * ExecInitPartitionPruning(), which runs during ExecuorStart(), fetches it
+ * from the EState's ExecLockRelsInfo (if any) and uses the value of
+ * initially_valid_subplans contained in it as-is to select the subplans to be
+ * initialized for execution, instead of re-evaluating that by performing
+ * initial pruning again.
+ */
+typedef struct PlanInitPruningOutput
+{
+ NodeTag type;
+
+ Bitmapset *initially_valid_subplans;
+} PlanInitPruningOutput;
+
/* ----------------
* PlanState node
*
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index 53f6b05a3f..928a30c7c6 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -97,6 +97,11 @@ typedef enum NodeTag
T_PartitionPruneStepCombine,
T_PlanInvalItem,
+ /* TAGS FOR EXECUTOR PREP NODES (execnodes.h) */
+ T_ExecGetLockRelsContext,
+ T_ExecLockRelsInfo,
+ T_PlanInitPruningOutput,
+
/*
* TAGS FOR PLAN STATE NODES (execnodes.h)
*
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index ef9b54739a..0ed171d3f5 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -129,6 +129,10 @@ typedef struct PlannerGlobal
char maxParallelHazard; /* worst PROPARALLEL hazard level */
+ bool containsInitialPruning; /* Do some Plan nodes in the tree
+ * have initial (pre-exec) pruning
+ * steps? */
+
PartitionDirectory partition_directory; /* partition descriptors */
Bitmapset *elidedAppendPartedRels; /* Combined partitioned_rels of all
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index a823c7c20d..4fcba0e55c 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -60,10 +60,16 @@ typedef struct PlannedStmt
bool parallelModeNeeded; /* parallel mode required to execute? */
+ bool containsInitialPruning; /* Do some Plan nodes in the tree
+ * have initial (pre-exec) pruning
+ * steps? */
+
int jitFlags; /* which forms of JIT should be performed */
struct Plan *planTree; /* tree of Plan nodes */
+ int numPlanNodes; /* number of nodes in planTree */
+
List *rtable; /* list of RangeTblEntry nodes */
/* rtable indexes of target relations for INSERT/UPDATE/DELETE */
@@ -1192,6 +1198,13 @@ typedef struct PlanRowMark
* prune_infos List of Lists containing PartitionedRelPruneInfo nodes,
* one sublist per run-time-prunable partition hierarchy
* appearing in the parent plan node's subplans.
+ *
+ * needs_init_pruning Does any of the PartitionedRelPruneInfos in
+ * prune_infos have its initial_pruning_steps set?
+ *
+ * needs_exec_pruning Does any of the PartitionedRelPruneInfos in
+ * prune_infos have its exec_pruning_steps set?
+ *
* other_subplans Indexes of any subplans that are not accounted for
* by any of the PartitionedRelPruneInfo nodes in
* "prune_infos". These subplans must not be pruned.
@@ -1200,6 +1213,8 @@ typedef struct PartitionPruneInfo
{
NodeTag type;
List *prune_infos;
+ bool needs_init_pruning;
+ bool needs_exec_pruning;
Bitmapset *other_subplans;
} PartitionPruneInfo;
diff --git a/src/include/tcop/tcopprot.h b/src/include/tcop/tcopprot.h
index 92291a750d..bf80c53bed 100644
--- a/src/include/tcop/tcopprot.h
+++ b/src/include/tcop/tcopprot.h
@@ -64,7 +64,7 @@ extern PlannedStmt *pg_plan_query(Query *querytree, const char *query_string,
ParamListInfo boundParams);
extern List *pg_plan_queries(List *querytrees, const char *query_string,
int cursorOptions,
- ParamListInfo boundParams);
+ ParamListInfo boundParams, List **execlockrelsinfo_list);
extern bool check_max_stack_depth(int *newval, void **extra, GucSource source);
extern void assign_max_stack_depth(int newval, void *extra);
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index 95b99e3d25..56b0dcc6bd 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -148,6 +148,9 @@ typedef struct CachedPlan
{
int magic; /* should equal CACHEDPLAN_MAGIC */
List *stmt_list; /* list of PlannedStmts */
+ List *execlockrelsinfo_list; /* list of ExecutorGetLockRelsResult with one
+ * element for each of stmt_list; NIL
+ * if not a generic plan */
bool is_oneshot; /* is it a "oneshot" plan? */
bool is_saved; /* is CachedPlan in a long-lived context? */
bool is_valid; /* is the stmt_list currently valid? */
@@ -158,6 +161,9 @@ typedef struct CachedPlan
int generation; /* parent's generation number for this plan */
int refcount; /* count of live references to this struct */
MemoryContext context; /* context containing this CachedPlan */
+ MemoryContext execlockrelsinfo_context; /* context containing
+ * execlockrelsinfo_list,
+ * a child of the above context */
} CachedPlan;
/*
diff --git a/src/include/utils/portal.h b/src/include/utils/portal.h
index aeddbdafe5..9abace6734 100644
--- a/src/include/utils/portal.h
+++ b/src/include/utils/portal.h
@@ -137,6 +137,10 @@ typedef struct PortalData
CommandTag commandTag; /* command tag for original query */
QueryCompletion qc; /* command completion data for executed query */
List *stmts; /* list of PlannedStmts */
+ List *execlockrelsinfos; /* list of ExecutorGetLockRelsResults with one element
+ * for each of 'stmts'; same as
+ * cplan->execlockrelsinfo_list if cplan is
+ * not NULL */
CachedPlan *cplan; /* CachedPlan, if stmts are from one */
ParamListInfo portalParams; /* params to pass to query */
@@ -241,6 +245,7 @@ extern void PortalDefineQuery(Portal portal,
const char *sourceText,
CommandTag commandTag,
List *stmts,
+ List *execlockrelsinfos,
CachedPlan *cplan);
extern PlannedStmt *PortalGetPrimaryStmt(Portal portal);
extern void PortalCreateHoldStore(Portal portal);
--
2.24.1
[application/octet-stream] v8-0001-Some-refactoring-of-runtime-pruning-code.patch (26.5K, 3-v8-0001-Some-refactoring-of-runtime-pruning-code.patch)
download | inline diff:
From ce2041b254a7fee3097012f11685b635d58fb9b2 Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Wed, 2 Mar 2022 15:17:55 +0900
Subject: [PATCH v8 1/4] Some refactoring of runtime pruning code
This does two things mainly:
* Move the execution pruning initialization steps that are common
between both ExecInitAppend() and ExecInitMergeAppend() into a new
function ExecInitPartitionPruning() defined in execPartition.c.
Thus, ExecCreatePartitionPruneState() and
ExecFindInitialMatchingSubPlans() need not be exported.
* Add an ExprContext field to PartitionPruneContext to remove the
implicit assumption in the runtime pruning code that the ExprContext
to use to compute pruning expressions that need one can always rely
on the PlanState providing it. A future patch will allow runtime
pruning (at least the initial pruning steps) to be performed without
the corresponding PlanState yet having been created, so this will
help.
---
src/backend/executor/execPartition.c | 340 ++++++++++++++++---------
src/backend/executor/nodeAppend.c | 33 +--
src/backend/executor/nodeMergeAppend.c | 32 +--
src/backend/partitioning/partprune.c | 20 +-
src/include/executor/execPartition.h | 9 +-
src/include/partitioning/partprune.h | 2 +
6 files changed, 252 insertions(+), 184 deletions(-)
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index aca42ca5b8..84b4e4b3d6 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -184,11 +184,18 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation rel,
int maxfieldlen);
static List *adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri);
static List *adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap);
+static PartitionPruneState *ExecCreatePartitionPruneState(PlanState *planstate,
+ PartitionPruneInfo *partitionpruneinfo);
+static Bitmapset *ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate);
static void ExecInitPruningContext(PartitionPruneContext *context,
List *pruning_steps,
PartitionDesc partdesc,
PartitionKey partkey,
- PlanState *planstate);
+ PlanState *planstate,
+ ExprContext *econtext);
+static void PartitionPruneStateFixSubPlanMap(PartitionPruneState *prunestate,
+ Bitmapset *initially_valid_subplans,
+ int n_total_subplans);
static void find_matching_subplans_recurse(PartitionPruningData *prunedata,
PartitionedRelPruningData *pprune,
bool initial_prune,
@@ -1590,30 +1597,86 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
*
* Functions:
*
- * ExecCreatePartitionPruneState:
+ * ExecInitPartitionPruning:
* Creates the PartitionPruneState required by each of the two pruning
* functions. Details stored include how to map the partition index
- * returned by the partition pruning code into subplan indexes.
- *
- * ExecFindInitialMatchingSubPlans:
- * Returns indexes of matching subplans. Partition pruning is attempted
- * without any evaluation of expressions containing PARAM_EXEC Params.
- * This function must be called during executor startup for the parent
- * plan before the subplans themselves are initialized. Subplans which
- * are found not to match by this function must be removed from the
- * plan's list of subplans during execution, as this function performs a
- * remap of the partition index to subplan index map and the newly
- * created map provides indexes only for subplans which remain after
- * calling this function.
+ * returned by the partition pruning code into subplan indexes. Also
+ * determines the set of initially valid subplans by performing initial
+ * pruning steps, only which need be initialized by the caller such as
+ * ExecInitAppend. Maps in PartitionPruneState are updated to account
+ * for initial pruning having eliminated some of the subplans, if any.
*
* ExecFindMatchingSubPlans:
* Returns indexes of matching subplans after evaluating all available
- * expressions. This function can only be called during execution and
- * must be called again each time the value of a Param listed in
- * PartitionPruneState's 'execparamids' changes.
+ * expressions, that is, using execution pruning steps. This function can
+ * can only be called during execution and must be called again each time
+ * the value of a Param listed in PartitionPruneState's 'execparamids'
+ * changes.
*-------------------------------------------------------------------------
*/
+/*
+ * ExecInitPartitionPruning
+ * Initialize data structure needed for run-time partition pruning
+ *
+ * Initial pruning can be done immediately, so it is done here if needed and
+ * the set of surviving partition subplans' indexes are added to the output
+ * parameter *initially_valid_subplans.
+ *
+ * If subplans are indeed pruned, subplan_map arrays contained in the returned
+ * PartitionPruneState are re-sequenced to not count those, though only if the
+ * maps will be needed for subsequent execution pruning passes.
+ */
+PartitionPruneState *
+ExecInitPartitionPruning(PlanState *planstate,
+ int n_total_subplans,
+ PartitionPruneInfo *pruneinfo,
+ Bitmapset **initially_valid_subplans)
+{
+ PartitionPruneState *prunestate;
+ EState *estate = planstate->state;
+
+ /* We may need an expression context to evaluate partition exprs */
+ ExecAssignExprContext(estate, planstate);
+
+ /*
+ * Create the working data structure for pruning.
+ */
+ prunestate = ExecCreatePartitionPruneState(planstate, pruneinfo);
+
+ /*
+ * Perform an initial partition prune, if required.
+ */
+ if (prunestate->do_initial_prune)
+ {
+ /* Determine which subplans survive initial pruning */
+ *initially_valid_subplans = ExecFindInitialMatchingSubPlans(prunestate);
+ }
+ else
+ {
+ /* We'll need to initialize all subplans */
+ Assert(n_total_subplans > 0);
+ *initially_valid_subplans = bms_add_range(NULL, 0,
+ n_total_subplans - 1);
+ }
+
+ /*
+ * Re-sequence subplan indexes contained in prunestate to account for any
+ * that were removed above due to initial pruning.
+ *
+ * We can safely skip this when !do_exec_prune, even though that leaves
+ * invalid data in prunestate, because that data won't be consulted again
+ * (cf initial Assert in ExecFindMatchingSubPlans).
+ */
+ if (prunestate->do_exec_prune &&
+ bms_num_members(*initially_valid_subplans) < n_total_subplans)
+ PartitionPruneStateFixSubPlanMap(prunestate,
+ *initially_valid_subplans,
+ n_total_subplans);
+
+ return prunestate;
+}
+
/*
* ExecCreatePartitionPruneState
* Build the data structure required for calling
@@ -1632,7 +1695,7 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
* re-used each time we re-evaluate which partitions match the pruning steps
* provided in each PartitionedRelPruneInfo.
*/
-PartitionPruneState *
+static PartitionPruneState *
ExecCreatePartitionPruneState(PlanState *planstate,
PartitionPruneInfo *partitionpruneinfo)
{
@@ -1641,6 +1704,7 @@ ExecCreatePartitionPruneState(PlanState *planstate,
int n_part_hierarchies;
ListCell *lc;
int i;
+ ExprContext *econtext = planstate->ps_ExprContext;
/* For data reading, executor always omits detached partitions */
if (estate->es_partition_directory == NULL)
@@ -1814,7 +1878,8 @@ ExecCreatePartitionPruneState(PlanState *planstate,
{
ExecInitPruningContext(&pprune->initial_context,
pinfo->initial_pruning_steps,
- partdesc, partkey, planstate);
+ partdesc, partkey, planstate,
+ econtext);
/* Record whether initial pruning is needed at any level */
prunestate->do_initial_prune = true;
}
@@ -1823,7 +1888,8 @@ ExecCreatePartitionPruneState(PlanState *planstate,
{
ExecInitPruningContext(&pprune->exec_context,
pinfo->exec_pruning_steps,
- partdesc, partkey, planstate);
+ partdesc, partkey, planstate,
+ econtext);
/* Record whether exec pruning is needed at any level */
prunestate->do_exec_prune = true;
}
@@ -1851,7 +1917,8 @@ ExecInitPruningContext(PartitionPruneContext *context,
List *pruning_steps,
PartitionDesc partdesc,
PartitionKey partkey,
- PlanState *planstate)
+ PlanState *planstate,
+ ExprContext *econtext)
{
int n_steps;
int partnatts;
@@ -1872,6 +1939,7 @@ ExecInitPruningContext(PartitionPruneContext *context,
context->ppccontext = CurrentMemoryContext;
context->planstate = planstate;
+ context->exprcontext = econtext;
/* Initialize expression state for each expression we need */
context->exprstates = (ExprState **)
@@ -1900,8 +1968,20 @@ ExecInitPruningContext(PartitionPruneContext *context,
step->step.step_id,
keyno);
- context->exprstates[stateidx] =
- ExecInitExpr(expr, context->planstate);
+ /*
+ * When planstate is NULL, pruning_steps is known not to
+ * contain any expressions that depend on the parent plan.
+ * Information of any available EXTERN parameters must be
+ * passed explicitly in that case, which the caller must
+ * have made available via econtext.
+ */
+ if (planstate == NULL)
+ context->exprstates[stateidx] =
+ ExecInitExprWithParams(expr,
+ econtext->ecxt_param_list_info);
+ else
+ context->exprstates[stateidx] =
+ ExecInitExpr(expr, context->planstate);
}
keyno++;
}
@@ -1914,18 +1994,11 @@ ExecInitPruningContext(PartitionPruneContext *context,
* pruning, disregarding any pruning constraints involving PARAM_EXEC
* Params.
*
- * If additional pruning passes will be required (because of PARAM_EXEC
- * Params), we must also update the translation data that allows conversion
- * of partition indexes into subplan indexes to account for the unneeded
- * subplans having been removed.
- *
* Must only be called once per 'prunestate', and only if initial pruning
* is required.
- *
- * 'nsubplans' must be passed as the total number of unpruned subplans.
*/
-Bitmapset *
-ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate, int nsubplans)
+static Bitmapset *
+ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate)
{
Bitmapset *result = NULL;
MemoryContext oldcontext;
@@ -1950,14 +2023,20 @@ ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate, int nsubplans)
PartitionedRelPruningData *pprune;
prunedata = prunestate->partprunedata[i];
+
+ /*
+ * We pass the 1st item belonging to the root table of the hierarchy
+ * and find_matching_subplans_recurse() takes care of recursing to
+ * other (lower-level) parents as needed.
+ */
pprune = &prunedata->partrelprunedata[0];
/* Perform pruning without using PARAM_EXEC Params */
find_matching_subplans_recurse(prunedata, pprune, true, &result);
- /* Expression eval may have used space in node's ps_ExprContext too */
+ /* Expression eval may have used space in ExprContext too */
if (pprune->initial_pruning_steps)
- ResetExprContext(pprune->initial_context.planstate->ps_ExprContext);
+ ResetExprContext(pprune->initial_context.exprcontext);
}
/* Add in any subplans that partition pruning didn't account for */
@@ -1970,118 +2049,120 @@ ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate, int nsubplans)
MemoryContextReset(prunestate->prune_context);
+ return result;
+}
+
+/*
+ * PartitionPruneStateFixSubPlanMap
+ * Fix mapping of partition indexes to subplan indexes contained in
+ * prunestate by considering the new list of subplans that survived
+ * initial pruning
+ *
+ * Subplans would previously be indexed 0..(n_total_subplans - 1) should be
+ * changed to index range 0..num(initially_valid_subplans).
+ */
+static void
+PartitionPruneStateFixSubPlanMap(PartitionPruneState *prunestate,
+ Bitmapset *initially_valid_subplans,
+ int n_total_subplans)
+{
+ int *new_subplan_indexes;
+ Bitmapset *new_other_subplans;
+ int i;
+ int newidx;
+
/*
- * If exec-time pruning is required and we pruned subplans above, then we
- * must re-sequence the subplan indexes so that ExecFindMatchingSubPlans
- * properly returns the indexes from the subplans which will remain after
- * execution of this function.
- *
- * We can safely skip this when !do_exec_prune, even though that leaves
- * invalid data in prunestate, because that data won't be consulted again
- * (cf initial Assert in ExecFindMatchingSubPlans).
+ * First we must build a temporary array which maps old subplan
+ * indexes to new ones. For convenience of initialization, we use
+ * 1-based indexes in this array and leave pruned items as 0.
*/
- if (prunestate->do_exec_prune && bms_num_members(result) < nsubplans)
+ new_subplan_indexes = (int *) palloc0(sizeof(int) * n_total_subplans);
+ newidx = 1;
+ i = -1;
+ while ((i = bms_next_member(initially_valid_subplans, i)) >= 0)
{
- int *new_subplan_indexes;
- Bitmapset *new_other_subplans;
- int i;
- int newidx;
+ Assert(i < n_total_subplans);
+ new_subplan_indexes[i] = newidx++;
+ }
- /*
- * First we must build a temporary array which maps old subplan
- * indexes to new ones. For convenience of initialization, we use
- * 1-based indexes in this array and leave pruned items as 0.
- */
- new_subplan_indexes = (int *) palloc0(sizeof(int) * nsubplans);
- newidx = 1;
- i = -1;
- while ((i = bms_next_member(result, i)) >= 0)
- {
- Assert(i < nsubplans);
- new_subplan_indexes[i] = newidx++;
- }
+ /*
+ * Now we can update each PartitionedRelPruneInfo's subplan_map with
+ * new subplan indexes. We must also recompute its present_parts
+ * bitmap.
+ */
+ for (i = 0; i < prunestate->num_partprunedata; i++)
+ {
+ PartitionPruningData *prunedata = prunestate->partprunedata[i];
+ int j;
/*
- * Now we can update each PartitionedRelPruneInfo's subplan_map with
- * new subplan indexes. We must also recompute its present_parts
- * bitmap.
+ * Within each hierarchy, we perform this loop in back-to-front
+ * order so that we determine present_parts for the lowest-level
+ * partitioned tables first. This way we can tell whether a
+ * sub-partitioned table's partitions were entirely pruned so we
+ * can exclude it from the current level's present_parts.
*/
- for (i = 0; i < prunestate->num_partprunedata; i++)
+ for (j = prunedata->num_partrelprunedata - 1; j >= 0; j--)
{
- PartitionPruningData *prunedata = prunestate->partprunedata[i];
- int j;
+ PartitionedRelPruningData *pprune = &prunedata->partrelprunedata[j];
+ int nparts = pprune->nparts;
+ int k;
- /*
- * Within each hierarchy, we perform this loop in back-to-front
- * order so that we determine present_parts for the lowest-level
- * partitioned tables first. This way we can tell whether a
- * sub-partitioned table's partitions were entirely pruned so we
- * can exclude it from the current level's present_parts.
- */
- for (j = prunedata->num_partrelprunedata - 1; j >= 0; j--)
- {
- PartitionedRelPruningData *pprune = &prunedata->partrelprunedata[j];
- int nparts = pprune->nparts;
- int k;
+ /* We just rebuild present_parts from scratch */
+ bms_free(pprune->present_parts);
+ pprune->present_parts = NULL;
- /* We just rebuild present_parts from scratch */
- bms_free(pprune->present_parts);
- pprune->present_parts = NULL;
+ for (k = 0; k < nparts; k++)
+ {
+ int oldidx = pprune->subplan_map[k];
+ int subidx;
- for (k = 0; k < nparts; k++)
+ /*
+ * If this partition existed as a subplan then change the
+ * old subplan index to the new subplan index. The new
+ * index may become -1 if the partition was pruned above,
+ * or it may just come earlier in the subplan list due to
+ * some subplans being removed earlier in the list. If
+ * it's a subpartition, add it to present_parts unless
+ * it's entirely pruned.
+ */
+ if (oldidx >= 0)
{
- int oldidx = pprune->subplan_map[k];
- int subidx;
-
- /*
- * If this partition existed as a subplan then change the
- * old subplan index to the new subplan index. The new
- * index may become -1 if the partition was pruned above,
- * or it may just come earlier in the subplan list due to
- * some subplans being removed earlier in the list. If
- * it's a subpartition, add it to present_parts unless
- * it's entirely pruned.
- */
- if (oldidx >= 0)
- {
- Assert(oldidx < nsubplans);
- pprune->subplan_map[k] = new_subplan_indexes[oldidx] - 1;
+ Assert(oldidx < n_total_subplans);
+ pprune->subplan_map[k] = new_subplan_indexes[oldidx] - 1;
- if (new_subplan_indexes[oldidx] > 0)
- pprune->present_parts =
- bms_add_member(pprune->present_parts, k);
- }
- else if ((subidx = pprune->subpart_map[k]) >= 0)
- {
- PartitionedRelPruningData *subprune;
+ if (new_subplan_indexes[oldidx] > 0)
+ pprune->present_parts =
+ bms_add_member(pprune->present_parts, k);
+ }
+ else if ((subidx = pprune->subpart_map[k]) >= 0)
+ {
+ PartitionedRelPruningData *subprune;
- subprune = &prunedata->partrelprunedata[subidx];
+ subprune = &prunedata->partrelprunedata[subidx];
- if (!bms_is_empty(subprune->present_parts))
- pprune->present_parts =
- bms_add_member(pprune->present_parts, k);
- }
+ if (!bms_is_empty(subprune->present_parts))
+ pprune->present_parts =
+ bms_add_member(pprune->present_parts, k);
}
}
}
+ }
- /*
- * We must also recompute the other_subplans set, since indexes in it
- * may change.
- */
- new_other_subplans = NULL;
- i = -1;
- while ((i = bms_next_member(prunestate->other_subplans, i)) >= 0)
- new_other_subplans = bms_add_member(new_other_subplans,
- new_subplan_indexes[i] - 1);
-
- bms_free(prunestate->other_subplans);
- prunestate->other_subplans = new_other_subplans;
+ /*
+ * We must also recompute the other_subplans set, since indexes in it
+ * may change.
+ */
+ new_other_subplans = NULL;
+ i = -1;
+ while ((i = bms_next_member(prunestate->other_subplans, i)) >= 0)
+ new_other_subplans = bms_add_member(new_other_subplans,
+ new_subplan_indexes[i] - 1);
- pfree(new_subplan_indexes);
- }
+ bms_free(prunestate->other_subplans);
+ prunestate->other_subplans = new_other_subplans;
- return result;
+ pfree(new_subplan_indexes);
}
/*
@@ -2123,11 +2204,16 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate)
prunedata = prunestate->partprunedata[i];
pprune = &prunedata->partrelprunedata[0];
+ /*
+ * We pass the 1st item belonging to the root table of the hierarchy
+ * and find_matching_subplans_recurse() takes care of recursing to
+ * other (lower-level) parents as needed.
+ */
find_matching_subplans_recurse(prunedata, pprune, false, &result);
- /* Expression eval may have used space in node's ps_ExprContext too */
+ /* Expression eval may have used space in ExprContext too */
if (pprune->exec_pruning_steps)
- ResetExprContext(pprune->exec_context.planstate->ps_ExprContext);
+ ResetExprContext(pprune->exec_context.exprcontext);
}
/* Add in any subplans that partition pruning didn't account for */
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index 7937f1c88f..5b6d3eb23b 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -138,30 +138,17 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
{
PartitionPruneState *prunestate;
- /* We may need an expression context to evaluate partition exprs */
- ExecAssignExprContext(estate, &appendstate->ps);
-
- /* Create the working data structure for pruning. */
- prunestate = ExecCreatePartitionPruneState(&appendstate->ps,
- node->part_prune_info);
+ /*
+ * Set up pruning data structure. Initial pruning steps, if any, are
+ * performed as part of the setup, adding the set of indexes of
+ * surviving subplans to 'validsubplans'.
+ */
+ prunestate = ExecInitPartitionPruning(&appendstate->ps,
+ list_length(node->appendplans),
+ node->part_prune_info,
+ &validsubplans);
appendstate->as_prune_state = prunestate;
-
- /* Perform an initial partition prune, if required. */
- if (prunestate->do_initial_prune)
- {
- /* Determine which subplans survive initial pruning */
- validsubplans = ExecFindInitialMatchingSubPlans(prunestate,
- list_length(node->appendplans));
-
- nplans = bms_num_members(validsubplans);
- }
- else
- {
- /* We'll need to initialize all subplans */
- nplans = list_length(node->appendplans);
- Assert(nplans > 0);
- validsubplans = bms_add_range(NULL, 0, nplans - 1);
- }
+ nplans = bms_num_members(validsubplans);
/*
* When no run-time pruning is required and there's at least one
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index 418f89dea8..9a9f29e845 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -86,29 +86,17 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
{
PartitionPruneState *prunestate;
- /* We may need an expression context to evaluate partition exprs */
- ExecAssignExprContext(estate, &mergestate->ps);
-
- prunestate = ExecCreatePartitionPruneState(&mergestate->ps,
- node->part_prune_info);
+ /*
+ * Set up pruning data structure. Initial pruning steps, if any, are
+ * performed as part of the setup, adding the set of indexes of
+ * surviving subplans to 'validsubplans'.
+ */
+ prunestate = ExecInitPartitionPruning(&mergestate->ps,
+ list_length(node->mergeplans),
+ node->part_prune_info,
+ &validsubplans);
mergestate->ms_prune_state = prunestate;
-
- /* Perform an initial partition prune, if required. */
- if (prunestate->do_initial_prune)
- {
- /* Determine which subplans survive initial pruning */
- validsubplans = ExecFindInitialMatchingSubPlans(prunestate,
- list_length(node->mergeplans));
-
- nplans = bms_num_members(validsubplans);
- }
- else
- {
- /* We'll need to initialize all subplans */
- nplans = list_length(node->mergeplans);
- Assert(nplans > 0);
- validsubplans = bms_add_range(NULL, 0, nplans - 1);
- }
+ nplans = bms_num_members(validsubplans);
/*
* When no run-time pruning is required and there's at least one
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index 1bc00826c1..7080cb25d9 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -798,6 +798,7 @@ prune_append_rel_partitions(RelOptInfo *rel)
/* These are not valid when being called from the planner */
context.planstate = NULL;
+ context.exprcontext = NULL;
context.exprstates = NULL;
/* Actual pruning happens here. */
@@ -808,8 +809,8 @@ prune_append_rel_partitions(RelOptInfo *rel)
* get_matching_partitions
* Determine partitions that survive partition pruning
*
- * Note: context->planstate must be set to a valid PlanState when the
- * pruning_steps were generated with a target other than PARTTARGET_PLANNER.
+ * Note: context->exprcontext must be valid when the pruning_steps were
+ * generated with a target other than PARTTARGET_PLANNER.
*
* Returns a Bitmapset of the RelOptInfo->part_rels indexes of the surviving
* partitions.
@@ -3654,7 +3655,7 @@ match_boolean_partition_clause(Oid partopfamily, Expr *clause, Expr *partkey,
* exprstate array.
*
* Note that the evaluated result may be in the per-tuple memory context of
- * context->planstate->ps_ExprContext, and we may have leaked other memory
+ * context->exprcontext, and we may have leaked other memory
* there too. This memory must be recovered by resetting that ExprContext
* after we're done with the pruning operation (see execPartition.c).
*/
@@ -3677,13 +3678,18 @@ partkey_datum_from_expr(PartitionPruneContext *context,
ExprContext *ectx;
/*
- * We should never see a non-Const in a step unless we're running in
- * the executor.
+ * We should never see a non-Const in a step unless the caller has
+ * passed a valid ExprContext.
+ *
+ * When context->planstate is valid, context->exprcontext is same
+ * as context->planstate->ps_ExprContext.
*/
- Assert(context->planstate != NULL);
+ Assert(context->planstate != NULL || context->exprcontext != NULL);
+ Assert(context->planstate == NULL ||
+ (context->exprcontext == context->planstate->ps_ExprContext));
exprstate = context->exprstates[stateidx];
- ectx = context->planstate->ps_ExprContext;
+ ectx = context->exprcontext;
*value = ExecEvalExprSwitchContext(exprstate, ectx, isnull);
}
}
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index 603d8becc4..fd5735a946 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -119,10 +119,9 @@ extern ResultRelInfo *ExecFindPartition(ModifyTableState *mtstate,
EState *estate);
extern void ExecCleanupTupleRouting(ModifyTableState *mtstate,
PartitionTupleRouting *proute);
-extern PartitionPruneState *ExecCreatePartitionPruneState(PlanState *planstate,
- PartitionPruneInfo *partitionpruneinfo);
+extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
+ int n_total_subplans,
+ PartitionPruneInfo *pruneinfo,
+ Bitmapset **initially_valid_subplans);
extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate);
-extern Bitmapset *ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate,
- int nsubplans);
-
#endif /* EXECPARTITION_H */
diff --git a/src/include/partitioning/partprune.h b/src/include/partitioning/partprune.h
index ee11b6feae..90684efa25 100644
--- a/src/include/partitioning/partprune.h
+++ b/src/include/partitioning/partprune.h
@@ -41,6 +41,7 @@ struct RelOptInfo;
* subsidiary data, such as the FmgrInfos.
* planstate Points to the parent plan node's PlanState when called
* during execution; NULL when called from the planner.
+ * exprcontext ExprContext to use when evaluating pruning expressions
* exprstates Array of ExprStates, indexed as per PruneCxtStateIdx; one
* for each partition key in each pruning step. Allocated if
* planstate is non-NULL, otherwise NULL.
@@ -56,6 +57,7 @@ typedef struct PartitionPruneContext
FmgrInfo *stepcmpfuncs;
MemoryContext ppccontext;
PlanState *planstate;
+ ExprContext *exprcontext;
ExprState **exprstates;
} PartitionPruneContext;
--
2.24.1
[application/octet-stream] v8-0003-Add-a-plan_tree_walker.patch (3.9K, 4-v8-0003-Add-a-plan_tree_walker.patch)
download | inline diff:
From 3f3bfe578401c43e578196f46f2bad7d3071411a Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Thu, 3 Mar 2022 16:04:13 +0900
Subject: [PATCH v8 3/4] Add a plan_tree_walker()
Like planstate_tree_walker() but for uninitialized plan trees.
---
src/backend/nodes/nodeFuncs.c | 116 ++++++++++++++++++++++++++++++++++
src/include/nodes/nodeFuncs.h | 3 +
2 files changed, 119 insertions(+)
diff --git a/src/backend/nodes/nodeFuncs.c b/src/backend/nodes/nodeFuncs.c
index 4789ba6911..51cac40a3e 100644
--- a/src/backend/nodes/nodeFuncs.c
+++ b/src/backend/nodes/nodeFuncs.c
@@ -31,6 +31,10 @@ static bool planstate_walk_subplans(List *plans, bool (*walker) (),
void *context);
static bool planstate_walk_members(PlanState **planstates, int nplans,
bool (*walker) (), void *context);
+static bool plan_walk_subplans(List *plans,
+ bool (*walker) (),
+ void *context);
+static bool plan_walk_members(List *plans, bool (*walker) (), void *context);
/*
@@ -4645,3 +4649,115 @@ planstate_walk_members(PlanState **planstates, int nplans,
return false;
}
+
+/*
+ * plan_tree_walker --- walk plantrees
+ *
+ * The walker has already visited the current node, and so we need only
+ * recurse into any sub-nodes it has.
+ */
+bool
+plan_tree_walker(Plan *plan,
+ bool (*walker) (),
+ void *context)
+{
+ /* Guard against stack overflow due to overly complex plan trees */
+ check_stack_depth();
+
+ /* initPlan-s */
+ if (plan_walk_subplans(plan->initPlan, walker, context))
+ return true;
+
+ /* lefttree */
+ if (outerPlan(plan))
+ {
+ if (walker(outerPlan(plan), context))
+ return true;
+ }
+
+ /* righttree */
+ if (innerPlan(plan))
+ {
+ if (walker(innerPlan(plan), context))
+ return true;
+ }
+
+ /* special child plans */
+ switch (nodeTag(plan))
+ {
+ case T_Append:
+ if (plan_walk_members(((Append *) plan)->appendplans,
+ walker, context))
+ return true;
+ break;
+ case T_MergeAppend:
+ if (plan_walk_members(((MergeAppend *) plan)->mergeplans,
+ walker, context))
+ return true;
+ break;
+ case T_BitmapAnd:
+ if (plan_walk_members(((BitmapAnd *) plan)->bitmapplans,
+ walker, context))
+ return true;
+ break;
+ case T_BitmapOr:
+ if (plan_walk_members(((BitmapOr *) plan)->bitmapplans,
+ walker, context))
+ return true;
+ break;
+ case T_CustomScan:
+ if (plan_walk_members(((CustomScan *) plan)->custom_plans,
+ walker, context))
+ return true;
+ break;
+ case T_SubqueryScan:
+ if (walker(((SubqueryScan *) plan)->subplan, context))
+ return true;
+ break;
+ default:
+ break;
+ }
+
+ return false;
+}
+
+/*
+ * Walk a list of SubPlans (or initPlans, which also use SubPlan nodes).
+ */
+static bool
+plan_walk_subplans(List *plans,
+ bool (*walker) (),
+ void *context)
+{
+ ListCell *lc;
+ PlannedStmt *plannedstmt = (PlannedStmt *) context;
+
+ foreach(lc, plans)
+ {
+ SubPlan *sp = lfirst_node(SubPlan, lc);
+ Plan *p = list_nth(plannedstmt->subplans, sp->plan_id - 1);
+
+ if (walker(p, context))
+ return true;
+ }
+
+ return false;
+}
+
+/*
+ * Walk the constituent plans of a ModifyTable, Append, MergeAppend,
+ * BitmapAnd, or BitmapOr node.
+ */
+static bool
+plan_walk_members(List *plans, bool (*walker) (), void *context)
+{
+ ListCell *lc;
+
+ foreach(lc, plans)
+ {
+ if (walker(lfirst(lc), context))
+ return true;
+ }
+
+ return false;
+}
diff --git a/src/include/nodes/nodeFuncs.h b/src/include/nodes/nodeFuncs.h
index 93c60bde66..fca107ad65 100644
--- a/src/include/nodes/nodeFuncs.h
+++ b/src/include/nodes/nodeFuncs.h
@@ -158,5 +158,8 @@ extern bool raw_expression_tree_walker(Node *node, bool (*walker) (),
struct PlanState;
extern bool planstate_tree_walker(struct PlanState *planstate, bool (*walker) (),
void *context);
+struct Plan;
+extern bool plan_tree_walker(struct Plan *plan, bool (*walker) (),
+ void *context);
#endif /* NODEFUNCS_H */
--
2.24.1
[application/octet-stream] v8-0002-Add-Merge-Append.partitioned_rels.patch (17.4K, 5-v8-0002-Add-Merge-Append.partitioned_rels.patch)
download | inline diff:
From 8b99146c9b8c4826e1434d3f006597681c24cd45 Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Thu, 24 Mar 2022 22:47:03 +0900
Subject: [PATCH v8 2/4] Add [Merge]Append.partitioned_rels
To record the RT indexes of all partitioned ancestors leading up to
leaf partitions that are appended by the node.
If a given [Merge]Append node is left out from the plan due to there
being only one element in its list of child subplans, then its
partitioned_rels set is added to PlannerGlobal.elidedAppendPartedRels
that is passed down to the executor through PlannedStmt.
There are no users for partitioned_rels and elidedAppendPartedRels
as of this commit, though a later commit will require the ability
to extract the set of relations that must be locked to make a plan
tree safe for execution by walking the plan tree itself, so having
the partitioned tables be also present in the plan tree will be
helpful. Note that currently the executor relies on the fact that
the set of relations to be locked can be obtained by simply scanning
the range table that's made available in PlannedStmt along with the
plan tree.
---
src/backend/nodes/copyfuncs.c | 3 +++
src/backend/nodes/outfuncs.c | 5 +++++
src/backend/nodes/readfuncs.c | 3 +++
src/backend/optimizer/path/joinrels.c | 9 ++++++++
src/backend/optimizer/plan/createplan.c | 18 +++++++++++++++-
src/backend/optimizer/plan/planner.c | 8 +++++++
src/backend/optimizer/plan/setrefs.c | 28 +++++++++++++++++++++++++
src/backend/optimizer/util/inherit.c | 16 ++++++++++++++
src/backend/optimizer/util/relnode.c | 20 ++++++++++++++++++
src/include/nodes/pathnodes.h | 22 +++++++++++++++++++
src/include/nodes/plannodes.h | 17 +++++++++++++++
11 files changed, 148 insertions(+), 1 deletion(-)
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 56505557bf..29c515d7db 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -106,6 +106,7 @@ _copyPlannedStmt(const PlannedStmt *from)
COPY_NODE_FIELD(invalItems);
COPY_NODE_FIELD(paramExecTypes);
COPY_NODE_FIELD(utilityStmt);
+ COPY_BITMAPSET_FIELD(elidedAppendPartedRels);
COPY_LOCATION_FIELD(stmt_location);
COPY_SCALAR_FIELD(stmt_len);
@@ -254,6 +255,7 @@ _copyAppend(const Append *from)
COPY_SCALAR_FIELD(nasyncplans);
COPY_SCALAR_FIELD(first_partial_plan);
COPY_NODE_FIELD(part_prune_info);
+ COPY_BITMAPSET_FIELD(partitioned_rels);
return newnode;
}
@@ -282,6 +284,7 @@ _copyMergeAppend(const MergeAppend *from)
COPY_POINTER_FIELD(collations, from->numCols * sizeof(Oid));
COPY_POINTER_FIELD(nullsFirst, from->numCols * sizeof(bool));
COPY_NODE_FIELD(part_prune_info);
+ COPY_BITMAPSET_FIELD(partitioned_rels);
return newnode;
}
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 6e39590730..108ede9af9 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -324,6 +324,7 @@ _outPlannedStmt(StringInfo str, const PlannedStmt *node)
WRITE_NODE_FIELD(invalItems);
WRITE_NODE_FIELD(paramExecTypes);
WRITE_NODE_FIELD(utilityStmt);
+ WRITE_BITMAPSET_FIELD(elidedAppendPartedRels);
WRITE_LOCATION_FIELD(stmt_location);
WRITE_INT_FIELD(stmt_len);
}
@@ -444,6 +445,7 @@ _outAppend(StringInfo str, const Append *node)
WRITE_INT_FIELD(nasyncplans);
WRITE_INT_FIELD(first_partial_plan);
WRITE_NODE_FIELD(part_prune_info);
+ WRITE_BITMAPSET_FIELD(partitioned_rels);
}
static void
@@ -461,6 +463,7 @@ _outMergeAppend(StringInfo str, const MergeAppend *node)
WRITE_OID_ARRAY(collations, node->numCols);
WRITE_BOOL_ARRAY(nullsFirst, node->numCols);
WRITE_NODE_FIELD(part_prune_info);
+ WRITE_BITMAPSET_FIELD(partitioned_rels);
}
static void
@@ -2404,6 +2407,7 @@ _outPlannerGlobal(StringInfo str, const PlannerGlobal *node)
WRITE_BOOL_FIELD(parallelModeOK);
WRITE_BOOL_FIELD(parallelModeNeeded);
WRITE_CHAR_FIELD(maxParallelHazard);
+ WRITE_BITMAPSET_FIELD(elidedAppendPartedRels);
}
static void
@@ -2515,6 +2519,7 @@ _outRelOptInfo(StringInfo str, const RelOptInfo *node)
WRITE_BOOL_FIELD(partbounds_merged);
WRITE_BITMAPSET_FIELD(live_parts);
WRITE_BITMAPSET_FIELD(all_partrels);
+ WRITE_BITMAPSET_FIELD(partitioned_rels);
}
static void
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index c94b2561f0..ce146dd45e 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -1794,6 +1794,7 @@ _readPlannedStmt(void)
READ_NODE_FIELD(invalItems);
READ_NODE_FIELD(paramExecTypes);
READ_NODE_FIELD(utilityStmt);
+ READ_BITMAPSET_FIELD(elidedAppendPartedRels);
READ_LOCATION_FIELD(stmt_location);
READ_INT_FIELD(stmt_len);
@@ -1917,6 +1918,7 @@ _readAppend(void)
READ_INT_FIELD(nasyncplans);
READ_INT_FIELD(first_partial_plan);
READ_NODE_FIELD(part_prune_info);
+ READ_BITMAPSET_FIELD(partitioned_rels);
READ_DONE();
}
@@ -1939,6 +1941,7 @@ _readMergeAppend(void)
READ_OID_ARRAY(collations, local_node->numCols);
READ_BOOL_ARRAY(nullsFirst, local_node->numCols);
READ_NODE_FIELD(part_prune_info);
+ READ_BITMAPSET_FIELD(partitioned_rels);
READ_DONE();
}
diff --git a/src/backend/optimizer/path/joinrels.c b/src/backend/optimizer/path/joinrels.c
index 9da3ff2f9a..e74d40fee3 100644
--- a/src/backend/optimizer/path/joinrels.c
+++ b/src/backend/optimizer/path/joinrels.c
@@ -1549,6 +1549,15 @@ try_partitionwise_join(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2,
populate_joinrel_with_paths(root, child_rel1, child_rel2,
child_joinrel, child_sjinfo,
child_restrictlist);
+
+ /*
+ * A parent relation's partitioned_rels must be a superset of the sets
+ * of all its children, direct or indirect, so bubble up the child
+ * joinrel's set.
+ */
+ joinrel->partitioned_rels =
+ bms_add_members(joinrel->partitioned_rels,
+ child_joinrel->partitioned_rels);
}
}
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index 179c87c671..99868a1a79 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -26,10 +26,12 @@
#include "nodes/extensible.h"
#include "nodes/makefuncs.h"
#include "nodes/nodeFuncs.h"
+#include "optimizer/appendinfo.h"
#include "optimizer/clauses.h"
#include "optimizer/cost.h"
#include "optimizer/optimizer.h"
#include "optimizer/paramassign.h"
+#include "optimizer/pathnode.h"
#include "optimizer/paths.h"
#include "optimizer/placeholder.h"
#include "optimizer/plancat.h"
@@ -1332,11 +1334,11 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
best_path->subpaths,
prunequal);
}
-
plan->appendplans = subplans;
plan->nasyncplans = nasyncplans;
plan->first_partial_plan = best_path->first_partial_path;
plan->part_prune_info = partpruneinfo;
+ plan->partitioned_rels = bms_copy(rel->partitioned_rels);
copy_generic_path_info(&plan->plan, (Path *) best_path);
@@ -1500,6 +1502,20 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
node->mergeplans = subplans;
node->part_prune_info = partpruneinfo;
+ /*
+ * We need to explicitly add to the plan node the RT indexes of any
+ * partitioned tables whose partitions will be scanned by the nodes in
+ * 'subplans'. There can be multiple RT indexes in the set due to the
+ * partition tree being multi-level and/or this being a plan for UNION ALL
+ * over multiple partition trees. Along with scanrelids of leaf-level Scan
+ * nodes, this allows the executor to lock the full set of relations being
+ * scanned by this node.
+ *
+ * Note that 'apprelids' only contains the top-level base relation(s), so
+ * is not sufficient for the purpose.
+ */
+ node->partitioned_rels = bms_copy(rel->partitioned_rels);
+
/*
* If prepare_sort_from_pathkeys added sort columns, but we were told to
* produce either the exact tlist or a narrow tlist, we should get rid of
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index b2569c5d0c..c769b4b4b9 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -529,6 +529,7 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
result->paramExecTypes = glob->paramExecTypes;
/* utilityStmt should be null, but we might as well copy it */
result->utilityStmt = parse->utilityStmt;
+ result->elidedAppendPartedRels = glob->elidedAppendPartedRels;
result->stmt_location = parse->stmt_location;
result->stmt_len = parse->stmt_len;
@@ -7534,6 +7535,13 @@ create_partitionwise_grouping_paths(PlannerInfo *root,
add_paths_to_append_rel(root, grouped_rel, grouped_live_children);
}
+
+ /*
+ * Input rel might be a partitioned appendrel, though grouped_rel has at
+ * this point taken its role as the an appendrel owning the former's
+ * children, so copy the former's partitioned_rels set into the latter.
+ */
+ grouped_rel->partitioned_rels = bms_copy(input_rel->partitioned_rels);
}
/*
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index bf4c722c02..8214edec54 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -1574,6 +1574,10 @@ set_append_references(PlannerInfo *root,
lfirst(l) = set_plan_refs(root, (Plan *) lfirst(l), rtoffset);
}
+ /* Fix up partitioned_rels before possibly removing the Append below. */
+ aplan->partitioned_rels = offset_relid_set(aplan->partitioned_rels,
+ rtoffset);
+
/*
* See if it's safe to get rid of the Append entirely. For this to be
* safe, there must be only one child plan and that child plan's parallel
@@ -1584,8 +1588,17 @@ set_append_references(PlannerInfo *root,
*/
if (list_length(aplan->appendplans) == 1 &&
((Plan *) linitial(aplan->appendplans))->parallel_aware == aplan->plan.parallel_aware)
+ {
+ /*
+ * Partitioned table involved, if any, must be made known to the
+ * executor.
+ */
+ root->glob->elidedAppendPartedRels =
+ bms_add_members(root->glob->elidedAppendPartedRels,
+ aplan->partitioned_rels);
return clean_up_removed_plan_level((Plan *) aplan,
(Plan *) linitial(aplan->appendplans));
+ }
/*
* Otherwise, clean up the Append as needed. It's okay to do this after
@@ -1646,6 +1659,12 @@ set_mergeappend_references(PlannerInfo *root,
lfirst(l) = set_plan_refs(root, (Plan *) lfirst(l), rtoffset);
}
+ /*
+ * Fix up partitioned_rels before possibly removing the MergeAppend below.
+ */
+ mplan->partitioned_rels = offset_relid_set(mplan->partitioned_rels,
+ rtoffset);
+
/*
* See if it's safe to get rid of the MergeAppend entirely. For this to
* be safe, there must be only one child plan and that child plan's
@@ -1656,8 +1675,17 @@ set_mergeappend_references(PlannerInfo *root,
*/
if (list_length(mplan->mergeplans) == 1 &&
((Plan *) linitial(mplan->mergeplans))->parallel_aware == mplan->plan.parallel_aware)
+ {
+ /*
+ * Partitioned tables involved, if any, must be made known to the
+ * executor.
+ */
+ root->glob->elidedAppendPartedRels =
+ bms_add_members(root->glob->elidedAppendPartedRels,
+ mplan->partitioned_rels);
return clean_up_removed_plan_level((Plan *) mplan,
(Plan *) linitial(mplan->mergeplans));
+ }
/*
* Otherwise, clean up the MergeAppend as needed. It's okay to do this
diff --git a/src/backend/optimizer/util/inherit.c b/src/backend/optimizer/util/inherit.c
index 7e134822f3..56912e4101 100644
--- a/src/backend/optimizer/util/inherit.c
+++ b/src/backend/optimizer/util/inherit.c
@@ -406,6 +406,14 @@ expand_partitioned_rtentry(PlannerInfo *root, RelOptInfo *relinfo,
childrte, childRTindex,
childrel, top_parentrc, lockmode);
+ /*
+ * A parent relation's partitioned_rels must be a superset of the sets
+ * of all its children, direct or indirect, so bubble up the child
+ * rel's set.
+ */
+ relinfo->partitioned_rels = bms_add_members(relinfo->partitioned_rels,
+ childrelinfo->partitioned_rels);
+
/* Close child relation, but keep locks */
table_close(childrel, NoLock);
}
@@ -737,6 +745,14 @@ expand_appendrel_subquery(PlannerInfo *root, RelOptInfo *rel,
/* Child may itself be an inherited rel, either table or subquery. */
if (childrte->inh)
expand_inherited_rtentry(root, childrel, childrte, childRTindex);
+
+ /*
+ * A parent relation's partitioned_rels must be a superset of the sets
+ * of all its children, direct or indirect, so bubble up the child
+ * rel's set.
+ */
+ rel->partitioned_rels = bms_add_members(rel->partitioned_rels,
+ childrel->partitioned_rels);
}
}
diff --git a/src/backend/optimizer/util/relnode.c b/src/backend/optimizer/util/relnode.c
index 520409f4ba..1d082a8fdd 100644
--- a/src/backend/optimizer/util/relnode.c
+++ b/src/backend/optimizer/util/relnode.c
@@ -361,6 +361,10 @@ build_simple_rel(PlannerInfo *root, int relid, RelOptInfo *parent)
}
}
+ /* A partitioned appendrel. */
+ if (rel->part_scheme != NULL)
+ rel->partitioned_rels = bms_copy(rel->relids);
+
/* Save the finished struct in the query's simple_rel_array */
root->simple_rel_array[relid] = rel;
@@ -729,6 +733,14 @@ build_join_rel(PlannerInfo *root,
set_joinrel_size_estimates(root, joinrel, outer_rel, inner_rel,
sjinfo, restrictlist);
+ /*
+ * The joinrel may get processed as an appendrel via partitionwise join
+ * if both outer and inner rels are partitioned, so set partitioned_rels
+ * appropriately.
+ */
+ joinrel->partitioned_rels = bms_union(outer_rel->partitioned_rels,
+ inner_rel->partitioned_rels);
+
/*
* Set the consider_parallel flag if this joinrel could potentially be
* scanned within a parallel worker. If this flag is false for either
@@ -897,6 +909,14 @@ build_child_join_rel(PlannerInfo *root, RelOptInfo *outer_rel,
set_joinrel_size_estimates(root, joinrel, outer_rel, inner_rel,
sjinfo, restrictlist);
+ /*
+ * The joinrel may get processed as an appendrel via partitionwise join
+ * if both outer and inner rels are partitioned, so set partitioned_rels
+ * appropriately.
+ */
+ joinrel->partitioned_rels = bms_union(outer_rel->partitioned_rels,
+ inner_rel->partitioned_rels);
+
/* We build the join only once. */
Assert(!find_join_rel(root, joinrel->relids));
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 6cbcb67bdf..ef9b54739a 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -130,6 +130,11 @@ typedef struct PlannerGlobal
char maxParallelHazard; /* worst PROPARALLEL hazard level */
PartitionDirectory partition_directory; /* partition descriptors */
+
+ Bitmapset *elidedAppendPartedRels; /* Combined partitioned_rels of all
+ * single-subplan [Merge]Append nodes
+ * that have been removed fron the
+ * various plan trees. */
} PlannerGlobal;
/* macro for fetching the Plan associated with a SubPlan node */
@@ -773,6 +778,23 @@ typedef struct RelOptInfo
Relids all_partrels; /* Relids set of all partition relids */
List **partexprs; /* Non-nullable partition key expressions */
List **nullable_partexprs; /* Nullable partition key expressions */
+
+ /*
+ * For an appendrel parent relation (base, join, or upper) that is
+ * partitioned, this stores the RT indexes of all the paritioned ancestors
+ * including itself that lead up to the individual leaf partitions that
+ * will be scanned to produce this relation's output rows. The relid set
+ * is copied into the resulting Append or MergeAppend plan node for
+ * allowing the executor to take appropriate locks on those relations,
+ * unless the node is deemed useless in setrefs.c due to having a single
+ * leaf subplan and thus elided from the final plan, in which case, the set
+ * is added into PlannerGlobal.elidedAppendPartedRels.
+ *
+ * Note that 'apprelids' of those nodes only contains the top-level base
+ * relation(s), so is not sufficient for said purpose.
+ */
+
+ Bitmapset *partitioned_rels;
} RelOptInfo;
/*
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 50ef3dda05..a823c7c20d 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -86,6 +86,11 @@ typedef struct PlannedStmt
Node *utilityStmt; /* non-null if this is utility stmt */
+ Bitmapset *elidedAppendPartedRels; /* Combined partitioned_rels of all
+ * single-subplan [Merge]Append nodes
+ * that have been removed from the
+ * various plan trees. */
+
/* statement location in source string (copied from Query) */
int stmt_location; /* start location, or -1 if unknown */
int stmt_len; /* length in bytes; 0 means "rest of string" */
@@ -264,6 +269,12 @@ typedef struct Append
/* Info for run-time subplan pruning; NULL if we're not doing that */
struct PartitionPruneInfo *part_prune_info;
+
+ /*
+ * RT indexes of all partitioned parents whose partitions' plans are
+ * present in appendplans.
+ */
+ Bitmapset *partitioned_rels;
} Append;
/* ----------------
@@ -284,6 +295,12 @@ typedef struct MergeAppend
bool *nullsFirst; /* NULLS FIRST/LAST directions */
/* Info for run-time subplan pruning; NULL if we're not doing that */
struct PartitionPruneInfo *part_prune_info;
+
+ /*
+ * RT indexes of all partitioned parents whose partitions' plans are
+ * present in appendplans.
+ */
+ Bitmapset *partitioned_rels;
} MergeAppend;
/* ----------------
--
2.24.1
^ permalink raw reply [nested|flat] 71+ messages in thread
* Re: generic plans and "initial" pruning
@ 2022-03-31 09:56 Alvaro Herrera <[email protected]>
parent: Amit Langote <[email protected]>
1 sibling, 1 reply; 71+ messages in thread
From: Alvaro Herrera @ 2022-03-31 09:56 UTC (permalink / raw)
To: Amit Langote <[email protected]>; +Cc: Robert Haas <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers; David Rowley <[email protected]>
I'm looking at 0001 here with intention to commit later. I see that
there is some resistance to 0004, but I think a final verdict on that
one doesn't materially affect 0001.
--
Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
"El destino baraja y nosotros jugamos" (A. Schopenhauer)
^ permalink raw reply [nested|flat] 71+ messages in thread
* Re: generic plans and "initial" pruning
@ 2022-03-31 11:11 Amit Langote <[email protected]>
parent: Alvaro Herrera <[email protected]>
0 siblings, 0 replies; 71+ messages in thread
From: Amit Langote @ 2022-03-31 11:11 UTC (permalink / raw)
To: Alvaro Herrera <[email protected]>; +Cc: Robert Haas <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers; David Rowley <[email protected]>
On Thu, Mar 31, 2022 at 6:55 PM Alvaro Herrera <[email protected]> wrote:
> I'm looking at 0001 here with intention to commit later. I see that
> there is some resistance to 0004, but I think a final verdict on that
> one doesn't materially affect 0001.
Thanks.
While the main goal of the refactoring patch is to make it easier to
review the more complex changes that 0004 makes to execPartition.c, I
agree it has merit on its own. Although, one may say that the bit
about providing a PlanState-independent ExprContext is more closely
tied with 0004's requirements...
--
Amit Langote
EDB: http://www.enterprisedb.com
^ permalink raw reply [nested|flat] 71+ messages in thread
* Re: generic plans and "initial" pruning
@ 2022-04-01 01:31 David Rowley <[email protected]>
parent: Amit Langote <[email protected]>
1 sibling, 1 reply; 71+ messages in thread
From: David Rowley @ 2022-04-01 01:31 UTC (permalink / raw)
To: Amit Langote <[email protected]>; +Cc: Robert Haas <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers
On Thu, 31 Mar 2022 at 16:25, Amit Langote <[email protected]> wrote:
> Rebased.
I've been looking over the v8 patch and I'd like to propose semi-baked
ideas to improve things. I'd need to go and write them myself to
fully know if they'd actually work ok.
1. You've changed the signature of various functions by adding
ExecLockRelsInfo *execlockrelsinfo. I'm wondering why you didn't just
put the ExecLockRelsInfo as a new field in PlannedStmt?
I think the above gets around messing the signatures of
CreateQueryDesc(), ExplainOnePlan(), pg_plan_queries(),
PortalDefineQuery(), ProcessQuery() It would get rid of your change of
foreach to forboth in execute_sql_string() / PortalRunMulti() and gets
rid of a number of places where your carrying around a variable named
execlockrelsinfo_list. It would also make the patch significantly
easier to review as you'd be touching far fewer files.
2. I don't really like the way you've gone about most of the patch...
The way I imagine this working is that during create_plan() we visit
all nodes that have run-time pruning then inside create_append_plan()
and create_merge_append_plan() we'd tag those onto a new field in
PlannerGlobal That way you can store the PartitionPruneInfos in the
new PlannedStmt field in standard_planner() after the
makeNode(PlannedStmt).
Instead of storing the PartitionPruneInfo in the Append / MergeAppend
struct, you'd just add a new index field to those structs. The index
would start with 0 for the 0th PartitionPruneInfo. You'd basically
just know the index by assigning
list_length(root->glob->partitionpruneinfos).
You'd then assign the root->glob->partitionpruneinfos to
PlannedStmt.partitionpruneinfos and anytime you needed to do run-time
pruning during execution, you'd need to use the Append / MergeAppend's
partition_prune_info_idx to lookup the PartitionPruneInfo in some new
field you add to EState to store those. You'd leave that index as -1
if there's no PartitionPruneInfo for the Append / MergeAppend node.
When you do AcquireExecutorLocks(), you'd iterate over the
PlannedStmt's PartitionPruneInfo to figure out which subplans to
prune. You'd then have an array sized
list_length(plannedstmt->runtimepruneinfos) where you'd store the
result. When the Append/MergeAppend node starts up you just check if
the part_prune_info_idx >= 0 and if there's a non-NULL result stored
then use that result. That's how you'd ensure you always got the same
run-time prune result between locking and plan startup.
3. Also, looking at ExecGetLockRels(), shouldn't it be the planner's
job to determine the minimum set of relations which must be locked? I
think the plan tree traversal during execution not great. Seems the
whole point of this patch is to reduce overhead during execution. A
full additional plan traversal aside from the 3 that we already do for
start/run/end of execution seems not great.
I think this means that during AcquireExecutorLocks() you'd start with
the minimum set or RTEs that need to be locked as determined during
create_plan() and stored in some Bitmapset field in PlannedStmt. This
minimal set would also only exclude RTIs that would only possibly be
used due to a PartitionPruneInfo with initial pruning steps, i.e.
include RTIs from PartitionPruneInfo with no init pruining steps (you
can't skip any locks for those). All you need to do to determine the
RTEs to lock are to take the minimal set and execute each
PartitionPruneInfo in the PlannedStmt that has init steps
4. It's a bit disappointing to see RelOptInfo.partitioned_rels getting
revived here. Why don't you just add a partitioned_relids to
PartitionPruneInfo and just have make_partitionedrel_pruneinfo build
you a Relids of them. PartitionedRelPruneInfo already has an rtindex
field, so you just need to bms_add_member whatever that rtindex is.
It's a fairly high-level review at this stage. I can look in more
detail if the above points get looked at. You may find or know of
some reason why it can't be done like I mention above.
David
^ permalink raw reply [nested|flat] 71+ messages in thread
* Re: generic plans and "initial" pruning
@ 2022-04-01 03:09 Amit Langote <[email protected]>
parent: David Rowley <[email protected]>
0 siblings, 2 replies; 71+ messages in thread
From: Amit Langote @ 2022-04-01 03:09 UTC (permalink / raw)
To: David Rowley <[email protected]>; +Cc: Robert Haas <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers
Thanks a lot for looking into this.
On Fri, Apr 1, 2022 at 10:32 AM David Rowley <[email protected]> wrote:
> I've been looking over the v8 patch and I'd like to propose semi-baked
> ideas to improve things. I'd need to go and write them myself to
> fully know if they'd actually work ok.
>
> 1. You've changed the signature of various functions by adding
> ExecLockRelsInfo *execlockrelsinfo. I'm wondering why you didn't just
> put the ExecLockRelsInfo as a new field in PlannedStmt?
>
> I think the above gets around messing the signatures of
> CreateQueryDesc(), ExplainOnePlan(), pg_plan_queries(),
> PortalDefineQuery(), ProcessQuery() It would get rid of your change of
> foreach to forboth in execute_sql_string() / PortalRunMulti() and gets
> rid of a number of places where your carrying around a variable named
> execlockrelsinfo_list. It would also make the patch significantly
> easier to review as you'd be touching far fewer files.
I'm worried about that churn myself and did consider this idea, though
I couldn't shake the feeling that it's maybe wrong to put something in
PlannedStmt that the planner itself doesn't produce. I mean the
definition of PlannedStmt says this:
/* ----------------
* PlannedStmt node
*
* The output of the planner
With the ideas that you've outlined below, perhaps we can frame most
of the things that the patch wants to do as the planner and the
plancache changes. If we twist the above definition a bit to say what
the plancache does in this regard is part of planning, maybe it makes
sense to add the initial pruning related fields (nodes, outputs) into
PlannedStmt.
> 2. I don't really like the way you've gone about most of the patch...
>
> The way I imagine this working is that during create_plan() we visit
> all nodes that have run-time pruning then inside create_append_plan()
> and create_merge_append_plan() we'd tag those onto a new field in
> PlannerGlobal That way you can store the PartitionPruneInfos in the
> new PlannedStmt field in standard_planner() after the
> makeNode(PlannedStmt).
>
> Instead of storing the PartitionPruneInfo in the Append / MergeAppend
> struct, you'd just add a new index field to those structs. The index
> would start with 0 for the 0th PartitionPruneInfo. You'd basically
> just know the index by assigning
> list_length(root->glob->partitionpruneinfos).
>
> You'd then assign the root->glob->partitionpruneinfos to
> PlannedStmt.partitionpruneinfos and anytime you needed to do run-time
> pruning during execution, you'd need to use the Append / MergeAppend's
> partition_prune_info_idx to lookup the PartitionPruneInfo in some new
> field you add to EState to store those. You'd leave that index as -1
> if there's no PartitionPruneInfo for the Append / MergeAppend node.
>
> When you do AcquireExecutorLocks(), you'd iterate over the
> PlannedStmt's PartitionPruneInfo to figure out which subplans to
> prune. You'd then have an array sized
> list_length(plannedstmt->runtimepruneinfos) where you'd store the
> result. When the Append/MergeAppend node starts up you just check if
> the part_prune_info_idx >= 0 and if there's a non-NULL result stored
> then use that result. That's how you'd ensure you always got the same
> run-time prune result between locking and plan startup.
Actually, Robert too suggested such an idea to me off-list and I think
it's worth trying. I was not sure about the implementation, because
then we'd be passing around lists of initial pruning nodes/results
across many function/module boundaries that you mentioned in your
comment 1, but if we agree that PlannedStmt is an acceptable place for
those things to be stored, then I agree it's an attractive idea.
> 3. Also, looking at ExecGetLockRels(), shouldn't it be the planner's
> job to determine the minimum set of relations which must be locked? I
> think the plan tree traversal during execution not great. Seems the
> whole point of this patch is to reduce overhead during execution. A
> full additional plan traversal aside from the 3 that we already do for
> start/run/end of execution seems not great.
>
> I think this means that during AcquireExecutorLocks() you'd start with
> the minimum set or RTEs that need to be locked as determined during
> create_plan() and stored in some Bitmapset field in PlannedStmt.
The patch did have a PlannedStmt.lockrels till v6. Though, it wasn't
the same thing as you are describing it...
> This
> minimal set would also only exclude RTIs that would only possibly be
> used due to a PartitionPruneInfo with initial pruning steps, i.e.
> include RTIs from PartitionPruneInfo with no init pruining steps (you
> can't skip any locks for those). All you need to do to determine the
> RTEs to lock are to take the minimal set and execute each
> PartitionPruneInfo in the PlannedStmt that has init steps
So just thinking about an Append/MergeAppend, the minimum set must
include the RT indexes of all the partitioned tables whose direct and
indirect children's plans will be in 'subplans' and also of the
children if the PartitionPruneInfo doesn't contain initial steps or if
there is no PartitionPruneInfo to begin with.
One question is whether the planner should always pay the overhead of
initializing this bitmapset? I mean it's only worthwhile if
AcquireExecutorLocks() is going to be involved, that is, the plan will
be cached and reused.
> 4. It's a bit disappointing to see RelOptInfo.partitioned_rels getting
> revived here. Why don't you just add a partitioned_relids to
> PartitionPruneInfo and just have make_partitionedrel_pruneinfo build
> you a Relids of them. PartitionedRelPruneInfo already has an rtindex
> field, so you just need to bms_add_member whatever that rtindex is.
Hmm, not all Append/MergeAppend nodes in the plan tree may have
make_partition_pruneinfo() called on them though.
If not the proposed RelOptInfo.partitioned_rels that is populated in
the early planning stages, the only reliable way to get all the
partitioned tables involved in Appends/MergeAppends at create_plan()
stage seems to be to make a function out the stanza at the top of
make_partition_pruneinfo() that collects them by scanning the leaf
paths and tracing each path's relation's parents up to the root
partitioned parent and call it from create_{merge_}append_plan() if
make_partition_pruneinfo() was not. I did try to implement that and
found it a bit complex and expensive (the scanning the leaf paths
part).
> It's a fairly high-level review at this stage. I can look in more
> detail if the above points get looked at. You may find or know of
> some reason why it can't be done like I mention above.
I'll try to write a version with the above points addressed, while
keeping RelOptInfo.partitioned_rels around for now.
--
Amit Langote
EDB: http://www.enterprisedb.com
[1] https://www.postgresql.org/message-id/CA%2BHiwqH9-fAvpG-w9qYCcDWzK3vGPCMyw4f9nHzqkxXVuD1pxw%40mail.g...
^ permalink raw reply [nested|flat] 71+ messages in thread
* Re: generic plans and "initial" pruning
@ 2022-04-01 03:45 Tom Lane <[email protected]>
parent: Amit Langote <[email protected]>
1 sibling, 1 reply; 71+ messages in thread
From: Tom Lane @ 2022-04-01 03:45 UTC (permalink / raw)
To: Amit Langote <[email protected]>; +Cc: David Rowley <[email protected]>; Robert Haas <[email protected]>; pgsql-hackers
Amit Langote <[email protected]> writes:
> On Fri, Apr 1, 2022 at 10:32 AM David Rowley <[email protected]> wrote:
>> 1. You've changed the signature of various functions by adding
>> ExecLockRelsInfo *execlockrelsinfo. I'm wondering why you didn't just
>> put the ExecLockRelsInfo as a new field in PlannedStmt?
> I'm worried about that churn myself and did consider this idea, though
> I couldn't shake the feeling that it's maybe wrong to put something in
> PlannedStmt that the planner itself doesn't produce.
PlannedStmt is part of the plan tree, which MUST be read-only to
the executor. This is not negotiable. However, there's other
places that this data could be put, such as QueryDesc.
Or for that matter, couldn't the data structure be created by
the planner? (It looks like David is proposing exactly that
further down.)
regards, tom lane
^ permalink raw reply [nested|flat] 71+ messages in thread
* Re: generic plans and "initial" pruning
@ 2022-04-01 04:08 David Rowley <[email protected]>
parent: Amit Langote <[email protected]>
1 sibling, 1 reply; 71+ messages in thread
From: David Rowley @ 2022-04-01 04:08 UTC (permalink / raw)
To: Amit Langote <[email protected]>; +Cc: Robert Haas <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers
On Fri, 1 Apr 2022 at 16:09, Amit Langote <[email protected]> wrote:
> definition of PlannedStmt says this:
>
> /* ----------------
> * PlannedStmt node
> *
> * The output of the planner
>
> With the ideas that you've outlined below, perhaps we can frame most
> of the things that the patch wants to do as the planner and the
> plancache changes. If we twist the above definition a bit to say what
> the plancache does in this regard is part of planning, maybe it makes
> sense to add the initial pruning related fields (nodes, outputs) into
> PlannedStmt.
How about the PartitionPruneInfos go into PlannedStmt as a List
indexed in the way I mentioned and the cache of the results of pruning
in EState?
I think that leaves you adding List *partpruneinfos, Bitmapset
*minimumlockrtis to PlannedStmt and the thing you have to cache the
pruning results into EState. I'm not very clear on where you should
stash the results of run-time pruning in the meantime before you can
put them in EState. You might need to invent some intermediate struct
that gets passed around that you can scribble down some details you're
going to need during execution.
> One question is whether the planner should always pay the overhead of
> initializing this bitmapset? I mean it's only worthwhile if
> AcquireExecutorLocks() is going to be involved, that is, the plan will
> be cached and reused.
Maybe the Bitmapset for the minimal locks needs to be built with
bms_add_range(NULL, 0, list_length(rtable)); then do
bms_del_members() on the relevant RTIs you find in the listed
PartitionPruneInfos. That way it's very simple and cheap to do when
there are no PartitionPruneInfos.
> > 4. It's a bit disappointing to see RelOptInfo.partitioned_rels getting
> > revived here. Why don't you just add a partitioned_relids to
> > PartitionPruneInfo and just have make_partitionedrel_pruneinfo build
> > you a Relids of them. PartitionedRelPruneInfo already has an rtindex
> > field, so you just need to bms_add_member whatever that rtindex is.
>
> Hmm, not all Append/MergeAppend nodes in the plan tree may have
> make_partition_pruneinfo() called on them though.
For Append/MergeAppends without run-time pruning you'll want to add
the RTIs to the minimal locking set of RTIs to go into PlannedStmt.
The only things you want to leave out of that are RTIs for the RTEs
that you might run-time prune away during AcquireExecutorLocks().
David
^ permalink raw reply [nested|flat] 71+ messages in thread
* Re: generic plans and "initial" pruning
@ 2022-04-01 06:58 Amit Langote <[email protected]>
parent: David Rowley <[email protected]>
0 siblings, 1 reply; 71+ messages in thread
From: Amit Langote @ 2022-04-01 06:58 UTC (permalink / raw)
To: David Rowley <[email protected]>; +Cc: Robert Haas <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers
On Fri, Apr 1, 2022 at 1:08 PM David Rowley <[email protected]> wrote:
> On Fri, 1 Apr 2022 at 16:09, Amit Langote <[email protected]> wrote:
> > definition of PlannedStmt says this:
> >
> > /* ----------------
> > * PlannedStmt node
> > *
> > * The output of the planner
> >
> > With the ideas that you've outlined below, perhaps we can frame most
> > of the things that the patch wants to do as the planner and the
> > plancache changes. If we twist the above definition a bit to say what
> > the plancache does in this regard is part of planning, maybe it makes
> > sense to add the initial pruning related fields (nodes, outputs) into
> > PlannedStmt.
>
> How about the PartitionPruneInfos go into PlannedStmt as a List
> indexed in the way I mentioned and the cache of the results of pruning
> in EState?
>
> I think that leaves you adding List *partpruneinfos, Bitmapset
> *minimumlockrtis to PlannedStmt and the thing you have to cache the
> pruning results into EState. I'm not very clear on where you should
> stash the results of run-time pruning in the meantime before you can
> put them in EState. You might need to invent some intermediate struct
> that gets passed around that you can scribble down some details you're
> going to need during execution.
Yes, the ExecLockRelsInfo node in the current patch, that first gets
added to the QueryDesc and subsequently to the EState of the query,
serves as that stashing place. Not sure if you've looked at
ExecLockRelInfo in detail in your review of the patch so far, but it
carries the initial pruning result in what are called
PlanInitPruningOutput nodes, which are stored in a list in
ExecLockRelsInfo and their offsets in the list are in turn stored in
an adjacent array that contains an element for every plan node in the
tree. If we go with a PlannedStmt.partpruneinfos list, then maybe we
don't need to have that array, because the Append/MergeAppend nodes
would be carrying those offsets by themselves.
Maybe a different name for ExecLockRelsInfo would be better?
Also, given Tom's apparent dislike for carrying that in PlannedStmt,
maybe the way I have it now is fine?
> > One question is whether the planner should always pay the overhead of
> > initializing this bitmapset? I mean it's only worthwhile if
> > AcquireExecutorLocks() is going to be involved, that is, the plan will
> > be cached and reused.
>
> Maybe the Bitmapset for the minimal locks needs to be built with
> bms_add_range(NULL, 0, list_length(rtable)); then do
> bms_del_members() on the relevant RTIs you find in the listed
> PartitionPruneInfos. That way it's very simple and cheap to do when
> there are no PartitionPruneInfos.
Ah, okay. Looking at make_partition_pruneinfo(), I think I see a way
to delete the RTIs of prunable relations -- construct a
all_matched_leaf_part_relids in parallel to allmatchedsubplans and
delete those from the initial set.
> > > 4. It's a bit disappointing to see RelOptInfo.partitioned_rels getting
> > > revived here. Why don't you just add a partitioned_relids to
> > > PartitionPruneInfo and just have make_partitionedrel_pruneinfo build
> > > you a Relids of them. PartitionedRelPruneInfo already has an rtindex
> > > field, so you just need to bms_add_member whatever that rtindex is.
> >
> > Hmm, not all Append/MergeAppend nodes in the plan tree may have
> > make_partition_pruneinfo() called on them though.
>
> For Append/MergeAppends without run-time pruning you'll want to add
> the RTIs to the minimal locking set of RTIs to go into PlannedStmt.
> The only things you want to leave out of that are RTIs for the RTEs
> that you might run-time prune away during AcquireExecutorLocks().
Yeah, I see it now.
Thanks.
--
Amit Langote
EDB: http://www.enterprisedb.com
^ permalink raw reply [nested|flat] 71+ messages in thread
* Re: generic plans and "initial" pruning
@ 2022-04-01 07:01 Amit Langote <[email protected]>
parent: Tom Lane <[email protected]>
0 siblings, 0 replies; 71+ messages in thread
From: Amit Langote @ 2022-04-01 07:01 UTC (permalink / raw)
To: Tom Lane <[email protected]>; +Cc: David Rowley <[email protected]>; Robert Haas <[email protected]>; pgsql-hackers
On Fri, Apr 1, 2022 at 12:45 PM Tom Lane <[email protected]> wrote:
> Amit Langote <[email protected]> writes:
> > On Fri, Apr 1, 2022 at 10:32 AM David Rowley <[email protected]> wrote:
> >> 1. You've changed the signature of various functions by adding
> >> ExecLockRelsInfo *execlockrelsinfo. I'm wondering why you didn't just
> >> put the ExecLockRelsInfo as a new field in PlannedStmt?
>
> > I'm worried about that churn myself and did consider this idea, though
> > I couldn't shake the feeling that it's maybe wrong to put something in
> > PlannedStmt that the planner itself doesn't produce.
>
> PlannedStmt is part of the plan tree, which MUST be read-only to
> the executor. This is not negotiable. However, there's other
> places that this data could be put, such as QueryDesc.
> Or for that matter, couldn't the data structure be created by
> the planner? (It looks like David is proposing exactly that
> further down.)
The data structure in question is for storing the results of
performing initial partition pruning on a generic plan, which the
proposes to do in plancache.c -- inside the body of
AcquireExecutorLocks()'s loop over PlannedStmts -- so, it's hard to
see it as a product of the planner. :-(
--
Amit Langote
EDB: http://www.enterprisedb.com
^ permalink raw reply [nested|flat] 71+ messages in thread
* Re: generic plans and "initial" pruning
@ 2022-04-01 08:19 David Rowley <[email protected]>
parent: Amit Langote <[email protected]>
0 siblings, 1 reply; 71+ messages in thread
From: David Rowley @ 2022-04-01 08:19 UTC (permalink / raw)
To: Amit Langote <[email protected]>; +Cc: Robert Haas <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers
On Fri, 1 Apr 2022 at 19:58, Amit Langote <[email protected]> wrote:
> Yes, the ExecLockRelsInfo node in the current patch, that first gets
> added to the QueryDesc and subsequently to the EState of the query,
> serves as that stashing place. Not sure if you've looked at
> ExecLockRelInfo in detail in your review of the patch so far, but it
> carries the initial pruning result in what are called
> PlanInitPruningOutput nodes, which are stored in a list in
> ExecLockRelsInfo and their offsets in the list are in turn stored in
> an adjacent array that contains an element for every plan node in the
> tree. If we go with a PlannedStmt.partpruneinfos list, then maybe we
> don't need to have that array, because the Append/MergeAppend nodes
> would be carrying those offsets by themselves.
I saw it, just not in great detail. I saw that you had an array that
was indexed by the plan node's ID. I thought that wouldn't be so good
with large complex plans that we often get with partitioning
workloads. That's why I mentioned using another index that you store
in Append/MergeAppend that starts at 0 and increments by 1 for each
node that has a PartitionPruneInfo made for it during create_plan.
> Maybe a different name for ExecLockRelsInfo would be better?
>
> Also, given Tom's apparent dislike for carrying that in PlannedStmt,
> maybe the way I have it now is fine?
I think if you change how it's indexed and the other stuff then we can
have another look. I think the patch will be much easier to review
once the ParitionPruneInfos are moved into PlannedStmt.
David
^ permalink raw reply [nested|flat] 71+ messages in thread
* Re: generic plans and "initial" pruning
@ 2022-04-01 08:36 Amit Langote <[email protected]>
parent: David Rowley <[email protected]>
0 siblings, 1 reply; 71+ messages in thread
From: Amit Langote @ 2022-04-01 08:36 UTC (permalink / raw)
To: David Rowley <[email protected]>; +Cc: Robert Haas <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers
On Fri, Apr 1, 2022 at 5:20 PM David Rowley <[email protected]> wrote:
> On Fri, 1 Apr 2022 at 19:58, Amit Langote <[email protected]> wrote:
> > Yes, the ExecLockRelsInfo node in the current patch, that first gets
> > added to the QueryDesc and subsequently to the EState of the query,
> > serves as that stashing place. Not sure if you've looked at
> > ExecLockRelInfo in detail in your review of the patch so far, but it
> > carries the initial pruning result in what are called
> > PlanInitPruningOutput nodes, which are stored in a list in
> > ExecLockRelsInfo and their offsets in the list are in turn stored in
> > an adjacent array that contains an element for every plan node in the
> > tree. If we go with a PlannedStmt.partpruneinfos list, then maybe we
> > don't need to have that array, because the Append/MergeAppend nodes
> > would be carrying those offsets by themselves.
>
> I saw it, just not in great detail. I saw that you had an array that
> was indexed by the plan node's ID. I thought that wouldn't be so good
> with large complex plans that we often get with partitioning
> workloads. That's why I mentioned using another index that you store
> in Append/MergeAppend that starts at 0 and increments by 1 for each
> node that has a PartitionPruneInfo made for it during create_plan.
>
> > Maybe a different name for ExecLockRelsInfo would be better?
> >
> > Also, given Tom's apparent dislike for carrying that in PlannedStmt,
> > maybe the way I have it now is fine?
>
> I think if you change how it's indexed and the other stuff then we can
> have another look. I think the patch will be much easier to review
> once the ParitionPruneInfos are moved into PlannedStmt.
Will do, thanks.
--
Amit Langote
EDB: http://www.enterprisedb.com
^ permalink raw reply [nested|flat] 71+ messages in thread
* Re: generic plans and "initial" pruning
@ 2022-04-06 07:20 Amit Langote <[email protected]>
parent: Amit Langote <[email protected]>
0 siblings, 1 reply; 71+ messages in thread
From: Amit Langote @ 2022-04-06 07:20 UTC (permalink / raw)
To: David Rowley <[email protected]>; +Cc: Robert Haas <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers
On Fri, Apr 1, 2022 at 5:36 PM Amit Langote <[email protected]> wrote:
> On Fri, Apr 1, 2022 at 5:20 PM David Rowley <[email protected]> wrote:
> > On Fri, 1 Apr 2022 at 19:58, Amit Langote <[email protected]> wrote:
> > > Yes, the ExecLockRelsInfo node in the current patch, that first gets
> > > added to the QueryDesc and subsequently to the EState of the query,
> > > serves as that stashing place. Not sure if you've looked at
> > > ExecLockRelInfo in detail in your review of the patch so far, but it
> > > carries the initial pruning result in what are called
> > > PlanInitPruningOutput nodes, which are stored in a list in
> > > ExecLockRelsInfo and their offsets in the list are in turn stored in
> > > an adjacent array that contains an element for every plan node in the
> > > tree. If we go with a PlannedStmt.partpruneinfos list, then maybe we
> > > don't need to have that array, because the Append/MergeAppend nodes
> > > would be carrying those offsets by themselves.
> >
> > I saw it, just not in great detail. I saw that you had an array that
> > was indexed by the plan node's ID. I thought that wouldn't be so good
> > with large complex plans that we often get with partitioning
> > workloads. That's why I mentioned using another index that you store
> > in Append/MergeAppend that starts at 0 and increments by 1 for each
> > node that has a PartitionPruneInfo made for it during create_plan.
> >
> > > Maybe a different name for ExecLockRelsInfo would be better?
> > >
> > > Also, given Tom's apparent dislike for carrying that in PlannedStmt,
> > > maybe the way I have it now is fine?
> >
> > I think if you change how it's indexed and the other stuff then we can
> > have another look. I think the patch will be much easier to review
> > once the ParitionPruneInfos are moved into PlannedStmt.
>
> Will do, thanks.
And here is a version like that that passes make check-world. Maybe
still a WIP as I think comments could use more editing.
Here's how the new implementation works:
AcquireExecutorLocks() calls ExecutorDoInitialPruning(), which in turn
iterates over a list of PartitionPruneInfos in a given PlannedStmt
coming from a CachedPlan. For each PartitionPruneInfo,
ExecPartitionDoInitialPruning() is called, which sets up
PartitionPruneState and performs initial pruning steps present in the
PartitionPruneInfo. The resulting bitmapsets of valid subplans, one
for each PartitionPruneInfo, are collected in a list and added to a
result node called PartitionPruneResult. It represents the result of
performing initial pruning on all PartitionPruneInfos found in a plan.
A list of PartitionPruneResults is passed along with the PlannedStmt
to the executor, which is referenced when initializing
Append/MergeAppend nodes.
PlannedStmt.minLockRelids defined by the planner contains the RT
indexes of all the entries in the range table minus those of the leaf
partitions whose subplans are subject to removal due to initial
pruning. AcquireExecutoLocks() adds back the RT indexes of only those
leaf partitions whose subplans survive ExecutorDoInitialPruning(). To
get the leaf partition RT indexes from the PartitionPruneInfo, a new
rti_map array is added to PartitionedRelPruneInfo.
There's only one patch this time. Patches that added partitioned_rels
and plan_tree_walker() are no longer necessary.
--
Amit Langote
EDB: http://www.enterprisedb.com
Attachments:
[application/octet-stream] v11-0001-Optimize-AcquireExecutorLocks-to-skip-pruned-par.patch (97.8K, 2-v11-0001-Optimize-AcquireExecutorLocks-to-skip-pruned-par.patch)
download | inline diff:
From b0c8f18835ea2f455ea503a7c1702195be989df8 Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Wed, 22 Dec 2021 16:55:17 +0900
Subject: [PATCH v11] Optimize AcquireExecutorLocks() to skip pruned partitions
---
src/backend/commands/copyto.c | 2 +-
src/backend/commands/createas.c | 2 +-
src/backend/commands/explain.c | 7 +-
src/backend/commands/extension.c | 13 +-
src/backend/commands/matview.c | 2 +-
src/backend/commands/portalcmds.c | 1 +
src/backend/commands/prepare.c | 17 +-
src/backend/executor/README | 28 +++
src/backend/executor/execMain.c | 46 +++++
src/backend/executor/execParallel.c | 28 ++-
src/backend/executor/execPartition.c | 238 ++++++++++++++++++++----
src/backend/executor/execUtils.c | 1 +
src/backend/executor/functions.c | 2 +-
src/backend/executor/nodeAppend.c | 16 +-
src/backend/executor/nodeMergeAppend.c | 9 +-
src/backend/executor/spi.c | 14 +-
src/backend/nodes/copyfuncs.c | 33 +++-
src/backend/nodes/outfuncs.c | 36 +++-
src/backend/nodes/readfuncs.c | 56 +++++-
src/backend/optimizer/plan/createplan.c | 20 +-
src/backend/optimizer/plan/planner.c | 3 +
src/backend/optimizer/plan/setrefs.c | 104 ++++++++---
src/backend/partitioning/partprune.c | 41 +++-
src/backend/tcop/postgres.c | 15 +-
src/backend/tcop/pquery.c | 22 ++-
src/backend/utils/cache/plancache.c | 232 ++++++++++++++++++++---
src/backend/utils/mmgr/portalmem.c | 2 +
src/include/commands/explain.h | 3 +-
src/include/executor/execPartition.h | 12 +-
src/include/executor/execdesc.h | 3 +
src/include/executor/executor.h | 2 +
src/include/nodes/execnodes.h | 15 ++
src/include/nodes/nodes.h | 4 +
src/include/nodes/pathnodes.h | 15 ++
src/include/nodes/plannodes.h | 39 +++-
src/include/tcop/tcopprot.h | 2 +-
src/include/utils/plancache.h | 7 +
src/include/utils/portal.h | 5 +
38 files changed, 942 insertions(+), 155 deletions(-)
diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index 55c38b04c4..d403eb2309 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -542,7 +542,7 @@ BeginCopyTo(ParseState *pstate,
((DR_copy *) dest)->cstate = cstate;
/* Create a QueryDesc requesting no output */
- cstate->queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+ cstate->queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
GetActiveSnapshot(),
InvalidSnapshot,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 9abbb6b555..f6607f2454 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -325,7 +325,7 @@ ExecCreateTableAs(ParseState *pstate, CreateTableAsStmt *stmt,
UpdateActiveSnapshotCommandId();
/* Create a QueryDesc, redirecting output to our tuple receiver */
- queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+ queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, 0);
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 1e5701b8eb..7ba9852e51 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -407,7 +407,7 @@ ExplainOneQuery(Query *query, int cursorOptions,
}
/* run it (if needed) and produce output */
- ExplainOnePlan(plan, into, es, queryString, params, queryEnv,
+ ExplainOnePlan(plan, NULL, into, es, queryString, params, queryEnv,
&planduration, (es->buffers ? &bufusage : NULL));
}
}
@@ -515,7 +515,8 @@ ExplainOneUtility(Node *utilityStmt, IntoClause *into, ExplainState *es,
* to call it.
*/
void
-ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
+ExplainOnePlan(PlannedStmt *plannedstmt, PartitionPruneResult *part_prune_result,
+ IntoClause *into, ExplainState *es,
const char *queryString, ParamListInfo params,
QueryEnvironment *queryEnv, const instr_time *planduration,
const BufferUsage *bufusage)
@@ -563,7 +564,7 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
dest = None_Receiver;
/* Create a QueryDesc for the query */
- queryDesc = CreateQueryDesc(plannedstmt, queryString,
+ queryDesc = CreateQueryDesc(plannedstmt, part_prune_result, queryString,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, instrument_option);
diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index 1013790dbb..1151d95e1f 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -741,8 +741,10 @@ execute_sql_string(const char *sql)
RawStmt *parsetree = lfirst_node(RawStmt, lc1);
MemoryContext per_parsetree_context,
oldcontext;
- List *stmt_list;
- ListCell *lc2;
+ List *stmt_list,
+ *part_prune_result_list;
+ ListCell *lc2,
+ *lc3;
/*
* We do the work for each parsetree in a short-lived context, to
@@ -762,11 +764,13 @@ execute_sql_string(const char *sql)
NULL,
0,
NULL);
- stmt_list = pg_plan_queries(stmt_list, sql, CURSOR_OPT_PARALLEL_OK, NULL);
+ stmt_list = pg_plan_queries(stmt_list, sql, CURSOR_OPT_PARALLEL_OK, NULL,
+ &part_prune_result_list);
- foreach(lc2, stmt_list)
+ forboth(lc2, stmt_list, lc3, part_prune_result_list)
{
PlannedStmt *stmt = lfirst_node(PlannedStmt, lc2);
+ PartitionPruneResult *part_prune_result = lfirst_node(PartitionPruneResult, lc3);
CommandCounterIncrement();
@@ -777,6 +781,7 @@ execute_sql_string(const char *sql)
QueryDesc *qdesc;
qdesc = CreateQueryDesc(stmt,
+ part_prune_result,
sql,
GetActiveSnapshot(), NULL,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index 05e7b60059..4ef44aaf23 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -416,7 +416,7 @@ refresh_matview_datafill(DestReceiver *dest, Query *query,
UpdateActiveSnapshotCommandId();
/* Create a QueryDesc, redirecting output to our tuple receiver */
- queryDesc = CreateQueryDesc(plan, queryString,
+ queryDesc = CreateQueryDesc(plan, NULL, queryString,
GetActiveSnapshot(), InvalidSnapshot,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/portalcmds.c b/src/backend/commands/portalcmds.c
index 9902c5c566..cac653f535 100644
--- a/src/backend/commands/portalcmds.c
+++ b/src/backend/commands/portalcmds.c
@@ -107,6 +107,7 @@ PerformCursorOpen(ParseState *pstate, DeclareCursorStmt *cstmt, ParamListInfo pa
queryString,
CMDTAG_SELECT, /* cursor's query is always a SELECT */
list_make1(plan),
+ list_make1(NULL), /* no PartitionPruneResult to pass */
NULL);
/*----------
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 80738547ed..8b15159374 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -155,6 +155,7 @@ ExecuteQuery(ParseState *pstate,
PreparedStatement *entry;
CachedPlan *cplan;
List *plan_list;
+ List *plan_part_prune_result_list;
ParamListInfo paramLI = NULL;
EState *estate = NULL;
Portal portal;
@@ -195,6 +196,7 @@ ExecuteQuery(ParseState *pstate,
/* Replan if needed, and increment plan refcount for portal */
cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL);
plan_list = cplan->stmt_list;
+ plan_part_prune_result_list = cplan->part_prune_result_list;
/*
* DO NOT add any logic that could possibly throw an error between
@@ -204,7 +206,7 @@ ExecuteQuery(ParseState *pstate,
NULL,
query_string,
entry->plansource->commandTag,
- plan_list,
+ plan_list, plan_part_prune_result_list,
cplan);
/*
@@ -576,7 +578,9 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
const char *query_string;
CachedPlan *cplan;
List *plan_list;
- ListCell *p;
+ List *plan_part_prune_result_list;
+ ListCell *p,
+ *pp;
ParamListInfo paramLI = NULL;
EState *estate = NULL;
instr_time planstart;
@@ -632,15 +636,18 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
}
plan_list = cplan->stmt_list;
+ plan_part_prune_result_list = cplan->part_prune_result_list;
/* Explain each query */
- foreach(p, plan_list)
+ forboth(p, plan_list, pp, plan_part_prune_result_list)
{
PlannedStmt *pstmt = lfirst_node(PlannedStmt, p);
+ PartitionPruneResult *part_prune_result = lfirst_node(PartitionPruneResult, pp);
if (pstmt->commandType != CMD_UTILITY)
- ExplainOnePlan(pstmt, into, es, query_string, paramLI, queryEnv,
- &planduration, (es->buffers ? &bufusage : NULL));
+ ExplainOnePlan(pstmt, part_prune_result, into, es, query_string,
+ paramLI, queryEnv, &planduration,
+ (es->buffers ? &bufusage : NULL));
else
ExplainOneUtility(pstmt->utilityStmt, into, es, query_string,
paramLI, queryEnv);
diff --git a/src/backend/executor/README b/src/backend/executor/README
index 0b5183fc4a..8418e758da 100644
--- a/src/backend/executor/README
+++ b/src/backend/executor/README
@@ -65,6 +65,30 @@ found there. This currently only occurs for Append and MergeAppend nodes. In
this case the non-required subplans are ignored and the executor state's
subnode array will become out of sequence to the plan's subplan list.
+Actually, the so-called execution time pruning may also occur even before the
+execution has started. One case where that occurs is when a cached generic
+plan is being validated for execution by plancache.c: GetCachedPlan(), which
+proceeds by locking all the relations that will be scanned by that plan. If
+the generic plan has nodes that contain so-called initial pruning steps (a
+subset of execution pruning steps that do not depend on full-fledged execution
+having started), they are performed at this point to figure out the minimal
+set of child subplans that satisfy those pruning instructions and the result
+of performing that pruning is saved in a data structure that gets passed to
+the executor alongside the plan tree. Relations scanned by only those
+surviving subplans are then locked while those scanned by the pruned subplans
+are not, even though the pruned subplans themselves are not removed from the
+plan tree. So, it is imperative that the executor and any third party code
+invoked by it that gets passed the plan tree look at the initial pruning result
+made available via the aforementioned data structure to determine whether or
+not a particular subplan is valid. The data structure basically consists of
+a PartitionPruneResult node passed through the QueryDesc (subsequently added
+to EState) containing a list of bitmapsets with one element for every
+PartitionPruneInfo found in PlannedStmt.partPruneInfos. The list is indexed
+with part_prune_index of the individual PartitionPruneInfos that's stored in
+the parent plan nodes to which a given PartitionPruneInfo belongs. Each
+bitmapset of the indexes of the child subplans of the given parent plan
+node that survive initial partiiton pruning.
+
Each Plan node may have expression trees associated with it, to represent
its target list, qualification conditions, etc. These trees are also
read-only to the executor, but the executor state for expression evaluation
@@ -286,6 +310,10 @@ Query Processing Control Flow
This is a sketch of control flow for full query processing:
+ [ ExecutorDoInitialPruning ] --- an optional step to perform initial
+ partition pruning on the plan tree the result of which is passed
+ to the executor via QueryDesc
+
CreateQueryDesc
ExecutorStart
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index ef2fd46092..05cc99df8f 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -49,11 +49,13 @@
#include "commands/matview.h"
#include "commands/trigger.h"
#include "executor/execdebug.h"
+#include "executor/execPartition.h"
#include "executor/nodeSubplan.h"
#include "foreign/fdwapi.h"
#include "jit/jit.h"
#include "mb/pg_wchar.h"
#include "miscadmin.h"
+#include "nodes/nodeFuncs.h"
#include "parser/parsetree.h"
#include "storage/bufmgr.h"
#include "storage/lmgr.h"
@@ -104,6 +106,47 @@ static void EvalPlanQualStart(EPQState *epqstate, Plan *planTree);
/* end of local decls */
+/* ----------------------------------------------------------------
+ * ExecutorDoInitialPruning
+ *
+ * Performs initial partition pruning to figure out the minimal set of
+ * subplans to be executed and the set of RT indexes of the corresponding
+ * leaf partitions
+ *
+ * Returned PartitionPruneResult must be subsequently passed to the executor
+ * so that it can reuse the result of pruning. It's important that the
+ * has the same view of which partitions are initially pruned (by not doing
+ * the pruning again itself) or otherwise it risks initializing subplans whose
+ * partitions would not have been locked.
+ *
+ * Note: Partitioned tables mentioned in PartitionedRelPruneInfo nodes that
+ * drive the pruning will be locked before doing the pruning.
+ */
+PartitionPruneResult *
+ExecutorDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params)
+{
+ PartitionPruneResult *result;
+ ListCell *lc;
+
+ /* Only get here if there is any pruning to do. */
+ Assert(plannedstmt->containsInitialPruning);
+
+ result = makeNode(PartitionPruneResult);
+ foreach(lc, plannedstmt->partPruneInfos)
+ {
+ PartitionPruneInfo *pruneinfo = lfirst(lc);
+ Bitmapset *valid_subplan_offs;
+
+ valid_subplan_offs =
+ ExecPartitionDoInitialPruning(plannedstmt, params, pruneinfo,
+ &result->scan_leafpart_rtis);
+ result->valid_subplan_offs_list =
+ lappend(result->valid_subplan_offs_list,
+ valid_subplan_offs);
+ }
+
+ return result;
+}
/* ----------------------------------------------------------------
* ExecutorStart
@@ -806,6 +849,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
{
CmdType operation = queryDesc->operation;
PlannedStmt *plannedstmt = queryDesc->plannedstmt;
+ PartitionPruneResult *part_prune_result = queryDesc->part_prune_result;
Plan *plan = plannedstmt->planTree;
List *rangeTable = plannedstmt->rtable;
EState *estate = queryDesc->estate;
@@ -825,6 +869,8 @@ InitPlan(QueryDesc *queryDesc, int eflags)
ExecInitRangeTable(estate, rangeTable);
estate->es_plannedstmt = plannedstmt;
+ estate->es_part_prune_infos = plannedstmt->partPruneInfos;
+ estate->es_part_prune_result = part_prune_result;
/*
* Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index 9a0d5d59ef..805f86c503 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -66,6 +66,7 @@
#define PARALLEL_KEY_QUERY_TEXT UINT64CONST(0xE000000000000008)
#define PARALLEL_KEY_JIT_INSTRUMENTATION UINT64CONST(0xE000000000000009)
#define PARALLEL_KEY_WAL_USAGE UINT64CONST(0xE00000000000000A)
+#define PARALLEL_KEY_PARTITIONPRUNERESULT UINT64CONST(0xE00000000000000B)
#define PARALLEL_TUPLE_QUEUE_SIZE 65536
@@ -182,7 +183,9 @@ ExecSerializePlan(Plan *plan, EState *estate)
pstmt->transientPlan = false;
pstmt->dependsOnRole = false;
pstmt->parallelModeNeeded = false;
+ pstmt->containsInitialPruning = false;
pstmt->planTree = plan;
+ pstmt->partPruneInfos = estate->es_part_prune_infos;
pstmt->rtable = estate->es_range_table;
pstmt->resultRelations = NIL;
pstmt->appendRelations = NIL;
@@ -596,12 +599,15 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
FixedParallelExecutorState *fpes;
char *pstmt_data;
char *pstmt_space;
+ char *part_prune_result_data;
+ char *part_prune_result_space;
char *paramlistinfo_space;
BufferUsage *bufusage_space;
WalUsage *walusage_space;
SharedExecutorInstrumentation *instrumentation = NULL;
SharedJitInstrumentation *jit_instrumentation = NULL;
int pstmt_len;
+ int part_prune_result_len;
int paramlistinfo_len;
int instrumentation_len = 0;
int jit_instrumentation_len = 0;
@@ -630,6 +636,7 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
/* Fix up and serialize plan to be sent to workers. */
pstmt_data = ExecSerializePlan(planstate->plan, estate);
+ part_prune_result_data = nodeToString(estate->es_part_prune_result);
/* Create a parallel context. */
pcxt = CreateParallelContext("postgres", "ParallelQueryMain", nworkers);
@@ -656,6 +663,11 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
shm_toc_estimate_chunk(&pcxt->estimator, pstmt_len);
shm_toc_estimate_keys(&pcxt->estimator, 1);
+ /* Estimate space for serialized PartitionPruneResult. */
+ part_prune_result_len = strlen(part_prune_result_data) + 1;
+ shm_toc_estimate_chunk(&pcxt->estimator, part_prune_result_len);
+ shm_toc_estimate_keys(&pcxt->estimator, 1);
+
/* Estimate space for serialized ParamListInfo. */
paramlistinfo_len = EstimateParamListSpace(estate->es_param_list_info);
shm_toc_estimate_chunk(&pcxt->estimator, paramlistinfo_len);
@@ -750,6 +762,12 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
memcpy(pstmt_space, pstmt_data, pstmt_len);
shm_toc_insert(pcxt->toc, PARALLEL_KEY_PLANNEDSTMT, pstmt_space);
+ /* Store serialized PartitionPruneResult */
+ part_prune_result_space = shm_toc_allocate(pcxt->toc, part_prune_result_len);
+ memcpy(part_prune_result_space, part_prune_result_data, part_prune_result_len);
+ shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARTITIONPRUNERESULT,
+ part_prune_result_space);
+
/* Store serialized ParamListInfo. */
paramlistinfo_space = shm_toc_allocate(pcxt->toc, paramlistinfo_len);
shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARAMLISTINFO, paramlistinfo_space);
@@ -1231,8 +1249,10 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
int instrument_options)
{
char *pstmtspace;
+ char *part_prune_result_space;
char *paramspace;
PlannedStmt *pstmt;
+ PartitionPruneResult *part_prune_result;
ParamListInfo paramLI;
char *queryString;
@@ -1243,12 +1263,18 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
pstmtspace = shm_toc_lookup(toc, PARALLEL_KEY_PLANNEDSTMT, false);
pstmt = (PlannedStmt *) stringToNode(pstmtspace);
+ /* Reconstruct leader-supplied PartitionPruneResult. */
+ part_prune_result_space =
+ shm_toc_lookup(toc, PARALLEL_KEY_PARTITIONPRUNERESULT, false);
+ part_prune_result = (PartitionPruneResult *)
+ stringToNode(part_prune_result_space);
+
/* Reconstruct ParamListInfo. */
paramspace = shm_toc_lookup(toc, PARALLEL_KEY_PARAMLISTINFO, false);
paramLI = RestoreParamList(¶mspace);
/* Create a QueryDesc for the query. */
- return CreateQueryDesc(pstmt,
+ return CreateQueryDesc(pstmt, part_prune_result,
queryString,
GetActiveSnapshot(), InvalidSnapshot,
receiver, paramLI, NULL, instrument_options);
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 615bd80973..3037742b8d 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -25,6 +25,7 @@
#include "mb/pg_wchar.h"
#include "miscadmin.h"
#include "nodes/makefuncs.h"
+#include "parser/parsetree.h"
#include "partitioning/partbounds.h"
#include "partitioning/partdesc.h"
#include "partitioning/partprune.h"
@@ -185,7 +186,11 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation rel,
static List *adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri);
static List *adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap);
static PartitionPruneState *CreatePartitionPruneState(PlanState *planstate,
- PartitionPruneInfo *pruneinfo);
+ PartitionPruneInfo *pruneinfo,
+ bool consider_initial_steps,
+ bool consider_exec_steps,
+ List *rtable, ExprContext *econtext,
+ PartitionDirectory partdir);
static void InitPartitionPruneContext(PartitionPruneContext *context,
List *pruning_steps,
PartitionDesc partdesc,
@@ -198,7 +203,8 @@ static void PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
static void find_matching_subplans_recurse(PartitionPruningData *prunedata,
PartitionedRelPruningData *pprune,
bool initial_prune,
- Bitmapset **validsubplans);
+ Bitmapset **validsubplans,
+ Bitmapset **scan_leafpart_rtis);
/*
@@ -1587,8 +1593,10 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
* considered to be a stable expression, it can change value from one plan
* node scan to the next during query execution. Stable comparison
* expressions that don't involve such Params allow partition pruning to be
- * done once during executor startup. Expressions that do involve such Params
- * require us to prune separately for each scan of the parent plan node.
+ * done once during executor startup or during ExecutorDoInitialPruning() that
+ * runs as part of performing AcquireExecutorLocks() on a given plan tree.
+ * Expressions that do involve such Params require us to prune separately for
+ * each scan of the parent plan node.
*
* Note that pruning away unneeded subplans during executor startup has the
* added benefit of not having to initialize the unneeded subplans at all.
@@ -1605,6 +1613,13 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
* account for initial pruning possibly having eliminated some of the
* subplans.
*
+ * ExecPartitionDoInitialPruning:
+ * Do initial pruning with the information contained in a given
+ * PartitionPruneInfo to determine the minimal set of child subplans
+ * to be executed of the parent plan node to which the PartitionPruneInfo
+ * belongs and also the set of the RT indexes of leaf partitions that will
+ * be scanned with those subplans.
+ *
* ExecFindMatchingSubPlans:
* Returns indexes of matching subplans after evaluating the expressions
* that are safe to evaluate at a given point. This function is first
@@ -1622,8 +1637,9 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
*
* On return, *initially_valid_subplans is assigned the set of indexes of
* child subplans that must be initialized along with the parent plan node.
- * Initial pruning is performed here if needed and in that case only the
- * surviving subplans' indexes are added.
+ * Initial pruning is performed here if needed (unless it has already been done
+ * by ExecDoInitialPruning()), and in that case only the surviving subplans'
+ * indexes are added.
*
* If subplans are indeed pruned, subplan_map arrays contained in the returned
* PartitionPruneState are re-sequenced to not count those, though only if the
@@ -1632,23 +1648,59 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
PartitionPruneState *
ExecInitPartitionPruning(PlanState *planstate,
int n_total_subplans,
- PartitionPruneInfo *pruneinfo,
+ int part_prune_index,
Bitmapset **initially_valid_subplans)
{
- PartitionPruneState *prunestate;
EState *estate = planstate->state;
+ PartitionPruneInfo *pruneinfo = list_nth(estate->es_part_prune_infos,
+ part_prune_index);
+ PartitionPruneResult *pruneresult = estate->es_part_prune_result;
+ PartitionPruneState *prunestate;
+ bool do_pruning = (pruneinfo->needs_init_pruning ||
+ pruneinfo->needs_exec_pruning);
- /* We may need an expression context to evaluate partition exprs */
- ExecAssignExprContext(estate, planstate);
+ /*
+ * No need to do initial pruning if it was done already by
+ * ExecutorDoInitialPruning(), which it would be if es_part_prune_result
+ * has been set.
+ */
+ if (pruneresult)
+ do_pruning = pruneinfo->needs_exec_pruning;
+
+ prunestate = NULL;
+ if (do_pruning)
+ {
+ /* We may need an expression context to evaluate partition exprs */
+ ExecAssignExprContext(estate, planstate);
- /* Create the working data structure for pruning */
- prunestate = CreatePartitionPruneState(planstate, pruneinfo);
+ /* For data reading, executor always omits detached partitions */
+ if (estate->es_partition_directory == NULL)
+ estate->es_partition_directory =
+ CreatePartitionDirectory(estate->es_query_cxt, false);
+
+ /*
+ * Create the working data structure for pruning. No need to consider
+ * initial pruning steps if we have a PartitionPruneResult.
+ */
+ prunestate = CreatePartitionPruneState(planstate, pruneinfo,
+ pruneresult == NULL, true,
+ NIL, planstate->ps_ExprContext,
+ estate->es_partition_directory);
+ }
/*
* Perform an initial partition prune pass, if required.
*/
- if (prunestate->do_initial_prune)
- *initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true);
+ if (pruneresult)
+ {
+ *initially_valid_subplans =
+ list_nth(pruneresult->valid_subplan_offs_list, part_prune_index);
+ }
+ else if (prunestate && prunestate->do_initial_prune)
+ {
+ *initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true,
+ NULL);
+ }
else
{
/* No pruning, so we'll need to initialize all subplans */
@@ -1669,7 +1721,7 @@ ExecInitPartitionPruning(PlanState *planstate,
* leaves invalid data in prunestate, because that data won't be
* consulted again (cf initial Assert in ExecFindMatchingSubPlans).
*/
- if (prunestate->do_exec_prune)
+ if (prunestate && prunestate->do_exec_prune)
PartitionPruneFixSubPlanMap(prunestate,
*initially_valid_subplans,
n_total_subplans);
@@ -1678,11 +1730,72 @@ ExecInitPartitionPruning(PlanState *planstate,
return prunestate;
}
+/*
+ * ExecPartitionDoInitialPruning
+ * Perform initial pruning using given PartitionPruneInfo to determine
+ * the minimal set of child subplans to be executed of the parent plan
+ * node to which the PartitionPruneInfo belongs and also the set of RT
+ * indexes of leaf partitions that will scanned with those subplans.
+ */
+Bitmapset *
+ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params,
+ PartitionPruneInfo *pruneinfo,
+ Bitmapset **scan_leafpart_rtis)
+{
+ List *rtable = plannedstmt->rtable;
+ ExprContext *econtext;
+ PartitionDirectory pdir;
+ MemoryContext oldcontext,
+ tmpcontext;
+ PartitionPruneState *prunestate;
+ Bitmapset *valid_subplan_offs;
+
+ /*
+ * A temporary context to allocate stuff needded to run the pruning steps.
+ */
+ tmpcontext = AllocSetContextCreate(CurrentMemoryContext,
+ "initial pruning working data",
+ ALLOCSET_DEFAULT_SIZES);
+ oldcontext = MemoryContextSwitchTo(tmpcontext);
+
+ /*
+ * PartitionDirectory to look up partition descriptors, which omits
+ * detached partitions, just like in the executor proper.
+ */
+ pdir = CreatePartitionDirectory(CurrentMemoryContext, false);
+
+ /*
+ * We don't yet have a PlanState for the parent plan node, so must create
+ * a standalone ExprContext to evaluate pruning expressions, equipped with
+ * the information about the EXTERN parameters that the caller passed us.
+ * Note that that's okay because the initial pruning steps do not contain
+ * anything that requires the execution to have started.
+ */
+ econtext = CreateStandaloneExprContext();
+ econtext->ecxt_param_list_info = params;
+ prunestate = CreatePartitionPruneState(NULL, pruneinfo, true, false,
+ rtable, econtext, pdir);
+ MemoryContextSwitchTo(oldcontext);
+
+ /* Do the initial pruning. */
+ valid_subplan_offs = ExecFindMatchingSubPlans(prunestate, true,
+ scan_leafpart_rtis);
+
+ FreeExprContext(econtext, true);
+ DestroyPartitionDirectory(pdir);
+ MemoryContextDelete(tmpcontext);
+
+ return valid_subplan_offs;
+}
+
/*
* CreatePartitionPruneState
* Build the data structure required for calling ExecFindMatchingSubPlans
*
- * 'planstate' is the parent plan node's execution state.
+ * 'planstate', if not NULL, is the parent plan node's execution state. It
+ * can be NULL if being called before ExecutorStart(), in which case,
+ * 'rtable' (range table), 'econtext', and 'partdir' must be explicitly
+ * provided.
*
* 'pruneinfo' is a PartitionPruneInfo as generated by
* make_partition_pruneinfo. Here we build a PartitionPruneState containing a
@@ -1696,19 +1809,21 @@ ExecInitPartitionPruning(PlanState *planstate,
* PartitionedRelPruneInfo.
*/
static PartitionPruneState *
-CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
+CreatePartitionPruneState(PlanState *planstate,
+ PartitionPruneInfo *pruneinfo,
+ bool consider_initial_steps,
+ bool consider_exec_steps,
+ List *rtable, ExprContext *econtext,
+ PartitionDirectory partdir)
{
- EState *estate = planstate->state;
+ EState *estate = planstate ? planstate->state : NULL;
PartitionPruneState *prunestate;
int n_part_hierarchies;
ListCell *lc;
int i;
- ExprContext *econtext = planstate->ps_ExprContext;
- /* For data reading, executor always omits detached partitions */
- if (estate->es_partition_directory == NULL)
- estate->es_partition_directory =
- CreatePartitionDirectory(estate->es_query_cxt, false);
+ Assert((estate != NULL) ||
+ (partdir != NULL && econtext != NULL && rtable != NIL));
n_part_hierarchies = list_length(pruneinfo->prune_infos);
Assert(n_part_hierarchies > 0);
@@ -1759,19 +1874,48 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
PartitionedRelPruneInfo *pinfo = lfirst_node(PartitionedRelPruneInfo, lc2);
PartitionedRelPruningData *pprune = &prunedata->partrelprunedata[j];
Relation partrel;
+ bool close_partrel = false;
PartitionDesc partdesc;
PartitionKey partkey;
/*
- * We can rely on the copies of the partitioned table's partition
- * key and partition descriptor appearing in its relcache entry,
- * because that entry will be held open and locked for the
- * duration of this executor run.
+ * Must open the relation by ourselves when called before the
+ * execution has started, such as, when called during
+ * ExecutorDoInitialPruning() on a cached plan. In that case,
+ * sub-partitions must be locked, because AcquirePlannerLocks()
+ * would not have seen them. (1st relation in a partrelpruneinfos
+ * list is always the root partitioned table appearing in the
+ * query, which AcquirePlannerLocks() would have locked; the
+ * Assert in relation_open() guards that assumption.)
+ */
+ if (estate == NULL)
+ {
+ RangeTblEntry *rte = rt_fetch(pinfo->rtindex, rtable);
+ int lockmode = (j == 0) ? NoLock : rte->rellockmode;
+
+ partrel = table_open(rte->relid, lockmode);
+ close_partrel = true;
+ }
+ else
+ partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
+
+ /*
+ * We can rely on the copy of the partitioned table's partition
+ * key from in its relcache entry, because it can't change (or
+ * get destroyed) as long as the relation is locked. Partition
+ * descriptor is taken from the PartitionDirectory associated with
+ * the table that is held open long enough for the descriptor to
+ * remain valid while it's used to perform the pruning steps.
*/
- partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
partkey = RelationGetPartitionKey(partrel);
- partdesc = PartitionDirectoryLookup(estate->es_partition_directory,
- partrel);
+ partdesc = PartitionDirectoryLookup(partdir, partrel);
+
+ /*
+ * Must close partrel, keeping the lock taken, if we're not using
+ * EState's entry.
+ */
+ if (close_partrel)
+ table_close(partrel, NoLock);
/*
* Initialize the subplan_map and subpart_map.
@@ -1785,6 +1929,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
Assert(partdesc->nparts >= pinfo->nparts);
pprune->nparts = partdesc->nparts;
pprune->subplan_map = palloc(sizeof(int) * partdesc->nparts);
+ pprune->rti_map = palloc(sizeof(Index) * partdesc->nparts);
if (partdesc->nparts == pinfo->nparts)
{
/*
@@ -1795,6 +1940,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
pprune->subpart_map = pinfo->subpart_map;
memcpy(pprune->subplan_map, pinfo->subplan_map,
sizeof(int) * pinfo->nparts);
+ memcpy(pprune->rti_map, pinfo->rti_map,
+ sizeof(int) * pinfo->nparts);
/*
* Double-check that the list of unpruned relations has not
@@ -1845,6 +1992,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
pinfo->subplan_map[pd_idx];
pprune->subpart_map[pp_idx] =
pinfo->subpart_map[pd_idx];
+ pprune->rti_map[pp_idx] =
+ pinfo->rti_map[pd_idx];
pd_idx++;
}
else
@@ -1852,6 +2001,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
/* this partdesc entry is not in the plan */
pprune->subplan_map[pp_idx] = -1;
pprune->subpart_map[pp_idx] = -1;
+ pprune->rti_map[pp_idx] = 0;
}
}
@@ -1873,7 +2023,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
* Initialize pruning contexts as needed.
*/
pprune->initial_pruning_steps = pinfo->initial_pruning_steps;
- if (pinfo->initial_pruning_steps)
+ if (consider_initial_steps && pinfo->initial_pruning_steps)
{
InitPartitionPruneContext(&pprune->initial_context,
pinfo->initial_pruning_steps,
@@ -1883,7 +2033,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
prunestate->do_initial_prune = true;
}
pprune->exec_pruning_steps = pinfo->exec_pruning_steps;
- if (pinfo->exec_pruning_steps)
+ if (consider_exec_steps && pinfo->exec_pruning_steps)
{
InitPartitionPruneContext(&pprune->exec_context,
pinfo->exec_pruning_steps,
@@ -2111,10 +2261,14 @@ PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
* Pass initial_prune if PARAM_EXEC Params cannot yet be evaluated. This
* differentiates the initial executor-time pruning step from later
* runtime pruning.
+ *
+ * RT indexes of leaf partitions scanned by the chosen subplans are added to
+ * *scan_leafpart_rtis if the pointer is non-NULL.
*/
Bitmapset *
ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
- bool initial_prune)
+ bool initial_prune,
+ Bitmapset **scan_leafpart_rtis)
{
Bitmapset *result = NULL;
MemoryContext oldcontext;
@@ -2149,7 +2303,7 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
*/
pprune = &prunedata->partrelprunedata[0];
find_matching_subplans_recurse(prunedata, pprune, initial_prune,
- &result);
+ &result, scan_leafpart_rtis);
/* Expression eval may have used space in ExprContext too */
if (pprune->exec_pruning_steps)
@@ -2163,6 +2317,8 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
/* Copy result out of the temp context before we reset it */
result = bms_copy(result);
+ if (scan_leafpart_rtis)
+ *scan_leafpart_rtis = bms_copy(*scan_leafpart_rtis);
MemoryContextReset(prunestate->prune_context);
@@ -2173,13 +2329,15 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
* find_matching_subplans_recurse
* Recursive worker function for ExecFindMatchingSubPlans
*
- * Adds valid (non-prunable) subplan IDs to *validsubplans
+ * Adds valid (non-prunable) subplan IDs to *validsubplans and RT indexes of
+ * of the corresponding leaf partitions to *scan_leafpart_rtis (if asked for).
*/
static void
find_matching_subplans_recurse(PartitionPruningData *prunedata,
PartitionedRelPruningData *pprune,
bool initial_prune,
- Bitmapset **validsubplans)
+ Bitmapset **validsubplans,
+ Bitmapset **scan_leafpart_rtis)
{
Bitmapset *partset;
int i;
@@ -2206,8 +2364,13 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
while ((i = bms_next_member(partset, i)) >= 0)
{
if (pprune->subplan_map[i] >= 0)
+ {
*validsubplans = bms_add_member(*validsubplans,
pprune->subplan_map[i]);
+ if (scan_leafpart_rtis && pprune->rti_map[i] > 0)
+ *scan_leafpart_rtis = bms_add_member(*scan_leafpart_rtis,
+ pprune->rti_map[i]);
+ }
else
{
int partidx = pprune->subpart_map[i];
@@ -2215,7 +2378,8 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
if (partidx >= 0)
find_matching_subplans_recurse(prunedata,
&prunedata->partrelprunedata[partidx],
- initial_prune, validsubplans);
+ initial_prune, validsubplans,
+ scan_leafpart_rtis);
else
{
/*
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 9df1f81ea8..639145abe9 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -119,6 +119,7 @@ CreateExecutorState(void)
estate->es_relations = NULL;
estate->es_rowmarks = NULL;
estate->es_plannedstmt = NULL;
+ estate->es_part_prune_result = NULL;
estate->es_junkFilter = NULL;
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index f9460ae506..a2182a6b1f 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -844,7 +844,7 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
else
dest = None_Receiver;
- es->qd = CreateQueryDesc(es->stmt,
+ es->qd = CreateQueryDesc(es->stmt, NULL,
fcache->src,
GetActiveSnapshot(),
InvalidSnapshot,
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index 357e10a1d7..09f26658e2 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -94,6 +94,7 @@ static bool ExecAppendAsyncRequest(AppendState *node, TupleTableSlot **result);
static void ExecAppendAsyncEventWait(AppendState *node);
static void classify_matching_subplans(AppendState *node);
+
/* ----------------------------------------------------------------
* ExecInitAppend
*
@@ -134,7 +135,7 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
appendstate->as_begun = false;
/* If run-time partition pruning is enabled, then set that up now */
- if (node->part_prune_info != NULL)
+ if (node->part_prune_index >= 0)
{
PartitionPruneState *prunestate;
@@ -145,7 +146,7 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
*/
prunestate = ExecInitPartitionPruning(&appendstate->ps,
list_length(node->appendplans),
- node->part_prune_info,
+ node->part_prune_index,
&validsubplans);
appendstate->as_prune_state = prunestate;
nplans = bms_num_members(validsubplans);
@@ -155,7 +156,8 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
* subplan, we can fill as_valid_subplans immediately, preventing
* later calls to ExecFindMatchingSubPlans.
*/
- if (!prunestate->do_exec_prune && nplans > 0)
+ if (appendstate->as_prune_state == NULL ||
+ (!appendstate->as_prune_state->do_exec_prune && nplans > 0))
appendstate->as_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
}
else
@@ -577,7 +579,7 @@ choose_next_subplan_locally(AppendState *node)
}
else if (node->as_valid_subplans == NULL)
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
whichplan = -1;
}
@@ -642,7 +644,7 @@ choose_next_subplan_for_leader(AppendState *node)
if (node->as_valid_subplans == NULL)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
/*
* Mark each invalid plan as finished to allow the loop below to
@@ -717,7 +719,7 @@ choose_next_subplan_for_worker(AppendState *node)
else if (node->as_valid_subplans == NULL)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
mark_invalid_subplans_as_finished(node);
}
@@ -868,7 +870,7 @@ ExecAppendAsyncBegin(AppendState *node)
if (node->as_valid_subplans == NULL)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
classify_matching_subplans(node);
}
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index ecf9052e03..7708cfffda 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -82,7 +82,7 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
mergestate->ps.ExecProcNode = ExecMergeAppend;
/* If run-time partition pruning is enabled, then set that up now */
- if (node->part_prune_info != NULL)
+ if (node->part_prune_index >= 0)
{
PartitionPruneState *prunestate;
@@ -93,7 +93,7 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
*/
prunestate = ExecInitPartitionPruning(&mergestate->ps,
list_length(node->mergeplans),
- node->part_prune_info,
+ node->part_prune_index,
&validsubplans);
mergestate->ms_prune_state = prunestate;
nplans = bms_num_members(validsubplans);
@@ -103,7 +103,8 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
* subplan, we can fill as_valid_subplans immediately, preventing
* later calls to ExecFindMatchingSubPlans.
*/
- if (!prunestate->do_exec_prune && nplans > 0)
+ if (mergestate->ms_prune_state == NULL ||
+ (!mergestate->ms_prune_state->do_exec_prune && nplans > 0))
mergestate->ms_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
}
else
@@ -218,7 +219,7 @@ ExecMergeAppend(PlanState *pstate)
*/
if (node->ms_valid_subplans == NULL)
node->ms_valid_subplans =
- ExecFindMatchingSubPlans(node->ms_prune_state, false);
+ ExecFindMatchingSubPlans(node->ms_prune_state, false, NULL);
/*
* First time through: pull the first tuple from each valid subplan,
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 042a5f8b0a..d2ea2a8914 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -1578,6 +1578,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
CachedPlanSource *plansource;
CachedPlan *cplan;
List *stmt_list;
+ List *part_prune_result_list;
char *query_string;
Snapshot snapshot;
MemoryContext oldcontext;
@@ -1659,6 +1660,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
/* Replan if needed, and increment plan refcount for portal */
cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv);
stmt_list = cplan->stmt_list;
+ part_prune_result_list = cplan->part_prune_result_list;
if (!plan->saved)
{
@@ -1670,6 +1672,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
*/
oldcontext = MemoryContextSwitchTo(portal->portalContext);
stmt_list = copyObject(stmt_list);
+ part_prune_result_list = copyObject(part_prune_result_list);
MemoryContextSwitchTo(oldcontext);
ReleaseCachedPlan(cplan, NULL);
cplan = NULL; /* portal shouldn't depend on cplan */
@@ -1683,6 +1686,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
query_string,
plansource->commandTag,
stmt_list,
+ part_prune_result_list,
cplan);
/*
@@ -2473,7 +2477,9 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
{
CachedPlanSource *plansource = (CachedPlanSource *) lfirst(lc1);
List *stmt_list;
- ListCell *lc2;
+ List *part_prune_result_list;
+ ListCell *lc2,
+ *lc3;
spicallbackarg.query = plansource->query_string;
@@ -2552,6 +2558,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
plan_owner, _SPI_current->queryEnv);
stmt_list = cplan->stmt_list;
+ part_prune_result_list = cplan->part_prune_result_list;
/*
* If we weren't given a specific snapshot to use, and the statement
@@ -2589,9 +2596,10 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
}
}
- foreach(lc2, stmt_list)
+ forboth(lc2, stmt_list, lc3, part_prune_result_list)
{
PlannedStmt *stmt = lfirst_node(PlannedStmt, lc2);
+ PartitionPruneResult *part_prune_result = lfirst_node(PartitionPruneResult, lc3);
bool canSetTag = stmt->canSetTag;
DestReceiver *dest;
@@ -2663,7 +2671,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
else
snap = InvalidSnapshot;
- qdesc = CreateQueryDesc(stmt,
+ qdesc = CreateQueryDesc(stmt, part_prune_result,
plansource->query_string,
snap, crosscheck_snapshot,
dest,
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index d5760b1006..d2d86c9841 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -96,7 +96,10 @@ _copyPlannedStmt(const PlannedStmt *from)
COPY_SCALAR_FIELD(parallelModeNeeded);
COPY_SCALAR_FIELD(jitFlags);
COPY_NODE_FIELD(planTree);
+ COPY_NODE_FIELD(partPruneInfos);
+ COPY_SCALAR_FIELD(containsInitialPruning);
COPY_NODE_FIELD(rtable);
+ COPY_BITMAPSET_FIELD(minLockRelids);
COPY_NODE_FIELD(resultRelations);
COPY_NODE_FIELD(appendRelations);
COPY_NODE_FIELD(subplans);
@@ -253,7 +256,7 @@ _copyAppend(const Append *from)
COPY_NODE_FIELD(appendplans);
COPY_SCALAR_FIELD(nasyncplans);
COPY_SCALAR_FIELD(first_partial_plan);
- COPY_NODE_FIELD(part_prune_info);
+ COPY_SCALAR_FIELD(part_prune_index);
return newnode;
}
@@ -281,7 +284,7 @@ _copyMergeAppend(const MergeAppend *from)
COPY_POINTER_FIELD(sortOperators, from->numCols * sizeof(Oid));
COPY_POINTER_FIELD(collations, from->numCols * sizeof(Oid));
COPY_POINTER_FIELD(nullsFirst, from->numCols * sizeof(bool));
- COPY_NODE_FIELD(part_prune_info);
+ COPY_SCALAR_FIELD(part_prune_index);
return newnode;
}
@@ -1279,6 +1282,8 @@ _copyPartitionPruneInfo(const PartitionPruneInfo *from)
PartitionPruneInfo *newnode = makeNode(PartitionPruneInfo);
COPY_NODE_FIELD(prune_infos);
+ COPY_SCALAR_FIELD(needs_init_pruning);
+ COPY_SCALAR_FIELD(needs_exec_pruning);
COPY_BITMAPSET_FIELD(other_subplans);
return newnode;
@@ -1295,6 +1300,7 @@ _copyPartitionedRelPruneInfo(const PartitionedRelPruneInfo *from)
COPY_POINTER_FIELD(subplan_map, from->nparts * sizeof(int));
COPY_POINTER_FIELD(subpart_map, from->nparts * sizeof(int));
COPY_POINTER_FIELD(relid_map, from->nparts * sizeof(Oid));
+ COPY_POINTER_FIELD(rti_map, from->nparts * sizeof(Index));
COPY_NODE_FIELD(initial_pruning_steps);
COPY_NODE_FIELD(exec_pruning_steps);
COPY_BITMAPSET_FIELD(execparamids);
@@ -5468,6 +5474,21 @@ _copyExtensibleNode(const ExtensibleNode *from)
return newnode;
}
+/* ****************************************************************
+ * execnodes.h copy functions
+ * ****************************************************************
+ */
+static PartitionPruneResult *
+_copyPartitionPruneResult(const PartitionPruneResult *from)
+{
+ PartitionPruneResult *newnode = makeNode(PartitionPruneResult);
+
+ COPY_BITMAPSET_FIELD(scan_leafpart_rtis);
+ COPY_NODE_FIELD(valid_subplan_offs_list);
+
+ return newnode;
+}
+
/* ****************************************************************
* value.h copy functions
* ****************************************************************
@@ -5522,7 +5543,6 @@ _copyBitString(const BitString *from)
return newnode;
}
-
static ForeignKeyCacheInfo *
_copyForeignKeyCacheInfo(const ForeignKeyCacheInfo *from)
{
@@ -6564,6 +6584,13 @@ copyObjectImpl(const void *from)
retval = _copyPublicationTable(from);
break;
+ /*
+ * EXECUTION NODES
+ */
+ case T_PartitionPruneResult:
+ retval = _copyPartitionPruneResult(from);
+ break;
+
/*
* MISCELLANEOUS NODES
*/
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index abb1f787ef..96d305102d 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -314,7 +314,10 @@ _outPlannedStmt(StringInfo str, const PlannedStmt *node)
WRITE_BOOL_FIELD(parallelModeNeeded);
WRITE_INT_FIELD(jitFlags);
WRITE_NODE_FIELD(planTree);
+ WRITE_NODE_FIELD(partPruneInfos);
+ WRITE_BOOL_FIELD(containsInitialPruning);
WRITE_NODE_FIELD(rtable);
+ WRITE_BITMAPSET_FIELD(minLockRelids);
WRITE_NODE_FIELD(resultRelations);
WRITE_NODE_FIELD(appendRelations);
WRITE_NODE_FIELD(subplans);
@@ -443,7 +446,7 @@ _outAppend(StringInfo str, const Append *node)
WRITE_NODE_FIELD(appendplans);
WRITE_INT_FIELD(nasyncplans);
WRITE_INT_FIELD(first_partial_plan);
- WRITE_NODE_FIELD(part_prune_info);
+ WRITE_INT_FIELD(part_prune_index);
}
static void
@@ -460,7 +463,7 @@ _outMergeAppend(StringInfo str, const MergeAppend *node)
WRITE_OID_ARRAY(sortOperators, node->numCols);
WRITE_OID_ARRAY(collations, node->numCols);
WRITE_BOOL_ARRAY(nullsFirst, node->numCols);
- WRITE_NODE_FIELD(part_prune_info);
+ WRITE_INT_FIELD(part_prune_index);
}
static void
@@ -1005,6 +1008,8 @@ _outPartitionPruneInfo(StringInfo str, const PartitionPruneInfo *node)
WRITE_NODE_TYPE("PARTITIONPRUNEINFO");
WRITE_NODE_FIELD(prune_infos);
+ WRITE_BOOL_FIELD(needs_init_pruning);
+ WRITE_BOOL_FIELD(needs_exec_pruning);
WRITE_BITMAPSET_FIELD(other_subplans);
}
@@ -1019,6 +1024,7 @@ _outPartitionedRelPruneInfo(StringInfo str, const PartitionedRelPruneInfo *node)
WRITE_INT_ARRAY(subplan_map, node->nparts);
WRITE_INT_ARRAY(subpart_map, node->nparts);
WRITE_OID_ARRAY(relid_map, node->nparts);
+ WRITE_INDEX_ARRAY(rti_map, node->nparts);
WRITE_NODE_FIELD(initial_pruning_steps);
WRITE_NODE_FIELD(exec_pruning_steps);
WRITE_BITMAPSET_FIELD(execparamids);
@@ -2419,6 +2425,9 @@ _outPlannerGlobal(StringInfo str, const PlannerGlobal *node)
WRITE_NODE_FIELD(finalrowmarks);
WRITE_NODE_FIELD(resultRelations);
WRITE_NODE_FIELD(appendRelations);
+ WRITE_NODE_FIELD(partPruneInfos);
+ WRITE_BOOL_FIELD(containsInitialPruning);
+ WRITE_BITMAPSET_FIELD(minLockRelids);
WRITE_NODE_FIELD(relationOids);
WRITE_NODE_FIELD(invalItems);
WRITE_NODE_FIELD(paramExecTypes);
@@ -2486,6 +2495,7 @@ _outPlannerInfo(StringInfo str, const PlannerInfo *node)
WRITE_BITMAPSET_FIELD(curOuterRels);
WRITE_NODE_FIELD(curOuterParams);
WRITE_BOOL_FIELD(partColsUpdated);
+ WRITE_NODE_FIELD(partPruneInfos);
}
static void
@@ -2839,6 +2849,21 @@ _outExtensibleNode(StringInfo str, const ExtensibleNode *node)
methods->nodeOut(str, node);
}
+/*****************************************************************************
+ *
+ * Stuff from execnodes.h
+ *
+ *****************************************************************************/
+
+static void
+_outPartitionPruneResult(StringInfo str, const PartitionPruneResult *node)
+{
+ WRITE_NODE_TYPE("PARTITIONPRUNERESULT");
+
+ WRITE_BITMAPSET_FIELD(scan_leafpart_rtis);
+ WRITE_NODE_FIELD(valid_subplan_offs_list);
+}
+
/*****************************************************************************
*
* Stuff from parsenodes.h.
@@ -4747,6 +4772,13 @@ outNode(StringInfo str, const void *obj)
_outJsonTableSibling(str, obj);
break;
+ /*
+ * EXECUTION NODES
+ */
+ case T_PartitionPruneResult:
+ _outPartitionPruneResult(str, obj);
+ break;
+
default:
/*
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index e7d008b2c5..677ec055d6 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -164,6 +164,11 @@
token = pg_strtok(&length); /* skip :fldname */ \
local_node->fldname = readIntCols(len)
+/* Read an Index array */
+#define READ_INDEX_ARRAY(fldname, len) \
+ token = pg_strtok(&length); /* skip :fldname */ \
+ local_node->fldname = readIndexCols(len)
+
/* Read a bool array */
#define READ_BOOL_ARRAY(fldname, len) \
token = pg_strtok(&length); /* skip :fldname */ \
@@ -1814,7 +1819,10 @@ _readPlannedStmt(void)
READ_BOOL_FIELD(parallelModeNeeded);
READ_INT_FIELD(jitFlags);
READ_NODE_FIELD(planTree);
+ READ_NODE_FIELD(partPruneInfos);
+ READ_BOOL_FIELD(containsInitialPruning);
READ_NODE_FIELD(rtable);
+ READ_BITMAPSET_FIELD(minLockRelids);
READ_NODE_FIELD(resultRelations);
READ_NODE_FIELD(appendRelations);
READ_NODE_FIELD(subplans);
@@ -1946,7 +1954,7 @@ _readAppend(void)
READ_NODE_FIELD(appendplans);
READ_INT_FIELD(nasyncplans);
READ_INT_FIELD(first_partial_plan);
- READ_NODE_FIELD(part_prune_info);
+ READ_INT_FIELD(part_prune_index);
READ_DONE();
}
@@ -1968,7 +1976,7 @@ _readMergeAppend(void)
READ_OID_ARRAY(sortOperators, local_node->numCols);
READ_OID_ARRAY(collations, local_node->numCols);
READ_BOOL_ARRAY(nullsFirst, local_node->numCols);
- READ_NODE_FIELD(part_prune_info);
+ READ_INT_FIELD(part_prune_index);
READ_DONE();
}
@@ -2762,6 +2770,8 @@ _readPartitionPruneInfo(void)
READ_LOCALS(PartitionPruneInfo);
READ_NODE_FIELD(prune_infos);
+ READ_BOOL_FIELD(needs_init_pruning);
+ READ_BOOL_FIELD(needs_exec_pruning);
READ_BITMAPSET_FIELD(other_subplans);
READ_DONE();
@@ -2778,6 +2788,7 @@ _readPartitionedRelPruneInfo(void)
READ_INT_ARRAY(subplan_map, local_node->nparts);
READ_INT_ARRAY(subpart_map, local_node->nparts);
READ_OID_ARRAY(relid_map, local_node->nparts);
+ READ_INDEX_ARRAY(rti_map, local_node->nparts);
READ_NODE_FIELD(initial_pruning_steps);
READ_NODE_FIELD(exec_pruning_steps);
READ_BITMAPSET_FIELD(execparamids);
@@ -2931,6 +2942,21 @@ _readPartitionRangeDatum(void)
READ_DONE();
}
+
+/*
+ * _readPartitionPruneResult
+ */
+static PartitionPruneResult *
+_readPartitionPruneResult(void)
+{
+ READ_LOCALS(PartitionPruneResult);
+
+ READ_BITMAPSET_FIELD(scan_leafpart_rtis);
+ READ_NODE_FIELD(valid_subplan_offs_list);
+
+ READ_DONE();
+}
+
/*
* parseNodeString
*
@@ -3228,6 +3254,8 @@ parseNodeString(void)
return_value = _readJsonTableParent();
else if (MATCH("JSONTABSNODE", 12))
return_value = _readJsonTableSibling();
+ else if (MATCH("PARTITIONPRUNERESULT", 20))
+ return_value = _readPartitionPruneResult();
else
{
elog(ERROR, "badly formatted node string \"%.32s\"...", token);
@@ -3371,6 +3399,30 @@ readIntCols(int numCols)
return int_vals;
}
+/*
+ * readIndexCols
+ */
+Index *
+readIndexCols(int numCols)
+{
+ int tokenLength,
+ i;
+ const char *token;
+ Index *index_vals;
+
+ if (numCols <= 0)
+ return NULL;
+
+ index_vals = (Index *) palloc(numCols * sizeof(Index));
+ for (i = 0; i < numCols; i++)
+ {
+ token = pg_strtok(&tokenLength);
+ index_vals[i] = atoui(token);
+ }
+
+ return index_vals;
+}
+
/*
* readBoolCols
*/
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index 179c87c671..2f9260abed 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -1336,7 +1336,15 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
plan->appendplans = subplans;
plan->nasyncplans = nasyncplans;
plan->first_partial_plan = best_path->first_partial_path;
- plan->part_prune_info = partpruneinfo;
+
+ if (partpruneinfo)
+ {
+ root->partPruneInfos = lappend(root->partPruneInfos, partpruneinfo);
+ /* Will be updated later in set_plan_references(). */
+ plan->part_prune_index = list_length(root->partPruneInfos) - 1;
+ }
+ else
+ plan->part_prune_index = -1;
copy_generic_path_info(&plan->plan, (Path *) best_path);
@@ -1498,7 +1506,15 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
}
node->mergeplans = subplans;
- node->part_prune_info = partpruneinfo;
+
+ if (partpruneinfo)
+ {
+ root->partPruneInfos = lappend(root->partPruneInfos, partpruneinfo);
+ /* Will be updated later in set_plan_references(). */
+ node->part_prune_index = list_length(root->partPruneInfos) - 1;
+ }
+ else
+ node->part_prune_index = -1;
/*
* If prepare_sort_from_pathkeys added sort columns, but we were told to
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index b2569c5d0c..2aa051d862 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -518,7 +518,10 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
result->dependsOnRole = glob->dependsOnRole;
result->parallelModeNeeded = glob->parallelModeNeeded;
result->planTree = top_plan;
+ result->partPruneInfos = glob->partPruneInfos;
+ result->containsInitialPruning = glob->containsInitialPruning;
result->rtable = glob->finalrtable;
+ result->minLockRelids = glob->minLockRelids;
result->resultRelations = glob->resultRelations;
result->appendRelations = glob->appendRelations;
result->subplans = glob->subplans;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index bf4c722c02..8d9ab2c74d 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -252,7 +252,7 @@ set_plan_references(PlannerInfo *root, Plan *plan)
Plan *result;
PlannerGlobal *glob = root->glob;
int rtoffset = list_length(glob->finalrtable);
- ListCell *lc;
+ ListCell *lc;
/*
* Add all the query's RTEs to the flattened rangetable. The live ones
@@ -261,6 +261,16 @@ set_plan_references(PlannerInfo *root, Plan *plan)
*/
add_rtes_to_flat_rtable(root, false);
+ /*
+ * Add the query's adjusted range of RT indexes to glob->minLockRelids.
+ * The adjusted RT indexes of prunable relations will be deleted from the
+ * set below where PartitionPruneInfos are processed.
+ */
+ glob->minLockRelids =
+ bms_add_range(glob->minLockRelids,
+ rtoffset + 1,
+ rtoffset + list_length(root->parse->rtable));
+
/*
* Adjust RT indexes of PlanRowMarks and add to final rowmarks list
*/
@@ -339,6 +349,56 @@ set_plan_references(PlannerInfo *root, Plan *plan)
}
}
+ /* Also fix up the information in PartitionPruneInfos. */
+ foreach (lc, root->partPruneInfos)
+ {
+ PartitionPruneInfo *pruneinfo = lfirst(lc);
+ Bitmapset *leafpart_rtis = NULL;
+ ListCell *l;
+
+ foreach(l, pruneinfo->prune_infos)
+ {
+ List *prune_infos = lfirst(l);
+ ListCell *l2;
+
+ foreach(l2, prune_infos)
+ {
+ PartitionedRelPruneInfo *pinfo = lfirst(l2);
+ int i;
+
+ /* RT index of the partitione table. */
+ pinfo->rtindex += rtoffset;
+
+ /* And also those of the leaf partitions. */
+ for (i = 0; i < pinfo->nparts; i++)
+ {
+ if (pinfo->rti_map[i] > 0)
+ {
+ pinfo->rti_map[i] += rtoffset;
+ leafpart_rtis = bms_add_member(leafpart_rtis,
+ pinfo->rti_map[i]);
+ }
+ }
+ }
+ }
+
+ if (pruneinfo->needs_init_pruning)
+ {
+ glob->containsInitialPruning = true;
+
+ /*
+ * Delete the leaf partition RTIs from the global set of relations
+ * to be locked before executing the plan. AcquireExecutorLocks()
+ * will find the ones to add to the set after performing initial
+ * pruning.
+ */
+ glob->minLockRelids = bms_del_members(glob->minLockRelids,
+ leafpart_rtis);
+ }
+
+ glob->partPruneInfos = lappend(glob->partPruneInfos, pruneinfo);
+ }
+
return result;
}
@@ -1596,21 +1656,12 @@ set_append_references(PlannerInfo *root,
aplan->apprelids = offset_relid_set(aplan->apprelids, rtoffset);
- if (aplan->part_prune_info)
- {
- foreach(l, aplan->part_prune_info->prune_infos)
- {
- List *prune_infos = lfirst(l);
- ListCell *l2;
-
- foreach(l2, prune_infos)
- {
- PartitionedRelPruneInfo *pinfo = lfirst(l2);
-
- pinfo->rtindex += rtoffset;
- }
- }
- }
+ /*
+ * PartitionPruneInfos will be added to a list in PlannerGlobal, so update
+ * the index.
+ */
+ if (aplan->part_prune_index >= 0)
+ aplan->part_prune_index += list_length(root->glob->partPruneInfos);
/* We don't need to recurse to lefttree or righttree ... */
Assert(aplan->plan.lefttree == NULL);
@@ -1668,21 +1719,12 @@ set_mergeappend_references(PlannerInfo *root,
mplan->apprelids = offset_relid_set(mplan->apprelids, rtoffset);
- if (mplan->part_prune_info)
- {
- foreach(l, mplan->part_prune_info->prune_infos)
- {
- List *prune_infos = lfirst(l);
- ListCell *l2;
-
- foreach(l2, prune_infos)
- {
- PartitionedRelPruneInfo *pinfo = lfirst(l2);
-
- pinfo->rtindex += rtoffset;
- }
- }
- }
+ /*
+ * PartitionPruneInfos will be added to a list in PlannerGlobal, so update
+ * the index.
+ */
+ if (mplan->part_prune_index >= 0)
+ mplan->part_prune_index += list_length(root->glob->partPruneInfos);
/* We don't need to recurse to lefttree or righttree ... */
Assert(mplan->plan.lefttree == NULL);
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index 9d3c05aed3..0eaff15ed0 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -144,7 +144,9 @@ static List *make_partitionedrel_pruneinfo(PlannerInfo *root,
List *prunequal,
Bitmapset *partrelids,
int *relid_subplan_map,
- Bitmapset **matchedsubplans);
+ Bitmapset **matchedsubplans,
+ bool *needs_init_pruning,
+ bool *needs_exec_pruning);
static void gen_partprune_steps(RelOptInfo *rel, List *clauses,
PartClauseTarget target,
GeneratePruningStepsContext *context);
@@ -230,6 +232,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
int *relid_subplan_map;
ListCell *lc;
int i;
+ bool needs_init_pruning = false;
+ bool needs_exec_pruning = false;
/*
* Scan the subpaths to see which ones are scans of partition child
@@ -309,12 +313,16 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
Bitmapset *partrelids = (Bitmapset *) lfirst(lc);
List *pinfolist;
Bitmapset *matchedsubplans = NULL;
+ bool partrel_needs_init_pruning;
+ bool partrel_needs_exec_pruning;
pinfolist = make_partitionedrel_pruneinfo(root, parentrel,
prunequal,
partrelids,
relid_subplan_map,
- &matchedsubplans);
+ &matchedsubplans,
+ &partrel_needs_init_pruning,
+ &partrel_needs_exec_pruning);
/* When pruning is possible, record the matched subplans */
if (pinfolist != NIL)
@@ -323,6 +331,10 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
allmatchedsubplans = bms_join(matchedsubplans,
allmatchedsubplans);
}
+ if (!needs_init_pruning)
+ needs_init_pruning = partrel_needs_init_pruning;
+ if (!needs_exec_pruning)
+ needs_exec_pruning = partrel_needs_exec_pruning;
}
pfree(relid_subplan_map);
@@ -337,6 +349,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
/* Else build the result data structure */
pruneinfo = makeNode(PartitionPruneInfo);
pruneinfo->prune_infos = prunerelinfos;
+ pruneinfo->needs_init_pruning = needs_init_pruning;
+ pruneinfo->needs_exec_pruning = needs_exec_pruning;
/*
* Some subplans may not belong to any of the identified partitioned rels.
@@ -435,13 +449,18 @@ add_part_relids(List *allpartrelids, Bitmapset *partrelids)
* If we cannot find any useful run-time pruning steps, return NIL.
* However, on success, each rel identified in partrelids will have
* an element in the result list, even if some of them are useless.
+ * *needs_init_pruning and *needs_exec_pruning are set to indicate that the
+ * returned PartitionedRelPruneInfos contains pruning steps that can be
+ * performed before and after execution begins, respectively.
*/
static List *
make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
List *prunequal,
Bitmapset *partrelids,
int *relid_subplan_map,
- Bitmapset **matchedsubplans)
+ Bitmapset **matchedsubplans,
+ bool *needs_init_pruning,
+ bool *needs_exec_pruning)
{
RelOptInfo *targetpart = NULL;
List *pinfolist = NIL;
@@ -452,6 +471,10 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
int rti;
int i;
+ /* Will find out below. */
+ *needs_init_pruning = false;
+ *needs_exec_pruning = false;
+
/*
* Examine each partitioned rel, constructing a temporary array to map
* from planner relids to index of the partitioned rel, and building a
@@ -539,6 +562,9 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
* executor per-scan pruning steps. This first pass creates startup
* pruning steps and detects whether there's any possibly-useful quals
* that would require per-scan pruning.
+ *
+ * In the first pass, we note whether the 2nd pass is necessary by
+ * by noting the presence of EXEC parameters.
*/
gen_partprune_steps(subpart, partprunequal, PARTTARGET_INITIAL,
&context);
@@ -613,6 +639,11 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
pinfo->execparamids = execparamids;
/* Remaining fields will be filled in the next loop */
+ if (!*needs_init_pruning)
+ *needs_init_pruning = (initial_pruning_steps != NIL);
+ if (!*needs_exec_pruning)
+ *needs_exec_pruning = (exec_pruning_steps != NIL);
+
pinfolist = lappend(pinfolist, pinfo);
}
@@ -640,6 +671,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
int *subplan_map;
int *subpart_map;
Oid *relid_map;
+ Index *rti_map;
/*
* Construct the subplan and subpart maps for this partitioning level.
@@ -652,6 +684,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
subpart_map = (int *) palloc(nparts * sizeof(int));
memset(subpart_map, -1, nparts * sizeof(int));
relid_map = (Oid *) palloc0(nparts * sizeof(Oid));
+ rti_map = (Index *) palloc0(nparts * sizeof(Index));
present_parts = NULL;
i = -1;
@@ -666,6 +699,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
subplan_map[i] = subplanidx = relid_subplan_map[partrel->relid] - 1;
subpart_map[i] = subpartidx = relid_subpart_map[partrel->relid] - 1;
relid_map[i] = planner_rt_fetch(partrel->relid, root)->relid;
+ rti_map[i] = partrel->relid;
if (subplanidx >= 0)
{
present_parts = bms_add_member(present_parts, i);
@@ -690,6 +724,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
pinfo->subplan_map = subplan_map;
pinfo->subpart_map = subpart_map;
pinfo->relid_map = relid_map;
+ pinfo->rti_map = rti_map;
}
pfree(relid_subpart_map);
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index ba2fcfeb4a..fecffdba65 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -945,15 +945,17 @@ pg_plan_query(Query *querytree, const char *query_string, int cursorOptions,
* For normal optimizable statements, invoke the planner. For utility
* statements, just make a wrapper PlannedStmt node.
*
- * The result is a list of PlannedStmt nodes.
+ * The result is a list of PlannedStmt nodes. Also, a NULL is appended to
+ * *part_prune_result_list for each PlannedStmt added to the returned list.
*/
List *
pg_plan_queries(List *querytrees, const char *query_string, int cursorOptions,
- ParamListInfo boundParams)
+ ParamListInfo boundParams, List **part_prune_result_list)
{
List *stmt_list = NIL;
ListCell *query_list;
+ *part_prune_result_list = NIL;
foreach(query_list, querytrees)
{
Query *query = lfirst_node(Query, query_list);
@@ -977,6 +979,7 @@ pg_plan_queries(List *querytrees, const char *query_string, int cursorOptions,
}
stmt_list = lappend(stmt_list, stmt);
+ *part_prune_result_list = lappend(*part_prune_result_list, NULL);
}
return stmt_list;
@@ -1080,7 +1083,8 @@ exec_simple_query(const char *query_string)
QueryCompletion qc;
MemoryContext per_parsetree_context = NULL;
List *querytree_list,
- *plantree_list;
+ *plantree_list,
+ *plantree_part_prune_result_list;
Portal portal;
DestReceiver *receiver;
int16 format;
@@ -1167,7 +1171,8 @@ exec_simple_query(const char *query_string)
NULL, 0, NULL);
plantree_list = pg_plan_queries(querytree_list, query_string,
- CURSOR_OPT_PARALLEL_OK, NULL);
+ CURSOR_OPT_PARALLEL_OK, NULL,
+ &plantree_part_prune_result_list);
/*
* Done with the snapshot used for parsing/planning.
@@ -1203,6 +1208,7 @@ exec_simple_query(const char *query_string)
query_string,
commandTag,
plantree_list,
+ plantree_part_prune_result_list,
NULL);
/*
@@ -1991,6 +1997,7 @@ exec_bind_message(StringInfo input_message)
query_string,
psrc->commandTag,
cplan->stmt_list,
+ cplan->part_prune_result_list,
cplan);
/* Done with the snapshot used for parameter I/O and parsing/planning */
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index 5aa5a350f3..fcba303b53 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -35,7 +35,7 @@
Portal ActivePortal = NULL;
-static void ProcessQuery(PlannedStmt *plan,
+static void ProcessQuery(PlannedStmt *plan, PartitionPruneResult *part_prune_result,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -65,6 +65,7 @@ static void DoPortalRewind(Portal portal);
*/
QueryDesc *
CreateQueryDesc(PlannedStmt *plannedstmt,
+ PartitionPruneResult *part_prune_result,
const char *sourceText,
Snapshot snapshot,
Snapshot crosscheck_snapshot,
@@ -77,6 +78,8 @@ CreateQueryDesc(PlannedStmt *plannedstmt,
qd->operation = plannedstmt->commandType; /* operation */
qd->plannedstmt = plannedstmt; /* plan */
+ qd->part_prune_result = part_prune_result; /* ExecutorDoInitialPruning()
+ * output for plan */
qd->sourceText = sourceText; /* query text */
qd->snapshot = RegisterSnapshot(snapshot); /* snapshot */
/* RI check snapshot */
@@ -122,6 +125,7 @@ FreeQueryDesc(QueryDesc *qdesc)
* PORTAL_ONE_RETURNING, or PORTAL_ONE_MOD_WITH portal
*
* plan: the plan tree for the query
+ * part_prune_result: ExecutorDoInitialPruning() output for the plan tree
* sourceText: the source text of the query
* params: any parameters needed
* dest: where to send results
@@ -134,6 +138,7 @@ FreeQueryDesc(QueryDesc *qdesc)
*/
static void
ProcessQuery(PlannedStmt *plan,
+ PartitionPruneResult *part_prune_result,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -145,7 +150,7 @@ ProcessQuery(PlannedStmt *plan,
/*
* Create the QueryDesc object
*/
- queryDesc = CreateQueryDesc(plan, sourceText,
+ queryDesc = CreateQueryDesc(plan, part_prune_result, sourceText,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, 0);
@@ -493,6 +498,7 @@ PortalStart(Portal portal, ParamListInfo params,
* the destination to DestNone.
*/
queryDesc = CreateQueryDesc(linitial_node(PlannedStmt, portal->stmts),
+ linitial_node(PartitionPruneResult, portal->part_prune_results),
portal->sourceText,
GetActiveSnapshot(),
InvalidSnapshot,
@@ -1193,7 +1199,8 @@ PortalRunMulti(Portal portal,
QueryCompletion *qc)
{
bool active_snapshot_set = false;
- ListCell *stmtlist_item;
+ ListCell *stmtlist_item,
+ *part_prune_results_item;
/*
* If the destination is DestRemoteExecute, change to DestNone. The
@@ -1214,9 +1221,12 @@ PortalRunMulti(Portal portal,
* Loop to handle the individual queries generated from a single parsetree
* by analysis and rewrite.
*/
- foreach(stmtlist_item, portal->stmts)
+ forboth(stmtlist_item, portal->stmts,
+ part_prune_results_item, portal->part_prune_results)
{
PlannedStmt *pstmt = lfirst_node(PlannedStmt, stmtlist_item);
+ PartitionPruneResult *part_prune_result = lfirst_node(PartitionPruneResult,
+ part_prune_results_item);
/*
* If we got a cancel signal in prior command, quit
@@ -1274,7 +1284,7 @@ PortalRunMulti(Portal portal,
if (pstmt->canSetTag)
{
/* statement can set tag string */
- ProcessQuery(pstmt,
+ ProcessQuery(pstmt, part_prune_result,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
@@ -1283,7 +1293,7 @@ PortalRunMulti(Portal portal,
else
{
/* stmt added by rewrite cannot set tag */
- ProcessQuery(pstmt,
+ ProcessQuery(pstmt, part_prune_result,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 4cf6db504f..80564dd874 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -99,14 +99,16 @@ static dlist_head cached_expression_list = DLIST_STATIC_INIT(cached_expression_l
static void ReleaseGenericPlan(CachedPlanSource *plansource);
static List *RevalidateCachedQuery(CachedPlanSource *plansource,
QueryEnvironment *queryEnv);
-static bool CheckCachedPlan(CachedPlanSource *plansource);
+static bool CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams);
static CachedPlan *BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
ParamListInfo boundParams, QueryEnvironment *queryEnv);
+static void CachedPlanSavePartitionPruneResults(CachedPlan *plan, List *part_prune_result_list);
static bool choose_custom_plan(CachedPlanSource *plansource,
ParamListInfo boundParams);
static double cached_plan_cost(CachedPlan *plan, bool include_planner);
static Query *QueryListGetPrimaryStmt(List *stmts);
-static void AcquireExecutorLocks(List *stmt_list, bool acquire);
+static List *AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams);
+static void ReleaseExecutorLocks(List *stmt_list, List *part_prune_result_list);
static void AcquirePlannerLocks(List *stmt_list, bool acquire);
static void ScanQueryForLocks(Query *parsetree, bool acquire);
static bool ScanQueryWalker(Node *node, bool *acquire);
@@ -790,9 +792,21 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
*
* On a "true" return, we have acquired the locks needed to run the plan.
* (We must do this for the "true" result to be race-condition-free.)
+ *
+ * If the CachedPlan is valid, this may in some cases call
+ * ExecutorDoInitialPruning() on each PlannedStmt contained in it to determine
+ * the set of relations to be locked by AcquireExecutorLocks(), instead of just
+ * scanning its range table, which is done to prune away any nodes in the tree
+ * that need not be executed based on the result of initial partition pruning.
+ * The result of pruning which consists of List of Lists of bitmapsets of child
+ * subplan indexes, allocated in a child context of the context containing the
+ * plan itself, are added into plan->part_prune_results. The previous contents
+ * of the list from the last invocation on the same CachedPlan are deleted,
+ * because they would no longer be valid given the fresh set of parameter
+ * values which may be used as pruning parameters.
*/
static bool
-CheckCachedPlan(CachedPlanSource *plansource)
+CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams)
{
CachedPlan *plan = plansource->gplan;
@@ -820,13 +834,24 @@ CheckCachedPlan(CachedPlanSource *plansource)
*/
if (plan->is_valid)
{
+ List *part_prune_result_list;
+
/*
* Plan must have positive refcount because it is referenced by
* plansource; so no need to fear it disappears under us here.
*/
Assert(plan->refcount > 0);
- AcquireExecutorLocks(plan->stmt_list, true);
+ /*
+ * Lock relations scanned by the plan. If ExecutorDoInitialPruning()
+ * asked to omit some relations because the plan nodes that scan them
+ * were found to be pruned, the executor will be informed of the
+ * omission of the plan nodes themselves via part_prune_result_list
+ * that is passed to it along with the list of PlannedStmts, so that
+ * it doesn't accidentally try to execute those nodes.
+ */
+ part_prune_result_list = AcquireExecutorLocks(plan->stmt_list,
+ boundParams);
/*
* If plan was transient, check to see if TransactionXmin has
@@ -844,11 +869,14 @@ CheckCachedPlan(CachedPlanSource *plansource)
if (plan->is_valid)
{
/* Successfully revalidated and locked the query. */
+
+ /* Remember pruning results in the CachedPlan. */
+ CachedPlanSavePartitionPruneResults(plan, part_prune_result_list);
return true;
}
/* Oops, the race case happened. Release useless locks. */
- AcquireExecutorLocks(plan->stmt_list, false);
+ ReleaseExecutorLocks(plan->stmt_list, part_prune_result_list);
}
/*
@@ -880,7 +908,8 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
ParamListInfo boundParams, QueryEnvironment *queryEnv)
{
CachedPlan *plan;
- List *plist;
+ List *plist,
+ *part_prune_result_list;
bool snapshot_set;
bool is_transient;
MemoryContext plan_context;
@@ -933,7 +962,8 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
* Generate the plan.
*/
plist = pg_plan_queries(qlist, plansource->query_string,
- plansource->cursor_options, boundParams);
+ plansource->cursor_options, boundParams,
+ &part_prune_result_list);
/* Release snapshot if we got one */
if (snapshot_set)
@@ -1002,6 +1032,16 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
plan->is_saved = false;
plan->is_valid = true;
+ /*
+ * Save a dummy part_prune_result_list, that is a list containing NULLs
+ * as elements. We must do this, becasue users of the CachedPlan expect
+ * one to go with the list of PlannedStmts.
+ * XXX maybe get rid of that contract.
+ */
+ plan->part_prune_result_list_context = NULL;
+ CachedPlanSavePartitionPruneResults(plan, part_prune_result_list);
+ Assert(MemoryContextIsValid(plan->part_prune_result_list_context));
+
/* assign generation number to new plan */
plan->generation = ++(plansource->generation);
@@ -1160,7 +1200,7 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
if (!customplan)
{
- if (CheckCachedPlan(plansource))
+ if (CheckCachedPlan(plansource, boundParams))
{
/* We want a generic plan, and we already have a valid one */
plan = plansource->gplan;
@@ -1586,6 +1626,49 @@ CopyCachedPlan(CachedPlanSource *plansource)
return newsource;
}
+/*
+ * CachedPlanSavePartitionPruneResults
+ * Save the list containing PartitionPruneResult nodes into the given
+ * CachedPlan
+ *
+ * The provided list is copied into a dedicated context that is a child of
+ * plan->context. If the child context already exists, it is emptied, because
+ * any PartitionPruneResult contained therein would no longer be useful.
+ */
+static void
+CachedPlanSavePartitionPruneResults(CachedPlan *plan, List *part_prune_result_list)
+{
+ MemoryContext part_prune_result_list_context = plan->part_prune_result_list_context,
+ oldcontext = CurrentMemoryContext;
+ List *part_prune_result_list_copy;
+
+ /*
+ * Set up the dedicated context if not already done, saving it as a child
+ * of the CachedPlan's context.
+ */
+ if (part_prune_result_list_context == NULL)
+ {
+ part_prune_result_list_context = AllocSetContextCreate(CurrentMemoryContext,
+ "CachedPlan part_prune_results list",
+ ALLOCSET_START_SMALL_SIZES);
+ MemoryContextSetParent(part_prune_result_list_context, plan->context);
+ MemoryContextSetIdentifier(part_prune_result_list_context, plan->context->ident);
+ plan->part_prune_result_list_context = part_prune_result_list_context;
+ }
+ else
+ {
+ /* Just clear existing contents by resetting the context. */
+ Assert(MemoryContextIsValid(part_prune_result_list_context));
+ MemoryContextReset(part_prune_result_list_context);
+ }
+
+ MemoryContextSwitchTo(part_prune_result_list_context);
+ part_prune_result_list_copy = copyObject(part_prune_result_list);
+ MemoryContextSwitchTo(oldcontext);
+
+ plan->part_prune_result_list = part_prune_result_list_copy;
+}
+
/*
* CachedPlanIsValid: test whether the rewritten querytree within a
* CachedPlanSource is currently valid (that is, not marked as being in need
@@ -1737,17 +1820,21 @@ QueryListGetPrimaryStmt(List *stmts)
/*
* AcquireExecutorLocks: acquire locks needed for execution of a cached plan;
- * or release them if acquire is false.
+ *
+ * Returns a list of PartitionPruneResult nodes containing one element for each
+ * PlannedStmt in stmt_list or NULL if the latter is utility statement or its
+ * containsInitialPruning is false.
*/
-static void
-AcquireExecutorLocks(List *stmt_list, bool acquire)
+static List *
+AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams)
{
ListCell *lc1;
+ List *part_prune_result_list = NIL;
foreach(lc1, stmt_list)
{
PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
- ListCell *lc2;
+ PartitionPruneResult *part_prune_result = NULL;
if (plannedstmt->commandType == CMD_UTILITY)
{
@@ -1761,27 +1848,122 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
Query *query = UtilityContainsQuery(plannedstmt->utilityStmt);
if (query)
- ScanQueryForLocks(query, acquire);
- continue;
+ ScanQueryForLocks(query, true);
}
-
- foreach(lc2, plannedstmt->rtable)
+ else
{
- RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc2);
-
- if (rte->rtekind != RTE_RELATION)
- continue;
+ Bitmapset *lockRelids;
+ int rti;
/*
- * Acquire the appropriate type of lock on each relation OID. Note
- * that we don't actually try to open the rel, and hence will not
- * fail if it's been dropped entirely --- we'll just transiently
- * acquire a non-conflicting lock.
+ * Figure out the set of relations that would need to be locked
+ * before executing the plan.
*/
- if (acquire)
+ if (plannedstmt->containsInitialPruning)
+ {
+ /*
+ * Obtain the set of partitions to be locked from the
+ * PartitionPruneInfos by considering the result of performing
+ * initial partition pruning.
+ */
+ PartitionPruneResult *part_prune_result =
+ ExecutorDoInitialPruning(plannedstmt, boundParams);
+
+ lockRelids = bms_union(plannedstmt->minLockRelids,
+ part_prune_result->scan_leafpart_rtis);
+ }
+ else
+ lockRelids = plannedstmt->minLockRelids;
+
+ rti = -1;
+ while ((rti = bms_next_member(lockRelids, rti)) > 0)
+ {
+ RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
+
+ if (rte->rtekind != RTE_RELATION)
+ continue;
+
+ /*
+ * Acquire the appropriate type of lock on each relation OID.
+ * Note that we don't actually try to open the rel, and hence
+ * will not fail if it's been dropped entirely --- we'll just
+ * transiently acquire a non-conflicting lock.
+ */
LockRelationOid(rte->relid, rte->rellockmode);
+ }
+ }
+
+ /*
+ * Remember PartitionPruneResult for later adding to the QueryDesc that
+ * will be passed to the executor when executing this plan. May be
+ * NULL, but must keep the list the same length as stmt_list.
+ */
+ part_prune_result_list = lappend(part_prune_result_list,
+ part_prune_result);
+ }
+
+ return part_prune_result_list;
+}
+
+/*
+ * ReleaseExecutorLocks
+ * Release locks that would've been acquired by an earlier call to
+ * AcquireExecutorLocks()
+ */
+static void
+ReleaseExecutorLocks(List *stmt_list, List *part_prune_result_list)
+{
+ ListCell *lc1,
+ *lc2;
+
+ forboth(lc1, stmt_list, lc2, part_prune_result_list)
+ {
+ PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
+ PartitionPruneResult *part_prune_result = lfirst_node(PartitionPruneResult, lc2);
+
+ if (plannedstmt->commandType == CMD_UTILITY)
+ {
+ /*
+ * Ignore utility statements, except those (such as EXPLAIN) that
+ * contain a parsed-but-not-planned query. Note: it's okay to use
+ * ScanQueryForLocks, even though the query hasn't been through
+ * rule rewriting, because rewriting doesn't change the query
+ * representation.
+ */
+ Query *query = UtilityContainsQuery(plannedstmt->utilityStmt);
+
+ if (query)
+ ScanQueryForLocks(query, false);
+ }
+ else
+ {
+ Bitmapset *lockRelids;
+ int rti;
+
+ if (part_prune_result == NULL)
+ {
+ Assert(!plannedstmt->containsInitialPruning);
+ lockRelids = plannedstmt->minLockRelids;
+ }
else
+ {
+ Assert(plannedstmt->containsInitialPruning);
+ lockRelids = bms_union(plannedstmt->minLockRelids,
+ part_prune_result->scan_leafpart_rtis);
+ }
+
+ rti = -1;
+ while ((rti = bms_next_member(lockRelids, rti)) >= 0)
+ {
+ RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
+
+ if (rte->rtekind != RTE_RELATION)
+ continue;
+
+ /* See the comment in AcquireExecutorLocks(). */
UnlockRelationOid(rte->relid, rte->rellockmode);
+ }
+
}
}
}
diff --git a/src/backend/utils/mmgr/portalmem.c b/src/backend/utils/mmgr/portalmem.c
index d549f66d4a..4705dc4097 100644
--- a/src/backend/utils/mmgr/portalmem.c
+++ b/src/backend/utils/mmgr/portalmem.c
@@ -285,6 +285,7 @@ PortalDefineQuery(Portal portal,
const char *sourceText,
CommandTag commandTag,
List *stmts,
+ List *part_prune_results,
CachedPlan *cplan)
{
AssertArg(PortalIsValid(portal));
@@ -299,6 +300,7 @@ PortalDefineQuery(Portal portal,
portal->qc.nprocessed = 0;
portal->commandTag = commandTag;
portal->stmts = stmts;
+ portal->part_prune_results = part_prune_results;
portal->cplan = cplan;
portal->status = PORTAL_DEFINED;
}
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 666977fb1f..34975c69ee 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -87,7 +87,8 @@ extern void ExplainOneUtility(Node *utilityStmt, IntoClause *into,
ExplainState *es, const char *queryString,
ParamListInfo params, QueryEnvironment *queryEnv);
-extern void ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into,
+extern void ExplainOnePlan(PlannedStmt *plannedstmt, PartitionPruneResult *part_prune_resul,
+ IntoClause *into,
ExplainState *es, const char *queryString,
ParamListInfo params, QueryEnvironment *queryEnv,
const instr_time *planduration,
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index 708435e952..bd8776402e 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -45,6 +45,7 @@ extern void ExecCleanupTupleRouting(ModifyTableState *mtstate,
* nparts Length of subplan_map[] and subpart_map[].
* subplan_map Subplan index by partition index, or -1.
* subpart_map Subpart index by partition index, or -1.
+ * rti_map Range table index by partition index, or 0.
* present_parts A Bitmapset of the partition indexes that we
* have subplans or subparts for.
* initial_pruning_steps List of PartitionPruneSteps used to
@@ -61,6 +62,7 @@ typedef struct PartitionedRelPruningData
int nparts;
int *subplan_map;
int *subpart_map;
+ Index *rti_map;
Bitmapset *present_parts;
List *initial_pruning_steps;
List *exec_pruning_steps;
@@ -123,9 +125,13 @@ typedef struct PartitionPruneState
extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
int n_total_subplans,
- PartitionPruneInfo *pruneinfo,
+ int part_prune_index,
Bitmapset **initially_valid_subplans);
extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
- bool initial_prune);
-
+ bool initial_prune,
+ Bitmapset **scan_leafpart_rtis);
+extern Bitmapset *ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt,
+ ParamListInfo params,
+ PartitionPruneInfo *pruneinfo,
+ Bitmapset **scan_leafpart_rtis);
#endif /* EXECPARTITION_H */
diff --git a/src/include/executor/execdesc.h b/src/include/executor/execdesc.h
index e79e2c001f..60d5644908 100644
--- a/src/include/executor/execdesc.h
+++ b/src/include/executor/execdesc.h
@@ -35,6 +35,8 @@ typedef struct QueryDesc
/* These fields are provided by CreateQueryDesc */
CmdType operation; /* CMD_SELECT, CMD_UPDATE, etc. */
PlannedStmt *plannedstmt; /* planner's output (could be utility, too) */
+ PartitionPruneResult *part_prune_result; /* ExecutorDoInitialPruning()'s
+ * output for plannedstmt */
const char *sourceText; /* source text of the query */
Snapshot snapshot; /* snapshot to use for query */
Snapshot crosscheck_snapshot; /* crosscheck for RI update/delete */
@@ -57,6 +59,7 @@ typedef struct QueryDesc
/* in pquery.c */
extern QueryDesc *CreateQueryDesc(PlannedStmt *plannedstmt,
+ PartitionPruneResult *part_prune_result,
const char *sourceText,
Snapshot snapshot,
Snapshot crosscheck_snapshot,
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index 873772f188..57dc0e8077 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -185,6 +185,8 @@ ExecGetJunkAttribute(TupleTableSlot *slot, AttrNumber attno, bool *isNull)
/*
* prototypes from functions in execMain.c
*/
+extern PartitionPruneResult *ExecutorDoInitialPruning(PlannedStmt *plannedstmt,
+ ParamListInfo params);
extern void ExecutorStart(QueryDesc *queryDesc, int eflags);
extern void standard_ExecutorStart(QueryDesc *queryDesc, int eflags);
extern void ExecutorRun(QueryDesc *queryDesc,
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index cbbcff81d2..b5a7fd7e16 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -596,6 +596,8 @@ typedef struct EState
struct ExecRowMark **es_rowmarks; /* Array of per-range-table-entry
* ExecRowMarks, or NULL if none */
PlannedStmt *es_plannedstmt; /* link to top of plan tree */
+ List *es_part_prune_infos; /* PlannedStmt.partPruneInfos */
+ struct PartitionPruneResult *es_part_prune_result; /* QueryDesc.part_prune_result */
const char *es_sourceText; /* Source text from QueryDesc */
JunkFilter *es_junkFilter; /* top-level junk filter, if any */
@@ -984,6 +986,19 @@ typedef struct DomainConstraintState
*/
typedef TupleTableSlot *(*ExecProcNodeMtd) (struct PlanState *pstate);
+/*----------------
+ * PartitionPruneResult
+ *
+ * Result of ExecutorDoInitialPruning() invocation on a given plan.
+ */
+typedef struct PartitionPruneResult
+{
+ NodeTag type;
+
+ Bitmapset *scan_leafpart_rtis;
+ List *valid_subplan_offs_list;
+} PartitionPruneResult;
+
/* ----------------
* PlanState node
*
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index 300824258e..de312b9215 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -97,6 +97,9 @@ typedef enum NodeTag
T_PartitionPruneStepCombine,
T_PlanInvalItem,
+ /* TAGS FOR EXECUTOR PREP NODES (execnodes.h) */
+ T_PartitionPruneResult,
+
/*
* TAGS FOR PLAN STATE NODES (execnodes.h)
*
@@ -673,6 +676,7 @@ extern struct Bitmapset *readBitmapset(void);
extern uintptr_t readDatum(bool typbyval);
extern bool *readBoolCols(int numCols);
extern int *readIntCols(int numCols);
+extern Index *readIndexCols(int numCols);
extern Oid *readOidCols(int numCols);
extern int16 *readAttrNumberCols(int numCols);
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 6cbcb67bdf..f2039071c9 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -107,6 +107,18 @@ typedef struct PlannerGlobal
List *appendRelations; /* "flat" list of AppendRelInfos */
+ List *partPruneInfos; /* List of PartitionPruneInfo contained in
+ * the plan */
+
+ bool containsInitialPruning; /* Do any of those PartitionPruneInfos
+ * have initial (pre-exec) pruning
+ * steps in them? */
+
+ Bitmapset *minLockRelids; /* RT indexes of RTE_RELATION entries that
+ * must always be locked to execute the plan;
+ * those scanned by initial-prunable plan
+ * nodes are not included */
+
List *relationOids; /* OIDs of relations the plan depends on */
List *invalItems; /* other dependencies, as PlanInvalItems */
@@ -377,6 +389,9 @@ struct PlannerInfo
/* Does this query modify any partition key columns? */
bool partColsUpdated;
+
+ /* PartitionPruneInfos added in this query's plan. */
+ List *partPruneInfos;
};
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 50ef3dda05..0a144a1e92 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -64,8 +64,19 @@ typedef struct PlannedStmt
struct Plan *planTree; /* tree of Plan nodes */
+ List *partPruneInfos; /* List of PartitionPruneInfo contained in
+ * the plan */
+
+ bool containsInitialPruning; /* Do any of those PartitionPruneInfos
+ * have initial (pre-exec) pruning
+ * steps in them? */
+
List *rtable; /* list of RangeTblEntry nodes */
+ Bitmapset *minLockRelids; /* RT indexes of RTE_RELATION entries that
+ * must be locked, except those scanned by
+ * initial-prunable plan nodes */
+
/* rtable indexes of target relations for INSERT/UPDATE/DELETE */
List *resultRelations; /* integer list of RT indexes, or NIL */
@@ -262,8 +273,12 @@ typedef struct Append
*/
int first_partial_plan;
- /* Info for run-time subplan pruning; NULL if we're not doing that */
- struct PartitionPruneInfo *part_prune_info;
+ /*
+ * Index of this plan's PartitionPruneInfo in PlannedStmt.part_prune_infos
+ * to be used for run-time subplan pruning; -1 if run-time pruning is
+ * not needed.
+ */
+ int part_prune_index;
} Append;
/* ----------------
@@ -282,8 +297,13 @@ typedef struct MergeAppend
Oid *sortOperators; /* OIDs of operators to sort them by */
Oid *collations; /* OIDs of collations */
bool *nullsFirst; /* NULLS FIRST/LAST directions */
- /* Info for run-time subplan pruning; NULL if we're not doing that */
- struct PartitionPruneInfo *part_prune_info;
+
+ /*
+ * Index of this plan's PartitionPruneInfo in PlannedStmt.part_prune_infos
+ * to be used for run-time subplan pruning; -1 if run-time pruning is
+ * not needed.
+ */
+ int part_prune_index;
} MergeAppend;
/* ----------------
@@ -1175,6 +1195,13 @@ typedef struct PlanRowMark
* prune_infos List of Lists containing PartitionedRelPruneInfo nodes,
* one sublist per run-time-prunable partition hierarchy
* appearing in the parent plan node's subplans.
+ *
+ * needs_init_pruning Does any of the PartitionedRelPruneInfos in
+ * prune_infos have its initial_pruning_steps set?
+ *
+ * needs_exec_pruning Does any of the PartitionedRelPruneInfos in
+ * prune_infos have its exec_pruning_steps set?
+ *
* other_subplans Indexes of any subplans that are not accounted for
* by any of the PartitionedRelPruneInfo nodes in
* "prune_infos". These subplans must not be pruned.
@@ -1183,6 +1210,9 @@ typedef struct PartitionPruneInfo
{
NodeTag type;
List *prune_infos;
+ Bitmapset *leafpart_rtis;
+ bool needs_init_pruning;
+ bool needs_exec_pruning;
Bitmapset *other_subplans;
} PartitionPruneInfo;
@@ -1213,6 +1243,7 @@ typedef struct PartitionedRelPruneInfo
int *subplan_map; /* subplan index by partition index, or -1 */
int *subpart_map; /* subpart index by partition index, or -1 */
Oid *relid_map; /* relation OID by partition index, or 0 */
+ Index *rti_map; /* Range table index by partition index, 0. */
/*
* initial_pruning_steps shows how to prune during executor startup (i.e.,
diff --git a/src/include/tcop/tcopprot.h b/src/include/tcop/tcopprot.h
index 92291a750d..119d4a1d10 100644
--- a/src/include/tcop/tcopprot.h
+++ b/src/include/tcop/tcopprot.h
@@ -64,7 +64,7 @@ extern PlannedStmt *pg_plan_query(Query *querytree, const char *query_string,
ParamListInfo boundParams);
extern List *pg_plan_queries(List *querytrees, const char *query_string,
int cursorOptions,
- ParamListInfo boundParams);
+ ParamListInfo boundParams, List **part_prune_result_list);
extern bool check_max_stack_depth(int *newval, void **extra, GucSource source);
extern void assign_max_stack_depth(int newval, void *extra);
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index 95b99e3d25..f591b9df9c 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -148,6 +148,9 @@ typedef struct CachedPlan
{
int magic; /* should equal CACHEDPLAN_MAGIC */
List *stmt_list; /* list of PlannedStmts */
+ List *part_prune_result_list; /* list of PartitionPruneResult with
+ * one element for each of stmt_list; NIL
+ * if not a generic plan */
bool is_oneshot; /* is it a "oneshot" plan? */
bool is_saved; /* is CachedPlan in a long-lived context? */
bool is_valid; /* is the stmt_list currently valid? */
@@ -158,6 +161,10 @@ typedef struct CachedPlan
int generation; /* parent's generation number for this plan */
int refcount; /* count of live references to this struct */
MemoryContext context; /* context containing this CachedPlan */
+ MemoryContext part_prune_result_list_context; /* context containing
+ * part_prune_result_list,
+ * a child of the above
+ * context */
} CachedPlan;
/*
diff --git a/src/include/utils/portal.h b/src/include/utils/portal.h
index aeddbdafe5..c1e304f9d7 100644
--- a/src/include/utils/portal.h
+++ b/src/include/utils/portal.h
@@ -137,6 +137,10 @@ typedef struct PortalData
CommandTag commandTag; /* command tag for original query */
QueryCompletion qc; /* command completion data for executed query */
List *stmts; /* list of PlannedStmts */
+ List *part_prune_results; /* list of PartitionPruneResults with one element
+ * for each of 'stmts'; same as
+ * cplan->part_prune_result_list if cplan is
+ * not NULL */
CachedPlan *cplan; /* CachedPlan, if stmts are from one */
ParamListInfo portalParams; /* params to pass to query */
@@ -241,6 +245,7 @@ extern void PortalDefineQuery(Portal portal,
const char *sourceText,
CommandTag commandTag,
List *stmts,
+ List *part_prune_results,
CachedPlan *cplan);
extern PlannedStmt *PortalGetPrimaryStmt(Portal portal);
extern void PortalCreateHoldStore(Portal portal);
--
2.24.1
^ permalink raw reply [nested|flat] 71+ messages in thread
* Re: generic plans and "initial" pruning
@ 2022-04-07 08:27 Amit Langote <[email protected]>
parent: Amit Langote <[email protected]>
0 siblings, 1 reply; 71+ messages in thread
From: Amit Langote @ 2022-04-07 08:27 UTC (permalink / raw)
To: David Rowley <[email protected]>; +Cc: Robert Haas <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers
On Wed, Apr 6, 2022 at 4:20 PM Amit Langote <[email protected]> wrote:
> And here is a version like that that passes make check-world. Maybe
> still a WIP as I think comments could use more editing.
>
> Here's how the new implementation works:
>
> AcquireExecutorLocks() calls ExecutorDoInitialPruning(), which in turn
> iterates over a list of PartitionPruneInfos in a given PlannedStmt
> coming from a CachedPlan. For each PartitionPruneInfo,
> ExecPartitionDoInitialPruning() is called, which sets up
> PartitionPruneState and performs initial pruning steps present in the
> PartitionPruneInfo. The resulting bitmapsets of valid subplans, one
> for each PartitionPruneInfo, are collected in a list and added to a
> result node called PartitionPruneResult. It represents the result of
> performing initial pruning on all PartitionPruneInfos found in a plan.
> A list of PartitionPruneResults is passed along with the PlannedStmt
> to the executor, which is referenced when initializing
> Append/MergeAppend nodes.
>
> PlannedStmt.minLockRelids defined by the planner contains the RT
> indexes of all the entries in the range table minus those of the leaf
> partitions whose subplans are subject to removal due to initial
> pruning. AcquireExecutoLocks() adds back the RT indexes of only those
> leaf partitions whose subplans survive ExecutorDoInitialPruning(). To
> get the leaf partition RT indexes from the PartitionPruneInfo, a new
> rti_map array is added to PartitionedRelPruneInfo.
>
> There's only one patch this time. Patches that added partitioned_rels
> and plan_tree_walker() are no longer necessary.
Here's an updated version. In Particular, I removed
part_prune_results list from PortalData, in favor of anything that
needs to look at the list can instead get it from the CachedPlan
(PortalData.cplan). This makes things better in 2 ways:
* All the changes that were needed to produce the list to be pass to
PortalDefineQuery() are now unnecessary (especially ugly ones were
those made to pg_plan_queries()'s interface)
* The cases in which the PartitionPruneResult being added to a
QueryDesc can be assumed to be valid is more clearly define now; it's
the cases where the portal's CachedPlan is also valid, that is, if the
accompanying PlannedStmt is a cached one.
--
Amit Langote
EDB: http://www.enterprisedb.com
Attachments:
[application/octet-stream] v12-0001-Optimize-AcquireExecutorLocks-to-skip-pruned-par.patch (90.2K, 2-v12-0001-Optimize-AcquireExecutorLocks-to-skip-pruned-par.patch)
download | inline diff:
From f55a622383c90c3f300dede0d04247f7cf2d9e77 Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Wed, 22 Dec 2021 16:55:17 +0900
Subject: [PATCH v12] Optimize AcquireExecutorLocks() to skip pruned partitions
---
src/backend/commands/copyto.c | 2 +-
src/backend/commands/createas.c | 2 +-
src/backend/commands/explain.c | 7 +-
src/backend/commands/extension.c | 2 +-
src/backend/commands/matview.c | 2 +-
src/backend/commands/prepare.c | 13 +-
src/backend/executor/README | 28 +++
src/backend/executor/execMain.c | 46 +++++
src/backend/executor/execParallel.c | 28 ++-
src/backend/executor/execPartition.c | 238 ++++++++++++++++++++----
src/backend/executor/execUtils.c | 1 +
src/backend/executor/functions.c | 2 +-
src/backend/executor/nodeAppend.c | 16 +-
src/backend/executor/nodeMergeAppend.c | 9 +-
src/backend/executor/spi.c | 10 +-
src/backend/nodes/copyfuncs.c | 33 +++-
src/backend/nodes/outfuncs.c | 36 +++-
src/backend/nodes/readfuncs.c | 56 +++++-
src/backend/optimizer/plan/createplan.c | 20 +-
src/backend/optimizer/plan/planner.c | 3 +
src/backend/optimizer/plan/setrefs.c | 104 ++++++++---
src/backend/partitioning/partprune.c | 41 +++-
src/backend/tcop/pquery.c | 28 ++-
src/backend/utils/cache/plancache.c | 236 ++++++++++++++++++++---
src/include/commands/explain.h | 3 +-
src/include/executor/execPartition.h | 12 +-
src/include/executor/execdesc.h | 3 +
src/include/executor/executor.h | 2 +
src/include/nodes/execnodes.h | 15 ++
src/include/nodes/nodes.h | 4 +
src/include/nodes/pathnodes.h | 15 ++
src/include/nodes/plannodes.h | 39 +++-
src/include/utils/plancache.h | 7 +
33 files changed, 919 insertions(+), 144 deletions(-)
diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index 55c38b04c4..d403eb2309 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -542,7 +542,7 @@ BeginCopyTo(ParseState *pstate,
((DR_copy *) dest)->cstate = cstate;
/* Create a QueryDesc requesting no output */
- cstate->queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+ cstate->queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
GetActiveSnapshot(),
InvalidSnapshot,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 9abbb6b555..f6607f2454 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -325,7 +325,7 @@ ExecCreateTableAs(ParseState *pstate, CreateTableAsStmt *stmt,
UpdateActiveSnapshotCommandId();
/* Create a QueryDesc, redirecting output to our tuple receiver */
- queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+ queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, 0);
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 1e5701b8eb..7ba9852e51 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -407,7 +407,7 @@ ExplainOneQuery(Query *query, int cursorOptions,
}
/* run it (if needed) and produce output */
- ExplainOnePlan(plan, into, es, queryString, params, queryEnv,
+ ExplainOnePlan(plan, NULL, into, es, queryString, params, queryEnv,
&planduration, (es->buffers ? &bufusage : NULL));
}
}
@@ -515,7 +515,8 @@ ExplainOneUtility(Node *utilityStmt, IntoClause *into, ExplainState *es,
* to call it.
*/
void
-ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
+ExplainOnePlan(PlannedStmt *plannedstmt, PartitionPruneResult *part_prune_result,
+ IntoClause *into, ExplainState *es,
const char *queryString, ParamListInfo params,
QueryEnvironment *queryEnv, const instr_time *planduration,
const BufferUsage *bufusage)
@@ -563,7 +564,7 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
dest = None_Receiver;
/* Create a QueryDesc for the query */
- queryDesc = CreateQueryDesc(plannedstmt, queryString,
+ queryDesc = CreateQueryDesc(plannedstmt, part_prune_result, queryString,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, instrument_option);
diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index 1013790dbb..54734a3a93 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -776,7 +776,7 @@ execute_sql_string(const char *sql)
{
QueryDesc *qdesc;
- qdesc = CreateQueryDesc(stmt,
+ qdesc = CreateQueryDesc(stmt, NULL,
sql,
GetActiveSnapshot(), NULL,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index 9ab248d25e..2be1782bc4 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -416,7 +416,7 @@ refresh_matview_datafill(DestReceiver *dest, Query *query,
UpdateActiveSnapshotCommandId();
/* Create a QueryDesc, redirecting output to our tuple receiver */
- queryDesc = CreateQueryDesc(plan, queryString,
+ queryDesc = CreateQueryDesc(plan, NULL, queryString,
GetActiveSnapshot(), InvalidSnapshot,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 80738547ed..45039e64be 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -576,7 +576,9 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
const char *query_string;
CachedPlan *cplan;
List *plan_list;
- ListCell *p;
+ List *plan_part_prune_result_list;
+ ListCell *p,
+ *pp;
ParamListInfo paramLI = NULL;
EState *estate = NULL;
instr_time planstart;
@@ -632,15 +634,18 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
}
plan_list = cplan->stmt_list;
+ plan_part_prune_result_list = cplan->part_prune_result_list;
/* Explain each query */
- foreach(p, plan_list)
+ forboth(p, plan_list, pp, plan_part_prune_result_list)
{
PlannedStmt *pstmt = lfirst_node(PlannedStmt, p);
+ PartitionPruneResult *part_prune_result = lfirst_node(PartitionPruneResult, pp);
if (pstmt->commandType != CMD_UTILITY)
- ExplainOnePlan(pstmt, into, es, query_string, paramLI, queryEnv,
- &planduration, (es->buffers ? &bufusage : NULL));
+ ExplainOnePlan(pstmt, part_prune_result, into, es, query_string,
+ paramLI, queryEnv, &planduration,
+ (es->buffers ? &bufusage : NULL));
else
ExplainOneUtility(pstmt->utilityStmt, into, es, query_string,
paramLI, queryEnv);
diff --git a/src/backend/executor/README b/src/backend/executor/README
index 0b5183fc4a..8418e758da 100644
--- a/src/backend/executor/README
+++ b/src/backend/executor/README
@@ -65,6 +65,30 @@ found there. This currently only occurs for Append and MergeAppend nodes. In
this case the non-required subplans are ignored and the executor state's
subnode array will become out of sequence to the plan's subplan list.
+Actually, the so-called execution time pruning may also occur even before the
+execution has started. One case where that occurs is when a cached generic
+plan is being validated for execution by plancache.c: GetCachedPlan(), which
+proceeds by locking all the relations that will be scanned by that plan. If
+the generic plan has nodes that contain so-called initial pruning steps (a
+subset of execution pruning steps that do not depend on full-fledged execution
+having started), they are performed at this point to figure out the minimal
+set of child subplans that satisfy those pruning instructions and the result
+of performing that pruning is saved in a data structure that gets passed to
+the executor alongside the plan tree. Relations scanned by only those
+surviving subplans are then locked while those scanned by the pruned subplans
+are not, even though the pruned subplans themselves are not removed from the
+plan tree. So, it is imperative that the executor and any third party code
+invoked by it that gets passed the plan tree look at the initial pruning result
+made available via the aforementioned data structure to determine whether or
+not a particular subplan is valid. The data structure basically consists of
+a PartitionPruneResult node passed through the QueryDesc (subsequently added
+to EState) containing a list of bitmapsets with one element for every
+PartitionPruneInfo found in PlannedStmt.partPruneInfos. The list is indexed
+with part_prune_index of the individual PartitionPruneInfos that's stored in
+the parent plan nodes to which a given PartitionPruneInfo belongs. Each
+bitmapset of the indexes of the child subplans of the given parent plan
+node that survive initial partiiton pruning.
+
Each Plan node may have expression trees associated with it, to represent
its target list, qualification conditions, etc. These trees are also
read-only to the executor, but the executor state for expression evaluation
@@ -286,6 +310,10 @@ Query Processing Control Flow
This is a sketch of control flow for full query processing:
+ [ ExecutorDoInitialPruning ] --- an optional step to perform initial
+ partition pruning on the plan tree the result of which is passed
+ to the executor via QueryDesc
+
CreateQueryDesc
ExecutorStart
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index ef2fd46092..05cc99df8f 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -49,11 +49,13 @@
#include "commands/matview.h"
#include "commands/trigger.h"
#include "executor/execdebug.h"
+#include "executor/execPartition.h"
#include "executor/nodeSubplan.h"
#include "foreign/fdwapi.h"
#include "jit/jit.h"
#include "mb/pg_wchar.h"
#include "miscadmin.h"
+#include "nodes/nodeFuncs.h"
#include "parser/parsetree.h"
#include "storage/bufmgr.h"
#include "storage/lmgr.h"
@@ -104,6 +106,47 @@ static void EvalPlanQualStart(EPQState *epqstate, Plan *planTree);
/* end of local decls */
+/* ----------------------------------------------------------------
+ * ExecutorDoInitialPruning
+ *
+ * Performs initial partition pruning to figure out the minimal set of
+ * subplans to be executed and the set of RT indexes of the corresponding
+ * leaf partitions
+ *
+ * Returned PartitionPruneResult must be subsequently passed to the executor
+ * so that it can reuse the result of pruning. It's important that the
+ * has the same view of which partitions are initially pruned (by not doing
+ * the pruning again itself) or otherwise it risks initializing subplans whose
+ * partitions would not have been locked.
+ *
+ * Note: Partitioned tables mentioned in PartitionedRelPruneInfo nodes that
+ * drive the pruning will be locked before doing the pruning.
+ */
+PartitionPruneResult *
+ExecutorDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params)
+{
+ PartitionPruneResult *result;
+ ListCell *lc;
+
+ /* Only get here if there is any pruning to do. */
+ Assert(plannedstmt->containsInitialPruning);
+
+ result = makeNode(PartitionPruneResult);
+ foreach(lc, plannedstmt->partPruneInfos)
+ {
+ PartitionPruneInfo *pruneinfo = lfirst(lc);
+ Bitmapset *valid_subplan_offs;
+
+ valid_subplan_offs =
+ ExecPartitionDoInitialPruning(plannedstmt, params, pruneinfo,
+ &result->scan_leafpart_rtis);
+ result->valid_subplan_offs_list =
+ lappend(result->valid_subplan_offs_list,
+ valid_subplan_offs);
+ }
+
+ return result;
+}
/* ----------------------------------------------------------------
* ExecutorStart
@@ -806,6 +849,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
{
CmdType operation = queryDesc->operation;
PlannedStmt *plannedstmt = queryDesc->plannedstmt;
+ PartitionPruneResult *part_prune_result = queryDesc->part_prune_result;
Plan *plan = plannedstmt->planTree;
List *rangeTable = plannedstmt->rtable;
EState *estate = queryDesc->estate;
@@ -825,6 +869,8 @@ InitPlan(QueryDesc *queryDesc, int eflags)
ExecInitRangeTable(estate, rangeTable);
estate->es_plannedstmt = plannedstmt;
+ estate->es_part_prune_infos = plannedstmt->partPruneInfos;
+ estate->es_part_prune_result = part_prune_result;
/*
* Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index 9a0d5d59ef..805f86c503 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -66,6 +66,7 @@
#define PARALLEL_KEY_QUERY_TEXT UINT64CONST(0xE000000000000008)
#define PARALLEL_KEY_JIT_INSTRUMENTATION UINT64CONST(0xE000000000000009)
#define PARALLEL_KEY_WAL_USAGE UINT64CONST(0xE00000000000000A)
+#define PARALLEL_KEY_PARTITIONPRUNERESULT UINT64CONST(0xE00000000000000B)
#define PARALLEL_TUPLE_QUEUE_SIZE 65536
@@ -182,7 +183,9 @@ ExecSerializePlan(Plan *plan, EState *estate)
pstmt->transientPlan = false;
pstmt->dependsOnRole = false;
pstmt->parallelModeNeeded = false;
+ pstmt->containsInitialPruning = false;
pstmt->planTree = plan;
+ pstmt->partPruneInfos = estate->es_part_prune_infos;
pstmt->rtable = estate->es_range_table;
pstmt->resultRelations = NIL;
pstmt->appendRelations = NIL;
@@ -596,12 +599,15 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
FixedParallelExecutorState *fpes;
char *pstmt_data;
char *pstmt_space;
+ char *part_prune_result_data;
+ char *part_prune_result_space;
char *paramlistinfo_space;
BufferUsage *bufusage_space;
WalUsage *walusage_space;
SharedExecutorInstrumentation *instrumentation = NULL;
SharedJitInstrumentation *jit_instrumentation = NULL;
int pstmt_len;
+ int part_prune_result_len;
int paramlistinfo_len;
int instrumentation_len = 0;
int jit_instrumentation_len = 0;
@@ -630,6 +636,7 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
/* Fix up and serialize plan to be sent to workers. */
pstmt_data = ExecSerializePlan(planstate->plan, estate);
+ part_prune_result_data = nodeToString(estate->es_part_prune_result);
/* Create a parallel context. */
pcxt = CreateParallelContext("postgres", "ParallelQueryMain", nworkers);
@@ -656,6 +663,11 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
shm_toc_estimate_chunk(&pcxt->estimator, pstmt_len);
shm_toc_estimate_keys(&pcxt->estimator, 1);
+ /* Estimate space for serialized PartitionPruneResult. */
+ part_prune_result_len = strlen(part_prune_result_data) + 1;
+ shm_toc_estimate_chunk(&pcxt->estimator, part_prune_result_len);
+ shm_toc_estimate_keys(&pcxt->estimator, 1);
+
/* Estimate space for serialized ParamListInfo. */
paramlistinfo_len = EstimateParamListSpace(estate->es_param_list_info);
shm_toc_estimate_chunk(&pcxt->estimator, paramlistinfo_len);
@@ -750,6 +762,12 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
memcpy(pstmt_space, pstmt_data, pstmt_len);
shm_toc_insert(pcxt->toc, PARALLEL_KEY_PLANNEDSTMT, pstmt_space);
+ /* Store serialized PartitionPruneResult */
+ part_prune_result_space = shm_toc_allocate(pcxt->toc, part_prune_result_len);
+ memcpy(part_prune_result_space, part_prune_result_data, part_prune_result_len);
+ shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARTITIONPRUNERESULT,
+ part_prune_result_space);
+
/* Store serialized ParamListInfo. */
paramlistinfo_space = shm_toc_allocate(pcxt->toc, paramlistinfo_len);
shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARAMLISTINFO, paramlistinfo_space);
@@ -1231,8 +1249,10 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
int instrument_options)
{
char *pstmtspace;
+ char *part_prune_result_space;
char *paramspace;
PlannedStmt *pstmt;
+ PartitionPruneResult *part_prune_result;
ParamListInfo paramLI;
char *queryString;
@@ -1243,12 +1263,18 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
pstmtspace = shm_toc_lookup(toc, PARALLEL_KEY_PLANNEDSTMT, false);
pstmt = (PlannedStmt *) stringToNode(pstmtspace);
+ /* Reconstruct leader-supplied PartitionPruneResult. */
+ part_prune_result_space =
+ shm_toc_lookup(toc, PARALLEL_KEY_PARTITIONPRUNERESULT, false);
+ part_prune_result = (PartitionPruneResult *)
+ stringToNode(part_prune_result_space);
+
/* Reconstruct ParamListInfo. */
paramspace = shm_toc_lookup(toc, PARALLEL_KEY_PARAMLISTINFO, false);
paramLI = RestoreParamList(¶mspace);
/* Create a QueryDesc for the query. */
- return CreateQueryDesc(pstmt,
+ return CreateQueryDesc(pstmt, part_prune_result,
queryString,
GetActiveSnapshot(), InvalidSnapshot,
receiver, paramLI, NULL, instrument_options);
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 615bd80973..3037742b8d 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -25,6 +25,7 @@
#include "mb/pg_wchar.h"
#include "miscadmin.h"
#include "nodes/makefuncs.h"
+#include "parser/parsetree.h"
#include "partitioning/partbounds.h"
#include "partitioning/partdesc.h"
#include "partitioning/partprune.h"
@@ -185,7 +186,11 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation rel,
static List *adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri);
static List *adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap);
static PartitionPruneState *CreatePartitionPruneState(PlanState *planstate,
- PartitionPruneInfo *pruneinfo);
+ PartitionPruneInfo *pruneinfo,
+ bool consider_initial_steps,
+ bool consider_exec_steps,
+ List *rtable, ExprContext *econtext,
+ PartitionDirectory partdir);
static void InitPartitionPruneContext(PartitionPruneContext *context,
List *pruning_steps,
PartitionDesc partdesc,
@@ -198,7 +203,8 @@ static void PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
static void find_matching_subplans_recurse(PartitionPruningData *prunedata,
PartitionedRelPruningData *pprune,
bool initial_prune,
- Bitmapset **validsubplans);
+ Bitmapset **validsubplans,
+ Bitmapset **scan_leafpart_rtis);
/*
@@ -1587,8 +1593,10 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
* considered to be a stable expression, it can change value from one plan
* node scan to the next during query execution. Stable comparison
* expressions that don't involve such Params allow partition pruning to be
- * done once during executor startup. Expressions that do involve such Params
- * require us to prune separately for each scan of the parent plan node.
+ * done once during executor startup or during ExecutorDoInitialPruning() that
+ * runs as part of performing AcquireExecutorLocks() on a given plan tree.
+ * Expressions that do involve such Params require us to prune separately for
+ * each scan of the parent plan node.
*
* Note that pruning away unneeded subplans during executor startup has the
* added benefit of not having to initialize the unneeded subplans at all.
@@ -1605,6 +1613,13 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
* account for initial pruning possibly having eliminated some of the
* subplans.
*
+ * ExecPartitionDoInitialPruning:
+ * Do initial pruning with the information contained in a given
+ * PartitionPruneInfo to determine the minimal set of child subplans
+ * to be executed of the parent plan node to which the PartitionPruneInfo
+ * belongs and also the set of the RT indexes of leaf partitions that will
+ * be scanned with those subplans.
+ *
* ExecFindMatchingSubPlans:
* Returns indexes of matching subplans after evaluating the expressions
* that are safe to evaluate at a given point. This function is first
@@ -1622,8 +1637,9 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
*
* On return, *initially_valid_subplans is assigned the set of indexes of
* child subplans that must be initialized along with the parent plan node.
- * Initial pruning is performed here if needed and in that case only the
- * surviving subplans' indexes are added.
+ * Initial pruning is performed here if needed (unless it has already been done
+ * by ExecDoInitialPruning()), and in that case only the surviving subplans'
+ * indexes are added.
*
* If subplans are indeed pruned, subplan_map arrays contained in the returned
* PartitionPruneState are re-sequenced to not count those, though only if the
@@ -1632,23 +1648,59 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
PartitionPruneState *
ExecInitPartitionPruning(PlanState *planstate,
int n_total_subplans,
- PartitionPruneInfo *pruneinfo,
+ int part_prune_index,
Bitmapset **initially_valid_subplans)
{
- PartitionPruneState *prunestate;
EState *estate = planstate->state;
+ PartitionPruneInfo *pruneinfo = list_nth(estate->es_part_prune_infos,
+ part_prune_index);
+ PartitionPruneResult *pruneresult = estate->es_part_prune_result;
+ PartitionPruneState *prunestate;
+ bool do_pruning = (pruneinfo->needs_init_pruning ||
+ pruneinfo->needs_exec_pruning);
- /* We may need an expression context to evaluate partition exprs */
- ExecAssignExprContext(estate, planstate);
+ /*
+ * No need to do initial pruning if it was done already by
+ * ExecutorDoInitialPruning(), which it would be if es_part_prune_result
+ * has been set.
+ */
+ if (pruneresult)
+ do_pruning = pruneinfo->needs_exec_pruning;
+
+ prunestate = NULL;
+ if (do_pruning)
+ {
+ /* We may need an expression context to evaluate partition exprs */
+ ExecAssignExprContext(estate, planstate);
- /* Create the working data structure for pruning */
- prunestate = CreatePartitionPruneState(planstate, pruneinfo);
+ /* For data reading, executor always omits detached partitions */
+ if (estate->es_partition_directory == NULL)
+ estate->es_partition_directory =
+ CreatePartitionDirectory(estate->es_query_cxt, false);
+
+ /*
+ * Create the working data structure for pruning. No need to consider
+ * initial pruning steps if we have a PartitionPruneResult.
+ */
+ prunestate = CreatePartitionPruneState(planstate, pruneinfo,
+ pruneresult == NULL, true,
+ NIL, planstate->ps_ExprContext,
+ estate->es_partition_directory);
+ }
/*
* Perform an initial partition prune pass, if required.
*/
- if (prunestate->do_initial_prune)
- *initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true);
+ if (pruneresult)
+ {
+ *initially_valid_subplans =
+ list_nth(pruneresult->valid_subplan_offs_list, part_prune_index);
+ }
+ else if (prunestate && prunestate->do_initial_prune)
+ {
+ *initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true,
+ NULL);
+ }
else
{
/* No pruning, so we'll need to initialize all subplans */
@@ -1669,7 +1721,7 @@ ExecInitPartitionPruning(PlanState *planstate,
* leaves invalid data in prunestate, because that data won't be
* consulted again (cf initial Assert in ExecFindMatchingSubPlans).
*/
- if (prunestate->do_exec_prune)
+ if (prunestate && prunestate->do_exec_prune)
PartitionPruneFixSubPlanMap(prunestate,
*initially_valid_subplans,
n_total_subplans);
@@ -1678,11 +1730,72 @@ ExecInitPartitionPruning(PlanState *planstate,
return prunestate;
}
+/*
+ * ExecPartitionDoInitialPruning
+ * Perform initial pruning using given PartitionPruneInfo to determine
+ * the minimal set of child subplans to be executed of the parent plan
+ * node to which the PartitionPruneInfo belongs and also the set of RT
+ * indexes of leaf partitions that will scanned with those subplans.
+ */
+Bitmapset *
+ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params,
+ PartitionPruneInfo *pruneinfo,
+ Bitmapset **scan_leafpart_rtis)
+{
+ List *rtable = plannedstmt->rtable;
+ ExprContext *econtext;
+ PartitionDirectory pdir;
+ MemoryContext oldcontext,
+ tmpcontext;
+ PartitionPruneState *prunestate;
+ Bitmapset *valid_subplan_offs;
+
+ /*
+ * A temporary context to allocate stuff needded to run the pruning steps.
+ */
+ tmpcontext = AllocSetContextCreate(CurrentMemoryContext,
+ "initial pruning working data",
+ ALLOCSET_DEFAULT_SIZES);
+ oldcontext = MemoryContextSwitchTo(tmpcontext);
+
+ /*
+ * PartitionDirectory to look up partition descriptors, which omits
+ * detached partitions, just like in the executor proper.
+ */
+ pdir = CreatePartitionDirectory(CurrentMemoryContext, false);
+
+ /*
+ * We don't yet have a PlanState for the parent plan node, so must create
+ * a standalone ExprContext to evaluate pruning expressions, equipped with
+ * the information about the EXTERN parameters that the caller passed us.
+ * Note that that's okay because the initial pruning steps do not contain
+ * anything that requires the execution to have started.
+ */
+ econtext = CreateStandaloneExprContext();
+ econtext->ecxt_param_list_info = params;
+ prunestate = CreatePartitionPruneState(NULL, pruneinfo, true, false,
+ rtable, econtext, pdir);
+ MemoryContextSwitchTo(oldcontext);
+
+ /* Do the initial pruning. */
+ valid_subplan_offs = ExecFindMatchingSubPlans(prunestate, true,
+ scan_leafpart_rtis);
+
+ FreeExprContext(econtext, true);
+ DestroyPartitionDirectory(pdir);
+ MemoryContextDelete(tmpcontext);
+
+ return valid_subplan_offs;
+}
+
/*
* CreatePartitionPruneState
* Build the data structure required for calling ExecFindMatchingSubPlans
*
- * 'planstate' is the parent plan node's execution state.
+ * 'planstate', if not NULL, is the parent plan node's execution state. It
+ * can be NULL if being called before ExecutorStart(), in which case,
+ * 'rtable' (range table), 'econtext', and 'partdir' must be explicitly
+ * provided.
*
* 'pruneinfo' is a PartitionPruneInfo as generated by
* make_partition_pruneinfo. Here we build a PartitionPruneState containing a
@@ -1696,19 +1809,21 @@ ExecInitPartitionPruning(PlanState *planstate,
* PartitionedRelPruneInfo.
*/
static PartitionPruneState *
-CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
+CreatePartitionPruneState(PlanState *planstate,
+ PartitionPruneInfo *pruneinfo,
+ bool consider_initial_steps,
+ bool consider_exec_steps,
+ List *rtable, ExprContext *econtext,
+ PartitionDirectory partdir)
{
- EState *estate = planstate->state;
+ EState *estate = planstate ? planstate->state : NULL;
PartitionPruneState *prunestate;
int n_part_hierarchies;
ListCell *lc;
int i;
- ExprContext *econtext = planstate->ps_ExprContext;
- /* For data reading, executor always omits detached partitions */
- if (estate->es_partition_directory == NULL)
- estate->es_partition_directory =
- CreatePartitionDirectory(estate->es_query_cxt, false);
+ Assert((estate != NULL) ||
+ (partdir != NULL && econtext != NULL && rtable != NIL));
n_part_hierarchies = list_length(pruneinfo->prune_infos);
Assert(n_part_hierarchies > 0);
@@ -1759,19 +1874,48 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
PartitionedRelPruneInfo *pinfo = lfirst_node(PartitionedRelPruneInfo, lc2);
PartitionedRelPruningData *pprune = &prunedata->partrelprunedata[j];
Relation partrel;
+ bool close_partrel = false;
PartitionDesc partdesc;
PartitionKey partkey;
/*
- * We can rely on the copies of the partitioned table's partition
- * key and partition descriptor appearing in its relcache entry,
- * because that entry will be held open and locked for the
- * duration of this executor run.
+ * Must open the relation by ourselves when called before the
+ * execution has started, such as, when called during
+ * ExecutorDoInitialPruning() on a cached plan. In that case,
+ * sub-partitions must be locked, because AcquirePlannerLocks()
+ * would not have seen them. (1st relation in a partrelpruneinfos
+ * list is always the root partitioned table appearing in the
+ * query, which AcquirePlannerLocks() would have locked; the
+ * Assert in relation_open() guards that assumption.)
+ */
+ if (estate == NULL)
+ {
+ RangeTblEntry *rte = rt_fetch(pinfo->rtindex, rtable);
+ int lockmode = (j == 0) ? NoLock : rte->rellockmode;
+
+ partrel = table_open(rte->relid, lockmode);
+ close_partrel = true;
+ }
+ else
+ partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
+
+ /*
+ * We can rely on the copy of the partitioned table's partition
+ * key from in its relcache entry, because it can't change (or
+ * get destroyed) as long as the relation is locked. Partition
+ * descriptor is taken from the PartitionDirectory associated with
+ * the table that is held open long enough for the descriptor to
+ * remain valid while it's used to perform the pruning steps.
*/
- partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
partkey = RelationGetPartitionKey(partrel);
- partdesc = PartitionDirectoryLookup(estate->es_partition_directory,
- partrel);
+ partdesc = PartitionDirectoryLookup(partdir, partrel);
+
+ /*
+ * Must close partrel, keeping the lock taken, if we're not using
+ * EState's entry.
+ */
+ if (close_partrel)
+ table_close(partrel, NoLock);
/*
* Initialize the subplan_map and subpart_map.
@@ -1785,6 +1929,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
Assert(partdesc->nparts >= pinfo->nparts);
pprune->nparts = partdesc->nparts;
pprune->subplan_map = palloc(sizeof(int) * partdesc->nparts);
+ pprune->rti_map = palloc(sizeof(Index) * partdesc->nparts);
if (partdesc->nparts == pinfo->nparts)
{
/*
@@ -1795,6 +1940,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
pprune->subpart_map = pinfo->subpart_map;
memcpy(pprune->subplan_map, pinfo->subplan_map,
sizeof(int) * pinfo->nparts);
+ memcpy(pprune->rti_map, pinfo->rti_map,
+ sizeof(int) * pinfo->nparts);
/*
* Double-check that the list of unpruned relations has not
@@ -1845,6 +1992,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
pinfo->subplan_map[pd_idx];
pprune->subpart_map[pp_idx] =
pinfo->subpart_map[pd_idx];
+ pprune->rti_map[pp_idx] =
+ pinfo->rti_map[pd_idx];
pd_idx++;
}
else
@@ -1852,6 +2001,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
/* this partdesc entry is not in the plan */
pprune->subplan_map[pp_idx] = -1;
pprune->subpart_map[pp_idx] = -1;
+ pprune->rti_map[pp_idx] = 0;
}
}
@@ -1873,7 +2023,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
* Initialize pruning contexts as needed.
*/
pprune->initial_pruning_steps = pinfo->initial_pruning_steps;
- if (pinfo->initial_pruning_steps)
+ if (consider_initial_steps && pinfo->initial_pruning_steps)
{
InitPartitionPruneContext(&pprune->initial_context,
pinfo->initial_pruning_steps,
@@ -1883,7 +2033,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
prunestate->do_initial_prune = true;
}
pprune->exec_pruning_steps = pinfo->exec_pruning_steps;
- if (pinfo->exec_pruning_steps)
+ if (consider_exec_steps && pinfo->exec_pruning_steps)
{
InitPartitionPruneContext(&pprune->exec_context,
pinfo->exec_pruning_steps,
@@ -2111,10 +2261,14 @@ PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
* Pass initial_prune if PARAM_EXEC Params cannot yet be evaluated. This
* differentiates the initial executor-time pruning step from later
* runtime pruning.
+ *
+ * RT indexes of leaf partitions scanned by the chosen subplans are added to
+ * *scan_leafpart_rtis if the pointer is non-NULL.
*/
Bitmapset *
ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
- bool initial_prune)
+ bool initial_prune,
+ Bitmapset **scan_leafpart_rtis)
{
Bitmapset *result = NULL;
MemoryContext oldcontext;
@@ -2149,7 +2303,7 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
*/
pprune = &prunedata->partrelprunedata[0];
find_matching_subplans_recurse(prunedata, pprune, initial_prune,
- &result);
+ &result, scan_leafpart_rtis);
/* Expression eval may have used space in ExprContext too */
if (pprune->exec_pruning_steps)
@@ -2163,6 +2317,8 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
/* Copy result out of the temp context before we reset it */
result = bms_copy(result);
+ if (scan_leafpart_rtis)
+ *scan_leafpart_rtis = bms_copy(*scan_leafpart_rtis);
MemoryContextReset(prunestate->prune_context);
@@ -2173,13 +2329,15 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
* find_matching_subplans_recurse
* Recursive worker function for ExecFindMatchingSubPlans
*
- * Adds valid (non-prunable) subplan IDs to *validsubplans
+ * Adds valid (non-prunable) subplan IDs to *validsubplans and RT indexes of
+ * of the corresponding leaf partitions to *scan_leafpart_rtis (if asked for).
*/
static void
find_matching_subplans_recurse(PartitionPruningData *prunedata,
PartitionedRelPruningData *pprune,
bool initial_prune,
- Bitmapset **validsubplans)
+ Bitmapset **validsubplans,
+ Bitmapset **scan_leafpart_rtis)
{
Bitmapset *partset;
int i;
@@ -2206,8 +2364,13 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
while ((i = bms_next_member(partset, i)) >= 0)
{
if (pprune->subplan_map[i] >= 0)
+ {
*validsubplans = bms_add_member(*validsubplans,
pprune->subplan_map[i]);
+ if (scan_leafpart_rtis && pprune->rti_map[i] > 0)
+ *scan_leafpart_rtis = bms_add_member(*scan_leafpart_rtis,
+ pprune->rti_map[i]);
+ }
else
{
int partidx = pprune->subpart_map[i];
@@ -2215,7 +2378,8 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
if (partidx >= 0)
find_matching_subplans_recurse(prunedata,
&prunedata->partrelprunedata[partidx],
- initial_prune, validsubplans);
+ initial_prune, validsubplans,
+ scan_leafpart_rtis);
else
{
/*
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 9df1f81ea8..639145abe9 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -119,6 +119,7 @@ CreateExecutorState(void)
estate->es_relations = NULL;
estate->es_rowmarks = NULL;
estate->es_plannedstmt = NULL;
+ estate->es_part_prune_result = NULL;
estate->es_junkFilter = NULL;
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index f9460ae506..a2182a6b1f 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -844,7 +844,7 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
else
dest = None_Receiver;
- es->qd = CreateQueryDesc(es->stmt,
+ es->qd = CreateQueryDesc(es->stmt, NULL,
fcache->src,
GetActiveSnapshot(),
InvalidSnapshot,
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index 357e10a1d7..09f26658e2 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -94,6 +94,7 @@ static bool ExecAppendAsyncRequest(AppendState *node, TupleTableSlot **result);
static void ExecAppendAsyncEventWait(AppendState *node);
static void classify_matching_subplans(AppendState *node);
+
/* ----------------------------------------------------------------
* ExecInitAppend
*
@@ -134,7 +135,7 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
appendstate->as_begun = false;
/* If run-time partition pruning is enabled, then set that up now */
- if (node->part_prune_info != NULL)
+ if (node->part_prune_index >= 0)
{
PartitionPruneState *prunestate;
@@ -145,7 +146,7 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
*/
prunestate = ExecInitPartitionPruning(&appendstate->ps,
list_length(node->appendplans),
- node->part_prune_info,
+ node->part_prune_index,
&validsubplans);
appendstate->as_prune_state = prunestate;
nplans = bms_num_members(validsubplans);
@@ -155,7 +156,8 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
* subplan, we can fill as_valid_subplans immediately, preventing
* later calls to ExecFindMatchingSubPlans.
*/
- if (!prunestate->do_exec_prune && nplans > 0)
+ if (appendstate->as_prune_state == NULL ||
+ (!appendstate->as_prune_state->do_exec_prune && nplans > 0))
appendstate->as_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
}
else
@@ -577,7 +579,7 @@ choose_next_subplan_locally(AppendState *node)
}
else if (node->as_valid_subplans == NULL)
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
whichplan = -1;
}
@@ -642,7 +644,7 @@ choose_next_subplan_for_leader(AppendState *node)
if (node->as_valid_subplans == NULL)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
/*
* Mark each invalid plan as finished to allow the loop below to
@@ -717,7 +719,7 @@ choose_next_subplan_for_worker(AppendState *node)
else if (node->as_valid_subplans == NULL)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
mark_invalid_subplans_as_finished(node);
}
@@ -868,7 +870,7 @@ ExecAppendAsyncBegin(AppendState *node)
if (node->as_valid_subplans == NULL)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
classify_matching_subplans(node);
}
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index ecf9052e03..7708cfffda 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -82,7 +82,7 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
mergestate->ps.ExecProcNode = ExecMergeAppend;
/* If run-time partition pruning is enabled, then set that up now */
- if (node->part_prune_info != NULL)
+ if (node->part_prune_index >= 0)
{
PartitionPruneState *prunestate;
@@ -93,7 +93,7 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
*/
prunestate = ExecInitPartitionPruning(&mergestate->ps,
list_length(node->mergeplans),
- node->part_prune_info,
+ node->part_prune_index,
&validsubplans);
mergestate->ms_prune_state = prunestate;
nplans = bms_num_members(validsubplans);
@@ -103,7 +103,8 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
* subplan, we can fill as_valid_subplans immediately, preventing
* later calls to ExecFindMatchingSubPlans.
*/
- if (!prunestate->do_exec_prune && nplans > 0)
+ if (mergestate->ms_prune_state == NULL ||
+ (!mergestate->ms_prune_state->do_exec_prune && nplans > 0))
mergestate->ms_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
}
else
@@ -218,7 +219,7 @@ ExecMergeAppend(PlanState *pstate)
*/
if (node->ms_valid_subplans == NULL)
node->ms_valid_subplans =
- ExecFindMatchingSubPlans(node->ms_prune_state, false);
+ ExecFindMatchingSubPlans(node->ms_prune_state, false, NULL);
/*
* First time through: pull the first tuple from each valid subplan,
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 042a5f8b0a..05db2e9de1 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -2473,7 +2473,9 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
{
CachedPlanSource *plansource = (CachedPlanSource *) lfirst(lc1);
List *stmt_list;
- ListCell *lc2;
+ List *part_prune_result_list;
+ ListCell *lc2,
+ *lc3;
spicallbackarg.query = plansource->query_string;
@@ -2552,6 +2554,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
plan_owner, _SPI_current->queryEnv);
stmt_list = cplan->stmt_list;
+ part_prune_result_list = cplan->part_prune_result_list;
/*
* If we weren't given a specific snapshot to use, and the statement
@@ -2589,9 +2592,10 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
}
}
- foreach(lc2, stmt_list)
+ forboth(lc2, stmt_list, lc3, part_prune_result_list)
{
PlannedStmt *stmt = lfirst_node(PlannedStmt, lc2);
+ PartitionPruneResult *part_prune_result = lfirst_node(PartitionPruneResult, lc3);
bool canSetTag = stmt->canSetTag;
DestReceiver *dest;
@@ -2663,7 +2667,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
else
snap = InvalidSnapshot;
- qdesc = CreateQueryDesc(stmt,
+ qdesc = CreateQueryDesc(stmt, part_prune_result,
plansource->query_string,
snap, crosscheck_snapshot,
dest,
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 46a1943d97..c5c70593de 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -96,7 +96,10 @@ _copyPlannedStmt(const PlannedStmt *from)
COPY_SCALAR_FIELD(parallelModeNeeded);
COPY_SCALAR_FIELD(jitFlags);
COPY_NODE_FIELD(planTree);
+ COPY_NODE_FIELD(partPruneInfos);
+ COPY_SCALAR_FIELD(containsInitialPruning);
COPY_NODE_FIELD(rtable);
+ COPY_BITMAPSET_FIELD(minLockRelids);
COPY_NODE_FIELD(resultRelations);
COPY_NODE_FIELD(appendRelations);
COPY_NODE_FIELD(subplans);
@@ -253,7 +256,7 @@ _copyAppend(const Append *from)
COPY_NODE_FIELD(appendplans);
COPY_SCALAR_FIELD(nasyncplans);
COPY_SCALAR_FIELD(first_partial_plan);
- COPY_NODE_FIELD(part_prune_info);
+ COPY_SCALAR_FIELD(part_prune_index);
return newnode;
}
@@ -281,7 +284,7 @@ _copyMergeAppend(const MergeAppend *from)
COPY_POINTER_FIELD(sortOperators, from->numCols * sizeof(Oid));
COPY_POINTER_FIELD(collations, from->numCols * sizeof(Oid));
COPY_POINTER_FIELD(nullsFirst, from->numCols * sizeof(bool));
- COPY_NODE_FIELD(part_prune_info);
+ COPY_SCALAR_FIELD(part_prune_index);
return newnode;
}
@@ -1280,6 +1283,8 @@ _copyPartitionPruneInfo(const PartitionPruneInfo *from)
PartitionPruneInfo *newnode = makeNode(PartitionPruneInfo);
COPY_NODE_FIELD(prune_infos);
+ COPY_SCALAR_FIELD(needs_init_pruning);
+ COPY_SCALAR_FIELD(needs_exec_pruning);
COPY_BITMAPSET_FIELD(other_subplans);
return newnode;
@@ -1296,6 +1301,7 @@ _copyPartitionedRelPruneInfo(const PartitionedRelPruneInfo *from)
COPY_POINTER_FIELD(subplan_map, from->nparts * sizeof(int));
COPY_POINTER_FIELD(subpart_map, from->nparts * sizeof(int));
COPY_POINTER_FIELD(relid_map, from->nparts * sizeof(Oid));
+ COPY_POINTER_FIELD(rti_map, from->nparts * sizeof(Index));
COPY_NODE_FIELD(initial_pruning_steps);
COPY_NODE_FIELD(exec_pruning_steps);
COPY_BITMAPSET_FIELD(execparamids);
@@ -5469,6 +5475,21 @@ _copyExtensibleNode(const ExtensibleNode *from)
return newnode;
}
+/* ****************************************************************
+ * execnodes.h copy functions
+ * ****************************************************************
+ */
+static PartitionPruneResult *
+_copyPartitionPruneResult(const PartitionPruneResult *from)
+{
+ PartitionPruneResult *newnode = makeNode(PartitionPruneResult);
+
+ COPY_BITMAPSET_FIELD(scan_leafpart_rtis);
+ COPY_NODE_FIELD(valid_subplan_offs_list);
+
+ return newnode;
+}
+
/* ****************************************************************
* value.h copy functions
* ****************************************************************
@@ -5523,7 +5544,6 @@ _copyBitString(const BitString *from)
return newnode;
}
-
static ForeignKeyCacheInfo *
_copyForeignKeyCacheInfo(const ForeignKeyCacheInfo *from)
{
@@ -6565,6 +6585,13 @@ copyObjectImpl(const void *from)
retval = _copyPublicationTable(from);
break;
+ /*
+ * EXECUTION NODES
+ */
+ case T_PartitionPruneResult:
+ retval = _copyPartitionPruneResult(from);
+ break;
+
/*
* MISCELLANEOUS NODES
*/
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 13e1643530..ca54022fee 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -314,7 +314,10 @@ _outPlannedStmt(StringInfo str, const PlannedStmt *node)
WRITE_BOOL_FIELD(parallelModeNeeded);
WRITE_INT_FIELD(jitFlags);
WRITE_NODE_FIELD(planTree);
+ WRITE_NODE_FIELD(partPruneInfos);
+ WRITE_BOOL_FIELD(containsInitialPruning);
WRITE_NODE_FIELD(rtable);
+ WRITE_BITMAPSET_FIELD(minLockRelids);
WRITE_NODE_FIELD(resultRelations);
WRITE_NODE_FIELD(appendRelations);
WRITE_NODE_FIELD(subplans);
@@ -443,7 +446,7 @@ _outAppend(StringInfo str, const Append *node)
WRITE_NODE_FIELD(appendplans);
WRITE_INT_FIELD(nasyncplans);
WRITE_INT_FIELD(first_partial_plan);
- WRITE_NODE_FIELD(part_prune_info);
+ WRITE_INT_FIELD(part_prune_index);
}
static void
@@ -460,7 +463,7 @@ _outMergeAppend(StringInfo str, const MergeAppend *node)
WRITE_OID_ARRAY(sortOperators, node->numCols);
WRITE_OID_ARRAY(collations, node->numCols);
WRITE_BOOL_ARRAY(nullsFirst, node->numCols);
- WRITE_NODE_FIELD(part_prune_info);
+ WRITE_INT_FIELD(part_prune_index);
}
static void
@@ -1006,6 +1009,8 @@ _outPartitionPruneInfo(StringInfo str, const PartitionPruneInfo *node)
WRITE_NODE_TYPE("PARTITIONPRUNEINFO");
WRITE_NODE_FIELD(prune_infos);
+ WRITE_BOOL_FIELD(needs_init_pruning);
+ WRITE_BOOL_FIELD(needs_exec_pruning);
WRITE_BITMAPSET_FIELD(other_subplans);
}
@@ -1020,6 +1025,7 @@ _outPartitionedRelPruneInfo(StringInfo str, const PartitionedRelPruneInfo *node)
WRITE_INT_ARRAY(subplan_map, node->nparts);
WRITE_INT_ARRAY(subpart_map, node->nparts);
WRITE_OID_ARRAY(relid_map, node->nparts);
+ WRITE_INDEX_ARRAY(rti_map, node->nparts);
WRITE_NODE_FIELD(initial_pruning_steps);
WRITE_NODE_FIELD(exec_pruning_steps);
WRITE_BITMAPSET_FIELD(execparamids);
@@ -2420,6 +2426,9 @@ _outPlannerGlobal(StringInfo str, const PlannerGlobal *node)
WRITE_NODE_FIELD(finalrowmarks);
WRITE_NODE_FIELD(resultRelations);
WRITE_NODE_FIELD(appendRelations);
+ WRITE_NODE_FIELD(partPruneInfos);
+ WRITE_BOOL_FIELD(containsInitialPruning);
+ WRITE_BITMAPSET_FIELD(minLockRelids);
WRITE_NODE_FIELD(relationOids);
WRITE_NODE_FIELD(invalItems);
WRITE_NODE_FIELD(paramExecTypes);
@@ -2487,6 +2496,7 @@ _outPlannerInfo(StringInfo str, const PlannerInfo *node)
WRITE_BITMAPSET_FIELD(curOuterRels);
WRITE_NODE_FIELD(curOuterParams);
WRITE_BOOL_FIELD(partColsUpdated);
+ WRITE_NODE_FIELD(partPruneInfos);
}
static void
@@ -2840,6 +2850,21 @@ _outExtensibleNode(StringInfo str, const ExtensibleNode *node)
methods->nodeOut(str, node);
}
+/*****************************************************************************
+ *
+ * Stuff from execnodes.h
+ *
+ *****************************************************************************/
+
+static void
+_outPartitionPruneResult(StringInfo str, const PartitionPruneResult *node)
+{
+ WRITE_NODE_TYPE("PARTITIONPRUNERESULT");
+
+ WRITE_BITMAPSET_FIELD(scan_leafpart_rtis);
+ WRITE_NODE_FIELD(valid_subplan_offs_list);
+}
+
/*****************************************************************************
*
* Stuff from parsenodes.h.
@@ -4748,6 +4773,13 @@ outNode(StringInfo str, const void *obj)
_outJsonTableSibling(str, obj);
break;
+ /*
+ * EXECUTION NODES
+ */
+ case T_PartitionPruneResult:
+ _outPartitionPruneResult(str, obj);
+ break;
+
default:
/*
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 48f7216c9e..acce5e29cc 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -164,6 +164,11 @@
token = pg_strtok(&length); /* skip :fldname */ \
local_node->fldname = readIntCols(len)
+/* Read an Index array */
+#define READ_INDEX_ARRAY(fldname, len) \
+ token = pg_strtok(&length); /* skip :fldname */ \
+ local_node->fldname = readIndexCols(len)
+
/* Read a bool array */
#define READ_BOOL_ARRAY(fldname, len) \
token = pg_strtok(&length); /* skip :fldname */ \
@@ -1814,7 +1819,10 @@ _readPlannedStmt(void)
READ_BOOL_FIELD(parallelModeNeeded);
READ_INT_FIELD(jitFlags);
READ_NODE_FIELD(planTree);
+ READ_NODE_FIELD(partPruneInfos);
+ READ_BOOL_FIELD(containsInitialPruning);
READ_NODE_FIELD(rtable);
+ READ_BITMAPSET_FIELD(minLockRelids);
READ_NODE_FIELD(resultRelations);
READ_NODE_FIELD(appendRelations);
READ_NODE_FIELD(subplans);
@@ -1946,7 +1954,7 @@ _readAppend(void)
READ_NODE_FIELD(appendplans);
READ_INT_FIELD(nasyncplans);
READ_INT_FIELD(first_partial_plan);
- READ_NODE_FIELD(part_prune_info);
+ READ_INT_FIELD(part_prune_index);
READ_DONE();
}
@@ -1968,7 +1976,7 @@ _readMergeAppend(void)
READ_OID_ARRAY(sortOperators, local_node->numCols);
READ_OID_ARRAY(collations, local_node->numCols);
READ_BOOL_ARRAY(nullsFirst, local_node->numCols);
- READ_NODE_FIELD(part_prune_info);
+ READ_INT_FIELD(part_prune_index);
READ_DONE();
}
@@ -2763,6 +2771,8 @@ _readPartitionPruneInfo(void)
READ_LOCALS(PartitionPruneInfo);
READ_NODE_FIELD(prune_infos);
+ READ_BOOL_FIELD(needs_init_pruning);
+ READ_BOOL_FIELD(needs_exec_pruning);
READ_BITMAPSET_FIELD(other_subplans);
READ_DONE();
@@ -2779,6 +2789,7 @@ _readPartitionedRelPruneInfo(void)
READ_INT_ARRAY(subplan_map, local_node->nparts);
READ_INT_ARRAY(subpart_map, local_node->nparts);
READ_OID_ARRAY(relid_map, local_node->nparts);
+ READ_INDEX_ARRAY(rti_map, local_node->nparts);
READ_NODE_FIELD(initial_pruning_steps);
READ_NODE_FIELD(exec_pruning_steps);
READ_BITMAPSET_FIELD(execparamids);
@@ -2932,6 +2943,21 @@ _readPartitionRangeDatum(void)
READ_DONE();
}
+
+/*
+ * _readPartitionPruneResult
+ */
+static PartitionPruneResult *
+_readPartitionPruneResult(void)
+{
+ READ_LOCALS(PartitionPruneResult);
+
+ READ_BITMAPSET_FIELD(scan_leafpart_rtis);
+ READ_NODE_FIELD(valid_subplan_offs_list);
+
+ READ_DONE();
+}
+
/*
* parseNodeString
*
@@ -3229,6 +3255,8 @@ parseNodeString(void)
return_value = _readJsonTableParent();
else if (MATCH("JSONTABSNODE", 12))
return_value = _readJsonTableSibling();
+ else if (MATCH("PARTITIONPRUNERESULT", 20))
+ return_value = _readPartitionPruneResult();
else
{
elog(ERROR, "badly formatted node string \"%.32s\"...", token);
@@ -3372,6 +3400,30 @@ readIntCols(int numCols)
return int_vals;
}
+/*
+ * readIndexCols
+ */
+Index *
+readIndexCols(int numCols)
+{
+ int tokenLength,
+ i;
+ const char *token;
+ Index *index_vals;
+
+ if (numCols <= 0)
+ return NULL;
+
+ index_vals = (Index *) palloc(numCols * sizeof(Index));
+ for (i = 0; i < numCols; i++)
+ {
+ token = pg_strtok(&tokenLength);
+ index_vals[i] = atoui(token);
+ }
+
+ return index_vals;
+}
+
/*
* readBoolCols
*/
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index 51591bb812..453f720759 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -1366,7 +1366,15 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
plan->appendplans = subplans;
plan->nasyncplans = nasyncplans;
plan->first_partial_plan = best_path->first_partial_path;
- plan->part_prune_info = partpruneinfo;
+
+ if (partpruneinfo)
+ {
+ root->partPruneInfos = lappend(root->partPruneInfos, partpruneinfo);
+ /* Will be updated later in set_plan_references(). */
+ plan->part_prune_index = list_length(root->partPruneInfos) - 1;
+ }
+ else
+ plan->part_prune_index = -1;
copy_generic_path_info(&plan->plan, (Path *) best_path);
@@ -1528,7 +1536,15 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
}
node->mergeplans = subplans;
- node->part_prune_info = partpruneinfo;
+
+ if (partpruneinfo)
+ {
+ root->partPruneInfos = lappend(root->partPruneInfos, partpruneinfo);
+ /* Will be updated later in set_plan_references(). */
+ node->part_prune_index = list_length(root->partPruneInfos) - 1;
+ }
+ else
+ node->part_prune_index = -1;
/*
* If prepare_sort_from_pathkeys added sort columns, but we were told to
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index b2569c5d0c..2aa051d862 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -518,7 +518,10 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
result->dependsOnRole = glob->dependsOnRole;
result->parallelModeNeeded = glob->parallelModeNeeded;
result->planTree = top_plan;
+ result->partPruneInfos = glob->partPruneInfos;
+ result->containsInitialPruning = glob->containsInitialPruning;
result->rtable = glob->finalrtable;
+ result->minLockRelids = glob->minLockRelids;
result->resultRelations = glob->resultRelations;
result->appendRelations = glob->appendRelations;
result->subplans = glob->subplans;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 7519723081..fc66986e1c 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -251,7 +251,7 @@ set_plan_references(PlannerInfo *root, Plan *plan)
Plan *result;
PlannerGlobal *glob = root->glob;
int rtoffset = list_length(glob->finalrtable);
- ListCell *lc;
+ ListCell *lc;
/*
* Add all the query's RTEs to the flattened rangetable. The live ones
@@ -260,6 +260,16 @@ set_plan_references(PlannerInfo *root, Plan *plan)
*/
add_rtes_to_flat_rtable(root, false);
+ /*
+ * Add the query's adjusted range of RT indexes to glob->minLockRelids.
+ * The adjusted RT indexes of prunable relations will be deleted from the
+ * set below where PartitionPruneInfos are processed.
+ */
+ glob->minLockRelids =
+ bms_add_range(glob->minLockRelids,
+ rtoffset + 1,
+ rtoffset + list_length(root->parse->rtable));
+
/*
* Adjust RT indexes of PlanRowMarks and add to final rowmarks list
*/
@@ -338,6 +348,56 @@ set_plan_references(PlannerInfo *root, Plan *plan)
}
}
+ /* Also fix up the information in PartitionPruneInfos. */
+ foreach (lc, root->partPruneInfos)
+ {
+ PartitionPruneInfo *pruneinfo = lfirst(lc);
+ Bitmapset *leafpart_rtis = NULL;
+ ListCell *l;
+
+ foreach(l, pruneinfo->prune_infos)
+ {
+ List *prune_infos = lfirst(l);
+ ListCell *l2;
+
+ foreach(l2, prune_infos)
+ {
+ PartitionedRelPruneInfo *pinfo = lfirst(l2);
+ int i;
+
+ /* RT index of the partitione table. */
+ pinfo->rtindex += rtoffset;
+
+ /* And also those of the leaf partitions. */
+ for (i = 0; i < pinfo->nparts; i++)
+ {
+ if (pinfo->rti_map[i] > 0)
+ {
+ pinfo->rti_map[i] += rtoffset;
+ leafpart_rtis = bms_add_member(leafpart_rtis,
+ pinfo->rti_map[i]);
+ }
+ }
+ }
+ }
+
+ if (pruneinfo->needs_init_pruning)
+ {
+ glob->containsInitialPruning = true;
+
+ /*
+ * Delete the leaf partition RTIs from the global set of relations
+ * to be locked before executing the plan. AcquireExecutorLocks()
+ * will find the ones to add to the set after performing initial
+ * pruning.
+ */
+ glob->minLockRelids = bms_del_members(glob->minLockRelids,
+ leafpart_rtis);
+ }
+
+ glob->partPruneInfos = lappend(glob->partPruneInfos, pruneinfo);
+ }
+
return result;
}
@@ -1610,21 +1670,12 @@ set_append_references(PlannerInfo *root,
aplan->apprelids = offset_relid_set(aplan->apprelids, rtoffset);
- if (aplan->part_prune_info)
- {
- foreach(l, aplan->part_prune_info->prune_infos)
- {
- List *prune_infos = lfirst(l);
- ListCell *l2;
-
- foreach(l2, prune_infos)
- {
- PartitionedRelPruneInfo *pinfo = lfirst(l2);
-
- pinfo->rtindex += rtoffset;
- }
- }
- }
+ /*
+ * PartitionPruneInfos will be added to a list in PlannerGlobal, so update
+ * the index.
+ */
+ if (aplan->part_prune_index >= 0)
+ aplan->part_prune_index += list_length(root->glob->partPruneInfos);
/* We don't need to recurse to lefttree or righttree ... */
Assert(aplan->plan.lefttree == NULL);
@@ -1682,21 +1733,12 @@ set_mergeappend_references(PlannerInfo *root,
mplan->apprelids = offset_relid_set(mplan->apprelids, rtoffset);
- if (mplan->part_prune_info)
- {
- foreach(l, mplan->part_prune_info->prune_infos)
- {
- List *prune_infos = lfirst(l);
- ListCell *l2;
-
- foreach(l2, prune_infos)
- {
- PartitionedRelPruneInfo *pinfo = lfirst(l2);
-
- pinfo->rtindex += rtoffset;
- }
- }
- }
+ /*
+ * PartitionPruneInfos will be added to a list in PlannerGlobal, so update
+ * the index.
+ */
+ if (mplan->part_prune_index >= 0)
+ mplan->part_prune_index += list_length(root->glob->partPruneInfos);
/* We don't need to recurse to lefttree or righttree ... */
Assert(mplan->plan.lefttree == NULL);
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index 9d3c05aed3..0eaff15ed0 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -144,7 +144,9 @@ static List *make_partitionedrel_pruneinfo(PlannerInfo *root,
List *prunequal,
Bitmapset *partrelids,
int *relid_subplan_map,
- Bitmapset **matchedsubplans);
+ Bitmapset **matchedsubplans,
+ bool *needs_init_pruning,
+ bool *needs_exec_pruning);
static void gen_partprune_steps(RelOptInfo *rel, List *clauses,
PartClauseTarget target,
GeneratePruningStepsContext *context);
@@ -230,6 +232,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
int *relid_subplan_map;
ListCell *lc;
int i;
+ bool needs_init_pruning = false;
+ bool needs_exec_pruning = false;
/*
* Scan the subpaths to see which ones are scans of partition child
@@ -309,12 +313,16 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
Bitmapset *partrelids = (Bitmapset *) lfirst(lc);
List *pinfolist;
Bitmapset *matchedsubplans = NULL;
+ bool partrel_needs_init_pruning;
+ bool partrel_needs_exec_pruning;
pinfolist = make_partitionedrel_pruneinfo(root, parentrel,
prunequal,
partrelids,
relid_subplan_map,
- &matchedsubplans);
+ &matchedsubplans,
+ &partrel_needs_init_pruning,
+ &partrel_needs_exec_pruning);
/* When pruning is possible, record the matched subplans */
if (pinfolist != NIL)
@@ -323,6 +331,10 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
allmatchedsubplans = bms_join(matchedsubplans,
allmatchedsubplans);
}
+ if (!needs_init_pruning)
+ needs_init_pruning = partrel_needs_init_pruning;
+ if (!needs_exec_pruning)
+ needs_exec_pruning = partrel_needs_exec_pruning;
}
pfree(relid_subplan_map);
@@ -337,6 +349,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
/* Else build the result data structure */
pruneinfo = makeNode(PartitionPruneInfo);
pruneinfo->prune_infos = prunerelinfos;
+ pruneinfo->needs_init_pruning = needs_init_pruning;
+ pruneinfo->needs_exec_pruning = needs_exec_pruning;
/*
* Some subplans may not belong to any of the identified partitioned rels.
@@ -435,13 +449,18 @@ add_part_relids(List *allpartrelids, Bitmapset *partrelids)
* If we cannot find any useful run-time pruning steps, return NIL.
* However, on success, each rel identified in partrelids will have
* an element in the result list, even if some of them are useless.
+ * *needs_init_pruning and *needs_exec_pruning are set to indicate that the
+ * returned PartitionedRelPruneInfos contains pruning steps that can be
+ * performed before and after execution begins, respectively.
*/
static List *
make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
List *prunequal,
Bitmapset *partrelids,
int *relid_subplan_map,
- Bitmapset **matchedsubplans)
+ Bitmapset **matchedsubplans,
+ bool *needs_init_pruning,
+ bool *needs_exec_pruning)
{
RelOptInfo *targetpart = NULL;
List *pinfolist = NIL;
@@ -452,6 +471,10 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
int rti;
int i;
+ /* Will find out below. */
+ *needs_init_pruning = false;
+ *needs_exec_pruning = false;
+
/*
* Examine each partitioned rel, constructing a temporary array to map
* from planner relids to index of the partitioned rel, and building a
@@ -539,6 +562,9 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
* executor per-scan pruning steps. This first pass creates startup
* pruning steps and detects whether there's any possibly-useful quals
* that would require per-scan pruning.
+ *
+ * In the first pass, we note whether the 2nd pass is necessary by
+ * by noting the presence of EXEC parameters.
*/
gen_partprune_steps(subpart, partprunequal, PARTTARGET_INITIAL,
&context);
@@ -613,6 +639,11 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
pinfo->execparamids = execparamids;
/* Remaining fields will be filled in the next loop */
+ if (!*needs_init_pruning)
+ *needs_init_pruning = (initial_pruning_steps != NIL);
+ if (!*needs_exec_pruning)
+ *needs_exec_pruning = (exec_pruning_steps != NIL);
+
pinfolist = lappend(pinfolist, pinfo);
}
@@ -640,6 +671,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
int *subplan_map;
int *subpart_map;
Oid *relid_map;
+ Index *rti_map;
/*
* Construct the subplan and subpart maps for this partitioning level.
@@ -652,6 +684,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
subpart_map = (int *) palloc(nparts * sizeof(int));
memset(subpart_map, -1, nparts * sizeof(int));
relid_map = (Oid *) palloc0(nparts * sizeof(Oid));
+ rti_map = (Index *) palloc0(nparts * sizeof(Index));
present_parts = NULL;
i = -1;
@@ -666,6 +699,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
subplan_map[i] = subplanidx = relid_subplan_map[partrel->relid] - 1;
subpart_map[i] = subpartidx = relid_subpart_map[partrel->relid] - 1;
relid_map[i] = planner_rt_fetch(partrel->relid, root)->relid;
+ rti_map[i] = partrel->relid;
if (subplanidx >= 0)
{
present_parts = bms_add_member(present_parts, i);
@@ -690,6 +724,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
pinfo->subplan_map = subplan_map;
pinfo->subpart_map = subpart_map;
pinfo->relid_map = relid_map;
+ pinfo->rti_map = rti_map;
}
pfree(relid_subpart_map);
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index 5aa5a350f3..163ba956c4 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -35,7 +35,7 @@
Portal ActivePortal = NULL;
-static void ProcessQuery(PlannedStmt *plan,
+static void ProcessQuery(PlannedStmt *plan, PartitionPruneResult *part_prune_result,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -65,6 +65,7 @@ static void DoPortalRewind(Portal portal);
*/
QueryDesc *
CreateQueryDesc(PlannedStmt *plannedstmt,
+ PartitionPruneResult *part_prune_result,
const char *sourceText,
Snapshot snapshot,
Snapshot crosscheck_snapshot,
@@ -77,6 +78,8 @@ CreateQueryDesc(PlannedStmt *plannedstmt,
qd->operation = plannedstmt->commandType; /* operation */
qd->plannedstmt = plannedstmt; /* plan */
+ qd->part_prune_result = part_prune_result; /* ExecutorDoInitialPruning()
+ * output for plan */
qd->sourceText = sourceText; /* query text */
qd->snapshot = RegisterSnapshot(snapshot); /* snapshot */
/* RI check snapshot */
@@ -122,6 +125,7 @@ FreeQueryDesc(QueryDesc *qdesc)
* PORTAL_ONE_RETURNING, or PORTAL_ONE_MOD_WITH portal
*
* plan: the plan tree for the query
+ * part_prune_result: ExecutorDoInitialPruning() output for the plan tree
* sourceText: the source text of the query
* params: any parameters needed
* dest: where to send results
@@ -134,6 +138,7 @@ FreeQueryDesc(QueryDesc *qdesc)
*/
static void
ProcessQuery(PlannedStmt *plan,
+ PartitionPruneResult *part_prune_result,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -145,7 +150,7 @@ ProcessQuery(PlannedStmt *plan,
/*
* Create the QueryDesc object
*/
- queryDesc = CreateQueryDesc(plan, sourceText,
+ queryDesc = CreateQueryDesc(plan, part_prune_result, sourceText,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, 0);
@@ -491,8 +496,14 @@ PortalStart(Portal portal, ParamListInfo params,
/*
* Create QueryDesc in portal's context; for the moment, set
* the destination to DestNone.
+ *
+ * There is no PartitionPruneResult unless the PlannedStmt is
+ * from a CachedPlan.
*/
queryDesc = CreateQueryDesc(linitial_node(PlannedStmt, portal->stmts),
+ portal->cplan == NULL ? NULL :
+ linitial_node(PartitionPruneResult,
+ portal->cplan->part_prune_result_list),
portal->sourceText,
GetActiveSnapshot(),
InvalidSnapshot,
@@ -1194,6 +1205,9 @@ PortalRunMulti(Portal portal,
{
bool active_snapshot_set = false;
ListCell *stmtlist_item;
+ int i;
+ List *part_prune_results = portal->cplan == NULL ? NIL:
+ portal->cplan->part_prune_result_list;
/*
* If the destination is DestRemoteExecute, change to DestNone. The
@@ -1214,9 +1228,15 @@ PortalRunMulti(Portal portal,
* Loop to handle the individual queries generated from a single parsetree
* by analysis and rewrite.
*/
+ i = 0;
foreach(stmtlist_item, portal->stmts)
{
PlannedStmt *pstmt = lfirst_node(PlannedStmt, stmtlist_item);
+ PartitionPruneResult *part_prune_result = part_prune_results ?
+ list_nth(part_prune_results, i) :
+ NULL;
+
+ i++;
/*
* If we got a cancel signal in prior command, quit
@@ -1274,7 +1294,7 @@ PortalRunMulti(Portal portal,
if (pstmt->canSetTag)
{
/* statement can set tag string */
- ProcessQuery(pstmt,
+ ProcessQuery(pstmt, part_prune_result,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
@@ -1283,7 +1303,7 @@ PortalRunMulti(Portal portal,
else
{
/* stmt added by rewrite cannot set tag */
- ProcessQuery(pstmt,
+ ProcessQuery(pstmt, part_prune_result,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 4cf6db504f..216401bcfb 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -99,14 +99,16 @@ static dlist_head cached_expression_list = DLIST_STATIC_INIT(cached_expression_l
static void ReleaseGenericPlan(CachedPlanSource *plansource);
static List *RevalidateCachedQuery(CachedPlanSource *plansource,
QueryEnvironment *queryEnv);
-static bool CheckCachedPlan(CachedPlanSource *plansource);
+static bool CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams);
static CachedPlan *BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
ParamListInfo boundParams, QueryEnvironment *queryEnv);
+static void CachedPlanSavePartitionPruneResults(CachedPlan *plan, List *part_prune_result_list);
static bool choose_custom_plan(CachedPlanSource *plansource,
ParamListInfo boundParams);
static double cached_plan_cost(CachedPlan *plan, bool include_planner);
static Query *QueryListGetPrimaryStmt(List *stmts);
-static void AcquireExecutorLocks(List *stmt_list, bool acquire);
+static List *AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams);
+static void ReleaseExecutorLocks(List *stmt_list, List *part_prune_result_list);
static void AcquirePlannerLocks(List *stmt_list, bool acquire);
static void ScanQueryForLocks(Query *parsetree, bool acquire);
static bool ScanQueryWalker(Node *node, bool *acquire);
@@ -790,9 +792,21 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
*
* On a "true" return, we have acquired the locks needed to run the plan.
* (We must do this for the "true" result to be race-condition-free.)
+ *
+ * If the CachedPlan is valid, this may in some cases call
+ * ExecutorDoInitialPruning() on each PlannedStmt contained in it to determine
+ * the set of relations to be locked by AcquireExecutorLocks(), instead of just
+ * scanning its range table, which is done to prune away any nodes in the tree
+ * that need not be executed based on the result of initial partition pruning.
+ * The result of pruning which consists of List of Lists of bitmapsets of child
+ * subplan indexes, allocated in a child context of the context containing the
+ * plan itself, are added into plan->part_prune_results. The previous contents
+ * of the list from the last invocation on the same CachedPlan are deleted,
+ * because they would no longer be valid given the fresh set of parameter
+ * values which may be used as pruning parameters.
*/
static bool
-CheckCachedPlan(CachedPlanSource *plansource)
+CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams)
{
CachedPlan *plan = plansource->gplan;
@@ -820,13 +834,24 @@ CheckCachedPlan(CachedPlanSource *plansource)
*/
if (plan->is_valid)
{
+ List *part_prune_result_list;
+
/*
* Plan must have positive refcount because it is referenced by
* plansource; so no need to fear it disappears under us here.
*/
Assert(plan->refcount > 0);
- AcquireExecutorLocks(plan->stmt_list, true);
+ /*
+ * Lock relations scanned by the plan. If ExecutorDoInitialPruning()
+ * asked to omit some relations because the plan nodes that scan them
+ * were found to be pruned, the executor will be informed of the
+ * omission of the plan nodes themselves via part_prune_result_list
+ * that is passed to it along with the list of PlannedStmts, so that
+ * it doesn't accidentally try to execute those nodes.
+ */
+ part_prune_result_list = AcquireExecutorLocks(plan->stmt_list,
+ boundParams);
/*
* If plan was transient, check to see if TransactionXmin has
@@ -844,11 +869,14 @@ CheckCachedPlan(CachedPlanSource *plansource)
if (plan->is_valid)
{
/* Successfully revalidated and locked the query. */
+
+ /* Remember pruning results in the CachedPlan. */
+ CachedPlanSavePartitionPruneResults(plan, part_prune_result_list);
return true;
}
/* Oops, the race case happened. Release useless locks. */
- AcquireExecutorLocks(plan->stmt_list, false);
+ ReleaseExecutorLocks(plan->stmt_list, part_prune_result_list);
}
/*
@@ -880,10 +908,12 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
ParamListInfo boundParams, QueryEnvironment *queryEnv)
{
CachedPlan *plan;
- List *plist;
+ List *plist,
+ *dummy_part_prune_result_list;
bool snapshot_set;
bool is_transient;
- MemoryContext plan_context;
+ MemoryContext plan_context,
+ part_prune_result_context;
MemoryContext oldcxt = CurrentMemoryContext;
ListCell *lc;
@@ -962,6 +992,16 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
else
plan_context = CurrentMemoryContext;
+ /*
+ * Also create a dedicated context for part_prune_result_list, making it
+ * a child of plan_context.
+ */
+ part_prune_result_context = AllocSetContextCreate(CurrentMemoryContext,
+ "CachedPlan part_prune_results list",
+ ALLOCSET_START_SMALL_SIZES);
+ MemoryContextSetParent(part_prune_result_context, plan_context);
+ MemoryContextSetIdentifier(part_prune_result_context, plan_context->ident);
+
/*
* Create and fill the CachedPlan struct within the new context.
*/
@@ -977,10 +1017,20 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
plan->planRoleId = GetUserId();
plan->dependsOnRole = plansource->dependsOnRLS;
is_transient = false;
+ dummy_part_prune_result_list = NIL;
foreach(lc, plist)
{
PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc);
+ /*
+ * Real values will be added during subsequent CheckCachedPlan() calls
+ * on this plan, but must add "something" for now, becasue users of
+ * CachedPlan expect stmt_list and part_prune_result_list to have
+ * the same number of elements.
+ */
+ dummy_part_prune_result_list = lappend(dummy_part_prune_result_list,
+ NULL);
+
if (plannedstmt->commandType == CMD_UTILITY)
continue; /* Ignore utility statements */
@@ -1002,6 +1052,13 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
plan->is_saved = false;
plan->is_valid = true;
+ /*
+ * While still dummy, save the list so that it is discarded on next use of
+ * the CachedPlan.
+ */
+ plan->part_prune_result_context = part_prune_result_context;
+ CachedPlanSavePartitionPruneResults(plan, dummy_part_prune_result_list);
+
/* assign generation number to new plan */
plan->generation = ++(plansource->generation);
@@ -1160,7 +1217,7 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
if (!customplan)
{
- if (CheckCachedPlan(plansource))
+ if (CheckCachedPlan(plansource, boundParams))
{
/* We want a generic plan, and we already have a valid one */
plan = plansource->gplan;
@@ -1586,6 +1643,36 @@ CopyCachedPlan(CachedPlanSource *plansource)
return newsource;
}
+/*
+ * CachedPlanSavePartitionPruneResults
+ * Save the list containing PartitionPruneResult nodes into the given
+ * CachedPlan
+ *
+ * They must be hanged on to for the duration of a given execution of the
+ * CachedPlan. The provided list is copied into a dedicated context that is
+ * a child of plan->context after dropping the existing contents of the list,
+ * because any PartitionPruneResult contained therein would no longer be
+ * valid for the current execution.
+ */
+static void
+CachedPlanSavePartitionPruneResults(CachedPlan *plan,
+ List *part_prune_result_list)
+{
+ MemoryContext part_prune_result_context = plan->part_prune_result_context,
+ oldcontext = CurrentMemoryContext;
+ List *part_prune_result_list_copy;
+
+ /* First clear the existing contents of the list. */
+ Assert(MemoryContextIsValid(part_prune_result_context));
+ MemoryContextReset(part_prune_result_context);
+
+ MemoryContextSwitchTo(part_prune_result_context);
+ part_prune_result_list_copy = copyObject(part_prune_result_list);
+ MemoryContextSwitchTo(oldcontext);
+
+ plan->part_prune_result_list = part_prune_result_list_copy;
+}
+
/*
* CachedPlanIsValid: test whether the rewritten querytree within a
* CachedPlanSource is currently valid (that is, not marked as being in need
@@ -1737,17 +1824,21 @@ QueryListGetPrimaryStmt(List *stmts)
/*
* AcquireExecutorLocks: acquire locks needed for execution of a cached plan;
- * or release them if acquire is false.
+ *
+ * Returns a list of PartitionPruneResult nodes containing one element for each
+ * PlannedStmt in stmt_list or NULL if the latter is utility statement or its
+ * containsInitialPruning is false.
*/
-static void
-AcquireExecutorLocks(List *stmt_list, bool acquire)
+static List *
+AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams)
{
ListCell *lc1;
+ List *part_prune_result_list = NIL;
foreach(lc1, stmt_list)
{
PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
- ListCell *lc2;
+ PartitionPruneResult *part_prune_result = NULL;
if (plannedstmt->commandType == CMD_UTILITY)
{
@@ -1761,27 +1852,122 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
Query *query = UtilityContainsQuery(plannedstmt->utilityStmt);
if (query)
- ScanQueryForLocks(query, acquire);
- continue;
+ ScanQueryForLocks(query, true);
}
-
- foreach(lc2, plannedstmt->rtable)
+ else
{
- RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc2);
-
- if (rte->rtekind != RTE_RELATION)
- continue;
+ Bitmapset *lockRelids;
+ int rti;
/*
- * Acquire the appropriate type of lock on each relation OID. Note
- * that we don't actually try to open the rel, and hence will not
- * fail if it's been dropped entirely --- we'll just transiently
- * acquire a non-conflicting lock.
+ * Figure out the set of relations that would need to be locked
+ * before executing the plan.
*/
- if (acquire)
+ if (plannedstmt->containsInitialPruning)
+ {
+ /*
+ * Obtain the set of partitions to be locked from the
+ * PartitionPruneInfos by considering the result of performing
+ * initial partition pruning.
+ */
+ PartitionPruneResult *part_prune_result =
+ ExecutorDoInitialPruning(plannedstmt, boundParams);
+
+ lockRelids = bms_union(plannedstmt->minLockRelids,
+ part_prune_result->scan_leafpart_rtis);
+ }
+ else
+ lockRelids = plannedstmt->minLockRelids;
+
+ rti = -1;
+ while ((rti = bms_next_member(lockRelids, rti)) > 0)
+ {
+ RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
+
+ if (rte->rtekind != RTE_RELATION)
+ continue;
+
+ /*
+ * Acquire the appropriate type of lock on each relation OID.
+ * Note that we don't actually try to open the rel, and hence
+ * will not fail if it's been dropped entirely --- we'll just
+ * transiently acquire a non-conflicting lock.
+ */
LockRelationOid(rte->relid, rte->rellockmode);
+ }
+ }
+
+ /*
+ * Remember PartitionPruneResult for later adding to the QueryDesc that
+ * will be passed to the executor when executing this plan. May be
+ * NULL, but must keep the list the same length as stmt_list.
+ */
+ part_prune_result_list = lappend(part_prune_result_list,
+ part_prune_result);
+ }
+
+ return part_prune_result_list;
+}
+
+/*
+ * ReleaseExecutorLocks
+ * Release locks that would've been acquired by an earlier call to
+ * AcquireExecutorLocks()
+ */
+static void
+ReleaseExecutorLocks(List *stmt_list, List *part_prune_result_list)
+{
+ ListCell *lc1,
+ *lc2;
+
+ forboth(lc1, stmt_list, lc2, part_prune_result_list)
+ {
+ PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
+ PartitionPruneResult *part_prune_result = lfirst_node(PartitionPruneResult, lc2);
+
+ if (plannedstmt->commandType == CMD_UTILITY)
+ {
+ /*
+ * Ignore utility statements, except those (such as EXPLAIN) that
+ * contain a parsed-but-not-planned query. Note: it's okay to use
+ * ScanQueryForLocks, even though the query hasn't been through
+ * rule rewriting, because rewriting doesn't change the query
+ * representation.
+ */
+ Query *query = UtilityContainsQuery(plannedstmt->utilityStmt);
+
+ if (query)
+ ScanQueryForLocks(query, false);
+ }
+ else
+ {
+ Bitmapset *lockRelids;
+ int rti;
+
+ if (part_prune_result == NULL)
+ {
+ Assert(!plannedstmt->containsInitialPruning);
+ lockRelids = plannedstmt->minLockRelids;
+ }
else
+ {
+ Assert(plannedstmt->containsInitialPruning);
+ lockRelids = bms_union(plannedstmt->minLockRelids,
+ part_prune_result->scan_leafpart_rtis);
+ }
+
+ rti = -1;
+ while ((rti = bms_next_member(lockRelids, rti)) >= 0)
+ {
+ RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
+
+ if (rte->rtekind != RTE_RELATION)
+ continue;
+
+ /* See the comment in AcquireExecutorLocks(). */
UnlockRelationOid(rte->relid, rte->rellockmode);
+ }
+
}
}
}
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 666977fb1f..34975c69ee 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -87,7 +87,8 @@ extern void ExplainOneUtility(Node *utilityStmt, IntoClause *into,
ExplainState *es, const char *queryString,
ParamListInfo params, QueryEnvironment *queryEnv);
-extern void ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into,
+extern void ExplainOnePlan(PlannedStmt *plannedstmt, PartitionPruneResult *part_prune_resul,
+ IntoClause *into,
ExplainState *es, const char *queryString,
ParamListInfo params, QueryEnvironment *queryEnv,
const instr_time *planduration,
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index 708435e952..bd8776402e 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -45,6 +45,7 @@ extern void ExecCleanupTupleRouting(ModifyTableState *mtstate,
* nparts Length of subplan_map[] and subpart_map[].
* subplan_map Subplan index by partition index, or -1.
* subpart_map Subpart index by partition index, or -1.
+ * rti_map Range table index by partition index, or 0.
* present_parts A Bitmapset of the partition indexes that we
* have subplans or subparts for.
* initial_pruning_steps List of PartitionPruneSteps used to
@@ -61,6 +62,7 @@ typedef struct PartitionedRelPruningData
int nparts;
int *subplan_map;
int *subpart_map;
+ Index *rti_map;
Bitmapset *present_parts;
List *initial_pruning_steps;
List *exec_pruning_steps;
@@ -123,9 +125,13 @@ typedef struct PartitionPruneState
extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
int n_total_subplans,
- PartitionPruneInfo *pruneinfo,
+ int part_prune_index,
Bitmapset **initially_valid_subplans);
extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
- bool initial_prune);
-
+ bool initial_prune,
+ Bitmapset **scan_leafpart_rtis);
+extern Bitmapset *ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt,
+ ParamListInfo params,
+ PartitionPruneInfo *pruneinfo,
+ Bitmapset **scan_leafpart_rtis);
#endif /* EXECPARTITION_H */
diff --git a/src/include/executor/execdesc.h b/src/include/executor/execdesc.h
index e79e2c001f..60d5644908 100644
--- a/src/include/executor/execdesc.h
+++ b/src/include/executor/execdesc.h
@@ -35,6 +35,8 @@ typedef struct QueryDesc
/* These fields are provided by CreateQueryDesc */
CmdType operation; /* CMD_SELECT, CMD_UPDATE, etc. */
PlannedStmt *plannedstmt; /* planner's output (could be utility, too) */
+ PartitionPruneResult *part_prune_result; /* ExecutorDoInitialPruning()'s
+ * output for plannedstmt */
const char *sourceText; /* source text of the query */
Snapshot snapshot; /* snapshot to use for query */
Snapshot crosscheck_snapshot; /* crosscheck for RI update/delete */
@@ -57,6 +59,7 @@ typedef struct QueryDesc
/* in pquery.c */
extern QueryDesc *CreateQueryDesc(PlannedStmt *plannedstmt,
+ PartitionPruneResult *part_prune_result,
const char *sourceText,
Snapshot snapshot,
Snapshot crosscheck_snapshot,
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index 873772f188..57dc0e8077 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -185,6 +185,8 @@ ExecGetJunkAttribute(TupleTableSlot *slot, AttrNumber attno, bool *isNull)
/*
* prototypes from functions in execMain.c
*/
+extern PartitionPruneResult *ExecutorDoInitialPruning(PlannedStmt *plannedstmt,
+ ParamListInfo params);
extern void ExecutorStart(QueryDesc *queryDesc, int eflags);
extern void standard_ExecutorStart(QueryDesc *queryDesc, int eflags);
extern void ExecutorRun(QueryDesc *queryDesc,
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index cbbcff81d2..b5a7fd7e16 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -596,6 +596,8 @@ typedef struct EState
struct ExecRowMark **es_rowmarks; /* Array of per-range-table-entry
* ExecRowMarks, or NULL if none */
PlannedStmt *es_plannedstmt; /* link to top of plan tree */
+ List *es_part_prune_infos; /* PlannedStmt.partPruneInfos */
+ struct PartitionPruneResult *es_part_prune_result; /* QueryDesc.part_prune_result */
const char *es_sourceText; /* Source text from QueryDesc */
JunkFilter *es_junkFilter; /* top-level junk filter, if any */
@@ -984,6 +986,19 @@ typedef struct DomainConstraintState
*/
typedef TupleTableSlot *(*ExecProcNodeMtd) (struct PlanState *pstate);
+/*----------------
+ * PartitionPruneResult
+ *
+ * Result of ExecutorDoInitialPruning() invocation on a given plan.
+ */
+typedef struct PartitionPruneResult
+{
+ NodeTag type;
+
+ Bitmapset *scan_leafpart_rtis;
+ List *valid_subplan_offs_list;
+} PartitionPruneResult;
+
/* ----------------
* PlanState node
*
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index 300824258e..de312b9215 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -97,6 +97,9 @@ typedef enum NodeTag
T_PartitionPruneStepCombine,
T_PlanInvalItem,
+ /* TAGS FOR EXECUTOR PREP NODES (execnodes.h) */
+ T_PartitionPruneResult,
+
/*
* TAGS FOR PLAN STATE NODES (execnodes.h)
*
@@ -673,6 +676,7 @@ extern struct Bitmapset *readBitmapset(void);
extern uintptr_t readDatum(bool typbyval);
extern bool *readBoolCols(int numCols);
extern int *readIntCols(int numCols);
+extern Index *readIndexCols(int numCols);
extern Oid *readOidCols(int numCols);
extern int16 *readAttrNumberCols(int numCols);
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 6cbcb67bdf..f2039071c9 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -107,6 +107,18 @@ typedef struct PlannerGlobal
List *appendRelations; /* "flat" list of AppendRelInfos */
+ List *partPruneInfos; /* List of PartitionPruneInfo contained in
+ * the plan */
+
+ bool containsInitialPruning; /* Do any of those PartitionPruneInfos
+ * have initial (pre-exec) pruning
+ * steps in them? */
+
+ Bitmapset *minLockRelids; /* RT indexes of RTE_RELATION entries that
+ * must always be locked to execute the plan;
+ * those scanned by initial-prunable plan
+ * nodes are not included */
+
List *relationOids; /* OIDs of relations the plan depends on */
List *invalItems; /* other dependencies, as PlanInvalItems */
@@ -377,6 +389,9 @@ struct PlannerInfo
/* Does this query modify any partition key columns? */
bool partColsUpdated;
+
+ /* PartitionPruneInfos added in this query's plan. */
+ List *partPruneInfos;
};
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 10dd35f011..ecdc950fde 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -64,8 +64,19 @@ typedef struct PlannedStmt
struct Plan *planTree; /* tree of Plan nodes */
+ List *partPruneInfos; /* List of PartitionPruneInfo contained in
+ * the plan */
+
+ bool containsInitialPruning; /* Do any of those PartitionPruneInfos
+ * have initial (pre-exec) pruning
+ * steps in them? */
+
List *rtable; /* list of RangeTblEntry nodes */
+ Bitmapset *minLockRelids; /* RT indexes of RTE_RELATION entries that
+ * must be locked, except those scanned by
+ * initial-prunable plan nodes */
+
/* rtable indexes of target relations for INSERT/UPDATE/DELETE */
List *resultRelations; /* integer list of RT indexes, or NIL */
@@ -262,8 +273,12 @@ typedef struct Append
*/
int first_partial_plan;
- /* Info for run-time subplan pruning; NULL if we're not doing that */
- struct PartitionPruneInfo *part_prune_info;
+ /*
+ * Index of this plan's PartitionPruneInfo in PlannedStmt.part_prune_infos
+ * to be used for run-time subplan pruning; -1 if run-time pruning is
+ * not needed.
+ */
+ int part_prune_index;
} Append;
/* ----------------
@@ -282,8 +297,13 @@ typedef struct MergeAppend
Oid *sortOperators; /* OIDs of operators to sort them by */
Oid *collations; /* OIDs of collations */
bool *nullsFirst; /* NULLS FIRST/LAST directions */
- /* Info for run-time subplan pruning; NULL if we're not doing that */
- struct PartitionPruneInfo *part_prune_info;
+
+ /*
+ * Index of this plan's PartitionPruneInfo in PlannedStmt.part_prune_infos
+ * to be used for run-time subplan pruning; -1 if run-time pruning is
+ * not needed.
+ */
+ int part_prune_index;
} MergeAppend;
/* ----------------
@@ -1187,6 +1207,13 @@ typedef struct PlanRowMark
* prune_infos List of Lists containing PartitionedRelPruneInfo nodes,
* one sublist per run-time-prunable partition hierarchy
* appearing in the parent plan node's subplans.
+ *
+ * needs_init_pruning Does any of the PartitionedRelPruneInfos in
+ * prune_infos have its initial_pruning_steps set?
+ *
+ * needs_exec_pruning Does any of the PartitionedRelPruneInfos in
+ * prune_infos have its exec_pruning_steps set?
+ *
* other_subplans Indexes of any subplans that are not accounted for
* by any of the PartitionedRelPruneInfo nodes in
* "prune_infos". These subplans must not be pruned.
@@ -1195,6 +1222,9 @@ typedef struct PartitionPruneInfo
{
NodeTag type;
List *prune_infos;
+ Bitmapset *leafpart_rtis;
+ bool needs_init_pruning;
+ bool needs_exec_pruning;
Bitmapset *other_subplans;
} PartitionPruneInfo;
@@ -1225,6 +1255,7 @@ typedef struct PartitionedRelPruneInfo
int *subplan_map; /* subplan index by partition index, or -1 */
int *subpart_map; /* subpart index by partition index, or -1 */
Oid *relid_map; /* relation OID by partition index, or 0 */
+ Index *rti_map; /* Range table index by partition index, 0. */
/*
* initial_pruning_steps shows how to prune during executor startup (i.e.,
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index 95b99e3d25..fd7f129aea 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -148,6 +148,9 @@ typedef struct CachedPlan
{
int magic; /* should equal CACHEDPLAN_MAGIC */
List *stmt_list; /* list of PlannedStmts */
+ List *part_prune_result_list; /* list of PartitionPruneResult with
+ * one element for each of stmt_list;
+ * NIL if not a generic plan */
bool is_oneshot; /* is it a "oneshot" plan? */
bool is_saved; /* is CachedPlan in a long-lived context? */
bool is_valid; /* is the stmt_list currently valid? */
@@ -158,6 +161,10 @@ typedef struct CachedPlan
int generation; /* parent's generation number for this plan */
int refcount; /* count of live references to this struct */
MemoryContext context; /* context containing this CachedPlan */
+ MemoryContext part_prune_result_context; /* context containing
+ * part_prune_result_list,
+ * a child of the above
+ * context */
} CachedPlan;
/*
--
2.24.1
^ permalink raw reply [nested|flat] 71+ messages in thread
* Re: generic plans and "initial" pruning
@ 2022-04-07 12:41 David Rowley <[email protected]>
parent: Amit Langote <[email protected]>
0 siblings, 1 reply; 71+ messages in thread
From: David Rowley @ 2022-04-07 12:41 UTC (permalink / raw)
To: Amit Langote <[email protected]>; +Cc: Robert Haas <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers
On Thu, 7 Apr 2022 at 20:28, Amit Langote <[email protected]> wrote:
> Here's an updated version. In Particular, I removed
> part_prune_results list from PortalData, in favor of anything that
> needs to look at the list can instead get it from the CachedPlan
> (PortalData.cplan). This makes things better in 2 ways:
Thanks for making those changes.
I'm not overly familiar with the data structures we use for planning
around plans between the planner and executor, but storing the pruning
results in CachedPlan seems pretty bad. I see you've stashed it in
there and invented a new memory context to stop leaks into the cache
memory.
Since I'm not overly familiar with these structures, I'm trying to
imagine why you made that choice and the best I can come up with was
that it was the most convenient thing you had to hand inside
CheckCachedPlan().
I don't really have any great ideas right now on how to make this
better. I wonder if GetCachedPlan() should be changed to return some
struct that wraps up the CachedPlan with some sort of executor prep
info struct that we can stash the list of PartitionPruneResults in,
and perhaps something else one day.
Some lesser important stuff that I think could be done better.
* Are you also able to put meaningful comments on the
PartitionPruneResult struct in execnodes.h?
* In create_append_plan() and create_merge_append_plan() you have the
same code to set the part_prune_index. Why not just move all that code
into make_partition_pruneinfo() and have make_partition_pruneinfo()
return the index and append to the PlannerInfo.partPruneInfos List?
* Why not forboth() here?
i = 0;
foreach(stmtlist_item, portal->stmts)
{
PlannedStmt *pstmt = lfirst_node(PlannedStmt, stmtlist_item);
PartitionPruneResult *part_prune_result = part_prune_results ?
list_nth(part_prune_results, i) :
NULL;
i++;
* It would be good if ReleaseExecutorLocks() already knew the RTIs
that were locked. Maybe the executor prep info struct I mentioned
above could also store the RTIs that have been locked already and
allow ReleaseExecutorLocks() to just iterate over those to release the
locks.
David
^ permalink raw reply [nested|flat] 71+ messages in thread
* Re: generic plans and "initial" pruning
@ 2022-04-08 05:49 Amit Langote <[email protected]>
parent: David Rowley <[email protected]>
0 siblings, 1 reply; 71+ messages in thread
From: Amit Langote @ 2022-04-08 05:49 UTC (permalink / raw)
To: David Rowley <[email protected]>; +Cc: Robert Haas <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers
On Thu, Apr 7, 2022 at 9:41 PM David Rowley <[email protected]> wrote:
> On Thu, 7 Apr 2022 at 20:28, Amit Langote <[email protected]> wrote:
> > Here's an updated version. In Particular, I removed
> > part_prune_results list from PortalData, in favor of anything that
> > needs to look at the list can instead get it from the CachedPlan
> > (PortalData.cplan). This makes things better in 2 ways:
>
> Thanks for making those changes.
>
> I'm not overly familiar with the data structures we use for planning
> around plans between the planner and executor, but storing the pruning
> results in CachedPlan seems pretty bad. I see you've stashed it in
> there and invented a new memory context to stop leaks into the cache
> memory.
>
> Since I'm not overly familiar with these structures, I'm trying to
> imagine why you made that choice and the best I can come up with was
> that it was the most convenient thing you had to hand inside
> CheckCachedPlan().
Yeah, it's that way because it felt convenient, though I have wondered
if a simpler scheme that doesn't require any changes to the CachedPlan
data structure might be better after all. Your pointing it out has
made me think a bit harder on that.
> I don't really have any great ideas right now on how to make this
> better. I wonder if GetCachedPlan() should be changed to return some
> struct that wraps up the CachedPlan with some sort of executor prep
> info struct that we can stash the list of PartitionPruneResults in,
> and perhaps something else one day.
I think what might be better to do now is just add an output List
parameter to GetCachedPlan() to add the PartitionPruneResult node to
instead of stashing them into CachedPlan as now. IMHO, we should
leave inventing a new generic struct to the next project that will
make it necessary to return more information from GetCachedPlan() to
its users. I find it hard to convincingly describe what the new
generic struct really is if we invent it *now*, when it's going to
carry a single list whose purpose is pretty narrow.
So, I've implemented this by making the callers of GetCachedPlan()
pass a list to add the PartitionPruneResults that may be produced.
Most callers can put that into the Portal for passing that to other
modules, so I have reinstated PortalData.part_prune_results. As for
its memory management, the list and the PartitionPruneResults therein
will be allocated in a context that holds the Portal itself.
> Some lesser important stuff that I think could be done better.
>
> * Are you also able to put meaningful comments on the
> PartitionPruneResult struct in execnodes.h?
>
> * In create_append_plan() and create_merge_append_plan() you have the
> same code to set the part_prune_index. Why not just move all that code
> into make_partition_pruneinfo() and have make_partition_pruneinfo()
> return the index and append to the PlannerInfo.partPruneInfos List?
That sounds better, so done.
> * Why not forboth() here?
>
> i = 0;
> foreach(stmtlist_item, portal->stmts)
> {
> PlannedStmt *pstmt = lfirst_node(PlannedStmt, stmtlist_item);
> PartitionPruneResult *part_prune_result = part_prune_results ?
> list_nth(part_prune_results, i) :
> NULL;
>
> i++;
Because the PartitionPruneResult list may not always be available. To
wit, it's only available when it is GetCachedPlan() that gave the
portal its plan. I know this is a bit ugly, but it seems better than
fixing all users of Portal to build a dummy list, not that it is
totally avoidable even in the current implementation.
> * It would be good if ReleaseExecutorLocks() already knew the RTIs
> that were locked. Maybe the executor prep info struct I mentioned
> above could also store the RTIs that have been locked already and
> allow ReleaseExecutorLocks() to just iterate over those to release the
> locks.
Rewrote this such that ReleaseExecutorLocks() just receives a list of
per-PlannedStmt bitmapsets containing the RT indexes of only the
locked entries in that plan.
Attached updated patch with these changes.
--
Amit Langote
EDB: http://www.enterprisedb.com
Attachments:
[application/octet-stream] v13-0001-Optimize-AcquireExecutorLocks-to-skip-pruned-par.patch (99.1K, 2-v13-0001-Optimize-AcquireExecutorLocks-to-skip-pruned-par.patch)
download | inline diff:
From 3c0c7f9f5f8bdf89c6afd06e26ba6d5490af9221 Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Wed, 22 Dec 2021 16:55:17 +0900
Subject: [PATCH v13] Optimize AcquireExecutorLocks() to skip pruned partitions
---
src/backend/commands/copyto.c | 2 +-
src/backend/commands/createas.c | 2 +-
src/backend/commands/explain.c | 7 +-
src/backend/commands/extension.c | 2 +-
src/backend/commands/matview.c | 2 +-
src/backend/commands/prepare.c | 26 ++-
src/backend/executor/README | 27 +++
src/backend/executor/execMain.c | 46 +++++
src/backend/executor/execParallel.c | 28 ++-
src/backend/executor/execPartition.c | 238 ++++++++++++++++++++----
src/backend/executor/execUtils.c | 1 +
src/backend/executor/functions.c | 2 +-
src/backend/executor/nodeAppend.c | 16 +-
src/backend/executor/nodeMergeAppend.c | 9 +-
src/backend/executor/spi.c | 27 ++-
src/backend/nodes/copyfuncs.c | 33 +++-
src/backend/nodes/outfuncs.c | 36 +++-
src/backend/nodes/readfuncs.c | 56 +++++-
src/backend/optimizer/plan/createplan.c | 25 +--
src/backend/optimizer/plan/planner.c | 3 +
src/backend/optimizer/plan/setrefs.c | 104 ++++++++---
src/backend/partitioning/partprune.c | 59 +++++-
src/backend/tcop/postgres.c | 8 +-
src/backend/tcop/pquery.c | 25 ++-
src/backend/utils/cache/plancache.c | 184 +++++++++++++++---
src/backend/utils/mmgr/portalmem.c | 19 ++
src/include/commands/explain.h | 3 +-
src/include/executor/execPartition.h | 12 +-
src/include/executor/execdesc.h | 3 +
src/include/executor/executor.h | 2 +
src/include/nodes/execnodes.h | 30 +++
src/include/nodes/nodes.h | 4 +
src/include/nodes/pathnodes.h | 15 ++
src/include/nodes/plannodes.h | 39 +++-
src/include/partitioning/partprune.h | 8 +-
src/include/utils/plancache.h | 3 +-
src/include/utils/portal.h | 3 +
37 files changed, 942 insertions(+), 167 deletions(-)
diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index 55c38b04c4..d403eb2309 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -542,7 +542,7 @@ BeginCopyTo(ParseState *pstate,
((DR_copy *) dest)->cstate = cstate;
/* Create a QueryDesc requesting no output */
- cstate->queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+ cstate->queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
GetActiveSnapshot(),
InvalidSnapshot,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 9abbb6b555..f6607f2454 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -325,7 +325,7 @@ ExecCreateTableAs(ParseState *pstate, CreateTableAsStmt *stmt,
UpdateActiveSnapshotCommandId();
/* Create a QueryDesc, redirecting output to our tuple receiver */
- queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+ queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, 0);
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 1e5701b8eb..7ba9852e51 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -407,7 +407,7 @@ ExplainOneQuery(Query *query, int cursorOptions,
}
/* run it (if needed) and produce output */
- ExplainOnePlan(plan, into, es, queryString, params, queryEnv,
+ ExplainOnePlan(plan, NULL, into, es, queryString, params, queryEnv,
&planduration, (es->buffers ? &bufusage : NULL));
}
}
@@ -515,7 +515,8 @@ ExplainOneUtility(Node *utilityStmt, IntoClause *into, ExplainState *es,
* to call it.
*/
void
-ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
+ExplainOnePlan(PlannedStmt *plannedstmt, PartitionPruneResult *part_prune_result,
+ IntoClause *into, ExplainState *es,
const char *queryString, ParamListInfo params,
QueryEnvironment *queryEnv, const instr_time *planduration,
const BufferUsage *bufusage)
@@ -563,7 +564,7 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
dest = None_Receiver;
/* Create a QueryDesc for the query */
- queryDesc = CreateQueryDesc(plannedstmt, queryString,
+ queryDesc = CreateQueryDesc(plannedstmt, part_prune_result, queryString,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, instrument_option);
diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index 1013790dbb..54734a3a93 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -776,7 +776,7 @@ execute_sql_string(const char *sql)
{
QueryDesc *qdesc;
- qdesc = CreateQueryDesc(stmt,
+ qdesc = CreateQueryDesc(stmt, NULL,
sql,
GetActiveSnapshot(), NULL,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index 9ab248d25e..2be1782bc4 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -416,7 +416,7 @@ refresh_matview_datafill(DestReceiver *dest, Query *query,
UpdateActiveSnapshotCommandId();
/* Create a QueryDesc, redirecting output to our tuple receiver */
- queryDesc = CreateQueryDesc(plan, queryString,
+ queryDesc = CreateQueryDesc(plan, NULL, queryString,
GetActiveSnapshot(), InvalidSnapshot,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 80738547ed..c7360712b1 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -155,6 +155,7 @@ ExecuteQuery(ParseState *pstate,
PreparedStatement *entry;
CachedPlan *cplan;
List *plan_list;
+ List *part_prune_result_list;
ParamListInfo paramLI = NULL;
EState *estate = NULL;
Portal portal;
@@ -193,7 +194,10 @@ ExecuteQuery(ParseState *pstate,
entry->plansource->query_string);
/* Replan if needed, and increment plan refcount for portal */
- cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL);
+ cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL,
+ &part_prune_result_list);
+ Assert(list_length(cplan->stmt_list) ==
+ list_length(part_prune_result_list));
plan_list = cplan->stmt_list;
/*
@@ -207,6 +211,9 @@ ExecuteQuery(ParseState *pstate,
plan_list,
cplan);
+ /* Copy PartitionPruneResults into the portal's context. */
+ PortalStorePartitionPruneResults(portal, part_prune_result_list);
+
/*
* For CREATE TABLE ... AS EXECUTE, we must verify that the prepared
* statement is one that produces tuples. Currently we insist that it be
@@ -576,7 +583,9 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
const char *query_string;
CachedPlan *cplan;
List *plan_list;
- ListCell *p;
+ List *part_prune_result_list;
+ ListCell *p,
+ *pp;
ParamListInfo paramLI = NULL;
EState *estate = NULL;
instr_time planstart;
@@ -619,7 +628,10 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
/* Replan if needed, and acquire a transient refcount */
cplan = GetCachedPlan(entry->plansource, paramLI,
- CurrentResourceOwner, queryEnv);
+ CurrentResourceOwner, queryEnv,
+ &part_prune_result_list);
+ Assert(list_length(cplan->stmt_list) ==
+ list_length(part_prune_result_list));
INSTR_TIME_SET_CURRENT(planduration);
INSTR_TIME_SUBTRACT(planduration, planstart);
@@ -634,13 +646,15 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
plan_list = cplan->stmt_list;
/* Explain each query */
- foreach(p, plan_list)
+ forboth(p, plan_list, pp, part_prune_result_list)
{
PlannedStmt *pstmt = lfirst_node(PlannedStmt, p);
+ PartitionPruneResult *part_prune_result = lfirst_node(PartitionPruneResult, pp);
if (pstmt->commandType != CMD_UTILITY)
- ExplainOnePlan(pstmt, into, es, query_string, paramLI, queryEnv,
- &planduration, (es->buffers ? &bufusage : NULL));
+ ExplainOnePlan(pstmt, part_prune_result, into, es, query_string,
+ paramLI, queryEnv, &planduration,
+ (es->buffers ? &bufusage : NULL));
else
ExplainOneUtility(pstmt->utilityStmt, into, es, query_string,
paramLI, queryEnv);
diff --git a/src/backend/executor/README b/src/backend/executor/README
index 0b5183fc4a..e0802be723 100644
--- a/src/backend/executor/README
+++ b/src/backend/executor/README
@@ -65,6 +65,29 @@ found there. This currently only occurs for Append and MergeAppend nodes. In
this case the non-required subplans are ignored and the executor state's
subnode array will become out of sequence to the plan's subplan list.
+Actually, the so-called execution time pruning may also occur even before the
+execution has started. One case where that occurs is when a cached generic
+plan is being validated for execution by plancache.c: GetCachedPlan(), which
+proceeds by locking all the relations that will be scanned by that plan. If
+the generic plan contains nodes that can perform execution time partition
+pruning (that is, contain a PartitionPruneInfo), a subset of pruning steps
+contained in the PartitionPruneInfos that do not depend on execution actually
+having started (called "initial" pruning steps) are performed at this point
+to figure out the minimal set of child subplans that satisfy those pruning
+instructions. AcquireExecutorLocks() looking at a particular plan will then
+lock only the relations scanned by those surviving subplans (along with those
+present in PlannedStmt.minLockRelids), and ignore those scanned by the pruned
+subplans, even though the pruned subplans themselves are not removed from the
+plan tree. The result of pruning (that is, the set of indexes of surviving
+subplans in their parent's list of child subplans) is saved as a list of
+bitmapsets, with one element for every PartitionPruneInfo referenced in the
+plan (PlannedStmt.partPruneInfos). The list is packaged into a
+PartitionPruneResult node, which is passed along with the PlannedStmt to the
+executor via the QueryDesc. It is imperative that the executor and any third
+party code invoked by it that gets passed the plan tree look at the plan's
+PartitionPruneResult to determine whether a particular child subplan of a
+parent node that supports pruning is valid for a given execution.
+
Each Plan node may have expression trees associated with it, to represent
its target list, qualification conditions, etc. These trees are also
read-only to the executor, but the executor state for expression evaluation
@@ -286,6 +309,10 @@ Query Processing Control Flow
This is a sketch of control flow for full query processing:
+ [ ExecutorDoInitialPruning ] --- an optional step to perform initial
+ partition pruning on the plan tree the result of which is passed
+ to the executor via QueryDesc
+
CreateQueryDesc
ExecutorStart
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index ef2fd46092..05cc99df8f 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -49,11 +49,13 @@
#include "commands/matview.h"
#include "commands/trigger.h"
#include "executor/execdebug.h"
+#include "executor/execPartition.h"
#include "executor/nodeSubplan.h"
#include "foreign/fdwapi.h"
#include "jit/jit.h"
#include "mb/pg_wchar.h"
#include "miscadmin.h"
+#include "nodes/nodeFuncs.h"
#include "parser/parsetree.h"
#include "storage/bufmgr.h"
#include "storage/lmgr.h"
@@ -104,6 +106,47 @@ static void EvalPlanQualStart(EPQState *epqstate, Plan *planTree);
/* end of local decls */
+/* ----------------------------------------------------------------
+ * ExecutorDoInitialPruning
+ *
+ * Performs initial partition pruning to figure out the minimal set of
+ * subplans to be executed and the set of RT indexes of the corresponding
+ * leaf partitions
+ *
+ * Returned PartitionPruneResult must be subsequently passed to the executor
+ * so that it can reuse the result of pruning. It's important that the
+ * has the same view of which partitions are initially pruned (by not doing
+ * the pruning again itself) or otherwise it risks initializing subplans whose
+ * partitions would not have been locked.
+ *
+ * Note: Partitioned tables mentioned in PartitionedRelPruneInfo nodes that
+ * drive the pruning will be locked before doing the pruning.
+ */
+PartitionPruneResult *
+ExecutorDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params)
+{
+ PartitionPruneResult *result;
+ ListCell *lc;
+
+ /* Only get here if there is any pruning to do. */
+ Assert(plannedstmt->containsInitialPruning);
+
+ result = makeNode(PartitionPruneResult);
+ foreach(lc, plannedstmt->partPruneInfos)
+ {
+ PartitionPruneInfo *pruneinfo = lfirst(lc);
+ Bitmapset *valid_subplan_offs;
+
+ valid_subplan_offs =
+ ExecPartitionDoInitialPruning(plannedstmt, params, pruneinfo,
+ &result->scan_leafpart_rtis);
+ result->valid_subplan_offs_list =
+ lappend(result->valid_subplan_offs_list,
+ valid_subplan_offs);
+ }
+
+ return result;
+}
/* ----------------------------------------------------------------
* ExecutorStart
@@ -806,6 +849,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
{
CmdType operation = queryDesc->operation;
PlannedStmt *plannedstmt = queryDesc->plannedstmt;
+ PartitionPruneResult *part_prune_result = queryDesc->part_prune_result;
Plan *plan = plannedstmt->planTree;
List *rangeTable = plannedstmt->rtable;
EState *estate = queryDesc->estate;
@@ -825,6 +869,8 @@ InitPlan(QueryDesc *queryDesc, int eflags)
ExecInitRangeTable(estate, rangeTable);
estate->es_plannedstmt = plannedstmt;
+ estate->es_part_prune_infos = plannedstmt->partPruneInfos;
+ estate->es_part_prune_result = part_prune_result;
/*
* Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index 9a0d5d59ef..805f86c503 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -66,6 +66,7 @@
#define PARALLEL_KEY_QUERY_TEXT UINT64CONST(0xE000000000000008)
#define PARALLEL_KEY_JIT_INSTRUMENTATION UINT64CONST(0xE000000000000009)
#define PARALLEL_KEY_WAL_USAGE UINT64CONST(0xE00000000000000A)
+#define PARALLEL_KEY_PARTITIONPRUNERESULT UINT64CONST(0xE00000000000000B)
#define PARALLEL_TUPLE_QUEUE_SIZE 65536
@@ -182,7 +183,9 @@ ExecSerializePlan(Plan *plan, EState *estate)
pstmt->transientPlan = false;
pstmt->dependsOnRole = false;
pstmt->parallelModeNeeded = false;
+ pstmt->containsInitialPruning = false;
pstmt->planTree = plan;
+ pstmt->partPruneInfos = estate->es_part_prune_infos;
pstmt->rtable = estate->es_range_table;
pstmt->resultRelations = NIL;
pstmt->appendRelations = NIL;
@@ -596,12 +599,15 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
FixedParallelExecutorState *fpes;
char *pstmt_data;
char *pstmt_space;
+ char *part_prune_result_data;
+ char *part_prune_result_space;
char *paramlistinfo_space;
BufferUsage *bufusage_space;
WalUsage *walusage_space;
SharedExecutorInstrumentation *instrumentation = NULL;
SharedJitInstrumentation *jit_instrumentation = NULL;
int pstmt_len;
+ int part_prune_result_len;
int paramlistinfo_len;
int instrumentation_len = 0;
int jit_instrumentation_len = 0;
@@ -630,6 +636,7 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
/* Fix up and serialize plan to be sent to workers. */
pstmt_data = ExecSerializePlan(planstate->plan, estate);
+ part_prune_result_data = nodeToString(estate->es_part_prune_result);
/* Create a parallel context. */
pcxt = CreateParallelContext("postgres", "ParallelQueryMain", nworkers);
@@ -656,6 +663,11 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
shm_toc_estimate_chunk(&pcxt->estimator, pstmt_len);
shm_toc_estimate_keys(&pcxt->estimator, 1);
+ /* Estimate space for serialized PartitionPruneResult. */
+ part_prune_result_len = strlen(part_prune_result_data) + 1;
+ shm_toc_estimate_chunk(&pcxt->estimator, part_prune_result_len);
+ shm_toc_estimate_keys(&pcxt->estimator, 1);
+
/* Estimate space for serialized ParamListInfo. */
paramlistinfo_len = EstimateParamListSpace(estate->es_param_list_info);
shm_toc_estimate_chunk(&pcxt->estimator, paramlistinfo_len);
@@ -750,6 +762,12 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
memcpy(pstmt_space, pstmt_data, pstmt_len);
shm_toc_insert(pcxt->toc, PARALLEL_KEY_PLANNEDSTMT, pstmt_space);
+ /* Store serialized PartitionPruneResult */
+ part_prune_result_space = shm_toc_allocate(pcxt->toc, part_prune_result_len);
+ memcpy(part_prune_result_space, part_prune_result_data, part_prune_result_len);
+ shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARTITIONPRUNERESULT,
+ part_prune_result_space);
+
/* Store serialized ParamListInfo. */
paramlistinfo_space = shm_toc_allocate(pcxt->toc, paramlistinfo_len);
shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARAMLISTINFO, paramlistinfo_space);
@@ -1231,8 +1249,10 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
int instrument_options)
{
char *pstmtspace;
+ char *part_prune_result_space;
char *paramspace;
PlannedStmt *pstmt;
+ PartitionPruneResult *part_prune_result;
ParamListInfo paramLI;
char *queryString;
@@ -1243,12 +1263,18 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
pstmtspace = shm_toc_lookup(toc, PARALLEL_KEY_PLANNEDSTMT, false);
pstmt = (PlannedStmt *) stringToNode(pstmtspace);
+ /* Reconstruct leader-supplied PartitionPruneResult. */
+ part_prune_result_space =
+ shm_toc_lookup(toc, PARALLEL_KEY_PARTITIONPRUNERESULT, false);
+ part_prune_result = (PartitionPruneResult *)
+ stringToNode(part_prune_result_space);
+
/* Reconstruct ParamListInfo. */
paramspace = shm_toc_lookup(toc, PARALLEL_KEY_PARAMLISTINFO, false);
paramLI = RestoreParamList(¶mspace);
/* Create a QueryDesc for the query. */
- return CreateQueryDesc(pstmt,
+ return CreateQueryDesc(pstmt, part_prune_result,
queryString,
GetActiveSnapshot(), InvalidSnapshot,
receiver, paramLI, NULL, instrument_options);
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 615bd80973..3037742b8d 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -25,6 +25,7 @@
#include "mb/pg_wchar.h"
#include "miscadmin.h"
#include "nodes/makefuncs.h"
+#include "parser/parsetree.h"
#include "partitioning/partbounds.h"
#include "partitioning/partdesc.h"
#include "partitioning/partprune.h"
@@ -185,7 +186,11 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation rel,
static List *adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri);
static List *adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap);
static PartitionPruneState *CreatePartitionPruneState(PlanState *planstate,
- PartitionPruneInfo *pruneinfo);
+ PartitionPruneInfo *pruneinfo,
+ bool consider_initial_steps,
+ bool consider_exec_steps,
+ List *rtable, ExprContext *econtext,
+ PartitionDirectory partdir);
static void InitPartitionPruneContext(PartitionPruneContext *context,
List *pruning_steps,
PartitionDesc partdesc,
@@ -198,7 +203,8 @@ static void PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
static void find_matching_subplans_recurse(PartitionPruningData *prunedata,
PartitionedRelPruningData *pprune,
bool initial_prune,
- Bitmapset **validsubplans);
+ Bitmapset **validsubplans,
+ Bitmapset **scan_leafpart_rtis);
/*
@@ -1587,8 +1593,10 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
* considered to be a stable expression, it can change value from one plan
* node scan to the next during query execution. Stable comparison
* expressions that don't involve such Params allow partition pruning to be
- * done once during executor startup. Expressions that do involve such Params
- * require us to prune separately for each scan of the parent plan node.
+ * done once during executor startup or during ExecutorDoInitialPruning() that
+ * runs as part of performing AcquireExecutorLocks() on a given plan tree.
+ * Expressions that do involve such Params require us to prune separately for
+ * each scan of the parent plan node.
*
* Note that pruning away unneeded subplans during executor startup has the
* added benefit of not having to initialize the unneeded subplans at all.
@@ -1605,6 +1613,13 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
* account for initial pruning possibly having eliminated some of the
* subplans.
*
+ * ExecPartitionDoInitialPruning:
+ * Do initial pruning with the information contained in a given
+ * PartitionPruneInfo to determine the minimal set of child subplans
+ * to be executed of the parent plan node to which the PartitionPruneInfo
+ * belongs and also the set of the RT indexes of leaf partitions that will
+ * be scanned with those subplans.
+ *
* ExecFindMatchingSubPlans:
* Returns indexes of matching subplans after evaluating the expressions
* that are safe to evaluate at a given point. This function is first
@@ -1622,8 +1637,9 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
*
* On return, *initially_valid_subplans is assigned the set of indexes of
* child subplans that must be initialized along with the parent plan node.
- * Initial pruning is performed here if needed and in that case only the
- * surviving subplans' indexes are added.
+ * Initial pruning is performed here if needed (unless it has already been done
+ * by ExecDoInitialPruning()), and in that case only the surviving subplans'
+ * indexes are added.
*
* If subplans are indeed pruned, subplan_map arrays contained in the returned
* PartitionPruneState are re-sequenced to not count those, though only if the
@@ -1632,23 +1648,59 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
PartitionPruneState *
ExecInitPartitionPruning(PlanState *planstate,
int n_total_subplans,
- PartitionPruneInfo *pruneinfo,
+ int part_prune_index,
Bitmapset **initially_valid_subplans)
{
- PartitionPruneState *prunestate;
EState *estate = planstate->state;
+ PartitionPruneInfo *pruneinfo = list_nth(estate->es_part_prune_infos,
+ part_prune_index);
+ PartitionPruneResult *pruneresult = estate->es_part_prune_result;
+ PartitionPruneState *prunestate;
+ bool do_pruning = (pruneinfo->needs_init_pruning ||
+ pruneinfo->needs_exec_pruning);
- /* We may need an expression context to evaluate partition exprs */
- ExecAssignExprContext(estate, planstate);
+ /*
+ * No need to do initial pruning if it was done already by
+ * ExecutorDoInitialPruning(), which it would be if es_part_prune_result
+ * has been set.
+ */
+ if (pruneresult)
+ do_pruning = pruneinfo->needs_exec_pruning;
+
+ prunestate = NULL;
+ if (do_pruning)
+ {
+ /* We may need an expression context to evaluate partition exprs */
+ ExecAssignExprContext(estate, planstate);
- /* Create the working data structure for pruning */
- prunestate = CreatePartitionPruneState(planstate, pruneinfo);
+ /* For data reading, executor always omits detached partitions */
+ if (estate->es_partition_directory == NULL)
+ estate->es_partition_directory =
+ CreatePartitionDirectory(estate->es_query_cxt, false);
+
+ /*
+ * Create the working data structure for pruning. No need to consider
+ * initial pruning steps if we have a PartitionPruneResult.
+ */
+ prunestate = CreatePartitionPruneState(planstate, pruneinfo,
+ pruneresult == NULL, true,
+ NIL, planstate->ps_ExprContext,
+ estate->es_partition_directory);
+ }
/*
* Perform an initial partition prune pass, if required.
*/
- if (prunestate->do_initial_prune)
- *initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true);
+ if (pruneresult)
+ {
+ *initially_valid_subplans =
+ list_nth(pruneresult->valid_subplan_offs_list, part_prune_index);
+ }
+ else if (prunestate && prunestate->do_initial_prune)
+ {
+ *initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true,
+ NULL);
+ }
else
{
/* No pruning, so we'll need to initialize all subplans */
@@ -1669,7 +1721,7 @@ ExecInitPartitionPruning(PlanState *planstate,
* leaves invalid data in prunestate, because that data won't be
* consulted again (cf initial Assert in ExecFindMatchingSubPlans).
*/
- if (prunestate->do_exec_prune)
+ if (prunestate && prunestate->do_exec_prune)
PartitionPruneFixSubPlanMap(prunestate,
*initially_valid_subplans,
n_total_subplans);
@@ -1678,11 +1730,72 @@ ExecInitPartitionPruning(PlanState *planstate,
return prunestate;
}
+/*
+ * ExecPartitionDoInitialPruning
+ * Perform initial pruning using given PartitionPruneInfo to determine
+ * the minimal set of child subplans to be executed of the parent plan
+ * node to which the PartitionPruneInfo belongs and also the set of RT
+ * indexes of leaf partitions that will scanned with those subplans.
+ */
+Bitmapset *
+ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params,
+ PartitionPruneInfo *pruneinfo,
+ Bitmapset **scan_leafpart_rtis)
+{
+ List *rtable = plannedstmt->rtable;
+ ExprContext *econtext;
+ PartitionDirectory pdir;
+ MemoryContext oldcontext,
+ tmpcontext;
+ PartitionPruneState *prunestate;
+ Bitmapset *valid_subplan_offs;
+
+ /*
+ * A temporary context to allocate stuff needded to run the pruning steps.
+ */
+ tmpcontext = AllocSetContextCreate(CurrentMemoryContext,
+ "initial pruning working data",
+ ALLOCSET_DEFAULT_SIZES);
+ oldcontext = MemoryContextSwitchTo(tmpcontext);
+
+ /*
+ * PartitionDirectory to look up partition descriptors, which omits
+ * detached partitions, just like in the executor proper.
+ */
+ pdir = CreatePartitionDirectory(CurrentMemoryContext, false);
+
+ /*
+ * We don't yet have a PlanState for the parent plan node, so must create
+ * a standalone ExprContext to evaluate pruning expressions, equipped with
+ * the information about the EXTERN parameters that the caller passed us.
+ * Note that that's okay because the initial pruning steps do not contain
+ * anything that requires the execution to have started.
+ */
+ econtext = CreateStandaloneExprContext();
+ econtext->ecxt_param_list_info = params;
+ prunestate = CreatePartitionPruneState(NULL, pruneinfo, true, false,
+ rtable, econtext, pdir);
+ MemoryContextSwitchTo(oldcontext);
+
+ /* Do the initial pruning. */
+ valid_subplan_offs = ExecFindMatchingSubPlans(prunestate, true,
+ scan_leafpart_rtis);
+
+ FreeExprContext(econtext, true);
+ DestroyPartitionDirectory(pdir);
+ MemoryContextDelete(tmpcontext);
+
+ return valid_subplan_offs;
+}
+
/*
* CreatePartitionPruneState
* Build the data structure required for calling ExecFindMatchingSubPlans
*
- * 'planstate' is the parent plan node's execution state.
+ * 'planstate', if not NULL, is the parent plan node's execution state. It
+ * can be NULL if being called before ExecutorStart(), in which case,
+ * 'rtable' (range table), 'econtext', and 'partdir' must be explicitly
+ * provided.
*
* 'pruneinfo' is a PartitionPruneInfo as generated by
* make_partition_pruneinfo. Here we build a PartitionPruneState containing a
@@ -1696,19 +1809,21 @@ ExecInitPartitionPruning(PlanState *planstate,
* PartitionedRelPruneInfo.
*/
static PartitionPruneState *
-CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
+CreatePartitionPruneState(PlanState *planstate,
+ PartitionPruneInfo *pruneinfo,
+ bool consider_initial_steps,
+ bool consider_exec_steps,
+ List *rtable, ExprContext *econtext,
+ PartitionDirectory partdir)
{
- EState *estate = planstate->state;
+ EState *estate = planstate ? planstate->state : NULL;
PartitionPruneState *prunestate;
int n_part_hierarchies;
ListCell *lc;
int i;
- ExprContext *econtext = planstate->ps_ExprContext;
- /* For data reading, executor always omits detached partitions */
- if (estate->es_partition_directory == NULL)
- estate->es_partition_directory =
- CreatePartitionDirectory(estate->es_query_cxt, false);
+ Assert((estate != NULL) ||
+ (partdir != NULL && econtext != NULL && rtable != NIL));
n_part_hierarchies = list_length(pruneinfo->prune_infos);
Assert(n_part_hierarchies > 0);
@@ -1759,19 +1874,48 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
PartitionedRelPruneInfo *pinfo = lfirst_node(PartitionedRelPruneInfo, lc2);
PartitionedRelPruningData *pprune = &prunedata->partrelprunedata[j];
Relation partrel;
+ bool close_partrel = false;
PartitionDesc partdesc;
PartitionKey partkey;
/*
- * We can rely on the copies of the partitioned table's partition
- * key and partition descriptor appearing in its relcache entry,
- * because that entry will be held open and locked for the
- * duration of this executor run.
+ * Must open the relation by ourselves when called before the
+ * execution has started, such as, when called during
+ * ExecutorDoInitialPruning() on a cached plan. In that case,
+ * sub-partitions must be locked, because AcquirePlannerLocks()
+ * would not have seen them. (1st relation in a partrelpruneinfos
+ * list is always the root partitioned table appearing in the
+ * query, which AcquirePlannerLocks() would have locked; the
+ * Assert in relation_open() guards that assumption.)
+ */
+ if (estate == NULL)
+ {
+ RangeTblEntry *rte = rt_fetch(pinfo->rtindex, rtable);
+ int lockmode = (j == 0) ? NoLock : rte->rellockmode;
+
+ partrel = table_open(rte->relid, lockmode);
+ close_partrel = true;
+ }
+ else
+ partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
+
+ /*
+ * We can rely on the copy of the partitioned table's partition
+ * key from in its relcache entry, because it can't change (or
+ * get destroyed) as long as the relation is locked. Partition
+ * descriptor is taken from the PartitionDirectory associated with
+ * the table that is held open long enough for the descriptor to
+ * remain valid while it's used to perform the pruning steps.
*/
- partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
partkey = RelationGetPartitionKey(partrel);
- partdesc = PartitionDirectoryLookup(estate->es_partition_directory,
- partrel);
+ partdesc = PartitionDirectoryLookup(partdir, partrel);
+
+ /*
+ * Must close partrel, keeping the lock taken, if we're not using
+ * EState's entry.
+ */
+ if (close_partrel)
+ table_close(partrel, NoLock);
/*
* Initialize the subplan_map and subpart_map.
@@ -1785,6 +1929,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
Assert(partdesc->nparts >= pinfo->nparts);
pprune->nparts = partdesc->nparts;
pprune->subplan_map = palloc(sizeof(int) * partdesc->nparts);
+ pprune->rti_map = palloc(sizeof(Index) * partdesc->nparts);
if (partdesc->nparts == pinfo->nparts)
{
/*
@@ -1795,6 +1940,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
pprune->subpart_map = pinfo->subpart_map;
memcpy(pprune->subplan_map, pinfo->subplan_map,
sizeof(int) * pinfo->nparts);
+ memcpy(pprune->rti_map, pinfo->rti_map,
+ sizeof(int) * pinfo->nparts);
/*
* Double-check that the list of unpruned relations has not
@@ -1845,6 +1992,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
pinfo->subplan_map[pd_idx];
pprune->subpart_map[pp_idx] =
pinfo->subpart_map[pd_idx];
+ pprune->rti_map[pp_idx] =
+ pinfo->rti_map[pd_idx];
pd_idx++;
}
else
@@ -1852,6 +2001,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
/* this partdesc entry is not in the plan */
pprune->subplan_map[pp_idx] = -1;
pprune->subpart_map[pp_idx] = -1;
+ pprune->rti_map[pp_idx] = 0;
}
}
@@ -1873,7 +2023,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
* Initialize pruning contexts as needed.
*/
pprune->initial_pruning_steps = pinfo->initial_pruning_steps;
- if (pinfo->initial_pruning_steps)
+ if (consider_initial_steps && pinfo->initial_pruning_steps)
{
InitPartitionPruneContext(&pprune->initial_context,
pinfo->initial_pruning_steps,
@@ -1883,7 +2033,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
prunestate->do_initial_prune = true;
}
pprune->exec_pruning_steps = pinfo->exec_pruning_steps;
- if (pinfo->exec_pruning_steps)
+ if (consider_exec_steps && pinfo->exec_pruning_steps)
{
InitPartitionPruneContext(&pprune->exec_context,
pinfo->exec_pruning_steps,
@@ -2111,10 +2261,14 @@ PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
* Pass initial_prune if PARAM_EXEC Params cannot yet be evaluated. This
* differentiates the initial executor-time pruning step from later
* runtime pruning.
+ *
+ * RT indexes of leaf partitions scanned by the chosen subplans are added to
+ * *scan_leafpart_rtis if the pointer is non-NULL.
*/
Bitmapset *
ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
- bool initial_prune)
+ bool initial_prune,
+ Bitmapset **scan_leafpart_rtis)
{
Bitmapset *result = NULL;
MemoryContext oldcontext;
@@ -2149,7 +2303,7 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
*/
pprune = &prunedata->partrelprunedata[0];
find_matching_subplans_recurse(prunedata, pprune, initial_prune,
- &result);
+ &result, scan_leafpart_rtis);
/* Expression eval may have used space in ExprContext too */
if (pprune->exec_pruning_steps)
@@ -2163,6 +2317,8 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
/* Copy result out of the temp context before we reset it */
result = bms_copy(result);
+ if (scan_leafpart_rtis)
+ *scan_leafpart_rtis = bms_copy(*scan_leafpart_rtis);
MemoryContextReset(prunestate->prune_context);
@@ -2173,13 +2329,15 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
* find_matching_subplans_recurse
* Recursive worker function for ExecFindMatchingSubPlans
*
- * Adds valid (non-prunable) subplan IDs to *validsubplans
+ * Adds valid (non-prunable) subplan IDs to *validsubplans and RT indexes of
+ * of the corresponding leaf partitions to *scan_leafpart_rtis (if asked for).
*/
static void
find_matching_subplans_recurse(PartitionPruningData *prunedata,
PartitionedRelPruningData *pprune,
bool initial_prune,
- Bitmapset **validsubplans)
+ Bitmapset **validsubplans,
+ Bitmapset **scan_leafpart_rtis)
{
Bitmapset *partset;
int i;
@@ -2206,8 +2364,13 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
while ((i = bms_next_member(partset, i)) >= 0)
{
if (pprune->subplan_map[i] >= 0)
+ {
*validsubplans = bms_add_member(*validsubplans,
pprune->subplan_map[i]);
+ if (scan_leafpart_rtis && pprune->rti_map[i] > 0)
+ *scan_leafpart_rtis = bms_add_member(*scan_leafpart_rtis,
+ pprune->rti_map[i]);
+ }
else
{
int partidx = pprune->subpart_map[i];
@@ -2215,7 +2378,8 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
if (partidx >= 0)
find_matching_subplans_recurse(prunedata,
&prunedata->partrelprunedata[partidx],
- initial_prune, validsubplans);
+ initial_prune, validsubplans,
+ scan_leafpart_rtis);
else
{
/*
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 9df1f81ea8..639145abe9 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -119,6 +119,7 @@ CreateExecutorState(void)
estate->es_relations = NULL;
estate->es_rowmarks = NULL;
estate->es_plannedstmt = NULL;
+ estate->es_part_prune_result = NULL;
estate->es_junkFilter = NULL;
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index f9460ae506..a2182a6b1f 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -844,7 +844,7 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
else
dest = None_Receiver;
- es->qd = CreateQueryDesc(es->stmt,
+ es->qd = CreateQueryDesc(es->stmt, NULL,
fcache->src,
GetActiveSnapshot(),
InvalidSnapshot,
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index 357e10a1d7..09f26658e2 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -94,6 +94,7 @@ static bool ExecAppendAsyncRequest(AppendState *node, TupleTableSlot **result);
static void ExecAppendAsyncEventWait(AppendState *node);
static void classify_matching_subplans(AppendState *node);
+
/* ----------------------------------------------------------------
* ExecInitAppend
*
@@ -134,7 +135,7 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
appendstate->as_begun = false;
/* If run-time partition pruning is enabled, then set that up now */
- if (node->part_prune_info != NULL)
+ if (node->part_prune_index >= 0)
{
PartitionPruneState *prunestate;
@@ -145,7 +146,7 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
*/
prunestate = ExecInitPartitionPruning(&appendstate->ps,
list_length(node->appendplans),
- node->part_prune_info,
+ node->part_prune_index,
&validsubplans);
appendstate->as_prune_state = prunestate;
nplans = bms_num_members(validsubplans);
@@ -155,7 +156,8 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
* subplan, we can fill as_valid_subplans immediately, preventing
* later calls to ExecFindMatchingSubPlans.
*/
- if (!prunestate->do_exec_prune && nplans > 0)
+ if (appendstate->as_prune_state == NULL ||
+ (!appendstate->as_prune_state->do_exec_prune && nplans > 0))
appendstate->as_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
}
else
@@ -577,7 +579,7 @@ choose_next_subplan_locally(AppendState *node)
}
else if (node->as_valid_subplans == NULL)
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
whichplan = -1;
}
@@ -642,7 +644,7 @@ choose_next_subplan_for_leader(AppendState *node)
if (node->as_valid_subplans == NULL)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
/*
* Mark each invalid plan as finished to allow the loop below to
@@ -717,7 +719,7 @@ choose_next_subplan_for_worker(AppendState *node)
else if (node->as_valid_subplans == NULL)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
mark_invalid_subplans_as_finished(node);
}
@@ -868,7 +870,7 @@ ExecAppendAsyncBegin(AppendState *node)
if (node->as_valid_subplans == NULL)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
classify_matching_subplans(node);
}
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index ecf9052e03..7708cfffda 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -82,7 +82,7 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
mergestate->ps.ExecProcNode = ExecMergeAppend;
/* If run-time partition pruning is enabled, then set that up now */
- if (node->part_prune_info != NULL)
+ if (node->part_prune_index >= 0)
{
PartitionPruneState *prunestate;
@@ -93,7 +93,7 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
*/
prunestate = ExecInitPartitionPruning(&mergestate->ps,
list_length(node->mergeplans),
- node->part_prune_info,
+ node->part_prune_index,
&validsubplans);
mergestate->ms_prune_state = prunestate;
nplans = bms_num_members(validsubplans);
@@ -103,7 +103,8 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
* subplan, we can fill as_valid_subplans immediately, preventing
* later calls to ExecFindMatchingSubPlans.
*/
- if (!prunestate->do_exec_prune && nplans > 0)
+ if (mergestate->ms_prune_state == NULL ||
+ (!mergestate->ms_prune_state->do_exec_prune && nplans > 0))
mergestate->ms_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
}
else
@@ -218,7 +219,7 @@ ExecMergeAppend(PlanState *pstate)
*/
if (node->ms_valid_subplans == NULL)
node->ms_valid_subplans =
- ExecFindMatchingSubPlans(node->ms_prune_state, false);
+ ExecFindMatchingSubPlans(node->ms_prune_state, false, NULL);
/*
* First time through: pull the first tuple from each valid subplan,
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 042a5f8b0a..729e2fd7b2 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -1578,6 +1578,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
CachedPlanSource *plansource;
CachedPlan *cplan;
List *stmt_list;
+ List *part_prune_result_list;
char *query_string;
Snapshot snapshot;
MemoryContext oldcontext;
@@ -1657,7 +1658,10 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
*/
/* Replan if needed, and increment plan refcount for portal */
- cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv);
+ cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv,
+ &part_prune_result_list);
+ Assert(list_length(cplan->stmt_list) ==
+ list_length(part_prune_result_list));
stmt_list = cplan->stmt_list;
if (!plan->saved)
@@ -1685,6 +1689,9 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
stmt_list,
cplan);
+ /* Copy PartitionPruneResults into the portal's context. */
+ PortalStorePartitionPruneResults(portal, part_prune_result_list);
+
/*
* Set up options for portal. Default SCROLL type is chosen the same way
* as PerformCursorOpen does it.
@@ -2092,7 +2099,8 @@ SPI_plan_get_cached_plan(SPIPlanPtr plan)
/* Get the generic plan for the query */
cplan = GetCachedPlan(plansource, NULL,
plan->saved ? CurrentResourceOwner : NULL,
- _SPI_current->queryEnv);
+ _SPI_current->queryEnv,
+ NULL /* Not interested in PartitionPruneResults */);
Assert(cplan == plansource->gplan);
/* Pop the error context stack */
@@ -2473,7 +2481,9 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
{
CachedPlanSource *plansource = (CachedPlanSource *) lfirst(lc1);
List *stmt_list;
- ListCell *lc2;
+ List *part_prune_result_list;
+ ListCell *lc2,
+ *lc3;
spicallbackarg.query = plansource->query_string;
@@ -2549,8 +2559,10 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
* plan, the refcount must be backed by the plan_owner.
*/
cplan = GetCachedPlan(plansource, options->params,
- plan_owner, _SPI_current->queryEnv);
-
+ plan_owner, _SPI_current->queryEnv,
+ &part_prune_result_list);
+ Assert(list_length(cplan->stmt_list) ==
+ list_length(part_prune_result_list));
stmt_list = cplan->stmt_list;
/*
@@ -2589,9 +2601,10 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
}
}
- foreach(lc2, stmt_list)
+ forboth(lc2, stmt_list, lc3, part_prune_result_list)
{
PlannedStmt *stmt = lfirst_node(PlannedStmt, lc2);
+ PartitionPruneResult *part_prune_result = lfirst_node(PartitionPruneResult, lc3);
bool canSetTag = stmt->canSetTag;
DestReceiver *dest;
@@ -2663,7 +2676,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
else
snap = InvalidSnapshot;
- qdesc = CreateQueryDesc(stmt,
+ qdesc = CreateQueryDesc(stmt, part_prune_result,
plansource->query_string,
snap, crosscheck_snapshot,
dest,
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 46a1943d97..9642e74ef1 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -96,7 +96,10 @@ _copyPlannedStmt(const PlannedStmt *from)
COPY_SCALAR_FIELD(parallelModeNeeded);
COPY_SCALAR_FIELD(jitFlags);
COPY_NODE_FIELD(planTree);
+ COPY_NODE_FIELD(partPruneInfos);
+ COPY_SCALAR_FIELD(containsInitialPruning);
COPY_NODE_FIELD(rtable);
+ COPY_BITMAPSET_FIELD(minLockRelids);
COPY_NODE_FIELD(resultRelations);
COPY_NODE_FIELD(appendRelations);
COPY_NODE_FIELD(subplans);
@@ -253,7 +256,7 @@ _copyAppend(const Append *from)
COPY_NODE_FIELD(appendplans);
COPY_SCALAR_FIELD(nasyncplans);
COPY_SCALAR_FIELD(first_partial_plan);
- COPY_NODE_FIELD(part_prune_info);
+ COPY_SCALAR_FIELD(part_prune_index);
return newnode;
}
@@ -281,7 +284,7 @@ _copyMergeAppend(const MergeAppend *from)
COPY_POINTER_FIELD(sortOperators, from->numCols * sizeof(Oid));
COPY_POINTER_FIELD(collations, from->numCols * sizeof(Oid));
COPY_POINTER_FIELD(nullsFirst, from->numCols * sizeof(bool));
- COPY_NODE_FIELD(part_prune_info);
+ COPY_SCALAR_FIELD(part_prune_index);
return newnode;
}
@@ -1280,6 +1283,8 @@ _copyPartitionPruneInfo(const PartitionPruneInfo *from)
PartitionPruneInfo *newnode = makeNode(PartitionPruneInfo);
COPY_NODE_FIELD(prune_infos);
+ COPY_SCALAR_FIELD(needs_init_pruning);
+ COPY_SCALAR_FIELD(needs_exec_pruning);
COPY_BITMAPSET_FIELD(other_subplans);
return newnode;
@@ -1296,6 +1301,7 @@ _copyPartitionedRelPruneInfo(const PartitionedRelPruneInfo *from)
COPY_POINTER_FIELD(subplan_map, from->nparts * sizeof(int));
COPY_POINTER_FIELD(subpart_map, from->nparts * sizeof(int));
COPY_POINTER_FIELD(relid_map, from->nparts * sizeof(Oid));
+ COPY_POINTER_FIELD(rti_map, from->nparts * sizeof(Index));
COPY_NODE_FIELD(initial_pruning_steps);
COPY_NODE_FIELD(exec_pruning_steps);
COPY_BITMAPSET_FIELD(execparamids);
@@ -5469,6 +5475,21 @@ _copyExtensibleNode(const ExtensibleNode *from)
return newnode;
}
+/* ****************************************************************
+ * execnodes.h copy functions
+ * ****************************************************************
+ */
+static PartitionPruneResult *
+_copyPartitionPruneResult(const PartitionPruneResult *from)
+{
+ PartitionPruneResult *newnode = makeNode(PartitionPruneResult);
+
+ COPY_NODE_FIELD(valid_subplan_offs_list);
+ COPY_BITMAPSET_FIELD(scan_leafpart_rtis);
+
+ return newnode;
+}
+
/* ****************************************************************
* value.h copy functions
* ****************************************************************
@@ -5523,7 +5544,6 @@ _copyBitString(const BitString *from)
return newnode;
}
-
static ForeignKeyCacheInfo *
_copyForeignKeyCacheInfo(const ForeignKeyCacheInfo *from)
{
@@ -6565,6 +6585,13 @@ copyObjectImpl(const void *from)
retval = _copyPublicationTable(from);
break;
+ /*
+ * EXECUTION NODES
+ */
+ case T_PartitionPruneResult:
+ retval = _copyPartitionPruneResult(from);
+ break;
+
/*
* MISCELLANEOUS NODES
*/
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 13e1643530..0cbcbc8ed4 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -314,7 +314,10 @@ _outPlannedStmt(StringInfo str, const PlannedStmt *node)
WRITE_BOOL_FIELD(parallelModeNeeded);
WRITE_INT_FIELD(jitFlags);
WRITE_NODE_FIELD(planTree);
+ WRITE_NODE_FIELD(partPruneInfos);
+ WRITE_BOOL_FIELD(containsInitialPruning);
WRITE_NODE_FIELD(rtable);
+ WRITE_BITMAPSET_FIELD(minLockRelids);
WRITE_NODE_FIELD(resultRelations);
WRITE_NODE_FIELD(appendRelations);
WRITE_NODE_FIELD(subplans);
@@ -443,7 +446,7 @@ _outAppend(StringInfo str, const Append *node)
WRITE_NODE_FIELD(appendplans);
WRITE_INT_FIELD(nasyncplans);
WRITE_INT_FIELD(first_partial_plan);
- WRITE_NODE_FIELD(part_prune_info);
+ WRITE_INT_FIELD(part_prune_index);
}
static void
@@ -460,7 +463,7 @@ _outMergeAppend(StringInfo str, const MergeAppend *node)
WRITE_OID_ARRAY(sortOperators, node->numCols);
WRITE_OID_ARRAY(collations, node->numCols);
WRITE_BOOL_ARRAY(nullsFirst, node->numCols);
- WRITE_NODE_FIELD(part_prune_info);
+ WRITE_INT_FIELD(part_prune_index);
}
static void
@@ -1006,6 +1009,8 @@ _outPartitionPruneInfo(StringInfo str, const PartitionPruneInfo *node)
WRITE_NODE_TYPE("PARTITIONPRUNEINFO");
WRITE_NODE_FIELD(prune_infos);
+ WRITE_BOOL_FIELD(needs_init_pruning);
+ WRITE_BOOL_FIELD(needs_exec_pruning);
WRITE_BITMAPSET_FIELD(other_subplans);
}
@@ -1020,6 +1025,7 @@ _outPartitionedRelPruneInfo(StringInfo str, const PartitionedRelPruneInfo *node)
WRITE_INT_ARRAY(subplan_map, node->nparts);
WRITE_INT_ARRAY(subpart_map, node->nparts);
WRITE_OID_ARRAY(relid_map, node->nparts);
+ WRITE_INDEX_ARRAY(rti_map, node->nparts);
WRITE_NODE_FIELD(initial_pruning_steps);
WRITE_NODE_FIELD(exec_pruning_steps);
WRITE_BITMAPSET_FIELD(execparamids);
@@ -2420,6 +2426,9 @@ _outPlannerGlobal(StringInfo str, const PlannerGlobal *node)
WRITE_NODE_FIELD(finalrowmarks);
WRITE_NODE_FIELD(resultRelations);
WRITE_NODE_FIELD(appendRelations);
+ WRITE_NODE_FIELD(partPruneInfos);
+ WRITE_BOOL_FIELD(containsInitialPruning);
+ WRITE_BITMAPSET_FIELD(minLockRelids);
WRITE_NODE_FIELD(relationOids);
WRITE_NODE_FIELD(invalItems);
WRITE_NODE_FIELD(paramExecTypes);
@@ -2487,6 +2496,7 @@ _outPlannerInfo(StringInfo str, const PlannerInfo *node)
WRITE_BITMAPSET_FIELD(curOuterRels);
WRITE_NODE_FIELD(curOuterParams);
WRITE_BOOL_FIELD(partColsUpdated);
+ WRITE_NODE_FIELD(partPruneInfos);
}
static void
@@ -2840,6 +2850,21 @@ _outExtensibleNode(StringInfo str, const ExtensibleNode *node)
methods->nodeOut(str, node);
}
+/*****************************************************************************
+ *
+ * Stuff from execnodes.h
+ *
+ *****************************************************************************/
+
+static void
+_outPartitionPruneResult(StringInfo str, const PartitionPruneResult *node)
+{
+ WRITE_NODE_TYPE("PARTITIONPRUNERESULT");
+
+ WRITE_NODE_FIELD(valid_subplan_offs_list);
+ WRITE_BITMAPSET_FIELD(scan_leafpart_rtis);
+}
+
/*****************************************************************************
*
* Stuff from parsenodes.h.
@@ -4748,6 +4773,13 @@ outNode(StringInfo str, const void *obj)
_outJsonTableSibling(str, obj);
break;
+ /*
+ * EXECUTION NODES
+ */
+ case T_PartitionPruneResult:
+ _outPartitionPruneResult(str, obj);
+ break;
+
default:
/*
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 48f7216c9e..25e1df7068 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -164,6 +164,11 @@
token = pg_strtok(&length); /* skip :fldname */ \
local_node->fldname = readIntCols(len)
+/* Read an Index array */
+#define READ_INDEX_ARRAY(fldname, len) \
+ token = pg_strtok(&length); /* skip :fldname */ \
+ local_node->fldname = readIndexCols(len)
+
/* Read a bool array */
#define READ_BOOL_ARRAY(fldname, len) \
token = pg_strtok(&length); /* skip :fldname */ \
@@ -1814,7 +1819,10 @@ _readPlannedStmt(void)
READ_BOOL_FIELD(parallelModeNeeded);
READ_INT_FIELD(jitFlags);
READ_NODE_FIELD(planTree);
+ READ_NODE_FIELD(partPruneInfos);
+ READ_BOOL_FIELD(containsInitialPruning);
READ_NODE_FIELD(rtable);
+ READ_BITMAPSET_FIELD(minLockRelids);
READ_NODE_FIELD(resultRelations);
READ_NODE_FIELD(appendRelations);
READ_NODE_FIELD(subplans);
@@ -1946,7 +1954,7 @@ _readAppend(void)
READ_NODE_FIELD(appendplans);
READ_INT_FIELD(nasyncplans);
READ_INT_FIELD(first_partial_plan);
- READ_NODE_FIELD(part_prune_info);
+ READ_INT_FIELD(part_prune_index);
READ_DONE();
}
@@ -1968,7 +1976,7 @@ _readMergeAppend(void)
READ_OID_ARRAY(sortOperators, local_node->numCols);
READ_OID_ARRAY(collations, local_node->numCols);
READ_BOOL_ARRAY(nullsFirst, local_node->numCols);
- READ_NODE_FIELD(part_prune_info);
+ READ_INT_FIELD(part_prune_index);
READ_DONE();
}
@@ -2763,6 +2771,8 @@ _readPartitionPruneInfo(void)
READ_LOCALS(PartitionPruneInfo);
READ_NODE_FIELD(prune_infos);
+ READ_BOOL_FIELD(needs_init_pruning);
+ READ_BOOL_FIELD(needs_exec_pruning);
READ_BITMAPSET_FIELD(other_subplans);
READ_DONE();
@@ -2779,6 +2789,7 @@ _readPartitionedRelPruneInfo(void)
READ_INT_ARRAY(subplan_map, local_node->nparts);
READ_INT_ARRAY(subpart_map, local_node->nparts);
READ_OID_ARRAY(relid_map, local_node->nparts);
+ READ_INDEX_ARRAY(rti_map, local_node->nparts);
READ_NODE_FIELD(initial_pruning_steps);
READ_NODE_FIELD(exec_pruning_steps);
READ_BITMAPSET_FIELD(execparamids);
@@ -2932,6 +2943,21 @@ _readPartitionRangeDatum(void)
READ_DONE();
}
+
+/*
+ * _readPartitionPruneResult
+ */
+static PartitionPruneResult *
+_readPartitionPruneResult(void)
+{
+ READ_LOCALS(PartitionPruneResult);
+
+ READ_NODE_FIELD(valid_subplan_offs_list);
+ READ_BITMAPSET_FIELD(scan_leafpart_rtis);
+
+ READ_DONE();
+}
+
/*
* parseNodeString
*
@@ -3229,6 +3255,8 @@ parseNodeString(void)
return_value = _readJsonTableParent();
else if (MATCH("JSONTABSNODE", 12))
return_value = _readJsonTableSibling();
+ else if (MATCH("PARTITIONPRUNERESULT", 20))
+ return_value = _readPartitionPruneResult();
else
{
elog(ERROR, "badly formatted node string \"%.32s\"...", token);
@@ -3372,6 +3400,30 @@ readIntCols(int numCols)
return int_vals;
}
+/*
+ * readIndexCols
+ */
+Index *
+readIndexCols(int numCols)
+{
+ int tokenLength,
+ i;
+ const char *token;
+ Index *index_vals;
+
+ if (numCols <= 0)
+ return NULL;
+
+ index_vals = (Index *) palloc(numCols * sizeof(Index));
+ for (i = 0; i < numCols; i++)
+ {
+ token = pg_strtok(&tokenLength);
+ index_vals[i] = atoui(token);
+ }
+
+ return index_vals;
+}
+
/*
* readBoolCols
*/
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index 51591bb812..e7f977fb96 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -1183,7 +1183,7 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
ListCell *subpaths;
int nasyncplans = 0;
RelOptInfo *rel = best_path->path.parent;
- PartitionPruneInfo *partpruneinfo = NULL;
+ int part_prune_index = -1;
int nodenumsortkeys = 0;
AttrNumber *nodeSortColIdx = NULL;
Oid *nodeSortOperators = NULL;
@@ -1357,16 +1357,17 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
}
if (prunequal != NIL)
- partpruneinfo =
- make_partition_pruneinfo(root, rel,
- best_path->subpaths,
- prunequal);
+ part_prune_index= make_partition_pruneinfo(root, rel,
+ best_path->subpaths,
+ prunequal);
}
plan->appendplans = subplans;
plan->nasyncplans = nasyncplans;
plan->first_partial_plan = best_path->first_partial_path;
- plan->part_prune_info = partpruneinfo;
+
+ /* Will be updated later in set_plan_references(). */
+ plan->part_prune_index = part_prune_index;
copy_generic_path_info(&plan->plan, (Path *) best_path);
@@ -1406,7 +1407,7 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
List *subplans = NIL;
ListCell *subpaths;
RelOptInfo *rel = best_path->path.parent;
- PartitionPruneInfo *partpruneinfo = NULL;
+ int part_prune_index = -1;
/*
* We don't have the actual creation of the MergeAppend node split out
@@ -1522,13 +1523,15 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
}
if (prunequal != NIL)
- partpruneinfo = make_partition_pruneinfo(root, rel,
- best_path->subpaths,
- prunequal);
+ part_prune_index= make_partition_pruneinfo(root, rel,
+ best_path->subpaths,
+ prunequal);
}
node->mergeplans = subplans;
- node->part_prune_info = partpruneinfo;
+
+ /* Will be updated later in set_plan_references(). */
+ node->part_prune_index = part_prune_index;
/*
* If prepare_sort_from_pathkeys added sort columns, but we were told to
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index b2569c5d0c..2aa051d862 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -518,7 +518,10 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
result->dependsOnRole = glob->dependsOnRole;
result->parallelModeNeeded = glob->parallelModeNeeded;
result->planTree = top_plan;
+ result->partPruneInfos = glob->partPruneInfos;
+ result->containsInitialPruning = glob->containsInitialPruning;
result->rtable = glob->finalrtable;
+ result->minLockRelids = glob->minLockRelids;
result->resultRelations = glob->resultRelations;
result->appendRelations = glob->appendRelations;
result->subplans = glob->subplans;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 7519723081..fc66986e1c 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -251,7 +251,7 @@ set_plan_references(PlannerInfo *root, Plan *plan)
Plan *result;
PlannerGlobal *glob = root->glob;
int rtoffset = list_length(glob->finalrtable);
- ListCell *lc;
+ ListCell *lc;
/*
* Add all the query's RTEs to the flattened rangetable. The live ones
@@ -260,6 +260,16 @@ set_plan_references(PlannerInfo *root, Plan *plan)
*/
add_rtes_to_flat_rtable(root, false);
+ /*
+ * Add the query's adjusted range of RT indexes to glob->minLockRelids.
+ * The adjusted RT indexes of prunable relations will be deleted from the
+ * set below where PartitionPruneInfos are processed.
+ */
+ glob->minLockRelids =
+ bms_add_range(glob->minLockRelids,
+ rtoffset + 1,
+ rtoffset + list_length(root->parse->rtable));
+
/*
* Adjust RT indexes of PlanRowMarks and add to final rowmarks list
*/
@@ -338,6 +348,56 @@ set_plan_references(PlannerInfo *root, Plan *plan)
}
}
+ /* Also fix up the information in PartitionPruneInfos. */
+ foreach (lc, root->partPruneInfos)
+ {
+ PartitionPruneInfo *pruneinfo = lfirst(lc);
+ Bitmapset *leafpart_rtis = NULL;
+ ListCell *l;
+
+ foreach(l, pruneinfo->prune_infos)
+ {
+ List *prune_infos = lfirst(l);
+ ListCell *l2;
+
+ foreach(l2, prune_infos)
+ {
+ PartitionedRelPruneInfo *pinfo = lfirst(l2);
+ int i;
+
+ /* RT index of the partitione table. */
+ pinfo->rtindex += rtoffset;
+
+ /* And also those of the leaf partitions. */
+ for (i = 0; i < pinfo->nparts; i++)
+ {
+ if (pinfo->rti_map[i] > 0)
+ {
+ pinfo->rti_map[i] += rtoffset;
+ leafpart_rtis = bms_add_member(leafpart_rtis,
+ pinfo->rti_map[i]);
+ }
+ }
+ }
+ }
+
+ if (pruneinfo->needs_init_pruning)
+ {
+ glob->containsInitialPruning = true;
+
+ /*
+ * Delete the leaf partition RTIs from the global set of relations
+ * to be locked before executing the plan. AcquireExecutorLocks()
+ * will find the ones to add to the set after performing initial
+ * pruning.
+ */
+ glob->minLockRelids = bms_del_members(glob->minLockRelids,
+ leafpart_rtis);
+ }
+
+ glob->partPruneInfos = lappend(glob->partPruneInfos, pruneinfo);
+ }
+
return result;
}
@@ -1610,21 +1670,12 @@ set_append_references(PlannerInfo *root,
aplan->apprelids = offset_relid_set(aplan->apprelids, rtoffset);
- if (aplan->part_prune_info)
- {
- foreach(l, aplan->part_prune_info->prune_infos)
- {
- List *prune_infos = lfirst(l);
- ListCell *l2;
-
- foreach(l2, prune_infos)
- {
- PartitionedRelPruneInfo *pinfo = lfirst(l2);
-
- pinfo->rtindex += rtoffset;
- }
- }
- }
+ /*
+ * PartitionPruneInfos will be added to a list in PlannerGlobal, so update
+ * the index.
+ */
+ if (aplan->part_prune_index >= 0)
+ aplan->part_prune_index += list_length(root->glob->partPruneInfos);
/* We don't need to recurse to lefttree or righttree ... */
Assert(aplan->plan.lefttree == NULL);
@@ -1682,21 +1733,12 @@ set_mergeappend_references(PlannerInfo *root,
mplan->apprelids = offset_relid_set(mplan->apprelids, rtoffset);
- if (mplan->part_prune_info)
- {
- foreach(l, mplan->part_prune_info->prune_infos)
- {
- List *prune_infos = lfirst(l);
- ListCell *l2;
-
- foreach(l2, prune_infos)
- {
- PartitionedRelPruneInfo *pinfo = lfirst(l2);
-
- pinfo->rtindex += rtoffset;
- }
- }
- }
+ /*
+ * PartitionPruneInfos will be added to a list in PlannerGlobal, so update
+ * the index.
+ */
+ if (mplan->part_prune_index >= 0)
+ mplan->part_prune_index += list_length(root->glob->partPruneInfos);
/* We don't need to recurse to lefttree or righttree ... */
Assert(mplan->plan.lefttree == NULL);
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index 9d3c05aed3..5a5f5dee46 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -144,7 +144,9 @@ static List *make_partitionedrel_pruneinfo(PlannerInfo *root,
List *prunequal,
Bitmapset *partrelids,
int *relid_subplan_map,
- Bitmapset **matchedsubplans);
+ Bitmapset **matchedsubplans,
+ bool *needs_init_pruning,
+ bool *needs_exec_pruning);
static void gen_partprune_steps(RelOptInfo *rel, List *clauses,
PartClauseTarget target,
GeneratePruningStepsContext *context);
@@ -209,16 +211,20 @@ static void partkey_datum_from_expr(PartitionPruneContext *context,
/*
* make_partition_pruneinfo
- * Builds a PartitionPruneInfo which can be used in the executor to allow
- * additional partition pruning to take place. Returns NULL when
- * partition pruning would be useless.
+ * Checks if the given set of quals can be used to build pruning steps
+ * that the executor will use to prune useless ones from given set of
+ * child paths, and if so builds a PartitionPruneInfo that will allow the
+ * executor to do do and append it to root->partPruneInfos.
+ *
+ * Return value is 0-based index of the added PartitionPruneInfo or -1 if one
+ * was not built after all.
*
* 'parentrel' is the RelOptInfo for an appendrel, and 'subpaths' is the list
* of scan paths for its child rels.
* 'prunequal' is a list of potential pruning quals (i.e., restriction
* clauses that are applicable to the appendrel).
*/
-PartitionPruneInfo *
+int
make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
List *subpaths,
List *prunequal)
@@ -230,6 +236,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
int *relid_subplan_map;
ListCell *lc;
int i;
+ bool needs_init_pruning = false;
+ bool needs_exec_pruning = false;
/*
* Scan the subpaths to see which ones are scans of partition child
@@ -309,12 +317,16 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
Bitmapset *partrelids = (Bitmapset *) lfirst(lc);
List *pinfolist;
Bitmapset *matchedsubplans = NULL;
+ bool partrel_needs_init_pruning;
+ bool partrel_needs_exec_pruning;
pinfolist = make_partitionedrel_pruneinfo(root, parentrel,
prunequal,
partrelids,
relid_subplan_map,
- &matchedsubplans);
+ &matchedsubplans,
+ &partrel_needs_init_pruning,
+ &partrel_needs_exec_pruning);
/* When pruning is possible, record the matched subplans */
if (pinfolist != NIL)
@@ -323,6 +335,10 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
allmatchedsubplans = bms_join(matchedsubplans,
allmatchedsubplans);
}
+ if (!needs_init_pruning)
+ needs_init_pruning = partrel_needs_init_pruning;
+ if (!needs_exec_pruning)
+ needs_exec_pruning = partrel_needs_exec_pruning;
}
pfree(relid_subplan_map);
@@ -332,11 +348,13 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
* quals, then we can just not bother with run-time pruning.
*/
if (prunerelinfos == NIL)
- return NULL;
+ return -1;
/* Else build the result data structure */
pruneinfo = makeNode(PartitionPruneInfo);
pruneinfo->prune_infos = prunerelinfos;
+ pruneinfo->needs_init_pruning = needs_init_pruning;
+ pruneinfo->needs_exec_pruning = needs_exec_pruning;
/*
* Some subplans may not belong to any of the identified partitioned rels.
@@ -358,7 +376,9 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
else
pruneinfo->other_subplans = NULL;
- return pruneinfo;
+ root->partPruneInfos = lappend(root->partPruneInfos, pruneinfo);
+
+ return list_length(root->partPruneInfos) - 1;
}
/*
@@ -435,13 +455,18 @@ add_part_relids(List *allpartrelids, Bitmapset *partrelids)
* If we cannot find any useful run-time pruning steps, return NIL.
* However, on success, each rel identified in partrelids will have
* an element in the result list, even if some of them are useless.
+ * *needs_init_pruning and *needs_exec_pruning are set to indicate that the
+ * returned PartitionedRelPruneInfos contains pruning steps that can be
+ * performed before and after execution begins, respectively.
*/
static List *
make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
List *prunequal,
Bitmapset *partrelids,
int *relid_subplan_map,
- Bitmapset **matchedsubplans)
+ Bitmapset **matchedsubplans,
+ bool *needs_init_pruning,
+ bool *needs_exec_pruning)
{
RelOptInfo *targetpart = NULL;
List *pinfolist = NIL;
@@ -452,6 +477,10 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
int rti;
int i;
+ /* Will find out below. */
+ *needs_init_pruning = false;
+ *needs_exec_pruning = false;
+
/*
* Examine each partitioned rel, constructing a temporary array to map
* from planner relids to index of the partitioned rel, and building a
@@ -539,6 +568,9 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
* executor per-scan pruning steps. This first pass creates startup
* pruning steps and detects whether there's any possibly-useful quals
* that would require per-scan pruning.
+ *
+ * In the first pass, we note whether the 2nd pass is necessary by
+ * by noting the presence of EXEC parameters.
*/
gen_partprune_steps(subpart, partprunequal, PARTTARGET_INITIAL,
&context);
@@ -613,6 +645,11 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
pinfo->execparamids = execparamids;
/* Remaining fields will be filled in the next loop */
+ if (!*needs_init_pruning)
+ *needs_init_pruning = (initial_pruning_steps != NIL);
+ if (!*needs_exec_pruning)
+ *needs_exec_pruning = (exec_pruning_steps != NIL);
+
pinfolist = lappend(pinfolist, pinfo);
}
@@ -640,6 +677,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
int *subplan_map;
int *subpart_map;
Oid *relid_map;
+ Index *rti_map;
/*
* Construct the subplan and subpart maps for this partitioning level.
@@ -652,6 +690,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
subpart_map = (int *) palloc(nparts * sizeof(int));
memset(subpart_map, -1, nparts * sizeof(int));
relid_map = (Oid *) palloc0(nparts * sizeof(Oid));
+ rti_map = (Index *) palloc0(nparts * sizeof(Index));
present_parts = NULL;
i = -1;
@@ -666,6 +705,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
subplan_map[i] = subplanidx = relid_subplan_map[partrel->relid] - 1;
subpart_map[i] = subpartidx = relid_subpart_map[partrel->relid] - 1;
relid_map[i] = planner_rt_fetch(partrel->relid, root)->relid;
+ rti_map[i] = partrel->relid;
if (subplanidx >= 0)
{
present_parts = bms_add_member(present_parts, i);
@@ -690,6 +730,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
pinfo->subplan_map = subplan_map;
pinfo->subpart_map = subpart_map;
pinfo->relid_map = relid_map;
+ pinfo->rti_map = rti_map;
}
pfree(relid_subpart_map);
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 95dc2e2c83..8dc52a158f 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -1603,6 +1603,7 @@ exec_bind_message(StringInfo input_message)
int16 *rformats = NULL;
CachedPlanSource *psrc;
CachedPlan *cplan;
+ List *part_prune_result_list;
Portal portal;
char *query_string;
char *saved_stmt_name;
@@ -1978,7 +1979,9 @@ exec_bind_message(StringInfo input_message)
* will be generated in MessageContext. The plan refcount will be
* assigned to the Portal, so it will be released at portal destruction.
*/
- cplan = GetCachedPlan(psrc, params, NULL, NULL);
+ cplan = GetCachedPlan(psrc, params, NULL, NULL, &part_prune_result_list);
+ Assert(list_length(cplan->stmt_list) ==
+ list_length(part_prune_result_list));
/*
* Now we can define the portal.
@@ -1993,6 +1996,9 @@ exec_bind_message(StringInfo input_message)
cplan->stmt_list,
cplan);
+ /* Copy PartitionPruneResults into the portal's context. */
+ PortalStorePartitionPruneResults(portal, part_prune_result_list);
+
/* Done with the snapshot used for parameter I/O and parsing/planning */
if (snapshot_set)
PopActiveSnapshot();
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index 5aa5a350f3..a627448a5a 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -35,7 +35,7 @@
Portal ActivePortal = NULL;
-static void ProcessQuery(PlannedStmt *plan,
+static void ProcessQuery(PlannedStmt *plan, PartitionPruneResult *part_prune_result,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -65,6 +65,7 @@ static void DoPortalRewind(Portal portal);
*/
QueryDesc *
CreateQueryDesc(PlannedStmt *plannedstmt,
+ PartitionPruneResult *part_prune_result,
const char *sourceText,
Snapshot snapshot,
Snapshot crosscheck_snapshot,
@@ -77,6 +78,8 @@ CreateQueryDesc(PlannedStmt *plannedstmt,
qd->operation = plannedstmt->commandType; /* operation */
qd->plannedstmt = plannedstmt; /* plan */
+ qd->part_prune_result = part_prune_result; /* ExecutorDoInitialPruning()
+ * output for plan */
qd->sourceText = sourceText; /* query text */
qd->snapshot = RegisterSnapshot(snapshot); /* snapshot */
/* RI check snapshot */
@@ -122,6 +125,7 @@ FreeQueryDesc(QueryDesc *qdesc)
* PORTAL_ONE_RETURNING, or PORTAL_ONE_MOD_WITH portal
*
* plan: the plan tree for the query
+ * part_prune_result: ExecutorDoInitialPruning() output for the plan tree
* sourceText: the source text of the query
* params: any parameters needed
* dest: where to send results
@@ -134,6 +138,7 @@ FreeQueryDesc(QueryDesc *qdesc)
*/
static void
ProcessQuery(PlannedStmt *plan,
+ PartitionPruneResult *part_prune_result,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -145,7 +150,7 @@ ProcessQuery(PlannedStmt *plan,
/*
* Create the QueryDesc object
*/
- queryDesc = CreateQueryDesc(plan, sourceText,
+ queryDesc = CreateQueryDesc(plan, part_prune_result, sourceText,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, 0);
@@ -491,8 +496,13 @@ PortalStart(Portal portal, ParamListInfo params,
/*
* Create QueryDesc in portal's context; for the moment, set
* the destination to DestNone.
+ *
+ * There is no PartitionPruneResult unless the PlannedStmt is
+ * from a CachedPlan.
*/
queryDesc = CreateQueryDesc(linitial_node(PlannedStmt, portal->stmts),
+ portal->part_prune_results == NIL ? NULL :
+ linitial(portal->part_prune_results),
portal->sourceText,
GetActiveSnapshot(),
InvalidSnapshot,
@@ -1194,6 +1204,7 @@ PortalRunMulti(Portal portal,
{
bool active_snapshot_set = false;
ListCell *stmtlist_item;
+ int i;
/*
* If the destination is DestRemoteExecute, change to DestNone. The
@@ -1214,9 +1225,15 @@ PortalRunMulti(Portal portal,
* Loop to handle the individual queries generated from a single parsetree
* by analysis and rewrite.
*/
+ i = 0;
foreach(stmtlist_item, portal->stmts)
{
PlannedStmt *pstmt = lfirst_node(PlannedStmt, stmtlist_item);
+ PartitionPruneResult *part_prune_result = portal->part_prune_results ?
+ list_nth(portal->part_prune_results, i) :
+ NULL;
+
+ i++;
/*
* If we got a cancel signal in prior command, quit
@@ -1274,7 +1291,7 @@ PortalRunMulti(Portal portal,
if (pstmt->canSetTag)
{
/* statement can set tag string */
- ProcessQuery(pstmt,
+ ProcessQuery(pstmt, part_prune_result,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
@@ -1283,7 +1300,7 @@ PortalRunMulti(Portal portal,
else
{
/* stmt added by rewrite cannot set tag */
- ProcessQuery(pstmt,
+ ProcessQuery(pstmt, part_prune_result,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 4cf6db504f..6cb473f2f4 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -99,14 +99,19 @@ static dlist_head cached_expression_list = DLIST_STATIC_INIT(cached_expression_l
static void ReleaseGenericPlan(CachedPlanSource *plansource);
static List *RevalidateCachedQuery(CachedPlanSource *plansource,
QueryEnvironment *queryEnv);
-static bool CheckCachedPlan(CachedPlanSource *plansource);
+static bool CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
+ List **part_prune_result_list);
static CachedPlan *BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
- ParamListInfo boundParams, QueryEnvironment *queryEnv);
+ ParamListInfo boundParams, QueryEnvironment *queryEnv,
+ List **part_prune_result_list);
static bool choose_custom_plan(CachedPlanSource *plansource,
ParamListInfo boundParams);
static double cached_plan_cost(CachedPlan *plan, bool include_planner);
static Query *QueryListGetPrimaryStmt(List *stmts);
-static void AcquireExecutorLocks(List *stmt_list, bool acquire);
+static void AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams,
+ List **part_prune_result_list,
+ List **lockedRelids_per_stmt);
+static void ReleaseExecutorLocks(List *stmt_list, List *lockedRelids_per_stmt);
static void AcquirePlannerLocks(List *stmt_list, bool acquire);
static void ScanQueryForLocks(Query *parsetree, bool acquire);
static bool ScanQueryWalker(Node *node, bool *acquire);
@@ -790,15 +795,20 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
*
* On a "true" return, we have acquired the locks needed to run the plan.
* (We must do this for the "true" result to be race-condition-free.)
+ *
+ * See GetCachedPlan()'s comment for a description of part_prune_result_list.
*/
static bool
-CheckCachedPlan(CachedPlanSource *plansource)
+CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
+ List **part_prune_result_list)
{
CachedPlan *plan = plansource->gplan;
/* Assert that caller checked the querytree */
Assert(plansource->is_valid);
+ *part_prune_result_list = NIL;
+
/* If there's no generic plan, just say "false" */
if (!plan)
return false;
@@ -820,13 +830,21 @@ CheckCachedPlan(CachedPlanSource *plansource)
*/
if (plan->is_valid)
{
+ List *lockedRelids_per_stmt;
+
/*
* Plan must have positive refcount because it is referenced by
* plansource; so no need to fear it disappears under us here.
*/
Assert(plan->refcount > 0);
- AcquireExecutorLocks(plan->stmt_list, true);
+ /*
+ * Lock relations scanned by the plan. This is where the pruning
+ * happens if needed.
+ */
+ AcquireExecutorLocks(plan->stmt_list, boundParams,
+ part_prune_result_list,
+ &lockedRelids_per_stmt);
/*
* If plan was transient, check to see if TransactionXmin has
@@ -848,7 +866,14 @@ CheckCachedPlan(CachedPlanSource *plansource)
}
/* Oops, the race case happened. Release useless locks. */
- AcquireExecutorLocks(plan->stmt_list, false);
+ ReleaseExecutorLocks(plan->stmt_list, lockedRelids_per_stmt);
+
+ /*
+ * The output list and any objects therein have been allocated in the
+ * caller's hopefully short-lived context, so will not remain leaked
+ * for long, though reset to avoid its accidentally being looked at.
+ */
+ *part_prune_result_list = NIL;
}
/*
@@ -874,10 +899,15 @@ CheckCachedPlan(CachedPlanSource *plansource)
* Planning work is done in the caller's memory context. The finished plan
* is in a child memory context, which typically should get reparented
* (unless this is a one-shot plan, in which case we don't copy the plan).
+ *
+ * A list of NULLs is returned in *part_prune_result_list, meaning that no
+ * PartitionPruneResult nodes have yet been created for the plans in
+ * stmt_list.
*/
static CachedPlan *
BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
- ParamListInfo boundParams, QueryEnvironment *queryEnv)
+ ParamListInfo boundParams, QueryEnvironment *queryEnv,
+ List **part_prune_result_list)
{
CachedPlan *plan;
List *plist;
@@ -1007,6 +1037,17 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
MemoryContextSwitchTo(oldcxt);
+ /*
+ * No actual PartitionPruneResults yet to add, though must initialize
+ * the list to have the same number of elements as the list of
+ * PlannedStmts.
+ */
+ *part_prune_result_list = NIL;
+ foreach(lc, plist)
+ {
+ *part_prune_result_list = lappend(*part_prune_result_list, NULL);
+ }
+
return plan;
}
@@ -1126,6 +1167,17 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
* plan or a custom plan for the given parameters: the caller does not know
* which it will get.
*
+ * For every PlannedStmt found in the returned CachedPlan, an element that
+ * is either a PartitionPruneResult or a NULL is added to
+ * *part_prune_result_list if needed. The former if the PlannedStmt is from
+ * the existing CachedPlan that is otherwise valid and contains at least one
+ * PartitionPruneInfo that has "initial" pruning steps. Those steps are
+ * performed by calling ExecutorDoInitialPruning() to determine only those
+ * leaf partitions that need to be locked by AcquireExecutorLocks() by pruning
+ * away subplans that don't match the pruning conditions. The
+ * PartitionPruneResult contains a list of bitmapsets of the indexes of
+ * matching subplans, one for each PartitionPruneInfo.
+ *
* On return, the plan is valid and we have sufficient locks to begin
* execution.
*
@@ -1139,11 +1191,13 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
*/
CachedPlan *
GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
- ResourceOwner owner, QueryEnvironment *queryEnv)
+ ResourceOwner owner, QueryEnvironment *queryEnv,
+ List **part_prune_result_list)
{
CachedPlan *plan = NULL;
List *qlist;
bool customplan;
+ List *my_part_prune_result_list;
/* Assert caller is doing things in a sane order */
Assert(plansource->magic == CACHEDPLANSOURCE_MAGIC);
@@ -1160,7 +1214,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
if (!customplan)
{
- if (CheckCachedPlan(plansource))
+ if (CheckCachedPlan(plansource, boundParams,
+ &my_part_prune_result_list))
{
/* We want a generic plan, and we already have a valid one */
plan = plansource->gplan;
@@ -1169,7 +1224,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
else
{
/* Build a new generic plan */
- plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv);
+ plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv,
+ &my_part_prune_result_list);
/* Just make real sure plansource->gplan is clear */
ReleaseGenericPlan(plansource);
/* Link the new generic plan into the plansource */
@@ -1214,7 +1270,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
if (customplan)
{
/* Build a custom plan */
- plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv);
+ plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv,
+ &my_part_prune_result_list);
/* Accumulate total costs of custom plans */
plansource->total_custom_cost += cached_plan_cost(plan, true);
@@ -1246,6 +1303,9 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
plan->is_saved = true;
}
+ if (part_prune_result_list)
+ *part_prune_result_list = my_part_prune_result_list;
+
return plan;
}
@@ -1737,17 +1797,29 @@ QueryListGetPrimaryStmt(List *stmts)
/*
* AcquireExecutorLocks: acquire locks needed for execution of a cached plan;
- * or release them if acquire is false.
+ *
+ * See GetCachedPlan()'s comment for a description of part_prune_result_list.
+ *
+ * On return, *lockedRelids_per_stmt will contain a bitmapset for every
+ * PlannedStmt in stmt_list, containing the RT indexes of relation entries
+ * in its range table that were actually locked, or NULL if the PlannedStmt
+ * contains a utility statement.
*/
static void
-AcquireExecutorLocks(List *stmt_list, bool acquire)
+AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams,
+ List **part_prune_result_list,
+ List **lockedRelids_per_stmt)
{
ListCell *lc1;
+ *part_prune_result_list = *lockedRelids_per_stmt = NIL;
foreach(lc1, stmt_list)
{
PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
- ListCell *lc2;
+ PartitionPruneResult *part_prune_result = NULL;
+ Bitmapset *allLockRelids;
+ Bitmapset *lockedRelids = NULL;
+ int rti;
if (plannedstmt->commandType == CMD_UTILITY)
{
@@ -1761,13 +1833,35 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
Query *query = UtilityContainsQuery(plannedstmt->utilityStmt);
if (query)
- ScanQueryForLocks(query, acquire);
+ ScanQueryForLocks(query, true);
+ *part_prune_result_list = lappend(*part_prune_result_list, NULL);
continue;
}
- foreach(lc2, plannedstmt->rtable)
+ /*
+ * Figure out the set of relations that would need to be locked
+ * before executing the plan.
+ */
+ if (plannedstmt->containsInitialPruning)
{
- RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc2);
+ /*
+ * Obtain the set of partitions to be locked from the
+ * PartitionPruneInfos by considering the result of performing
+ * initial partition pruning.
+ */
+ PartitionPruneResult *part_prune_result =
+ ExecutorDoInitialPruning(plannedstmt, boundParams);
+
+ allLockRelids = bms_union(plannedstmt->minLockRelids,
+ part_prune_result->scan_leafpart_rtis);
+ }
+ else
+ allLockRelids = plannedstmt->minLockRelids;
+
+ rti = -1;
+ while ((rti = bms_next_member(allLockRelids, rti)) > 0)
+ {
+ RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
if (rte->rtekind != RTE_RELATION)
continue;
@@ -1778,10 +1872,58 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
* fail if it's been dropped entirely --- we'll just transiently
* acquire a non-conflicting lock.
*/
- if (acquire)
- LockRelationOid(rte->relid, rte->rellockmode);
- else
- UnlockRelationOid(rte->relid, rte->rellockmode);
+ LockRelationOid(rte->relid, rte->rellockmode);
+ lockedRelids = bms_add_member(lockedRelids, rti);
+ }
+
+ *part_prune_result_list = lappend(*part_prune_result_list,
+ part_prune_result);
+ *lockedRelids_per_stmt = lappend(*lockedRelids_per_stmt, lockedRelids);
+ }
+}
+
+/*
+ * ReleaseExecutorLocks
+ * Release locks that would've been acquired by an earlier call to
+ * AcquireExecutorLocks()
+ */
+static void
+ReleaseExecutorLocks(List *stmt_list, List *lockedRelids_per_stmt)
+{
+ ListCell *lc1,
+ *lc2;
+
+ forboth(lc1, stmt_list, lc2, lockedRelids_per_stmt)
+ {
+ PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
+ Bitmapset *lockedRelids = lfirst(lc2);
+ int rti;
+
+ if (plannedstmt->commandType == CMD_UTILITY)
+ {
+ /*
+ * Ignore utility statements, except those (such as EXPLAIN) that
+ * contain a parsed-but-not-planned query. Note: it's okay to use
+ * ScanQueryForLocks, even though the query hasn't been through
+ * rule rewriting, because rewriting doesn't change the query
+ * representation.
+ */
+ Query *query = UtilityContainsQuery(plannedstmt->utilityStmt);
+
+ if (query)
+ ScanQueryForLocks(query, false);
+ continue;
+ }
+
+ rti = -1;
+ while ((rti = bms_next_member(lockedRelids, rti)) >= 0)
+ {
+ RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
+
+ Assert(rte->rtekind == RTE_RELATION);
+
+ /* See the comment in AcquireExecutorLocks(). */
+ UnlockRelationOid(rte->relid, rte->rellockmode);
}
}
}
diff --git a/src/backend/utils/mmgr/portalmem.c b/src/backend/utils/mmgr/portalmem.c
index d549f66d4a..1bbe6b704b 100644
--- a/src/backend/utils/mmgr/portalmem.c
+++ b/src/backend/utils/mmgr/portalmem.c
@@ -303,6 +303,25 @@ PortalDefineQuery(Portal portal,
portal->status = PORTAL_DEFINED;
}
+/*
+ * PortalStorePartitionPruneResults
+ * Copy the given list of PartitionPruneResults into the portal's
+ * context
+ *
+ * This allows the caller to ensure that the list exists as long as the portal
+ * does.
+ */
+void
+PortalStorePartitionPruneResults(Portal portal, List *part_prune_results)
+{
+ MemoryContext oldcxt;
+
+ AssertArg(PortalIsValid(portal));
+ oldcxt = MemoryContextSwitchTo(portal->portalContext);
+ portal->part_prune_results = copyObject(part_prune_results);
+ MemoryContextSwitchTo(oldcxt);
+}
+
/*
* PortalReleaseCachedPlan
* Release a portal's reference to its cached plan, if any.
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 666977fb1f..34975c69ee 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -87,7 +87,8 @@ extern void ExplainOneUtility(Node *utilityStmt, IntoClause *into,
ExplainState *es, const char *queryString,
ParamListInfo params, QueryEnvironment *queryEnv);
-extern void ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into,
+extern void ExplainOnePlan(PlannedStmt *plannedstmt, PartitionPruneResult *part_prune_resul,
+ IntoClause *into,
ExplainState *es, const char *queryString,
ParamListInfo params, QueryEnvironment *queryEnv,
const instr_time *planduration,
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index 708435e952..bd8776402e 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -45,6 +45,7 @@ extern void ExecCleanupTupleRouting(ModifyTableState *mtstate,
* nparts Length of subplan_map[] and subpart_map[].
* subplan_map Subplan index by partition index, or -1.
* subpart_map Subpart index by partition index, or -1.
+ * rti_map Range table index by partition index, or 0.
* present_parts A Bitmapset of the partition indexes that we
* have subplans or subparts for.
* initial_pruning_steps List of PartitionPruneSteps used to
@@ -61,6 +62,7 @@ typedef struct PartitionedRelPruningData
int nparts;
int *subplan_map;
int *subpart_map;
+ Index *rti_map;
Bitmapset *present_parts;
List *initial_pruning_steps;
List *exec_pruning_steps;
@@ -123,9 +125,13 @@ typedef struct PartitionPruneState
extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
int n_total_subplans,
- PartitionPruneInfo *pruneinfo,
+ int part_prune_index,
Bitmapset **initially_valid_subplans);
extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
- bool initial_prune);
-
+ bool initial_prune,
+ Bitmapset **scan_leafpart_rtis);
+extern Bitmapset *ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt,
+ ParamListInfo params,
+ PartitionPruneInfo *pruneinfo,
+ Bitmapset **scan_leafpart_rtis);
#endif /* EXECPARTITION_H */
diff --git a/src/include/executor/execdesc.h b/src/include/executor/execdesc.h
index e79e2c001f..60d5644908 100644
--- a/src/include/executor/execdesc.h
+++ b/src/include/executor/execdesc.h
@@ -35,6 +35,8 @@ typedef struct QueryDesc
/* These fields are provided by CreateQueryDesc */
CmdType operation; /* CMD_SELECT, CMD_UPDATE, etc. */
PlannedStmt *plannedstmt; /* planner's output (could be utility, too) */
+ PartitionPruneResult *part_prune_result; /* ExecutorDoInitialPruning()'s
+ * output for plannedstmt */
const char *sourceText; /* source text of the query */
Snapshot snapshot; /* snapshot to use for query */
Snapshot crosscheck_snapshot; /* crosscheck for RI update/delete */
@@ -57,6 +59,7 @@ typedef struct QueryDesc
/* in pquery.c */
extern QueryDesc *CreateQueryDesc(PlannedStmt *plannedstmt,
+ PartitionPruneResult *part_prune_result,
const char *sourceText,
Snapshot snapshot,
Snapshot crosscheck_snapshot,
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index 873772f188..57dc0e8077 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -185,6 +185,8 @@ ExecGetJunkAttribute(TupleTableSlot *slot, AttrNumber attno, bool *isNull)
/*
* prototypes from functions in execMain.c
*/
+extern PartitionPruneResult *ExecutorDoInitialPruning(PlannedStmt *plannedstmt,
+ ParamListInfo params);
extern void ExecutorStart(QueryDesc *queryDesc, int eflags);
extern void standard_ExecutorStart(QueryDesc *queryDesc, int eflags);
extern void ExecutorRun(QueryDesc *queryDesc,
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index cbbcff81d2..3de4df1b05 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -596,6 +596,8 @@ typedef struct EState
struct ExecRowMark **es_rowmarks; /* Array of per-range-table-entry
* ExecRowMarks, or NULL if none */
PlannedStmt *es_plannedstmt; /* link to top of plan tree */
+ List *es_part_prune_infos; /* PlannedStmt.partPruneInfos */
+ struct PartitionPruneResult *es_part_prune_result; /* QueryDesc.part_prune_result */
const char *es_sourceText; /* Source text from QueryDesc */
JunkFilter *es_junkFilter; /* top-level junk filter, if any */
@@ -984,6 +986,34 @@ typedef struct DomainConstraintState
*/
typedef TupleTableSlot *(*ExecProcNodeMtd) (struct PlanState *pstate);
+/*----------------
+ * PartitionPruneResult
+ *
+ * The result of performing ExecutorDoInitialPruning() invocation on a given
+ * PlannedStmt.
+ *
+ * Contains a list of Bitmapset of the indexes of the subplans remaining after
+ * performing initial pruning by calling ExecFindMatchingSubPlans() for every
+ * PartitionPruneInfos found in PlannedStmt.partPruneInfos. RT indexes of the
+ * leaf partitions scanned by those subplans across all PartitionPruneInfos
+ * are added into scan_leafpart_rtis.
+ *
+ * This is used by GetCachedPlan() to inform its callers of the pruning
+ * decisions made when performing AcquireExecutorLocks() on a given cached
+ * PlannedStmt, which the callers then pass that on to the executor. The
+ * executor refers to this node when made available when initializing the plan
+ * nodes to which those PartitionPruneInfos apply so that the same set of
+ * qualifying subplans are initialized, rather than deriving that set again by
+ * redoing initial pruning.
+ */
+typedef struct PartitionPruneResult
+{
+ NodeTag type;
+
+ List *valid_subplan_offs_list;
+ Bitmapset *scan_leafpart_rtis;
+} PartitionPruneResult;
+
/* ----------------
* PlanState node
*
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index 300824258e..de312b9215 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -97,6 +97,9 @@ typedef enum NodeTag
T_PartitionPruneStepCombine,
T_PlanInvalItem,
+ /* TAGS FOR EXECUTOR PREP NODES (execnodes.h) */
+ T_PartitionPruneResult,
+
/*
* TAGS FOR PLAN STATE NODES (execnodes.h)
*
@@ -673,6 +676,7 @@ extern struct Bitmapset *readBitmapset(void);
extern uintptr_t readDatum(bool typbyval);
extern bool *readBoolCols(int numCols);
extern int *readIntCols(int numCols);
+extern Index *readIndexCols(int numCols);
extern Oid *readOidCols(int numCols);
extern int16 *readAttrNumberCols(int numCols);
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 6cbcb67bdf..d9c482e08b 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -107,6 +107,18 @@ typedef struct PlannerGlobal
List *appendRelations; /* "flat" list of AppendRelInfos */
+ List *partPruneInfos; /* List of PartitionPruneInfo contained in
+ * the plan */
+
+ bool containsInitialPruning; /* Do any of those PartitionPruneInfos
+ * have initial (pre-exec) pruning
+ * steps in them? */
+
+ Bitmapset *minLockRelids; /* Indexes of all range table entries minus
+ * indexes of range table entries of the leaf
+ * partitions scanned by prunable subplans;
+ * see AcquireExecutorLocks() */
+
List *relationOids; /* OIDs of relations the plan depends on */
List *invalItems; /* other dependencies, as PlanInvalItems */
@@ -377,6 +389,9 @@ struct PlannerInfo
/* Does this query modify any partition key columns? */
bool partColsUpdated;
+
+ /* PartitionPruneInfos added in this query's plan. */
+ List *partPruneInfos;
};
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 10dd35f011..44997d595d 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -64,8 +64,20 @@ typedef struct PlannedStmt
struct Plan *planTree; /* tree of Plan nodes */
+ List *partPruneInfos; /* List of PartitionPruneInfo contained in
+ * the plan */
+
+ bool containsInitialPruning; /* Do any of those PartitionPruneInfos
+ * have initial (pre-exec) pruning
+ * steps in them? */
+
List *rtable; /* list of RangeTblEntry nodes */
+ Bitmapset *minLockRelids; /* Indexes of all range table entries minus
+ * indexes of range table entries of the leaf
+ * partitions scanned by prunable subplans;
+ * see AcquireExecutorLocks() */
+
/* rtable indexes of target relations for INSERT/UPDATE/DELETE */
List *resultRelations; /* integer list of RT indexes, or NIL */
@@ -262,8 +274,12 @@ typedef struct Append
*/
int first_partial_plan;
- /* Info for run-time subplan pruning; NULL if we're not doing that */
- struct PartitionPruneInfo *part_prune_info;
+ /*
+ * Index of this plan's PartitionPruneInfo in PlannedStmt.partPruneInfos
+ * to be used for run-time subplan pruning; -1 if run-time pruning is
+ * not needed.
+ */
+ int part_prune_index;
} Append;
/* ----------------
@@ -282,8 +298,13 @@ typedef struct MergeAppend
Oid *sortOperators; /* OIDs of operators to sort them by */
Oid *collations; /* OIDs of collations */
bool *nullsFirst; /* NULLS FIRST/LAST directions */
- /* Info for run-time subplan pruning; NULL if we're not doing that */
- struct PartitionPruneInfo *part_prune_info;
+
+ /*
+ * Index of this plan's PartitionPruneInfo in PlannedStmt.partPruneInfos
+ * to be used for run-time subplan pruning; -1 if run-time pruning is
+ * not needed.
+ */
+ int part_prune_index;
} MergeAppend;
/* ----------------
@@ -1187,6 +1208,13 @@ typedef struct PlanRowMark
* prune_infos List of Lists containing PartitionedRelPruneInfo nodes,
* one sublist per run-time-prunable partition hierarchy
* appearing in the parent plan node's subplans.
+ *
+ * needs_init_pruning Does any of the PartitionedRelPruneInfos in
+ * prune_infos have its initial_pruning_steps set?
+ *
+ * needs_exec_pruning Does any of the PartitionedRelPruneInfos in
+ * prune_infos have its exec_pruning_steps set?
+ *
* other_subplans Indexes of any subplans that are not accounted for
* by any of the PartitionedRelPruneInfo nodes in
* "prune_infos". These subplans must not be pruned.
@@ -1195,6 +1223,8 @@ typedef struct PartitionPruneInfo
{
NodeTag type;
List *prune_infos;
+ bool needs_init_pruning;
+ bool needs_exec_pruning;
Bitmapset *other_subplans;
} PartitionPruneInfo;
@@ -1225,6 +1255,7 @@ typedef struct PartitionedRelPruneInfo
int *subplan_map; /* subplan index by partition index, or -1 */
int *subpart_map; /* subpart index by partition index, or -1 */
Oid *relid_map; /* relation OID by partition index, or 0 */
+ Index *rti_map; /* Range table index by partition index, 0. */
/*
* initial_pruning_steps shows how to prune during executor startup (i.e.,
diff --git a/src/include/partitioning/partprune.h b/src/include/partitioning/partprune.h
index 90684efa25..ebf0dcff8c 100644
--- a/src/include/partitioning/partprune.h
+++ b/src/include/partitioning/partprune.h
@@ -70,10 +70,10 @@ typedef struct PartitionPruneContext
#define PruneCxtStateIdx(partnatts, step_id, keyno) \
((partnatts) * (step_id) + (keyno))
-extern PartitionPruneInfo *make_partition_pruneinfo(struct PlannerInfo *root,
- struct RelOptInfo *parentrel,
- List *subpaths,
- List *prunequal);
+extern int make_partition_pruneinfo(struct PlannerInfo *root,
+ struct RelOptInfo *parentrel,
+ List *subpaths,
+ List *prunequal);
extern Bitmapset *prune_append_rel_partitions(struct RelOptInfo *rel);
extern Bitmapset *get_matching_partitions(PartitionPruneContext *context,
List *pruning_steps);
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index 95b99e3d25..449200b949 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -220,7 +220,8 @@ extern List *CachedPlanGetTargetList(CachedPlanSource *plansource,
extern CachedPlan *GetCachedPlan(CachedPlanSource *plansource,
ParamListInfo boundParams,
ResourceOwner owner,
- QueryEnvironment *queryEnv);
+ QueryEnvironment *queryEnv,
+ List **part_prune_result_list);
extern void ReleaseCachedPlan(CachedPlan *plan, ResourceOwner owner);
extern bool CachedPlanAllowsSimpleValidityCheck(CachedPlanSource *plansource,
diff --git a/src/include/utils/portal.h b/src/include/utils/portal.h
index aeddbdafe5..9f7727a837 100644
--- a/src/include/utils/portal.h
+++ b/src/include/utils/portal.h
@@ -138,6 +138,7 @@ typedef struct PortalData
QueryCompletion qc; /* command completion data for executed query */
List *stmts; /* list of PlannedStmts */
CachedPlan *cplan; /* CachedPlan, if stmts are from one */
+ List *part_prune_results; /* list of PartitionPruneResults */
ParamListInfo portalParams; /* params to pass to query */
QueryEnvironment *queryEnv; /* environment for query */
@@ -242,6 +243,8 @@ extern void PortalDefineQuery(Portal portal,
CommandTag commandTag,
List *stmts,
CachedPlan *cplan);
+extern void PortalStorePartitionPruneResults(Portal portal,
+ List *part_prune_result_list);
extern PlannedStmt *PortalGetPrimaryStmt(Portal portal);
extern void PortalCreateHoldStore(Portal portal);
extern void PortalHashTableDeleteAll(void);
--
2.24.1
^ permalink raw reply [nested|flat] 71+ messages in thread
* Re: generic plans and "initial" pruning
@ 2022-04-08 11:15 David Rowley <[email protected]>
parent: Amit Langote <[email protected]>
0 siblings, 1 reply; 71+ messages in thread
From: David Rowley @ 2022-04-08 11:15 UTC (permalink / raw)
To: Amit Langote <[email protected]>; +Cc: Robert Haas <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers
On Fri, 8 Apr 2022 at 17:49, Amit Langote <[email protected]> wrote:
> Attached updated patch with these changes.
Thanks for making the changes. I started looking over this patch but
really feel like it needs quite a few more iterations of what we've
just been doing to get it into proper committable shape. There seems
to be only about 40 mins to go before the freeze, so it seems very
unrealistic that it could be made to work.
I started trying to take a serious look at it this evening, but I feel
like I just failed to get into it deep enough to make any meaningful
improvements. I'd need more time to study the problem before I could
build up a proper opinion on how exactly I think it should work.
Anyway. I've attached a small patch that's just a few things I
adjusted or questions while reading over your v13 patch. Some of
these are just me questioning your code (See XXX comments) and some I
think are improvements. Feel free to take the hunks that you see fit
and drop anything you don't.
David
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 05cc99df8f..5ee978937d 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -121,6 +121,8 @@ static void EvalPlanQualStart(EPQState *epqstate, Plan *planTree);
*
* Note: Partitioned tables mentioned in PartitionedRelPruneInfo nodes that
* drive the pruning will be locked before doing the pruning.
+ *
+ * ----------------------------------------------------------------
*/
PartitionPruneResult *
ExecutorDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params)
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 3037742b8d..e9ca6bc55f 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -1707,6 +1707,7 @@ ExecInitPartitionPruning(PlanState *planstate,
Assert(n_total_subplans > 0);
*initially_valid_subplans = bms_add_range(NULL, 0,
n_total_subplans - 1);
+ return prunestate;
}
/*
@@ -1714,14 +1715,15 @@ ExecInitPartitionPruning(PlanState *planstate,
* that were removed above due to initial pruning. No need to do this if
* no steps were removed.
*/
- if (bms_num_members(*initially_valid_subplans) < n_total_subplans)
+ if (prunestate &&
+ bms_num_members(*initially_valid_subplans) < n_total_subplans)
{
/*
* We can safely skip this when !do_exec_prune, even though that
* leaves invalid data in prunestate, because that data won't be
* consulted again (cf initial Assert in ExecFindMatchingSubPlans).
*/
- if (prunestate && prunestate->do_exec_prune)
+ if (prunestate->do_exec_prune)
PartitionPruneFixSubPlanMap(prunestate,
*initially_valid_subplans,
n_total_subplans);
@@ -1751,7 +1753,8 @@ ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params,
Bitmapset *valid_subplan_offs;
/*
- * A temporary context to allocate stuff needded to run the pruning steps.
+ * A temporary context to for memory allocations required while execution
+ * partition pruning steps.
*/
tmpcontext = AllocSetContextCreate(CurrentMemoryContext,
"initial pruning working data",
@@ -1765,11 +1768,12 @@ ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params,
pdir = CreatePartitionDirectory(CurrentMemoryContext, false);
/*
- * We don't yet have a PlanState for the parent plan node, so must create
- * a standalone ExprContext to evaluate pruning expressions, equipped with
- * the information about the EXTERN parameters that the caller passed us.
- * Note that that's okay because the initial pruning steps do not contain
- * anything that requires the execution to have started.
+ * We don't yet have a PlanState for the parent plan node, so we must
+ * create a standalone ExprContext to evaluate pruning expressions,
+ * equipped with the information about the EXTERN parameters that the
+ * caller passed us. Note that that's okay because the initial pruning
+ * steps do not contain anything that requires the execution to have
+ * started.
*/
econtext = CreateStandaloneExprContext();
econtext->ecxt_param_list_info = params;
@@ -1874,7 +1878,6 @@ CreatePartitionPruneState(PlanState *planstate,
PartitionedRelPruneInfo *pinfo = lfirst_node(PartitionedRelPruneInfo, lc2);
PartitionedRelPruningData *pprune = &prunedata->partrelprunedata[j];
Relation partrel;
- bool close_partrel = false;
PartitionDesc partdesc;
PartitionKey partkey;
@@ -1894,7 +1897,6 @@ CreatePartitionPruneState(PlanState *planstate,
int lockmode = (j == 0) ? NoLock : rte->rellockmode;
partrel = table_open(rte->relid, lockmode);
- close_partrel = true;
}
else
partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
@@ -1914,7 +1916,7 @@ CreatePartitionPruneState(PlanState *planstate,
* Must close partrel, keeping the lock taken, if we're not using
* EState's entry.
*/
- if (close_partrel)
+ if (estate == NULL)
table_close(partrel, NoLock);
/*
@@ -2367,6 +2369,8 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
{
*validsubplans = bms_add_member(*validsubplans,
pprune->subplan_map[i]);
+ /* XXX why would pprune->rti_map[i] ever be zero here??? */
+ Assert(pprune->rti_map[i] > 0);
if (scan_leafpart_rtis && pprune->rti_map[i] > 0)
*scan_leafpart_rtis = bms_add_member(*scan_leafpart_rtis,
pprune->rti_map[i]);
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 639145abe9..f9c7976ff2 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -119,6 +119,7 @@ CreateExecutorState(void)
estate->es_relations = NULL;
estate->es_rowmarks = NULL;
estate->es_plannedstmt = NULL;
+ estate->es_part_prune_infos = NIL;
estate->es_part_prune_result = NULL;
estate->es_junkFilter = NULL;
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index 09f26658e2..96880e122a 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -94,7 +94,6 @@ static bool ExecAppendAsyncRequest(AppendState *node, TupleTableSlot **result);
static void ExecAppendAsyncEventWait(AppendState *node);
static void classify_matching_subplans(AppendState *node);
-
/* ----------------------------------------------------------------
* ExecInitAppend
*
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index ec6b1f1fc0..fe0df2f1d1 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -1184,7 +1184,6 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
ListCell *subpaths;
int nasyncplans = 0;
RelOptInfo *rel = best_path->path.parent;
- int part_prune_index = -1;
int nodenumsortkeys = 0;
AttrNumber *nodeSortColIdx = NULL;
Oid *nodeSortOperators = NULL;
@@ -1335,6 +1334,9 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
subplans = lappend(subplans, subplan);
}
+ /* Set below if we find quals that we can use to run-time prune */
+ plan->part_prune_index = -1;
+
/*
* If any quals exist, they may be useful to perform further partition
* pruning during execution. Gather information needed by the executor to
@@ -1358,18 +1360,15 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
}
if (prunequal != NIL)
- part_prune_index= make_partition_pruneinfo(root, rel,
- best_path->subpaths,
- prunequal);
+ plan->part_prune_index = make_partition_pruneinfo(root, rel,
+ best_path->subpaths,
+ prunequal);
}
plan->appendplans = subplans;
plan->nasyncplans = nasyncplans;
plan->first_partial_plan = best_path->first_partial_path;
- /* Will be updated later in set_plan_references(). */
- plan->part_prune_index = part_prune_index;
-
copy_generic_path_info(&plan->plan, (Path *) best_path);
/*
@@ -1408,7 +1407,6 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
List *subplans = NIL;
ListCell *subpaths;
RelOptInfo *rel = best_path->path.parent;
- int part_prune_index = -1;
/*
* We don't have the actual creation of the MergeAppend node split out
@@ -1501,6 +1499,9 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
subplans = lappend(subplans, subplan);
}
+ /* Set below if we find quals that we can use to run-time prune */
+ node->part_prune_index = -1;
+
/*
* If any quals exist, they may be useful to perform further partition
* pruning during execution. Gather information needed by the executor to
@@ -1524,15 +1525,13 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
}
if (prunequal != NIL)
- part_prune_index= make_partition_pruneinfo(root, rel,
- best_path->subpaths,
- prunequal);
+ node->part_prune_index = make_partition_pruneinfo(root, rel,
+ best_path->subpaths,
+ prunequal);
}
node->mergeplans = subplans;
- /* Will be updated later in set_plan_references(). */
- node->part_prune_index = part_prune_index;
/*
* If prepare_sort_from_pathkeys added sort columns, but we were told to
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index c88e5bacac..63ec8a98fc 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -408,6 +408,13 @@ set_plan_references(PlannerInfo *root, Plan *plan)
glob->partPruneInfos = lappend(glob->partPruneInfos, pruneinfo);
}
+ /*
+ * XXX is it worth doing a bms_copy() on glob->minLockRelids if
+ * glob->containsInitialPruning is true?. I'm slighly worried that the
+ * Bitmapset could have a very long empty tail resulting in excessive
+ * looping during AcquireExecutorLocks().
+ */
+
return result;
}
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index 5a5f5dee46..952c5b8327 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -212,12 +212,12 @@ static void partkey_datum_from_expr(PartitionPruneContext *context,
/*
* make_partition_pruneinfo
* Checks if the given set of quals can be used to build pruning steps
- * that the executor will use to prune useless ones from given set of
- * child paths, and if so builds a PartitionPruneInfo that will allow the
- * executor to do do and append it to root->partPruneInfos.
+ * that the executor can use to prune away unneeded partitions. If
+ * suitable quals are found then a PartitionPruneInfo is built and tagged
+ * onto the PlannerInfo's partPruneInfos list.
*
- * Return value is 0-based index of the added PartitionPruneInfo or -1 if one
- * was not built after all.
+ * The return value is the 0-based index of the item added to the
+ * partPruneInfos list or -1 if nothing was added.
*
* 'parentrel' is the RelOptInfo for an appendrel, and 'subpaths' is the list
* of scan paths for its child rels.
@@ -335,10 +335,9 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
allmatchedsubplans = bms_join(matchedsubplans,
allmatchedsubplans);
}
- if (!needs_init_pruning)
- needs_init_pruning = partrel_needs_init_pruning;
- if (!needs_exec_pruning)
- needs_exec_pruning = partrel_needs_exec_pruning;
+
+ needs_init_pruning |= partrel_needs_init_pruning;
+ needs_exec_pruning |= partrel_needs_exec_pruning;
}
pfree(relid_subplan_map);
@@ -570,7 +569,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
* that would require per-scan pruning.
*
* In the first pass, we note whether the 2nd pass is necessary by
- * by noting the presence of EXEC parameters.
+ * noting the presence of EXEC parameters.
*/
gen_partprune_steps(subpart, partprunequal, PARTTARGET_INITIAL,
&context);
@@ -645,10 +644,11 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
pinfo->execparamids = execparamids;
/* Remaining fields will be filled in the next loop */
- if (!*needs_init_pruning)
- *needs_init_pruning = (initial_pruning_steps != NIL);
- if (!*needs_exec_pruning)
- *needs_exec_pruning = (exec_pruning_steps != NIL);
+ /* record which types of pruning steps we've seen so far */
+ if (initial_pruning_steps != NIL)
+ *needs_init_pruning = true;
+ if (exec_pruning_steps != NIL)
+ *needs_exec_pruning = true;
pinfolist = lappend(pinfolist, pinfo);
}
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index a627448a5a..8cc2e2162d 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -1204,7 +1204,6 @@ PortalRunMulti(Portal portal,
{
bool active_snapshot_set = false;
ListCell *stmtlist_item;
- int i;
/*
* If the destination is DestRemoteExecute, change to DestNone. The
@@ -1225,15 +1224,9 @@ PortalRunMulti(Portal portal,
* Loop to handle the individual queries generated from a single parsetree
* by analysis and rewrite.
*/
- i = 0;
foreach(stmtlist_item, portal->stmts)
{
PlannedStmt *pstmt = lfirst_node(PlannedStmt, stmtlist_item);
- PartitionPruneResult *part_prune_result = portal->part_prune_results ?
- list_nth(portal->part_prune_results, i) :
- NULL;
-
- i++;
/*
* If we got a cancel signal in prior command, quit
@@ -1242,6 +1235,8 @@ PortalRunMulti(Portal portal,
if (pstmt->utilityStmt == NULL)
{
+ PartitionPruneResult *part_prune_result = NULL;
+
/*
* process a plannable query.
*/
@@ -1288,6 +1283,14 @@ PortalRunMulti(Portal portal,
else
UpdateActiveSnapshotCommandId();
+ /*
+ * Determine if there's a corresponding PartitionPruneResult for
+ * this PlannedStmt.
+ */
+ if (portal->part_prune_results != NIL)
+ part_prune_result = list_nth(portal->part_prune_results,
+ foreach_current_index(stmtlist_item));
+
if (pstmt->canSetTag)
{
/* statement can set tag string */
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 34975c69ee..bbc8c42d88 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -87,7 +87,8 @@ extern void ExplainOneUtility(Node *utilityStmt, IntoClause *into,
ExplainState *es, const char *queryString,
ParamListInfo params, QueryEnvironment *queryEnv);
-extern void ExplainOnePlan(PlannedStmt *plannedstmt, PartitionPruneResult *part_prune_resul,
+extern void ExplainOnePlan(PlannedStmt *plannedstmt,
+ PartitionPruneResult *part_prune_result,
IntoClause *into,
ExplainState *es, const char *queryString,
ParamListInfo params, QueryEnvironment *queryEnv,
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 43bd293433..a8bf908d63 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -1000,11 +1000,11 @@ typedef TupleTableSlot *(*ExecProcNodeMtd) (struct PlanState *pstate);
*
* This is used by GetCachedPlan() to inform its callers of the pruning
* decisions made when performing AcquireExecutorLocks() on a given cached
- * PlannedStmt, which the callers then pass that on to the executor. The
- * executor refers to this node when made available when initializing the plan
- * nodes to which those PartitionPruneInfos apply so that the same set of
- * qualifying subplans are initialized, rather than deriving that set again by
- * redoing initial pruning.
+ * PlannedStmt, which the callers then pass onto the executor. The executor
+ * refers to this node when made available when initializing the plan nodes to
+ * which those PartitionPruneInfos apply so that the same set of qualifying
+ * subplans are initialized, rather than deriving that set again by redoing
+ * initial pruning.
*/
typedef struct PartitionPruneResult
{
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 550308147d..f8f3971f44 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -274,11 +274,7 @@ typedef struct Append
*/
int first_partial_plan;
- /*
- * Index of this plan's PartitionPruneInfo in PlannedStmt.partPruneInfos
- * to be used for run-time subplan pruning; -1 if run-time pruning is
- * not needed.
- */
+ /* Index to PlannerInfo.partPruneInfos or -1 if no run-time pruning */
int part_prune_index;
} Append;
@@ -299,11 +295,7 @@ typedef struct MergeAppend
Oid *collations; /* OIDs of collations */
bool *nullsFirst; /* NULLS FIRST/LAST directions */
- /*
- * Index of this plan's PartitionPruneInfo in PlannedStmt.partPruneInfos
- * to be used for run-time subplan pruning; -1 if run-time pruning is
- * not needed.
- */
+ /* Index to PlannerInfo.partPruneInfos or -1 if no run-time pruning */
int part_prune_index;
} MergeAppend;
Attachments:
[text/plain] misc_fixes.patch.txt (15.8K, 2-misc_fixes.patch.txt)
download | inline diff:
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 05cc99df8f..5ee978937d 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -121,6 +121,8 @@ static void EvalPlanQualStart(EPQState *epqstate, Plan *planTree);
*
* Note: Partitioned tables mentioned in PartitionedRelPruneInfo nodes that
* drive the pruning will be locked before doing the pruning.
+ *
+ * ----------------------------------------------------------------
*/
PartitionPruneResult *
ExecutorDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params)
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 3037742b8d..e9ca6bc55f 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -1707,6 +1707,7 @@ ExecInitPartitionPruning(PlanState *planstate,
Assert(n_total_subplans > 0);
*initially_valid_subplans = bms_add_range(NULL, 0,
n_total_subplans - 1);
+ return prunestate;
}
/*
@@ -1714,14 +1715,15 @@ ExecInitPartitionPruning(PlanState *planstate,
* that were removed above due to initial pruning. No need to do this if
* no steps were removed.
*/
- if (bms_num_members(*initially_valid_subplans) < n_total_subplans)
+ if (prunestate &&
+ bms_num_members(*initially_valid_subplans) < n_total_subplans)
{
/*
* We can safely skip this when !do_exec_prune, even though that
* leaves invalid data in prunestate, because that data won't be
* consulted again (cf initial Assert in ExecFindMatchingSubPlans).
*/
- if (prunestate && prunestate->do_exec_prune)
+ if (prunestate->do_exec_prune)
PartitionPruneFixSubPlanMap(prunestate,
*initially_valid_subplans,
n_total_subplans);
@@ -1751,7 +1753,8 @@ ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params,
Bitmapset *valid_subplan_offs;
/*
- * A temporary context to allocate stuff needded to run the pruning steps.
+ * A temporary context to for memory allocations required while execution
+ * partition pruning steps.
*/
tmpcontext = AllocSetContextCreate(CurrentMemoryContext,
"initial pruning working data",
@@ -1765,11 +1768,12 @@ ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params,
pdir = CreatePartitionDirectory(CurrentMemoryContext, false);
/*
- * We don't yet have a PlanState for the parent plan node, so must create
- * a standalone ExprContext to evaluate pruning expressions, equipped with
- * the information about the EXTERN parameters that the caller passed us.
- * Note that that's okay because the initial pruning steps do not contain
- * anything that requires the execution to have started.
+ * We don't yet have a PlanState for the parent plan node, so we must
+ * create a standalone ExprContext to evaluate pruning expressions,
+ * equipped with the information about the EXTERN parameters that the
+ * caller passed us. Note that that's okay because the initial pruning
+ * steps do not contain anything that requires the execution to have
+ * started.
*/
econtext = CreateStandaloneExprContext();
econtext->ecxt_param_list_info = params;
@@ -1874,7 +1878,6 @@ CreatePartitionPruneState(PlanState *planstate,
PartitionedRelPruneInfo *pinfo = lfirst_node(PartitionedRelPruneInfo, lc2);
PartitionedRelPruningData *pprune = &prunedata->partrelprunedata[j];
Relation partrel;
- bool close_partrel = false;
PartitionDesc partdesc;
PartitionKey partkey;
@@ -1894,7 +1897,6 @@ CreatePartitionPruneState(PlanState *planstate,
int lockmode = (j == 0) ? NoLock : rte->rellockmode;
partrel = table_open(rte->relid, lockmode);
- close_partrel = true;
}
else
partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
@@ -1914,7 +1916,7 @@ CreatePartitionPruneState(PlanState *planstate,
* Must close partrel, keeping the lock taken, if we're not using
* EState's entry.
*/
- if (close_partrel)
+ if (estate == NULL)
table_close(partrel, NoLock);
/*
@@ -2367,6 +2369,8 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
{
*validsubplans = bms_add_member(*validsubplans,
pprune->subplan_map[i]);
+ /* XXX why would pprune->rti_map[i] ever be zero here??? */
+ Assert(pprune->rti_map[i] > 0);
if (scan_leafpart_rtis && pprune->rti_map[i] > 0)
*scan_leafpart_rtis = bms_add_member(*scan_leafpart_rtis,
pprune->rti_map[i]);
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 639145abe9..f9c7976ff2 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -119,6 +119,7 @@ CreateExecutorState(void)
estate->es_relations = NULL;
estate->es_rowmarks = NULL;
estate->es_plannedstmt = NULL;
+ estate->es_part_prune_infos = NIL;
estate->es_part_prune_result = NULL;
estate->es_junkFilter = NULL;
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index 09f26658e2..96880e122a 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -94,7 +94,6 @@ static bool ExecAppendAsyncRequest(AppendState *node, TupleTableSlot **result);
static void ExecAppendAsyncEventWait(AppendState *node);
static void classify_matching_subplans(AppendState *node);
-
/* ----------------------------------------------------------------
* ExecInitAppend
*
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index ec6b1f1fc0..fe0df2f1d1 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -1184,7 +1184,6 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
ListCell *subpaths;
int nasyncplans = 0;
RelOptInfo *rel = best_path->path.parent;
- int part_prune_index = -1;
int nodenumsortkeys = 0;
AttrNumber *nodeSortColIdx = NULL;
Oid *nodeSortOperators = NULL;
@@ -1335,6 +1334,9 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
subplans = lappend(subplans, subplan);
}
+ /* Set below if we find quals that we can use to run-time prune */
+ plan->part_prune_index = -1;
+
/*
* If any quals exist, they may be useful to perform further partition
* pruning during execution. Gather information needed by the executor to
@@ -1358,18 +1360,15 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
}
if (prunequal != NIL)
- part_prune_index= make_partition_pruneinfo(root, rel,
- best_path->subpaths,
- prunequal);
+ plan->part_prune_index = make_partition_pruneinfo(root, rel,
+ best_path->subpaths,
+ prunequal);
}
plan->appendplans = subplans;
plan->nasyncplans = nasyncplans;
plan->first_partial_plan = best_path->first_partial_path;
- /* Will be updated later in set_plan_references(). */
- plan->part_prune_index = part_prune_index;
-
copy_generic_path_info(&plan->plan, (Path *) best_path);
/*
@@ -1408,7 +1407,6 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
List *subplans = NIL;
ListCell *subpaths;
RelOptInfo *rel = best_path->path.parent;
- int part_prune_index = -1;
/*
* We don't have the actual creation of the MergeAppend node split out
@@ -1501,6 +1499,9 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
subplans = lappend(subplans, subplan);
}
+ /* Set below if we find quals that we can use to run-time prune */
+ node->part_prune_index = -1;
+
/*
* If any quals exist, they may be useful to perform further partition
* pruning during execution. Gather information needed by the executor to
@@ -1524,15 +1525,13 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
}
if (prunequal != NIL)
- part_prune_index= make_partition_pruneinfo(root, rel,
- best_path->subpaths,
- prunequal);
+ node->part_prune_index = make_partition_pruneinfo(root, rel,
+ best_path->subpaths,
+ prunequal);
}
node->mergeplans = subplans;
- /* Will be updated later in set_plan_references(). */
- node->part_prune_index = part_prune_index;
/*
* If prepare_sort_from_pathkeys added sort columns, but we were told to
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index c88e5bacac..63ec8a98fc 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -408,6 +408,13 @@ set_plan_references(PlannerInfo *root, Plan *plan)
glob->partPruneInfos = lappend(glob->partPruneInfos, pruneinfo);
}
+ /*
+ * XXX is it worth doing a bms_copy() on glob->minLockRelids if
+ * glob->containsInitialPruning is true?. I'm slighly worried that the
+ * Bitmapset could have a very long empty tail resulting in excessive
+ * looping during AcquireExecutorLocks().
+ */
+
return result;
}
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index 5a5f5dee46..952c5b8327 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -212,12 +212,12 @@ static void partkey_datum_from_expr(PartitionPruneContext *context,
/*
* make_partition_pruneinfo
* Checks if the given set of quals can be used to build pruning steps
- * that the executor will use to prune useless ones from given set of
- * child paths, and if so builds a PartitionPruneInfo that will allow the
- * executor to do do and append it to root->partPruneInfos.
+ * that the executor can use to prune away unneeded partitions. If
+ * suitable quals are found then a PartitionPruneInfo is built and tagged
+ * onto the PlannerInfo's partPruneInfos list.
*
- * Return value is 0-based index of the added PartitionPruneInfo or -1 if one
- * was not built after all.
+ * The return value is the 0-based index of the item added to the
+ * partPruneInfos list or -1 if nothing was added.
*
* 'parentrel' is the RelOptInfo for an appendrel, and 'subpaths' is the list
* of scan paths for its child rels.
@@ -335,10 +335,9 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
allmatchedsubplans = bms_join(matchedsubplans,
allmatchedsubplans);
}
- if (!needs_init_pruning)
- needs_init_pruning = partrel_needs_init_pruning;
- if (!needs_exec_pruning)
- needs_exec_pruning = partrel_needs_exec_pruning;
+
+ needs_init_pruning |= partrel_needs_init_pruning;
+ needs_exec_pruning |= partrel_needs_exec_pruning;
}
pfree(relid_subplan_map);
@@ -570,7 +569,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
* that would require per-scan pruning.
*
* In the first pass, we note whether the 2nd pass is necessary by
- * by noting the presence of EXEC parameters.
+ * noting the presence of EXEC parameters.
*/
gen_partprune_steps(subpart, partprunequal, PARTTARGET_INITIAL,
&context);
@@ -645,10 +644,11 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
pinfo->execparamids = execparamids;
/* Remaining fields will be filled in the next loop */
- if (!*needs_init_pruning)
- *needs_init_pruning = (initial_pruning_steps != NIL);
- if (!*needs_exec_pruning)
- *needs_exec_pruning = (exec_pruning_steps != NIL);
+ /* record which types of pruning steps we've seen so far */
+ if (initial_pruning_steps != NIL)
+ *needs_init_pruning = true;
+ if (exec_pruning_steps != NIL)
+ *needs_exec_pruning = true;
pinfolist = lappend(pinfolist, pinfo);
}
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index a627448a5a..8cc2e2162d 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -1204,7 +1204,6 @@ PortalRunMulti(Portal portal,
{
bool active_snapshot_set = false;
ListCell *stmtlist_item;
- int i;
/*
* If the destination is DestRemoteExecute, change to DestNone. The
@@ -1225,15 +1224,9 @@ PortalRunMulti(Portal portal,
* Loop to handle the individual queries generated from a single parsetree
* by analysis and rewrite.
*/
- i = 0;
foreach(stmtlist_item, portal->stmts)
{
PlannedStmt *pstmt = lfirst_node(PlannedStmt, stmtlist_item);
- PartitionPruneResult *part_prune_result = portal->part_prune_results ?
- list_nth(portal->part_prune_results, i) :
- NULL;
-
- i++;
/*
* If we got a cancel signal in prior command, quit
@@ -1242,6 +1235,8 @@ PortalRunMulti(Portal portal,
if (pstmt->utilityStmt == NULL)
{
+ PartitionPruneResult *part_prune_result = NULL;
+
/*
* process a plannable query.
*/
@@ -1288,6 +1283,14 @@ PortalRunMulti(Portal portal,
else
UpdateActiveSnapshotCommandId();
+ /*
+ * Determine if there's a corresponding PartitionPruneResult for
+ * this PlannedStmt.
+ */
+ if (portal->part_prune_results != NIL)
+ part_prune_result = list_nth(portal->part_prune_results,
+ foreach_current_index(stmtlist_item));
+
if (pstmt->canSetTag)
{
/* statement can set tag string */
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 34975c69ee..bbc8c42d88 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -87,7 +87,8 @@ extern void ExplainOneUtility(Node *utilityStmt, IntoClause *into,
ExplainState *es, const char *queryString,
ParamListInfo params, QueryEnvironment *queryEnv);
-extern void ExplainOnePlan(PlannedStmt *plannedstmt, PartitionPruneResult *part_prune_resul,
+extern void ExplainOnePlan(PlannedStmt *plannedstmt,
+ PartitionPruneResult *part_prune_result,
IntoClause *into,
ExplainState *es, const char *queryString,
ParamListInfo params, QueryEnvironment *queryEnv,
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 43bd293433..a8bf908d63 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -1000,11 +1000,11 @@ typedef TupleTableSlot *(*ExecProcNodeMtd) (struct PlanState *pstate);
*
* This is used by GetCachedPlan() to inform its callers of the pruning
* decisions made when performing AcquireExecutorLocks() on a given cached
- * PlannedStmt, which the callers then pass that on to the executor. The
- * executor refers to this node when made available when initializing the plan
- * nodes to which those PartitionPruneInfos apply so that the same set of
- * qualifying subplans are initialized, rather than deriving that set again by
- * redoing initial pruning.
+ * PlannedStmt, which the callers then pass onto the executor. The executor
+ * refers to this node when made available when initializing the plan nodes to
+ * which those PartitionPruneInfos apply so that the same set of qualifying
+ * subplans are initialized, rather than deriving that set again by redoing
+ * initial pruning.
*/
typedef struct PartitionPruneResult
{
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 550308147d..f8f3971f44 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -274,11 +274,7 @@ typedef struct Append
*/
int first_partial_plan;
- /*
- * Index of this plan's PartitionPruneInfo in PlannedStmt.partPruneInfos
- * to be used for run-time subplan pruning; -1 if run-time pruning is
- * not needed.
- */
+ /* Index to PlannerInfo.partPruneInfos or -1 if no run-time pruning */
int part_prune_index;
} Append;
@@ -299,11 +295,7 @@ typedef struct MergeAppend
Oid *collations; /* OIDs of collations */
bool *nullsFirst; /* NULLS FIRST/LAST directions */
- /*
- * Index of this plan's PartitionPruneInfo in PlannedStmt.partPruneInfos
- * to be used for run-time subplan pruning; -1 if run-time pruning is
- * not needed.
- */
+ /* Index to PlannerInfo.partPruneInfos or -1 if no run-time pruning */
int part_prune_index;
} MergeAppend;
^ permalink raw reply [nested|flat] 71+ messages in thread
* Re: generic plans and "initial" pruning
@ 2022-04-08 11:45 Amit Langote <[email protected]>
parent: David Rowley <[email protected]>
0 siblings, 1 reply; 71+ messages in thread
From: Amit Langote @ 2022-04-08 11:45 UTC (permalink / raw)
To: David Rowley <[email protected]>; +Cc: Robert Haas <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers
Hi David,
On Fri, Apr 8, 2022 at 8:16 PM David Rowley <[email protected]> wrote:
> On Fri, 8 Apr 2022 at 17:49, Amit Langote <[email protected]> wrote:
> > Attached updated patch with these changes.
> Thanks for making the changes. I started looking over this patch but
> really feel like it needs quite a few more iterations of what we've
> just been doing to get it into proper committable shape. There seems
> to be only about 40 mins to go before the freeze, so it seems very
> unrealistic that it could be made to work.
Yeah, totally understandable.
> I started trying to take a serious look at it this evening, but I feel
> like I just failed to get into it deep enough to make any meaningful
> improvements. I'd need more time to study the problem before I could
> build up a proper opinion on how exactly I think it should work.
>
> Anyway. I've attached a small patch that's just a few things I
> adjusted or questions while reading over your v13 patch. Some of
> these are just me questioning your code (See XXX comments) and some I
> think are improvements. Feel free to take the hunks that you see fit
> and drop anything you don't.
Thanks a lot for compiling those.
Most looked fine changes to me except a couple of typos, so I've
adopted those into the attached new version, even though I know it's
too late to try to apply it. Re the XXX comments:
+ /* XXX why would pprune->rti_map[i] ever be zero here??? */
Yeah, no there can't be, was perhaps being overly paraioid.
+ * XXX is it worth doing a bms_copy() on glob->minLockRelids if
+ * glob->containsInitialPruning is true?. I'm slighly worried that the
+ * Bitmapset could have a very long empty tail resulting in excessive
+ * looping during AcquireExecutorLocks().
+ */
I guess I trust your instincts about bitmapset operation efficiency
and what you've written here makes sense. It's typical for leaf
partitions to have been appended toward the tail end of rtable and I'd
imagine their indexes would be in the tail words of minLockRelids. If
copying the bitmapset removes those useless words, I don't see why we
shouldn't do that. So added:
+ /*
+ * It seems worth doing a bms_copy() on glob->minLockRelids if we deleted
+ * bit from it just above to prevent empty tail bits resulting in
+ * inefficient looping during AcquireExecutorLocks().
+ */
+ if (glob->containsInitialPruning)
+ glob->minLockRelids = bms_copy(glob->minLockRelids)
Not 100% about the comment I wrote.
--
Amit Langote
EDB: http://www.enterprisedb.com
Attachments:
[application/octet-stream] v14-0001-Optimize-AcquireExecutorLocks-to-skip-pruned-par.patch (99.3K, 2-v14-0001-Optimize-AcquireExecutorLocks-to-skip-pruned-par.patch)
download | inline diff:
From 552da9453f0c4896bcc8748719960db52b3ccad1 Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Wed, 22 Dec 2021 16:55:17 +0900
Subject: [PATCH v14] Optimize AcquireExecutorLocks() to skip pruned partitions
---
src/backend/commands/copyto.c | 2 +-
src/backend/commands/createas.c | 2 +-
src/backend/commands/explain.c | 7 +-
src/backend/commands/extension.c | 2 +-
src/backend/commands/matview.c | 2 +-
src/backend/commands/prepare.c | 26 ++-
src/backend/executor/README | 27 +++
src/backend/executor/execMain.c | 48 +++++
src/backend/executor/execParallel.c | 28 ++-
src/backend/executor/execPartition.c | 241 ++++++++++++++++++++----
src/backend/executor/execUtils.c | 2 +
src/backend/executor/functions.c | 2 +-
src/backend/executor/nodeAppend.c | 15 +-
src/backend/executor/nodeMergeAppend.c | 9 +-
src/backend/executor/spi.c | 27 ++-
src/backend/nodes/copyfuncs.c | 33 +++-
src/backend/nodes/outfuncs.c | 36 +++-
src/backend/nodes/readfuncs.c | 56 +++++-
src/backend/optimizer/plan/createplan.c | 24 +--
src/backend/optimizer/plan/planner.c | 3 +
src/backend/optimizer/plan/setrefs.c | 112 ++++++++---
src/backend/partitioning/partprune.c | 59 +++++-
src/backend/tcop/postgres.c | 8 +-
src/backend/tcop/pquery.c | 28 ++-
src/backend/utils/cache/plancache.c | 184 +++++++++++++++---
src/backend/utils/mmgr/portalmem.c | 19 ++
src/include/commands/explain.h | 4 +-
src/include/executor/execPartition.h | 12 +-
src/include/executor/execdesc.h | 3 +
src/include/executor/executor.h | 2 +
src/include/nodes/execnodes.h | 30 +++
src/include/nodes/nodes.h | 4 +
src/include/nodes/pathnodes.h | 15 ++
src/include/nodes/plannodes.h | 31 ++-
src/include/partitioning/partprune.h | 8 +-
src/include/utils/plancache.h | 3 +-
src/include/utils/portal.h | 3 +
37 files changed, 950 insertions(+), 167 deletions(-)
diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index 55c38b04c4..d403eb2309 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -542,7 +542,7 @@ BeginCopyTo(ParseState *pstate,
((DR_copy *) dest)->cstate = cstate;
/* Create a QueryDesc requesting no output */
- cstate->queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+ cstate->queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
GetActiveSnapshot(),
InvalidSnapshot,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 9abbb6b555..f6607f2454 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -325,7 +325,7 @@ ExecCreateTableAs(ParseState *pstate, CreateTableAsStmt *stmt,
UpdateActiveSnapshotCommandId();
/* Create a QueryDesc, redirecting output to our tuple receiver */
- queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+ queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, 0);
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index d2a2479822..35dd24adf8 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -407,7 +407,7 @@ ExplainOneQuery(Query *query, int cursorOptions,
}
/* run it (if needed) and produce output */
- ExplainOnePlan(plan, into, es, queryString, params, queryEnv,
+ ExplainOnePlan(plan, NULL, into, es, queryString, params, queryEnv,
&planduration, (es->buffers ? &bufusage : NULL));
}
}
@@ -515,7 +515,8 @@ ExplainOneUtility(Node *utilityStmt, IntoClause *into, ExplainState *es,
* to call it.
*/
void
-ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
+ExplainOnePlan(PlannedStmt *plannedstmt, PartitionPruneResult *part_prune_result,
+ IntoClause *into, ExplainState *es,
const char *queryString, ParamListInfo params,
QueryEnvironment *queryEnv, const instr_time *planduration,
const BufferUsage *bufusage)
@@ -563,7 +564,7 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
dest = None_Receiver;
/* Create a QueryDesc for the query */
- queryDesc = CreateQueryDesc(plannedstmt, queryString,
+ queryDesc = CreateQueryDesc(plannedstmt, part_prune_result, queryString,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, instrument_option);
diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index 1013790dbb..54734a3a93 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -776,7 +776,7 @@ execute_sql_string(const char *sql)
{
QueryDesc *qdesc;
- qdesc = CreateQueryDesc(stmt,
+ qdesc = CreateQueryDesc(stmt, NULL,
sql,
GetActiveSnapshot(), NULL,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index 9ab248d25e..2be1782bc4 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -416,7 +416,7 @@ refresh_matview_datafill(DestReceiver *dest, Query *query,
UpdateActiveSnapshotCommandId();
/* Create a QueryDesc, redirecting output to our tuple receiver */
- queryDesc = CreateQueryDesc(plan, queryString,
+ queryDesc = CreateQueryDesc(plan, NULL, queryString,
GetActiveSnapshot(), InvalidSnapshot,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 80738547ed..c7360712b1 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -155,6 +155,7 @@ ExecuteQuery(ParseState *pstate,
PreparedStatement *entry;
CachedPlan *cplan;
List *plan_list;
+ List *part_prune_result_list;
ParamListInfo paramLI = NULL;
EState *estate = NULL;
Portal portal;
@@ -193,7 +194,10 @@ ExecuteQuery(ParseState *pstate,
entry->plansource->query_string);
/* Replan if needed, and increment plan refcount for portal */
- cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL);
+ cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL,
+ &part_prune_result_list);
+ Assert(list_length(cplan->stmt_list) ==
+ list_length(part_prune_result_list));
plan_list = cplan->stmt_list;
/*
@@ -207,6 +211,9 @@ ExecuteQuery(ParseState *pstate,
plan_list,
cplan);
+ /* Copy PartitionPruneResults into the portal's context. */
+ PortalStorePartitionPruneResults(portal, part_prune_result_list);
+
/*
* For CREATE TABLE ... AS EXECUTE, we must verify that the prepared
* statement is one that produces tuples. Currently we insist that it be
@@ -576,7 +583,9 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
const char *query_string;
CachedPlan *cplan;
List *plan_list;
- ListCell *p;
+ List *part_prune_result_list;
+ ListCell *p,
+ *pp;
ParamListInfo paramLI = NULL;
EState *estate = NULL;
instr_time planstart;
@@ -619,7 +628,10 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
/* Replan if needed, and acquire a transient refcount */
cplan = GetCachedPlan(entry->plansource, paramLI,
- CurrentResourceOwner, queryEnv);
+ CurrentResourceOwner, queryEnv,
+ &part_prune_result_list);
+ Assert(list_length(cplan->stmt_list) ==
+ list_length(part_prune_result_list));
INSTR_TIME_SET_CURRENT(planduration);
INSTR_TIME_SUBTRACT(planduration, planstart);
@@ -634,13 +646,15 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
plan_list = cplan->stmt_list;
/* Explain each query */
- foreach(p, plan_list)
+ forboth(p, plan_list, pp, part_prune_result_list)
{
PlannedStmt *pstmt = lfirst_node(PlannedStmt, p);
+ PartitionPruneResult *part_prune_result = lfirst_node(PartitionPruneResult, pp);
if (pstmt->commandType != CMD_UTILITY)
- ExplainOnePlan(pstmt, into, es, query_string, paramLI, queryEnv,
- &planduration, (es->buffers ? &bufusage : NULL));
+ ExplainOnePlan(pstmt, part_prune_result, into, es, query_string,
+ paramLI, queryEnv, &planduration,
+ (es->buffers ? &bufusage : NULL));
else
ExplainOneUtility(pstmt->utilityStmt, into, es, query_string,
paramLI, queryEnv);
diff --git a/src/backend/executor/README b/src/backend/executor/README
index 0b5183fc4a..e0802be723 100644
--- a/src/backend/executor/README
+++ b/src/backend/executor/README
@@ -65,6 +65,29 @@ found there. This currently only occurs for Append and MergeAppend nodes. In
this case the non-required subplans are ignored and the executor state's
subnode array will become out of sequence to the plan's subplan list.
+Actually, the so-called execution time pruning may also occur even before the
+execution has started. One case where that occurs is when a cached generic
+plan is being validated for execution by plancache.c: GetCachedPlan(), which
+proceeds by locking all the relations that will be scanned by that plan. If
+the generic plan contains nodes that can perform execution time partition
+pruning (that is, contain a PartitionPruneInfo), a subset of pruning steps
+contained in the PartitionPruneInfos that do not depend on execution actually
+having started (called "initial" pruning steps) are performed at this point
+to figure out the minimal set of child subplans that satisfy those pruning
+instructions. AcquireExecutorLocks() looking at a particular plan will then
+lock only the relations scanned by those surviving subplans (along with those
+present in PlannedStmt.minLockRelids), and ignore those scanned by the pruned
+subplans, even though the pruned subplans themselves are not removed from the
+plan tree. The result of pruning (that is, the set of indexes of surviving
+subplans in their parent's list of child subplans) is saved as a list of
+bitmapsets, with one element for every PartitionPruneInfo referenced in the
+plan (PlannedStmt.partPruneInfos). The list is packaged into a
+PartitionPruneResult node, which is passed along with the PlannedStmt to the
+executor via the QueryDesc. It is imperative that the executor and any third
+party code invoked by it that gets passed the plan tree look at the plan's
+PartitionPruneResult to determine whether a particular child subplan of a
+parent node that supports pruning is valid for a given execution.
+
Each Plan node may have expression trees associated with it, to represent
its target list, qualification conditions, etc. These trees are also
read-only to the executor, but the executor state for expression evaluation
@@ -286,6 +309,10 @@ Query Processing Control Flow
This is a sketch of control flow for full query processing:
+ [ ExecutorDoInitialPruning ] --- an optional step to perform initial
+ partition pruning on the plan tree the result of which is passed
+ to the executor via QueryDesc
+
CreateQueryDesc
ExecutorStart
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index ef2fd46092..5ee978937d 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -49,11 +49,13 @@
#include "commands/matview.h"
#include "commands/trigger.h"
#include "executor/execdebug.h"
+#include "executor/execPartition.h"
#include "executor/nodeSubplan.h"
#include "foreign/fdwapi.h"
#include "jit/jit.h"
#include "mb/pg_wchar.h"
#include "miscadmin.h"
+#include "nodes/nodeFuncs.h"
#include "parser/parsetree.h"
#include "storage/bufmgr.h"
#include "storage/lmgr.h"
@@ -104,6 +106,49 @@ static void EvalPlanQualStart(EPQState *epqstate, Plan *planTree);
/* end of local decls */
+/* ----------------------------------------------------------------
+ * ExecutorDoInitialPruning
+ *
+ * Performs initial partition pruning to figure out the minimal set of
+ * subplans to be executed and the set of RT indexes of the corresponding
+ * leaf partitions
+ *
+ * Returned PartitionPruneResult must be subsequently passed to the executor
+ * so that it can reuse the result of pruning. It's important that the
+ * has the same view of which partitions are initially pruned (by not doing
+ * the pruning again itself) or otherwise it risks initializing subplans whose
+ * partitions would not have been locked.
+ *
+ * Note: Partitioned tables mentioned in PartitionedRelPruneInfo nodes that
+ * drive the pruning will be locked before doing the pruning.
+ *
+ * ----------------------------------------------------------------
+ */
+PartitionPruneResult *
+ExecutorDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params)
+{
+ PartitionPruneResult *result;
+ ListCell *lc;
+
+ /* Only get here if there is any pruning to do. */
+ Assert(plannedstmt->containsInitialPruning);
+
+ result = makeNode(PartitionPruneResult);
+ foreach(lc, plannedstmt->partPruneInfos)
+ {
+ PartitionPruneInfo *pruneinfo = lfirst(lc);
+ Bitmapset *valid_subplan_offs;
+
+ valid_subplan_offs =
+ ExecPartitionDoInitialPruning(plannedstmt, params, pruneinfo,
+ &result->scan_leafpart_rtis);
+ result->valid_subplan_offs_list =
+ lappend(result->valid_subplan_offs_list,
+ valid_subplan_offs);
+ }
+
+ return result;
+}
/* ----------------------------------------------------------------
* ExecutorStart
@@ -806,6 +851,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
{
CmdType operation = queryDesc->operation;
PlannedStmt *plannedstmt = queryDesc->plannedstmt;
+ PartitionPruneResult *part_prune_result = queryDesc->part_prune_result;
Plan *plan = plannedstmt->planTree;
List *rangeTable = plannedstmt->rtable;
EState *estate = queryDesc->estate;
@@ -825,6 +871,8 @@ InitPlan(QueryDesc *queryDesc, int eflags)
ExecInitRangeTable(estate, rangeTable);
estate->es_plannedstmt = plannedstmt;
+ estate->es_part_prune_infos = plannedstmt->partPruneInfos;
+ estate->es_part_prune_result = part_prune_result;
/*
* Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index 9a0d5d59ef..805f86c503 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -66,6 +66,7 @@
#define PARALLEL_KEY_QUERY_TEXT UINT64CONST(0xE000000000000008)
#define PARALLEL_KEY_JIT_INSTRUMENTATION UINT64CONST(0xE000000000000009)
#define PARALLEL_KEY_WAL_USAGE UINT64CONST(0xE00000000000000A)
+#define PARALLEL_KEY_PARTITIONPRUNERESULT UINT64CONST(0xE00000000000000B)
#define PARALLEL_TUPLE_QUEUE_SIZE 65536
@@ -182,7 +183,9 @@ ExecSerializePlan(Plan *plan, EState *estate)
pstmt->transientPlan = false;
pstmt->dependsOnRole = false;
pstmt->parallelModeNeeded = false;
+ pstmt->containsInitialPruning = false;
pstmt->planTree = plan;
+ pstmt->partPruneInfos = estate->es_part_prune_infos;
pstmt->rtable = estate->es_range_table;
pstmt->resultRelations = NIL;
pstmt->appendRelations = NIL;
@@ -596,12 +599,15 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
FixedParallelExecutorState *fpes;
char *pstmt_data;
char *pstmt_space;
+ char *part_prune_result_data;
+ char *part_prune_result_space;
char *paramlistinfo_space;
BufferUsage *bufusage_space;
WalUsage *walusage_space;
SharedExecutorInstrumentation *instrumentation = NULL;
SharedJitInstrumentation *jit_instrumentation = NULL;
int pstmt_len;
+ int part_prune_result_len;
int paramlistinfo_len;
int instrumentation_len = 0;
int jit_instrumentation_len = 0;
@@ -630,6 +636,7 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
/* Fix up and serialize plan to be sent to workers. */
pstmt_data = ExecSerializePlan(planstate->plan, estate);
+ part_prune_result_data = nodeToString(estate->es_part_prune_result);
/* Create a parallel context. */
pcxt = CreateParallelContext("postgres", "ParallelQueryMain", nworkers);
@@ -656,6 +663,11 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
shm_toc_estimate_chunk(&pcxt->estimator, pstmt_len);
shm_toc_estimate_keys(&pcxt->estimator, 1);
+ /* Estimate space for serialized PartitionPruneResult. */
+ part_prune_result_len = strlen(part_prune_result_data) + 1;
+ shm_toc_estimate_chunk(&pcxt->estimator, part_prune_result_len);
+ shm_toc_estimate_keys(&pcxt->estimator, 1);
+
/* Estimate space for serialized ParamListInfo. */
paramlistinfo_len = EstimateParamListSpace(estate->es_param_list_info);
shm_toc_estimate_chunk(&pcxt->estimator, paramlistinfo_len);
@@ -750,6 +762,12 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
memcpy(pstmt_space, pstmt_data, pstmt_len);
shm_toc_insert(pcxt->toc, PARALLEL_KEY_PLANNEDSTMT, pstmt_space);
+ /* Store serialized PartitionPruneResult */
+ part_prune_result_space = shm_toc_allocate(pcxt->toc, part_prune_result_len);
+ memcpy(part_prune_result_space, part_prune_result_data, part_prune_result_len);
+ shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARTITIONPRUNERESULT,
+ part_prune_result_space);
+
/* Store serialized ParamListInfo. */
paramlistinfo_space = shm_toc_allocate(pcxt->toc, paramlistinfo_len);
shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARAMLISTINFO, paramlistinfo_space);
@@ -1231,8 +1249,10 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
int instrument_options)
{
char *pstmtspace;
+ char *part_prune_result_space;
char *paramspace;
PlannedStmt *pstmt;
+ PartitionPruneResult *part_prune_result;
ParamListInfo paramLI;
char *queryString;
@@ -1243,12 +1263,18 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
pstmtspace = shm_toc_lookup(toc, PARALLEL_KEY_PLANNEDSTMT, false);
pstmt = (PlannedStmt *) stringToNode(pstmtspace);
+ /* Reconstruct leader-supplied PartitionPruneResult. */
+ part_prune_result_space =
+ shm_toc_lookup(toc, PARALLEL_KEY_PARTITIONPRUNERESULT, false);
+ part_prune_result = (PartitionPruneResult *)
+ stringToNode(part_prune_result_space);
+
/* Reconstruct ParamListInfo. */
paramspace = shm_toc_lookup(toc, PARALLEL_KEY_PARAMLISTINFO, false);
paramLI = RestoreParamList(¶mspace);
/* Create a QueryDesc for the query. */
- return CreateQueryDesc(pstmt,
+ return CreateQueryDesc(pstmt, part_prune_result,
queryString,
GetActiveSnapshot(), InvalidSnapshot,
receiver, paramLI, NULL, instrument_options);
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 615bd80973..af87b9197f 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -25,6 +25,7 @@
#include "mb/pg_wchar.h"
#include "miscadmin.h"
#include "nodes/makefuncs.h"
+#include "parser/parsetree.h"
#include "partitioning/partbounds.h"
#include "partitioning/partdesc.h"
#include "partitioning/partprune.h"
@@ -185,7 +186,11 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation rel,
static List *adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri);
static List *adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap);
static PartitionPruneState *CreatePartitionPruneState(PlanState *planstate,
- PartitionPruneInfo *pruneinfo);
+ PartitionPruneInfo *pruneinfo,
+ bool consider_initial_steps,
+ bool consider_exec_steps,
+ List *rtable, ExprContext *econtext,
+ PartitionDirectory partdir);
static void InitPartitionPruneContext(PartitionPruneContext *context,
List *pruning_steps,
PartitionDesc partdesc,
@@ -198,7 +203,8 @@ static void PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
static void find_matching_subplans_recurse(PartitionPruningData *prunedata,
PartitionedRelPruningData *pprune,
bool initial_prune,
- Bitmapset **validsubplans);
+ Bitmapset **validsubplans,
+ Bitmapset **scan_leafpart_rtis);
/*
@@ -1587,8 +1593,10 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
* considered to be a stable expression, it can change value from one plan
* node scan to the next during query execution. Stable comparison
* expressions that don't involve such Params allow partition pruning to be
- * done once during executor startup. Expressions that do involve such Params
- * require us to prune separately for each scan of the parent plan node.
+ * done once during executor startup or during ExecutorDoInitialPruning() that
+ * runs as part of performing AcquireExecutorLocks() on a given plan tree.
+ * Expressions that do involve such Params require us to prune separately for
+ * each scan of the parent plan node.
*
* Note that pruning away unneeded subplans during executor startup has the
* added benefit of not having to initialize the unneeded subplans at all.
@@ -1605,6 +1613,13 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
* account for initial pruning possibly having eliminated some of the
* subplans.
*
+ * ExecPartitionDoInitialPruning:
+ * Do initial pruning with the information contained in a given
+ * PartitionPruneInfo to determine the minimal set of child subplans
+ * to be executed of the parent plan node to which the PartitionPruneInfo
+ * belongs and also the set of the RT indexes of leaf partitions that will
+ * be scanned with those subplans.
+ *
* ExecFindMatchingSubPlans:
* Returns indexes of matching subplans after evaluating the expressions
* that are safe to evaluate at a given point. This function is first
@@ -1622,8 +1637,9 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
*
* On return, *initially_valid_subplans is assigned the set of indexes of
* child subplans that must be initialized along with the parent plan node.
- * Initial pruning is performed here if needed and in that case only the
- * surviving subplans' indexes are added.
+ * Initial pruning is performed here if needed (unless it has already been done
+ * by ExecDoInitialPruning()), and in that case only the surviving subplans'
+ * indexes are added.
*
* If subplans are indeed pruned, subplan_map arrays contained in the returned
* PartitionPruneState are re-sequenced to not count those, though only if the
@@ -1632,29 +1648,66 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
PartitionPruneState *
ExecInitPartitionPruning(PlanState *planstate,
int n_total_subplans,
- PartitionPruneInfo *pruneinfo,
+ int part_prune_index,
Bitmapset **initially_valid_subplans)
{
- PartitionPruneState *prunestate;
EState *estate = planstate->state;
+ PartitionPruneInfo *pruneinfo = list_nth(estate->es_part_prune_infos,
+ part_prune_index);
+ PartitionPruneResult *pruneresult = estate->es_part_prune_result;
+ PartitionPruneState *prunestate;
+ bool do_pruning = (pruneinfo->needs_init_pruning ||
+ pruneinfo->needs_exec_pruning);
- /* We may need an expression context to evaluate partition exprs */
- ExecAssignExprContext(estate, planstate);
+ /*
+ * No need to do initial pruning if it was done already by
+ * ExecutorDoInitialPruning(), which it would be if es_part_prune_result
+ * has been set.
+ */
+ if (pruneresult)
+ do_pruning = pruneinfo->needs_exec_pruning;
+
+ prunestate = NULL;
+ if (do_pruning)
+ {
+ /* We may need an expression context to evaluate partition exprs */
+ ExecAssignExprContext(estate, planstate);
- /* Create the working data structure for pruning */
- prunestate = CreatePartitionPruneState(planstate, pruneinfo);
+ /* For data reading, executor always omits detached partitions */
+ if (estate->es_partition_directory == NULL)
+ estate->es_partition_directory =
+ CreatePartitionDirectory(estate->es_query_cxt, false);
+
+ /*
+ * Create the working data structure for pruning. No need to consider
+ * initial pruning steps if we have a PartitionPruneResult.
+ */
+ prunestate = CreatePartitionPruneState(planstate, pruneinfo,
+ pruneresult == NULL, true,
+ NIL, planstate->ps_ExprContext,
+ estate->es_partition_directory);
+ }
/*
* Perform an initial partition prune pass, if required.
*/
- if (prunestate->do_initial_prune)
- *initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true);
+ if (pruneresult)
+ {
+ *initially_valid_subplans =
+ list_nth(pruneresult->valid_subplan_offs_list, part_prune_index);
+ }
+ else if (prunestate && prunestate->do_initial_prune)
+ {
+ *initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true,
+ NULL);
+ }
else
{
/* No pruning, so we'll need to initialize all subplans */
Assert(n_total_subplans > 0);
*initially_valid_subplans = bms_add_range(NULL, 0,
n_total_subplans - 1);
+ return prunestate;
}
/*
@@ -1662,7 +1715,8 @@ ExecInitPartitionPruning(PlanState *planstate,
* that were removed above due to initial pruning. No need to do this if
* no steps were removed.
*/
- if (bms_num_members(*initially_valid_subplans) < n_total_subplans)
+ if (prunestate &&
+ bms_num_members(*initially_valid_subplans) < n_total_subplans)
{
/*
* We can safely skip this when !do_exec_prune, even though that
@@ -1678,11 +1732,74 @@ ExecInitPartitionPruning(PlanState *planstate,
return prunestate;
}
+/*
+ * ExecPartitionDoInitialPruning
+ * Perform initial pruning using given PartitionPruneInfo to determine
+ * the minimal set of child subplans to be executed of the parent plan
+ * node to which the PartitionPruneInfo belongs and also the set of RT
+ * indexes of leaf partitions that will scanned with those subplans.
+ */
+Bitmapset *
+ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params,
+ PartitionPruneInfo *pruneinfo,
+ Bitmapset **scan_leafpart_rtis)
+{
+ List *rtable = plannedstmt->rtable;
+ ExprContext *econtext;
+ PartitionDirectory pdir;
+ MemoryContext oldcontext,
+ tmpcontext;
+ PartitionPruneState *prunestate;
+ Bitmapset *valid_subplan_offs;
+
+ /*
+ * A temporary context for memory allocations required while executing
+ * partition pruning steps.
+ */
+ tmpcontext = AllocSetContextCreate(CurrentMemoryContext,
+ "initial pruning working data",
+ ALLOCSET_DEFAULT_SIZES);
+ oldcontext = MemoryContextSwitchTo(tmpcontext);
+
+ /*
+ * PartitionDirectory to look up partition descriptors, which omits
+ * detached partitions, just like in the executor proper.
+ */
+ pdir = CreatePartitionDirectory(CurrentMemoryContext, false);
+
+ /*
+ * We don't yet have a PlanState for the parent plan node, so we must
+ * create a standalone ExprContext to evaluate pruning expressions,
+ * equipped with the information about the EXTERN parameters that the
+ * caller passed us. Note that that's okay because the initial pruning
+ * steps do not contain anything that requires the execution to have
+ * started.
+ */
+ econtext = CreateStandaloneExprContext();
+ econtext->ecxt_param_list_info = params;
+ prunestate = CreatePartitionPruneState(NULL, pruneinfo, true, false,
+ rtable, econtext, pdir);
+ MemoryContextSwitchTo(oldcontext);
+
+ /* Do the initial pruning. */
+ valid_subplan_offs = ExecFindMatchingSubPlans(prunestate, true,
+ scan_leafpart_rtis);
+
+ FreeExprContext(econtext, true);
+ DestroyPartitionDirectory(pdir);
+ MemoryContextDelete(tmpcontext);
+
+ return valid_subplan_offs;
+}
+
/*
* CreatePartitionPruneState
* Build the data structure required for calling ExecFindMatchingSubPlans
*
- * 'planstate' is the parent plan node's execution state.
+ * 'planstate', if not NULL, is the parent plan node's execution state. It
+ * can be NULL if being called before ExecutorStart(), in which case,
+ * 'rtable' (range table), 'econtext', and 'partdir' must be explicitly
+ * provided.
*
* 'pruneinfo' is a PartitionPruneInfo as generated by
* make_partition_pruneinfo. Here we build a PartitionPruneState containing a
@@ -1696,19 +1813,21 @@ ExecInitPartitionPruning(PlanState *planstate,
* PartitionedRelPruneInfo.
*/
static PartitionPruneState *
-CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
+CreatePartitionPruneState(PlanState *planstate,
+ PartitionPruneInfo *pruneinfo,
+ bool consider_initial_steps,
+ bool consider_exec_steps,
+ List *rtable, ExprContext *econtext,
+ PartitionDirectory partdir)
{
- EState *estate = planstate->state;
+ EState *estate = planstate ? planstate->state : NULL;
PartitionPruneState *prunestate;
int n_part_hierarchies;
ListCell *lc;
int i;
- ExprContext *econtext = planstate->ps_ExprContext;
- /* For data reading, executor always omits detached partitions */
- if (estate->es_partition_directory == NULL)
- estate->es_partition_directory =
- CreatePartitionDirectory(estate->es_query_cxt, false);
+ Assert((estate != NULL) ||
+ (partdir != NULL && econtext != NULL && rtable != NIL));
n_part_hierarchies = list_length(pruneinfo->prune_infos);
Assert(n_part_hierarchies > 0);
@@ -1763,15 +1882,42 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
PartitionKey partkey;
/*
- * We can rely on the copies of the partitioned table's partition
- * key and partition descriptor appearing in its relcache entry,
- * because that entry will be held open and locked for the
- * duration of this executor run.
+ * Must open the relation by ourselves when called before the
+ * execution has started, such as, when called during
+ * ExecutorDoInitialPruning() on a cached plan. In that case,
+ * sub-partitions must be locked, because AcquirePlannerLocks()
+ * would not have seen them. (1st relation in a partrelpruneinfos
+ * list is always the root partitioned table appearing in the
+ * query, which AcquirePlannerLocks() would have locked; the
+ * Assert in relation_open() guards that assumption.)
+ */
+ if (estate == NULL)
+ {
+ RangeTblEntry *rte = rt_fetch(pinfo->rtindex, rtable);
+ int lockmode = (j == 0) ? NoLock : rte->rellockmode;
+
+ partrel = table_open(rte->relid, lockmode);
+ }
+ else
+ partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
+
+ /*
+ * We can rely on the copy of the partitioned table's partition
+ * key from in its relcache entry, because it can't change (or
+ * get destroyed) as long as the relation is locked. Partition
+ * descriptor is taken from the PartitionDirectory associated with
+ * the table that is held open long enough for the descriptor to
+ * remain valid while it's used to perform the pruning steps.
*/
- partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
partkey = RelationGetPartitionKey(partrel);
- partdesc = PartitionDirectoryLookup(estate->es_partition_directory,
- partrel);
+ partdesc = PartitionDirectoryLookup(partdir, partrel);
+
+ /*
+ * Must close partrel, keeping the lock taken, if we're not using
+ * EState's entry.
+ */
+ if (estate == NULL)
+ table_close(partrel, NoLock);
/*
* Initialize the subplan_map and subpart_map.
@@ -1785,6 +1931,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
Assert(partdesc->nparts >= pinfo->nparts);
pprune->nparts = partdesc->nparts;
pprune->subplan_map = palloc(sizeof(int) * partdesc->nparts);
+ pprune->rti_map = palloc(sizeof(Index) * partdesc->nparts);
if (partdesc->nparts == pinfo->nparts)
{
/*
@@ -1795,6 +1942,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
pprune->subpart_map = pinfo->subpart_map;
memcpy(pprune->subplan_map, pinfo->subplan_map,
sizeof(int) * pinfo->nparts);
+ memcpy(pprune->rti_map, pinfo->rti_map,
+ sizeof(int) * pinfo->nparts);
/*
* Double-check that the list of unpruned relations has not
@@ -1845,6 +1994,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
pinfo->subplan_map[pd_idx];
pprune->subpart_map[pp_idx] =
pinfo->subpart_map[pd_idx];
+ pprune->rti_map[pp_idx] =
+ pinfo->rti_map[pd_idx];
pd_idx++;
}
else
@@ -1852,6 +2003,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
/* this partdesc entry is not in the plan */
pprune->subplan_map[pp_idx] = -1;
pprune->subpart_map[pp_idx] = -1;
+ pprune->rti_map[pp_idx] = 0;
}
}
@@ -1873,7 +2025,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
* Initialize pruning contexts as needed.
*/
pprune->initial_pruning_steps = pinfo->initial_pruning_steps;
- if (pinfo->initial_pruning_steps)
+ if (consider_initial_steps && pinfo->initial_pruning_steps)
{
InitPartitionPruneContext(&pprune->initial_context,
pinfo->initial_pruning_steps,
@@ -1883,7 +2035,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
prunestate->do_initial_prune = true;
}
pprune->exec_pruning_steps = pinfo->exec_pruning_steps;
- if (pinfo->exec_pruning_steps)
+ if (consider_exec_steps && pinfo->exec_pruning_steps)
{
InitPartitionPruneContext(&pprune->exec_context,
pinfo->exec_pruning_steps,
@@ -2111,10 +2263,14 @@ PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
* Pass initial_prune if PARAM_EXEC Params cannot yet be evaluated. This
* differentiates the initial executor-time pruning step from later
* runtime pruning.
+ *
+ * RT indexes of leaf partitions scanned by the chosen subplans are added to
+ * *scan_leafpart_rtis if the pointer is non-NULL.
*/
Bitmapset *
ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
- bool initial_prune)
+ bool initial_prune,
+ Bitmapset **scan_leafpart_rtis)
{
Bitmapset *result = NULL;
MemoryContext oldcontext;
@@ -2149,7 +2305,7 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
*/
pprune = &prunedata->partrelprunedata[0];
find_matching_subplans_recurse(prunedata, pprune, initial_prune,
- &result);
+ &result, scan_leafpart_rtis);
/* Expression eval may have used space in ExprContext too */
if (pprune->exec_pruning_steps)
@@ -2163,6 +2319,8 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
/* Copy result out of the temp context before we reset it */
result = bms_copy(result);
+ if (scan_leafpart_rtis)
+ *scan_leafpart_rtis = bms_copy(*scan_leafpart_rtis);
MemoryContextReset(prunestate->prune_context);
@@ -2173,13 +2331,15 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
* find_matching_subplans_recurse
* Recursive worker function for ExecFindMatchingSubPlans
*
- * Adds valid (non-prunable) subplan IDs to *validsubplans
+ * Adds valid (non-prunable) subplan IDs to *validsubplans and RT indexes of
+ * of the corresponding leaf partitions to *scan_leafpart_rtis (if asked for).
*/
static void
find_matching_subplans_recurse(PartitionPruningData *prunedata,
PartitionedRelPruningData *pprune,
bool initial_prune,
- Bitmapset **validsubplans)
+ Bitmapset **validsubplans,
+ Bitmapset **scan_leafpart_rtis)
{
Bitmapset *partset;
int i;
@@ -2206,8 +2366,14 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
while ((i = bms_next_member(partset, i)) >= 0)
{
if (pprune->subplan_map[i] >= 0)
+ {
*validsubplans = bms_add_member(*validsubplans,
pprune->subplan_map[i]);
+ Assert(pprune->rti_map[i] > 0);
+ if (scan_leafpart_rtis)
+ *scan_leafpart_rtis = bms_add_member(*scan_leafpart_rtis,
+ pprune->rti_map[i]);
+ }
else
{
int partidx = pprune->subpart_map[i];
@@ -2215,7 +2381,8 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
if (partidx >= 0)
find_matching_subplans_recurse(prunedata,
&prunedata->partrelprunedata[partidx],
- initial_prune, validsubplans);
+ initial_prune, validsubplans,
+ scan_leafpart_rtis);
else
{
/*
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 9df1f81ea8..f9c7976ff2 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -119,6 +119,8 @@ CreateExecutorState(void)
estate->es_relations = NULL;
estate->es_rowmarks = NULL;
estate->es_plannedstmt = NULL;
+ estate->es_part_prune_infos = NIL;
+ estate->es_part_prune_result = NULL;
estate->es_junkFilter = NULL;
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index f9460ae506..a2182a6b1f 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -844,7 +844,7 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
else
dest = None_Receiver;
- es->qd = CreateQueryDesc(es->stmt,
+ es->qd = CreateQueryDesc(es->stmt, NULL,
fcache->src,
GetActiveSnapshot(),
InvalidSnapshot,
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index 357e10a1d7..96880e122a 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -134,7 +134,7 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
appendstate->as_begun = false;
/* If run-time partition pruning is enabled, then set that up now */
- if (node->part_prune_info != NULL)
+ if (node->part_prune_index >= 0)
{
PartitionPruneState *prunestate;
@@ -145,7 +145,7 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
*/
prunestate = ExecInitPartitionPruning(&appendstate->ps,
list_length(node->appendplans),
- node->part_prune_info,
+ node->part_prune_index,
&validsubplans);
appendstate->as_prune_state = prunestate;
nplans = bms_num_members(validsubplans);
@@ -155,7 +155,8 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
* subplan, we can fill as_valid_subplans immediately, preventing
* later calls to ExecFindMatchingSubPlans.
*/
- if (!prunestate->do_exec_prune && nplans > 0)
+ if (appendstate->as_prune_state == NULL ||
+ (!appendstate->as_prune_state->do_exec_prune && nplans > 0))
appendstate->as_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
}
else
@@ -577,7 +578,7 @@ choose_next_subplan_locally(AppendState *node)
}
else if (node->as_valid_subplans == NULL)
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
whichplan = -1;
}
@@ -642,7 +643,7 @@ choose_next_subplan_for_leader(AppendState *node)
if (node->as_valid_subplans == NULL)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
/*
* Mark each invalid plan as finished to allow the loop below to
@@ -717,7 +718,7 @@ choose_next_subplan_for_worker(AppendState *node)
else if (node->as_valid_subplans == NULL)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
mark_invalid_subplans_as_finished(node);
}
@@ -868,7 +869,7 @@ ExecAppendAsyncBegin(AppendState *node)
if (node->as_valid_subplans == NULL)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
classify_matching_subplans(node);
}
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index ecf9052e03..7708cfffda 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -82,7 +82,7 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
mergestate->ps.ExecProcNode = ExecMergeAppend;
/* If run-time partition pruning is enabled, then set that up now */
- if (node->part_prune_info != NULL)
+ if (node->part_prune_index >= 0)
{
PartitionPruneState *prunestate;
@@ -93,7 +93,7 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
*/
prunestate = ExecInitPartitionPruning(&mergestate->ps,
list_length(node->mergeplans),
- node->part_prune_info,
+ node->part_prune_index,
&validsubplans);
mergestate->ms_prune_state = prunestate;
nplans = bms_num_members(validsubplans);
@@ -103,7 +103,8 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
* subplan, we can fill as_valid_subplans immediately, preventing
* later calls to ExecFindMatchingSubPlans.
*/
- if (!prunestate->do_exec_prune && nplans > 0)
+ if (mergestate->ms_prune_state == NULL ||
+ (!mergestate->ms_prune_state->do_exec_prune && nplans > 0))
mergestate->ms_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
}
else
@@ -218,7 +219,7 @@ ExecMergeAppend(PlanState *pstate)
*/
if (node->ms_valid_subplans == NULL)
node->ms_valid_subplans =
- ExecFindMatchingSubPlans(node->ms_prune_state, false);
+ ExecFindMatchingSubPlans(node->ms_prune_state, false, NULL);
/*
* First time through: pull the first tuple from each valid subplan,
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 042a5f8b0a..729e2fd7b2 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -1578,6 +1578,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
CachedPlanSource *plansource;
CachedPlan *cplan;
List *stmt_list;
+ List *part_prune_result_list;
char *query_string;
Snapshot snapshot;
MemoryContext oldcontext;
@@ -1657,7 +1658,10 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
*/
/* Replan if needed, and increment plan refcount for portal */
- cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv);
+ cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv,
+ &part_prune_result_list);
+ Assert(list_length(cplan->stmt_list) ==
+ list_length(part_prune_result_list));
stmt_list = cplan->stmt_list;
if (!plan->saved)
@@ -1685,6 +1689,9 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
stmt_list,
cplan);
+ /* Copy PartitionPruneResults into the portal's context. */
+ PortalStorePartitionPruneResults(portal, part_prune_result_list);
+
/*
* Set up options for portal. Default SCROLL type is chosen the same way
* as PerformCursorOpen does it.
@@ -2092,7 +2099,8 @@ SPI_plan_get_cached_plan(SPIPlanPtr plan)
/* Get the generic plan for the query */
cplan = GetCachedPlan(plansource, NULL,
plan->saved ? CurrentResourceOwner : NULL,
- _SPI_current->queryEnv);
+ _SPI_current->queryEnv,
+ NULL /* Not interested in PartitionPruneResults */);
Assert(cplan == plansource->gplan);
/* Pop the error context stack */
@@ -2473,7 +2481,9 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
{
CachedPlanSource *plansource = (CachedPlanSource *) lfirst(lc1);
List *stmt_list;
- ListCell *lc2;
+ List *part_prune_result_list;
+ ListCell *lc2,
+ *lc3;
spicallbackarg.query = plansource->query_string;
@@ -2549,8 +2559,10 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
* plan, the refcount must be backed by the plan_owner.
*/
cplan = GetCachedPlan(plansource, options->params,
- plan_owner, _SPI_current->queryEnv);
-
+ plan_owner, _SPI_current->queryEnv,
+ &part_prune_result_list);
+ Assert(list_length(cplan->stmt_list) ==
+ list_length(part_prune_result_list));
stmt_list = cplan->stmt_list;
/*
@@ -2589,9 +2601,10 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
}
}
- foreach(lc2, stmt_list)
+ forboth(lc2, stmt_list, lc3, part_prune_result_list)
{
PlannedStmt *stmt = lfirst_node(PlannedStmt, lc2);
+ PartitionPruneResult *part_prune_result = lfirst_node(PartitionPruneResult, lc3);
bool canSetTag = stmt->canSetTag;
DestReceiver *dest;
@@ -2663,7 +2676,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
else
snap = InvalidSnapshot;
- qdesc = CreateQueryDesc(stmt,
+ qdesc = CreateQueryDesc(stmt, part_prune_result,
plansource->query_string,
snap, crosscheck_snapshot,
dest,
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 836f427ea8..59a7054011 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -96,7 +96,10 @@ _copyPlannedStmt(const PlannedStmt *from)
COPY_SCALAR_FIELD(parallelModeNeeded);
COPY_SCALAR_FIELD(jitFlags);
COPY_NODE_FIELD(planTree);
+ COPY_NODE_FIELD(partPruneInfos);
+ COPY_SCALAR_FIELD(containsInitialPruning);
COPY_NODE_FIELD(rtable);
+ COPY_BITMAPSET_FIELD(minLockRelids);
COPY_NODE_FIELD(resultRelations);
COPY_NODE_FIELD(appendRelations);
COPY_NODE_FIELD(subplans);
@@ -253,7 +256,7 @@ _copyAppend(const Append *from)
COPY_NODE_FIELD(appendplans);
COPY_SCALAR_FIELD(nasyncplans);
COPY_SCALAR_FIELD(first_partial_plan);
- COPY_NODE_FIELD(part_prune_info);
+ COPY_SCALAR_FIELD(part_prune_index);
return newnode;
}
@@ -281,7 +284,7 @@ _copyMergeAppend(const MergeAppend *from)
COPY_POINTER_FIELD(sortOperators, from->numCols * sizeof(Oid));
COPY_POINTER_FIELD(collations, from->numCols * sizeof(Oid));
COPY_POINTER_FIELD(nullsFirst, from->numCols * sizeof(bool));
- COPY_NODE_FIELD(part_prune_info);
+ COPY_SCALAR_FIELD(part_prune_index);
return newnode;
}
@@ -1283,6 +1286,8 @@ _copyPartitionPruneInfo(const PartitionPruneInfo *from)
PartitionPruneInfo *newnode = makeNode(PartitionPruneInfo);
COPY_NODE_FIELD(prune_infos);
+ COPY_SCALAR_FIELD(needs_init_pruning);
+ COPY_SCALAR_FIELD(needs_exec_pruning);
COPY_BITMAPSET_FIELD(other_subplans);
return newnode;
@@ -1299,6 +1304,7 @@ _copyPartitionedRelPruneInfo(const PartitionedRelPruneInfo *from)
COPY_POINTER_FIELD(subplan_map, from->nparts * sizeof(int));
COPY_POINTER_FIELD(subpart_map, from->nparts * sizeof(int));
COPY_POINTER_FIELD(relid_map, from->nparts * sizeof(Oid));
+ COPY_POINTER_FIELD(rti_map, from->nparts * sizeof(Index));
COPY_NODE_FIELD(initial_pruning_steps);
COPY_NODE_FIELD(exec_pruning_steps);
COPY_BITMAPSET_FIELD(execparamids);
@@ -5473,6 +5479,21 @@ _copyExtensibleNode(const ExtensibleNode *from)
return newnode;
}
+/* ****************************************************************
+ * execnodes.h copy functions
+ * ****************************************************************
+ */
+static PartitionPruneResult *
+_copyPartitionPruneResult(const PartitionPruneResult *from)
+{
+ PartitionPruneResult *newnode = makeNode(PartitionPruneResult);
+
+ COPY_NODE_FIELD(valid_subplan_offs_list);
+ COPY_BITMAPSET_FIELD(scan_leafpart_rtis);
+
+ return newnode;
+}
+
/* ****************************************************************
* value.h copy functions
* ****************************************************************
@@ -5527,7 +5548,6 @@ _copyBitString(const BitString *from)
return newnode;
}
-
static ForeignKeyCacheInfo *
_copyForeignKeyCacheInfo(const ForeignKeyCacheInfo *from)
{
@@ -6569,6 +6589,13 @@ copyObjectImpl(const void *from)
retval = _copyPublicationTable(from);
break;
+ /*
+ * EXECUTION NODES
+ */
+ case T_PartitionPruneResult:
+ retval = _copyPartitionPruneResult(from);
+ break;
+
/*
* MISCELLANEOUS NODES
*/
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index d5f5e76c55..3dada68291 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -314,7 +314,10 @@ _outPlannedStmt(StringInfo str, const PlannedStmt *node)
WRITE_BOOL_FIELD(parallelModeNeeded);
WRITE_INT_FIELD(jitFlags);
WRITE_NODE_FIELD(planTree);
+ WRITE_NODE_FIELD(partPruneInfos);
+ WRITE_BOOL_FIELD(containsInitialPruning);
WRITE_NODE_FIELD(rtable);
+ WRITE_BITMAPSET_FIELD(minLockRelids);
WRITE_NODE_FIELD(resultRelations);
WRITE_NODE_FIELD(appendRelations);
WRITE_NODE_FIELD(subplans);
@@ -443,7 +446,7 @@ _outAppend(StringInfo str, const Append *node)
WRITE_NODE_FIELD(appendplans);
WRITE_INT_FIELD(nasyncplans);
WRITE_INT_FIELD(first_partial_plan);
- WRITE_NODE_FIELD(part_prune_info);
+ WRITE_INT_FIELD(part_prune_index);
}
static void
@@ -460,7 +463,7 @@ _outMergeAppend(StringInfo str, const MergeAppend *node)
WRITE_OID_ARRAY(sortOperators, node->numCols);
WRITE_OID_ARRAY(collations, node->numCols);
WRITE_BOOL_ARRAY(nullsFirst, node->numCols);
- WRITE_NODE_FIELD(part_prune_info);
+ WRITE_INT_FIELD(part_prune_index);
}
static void
@@ -1009,6 +1012,8 @@ _outPartitionPruneInfo(StringInfo str, const PartitionPruneInfo *node)
WRITE_NODE_TYPE("PARTITIONPRUNEINFO");
WRITE_NODE_FIELD(prune_infos);
+ WRITE_BOOL_FIELD(needs_init_pruning);
+ WRITE_BOOL_FIELD(needs_exec_pruning);
WRITE_BITMAPSET_FIELD(other_subplans);
}
@@ -1023,6 +1028,7 @@ _outPartitionedRelPruneInfo(StringInfo str, const PartitionedRelPruneInfo *node)
WRITE_INT_ARRAY(subplan_map, node->nparts);
WRITE_INT_ARRAY(subpart_map, node->nparts);
WRITE_OID_ARRAY(relid_map, node->nparts);
+ WRITE_INDEX_ARRAY(rti_map, node->nparts);
WRITE_NODE_FIELD(initial_pruning_steps);
WRITE_NODE_FIELD(exec_pruning_steps);
WRITE_BITMAPSET_FIELD(execparamids);
@@ -2425,6 +2431,9 @@ _outPlannerGlobal(StringInfo str, const PlannerGlobal *node)
WRITE_NODE_FIELD(finalrowmarks);
WRITE_NODE_FIELD(resultRelations);
WRITE_NODE_FIELD(appendRelations);
+ WRITE_NODE_FIELD(partPruneInfos);
+ WRITE_BOOL_FIELD(containsInitialPruning);
+ WRITE_BITMAPSET_FIELD(minLockRelids);
WRITE_NODE_FIELD(relationOids);
WRITE_NODE_FIELD(invalItems);
WRITE_NODE_FIELD(paramExecTypes);
@@ -2492,6 +2501,7 @@ _outPlannerInfo(StringInfo str, const PlannerInfo *node)
WRITE_BITMAPSET_FIELD(curOuterRels);
WRITE_NODE_FIELD(curOuterParams);
WRITE_BOOL_FIELD(partColsUpdated);
+ WRITE_NODE_FIELD(partPruneInfos);
}
static void
@@ -2845,6 +2855,21 @@ _outExtensibleNode(StringInfo str, const ExtensibleNode *node)
methods->nodeOut(str, node);
}
+/*****************************************************************************
+ *
+ * Stuff from execnodes.h
+ *
+ *****************************************************************************/
+
+static void
+_outPartitionPruneResult(StringInfo str, const PartitionPruneResult *node)
+{
+ WRITE_NODE_TYPE("PARTITIONPRUNERESULT");
+
+ WRITE_NODE_FIELD(valid_subplan_offs_list);
+ WRITE_BITMAPSET_FIELD(scan_leafpart_rtis);
+}
+
/*****************************************************************************
*
* Stuff from parsenodes.h.
@@ -4754,6 +4779,13 @@ outNode(StringInfo str, const void *obj)
_outJsonTableSibling(str, obj);
break;
+ /*
+ * EXECUTION NODES
+ */
+ case T_PartitionPruneResult:
+ _outPartitionPruneResult(str, obj);
+ break;
+
default:
/*
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 3d150cb25d..6a6fcec03b 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -164,6 +164,11 @@
token = pg_strtok(&length); /* skip :fldname */ \
local_node->fldname = readIntCols(len)
+/* Read an Index array */
+#define READ_INDEX_ARRAY(fldname, len) \
+ token = pg_strtok(&length); /* skip :fldname */ \
+ local_node->fldname = readIndexCols(len)
+
/* Read a bool array */
#define READ_BOOL_ARRAY(fldname, len) \
token = pg_strtok(&length); /* skip :fldname */ \
@@ -1815,7 +1820,10 @@ _readPlannedStmt(void)
READ_BOOL_FIELD(parallelModeNeeded);
READ_INT_FIELD(jitFlags);
READ_NODE_FIELD(planTree);
+ READ_NODE_FIELD(partPruneInfos);
+ READ_BOOL_FIELD(containsInitialPruning);
READ_NODE_FIELD(rtable);
+ READ_BITMAPSET_FIELD(minLockRelids);
READ_NODE_FIELD(resultRelations);
READ_NODE_FIELD(appendRelations);
READ_NODE_FIELD(subplans);
@@ -1947,7 +1955,7 @@ _readAppend(void)
READ_NODE_FIELD(appendplans);
READ_INT_FIELD(nasyncplans);
READ_INT_FIELD(first_partial_plan);
- READ_NODE_FIELD(part_prune_info);
+ READ_INT_FIELD(part_prune_index);
READ_DONE();
}
@@ -1969,7 +1977,7 @@ _readMergeAppend(void)
READ_OID_ARRAY(sortOperators, local_node->numCols);
READ_OID_ARRAY(collations, local_node->numCols);
READ_BOOL_ARRAY(nullsFirst, local_node->numCols);
- READ_NODE_FIELD(part_prune_info);
+ READ_INT_FIELD(part_prune_index);
READ_DONE();
}
@@ -2767,6 +2775,8 @@ _readPartitionPruneInfo(void)
READ_LOCALS(PartitionPruneInfo);
READ_NODE_FIELD(prune_infos);
+ READ_BOOL_FIELD(needs_init_pruning);
+ READ_BOOL_FIELD(needs_exec_pruning);
READ_BITMAPSET_FIELD(other_subplans);
READ_DONE();
@@ -2783,6 +2793,7 @@ _readPartitionedRelPruneInfo(void)
READ_INT_ARRAY(subplan_map, local_node->nparts);
READ_INT_ARRAY(subpart_map, local_node->nparts);
READ_OID_ARRAY(relid_map, local_node->nparts);
+ READ_INDEX_ARRAY(rti_map, local_node->nparts);
READ_NODE_FIELD(initial_pruning_steps);
READ_NODE_FIELD(exec_pruning_steps);
READ_BITMAPSET_FIELD(execparamids);
@@ -2936,6 +2947,21 @@ _readPartitionRangeDatum(void)
READ_DONE();
}
+
+/*
+ * _readPartitionPruneResult
+ */
+static PartitionPruneResult *
+_readPartitionPruneResult(void)
+{
+ READ_LOCALS(PartitionPruneResult);
+
+ READ_NODE_FIELD(valid_subplan_offs_list);
+ READ_BITMAPSET_FIELD(scan_leafpart_rtis);
+
+ READ_DONE();
+}
+
/*
* parseNodeString
*
@@ -3233,6 +3259,8 @@ parseNodeString(void)
return_value = _readJsonTableParent();
else if (MATCH("JSONTABSNODE", 12))
return_value = _readJsonTableSibling();
+ else if (MATCH("PARTITIONPRUNERESULT", 20))
+ return_value = _readPartitionPruneResult();
else
{
elog(ERROR, "badly formatted node string \"%.32s\"...", token);
@@ -3376,6 +3404,30 @@ readIntCols(int numCols)
return int_vals;
}
+/*
+ * readIndexCols
+ */
+Index *
+readIndexCols(int numCols)
+{
+ int tokenLength,
+ i;
+ const char *token;
+ Index *index_vals;
+
+ if (numCols <= 0)
+ return NULL;
+
+ index_vals = (Index *) palloc(numCols * sizeof(Index));
+ for (i = 0; i < numCols; i++)
+ {
+ token = pg_strtok(&tokenLength);
+ index_vals[i] = atoui(token);
+ }
+
+ return index_vals;
+}
+
/*
* readBoolCols
*/
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index 95476ada0b..fe0df2f1d1 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -1184,7 +1184,6 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
ListCell *subpaths;
int nasyncplans = 0;
RelOptInfo *rel = best_path->path.parent;
- PartitionPruneInfo *partpruneinfo = NULL;
int nodenumsortkeys = 0;
AttrNumber *nodeSortColIdx = NULL;
Oid *nodeSortOperators = NULL;
@@ -1335,6 +1334,9 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
subplans = lappend(subplans, subplan);
}
+ /* Set below if we find quals that we can use to run-time prune */
+ plan->part_prune_index = -1;
+
/*
* If any quals exist, they may be useful to perform further partition
* pruning during execution. Gather information needed by the executor to
@@ -1358,16 +1360,14 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
}
if (prunequal != NIL)
- partpruneinfo =
- make_partition_pruneinfo(root, rel,
- best_path->subpaths,
- prunequal);
+ plan->part_prune_index = make_partition_pruneinfo(root, rel,
+ best_path->subpaths,
+ prunequal);
}
plan->appendplans = subplans;
plan->nasyncplans = nasyncplans;
plan->first_partial_plan = best_path->first_partial_path;
- plan->part_prune_info = partpruneinfo;
copy_generic_path_info(&plan->plan, (Path *) best_path);
@@ -1407,7 +1407,6 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
List *subplans = NIL;
ListCell *subpaths;
RelOptInfo *rel = best_path->path.parent;
- PartitionPruneInfo *partpruneinfo = NULL;
/*
* We don't have the actual creation of the MergeAppend node split out
@@ -1500,6 +1499,9 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
subplans = lappend(subplans, subplan);
}
+ /* Set below if we find quals that we can use to run-time prune */
+ node->part_prune_index = -1;
+
/*
* If any quals exist, they may be useful to perform further partition
* pruning during execution. Gather information needed by the executor to
@@ -1523,13 +1525,13 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
}
if (prunequal != NIL)
- partpruneinfo = make_partition_pruneinfo(root, rel,
- best_path->subpaths,
- prunequal);
+ node->part_prune_index = make_partition_pruneinfo(root, rel,
+ best_path->subpaths,
+ prunequal);
}
node->mergeplans = subplans;
- node->part_prune_info = partpruneinfo;
+
/*
* If prepare_sort_from_pathkeys added sort columns, but we were told to
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index b090b087e9..f425362491 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -518,7 +518,10 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
result->dependsOnRole = glob->dependsOnRole;
result->parallelModeNeeded = glob->parallelModeNeeded;
result->planTree = top_plan;
+ result->partPruneInfos = glob->partPruneInfos;
+ result->containsInitialPruning = glob->containsInitialPruning;
result->rtable = glob->finalrtable;
+ result->minLockRelids = glob->minLockRelids;
result->resultRelations = glob->resultRelations;
result->appendRelations = glob->appendRelations;
result->subplans = glob->subplans;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 6ea3505646..94d4ff0b9d 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -261,7 +261,7 @@ set_plan_references(PlannerInfo *root, Plan *plan)
Plan *result;
PlannerGlobal *glob = root->glob;
int rtoffset = list_length(glob->finalrtable);
- ListCell *lc;
+ ListCell *lc;
/*
* Add all the query's RTEs to the flattened rangetable. The live ones
@@ -270,6 +270,16 @@ set_plan_references(PlannerInfo *root, Plan *plan)
*/
add_rtes_to_flat_rtable(root, false);
+ /*
+ * Add the query's adjusted range of RT indexes to glob->minLockRelids.
+ * The adjusted RT indexes of prunable relations will be deleted from the
+ * set below where PartitionPruneInfos are processed.
+ */
+ glob->minLockRelids =
+ bms_add_range(glob->minLockRelids,
+ rtoffset + 1,
+ rtoffset + list_length(root->parse->rtable));
+
/*
* Adjust RT indexes of PlanRowMarks and add to final rowmarks list
*/
@@ -348,6 +358,64 @@ set_plan_references(PlannerInfo *root, Plan *plan)
}
}
+ /* Also fix up the information in PartitionPruneInfos. */
+ foreach (lc, root->partPruneInfos)
+ {
+ PartitionPruneInfo *pruneinfo = lfirst(lc);
+ Bitmapset *leafpart_rtis = NULL;
+ ListCell *l;
+
+ foreach(l, pruneinfo->prune_infos)
+ {
+ List *prune_infos = lfirst(l);
+ ListCell *l2;
+
+ foreach(l2, prune_infos)
+ {
+ PartitionedRelPruneInfo *pinfo = lfirst(l2);
+ int i;
+
+ /* RT index of the partitione table. */
+ pinfo->rtindex += rtoffset;
+
+ /* And also those of the leaf partitions. */
+ for (i = 0; i < pinfo->nparts; i++)
+ {
+ if (pinfo->rti_map[i] > 0)
+ {
+ pinfo->rti_map[i] += rtoffset;
+ leafpart_rtis = bms_add_member(leafpart_rtis,
+ pinfo->rti_map[i]);
+ }
+ }
+ }
+ }
+
+ if (pruneinfo->needs_init_pruning)
+ {
+ glob->containsInitialPruning = true;
+
+ /*
+ * Delete the leaf partition RTIs from the global set of relations
+ * to be locked before executing the plan. AcquireExecutorLocks()
+ * will find the ones to add to the set after performing initial
+ * pruning.
+ */
+ glob->minLockRelids = bms_del_members(glob->minLockRelids,
+ leafpart_rtis);
+ }
+
+ glob->partPruneInfos = lappend(glob->partPruneInfos, pruneinfo);
+ }
+
+ /*
+ * It seems worth doing a bms_copy() on glob->minLockRelids if we deleted
+ * bit from it just above to prevent empty tail bits resulting in
+ * inefficient looping during AcquireExecutorLocks().
+ */
+ if (glob->containsInitialPruning)
+ glob->minLockRelids = bms_copy(glob->minLockRelids)
+
return result;
}
@@ -1640,21 +1708,12 @@ set_append_references(PlannerInfo *root,
aplan->apprelids = offset_relid_set(aplan->apprelids, rtoffset);
- if (aplan->part_prune_info)
- {
- foreach(l, aplan->part_prune_info->prune_infos)
- {
- List *prune_infos = lfirst(l);
- ListCell *l2;
-
- foreach(l2, prune_infos)
- {
- PartitionedRelPruneInfo *pinfo = lfirst(l2);
-
- pinfo->rtindex += rtoffset;
- }
- }
- }
+ /*
+ * PartitionPruneInfos will be added to a list in PlannerGlobal, so update
+ * the index.
+ */
+ if (aplan->part_prune_index >= 0)
+ aplan->part_prune_index += list_length(root->glob->partPruneInfos);
/* We don't need to recurse to lefttree or righttree ... */
Assert(aplan->plan.lefttree == NULL);
@@ -1712,21 +1771,12 @@ set_mergeappend_references(PlannerInfo *root,
mplan->apprelids = offset_relid_set(mplan->apprelids, rtoffset);
- if (mplan->part_prune_info)
- {
- foreach(l, mplan->part_prune_info->prune_infos)
- {
- List *prune_infos = lfirst(l);
- ListCell *l2;
-
- foreach(l2, prune_infos)
- {
- PartitionedRelPruneInfo *pinfo = lfirst(l2);
-
- pinfo->rtindex += rtoffset;
- }
- }
- }
+ /*
+ * PartitionPruneInfos will be added to a list in PlannerGlobal, so update
+ * the index.
+ */
+ if (mplan->part_prune_index >= 0)
+ mplan->part_prune_index += list_length(root->glob->partPruneInfos);
/* We don't need to recurse to lefttree or righttree ... */
Assert(mplan->plan.lefttree == NULL);
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index 9d3c05aed3..952c5b8327 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -144,7 +144,9 @@ static List *make_partitionedrel_pruneinfo(PlannerInfo *root,
List *prunequal,
Bitmapset *partrelids,
int *relid_subplan_map,
- Bitmapset **matchedsubplans);
+ Bitmapset **matchedsubplans,
+ bool *needs_init_pruning,
+ bool *needs_exec_pruning);
static void gen_partprune_steps(RelOptInfo *rel, List *clauses,
PartClauseTarget target,
GeneratePruningStepsContext *context);
@@ -209,16 +211,20 @@ static void partkey_datum_from_expr(PartitionPruneContext *context,
/*
* make_partition_pruneinfo
- * Builds a PartitionPruneInfo which can be used in the executor to allow
- * additional partition pruning to take place. Returns NULL when
- * partition pruning would be useless.
+ * Checks if the given set of quals can be used to build pruning steps
+ * that the executor can use to prune away unneeded partitions. If
+ * suitable quals are found then a PartitionPruneInfo is built and tagged
+ * onto the PlannerInfo's partPruneInfos list.
+ *
+ * The return value is the 0-based index of the item added to the
+ * partPruneInfos list or -1 if nothing was added.
*
* 'parentrel' is the RelOptInfo for an appendrel, and 'subpaths' is the list
* of scan paths for its child rels.
* 'prunequal' is a list of potential pruning quals (i.e., restriction
* clauses that are applicable to the appendrel).
*/
-PartitionPruneInfo *
+int
make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
List *subpaths,
List *prunequal)
@@ -230,6 +236,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
int *relid_subplan_map;
ListCell *lc;
int i;
+ bool needs_init_pruning = false;
+ bool needs_exec_pruning = false;
/*
* Scan the subpaths to see which ones are scans of partition child
@@ -309,12 +317,16 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
Bitmapset *partrelids = (Bitmapset *) lfirst(lc);
List *pinfolist;
Bitmapset *matchedsubplans = NULL;
+ bool partrel_needs_init_pruning;
+ bool partrel_needs_exec_pruning;
pinfolist = make_partitionedrel_pruneinfo(root, parentrel,
prunequal,
partrelids,
relid_subplan_map,
- &matchedsubplans);
+ &matchedsubplans,
+ &partrel_needs_init_pruning,
+ &partrel_needs_exec_pruning);
/* When pruning is possible, record the matched subplans */
if (pinfolist != NIL)
@@ -323,6 +335,9 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
allmatchedsubplans = bms_join(matchedsubplans,
allmatchedsubplans);
}
+
+ needs_init_pruning |= partrel_needs_init_pruning;
+ needs_exec_pruning |= partrel_needs_exec_pruning;
}
pfree(relid_subplan_map);
@@ -332,11 +347,13 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
* quals, then we can just not bother with run-time pruning.
*/
if (prunerelinfos == NIL)
- return NULL;
+ return -1;
/* Else build the result data structure */
pruneinfo = makeNode(PartitionPruneInfo);
pruneinfo->prune_infos = prunerelinfos;
+ pruneinfo->needs_init_pruning = needs_init_pruning;
+ pruneinfo->needs_exec_pruning = needs_exec_pruning;
/*
* Some subplans may not belong to any of the identified partitioned rels.
@@ -358,7 +375,9 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
else
pruneinfo->other_subplans = NULL;
- return pruneinfo;
+ root->partPruneInfos = lappend(root->partPruneInfos, pruneinfo);
+
+ return list_length(root->partPruneInfos) - 1;
}
/*
@@ -435,13 +454,18 @@ add_part_relids(List *allpartrelids, Bitmapset *partrelids)
* If we cannot find any useful run-time pruning steps, return NIL.
* However, on success, each rel identified in partrelids will have
* an element in the result list, even if some of them are useless.
+ * *needs_init_pruning and *needs_exec_pruning are set to indicate that the
+ * returned PartitionedRelPruneInfos contains pruning steps that can be
+ * performed before and after execution begins, respectively.
*/
static List *
make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
List *prunequal,
Bitmapset *partrelids,
int *relid_subplan_map,
- Bitmapset **matchedsubplans)
+ Bitmapset **matchedsubplans,
+ bool *needs_init_pruning,
+ bool *needs_exec_pruning)
{
RelOptInfo *targetpart = NULL;
List *pinfolist = NIL;
@@ -452,6 +476,10 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
int rti;
int i;
+ /* Will find out below. */
+ *needs_init_pruning = false;
+ *needs_exec_pruning = false;
+
/*
* Examine each partitioned rel, constructing a temporary array to map
* from planner relids to index of the partitioned rel, and building a
@@ -539,6 +567,9 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
* executor per-scan pruning steps. This first pass creates startup
* pruning steps and detects whether there's any possibly-useful quals
* that would require per-scan pruning.
+ *
+ * In the first pass, we note whether the 2nd pass is necessary by
+ * noting the presence of EXEC parameters.
*/
gen_partprune_steps(subpart, partprunequal, PARTTARGET_INITIAL,
&context);
@@ -613,6 +644,12 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
pinfo->execparamids = execparamids;
/* Remaining fields will be filled in the next loop */
+ /* record which types of pruning steps we've seen so far */
+ if (initial_pruning_steps != NIL)
+ *needs_init_pruning = true;
+ if (exec_pruning_steps != NIL)
+ *needs_exec_pruning = true;
+
pinfolist = lappend(pinfolist, pinfo);
}
@@ -640,6 +677,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
int *subplan_map;
int *subpart_map;
Oid *relid_map;
+ Index *rti_map;
/*
* Construct the subplan and subpart maps for this partitioning level.
@@ -652,6 +690,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
subpart_map = (int *) palloc(nparts * sizeof(int));
memset(subpart_map, -1, nparts * sizeof(int));
relid_map = (Oid *) palloc0(nparts * sizeof(Oid));
+ rti_map = (Index *) palloc0(nparts * sizeof(Index));
present_parts = NULL;
i = -1;
@@ -666,6 +705,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
subplan_map[i] = subplanidx = relid_subplan_map[partrel->relid] - 1;
subpart_map[i] = subpartidx = relid_subpart_map[partrel->relid] - 1;
relid_map[i] = planner_rt_fetch(partrel->relid, root)->relid;
+ rti_map[i] = partrel->relid;
if (subplanidx >= 0)
{
present_parts = bms_add_member(present_parts, i);
@@ -690,6 +730,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
pinfo->subplan_map = subplan_map;
pinfo->subpart_map = subpart_map;
pinfo->relid_map = relid_map;
+ pinfo->rti_map = rti_map;
}
pfree(relid_subpart_map);
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 95dc2e2c83..8dc52a158f 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -1603,6 +1603,7 @@ exec_bind_message(StringInfo input_message)
int16 *rformats = NULL;
CachedPlanSource *psrc;
CachedPlan *cplan;
+ List *part_prune_result_list;
Portal portal;
char *query_string;
char *saved_stmt_name;
@@ -1978,7 +1979,9 @@ exec_bind_message(StringInfo input_message)
* will be generated in MessageContext. The plan refcount will be
* assigned to the Portal, so it will be released at portal destruction.
*/
- cplan = GetCachedPlan(psrc, params, NULL, NULL);
+ cplan = GetCachedPlan(psrc, params, NULL, NULL, &part_prune_result_list);
+ Assert(list_length(cplan->stmt_list) ==
+ list_length(part_prune_result_list));
/*
* Now we can define the portal.
@@ -1993,6 +1996,9 @@ exec_bind_message(StringInfo input_message)
cplan->stmt_list,
cplan);
+ /* Copy PartitionPruneResults into the portal's context. */
+ PortalStorePartitionPruneResults(portal, part_prune_result_list);
+
/* Done with the snapshot used for parameter I/O and parsing/planning */
if (snapshot_set)
PopActiveSnapshot();
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index 5aa5a350f3..8cc2e2162d 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -35,7 +35,7 @@
Portal ActivePortal = NULL;
-static void ProcessQuery(PlannedStmt *plan,
+static void ProcessQuery(PlannedStmt *plan, PartitionPruneResult *part_prune_result,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -65,6 +65,7 @@ static void DoPortalRewind(Portal portal);
*/
QueryDesc *
CreateQueryDesc(PlannedStmt *plannedstmt,
+ PartitionPruneResult *part_prune_result,
const char *sourceText,
Snapshot snapshot,
Snapshot crosscheck_snapshot,
@@ -77,6 +78,8 @@ CreateQueryDesc(PlannedStmt *plannedstmt,
qd->operation = plannedstmt->commandType; /* operation */
qd->plannedstmt = plannedstmt; /* plan */
+ qd->part_prune_result = part_prune_result; /* ExecutorDoInitialPruning()
+ * output for plan */
qd->sourceText = sourceText; /* query text */
qd->snapshot = RegisterSnapshot(snapshot); /* snapshot */
/* RI check snapshot */
@@ -122,6 +125,7 @@ FreeQueryDesc(QueryDesc *qdesc)
* PORTAL_ONE_RETURNING, or PORTAL_ONE_MOD_WITH portal
*
* plan: the plan tree for the query
+ * part_prune_result: ExecutorDoInitialPruning() output for the plan tree
* sourceText: the source text of the query
* params: any parameters needed
* dest: where to send results
@@ -134,6 +138,7 @@ FreeQueryDesc(QueryDesc *qdesc)
*/
static void
ProcessQuery(PlannedStmt *plan,
+ PartitionPruneResult *part_prune_result,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -145,7 +150,7 @@ ProcessQuery(PlannedStmt *plan,
/*
* Create the QueryDesc object
*/
- queryDesc = CreateQueryDesc(plan, sourceText,
+ queryDesc = CreateQueryDesc(plan, part_prune_result, sourceText,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, 0);
@@ -491,8 +496,13 @@ PortalStart(Portal portal, ParamListInfo params,
/*
* Create QueryDesc in portal's context; for the moment, set
* the destination to DestNone.
+ *
+ * There is no PartitionPruneResult unless the PlannedStmt is
+ * from a CachedPlan.
*/
queryDesc = CreateQueryDesc(linitial_node(PlannedStmt, portal->stmts),
+ portal->part_prune_results == NIL ? NULL :
+ linitial(portal->part_prune_results),
portal->sourceText,
GetActiveSnapshot(),
InvalidSnapshot,
@@ -1225,6 +1235,8 @@ PortalRunMulti(Portal portal,
if (pstmt->utilityStmt == NULL)
{
+ PartitionPruneResult *part_prune_result = NULL;
+
/*
* process a plannable query.
*/
@@ -1271,10 +1283,18 @@ PortalRunMulti(Portal portal,
else
UpdateActiveSnapshotCommandId();
+ /*
+ * Determine if there's a corresponding PartitionPruneResult for
+ * this PlannedStmt.
+ */
+ if (portal->part_prune_results != NIL)
+ part_prune_result = list_nth(portal->part_prune_results,
+ foreach_current_index(stmtlist_item));
+
if (pstmt->canSetTag)
{
/* statement can set tag string */
- ProcessQuery(pstmt,
+ ProcessQuery(pstmt, part_prune_result,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
@@ -1283,7 +1303,7 @@ PortalRunMulti(Portal portal,
else
{
/* stmt added by rewrite cannot set tag */
- ProcessQuery(pstmt,
+ ProcessQuery(pstmt, part_prune_result,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 4cf6db504f..6cb473f2f4 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -99,14 +99,19 @@ static dlist_head cached_expression_list = DLIST_STATIC_INIT(cached_expression_l
static void ReleaseGenericPlan(CachedPlanSource *plansource);
static List *RevalidateCachedQuery(CachedPlanSource *plansource,
QueryEnvironment *queryEnv);
-static bool CheckCachedPlan(CachedPlanSource *plansource);
+static bool CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
+ List **part_prune_result_list);
static CachedPlan *BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
- ParamListInfo boundParams, QueryEnvironment *queryEnv);
+ ParamListInfo boundParams, QueryEnvironment *queryEnv,
+ List **part_prune_result_list);
static bool choose_custom_plan(CachedPlanSource *plansource,
ParamListInfo boundParams);
static double cached_plan_cost(CachedPlan *plan, bool include_planner);
static Query *QueryListGetPrimaryStmt(List *stmts);
-static void AcquireExecutorLocks(List *stmt_list, bool acquire);
+static void AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams,
+ List **part_prune_result_list,
+ List **lockedRelids_per_stmt);
+static void ReleaseExecutorLocks(List *stmt_list, List *lockedRelids_per_stmt);
static void AcquirePlannerLocks(List *stmt_list, bool acquire);
static void ScanQueryForLocks(Query *parsetree, bool acquire);
static bool ScanQueryWalker(Node *node, bool *acquire);
@@ -790,15 +795,20 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
*
* On a "true" return, we have acquired the locks needed to run the plan.
* (We must do this for the "true" result to be race-condition-free.)
+ *
+ * See GetCachedPlan()'s comment for a description of part_prune_result_list.
*/
static bool
-CheckCachedPlan(CachedPlanSource *plansource)
+CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
+ List **part_prune_result_list)
{
CachedPlan *plan = plansource->gplan;
/* Assert that caller checked the querytree */
Assert(plansource->is_valid);
+ *part_prune_result_list = NIL;
+
/* If there's no generic plan, just say "false" */
if (!plan)
return false;
@@ -820,13 +830,21 @@ CheckCachedPlan(CachedPlanSource *plansource)
*/
if (plan->is_valid)
{
+ List *lockedRelids_per_stmt;
+
/*
* Plan must have positive refcount because it is referenced by
* plansource; so no need to fear it disappears under us here.
*/
Assert(plan->refcount > 0);
- AcquireExecutorLocks(plan->stmt_list, true);
+ /*
+ * Lock relations scanned by the plan. This is where the pruning
+ * happens if needed.
+ */
+ AcquireExecutorLocks(plan->stmt_list, boundParams,
+ part_prune_result_list,
+ &lockedRelids_per_stmt);
/*
* If plan was transient, check to see if TransactionXmin has
@@ -848,7 +866,14 @@ CheckCachedPlan(CachedPlanSource *plansource)
}
/* Oops, the race case happened. Release useless locks. */
- AcquireExecutorLocks(plan->stmt_list, false);
+ ReleaseExecutorLocks(plan->stmt_list, lockedRelids_per_stmt);
+
+ /*
+ * The output list and any objects therein have been allocated in the
+ * caller's hopefully short-lived context, so will not remain leaked
+ * for long, though reset to avoid its accidentally being looked at.
+ */
+ *part_prune_result_list = NIL;
}
/*
@@ -874,10 +899,15 @@ CheckCachedPlan(CachedPlanSource *plansource)
* Planning work is done in the caller's memory context. The finished plan
* is in a child memory context, which typically should get reparented
* (unless this is a one-shot plan, in which case we don't copy the plan).
+ *
+ * A list of NULLs is returned in *part_prune_result_list, meaning that no
+ * PartitionPruneResult nodes have yet been created for the plans in
+ * stmt_list.
*/
static CachedPlan *
BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
- ParamListInfo boundParams, QueryEnvironment *queryEnv)
+ ParamListInfo boundParams, QueryEnvironment *queryEnv,
+ List **part_prune_result_list)
{
CachedPlan *plan;
List *plist;
@@ -1007,6 +1037,17 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
MemoryContextSwitchTo(oldcxt);
+ /*
+ * No actual PartitionPruneResults yet to add, though must initialize
+ * the list to have the same number of elements as the list of
+ * PlannedStmts.
+ */
+ *part_prune_result_list = NIL;
+ foreach(lc, plist)
+ {
+ *part_prune_result_list = lappend(*part_prune_result_list, NULL);
+ }
+
return plan;
}
@@ -1126,6 +1167,17 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
* plan or a custom plan for the given parameters: the caller does not know
* which it will get.
*
+ * For every PlannedStmt found in the returned CachedPlan, an element that
+ * is either a PartitionPruneResult or a NULL is added to
+ * *part_prune_result_list if needed. The former if the PlannedStmt is from
+ * the existing CachedPlan that is otherwise valid and contains at least one
+ * PartitionPruneInfo that has "initial" pruning steps. Those steps are
+ * performed by calling ExecutorDoInitialPruning() to determine only those
+ * leaf partitions that need to be locked by AcquireExecutorLocks() by pruning
+ * away subplans that don't match the pruning conditions. The
+ * PartitionPruneResult contains a list of bitmapsets of the indexes of
+ * matching subplans, one for each PartitionPruneInfo.
+ *
* On return, the plan is valid and we have sufficient locks to begin
* execution.
*
@@ -1139,11 +1191,13 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
*/
CachedPlan *
GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
- ResourceOwner owner, QueryEnvironment *queryEnv)
+ ResourceOwner owner, QueryEnvironment *queryEnv,
+ List **part_prune_result_list)
{
CachedPlan *plan = NULL;
List *qlist;
bool customplan;
+ List *my_part_prune_result_list;
/* Assert caller is doing things in a sane order */
Assert(plansource->magic == CACHEDPLANSOURCE_MAGIC);
@@ -1160,7 +1214,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
if (!customplan)
{
- if (CheckCachedPlan(plansource))
+ if (CheckCachedPlan(plansource, boundParams,
+ &my_part_prune_result_list))
{
/* We want a generic plan, and we already have a valid one */
plan = plansource->gplan;
@@ -1169,7 +1224,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
else
{
/* Build a new generic plan */
- plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv);
+ plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv,
+ &my_part_prune_result_list);
/* Just make real sure plansource->gplan is clear */
ReleaseGenericPlan(plansource);
/* Link the new generic plan into the plansource */
@@ -1214,7 +1270,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
if (customplan)
{
/* Build a custom plan */
- plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv);
+ plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv,
+ &my_part_prune_result_list);
/* Accumulate total costs of custom plans */
plansource->total_custom_cost += cached_plan_cost(plan, true);
@@ -1246,6 +1303,9 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
plan->is_saved = true;
}
+ if (part_prune_result_list)
+ *part_prune_result_list = my_part_prune_result_list;
+
return plan;
}
@@ -1737,17 +1797,29 @@ QueryListGetPrimaryStmt(List *stmts)
/*
* AcquireExecutorLocks: acquire locks needed for execution of a cached plan;
- * or release them if acquire is false.
+ *
+ * See GetCachedPlan()'s comment for a description of part_prune_result_list.
+ *
+ * On return, *lockedRelids_per_stmt will contain a bitmapset for every
+ * PlannedStmt in stmt_list, containing the RT indexes of relation entries
+ * in its range table that were actually locked, or NULL if the PlannedStmt
+ * contains a utility statement.
*/
static void
-AcquireExecutorLocks(List *stmt_list, bool acquire)
+AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams,
+ List **part_prune_result_list,
+ List **lockedRelids_per_stmt)
{
ListCell *lc1;
+ *part_prune_result_list = *lockedRelids_per_stmt = NIL;
foreach(lc1, stmt_list)
{
PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
- ListCell *lc2;
+ PartitionPruneResult *part_prune_result = NULL;
+ Bitmapset *allLockRelids;
+ Bitmapset *lockedRelids = NULL;
+ int rti;
if (plannedstmt->commandType == CMD_UTILITY)
{
@@ -1761,13 +1833,35 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
Query *query = UtilityContainsQuery(plannedstmt->utilityStmt);
if (query)
- ScanQueryForLocks(query, acquire);
+ ScanQueryForLocks(query, true);
+ *part_prune_result_list = lappend(*part_prune_result_list, NULL);
continue;
}
- foreach(lc2, plannedstmt->rtable)
+ /*
+ * Figure out the set of relations that would need to be locked
+ * before executing the plan.
+ */
+ if (plannedstmt->containsInitialPruning)
{
- RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc2);
+ /*
+ * Obtain the set of partitions to be locked from the
+ * PartitionPruneInfos by considering the result of performing
+ * initial partition pruning.
+ */
+ PartitionPruneResult *part_prune_result =
+ ExecutorDoInitialPruning(plannedstmt, boundParams);
+
+ allLockRelids = bms_union(plannedstmt->minLockRelids,
+ part_prune_result->scan_leafpart_rtis);
+ }
+ else
+ allLockRelids = plannedstmt->minLockRelids;
+
+ rti = -1;
+ while ((rti = bms_next_member(allLockRelids, rti)) > 0)
+ {
+ RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
if (rte->rtekind != RTE_RELATION)
continue;
@@ -1778,10 +1872,58 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
* fail if it's been dropped entirely --- we'll just transiently
* acquire a non-conflicting lock.
*/
- if (acquire)
- LockRelationOid(rte->relid, rte->rellockmode);
- else
- UnlockRelationOid(rte->relid, rte->rellockmode);
+ LockRelationOid(rte->relid, rte->rellockmode);
+ lockedRelids = bms_add_member(lockedRelids, rti);
+ }
+
+ *part_prune_result_list = lappend(*part_prune_result_list,
+ part_prune_result);
+ *lockedRelids_per_stmt = lappend(*lockedRelids_per_stmt, lockedRelids);
+ }
+}
+
+/*
+ * ReleaseExecutorLocks
+ * Release locks that would've been acquired by an earlier call to
+ * AcquireExecutorLocks()
+ */
+static void
+ReleaseExecutorLocks(List *stmt_list, List *lockedRelids_per_stmt)
+{
+ ListCell *lc1,
+ *lc2;
+
+ forboth(lc1, stmt_list, lc2, lockedRelids_per_stmt)
+ {
+ PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
+ Bitmapset *lockedRelids = lfirst(lc2);
+ int rti;
+
+ if (plannedstmt->commandType == CMD_UTILITY)
+ {
+ /*
+ * Ignore utility statements, except those (such as EXPLAIN) that
+ * contain a parsed-but-not-planned query. Note: it's okay to use
+ * ScanQueryForLocks, even though the query hasn't been through
+ * rule rewriting, because rewriting doesn't change the query
+ * representation.
+ */
+ Query *query = UtilityContainsQuery(plannedstmt->utilityStmt);
+
+ if (query)
+ ScanQueryForLocks(query, false);
+ continue;
+ }
+
+ rti = -1;
+ while ((rti = bms_next_member(lockedRelids, rti)) >= 0)
+ {
+ RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
+
+ Assert(rte->rtekind == RTE_RELATION);
+
+ /* See the comment in AcquireExecutorLocks(). */
+ UnlockRelationOid(rte->relid, rte->rellockmode);
}
}
}
diff --git a/src/backend/utils/mmgr/portalmem.c b/src/backend/utils/mmgr/portalmem.c
index d549f66d4a..1bbe6b704b 100644
--- a/src/backend/utils/mmgr/portalmem.c
+++ b/src/backend/utils/mmgr/portalmem.c
@@ -303,6 +303,25 @@ PortalDefineQuery(Portal portal,
portal->status = PORTAL_DEFINED;
}
+/*
+ * PortalStorePartitionPruneResults
+ * Copy the given list of PartitionPruneResults into the portal's
+ * context
+ *
+ * This allows the caller to ensure that the list exists as long as the portal
+ * does.
+ */
+void
+PortalStorePartitionPruneResults(Portal portal, List *part_prune_results)
+{
+ MemoryContext oldcxt;
+
+ AssertArg(PortalIsValid(portal));
+ oldcxt = MemoryContextSwitchTo(portal->portalContext);
+ portal->part_prune_results = copyObject(part_prune_results);
+ MemoryContextSwitchTo(oldcxt);
+}
+
/*
* PortalReleaseCachedPlan
* Release a portal's reference to its cached plan, if any.
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 666977fb1f..bbc8c42d88 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -87,7 +87,9 @@ extern void ExplainOneUtility(Node *utilityStmt, IntoClause *into,
ExplainState *es, const char *queryString,
ParamListInfo params, QueryEnvironment *queryEnv);
-extern void ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into,
+extern void ExplainOnePlan(PlannedStmt *plannedstmt,
+ PartitionPruneResult *part_prune_result,
+ IntoClause *into,
ExplainState *es, const char *queryString,
ParamListInfo params, QueryEnvironment *queryEnv,
const instr_time *planduration,
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index 708435e952..bd8776402e 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -45,6 +45,7 @@ extern void ExecCleanupTupleRouting(ModifyTableState *mtstate,
* nparts Length of subplan_map[] and subpart_map[].
* subplan_map Subplan index by partition index, or -1.
* subpart_map Subpart index by partition index, or -1.
+ * rti_map Range table index by partition index, or 0.
* present_parts A Bitmapset of the partition indexes that we
* have subplans or subparts for.
* initial_pruning_steps List of PartitionPruneSteps used to
@@ -61,6 +62,7 @@ typedef struct PartitionedRelPruningData
int nparts;
int *subplan_map;
int *subpart_map;
+ Index *rti_map;
Bitmapset *present_parts;
List *initial_pruning_steps;
List *exec_pruning_steps;
@@ -123,9 +125,13 @@ typedef struct PartitionPruneState
extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
int n_total_subplans,
- PartitionPruneInfo *pruneinfo,
+ int part_prune_index,
Bitmapset **initially_valid_subplans);
extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
- bool initial_prune);
-
+ bool initial_prune,
+ Bitmapset **scan_leafpart_rtis);
+extern Bitmapset *ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt,
+ ParamListInfo params,
+ PartitionPruneInfo *pruneinfo,
+ Bitmapset **scan_leafpart_rtis);
#endif /* EXECPARTITION_H */
diff --git a/src/include/executor/execdesc.h b/src/include/executor/execdesc.h
index e79e2c001f..60d5644908 100644
--- a/src/include/executor/execdesc.h
+++ b/src/include/executor/execdesc.h
@@ -35,6 +35,8 @@ typedef struct QueryDesc
/* These fields are provided by CreateQueryDesc */
CmdType operation; /* CMD_SELECT, CMD_UPDATE, etc. */
PlannedStmt *plannedstmt; /* planner's output (could be utility, too) */
+ PartitionPruneResult *part_prune_result; /* ExecutorDoInitialPruning()'s
+ * output for plannedstmt */
const char *sourceText; /* source text of the query */
Snapshot snapshot; /* snapshot to use for query */
Snapshot crosscheck_snapshot; /* crosscheck for RI update/delete */
@@ -57,6 +59,7 @@ typedef struct QueryDesc
/* in pquery.c */
extern QueryDesc *CreateQueryDesc(PlannedStmt *plannedstmt,
+ PartitionPruneResult *part_prune_result,
const char *sourceText,
Snapshot snapshot,
Snapshot crosscheck_snapshot,
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index 873772f188..57dc0e8077 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -185,6 +185,8 @@ ExecGetJunkAttribute(TupleTableSlot *slot, AttrNumber attno, bool *isNull)
/*
* prototypes from functions in execMain.c
*/
+extern PartitionPruneResult *ExecutorDoInitialPruning(PlannedStmt *plannedstmt,
+ ParamListInfo params);
extern void ExecutorStart(QueryDesc *queryDesc, int eflags);
extern void standard_ExecutorStart(QueryDesc *queryDesc, int eflags);
extern void ExecutorRun(QueryDesc *queryDesc,
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 94b191f8ae..a8bf908d63 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -596,6 +596,8 @@ typedef struct EState
struct ExecRowMark **es_rowmarks; /* Array of per-range-table-entry
* ExecRowMarks, or NULL if none */
PlannedStmt *es_plannedstmt; /* link to top of plan tree */
+ List *es_part_prune_infos; /* PlannedStmt.partPruneInfos */
+ struct PartitionPruneResult *es_part_prune_result; /* QueryDesc.part_prune_result */
const char *es_sourceText; /* Source text from QueryDesc */
JunkFilter *es_junkFilter; /* top-level junk filter, if any */
@@ -984,6 +986,34 @@ typedef struct DomainConstraintState
*/
typedef TupleTableSlot *(*ExecProcNodeMtd) (struct PlanState *pstate);
+/*----------------
+ * PartitionPruneResult
+ *
+ * The result of performing ExecutorDoInitialPruning() invocation on a given
+ * PlannedStmt.
+ *
+ * Contains a list of Bitmapset of the indexes of the subplans remaining after
+ * performing initial pruning by calling ExecFindMatchingSubPlans() for every
+ * PartitionPruneInfos found in PlannedStmt.partPruneInfos. RT indexes of the
+ * leaf partitions scanned by those subplans across all PartitionPruneInfos
+ * are added into scan_leafpart_rtis.
+ *
+ * This is used by GetCachedPlan() to inform its callers of the pruning
+ * decisions made when performing AcquireExecutorLocks() on a given cached
+ * PlannedStmt, which the callers then pass onto the executor. The executor
+ * refers to this node when made available when initializing the plan nodes to
+ * which those PartitionPruneInfos apply so that the same set of qualifying
+ * subplans are initialized, rather than deriving that set again by redoing
+ * initial pruning.
+ */
+typedef struct PartitionPruneResult
+{
+ NodeTag type;
+
+ List *valid_subplan_offs_list;
+ Bitmapset *scan_leafpart_rtis;
+} PartitionPruneResult;
+
/* ----------------
* PlanState node
*
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index 340d28f4e1..66416bce97 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -97,6 +97,9 @@ typedef enum NodeTag
T_PartitionPruneStepCombine,
T_PlanInvalItem,
+ /* TAGS FOR EXECUTOR PREP NODES (execnodes.h) */
+ T_PartitionPruneResult,
+
/*
* TAGS FOR PLAN STATE NODES (execnodes.h)
*
@@ -674,6 +677,7 @@ extern struct Bitmapset *readBitmapset(void);
extern uintptr_t readDatum(bool typbyval);
extern bool *readBoolCols(int numCols);
extern int *readIntCols(int numCols);
+extern Index *readIndexCols(int numCols);
extern Oid *readOidCols(int numCols);
extern int16 *readAttrNumberCols(int numCols);
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index c5ab53e05c..11007cda25 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -107,6 +107,18 @@ typedef struct PlannerGlobal
List *appendRelations; /* "flat" list of AppendRelInfos */
+ List *partPruneInfos; /* List of PartitionPruneInfo contained in
+ * the plan */
+
+ bool containsInitialPruning; /* Do any of those PartitionPruneInfos
+ * have initial (pre-exec) pruning
+ * steps in them? */
+
+ Bitmapset *minLockRelids; /* Indexes of all range table entries minus
+ * indexes of range table entries of the leaf
+ * partitions scanned by prunable subplans;
+ * see AcquireExecutorLocks() */
+
List *relationOids; /* OIDs of relations the plan depends on */
List *invalItems; /* other dependencies, as PlanInvalItems */
@@ -377,6 +389,9 @@ struct PlannerInfo
/* Does this query modify any partition key columns? */
bool partColsUpdated;
+
+ /* PartitionPruneInfos added in this query's plan. */
+ List *partPruneInfos;
};
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index e43e360d9b..f8f3971f44 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -64,8 +64,20 @@ typedef struct PlannedStmt
struct Plan *planTree; /* tree of Plan nodes */
+ List *partPruneInfos; /* List of PartitionPruneInfo contained in
+ * the plan */
+
+ bool containsInitialPruning; /* Do any of those PartitionPruneInfos
+ * have initial (pre-exec) pruning
+ * steps in them? */
+
List *rtable; /* list of RangeTblEntry nodes */
+ Bitmapset *minLockRelids; /* Indexes of all range table entries minus
+ * indexes of range table entries of the leaf
+ * partitions scanned by prunable subplans;
+ * see AcquireExecutorLocks() */
+
/* rtable indexes of target relations for INSERT/UPDATE/DELETE */
List *resultRelations; /* integer list of RT indexes, or NIL */
@@ -262,8 +274,8 @@ typedef struct Append
*/
int first_partial_plan;
- /* Info for run-time subplan pruning; NULL if we're not doing that */
- struct PartitionPruneInfo *part_prune_info;
+ /* Index to PlannerInfo.partPruneInfos or -1 if no run-time pruning */
+ int part_prune_index;
} Append;
/* ----------------
@@ -282,8 +294,9 @@ typedef struct MergeAppend
Oid *sortOperators; /* OIDs of operators to sort them by */
Oid *collations; /* OIDs of collations */
bool *nullsFirst; /* NULLS FIRST/LAST directions */
- /* Info for run-time subplan pruning; NULL if we're not doing that */
- struct PartitionPruneInfo *part_prune_info;
+
+ /* Index to PlannerInfo.partPruneInfos or -1 if no run-time pruning */
+ int part_prune_index;
} MergeAppend;
/* ----------------
@@ -1191,6 +1204,13 @@ typedef struct PlanRowMark
* prune_infos List of Lists containing PartitionedRelPruneInfo nodes,
* one sublist per run-time-prunable partition hierarchy
* appearing in the parent plan node's subplans.
+ *
+ * needs_init_pruning Does any of the PartitionedRelPruneInfos in
+ * prune_infos have its initial_pruning_steps set?
+ *
+ * needs_exec_pruning Does any of the PartitionedRelPruneInfos in
+ * prune_infos have its exec_pruning_steps set?
+ *
* other_subplans Indexes of any subplans that are not accounted for
* by any of the PartitionedRelPruneInfo nodes in
* "prune_infos". These subplans must not be pruned.
@@ -1199,6 +1219,8 @@ typedef struct PartitionPruneInfo
{
NodeTag type;
List *prune_infos;
+ bool needs_init_pruning;
+ bool needs_exec_pruning;
Bitmapset *other_subplans;
} PartitionPruneInfo;
@@ -1229,6 +1251,7 @@ typedef struct PartitionedRelPruneInfo
int *subplan_map; /* subplan index by partition index, or -1 */
int *subpart_map; /* subpart index by partition index, or -1 */
Oid *relid_map; /* relation OID by partition index, or 0 */
+ Index *rti_map; /* Range table index by partition index, 0. */
/*
* initial_pruning_steps shows how to prune during executor startup (i.e.,
diff --git a/src/include/partitioning/partprune.h b/src/include/partitioning/partprune.h
index 90684efa25..ebf0dcff8c 100644
--- a/src/include/partitioning/partprune.h
+++ b/src/include/partitioning/partprune.h
@@ -70,10 +70,10 @@ typedef struct PartitionPruneContext
#define PruneCxtStateIdx(partnatts, step_id, keyno) \
((partnatts) * (step_id) + (keyno))
-extern PartitionPruneInfo *make_partition_pruneinfo(struct PlannerInfo *root,
- struct RelOptInfo *parentrel,
- List *subpaths,
- List *prunequal);
+extern int make_partition_pruneinfo(struct PlannerInfo *root,
+ struct RelOptInfo *parentrel,
+ List *subpaths,
+ List *prunequal);
extern Bitmapset *prune_append_rel_partitions(struct RelOptInfo *rel);
extern Bitmapset *get_matching_partitions(PartitionPruneContext *context,
List *pruning_steps);
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index 95b99e3d25..449200b949 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -220,7 +220,8 @@ extern List *CachedPlanGetTargetList(CachedPlanSource *plansource,
extern CachedPlan *GetCachedPlan(CachedPlanSource *plansource,
ParamListInfo boundParams,
ResourceOwner owner,
- QueryEnvironment *queryEnv);
+ QueryEnvironment *queryEnv,
+ List **part_prune_result_list);
extern void ReleaseCachedPlan(CachedPlan *plan, ResourceOwner owner);
extern bool CachedPlanAllowsSimpleValidityCheck(CachedPlanSource *plansource,
diff --git a/src/include/utils/portal.h b/src/include/utils/portal.h
index aeddbdafe5..9f7727a837 100644
--- a/src/include/utils/portal.h
+++ b/src/include/utils/portal.h
@@ -138,6 +138,7 @@ typedef struct PortalData
QueryCompletion qc; /* command completion data for executed query */
List *stmts; /* list of PlannedStmts */
CachedPlan *cplan; /* CachedPlan, if stmts are from one */
+ List *part_prune_results; /* list of PartitionPruneResults */
ParamListInfo portalParams; /* params to pass to query */
QueryEnvironment *queryEnv; /* environment for query */
@@ -242,6 +243,8 @@ extern void PortalDefineQuery(Portal portal,
CommandTag commandTag,
List *stmts,
CachedPlan *cplan);
+extern void PortalStorePartitionPruneResults(Portal portal,
+ List *part_prune_result_list);
extern PlannedStmt *PortalGetPrimaryStmt(Portal portal);
extern void PortalCreateHoldStore(Portal portal);
extern void PortalHashTableDeleteAll(void);
--
2.24.1
^ permalink raw reply [nested|flat] 71+ messages in thread
* Re: generic plans and "initial" pruning
@ 2022-04-11 03:05 Amit Langote <[email protected]>
parent: Amit Langote <[email protected]>
0 siblings, 1 reply; 71+ messages in thread
From: Amit Langote @ 2022-04-11 03:05 UTC (permalink / raw)
To: David Rowley <[email protected]>; +Cc: Robert Haas <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers
On Fri, Apr 8, 2022 at 8:45 PM Amit Langote <[email protected]> wrote:
> Most looked fine changes to me except a couple of typos, so I've
> adopted those into the attached new version, even though I know it's
> too late to try to apply it.
>
> + * XXX is it worth doing a bms_copy() on glob->minLockRelids if
> + * glob->containsInitialPruning is true?. I'm slighly worried that the
> + * Bitmapset could have a very long empty tail resulting in excessive
> + * looping during AcquireExecutorLocks().
> + */
>
> I guess I trust your instincts about bitmapset operation efficiency
> and what you've written here makes sense. It's typical for leaf
> partitions to have been appended toward the tail end of rtable and I'd
> imagine their indexes would be in the tail words of minLockRelids. If
> copying the bitmapset removes those useless words, I don't see why we
> shouldn't do that. So added:
>
> + /*
> + * It seems worth doing a bms_copy() on glob->minLockRelids if we deleted
> + * bit from it just above to prevent empty tail bits resulting in
> + * inefficient looping during AcquireExecutorLocks().
> + */
> + if (glob->containsInitialPruning)
> + glob->minLockRelids = bms_copy(glob->minLockRelids)
>
> Not 100% about the comment I wrote.
And the quoted code change missed a semicolon in the v14 that I
hurriedly sent on Friday. (Had apparently forgotten to `git add` the
hunk to fix that).
Sending v15 that fixes that to keep the cfbot green for now.
--
Amit Langote
EDB: http://www.enterprisedb.com
Attachments:
[application/octet-stream] v15-0001-Optimize-AcquireExecutorLocks-to-skip-pruned-par.patch (99.3K, 2-v15-0001-Optimize-AcquireExecutorLocks-to-skip-pruned-par.patch)
download | inline diff:
From e974c27abda9c53744b93f2c6e0f1083ddeedbba Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Wed, 22 Dec 2021 16:55:17 +0900
Subject: [PATCH v15] Optimize AcquireExecutorLocks() to skip pruned partitions
---
src/backend/commands/copyto.c | 2 +-
src/backend/commands/createas.c | 2 +-
src/backend/commands/explain.c | 7 +-
src/backend/commands/extension.c | 2 +-
src/backend/commands/matview.c | 2 +-
src/backend/commands/prepare.c | 26 ++-
src/backend/executor/README | 27 +++
src/backend/executor/execMain.c | 48 +++++
src/backend/executor/execParallel.c | 28 ++-
src/backend/executor/execPartition.c | 241 ++++++++++++++++++++----
src/backend/executor/execUtils.c | 2 +
src/backend/executor/functions.c | 2 +-
src/backend/executor/nodeAppend.c | 15 +-
src/backend/executor/nodeMergeAppend.c | 9 +-
src/backend/executor/spi.c | 27 ++-
src/backend/nodes/copyfuncs.c | 33 +++-
src/backend/nodes/outfuncs.c | 36 +++-
src/backend/nodes/readfuncs.c | 56 +++++-
src/backend/optimizer/plan/createplan.c | 24 +--
src/backend/optimizer/plan/planner.c | 3 +
src/backend/optimizer/plan/setrefs.c | 112 ++++++++---
src/backend/partitioning/partprune.c | 59 +++++-
src/backend/tcop/postgres.c | 8 +-
src/backend/tcop/pquery.c | 28 ++-
src/backend/utils/cache/plancache.c | 184 +++++++++++++++---
src/backend/utils/mmgr/portalmem.c | 19 ++
src/include/commands/explain.h | 4 +-
src/include/executor/execPartition.h | 12 +-
src/include/executor/execdesc.h | 3 +
src/include/executor/executor.h | 2 +
src/include/nodes/execnodes.h | 30 +++
src/include/nodes/nodes.h | 4 +
src/include/nodes/pathnodes.h | 15 ++
src/include/nodes/plannodes.h | 31 ++-
src/include/partitioning/partprune.h | 8 +-
src/include/utils/plancache.h | 3 +-
src/include/utils/portal.h | 3 +
37 files changed, 950 insertions(+), 167 deletions(-)
diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index 55c38b04c4..d403eb2309 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -542,7 +542,7 @@ BeginCopyTo(ParseState *pstate,
((DR_copy *) dest)->cstate = cstate;
/* Create a QueryDesc requesting no output */
- cstate->queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+ cstate->queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
GetActiveSnapshot(),
InvalidSnapshot,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 9abbb6b555..f6607f2454 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -325,7 +325,7 @@ ExecCreateTableAs(ParseState *pstate, CreateTableAsStmt *stmt,
UpdateActiveSnapshotCommandId();
/* Create a QueryDesc, redirecting output to our tuple receiver */
- queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+ queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, 0);
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index d2a2479822..35dd24adf8 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -407,7 +407,7 @@ ExplainOneQuery(Query *query, int cursorOptions,
}
/* run it (if needed) and produce output */
- ExplainOnePlan(plan, into, es, queryString, params, queryEnv,
+ ExplainOnePlan(plan, NULL, into, es, queryString, params, queryEnv,
&planduration, (es->buffers ? &bufusage : NULL));
}
}
@@ -515,7 +515,8 @@ ExplainOneUtility(Node *utilityStmt, IntoClause *into, ExplainState *es,
* to call it.
*/
void
-ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
+ExplainOnePlan(PlannedStmt *plannedstmt, PartitionPruneResult *part_prune_result,
+ IntoClause *into, ExplainState *es,
const char *queryString, ParamListInfo params,
QueryEnvironment *queryEnv, const instr_time *planduration,
const BufferUsage *bufusage)
@@ -563,7 +564,7 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
dest = None_Receiver;
/* Create a QueryDesc for the query */
- queryDesc = CreateQueryDesc(plannedstmt, queryString,
+ queryDesc = CreateQueryDesc(plannedstmt, part_prune_result, queryString,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, instrument_option);
diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index 1013790dbb..54734a3a93 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -776,7 +776,7 @@ execute_sql_string(const char *sql)
{
QueryDesc *qdesc;
- qdesc = CreateQueryDesc(stmt,
+ qdesc = CreateQueryDesc(stmt, NULL,
sql,
GetActiveSnapshot(), NULL,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index 9ab248d25e..2be1782bc4 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -416,7 +416,7 @@ refresh_matview_datafill(DestReceiver *dest, Query *query,
UpdateActiveSnapshotCommandId();
/* Create a QueryDesc, redirecting output to our tuple receiver */
- queryDesc = CreateQueryDesc(plan, queryString,
+ queryDesc = CreateQueryDesc(plan, NULL, queryString,
GetActiveSnapshot(), InvalidSnapshot,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 80738547ed..c7360712b1 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -155,6 +155,7 @@ ExecuteQuery(ParseState *pstate,
PreparedStatement *entry;
CachedPlan *cplan;
List *plan_list;
+ List *part_prune_result_list;
ParamListInfo paramLI = NULL;
EState *estate = NULL;
Portal portal;
@@ -193,7 +194,10 @@ ExecuteQuery(ParseState *pstate,
entry->plansource->query_string);
/* Replan if needed, and increment plan refcount for portal */
- cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL);
+ cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL,
+ &part_prune_result_list);
+ Assert(list_length(cplan->stmt_list) ==
+ list_length(part_prune_result_list));
plan_list = cplan->stmt_list;
/*
@@ -207,6 +211,9 @@ ExecuteQuery(ParseState *pstate,
plan_list,
cplan);
+ /* Copy PartitionPruneResults into the portal's context. */
+ PortalStorePartitionPruneResults(portal, part_prune_result_list);
+
/*
* For CREATE TABLE ... AS EXECUTE, we must verify that the prepared
* statement is one that produces tuples. Currently we insist that it be
@@ -576,7 +583,9 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
const char *query_string;
CachedPlan *cplan;
List *plan_list;
- ListCell *p;
+ List *part_prune_result_list;
+ ListCell *p,
+ *pp;
ParamListInfo paramLI = NULL;
EState *estate = NULL;
instr_time planstart;
@@ -619,7 +628,10 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
/* Replan if needed, and acquire a transient refcount */
cplan = GetCachedPlan(entry->plansource, paramLI,
- CurrentResourceOwner, queryEnv);
+ CurrentResourceOwner, queryEnv,
+ &part_prune_result_list);
+ Assert(list_length(cplan->stmt_list) ==
+ list_length(part_prune_result_list));
INSTR_TIME_SET_CURRENT(planduration);
INSTR_TIME_SUBTRACT(planduration, planstart);
@@ -634,13 +646,15 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
plan_list = cplan->stmt_list;
/* Explain each query */
- foreach(p, plan_list)
+ forboth(p, plan_list, pp, part_prune_result_list)
{
PlannedStmt *pstmt = lfirst_node(PlannedStmt, p);
+ PartitionPruneResult *part_prune_result = lfirst_node(PartitionPruneResult, pp);
if (pstmt->commandType != CMD_UTILITY)
- ExplainOnePlan(pstmt, into, es, query_string, paramLI, queryEnv,
- &planduration, (es->buffers ? &bufusage : NULL));
+ ExplainOnePlan(pstmt, part_prune_result, into, es, query_string,
+ paramLI, queryEnv, &planduration,
+ (es->buffers ? &bufusage : NULL));
else
ExplainOneUtility(pstmt->utilityStmt, into, es, query_string,
paramLI, queryEnv);
diff --git a/src/backend/executor/README b/src/backend/executor/README
index 0b5183fc4a..e0802be723 100644
--- a/src/backend/executor/README
+++ b/src/backend/executor/README
@@ -65,6 +65,29 @@ found there. This currently only occurs for Append and MergeAppend nodes. In
this case the non-required subplans are ignored and the executor state's
subnode array will become out of sequence to the plan's subplan list.
+Actually, the so-called execution time pruning may also occur even before the
+execution has started. One case where that occurs is when a cached generic
+plan is being validated for execution by plancache.c: GetCachedPlan(), which
+proceeds by locking all the relations that will be scanned by that plan. If
+the generic plan contains nodes that can perform execution time partition
+pruning (that is, contain a PartitionPruneInfo), a subset of pruning steps
+contained in the PartitionPruneInfos that do not depend on execution actually
+having started (called "initial" pruning steps) are performed at this point
+to figure out the minimal set of child subplans that satisfy those pruning
+instructions. AcquireExecutorLocks() looking at a particular plan will then
+lock only the relations scanned by those surviving subplans (along with those
+present in PlannedStmt.minLockRelids), and ignore those scanned by the pruned
+subplans, even though the pruned subplans themselves are not removed from the
+plan tree. The result of pruning (that is, the set of indexes of surviving
+subplans in their parent's list of child subplans) is saved as a list of
+bitmapsets, with one element for every PartitionPruneInfo referenced in the
+plan (PlannedStmt.partPruneInfos). The list is packaged into a
+PartitionPruneResult node, which is passed along with the PlannedStmt to the
+executor via the QueryDesc. It is imperative that the executor and any third
+party code invoked by it that gets passed the plan tree look at the plan's
+PartitionPruneResult to determine whether a particular child subplan of a
+parent node that supports pruning is valid for a given execution.
+
Each Plan node may have expression trees associated with it, to represent
its target list, qualification conditions, etc. These trees are also
read-only to the executor, but the executor state for expression evaluation
@@ -286,6 +309,10 @@ Query Processing Control Flow
This is a sketch of control flow for full query processing:
+ [ ExecutorDoInitialPruning ] --- an optional step to perform initial
+ partition pruning on the plan tree the result of which is passed
+ to the executor via QueryDesc
+
CreateQueryDesc
ExecutorStart
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index ef2fd46092..5ee978937d 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -49,11 +49,13 @@
#include "commands/matview.h"
#include "commands/trigger.h"
#include "executor/execdebug.h"
+#include "executor/execPartition.h"
#include "executor/nodeSubplan.h"
#include "foreign/fdwapi.h"
#include "jit/jit.h"
#include "mb/pg_wchar.h"
#include "miscadmin.h"
+#include "nodes/nodeFuncs.h"
#include "parser/parsetree.h"
#include "storage/bufmgr.h"
#include "storage/lmgr.h"
@@ -104,6 +106,49 @@ static void EvalPlanQualStart(EPQState *epqstate, Plan *planTree);
/* end of local decls */
+/* ----------------------------------------------------------------
+ * ExecutorDoInitialPruning
+ *
+ * Performs initial partition pruning to figure out the minimal set of
+ * subplans to be executed and the set of RT indexes of the corresponding
+ * leaf partitions
+ *
+ * Returned PartitionPruneResult must be subsequently passed to the executor
+ * so that it can reuse the result of pruning. It's important that the
+ * has the same view of which partitions are initially pruned (by not doing
+ * the pruning again itself) or otherwise it risks initializing subplans whose
+ * partitions would not have been locked.
+ *
+ * Note: Partitioned tables mentioned in PartitionedRelPruneInfo nodes that
+ * drive the pruning will be locked before doing the pruning.
+ *
+ * ----------------------------------------------------------------
+ */
+PartitionPruneResult *
+ExecutorDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params)
+{
+ PartitionPruneResult *result;
+ ListCell *lc;
+
+ /* Only get here if there is any pruning to do. */
+ Assert(plannedstmt->containsInitialPruning);
+
+ result = makeNode(PartitionPruneResult);
+ foreach(lc, plannedstmt->partPruneInfos)
+ {
+ PartitionPruneInfo *pruneinfo = lfirst(lc);
+ Bitmapset *valid_subplan_offs;
+
+ valid_subplan_offs =
+ ExecPartitionDoInitialPruning(plannedstmt, params, pruneinfo,
+ &result->scan_leafpart_rtis);
+ result->valid_subplan_offs_list =
+ lappend(result->valid_subplan_offs_list,
+ valid_subplan_offs);
+ }
+
+ return result;
+}
/* ----------------------------------------------------------------
* ExecutorStart
@@ -806,6 +851,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
{
CmdType operation = queryDesc->operation;
PlannedStmt *plannedstmt = queryDesc->plannedstmt;
+ PartitionPruneResult *part_prune_result = queryDesc->part_prune_result;
Plan *plan = plannedstmt->planTree;
List *rangeTable = plannedstmt->rtable;
EState *estate = queryDesc->estate;
@@ -825,6 +871,8 @@ InitPlan(QueryDesc *queryDesc, int eflags)
ExecInitRangeTable(estate, rangeTable);
estate->es_plannedstmt = plannedstmt;
+ estate->es_part_prune_infos = plannedstmt->partPruneInfos;
+ estate->es_part_prune_result = part_prune_result;
/*
* Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index 9a0d5d59ef..805f86c503 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -66,6 +66,7 @@
#define PARALLEL_KEY_QUERY_TEXT UINT64CONST(0xE000000000000008)
#define PARALLEL_KEY_JIT_INSTRUMENTATION UINT64CONST(0xE000000000000009)
#define PARALLEL_KEY_WAL_USAGE UINT64CONST(0xE00000000000000A)
+#define PARALLEL_KEY_PARTITIONPRUNERESULT UINT64CONST(0xE00000000000000B)
#define PARALLEL_TUPLE_QUEUE_SIZE 65536
@@ -182,7 +183,9 @@ ExecSerializePlan(Plan *plan, EState *estate)
pstmt->transientPlan = false;
pstmt->dependsOnRole = false;
pstmt->parallelModeNeeded = false;
+ pstmt->containsInitialPruning = false;
pstmt->planTree = plan;
+ pstmt->partPruneInfos = estate->es_part_prune_infos;
pstmt->rtable = estate->es_range_table;
pstmt->resultRelations = NIL;
pstmt->appendRelations = NIL;
@@ -596,12 +599,15 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
FixedParallelExecutorState *fpes;
char *pstmt_data;
char *pstmt_space;
+ char *part_prune_result_data;
+ char *part_prune_result_space;
char *paramlistinfo_space;
BufferUsage *bufusage_space;
WalUsage *walusage_space;
SharedExecutorInstrumentation *instrumentation = NULL;
SharedJitInstrumentation *jit_instrumentation = NULL;
int pstmt_len;
+ int part_prune_result_len;
int paramlistinfo_len;
int instrumentation_len = 0;
int jit_instrumentation_len = 0;
@@ -630,6 +636,7 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
/* Fix up and serialize plan to be sent to workers. */
pstmt_data = ExecSerializePlan(planstate->plan, estate);
+ part_prune_result_data = nodeToString(estate->es_part_prune_result);
/* Create a parallel context. */
pcxt = CreateParallelContext("postgres", "ParallelQueryMain", nworkers);
@@ -656,6 +663,11 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
shm_toc_estimate_chunk(&pcxt->estimator, pstmt_len);
shm_toc_estimate_keys(&pcxt->estimator, 1);
+ /* Estimate space for serialized PartitionPruneResult. */
+ part_prune_result_len = strlen(part_prune_result_data) + 1;
+ shm_toc_estimate_chunk(&pcxt->estimator, part_prune_result_len);
+ shm_toc_estimate_keys(&pcxt->estimator, 1);
+
/* Estimate space for serialized ParamListInfo. */
paramlistinfo_len = EstimateParamListSpace(estate->es_param_list_info);
shm_toc_estimate_chunk(&pcxt->estimator, paramlistinfo_len);
@@ -750,6 +762,12 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
memcpy(pstmt_space, pstmt_data, pstmt_len);
shm_toc_insert(pcxt->toc, PARALLEL_KEY_PLANNEDSTMT, pstmt_space);
+ /* Store serialized PartitionPruneResult */
+ part_prune_result_space = shm_toc_allocate(pcxt->toc, part_prune_result_len);
+ memcpy(part_prune_result_space, part_prune_result_data, part_prune_result_len);
+ shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARTITIONPRUNERESULT,
+ part_prune_result_space);
+
/* Store serialized ParamListInfo. */
paramlistinfo_space = shm_toc_allocate(pcxt->toc, paramlistinfo_len);
shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARAMLISTINFO, paramlistinfo_space);
@@ -1231,8 +1249,10 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
int instrument_options)
{
char *pstmtspace;
+ char *part_prune_result_space;
char *paramspace;
PlannedStmt *pstmt;
+ PartitionPruneResult *part_prune_result;
ParamListInfo paramLI;
char *queryString;
@@ -1243,12 +1263,18 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
pstmtspace = shm_toc_lookup(toc, PARALLEL_KEY_PLANNEDSTMT, false);
pstmt = (PlannedStmt *) stringToNode(pstmtspace);
+ /* Reconstruct leader-supplied PartitionPruneResult. */
+ part_prune_result_space =
+ shm_toc_lookup(toc, PARALLEL_KEY_PARTITIONPRUNERESULT, false);
+ part_prune_result = (PartitionPruneResult *)
+ stringToNode(part_prune_result_space);
+
/* Reconstruct ParamListInfo. */
paramspace = shm_toc_lookup(toc, PARALLEL_KEY_PARAMLISTINFO, false);
paramLI = RestoreParamList(¶mspace);
/* Create a QueryDesc for the query. */
- return CreateQueryDesc(pstmt,
+ return CreateQueryDesc(pstmt, part_prune_result,
queryString,
GetActiveSnapshot(), InvalidSnapshot,
receiver, paramLI, NULL, instrument_options);
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 615bd80973..af87b9197f 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -25,6 +25,7 @@
#include "mb/pg_wchar.h"
#include "miscadmin.h"
#include "nodes/makefuncs.h"
+#include "parser/parsetree.h"
#include "partitioning/partbounds.h"
#include "partitioning/partdesc.h"
#include "partitioning/partprune.h"
@@ -185,7 +186,11 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation rel,
static List *adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri);
static List *adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap);
static PartitionPruneState *CreatePartitionPruneState(PlanState *planstate,
- PartitionPruneInfo *pruneinfo);
+ PartitionPruneInfo *pruneinfo,
+ bool consider_initial_steps,
+ bool consider_exec_steps,
+ List *rtable, ExprContext *econtext,
+ PartitionDirectory partdir);
static void InitPartitionPruneContext(PartitionPruneContext *context,
List *pruning_steps,
PartitionDesc partdesc,
@@ -198,7 +203,8 @@ static void PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
static void find_matching_subplans_recurse(PartitionPruningData *prunedata,
PartitionedRelPruningData *pprune,
bool initial_prune,
- Bitmapset **validsubplans);
+ Bitmapset **validsubplans,
+ Bitmapset **scan_leafpart_rtis);
/*
@@ -1587,8 +1593,10 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
* considered to be a stable expression, it can change value from one plan
* node scan to the next during query execution. Stable comparison
* expressions that don't involve such Params allow partition pruning to be
- * done once during executor startup. Expressions that do involve such Params
- * require us to prune separately for each scan of the parent plan node.
+ * done once during executor startup or during ExecutorDoInitialPruning() that
+ * runs as part of performing AcquireExecutorLocks() on a given plan tree.
+ * Expressions that do involve such Params require us to prune separately for
+ * each scan of the parent plan node.
*
* Note that pruning away unneeded subplans during executor startup has the
* added benefit of not having to initialize the unneeded subplans at all.
@@ -1605,6 +1613,13 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
* account for initial pruning possibly having eliminated some of the
* subplans.
*
+ * ExecPartitionDoInitialPruning:
+ * Do initial pruning with the information contained in a given
+ * PartitionPruneInfo to determine the minimal set of child subplans
+ * to be executed of the parent plan node to which the PartitionPruneInfo
+ * belongs and also the set of the RT indexes of leaf partitions that will
+ * be scanned with those subplans.
+ *
* ExecFindMatchingSubPlans:
* Returns indexes of matching subplans after evaluating the expressions
* that are safe to evaluate at a given point. This function is first
@@ -1622,8 +1637,9 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
*
* On return, *initially_valid_subplans is assigned the set of indexes of
* child subplans that must be initialized along with the parent plan node.
- * Initial pruning is performed here if needed and in that case only the
- * surviving subplans' indexes are added.
+ * Initial pruning is performed here if needed (unless it has already been done
+ * by ExecDoInitialPruning()), and in that case only the surviving subplans'
+ * indexes are added.
*
* If subplans are indeed pruned, subplan_map arrays contained in the returned
* PartitionPruneState are re-sequenced to not count those, though only if the
@@ -1632,29 +1648,66 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
PartitionPruneState *
ExecInitPartitionPruning(PlanState *planstate,
int n_total_subplans,
- PartitionPruneInfo *pruneinfo,
+ int part_prune_index,
Bitmapset **initially_valid_subplans)
{
- PartitionPruneState *prunestate;
EState *estate = planstate->state;
+ PartitionPruneInfo *pruneinfo = list_nth(estate->es_part_prune_infos,
+ part_prune_index);
+ PartitionPruneResult *pruneresult = estate->es_part_prune_result;
+ PartitionPruneState *prunestate;
+ bool do_pruning = (pruneinfo->needs_init_pruning ||
+ pruneinfo->needs_exec_pruning);
- /* We may need an expression context to evaluate partition exprs */
- ExecAssignExprContext(estate, planstate);
+ /*
+ * No need to do initial pruning if it was done already by
+ * ExecutorDoInitialPruning(), which it would be if es_part_prune_result
+ * has been set.
+ */
+ if (pruneresult)
+ do_pruning = pruneinfo->needs_exec_pruning;
+
+ prunestate = NULL;
+ if (do_pruning)
+ {
+ /* We may need an expression context to evaluate partition exprs */
+ ExecAssignExprContext(estate, planstate);
- /* Create the working data structure for pruning */
- prunestate = CreatePartitionPruneState(planstate, pruneinfo);
+ /* For data reading, executor always omits detached partitions */
+ if (estate->es_partition_directory == NULL)
+ estate->es_partition_directory =
+ CreatePartitionDirectory(estate->es_query_cxt, false);
+
+ /*
+ * Create the working data structure for pruning. No need to consider
+ * initial pruning steps if we have a PartitionPruneResult.
+ */
+ prunestate = CreatePartitionPruneState(planstate, pruneinfo,
+ pruneresult == NULL, true,
+ NIL, planstate->ps_ExprContext,
+ estate->es_partition_directory);
+ }
/*
* Perform an initial partition prune pass, if required.
*/
- if (prunestate->do_initial_prune)
- *initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true);
+ if (pruneresult)
+ {
+ *initially_valid_subplans =
+ list_nth(pruneresult->valid_subplan_offs_list, part_prune_index);
+ }
+ else if (prunestate && prunestate->do_initial_prune)
+ {
+ *initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true,
+ NULL);
+ }
else
{
/* No pruning, so we'll need to initialize all subplans */
Assert(n_total_subplans > 0);
*initially_valid_subplans = bms_add_range(NULL, 0,
n_total_subplans - 1);
+ return prunestate;
}
/*
@@ -1662,7 +1715,8 @@ ExecInitPartitionPruning(PlanState *planstate,
* that were removed above due to initial pruning. No need to do this if
* no steps were removed.
*/
- if (bms_num_members(*initially_valid_subplans) < n_total_subplans)
+ if (prunestate &&
+ bms_num_members(*initially_valid_subplans) < n_total_subplans)
{
/*
* We can safely skip this when !do_exec_prune, even though that
@@ -1678,11 +1732,74 @@ ExecInitPartitionPruning(PlanState *planstate,
return prunestate;
}
+/*
+ * ExecPartitionDoInitialPruning
+ * Perform initial pruning using given PartitionPruneInfo to determine
+ * the minimal set of child subplans to be executed of the parent plan
+ * node to which the PartitionPruneInfo belongs and also the set of RT
+ * indexes of leaf partitions that will scanned with those subplans.
+ */
+Bitmapset *
+ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params,
+ PartitionPruneInfo *pruneinfo,
+ Bitmapset **scan_leafpart_rtis)
+{
+ List *rtable = plannedstmt->rtable;
+ ExprContext *econtext;
+ PartitionDirectory pdir;
+ MemoryContext oldcontext,
+ tmpcontext;
+ PartitionPruneState *prunestate;
+ Bitmapset *valid_subplan_offs;
+
+ /*
+ * A temporary context for memory allocations required while executing
+ * partition pruning steps.
+ */
+ tmpcontext = AllocSetContextCreate(CurrentMemoryContext,
+ "initial pruning working data",
+ ALLOCSET_DEFAULT_SIZES);
+ oldcontext = MemoryContextSwitchTo(tmpcontext);
+
+ /*
+ * PartitionDirectory to look up partition descriptors, which omits
+ * detached partitions, just like in the executor proper.
+ */
+ pdir = CreatePartitionDirectory(CurrentMemoryContext, false);
+
+ /*
+ * We don't yet have a PlanState for the parent plan node, so we must
+ * create a standalone ExprContext to evaluate pruning expressions,
+ * equipped with the information about the EXTERN parameters that the
+ * caller passed us. Note that that's okay because the initial pruning
+ * steps do not contain anything that requires the execution to have
+ * started.
+ */
+ econtext = CreateStandaloneExprContext();
+ econtext->ecxt_param_list_info = params;
+ prunestate = CreatePartitionPruneState(NULL, pruneinfo, true, false,
+ rtable, econtext, pdir);
+ MemoryContextSwitchTo(oldcontext);
+
+ /* Do the initial pruning. */
+ valid_subplan_offs = ExecFindMatchingSubPlans(prunestate, true,
+ scan_leafpart_rtis);
+
+ FreeExprContext(econtext, true);
+ DestroyPartitionDirectory(pdir);
+ MemoryContextDelete(tmpcontext);
+
+ return valid_subplan_offs;
+}
+
/*
* CreatePartitionPruneState
* Build the data structure required for calling ExecFindMatchingSubPlans
*
- * 'planstate' is the parent plan node's execution state.
+ * 'planstate', if not NULL, is the parent plan node's execution state. It
+ * can be NULL if being called before ExecutorStart(), in which case,
+ * 'rtable' (range table), 'econtext', and 'partdir' must be explicitly
+ * provided.
*
* 'pruneinfo' is a PartitionPruneInfo as generated by
* make_partition_pruneinfo. Here we build a PartitionPruneState containing a
@@ -1696,19 +1813,21 @@ ExecInitPartitionPruning(PlanState *planstate,
* PartitionedRelPruneInfo.
*/
static PartitionPruneState *
-CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
+CreatePartitionPruneState(PlanState *planstate,
+ PartitionPruneInfo *pruneinfo,
+ bool consider_initial_steps,
+ bool consider_exec_steps,
+ List *rtable, ExprContext *econtext,
+ PartitionDirectory partdir)
{
- EState *estate = planstate->state;
+ EState *estate = planstate ? planstate->state : NULL;
PartitionPruneState *prunestate;
int n_part_hierarchies;
ListCell *lc;
int i;
- ExprContext *econtext = planstate->ps_ExprContext;
- /* For data reading, executor always omits detached partitions */
- if (estate->es_partition_directory == NULL)
- estate->es_partition_directory =
- CreatePartitionDirectory(estate->es_query_cxt, false);
+ Assert((estate != NULL) ||
+ (partdir != NULL && econtext != NULL && rtable != NIL));
n_part_hierarchies = list_length(pruneinfo->prune_infos);
Assert(n_part_hierarchies > 0);
@@ -1763,15 +1882,42 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
PartitionKey partkey;
/*
- * We can rely on the copies of the partitioned table's partition
- * key and partition descriptor appearing in its relcache entry,
- * because that entry will be held open and locked for the
- * duration of this executor run.
+ * Must open the relation by ourselves when called before the
+ * execution has started, such as, when called during
+ * ExecutorDoInitialPruning() on a cached plan. In that case,
+ * sub-partitions must be locked, because AcquirePlannerLocks()
+ * would not have seen them. (1st relation in a partrelpruneinfos
+ * list is always the root partitioned table appearing in the
+ * query, which AcquirePlannerLocks() would have locked; the
+ * Assert in relation_open() guards that assumption.)
+ */
+ if (estate == NULL)
+ {
+ RangeTblEntry *rte = rt_fetch(pinfo->rtindex, rtable);
+ int lockmode = (j == 0) ? NoLock : rte->rellockmode;
+
+ partrel = table_open(rte->relid, lockmode);
+ }
+ else
+ partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
+
+ /*
+ * We can rely on the copy of the partitioned table's partition
+ * key from in its relcache entry, because it can't change (or
+ * get destroyed) as long as the relation is locked. Partition
+ * descriptor is taken from the PartitionDirectory associated with
+ * the table that is held open long enough for the descriptor to
+ * remain valid while it's used to perform the pruning steps.
*/
- partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
partkey = RelationGetPartitionKey(partrel);
- partdesc = PartitionDirectoryLookup(estate->es_partition_directory,
- partrel);
+ partdesc = PartitionDirectoryLookup(partdir, partrel);
+
+ /*
+ * Must close partrel, keeping the lock taken, if we're not using
+ * EState's entry.
+ */
+ if (estate == NULL)
+ table_close(partrel, NoLock);
/*
* Initialize the subplan_map and subpart_map.
@@ -1785,6 +1931,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
Assert(partdesc->nparts >= pinfo->nparts);
pprune->nparts = partdesc->nparts;
pprune->subplan_map = palloc(sizeof(int) * partdesc->nparts);
+ pprune->rti_map = palloc(sizeof(Index) * partdesc->nparts);
if (partdesc->nparts == pinfo->nparts)
{
/*
@@ -1795,6 +1942,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
pprune->subpart_map = pinfo->subpart_map;
memcpy(pprune->subplan_map, pinfo->subplan_map,
sizeof(int) * pinfo->nparts);
+ memcpy(pprune->rti_map, pinfo->rti_map,
+ sizeof(int) * pinfo->nparts);
/*
* Double-check that the list of unpruned relations has not
@@ -1845,6 +1994,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
pinfo->subplan_map[pd_idx];
pprune->subpart_map[pp_idx] =
pinfo->subpart_map[pd_idx];
+ pprune->rti_map[pp_idx] =
+ pinfo->rti_map[pd_idx];
pd_idx++;
}
else
@@ -1852,6 +2003,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
/* this partdesc entry is not in the plan */
pprune->subplan_map[pp_idx] = -1;
pprune->subpart_map[pp_idx] = -1;
+ pprune->rti_map[pp_idx] = 0;
}
}
@@ -1873,7 +2025,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
* Initialize pruning contexts as needed.
*/
pprune->initial_pruning_steps = pinfo->initial_pruning_steps;
- if (pinfo->initial_pruning_steps)
+ if (consider_initial_steps && pinfo->initial_pruning_steps)
{
InitPartitionPruneContext(&pprune->initial_context,
pinfo->initial_pruning_steps,
@@ -1883,7 +2035,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
prunestate->do_initial_prune = true;
}
pprune->exec_pruning_steps = pinfo->exec_pruning_steps;
- if (pinfo->exec_pruning_steps)
+ if (consider_exec_steps && pinfo->exec_pruning_steps)
{
InitPartitionPruneContext(&pprune->exec_context,
pinfo->exec_pruning_steps,
@@ -2111,10 +2263,14 @@ PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
* Pass initial_prune if PARAM_EXEC Params cannot yet be evaluated. This
* differentiates the initial executor-time pruning step from later
* runtime pruning.
+ *
+ * RT indexes of leaf partitions scanned by the chosen subplans are added to
+ * *scan_leafpart_rtis if the pointer is non-NULL.
*/
Bitmapset *
ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
- bool initial_prune)
+ bool initial_prune,
+ Bitmapset **scan_leafpart_rtis)
{
Bitmapset *result = NULL;
MemoryContext oldcontext;
@@ -2149,7 +2305,7 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
*/
pprune = &prunedata->partrelprunedata[0];
find_matching_subplans_recurse(prunedata, pprune, initial_prune,
- &result);
+ &result, scan_leafpart_rtis);
/* Expression eval may have used space in ExprContext too */
if (pprune->exec_pruning_steps)
@@ -2163,6 +2319,8 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
/* Copy result out of the temp context before we reset it */
result = bms_copy(result);
+ if (scan_leafpart_rtis)
+ *scan_leafpart_rtis = bms_copy(*scan_leafpart_rtis);
MemoryContextReset(prunestate->prune_context);
@@ -2173,13 +2331,15 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
* find_matching_subplans_recurse
* Recursive worker function for ExecFindMatchingSubPlans
*
- * Adds valid (non-prunable) subplan IDs to *validsubplans
+ * Adds valid (non-prunable) subplan IDs to *validsubplans and RT indexes of
+ * of the corresponding leaf partitions to *scan_leafpart_rtis (if asked for).
*/
static void
find_matching_subplans_recurse(PartitionPruningData *prunedata,
PartitionedRelPruningData *pprune,
bool initial_prune,
- Bitmapset **validsubplans)
+ Bitmapset **validsubplans,
+ Bitmapset **scan_leafpart_rtis)
{
Bitmapset *partset;
int i;
@@ -2206,8 +2366,14 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
while ((i = bms_next_member(partset, i)) >= 0)
{
if (pprune->subplan_map[i] >= 0)
+ {
*validsubplans = bms_add_member(*validsubplans,
pprune->subplan_map[i]);
+ Assert(pprune->rti_map[i] > 0);
+ if (scan_leafpart_rtis)
+ *scan_leafpart_rtis = bms_add_member(*scan_leafpart_rtis,
+ pprune->rti_map[i]);
+ }
else
{
int partidx = pprune->subpart_map[i];
@@ -2215,7 +2381,8 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
if (partidx >= 0)
find_matching_subplans_recurse(prunedata,
&prunedata->partrelprunedata[partidx],
- initial_prune, validsubplans);
+ initial_prune, validsubplans,
+ scan_leafpart_rtis);
else
{
/*
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 9df1f81ea8..f9c7976ff2 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -119,6 +119,8 @@ CreateExecutorState(void)
estate->es_relations = NULL;
estate->es_rowmarks = NULL;
estate->es_plannedstmt = NULL;
+ estate->es_part_prune_infos = NIL;
+ estate->es_part_prune_result = NULL;
estate->es_junkFilter = NULL;
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index f9460ae506..a2182a6b1f 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -844,7 +844,7 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
else
dest = None_Receiver;
- es->qd = CreateQueryDesc(es->stmt,
+ es->qd = CreateQueryDesc(es->stmt, NULL,
fcache->src,
GetActiveSnapshot(),
InvalidSnapshot,
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index 357e10a1d7..96880e122a 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -134,7 +134,7 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
appendstate->as_begun = false;
/* If run-time partition pruning is enabled, then set that up now */
- if (node->part_prune_info != NULL)
+ if (node->part_prune_index >= 0)
{
PartitionPruneState *prunestate;
@@ -145,7 +145,7 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
*/
prunestate = ExecInitPartitionPruning(&appendstate->ps,
list_length(node->appendplans),
- node->part_prune_info,
+ node->part_prune_index,
&validsubplans);
appendstate->as_prune_state = prunestate;
nplans = bms_num_members(validsubplans);
@@ -155,7 +155,8 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
* subplan, we can fill as_valid_subplans immediately, preventing
* later calls to ExecFindMatchingSubPlans.
*/
- if (!prunestate->do_exec_prune && nplans > 0)
+ if (appendstate->as_prune_state == NULL ||
+ (!appendstate->as_prune_state->do_exec_prune && nplans > 0))
appendstate->as_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
}
else
@@ -577,7 +578,7 @@ choose_next_subplan_locally(AppendState *node)
}
else if (node->as_valid_subplans == NULL)
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
whichplan = -1;
}
@@ -642,7 +643,7 @@ choose_next_subplan_for_leader(AppendState *node)
if (node->as_valid_subplans == NULL)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
/*
* Mark each invalid plan as finished to allow the loop below to
@@ -717,7 +718,7 @@ choose_next_subplan_for_worker(AppendState *node)
else if (node->as_valid_subplans == NULL)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
mark_invalid_subplans_as_finished(node);
}
@@ -868,7 +869,7 @@ ExecAppendAsyncBegin(AppendState *node)
if (node->as_valid_subplans == NULL)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
classify_matching_subplans(node);
}
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index ecf9052e03..7708cfffda 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -82,7 +82,7 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
mergestate->ps.ExecProcNode = ExecMergeAppend;
/* If run-time partition pruning is enabled, then set that up now */
- if (node->part_prune_info != NULL)
+ if (node->part_prune_index >= 0)
{
PartitionPruneState *prunestate;
@@ -93,7 +93,7 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
*/
prunestate = ExecInitPartitionPruning(&mergestate->ps,
list_length(node->mergeplans),
- node->part_prune_info,
+ node->part_prune_index,
&validsubplans);
mergestate->ms_prune_state = prunestate;
nplans = bms_num_members(validsubplans);
@@ -103,7 +103,8 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
* subplan, we can fill as_valid_subplans immediately, preventing
* later calls to ExecFindMatchingSubPlans.
*/
- if (!prunestate->do_exec_prune && nplans > 0)
+ if (mergestate->ms_prune_state == NULL ||
+ (!mergestate->ms_prune_state->do_exec_prune && nplans > 0))
mergestate->ms_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
}
else
@@ -218,7 +219,7 @@ ExecMergeAppend(PlanState *pstate)
*/
if (node->ms_valid_subplans == NULL)
node->ms_valid_subplans =
- ExecFindMatchingSubPlans(node->ms_prune_state, false);
+ ExecFindMatchingSubPlans(node->ms_prune_state, false, NULL);
/*
* First time through: pull the first tuple from each valid subplan,
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 042a5f8b0a..729e2fd7b2 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -1578,6 +1578,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
CachedPlanSource *plansource;
CachedPlan *cplan;
List *stmt_list;
+ List *part_prune_result_list;
char *query_string;
Snapshot snapshot;
MemoryContext oldcontext;
@@ -1657,7 +1658,10 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
*/
/* Replan if needed, and increment plan refcount for portal */
- cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv);
+ cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv,
+ &part_prune_result_list);
+ Assert(list_length(cplan->stmt_list) ==
+ list_length(part_prune_result_list));
stmt_list = cplan->stmt_list;
if (!plan->saved)
@@ -1685,6 +1689,9 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
stmt_list,
cplan);
+ /* Copy PartitionPruneResults into the portal's context. */
+ PortalStorePartitionPruneResults(portal, part_prune_result_list);
+
/*
* Set up options for portal. Default SCROLL type is chosen the same way
* as PerformCursorOpen does it.
@@ -2092,7 +2099,8 @@ SPI_plan_get_cached_plan(SPIPlanPtr plan)
/* Get the generic plan for the query */
cplan = GetCachedPlan(plansource, NULL,
plan->saved ? CurrentResourceOwner : NULL,
- _SPI_current->queryEnv);
+ _SPI_current->queryEnv,
+ NULL /* Not interested in PartitionPruneResults */);
Assert(cplan == plansource->gplan);
/* Pop the error context stack */
@@ -2473,7 +2481,9 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
{
CachedPlanSource *plansource = (CachedPlanSource *) lfirst(lc1);
List *stmt_list;
- ListCell *lc2;
+ List *part_prune_result_list;
+ ListCell *lc2,
+ *lc3;
spicallbackarg.query = plansource->query_string;
@@ -2549,8 +2559,10 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
* plan, the refcount must be backed by the plan_owner.
*/
cplan = GetCachedPlan(plansource, options->params,
- plan_owner, _SPI_current->queryEnv);
-
+ plan_owner, _SPI_current->queryEnv,
+ &part_prune_result_list);
+ Assert(list_length(cplan->stmt_list) ==
+ list_length(part_prune_result_list));
stmt_list = cplan->stmt_list;
/*
@@ -2589,9 +2601,10 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
}
}
- foreach(lc2, stmt_list)
+ forboth(lc2, stmt_list, lc3, part_prune_result_list)
{
PlannedStmt *stmt = lfirst_node(PlannedStmt, lc2);
+ PartitionPruneResult *part_prune_result = lfirst_node(PartitionPruneResult, lc3);
bool canSetTag = stmt->canSetTag;
DestReceiver *dest;
@@ -2663,7 +2676,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
else
snap = InvalidSnapshot;
- qdesc = CreateQueryDesc(stmt,
+ qdesc = CreateQueryDesc(stmt, part_prune_result,
plansource->query_string,
snap, crosscheck_snapshot,
dest,
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 836f427ea8..59a7054011 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -96,7 +96,10 @@ _copyPlannedStmt(const PlannedStmt *from)
COPY_SCALAR_FIELD(parallelModeNeeded);
COPY_SCALAR_FIELD(jitFlags);
COPY_NODE_FIELD(planTree);
+ COPY_NODE_FIELD(partPruneInfos);
+ COPY_SCALAR_FIELD(containsInitialPruning);
COPY_NODE_FIELD(rtable);
+ COPY_BITMAPSET_FIELD(minLockRelids);
COPY_NODE_FIELD(resultRelations);
COPY_NODE_FIELD(appendRelations);
COPY_NODE_FIELD(subplans);
@@ -253,7 +256,7 @@ _copyAppend(const Append *from)
COPY_NODE_FIELD(appendplans);
COPY_SCALAR_FIELD(nasyncplans);
COPY_SCALAR_FIELD(first_partial_plan);
- COPY_NODE_FIELD(part_prune_info);
+ COPY_SCALAR_FIELD(part_prune_index);
return newnode;
}
@@ -281,7 +284,7 @@ _copyMergeAppend(const MergeAppend *from)
COPY_POINTER_FIELD(sortOperators, from->numCols * sizeof(Oid));
COPY_POINTER_FIELD(collations, from->numCols * sizeof(Oid));
COPY_POINTER_FIELD(nullsFirst, from->numCols * sizeof(bool));
- COPY_NODE_FIELD(part_prune_info);
+ COPY_SCALAR_FIELD(part_prune_index);
return newnode;
}
@@ -1283,6 +1286,8 @@ _copyPartitionPruneInfo(const PartitionPruneInfo *from)
PartitionPruneInfo *newnode = makeNode(PartitionPruneInfo);
COPY_NODE_FIELD(prune_infos);
+ COPY_SCALAR_FIELD(needs_init_pruning);
+ COPY_SCALAR_FIELD(needs_exec_pruning);
COPY_BITMAPSET_FIELD(other_subplans);
return newnode;
@@ -1299,6 +1304,7 @@ _copyPartitionedRelPruneInfo(const PartitionedRelPruneInfo *from)
COPY_POINTER_FIELD(subplan_map, from->nparts * sizeof(int));
COPY_POINTER_FIELD(subpart_map, from->nparts * sizeof(int));
COPY_POINTER_FIELD(relid_map, from->nparts * sizeof(Oid));
+ COPY_POINTER_FIELD(rti_map, from->nparts * sizeof(Index));
COPY_NODE_FIELD(initial_pruning_steps);
COPY_NODE_FIELD(exec_pruning_steps);
COPY_BITMAPSET_FIELD(execparamids);
@@ -5473,6 +5479,21 @@ _copyExtensibleNode(const ExtensibleNode *from)
return newnode;
}
+/* ****************************************************************
+ * execnodes.h copy functions
+ * ****************************************************************
+ */
+static PartitionPruneResult *
+_copyPartitionPruneResult(const PartitionPruneResult *from)
+{
+ PartitionPruneResult *newnode = makeNode(PartitionPruneResult);
+
+ COPY_NODE_FIELD(valid_subplan_offs_list);
+ COPY_BITMAPSET_FIELD(scan_leafpart_rtis);
+
+ return newnode;
+}
+
/* ****************************************************************
* value.h copy functions
* ****************************************************************
@@ -5527,7 +5548,6 @@ _copyBitString(const BitString *from)
return newnode;
}
-
static ForeignKeyCacheInfo *
_copyForeignKeyCacheInfo(const ForeignKeyCacheInfo *from)
{
@@ -6569,6 +6589,13 @@ copyObjectImpl(const void *from)
retval = _copyPublicationTable(from);
break;
+ /*
+ * EXECUTION NODES
+ */
+ case T_PartitionPruneResult:
+ retval = _copyPartitionPruneResult(from);
+ break;
+
/*
* MISCELLANEOUS NODES
*/
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index d5f5e76c55..3dada68291 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -314,7 +314,10 @@ _outPlannedStmt(StringInfo str, const PlannedStmt *node)
WRITE_BOOL_FIELD(parallelModeNeeded);
WRITE_INT_FIELD(jitFlags);
WRITE_NODE_FIELD(planTree);
+ WRITE_NODE_FIELD(partPruneInfos);
+ WRITE_BOOL_FIELD(containsInitialPruning);
WRITE_NODE_FIELD(rtable);
+ WRITE_BITMAPSET_FIELD(minLockRelids);
WRITE_NODE_FIELD(resultRelations);
WRITE_NODE_FIELD(appendRelations);
WRITE_NODE_FIELD(subplans);
@@ -443,7 +446,7 @@ _outAppend(StringInfo str, const Append *node)
WRITE_NODE_FIELD(appendplans);
WRITE_INT_FIELD(nasyncplans);
WRITE_INT_FIELD(first_partial_plan);
- WRITE_NODE_FIELD(part_prune_info);
+ WRITE_INT_FIELD(part_prune_index);
}
static void
@@ -460,7 +463,7 @@ _outMergeAppend(StringInfo str, const MergeAppend *node)
WRITE_OID_ARRAY(sortOperators, node->numCols);
WRITE_OID_ARRAY(collations, node->numCols);
WRITE_BOOL_ARRAY(nullsFirst, node->numCols);
- WRITE_NODE_FIELD(part_prune_info);
+ WRITE_INT_FIELD(part_prune_index);
}
static void
@@ -1009,6 +1012,8 @@ _outPartitionPruneInfo(StringInfo str, const PartitionPruneInfo *node)
WRITE_NODE_TYPE("PARTITIONPRUNEINFO");
WRITE_NODE_FIELD(prune_infos);
+ WRITE_BOOL_FIELD(needs_init_pruning);
+ WRITE_BOOL_FIELD(needs_exec_pruning);
WRITE_BITMAPSET_FIELD(other_subplans);
}
@@ -1023,6 +1028,7 @@ _outPartitionedRelPruneInfo(StringInfo str, const PartitionedRelPruneInfo *node)
WRITE_INT_ARRAY(subplan_map, node->nparts);
WRITE_INT_ARRAY(subpart_map, node->nparts);
WRITE_OID_ARRAY(relid_map, node->nparts);
+ WRITE_INDEX_ARRAY(rti_map, node->nparts);
WRITE_NODE_FIELD(initial_pruning_steps);
WRITE_NODE_FIELD(exec_pruning_steps);
WRITE_BITMAPSET_FIELD(execparamids);
@@ -2425,6 +2431,9 @@ _outPlannerGlobal(StringInfo str, const PlannerGlobal *node)
WRITE_NODE_FIELD(finalrowmarks);
WRITE_NODE_FIELD(resultRelations);
WRITE_NODE_FIELD(appendRelations);
+ WRITE_NODE_FIELD(partPruneInfos);
+ WRITE_BOOL_FIELD(containsInitialPruning);
+ WRITE_BITMAPSET_FIELD(minLockRelids);
WRITE_NODE_FIELD(relationOids);
WRITE_NODE_FIELD(invalItems);
WRITE_NODE_FIELD(paramExecTypes);
@@ -2492,6 +2501,7 @@ _outPlannerInfo(StringInfo str, const PlannerInfo *node)
WRITE_BITMAPSET_FIELD(curOuterRels);
WRITE_NODE_FIELD(curOuterParams);
WRITE_BOOL_FIELD(partColsUpdated);
+ WRITE_NODE_FIELD(partPruneInfos);
}
static void
@@ -2845,6 +2855,21 @@ _outExtensibleNode(StringInfo str, const ExtensibleNode *node)
methods->nodeOut(str, node);
}
+/*****************************************************************************
+ *
+ * Stuff from execnodes.h
+ *
+ *****************************************************************************/
+
+static void
+_outPartitionPruneResult(StringInfo str, const PartitionPruneResult *node)
+{
+ WRITE_NODE_TYPE("PARTITIONPRUNERESULT");
+
+ WRITE_NODE_FIELD(valid_subplan_offs_list);
+ WRITE_BITMAPSET_FIELD(scan_leafpart_rtis);
+}
+
/*****************************************************************************
*
* Stuff from parsenodes.h.
@@ -4754,6 +4779,13 @@ outNode(StringInfo str, const void *obj)
_outJsonTableSibling(str, obj);
break;
+ /*
+ * EXECUTION NODES
+ */
+ case T_PartitionPruneResult:
+ _outPartitionPruneResult(str, obj);
+ break;
+
default:
/*
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 3d150cb25d..6a6fcec03b 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -164,6 +164,11 @@
token = pg_strtok(&length); /* skip :fldname */ \
local_node->fldname = readIntCols(len)
+/* Read an Index array */
+#define READ_INDEX_ARRAY(fldname, len) \
+ token = pg_strtok(&length); /* skip :fldname */ \
+ local_node->fldname = readIndexCols(len)
+
/* Read a bool array */
#define READ_BOOL_ARRAY(fldname, len) \
token = pg_strtok(&length); /* skip :fldname */ \
@@ -1815,7 +1820,10 @@ _readPlannedStmt(void)
READ_BOOL_FIELD(parallelModeNeeded);
READ_INT_FIELD(jitFlags);
READ_NODE_FIELD(planTree);
+ READ_NODE_FIELD(partPruneInfos);
+ READ_BOOL_FIELD(containsInitialPruning);
READ_NODE_FIELD(rtable);
+ READ_BITMAPSET_FIELD(minLockRelids);
READ_NODE_FIELD(resultRelations);
READ_NODE_FIELD(appendRelations);
READ_NODE_FIELD(subplans);
@@ -1947,7 +1955,7 @@ _readAppend(void)
READ_NODE_FIELD(appendplans);
READ_INT_FIELD(nasyncplans);
READ_INT_FIELD(first_partial_plan);
- READ_NODE_FIELD(part_prune_info);
+ READ_INT_FIELD(part_prune_index);
READ_DONE();
}
@@ -1969,7 +1977,7 @@ _readMergeAppend(void)
READ_OID_ARRAY(sortOperators, local_node->numCols);
READ_OID_ARRAY(collations, local_node->numCols);
READ_BOOL_ARRAY(nullsFirst, local_node->numCols);
- READ_NODE_FIELD(part_prune_info);
+ READ_INT_FIELD(part_prune_index);
READ_DONE();
}
@@ -2767,6 +2775,8 @@ _readPartitionPruneInfo(void)
READ_LOCALS(PartitionPruneInfo);
READ_NODE_FIELD(prune_infos);
+ READ_BOOL_FIELD(needs_init_pruning);
+ READ_BOOL_FIELD(needs_exec_pruning);
READ_BITMAPSET_FIELD(other_subplans);
READ_DONE();
@@ -2783,6 +2793,7 @@ _readPartitionedRelPruneInfo(void)
READ_INT_ARRAY(subplan_map, local_node->nparts);
READ_INT_ARRAY(subpart_map, local_node->nparts);
READ_OID_ARRAY(relid_map, local_node->nparts);
+ READ_INDEX_ARRAY(rti_map, local_node->nparts);
READ_NODE_FIELD(initial_pruning_steps);
READ_NODE_FIELD(exec_pruning_steps);
READ_BITMAPSET_FIELD(execparamids);
@@ -2936,6 +2947,21 @@ _readPartitionRangeDatum(void)
READ_DONE();
}
+
+/*
+ * _readPartitionPruneResult
+ */
+static PartitionPruneResult *
+_readPartitionPruneResult(void)
+{
+ READ_LOCALS(PartitionPruneResult);
+
+ READ_NODE_FIELD(valid_subplan_offs_list);
+ READ_BITMAPSET_FIELD(scan_leafpart_rtis);
+
+ READ_DONE();
+}
+
/*
* parseNodeString
*
@@ -3233,6 +3259,8 @@ parseNodeString(void)
return_value = _readJsonTableParent();
else if (MATCH("JSONTABSNODE", 12))
return_value = _readJsonTableSibling();
+ else if (MATCH("PARTITIONPRUNERESULT", 20))
+ return_value = _readPartitionPruneResult();
else
{
elog(ERROR, "badly formatted node string \"%.32s\"...", token);
@@ -3376,6 +3404,30 @@ readIntCols(int numCols)
return int_vals;
}
+/*
+ * readIndexCols
+ */
+Index *
+readIndexCols(int numCols)
+{
+ int tokenLength,
+ i;
+ const char *token;
+ Index *index_vals;
+
+ if (numCols <= 0)
+ return NULL;
+
+ index_vals = (Index *) palloc(numCols * sizeof(Index));
+ for (i = 0; i < numCols; i++)
+ {
+ token = pg_strtok(&tokenLength);
+ index_vals[i] = atoui(token);
+ }
+
+ return index_vals;
+}
+
/*
* readBoolCols
*/
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index 95476ada0b..fe0df2f1d1 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -1184,7 +1184,6 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
ListCell *subpaths;
int nasyncplans = 0;
RelOptInfo *rel = best_path->path.parent;
- PartitionPruneInfo *partpruneinfo = NULL;
int nodenumsortkeys = 0;
AttrNumber *nodeSortColIdx = NULL;
Oid *nodeSortOperators = NULL;
@@ -1335,6 +1334,9 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
subplans = lappend(subplans, subplan);
}
+ /* Set below if we find quals that we can use to run-time prune */
+ plan->part_prune_index = -1;
+
/*
* If any quals exist, they may be useful to perform further partition
* pruning during execution. Gather information needed by the executor to
@@ -1358,16 +1360,14 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
}
if (prunequal != NIL)
- partpruneinfo =
- make_partition_pruneinfo(root, rel,
- best_path->subpaths,
- prunequal);
+ plan->part_prune_index = make_partition_pruneinfo(root, rel,
+ best_path->subpaths,
+ prunequal);
}
plan->appendplans = subplans;
plan->nasyncplans = nasyncplans;
plan->first_partial_plan = best_path->first_partial_path;
- plan->part_prune_info = partpruneinfo;
copy_generic_path_info(&plan->plan, (Path *) best_path);
@@ -1407,7 +1407,6 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
List *subplans = NIL;
ListCell *subpaths;
RelOptInfo *rel = best_path->path.parent;
- PartitionPruneInfo *partpruneinfo = NULL;
/*
* We don't have the actual creation of the MergeAppend node split out
@@ -1500,6 +1499,9 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
subplans = lappend(subplans, subplan);
}
+ /* Set below if we find quals that we can use to run-time prune */
+ node->part_prune_index = -1;
+
/*
* If any quals exist, they may be useful to perform further partition
* pruning during execution. Gather information needed by the executor to
@@ -1523,13 +1525,13 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
}
if (prunequal != NIL)
- partpruneinfo = make_partition_pruneinfo(root, rel,
- best_path->subpaths,
- prunequal);
+ node->part_prune_index = make_partition_pruneinfo(root, rel,
+ best_path->subpaths,
+ prunequal);
}
node->mergeplans = subplans;
- node->part_prune_info = partpruneinfo;
+
/*
* If prepare_sort_from_pathkeys added sort columns, but we were told to
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index b090b087e9..f425362491 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -518,7 +518,10 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
result->dependsOnRole = glob->dependsOnRole;
result->parallelModeNeeded = glob->parallelModeNeeded;
result->planTree = top_plan;
+ result->partPruneInfos = glob->partPruneInfos;
+ result->containsInitialPruning = glob->containsInitialPruning;
result->rtable = glob->finalrtable;
+ result->minLockRelids = glob->minLockRelids;
result->resultRelations = glob->resultRelations;
result->appendRelations = glob->appendRelations;
result->subplans = glob->subplans;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 6ea3505646..c5549a19b4 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -261,7 +261,7 @@ set_plan_references(PlannerInfo *root, Plan *plan)
Plan *result;
PlannerGlobal *glob = root->glob;
int rtoffset = list_length(glob->finalrtable);
- ListCell *lc;
+ ListCell *lc;
/*
* Add all the query's RTEs to the flattened rangetable. The live ones
@@ -270,6 +270,16 @@ set_plan_references(PlannerInfo *root, Plan *plan)
*/
add_rtes_to_flat_rtable(root, false);
+ /*
+ * Add the query's adjusted range of RT indexes to glob->minLockRelids.
+ * The adjusted RT indexes of prunable relations will be deleted from the
+ * set below where PartitionPruneInfos are processed.
+ */
+ glob->minLockRelids =
+ bms_add_range(glob->minLockRelids,
+ rtoffset + 1,
+ rtoffset + list_length(root->parse->rtable));
+
/*
* Adjust RT indexes of PlanRowMarks and add to final rowmarks list
*/
@@ -348,6 +358,64 @@ set_plan_references(PlannerInfo *root, Plan *plan)
}
}
+ /* Also fix up the information in PartitionPruneInfos. */
+ foreach (lc, root->partPruneInfos)
+ {
+ PartitionPruneInfo *pruneinfo = lfirst(lc);
+ Bitmapset *leafpart_rtis = NULL;
+ ListCell *l;
+
+ foreach(l, pruneinfo->prune_infos)
+ {
+ List *prune_infos = lfirst(l);
+ ListCell *l2;
+
+ foreach(l2, prune_infos)
+ {
+ PartitionedRelPruneInfo *pinfo = lfirst(l2);
+ int i;
+
+ /* RT index of the partitione table. */
+ pinfo->rtindex += rtoffset;
+
+ /* And also those of the leaf partitions. */
+ for (i = 0; i < pinfo->nparts; i++)
+ {
+ if (pinfo->rti_map[i] > 0)
+ {
+ pinfo->rti_map[i] += rtoffset;
+ leafpart_rtis = bms_add_member(leafpart_rtis,
+ pinfo->rti_map[i]);
+ }
+ }
+ }
+ }
+
+ if (pruneinfo->needs_init_pruning)
+ {
+ glob->containsInitialPruning = true;
+
+ /*
+ * Delete the leaf partition RTIs from the global set of relations
+ * to be locked before executing the plan. AcquireExecutorLocks()
+ * will find the ones to add to the set after performing initial
+ * pruning.
+ */
+ glob->minLockRelids = bms_del_members(glob->minLockRelids,
+ leafpart_rtis);
+ }
+
+ glob->partPruneInfos = lappend(glob->partPruneInfos, pruneinfo);
+ }
+
+ /*
+ * It seems worth doing a bms_copy() on glob->minLockRelids if we deleted
+ * bit from it just above to prevent empty tail bits resulting in
+ * inefficient looping during AcquireExecutorLocks().
+ */
+ if (glob->containsInitialPruning)
+ glob->minLockRelids = bms_copy(glob->minLockRelids);
+
return result;
}
@@ -1640,21 +1708,12 @@ set_append_references(PlannerInfo *root,
aplan->apprelids = offset_relid_set(aplan->apprelids, rtoffset);
- if (aplan->part_prune_info)
- {
- foreach(l, aplan->part_prune_info->prune_infos)
- {
- List *prune_infos = lfirst(l);
- ListCell *l2;
-
- foreach(l2, prune_infos)
- {
- PartitionedRelPruneInfo *pinfo = lfirst(l2);
-
- pinfo->rtindex += rtoffset;
- }
- }
- }
+ /*
+ * PartitionPruneInfos will be added to a list in PlannerGlobal, so update
+ * the index.
+ */
+ if (aplan->part_prune_index >= 0)
+ aplan->part_prune_index += list_length(root->glob->partPruneInfos);
/* We don't need to recurse to lefttree or righttree ... */
Assert(aplan->plan.lefttree == NULL);
@@ -1712,21 +1771,12 @@ set_mergeappend_references(PlannerInfo *root,
mplan->apprelids = offset_relid_set(mplan->apprelids, rtoffset);
- if (mplan->part_prune_info)
- {
- foreach(l, mplan->part_prune_info->prune_infos)
- {
- List *prune_infos = lfirst(l);
- ListCell *l2;
-
- foreach(l2, prune_infos)
- {
- PartitionedRelPruneInfo *pinfo = lfirst(l2);
-
- pinfo->rtindex += rtoffset;
- }
- }
- }
+ /*
+ * PartitionPruneInfos will be added to a list in PlannerGlobal, so update
+ * the index.
+ */
+ if (mplan->part_prune_index >= 0)
+ mplan->part_prune_index += list_length(root->glob->partPruneInfos);
/* We don't need to recurse to lefttree or righttree ... */
Assert(mplan->plan.lefttree == NULL);
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index 9d3c05aed3..952c5b8327 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -144,7 +144,9 @@ static List *make_partitionedrel_pruneinfo(PlannerInfo *root,
List *prunequal,
Bitmapset *partrelids,
int *relid_subplan_map,
- Bitmapset **matchedsubplans);
+ Bitmapset **matchedsubplans,
+ bool *needs_init_pruning,
+ bool *needs_exec_pruning);
static void gen_partprune_steps(RelOptInfo *rel, List *clauses,
PartClauseTarget target,
GeneratePruningStepsContext *context);
@@ -209,16 +211,20 @@ static void partkey_datum_from_expr(PartitionPruneContext *context,
/*
* make_partition_pruneinfo
- * Builds a PartitionPruneInfo which can be used in the executor to allow
- * additional partition pruning to take place. Returns NULL when
- * partition pruning would be useless.
+ * Checks if the given set of quals can be used to build pruning steps
+ * that the executor can use to prune away unneeded partitions. If
+ * suitable quals are found then a PartitionPruneInfo is built and tagged
+ * onto the PlannerInfo's partPruneInfos list.
+ *
+ * The return value is the 0-based index of the item added to the
+ * partPruneInfos list or -1 if nothing was added.
*
* 'parentrel' is the RelOptInfo for an appendrel, and 'subpaths' is the list
* of scan paths for its child rels.
* 'prunequal' is a list of potential pruning quals (i.e., restriction
* clauses that are applicable to the appendrel).
*/
-PartitionPruneInfo *
+int
make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
List *subpaths,
List *prunequal)
@@ -230,6 +236,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
int *relid_subplan_map;
ListCell *lc;
int i;
+ bool needs_init_pruning = false;
+ bool needs_exec_pruning = false;
/*
* Scan the subpaths to see which ones are scans of partition child
@@ -309,12 +317,16 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
Bitmapset *partrelids = (Bitmapset *) lfirst(lc);
List *pinfolist;
Bitmapset *matchedsubplans = NULL;
+ bool partrel_needs_init_pruning;
+ bool partrel_needs_exec_pruning;
pinfolist = make_partitionedrel_pruneinfo(root, parentrel,
prunequal,
partrelids,
relid_subplan_map,
- &matchedsubplans);
+ &matchedsubplans,
+ &partrel_needs_init_pruning,
+ &partrel_needs_exec_pruning);
/* When pruning is possible, record the matched subplans */
if (pinfolist != NIL)
@@ -323,6 +335,9 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
allmatchedsubplans = bms_join(matchedsubplans,
allmatchedsubplans);
}
+
+ needs_init_pruning |= partrel_needs_init_pruning;
+ needs_exec_pruning |= partrel_needs_exec_pruning;
}
pfree(relid_subplan_map);
@@ -332,11 +347,13 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
* quals, then we can just not bother with run-time pruning.
*/
if (prunerelinfos == NIL)
- return NULL;
+ return -1;
/* Else build the result data structure */
pruneinfo = makeNode(PartitionPruneInfo);
pruneinfo->prune_infos = prunerelinfos;
+ pruneinfo->needs_init_pruning = needs_init_pruning;
+ pruneinfo->needs_exec_pruning = needs_exec_pruning;
/*
* Some subplans may not belong to any of the identified partitioned rels.
@@ -358,7 +375,9 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
else
pruneinfo->other_subplans = NULL;
- return pruneinfo;
+ root->partPruneInfos = lappend(root->partPruneInfos, pruneinfo);
+
+ return list_length(root->partPruneInfos) - 1;
}
/*
@@ -435,13 +454,18 @@ add_part_relids(List *allpartrelids, Bitmapset *partrelids)
* If we cannot find any useful run-time pruning steps, return NIL.
* However, on success, each rel identified in partrelids will have
* an element in the result list, even if some of them are useless.
+ * *needs_init_pruning and *needs_exec_pruning are set to indicate that the
+ * returned PartitionedRelPruneInfos contains pruning steps that can be
+ * performed before and after execution begins, respectively.
*/
static List *
make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
List *prunequal,
Bitmapset *partrelids,
int *relid_subplan_map,
- Bitmapset **matchedsubplans)
+ Bitmapset **matchedsubplans,
+ bool *needs_init_pruning,
+ bool *needs_exec_pruning)
{
RelOptInfo *targetpart = NULL;
List *pinfolist = NIL;
@@ -452,6 +476,10 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
int rti;
int i;
+ /* Will find out below. */
+ *needs_init_pruning = false;
+ *needs_exec_pruning = false;
+
/*
* Examine each partitioned rel, constructing a temporary array to map
* from planner relids to index of the partitioned rel, and building a
@@ -539,6 +567,9 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
* executor per-scan pruning steps. This first pass creates startup
* pruning steps and detects whether there's any possibly-useful quals
* that would require per-scan pruning.
+ *
+ * In the first pass, we note whether the 2nd pass is necessary by
+ * noting the presence of EXEC parameters.
*/
gen_partprune_steps(subpart, partprunequal, PARTTARGET_INITIAL,
&context);
@@ -613,6 +644,12 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
pinfo->execparamids = execparamids;
/* Remaining fields will be filled in the next loop */
+ /* record which types of pruning steps we've seen so far */
+ if (initial_pruning_steps != NIL)
+ *needs_init_pruning = true;
+ if (exec_pruning_steps != NIL)
+ *needs_exec_pruning = true;
+
pinfolist = lappend(pinfolist, pinfo);
}
@@ -640,6 +677,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
int *subplan_map;
int *subpart_map;
Oid *relid_map;
+ Index *rti_map;
/*
* Construct the subplan and subpart maps for this partitioning level.
@@ -652,6 +690,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
subpart_map = (int *) palloc(nparts * sizeof(int));
memset(subpart_map, -1, nparts * sizeof(int));
relid_map = (Oid *) palloc0(nparts * sizeof(Oid));
+ rti_map = (Index *) palloc0(nparts * sizeof(Index));
present_parts = NULL;
i = -1;
@@ -666,6 +705,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
subplan_map[i] = subplanidx = relid_subplan_map[partrel->relid] - 1;
subpart_map[i] = subpartidx = relid_subpart_map[partrel->relid] - 1;
relid_map[i] = planner_rt_fetch(partrel->relid, root)->relid;
+ rti_map[i] = partrel->relid;
if (subplanidx >= 0)
{
present_parts = bms_add_member(present_parts, i);
@@ -690,6 +730,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
pinfo->subplan_map = subplan_map;
pinfo->subpart_map = subpart_map;
pinfo->relid_map = relid_map;
+ pinfo->rti_map = rti_map;
}
pfree(relid_subpart_map);
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 95dc2e2c83..8dc52a158f 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -1603,6 +1603,7 @@ exec_bind_message(StringInfo input_message)
int16 *rformats = NULL;
CachedPlanSource *psrc;
CachedPlan *cplan;
+ List *part_prune_result_list;
Portal portal;
char *query_string;
char *saved_stmt_name;
@@ -1978,7 +1979,9 @@ exec_bind_message(StringInfo input_message)
* will be generated in MessageContext. The plan refcount will be
* assigned to the Portal, so it will be released at portal destruction.
*/
- cplan = GetCachedPlan(psrc, params, NULL, NULL);
+ cplan = GetCachedPlan(psrc, params, NULL, NULL, &part_prune_result_list);
+ Assert(list_length(cplan->stmt_list) ==
+ list_length(part_prune_result_list));
/*
* Now we can define the portal.
@@ -1993,6 +1996,9 @@ exec_bind_message(StringInfo input_message)
cplan->stmt_list,
cplan);
+ /* Copy PartitionPruneResults into the portal's context. */
+ PortalStorePartitionPruneResults(portal, part_prune_result_list);
+
/* Done with the snapshot used for parameter I/O and parsing/planning */
if (snapshot_set)
PopActiveSnapshot();
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index 5aa5a350f3..8cc2e2162d 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -35,7 +35,7 @@
Portal ActivePortal = NULL;
-static void ProcessQuery(PlannedStmt *plan,
+static void ProcessQuery(PlannedStmt *plan, PartitionPruneResult *part_prune_result,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -65,6 +65,7 @@ static void DoPortalRewind(Portal portal);
*/
QueryDesc *
CreateQueryDesc(PlannedStmt *plannedstmt,
+ PartitionPruneResult *part_prune_result,
const char *sourceText,
Snapshot snapshot,
Snapshot crosscheck_snapshot,
@@ -77,6 +78,8 @@ CreateQueryDesc(PlannedStmt *plannedstmt,
qd->operation = plannedstmt->commandType; /* operation */
qd->plannedstmt = plannedstmt; /* plan */
+ qd->part_prune_result = part_prune_result; /* ExecutorDoInitialPruning()
+ * output for plan */
qd->sourceText = sourceText; /* query text */
qd->snapshot = RegisterSnapshot(snapshot); /* snapshot */
/* RI check snapshot */
@@ -122,6 +125,7 @@ FreeQueryDesc(QueryDesc *qdesc)
* PORTAL_ONE_RETURNING, or PORTAL_ONE_MOD_WITH portal
*
* plan: the plan tree for the query
+ * part_prune_result: ExecutorDoInitialPruning() output for the plan tree
* sourceText: the source text of the query
* params: any parameters needed
* dest: where to send results
@@ -134,6 +138,7 @@ FreeQueryDesc(QueryDesc *qdesc)
*/
static void
ProcessQuery(PlannedStmt *plan,
+ PartitionPruneResult *part_prune_result,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -145,7 +150,7 @@ ProcessQuery(PlannedStmt *plan,
/*
* Create the QueryDesc object
*/
- queryDesc = CreateQueryDesc(plan, sourceText,
+ queryDesc = CreateQueryDesc(plan, part_prune_result, sourceText,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, 0);
@@ -491,8 +496,13 @@ PortalStart(Portal portal, ParamListInfo params,
/*
* Create QueryDesc in portal's context; for the moment, set
* the destination to DestNone.
+ *
+ * There is no PartitionPruneResult unless the PlannedStmt is
+ * from a CachedPlan.
*/
queryDesc = CreateQueryDesc(linitial_node(PlannedStmt, portal->stmts),
+ portal->part_prune_results == NIL ? NULL :
+ linitial(portal->part_prune_results),
portal->sourceText,
GetActiveSnapshot(),
InvalidSnapshot,
@@ -1225,6 +1235,8 @@ PortalRunMulti(Portal portal,
if (pstmt->utilityStmt == NULL)
{
+ PartitionPruneResult *part_prune_result = NULL;
+
/*
* process a plannable query.
*/
@@ -1271,10 +1283,18 @@ PortalRunMulti(Portal portal,
else
UpdateActiveSnapshotCommandId();
+ /*
+ * Determine if there's a corresponding PartitionPruneResult for
+ * this PlannedStmt.
+ */
+ if (portal->part_prune_results != NIL)
+ part_prune_result = list_nth(portal->part_prune_results,
+ foreach_current_index(stmtlist_item));
+
if (pstmt->canSetTag)
{
/* statement can set tag string */
- ProcessQuery(pstmt,
+ ProcessQuery(pstmt, part_prune_result,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
@@ -1283,7 +1303,7 @@ PortalRunMulti(Portal portal,
else
{
/* stmt added by rewrite cannot set tag */
- ProcessQuery(pstmt,
+ ProcessQuery(pstmt, part_prune_result,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 4cf6db504f..6cb473f2f4 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -99,14 +99,19 @@ static dlist_head cached_expression_list = DLIST_STATIC_INIT(cached_expression_l
static void ReleaseGenericPlan(CachedPlanSource *plansource);
static List *RevalidateCachedQuery(CachedPlanSource *plansource,
QueryEnvironment *queryEnv);
-static bool CheckCachedPlan(CachedPlanSource *plansource);
+static bool CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
+ List **part_prune_result_list);
static CachedPlan *BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
- ParamListInfo boundParams, QueryEnvironment *queryEnv);
+ ParamListInfo boundParams, QueryEnvironment *queryEnv,
+ List **part_prune_result_list);
static bool choose_custom_plan(CachedPlanSource *plansource,
ParamListInfo boundParams);
static double cached_plan_cost(CachedPlan *plan, bool include_planner);
static Query *QueryListGetPrimaryStmt(List *stmts);
-static void AcquireExecutorLocks(List *stmt_list, bool acquire);
+static void AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams,
+ List **part_prune_result_list,
+ List **lockedRelids_per_stmt);
+static void ReleaseExecutorLocks(List *stmt_list, List *lockedRelids_per_stmt);
static void AcquirePlannerLocks(List *stmt_list, bool acquire);
static void ScanQueryForLocks(Query *parsetree, bool acquire);
static bool ScanQueryWalker(Node *node, bool *acquire);
@@ -790,15 +795,20 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
*
* On a "true" return, we have acquired the locks needed to run the plan.
* (We must do this for the "true" result to be race-condition-free.)
+ *
+ * See GetCachedPlan()'s comment for a description of part_prune_result_list.
*/
static bool
-CheckCachedPlan(CachedPlanSource *plansource)
+CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
+ List **part_prune_result_list)
{
CachedPlan *plan = plansource->gplan;
/* Assert that caller checked the querytree */
Assert(plansource->is_valid);
+ *part_prune_result_list = NIL;
+
/* If there's no generic plan, just say "false" */
if (!plan)
return false;
@@ -820,13 +830,21 @@ CheckCachedPlan(CachedPlanSource *plansource)
*/
if (plan->is_valid)
{
+ List *lockedRelids_per_stmt;
+
/*
* Plan must have positive refcount because it is referenced by
* plansource; so no need to fear it disappears under us here.
*/
Assert(plan->refcount > 0);
- AcquireExecutorLocks(plan->stmt_list, true);
+ /*
+ * Lock relations scanned by the plan. This is where the pruning
+ * happens if needed.
+ */
+ AcquireExecutorLocks(plan->stmt_list, boundParams,
+ part_prune_result_list,
+ &lockedRelids_per_stmt);
/*
* If plan was transient, check to see if TransactionXmin has
@@ -848,7 +866,14 @@ CheckCachedPlan(CachedPlanSource *plansource)
}
/* Oops, the race case happened. Release useless locks. */
- AcquireExecutorLocks(plan->stmt_list, false);
+ ReleaseExecutorLocks(plan->stmt_list, lockedRelids_per_stmt);
+
+ /*
+ * The output list and any objects therein have been allocated in the
+ * caller's hopefully short-lived context, so will not remain leaked
+ * for long, though reset to avoid its accidentally being looked at.
+ */
+ *part_prune_result_list = NIL;
}
/*
@@ -874,10 +899,15 @@ CheckCachedPlan(CachedPlanSource *plansource)
* Planning work is done in the caller's memory context. The finished plan
* is in a child memory context, which typically should get reparented
* (unless this is a one-shot plan, in which case we don't copy the plan).
+ *
+ * A list of NULLs is returned in *part_prune_result_list, meaning that no
+ * PartitionPruneResult nodes have yet been created for the plans in
+ * stmt_list.
*/
static CachedPlan *
BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
- ParamListInfo boundParams, QueryEnvironment *queryEnv)
+ ParamListInfo boundParams, QueryEnvironment *queryEnv,
+ List **part_prune_result_list)
{
CachedPlan *plan;
List *plist;
@@ -1007,6 +1037,17 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
MemoryContextSwitchTo(oldcxt);
+ /*
+ * No actual PartitionPruneResults yet to add, though must initialize
+ * the list to have the same number of elements as the list of
+ * PlannedStmts.
+ */
+ *part_prune_result_list = NIL;
+ foreach(lc, plist)
+ {
+ *part_prune_result_list = lappend(*part_prune_result_list, NULL);
+ }
+
return plan;
}
@@ -1126,6 +1167,17 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
* plan or a custom plan for the given parameters: the caller does not know
* which it will get.
*
+ * For every PlannedStmt found in the returned CachedPlan, an element that
+ * is either a PartitionPruneResult or a NULL is added to
+ * *part_prune_result_list if needed. The former if the PlannedStmt is from
+ * the existing CachedPlan that is otherwise valid and contains at least one
+ * PartitionPruneInfo that has "initial" pruning steps. Those steps are
+ * performed by calling ExecutorDoInitialPruning() to determine only those
+ * leaf partitions that need to be locked by AcquireExecutorLocks() by pruning
+ * away subplans that don't match the pruning conditions. The
+ * PartitionPruneResult contains a list of bitmapsets of the indexes of
+ * matching subplans, one for each PartitionPruneInfo.
+ *
* On return, the plan is valid and we have sufficient locks to begin
* execution.
*
@@ -1139,11 +1191,13 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
*/
CachedPlan *
GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
- ResourceOwner owner, QueryEnvironment *queryEnv)
+ ResourceOwner owner, QueryEnvironment *queryEnv,
+ List **part_prune_result_list)
{
CachedPlan *plan = NULL;
List *qlist;
bool customplan;
+ List *my_part_prune_result_list;
/* Assert caller is doing things in a sane order */
Assert(plansource->magic == CACHEDPLANSOURCE_MAGIC);
@@ -1160,7 +1214,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
if (!customplan)
{
- if (CheckCachedPlan(plansource))
+ if (CheckCachedPlan(plansource, boundParams,
+ &my_part_prune_result_list))
{
/* We want a generic plan, and we already have a valid one */
plan = plansource->gplan;
@@ -1169,7 +1224,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
else
{
/* Build a new generic plan */
- plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv);
+ plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv,
+ &my_part_prune_result_list);
/* Just make real sure plansource->gplan is clear */
ReleaseGenericPlan(plansource);
/* Link the new generic plan into the plansource */
@@ -1214,7 +1270,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
if (customplan)
{
/* Build a custom plan */
- plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv);
+ plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv,
+ &my_part_prune_result_list);
/* Accumulate total costs of custom plans */
plansource->total_custom_cost += cached_plan_cost(plan, true);
@@ -1246,6 +1303,9 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
plan->is_saved = true;
}
+ if (part_prune_result_list)
+ *part_prune_result_list = my_part_prune_result_list;
+
return plan;
}
@@ -1737,17 +1797,29 @@ QueryListGetPrimaryStmt(List *stmts)
/*
* AcquireExecutorLocks: acquire locks needed for execution of a cached plan;
- * or release them if acquire is false.
+ *
+ * See GetCachedPlan()'s comment for a description of part_prune_result_list.
+ *
+ * On return, *lockedRelids_per_stmt will contain a bitmapset for every
+ * PlannedStmt in stmt_list, containing the RT indexes of relation entries
+ * in its range table that were actually locked, or NULL if the PlannedStmt
+ * contains a utility statement.
*/
static void
-AcquireExecutorLocks(List *stmt_list, bool acquire)
+AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams,
+ List **part_prune_result_list,
+ List **lockedRelids_per_stmt)
{
ListCell *lc1;
+ *part_prune_result_list = *lockedRelids_per_stmt = NIL;
foreach(lc1, stmt_list)
{
PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
- ListCell *lc2;
+ PartitionPruneResult *part_prune_result = NULL;
+ Bitmapset *allLockRelids;
+ Bitmapset *lockedRelids = NULL;
+ int rti;
if (plannedstmt->commandType == CMD_UTILITY)
{
@@ -1761,13 +1833,35 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
Query *query = UtilityContainsQuery(plannedstmt->utilityStmt);
if (query)
- ScanQueryForLocks(query, acquire);
+ ScanQueryForLocks(query, true);
+ *part_prune_result_list = lappend(*part_prune_result_list, NULL);
continue;
}
- foreach(lc2, plannedstmt->rtable)
+ /*
+ * Figure out the set of relations that would need to be locked
+ * before executing the plan.
+ */
+ if (plannedstmt->containsInitialPruning)
{
- RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc2);
+ /*
+ * Obtain the set of partitions to be locked from the
+ * PartitionPruneInfos by considering the result of performing
+ * initial partition pruning.
+ */
+ PartitionPruneResult *part_prune_result =
+ ExecutorDoInitialPruning(plannedstmt, boundParams);
+
+ allLockRelids = bms_union(plannedstmt->minLockRelids,
+ part_prune_result->scan_leafpart_rtis);
+ }
+ else
+ allLockRelids = plannedstmt->minLockRelids;
+
+ rti = -1;
+ while ((rti = bms_next_member(allLockRelids, rti)) > 0)
+ {
+ RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
if (rte->rtekind != RTE_RELATION)
continue;
@@ -1778,10 +1872,58 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
* fail if it's been dropped entirely --- we'll just transiently
* acquire a non-conflicting lock.
*/
- if (acquire)
- LockRelationOid(rte->relid, rte->rellockmode);
- else
- UnlockRelationOid(rte->relid, rte->rellockmode);
+ LockRelationOid(rte->relid, rte->rellockmode);
+ lockedRelids = bms_add_member(lockedRelids, rti);
+ }
+
+ *part_prune_result_list = lappend(*part_prune_result_list,
+ part_prune_result);
+ *lockedRelids_per_stmt = lappend(*lockedRelids_per_stmt, lockedRelids);
+ }
+}
+
+/*
+ * ReleaseExecutorLocks
+ * Release locks that would've been acquired by an earlier call to
+ * AcquireExecutorLocks()
+ */
+static void
+ReleaseExecutorLocks(List *stmt_list, List *lockedRelids_per_stmt)
+{
+ ListCell *lc1,
+ *lc2;
+
+ forboth(lc1, stmt_list, lc2, lockedRelids_per_stmt)
+ {
+ PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
+ Bitmapset *lockedRelids = lfirst(lc2);
+ int rti;
+
+ if (plannedstmt->commandType == CMD_UTILITY)
+ {
+ /*
+ * Ignore utility statements, except those (such as EXPLAIN) that
+ * contain a parsed-but-not-planned query. Note: it's okay to use
+ * ScanQueryForLocks, even though the query hasn't been through
+ * rule rewriting, because rewriting doesn't change the query
+ * representation.
+ */
+ Query *query = UtilityContainsQuery(plannedstmt->utilityStmt);
+
+ if (query)
+ ScanQueryForLocks(query, false);
+ continue;
+ }
+
+ rti = -1;
+ while ((rti = bms_next_member(lockedRelids, rti)) >= 0)
+ {
+ RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
+
+ Assert(rte->rtekind == RTE_RELATION);
+
+ /* See the comment in AcquireExecutorLocks(). */
+ UnlockRelationOid(rte->relid, rte->rellockmode);
}
}
}
diff --git a/src/backend/utils/mmgr/portalmem.c b/src/backend/utils/mmgr/portalmem.c
index d549f66d4a..1bbe6b704b 100644
--- a/src/backend/utils/mmgr/portalmem.c
+++ b/src/backend/utils/mmgr/portalmem.c
@@ -303,6 +303,25 @@ PortalDefineQuery(Portal portal,
portal->status = PORTAL_DEFINED;
}
+/*
+ * PortalStorePartitionPruneResults
+ * Copy the given list of PartitionPruneResults into the portal's
+ * context
+ *
+ * This allows the caller to ensure that the list exists as long as the portal
+ * does.
+ */
+void
+PortalStorePartitionPruneResults(Portal portal, List *part_prune_results)
+{
+ MemoryContext oldcxt;
+
+ AssertArg(PortalIsValid(portal));
+ oldcxt = MemoryContextSwitchTo(portal->portalContext);
+ portal->part_prune_results = copyObject(part_prune_results);
+ MemoryContextSwitchTo(oldcxt);
+}
+
/*
* PortalReleaseCachedPlan
* Release a portal's reference to its cached plan, if any.
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 666977fb1f..bbc8c42d88 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -87,7 +87,9 @@ extern void ExplainOneUtility(Node *utilityStmt, IntoClause *into,
ExplainState *es, const char *queryString,
ParamListInfo params, QueryEnvironment *queryEnv);
-extern void ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into,
+extern void ExplainOnePlan(PlannedStmt *plannedstmt,
+ PartitionPruneResult *part_prune_result,
+ IntoClause *into,
ExplainState *es, const char *queryString,
ParamListInfo params, QueryEnvironment *queryEnv,
const instr_time *planduration,
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index 708435e952..bd8776402e 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -45,6 +45,7 @@ extern void ExecCleanupTupleRouting(ModifyTableState *mtstate,
* nparts Length of subplan_map[] and subpart_map[].
* subplan_map Subplan index by partition index, or -1.
* subpart_map Subpart index by partition index, or -1.
+ * rti_map Range table index by partition index, or 0.
* present_parts A Bitmapset of the partition indexes that we
* have subplans or subparts for.
* initial_pruning_steps List of PartitionPruneSteps used to
@@ -61,6 +62,7 @@ typedef struct PartitionedRelPruningData
int nparts;
int *subplan_map;
int *subpart_map;
+ Index *rti_map;
Bitmapset *present_parts;
List *initial_pruning_steps;
List *exec_pruning_steps;
@@ -123,9 +125,13 @@ typedef struct PartitionPruneState
extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
int n_total_subplans,
- PartitionPruneInfo *pruneinfo,
+ int part_prune_index,
Bitmapset **initially_valid_subplans);
extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
- bool initial_prune);
-
+ bool initial_prune,
+ Bitmapset **scan_leafpart_rtis);
+extern Bitmapset *ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt,
+ ParamListInfo params,
+ PartitionPruneInfo *pruneinfo,
+ Bitmapset **scan_leafpart_rtis);
#endif /* EXECPARTITION_H */
diff --git a/src/include/executor/execdesc.h b/src/include/executor/execdesc.h
index e79e2c001f..60d5644908 100644
--- a/src/include/executor/execdesc.h
+++ b/src/include/executor/execdesc.h
@@ -35,6 +35,8 @@ typedef struct QueryDesc
/* These fields are provided by CreateQueryDesc */
CmdType operation; /* CMD_SELECT, CMD_UPDATE, etc. */
PlannedStmt *plannedstmt; /* planner's output (could be utility, too) */
+ PartitionPruneResult *part_prune_result; /* ExecutorDoInitialPruning()'s
+ * output for plannedstmt */
const char *sourceText; /* source text of the query */
Snapshot snapshot; /* snapshot to use for query */
Snapshot crosscheck_snapshot; /* crosscheck for RI update/delete */
@@ -57,6 +59,7 @@ typedef struct QueryDesc
/* in pquery.c */
extern QueryDesc *CreateQueryDesc(PlannedStmt *plannedstmt,
+ PartitionPruneResult *part_prune_result,
const char *sourceText,
Snapshot snapshot,
Snapshot crosscheck_snapshot,
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index 873772f188..57dc0e8077 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -185,6 +185,8 @@ ExecGetJunkAttribute(TupleTableSlot *slot, AttrNumber attno, bool *isNull)
/*
* prototypes from functions in execMain.c
*/
+extern PartitionPruneResult *ExecutorDoInitialPruning(PlannedStmt *plannedstmt,
+ ParamListInfo params);
extern void ExecutorStart(QueryDesc *queryDesc, int eflags);
extern void standard_ExecutorStart(QueryDesc *queryDesc, int eflags);
extern void ExecutorRun(QueryDesc *queryDesc,
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 94b191f8ae..a8bf908d63 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -596,6 +596,8 @@ typedef struct EState
struct ExecRowMark **es_rowmarks; /* Array of per-range-table-entry
* ExecRowMarks, or NULL if none */
PlannedStmt *es_plannedstmt; /* link to top of plan tree */
+ List *es_part_prune_infos; /* PlannedStmt.partPruneInfos */
+ struct PartitionPruneResult *es_part_prune_result; /* QueryDesc.part_prune_result */
const char *es_sourceText; /* Source text from QueryDesc */
JunkFilter *es_junkFilter; /* top-level junk filter, if any */
@@ -984,6 +986,34 @@ typedef struct DomainConstraintState
*/
typedef TupleTableSlot *(*ExecProcNodeMtd) (struct PlanState *pstate);
+/*----------------
+ * PartitionPruneResult
+ *
+ * The result of performing ExecutorDoInitialPruning() invocation on a given
+ * PlannedStmt.
+ *
+ * Contains a list of Bitmapset of the indexes of the subplans remaining after
+ * performing initial pruning by calling ExecFindMatchingSubPlans() for every
+ * PartitionPruneInfos found in PlannedStmt.partPruneInfos. RT indexes of the
+ * leaf partitions scanned by those subplans across all PartitionPruneInfos
+ * are added into scan_leafpart_rtis.
+ *
+ * This is used by GetCachedPlan() to inform its callers of the pruning
+ * decisions made when performing AcquireExecutorLocks() on a given cached
+ * PlannedStmt, which the callers then pass onto the executor. The executor
+ * refers to this node when made available when initializing the plan nodes to
+ * which those PartitionPruneInfos apply so that the same set of qualifying
+ * subplans are initialized, rather than deriving that set again by redoing
+ * initial pruning.
+ */
+typedef struct PartitionPruneResult
+{
+ NodeTag type;
+
+ List *valid_subplan_offs_list;
+ Bitmapset *scan_leafpart_rtis;
+} PartitionPruneResult;
+
/* ----------------
* PlanState node
*
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index 340d28f4e1..66416bce97 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -97,6 +97,9 @@ typedef enum NodeTag
T_PartitionPruneStepCombine,
T_PlanInvalItem,
+ /* TAGS FOR EXECUTOR PREP NODES (execnodes.h) */
+ T_PartitionPruneResult,
+
/*
* TAGS FOR PLAN STATE NODES (execnodes.h)
*
@@ -674,6 +677,7 @@ extern struct Bitmapset *readBitmapset(void);
extern uintptr_t readDatum(bool typbyval);
extern bool *readBoolCols(int numCols);
extern int *readIntCols(int numCols);
+extern Index *readIndexCols(int numCols);
extern Oid *readOidCols(int numCols);
extern int16 *readAttrNumberCols(int numCols);
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index c5ab53e05c..11007cda25 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -107,6 +107,18 @@ typedef struct PlannerGlobal
List *appendRelations; /* "flat" list of AppendRelInfos */
+ List *partPruneInfos; /* List of PartitionPruneInfo contained in
+ * the plan */
+
+ bool containsInitialPruning; /* Do any of those PartitionPruneInfos
+ * have initial (pre-exec) pruning
+ * steps in them? */
+
+ Bitmapset *minLockRelids; /* Indexes of all range table entries minus
+ * indexes of range table entries of the leaf
+ * partitions scanned by prunable subplans;
+ * see AcquireExecutorLocks() */
+
List *relationOids; /* OIDs of relations the plan depends on */
List *invalItems; /* other dependencies, as PlanInvalItems */
@@ -377,6 +389,9 @@ struct PlannerInfo
/* Does this query modify any partition key columns? */
bool partColsUpdated;
+
+ /* PartitionPruneInfos added in this query's plan. */
+ List *partPruneInfos;
};
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index e43e360d9b..f8f3971f44 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -64,8 +64,20 @@ typedef struct PlannedStmt
struct Plan *planTree; /* tree of Plan nodes */
+ List *partPruneInfos; /* List of PartitionPruneInfo contained in
+ * the plan */
+
+ bool containsInitialPruning; /* Do any of those PartitionPruneInfos
+ * have initial (pre-exec) pruning
+ * steps in them? */
+
List *rtable; /* list of RangeTblEntry nodes */
+ Bitmapset *minLockRelids; /* Indexes of all range table entries minus
+ * indexes of range table entries of the leaf
+ * partitions scanned by prunable subplans;
+ * see AcquireExecutorLocks() */
+
/* rtable indexes of target relations for INSERT/UPDATE/DELETE */
List *resultRelations; /* integer list of RT indexes, or NIL */
@@ -262,8 +274,8 @@ typedef struct Append
*/
int first_partial_plan;
- /* Info for run-time subplan pruning; NULL if we're not doing that */
- struct PartitionPruneInfo *part_prune_info;
+ /* Index to PlannerInfo.partPruneInfos or -1 if no run-time pruning */
+ int part_prune_index;
} Append;
/* ----------------
@@ -282,8 +294,9 @@ typedef struct MergeAppend
Oid *sortOperators; /* OIDs of operators to sort them by */
Oid *collations; /* OIDs of collations */
bool *nullsFirst; /* NULLS FIRST/LAST directions */
- /* Info for run-time subplan pruning; NULL if we're not doing that */
- struct PartitionPruneInfo *part_prune_info;
+
+ /* Index to PlannerInfo.partPruneInfos or -1 if no run-time pruning */
+ int part_prune_index;
} MergeAppend;
/* ----------------
@@ -1191,6 +1204,13 @@ typedef struct PlanRowMark
* prune_infos List of Lists containing PartitionedRelPruneInfo nodes,
* one sublist per run-time-prunable partition hierarchy
* appearing in the parent plan node's subplans.
+ *
+ * needs_init_pruning Does any of the PartitionedRelPruneInfos in
+ * prune_infos have its initial_pruning_steps set?
+ *
+ * needs_exec_pruning Does any of the PartitionedRelPruneInfos in
+ * prune_infos have its exec_pruning_steps set?
+ *
* other_subplans Indexes of any subplans that are not accounted for
* by any of the PartitionedRelPruneInfo nodes in
* "prune_infos". These subplans must not be pruned.
@@ -1199,6 +1219,8 @@ typedef struct PartitionPruneInfo
{
NodeTag type;
List *prune_infos;
+ bool needs_init_pruning;
+ bool needs_exec_pruning;
Bitmapset *other_subplans;
} PartitionPruneInfo;
@@ -1229,6 +1251,7 @@ typedef struct PartitionedRelPruneInfo
int *subplan_map; /* subplan index by partition index, or -1 */
int *subpart_map; /* subpart index by partition index, or -1 */
Oid *relid_map; /* relation OID by partition index, or 0 */
+ Index *rti_map; /* Range table index by partition index, 0. */
/*
* initial_pruning_steps shows how to prune during executor startup (i.e.,
diff --git a/src/include/partitioning/partprune.h b/src/include/partitioning/partprune.h
index 90684efa25..ebf0dcff8c 100644
--- a/src/include/partitioning/partprune.h
+++ b/src/include/partitioning/partprune.h
@@ -70,10 +70,10 @@ typedef struct PartitionPruneContext
#define PruneCxtStateIdx(partnatts, step_id, keyno) \
((partnatts) * (step_id) + (keyno))
-extern PartitionPruneInfo *make_partition_pruneinfo(struct PlannerInfo *root,
- struct RelOptInfo *parentrel,
- List *subpaths,
- List *prunequal);
+extern int make_partition_pruneinfo(struct PlannerInfo *root,
+ struct RelOptInfo *parentrel,
+ List *subpaths,
+ List *prunequal);
extern Bitmapset *prune_append_rel_partitions(struct RelOptInfo *rel);
extern Bitmapset *get_matching_partitions(PartitionPruneContext *context,
List *pruning_steps);
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index 95b99e3d25..449200b949 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -220,7 +220,8 @@ extern List *CachedPlanGetTargetList(CachedPlanSource *plansource,
extern CachedPlan *GetCachedPlan(CachedPlanSource *plansource,
ParamListInfo boundParams,
ResourceOwner owner,
- QueryEnvironment *queryEnv);
+ QueryEnvironment *queryEnv,
+ List **part_prune_result_list);
extern void ReleaseCachedPlan(CachedPlan *plan, ResourceOwner owner);
extern bool CachedPlanAllowsSimpleValidityCheck(CachedPlanSource *plansource,
diff --git a/src/include/utils/portal.h b/src/include/utils/portal.h
index aeddbdafe5..9f7727a837 100644
--- a/src/include/utils/portal.h
+++ b/src/include/utils/portal.h
@@ -138,6 +138,7 @@ typedef struct PortalData
QueryCompletion qc; /* command completion data for executed query */
List *stmts; /* list of PlannedStmts */
CachedPlan *cplan; /* CachedPlan, if stmts are from one */
+ List *part_prune_results; /* list of PartitionPruneResults */
ParamListInfo portalParams; /* params to pass to query */
QueryEnvironment *queryEnv; /* environment for query */
@@ -242,6 +243,8 @@ extern void PortalDefineQuery(Portal portal,
CommandTag commandTag,
List *stmts,
CachedPlan *cplan);
+extern void PortalStorePartitionPruneResults(Portal portal,
+ List *part_prune_result_list);
extern PlannedStmt *PortalGetPrimaryStmt(Portal portal);
extern void PortalCreateHoldStore(Portal portal);
extern void PortalHashTableDeleteAll(void);
--
2.24.1
^ permalink raw reply [nested|flat] 71+ messages in thread
* Re: generic plans and "initial" pruning
@ 2022-04-11 03:58 Zhihong Yu <[email protected]>
parent: Amit Langote <[email protected]>
0 siblings, 1 reply; 71+ messages in thread
From: Zhihong Yu @ 2022-04-11 03:58 UTC (permalink / raw)
To: Amit Langote <[email protected]>; +Cc: David Rowley <[email protected]>; Robert Haas <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers
On Sun, Apr 10, 2022 at 8:05 PM Amit Langote <[email protected]>
wrote:
> On Fri, Apr 8, 2022 at 8:45 PM Amit Langote <[email protected]>
> wrote:
> > Most looked fine changes to me except a couple of typos, so I've
> > adopted those into the attached new version, even though I know it's
> > too late to try to apply it.
> >
> > + * XXX is it worth doing a bms_copy() on glob->minLockRelids if
> > + * glob->containsInitialPruning is true?. I'm slighly worried that the
> > + * Bitmapset could have a very long empty tail resulting in excessive
> > + * looping during AcquireExecutorLocks().
> > + */
> >
> > I guess I trust your instincts about bitmapset operation efficiency
> > and what you've written here makes sense. It's typical for leaf
> > partitions to have been appended toward the tail end of rtable and I'd
> > imagine their indexes would be in the tail words of minLockRelids. If
> > copying the bitmapset removes those useless words, I don't see why we
> > shouldn't do that. So added:
> >
> > + /*
> > + * It seems worth doing a bms_copy() on glob->minLockRelids if we
> deleted
> > + * bit from it just above to prevent empty tail bits resulting in
> > + * inefficient looping during AcquireExecutorLocks().
> > + */
> > + if (glob->containsInitialPruning)
> > + glob->minLockRelids = bms_copy(glob->minLockRelids)
> >
> > Not 100% about the comment I wrote.
>
> And the quoted code change missed a semicolon in the v14 that I
> hurriedly sent on Friday. (Had apparently forgotten to `git add` the
> hunk to fix that).
>
> Sending v15 that fixes that to keep the cfbot green for now.
>
> --
> Amit Langote
> EDB: http://www.enterprisedb.com
Hi,
+ /* RT index of the partitione table. */
partitione -> partitioned
Cheers
^ permalink raw reply [nested|flat] 71+ messages in thread
* Re: generic plans and "initial" pruning
@ 2022-05-27 08:09 Amit Langote <[email protected]>
parent: Zhihong Yu <[email protected]>
0 siblings, 2 replies; 71+ messages in thread
From: Amit Langote @ 2022-05-27 08:09 UTC (permalink / raw)
To: Zhihong Yu <[email protected]>; +Cc: David Rowley <[email protected]>; Robert Haas <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers
On Mon, Apr 11, 2022 at 12:53 PM Zhihong Yu <[email protected]> wrote:
> On Sun, Apr 10, 2022 at 8:05 PM Amit Langote <[email protected]> wrote:
>> Sending v15 that fixes that to keep the cfbot green for now.
>
> Hi,
>
> + /* RT index of the partitione table. */
>
> partitione -> partitioned
Thanks, fixed.
Also, I broke this into patches:
0001 contains the mechanical changes of moving PartitionPruneInfo out
of Append/MergeAppend into a list in PlannedStmt.
0002 is the main patch to "Optimize AcquireExecutorLocks() by locking
only unpruned partitions".
--
Thanks, Amit Langote
EDB: http://www.enterprisedb.com
Attachments:
[application/octet-stream] v16-0001-Move-PartitioPruneInfo-out-of-plan-nodes-into-Pl.patch (21.2K, 2-v16-0001-Move-PartitioPruneInfo-out-of-plan-nodes-into-Pl.patch)
download | inline diff:
From 16fd07b7c8ffde7632ffa7b03e4595e1e08d7e06 Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Fri, 27 May 2022 16:00:28 +0900
Subject: [PATCH v16 1/2] Move PartitioPruneInfo out of plan nodes into
PlannedStmt
The planner will now add a given PartitioPruneInfo to
PlannedStmt.partPruneInfos instead of the Append/MergeAppend plan
node to which it would be added until now and set an index field in
the plan node that point to the list element.
A later commit will make AcquireExecutorLocks() do the initial
partition pruning to determine a minimal set of partitions to be
locked to validate a plan tree and it will need to consult the
PartitioPruneInfos referenced therein to do so. It would be better
for the PartitioPruneInfos to be accessible directly than them having
to be found individually by walking the plan tree, which can be done
by simply iterative over PlannedStmt.partPruneInfos.
---
src/backend/executor/execMain.c | 1 +
src/backend/executor/execParallel.c | 1 +
src/backend/executor/execPartition.c | 4 +-
src/backend/executor/execUtils.c | 2 +
src/backend/executor/nodeAppend.c | 4 +-
src/backend/executor/nodeMergeAppend.c | 4 +-
src/backend/nodes/copyfuncs.c | 5 +-
src/backend/nodes/outfuncs.c | 7 ++-
src/backend/nodes/readfuncs.c | 5 +-
src/backend/optimizer/plan/createplan.c | 24 ++++-----
src/backend/optimizer/plan/planner.c | 1 +
src/backend/optimizer/plan/setrefs.c | 65 +++++++++++++------------
src/backend/partitioning/partprune.c | 18 ++++---
src/include/executor/execPartition.h | 3 +-
src/include/nodes/execnodes.h | 2 +
src/include/nodes/pathnodes.h | 6 +++
src/include/nodes/plannodes.h | 12 +++--
src/include/partitioning/partprune.h | 8 +--
18 files changed, 104 insertions(+), 68 deletions(-)
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index ef2fd46092..72fc273524 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -825,6 +825,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
ExecInitRangeTable(estate, rangeTable);
estate->es_plannedstmt = plannedstmt;
+ estate->es_part_prune_infos = plannedstmt->partPruneInfos;
/*
* Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index f1fd7f7e8b..f73b8c2607 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -183,6 +183,7 @@ ExecSerializePlan(Plan *plan, EState *estate)
pstmt->dependsOnRole = false;
pstmt->parallelModeNeeded = false;
pstmt->planTree = plan;
+ pstmt->partPruneInfos = estate->es_part_prune_infos;
pstmt->rtable = estate->es_range_table;
pstmt->resultRelations = NIL;
pstmt->appendRelations = NIL;
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index e03ea27299..b55cdd2580 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -1638,11 +1638,13 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
PartitionPruneState *
ExecInitPartitionPruning(PlanState *planstate,
int n_total_subplans,
- PartitionPruneInfo *pruneinfo,
+ int part_prune_index,
Bitmapset **initially_valid_subplans)
{
PartitionPruneState *prunestate;
EState *estate = planstate->state;
+ PartitionPruneInfo *pruneinfo = list_nth(estate->es_part_prune_infos,
+ part_prune_index);
/* We may need an expression context to evaluate partition exprs */
ExecAssignExprContext(estate, planstate);
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 9df1f81ea8..f9c7976ff2 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -119,6 +119,8 @@ CreateExecutorState(void)
estate->es_relations = NULL;
estate->es_rowmarks = NULL;
estate->es_plannedstmt = NULL;
+ estate->es_part_prune_infos = NIL;
+ estate->es_part_prune_result = NULL;
estate->es_junkFilter = NULL;
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index 357e10a1d7..c6f86a6510 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -134,7 +134,7 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
appendstate->as_begun = false;
/* If run-time partition pruning is enabled, then set that up now */
- if (node->part_prune_info != NULL)
+ if (node->part_prune_index >= 0)
{
PartitionPruneState *prunestate;
@@ -145,7 +145,7 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
*/
prunestate = ExecInitPartitionPruning(&appendstate->ps,
list_length(node->appendplans),
- node->part_prune_info,
+ node->part_prune_index,
&validsubplans);
appendstate->as_prune_state = prunestate;
nplans = bms_num_members(validsubplans);
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index c5c62fa5c7..8d35860c30 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -82,7 +82,7 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
mergestate->ps.ExecProcNode = ExecMergeAppend;
/* If run-time partition pruning is enabled, then set that up now */
- if (node->part_prune_info != NULL)
+ if (node->part_prune_index >= 0)
{
PartitionPruneState *prunestate;
@@ -93,7 +93,7 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
*/
prunestate = ExecInitPartitionPruning(&mergestate->ps,
list_length(node->mergeplans),
- node->part_prune_info,
+ node->part_prune_index,
&validsubplans);
mergestate->ms_prune_state = prunestate;
nplans = bms_num_members(validsubplans);
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 51d630fa89..8fbeaa4f36 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -96,6 +96,7 @@ _copyPlannedStmt(const PlannedStmt *from)
COPY_SCALAR_FIELD(parallelModeNeeded);
COPY_SCALAR_FIELD(jitFlags);
COPY_NODE_FIELD(planTree);
+ COPY_NODE_FIELD(partPruneInfos);
COPY_NODE_FIELD(rtable);
COPY_NODE_FIELD(resultRelations);
COPY_NODE_FIELD(appendRelations);
@@ -253,7 +254,7 @@ _copyAppend(const Append *from)
COPY_NODE_FIELD(appendplans);
COPY_SCALAR_FIELD(nasyncplans);
COPY_SCALAR_FIELD(first_partial_plan);
- COPY_NODE_FIELD(part_prune_info);
+ COPY_SCALAR_FIELD(part_prune_index);
return newnode;
}
@@ -281,7 +282,7 @@ _copyMergeAppend(const MergeAppend *from)
COPY_POINTER_FIELD(sortOperators, from->numCols * sizeof(Oid));
COPY_POINTER_FIELD(collations, from->numCols * sizeof(Oid));
COPY_POINTER_FIELD(nullsFirst, from->numCols * sizeof(bool));
- COPY_NODE_FIELD(part_prune_info);
+ COPY_SCALAR_FIELD(part_prune_index);
return newnode;
}
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index ce12915592..72fcd8a6ee 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -321,6 +321,7 @@ _outPlannedStmt(StringInfo str, const PlannedStmt *node)
WRITE_BOOL_FIELD(parallelModeNeeded);
WRITE_INT_FIELD(jitFlags);
WRITE_NODE_FIELD(planTree);
+ WRITE_NODE_FIELD(partPruneInfos);
WRITE_NODE_FIELD(rtable);
WRITE_NODE_FIELD(resultRelations);
WRITE_NODE_FIELD(appendRelations);
@@ -450,7 +451,7 @@ _outAppend(StringInfo str, const Append *node)
WRITE_NODE_FIELD(appendplans);
WRITE_INT_FIELD(nasyncplans);
WRITE_INT_FIELD(first_partial_plan);
- WRITE_NODE_FIELD(part_prune_info);
+ WRITE_INT_FIELD(part_prune_index);
}
static void
@@ -467,7 +468,7 @@ _outMergeAppend(StringInfo str, const MergeAppend *node)
WRITE_OID_ARRAY(sortOperators, node->numCols);
WRITE_OID_ARRAY(collations, node->numCols);
WRITE_BOOL_ARRAY(nullsFirst, node->numCols);
- WRITE_NODE_FIELD(part_prune_info);
+ WRITE_INT_FIELD(part_prune_index);
}
static void
@@ -2434,6 +2435,7 @@ _outPlannerGlobal(StringInfo str, const PlannerGlobal *node)
WRITE_NODE_FIELD(finalrowmarks);
WRITE_NODE_FIELD(resultRelations);
WRITE_NODE_FIELD(appendRelations);
+ WRITE_NODE_FIELD(partPruneInfos);
WRITE_NODE_FIELD(relationOids);
WRITE_NODE_FIELD(invalItems);
WRITE_NODE_FIELD(paramExecTypes);
@@ -2501,6 +2503,7 @@ _outPlannerInfo(StringInfo str, const PlannerInfo *node)
WRITE_BITMAPSET_FIELD(curOuterRels);
WRITE_NODE_FIELD(curOuterParams);
WRITE_BOOL_FIELD(partColsUpdated);
+ WRITE_NODE_FIELD(partPruneInfos);
}
static void
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 6a05b69415..bf602ff93e 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -1817,6 +1817,7 @@ _readPlannedStmt(void)
READ_BOOL_FIELD(parallelModeNeeded);
READ_INT_FIELD(jitFlags);
READ_NODE_FIELD(planTree);
+ READ_NODE_FIELD(partPruneInfos);
READ_NODE_FIELD(rtable);
READ_NODE_FIELD(resultRelations);
READ_NODE_FIELD(appendRelations);
@@ -1949,7 +1950,7 @@ _readAppend(void)
READ_NODE_FIELD(appendplans);
READ_INT_FIELD(nasyncplans);
READ_INT_FIELD(first_partial_plan);
- READ_NODE_FIELD(part_prune_info);
+ READ_INT_FIELD(part_prune_index);
READ_DONE();
}
@@ -1971,7 +1972,7 @@ _readMergeAppend(void)
READ_OID_ARRAY(sortOperators, local_node->numCols);
READ_OID_ARRAY(collations, local_node->numCols);
READ_BOOL_ARRAY(nullsFirst, local_node->numCols);
- READ_NODE_FIELD(part_prune_info);
+ READ_INT_FIELD(part_prune_index);
READ_DONE();
}
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index 76606faa3e..58a05cf673 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -1203,7 +1203,6 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
ListCell *subpaths;
int nasyncplans = 0;
RelOptInfo *rel = best_path->path.parent;
- PartitionPruneInfo *partpruneinfo = NULL;
int nodenumsortkeys = 0;
AttrNumber *nodeSortColIdx = NULL;
Oid *nodeSortOperators = NULL;
@@ -1354,6 +1353,9 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
subplans = lappend(subplans, subplan);
}
+ /* Set below if we find quals that we can use to run-time prune */
+ plan->part_prune_index = -1;
+
/*
* If any quals exist, they may be useful to perform further partition
* pruning during execution. Gather information needed by the executor to
@@ -1377,16 +1379,14 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
}
if (prunequal != NIL)
- partpruneinfo =
- make_partition_pruneinfo(root, rel,
- best_path->subpaths,
- prunequal);
+ plan->part_prune_index = make_partition_pruneinfo(root, rel,
+ best_path->subpaths,
+ prunequal);
}
plan->appendplans = subplans;
plan->nasyncplans = nasyncplans;
plan->first_partial_plan = best_path->first_partial_path;
- plan->part_prune_info = partpruneinfo;
copy_generic_path_info(&plan->plan, (Path *) best_path);
@@ -1426,7 +1426,6 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
List *subplans = NIL;
ListCell *subpaths;
RelOptInfo *rel = best_path->path.parent;
- PartitionPruneInfo *partpruneinfo = NULL;
/*
* We don't have the actual creation of the MergeAppend node split out
@@ -1519,6 +1518,9 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
subplans = lappend(subplans, subplan);
}
+ /* Set below if we find quals that we can use to run-time prune */
+ node->part_prune_index = -1;
+
/*
* If any quals exist, they may be useful to perform further partition
* pruning during execution. Gather information needed by the executor to
@@ -1542,13 +1544,13 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
}
if (prunequal != NIL)
- partpruneinfo = make_partition_pruneinfo(root, rel,
- best_path->subpaths,
- prunequal);
+ node->part_prune_index = make_partition_pruneinfo(root, rel,
+ best_path->subpaths,
+ prunequal);
}
node->mergeplans = subplans;
- node->part_prune_info = partpruneinfo;
+
/*
* If prepare_sort_from_pathkeys added sort columns, but we were told to
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index a0f2390334..32e658b5d6 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -518,6 +518,7 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
result->dependsOnRole = glob->dependsOnRole;
result->parallelModeNeeded = glob->parallelModeNeeded;
result->planTree = top_plan;
+ result->partPruneInfos = glob->partPruneInfos;
result->rtable = glob->finalrtable;
result->resultRelations = glob->resultRelations;
result->appendRelations = glob->appendRelations;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index d95fd89807..aafe1c149d 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -348,6 +348,29 @@ set_plan_references(PlannerInfo *root, Plan *plan)
}
}
+ /* Also fix up the information in PartitionPruneInfos. */
+ foreach (lc, root->partPruneInfos)
+ {
+ PartitionPruneInfo *pruneinfo = lfirst(lc);
+ ListCell *l;
+
+ foreach(l, pruneinfo->prune_infos)
+ {
+ List *prune_infos = lfirst(l);
+ ListCell *l2;
+
+ foreach(l2, prune_infos)
+ {
+ PartitionedRelPruneInfo *pinfo = lfirst(l2);
+
+ /* RT index of the table to which the pinfo belongs. */
+ pinfo->rtindex += rtoffset;
+ }
+ }
+
+ glob->partPruneInfos = lappend(glob->partPruneInfos, pruneinfo);
+ }
+
return result;
}
@@ -1640,21 +1663,12 @@ set_append_references(PlannerInfo *root,
aplan->apprelids = offset_relid_set(aplan->apprelids, rtoffset);
- if (aplan->part_prune_info)
- {
- foreach(l, aplan->part_prune_info->prune_infos)
- {
- List *prune_infos = lfirst(l);
- ListCell *l2;
-
- foreach(l2, prune_infos)
- {
- PartitionedRelPruneInfo *pinfo = lfirst(l2);
-
- pinfo->rtindex += rtoffset;
- }
- }
- }
+ /*
+ * PartitionPruneInfos will be added to a list in PlannerGlobal, so update
+ * the index.
+ */
+ if (aplan->part_prune_index >= 0)
+ aplan->part_prune_index += list_length(root->glob->partPruneInfos);
/* We don't need to recurse to lefttree or righttree ... */
Assert(aplan->plan.lefttree == NULL);
@@ -1712,21 +1726,12 @@ set_mergeappend_references(PlannerInfo *root,
mplan->apprelids = offset_relid_set(mplan->apprelids, rtoffset);
- if (mplan->part_prune_info)
- {
- foreach(l, mplan->part_prune_info->prune_infos)
- {
- List *prune_infos = lfirst(l);
- ListCell *l2;
-
- foreach(l2, prune_infos)
- {
- PartitionedRelPruneInfo *pinfo = lfirst(l2);
-
- pinfo->rtindex += rtoffset;
- }
- }
- }
+ /*
+ * PartitionPruneInfos will be added to a list in PlannerGlobal, so update
+ * the index.
+ */
+ if (mplan->part_prune_index >= 0)
+ mplan->part_prune_index += list_length(root->glob->partPruneInfos);
/* We don't need to recurse to lefttree or righttree ... */
Assert(mplan->plan.lefttree == NULL);
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index 9d3c05aed3..d77f7d3aef 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -209,16 +209,20 @@ static void partkey_datum_from_expr(PartitionPruneContext *context,
/*
* make_partition_pruneinfo
- * Builds a PartitionPruneInfo which can be used in the executor to allow
- * additional partition pruning to take place. Returns NULL when
- * partition pruning would be useless.
+ * Checks if the given set of quals can be used to build pruning steps
+ * that the executor can use to prune away unneeded partitions. If
+ * suitable quals are found then a PartitionPruneInfo is built and tagged
+ * onto the PlannerInfo's partPruneInfos list.
+ *
+ * The return value is the 0-based index of the item added to the
+ * partPruneInfos list or -1 if nothing was added.
*
* 'parentrel' is the RelOptInfo for an appendrel, and 'subpaths' is the list
* of scan paths for its child rels.
* 'prunequal' is a list of potential pruning quals (i.e., restriction
* clauses that are applicable to the appendrel).
*/
-PartitionPruneInfo *
+int
make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
List *subpaths,
List *prunequal)
@@ -332,7 +336,7 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
* quals, then we can just not bother with run-time pruning.
*/
if (prunerelinfos == NIL)
- return NULL;
+ return -1;
/* Else build the result data structure */
pruneinfo = makeNode(PartitionPruneInfo);
@@ -358,7 +362,9 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
else
pruneinfo->other_subplans = NULL;
- return pruneinfo;
+ root->partPruneInfos = lappend(root->partPruneInfos, pruneinfo);
+
+ return list_length(root->partPruneInfos) - 1;
}
/*
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index 708435e952..bf962af7af 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -123,9 +123,8 @@ typedef struct PartitionPruneState
extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
int n_total_subplans,
- PartitionPruneInfo *pruneinfo,
+ int part_prune_index,
Bitmapset **initially_valid_subplans);
extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
bool initial_prune);
-
#endif /* EXECPARTITION_H */
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 5728801379..25e0bb976e 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -596,6 +596,8 @@ typedef struct EState
struct ExecRowMark **es_rowmarks; /* Array of per-range-table-entry
* ExecRowMarks, or NULL if none */
PlannedStmt *es_plannedstmt; /* link to top of plan tree */
+ List *es_part_prune_infos; /* PlannedStmt.partPruneInfos */
+ struct PartitionPruneResult *es_part_prune_result; /* QueryDesc.part_prune_result */
const char *es_sourceText; /* Source text from QueryDesc */
JunkFilter *es_junkFilter; /* top-level junk filter, if any */
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index a6e5db4eec..6995b0ecec 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -107,6 +107,9 @@ typedef struct PlannerGlobal
List *appendRelations; /* "flat" list of AppendRelInfos */
+ List *partPruneInfos; /* List of PartitionPruneInfo contained in
+ * the plan */
+
List *relationOids; /* OIDs of relations the plan depends on */
List *invalItems; /* other dependencies, as PlanInvalItems */
@@ -378,6 +381,9 @@ struct PlannerInfo
/* Does this query modify any partition key columns? */
bool partColsUpdated;
+
+ /* PartitionPruneInfos added in this query's plan. */
+ List *partPruneInfos;
};
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 0ea9a22dfb..297cacfb5b 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -64,6 +64,9 @@ typedef struct PlannedStmt
struct Plan *planTree; /* tree of Plan nodes */
+ List *partPruneInfos; /* List of PartitionPruneInfo contained in
+ * the plan */
+
List *rtable; /* list of RangeTblEntry nodes */
/* rtable indexes of target relations for INSERT/UPDATE/DELETE */
@@ -262,8 +265,8 @@ typedef struct Append
*/
int first_partial_plan;
- /* Info for run-time subplan pruning; NULL if we're not doing that */
- struct PartitionPruneInfo *part_prune_info;
+ /* Index to PlannerInfo.partPruneInfos or -1 if no run-time pruning */
+ int part_prune_index;
} Append;
/* ----------------
@@ -282,8 +285,9 @@ typedef struct MergeAppend
Oid *sortOperators; /* OIDs of operators to sort them by */
Oid *collations; /* OIDs of collations */
bool *nullsFirst; /* NULLS FIRST/LAST directions */
- /* Info for run-time subplan pruning; NULL if we're not doing that */
- struct PartitionPruneInfo *part_prune_info;
+
+ /* Index to PlannerInfo.partPruneInfos or -1 if no run-time pruning */
+ int part_prune_index;
} MergeAppend;
/* ----------------
diff --git a/src/include/partitioning/partprune.h b/src/include/partitioning/partprune.h
index 90684efa25..ebf0dcff8c 100644
--- a/src/include/partitioning/partprune.h
+++ b/src/include/partitioning/partprune.h
@@ -70,10 +70,10 @@ typedef struct PartitionPruneContext
#define PruneCxtStateIdx(partnatts, step_id, keyno) \
((partnatts) * (step_id) + (keyno))
-extern PartitionPruneInfo *make_partition_pruneinfo(struct PlannerInfo *root,
- struct RelOptInfo *parentrel,
- List *subpaths,
- List *prunequal);
+extern int make_partition_pruneinfo(struct PlannerInfo *root,
+ struct RelOptInfo *parentrel,
+ List *subpaths,
+ List *prunequal);
extern Bitmapset *prune_append_rel_partitions(struct RelOptInfo *rel);
extern Bitmapset *get_matching_partitions(PartitionPruneContext *context,
List *pruning_steps);
--
2.35.3
[application/octet-stream] v16-0002-Optimize-AcquireExecutorLocks-by-locking-only-un.patch (87.1K, 3-v16-0002-Optimize-AcquireExecutorLocks-by-locking-only-un.patch)
download | inline diff:
From 6654d7c2b5c54d69d3f8a0136cfaf5593a3b7aae Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Wed, 22 Dec 2021 16:55:17 +0900
Subject: [PATCH v16 2/2] Optimize AcquireExecutorLocks() by locking only
unpruned partitions
This commit teaches AcquireExecutorLocks() to perform initial
partition pruning to notionally eliminate the subnodes contained in a
generic cached plan that need not be initialized during the actual
execution of the plan and skip locking the partition scanned by those
subnodes.
The result of performing initial partition pruning this way before the
actual execution has started is made available to the actual execution via
PartitionPruneResult, made available along with the PlannedStmt by the
callers of the executor that used plancache.c to get the plan. It is NULL
in the cases in which the plan is obtained by calling the planner
directly or if the plan obtained by plancache.c is not a generic one.
---
src/backend/commands/copyto.c | 2 +-
src/backend/commands/createas.c | 2 +-
src/backend/commands/explain.c | 7 +-
src/backend/commands/extension.c | 2 +-
src/backend/commands/matview.c | 2 +-
src/backend/commands/prepare.c | 26 ++-
src/backend/executor/README | 27 +++
src/backend/executor/execMain.c | 53 ++++++
src/backend/executor/execParallel.c | 27 ++-
src/backend/executor/execPartition.c | 234 +++++++++++++++++++++----
src/backend/executor/functions.c | 2 +-
src/backend/executor/nodeAppend.c | 11 +-
src/backend/executor/nodeMergeAppend.c | 5 +-
src/backend/executor/spi.c | 27 ++-
src/backend/nodes/copyfuncs.c | 27 +++
src/backend/nodes/outfuncs.c | 29 +++
src/backend/nodes/readfuncs.c | 51 ++++++
src/backend/optimizer/plan/planner.c | 2 +
src/backend/optimizer/plan/setrefs.c | 45 +++++
src/backend/partitioning/partprune.c | 41 ++++-
src/backend/tcop/postgres.c | 8 +-
src/backend/tcop/pquery.c | 28 ++-
src/backend/utils/cache/plancache.c | 184 ++++++++++++++++---
src/backend/utils/mmgr/portalmem.c | 19 ++
src/include/commands/explain.h | 4 +-
src/include/executor/execPartition.h | 9 +-
src/include/executor/execdesc.h | 3 +
src/include/executor/executor.h | 2 +
src/include/nodes/execnodes.h | 28 +++
src/include/nodes/nodes.h | 4 +
src/include/nodes/pathnodes.h | 9 +
src/include/nodes/plannodes.h | 19 ++
src/include/utils/plancache.h | 3 +-
src/include/utils/portal.h | 3 +
34 files changed, 849 insertions(+), 96 deletions(-)
diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index fca29a9a10..d839517693 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -541,7 +541,7 @@ BeginCopyTo(ParseState *pstate,
((DR_copy *) dest)->cstate = cstate;
/* Create a QueryDesc requesting no output */
- cstate->queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+ cstate->queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
GetActiveSnapshot(),
InvalidSnapshot,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 9abbb6b555..f6607f2454 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -325,7 +325,7 @@ ExecCreateTableAs(ParseState *pstate, CreateTableAsStmt *stmt,
UpdateActiveSnapshotCommandId();
/* Create a QueryDesc, redirecting output to our tuple receiver */
- queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+ queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, 0);
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 5d1f7089da..111d384982 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -407,7 +407,7 @@ ExplainOneQuery(Query *query, int cursorOptions,
}
/* run it (if needed) and produce output */
- ExplainOnePlan(plan, into, es, queryString, params, queryEnv,
+ ExplainOnePlan(plan, NULL, into, es, queryString, params, queryEnv,
&planduration, (es->buffers ? &bufusage : NULL));
}
}
@@ -515,7 +515,8 @@ ExplainOneUtility(Node *utilityStmt, IntoClause *into, ExplainState *es,
* to call it.
*/
void
-ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
+ExplainOnePlan(PlannedStmt *plannedstmt, PartitionPruneResult *part_prune_result,
+ IntoClause *into, ExplainState *es,
const char *queryString, ParamListInfo params,
QueryEnvironment *queryEnv, const instr_time *planduration,
const BufferUsage *bufusage)
@@ -563,7 +564,7 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
dest = None_Receiver;
/* Create a QueryDesc for the query */
- queryDesc = CreateQueryDesc(plannedstmt, queryString,
+ queryDesc = CreateQueryDesc(plannedstmt, part_prune_result, queryString,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, instrument_option);
diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index 767d9b9619..1d55a23ded 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -776,7 +776,7 @@ execute_sql_string(const char *sql)
{
QueryDesc *qdesc;
- qdesc = CreateQueryDesc(stmt,
+ qdesc = CreateQueryDesc(stmt, NULL,
sql,
GetActiveSnapshot(), NULL,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index d1ee106465..e878209674 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -408,7 +408,7 @@ refresh_matview_datafill(DestReceiver *dest, Query *query,
UpdateActiveSnapshotCommandId();
/* Create a QueryDesc, redirecting output to our tuple receiver */
- queryDesc = CreateQueryDesc(plan, queryString,
+ queryDesc = CreateQueryDesc(plan, NULL, queryString,
GetActiveSnapshot(), InvalidSnapshot,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 80738547ed..c7360712b1 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -155,6 +155,7 @@ ExecuteQuery(ParseState *pstate,
PreparedStatement *entry;
CachedPlan *cplan;
List *plan_list;
+ List *part_prune_result_list;
ParamListInfo paramLI = NULL;
EState *estate = NULL;
Portal portal;
@@ -193,7 +194,10 @@ ExecuteQuery(ParseState *pstate,
entry->plansource->query_string);
/* Replan if needed, and increment plan refcount for portal */
- cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL);
+ cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL,
+ &part_prune_result_list);
+ Assert(list_length(cplan->stmt_list) ==
+ list_length(part_prune_result_list));
plan_list = cplan->stmt_list;
/*
@@ -207,6 +211,9 @@ ExecuteQuery(ParseState *pstate,
plan_list,
cplan);
+ /* Copy PartitionPruneResults into the portal's context. */
+ PortalStorePartitionPruneResults(portal, part_prune_result_list);
+
/*
* For CREATE TABLE ... AS EXECUTE, we must verify that the prepared
* statement is one that produces tuples. Currently we insist that it be
@@ -576,7 +583,9 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
const char *query_string;
CachedPlan *cplan;
List *plan_list;
- ListCell *p;
+ List *part_prune_result_list;
+ ListCell *p,
+ *pp;
ParamListInfo paramLI = NULL;
EState *estate = NULL;
instr_time planstart;
@@ -619,7 +628,10 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
/* Replan if needed, and acquire a transient refcount */
cplan = GetCachedPlan(entry->plansource, paramLI,
- CurrentResourceOwner, queryEnv);
+ CurrentResourceOwner, queryEnv,
+ &part_prune_result_list);
+ Assert(list_length(cplan->stmt_list) ==
+ list_length(part_prune_result_list));
INSTR_TIME_SET_CURRENT(planduration);
INSTR_TIME_SUBTRACT(planduration, planstart);
@@ -634,13 +646,15 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
plan_list = cplan->stmt_list;
/* Explain each query */
- foreach(p, plan_list)
+ forboth(p, plan_list, pp, part_prune_result_list)
{
PlannedStmt *pstmt = lfirst_node(PlannedStmt, p);
+ PartitionPruneResult *part_prune_result = lfirst_node(PartitionPruneResult, pp);
if (pstmt->commandType != CMD_UTILITY)
- ExplainOnePlan(pstmt, into, es, query_string, paramLI, queryEnv,
- &planduration, (es->buffers ? &bufusage : NULL));
+ ExplainOnePlan(pstmt, part_prune_result, into, es, query_string,
+ paramLI, queryEnv, &planduration,
+ (es->buffers ? &bufusage : NULL));
else
ExplainOneUtility(pstmt->utilityStmt, into, es, query_string,
paramLI, queryEnv);
diff --git a/src/backend/executor/README b/src/backend/executor/README
index 0b5183fc4a..e0802be723 100644
--- a/src/backend/executor/README
+++ b/src/backend/executor/README
@@ -65,6 +65,29 @@ found there. This currently only occurs for Append and MergeAppend nodes. In
this case the non-required subplans are ignored and the executor state's
subnode array will become out of sequence to the plan's subplan list.
+Actually, the so-called execution time pruning may also occur even before the
+execution has started. One case where that occurs is when a cached generic
+plan is being validated for execution by plancache.c: GetCachedPlan(), which
+proceeds by locking all the relations that will be scanned by that plan. If
+the generic plan contains nodes that can perform execution time partition
+pruning (that is, contain a PartitionPruneInfo), a subset of pruning steps
+contained in the PartitionPruneInfos that do not depend on execution actually
+having started (called "initial" pruning steps) are performed at this point
+to figure out the minimal set of child subplans that satisfy those pruning
+instructions. AcquireExecutorLocks() looking at a particular plan will then
+lock only the relations scanned by those surviving subplans (along with those
+present in PlannedStmt.minLockRelids), and ignore those scanned by the pruned
+subplans, even though the pruned subplans themselves are not removed from the
+plan tree. The result of pruning (that is, the set of indexes of surviving
+subplans in their parent's list of child subplans) is saved as a list of
+bitmapsets, with one element for every PartitionPruneInfo referenced in the
+plan (PlannedStmt.partPruneInfos). The list is packaged into a
+PartitionPruneResult node, which is passed along with the PlannedStmt to the
+executor via the QueryDesc. It is imperative that the executor and any third
+party code invoked by it that gets passed the plan tree look at the plan's
+PartitionPruneResult to determine whether a particular child subplan of a
+parent node that supports pruning is valid for a given execution.
+
Each Plan node may have expression trees associated with it, to represent
its target list, qualification conditions, etc. These trees are also
read-only to the executor, but the executor state for expression evaluation
@@ -286,6 +309,10 @@ Query Processing Control Flow
This is a sketch of control flow for full query processing:
+ [ ExecutorDoInitialPruning ] --- an optional step to perform initial
+ partition pruning on the plan tree the result of which is passed
+ to the executor via QueryDesc
+
CreateQueryDesc
ExecutorStart
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 72fc273524..45824624f8 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -49,6 +49,7 @@
#include "commands/matview.h"
#include "commands/trigger.h"
#include "executor/execdebug.h"
+#include "executor/execPartition.h"
#include "executor/nodeSubplan.h"
#include "foreign/fdwapi.h"
#include "jit/jit.h"
@@ -104,6 +105,56 @@ static void EvalPlanQualStart(EPQState *epqstate, Plan *planTree);
/* end of local decls */
+/* ----------------------------------------------------------------
+ * ExecutorDoInitialPruning
+ *
+ * For each plan tree node that has been assigned a PartitionPruneInfo,
+ * this performs initial partition pruning using the information contained
+ * therein to determine the set of child subplans that satisfy the initial
+ * pruning steps, to be returned as a bitmapset of their indexes in the
+ * node's list of child subplans (for example, an Append's appendplans).
+ *
+ * Return value is a PartitionPruneResult node that contains a list of those
+ * bitmapsets, with one element for every PartitionPruneInfo, and a bitmapset
+ * of the RT indexes of all the leaf partitions scanned by those chosen
+ * subplans. Note that the latter is shared across all PartitionPruneInfos.
+ *
+ * The executor must see the exactly same set of subplans as valid for
+ * execution when doing ExecInitNode() on the plan nodes whose
+ * PartitionPruneInfos are processed here. So, it must get the set from the
+ * aforementioned PartitionPruneResult, instead of computing it all over
+ * again by redoing the initial pruning. It's the caller's job to pass the
+ * PartitionPruneResult to the executor.
+ *
+ * Note: Partitioned tables mentioned in PartitionedRelPruneInfo nodes that
+ * drive the pruning will be locked before doing the pruning.
+ * ----------------------------------------------------------------
+ */
+PartitionPruneResult *
+ExecutorDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params)
+{
+ PartitionPruneResult *result;
+ ListCell *lc;
+
+ /* Only get here if there is any pruning to do. */
+ Assert(plannedstmt->containsInitialPruning);
+
+ result = makeNode(PartitionPruneResult);
+ foreach(lc, plannedstmt->partPruneInfos)
+ {
+ PartitionPruneInfo *pruneinfo = lfirst(lc);
+ Bitmapset *valid_subplan_offs;
+
+ valid_subplan_offs =
+ ExecPartitionDoInitialPruning(plannedstmt, params, pruneinfo,
+ &result->scan_leafpart_rtis);
+ result->valid_subplan_offs_list =
+ lappend(result->valid_subplan_offs_list,
+ valid_subplan_offs);
+ }
+
+ return result;
+}
/* ----------------------------------------------------------------
* ExecutorStart
@@ -806,6 +857,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
{
CmdType operation = queryDesc->operation;
PlannedStmt *plannedstmt = queryDesc->plannedstmt;
+ PartitionPruneResult *part_prune_result = queryDesc->part_prune_result;
Plan *plan = plannedstmt->planTree;
List *rangeTable = plannedstmt->rtable;
EState *estate = queryDesc->estate;
@@ -826,6 +878,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
estate->es_plannedstmt = plannedstmt;
estate->es_part_prune_infos = plannedstmt->partPruneInfos;
+ estate->es_part_prune_result = part_prune_result;
/*
* Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index f73b8c2607..7e6dab5623 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -66,6 +66,7 @@
#define PARALLEL_KEY_QUERY_TEXT UINT64CONST(0xE000000000000008)
#define PARALLEL_KEY_JIT_INSTRUMENTATION UINT64CONST(0xE000000000000009)
#define PARALLEL_KEY_WAL_USAGE UINT64CONST(0xE00000000000000A)
+#define PARALLEL_KEY_PARTITIONPRUNERESULT UINT64CONST(0xE00000000000000B)
#define PARALLEL_TUPLE_QUEUE_SIZE 65536
@@ -182,6 +183,7 @@ ExecSerializePlan(Plan *plan, EState *estate)
pstmt->transientPlan = false;
pstmt->dependsOnRole = false;
pstmt->parallelModeNeeded = false;
+ pstmt->containsInitialPruning = false;
pstmt->planTree = plan;
pstmt->partPruneInfos = estate->es_part_prune_infos;
pstmt->rtable = estate->es_range_table;
@@ -597,12 +599,15 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
FixedParallelExecutorState *fpes;
char *pstmt_data;
char *pstmt_space;
+ char *part_prune_result_data;
+ char *part_prune_result_space;
char *paramlistinfo_space;
BufferUsage *bufusage_space;
WalUsage *walusage_space;
SharedExecutorInstrumentation *instrumentation = NULL;
SharedJitInstrumentation *jit_instrumentation = NULL;
int pstmt_len;
+ int part_prune_result_len;
int paramlistinfo_len;
int instrumentation_len = 0;
int jit_instrumentation_len = 0;
@@ -631,6 +636,7 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
/* Fix up and serialize plan to be sent to workers. */
pstmt_data = ExecSerializePlan(planstate->plan, estate);
+ part_prune_result_data = nodeToString(estate->es_part_prune_result);
/* Create a parallel context. */
pcxt = CreateParallelContext("postgres", "ParallelQueryMain", nworkers);
@@ -657,6 +663,11 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
shm_toc_estimate_chunk(&pcxt->estimator, pstmt_len);
shm_toc_estimate_keys(&pcxt->estimator, 1);
+ /* Estimate space for serialized PartitionPruneResult. */
+ part_prune_result_len = strlen(part_prune_result_data) + 1;
+ shm_toc_estimate_chunk(&pcxt->estimator, part_prune_result_len);
+ shm_toc_estimate_keys(&pcxt->estimator, 1);
+
/* Estimate space for serialized ParamListInfo. */
paramlistinfo_len = EstimateParamListSpace(estate->es_param_list_info);
shm_toc_estimate_chunk(&pcxt->estimator, paramlistinfo_len);
@@ -751,6 +762,12 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
memcpy(pstmt_space, pstmt_data, pstmt_len);
shm_toc_insert(pcxt->toc, PARALLEL_KEY_PLANNEDSTMT, pstmt_space);
+ /* Store serialized PartitionPruneResult */
+ part_prune_result_space = shm_toc_allocate(pcxt->toc, part_prune_result_len);
+ memcpy(part_prune_result_space, part_prune_result_data, part_prune_result_len);
+ shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARTITIONPRUNERESULT,
+ part_prune_result_space);
+
/* Store serialized ParamListInfo. */
paramlistinfo_space = shm_toc_allocate(pcxt->toc, paramlistinfo_len);
shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARAMLISTINFO, paramlistinfo_space);
@@ -1232,8 +1249,10 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
int instrument_options)
{
char *pstmtspace;
+ char *part_prune_result_space;
char *paramspace;
PlannedStmt *pstmt;
+ PartitionPruneResult *part_prune_result;
ParamListInfo paramLI;
char *queryString;
@@ -1244,12 +1263,18 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
pstmtspace = shm_toc_lookup(toc, PARALLEL_KEY_PLANNEDSTMT, false);
pstmt = (PlannedStmt *) stringToNode(pstmtspace);
+ /* Reconstruct leader-supplied PartitionPruneResult. */
+ part_prune_result_space =
+ shm_toc_lookup(toc, PARALLEL_KEY_PARTITIONPRUNERESULT, false);
+ part_prune_result = (PartitionPruneResult *)
+ stringToNode(part_prune_result_space);
+
/* Reconstruct ParamListInfo. */
paramspace = shm_toc_lookup(toc, PARALLEL_KEY_PARAMLISTINFO, false);
paramLI = RestoreParamList(¶mspace);
/* Create a QueryDesc for the query. */
- return CreateQueryDesc(pstmt,
+ return CreateQueryDesc(pstmt, part_prune_result,
queryString,
GetActiveSnapshot(), InvalidSnapshot,
receiver, paramLI, NULL, instrument_options);
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index b55cdd2580..86227301e9 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -25,6 +25,7 @@
#include "mb/pg_wchar.h"
#include "miscadmin.h"
#include "nodes/makefuncs.h"
+#include "parser/parsetree.h"
#include "partitioning/partbounds.h"
#include "partitioning/partdesc.h"
#include "partitioning/partprune.h"
@@ -185,7 +186,11 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation rel,
static List *adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri);
static List *adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap);
static PartitionPruneState *CreatePartitionPruneState(PlanState *planstate,
- PartitionPruneInfo *pruneinfo);
+ PartitionPruneInfo *pruneinfo,
+ bool consider_initial_steps,
+ bool consider_exec_steps,
+ List *rtable, ExprContext *econtext,
+ PartitionDirectory partdir);
static void InitPartitionPruneContext(PartitionPruneContext *context,
List *pruning_steps,
PartitionDesc partdesc,
@@ -198,7 +203,8 @@ static void PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
static void find_matching_subplans_recurse(PartitionPruningData *prunedata,
PartitionedRelPruningData *pprune,
bool initial_prune,
- Bitmapset **validsubplans);
+ Bitmapset **validsubplans,
+ Bitmapset **scan_leafpart_rtis);
/*
@@ -1593,8 +1599,10 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
* considered to be a stable expression, it can change value from one plan
* node scan to the next during query execution. Stable comparison
* expressions that don't involve such Params allow partition pruning to be
- * done once during executor startup. Expressions that do involve such Params
- * require us to prune separately for each scan of the parent plan node.
+ * done once during executor startup or during ExecutorDoInitialPruning() that
+ * runs as part of performing AcquireExecutorLocks() on a given plan tree.
+ * Expressions that do involve such Params require us to prune separately for
+ * each scan of the parent plan node.
*
* Note that pruning away unneeded subplans during executor startup has the
* added benefit of not having to initialize the unneeded subplans at all.
@@ -1611,6 +1619,13 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
* account for initial pruning possibly having eliminated some of the
* subplans.
*
+ * ExecPartitionDoInitialPruning:
+ * Do initial pruning with the information contained in a given
+ * PartitionPruneInfo to determine the minimal set of child subplans
+ * to be executed of the parent plan node to which the PartitionPruneInfo
+ * belongs and also the set of the RT indexes of leaf partitions that will
+ * be scanned with those subplans.
+ *
* ExecFindMatchingSubPlans:
* Returns indexes of matching subplans after evaluating the expressions
* that are safe to evaluate at a given point. This function is first
@@ -1628,8 +1643,9 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
*
* On return, *initially_valid_subplans is assigned the set of indexes of
* child subplans that must be initialized along with the parent plan node.
- * Initial pruning is performed here if needed and in that case only the
- * surviving subplans' indexes are added.
+ * Initial pruning is performed here if needed (unless it has already been done
+ * by ExecDoInitialPruning()), and in that case only the surviving subplans'
+ * indexes are added.
*
* If subplans are indeed pruned, subplan_map arrays contained in the returned
* PartitionPruneState are re-sequenced to not count those, though only if the
@@ -1645,24 +1661,59 @@ ExecInitPartitionPruning(PlanState *planstate,
EState *estate = planstate->state;
PartitionPruneInfo *pruneinfo = list_nth(estate->es_part_prune_infos,
part_prune_index);
+ PartitionPruneResult *pruneresult = estate->es_part_prune_result;
+ bool do_pruning = (pruneinfo->needs_init_pruning ||
+ pruneinfo->needs_exec_pruning);
- /* We may need an expression context to evaluate partition exprs */
- ExecAssignExprContext(estate, planstate);
+ /*
+ * No need to do initial pruning if it was done already by
+ * ExecutorDoInitialPruning(), which it would be if es_part_prune_result
+ * has been set.
+ */
+ if (pruneresult)
+ do_pruning = pruneinfo->needs_exec_pruning;
- /* Create the working data structure for pruning */
- prunestate = CreatePartitionPruneState(planstate, pruneinfo);
+ prunestate = NULL;
+ if (do_pruning)
+ {
+ /* We may need an expression context to evaluate partition exprs */
+ ExecAssignExprContext(estate, planstate);
+
+ /* For data reading, executor always omits detached partitions */
+ if (estate->es_partition_directory == NULL)
+ estate->es_partition_directory =
+ CreatePartitionDirectory(estate->es_query_cxt, false);
+
+ /*
+ * Create the working data structure for pruning. No need to consider
+ * initial pruning steps if we have a PartitionPruneResult.
+ */
+ prunestate = CreatePartitionPruneState(planstate, pruneinfo,
+ pruneresult == NULL, true,
+ NIL, planstate->ps_ExprContext,
+ estate->es_partition_directory);
+ }
/*
* Perform an initial partition prune pass, if required.
*/
- if (prunestate->do_initial_prune)
- *initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true);
+ if (pruneresult)
+ {
+ *initially_valid_subplans =
+ list_nth(pruneresult->valid_subplan_offs_list, part_prune_index);
+ }
+ else if (prunestate && prunestate->do_initial_prune)
+ {
+ *initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true,
+ NULL);
+ }
else
{
/* No pruning, so we'll need to initialize all subplans */
Assert(n_total_subplans > 0);
*initially_valid_subplans = bms_add_range(NULL, 0,
n_total_subplans - 1);
+ return prunestate;
}
/*
@@ -1670,7 +1721,8 @@ ExecInitPartitionPruning(PlanState *planstate,
* that were removed above due to initial pruning. No need to do this if
* no steps were removed.
*/
- if (bms_num_members(*initially_valid_subplans) < n_total_subplans)
+ if (prunestate &&
+ bms_num_members(*initially_valid_subplans) < n_total_subplans)
{
/*
* We can safely skip this when !do_exec_prune, even though that
@@ -1686,11 +1738,73 @@ ExecInitPartitionPruning(PlanState *planstate,
return prunestate;
}
+/*
+ * ExecPartitionDoInitialPruning
+ * Perform initial pruning using given PartitionPruneInfo to determine
+ * the minimal set of child subplans that will be executed and also the
+ * set of RT indexes of the leaf partitions scanned by those subplans.
+ */
+Bitmapset *
+ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params,
+ PartitionPruneInfo *pruneinfo,
+ Bitmapset **scan_leafpart_rtis)
+{
+ List *rtable = plannedstmt->rtable;
+ ExprContext *econtext;
+ PartitionDirectory pdir;
+ MemoryContext oldcontext,
+ tmpcontext;
+ PartitionPruneState *prunestate;
+ Bitmapset *valid_subplan_offs;
+
+ /*
+ * A temporary context for memory allocations required while executing
+ * partition pruning steps.
+ */
+ tmpcontext = AllocSetContextCreate(CurrentMemoryContext,
+ "initial pruning working data",
+ ALLOCSET_DEFAULT_SIZES);
+ oldcontext = MemoryContextSwitchTo(tmpcontext);
+
+ /*
+ * PartitionDirectory to look up partition descriptors, which omits
+ * detached partitions, just like in the executor proper.
+ */
+ pdir = CreatePartitionDirectory(CurrentMemoryContext, false);
+
+ /*
+ * We don't yet have a PlanState for the parent plan node, so we must
+ * create a standalone ExprContext to evaluate pruning expressions,
+ * equipped with the information about the EXTERN parameters that the
+ * caller passed us. Note that that's okay because the initial pruning
+ * steps do not contain anything that requires the execution to have
+ * started.
+ */
+ econtext = CreateStandaloneExprContext();
+ econtext->ecxt_param_list_info = params;
+ prunestate = CreatePartitionPruneState(NULL, pruneinfo, true, false,
+ rtable, econtext, pdir);
+ MemoryContextSwitchTo(oldcontext);
+
+ /* Do the initial pruning. */
+ valid_subplan_offs = ExecFindMatchingSubPlans(prunestate, true,
+ scan_leafpart_rtis);
+
+ FreeExprContext(econtext, true);
+ DestroyPartitionDirectory(pdir);
+ MemoryContextDelete(tmpcontext);
+
+ return valid_subplan_offs;
+}
+
/*
* CreatePartitionPruneState
* Build the data structure required for calling ExecFindMatchingSubPlans
*
- * 'planstate' is the parent plan node's execution state.
+ * 'planstate', if not NULL, is the parent plan node's execution state. It
+ * can be NULL if being called before ExecutorStart(), in which case,
+ * 'rtable' (range table), 'econtext', and 'partdir' must be explicitly
+ * provided.
*
* 'pruneinfo' is a PartitionPruneInfo as generated by
* make_partition_pruneinfo. Here we build a PartitionPruneState containing a
@@ -1704,19 +1818,21 @@ ExecInitPartitionPruning(PlanState *planstate,
* PartitionedRelPruneInfo.
*/
static PartitionPruneState *
-CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
+CreatePartitionPruneState(PlanState *planstate,
+ PartitionPruneInfo *pruneinfo,
+ bool consider_initial_steps,
+ bool consider_exec_steps,
+ List *rtable, ExprContext *econtext,
+ PartitionDirectory partdir)
{
- EState *estate = planstate->state;
+ EState *estate = planstate ? planstate->state : NULL;
PartitionPruneState *prunestate;
int n_part_hierarchies;
ListCell *lc;
int i;
- ExprContext *econtext = planstate->ps_ExprContext;
- /* For data reading, executor always omits detached partitions */
- if (estate->es_partition_directory == NULL)
- estate->es_partition_directory =
- CreatePartitionDirectory(estate->es_query_cxt, false);
+ Assert((estate != NULL) ||
+ (partdir != NULL && econtext != NULL && rtable != NIL));
n_part_hierarchies = list_length(pruneinfo->prune_infos);
Assert(n_part_hierarchies > 0);
@@ -1771,15 +1887,42 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
PartitionKey partkey;
/*
- * We can rely on the copies of the partitioned table's partition
- * key and partition descriptor appearing in its relcache entry,
- * because that entry will be held open and locked for the
- * duration of this executor run.
+ * Must open the relation by ourselves when called before the
+ * execution has started, such as, when called during
+ * ExecutorDoInitialPruning() on a cached plan. In that case,
+ * sub-partitions must be locked, because AcquirePlannerLocks()
+ * would not have seen them. (1st relation in a partrelpruneinfos
+ * list is always the root partitioned table appearing in the
+ * query, which AcquirePlannerLocks() would have locked; the
+ * Assert in relation_open() guards that assumption.)
+ */
+ if (estate == NULL)
+ {
+ RangeTblEntry *rte = rt_fetch(pinfo->rtindex, rtable);
+ int lockmode = (j == 0) ? NoLock : rte->rellockmode;
+
+ partrel = table_open(rte->relid, lockmode);
+ }
+ else
+ partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
+
+ /*
+ * We can rely on the copy of the partitioned table's partition
+ * key from in its relcache entry, because it can't change (or
+ * get destroyed) as long as the relation is locked. Partition
+ * descriptor is taken from the PartitionDirectory associated with
+ * the table that is held open long enough for the descriptor to
+ * remain valid while it's used to perform the pruning steps.
*/
- partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
partkey = RelationGetPartitionKey(partrel);
- partdesc = PartitionDirectoryLookup(estate->es_partition_directory,
- partrel);
+ partdesc = PartitionDirectoryLookup(partdir, partrel);
+
+ /*
+ * Must close partrel, keeping the lock taken, if we're not using
+ * EState's entry.
+ */
+ if (estate == NULL)
+ table_close(partrel, NoLock);
/*
* Initialize the subplan_map and subpart_map.
@@ -1793,6 +1936,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
Assert(partdesc->nparts >= pinfo->nparts);
pprune->nparts = partdesc->nparts;
pprune->subplan_map = palloc(sizeof(int) * partdesc->nparts);
+ pprune->rti_map = palloc(sizeof(Index) * partdesc->nparts);
if (partdesc->nparts == pinfo->nparts)
{
/*
@@ -1803,6 +1947,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
pprune->subpart_map = pinfo->subpart_map;
memcpy(pprune->subplan_map, pinfo->subplan_map,
sizeof(int) * pinfo->nparts);
+ memcpy(pprune->rti_map, pinfo->rti_map,
+ sizeof(int) * pinfo->nparts);
/*
* Double-check that the list of unpruned relations has not
@@ -1853,6 +1999,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
pinfo->subplan_map[pd_idx];
pprune->subpart_map[pp_idx] =
pinfo->subpart_map[pd_idx];
+ pprune->rti_map[pp_idx] =
+ pinfo->rti_map[pd_idx];
pd_idx++;
}
else
@@ -1860,6 +2008,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
/* this partdesc entry is not in the plan */
pprune->subplan_map[pp_idx] = -1;
pprune->subpart_map[pp_idx] = -1;
+ pprune->rti_map[pp_idx] = 0;
}
}
@@ -1881,7 +2030,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
* Initialize pruning contexts as needed.
*/
pprune->initial_pruning_steps = pinfo->initial_pruning_steps;
- if (pinfo->initial_pruning_steps)
+ if (consider_initial_steps && pinfo->initial_pruning_steps)
{
InitPartitionPruneContext(&pprune->initial_context,
pinfo->initial_pruning_steps,
@@ -1891,7 +2040,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
prunestate->do_initial_prune = true;
}
pprune->exec_pruning_steps = pinfo->exec_pruning_steps;
- if (pinfo->exec_pruning_steps)
+ if (consider_exec_steps && pinfo->exec_pruning_steps)
{
InitPartitionPruneContext(&pprune->exec_context,
pinfo->exec_pruning_steps,
@@ -2119,10 +2268,14 @@ PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
* Pass initial_prune if PARAM_EXEC Params cannot yet be evaluated. This
* differentiates the initial executor-time pruning step from later
* runtime pruning.
+ *
+ * RT indexes of leaf partitions scanned by the chosen subplans are added to
+ * *scan_leafpart_rtis if the pointer is non-NULL.
*/
Bitmapset *
ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
- bool initial_prune)
+ bool initial_prune,
+ Bitmapset **scan_leafpart_rtis)
{
Bitmapset *result = NULL;
MemoryContext oldcontext;
@@ -2157,7 +2310,7 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
*/
pprune = &prunedata->partrelprunedata[0];
find_matching_subplans_recurse(prunedata, pprune, initial_prune,
- &result);
+ &result, scan_leafpart_rtis);
/* Expression eval may have used space in ExprContext too */
if (pprune->exec_pruning_steps)
@@ -2171,6 +2324,8 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
/* Copy result out of the temp context before we reset it */
result = bms_copy(result);
+ if (scan_leafpart_rtis)
+ *scan_leafpart_rtis = bms_copy(*scan_leafpart_rtis);
MemoryContextReset(prunestate->prune_context);
@@ -2181,13 +2336,15 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
* find_matching_subplans_recurse
* Recursive worker function for ExecFindMatchingSubPlans
*
- * Adds valid (non-prunable) subplan IDs to *validsubplans
+ * Adds valid (non-prunable) subplan IDs to *validsubplans and RT indexes of
+ * of the corresponding leaf partitions to *scan_leafpart_rtis (if asked for).
*/
static void
find_matching_subplans_recurse(PartitionPruningData *prunedata,
PartitionedRelPruningData *pprune,
bool initial_prune,
- Bitmapset **validsubplans)
+ Bitmapset **validsubplans,
+ Bitmapset **scan_leafpart_rtis)
{
Bitmapset *partset;
int i;
@@ -2214,8 +2371,14 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
while ((i = bms_next_member(partset, i)) >= 0)
{
if (pprune->subplan_map[i] >= 0)
+ {
*validsubplans = bms_add_member(*validsubplans,
pprune->subplan_map[i]);
+ Assert(pprune->rti_map[i] > 0);
+ if (scan_leafpart_rtis)
+ *scan_leafpart_rtis = bms_add_member(*scan_leafpart_rtis,
+ pprune->rti_map[i]);
+ }
else
{
int partidx = pprune->subpart_map[i];
@@ -2223,7 +2386,8 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
if (partidx >= 0)
find_matching_subplans_recurse(prunedata,
&prunedata->partrelprunedata[partidx],
- initial_prune, validsubplans);
+ initial_prune, validsubplans,
+ scan_leafpart_rtis);
else
{
/*
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index f9460ae506..a2182a6b1f 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -844,7 +844,7 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
else
dest = None_Receiver;
- es->qd = CreateQueryDesc(es->stmt,
+ es->qd = CreateQueryDesc(es->stmt, NULL,
fcache->src,
GetActiveSnapshot(),
InvalidSnapshot,
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index c6f86a6510..96880e122a 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -155,7 +155,8 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
* subplan, we can fill as_valid_subplans immediately, preventing
* later calls to ExecFindMatchingSubPlans.
*/
- if (!prunestate->do_exec_prune && nplans > 0)
+ if (appendstate->as_prune_state == NULL ||
+ (!appendstate->as_prune_state->do_exec_prune && nplans > 0))
appendstate->as_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
}
else
@@ -577,7 +578,7 @@ choose_next_subplan_locally(AppendState *node)
}
else if (node->as_valid_subplans == NULL)
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
whichplan = -1;
}
@@ -642,7 +643,7 @@ choose_next_subplan_for_leader(AppendState *node)
if (node->as_valid_subplans == NULL)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
/*
* Mark each invalid plan as finished to allow the loop below to
@@ -717,7 +718,7 @@ choose_next_subplan_for_worker(AppendState *node)
else if (node->as_valid_subplans == NULL)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
mark_invalid_subplans_as_finished(node);
}
@@ -868,7 +869,7 @@ ExecAppendAsyncBegin(AppendState *node)
if (node->as_valid_subplans == NULL)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
classify_matching_subplans(node);
}
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index 8d35860c30..2312e5a633 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -103,7 +103,8 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
* subplan, we can fill ms_valid_subplans immediately, preventing
* later calls to ExecFindMatchingSubPlans.
*/
- if (!prunestate->do_exec_prune && nplans > 0)
+ if (mergestate->ms_prune_state == NULL ||
+ (!mergestate->ms_prune_state->do_exec_prune && nplans > 0))
mergestate->ms_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
}
else
@@ -218,7 +219,7 @@ ExecMergeAppend(PlanState *pstate)
*/
if (node->ms_valid_subplans == NULL)
node->ms_valid_subplans =
- ExecFindMatchingSubPlans(node->ms_prune_state, false);
+ ExecFindMatchingSubPlans(node->ms_prune_state, false, NULL);
/*
* First time through: pull the first tuple from each valid subplan,
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 29bc26669b..303a572c02 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -1578,6 +1578,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
CachedPlanSource *plansource;
CachedPlan *cplan;
List *stmt_list;
+ List *part_prune_result_list;
char *query_string;
Snapshot snapshot;
MemoryContext oldcontext;
@@ -1657,7 +1658,10 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
*/
/* Replan if needed, and increment plan refcount for portal */
- cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv);
+ cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv,
+ &part_prune_result_list);
+ Assert(list_length(cplan->stmt_list) ==
+ list_length(part_prune_result_list));
stmt_list = cplan->stmt_list;
if (!plan->saved)
@@ -1685,6 +1689,9 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
stmt_list,
cplan);
+ /* Copy PartitionPruneResults into the portal's context. */
+ PortalStorePartitionPruneResults(portal, part_prune_result_list);
+
/*
* Set up options for portal. Default SCROLL type is chosen the same way
* as PerformCursorOpen does it.
@@ -2092,7 +2099,8 @@ SPI_plan_get_cached_plan(SPIPlanPtr plan)
/* Get the generic plan for the query */
cplan = GetCachedPlan(plansource, NULL,
plan->saved ? CurrentResourceOwner : NULL,
- _SPI_current->queryEnv);
+ _SPI_current->queryEnv,
+ NULL /* Not interested in PartitionPruneResults */);
Assert(cplan == plansource->gplan);
/* Pop the error context stack */
@@ -2473,7 +2481,9 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
{
CachedPlanSource *plansource = (CachedPlanSource *) lfirst(lc1);
List *stmt_list;
- ListCell *lc2;
+ List *part_prune_result_list;
+ ListCell *lc2,
+ *lc3;
spicallbackarg.query = plansource->query_string;
@@ -2549,8 +2559,10 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
* plan, the refcount must be backed by the plan_owner.
*/
cplan = GetCachedPlan(plansource, options->params,
- plan_owner, _SPI_current->queryEnv);
-
+ plan_owner, _SPI_current->queryEnv,
+ &part_prune_result_list);
+ Assert(list_length(cplan->stmt_list) ==
+ list_length(part_prune_result_list));
stmt_list = cplan->stmt_list;
/*
@@ -2589,9 +2601,10 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
}
}
- foreach(lc2, stmt_list)
+ forboth(lc2, stmt_list, lc3, part_prune_result_list)
{
PlannedStmt *stmt = lfirst_node(PlannedStmt, lc2);
+ PartitionPruneResult *part_prune_result = lfirst_node(PartitionPruneResult, lc3);
bool canSetTag = stmt->canSetTag;
DestReceiver *dest;
@@ -2663,7 +2676,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
else
snap = InvalidSnapshot;
- qdesc = CreateQueryDesc(stmt,
+ qdesc = CreateQueryDesc(stmt, part_prune_result,
plansource->query_string,
snap, crosscheck_snapshot,
dest,
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 8fbeaa4f36..ca139797a8 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -97,7 +97,9 @@ _copyPlannedStmt(const PlannedStmt *from)
COPY_SCALAR_FIELD(jitFlags);
COPY_NODE_FIELD(planTree);
COPY_NODE_FIELD(partPruneInfos);
+ COPY_SCALAR_FIELD(containsInitialPruning);
COPY_NODE_FIELD(rtable);
+ COPY_BITMAPSET_FIELD(minLockRelids);
COPY_NODE_FIELD(resultRelations);
COPY_NODE_FIELD(appendRelations);
COPY_NODE_FIELD(subplans);
@@ -1284,6 +1286,8 @@ _copyPartitionPruneInfo(const PartitionPruneInfo *from)
PartitionPruneInfo *newnode = makeNode(PartitionPruneInfo);
COPY_NODE_FIELD(prune_infos);
+ COPY_SCALAR_FIELD(needs_init_pruning);
+ COPY_SCALAR_FIELD(needs_exec_pruning);
COPY_BITMAPSET_FIELD(other_subplans);
return newnode;
@@ -1300,6 +1304,7 @@ _copyPartitionedRelPruneInfo(const PartitionedRelPruneInfo *from)
COPY_POINTER_FIELD(subplan_map, from->nparts * sizeof(int));
COPY_POINTER_FIELD(subpart_map, from->nparts * sizeof(int));
COPY_POINTER_FIELD(relid_map, from->nparts * sizeof(Oid));
+ COPY_POINTER_FIELD(rti_map, from->nparts * sizeof(Index));
COPY_NODE_FIELD(initial_pruning_steps);
COPY_NODE_FIELD(exec_pruning_steps);
COPY_BITMAPSET_FIELD(execparamids);
@@ -5475,6 +5480,21 @@ _copyExtensibleNode(const ExtensibleNode *from)
return newnode;
}
+/* ****************************************************************
+ * execnodes.h copy functions
+ * ****************************************************************
+ */
+static PartitionPruneResult *
+_copyPartitionPruneResult(const PartitionPruneResult *from)
+{
+ PartitionPruneResult *newnode = makeNode(PartitionPruneResult);
+
+ COPY_NODE_FIELD(valid_subplan_offs_list);
+ COPY_BITMAPSET_FIELD(scan_leafpart_rtis);
+
+ return newnode;
+}
+
/* ****************************************************************
* value.h copy functions
* ****************************************************************
@@ -6571,6 +6591,13 @@ copyObjectImpl(const void *from)
retval = _copyPublicationTable(from);
break;
+ /*
+ * EXECUTION NODES
+ */
+ case T_PartitionPruneResult:
+ retval = _copyPartitionPruneResult(from);
+ break;
+
/*
* MISCELLANEOUS NODES
*/
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 72fcd8a6ee..53010bf059 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -322,7 +322,9 @@ _outPlannedStmt(StringInfo str, const PlannedStmt *node)
WRITE_INT_FIELD(jitFlags);
WRITE_NODE_FIELD(planTree);
WRITE_NODE_FIELD(partPruneInfos);
+ WRITE_BOOL_FIELD(containsInitialPruning);
WRITE_NODE_FIELD(rtable);
+ WRITE_BITMAPSET_FIELD(minLockRelids);
WRITE_NODE_FIELD(resultRelations);
WRITE_NODE_FIELD(appendRelations);
WRITE_NODE_FIELD(subplans);
@@ -1017,6 +1019,8 @@ _outPartitionPruneInfo(StringInfo str, const PartitionPruneInfo *node)
WRITE_NODE_TYPE("PARTITIONPRUNEINFO");
WRITE_NODE_FIELD(prune_infos);
+ WRITE_BOOL_FIELD(needs_init_pruning);
+ WRITE_BOOL_FIELD(needs_exec_pruning);
WRITE_BITMAPSET_FIELD(other_subplans);
}
@@ -1031,6 +1035,7 @@ _outPartitionedRelPruneInfo(StringInfo str, const PartitionedRelPruneInfo *node)
WRITE_INT_ARRAY(subplan_map, node->nparts);
WRITE_INT_ARRAY(subpart_map, node->nparts);
WRITE_OID_ARRAY(relid_map, node->nparts);
+ WRITE_INDEX_ARRAY(rti_map, node->nparts);
WRITE_NODE_FIELD(initial_pruning_steps);
WRITE_NODE_FIELD(exec_pruning_steps);
WRITE_BITMAPSET_FIELD(execparamids);
@@ -2436,6 +2441,8 @@ _outPlannerGlobal(StringInfo str, const PlannerGlobal *node)
WRITE_NODE_FIELD(resultRelations);
WRITE_NODE_FIELD(appendRelations);
WRITE_NODE_FIELD(partPruneInfos);
+ WRITE_BOOL_FIELD(containsInitialPruning);
+ WRITE_BITMAPSET_FIELD(minLockRelids);
WRITE_NODE_FIELD(relationOids);
WRITE_NODE_FIELD(invalItems);
WRITE_NODE_FIELD(paramExecTypes);
@@ -2857,6 +2864,21 @@ _outExtensibleNode(StringInfo str, const ExtensibleNode *node)
methods->nodeOut(str, node);
}
+/*****************************************************************************
+ *
+ * Stuff from execnodes.h
+ *
+ *****************************************************************************/
+
+static void
+_outPartitionPruneResult(StringInfo str, const PartitionPruneResult *node)
+{
+ WRITE_NODE_TYPE("PARTITIONPRUNERESULT");
+
+ WRITE_NODE_FIELD(valid_subplan_offs_list);
+ WRITE_BITMAPSET_FIELD(scan_leafpart_rtis);
+}
+
/*****************************************************************************
*
* Stuff from parsenodes.h.
@@ -4766,6 +4788,13 @@ outNode(StringInfo str, const void *obj)
_outJsonTableSibling(str, obj);
break;
+ /*
+ * EXECUTION NODES
+ */
+ case T_PartitionPruneResult:
+ _outPartitionPruneResult(str, obj);
+ break;
+
default:
/*
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index bf602ff93e..c1d131aa99 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -164,6 +164,11 @@
token = pg_strtok(&length); /* skip :fldname */ \
local_node->fldname = readIntCols(len)
+/* Read an Index array */
+#define READ_INDEX_ARRAY(fldname, len) \
+ token = pg_strtok(&length); /* skip :fldname */ \
+ local_node->fldname = readIndexCols(len)
+
/* Read a bool array */
#define READ_BOOL_ARRAY(fldname, len) \
token = pg_strtok(&length); /* skip :fldname */ \
@@ -1818,7 +1823,9 @@ _readPlannedStmt(void)
READ_INT_FIELD(jitFlags);
READ_NODE_FIELD(planTree);
READ_NODE_FIELD(partPruneInfos);
+ READ_BOOL_FIELD(containsInitialPruning);
READ_NODE_FIELD(rtable);
+ READ_BITMAPSET_FIELD(minLockRelids);
READ_NODE_FIELD(resultRelations);
READ_NODE_FIELD(appendRelations);
READ_NODE_FIELD(subplans);
@@ -2770,6 +2777,8 @@ _readPartitionPruneInfo(void)
READ_LOCALS(PartitionPruneInfo);
READ_NODE_FIELD(prune_infos);
+ READ_BOOL_FIELD(needs_init_pruning);
+ READ_BOOL_FIELD(needs_exec_pruning);
READ_BITMAPSET_FIELD(other_subplans);
READ_DONE();
@@ -2786,6 +2795,7 @@ _readPartitionedRelPruneInfo(void)
READ_INT_ARRAY(subplan_map, local_node->nparts);
READ_INT_ARRAY(subpart_map, local_node->nparts);
READ_OID_ARRAY(relid_map, local_node->nparts);
+ READ_INDEX_ARRAY(rti_map, local_node->nparts);
READ_NODE_FIELD(initial_pruning_steps);
READ_NODE_FIELD(exec_pruning_steps);
READ_BITMAPSET_FIELD(execparamids);
@@ -2939,6 +2949,21 @@ _readPartitionRangeDatum(void)
READ_DONE();
}
+
+/*
+ * _readPartitionPruneResult
+ */
+static PartitionPruneResult *
+_readPartitionPruneResult(void)
+{
+ READ_LOCALS(PartitionPruneResult);
+
+ READ_NODE_FIELD(valid_subplan_offs_list);
+ READ_BITMAPSET_FIELD(scan_leafpart_rtis);
+
+ READ_DONE();
+}
+
/*
* parseNodeString
*
@@ -3236,6 +3261,8 @@ parseNodeString(void)
return_value = _readJsonTableParent();
else if (MATCH("JSONTABLESIBLING", 16))
return_value = _readJsonTableSibling();
+ else if (MATCH("PARTITIONPRUNERESULT", 20))
+ return_value = _readPartitionPruneResult();
else
{
elog(ERROR, "badly formatted node string \"%.32s\"...", token);
@@ -3379,6 +3406,30 @@ readIntCols(int numCols)
return int_vals;
}
+/*
+ * readIndexCols
+ */
+Index *
+readIndexCols(int numCols)
+{
+ int tokenLength,
+ i;
+ const char *token;
+ Index *index_vals;
+
+ if (numCols <= 0)
+ return NULL;
+
+ index_vals = (Index *) palloc(numCols * sizeof(Index));
+ for (i = 0; i < numCols; i++)
+ {
+ token = pg_strtok(&tokenLength);
+ index_vals[i] = atoui(token);
+ }
+
+ return index_vals;
+}
+
/*
* readBoolCols
*/
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 32e658b5d6..edbf19716e 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -519,7 +519,9 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
result->parallelModeNeeded = glob->parallelModeNeeded;
result->planTree = top_plan;
result->partPruneInfos = glob->partPruneInfos;
+ result->containsInitialPruning = glob->containsInitialPruning;
result->rtable = glob->finalrtable;
+ result->minLockRelids = glob->minLockRelids;
result->resultRelations = glob->resultRelations;
result->appendRelations = glob->appendRelations;
result->subplans = glob->subplans;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index aafe1c149d..a32fc70785 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -270,6 +270,16 @@ set_plan_references(PlannerInfo *root, Plan *plan)
*/
add_rtes_to_flat_rtable(root, false);
+ /*
+ * Add the query's adjusted range of RT indexes to glob->minLockRelids.
+ * The adjusted RT indexes of prunable relations will be deleted from the
+ * set below where PartitionPruneInfos are processed.
+ */
+ glob->minLockRelids =
+ bms_add_range(glob->minLockRelids,
+ rtoffset + 1,
+ rtoffset + list_length(root->parse->rtable));
+
/*
* Adjust RT indexes of PlanRowMarks and add to final rowmarks list
*/
@@ -352,6 +362,7 @@ set_plan_references(PlannerInfo *root, Plan *plan)
foreach (lc, root->partPruneInfos)
{
PartitionPruneInfo *pruneinfo = lfirst(lc);
+ Bitmapset *leafpart_rtis = NULL;
ListCell *l;
foreach(l, pruneinfo->prune_infos)
@@ -362,15 +373,49 @@ set_plan_references(PlannerInfo *root, Plan *plan)
foreach(l2, prune_infos)
{
PartitionedRelPruneInfo *pinfo = lfirst(l2);
+ int i;
/* RT index of the table to which the pinfo belongs. */
pinfo->rtindex += rtoffset;
+
+ /* Also of the leaf partitions that might be scanned. */
+ for (i = 0; i < pinfo->nparts; i++)
+ {
+ if (pinfo->rti_map[i] > 0 && pinfo->subplan_map[i] >= 0)
+ {
+ pinfo->rti_map[i] += rtoffset;
+ leafpart_rtis = bms_add_member(leafpart_rtis,
+ pinfo->rti_map[i]);
+ }
+ }
}
}
+ if (pruneinfo->needs_init_pruning)
+ {
+ glob->containsInitialPruning = true;
+
+ /*
+ * Delete the leaf partition RTIs from the global set of relations
+ * to be locked before executing the plan. AcquireExecutorLocks()
+ * will find the ones to add to the set after performing initial
+ * pruning.
+ */
+ glob->minLockRelids = bms_del_members(glob->minLockRelids,
+ leafpart_rtis);
+ }
+
glob->partPruneInfos = lappend(glob->partPruneInfos, pruneinfo);
}
+ /*
+ * It seems worth doing a bms_copy() on glob->minLockRelids if we deleted
+ * bit from it just above to prevent empty tail bits resulting in
+ * inefficient looping during AcquireExecutorLocks().
+ */
+ if (glob->containsInitialPruning)
+ glob->minLockRelids = bms_copy(glob->minLockRelids);
+
return result;
}
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index d77f7d3aef..952c5b8327 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -144,7 +144,9 @@ static List *make_partitionedrel_pruneinfo(PlannerInfo *root,
List *prunequal,
Bitmapset *partrelids,
int *relid_subplan_map,
- Bitmapset **matchedsubplans);
+ Bitmapset **matchedsubplans,
+ bool *needs_init_pruning,
+ bool *needs_exec_pruning);
static void gen_partprune_steps(RelOptInfo *rel, List *clauses,
PartClauseTarget target,
GeneratePruningStepsContext *context);
@@ -234,6 +236,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
int *relid_subplan_map;
ListCell *lc;
int i;
+ bool needs_init_pruning = false;
+ bool needs_exec_pruning = false;
/*
* Scan the subpaths to see which ones are scans of partition child
@@ -313,12 +317,16 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
Bitmapset *partrelids = (Bitmapset *) lfirst(lc);
List *pinfolist;
Bitmapset *matchedsubplans = NULL;
+ bool partrel_needs_init_pruning;
+ bool partrel_needs_exec_pruning;
pinfolist = make_partitionedrel_pruneinfo(root, parentrel,
prunequal,
partrelids,
relid_subplan_map,
- &matchedsubplans);
+ &matchedsubplans,
+ &partrel_needs_init_pruning,
+ &partrel_needs_exec_pruning);
/* When pruning is possible, record the matched subplans */
if (pinfolist != NIL)
@@ -327,6 +335,9 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
allmatchedsubplans = bms_join(matchedsubplans,
allmatchedsubplans);
}
+
+ needs_init_pruning |= partrel_needs_init_pruning;
+ needs_exec_pruning |= partrel_needs_exec_pruning;
}
pfree(relid_subplan_map);
@@ -341,6 +352,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
/* Else build the result data structure */
pruneinfo = makeNode(PartitionPruneInfo);
pruneinfo->prune_infos = prunerelinfos;
+ pruneinfo->needs_init_pruning = needs_init_pruning;
+ pruneinfo->needs_exec_pruning = needs_exec_pruning;
/*
* Some subplans may not belong to any of the identified partitioned rels.
@@ -441,13 +454,18 @@ add_part_relids(List *allpartrelids, Bitmapset *partrelids)
* If we cannot find any useful run-time pruning steps, return NIL.
* However, on success, each rel identified in partrelids will have
* an element in the result list, even if some of them are useless.
+ * *needs_init_pruning and *needs_exec_pruning are set to indicate that the
+ * returned PartitionedRelPruneInfos contains pruning steps that can be
+ * performed before and after execution begins, respectively.
*/
static List *
make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
List *prunequal,
Bitmapset *partrelids,
int *relid_subplan_map,
- Bitmapset **matchedsubplans)
+ Bitmapset **matchedsubplans,
+ bool *needs_init_pruning,
+ bool *needs_exec_pruning)
{
RelOptInfo *targetpart = NULL;
List *pinfolist = NIL;
@@ -458,6 +476,10 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
int rti;
int i;
+ /* Will find out below. */
+ *needs_init_pruning = false;
+ *needs_exec_pruning = false;
+
/*
* Examine each partitioned rel, constructing a temporary array to map
* from planner relids to index of the partitioned rel, and building a
@@ -545,6 +567,9 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
* executor per-scan pruning steps. This first pass creates startup
* pruning steps and detects whether there's any possibly-useful quals
* that would require per-scan pruning.
+ *
+ * In the first pass, we note whether the 2nd pass is necessary by
+ * noting the presence of EXEC parameters.
*/
gen_partprune_steps(subpart, partprunequal, PARTTARGET_INITIAL,
&context);
@@ -619,6 +644,12 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
pinfo->execparamids = execparamids;
/* Remaining fields will be filled in the next loop */
+ /* record which types of pruning steps we've seen so far */
+ if (initial_pruning_steps != NIL)
+ *needs_init_pruning = true;
+ if (exec_pruning_steps != NIL)
+ *needs_exec_pruning = true;
+
pinfolist = lappend(pinfolist, pinfo);
}
@@ -646,6 +677,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
int *subplan_map;
int *subpart_map;
Oid *relid_map;
+ Index *rti_map;
/*
* Construct the subplan and subpart maps for this partitioning level.
@@ -658,6 +690,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
subpart_map = (int *) palloc(nparts * sizeof(int));
memset(subpart_map, -1, nparts * sizeof(int));
relid_map = (Oid *) palloc0(nparts * sizeof(Oid));
+ rti_map = (Index *) palloc0(nparts * sizeof(Index));
present_parts = NULL;
i = -1;
@@ -672,6 +705,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
subplan_map[i] = subplanidx = relid_subplan_map[partrel->relid] - 1;
subpart_map[i] = subpartidx = relid_subpart_map[partrel->relid] - 1;
relid_map[i] = planner_rt_fetch(partrel->relid, root)->relid;
+ rti_map[i] = partrel->relid;
if (subplanidx >= 0)
{
present_parts = bms_add_member(present_parts, i);
@@ -696,6 +730,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
pinfo->subplan_map = subplan_map;
pinfo->subpart_map = subpart_map;
pinfo->relid_map = relid_map;
+ pinfo->rti_map = rti_map;
}
pfree(relid_subpart_map);
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 8b6b5bbaaa..7f0eda48a4 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -1603,6 +1603,7 @@ exec_bind_message(StringInfo input_message)
int16 *rformats = NULL;
CachedPlanSource *psrc;
CachedPlan *cplan;
+ List *part_prune_result_list;
Portal portal;
char *query_string;
char *saved_stmt_name;
@@ -1978,7 +1979,9 @@ exec_bind_message(StringInfo input_message)
* will be generated in MessageContext. The plan refcount will be
* assigned to the Portal, so it will be released at portal destruction.
*/
- cplan = GetCachedPlan(psrc, params, NULL, NULL);
+ cplan = GetCachedPlan(psrc, params, NULL, NULL, &part_prune_result_list);
+ Assert(list_length(cplan->stmt_list) ==
+ list_length(part_prune_result_list));
/*
* Now we can define the portal.
@@ -1993,6 +1996,9 @@ exec_bind_message(StringInfo input_message)
cplan->stmt_list,
cplan);
+ /* Copy PartitionPruneResults into the portal's context. */
+ PortalStorePartitionPruneResults(portal, part_prune_result_list);
+
/* Done with the snapshot used for parameter I/O and parsing/planning */
if (snapshot_set)
PopActiveSnapshot();
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index 5aa5a350f3..8cc2e2162d 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -35,7 +35,7 @@
Portal ActivePortal = NULL;
-static void ProcessQuery(PlannedStmt *plan,
+static void ProcessQuery(PlannedStmt *plan, PartitionPruneResult *part_prune_result,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -65,6 +65,7 @@ static void DoPortalRewind(Portal portal);
*/
QueryDesc *
CreateQueryDesc(PlannedStmt *plannedstmt,
+ PartitionPruneResult *part_prune_result,
const char *sourceText,
Snapshot snapshot,
Snapshot crosscheck_snapshot,
@@ -77,6 +78,8 @@ CreateQueryDesc(PlannedStmt *plannedstmt,
qd->operation = plannedstmt->commandType; /* operation */
qd->plannedstmt = plannedstmt; /* plan */
+ qd->part_prune_result = part_prune_result; /* ExecutorDoInitialPruning()
+ * output for plan */
qd->sourceText = sourceText; /* query text */
qd->snapshot = RegisterSnapshot(snapshot); /* snapshot */
/* RI check snapshot */
@@ -122,6 +125,7 @@ FreeQueryDesc(QueryDesc *qdesc)
* PORTAL_ONE_RETURNING, or PORTAL_ONE_MOD_WITH portal
*
* plan: the plan tree for the query
+ * part_prune_result: ExecutorDoInitialPruning() output for the plan tree
* sourceText: the source text of the query
* params: any parameters needed
* dest: where to send results
@@ -134,6 +138,7 @@ FreeQueryDesc(QueryDesc *qdesc)
*/
static void
ProcessQuery(PlannedStmt *plan,
+ PartitionPruneResult *part_prune_result,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -145,7 +150,7 @@ ProcessQuery(PlannedStmt *plan,
/*
* Create the QueryDesc object
*/
- queryDesc = CreateQueryDesc(plan, sourceText,
+ queryDesc = CreateQueryDesc(plan, part_prune_result, sourceText,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, 0);
@@ -491,8 +496,13 @@ PortalStart(Portal portal, ParamListInfo params,
/*
* Create QueryDesc in portal's context; for the moment, set
* the destination to DestNone.
+ *
+ * There is no PartitionPruneResult unless the PlannedStmt is
+ * from a CachedPlan.
*/
queryDesc = CreateQueryDesc(linitial_node(PlannedStmt, portal->stmts),
+ portal->part_prune_results == NIL ? NULL :
+ linitial(portal->part_prune_results),
portal->sourceText,
GetActiveSnapshot(),
InvalidSnapshot,
@@ -1225,6 +1235,8 @@ PortalRunMulti(Portal portal,
if (pstmt->utilityStmt == NULL)
{
+ PartitionPruneResult *part_prune_result = NULL;
+
/*
* process a plannable query.
*/
@@ -1271,10 +1283,18 @@ PortalRunMulti(Portal portal,
else
UpdateActiveSnapshotCommandId();
+ /*
+ * Determine if there's a corresponding PartitionPruneResult for
+ * this PlannedStmt.
+ */
+ if (portal->part_prune_results != NIL)
+ part_prune_result = list_nth(portal->part_prune_results,
+ foreach_current_index(stmtlist_item));
+
if (pstmt->canSetTag)
{
/* statement can set tag string */
- ProcessQuery(pstmt,
+ ProcessQuery(pstmt, part_prune_result,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
@@ -1283,7 +1303,7 @@ PortalRunMulti(Portal portal,
else
{
/* stmt added by rewrite cannot set tag */
- ProcessQuery(pstmt,
+ ProcessQuery(pstmt, part_prune_result,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 0d6a295674..8c164741f7 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -99,14 +99,19 @@ static dlist_head cached_expression_list = DLIST_STATIC_INIT(cached_expression_l
static void ReleaseGenericPlan(CachedPlanSource *plansource);
static List *RevalidateCachedQuery(CachedPlanSource *plansource,
QueryEnvironment *queryEnv);
-static bool CheckCachedPlan(CachedPlanSource *plansource);
+static bool CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
+ List **part_prune_result_list);
static CachedPlan *BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
- ParamListInfo boundParams, QueryEnvironment *queryEnv);
+ ParamListInfo boundParams, QueryEnvironment *queryEnv,
+ List **part_prune_result_list);
static bool choose_custom_plan(CachedPlanSource *plansource,
ParamListInfo boundParams);
static double cached_plan_cost(CachedPlan *plan, bool include_planner);
static Query *QueryListGetPrimaryStmt(List *stmts);
-static void AcquireExecutorLocks(List *stmt_list, bool acquire);
+static void AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams,
+ List **part_prune_result_list,
+ List **lockedRelids_per_stmt);
+static void ReleaseExecutorLocks(List *stmt_list, List *lockedRelids_per_stmt);
static void AcquirePlannerLocks(List *stmt_list, bool acquire);
static void ScanQueryForLocks(Query *parsetree, bool acquire);
static bool ScanQueryWalker(Node *node, bool *acquire);
@@ -790,15 +795,20 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
*
* On a "true" return, we have acquired the locks needed to run the plan.
* (We must do this for the "true" result to be race-condition-free.)
+ *
+ * See GetCachedPlan()'s comment for a description of part_prune_result_list.
*/
static bool
-CheckCachedPlan(CachedPlanSource *plansource)
+CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
+ List **part_prune_result_list)
{
CachedPlan *plan = plansource->gplan;
/* Assert that caller checked the querytree */
Assert(plansource->is_valid);
+ *part_prune_result_list = NIL;
+
/* If there's no generic plan, just say "false" */
if (!plan)
return false;
@@ -820,13 +830,21 @@ CheckCachedPlan(CachedPlanSource *plansource)
*/
if (plan->is_valid)
{
+ List *lockedRelids_per_stmt;
+
/*
* Plan must have positive refcount because it is referenced by
* plansource; so no need to fear it disappears under us here.
*/
Assert(plan->refcount > 0);
- AcquireExecutorLocks(plan->stmt_list, true);
+ /*
+ * Lock relations scanned by the plan. This is where the pruning
+ * happens if needed.
+ */
+ AcquireExecutorLocks(plan->stmt_list, boundParams,
+ part_prune_result_list,
+ &lockedRelids_per_stmt);
/*
* If plan was transient, check to see if TransactionXmin has
@@ -848,7 +866,14 @@ CheckCachedPlan(CachedPlanSource *plansource)
}
/* Oops, the race case happened. Release useless locks. */
- AcquireExecutorLocks(plan->stmt_list, false);
+ ReleaseExecutorLocks(plan->stmt_list, lockedRelids_per_stmt);
+
+ /*
+ * The output list and any objects therein have been allocated in the
+ * caller's hopefully short-lived context, so will not remain leaked
+ * for long, though reset to avoid its accidentally being looked at.
+ */
+ *part_prune_result_list = NIL;
}
/*
@@ -874,10 +899,15 @@ CheckCachedPlan(CachedPlanSource *plansource)
* Planning work is done in the caller's memory context. The finished plan
* is in a child memory context, which typically should get reparented
* (unless this is a one-shot plan, in which case we don't copy the plan).
+ *
+ * A list of NULLs is returned in *part_prune_result_list, meaning that no
+ * PartitionPruneResult nodes have yet been created for the plans in
+ * stmt_list.
*/
static CachedPlan *
BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
- ParamListInfo boundParams, QueryEnvironment *queryEnv)
+ ParamListInfo boundParams, QueryEnvironment *queryEnv,
+ List **part_prune_result_list)
{
CachedPlan *plan;
List *plist;
@@ -1007,6 +1037,17 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
MemoryContextSwitchTo(oldcxt);
+ /*
+ * No actual PartitionPruneResults yet to add, though must initialize
+ * the list to have the same number of elements as the list of
+ * PlannedStmts.
+ */
+ *part_prune_result_list = NIL;
+ foreach(lc, plist)
+ {
+ *part_prune_result_list = lappend(*part_prune_result_list, NULL);
+ }
+
return plan;
}
@@ -1126,6 +1167,17 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
* plan or a custom plan for the given parameters: the caller does not know
* which it will get.
*
+ * For every PlannedStmt found in the returned CachedPlan, an element that
+ * is either a PartitionPruneResult or a NULL is added to
+ * *part_prune_result_list if needed. The former if the PlannedStmt is from
+ * the existing CachedPlan that is otherwise valid and contains at least one
+ * PartitionPruneInfo that has "initial" pruning steps. Those steps are
+ * performed by calling ExecutorDoInitialPruning() to determine only those
+ * leaf partitions that need to be locked by AcquireExecutorLocks() by pruning
+ * away subplans that don't match the pruning conditions. The
+ * PartitionPruneResult contains a list of bitmapsets of the indexes of
+ * matching subplans, one for each PartitionPruneInfo.
+ *
* On return, the plan is valid and we have sufficient locks to begin
* execution.
*
@@ -1139,11 +1191,13 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
*/
CachedPlan *
GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
- ResourceOwner owner, QueryEnvironment *queryEnv)
+ ResourceOwner owner, QueryEnvironment *queryEnv,
+ List **part_prune_result_list)
{
CachedPlan *plan = NULL;
List *qlist;
bool customplan;
+ List *my_part_prune_result_list;
/* Assert caller is doing things in a sane order */
Assert(plansource->magic == CACHEDPLANSOURCE_MAGIC);
@@ -1160,7 +1214,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
if (!customplan)
{
- if (CheckCachedPlan(plansource))
+ if (CheckCachedPlan(plansource, boundParams,
+ &my_part_prune_result_list))
{
/* We want a generic plan, and we already have a valid one */
plan = plansource->gplan;
@@ -1169,7 +1224,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
else
{
/* Build a new generic plan */
- plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv);
+ plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv,
+ &my_part_prune_result_list);
/* Just make real sure plansource->gplan is clear */
ReleaseGenericPlan(plansource);
/* Link the new generic plan into the plansource */
@@ -1214,7 +1270,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
if (customplan)
{
/* Build a custom plan */
- plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv);
+ plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv,
+ &my_part_prune_result_list);
/* Accumulate total costs of custom plans */
plansource->total_custom_cost += cached_plan_cost(plan, true);
@@ -1246,6 +1303,9 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
plan->is_saved = true;
}
+ if (part_prune_result_list)
+ *part_prune_result_list = my_part_prune_result_list;
+
return plan;
}
@@ -1737,17 +1797,29 @@ QueryListGetPrimaryStmt(List *stmts)
/*
* AcquireExecutorLocks: acquire locks needed for execution of a cached plan;
- * or release them if acquire is false.
+ *
+ * See GetCachedPlan()'s comment for a description of part_prune_result_list.
+ *
+ * On return, *lockedRelids_per_stmt will contain a bitmapset for every
+ * PlannedStmt in stmt_list, containing the RT indexes of relation entries
+ * in its range table that were actually locked, or NULL if the PlannedStmt
+ * contains a utility statement.
*/
static void
-AcquireExecutorLocks(List *stmt_list, bool acquire)
+AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams,
+ List **part_prune_result_list,
+ List **lockedRelids_per_stmt)
{
ListCell *lc1;
+ *part_prune_result_list = *lockedRelids_per_stmt = NIL;
foreach(lc1, stmt_list)
{
PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
- ListCell *lc2;
+ PartitionPruneResult *part_prune_result = NULL;
+ Bitmapset *allLockRelids;
+ Bitmapset *lockedRelids = NULL;
+ int rti;
if (plannedstmt->commandType == CMD_UTILITY)
{
@@ -1761,13 +1833,35 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
Query *query = UtilityContainsQuery(plannedstmt->utilityStmt);
if (query)
- ScanQueryForLocks(query, acquire);
+ ScanQueryForLocks(query, true);
+ *part_prune_result_list = lappend(*part_prune_result_list, NULL);
continue;
}
- foreach(lc2, plannedstmt->rtable)
+ /*
+ * Figure out the set of relations that would need to be locked
+ * before executing the plan.
+ */
+ if (plannedstmt->containsInitialPruning)
{
- RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc2);
+ /*
+ * Obtain the set of partitions to be locked from the
+ * PartitionPruneInfos by considering the result of performing
+ * initial partition pruning.
+ */
+ PartitionPruneResult *part_prune_result =
+ ExecutorDoInitialPruning(plannedstmt, boundParams);
+
+ allLockRelids = bms_union(plannedstmt->minLockRelids,
+ part_prune_result->scan_leafpart_rtis);
+ }
+ else
+ allLockRelids = plannedstmt->minLockRelids;
+
+ rti = -1;
+ while ((rti = bms_next_member(allLockRelids, rti)) > 0)
+ {
+ RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
if (rte->rtekind != RTE_RELATION)
continue;
@@ -1778,10 +1872,58 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
* fail if it's been dropped entirely --- we'll just transiently
* acquire a non-conflicting lock.
*/
- if (acquire)
- LockRelationOid(rte->relid, rte->rellockmode);
- else
- UnlockRelationOid(rte->relid, rte->rellockmode);
+ LockRelationOid(rte->relid, rte->rellockmode);
+ lockedRelids = bms_add_member(lockedRelids, rti);
+ }
+
+ *part_prune_result_list = lappend(*part_prune_result_list,
+ part_prune_result);
+ *lockedRelids_per_stmt = lappend(*lockedRelids_per_stmt, lockedRelids);
+ }
+}
+
+/*
+ * ReleaseExecutorLocks
+ * Release locks that would've been acquired by an earlier call to
+ * AcquireExecutorLocks()
+ */
+static void
+ReleaseExecutorLocks(List *stmt_list, List *lockedRelids_per_stmt)
+{
+ ListCell *lc1,
+ *lc2;
+
+ forboth(lc1, stmt_list, lc2, lockedRelids_per_stmt)
+ {
+ PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
+ Bitmapset *lockedRelids = lfirst(lc2);
+ int rti;
+
+ if (plannedstmt->commandType == CMD_UTILITY)
+ {
+ /*
+ * Ignore utility statements, except those (such as EXPLAIN) that
+ * contain a parsed-but-not-planned query. Note: it's okay to use
+ * ScanQueryForLocks, even though the query hasn't been through
+ * rule rewriting, because rewriting doesn't change the query
+ * representation.
+ */
+ Query *query = UtilityContainsQuery(plannedstmt->utilityStmt);
+
+ if (query)
+ ScanQueryForLocks(query, false);
+ continue;
+ }
+
+ rti = -1;
+ while ((rti = bms_next_member(lockedRelids, rti)) >= 0)
+ {
+ RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
+
+ Assert(rte->rtekind == RTE_RELATION);
+
+ /* See the comment in AcquireExecutorLocks(). */
+ UnlockRelationOid(rte->relid, rte->rellockmode);
}
}
}
diff --git a/src/backend/utils/mmgr/portalmem.c b/src/backend/utils/mmgr/portalmem.c
index d549f66d4a..1bbe6b704b 100644
--- a/src/backend/utils/mmgr/portalmem.c
+++ b/src/backend/utils/mmgr/portalmem.c
@@ -303,6 +303,25 @@ PortalDefineQuery(Portal portal,
portal->status = PORTAL_DEFINED;
}
+/*
+ * PortalStorePartitionPruneResults
+ * Copy the given list of PartitionPruneResults into the portal's
+ * context
+ *
+ * This allows the caller to ensure that the list exists as long as the portal
+ * does.
+ */
+void
+PortalStorePartitionPruneResults(Portal portal, List *part_prune_results)
+{
+ MemoryContext oldcxt;
+
+ AssertArg(PortalIsValid(portal));
+ oldcxt = MemoryContextSwitchTo(portal->portalContext);
+ portal->part_prune_results = copyObject(part_prune_results);
+ MemoryContextSwitchTo(oldcxt);
+}
+
/*
* PortalReleaseCachedPlan
* Release a portal's reference to its cached plan, if any.
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 666977fb1f..bbc8c42d88 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -87,7 +87,9 @@ extern void ExplainOneUtility(Node *utilityStmt, IntoClause *into,
ExplainState *es, const char *queryString,
ParamListInfo params, QueryEnvironment *queryEnv);
-extern void ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into,
+extern void ExplainOnePlan(PlannedStmt *plannedstmt,
+ PartitionPruneResult *part_prune_result,
+ IntoClause *into,
ExplainState *es, const char *queryString,
ParamListInfo params, QueryEnvironment *queryEnv,
const instr_time *planduration,
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index bf962af7af..bd8776402e 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -45,6 +45,7 @@ extern void ExecCleanupTupleRouting(ModifyTableState *mtstate,
* nparts Length of subplan_map[] and subpart_map[].
* subplan_map Subplan index by partition index, or -1.
* subpart_map Subpart index by partition index, or -1.
+ * rti_map Range table index by partition index, or 0.
* present_parts A Bitmapset of the partition indexes that we
* have subplans or subparts for.
* initial_pruning_steps List of PartitionPruneSteps used to
@@ -61,6 +62,7 @@ typedef struct PartitionedRelPruningData
int nparts;
int *subplan_map;
int *subpart_map;
+ Index *rti_map;
Bitmapset *present_parts;
List *initial_pruning_steps;
List *exec_pruning_steps;
@@ -126,5 +128,10 @@ extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
int part_prune_index,
Bitmapset **initially_valid_subplans);
extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
- bool initial_prune);
+ bool initial_prune,
+ Bitmapset **scan_leafpart_rtis);
+extern Bitmapset *ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt,
+ ParamListInfo params,
+ PartitionPruneInfo *pruneinfo,
+ Bitmapset **scan_leafpart_rtis);
#endif /* EXECPARTITION_H */
diff --git a/src/include/executor/execdesc.h b/src/include/executor/execdesc.h
index e79e2c001f..60d5644908 100644
--- a/src/include/executor/execdesc.h
+++ b/src/include/executor/execdesc.h
@@ -35,6 +35,8 @@ typedef struct QueryDesc
/* These fields are provided by CreateQueryDesc */
CmdType operation; /* CMD_SELECT, CMD_UPDATE, etc. */
PlannedStmt *plannedstmt; /* planner's output (could be utility, too) */
+ PartitionPruneResult *part_prune_result; /* ExecutorDoInitialPruning()'s
+ * output for plannedstmt */
const char *sourceText; /* source text of the query */
Snapshot snapshot; /* snapshot to use for query */
Snapshot crosscheck_snapshot; /* crosscheck for RI update/delete */
@@ -57,6 +59,7 @@ typedef struct QueryDesc
/* in pquery.c */
extern QueryDesc *CreateQueryDesc(PlannedStmt *plannedstmt,
+ PartitionPruneResult *part_prune_result,
const char *sourceText,
Snapshot snapshot,
Snapshot crosscheck_snapshot,
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index d68a6b9d28..5c4a282be0 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -185,6 +185,8 @@ ExecGetJunkAttribute(TupleTableSlot *slot, AttrNumber attno, bool *isNull)
/*
* prototypes from functions in execMain.c
*/
+extern PartitionPruneResult *ExecutorDoInitialPruning(PlannedStmt *plannedstmt,
+ ParamListInfo params);
extern void ExecutorStart(QueryDesc *queryDesc, int eflags);
extern void standard_ExecutorStart(QueryDesc *queryDesc, int eflags);
extern void ExecutorRun(QueryDesc *queryDesc,
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 25e0bb976e..d3ae0fa52d 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -986,6 +986,34 @@ typedef struct DomainConstraintState
*/
typedef TupleTableSlot *(*ExecProcNodeMtd) (struct PlanState *pstate);
+/*----------------
+ * PartitionPruneResult
+ *
+ * The result of performing ExecutorDoInitialPruning() invocation on a given
+ * PlannedStmt.
+ *
+ * Contains a list of Bitmapset of the indexes of the subplans remaining after
+ * performing initial pruning by calling ExecFindMatchingSubPlans() for every
+ * PartitionPruneInfos found in PlannedStmt.partPruneInfos. RT indexes of the
+ * leaf partitions scanned by those subplans across all PartitionPruneInfos
+ * are added into scan_leafpart_rtis.
+ *
+ * This is used by GetCachedPlan() to inform its callers of the pruning
+ * decisions made when performing AcquireExecutorLocks() on a given cached
+ * PlannedStmt, which the callers then pass onto the executor. The executor
+ * refers to this node when made available when initializing the plan nodes to
+ * which those PartitionPruneInfos apply so that the same set of qualifying
+ * subplans are initialized, rather than deriving that set again by redoing
+ * initial pruning.
+ */
+typedef struct PartitionPruneResult
+{
+ NodeTag type;
+
+ List *valid_subplan_offs_list;
+ Bitmapset *scan_leafpart_rtis;
+} PartitionPruneResult;
+
/* ----------------
* PlanState node
*
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index b3b407579b..84d67d5dcf 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -97,6 +97,9 @@ typedef enum NodeTag
T_PartitionPruneStepCombine,
T_PlanInvalItem,
+ /* TAGS FOR EXECUTOR PREP NODES (execnodes.h) */
+ T_PartitionPruneResult,
+
/*
* TAGS FOR PLAN STATE NODES (execnodes.h)
*
@@ -674,6 +677,7 @@ extern struct Bitmapset *readBitmapset(void);
extern uintptr_t readDatum(bool typbyval);
extern bool *readBoolCols(int numCols);
extern int *readIntCols(int numCols);
+extern Index *readIndexCols(int numCols);
extern Oid *readOidCols(int numCols);
extern int16 *readAttrNumberCols(int numCols);
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 6995b0ecec..c47ce6c09b 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -110,6 +110,15 @@ typedef struct PlannerGlobal
List *partPruneInfos; /* List of PartitionPruneInfo contained in
* the plan */
+ bool containsInitialPruning; /* Do any of those PartitionPruneInfos
+ * have initial (pre-exec) pruning
+ * steps in them? */
+
+ Bitmapset *minLockRelids; /* Indexes of all range table entries minus
+ * indexes of range table entries of the leaf
+ * partitions scanned by prunable subplans;
+ * see AcquireExecutorLocks() */
+
List *relationOids; /* OIDs of relations the plan depends on */
List *invalItems; /* other dependencies, as PlanInvalItems */
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 297cacfb5b..ffb52e2ac2 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -67,8 +67,17 @@ typedef struct PlannedStmt
List *partPruneInfos; /* List of PartitionPruneInfo contained in
* the plan */
+ bool containsInitialPruning; /* Do any of those PartitionPruneInfos
+ * have initial (pre-exec) pruning
+ * steps in them? */
+
List *rtable; /* list of RangeTblEntry nodes */
+ Bitmapset *minLockRelids; /* Indexes of all range table entries minus
+ * indexes of range table entries of the leaf
+ * partitions scanned by prunable subplans;
+ * see AcquireExecutorLocks() */
+
/* rtable indexes of target relations for INSERT/UPDATE/DELETE */
List *resultRelations; /* integer list of RT indexes, or NIL */
@@ -1196,6 +1205,13 @@ typedef struct PlanRowMark
* prune_infos List of Lists containing PartitionedRelPruneInfo nodes,
* one sublist per run-time-prunable partition hierarchy
* appearing in the parent plan node's subplans.
+ *
+ * needs_init_pruning Does any of the PartitionedRelPruneInfos in
+ * prune_infos have its initial_pruning_steps set?
+ *
+ * needs_exec_pruning Does any of the PartitionedRelPruneInfos in
+ * prune_infos have its exec_pruning_steps set?
+ *
* other_subplans Indexes of any subplans that are not accounted for
* by any of the PartitionedRelPruneInfo nodes in
* "prune_infos". These subplans must not be pruned.
@@ -1204,6 +1220,8 @@ typedef struct PartitionPruneInfo
{
NodeTag type;
List *prune_infos;
+ bool needs_init_pruning;
+ bool needs_exec_pruning;
Bitmapset *other_subplans;
} PartitionPruneInfo;
@@ -1234,6 +1252,7 @@ typedef struct PartitionedRelPruneInfo
int *subplan_map; /* subplan index by partition index, or -1 */
int *subpart_map; /* subpart index by partition index, or -1 */
Oid *relid_map; /* relation OID by partition index, or 0 */
+ Index *rti_map; /* Range table index by partition index, 0. */
/*
* initial_pruning_steps shows how to prune during executor startup (i.e.,
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index 0499635f59..1c5bb5ece1 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -220,7 +220,8 @@ extern List *CachedPlanGetTargetList(CachedPlanSource *plansource,
extern CachedPlan *GetCachedPlan(CachedPlanSource *plansource,
ParamListInfo boundParams,
ResourceOwner owner,
- QueryEnvironment *queryEnv);
+ QueryEnvironment *queryEnv,
+ List **part_prune_result_list);
extern void ReleaseCachedPlan(CachedPlan *plan, ResourceOwner owner);
extern bool CachedPlanAllowsSimpleValidityCheck(CachedPlanSource *plansource,
diff --git a/src/include/utils/portal.h b/src/include/utils/portal.h
index aeddbdafe5..9f7727a837 100644
--- a/src/include/utils/portal.h
+++ b/src/include/utils/portal.h
@@ -138,6 +138,7 @@ typedef struct PortalData
QueryCompletion qc; /* command completion data for executed query */
List *stmts; /* list of PlannedStmts */
CachedPlan *cplan; /* CachedPlan, if stmts are from one */
+ List *part_prune_results; /* list of PartitionPruneResults */
ParamListInfo portalParams; /* params to pass to query */
QueryEnvironment *queryEnv; /* environment for query */
@@ -242,6 +243,8 @@ extern void PortalDefineQuery(Portal portal,
CommandTag commandTag,
List *stmts,
CachedPlan *cplan);
+extern void PortalStorePartitionPruneResults(Portal portal,
+ List *part_prune_result_list);
extern PlannedStmt *PortalGetPrimaryStmt(Portal portal);
extern void PortalCreateHoldStore(Portal portal);
extern void PortalHashTableDeleteAll(void);
--
2.35.3
^ permalink raw reply [nested|flat] 71+ messages in thread
* Re: generic plans and "initial" pruning
@ 2022-05-27 20:08 Zhihong Yu <[email protected]>
parent: Amit Langote <[email protected]>
1 sibling, 0 replies; 71+ messages in thread
From: Zhihong Yu @ 2022-05-27 20:08 UTC (permalink / raw)
To: Amit Langote <[email protected]>; +Cc: David Rowley <[email protected]>; Robert Haas <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers
On Fri, May 27, 2022 at 1:10 AM Amit Langote <[email protected]>
wrote:
> On Mon, Apr 11, 2022 at 12:53 PM Zhihong Yu <[email protected]> wrote:
> > On Sun, Apr 10, 2022 at 8:05 PM Amit Langote <[email protected]>
> wrote:
> >> Sending v15 that fixes that to keep the cfbot green for now.
> >
> > Hi,
> >
> > + /* RT index of the partitione table. */
> >
> > partitione -> partitioned
>
> Thanks, fixed.
>
> Also, I broke this into patches:
>
> 0001 contains the mechanical changes of moving PartitionPruneInfo out
> of Append/MergeAppend into a list in PlannedStmt.
>
> 0002 is the main patch to "Optimize AcquireExecutorLocks() by locking
> only unpruned partitions".
>
> --
> Thanks, Amit Langote
> EDB: http://www.enterprisedb.com
Hi,
In the description:
is made available to the actual execution via
PartitionPruneResult, made available along with the PlannedStmt by the
I think the second `made available` is redundant (can be omitted).
+ * Initial pruning is performed here if needed (unless it has already been
done
+ * by ExecDoInitialPruning()), and in that case only the surviving
subplans'
I wonder if there is a typo above - I don't find ExecDoInitialPruning
either in PG codebase or in the patches (except for this in the comment).
I think it should be ExecutorDoInitialPruning.
+ * bit from it just above to prevent empty tail bits resulting in
I searched in the code base but didn't find mentioning of `empty tail bit`.
Do you mind explaining a bit about it ?
Cheers
^ permalink raw reply [nested|flat] 71+ messages in thread
* Re: generic plans and "initial" pruning
@ 2022-07-05 17:43 Jacob Champion <[email protected]>
parent: Amit Langote <[email protected]>
1 sibling, 1 reply; 71+ messages in thread
From: Jacob Champion @ 2022-07-05 17:43 UTC (permalink / raw)
To: Amit Langote <[email protected]>; +Cc: Zhihong Yu <[email protected]>; David Rowley <[email protected]>; Robert Haas <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers
On Fri, May 27, 2022 at 1:09 AM Amit Langote <[email protected]> wrote:
> 0001 contains the mechanical changes of moving PartitionPruneInfo out
> of Append/MergeAppend into a list in PlannedStmt.
>
> 0002 is the main patch to "Optimize AcquireExecutorLocks() by locking
> only unpruned partitions".
This patchset will need to be rebased over 835d476fd21; looks like
just a cosmetic change.
--Jacob
^ permalink raw reply [nested|flat] 71+ messages in thread
* Re: generic plans and "initial" pruning
@ 2022-07-06 02:37 Amit Langote <[email protected]>
parent: Jacob Champion <[email protected]>
0 siblings, 2 replies; 71+ messages in thread
From: Amit Langote @ 2022-07-06 02:37 UTC (permalink / raw)
To: Jacob Champion <[email protected]>; +Cc: Zhihong Yu <[email protected]>; David Rowley <[email protected]>; Robert Haas <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers
On Wed, Jul 6, 2022 at 2:43 AM Jacob Champion <[email protected]> wrote:
> On Fri, May 27, 2022 at 1:09 AM Amit Langote <[email protected]> wrote:
> > 0001 contains the mechanical changes of moving PartitionPruneInfo out
> > of Append/MergeAppend into a list in PlannedStmt.
> >
> > 0002 is the main patch to "Optimize AcquireExecutorLocks() by locking
> > only unpruned partitions".
>
> This patchset will need to be rebased over 835d476fd21; looks like
> just a cosmetic change.
Thanks for the heads up.
Rebased and also fixed per comments given by Zhihong Yu on May 28.
--
Thanks, Amit Langote
EDB: http://www.enterprisedb.com
Attachments:
[application/octet-stream] v17-0001-Move-PartitioPruneInfo-out-of-plan-nodes-into-Pl.patch (21.2K, 2-v17-0001-Move-PartitioPruneInfo-out-of-plan-nodes-into-Pl.patch)
download | inline diff:
From 665055be44caaec9dcc2a3251f20ceb3c678fa3d Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Fri, 27 May 2022 16:00:28 +0900
Subject: [PATCH v17 1/2] Move PartitioPruneInfo out of plan nodes into
PlannedStmt
The planner will now add a given PartitioPruneInfo to
PlannedStmt.partPruneInfos instead of directly to the
Append/MergeAppend plan node. What gets set instead in the
latter is an index field which points to the list element
of PlannedStmt.partPruneInfos containing the PartitioPruneInfo
belonging to the plan node.
A later commit will make AcquireExecutorLocks() do the initial
partition pruning to determine a minimal set of partitions to be
locked when validating a plan tree and it will need to consult the
PartitioPruneInfos referenced therein to do so. It would be better
for the PartitioPruneInfos to be accessible directly than requiring
a walk of the plan tree to find them, which is easier when it can be
done by simply iterating over PlannedStmt.partPruneInfos.
---
src/backend/executor/execMain.c | 1 +
src/backend/executor/execParallel.c | 1 +
src/backend/executor/execPartition.c | 4 +-
src/backend/executor/execUtils.c | 2 +
src/backend/executor/nodeAppend.c | 4 +-
src/backend/executor/nodeMergeAppend.c | 4 +-
src/backend/nodes/copyfuncs.c | 5 +-
src/backend/nodes/outfuncs.c | 7 ++-
src/backend/nodes/readfuncs.c | 5 +-
src/backend/optimizer/plan/createplan.c | 24 ++++-----
src/backend/optimizer/plan/planner.c | 1 +
src/backend/optimizer/plan/setrefs.c | 65 +++++++++++++------------
src/backend/partitioning/partprune.c | 18 ++++---
src/include/executor/execPartition.h | 3 +-
src/include/nodes/execnodes.h | 2 +
src/include/nodes/pathnodes.h | 6 +++
src/include/nodes/plannodes.h | 11 +++--
src/include/partitioning/partprune.h | 8 +--
18 files changed, 103 insertions(+), 68 deletions(-)
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index ef2fd46092..72fc273524 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -825,6 +825,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
ExecInitRangeTable(estate, rangeTable);
estate->es_plannedstmt = plannedstmt;
+ estate->es_part_prune_infos = plannedstmt->partPruneInfos;
/*
* Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index f1fd7f7e8b..f73b8c2607 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -183,6 +183,7 @@ ExecSerializePlan(Plan *plan, EState *estate)
pstmt->dependsOnRole = false;
pstmt->parallelModeNeeded = false;
pstmt->planTree = plan;
+ pstmt->partPruneInfos = estate->es_part_prune_infos;
pstmt->rtable = estate->es_range_table;
pstmt->resultRelations = NIL;
pstmt->appendRelations = NIL;
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index e03ea27299..b55cdd2580 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -1638,11 +1638,13 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
PartitionPruneState *
ExecInitPartitionPruning(PlanState *planstate,
int n_total_subplans,
- PartitionPruneInfo *pruneinfo,
+ int part_prune_index,
Bitmapset **initially_valid_subplans)
{
PartitionPruneState *prunestate;
EState *estate = planstate->state;
+ PartitionPruneInfo *pruneinfo = list_nth(estate->es_part_prune_infos,
+ part_prune_index);
/* We may need an expression context to evaluate partition exprs */
ExecAssignExprContext(estate, planstate);
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 9df1f81ea8..f9c7976ff2 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -119,6 +119,8 @@ CreateExecutorState(void)
estate->es_relations = NULL;
estate->es_rowmarks = NULL;
estate->es_plannedstmt = NULL;
+ estate->es_part_prune_infos = NIL;
+ estate->es_part_prune_result = NULL;
estate->es_junkFilter = NULL;
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index 357e10a1d7..c6f86a6510 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -134,7 +134,7 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
appendstate->as_begun = false;
/* If run-time partition pruning is enabled, then set that up now */
- if (node->part_prune_info != NULL)
+ if (node->part_prune_index >= 0)
{
PartitionPruneState *prunestate;
@@ -145,7 +145,7 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
*/
prunestate = ExecInitPartitionPruning(&appendstate->ps,
list_length(node->appendplans),
- node->part_prune_info,
+ node->part_prune_index,
&validsubplans);
appendstate->as_prune_state = prunestate;
nplans = bms_num_members(validsubplans);
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index c5c62fa5c7..8d35860c30 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -82,7 +82,7 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
mergestate->ps.ExecProcNode = ExecMergeAppend;
/* If run-time partition pruning is enabled, then set that up now */
- if (node->part_prune_info != NULL)
+ if (node->part_prune_index >= 0)
{
PartitionPruneState *prunestate;
@@ -93,7 +93,7 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
*/
prunestate = ExecInitPartitionPruning(&mergestate->ps,
list_length(node->mergeplans),
- node->part_prune_info,
+ node->part_prune_index,
&validsubplans);
mergestate->ms_prune_state = prunestate;
nplans = bms_num_members(validsubplans);
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 706d283a92..b02b4a641c 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -96,6 +96,7 @@ _copyPlannedStmt(const PlannedStmt *from)
COPY_SCALAR_FIELD(parallelModeNeeded);
COPY_SCALAR_FIELD(jitFlags);
COPY_NODE_FIELD(planTree);
+ COPY_NODE_FIELD(partPruneInfos);
COPY_NODE_FIELD(rtable);
COPY_NODE_FIELD(resultRelations);
COPY_NODE_FIELD(appendRelations);
@@ -253,7 +254,7 @@ _copyAppend(const Append *from)
COPY_NODE_FIELD(appendplans);
COPY_SCALAR_FIELD(nasyncplans);
COPY_SCALAR_FIELD(first_partial_plan);
- COPY_NODE_FIELD(part_prune_info);
+ COPY_SCALAR_FIELD(part_prune_index);
return newnode;
}
@@ -281,7 +282,7 @@ _copyMergeAppend(const MergeAppend *from)
COPY_POINTER_FIELD(sortOperators, from->numCols * sizeof(Oid));
COPY_POINTER_FIELD(collations, from->numCols * sizeof(Oid));
COPY_POINTER_FIELD(nullsFirst, from->numCols * sizeof(bool));
- COPY_NODE_FIELD(part_prune_info);
+ COPY_SCALAR_FIELD(part_prune_index);
return newnode;
}
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 4315c53080..7618444b4d 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -325,6 +325,7 @@ _outPlannedStmt(StringInfo str, const PlannedStmt *node)
WRITE_BOOL_FIELD(parallelModeNeeded);
WRITE_INT_FIELD(jitFlags);
WRITE_NODE_FIELD(planTree);
+ WRITE_NODE_FIELD(partPruneInfos);
WRITE_NODE_FIELD(rtable);
WRITE_NODE_FIELD(resultRelations);
WRITE_NODE_FIELD(appendRelations);
@@ -454,7 +455,7 @@ _outAppend(StringInfo str, const Append *node)
WRITE_NODE_FIELD(appendplans);
WRITE_INT_FIELD(nasyncplans);
WRITE_INT_FIELD(first_partial_plan);
- WRITE_NODE_FIELD(part_prune_info);
+ WRITE_INT_FIELD(part_prune_index);
}
static void
@@ -471,7 +472,7 @@ _outMergeAppend(StringInfo str, const MergeAppend *node)
WRITE_OID_ARRAY(sortOperators, node->numCols);
WRITE_OID_ARRAY(collations, node->numCols);
WRITE_BOOL_ARRAY(nullsFirst, node->numCols);
- WRITE_NODE_FIELD(part_prune_info);
+ WRITE_INT_FIELD(part_prune_index);
}
static void
@@ -2438,6 +2439,7 @@ _outPlannerGlobal(StringInfo str, const PlannerGlobal *node)
WRITE_NODE_FIELD(finalrowmarks);
WRITE_NODE_FIELD(resultRelations);
WRITE_NODE_FIELD(appendRelations);
+ WRITE_NODE_FIELD(partPruneInfos);
WRITE_NODE_FIELD(relationOids);
WRITE_NODE_FIELD(invalItems);
WRITE_NODE_FIELD(paramExecTypes);
@@ -2505,6 +2507,7 @@ _outPlannerInfo(StringInfo str, const PlannerInfo *node)
WRITE_BITMAPSET_FIELD(curOuterRels);
WRITE_NODE_FIELD(curOuterParams);
WRITE_BOOL_FIELD(partColsUpdated);
+ WRITE_NODE_FIELD(partPruneInfos);
}
static void
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 6a05b69415..bf602ff93e 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -1817,6 +1817,7 @@ _readPlannedStmt(void)
READ_BOOL_FIELD(parallelModeNeeded);
READ_INT_FIELD(jitFlags);
READ_NODE_FIELD(planTree);
+ READ_NODE_FIELD(partPruneInfos);
READ_NODE_FIELD(rtable);
READ_NODE_FIELD(resultRelations);
READ_NODE_FIELD(appendRelations);
@@ -1949,7 +1950,7 @@ _readAppend(void)
READ_NODE_FIELD(appendplans);
READ_INT_FIELD(nasyncplans);
READ_INT_FIELD(first_partial_plan);
- READ_NODE_FIELD(part_prune_info);
+ READ_INT_FIELD(part_prune_index);
READ_DONE();
}
@@ -1971,7 +1972,7 @@ _readMergeAppend(void)
READ_OID_ARRAY(sortOperators, local_node->numCols);
READ_OID_ARRAY(collations, local_node->numCols);
READ_BOOL_ARRAY(nullsFirst, local_node->numCols);
- READ_NODE_FIELD(part_prune_info);
+ READ_INT_FIELD(part_prune_index);
READ_DONE();
}
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index 76606faa3e..58a05cf673 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -1203,7 +1203,6 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
ListCell *subpaths;
int nasyncplans = 0;
RelOptInfo *rel = best_path->path.parent;
- PartitionPruneInfo *partpruneinfo = NULL;
int nodenumsortkeys = 0;
AttrNumber *nodeSortColIdx = NULL;
Oid *nodeSortOperators = NULL;
@@ -1354,6 +1353,9 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
subplans = lappend(subplans, subplan);
}
+ /* Set below if we find quals that we can use to run-time prune */
+ plan->part_prune_index = -1;
+
/*
* If any quals exist, they may be useful to perform further partition
* pruning during execution. Gather information needed by the executor to
@@ -1377,16 +1379,14 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
}
if (prunequal != NIL)
- partpruneinfo =
- make_partition_pruneinfo(root, rel,
- best_path->subpaths,
- prunequal);
+ plan->part_prune_index = make_partition_pruneinfo(root, rel,
+ best_path->subpaths,
+ prunequal);
}
plan->appendplans = subplans;
plan->nasyncplans = nasyncplans;
plan->first_partial_plan = best_path->first_partial_path;
- plan->part_prune_info = partpruneinfo;
copy_generic_path_info(&plan->plan, (Path *) best_path);
@@ -1426,7 +1426,6 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
List *subplans = NIL;
ListCell *subpaths;
RelOptInfo *rel = best_path->path.parent;
- PartitionPruneInfo *partpruneinfo = NULL;
/*
* We don't have the actual creation of the MergeAppend node split out
@@ -1519,6 +1518,9 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
subplans = lappend(subplans, subplan);
}
+ /* Set below if we find quals that we can use to run-time prune */
+ node->part_prune_index = -1;
+
/*
* If any quals exist, they may be useful to perform further partition
* pruning during execution. Gather information needed by the executor to
@@ -1542,13 +1544,13 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
}
if (prunequal != NIL)
- partpruneinfo = make_partition_pruneinfo(root, rel,
- best_path->subpaths,
- prunequal);
+ node->part_prune_index = make_partition_pruneinfo(root, rel,
+ best_path->subpaths,
+ prunequal);
}
node->mergeplans = subplans;
- node->part_prune_info = partpruneinfo;
+
/*
* If prepare_sort_from_pathkeys added sort columns, but we were told to
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 06ad856eac..b11249ed8f 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -518,6 +518,7 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
result->dependsOnRole = glob->dependsOnRole;
result->parallelModeNeeded = glob->parallelModeNeeded;
result->planTree = top_plan;
+ result->partPruneInfos = glob->partPruneInfos;
result->rtable = glob->finalrtable;
result->resultRelations = glob->resultRelations;
result->appendRelations = glob->appendRelations;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 9cef92cab2..b8d5610593 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -348,6 +348,29 @@ set_plan_references(PlannerInfo *root, Plan *plan)
}
}
+ /* Also fix up the information in PartitionPruneInfos. */
+ foreach (lc, root->partPruneInfos)
+ {
+ PartitionPruneInfo *pruneinfo = lfirst(lc);
+ ListCell *l;
+
+ foreach(l, pruneinfo->prune_infos)
+ {
+ List *prune_infos = lfirst(l);
+ ListCell *l2;
+
+ foreach(l2, prune_infos)
+ {
+ PartitionedRelPruneInfo *pinfo = lfirst(l2);
+
+ /* RT index of the table to which the pinfo belongs. */
+ pinfo->rtindex += rtoffset;
+ }
+ }
+
+ glob->partPruneInfos = lappend(glob->partPruneInfos, pruneinfo);
+ }
+
return result;
}
@@ -1655,21 +1678,12 @@ set_append_references(PlannerInfo *root,
aplan->apprelids = offset_relid_set(aplan->apprelids, rtoffset);
- if (aplan->part_prune_info)
- {
- foreach(l, aplan->part_prune_info->prune_infos)
- {
- List *prune_infos = lfirst(l);
- ListCell *l2;
-
- foreach(l2, prune_infos)
- {
- PartitionedRelPruneInfo *pinfo = lfirst(l2);
-
- pinfo->rtindex += rtoffset;
- }
- }
- }
+ /*
+ * PartitionPruneInfos will be added to a list in PlannerGlobal, so update
+ * the index.
+ */
+ if (aplan->part_prune_index >= 0)
+ aplan->part_prune_index += list_length(root->glob->partPruneInfos);
/* We don't need to recurse to lefttree or righttree ... */
Assert(aplan->plan.lefttree == NULL);
@@ -1727,21 +1741,12 @@ set_mergeappend_references(PlannerInfo *root,
mplan->apprelids = offset_relid_set(mplan->apprelids, rtoffset);
- if (mplan->part_prune_info)
- {
- foreach(l, mplan->part_prune_info->prune_infos)
- {
- List *prune_infos = lfirst(l);
- ListCell *l2;
-
- foreach(l2, prune_infos)
- {
- PartitionedRelPruneInfo *pinfo = lfirst(l2);
-
- pinfo->rtindex += rtoffset;
- }
- }
- }
+ /*
+ * PartitionPruneInfos will be added to a list in PlannerGlobal, so update
+ * the index.
+ */
+ if (mplan->part_prune_index >= 0)
+ mplan->part_prune_index += list_length(root->glob->partPruneInfos);
/* We don't need to recurse to lefttree or righttree ... */
Assert(mplan->plan.lefttree == NULL);
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index 9d3c05aed3..d77f7d3aef 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -209,16 +209,20 @@ static void partkey_datum_from_expr(PartitionPruneContext *context,
/*
* make_partition_pruneinfo
- * Builds a PartitionPruneInfo which can be used in the executor to allow
- * additional partition pruning to take place. Returns NULL when
- * partition pruning would be useless.
+ * Checks if the given set of quals can be used to build pruning steps
+ * that the executor can use to prune away unneeded partitions. If
+ * suitable quals are found then a PartitionPruneInfo is built and tagged
+ * onto the PlannerInfo's partPruneInfos list.
+ *
+ * The return value is the 0-based index of the item added to the
+ * partPruneInfos list or -1 if nothing was added.
*
* 'parentrel' is the RelOptInfo for an appendrel, and 'subpaths' is the list
* of scan paths for its child rels.
* 'prunequal' is a list of potential pruning quals (i.e., restriction
* clauses that are applicable to the appendrel).
*/
-PartitionPruneInfo *
+int
make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
List *subpaths,
List *prunequal)
@@ -332,7 +336,7 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
* quals, then we can just not bother with run-time pruning.
*/
if (prunerelinfos == NIL)
- return NULL;
+ return -1;
/* Else build the result data structure */
pruneinfo = makeNode(PartitionPruneInfo);
@@ -358,7 +362,9 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
else
pruneinfo->other_subplans = NULL;
- return pruneinfo;
+ root->partPruneInfos = lappend(root->partPruneInfos, pruneinfo);
+
+ return list_length(root->partPruneInfos) - 1;
}
/*
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index 708435e952..bf962af7af 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -123,9 +123,8 @@ typedef struct PartitionPruneState
extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
int n_total_subplans,
- PartitionPruneInfo *pruneinfo,
+ int part_prune_index,
Bitmapset **initially_valid_subplans);
extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
bool initial_prune);
-
#endif /* EXECPARTITION_H */
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 5728801379..25e0bb976e 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -596,6 +596,8 @@ typedef struct EState
struct ExecRowMark **es_rowmarks; /* Array of per-range-table-entry
* ExecRowMarks, or NULL if none */
PlannedStmt *es_plannedstmt; /* link to top of plan tree */
+ List *es_part_prune_infos; /* PlannedStmt.partPruneInfos */
+ struct PartitionPruneResult *es_part_prune_result; /* QueryDesc.part_prune_result */
const char *es_sourceText; /* Source text from QueryDesc */
JunkFilter *es_junkFilter; /* top-level junk filter, if any */
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index b88cfb8dc0..a0f3a46334 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -107,6 +107,9 @@ typedef struct PlannerGlobal
List *appendRelations; /* "flat" list of AppendRelInfos */
+ List *partPruneInfos; /* List of PartitionPruneInfo contained in
+ * the plan */
+
List *relationOids; /* OIDs of relations the plan depends on */
List *invalItems; /* other dependencies, as PlanInvalItems */
@@ -386,6 +389,9 @@ struct PlannerInfo
/* Does this query modify any partition key columns? */
bool partColsUpdated;
+
+ /* PartitionPruneInfos added in this query's plan. */
+ List *partPruneInfos;
};
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index d5c0ebe859..c3f4a39657 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -64,6 +64,9 @@ typedef struct PlannedStmt
struct Plan *planTree; /* tree of Plan nodes */
+ List *partPruneInfos; /* List of PartitionPruneInfo contained in
+ * the plan */
+
List *rtable; /* list of RangeTblEntry nodes */
/* rtable indexes of target relations for INSERT/UPDATE/DELETE */
@@ -262,8 +265,8 @@ typedef struct Append
*/
int first_partial_plan;
- /* Info for run-time subplan pruning; NULL if we're not doing that */
- struct PartitionPruneInfo *part_prune_info;
+ /* Index to PlannerInfo.partPruneInfos or -1 if no run-time pruning */
+ int part_prune_index;
} Append;
/* ----------------
@@ -297,8 +300,8 @@ typedef struct MergeAppend
/* NULLS FIRST/LAST directions */
bool *nullsFirst;
- /* Info for run-time subplan pruning; NULL if we're not doing that */
- struct PartitionPruneInfo *part_prune_info;
+ /* Index to PlannerInfo.partPruneInfos or -1 if no run-time pruning */
+ int part_prune_index;
} MergeAppend;
/* ----------------
diff --git a/src/include/partitioning/partprune.h b/src/include/partitioning/partprune.h
index 90684efa25..ebf0dcff8c 100644
--- a/src/include/partitioning/partprune.h
+++ b/src/include/partitioning/partprune.h
@@ -70,10 +70,10 @@ typedef struct PartitionPruneContext
#define PruneCxtStateIdx(partnatts, step_id, keyno) \
((partnatts) * (step_id) + (keyno))
-extern PartitionPruneInfo *make_partition_pruneinfo(struct PlannerInfo *root,
- struct RelOptInfo *parentrel,
- List *subpaths,
- List *prunequal);
+extern int make_partition_pruneinfo(struct PlannerInfo *root,
+ struct RelOptInfo *parentrel,
+ List *subpaths,
+ List *prunequal);
extern Bitmapset *prune_append_rel_partitions(struct RelOptInfo *rel);
extern Bitmapset *get_matching_partitions(PartitionPruneContext *context,
List *pruning_steps);
--
2.35.3
[application/octet-stream] v17-0002-Optimize-AcquireExecutorLocks-by-locking-only-un.patch (87.2K, 3-v17-0002-Optimize-AcquireExecutorLocks-by-locking-only-un.patch)
download | inline diff:
From e5d0283732311fb068ad75ee4ff282ebe5306266 Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Wed, 22 Dec 2021 16:55:17 +0900
Subject: [PATCH v17 2/2] Optimize AcquireExecutorLocks() by locking only
unpruned partitions
This commit teaches AcquireExecutorLocks() to perform initial
partition pruning to notionally eliminate the subnodes contained in a
generic cached plan that need not be initialized during the actual
execution of the plan and skip locking the partition scanned by those
subnodes.
The result of performing initial partition pruning this way before the
actual execution has started is made available to the actual execution via
PartitionPruneResult, made available along with the PlannedStmt by the
callers of the executor that used plancache.c to get the plan. It is NULL
in the cases in which the plan is obtained by calling the planner
directly or if the plan obtained by plancache.c is not a generic one.
---
src/backend/commands/copyto.c | 2 +-
src/backend/commands/createas.c | 2 +-
src/backend/commands/explain.c | 7 +-
src/backend/commands/extension.c | 2 +-
src/backend/commands/matview.c | 2 +-
src/backend/commands/prepare.c | 26 ++-
src/backend/executor/README | 32 ++++
src/backend/executor/execMain.c | 53 ++++++
src/backend/executor/execParallel.c | 27 ++-
src/backend/executor/execPartition.c | 234 +++++++++++++++++++++----
src/backend/executor/functions.c | 2 +-
src/backend/executor/nodeAppend.c | 11 +-
src/backend/executor/nodeMergeAppend.c | 5 +-
src/backend/executor/spi.c | 27 ++-
src/backend/nodes/copyfuncs.c | 27 +++
src/backend/nodes/outfuncs.c | 29 +++
src/backend/nodes/readfuncs.c | 51 ++++++
src/backend/optimizer/plan/planner.c | 2 +
src/backend/optimizer/plan/setrefs.c | 46 +++++
src/backend/partitioning/partprune.c | 41 ++++-
src/backend/tcop/postgres.c | 8 +-
src/backend/tcop/pquery.c | 28 ++-
src/backend/utils/cache/plancache.c | 184 ++++++++++++++++---
src/backend/utils/mmgr/portalmem.c | 19 ++
src/include/commands/explain.h | 4 +-
src/include/executor/execPartition.h | 9 +-
src/include/executor/execdesc.h | 3 +
src/include/executor/executor.h | 2 +
src/include/nodes/execnodes.h | 27 +++
src/include/nodes/nodes.h | 4 +
src/include/nodes/pathnodes.h | 9 +
src/include/nodes/plannodes.h | 21 +++
src/include/utils/plancache.h | 3 +-
src/include/utils/portal.h | 3 +
34 files changed, 856 insertions(+), 96 deletions(-)
diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index fca29a9a10..d839517693 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -541,7 +541,7 @@ BeginCopyTo(ParseState *pstate,
((DR_copy *) dest)->cstate = cstate;
/* Create a QueryDesc requesting no output */
- cstate->queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+ cstate->queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
GetActiveSnapshot(),
InvalidSnapshot,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 9abbb6b555..f6607f2454 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -325,7 +325,7 @@ ExecCreateTableAs(ParseState *pstate, CreateTableAsStmt *stmt,
UpdateActiveSnapshotCommandId();
/* Create a QueryDesc, redirecting output to our tuple receiver */
- queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+ queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, 0);
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index e29c2ae206..e41b13a3ea 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -407,7 +407,7 @@ ExplainOneQuery(Query *query, int cursorOptions,
}
/* run it (if needed) and produce output */
- ExplainOnePlan(plan, into, es, queryString, params, queryEnv,
+ ExplainOnePlan(plan, NULL, into, es, queryString, params, queryEnv,
&planduration, (es->buffers ? &bufusage : NULL));
}
}
@@ -515,7 +515,8 @@ ExplainOneUtility(Node *utilityStmt, IntoClause *into, ExplainState *es,
* to call it.
*/
void
-ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
+ExplainOnePlan(PlannedStmt *plannedstmt, PartitionPruneResult *part_prune_result,
+ IntoClause *into, ExplainState *es,
const char *queryString, ParamListInfo params,
QueryEnvironment *queryEnv, const instr_time *planduration,
const BufferUsage *bufusage)
@@ -563,7 +564,7 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
dest = None_Receiver;
/* Create a QueryDesc for the query */
- queryDesc = CreateQueryDesc(plannedstmt, queryString,
+ queryDesc = CreateQueryDesc(plannedstmt, part_prune_result, queryString,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, instrument_option);
diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index 3db859c3ea..631cc07217 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -776,7 +776,7 @@ execute_sql_string(const char *sql)
{
QueryDesc *qdesc;
- qdesc = CreateQueryDesc(stmt,
+ qdesc = CreateQueryDesc(stmt, NULL,
sql,
GetActiveSnapshot(), NULL,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index d1ee106465..e878209674 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -408,7 +408,7 @@ refresh_matview_datafill(DestReceiver *dest, Query *query,
UpdateActiveSnapshotCommandId();
/* Create a QueryDesc, redirecting output to our tuple receiver */
- queryDesc = CreateQueryDesc(plan, queryString,
+ queryDesc = CreateQueryDesc(plan, NULL, queryString,
GetActiveSnapshot(), InvalidSnapshot,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 2333aae467..83465e40f8 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -155,6 +155,7 @@ ExecuteQuery(ParseState *pstate,
PreparedStatement *entry;
CachedPlan *cplan;
List *plan_list;
+ List *part_prune_result_list;
ParamListInfo paramLI = NULL;
EState *estate = NULL;
Portal portal;
@@ -193,7 +194,10 @@ ExecuteQuery(ParseState *pstate,
entry->plansource->query_string);
/* Replan if needed, and increment plan refcount for portal */
- cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL);
+ cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL,
+ &part_prune_result_list);
+ Assert(list_length(cplan->stmt_list) ==
+ list_length(part_prune_result_list));
plan_list = cplan->stmt_list;
/*
@@ -207,6 +211,9 @@ ExecuteQuery(ParseState *pstate,
plan_list,
cplan);
+ /* Copy PartitionPruneResults into the portal's context. */
+ PortalStorePartitionPruneResults(portal, part_prune_result_list);
+
/*
* For CREATE TABLE ... AS EXECUTE, we must verify that the prepared
* statement is one that produces tuples. Currently we insist that it be
@@ -576,7 +583,9 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
const char *query_string;
CachedPlan *cplan;
List *plan_list;
- ListCell *p;
+ List *part_prune_result_list;
+ ListCell *p,
+ *pp;
ParamListInfo paramLI = NULL;
EState *estate = NULL;
instr_time planstart;
@@ -619,7 +628,10 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
/* Replan if needed, and acquire a transient refcount */
cplan = GetCachedPlan(entry->plansource, paramLI,
- CurrentResourceOwner, queryEnv);
+ CurrentResourceOwner, queryEnv,
+ &part_prune_result_list);
+ Assert(list_length(cplan->stmt_list) ==
+ list_length(part_prune_result_list));
INSTR_TIME_SET_CURRENT(planduration);
INSTR_TIME_SUBTRACT(planduration, planstart);
@@ -634,13 +646,15 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
plan_list = cplan->stmt_list;
/* Explain each query */
- foreach(p, plan_list)
+ forboth(p, plan_list, pp, part_prune_result_list)
{
PlannedStmt *pstmt = lfirst_node(PlannedStmt, p);
+ PartitionPruneResult *part_prune_result = lfirst_node(PartitionPruneResult, pp);
if (pstmt->commandType != CMD_UTILITY)
- ExplainOnePlan(pstmt, into, es, query_string, paramLI, queryEnv,
- &planduration, (es->buffers ? &bufusage : NULL));
+ ExplainOnePlan(pstmt, part_prune_result, into, es, query_string,
+ paramLI, queryEnv, &planduration,
+ (es->buffers ? &bufusage : NULL));
else
ExplainOneUtility(pstmt->utilityStmt, into, es, query_string,
paramLI, queryEnv);
diff --git a/src/backend/executor/README b/src/backend/executor/README
index 0b5183fc4a..953a476ea5 100644
--- a/src/backend/executor/README
+++ b/src/backend/executor/README
@@ -65,6 +65,34 @@ found there. This currently only occurs for Append and MergeAppend nodes. In
this case the non-required subplans are ignored and the executor state's
subnode array will become out of sequence to the plan's subplan list.
+Actually, the so-called execution time pruning may also occur even before the
+execution has started. One case where that occurs is when a cached generic
+plan is being validated for execution by plancache.c: GetCachedPlan(), which
+works by locking all the relations that will be scanned by that plan. If the
+generic plan contains nodes that can perform execution time partition pruning
+(that is, contain a PartitionPruneInfo), a subset of pruning steps contained
+in a given node's PartitionPruneInfo that do not depend on the execution
+actually having started (called "initial" pruning steps) are performed at this
+point to figure out the minimal set of child subplans that satisfy those
+pruning steps. AcquireExecutorLocks() looking at a given plan tree will then
+lock only the relations scanned by the child subplans that survived such
+pruning, along with those present in PlannedStmt.minLockRelids. Note that the
+subplans are only notionally pruned in that they are not removed from the plan
+tree as such.
+
+To prevent the executor and any third party execution code that can look at
+the plan tree from trying to execute the subplans that were pruned as
+described above, the result of pruning is passed to the executor as a
+PartitionPruneResult node via the QueryDesc. It consists of the set of
+indexes of surviving subplans in their respective parent plan node's list of
+child subplans, saved as a list of bitmapsets, with one element for every
+parent plan node whose PartitionPruneInfo is present in
+PlannedStmt.partPruneInfos. In other words, the executor should not
+re-evaluate the set of initially valid subplans by redoing the initial pruning
+if it was already done by AcquireExecutorLocks(), because the re-evaluation may
+very well end up resulting in a different set of subplans, containing some
+whose relations were not locked by AcquireExecutorLocks().
+
Each Plan node may have expression trees associated with it, to represent
its target list, qualification conditions, etc. These trees are also
read-only to the executor, but the executor state for expression evaluation
@@ -286,6 +314,10 @@ Query Processing Control Flow
This is a sketch of control flow for full query processing:
+ [ ExecutorDoInitialPruning ] --- an optional step to perform initial
+ partition pruning on the plan tree the result of which is passed
+ to the executor via QueryDesc
+
CreateQueryDesc
ExecutorStart
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 72fc273524..45824624f8 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -49,6 +49,7 @@
#include "commands/matview.h"
#include "commands/trigger.h"
#include "executor/execdebug.h"
+#include "executor/execPartition.h"
#include "executor/nodeSubplan.h"
#include "foreign/fdwapi.h"
#include "jit/jit.h"
@@ -104,6 +105,56 @@ static void EvalPlanQualStart(EPQState *epqstate, Plan *planTree);
/* end of local decls */
+/* ----------------------------------------------------------------
+ * ExecutorDoInitialPruning
+ *
+ * For each plan tree node that has been assigned a PartitionPruneInfo,
+ * this performs initial partition pruning using the information contained
+ * therein to determine the set of child subplans that satisfy the initial
+ * pruning steps, to be returned as a bitmapset of their indexes in the
+ * node's list of child subplans (for example, an Append's appendplans).
+ *
+ * Return value is a PartitionPruneResult node that contains a list of those
+ * bitmapsets, with one element for every PartitionPruneInfo, and a bitmapset
+ * of the RT indexes of all the leaf partitions scanned by those chosen
+ * subplans. Note that the latter is shared across all PartitionPruneInfos.
+ *
+ * The executor must see the exactly same set of subplans as valid for
+ * execution when doing ExecInitNode() on the plan nodes whose
+ * PartitionPruneInfos are processed here. So, it must get the set from the
+ * aforementioned PartitionPruneResult, instead of computing it all over
+ * again by redoing the initial pruning. It's the caller's job to pass the
+ * PartitionPruneResult to the executor.
+ *
+ * Note: Partitioned tables mentioned in PartitionedRelPruneInfo nodes that
+ * drive the pruning will be locked before doing the pruning.
+ * ----------------------------------------------------------------
+ */
+PartitionPruneResult *
+ExecutorDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params)
+{
+ PartitionPruneResult *result;
+ ListCell *lc;
+
+ /* Only get here if there is any pruning to do. */
+ Assert(plannedstmt->containsInitialPruning);
+
+ result = makeNode(PartitionPruneResult);
+ foreach(lc, plannedstmt->partPruneInfos)
+ {
+ PartitionPruneInfo *pruneinfo = lfirst(lc);
+ Bitmapset *valid_subplan_offs;
+
+ valid_subplan_offs =
+ ExecPartitionDoInitialPruning(plannedstmt, params, pruneinfo,
+ &result->scan_leafpart_rtis);
+ result->valid_subplan_offs_list =
+ lappend(result->valid_subplan_offs_list,
+ valid_subplan_offs);
+ }
+
+ return result;
+}
/* ----------------------------------------------------------------
* ExecutorStart
@@ -806,6 +857,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
{
CmdType operation = queryDesc->operation;
PlannedStmt *plannedstmt = queryDesc->plannedstmt;
+ PartitionPruneResult *part_prune_result = queryDesc->part_prune_result;
Plan *plan = plannedstmt->planTree;
List *rangeTable = plannedstmt->rtable;
EState *estate = queryDesc->estate;
@@ -826,6 +878,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
estate->es_plannedstmt = plannedstmt;
estate->es_part_prune_infos = plannedstmt->partPruneInfos;
+ estate->es_part_prune_result = part_prune_result;
/*
* Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index f73b8c2607..7e6dab5623 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -66,6 +66,7 @@
#define PARALLEL_KEY_QUERY_TEXT UINT64CONST(0xE000000000000008)
#define PARALLEL_KEY_JIT_INSTRUMENTATION UINT64CONST(0xE000000000000009)
#define PARALLEL_KEY_WAL_USAGE UINT64CONST(0xE00000000000000A)
+#define PARALLEL_KEY_PARTITIONPRUNERESULT UINT64CONST(0xE00000000000000B)
#define PARALLEL_TUPLE_QUEUE_SIZE 65536
@@ -182,6 +183,7 @@ ExecSerializePlan(Plan *plan, EState *estate)
pstmt->transientPlan = false;
pstmt->dependsOnRole = false;
pstmt->parallelModeNeeded = false;
+ pstmt->containsInitialPruning = false;
pstmt->planTree = plan;
pstmt->partPruneInfos = estate->es_part_prune_infos;
pstmt->rtable = estate->es_range_table;
@@ -597,12 +599,15 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
FixedParallelExecutorState *fpes;
char *pstmt_data;
char *pstmt_space;
+ char *part_prune_result_data;
+ char *part_prune_result_space;
char *paramlistinfo_space;
BufferUsage *bufusage_space;
WalUsage *walusage_space;
SharedExecutorInstrumentation *instrumentation = NULL;
SharedJitInstrumentation *jit_instrumentation = NULL;
int pstmt_len;
+ int part_prune_result_len;
int paramlistinfo_len;
int instrumentation_len = 0;
int jit_instrumentation_len = 0;
@@ -631,6 +636,7 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
/* Fix up and serialize plan to be sent to workers. */
pstmt_data = ExecSerializePlan(planstate->plan, estate);
+ part_prune_result_data = nodeToString(estate->es_part_prune_result);
/* Create a parallel context. */
pcxt = CreateParallelContext("postgres", "ParallelQueryMain", nworkers);
@@ -657,6 +663,11 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
shm_toc_estimate_chunk(&pcxt->estimator, pstmt_len);
shm_toc_estimate_keys(&pcxt->estimator, 1);
+ /* Estimate space for serialized PartitionPruneResult. */
+ part_prune_result_len = strlen(part_prune_result_data) + 1;
+ shm_toc_estimate_chunk(&pcxt->estimator, part_prune_result_len);
+ shm_toc_estimate_keys(&pcxt->estimator, 1);
+
/* Estimate space for serialized ParamListInfo. */
paramlistinfo_len = EstimateParamListSpace(estate->es_param_list_info);
shm_toc_estimate_chunk(&pcxt->estimator, paramlistinfo_len);
@@ -751,6 +762,12 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
memcpy(pstmt_space, pstmt_data, pstmt_len);
shm_toc_insert(pcxt->toc, PARALLEL_KEY_PLANNEDSTMT, pstmt_space);
+ /* Store serialized PartitionPruneResult */
+ part_prune_result_space = shm_toc_allocate(pcxt->toc, part_prune_result_len);
+ memcpy(part_prune_result_space, part_prune_result_data, part_prune_result_len);
+ shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARTITIONPRUNERESULT,
+ part_prune_result_space);
+
/* Store serialized ParamListInfo. */
paramlistinfo_space = shm_toc_allocate(pcxt->toc, paramlistinfo_len);
shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARAMLISTINFO, paramlistinfo_space);
@@ -1232,8 +1249,10 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
int instrument_options)
{
char *pstmtspace;
+ char *part_prune_result_space;
char *paramspace;
PlannedStmt *pstmt;
+ PartitionPruneResult *part_prune_result;
ParamListInfo paramLI;
char *queryString;
@@ -1244,12 +1263,18 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
pstmtspace = shm_toc_lookup(toc, PARALLEL_KEY_PLANNEDSTMT, false);
pstmt = (PlannedStmt *) stringToNode(pstmtspace);
+ /* Reconstruct leader-supplied PartitionPruneResult. */
+ part_prune_result_space =
+ shm_toc_lookup(toc, PARALLEL_KEY_PARTITIONPRUNERESULT, false);
+ part_prune_result = (PartitionPruneResult *)
+ stringToNode(part_prune_result_space);
+
/* Reconstruct ParamListInfo. */
paramspace = shm_toc_lookup(toc, PARALLEL_KEY_PARAMLISTINFO, false);
paramLI = RestoreParamList(¶mspace);
/* Create a QueryDesc for the query. */
- return CreateQueryDesc(pstmt,
+ return CreateQueryDesc(pstmt, part_prune_result,
queryString,
GetActiveSnapshot(), InvalidSnapshot,
receiver, paramLI, NULL, instrument_options);
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index b55cdd2580..24e6f6e988 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -25,6 +25,7 @@
#include "mb/pg_wchar.h"
#include "miscadmin.h"
#include "nodes/makefuncs.h"
+#include "parser/parsetree.h"
#include "partitioning/partbounds.h"
#include "partitioning/partdesc.h"
#include "partitioning/partprune.h"
@@ -185,7 +186,11 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation rel,
static List *adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri);
static List *adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap);
static PartitionPruneState *CreatePartitionPruneState(PlanState *planstate,
- PartitionPruneInfo *pruneinfo);
+ PartitionPruneInfo *pruneinfo,
+ bool consider_initial_steps,
+ bool consider_exec_steps,
+ List *rtable, ExprContext *econtext,
+ PartitionDirectory partdir);
static void InitPartitionPruneContext(PartitionPruneContext *context,
List *pruning_steps,
PartitionDesc partdesc,
@@ -198,7 +203,8 @@ static void PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
static void find_matching_subplans_recurse(PartitionPruningData *prunedata,
PartitionedRelPruningData *pprune,
bool initial_prune,
- Bitmapset **validsubplans);
+ Bitmapset **validsubplans,
+ Bitmapset **scan_leafpart_rtis);
/*
@@ -1593,8 +1599,10 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
* considered to be a stable expression, it can change value from one plan
* node scan to the next during query execution. Stable comparison
* expressions that don't involve such Params allow partition pruning to be
- * done once during executor startup. Expressions that do involve such Params
- * require us to prune separately for each scan of the parent plan node.
+ * done once during executor startup or during ExecutorDoInitialPruning() that
+ * runs as part of performing AcquireExecutorLocks() on a given plan tree.
+ * Expressions that do involve such Params require us to prune separately for
+ * each scan of the parent plan node.
*
* Note that pruning away unneeded subplans during executor startup has the
* added benefit of not having to initialize the unneeded subplans at all.
@@ -1611,6 +1619,13 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
* account for initial pruning possibly having eliminated some of the
* subplans.
*
+ * ExecPartitionDoInitialPruning:
+ * Do initial pruning with the information contained in a given
+ * PartitionPruneInfo to determine the minimal set of child subplans
+ * to be executed of the parent plan node to which the PartitionPruneInfo
+ * belongs and also the set of the RT indexes of leaf partitions that will
+ * be scanned with those subplans.
+ *
* ExecFindMatchingSubPlans:
* Returns indexes of matching subplans after evaluating the expressions
* that are safe to evaluate at a given point. This function is first
@@ -1628,8 +1643,9 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
*
* On return, *initially_valid_subplans is assigned the set of indexes of
* child subplans that must be initialized along with the parent plan node.
- * Initial pruning is performed here if needed and in that case only the
- * surviving subplans' indexes are added.
+ * Initial pruning is performed here if needed (unless it has already been done
+ * by ExecutorDoInitialPruning()), and in that case only the surviving
+ * subplans' indexes are added.
*
* If subplans are indeed pruned, subplan_map arrays contained in the returned
* PartitionPruneState are re-sequenced to not count those, though only if the
@@ -1645,24 +1661,59 @@ ExecInitPartitionPruning(PlanState *planstate,
EState *estate = planstate->state;
PartitionPruneInfo *pruneinfo = list_nth(estate->es_part_prune_infos,
part_prune_index);
+ PartitionPruneResult *pruneresult = estate->es_part_prune_result;
+ bool do_pruning = (pruneinfo->needs_init_pruning ||
+ pruneinfo->needs_exec_pruning);
- /* We may need an expression context to evaluate partition exprs */
- ExecAssignExprContext(estate, planstate);
+ /*
+ * No need to do initial pruning if it was done already by
+ * ExecutorDoInitialPruning(), which it would be if es_part_prune_result
+ * has been set.
+ */
+ if (pruneresult)
+ do_pruning = pruneinfo->needs_exec_pruning;
- /* Create the working data structure for pruning */
- prunestate = CreatePartitionPruneState(planstate, pruneinfo);
+ prunestate = NULL;
+ if (do_pruning)
+ {
+ /* We may need an expression context to evaluate partition exprs */
+ ExecAssignExprContext(estate, planstate);
+
+ /* For data reading, executor always omits detached partitions */
+ if (estate->es_partition_directory == NULL)
+ estate->es_partition_directory =
+ CreatePartitionDirectory(estate->es_query_cxt, false);
+
+ /*
+ * Create the working data structure for pruning. No need to consider
+ * initial pruning steps if we have a PartitionPruneResult.
+ */
+ prunestate = CreatePartitionPruneState(planstate, pruneinfo,
+ pruneresult == NULL, true,
+ NIL, planstate->ps_ExprContext,
+ estate->es_partition_directory);
+ }
/*
* Perform an initial partition prune pass, if required.
*/
- if (prunestate->do_initial_prune)
- *initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true);
+ if (pruneresult)
+ {
+ *initially_valid_subplans =
+ list_nth(pruneresult->valid_subplan_offs_list, part_prune_index);
+ }
+ else if (prunestate && prunestate->do_initial_prune)
+ {
+ *initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true,
+ NULL);
+ }
else
{
/* No pruning, so we'll need to initialize all subplans */
Assert(n_total_subplans > 0);
*initially_valid_subplans = bms_add_range(NULL, 0,
n_total_subplans - 1);
+ return prunestate;
}
/*
@@ -1670,7 +1721,8 @@ ExecInitPartitionPruning(PlanState *planstate,
* that were removed above due to initial pruning. No need to do this if
* no steps were removed.
*/
- if (bms_num_members(*initially_valid_subplans) < n_total_subplans)
+ if (prunestate &&
+ bms_num_members(*initially_valid_subplans) < n_total_subplans)
{
/*
* We can safely skip this when !do_exec_prune, even though that
@@ -1686,11 +1738,73 @@ ExecInitPartitionPruning(PlanState *planstate,
return prunestate;
}
+/*
+ * ExecPartitionDoInitialPruning
+ * Perform initial pruning using given PartitionPruneInfo to determine
+ * the minimal set of child subplans that will be executed and also the
+ * set of RT indexes of the leaf partitions scanned by those subplans.
+ */
+Bitmapset *
+ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params,
+ PartitionPruneInfo *pruneinfo,
+ Bitmapset **scan_leafpart_rtis)
+{
+ List *rtable = plannedstmt->rtable;
+ ExprContext *econtext;
+ PartitionDirectory pdir;
+ MemoryContext oldcontext,
+ tmpcontext;
+ PartitionPruneState *prunestate;
+ Bitmapset *valid_subplan_offs;
+
+ /*
+ * A temporary context for memory allocations required while executing
+ * partition pruning steps.
+ */
+ tmpcontext = AllocSetContextCreate(CurrentMemoryContext,
+ "initial pruning working data",
+ ALLOCSET_DEFAULT_SIZES);
+ oldcontext = MemoryContextSwitchTo(tmpcontext);
+
+ /*
+ * PartitionDirectory to look up partition descriptors, which omits
+ * detached partitions, just like in the executor proper.
+ */
+ pdir = CreatePartitionDirectory(CurrentMemoryContext, false);
+
+ /*
+ * We don't yet have a PlanState for the parent plan node, so we must
+ * create a standalone ExprContext to evaluate pruning expressions,
+ * equipped with the information about the EXTERN parameters that the
+ * caller passed us. Note that that's okay because the initial pruning
+ * steps do not contain anything that requires the execution to have
+ * started.
+ */
+ econtext = CreateStandaloneExprContext();
+ econtext->ecxt_param_list_info = params;
+ prunestate = CreatePartitionPruneState(NULL, pruneinfo, true, false,
+ rtable, econtext, pdir);
+ MemoryContextSwitchTo(oldcontext);
+
+ /* Do the initial pruning. */
+ valid_subplan_offs = ExecFindMatchingSubPlans(prunestate, true,
+ scan_leafpart_rtis);
+
+ FreeExprContext(econtext, true);
+ DestroyPartitionDirectory(pdir);
+ MemoryContextDelete(tmpcontext);
+
+ return valid_subplan_offs;
+}
+
/*
* CreatePartitionPruneState
* Build the data structure required for calling ExecFindMatchingSubPlans
*
- * 'planstate' is the parent plan node's execution state.
+ * 'planstate', if not NULL, is the parent plan node's execution state. It
+ * can be NULL if being called before ExecutorStart(), in which case,
+ * 'rtable' (range table), 'econtext', and 'partdir' must be explicitly
+ * provided.
*
* 'pruneinfo' is a PartitionPruneInfo as generated by
* make_partition_pruneinfo. Here we build a PartitionPruneState containing a
@@ -1704,19 +1818,21 @@ ExecInitPartitionPruning(PlanState *planstate,
* PartitionedRelPruneInfo.
*/
static PartitionPruneState *
-CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
+CreatePartitionPruneState(PlanState *planstate,
+ PartitionPruneInfo *pruneinfo,
+ bool consider_initial_steps,
+ bool consider_exec_steps,
+ List *rtable, ExprContext *econtext,
+ PartitionDirectory partdir)
{
- EState *estate = planstate->state;
+ EState *estate = planstate ? planstate->state : NULL;
PartitionPruneState *prunestate;
int n_part_hierarchies;
ListCell *lc;
int i;
- ExprContext *econtext = planstate->ps_ExprContext;
- /* For data reading, executor always omits detached partitions */
- if (estate->es_partition_directory == NULL)
- estate->es_partition_directory =
- CreatePartitionDirectory(estate->es_query_cxt, false);
+ Assert((estate != NULL) ||
+ (partdir != NULL && econtext != NULL && rtable != NIL));
n_part_hierarchies = list_length(pruneinfo->prune_infos);
Assert(n_part_hierarchies > 0);
@@ -1771,15 +1887,42 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
PartitionKey partkey;
/*
- * We can rely on the copies of the partitioned table's partition
- * key and partition descriptor appearing in its relcache entry,
- * because that entry will be held open and locked for the
- * duration of this executor run.
+ * Must open the relation by ourselves when called before the
+ * execution has started, such as, when called during
+ * ExecutorDoInitialPruning() on a cached plan. In that case,
+ * sub-partitions must be locked, because AcquirePlannerLocks()
+ * would not have seen them. (1st relation in a partrelpruneinfos
+ * list is always the root partitioned table appearing in the
+ * query, which AcquirePlannerLocks() would have locked; the
+ * Assert in relation_open() guards that assumption.)
+ */
+ if (estate == NULL)
+ {
+ RangeTblEntry *rte = rt_fetch(pinfo->rtindex, rtable);
+ int lockmode = (j == 0) ? NoLock : rte->rellockmode;
+
+ partrel = table_open(rte->relid, lockmode);
+ }
+ else
+ partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
+
+ /*
+ * We can rely on the copy of the partitioned table's partition
+ * key from in its relcache entry, because it can't change (or
+ * get destroyed) as long as the relation is locked. Partition
+ * descriptor is taken from the PartitionDirectory associated with
+ * the table that is held open long enough for the descriptor to
+ * remain valid while it's used to perform the pruning steps.
*/
- partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
partkey = RelationGetPartitionKey(partrel);
- partdesc = PartitionDirectoryLookup(estate->es_partition_directory,
- partrel);
+ partdesc = PartitionDirectoryLookup(partdir, partrel);
+
+ /*
+ * Must close partrel, keeping the lock taken, if we're not using
+ * EState's entry.
+ */
+ if (estate == NULL)
+ table_close(partrel, NoLock);
/*
* Initialize the subplan_map and subpart_map.
@@ -1793,6 +1936,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
Assert(partdesc->nparts >= pinfo->nparts);
pprune->nparts = partdesc->nparts;
pprune->subplan_map = palloc(sizeof(int) * partdesc->nparts);
+ pprune->rti_map = palloc(sizeof(Index) * partdesc->nparts);
if (partdesc->nparts == pinfo->nparts)
{
/*
@@ -1803,6 +1947,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
pprune->subpart_map = pinfo->subpart_map;
memcpy(pprune->subplan_map, pinfo->subplan_map,
sizeof(int) * pinfo->nparts);
+ memcpy(pprune->rti_map, pinfo->rti_map,
+ sizeof(int) * pinfo->nparts);
/*
* Double-check that the list of unpruned relations has not
@@ -1853,6 +1999,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
pinfo->subplan_map[pd_idx];
pprune->subpart_map[pp_idx] =
pinfo->subpart_map[pd_idx];
+ pprune->rti_map[pp_idx] =
+ pinfo->rti_map[pd_idx];
pd_idx++;
}
else
@@ -1860,6 +2008,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
/* this partdesc entry is not in the plan */
pprune->subplan_map[pp_idx] = -1;
pprune->subpart_map[pp_idx] = -1;
+ pprune->rti_map[pp_idx] = 0;
}
}
@@ -1881,7 +2030,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
* Initialize pruning contexts as needed.
*/
pprune->initial_pruning_steps = pinfo->initial_pruning_steps;
- if (pinfo->initial_pruning_steps)
+ if (consider_initial_steps && pinfo->initial_pruning_steps)
{
InitPartitionPruneContext(&pprune->initial_context,
pinfo->initial_pruning_steps,
@@ -1891,7 +2040,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
prunestate->do_initial_prune = true;
}
pprune->exec_pruning_steps = pinfo->exec_pruning_steps;
- if (pinfo->exec_pruning_steps)
+ if (consider_exec_steps && pinfo->exec_pruning_steps)
{
InitPartitionPruneContext(&pprune->exec_context,
pinfo->exec_pruning_steps,
@@ -2119,10 +2268,14 @@ PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
* Pass initial_prune if PARAM_EXEC Params cannot yet be evaluated. This
* differentiates the initial executor-time pruning step from later
* runtime pruning.
+ *
+ * RT indexes of leaf partitions scanned by the chosen subplans are added to
+ * *scan_leafpart_rtis if the pointer is non-NULL.
*/
Bitmapset *
ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
- bool initial_prune)
+ bool initial_prune,
+ Bitmapset **scan_leafpart_rtis)
{
Bitmapset *result = NULL;
MemoryContext oldcontext;
@@ -2157,7 +2310,7 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
*/
pprune = &prunedata->partrelprunedata[0];
find_matching_subplans_recurse(prunedata, pprune, initial_prune,
- &result);
+ &result, scan_leafpart_rtis);
/* Expression eval may have used space in ExprContext too */
if (pprune->exec_pruning_steps)
@@ -2171,6 +2324,8 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
/* Copy result out of the temp context before we reset it */
result = bms_copy(result);
+ if (scan_leafpart_rtis)
+ *scan_leafpart_rtis = bms_copy(*scan_leafpart_rtis);
MemoryContextReset(prunestate->prune_context);
@@ -2181,13 +2336,15 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
* find_matching_subplans_recurse
* Recursive worker function for ExecFindMatchingSubPlans
*
- * Adds valid (non-prunable) subplan IDs to *validsubplans
+ * Adds valid (non-prunable) subplan IDs to *validsubplans and RT indexes of
+ * of the corresponding leaf partitions to *scan_leafpart_rtis (if asked for).
*/
static void
find_matching_subplans_recurse(PartitionPruningData *prunedata,
PartitionedRelPruningData *pprune,
bool initial_prune,
- Bitmapset **validsubplans)
+ Bitmapset **validsubplans,
+ Bitmapset **scan_leafpart_rtis)
{
Bitmapset *partset;
int i;
@@ -2214,8 +2371,14 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
while ((i = bms_next_member(partset, i)) >= 0)
{
if (pprune->subplan_map[i] >= 0)
+ {
*validsubplans = bms_add_member(*validsubplans,
pprune->subplan_map[i]);
+ Assert(pprune->rti_map[i] > 0);
+ if (scan_leafpart_rtis)
+ *scan_leafpart_rtis = bms_add_member(*scan_leafpart_rtis,
+ pprune->rti_map[i]);
+ }
else
{
int partidx = pprune->subpart_map[i];
@@ -2223,7 +2386,8 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
if (partidx >= 0)
find_matching_subplans_recurse(prunedata,
&prunedata->partrelprunedata[partidx],
- initial_prune, validsubplans);
+ initial_prune, validsubplans,
+ scan_leafpart_rtis);
else
{
/*
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index f9460ae506..a2182a6b1f 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -844,7 +844,7 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
else
dest = None_Receiver;
- es->qd = CreateQueryDesc(es->stmt,
+ es->qd = CreateQueryDesc(es->stmt, NULL,
fcache->src,
GetActiveSnapshot(),
InvalidSnapshot,
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index c6f86a6510..96880e122a 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -155,7 +155,8 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
* subplan, we can fill as_valid_subplans immediately, preventing
* later calls to ExecFindMatchingSubPlans.
*/
- if (!prunestate->do_exec_prune && nplans > 0)
+ if (appendstate->as_prune_state == NULL ||
+ (!appendstate->as_prune_state->do_exec_prune && nplans > 0))
appendstate->as_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
}
else
@@ -577,7 +578,7 @@ choose_next_subplan_locally(AppendState *node)
}
else if (node->as_valid_subplans == NULL)
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
whichplan = -1;
}
@@ -642,7 +643,7 @@ choose_next_subplan_for_leader(AppendState *node)
if (node->as_valid_subplans == NULL)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
/*
* Mark each invalid plan as finished to allow the loop below to
@@ -717,7 +718,7 @@ choose_next_subplan_for_worker(AppendState *node)
else if (node->as_valid_subplans == NULL)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
mark_invalid_subplans_as_finished(node);
}
@@ -868,7 +869,7 @@ ExecAppendAsyncBegin(AppendState *node)
if (node->as_valid_subplans == NULL)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
classify_matching_subplans(node);
}
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index 8d35860c30..2312e5a633 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -103,7 +103,8 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
* subplan, we can fill ms_valid_subplans immediately, preventing
* later calls to ExecFindMatchingSubPlans.
*/
- if (!prunestate->do_exec_prune && nplans > 0)
+ if (mergestate->ms_prune_state == NULL ||
+ (!mergestate->ms_prune_state->do_exec_prune && nplans > 0))
mergestate->ms_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
}
else
@@ -218,7 +219,7 @@ ExecMergeAppend(PlanState *pstate)
*/
if (node->ms_valid_subplans == NULL)
node->ms_valid_subplans =
- ExecFindMatchingSubPlans(node->ms_prune_state, false);
+ ExecFindMatchingSubPlans(node->ms_prune_state, false, NULL);
/*
* First time through: pull the first tuple from each valid subplan,
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 29bc26669b..303a572c02 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -1578,6 +1578,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
CachedPlanSource *plansource;
CachedPlan *cplan;
List *stmt_list;
+ List *part_prune_result_list;
char *query_string;
Snapshot snapshot;
MemoryContext oldcontext;
@@ -1657,7 +1658,10 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
*/
/* Replan if needed, and increment plan refcount for portal */
- cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv);
+ cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv,
+ &part_prune_result_list);
+ Assert(list_length(cplan->stmt_list) ==
+ list_length(part_prune_result_list));
stmt_list = cplan->stmt_list;
if (!plan->saved)
@@ -1685,6 +1689,9 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
stmt_list,
cplan);
+ /* Copy PartitionPruneResults into the portal's context. */
+ PortalStorePartitionPruneResults(portal, part_prune_result_list);
+
/*
* Set up options for portal. Default SCROLL type is chosen the same way
* as PerformCursorOpen does it.
@@ -2092,7 +2099,8 @@ SPI_plan_get_cached_plan(SPIPlanPtr plan)
/* Get the generic plan for the query */
cplan = GetCachedPlan(plansource, NULL,
plan->saved ? CurrentResourceOwner : NULL,
- _SPI_current->queryEnv);
+ _SPI_current->queryEnv,
+ NULL /* Not interested in PartitionPruneResults */);
Assert(cplan == plansource->gplan);
/* Pop the error context stack */
@@ -2473,7 +2481,9 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
{
CachedPlanSource *plansource = (CachedPlanSource *) lfirst(lc1);
List *stmt_list;
- ListCell *lc2;
+ List *part_prune_result_list;
+ ListCell *lc2,
+ *lc3;
spicallbackarg.query = plansource->query_string;
@@ -2549,8 +2559,10 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
* plan, the refcount must be backed by the plan_owner.
*/
cplan = GetCachedPlan(plansource, options->params,
- plan_owner, _SPI_current->queryEnv);
-
+ plan_owner, _SPI_current->queryEnv,
+ &part_prune_result_list);
+ Assert(list_length(cplan->stmt_list) ==
+ list_length(part_prune_result_list));
stmt_list = cplan->stmt_list;
/*
@@ -2589,9 +2601,10 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
}
}
- foreach(lc2, stmt_list)
+ forboth(lc2, stmt_list, lc3, part_prune_result_list)
{
PlannedStmt *stmt = lfirst_node(PlannedStmt, lc2);
+ PartitionPruneResult *part_prune_result = lfirst_node(PartitionPruneResult, lc3);
bool canSetTag = stmt->canSetTag;
DestReceiver *dest;
@@ -2663,7 +2676,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
else
snap = InvalidSnapshot;
- qdesc = CreateQueryDesc(stmt,
+ qdesc = CreateQueryDesc(stmt, part_prune_result,
plansource->query_string,
snap, crosscheck_snapshot,
dest,
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index b02b4a641c..332d58381b 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -97,7 +97,9 @@ _copyPlannedStmt(const PlannedStmt *from)
COPY_SCALAR_FIELD(jitFlags);
COPY_NODE_FIELD(planTree);
COPY_NODE_FIELD(partPruneInfos);
+ COPY_SCALAR_FIELD(containsInitialPruning);
COPY_NODE_FIELD(rtable);
+ COPY_BITMAPSET_FIELD(minLockRelids);
COPY_NODE_FIELD(resultRelations);
COPY_NODE_FIELD(appendRelations);
COPY_NODE_FIELD(subplans);
@@ -1284,6 +1286,8 @@ _copyPartitionPruneInfo(const PartitionPruneInfo *from)
PartitionPruneInfo *newnode = makeNode(PartitionPruneInfo);
COPY_NODE_FIELD(prune_infos);
+ COPY_SCALAR_FIELD(needs_init_pruning);
+ COPY_SCALAR_FIELD(needs_exec_pruning);
COPY_BITMAPSET_FIELD(other_subplans);
return newnode;
@@ -1300,6 +1304,7 @@ _copyPartitionedRelPruneInfo(const PartitionedRelPruneInfo *from)
COPY_POINTER_FIELD(subplan_map, from->nparts * sizeof(int));
COPY_POINTER_FIELD(subpart_map, from->nparts * sizeof(int));
COPY_POINTER_FIELD(relid_map, from->nparts * sizeof(Oid));
+ COPY_POINTER_FIELD(rti_map, from->nparts * sizeof(Index));
COPY_NODE_FIELD(initial_pruning_steps);
COPY_NODE_FIELD(exec_pruning_steps);
COPY_BITMAPSET_FIELD(execparamids);
@@ -5476,6 +5481,21 @@ _copyExtensibleNode(const ExtensibleNode *from)
return newnode;
}
+/* ****************************************************************
+ * execnodes.h copy functions
+ * ****************************************************************
+ */
+static PartitionPruneResult *
+_copyPartitionPruneResult(const PartitionPruneResult *from)
+{
+ PartitionPruneResult *newnode = makeNode(PartitionPruneResult);
+
+ COPY_NODE_FIELD(valid_subplan_offs_list);
+ COPY_BITMAPSET_FIELD(scan_leafpart_rtis);
+
+ return newnode;
+}
+
/* ****************************************************************
* value.h copy functions
* ****************************************************************
@@ -6572,6 +6592,13 @@ copyObjectImpl(const void *from)
retval = _copyPublicationTable(from);
break;
+ /*
+ * EXECUTION NODES
+ */
+ case T_PartitionPruneResult:
+ retval = _copyPartitionPruneResult(from);
+ break;
+
/*
* MISCELLANEOUS NODES
*/
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 7618444b4d..7346820eee 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -326,7 +326,9 @@ _outPlannedStmt(StringInfo str, const PlannedStmt *node)
WRITE_INT_FIELD(jitFlags);
WRITE_NODE_FIELD(planTree);
WRITE_NODE_FIELD(partPruneInfos);
+ WRITE_BOOL_FIELD(containsInitialPruning);
WRITE_NODE_FIELD(rtable);
+ WRITE_BITMAPSET_FIELD(minLockRelids);
WRITE_NODE_FIELD(resultRelations);
WRITE_NODE_FIELD(appendRelations);
WRITE_NODE_FIELD(subplans);
@@ -1021,6 +1023,8 @@ _outPartitionPruneInfo(StringInfo str, const PartitionPruneInfo *node)
WRITE_NODE_TYPE("PARTITIONPRUNEINFO");
WRITE_NODE_FIELD(prune_infos);
+ WRITE_BOOL_FIELD(needs_init_pruning);
+ WRITE_BOOL_FIELD(needs_exec_pruning);
WRITE_BITMAPSET_FIELD(other_subplans);
}
@@ -1035,6 +1039,7 @@ _outPartitionedRelPruneInfo(StringInfo str, const PartitionedRelPruneInfo *node)
WRITE_INT_ARRAY(subplan_map, node->nparts);
WRITE_INT_ARRAY(subpart_map, node->nparts);
WRITE_OID_ARRAY(relid_map, node->nparts);
+ WRITE_INDEX_ARRAY(rti_map, node->nparts);
WRITE_NODE_FIELD(initial_pruning_steps);
WRITE_NODE_FIELD(exec_pruning_steps);
WRITE_BITMAPSET_FIELD(execparamids);
@@ -2440,6 +2445,8 @@ _outPlannerGlobal(StringInfo str, const PlannerGlobal *node)
WRITE_NODE_FIELD(resultRelations);
WRITE_NODE_FIELD(appendRelations);
WRITE_NODE_FIELD(partPruneInfos);
+ WRITE_BOOL_FIELD(containsInitialPruning);
+ WRITE_BITMAPSET_FIELD(minLockRelids);
WRITE_NODE_FIELD(relationOids);
WRITE_NODE_FIELD(invalItems);
WRITE_NODE_FIELD(paramExecTypes);
@@ -2861,6 +2868,21 @@ _outExtensibleNode(StringInfo str, const ExtensibleNode *node)
methods->nodeOut(str, node);
}
+/*****************************************************************************
+ *
+ * Stuff from execnodes.h
+ *
+ *****************************************************************************/
+
+static void
+_outPartitionPruneResult(StringInfo str, const PartitionPruneResult *node)
+{
+ WRITE_NODE_TYPE("PARTITIONPRUNERESULT");
+
+ WRITE_NODE_FIELD(valid_subplan_offs_list);
+ WRITE_BITMAPSET_FIELD(scan_leafpart_rtis);
+}
+
/*****************************************************************************
*
* Stuff from parsenodes.h.
@@ -4770,6 +4792,13 @@ outNode(StringInfo str, const void *obj)
_outJsonTableSibling(str, obj);
break;
+ /*
+ * EXECUTION NODES
+ */
+ case T_PartitionPruneResult:
+ _outPartitionPruneResult(str, obj);
+ break;
+
default:
/*
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index bf602ff93e..c1d131aa99 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -164,6 +164,11 @@
token = pg_strtok(&length); /* skip :fldname */ \
local_node->fldname = readIntCols(len)
+/* Read an Index array */
+#define READ_INDEX_ARRAY(fldname, len) \
+ token = pg_strtok(&length); /* skip :fldname */ \
+ local_node->fldname = readIndexCols(len)
+
/* Read a bool array */
#define READ_BOOL_ARRAY(fldname, len) \
token = pg_strtok(&length); /* skip :fldname */ \
@@ -1818,7 +1823,9 @@ _readPlannedStmt(void)
READ_INT_FIELD(jitFlags);
READ_NODE_FIELD(planTree);
READ_NODE_FIELD(partPruneInfos);
+ READ_BOOL_FIELD(containsInitialPruning);
READ_NODE_FIELD(rtable);
+ READ_BITMAPSET_FIELD(minLockRelids);
READ_NODE_FIELD(resultRelations);
READ_NODE_FIELD(appendRelations);
READ_NODE_FIELD(subplans);
@@ -2770,6 +2777,8 @@ _readPartitionPruneInfo(void)
READ_LOCALS(PartitionPruneInfo);
READ_NODE_FIELD(prune_infos);
+ READ_BOOL_FIELD(needs_init_pruning);
+ READ_BOOL_FIELD(needs_exec_pruning);
READ_BITMAPSET_FIELD(other_subplans);
READ_DONE();
@@ -2786,6 +2795,7 @@ _readPartitionedRelPruneInfo(void)
READ_INT_ARRAY(subplan_map, local_node->nparts);
READ_INT_ARRAY(subpart_map, local_node->nparts);
READ_OID_ARRAY(relid_map, local_node->nparts);
+ READ_INDEX_ARRAY(rti_map, local_node->nparts);
READ_NODE_FIELD(initial_pruning_steps);
READ_NODE_FIELD(exec_pruning_steps);
READ_BITMAPSET_FIELD(execparamids);
@@ -2939,6 +2949,21 @@ _readPartitionRangeDatum(void)
READ_DONE();
}
+
+/*
+ * _readPartitionPruneResult
+ */
+static PartitionPruneResult *
+_readPartitionPruneResult(void)
+{
+ READ_LOCALS(PartitionPruneResult);
+
+ READ_NODE_FIELD(valid_subplan_offs_list);
+ READ_BITMAPSET_FIELD(scan_leafpart_rtis);
+
+ READ_DONE();
+}
+
/*
* parseNodeString
*
@@ -3236,6 +3261,8 @@ parseNodeString(void)
return_value = _readJsonTableParent();
else if (MATCH("JSONTABLESIBLING", 16))
return_value = _readJsonTableSibling();
+ else if (MATCH("PARTITIONPRUNERESULT", 20))
+ return_value = _readPartitionPruneResult();
else
{
elog(ERROR, "badly formatted node string \"%.32s\"...", token);
@@ -3379,6 +3406,30 @@ readIntCols(int numCols)
return int_vals;
}
+/*
+ * readIndexCols
+ */
+Index *
+readIndexCols(int numCols)
+{
+ int tokenLength,
+ i;
+ const char *token;
+ Index *index_vals;
+
+ if (numCols <= 0)
+ return NULL;
+
+ index_vals = (Index *) palloc(numCols * sizeof(Index));
+ for (i = 0; i < numCols; i++)
+ {
+ token = pg_strtok(&tokenLength);
+ index_vals[i] = atoui(token);
+ }
+
+ return index_vals;
+}
+
/*
* readBoolCols
*/
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index b11249ed8f..7141035cc4 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -519,7 +519,9 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
result->parallelModeNeeded = glob->parallelModeNeeded;
result->planTree = top_plan;
result->partPruneInfos = glob->partPruneInfos;
+ result->containsInitialPruning = glob->containsInitialPruning;
result->rtable = glob->finalrtable;
+ result->minLockRelids = glob->minLockRelids;
result->resultRelations = glob->resultRelations;
result->appendRelations = glob->appendRelations;
result->subplans = glob->subplans;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index b8d5610593..da749e331e 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -270,6 +270,16 @@ set_plan_references(PlannerInfo *root, Plan *plan)
*/
add_rtes_to_flat_rtable(root, false);
+ /*
+ * Add the query's adjusted range of RT indexes to glob->minLockRelids.
+ * The adjusted RT indexes of prunable relations will be deleted from the
+ * set below where PartitionPruneInfos are processed.
+ */
+ glob->minLockRelids =
+ bms_add_range(glob->minLockRelids,
+ rtoffset + 1,
+ rtoffset + list_length(root->parse->rtable));
+
/*
* Adjust RT indexes of PlanRowMarks and add to final rowmarks list
*/
@@ -352,6 +362,7 @@ set_plan_references(PlannerInfo *root, Plan *plan)
foreach (lc, root->partPruneInfos)
{
PartitionPruneInfo *pruneinfo = lfirst(lc);
+ Bitmapset *leafpart_rtis = NULL;
ListCell *l;
foreach(l, pruneinfo->prune_infos)
@@ -362,15 +373,50 @@ set_plan_references(PlannerInfo *root, Plan *plan)
foreach(l2, prune_infos)
{
PartitionedRelPruneInfo *pinfo = lfirst(l2);
+ int i;
/* RT index of the table to which the pinfo belongs. */
pinfo->rtindex += rtoffset;
+
+ /* Also of the leaf partitions that might be scanned. */
+ for (i = 0; i < pinfo->nparts; i++)
+ {
+ if (pinfo->rti_map[i] > 0 && pinfo->subplan_map[i] >= 0)
+ {
+ pinfo->rti_map[i] += rtoffset;
+ leafpart_rtis = bms_add_member(leafpart_rtis,
+ pinfo->rti_map[i]);
+ }
+ }
}
}
+ if (pruneinfo->needs_init_pruning)
+ {
+ glob->containsInitialPruning = true;
+
+ /*
+ * Delete the leaf partition RTIs from the global set of relations
+ * to be locked before executing the plan. AcquireExecutorLocks()
+ * will find the ones to add to the set after performing initial
+ * pruning.
+ */
+ glob->minLockRelids = bms_del_members(glob->minLockRelids,
+ leafpart_rtis);
+ }
+
glob->partPruneInfos = lappend(glob->partPruneInfos, pruneinfo);
}
+ /*
+ * It seems worth doing a bms_copy() on glob->minLockRelids if we deleted
+ * bits from it above to get rid of any empty tail bits. It seems better
+ * for the loop over this set in AcquireExecutorLocks() to not have to go
+ * through those useless bit words.
+ */
+ if (glob->containsInitialPruning)
+ glob->minLockRelids = bms_copy(glob->minLockRelids);
+
return result;
}
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index d77f7d3aef..952c5b8327 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -144,7 +144,9 @@ static List *make_partitionedrel_pruneinfo(PlannerInfo *root,
List *prunequal,
Bitmapset *partrelids,
int *relid_subplan_map,
- Bitmapset **matchedsubplans);
+ Bitmapset **matchedsubplans,
+ bool *needs_init_pruning,
+ bool *needs_exec_pruning);
static void gen_partprune_steps(RelOptInfo *rel, List *clauses,
PartClauseTarget target,
GeneratePruningStepsContext *context);
@@ -234,6 +236,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
int *relid_subplan_map;
ListCell *lc;
int i;
+ bool needs_init_pruning = false;
+ bool needs_exec_pruning = false;
/*
* Scan the subpaths to see which ones are scans of partition child
@@ -313,12 +317,16 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
Bitmapset *partrelids = (Bitmapset *) lfirst(lc);
List *pinfolist;
Bitmapset *matchedsubplans = NULL;
+ bool partrel_needs_init_pruning;
+ bool partrel_needs_exec_pruning;
pinfolist = make_partitionedrel_pruneinfo(root, parentrel,
prunequal,
partrelids,
relid_subplan_map,
- &matchedsubplans);
+ &matchedsubplans,
+ &partrel_needs_init_pruning,
+ &partrel_needs_exec_pruning);
/* When pruning is possible, record the matched subplans */
if (pinfolist != NIL)
@@ -327,6 +335,9 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
allmatchedsubplans = bms_join(matchedsubplans,
allmatchedsubplans);
}
+
+ needs_init_pruning |= partrel_needs_init_pruning;
+ needs_exec_pruning |= partrel_needs_exec_pruning;
}
pfree(relid_subplan_map);
@@ -341,6 +352,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
/* Else build the result data structure */
pruneinfo = makeNode(PartitionPruneInfo);
pruneinfo->prune_infos = prunerelinfos;
+ pruneinfo->needs_init_pruning = needs_init_pruning;
+ pruneinfo->needs_exec_pruning = needs_exec_pruning;
/*
* Some subplans may not belong to any of the identified partitioned rels.
@@ -441,13 +454,18 @@ add_part_relids(List *allpartrelids, Bitmapset *partrelids)
* If we cannot find any useful run-time pruning steps, return NIL.
* However, on success, each rel identified in partrelids will have
* an element in the result list, even if some of them are useless.
+ * *needs_init_pruning and *needs_exec_pruning are set to indicate that the
+ * returned PartitionedRelPruneInfos contains pruning steps that can be
+ * performed before and after execution begins, respectively.
*/
static List *
make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
List *prunequal,
Bitmapset *partrelids,
int *relid_subplan_map,
- Bitmapset **matchedsubplans)
+ Bitmapset **matchedsubplans,
+ bool *needs_init_pruning,
+ bool *needs_exec_pruning)
{
RelOptInfo *targetpart = NULL;
List *pinfolist = NIL;
@@ -458,6 +476,10 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
int rti;
int i;
+ /* Will find out below. */
+ *needs_init_pruning = false;
+ *needs_exec_pruning = false;
+
/*
* Examine each partitioned rel, constructing a temporary array to map
* from planner relids to index of the partitioned rel, and building a
@@ -545,6 +567,9 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
* executor per-scan pruning steps. This first pass creates startup
* pruning steps and detects whether there's any possibly-useful quals
* that would require per-scan pruning.
+ *
+ * In the first pass, we note whether the 2nd pass is necessary by
+ * noting the presence of EXEC parameters.
*/
gen_partprune_steps(subpart, partprunequal, PARTTARGET_INITIAL,
&context);
@@ -619,6 +644,12 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
pinfo->execparamids = execparamids;
/* Remaining fields will be filled in the next loop */
+ /* record which types of pruning steps we've seen so far */
+ if (initial_pruning_steps != NIL)
+ *needs_init_pruning = true;
+ if (exec_pruning_steps != NIL)
+ *needs_exec_pruning = true;
+
pinfolist = lappend(pinfolist, pinfo);
}
@@ -646,6 +677,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
int *subplan_map;
int *subpart_map;
Oid *relid_map;
+ Index *rti_map;
/*
* Construct the subplan and subpart maps for this partitioning level.
@@ -658,6 +690,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
subpart_map = (int *) palloc(nparts * sizeof(int));
memset(subpart_map, -1, nparts * sizeof(int));
relid_map = (Oid *) palloc0(nparts * sizeof(Oid));
+ rti_map = (Index *) palloc0(nparts * sizeof(Index));
present_parts = NULL;
i = -1;
@@ -672,6 +705,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
subplan_map[i] = subplanidx = relid_subplan_map[partrel->relid] - 1;
subpart_map[i] = subpartidx = relid_subpart_map[partrel->relid] - 1;
relid_map[i] = planner_rt_fetch(partrel->relid, root)->relid;
+ rti_map[i] = partrel->relid;
if (subplanidx >= 0)
{
present_parts = bms_add_member(present_parts, i);
@@ -696,6 +730,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
pinfo->subplan_map = subplan_map;
pinfo->subpart_map = subpart_map;
pinfo->relid_map = relid_map;
+ pinfo->rti_map = rti_map;
}
pfree(relid_subpart_map);
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 5ab91c2c58..5ae967608d 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -1603,6 +1603,7 @@ exec_bind_message(StringInfo input_message)
int16 *rformats = NULL;
CachedPlanSource *psrc;
CachedPlan *cplan;
+ List *part_prune_result_list;
Portal portal;
char *query_string;
char *saved_stmt_name;
@@ -1978,7 +1979,9 @@ exec_bind_message(StringInfo input_message)
* will be generated in MessageContext. The plan refcount will be
* assigned to the Portal, so it will be released at portal destruction.
*/
- cplan = GetCachedPlan(psrc, params, NULL, NULL);
+ cplan = GetCachedPlan(psrc, params, NULL, NULL, &part_prune_result_list);
+ Assert(list_length(cplan->stmt_list) ==
+ list_length(part_prune_result_list));
/*
* Now we can define the portal.
@@ -1993,6 +1996,9 @@ exec_bind_message(StringInfo input_message)
cplan->stmt_list,
cplan);
+ /* Copy PartitionPruneResults into the portal's context. */
+ PortalStorePartitionPruneResults(portal, part_prune_result_list);
+
/* Done with the snapshot used for parameter I/O and parsing/planning */
if (snapshot_set)
PopActiveSnapshot();
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index 5aa5a350f3..8cc2e2162d 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -35,7 +35,7 @@
Portal ActivePortal = NULL;
-static void ProcessQuery(PlannedStmt *plan,
+static void ProcessQuery(PlannedStmt *plan, PartitionPruneResult *part_prune_result,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -65,6 +65,7 @@ static void DoPortalRewind(Portal portal);
*/
QueryDesc *
CreateQueryDesc(PlannedStmt *plannedstmt,
+ PartitionPruneResult *part_prune_result,
const char *sourceText,
Snapshot snapshot,
Snapshot crosscheck_snapshot,
@@ -77,6 +78,8 @@ CreateQueryDesc(PlannedStmt *plannedstmt,
qd->operation = plannedstmt->commandType; /* operation */
qd->plannedstmt = plannedstmt; /* plan */
+ qd->part_prune_result = part_prune_result; /* ExecutorDoInitialPruning()
+ * output for plan */
qd->sourceText = sourceText; /* query text */
qd->snapshot = RegisterSnapshot(snapshot); /* snapshot */
/* RI check snapshot */
@@ -122,6 +125,7 @@ FreeQueryDesc(QueryDesc *qdesc)
* PORTAL_ONE_RETURNING, or PORTAL_ONE_MOD_WITH portal
*
* plan: the plan tree for the query
+ * part_prune_result: ExecutorDoInitialPruning() output for the plan tree
* sourceText: the source text of the query
* params: any parameters needed
* dest: where to send results
@@ -134,6 +138,7 @@ FreeQueryDesc(QueryDesc *qdesc)
*/
static void
ProcessQuery(PlannedStmt *plan,
+ PartitionPruneResult *part_prune_result,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -145,7 +150,7 @@ ProcessQuery(PlannedStmt *plan,
/*
* Create the QueryDesc object
*/
- queryDesc = CreateQueryDesc(plan, sourceText,
+ queryDesc = CreateQueryDesc(plan, part_prune_result, sourceText,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, 0);
@@ -491,8 +496,13 @@ PortalStart(Portal portal, ParamListInfo params,
/*
* Create QueryDesc in portal's context; for the moment, set
* the destination to DestNone.
+ *
+ * There is no PartitionPruneResult unless the PlannedStmt is
+ * from a CachedPlan.
*/
queryDesc = CreateQueryDesc(linitial_node(PlannedStmt, portal->stmts),
+ portal->part_prune_results == NIL ? NULL :
+ linitial(portal->part_prune_results),
portal->sourceText,
GetActiveSnapshot(),
InvalidSnapshot,
@@ -1225,6 +1235,8 @@ PortalRunMulti(Portal portal,
if (pstmt->utilityStmt == NULL)
{
+ PartitionPruneResult *part_prune_result = NULL;
+
/*
* process a plannable query.
*/
@@ -1271,10 +1283,18 @@ PortalRunMulti(Portal portal,
else
UpdateActiveSnapshotCommandId();
+ /*
+ * Determine if there's a corresponding PartitionPruneResult for
+ * this PlannedStmt.
+ */
+ if (portal->part_prune_results != NIL)
+ part_prune_result = list_nth(portal->part_prune_results,
+ foreach_current_index(stmtlist_item));
+
if (pstmt->canSetTag)
{
/* statement can set tag string */
- ProcessQuery(pstmt,
+ ProcessQuery(pstmt, part_prune_result,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
@@ -1283,7 +1303,7 @@ PortalRunMulti(Portal portal,
else
{
/* stmt added by rewrite cannot set tag */
- ProcessQuery(pstmt,
+ ProcessQuery(pstmt, part_prune_result,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 0d6a295674..8c164741f7 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -99,14 +99,19 @@ static dlist_head cached_expression_list = DLIST_STATIC_INIT(cached_expression_l
static void ReleaseGenericPlan(CachedPlanSource *plansource);
static List *RevalidateCachedQuery(CachedPlanSource *plansource,
QueryEnvironment *queryEnv);
-static bool CheckCachedPlan(CachedPlanSource *plansource);
+static bool CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
+ List **part_prune_result_list);
static CachedPlan *BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
- ParamListInfo boundParams, QueryEnvironment *queryEnv);
+ ParamListInfo boundParams, QueryEnvironment *queryEnv,
+ List **part_prune_result_list);
static bool choose_custom_plan(CachedPlanSource *plansource,
ParamListInfo boundParams);
static double cached_plan_cost(CachedPlan *plan, bool include_planner);
static Query *QueryListGetPrimaryStmt(List *stmts);
-static void AcquireExecutorLocks(List *stmt_list, bool acquire);
+static void AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams,
+ List **part_prune_result_list,
+ List **lockedRelids_per_stmt);
+static void ReleaseExecutorLocks(List *stmt_list, List *lockedRelids_per_stmt);
static void AcquirePlannerLocks(List *stmt_list, bool acquire);
static void ScanQueryForLocks(Query *parsetree, bool acquire);
static bool ScanQueryWalker(Node *node, bool *acquire);
@@ -790,15 +795,20 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
*
* On a "true" return, we have acquired the locks needed to run the plan.
* (We must do this for the "true" result to be race-condition-free.)
+ *
+ * See GetCachedPlan()'s comment for a description of part_prune_result_list.
*/
static bool
-CheckCachedPlan(CachedPlanSource *plansource)
+CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
+ List **part_prune_result_list)
{
CachedPlan *plan = plansource->gplan;
/* Assert that caller checked the querytree */
Assert(plansource->is_valid);
+ *part_prune_result_list = NIL;
+
/* If there's no generic plan, just say "false" */
if (!plan)
return false;
@@ -820,13 +830,21 @@ CheckCachedPlan(CachedPlanSource *plansource)
*/
if (plan->is_valid)
{
+ List *lockedRelids_per_stmt;
+
/*
* Plan must have positive refcount because it is referenced by
* plansource; so no need to fear it disappears under us here.
*/
Assert(plan->refcount > 0);
- AcquireExecutorLocks(plan->stmt_list, true);
+ /*
+ * Lock relations scanned by the plan. This is where the pruning
+ * happens if needed.
+ */
+ AcquireExecutorLocks(plan->stmt_list, boundParams,
+ part_prune_result_list,
+ &lockedRelids_per_stmt);
/*
* If plan was transient, check to see if TransactionXmin has
@@ -848,7 +866,14 @@ CheckCachedPlan(CachedPlanSource *plansource)
}
/* Oops, the race case happened. Release useless locks. */
- AcquireExecutorLocks(plan->stmt_list, false);
+ ReleaseExecutorLocks(plan->stmt_list, lockedRelids_per_stmt);
+
+ /*
+ * The output list and any objects therein have been allocated in the
+ * caller's hopefully short-lived context, so will not remain leaked
+ * for long, though reset to avoid its accidentally being looked at.
+ */
+ *part_prune_result_list = NIL;
}
/*
@@ -874,10 +899,15 @@ CheckCachedPlan(CachedPlanSource *plansource)
* Planning work is done in the caller's memory context. The finished plan
* is in a child memory context, which typically should get reparented
* (unless this is a one-shot plan, in which case we don't copy the plan).
+ *
+ * A list of NULLs is returned in *part_prune_result_list, meaning that no
+ * PartitionPruneResult nodes have yet been created for the plans in
+ * stmt_list.
*/
static CachedPlan *
BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
- ParamListInfo boundParams, QueryEnvironment *queryEnv)
+ ParamListInfo boundParams, QueryEnvironment *queryEnv,
+ List **part_prune_result_list)
{
CachedPlan *plan;
List *plist;
@@ -1007,6 +1037,17 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
MemoryContextSwitchTo(oldcxt);
+ /*
+ * No actual PartitionPruneResults yet to add, though must initialize
+ * the list to have the same number of elements as the list of
+ * PlannedStmts.
+ */
+ *part_prune_result_list = NIL;
+ foreach(lc, plist)
+ {
+ *part_prune_result_list = lappend(*part_prune_result_list, NULL);
+ }
+
return plan;
}
@@ -1126,6 +1167,17 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
* plan or a custom plan for the given parameters: the caller does not know
* which it will get.
*
+ * For every PlannedStmt found in the returned CachedPlan, an element that
+ * is either a PartitionPruneResult or a NULL is added to
+ * *part_prune_result_list if needed. The former if the PlannedStmt is from
+ * the existing CachedPlan that is otherwise valid and contains at least one
+ * PartitionPruneInfo that has "initial" pruning steps. Those steps are
+ * performed by calling ExecutorDoInitialPruning() to determine only those
+ * leaf partitions that need to be locked by AcquireExecutorLocks() by pruning
+ * away subplans that don't match the pruning conditions. The
+ * PartitionPruneResult contains a list of bitmapsets of the indexes of
+ * matching subplans, one for each PartitionPruneInfo.
+ *
* On return, the plan is valid and we have sufficient locks to begin
* execution.
*
@@ -1139,11 +1191,13 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
*/
CachedPlan *
GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
- ResourceOwner owner, QueryEnvironment *queryEnv)
+ ResourceOwner owner, QueryEnvironment *queryEnv,
+ List **part_prune_result_list)
{
CachedPlan *plan = NULL;
List *qlist;
bool customplan;
+ List *my_part_prune_result_list;
/* Assert caller is doing things in a sane order */
Assert(plansource->magic == CACHEDPLANSOURCE_MAGIC);
@@ -1160,7 +1214,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
if (!customplan)
{
- if (CheckCachedPlan(plansource))
+ if (CheckCachedPlan(plansource, boundParams,
+ &my_part_prune_result_list))
{
/* We want a generic plan, and we already have a valid one */
plan = plansource->gplan;
@@ -1169,7 +1224,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
else
{
/* Build a new generic plan */
- plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv);
+ plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv,
+ &my_part_prune_result_list);
/* Just make real sure plansource->gplan is clear */
ReleaseGenericPlan(plansource);
/* Link the new generic plan into the plansource */
@@ -1214,7 +1270,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
if (customplan)
{
/* Build a custom plan */
- plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv);
+ plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv,
+ &my_part_prune_result_list);
/* Accumulate total costs of custom plans */
plansource->total_custom_cost += cached_plan_cost(plan, true);
@@ -1246,6 +1303,9 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
plan->is_saved = true;
}
+ if (part_prune_result_list)
+ *part_prune_result_list = my_part_prune_result_list;
+
return plan;
}
@@ -1737,17 +1797,29 @@ QueryListGetPrimaryStmt(List *stmts)
/*
* AcquireExecutorLocks: acquire locks needed for execution of a cached plan;
- * or release them if acquire is false.
+ *
+ * See GetCachedPlan()'s comment for a description of part_prune_result_list.
+ *
+ * On return, *lockedRelids_per_stmt will contain a bitmapset for every
+ * PlannedStmt in stmt_list, containing the RT indexes of relation entries
+ * in its range table that were actually locked, or NULL if the PlannedStmt
+ * contains a utility statement.
*/
static void
-AcquireExecutorLocks(List *stmt_list, bool acquire)
+AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams,
+ List **part_prune_result_list,
+ List **lockedRelids_per_stmt)
{
ListCell *lc1;
+ *part_prune_result_list = *lockedRelids_per_stmt = NIL;
foreach(lc1, stmt_list)
{
PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
- ListCell *lc2;
+ PartitionPruneResult *part_prune_result = NULL;
+ Bitmapset *allLockRelids;
+ Bitmapset *lockedRelids = NULL;
+ int rti;
if (plannedstmt->commandType == CMD_UTILITY)
{
@@ -1761,13 +1833,35 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
Query *query = UtilityContainsQuery(plannedstmt->utilityStmt);
if (query)
- ScanQueryForLocks(query, acquire);
+ ScanQueryForLocks(query, true);
+ *part_prune_result_list = lappend(*part_prune_result_list, NULL);
continue;
}
- foreach(lc2, plannedstmt->rtable)
+ /*
+ * Figure out the set of relations that would need to be locked
+ * before executing the plan.
+ */
+ if (plannedstmt->containsInitialPruning)
{
- RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc2);
+ /*
+ * Obtain the set of partitions to be locked from the
+ * PartitionPruneInfos by considering the result of performing
+ * initial partition pruning.
+ */
+ PartitionPruneResult *part_prune_result =
+ ExecutorDoInitialPruning(plannedstmt, boundParams);
+
+ allLockRelids = bms_union(plannedstmt->minLockRelids,
+ part_prune_result->scan_leafpart_rtis);
+ }
+ else
+ allLockRelids = plannedstmt->minLockRelids;
+
+ rti = -1;
+ while ((rti = bms_next_member(allLockRelids, rti)) > 0)
+ {
+ RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
if (rte->rtekind != RTE_RELATION)
continue;
@@ -1778,10 +1872,58 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
* fail if it's been dropped entirely --- we'll just transiently
* acquire a non-conflicting lock.
*/
- if (acquire)
- LockRelationOid(rte->relid, rte->rellockmode);
- else
- UnlockRelationOid(rte->relid, rte->rellockmode);
+ LockRelationOid(rte->relid, rte->rellockmode);
+ lockedRelids = bms_add_member(lockedRelids, rti);
+ }
+
+ *part_prune_result_list = lappend(*part_prune_result_list,
+ part_prune_result);
+ *lockedRelids_per_stmt = lappend(*lockedRelids_per_stmt, lockedRelids);
+ }
+}
+
+/*
+ * ReleaseExecutorLocks
+ * Release locks that would've been acquired by an earlier call to
+ * AcquireExecutorLocks()
+ */
+static void
+ReleaseExecutorLocks(List *stmt_list, List *lockedRelids_per_stmt)
+{
+ ListCell *lc1,
+ *lc2;
+
+ forboth(lc1, stmt_list, lc2, lockedRelids_per_stmt)
+ {
+ PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
+ Bitmapset *lockedRelids = lfirst(lc2);
+ int rti;
+
+ if (plannedstmt->commandType == CMD_UTILITY)
+ {
+ /*
+ * Ignore utility statements, except those (such as EXPLAIN) that
+ * contain a parsed-but-not-planned query. Note: it's okay to use
+ * ScanQueryForLocks, even though the query hasn't been through
+ * rule rewriting, because rewriting doesn't change the query
+ * representation.
+ */
+ Query *query = UtilityContainsQuery(plannedstmt->utilityStmt);
+
+ if (query)
+ ScanQueryForLocks(query, false);
+ continue;
+ }
+
+ rti = -1;
+ while ((rti = bms_next_member(lockedRelids, rti)) >= 0)
+ {
+ RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
+
+ Assert(rte->rtekind == RTE_RELATION);
+
+ /* See the comment in AcquireExecutorLocks(). */
+ UnlockRelationOid(rte->relid, rte->rellockmode);
}
}
}
diff --git a/src/backend/utils/mmgr/portalmem.c b/src/backend/utils/mmgr/portalmem.c
index d549f66d4a..1bbe6b704b 100644
--- a/src/backend/utils/mmgr/portalmem.c
+++ b/src/backend/utils/mmgr/portalmem.c
@@ -303,6 +303,25 @@ PortalDefineQuery(Portal portal,
portal->status = PORTAL_DEFINED;
}
+/*
+ * PortalStorePartitionPruneResults
+ * Copy the given list of PartitionPruneResults into the portal's
+ * context
+ *
+ * This allows the caller to ensure that the list exists as long as the portal
+ * does.
+ */
+void
+PortalStorePartitionPruneResults(Portal portal, List *part_prune_results)
+{
+ MemoryContext oldcxt;
+
+ AssertArg(PortalIsValid(portal));
+ oldcxt = MemoryContextSwitchTo(portal->portalContext);
+ portal->part_prune_results = copyObject(part_prune_results);
+ MemoryContextSwitchTo(oldcxt);
+}
+
/*
* PortalReleaseCachedPlan
* Release a portal's reference to its cached plan, if any.
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 9ebde089ae..e57e133f0e 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -87,7 +87,9 @@ extern void ExplainOneUtility(Node *utilityStmt, IntoClause *into,
ExplainState *es, const char *queryString,
ParamListInfo params, QueryEnvironment *queryEnv);
-extern void ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into,
+extern void ExplainOnePlan(PlannedStmt *plannedstmt,
+ PartitionPruneResult *part_prune_result,
+ IntoClause *into,
ExplainState *es, const char *queryString,
ParamListInfo params, QueryEnvironment *queryEnv,
const instr_time *planduration,
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index bf962af7af..bd8776402e 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -45,6 +45,7 @@ extern void ExecCleanupTupleRouting(ModifyTableState *mtstate,
* nparts Length of subplan_map[] and subpart_map[].
* subplan_map Subplan index by partition index, or -1.
* subpart_map Subpart index by partition index, or -1.
+ * rti_map Range table index by partition index, or 0.
* present_parts A Bitmapset of the partition indexes that we
* have subplans or subparts for.
* initial_pruning_steps List of PartitionPruneSteps used to
@@ -61,6 +62,7 @@ typedef struct PartitionedRelPruningData
int nparts;
int *subplan_map;
int *subpart_map;
+ Index *rti_map;
Bitmapset *present_parts;
List *initial_pruning_steps;
List *exec_pruning_steps;
@@ -126,5 +128,10 @@ extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
int part_prune_index,
Bitmapset **initially_valid_subplans);
extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
- bool initial_prune);
+ bool initial_prune,
+ Bitmapset **scan_leafpart_rtis);
+extern Bitmapset *ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt,
+ ParamListInfo params,
+ PartitionPruneInfo *pruneinfo,
+ Bitmapset **scan_leafpart_rtis);
#endif /* EXECPARTITION_H */
diff --git a/src/include/executor/execdesc.h b/src/include/executor/execdesc.h
index e79e2c001f..60d5644908 100644
--- a/src/include/executor/execdesc.h
+++ b/src/include/executor/execdesc.h
@@ -35,6 +35,8 @@ typedef struct QueryDesc
/* These fields are provided by CreateQueryDesc */
CmdType operation; /* CMD_SELECT, CMD_UPDATE, etc. */
PlannedStmt *plannedstmt; /* planner's output (could be utility, too) */
+ PartitionPruneResult *part_prune_result; /* ExecutorDoInitialPruning()'s
+ * output for plannedstmt */
const char *sourceText; /* source text of the query */
Snapshot snapshot; /* snapshot to use for query */
Snapshot crosscheck_snapshot; /* crosscheck for RI update/delete */
@@ -57,6 +59,7 @@ typedef struct QueryDesc
/* in pquery.c */
extern QueryDesc *CreateQueryDesc(PlannedStmt *plannedstmt,
+ PartitionPruneResult *part_prune_result,
const char *sourceText,
Snapshot snapshot,
Snapshot crosscheck_snapshot,
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index d68a6b9d28..5c4a282be0 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -185,6 +185,8 @@ ExecGetJunkAttribute(TupleTableSlot *slot, AttrNumber attno, bool *isNull)
/*
* prototypes from functions in execMain.c
*/
+extern PartitionPruneResult *ExecutorDoInitialPruning(PlannedStmt *plannedstmt,
+ ParamListInfo params);
extern void ExecutorStart(QueryDesc *queryDesc, int eflags);
extern void standard_ExecutorStart(QueryDesc *queryDesc, int eflags);
extern void ExecutorRun(QueryDesc *queryDesc,
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 25e0bb976e..4d4bb3fc3c 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -986,6 +986,33 @@ typedef struct DomainConstraintState
*/
typedef TupleTableSlot *(*ExecProcNodeMtd) (struct PlanState *pstate);
+/*----------------
+ * PartitionPruneResult
+ *
+ * The result of performing ExecutorDoInitialPruning() invocation on a given
+ * PlannedStmt.
+ *
+ * Contains a list of Bitmapset of the indexes of the subplans remaining after
+ * performing initial pruning by calling ExecFindMatchingSubPlans() for every
+ * PartitionPruneInfo found in PlannedStmt.partPruneInfos. RT indexes of the
+ * leaf partitions scanned by those subplans across all PartitionPruneInfos
+ * are added into scan_leafpart_rtis.
+ *
+ * This is used by GetCachedPlan() to inform its callers of the pruning
+ * decisions made when performing AcquireExecutorLocks() on a given cached
+ * PlannedStmt, which the callers then pass on to the executor. The executor
+ * refers to this node when initializing the plan nodes which contain subplans
+ * that may have been pruned by ExecutorDoInitialPruning(), rather than
+ * redoing initial pruning.
+ */
+typedef struct PartitionPruneResult
+{
+ NodeTag type;
+
+ List *valid_subplan_offs_list;
+ Bitmapset *scan_leafpart_rtis;
+} PartitionPruneResult;
+
/* ----------------
* PlanState node
*
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index 7ce1fc4deb..c7f256028e 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -97,6 +97,9 @@ typedef enum NodeTag
T_PartitionPruneStepCombine,
T_PlanInvalItem,
+ /* TAGS FOR EXECUTOR PREP NODES (execnodes.h) */
+ T_PartitionPruneResult,
+
/*
* TAGS FOR PLAN STATE NODES (execnodes.h)
*
@@ -675,6 +678,7 @@ extern struct Bitmapset *readBitmapset(void);
extern uintptr_t readDatum(bool typbyval);
extern bool *readBoolCols(int numCols);
extern int *readIntCols(int numCols);
+extern Index *readIndexCols(int numCols);
extern Oid *readOidCols(int numCols);
extern int16 *readAttrNumberCols(int numCols);
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index a0f3a46334..c2d91bb12f 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -110,6 +110,15 @@ typedef struct PlannerGlobal
List *partPruneInfos; /* List of PartitionPruneInfo contained in
* the plan */
+ bool containsInitialPruning; /* Do any of those PartitionPruneInfos
+ * have initial (pre-exec) pruning
+ * steps in them? */
+
+ Bitmapset *minLockRelids; /* Indexes of all range table entries minus
+ * indexes of range table entries of the leaf
+ * partitions scanned by prunable subplans;
+ * see AcquireExecutorLocks() */
+
List *relationOids; /* OIDs of relations the plan depends on */
List *invalItems; /* other dependencies, as PlanInvalItems */
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index c3f4a39657..869bf535bc 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -67,8 +67,17 @@ typedef struct PlannedStmt
List *partPruneInfos; /* List of PartitionPruneInfo contained in
* the plan */
+ bool containsInitialPruning; /* Do any of those PartitionPruneInfos
+ * have initial (pre-exec) pruning
+ * steps in them? */
+
List *rtable; /* list of RangeTblEntry nodes */
+ Bitmapset *minLockRelids; /* Indexes of all range table entries minus
+ * indexes of range table entries of the leaf
+ * partitions scanned by prunable subplans;
+ * see AcquireExecutorLocks() */
+
/* rtable indexes of target relations for INSERT/UPDATE/DELETE */
List *resultRelations; /* integer list of RT indexes, or NIL */
@@ -1386,6 +1395,13 @@ typedef struct PlanRowMark
* prune_infos List of Lists containing PartitionedRelPruneInfo nodes,
* one sublist per run-time-prunable partition hierarchy
* appearing in the parent plan node's subplans.
+ *
+ * needs_init_pruning Does any of the PartitionedRelPruneInfos in
+ * prune_infos have its initial_pruning_steps set?
+ *
+ * needs_exec_pruning Does any of the PartitionedRelPruneInfos in
+ * prune_infos have its exec_pruning_steps set?
+ *
* other_subplans Indexes of any subplans that are not accounted for
* by any of the PartitionedRelPruneInfo nodes in
* "prune_infos". These subplans must not be pruned.
@@ -1394,6 +1410,8 @@ typedef struct PartitionPruneInfo
{
NodeTag type;
List *prune_infos;
+ bool needs_init_pruning;
+ bool needs_exec_pruning;
Bitmapset *other_subplans;
} PartitionPruneInfo;
@@ -1436,6 +1454,9 @@ typedef struct PartitionedRelPruneInfo
/* relation OID by partition index, or 0 */
Oid *relid_map;
+ /* Range table index by partition index, or 0. */
+ Index *rti_map;
+
/*
* initial_pruning_steps shows how to prune during executor startup (i.e.,
* without use of any PARAM_EXEC Params); it is NIL if no startup pruning
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index 0499635f59..1c5bb5ece1 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -220,7 +220,8 @@ extern List *CachedPlanGetTargetList(CachedPlanSource *plansource,
extern CachedPlan *GetCachedPlan(CachedPlanSource *plansource,
ParamListInfo boundParams,
ResourceOwner owner,
- QueryEnvironment *queryEnv);
+ QueryEnvironment *queryEnv,
+ List **part_prune_result_list);
extern void ReleaseCachedPlan(CachedPlan *plan, ResourceOwner owner);
extern bool CachedPlanAllowsSimpleValidityCheck(CachedPlanSource *plansource,
diff --git a/src/include/utils/portal.h b/src/include/utils/portal.h
index aeddbdafe5..9f7727a837 100644
--- a/src/include/utils/portal.h
+++ b/src/include/utils/portal.h
@@ -138,6 +138,7 @@ typedef struct PortalData
QueryCompletion qc; /* command completion data for executed query */
List *stmts; /* list of PlannedStmts */
CachedPlan *cplan; /* CachedPlan, if stmts are from one */
+ List *part_prune_results; /* list of PartitionPruneResults */
ParamListInfo portalParams; /* params to pass to query */
QueryEnvironment *queryEnv; /* environment for query */
@@ -242,6 +243,8 @@ extern void PortalDefineQuery(Portal portal,
CommandTag commandTag,
List *stmts,
CachedPlan *cplan);
+extern void PortalStorePartitionPruneResults(Portal portal,
+ List *part_prune_result_list);
extern PlannedStmt *PortalGetPrimaryStmt(Portal portal);
extern void PortalCreateHoldStore(Portal portal);
extern void PortalHashTableDeleteAll(void);
--
2.35.3
^ permalink raw reply [nested|flat] 71+ messages in thread
* Re: generic plans and "initial" pruning
@ 2022-07-13 06:40 Amit Langote <[email protected]>
parent: Amit Langote <[email protected]>
1 sibling, 1 reply; 71+ messages in thread
From: Amit Langote @ 2022-07-13 06:40 UTC (permalink / raw)
To: Jacob Champion <[email protected]>; +Cc: Zhihong Yu <[email protected]>; David Rowley <[email protected]>; Robert Haas <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers
Rebased over 964d01ae90c.
--
Thanks, Amit Langote
EDB: http://www.enterprisedb.com
Attachments:
[application/octet-stream] v18-0002-Optimize-AcquireExecutorLocks-by-locking-only-un.patch (81.4K, 2-v18-0002-Optimize-AcquireExecutorLocks-by-locking-only-un.patch)
download | inline diff:
From 567059057ee35bcd8ca066f46d4c6b23641af090 Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Wed, 22 Dec 2021 16:55:17 +0900
Subject: [PATCH v18 2/2] Optimize AcquireExecutorLocks() by locking only
unpruned partitions
This commit teaches AcquireExecutorLocks() to perform initial
partition pruning to notionally eliminate the subnodes contained in a
generic cached plan that need not be initialized during the actual
execution of the plan and skip locking the partition scanned by those
subnodes.
The result of performing initial partition pruning this way before the
actual execution has started is made available to the actual execution via
PartitionPruneResult, made available along with the PlannedStmt by the
callers of the executor that used plancache.c to get the plan. It is NULL
in the cases in which the plan is obtained by calling the planner
directly or if the plan obtained by plancache.c is not a generic one.
---
src/backend/commands/copyto.c | 2 +-
src/backend/commands/createas.c | 2 +-
src/backend/commands/explain.c | 7 +-
src/backend/commands/extension.c | 2 +-
src/backend/commands/matview.c | 2 +-
src/backend/commands/prepare.c | 26 ++-
src/backend/executor/README | 32 ++++
src/backend/executor/execMain.c | 53 ++++++
src/backend/executor/execParallel.c | 27 ++-
src/backend/executor/execPartition.c | 234 +++++++++++++++++++++----
src/backend/executor/functions.c | 2 +-
src/backend/executor/nodeAppend.c | 11 +-
src/backend/executor/nodeMergeAppend.c | 5 +-
src/backend/executor/spi.c | 27 ++-
src/backend/nodes/copyfuncs.c | 1 -
src/backend/nodes/outfuncs.c | 1 -
src/backend/nodes/readfuncs.c | 29 +++
src/backend/optimizer/plan/planner.c | 2 +
src/backend/optimizer/plan/setrefs.c | 46 +++++
src/backend/partitioning/partprune.c | 41 ++++-
src/backend/tcop/postgres.c | 8 +-
src/backend/tcop/pquery.c | 28 ++-
src/backend/utils/cache/plancache.c | 187 +++++++++++++++++---
src/backend/utils/mmgr/portalmem.c | 19 ++
src/include/commands/explain.h | 4 +-
src/include/executor/execPartition.h | 9 +-
src/include/executor/execdesc.h | 3 +
src/include/executor/executor.h | 2 +
src/include/nodes/execnodes.h | 27 +++
src/include/nodes/nodes.h | 1 +
src/include/nodes/pathnodes.h | 13 ++
src/include/nodes/plannodes.h | 21 +++
src/include/utils/plancache.h | 3 +-
src/include/utils/portal.h | 3 +
34 files changed, 782 insertions(+), 98 deletions(-)
diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index fca29a9a10..d839517693 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -541,7 +541,7 @@ BeginCopyTo(ParseState *pstate,
((DR_copy *) dest)->cstate = cstate;
/* Create a QueryDesc requesting no output */
- cstate->queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+ cstate->queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
GetActiveSnapshot(),
InvalidSnapshot,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 9abbb6b555..f6607f2454 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -325,7 +325,7 @@ ExecCreateTableAs(ParseState *pstate, CreateTableAsStmt *stmt,
UpdateActiveSnapshotCommandId();
/* Create a QueryDesc, redirecting output to our tuple receiver */
- queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+ queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, 0);
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index e29c2ae206..e41b13a3ea 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -407,7 +407,7 @@ ExplainOneQuery(Query *query, int cursorOptions,
}
/* run it (if needed) and produce output */
- ExplainOnePlan(plan, into, es, queryString, params, queryEnv,
+ ExplainOnePlan(plan, NULL, into, es, queryString, params, queryEnv,
&planduration, (es->buffers ? &bufusage : NULL));
}
}
@@ -515,7 +515,8 @@ ExplainOneUtility(Node *utilityStmt, IntoClause *into, ExplainState *es,
* to call it.
*/
void
-ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
+ExplainOnePlan(PlannedStmt *plannedstmt, PartitionPruneResult *part_prune_result,
+ IntoClause *into, ExplainState *es,
const char *queryString, ParamListInfo params,
QueryEnvironment *queryEnv, const instr_time *planduration,
const BufferUsage *bufusage)
@@ -563,7 +564,7 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
dest = None_Receiver;
/* Create a QueryDesc for the query */
- queryDesc = CreateQueryDesc(plannedstmt, queryString,
+ queryDesc = CreateQueryDesc(plannedstmt, part_prune_result, queryString,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, instrument_option);
diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index 3db859c3ea..631cc07217 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -776,7 +776,7 @@ execute_sql_string(const char *sql)
{
QueryDesc *qdesc;
- qdesc = CreateQueryDesc(stmt,
+ qdesc = CreateQueryDesc(stmt, NULL,
sql,
GetActiveSnapshot(), NULL,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index 9ac0383459..b0ed96e56c 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -408,7 +408,7 @@ refresh_matview_datafill(DestReceiver *dest, Query *query,
UpdateActiveSnapshotCommandId();
/* Create a QueryDesc, redirecting output to our tuple receiver */
- queryDesc = CreateQueryDesc(plan, queryString,
+ queryDesc = CreateQueryDesc(plan, NULL, queryString,
GetActiveSnapshot(), InvalidSnapshot,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 2333aae467..83465e40f8 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -155,6 +155,7 @@ ExecuteQuery(ParseState *pstate,
PreparedStatement *entry;
CachedPlan *cplan;
List *plan_list;
+ List *part_prune_result_list;
ParamListInfo paramLI = NULL;
EState *estate = NULL;
Portal portal;
@@ -193,7 +194,10 @@ ExecuteQuery(ParseState *pstate,
entry->plansource->query_string);
/* Replan if needed, and increment plan refcount for portal */
- cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL);
+ cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL,
+ &part_prune_result_list);
+ Assert(list_length(cplan->stmt_list) ==
+ list_length(part_prune_result_list));
plan_list = cplan->stmt_list;
/*
@@ -207,6 +211,9 @@ ExecuteQuery(ParseState *pstate,
plan_list,
cplan);
+ /* Copy PartitionPruneResults into the portal's context. */
+ PortalStorePartitionPruneResults(portal, part_prune_result_list);
+
/*
* For CREATE TABLE ... AS EXECUTE, we must verify that the prepared
* statement is one that produces tuples. Currently we insist that it be
@@ -576,7 +583,9 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
const char *query_string;
CachedPlan *cplan;
List *plan_list;
- ListCell *p;
+ List *part_prune_result_list;
+ ListCell *p,
+ *pp;
ParamListInfo paramLI = NULL;
EState *estate = NULL;
instr_time planstart;
@@ -619,7 +628,10 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
/* Replan if needed, and acquire a transient refcount */
cplan = GetCachedPlan(entry->plansource, paramLI,
- CurrentResourceOwner, queryEnv);
+ CurrentResourceOwner, queryEnv,
+ &part_prune_result_list);
+ Assert(list_length(cplan->stmt_list) ==
+ list_length(part_prune_result_list));
INSTR_TIME_SET_CURRENT(planduration);
INSTR_TIME_SUBTRACT(planduration, planstart);
@@ -634,13 +646,15 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
plan_list = cplan->stmt_list;
/* Explain each query */
- foreach(p, plan_list)
+ forboth(p, plan_list, pp, part_prune_result_list)
{
PlannedStmt *pstmt = lfirst_node(PlannedStmt, p);
+ PartitionPruneResult *part_prune_result = lfirst_node(PartitionPruneResult, pp);
if (pstmt->commandType != CMD_UTILITY)
- ExplainOnePlan(pstmt, into, es, query_string, paramLI, queryEnv,
- &planduration, (es->buffers ? &bufusage : NULL));
+ ExplainOnePlan(pstmt, part_prune_result, into, es, query_string,
+ paramLI, queryEnv, &planduration,
+ (es->buffers ? &bufusage : NULL));
else
ExplainOneUtility(pstmt->utilityStmt, into, es, query_string,
paramLI, queryEnv);
diff --git a/src/backend/executor/README b/src/backend/executor/README
index 0b5183fc4a..953a476ea5 100644
--- a/src/backend/executor/README
+++ b/src/backend/executor/README
@@ -65,6 +65,34 @@ found there. This currently only occurs for Append and MergeAppend nodes. In
this case the non-required subplans are ignored and the executor state's
subnode array will become out of sequence to the plan's subplan list.
+Actually, the so-called execution time pruning may also occur even before the
+execution has started. One case where that occurs is when a cached generic
+plan is being validated for execution by plancache.c: GetCachedPlan(), which
+works by locking all the relations that will be scanned by that plan. If the
+generic plan contains nodes that can perform execution time partition pruning
+(that is, contain a PartitionPruneInfo), a subset of pruning steps contained
+in a given node's PartitionPruneInfo that do not depend on the execution
+actually having started (called "initial" pruning steps) are performed at this
+point to figure out the minimal set of child subplans that satisfy those
+pruning steps. AcquireExecutorLocks() looking at a given plan tree will then
+lock only the relations scanned by the child subplans that survived such
+pruning, along with those present in PlannedStmt.minLockRelids. Note that the
+subplans are only notionally pruned in that they are not removed from the plan
+tree as such.
+
+To prevent the executor and any third party execution code that can look at
+the plan tree from trying to execute the subplans that were pruned as
+described above, the result of pruning is passed to the executor as a
+PartitionPruneResult node via the QueryDesc. It consists of the set of
+indexes of surviving subplans in their respective parent plan node's list of
+child subplans, saved as a list of bitmapsets, with one element for every
+parent plan node whose PartitionPruneInfo is present in
+PlannedStmt.partPruneInfos. In other words, the executor should not
+re-evaluate the set of initially valid subplans by redoing the initial pruning
+if it was already done by AcquireExecutorLocks(), because the re-evaluation may
+very well end up resulting in a different set of subplans, containing some
+whose relations were not locked by AcquireExecutorLocks().
+
Each Plan node may have expression trees associated with it, to represent
its target list, qualification conditions, etc. These trees are also
read-only to the executor, but the executor state for expression evaluation
@@ -286,6 +314,10 @@ Query Processing Control Flow
This is a sketch of control flow for full query processing:
+ [ ExecutorDoInitialPruning ] --- an optional step to perform initial
+ partition pruning on the plan tree the result of which is passed
+ to the executor via QueryDesc
+
CreateQueryDesc
ExecutorStart
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 72fc273524..45824624f8 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -49,6 +49,7 @@
#include "commands/matview.h"
#include "commands/trigger.h"
#include "executor/execdebug.h"
+#include "executor/execPartition.h"
#include "executor/nodeSubplan.h"
#include "foreign/fdwapi.h"
#include "jit/jit.h"
@@ -104,6 +105,56 @@ static void EvalPlanQualStart(EPQState *epqstate, Plan *planTree);
/* end of local decls */
+/* ----------------------------------------------------------------
+ * ExecutorDoInitialPruning
+ *
+ * For each plan tree node that has been assigned a PartitionPruneInfo,
+ * this performs initial partition pruning using the information contained
+ * therein to determine the set of child subplans that satisfy the initial
+ * pruning steps, to be returned as a bitmapset of their indexes in the
+ * node's list of child subplans (for example, an Append's appendplans).
+ *
+ * Return value is a PartitionPruneResult node that contains a list of those
+ * bitmapsets, with one element for every PartitionPruneInfo, and a bitmapset
+ * of the RT indexes of all the leaf partitions scanned by those chosen
+ * subplans. Note that the latter is shared across all PartitionPruneInfos.
+ *
+ * The executor must see the exactly same set of subplans as valid for
+ * execution when doing ExecInitNode() on the plan nodes whose
+ * PartitionPruneInfos are processed here. So, it must get the set from the
+ * aforementioned PartitionPruneResult, instead of computing it all over
+ * again by redoing the initial pruning. It's the caller's job to pass the
+ * PartitionPruneResult to the executor.
+ *
+ * Note: Partitioned tables mentioned in PartitionedRelPruneInfo nodes that
+ * drive the pruning will be locked before doing the pruning.
+ * ----------------------------------------------------------------
+ */
+PartitionPruneResult *
+ExecutorDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params)
+{
+ PartitionPruneResult *result;
+ ListCell *lc;
+
+ /* Only get here if there is any pruning to do. */
+ Assert(plannedstmt->containsInitialPruning);
+
+ result = makeNode(PartitionPruneResult);
+ foreach(lc, plannedstmt->partPruneInfos)
+ {
+ PartitionPruneInfo *pruneinfo = lfirst(lc);
+ Bitmapset *valid_subplan_offs;
+
+ valid_subplan_offs =
+ ExecPartitionDoInitialPruning(plannedstmt, params, pruneinfo,
+ &result->scan_leafpart_rtis);
+ result->valid_subplan_offs_list =
+ lappend(result->valid_subplan_offs_list,
+ valid_subplan_offs);
+ }
+
+ return result;
+}
/* ----------------------------------------------------------------
* ExecutorStart
@@ -806,6 +857,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
{
CmdType operation = queryDesc->operation;
PlannedStmt *plannedstmt = queryDesc->plannedstmt;
+ PartitionPruneResult *part_prune_result = queryDesc->part_prune_result;
Plan *plan = plannedstmt->planTree;
List *rangeTable = plannedstmt->rtable;
EState *estate = queryDesc->estate;
@@ -826,6 +878,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
estate->es_plannedstmt = plannedstmt;
estate->es_part_prune_infos = plannedstmt->partPruneInfos;
+ estate->es_part_prune_result = part_prune_result;
/*
* Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index f73b8c2607..7e6dab5623 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -66,6 +66,7 @@
#define PARALLEL_KEY_QUERY_TEXT UINT64CONST(0xE000000000000008)
#define PARALLEL_KEY_JIT_INSTRUMENTATION UINT64CONST(0xE000000000000009)
#define PARALLEL_KEY_WAL_USAGE UINT64CONST(0xE00000000000000A)
+#define PARALLEL_KEY_PARTITIONPRUNERESULT UINT64CONST(0xE00000000000000B)
#define PARALLEL_TUPLE_QUEUE_SIZE 65536
@@ -182,6 +183,7 @@ ExecSerializePlan(Plan *plan, EState *estate)
pstmt->transientPlan = false;
pstmt->dependsOnRole = false;
pstmt->parallelModeNeeded = false;
+ pstmt->containsInitialPruning = false;
pstmt->planTree = plan;
pstmt->partPruneInfos = estate->es_part_prune_infos;
pstmt->rtable = estate->es_range_table;
@@ -597,12 +599,15 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
FixedParallelExecutorState *fpes;
char *pstmt_data;
char *pstmt_space;
+ char *part_prune_result_data;
+ char *part_prune_result_space;
char *paramlistinfo_space;
BufferUsage *bufusage_space;
WalUsage *walusage_space;
SharedExecutorInstrumentation *instrumentation = NULL;
SharedJitInstrumentation *jit_instrumentation = NULL;
int pstmt_len;
+ int part_prune_result_len;
int paramlistinfo_len;
int instrumentation_len = 0;
int jit_instrumentation_len = 0;
@@ -631,6 +636,7 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
/* Fix up and serialize plan to be sent to workers. */
pstmt_data = ExecSerializePlan(planstate->plan, estate);
+ part_prune_result_data = nodeToString(estate->es_part_prune_result);
/* Create a parallel context. */
pcxt = CreateParallelContext("postgres", "ParallelQueryMain", nworkers);
@@ -657,6 +663,11 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
shm_toc_estimate_chunk(&pcxt->estimator, pstmt_len);
shm_toc_estimate_keys(&pcxt->estimator, 1);
+ /* Estimate space for serialized PartitionPruneResult. */
+ part_prune_result_len = strlen(part_prune_result_data) + 1;
+ shm_toc_estimate_chunk(&pcxt->estimator, part_prune_result_len);
+ shm_toc_estimate_keys(&pcxt->estimator, 1);
+
/* Estimate space for serialized ParamListInfo. */
paramlistinfo_len = EstimateParamListSpace(estate->es_param_list_info);
shm_toc_estimate_chunk(&pcxt->estimator, paramlistinfo_len);
@@ -751,6 +762,12 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
memcpy(pstmt_space, pstmt_data, pstmt_len);
shm_toc_insert(pcxt->toc, PARALLEL_KEY_PLANNEDSTMT, pstmt_space);
+ /* Store serialized PartitionPruneResult */
+ part_prune_result_space = shm_toc_allocate(pcxt->toc, part_prune_result_len);
+ memcpy(part_prune_result_space, part_prune_result_data, part_prune_result_len);
+ shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARTITIONPRUNERESULT,
+ part_prune_result_space);
+
/* Store serialized ParamListInfo. */
paramlistinfo_space = shm_toc_allocate(pcxt->toc, paramlistinfo_len);
shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARAMLISTINFO, paramlistinfo_space);
@@ -1232,8 +1249,10 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
int instrument_options)
{
char *pstmtspace;
+ char *part_prune_result_space;
char *paramspace;
PlannedStmt *pstmt;
+ PartitionPruneResult *part_prune_result;
ParamListInfo paramLI;
char *queryString;
@@ -1244,12 +1263,18 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
pstmtspace = shm_toc_lookup(toc, PARALLEL_KEY_PLANNEDSTMT, false);
pstmt = (PlannedStmt *) stringToNode(pstmtspace);
+ /* Reconstruct leader-supplied PartitionPruneResult. */
+ part_prune_result_space =
+ shm_toc_lookup(toc, PARALLEL_KEY_PARTITIONPRUNERESULT, false);
+ part_prune_result = (PartitionPruneResult *)
+ stringToNode(part_prune_result_space);
+
/* Reconstruct ParamListInfo. */
paramspace = shm_toc_lookup(toc, PARALLEL_KEY_PARAMLISTINFO, false);
paramLI = RestoreParamList(¶mspace);
/* Create a QueryDesc for the query. */
- return CreateQueryDesc(pstmt,
+ return CreateQueryDesc(pstmt, part_prune_result,
queryString,
GetActiveSnapshot(), InvalidSnapshot,
receiver, paramLI, NULL, instrument_options);
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index b55cdd2580..24e6f6e988 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -25,6 +25,7 @@
#include "mb/pg_wchar.h"
#include "miscadmin.h"
#include "nodes/makefuncs.h"
+#include "parser/parsetree.h"
#include "partitioning/partbounds.h"
#include "partitioning/partdesc.h"
#include "partitioning/partprune.h"
@@ -185,7 +186,11 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation rel,
static List *adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri);
static List *adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap);
static PartitionPruneState *CreatePartitionPruneState(PlanState *planstate,
- PartitionPruneInfo *pruneinfo);
+ PartitionPruneInfo *pruneinfo,
+ bool consider_initial_steps,
+ bool consider_exec_steps,
+ List *rtable, ExprContext *econtext,
+ PartitionDirectory partdir);
static void InitPartitionPruneContext(PartitionPruneContext *context,
List *pruning_steps,
PartitionDesc partdesc,
@@ -198,7 +203,8 @@ static void PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
static void find_matching_subplans_recurse(PartitionPruningData *prunedata,
PartitionedRelPruningData *pprune,
bool initial_prune,
- Bitmapset **validsubplans);
+ Bitmapset **validsubplans,
+ Bitmapset **scan_leafpart_rtis);
/*
@@ -1593,8 +1599,10 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
* considered to be a stable expression, it can change value from one plan
* node scan to the next during query execution. Stable comparison
* expressions that don't involve such Params allow partition pruning to be
- * done once during executor startup. Expressions that do involve such Params
- * require us to prune separately for each scan of the parent plan node.
+ * done once during executor startup or during ExecutorDoInitialPruning() that
+ * runs as part of performing AcquireExecutorLocks() on a given plan tree.
+ * Expressions that do involve such Params require us to prune separately for
+ * each scan of the parent plan node.
*
* Note that pruning away unneeded subplans during executor startup has the
* added benefit of not having to initialize the unneeded subplans at all.
@@ -1611,6 +1619,13 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
* account for initial pruning possibly having eliminated some of the
* subplans.
*
+ * ExecPartitionDoInitialPruning:
+ * Do initial pruning with the information contained in a given
+ * PartitionPruneInfo to determine the minimal set of child subplans
+ * to be executed of the parent plan node to which the PartitionPruneInfo
+ * belongs and also the set of the RT indexes of leaf partitions that will
+ * be scanned with those subplans.
+ *
* ExecFindMatchingSubPlans:
* Returns indexes of matching subplans after evaluating the expressions
* that are safe to evaluate at a given point. This function is first
@@ -1628,8 +1643,9 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
*
* On return, *initially_valid_subplans is assigned the set of indexes of
* child subplans that must be initialized along with the parent plan node.
- * Initial pruning is performed here if needed and in that case only the
- * surviving subplans' indexes are added.
+ * Initial pruning is performed here if needed (unless it has already been done
+ * by ExecutorDoInitialPruning()), and in that case only the surviving
+ * subplans' indexes are added.
*
* If subplans are indeed pruned, subplan_map arrays contained in the returned
* PartitionPruneState are re-sequenced to not count those, though only if the
@@ -1645,24 +1661,59 @@ ExecInitPartitionPruning(PlanState *planstate,
EState *estate = planstate->state;
PartitionPruneInfo *pruneinfo = list_nth(estate->es_part_prune_infos,
part_prune_index);
+ PartitionPruneResult *pruneresult = estate->es_part_prune_result;
+ bool do_pruning = (pruneinfo->needs_init_pruning ||
+ pruneinfo->needs_exec_pruning);
- /* We may need an expression context to evaluate partition exprs */
- ExecAssignExprContext(estate, planstate);
+ /*
+ * No need to do initial pruning if it was done already by
+ * ExecutorDoInitialPruning(), which it would be if es_part_prune_result
+ * has been set.
+ */
+ if (pruneresult)
+ do_pruning = pruneinfo->needs_exec_pruning;
- /* Create the working data structure for pruning */
- prunestate = CreatePartitionPruneState(planstate, pruneinfo);
+ prunestate = NULL;
+ if (do_pruning)
+ {
+ /* We may need an expression context to evaluate partition exprs */
+ ExecAssignExprContext(estate, planstate);
+
+ /* For data reading, executor always omits detached partitions */
+ if (estate->es_partition_directory == NULL)
+ estate->es_partition_directory =
+ CreatePartitionDirectory(estate->es_query_cxt, false);
+
+ /*
+ * Create the working data structure for pruning. No need to consider
+ * initial pruning steps if we have a PartitionPruneResult.
+ */
+ prunestate = CreatePartitionPruneState(planstate, pruneinfo,
+ pruneresult == NULL, true,
+ NIL, planstate->ps_ExprContext,
+ estate->es_partition_directory);
+ }
/*
* Perform an initial partition prune pass, if required.
*/
- if (prunestate->do_initial_prune)
- *initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true);
+ if (pruneresult)
+ {
+ *initially_valid_subplans =
+ list_nth(pruneresult->valid_subplan_offs_list, part_prune_index);
+ }
+ else if (prunestate && prunestate->do_initial_prune)
+ {
+ *initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true,
+ NULL);
+ }
else
{
/* No pruning, so we'll need to initialize all subplans */
Assert(n_total_subplans > 0);
*initially_valid_subplans = bms_add_range(NULL, 0,
n_total_subplans - 1);
+ return prunestate;
}
/*
@@ -1670,7 +1721,8 @@ ExecInitPartitionPruning(PlanState *planstate,
* that were removed above due to initial pruning. No need to do this if
* no steps were removed.
*/
- if (bms_num_members(*initially_valid_subplans) < n_total_subplans)
+ if (prunestate &&
+ bms_num_members(*initially_valid_subplans) < n_total_subplans)
{
/*
* We can safely skip this when !do_exec_prune, even though that
@@ -1686,11 +1738,73 @@ ExecInitPartitionPruning(PlanState *planstate,
return prunestate;
}
+/*
+ * ExecPartitionDoInitialPruning
+ * Perform initial pruning using given PartitionPruneInfo to determine
+ * the minimal set of child subplans that will be executed and also the
+ * set of RT indexes of the leaf partitions scanned by those subplans.
+ */
+Bitmapset *
+ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params,
+ PartitionPruneInfo *pruneinfo,
+ Bitmapset **scan_leafpart_rtis)
+{
+ List *rtable = plannedstmt->rtable;
+ ExprContext *econtext;
+ PartitionDirectory pdir;
+ MemoryContext oldcontext,
+ tmpcontext;
+ PartitionPruneState *prunestate;
+ Bitmapset *valid_subplan_offs;
+
+ /*
+ * A temporary context for memory allocations required while executing
+ * partition pruning steps.
+ */
+ tmpcontext = AllocSetContextCreate(CurrentMemoryContext,
+ "initial pruning working data",
+ ALLOCSET_DEFAULT_SIZES);
+ oldcontext = MemoryContextSwitchTo(tmpcontext);
+
+ /*
+ * PartitionDirectory to look up partition descriptors, which omits
+ * detached partitions, just like in the executor proper.
+ */
+ pdir = CreatePartitionDirectory(CurrentMemoryContext, false);
+
+ /*
+ * We don't yet have a PlanState for the parent plan node, so we must
+ * create a standalone ExprContext to evaluate pruning expressions,
+ * equipped with the information about the EXTERN parameters that the
+ * caller passed us. Note that that's okay because the initial pruning
+ * steps do not contain anything that requires the execution to have
+ * started.
+ */
+ econtext = CreateStandaloneExprContext();
+ econtext->ecxt_param_list_info = params;
+ prunestate = CreatePartitionPruneState(NULL, pruneinfo, true, false,
+ rtable, econtext, pdir);
+ MemoryContextSwitchTo(oldcontext);
+
+ /* Do the initial pruning. */
+ valid_subplan_offs = ExecFindMatchingSubPlans(prunestate, true,
+ scan_leafpart_rtis);
+
+ FreeExprContext(econtext, true);
+ DestroyPartitionDirectory(pdir);
+ MemoryContextDelete(tmpcontext);
+
+ return valid_subplan_offs;
+}
+
/*
* CreatePartitionPruneState
* Build the data structure required for calling ExecFindMatchingSubPlans
*
- * 'planstate' is the parent plan node's execution state.
+ * 'planstate', if not NULL, is the parent plan node's execution state. It
+ * can be NULL if being called before ExecutorStart(), in which case,
+ * 'rtable' (range table), 'econtext', and 'partdir' must be explicitly
+ * provided.
*
* 'pruneinfo' is a PartitionPruneInfo as generated by
* make_partition_pruneinfo. Here we build a PartitionPruneState containing a
@@ -1704,19 +1818,21 @@ ExecInitPartitionPruning(PlanState *planstate,
* PartitionedRelPruneInfo.
*/
static PartitionPruneState *
-CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
+CreatePartitionPruneState(PlanState *planstate,
+ PartitionPruneInfo *pruneinfo,
+ bool consider_initial_steps,
+ bool consider_exec_steps,
+ List *rtable, ExprContext *econtext,
+ PartitionDirectory partdir)
{
- EState *estate = planstate->state;
+ EState *estate = planstate ? planstate->state : NULL;
PartitionPruneState *prunestate;
int n_part_hierarchies;
ListCell *lc;
int i;
- ExprContext *econtext = planstate->ps_ExprContext;
- /* For data reading, executor always omits detached partitions */
- if (estate->es_partition_directory == NULL)
- estate->es_partition_directory =
- CreatePartitionDirectory(estate->es_query_cxt, false);
+ Assert((estate != NULL) ||
+ (partdir != NULL && econtext != NULL && rtable != NIL));
n_part_hierarchies = list_length(pruneinfo->prune_infos);
Assert(n_part_hierarchies > 0);
@@ -1771,15 +1887,42 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
PartitionKey partkey;
/*
- * We can rely on the copies of the partitioned table's partition
- * key and partition descriptor appearing in its relcache entry,
- * because that entry will be held open and locked for the
- * duration of this executor run.
+ * Must open the relation by ourselves when called before the
+ * execution has started, such as, when called during
+ * ExecutorDoInitialPruning() on a cached plan. In that case,
+ * sub-partitions must be locked, because AcquirePlannerLocks()
+ * would not have seen them. (1st relation in a partrelpruneinfos
+ * list is always the root partitioned table appearing in the
+ * query, which AcquirePlannerLocks() would have locked; the
+ * Assert in relation_open() guards that assumption.)
+ */
+ if (estate == NULL)
+ {
+ RangeTblEntry *rte = rt_fetch(pinfo->rtindex, rtable);
+ int lockmode = (j == 0) ? NoLock : rte->rellockmode;
+
+ partrel = table_open(rte->relid, lockmode);
+ }
+ else
+ partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
+
+ /*
+ * We can rely on the copy of the partitioned table's partition
+ * key from in its relcache entry, because it can't change (or
+ * get destroyed) as long as the relation is locked. Partition
+ * descriptor is taken from the PartitionDirectory associated with
+ * the table that is held open long enough for the descriptor to
+ * remain valid while it's used to perform the pruning steps.
*/
- partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
partkey = RelationGetPartitionKey(partrel);
- partdesc = PartitionDirectoryLookup(estate->es_partition_directory,
- partrel);
+ partdesc = PartitionDirectoryLookup(partdir, partrel);
+
+ /*
+ * Must close partrel, keeping the lock taken, if we're not using
+ * EState's entry.
+ */
+ if (estate == NULL)
+ table_close(partrel, NoLock);
/*
* Initialize the subplan_map and subpart_map.
@@ -1793,6 +1936,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
Assert(partdesc->nparts >= pinfo->nparts);
pprune->nparts = partdesc->nparts;
pprune->subplan_map = palloc(sizeof(int) * partdesc->nparts);
+ pprune->rti_map = palloc(sizeof(Index) * partdesc->nparts);
if (partdesc->nparts == pinfo->nparts)
{
/*
@@ -1803,6 +1947,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
pprune->subpart_map = pinfo->subpart_map;
memcpy(pprune->subplan_map, pinfo->subplan_map,
sizeof(int) * pinfo->nparts);
+ memcpy(pprune->rti_map, pinfo->rti_map,
+ sizeof(int) * pinfo->nparts);
/*
* Double-check that the list of unpruned relations has not
@@ -1853,6 +1999,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
pinfo->subplan_map[pd_idx];
pprune->subpart_map[pp_idx] =
pinfo->subpart_map[pd_idx];
+ pprune->rti_map[pp_idx] =
+ pinfo->rti_map[pd_idx];
pd_idx++;
}
else
@@ -1860,6 +2008,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
/* this partdesc entry is not in the plan */
pprune->subplan_map[pp_idx] = -1;
pprune->subpart_map[pp_idx] = -1;
+ pprune->rti_map[pp_idx] = 0;
}
}
@@ -1881,7 +2030,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
* Initialize pruning contexts as needed.
*/
pprune->initial_pruning_steps = pinfo->initial_pruning_steps;
- if (pinfo->initial_pruning_steps)
+ if (consider_initial_steps && pinfo->initial_pruning_steps)
{
InitPartitionPruneContext(&pprune->initial_context,
pinfo->initial_pruning_steps,
@@ -1891,7 +2040,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
prunestate->do_initial_prune = true;
}
pprune->exec_pruning_steps = pinfo->exec_pruning_steps;
- if (pinfo->exec_pruning_steps)
+ if (consider_exec_steps && pinfo->exec_pruning_steps)
{
InitPartitionPruneContext(&pprune->exec_context,
pinfo->exec_pruning_steps,
@@ -2119,10 +2268,14 @@ PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
* Pass initial_prune if PARAM_EXEC Params cannot yet be evaluated. This
* differentiates the initial executor-time pruning step from later
* runtime pruning.
+ *
+ * RT indexes of leaf partitions scanned by the chosen subplans are added to
+ * *scan_leafpart_rtis if the pointer is non-NULL.
*/
Bitmapset *
ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
- bool initial_prune)
+ bool initial_prune,
+ Bitmapset **scan_leafpart_rtis)
{
Bitmapset *result = NULL;
MemoryContext oldcontext;
@@ -2157,7 +2310,7 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
*/
pprune = &prunedata->partrelprunedata[0];
find_matching_subplans_recurse(prunedata, pprune, initial_prune,
- &result);
+ &result, scan_leafpart_rtis);
/* Expression eval may have used space in ExprContext too */
if (pprune->exec_pruning_steps)
@@ -2171,6 +2324,8 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
/* Copy result out of the temp context before we reset it */
result = bms_copy(result);
+ if (scan_leafpart_rtis)
+ *scan_leafpart_rtis = bms_copy(*scan_leafpart_rtis);
MemoryContextReset(prunestate->prune_context);
@@ -2181,13 +2336,15 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
* find_matching_subplans_recurse
* Recursive worker function for ExecFindMatchingSubPlans
*
- * Adds valid (non-prunable) subplan IDs to *validsubplans
+ * Adds valid (non-prunable) subplan IDs to *validsubplans and RT indexes of
+ * of the corresponding leaf partitions to *scan_leafpart_rtis (if asked for).
*/
static void
find_matching_subplans_recurse(PartitionPruningData *prunedata,
PartitionedRelPruningData *pprune,
bool initial_prune,
- Bitmapset **validsubplans)
+ Bitmapset **validsubplans,
+ Bitmapset **scan_leafpart_rtis)
{
Bitmapset *partset;
int i;
@@ -2214,8 +2371,14 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
while ((i = bms_next_member(partset, i)) >= 0)
{
if (pprune->subplan_map[i] >= 0)
+ {
*validsubplans = bms_add_member(*validsubplans,
pprune->subplan_map[i]);
+ Assert(pprune->rti_map[i] > 0);
+ if (scan_leafpart_rtis)
+ *scan_leafpart_rtis = bms_add_member(*scan_leafpart_rtis,
+ pprune->rti_map[i]);
+ }
else
{
int partidx = pprune->subpart_map[i];
@@ -2223,7 +2386,8 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
if (partidx >= 0)
find_matching_subplans_recurse(prunedata,
&prunedata->partrelprunedata[partidx],
- initial_prune, validsubplans);
+ initial_prune, validsubplans,
+ scan_leafpart_rtis);
else
{
/*
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index f9460ae506..a2182a6b1f 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -844,7 +844,7 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
else
dest = None_Receiver;
- es->qd = CreateQueryDesc(es->stmt,
+ es->qd = CreateQueryDesc(es->stmt, NULL,
fcache->src,
GetActiveSnapshot(),
InvalidSnapshot,
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index c6f86a6510..96880e122a 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -155,7 +155,8 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
* subplan, we can fill as_valid_subplans immediately, preventing
* later calls to ExecFindMatchingSubPlans.
*/
- if (!prunestate->do_exec_prune && nplans > 0)
+ if (appendstate->as_prune_state == NULL ||
+ (!appendstate->as_prune_state->do_exec_prune && nplans > 0))
appendstate->as_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
}
else
@@ -577,7 +578,7 @@ choose_next_subplan_locally(AppendState *node)
}
else if (node->as_valid_subplans == NULL)
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
whichplan = -1;
}
@@ -642,7 +643,7 @@ choose_next_subplan_for_leader(AppendState *node)
if (node->as_valid_subplans == NULL)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
/*
* Mark each invalid plan as finished to allow the loop below to
@@ -717,7 +718,7 @@ choose_next_subplan_for_worker(AppendState *node)
else if (node->as_valid_subplans == NULL)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
mark_invalid_subplans_as_finished(node);
}
@@ -868,7 +869,7 @@ ExecAppendAsyncBegin(AppendState *node)
if (node->as_valid_subplans == NULL)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
classify_matching_subplans(node);
}
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index 8d35860c30..2312e5a633 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -103,7 +103,8 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
* subplan, we can fill ms_valid_subplans immediately, preventing
* later calls to ExecFindMatchingSubPlans.
*/
- if (!prunestate->do_exec_prune && nplans > 0)
+ if (mergestate->ms_prune_state == NULL ||
+ (!mergestate->ms_prune_state->do_exec_prune && nplans > 0))
mergestate->ms_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
}
else
@@ -218,7 +219,7 @@ ExecMergeAppend(PlanState *pstate)
*/
if (node->ms_valid_subplans == NULL)
node->ms_valid_subplans =
- ExecFindMatchingSubPlans(node->ms_prune_state, false);
+ ExecFindMatchingSubPlans(node->ms_prune_state, false, NULL);
/*
* First time through: pull the first tuple from each valid subplan,
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 29bc26669b..303a572c02 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -1578,6 +1578,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
CachedPlanSource *plansource;
CachedPlan *cplan;
List *stmt_list;
+ List *part_prune_result_list;
char *query_string;
Snapshot snapshot;
MemoryContext oldcontext;
@@ -1657,7 +1658,10 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
*/
/* Replan if needed, and increment plan refcount for portal */
- cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv);
+ cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv,
+ &part_prune_result_list);
+ Assert(list_length(cplan->stmt_list) ==
+ list_length(part_prune_result_list));
stmt_list = cplan->stmt_list;
if (!plan->saved)
@@ -1685,6 +1689,9 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
stmt_list,
cplan);
+ /* Copy PartitionPruneResults into the portal's context. */
+ PortalStorePartitionPruneResults(portal, part_prune_result_list);
+
/*
* Set up options for portal. Default SCROLL type is chosen the same way
* as PerformCursorOpen does it.
@@ -2092,7 +2099,8 @@ SPI_plan_get_cached_plan(SPIPlanPtr plan)
/* Get the generic plan for the query */
cplan = GetCachedPlan(plansource, NULL,
plan->saved ? CurrentResourceOwner : NULL,
- _SPI_current->queryEnv);
+ _SPI_current->queryEnv,
+ NULL /* Not interested in PartitionPruneResults */);
Assert(cplan == plansource->gplan);
/* Pop the error context stack */
@@ -2473,7 +2481,9 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
{
CachedPlanSource *plansource = (CachedPlanSource *) lfirst(lc1);
List *stmt_list;
- ListCell *lc2;
+ List *part_prune_result_list;
+ ListCell *lc2,
+ *lc3;
spicallbackarg.query = plansource->query_string;
@@ -2549,8 +2559,10 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
* plan, the refcount must be backed by the plan_owner.
*/
cplan = GetCachedPlan(plansource, options->params,
- plan_owner, _SPI_current->queryEnv);
-
+ plan_owner, _SPI_current->queryEnv,
+ &part_prune_result_list);
+ Assert(list_length(cplan->stmt_list) ==
+ list_length(part_prune_result_list));
stmt_list = cplan->stmt_list;
/*
@@ -2589,9 +2601,10 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
}
}
- foreach(lc2, stmt_list)
+ forboth(lc2, stmt_list, lc3, part_prune_result_list)
{
PlannedStmt *stmt = lfirst_node(PlannedStmt, lc2);
+ PartitionPruneResult *part_prune_result = lfirst_node(PartitionPruneResult, lc3);
bool canSetTag = stmt->canSetTag;
DestReceiver *dest;
@@ -2663,7 +2676,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
else
snap = InvalidSnapshot;
- qdesc = CreateQueryDesc(stmt,
+ qdesc = CreateQueryDesc(stmt, part_prune_result,
plansource->query_string,
snap, crosscheck_snapshot,
dest,
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index e76fda8eba..afd0332ddd 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -160,7 +160,6 @@ _copyExtensibleNode(const ExtensibleNode *from)
return newnode;
}
-
/*
* copyObjectImpl -- implementation of copyObject(); see nodes/nodes.h
*
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 81f6a9093c..84a195adca 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -294,7 +294,6 @@ outDatum(StringInfo str, Datum value, int typlen, bool typbyval)
#include "outfuncs.funcs.c"
-
/*
* Support functions for nodes with custom_read_write attribute or
* special_read_write attribute
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 1421686938..d57478bde9 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -158,6 +158,11 @@
token = pg_strtok(&length); /* skip :fldname */ \
local_node->fldname = readIntCols(len)
+/* Read an Index array */
+#define READ_INDEX_ARRAY(fldname, len) \
+ token = pg_strtok(&length); /* skip :fldname */ \
+ local_node->fldname = readIndexCols(len)
+
/* Read a bool array */
#define READ_BOOL_ARRAY(fldname, len) \
token = pg_strtok(&length); /* skip :fldname */ \
@@ -623,6 +628,30 @@ readIntCols(int numCols)
return int_vals;
}
+/*
+ * readIndexCols
+ */
+Index *
+readIndexCols(int numCols)
+{
+ int tokenLength,
+ i;
+ const char *token;
+ Index *index_vals;
+
+ if (numCols <= 0)
+ return NULL;
+
+ index_vals = (Index *) palloc(numCols * sizeof(Index));
+ for (i = 0; i < numCols; i++)
+ {
+ token = pg_strtok(&tokenLength);
+ index_vals[i] = atoui(token);
+ }
+
+ return index_vals;
+}
+
/*
* readBoolCols
*/
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index b11249ed8f..7141035cc4 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -519,7 +519,9 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
result->parallelModeNeeded = glob->parallelModeNeeded;
result->planTree = top_plan;
result->partPruneInfos = glob->partPruneInfos;
+ result->containsInitialPruning = glob->containsInitialPruning;
result->rtable = glob->finalrtable;
+ result->minLockRelids = glob->minLockRelids;
result->resultRelations = glob->resultRelations;
result->appendRelations = glob->appendRelations;
result->subplans = glob->subplans;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index b8d5610593..da749e331e 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -270,6 +270,16 @@ set_plan_references(PlannerInfo *root, Plan *plan)
*/
add_rtes_to_flat_rtable(root, false);
+ /*
+ * Add the query's adjusted range of RT indexes to glob->minLockRelids.
+ * The adjusted RT indexes of prunable relations will be deleted from the
+ * set below where PartitionPruneInfos are processed.
+ */
+ glob->minLockRelids =
+ bms_add_range(glob->minLockRelids,
+ rtoffset + 1,
+ rtoffset + list_length(root->parse->rtable));
+
/*
* Adjust RT indexes of PlanRowMarks and add to final rowmarks list
*/
@@ -352,6 +362,7 @@ set_plan_references(PlannerInfo *root, Plan *plan)
foreach (lc, root->partPruneInfos)
{
PartitionPruneInfo *pruneinfo = lfirst(lc);
+ Bitmapset *leafpart_rtis = NULL;
ListCell *l;
foreach(l, pruneinfo->prune_infos)
@@ -362,15 +373,50 @@ set_plan_references(PlannerInfo *root, Plan *plan)
foreach(l2, prune_infos)
{
PartitionedRelPruneInfo *pinfo = lfirst(l2);
+ int i;
/* RT index of the table to which the pinfo belongs. */
pinfo->rtindex += rtoffset;
+
+ /* Also of the leaf partitions that might be scanned. */
+ for (i = 0; i < pinfo->nparts; i++)
+ {
+ if (pinfo->rti_map[i] > 0 && pinfo->subplan_map[i] >= 0)
+ {
+ pinfo->rti_map[i] += rtoffset;
+ leafpart_rtis = bms_add_member(leafpart_rtis,
+ pinfo->rti_map[i]);
+ }
+ }
}
}
+ if (pruneinfo->needs_init_pruning)
+ {
+ glob->containsInitialPruning = true;
+
+ /*
+ * Delete the leaf partition RTIs from the global set of relations
+ * to be locked before executing the plan. AcquireExecutorLocks()
+ * will find the ones to add to the set after performing initial
+ * pruning.
+ */
+ glob->minLockRelids = bms_del_members(glob->minLockRelids,
+ leafpart_rtis);
+ }
+
glob->partPruneInfos = lappend(glob->partPruneInfos, pruneinfo);
}
+ /*
+ * It seems worth doing a bms_copy() on glob->minLockRelids if we deleted
+ * bits from it above to get rid of any empty tail bits. It seems better
+ * for the loop over this set in AcquireExecutorLocks() to not have to go
+ * through those useless bit words.
+ */
+ if (glob->containsInitialPruning)
+ glob->minLockRelids = bms_copy(glob->minLockRelids);
+
return result;
}
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index d77f7d3aef..952c5b8327 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -144,7 +144,9 @@ static List *make_partitionedrel_pruneinfo(PlannerInfo *root,
List *prunequal,
Bitmapset *partrelids,
int *relid_subplan_map,
- Bitmapset **matchedsubplans);
+ Bitmapset **matchedsubplans,
+ bool *needs_init_pruning,
+ bool *needs_exec_pruning);
static void gen_partprune_steps(RelOptInfo *rel, List *clauses,
PartClauseTarget target,
GeneratePruningStepsContext *context);
@@ -234,6 +236,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
int *relid_subplan_map;
ListCell *lc;
int i;
+ bool needs_init_pruning = false;
+ bool needs_exec_pruning = false;
/*
* Scan the subpaths to see which ones are scans of partition child
@@ -313,12 +317,16 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
Bitmapset *partrelids = (Bitmapset *) lfirst(lc);
List *pinfolist;
Bitmapset *matchedsubplans = NULL;
+ bool partrel_needs_init_pruning;
+ bool partrel_needs_exec_pruning;
pinfolist = make_partitionedrel_pruneinfo(root, parentrel,
prunequal,
partrelids,
relid_subplan_map,
- &matchedsubplans);
+ &matchedsubplans,
+ &partrel_needs_init_pruning,
+ &partrel_needs_exec_pruning);
/* When pruning is possible, record the matched subplans */
if (pinfolist != NIL)
@@ -327,6 +335,9 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
allmatchedsubplans = bms_join(matchedsubplans,
allmatchedsubplans);
}
+
+ needs_init_pruning |= partrel_needs_init_pruning;
+ needs_exec_pruning |= partrel_needs_exec_pruning;
}
pfree(relid_subplan_map);
@@ -341,6 +352,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
/* Else build the result data structure */
pruneinfo = makeNode(PartitionPruneInfo);
pruneinfo->prune_infos = prunerelinfos;
+ pruneinfo->needs_init_pruning = needs_init_pruning;
+ pruneinfo->needs_exec_pruning = needs_exec_pruning;
/*
* Some subplans may not belong to any of the identified partitioned rels.
@@ -441,13 +454,18 @@ add_part_relids(List *allpartrelids, Bitmapset *partrelids)
* If we cannot find any useful run-time pruning steps, return NIL.
* However, on success, each rel identified in partrelids will have
* an element in the result list, even if some of them are useless.
+ * *needs_init_pruning and *needs_exec_pruning are set to indicate that the
+ * returned PartitionedRelPruneInfos contains pruning steps that can be
+ * performed before and after execution begins, respectively.
*/
static List *
make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
List *prunequal,
Bitmapset *partrelids,
int *relid_subplan_map,
- Bitmapset **matchedsubplans)
+ Bitmapset **matchedsubplans,
+ bool *needs_init_pruning,
+ bool *needs_exec_pruning)
{
RelOptInfo *targetpart = NULL;
List *pinfolist = NIL;
@@ -458,6 +476,10 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
int rti;
int i;
+ /* Will find out below. */
+ *needs_init_pruning = false;
+ *needs_exec_pruning = false;
+
/*
* Examine each partitioned rel, constructing a temporary array to map
* from planner relids to index of the partitioned rel, and building a
@@ -545,6 +567,9 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
* executor per-scan pruning steps. This first pass creates startup
* pruning steps and detects whether there's any possibly-useful quals
* that would require per-scan pruning.
+ *
+ * In the first pass, we note whether the 2nd pass is necessary by
+ * noting the presence of EXEC parameters.
*/
gen_partprune_steps(subpart, partprunequal, PARTTARGET_INITIAL,
&context);
@@ -619,6 +644,12 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
pinfo->execparamids = execparamids;
/* Remaining fields will be filled in the next loop */
+ /* record which types of pruning steps we've seen so far */
+ if (initial_pruning_steps != NIL)
+ *needs_init_pruning = true;
+ if (exec_pruning_steps != NIL)
+ *needs_exec_pruning = true;
+
pinfolist = lappend(pinfolist, pinfo);
}
@@ -646,6 +677,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
int *subplan_map;
int *subpart_map;
Oid *relid_map;
+ Index *rti_map;
/*
* Construct the subplan and subpart maps for this partitioning level.
@@ -658,6 +690,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
subpart_map = (int *) palloc(nparts * sizeof(int));
memset(subpart_map, -1, nparts * sizeof(int));
relid_map = (Oid *) palloc0(nparts * sizeof(Oid));
+ rti_map = (Index *) palloc0(nparts * sizeof(Index));
present_parts = NULL;
i = -1;
@@ -672,6 +705,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
subplan_map[i] = subplanidx = relid_subplan_map[partrel->relid] - 1;
subpart_map[i] = subpartidx = relid_subpart_map[partrel->relid] - 1;
relid_map[i] = planner_rt_fetch(partrel->relid, root)->relid;
+ rti_map[i] = partrel->relid;
if (subplanidx >= 0)
{
present_parts = bms_add_member(present_parts, i);
@@ -696,6 +730,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
pinfo->subplan_map = subplan_map;
pinfo->subpart_map = subpart_map;
pinfo->relid_map = relid_map;
+ pinfo->rti_map = rti_map;
}
pfree(relid_subpart_map);
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 6f18b68856..16bda42f11 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -1596,6 +1596,7 @@ exec_bind_message(StringInfo input_message)
int16 *rformats = NULL;
CachedPlanSource *psrc;
CachedPlan *cplan;
+ List *part_prune_result_list;
Portal portal;
char *query_string;
char *saved_stmt_name;
@@ -1971,7 +1972,9 @@ exec_bind_message(StringInfo input_message)
* will be generated in MessageContext. The plan refcount will be
* assigned to the Portal, so it will be released at portal destruction.
*/
- cplan = GetCachedPlan(psrc, params, NULL, NULL);
+ cplan = GetCachedPlan(psrc, params, NULL, NULL, &part_prune_result_list);
+ Assert(list_length(cplan->stmt_list) ==
+ list_length(part_prune_result_list));
/*
* Now we can define the portal.
@@ -1986,6 +1989,9 @@ exec_bind_message(StringInfo input_message)
cplan->stmt_list,
cplan);
+ /* Copy PartitionPruneResults into the portal's context. */
+ PortalStorePartitionPruneResults(portal, part_prune_result_list);
+
/* Done with the snapshot used for parameter I/O and parsing/planning */
if (snapshot_set)
PopActiveSnapshot();
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index 5aa5a350f3..8cc2e2162d 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -35,7 +35,7 @@
Portal ActivePortal = NULL;
-static void ProcessQuery(PlannedStmt *plan,
+static void ProcessQuery(PlannedStmt *plan, PartitionPruneResult *part_prune_result,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -65,6 +65,7 @@ static void DoPortalRewind(Portal portal);
*/
QueryDesc *
CreateQueryDesc(PlannedStmt *plannedstmt,
+ PartitionPruneResult *part_prune_result,
const char *sourceText,
Snapshot snapshot,
Snapshot crosscheck_snapshot,
@@ -77,6 +78,8 @@ CreateQueryDesc(PlannedStmt *plannedstmt,
qd->operation = plannedstmt->commandType; /* operation */
qd->plannedstmt = plannedstmt; /* plan */
+ qd->part_prune_result = part_prune_result; /* ExecutorDoInitialPruning()
+ * output for plan */
qd->sourceText = sourceText; /* query text */
qd->snapshot = RegisterSnapshot(snapshot); /* snapshot */
/* RI check snapshot */
@@ -122,6 +125,7 @@ FreeQueryDesc(QueryDesc *qdesc)
* PORTAL_ONE_RETURNING, or PORTAL_ONE_MOD_WITH portal
*
* plan: the plan tree for the query
+ * part_prune_result: ExecutorDoInitialPruning() output for the plan tree
* sourceText: the source text of the query
* params: any parameters needed
* dest: where to send results
@@ -134,6 +138,7 @@ FreeQueryDesc(QueryDesc *qdesc)
*/
static void
ProcessQuery(PlannedStmt *plan,
+ PartitionPruneResult *part_prune_result,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -145,7 +150,7 @@ ProcessQuery(PlannedStmt *plan,
/*
* Create the QueryDesc object
*/
- queryDesc = CreateQueryDesc(plan, sourceText,
+ queryDesc = CreateQueryDesc(plan, part_prune_result, sourceText,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, 0);
@@ -491,8 +496,13 @@ PortalStart(Portal portal, ParamListInfo params,
/*
* Create QueryDesc in portal's context; for the moment, set
* the destination to DestNone.
+ *
+ * There is no PartitionPruneResult unless the PlannedStmt is
+ * from a CachedPlan.
*/
queryDesc = CreateQueryDesc(linitial_node(PlannedStmt, portal->stmts),
+ portal->part_prune_results == NIL ? NULL :
+ linitial(portal->part_prune_results),
portal->sourceText,
GetActiveSnapshot(),
InvalidSnapshot,
@@ -1225,6 +1235,8 @@ PortalRunMulti(Portal portal,
if (pstmt->utilityStmt == NULL)
{
+ PartitionPruneResult *part_prune_result = NULL;
+
/*
* process a plannable query.
*/
@@ -1271,10 +1283,18 @@ PortalRunMulti(Portal portal,
else
UpdateActiveSnapshotCommandId();
+ /*
+ * Determine if there's a corresponding PartitionPruneResult for
+ * this PlannedStmt.
+ */
+ if (portal->part_prune_results != NIL)
+ part_prune_result = list_nth(portal->part_prune_results,
+ foreach_current_index(stmtlist_item));
+
if (pstmt->canSetTag)
{
/* statement can set tag string */
- ProcessQuery(pstmt,
+ ProcessQuery(pstmt, part_prune_result,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
@@ -1283,7 +1303,7 @@ PortalRunMulti(Portal portal,
else
{
/* stmt added by rewrite cannot set tag */
- ProcessQuery(pstmt,
+ ProcessQuery(pstmt, part_prune_result,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 0d6a295674..d1c9605979 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -99,14 +99,19 @@ static dlist_head cached_expression_list = DLIST_STATIC_INIT(cached_expression_l
static void ReleaseGenericPlan(CachedPlanSource *plansource);
static List *RevalidateCachedQuery(CachedPlanSource *plansource,
QueryEnvironment *queryEnv);
-static bool CheckCachedPlan(CachedPlanSource *plansource);
+static bool CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
+ List **part_prune_result_list);
static CachedPlan *BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
- ParamListInfo boundParams, QueryEnvironment *queryEnv);
+ ParamListInfo boundParams, QueryEnvironment *queryEnv,
+ List **part_prune_result_list);
static bool choose_custom_plan(CachedPlanSource *plansource,
ParamListInfo boundParams);
static double cached_plan_cost(CachedPlan *plan, bool include_planner);
static Query *QueryListGetPrimaryStmt(List *stmts);
-static void AcquireExecutorLocks(List *stmt_list, bool acquire);
+static void AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams,
+ List **part_prune_result_list,
+ List **lockedRelids_per_stmt);
+static void ReleaseExecutorLocks(List *stmt_list, List *lockedRelids_per_stmt);
static void AcquirePlannerLocks(List *stmt_list, bool acquire);
static void ScanQueryForLocks(Query *parsetree, bool acquire);
static bool ScanQueryWalker(Node *node, bool *acquire);
@@ -790,15 +795,20 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
*
* On a "true" return, we have acquired the locks needed to run the plan.
* (We must do this for the "true" result to be race-condition-free.)
+ *
+ * See GetCachedPlan()'s comment for a description of part_prune_result_list.
*/
static bool
-CheckCachedPlan(CachedPlanSource *plansource)
+CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
+ List **part_prune_result_list)
{
CachedPlan *plan = plansource->gplan;
/* Assert that caller checked the querytree */
Assert(plansource->is_valid);
+ *part_prune_result_list = NIL;
+
/* If there's no generic plan, just say "false" */
if (!plan)
return false;
@@ -820,13 +830,21 @@ CheckCachedPlan(CachedPlanSource *plansource)
*/
if (plan->is_valid)
{
+ List *lockedRelids_per_stmt;
+
/*
* Plan must have positive refcount because it is referenced by
* plansource; so no need to fear it disappears under us here.
*/
Assert(plan->refcount > 0);
- AcquireExecutorLocks(plan->stmt_list, true);
+ /*
+ * Lock relations scanned by the plan. This is where the pruning
+ * happens if needed.
+ */
+ AcquireExecutorLocks(plan->stmt_list, boundParams,
+ part_prune_result_list,
+ &lockedRelids_per_stmt);
/*
* If plan was transient, check to see if TransactionXmin has
@@ -848,7 +866,14 @@ CheckCachedPlan(CachedPlanSource *plansource)
}
/* Oops, the race case happened. Release useless locks. */
- AcquireExecutorLocks(plan->stmt_list, false);
+ ReleaseExecutorLocks(plan->stmt_list, lockedRelids_per_stmt);
+
+ /*
+ * The output list and any objects therein have been allocated in the
+ * caller's hopefully short-lived context, so will not remain leaked
+ * for long, though reset to avoid its accidentally being looked at.
+ */
+ *part_prune_result_list = NIL;
}
/*
@@ -874,10 +899,15 @@ CheckCachedPlan(CachedPlanSource *plansource)
* Planning work is done in the caller's memory context. The finished plan
* is in a child memory context, which typically should get reparented
* (unless this is a one-shot plan, in which case we don't copy the plan).
+ *
+ * A list of NULLs is returned in *part_prune_result_list, meaning that no
+ * PartitionPruneResult nodes have yet been created for the plans in
+ * stmt_list.
*/
static CachedPlan *
BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
- ParamListInfo boundParams, QueryEnvironment *queryEnv)
+ ParamListInfo boundParams, QueryEnvironment *queryEnv,
+ List **part_prune_result_list)
{
CachedPlan *plan;
List *plist;
@@ -1007,6 +1037,17 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
MemoryContextSwitchTo(oldcxt);
+ /*
+ * No actual PartitionPruneResults yet to add, though must initialize
+ * the list to have the same number of elements as the list of
+ * PlannedStmts.
+ */
+ *part_prune_result_list = NIL;
+ foreach(lc, plist)
+ {
+ *part_prune_result_list = lappend(*part_prune_result_list, NULL);
+ }
+
return plan;
}
@@ -1126,6 +1167,17 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
* plan or a custom plan for the given parameters: the caller does not know
* which it will get.
*
+ * For every PlannedStmt found in the returned CachedPlan, an element that
+ * is either a PartitionPruneResult or a NULL is added to
+ * *part_prune_result_list if needed. The former if the PlannedStmt is from
+ * the existing CachedPlan that is otherwise valid and contains at least one
+ * PartitionPruneInfo that has "initial" pruning steps. Those steps are
+ * performed by calling ExecutorDoInitialPruning() to determine only those
+ * leaf partitions that need to be locked by AcquireExecutorLocks() by pruning
+ * away subplans that don't match the pruning conditions. The
+ * PartitionPruneResult contains a list of bitmapsets of the indexes of
+ * matching subplans, one for each PartitionPruneInfo.
+ *
* On return, the plan is valid and we have sufficient locks to begin
* execution.
*
@@ -1139,11 +1191,13 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
*/
CachedPlan *
GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
- ResourceOwner owner, QueryEnvironment *queryEnv)
+ ResourceOwner owner, QueryEnvironment *queryEnv,
+ List **part_prune_result_list)
{
CachedPlan *plan = NULL;
List *qlist;
bool customplan;
+ List *my_part_prune_result_list;
/* Assert caller is doing things in a sane order */
Assert(plansource->magic == CACHEDPLANSOURCE_MAGIC);
@@ -1160,7 +1214,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
if (!customplan)
{
- if (CheckCachedPlan(plansource))
+ if (CheckCachedPlan(plansource, boundParams,
+ &my_part_prune_result_list))
{
/* We want a generic plan, and we already have a valid one */
plan = plansource->gplan;
@@ -1169,7 +1224,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
else
{
/* Build a new generic plan */
- plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv);
+ plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv,
+ &my_part_prune_result_list);
/* Just make real sure plansource->gplan is clear */
ReleaseGenericPlan(plansource);
/* Link the new generic plan into the plansource */
@@ -1214,7 +1270,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
if (customplan)
{
/* Build a custom plan */
- plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv);
+ plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv,
+ &my_part_prune_result_list);
/* Accumulate total costs of custom plans */
plansource->total_custom_cost += cached_plan_cost(plan, true);
@@ -1246,6 +1303,9 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
plan->is_saved = true;
}
+ if (part_prune_result_list)
+ *part_prune_result_list = my_part_prune_result_list;
+
return plan;
}
@@ -1737,17 +1797,29 @@ QueryListGetPrimaryStmt(List *stmts)
/*
* AcquireExecutorLocks: acquire locks needed for execution of a cached plan;
- * or release them if acquire is false.
+ *
+ * See GetCachedPlan()'s comment for a description of part_prune_result_list.
+ *
+ * On return, *lockedRelids_per_stmt will contain a bitmapset for every
+ * PlannedStmt in stmt_list, containing the RT indexes of relation entries
+ * in its range table that were actually locked, or NULL if the PlannedStmt
+ * contains a utility statement.
*/
static void
-AcquireExecutorLocks(List *stmt_list, bool acquire)
+AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams,
+ List **part_prune_result_list,
+ List **lockedRelids_per_stmt)
{
ListCell *lc1;
+ *part_prune_result_list = *lockedRelids_per_stmt = NIL;
foreach(lc1, stmt_list)
{
PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
- ListCell *lc2;
+ PartitionPruneResult *part_prune_result = NULL;
+ Bitmapset *allLockRelids;
+ Bitmapset *lockedRelids = NULL;
+ int rti;
if (plannedstmt->commandType == CMD_UTILITY)
{
@@ -1761,13 +1833,38 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
Query *query = UtilityContainsQuery(plannedstmt->utilityStmt);
if (query)
- ScanQueryForLocks(query, acquire);
+ ScanQueryForLocks(query, true);
+ *part_prune_result_list = lappend(*part_prune_result_list, NULL);
continue;
}
- foreach(lc2, plannedstmt->rtable)
+ /*
+ * Figure out the set of relations that would need to be locked
+ * before executing the plan.
+ */
+ if (plannedstmt->containsInitialPruning)
{
- RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc2);
+ /*
+ * Obtain the set of leaf partitions to be locked.
+ *
+ * The following does initial partition pruning using the
+ * PartitionPruneInfos found in plannedstmt->partPruneInfos and
+ * finds leaf partitions that survive that pruning across all the
+ * nodes in the plan tree.
+ */
+ PartitionPruneResult *part_prune_result =
+ ExecutorDoInitialPruning(plannedstmt, boundParams);
+
+ allLockRelids = bms_union(plannedstmt->minLockRelids,
+ part_prune_result->scan_leafpart_rtis);
+ }
+ else
+ allLockRelids = plannedstmt->minLockRelids;
+
+ rti = -1;
+ while ((rti = bms_next_member(allLockRelids, rti)) > 0)
+ {
+ RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
if (rte->rtekind != RTE_RELATION)
continue;
@@ -1778,10 +1875,58 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
* fail if it's been dropped entirely --- we'll just transiently
* acquire a non-conflicting lock.
*/
- if (acquire)
- LockRelationOid(rte->relid, rte->rellockmode);
- else
- UnlockRelationOid(rte->relid, rte->rellockmode);
+ LockRelationOid(rte->relid, rte->rellockmode);
+ lockedRelids = bms_add_member(lockedRelids, rti);
+ }
+
+ *part_prune_result_list = lappend(*part_prune_result_list,
+ part_prune_result);
+ *lockedRelids_per_stmt = lappend(*lockedRelids_per_stmt, lockedRelids);
+ }
+}
+
+/*
+ * ReleaseExecutorLocks
+ * Release locks that would've been acquired by an earlier call to
+ * AcquireExecutorLocks()
+ */
+static void
+ReleaseExecutorLocks(List *stmt_list, List *lockedRelids_per_stmt)
+{
+ ListCell *lc1,
+ *lc2;
+
+ forboth(lc1, stmt_list, lc2, lockedRelids_per_stmt)
+ {
+ PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
+ Bitmapset *lockedRelids = lfirst(lc2);
+ int rti;
+
+ if (plannedstmt->commandType == CMD_UTILITY)
+ {
+ /*
+ * Ignore utility statements, except those (such as EXPLAIN) that
+ * contain a parsed-but-not-planned query. Note: it's okay to use
+ * ScanQueryForLocks, even though the query hasn't been through
+ * rule rewriting, because rewriting doesn't change the query
+ * representation.
+ */
+ Query *query = UtilityContainsQuery(plannedstmt->utilityStmt);
+
+ if (query)
+ ScanQueryForLocks(query, false);
+ continue;
+ }
+
+ rti = -1;
+ while ((rti = bms_next_member(lockedRelids, rti)) >= 0)
+ {
+ RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
+
+ Assert(rte->rtekind == RTE_RELATION);
+
+ /* See the comment in AcquireExecutorLocks(). */
+ UnlockRelationOid(rte->relid, rte->rellockmode);
}
}
}
diff --git a/src/backend/utils/mmgr/portalmem.c b/src/backend/utils/mmgr/portalmem.c
index d549f66d4a..1bbe6b704b 100644
--- a/src/backend/utils/mmgr/portalmem.c
+++ b/src/backend/utils/mmgr/portalmem.c
@@ -303,6 +303,25 @@ PortalDefineQuery(Portal portal,
portal->status = PORTAL_DEFINED;
}
+/*
+ * PortalStorePartitionPruneResults
+ * Copy the given list of PartitionPruneResults into the portal's
+ * context
+ *
+ * This allows the caller to ensure that the list exists as long as the portal
+ * does.
+ */
+void
+PortalStorePartitionPruneResults(Portal portal, List *part_prune_results)
+{
+ MemoryContext oldcxt;
+
+ AssertArg(PortalIsValid(portal));
+ oldcxt = MemoryContextSwitchTo(portal->portalContext);
+ portal->part_prune_results = copyObject(part_prune_results);
+ MemoryContextSwitchTo(oldcxt);
+}
+
/*
* PortalReleaseCachedPlan
* Release a portal's reference to its cached plan, if any.
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 9ebde089ae..e57e133f0e 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -87,7 +87,9 @@ extern void ExplainOneUtility(Node *utilityStmt, IntoClause *into,
ExplainState *es, const char *queryString,
ParamListInfo params, QueryEnvironment *queryEnv);
-extern void ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into,
+extern void ExplainOnePlan(PlannedStmt *plannedstmt,
+ PartitionPruneResult *part_prune_result,
+ IntoClause *into,
ExplainState *es, const char *queryString,
ParamListInfo params, QueryEnvironment *queryEnv,
const instr_time *planduration,
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index bf962af7af..bd8776402e 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -45,6 +45,7 @@ extern void ExecCleanupTupleRouting(ModifyTableState *mtstate,
* nparts Length of subplan_map[] and subpart_map[].
* subplan_map Subplan index by partition index, or -1.
* subpart_map Subpart index by partition index, or -1.
+ * rti_map Range table index by partition index, or 0.
* present_parts A Bitmapset of the partition indexes that we
* have subplans or subparts for.
* initial_pruning_steps List of PartitionPruneSteps used to
@@ -61,6 +62,7 @@ typedef struct PartitionedRelPruningData
int nparts;
int *subplan_map;
int *subpart_map;
+ Index *rti_map;
Bitmapset *present_parts;
List *initial_pruning_steps;
List *exec_pruning_steps;
@@ -126,5 +128,10 @@ extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
int part_prune_index,
Bitmapset **initially_valid_subplans);
extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
- bool initial_prune);
+ bool initial_prune,
+ Bitmapset **scan_leafpart_rtis);
+extern Bitmapset *ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt,
+ ParamListInfo params,
+ PartitionPruneInfo *pruneinfo,
+ Bitmapset **scan_leafpart_rtis);
#endif /* EXECPARTITION_H */
diff --git a/src/include/executor/execdesc.h b/src/include/executor/execdesc.h
index e79e2c001f..60d5644908 100644
--- a/src/include/executor/execdesc.h
+++ b/src/include/executor/execdesc.h
@@ -35,6 +35,8 @@ typedef struct QueryDesc
/* These fields are provided by CreateQueryDesc */
CmdType operation; /* CMD_SELECT, CMD_UPDATE, etc. */
PlannedStmt *plannedstmt; /* planner's output (could be utility, too) */
+ PartitionPruneResult *part_prune_result; /* ExecutorDoInitialPruning()'s
+ * output for plannedstmt */
const char *sourceText; /* source text of the query */
Snapshot snapshot; /* snapshot to use for query */
Snapshot crosscheck_snapshot; /* crosscheck for RI update/delete */
@@ -57,6 +59,7 @@ typedef struct QueryDesc
/* in pquery.c */
extern QueryDesc *CreateQueryDesc(PlannedStmt *plannedstmt,
+ PartitionPruneResult *part_prune_result,
const char *sourceText,
Snapshot snapshot,
Snapshot crosscheck_snapshot,
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index d68a6b9d28..5c4a282be0 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -185,6 +185,8 @@ ExecGetJunkAttribute(TupleTableSlot *slot, AttrNumber attno, bool *isNull)
/*
* prototypes from functions in execMain.c
*/
+extern PartitionPruneResult *ExecutorDoInitialPruning(PlannedStmt *plannedstmt,
+ ParamListInfo params);
extern void ExecutorStart(QueryDesc *queryDesc, int eflags);
extern void standard_ExecutorStart(QueryDesc *queryDesc, int eflags);
extern void ExecutorRun(QueryDesc *queryDesc,
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 63a89474db..12ea06c2f6 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -1001,6 +1001,33 @@ typedef struct DomainConstraintState
*/
typedef TupleTableSlot *(*ExecProcNodeMtd) (struct PlanState *pstate);
+/*----------------
+ * PartitionPruneResult
+ *
+ * The result of performing ExecutorDoInitialPruning() invocation on a given
+ * PlannedStmt.
+ *
+ * Contains a list of Bitmapset of the indexes of the subplans remaining after
+ * performing initial pruning by calling ExecFindMatchingSubPlans() for every
+ * PartitionPruneInfo found in PlannedStmt.partPruneInfos. RT indexes of the
+ * leaf partitions scanned by those subplans across all PartitionPruneInfos
+ * are added into scan_leafpart_rtis.
+ *
+ * This is used by GetCachedPlan() to inform its callers of the pruning
+ * decisions made when performing AcquireExecutorLocks() on a given cached
+ * PlannedStmt, which the callers then pass on to the executor. The executor
+ * refers to this node when initializing the plan nodes which contain subplans
+ * that may have been pruned by ExecutorDoInitialPruning(), rather than
+ * redoing initial pruning.
+ */
+typedef struct PartitionPruneResult
+{
+ NodeTag type;
+
+ List *valid_subplan_offs_list;
+ Bitmapset *scan_leafpart_rtis;
+} PartitionPruneResult;
+
/* ----------------
* PlanState node
*
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index cdd6debfa0..b33d9e426d 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -212,6 +212,7 @@ extern struct Bitmapset *readBitmapset(void);
extern uintptr_t readDatum(bool typbyval);
extern bool *readBoolCols(int numCols);
extern int *readIntCols(int numCols);
+extern Index *readIndexCols(int numCols);
extern Oid *readOidCols(int numCols);
extern int16 *readAttrNumberCols(int numCols);
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index d87957ff6c..7957aeb6d7 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -125,6 +125,19 @@ typedef struct PlannerGlobal
/* List of PartitionPruneInfo contained in the plan */
List *partPruneInfos;
+ /*
+ * Do any of those PartitionPruneInfos have initial (pre-exec) pruning
+ * steps in them?
+ */
+ bool containsInitialPruning;
+
+ /*
+ * Indexes of all range table entries minus indexes of range table entries
+ * of the leaf partitions scanned by prunable subplans; see
+ * AcquireExecutorLocks()
+ */
+ Bitmapset *minLockRelids;
+
/* OIDs of relations the plan depends on */
List *relationOids;
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index f2daabb3b7..1d2c0d9bdf 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -72,8 +72,17 @@ typedef struct PlannedStmt
List *partPruneInfos; /* List of PartitionPruneInfo contained in
* the plan */
+ bool containsInitialPruning; /* Do any of those PartitionPruneInfos
+ * have initial (pre-exec) pruning
+ * steps in them? */
+
List *rtable; /* list of RangeTblEntry nodes */
+ Bitmapset *minLockRelids; /* Indexes of all range table entries minus
+ * indexes of range table entries of the leaf
+ * partitions scanned by prunable subplans;
+ * see AcquireExecutorLocks() */
+
/* rtable indexes of target relations for INSERT/UPDATE/DELETE */
List *resultRelations; /* integer list of RT indexes, or NIL */
@@ -1409,6 +1418,13 @@ typedef struct PlanRowMark
* prune_infos List of Lists containing PartitionedRelPruneInfo nodes,
* one sublist per run-time-prunable partition hierarchy
* appearing in the parent plan node's subplans.
+ *
+ * needs_init_pruning Does any of the PartitionedRelPruneInfos in
+ * prune_infos have its initial_pruning_steps set?
+ *
+ * needs_exec_pruning Does any of the PartitionedRelPruneInfos in
+ * prune_infos have its exec_pruning_steps set?
+ *
* other_subplans Indexes of any subplans that are not accounted for
* by any of the PartitionedRelPruneInfo nodes in
* "prune_infos". These subplans must not be pruned.
@@ -1419,6 +1435,8 @@ typedef struct PartitionPruneInfo
NodeTag type;
List *prune_infos;
+ bool needs_init_pruning;
+ bool needs_exec_pruning;
Bitmapset *other_subplans;
} PartitionPruneInfo;
@@ -1463,6 +1481,9 @@ typedef struct PartitionedRelPruneInfo
/* relation OID by partition index, or 0 */
Oid *relid_map pg_node_attr(array_size(nparts));
+ /* Range table index by partition index, or 0. */
+ Index *rti_map pg_node_attr(array_size(nparts));
+
/*
* initial_pruning_steps shows how to prune during executor startup (i.e.,
* without use of any PARAM_EXEC Params); it is NIL if no startup pruning
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index 0499635f59..1c5bb5ece1 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -220,7 +220,8 @@ extern List *CachedPlanGetTargetList(CachedPlanSource *plansource,
extern CachedPlan *GetCachedPlan(CachedPlanSource *plansource,
ParamListInfo boundParams,
ResourceOwner owner,
- QueryEnvironment *queryEnv);
+ QueryEnvironment *queryEnv,
+ List **part_prune_result_list);
extern void ReleaseCachedPlan(CachedPlan *plan, ResourceOwner owner);
extern bool CachedPlanAllowsSimpleValidityCheck(CachedPlanSource *plansource,
diff --git a/src/include/utils/portal.h b/src/include/utils/portal.h
index aeddbdafe5..9f7727a837 100644
--- a/src/include/utils/portal.h
+++ b/src/include/utils/portal.h
@@ -138,6 +138,7 @@ typedef struct PortalData
QueryCompletion qc; /* command completion data for executed query */
List *stmts; /* list of PlannedStmts */
CachedPlan *cplan; /* CachedPlan, if stmts are from one */
+ List *part_prune_results; /* list of PartitionPruneResults */
ParamListInfo portalParams; /* params to pass to query */
QueryEnvironment *queryEnv; /* environment for query */
@@ -242,6 +243,8 @@ extern void PortalDefineQuery(Portal portal,
CommandTag commandTag,
List *stmts,
CachedPlan *cplan);
+extern void PortalStorePartitionPruneResults(Portal portal,
+ List *part_prune_result_list);
extern PlannedStmt *PortalGetPrimaryStmt(Portal portal);
extern void PortalCreateHoldStore(Portal portal);
extern void PortalHashTableDeleteAll(void);
--
2.35.3
[application/octet-stream] v18-0001-Move-PartitioPruneInfo-out-of-plan-nodes-into-Pl.patch (17.8K, 3-v18-0001-Move-PartitioPruneInfo-out-of-plan-nodes-into-Pl.patch)
download | inline diff:
From 571424d7f1d5cb8b3ee59853649d35731b033b03 Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Fri, 27 May 2022 16:00:28 +0900
Subject: [PATCH v18 1/2] Move PartitioPruneInfo out of plan nodes into
PlannedStmt
The planner will now add a given PartitioPruneInfo to
PlannedStmt.partPruneInfos instead of directly to the
Append/MergeAppend plan node. What gets set instead in the
latter is an index field which points to the list element
of PlannedStmt.partPruneInfos containing the PartitioPruneInfo
belonging to the plan node.
A later commit will make AcquireExecutorLocks() do the initial
partition pruning to determine a minimal set of partitions to be
locked when validating a plan tree and it will need to consult the
PartitioPruneInfos referenced therein to do so. It would be better
for the PartitioPruneInfos to be accessible directly than requiring
a walk of the plan tree to find them, which is easier when it can be
done by simply iterating over PlannedStmt.partPruneInfos.
---
src/backend/executor/execMain.c | 1 +
src/backend/executor/execParallel.c | 1 +
src/backend/executor/execPartition.c | 4 +-
src/backend/executor/execUtils.c | 2 +
src/backend/executor/nodeAppend.c | 4 +-
src/backend/executor/nodeMergeAppend.c | 4 +-
src/backend/nodes/outfuncs.c | 1 -
src/backend/optimizer/plan/createplan.c | 24 ++++-----
src/backend/optimizer/plan/planner.c | 1 +
src/backend/optimizer/plan/setrefs.c | 65 +++++++++++++------------
src/backend/partitioning/partprune.c | 18 ++++---
src/include/executor/execPartition.h | 3 +-
src/include/nodes/execnodes.h | 2 +
src/include/nodes/pathnodes.h | 6 +++
src/include/nodes/plannodes.h | 11 +++--
src/include/partitioning/partprune.h | 8 +--
16 files changed, 92 insertions(+), 63 deletions(-)
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index ef2fd46092..72fc273524 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -825,6 +825,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
ExecInitRangeTable(estate, rangeTable);
estate->es_plannedstmt = plannedstmt;
+ estate->es_part_prune_infos = plannedstmt->partPruneInfos;
/*
* Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index f1fd7f7e8b..f73b8c2607 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -183,6 +183,7 @@ ExecSerializePlan(Plan *plan, EState *estate)
pstmt->dependsOnRole = false;
pstmt->parallelModeNeeded = false;
pstmt->planTree = plan;
+ pstmt->partPruneInfos = estate->es_part_prune_infos;
pstmt->rtable = estate->es_range_table;
pstmt->resultRelations = NIL;
pstmt->appendRelations = NIL;
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index e03ea27299..b55cdd2580 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -1638,11 +1638,13 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
PartitionPruneState *
ExecInitPartitionPruning(PlanState *planstate,
int n_total_subplans,
- PartitionPruneInfo *pruneinfo,
+ int part_prune_index,
Bitmapset **initially_valid_subplans)
{
PartitionPruneState *prunestate;
EState *estate = planstate->state;
+ PartitionPruneInfo *pruneinfo = list_nth(estate->es_part_prune_infos,
+ part_prune_index);
/* We may need an expression context to evaluate partition exprs */
ExecAssignExprContext(estate, planstate);
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 9df1f81ea8..f9c7976ff2 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -119,6 +119,8 @@ CreateExecutorState(void)
estate->es_relations = NULL;
estate->es_rowmarks = NULL;
estate->es_plannedstmt = NULL;
+ estate->es_part_prune_infos = NIL;
+ estate->es_part_prune_result = NULL;
estate->es_junkFilter = NULL;
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index 357e10a1d7..c6f86a6510 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -134,7 +134,7 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
appendstate->as_begun = false;
/* If run-time partition pruning is enabled, then set that up now */
- if (node->part_prune_info != NULL)
+ if (node->part_prune_index >= 0)
{
PartitionPruneState *prunestate;
@@ -145,7 +145,7 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
*/
prunestate = ExecInitPartitionPruning(&appendstate->ps,
list_length(node->appendplans),
- node->part_prune_info,
+ node->part_prune_index,
&validsubplans);
appendstate->as_prune_state = prunestate;
nplans = bms_num_members(validsubplans);
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index c5c62fa5c7..8d35860c30 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -82,7 +82,7 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
mergestate->ps.ExecProcNode = ExecMergeAppend;
/* If run-time partition pruning is enabled, then set that up now */
- if (node->part_prune_info != NULL)
+ if (node->part_prune_index >= 0)
{
PartitionPruneState *prunestate;
@@ -93,7 +93,7 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
*/
prunestate = ExecInitPartitionPruning(&mergestate->ps,
list_length(node->mergeplans),
- node->part_prune_info,
+ node->part_prune_index,
&validsubplans);
mergestate->ms_prune_state = prunestate;
nplans = bms_num_members(validsubplans);
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 4d776e7b51..81f6a9093c 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -299,7 +299,6 @@ outDatum(StringInfo str, Datum value, int typlen, bool typbyval)
* Support functions for nodes with custom_read_write attribute or
* special_read_write attribute
*/
-
static void
_outConst(StringInfo str, const Const *node)
{
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index 76606faa3e..58a05cf673 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -1203,7 +1203,6 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
ListCell *subpaths;
int nasyncplans = 0;
RelOptInfo *rel = best_path->path.parent;
- PartitionPruneInfo *partpruneinfo = NULL;
int nodenumsortkeys = 0;
AttrNumber *nodeSortColIdx = NULL;
Oid *nodeSortOperators = NULL;
@@ -1354,6 +1353,9 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
subplans = lappend(subplans, subplan);
}
+ /* Set below if we find quals that we can use to run-time prune */
+ plan->part_prune_index = -1;
+
/*
* If any quals exist, they may be useful to perform further partition
* pruning during execution. Gather information needed by the executor to
@@ -1377,16 +1379,14 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
}
if (prunequal != NIL)
- partpruneinfo =
- make_partition_pruneinfo(root, rel,
- best_path->subpaths,
- prunequal);
+ plan->part_prune_index = make_partition_pruneinfo(root, rel,
+ best_path->subpaths,
+ prunequal);
}
plan->appendplans = subplans;
plan->nasyncplans = nasyncplans;
plan->first_partial_plan = best_path->first_partial_path;
- plan->part_prune_info = partpruneinfo;
copy_generic_path_info(&plan->plan, (Path *) best_path);
@@ -1426,7 +1426,6 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
List *subplans = NIL;
ListCell *subpaths;
RelOptInfo *rel = best_path->path.parent;
- PartitionPruneInfo *partpruneinfo = NULL;
/*
* We don't have the actual creation of the MergeAppend node split out
@@ -1519,6 +1518,9 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
subplans = lappend(subplans, subplan);
}
+ /* Set below if we find quals that we can use to run-time prune */
+ node->part_prune_index = -1;
+
/*
* If any quals exist, they may be useful to perform further partition
* pruning during execution. Gather information needed by the executor to
@@ -1542,13 +1544,13 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
}
if (prunequal != NIL)
- partpruneinfo = make_partition_pruneinfo(root, rel,
- best_path->subpaths,
- prunequal);
+ node->part_prune_index = make_partition_pruneinfo(root, rel,
+ best_path->subpaths,
+ prunequal);
}
node->mergeplans = subplans;
- node->part_prune_info = partpruneinfo;
+
/*
* If prepare_sort_from_pathkeys added sort columns, but we were told to
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 06ad856eac..b11249ed8f 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -518,6 +518,7 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
result->dependsOnRole = glob->dependsOnRole;
result->parallelModeNeeded = glob->parallelModeNeeded;
result->planTree = top_plan;
+ result->partPruneInfos = glob->partPruneInfos;
result->rtable = glob->finalrtable;
result->resultRelations = glob->resultRelations;
result->appendRelations = glob->appendRelations;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 9cef92cab2..b8d5610593 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -348,6 +348,29 @@ set_plan_references(PlannerInfo *root, Plan *plan)
}
}
+ /* Also fix up the information in PartitionPruneInfos. */
+ foreach (lc, root->partPruneInfos)
+ {
+ PartitionPruneInfo *pruneinfo = lfirst(lc);
+ ListCell *l;
+
+ foreach(l, pruneinfo->prune_infos)
+ {
+ List *prune_infos = lfirst(l);
+ ListCell *l2;
+
+ foreach(l2, prune_infos)
+ {
+ PartitionedRelPruneInfo *pinfo = lfirst(l2);
+
+ /* RT index of the table to which the pinfo belongs. */
+ pinfo->rtindex += rtoffset;
+ }
+ }
+
+ glob->partPruneInfos = lappend(glob->partPruneInfos, pruneinfo);
+ }
+
return result;
}
@@ -1655,21 +1678,12 @@ set_append_references(PlannerInfo *root,
aplan->apprelids = offset_relid_set(aplan->apprelids, rtoffset);
- if (aplan->part_prune_info)
- {
- foreach(l, aplan->part_prune_info->prune_infos)
- {
- List *prune_infos = lfirst(l);
- ListCell *l2;
-
- foreach(l2, prune_infos)
- {
- PartitionedRelPruneInfo *pinfo = lfirst(l2);
-
- pinfo->rtindex += rtoffset;
- }
- }
- }
+ /*
+ * PartitionPruneInfos will be added to a list in PlannerGlobal, so update
+ * the index.
+ */
+ if (aplan->part_prune_index >= 0)
+ aplan->part_prune_index += list_length(root->glob->partPruneInfos);
/* We don't need to recurse to lefttree or righttree ... */
Assert(aplan->plan.lefttree == NULL);
@@ -1727,21 +1741,12 @@ set_mergeappend_references(PlannerInfo *root,
mplan->apprelids = offset_relid_set(mplan->apprelids, rtoffset);
- if (mplan->part_prune_info)
- {
- foreach(l, mplan->part_prune_info->prune_infos)
- {
- List *prune_infos = lfirst(l);
- ListCell *l2;
-
- foreach(l2, prune_infos)
- {
- PartitionedRelPruneInfo *pinfo = lfirst(l2);
-
- pinfo->rtindex += rtoffset;
- }
- }
- }
+ /*
+ * PartitionPruneInfos will be added to a list in PlannerGlobal, so update
+ * the index.
+ */
+ if (mplan->part_prune_index >= 0)
+ mplan->part_prune_index += list_length(root->glob->partPruneInfos);
/* We don't need to recurse to lefttree or righttree ... */
Assert(mplan->plan.lefttree == NULL);
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index 9d3c05aed3..d77f7d3aef 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -209,16 +209,20 @@ static void partkey_datum_from_expr(PartitionPruneContext *context,
/*
* make_partition_pruneinfo
- * Builds a PartitionPruneInfo which can be used in the executor to allow
- * additional partition pruning to take place. Returns NULL when
- * partition pruning would be useless.
+ * Checks if the given set of quals can be used to build pruning steps
+ * that the executor can use to prune away unneeded partitions. If
+ * suitable quals are found then a PartitionPruneInfo is built and tagged
+ * onto the PlannerInfo's partPruneInfos list.
+ *
+ * The return value is the 0-based index of the item added to the
+ * partPruneInfos list or -1 if nothing was added.
*
* 'parentrel' is the RelOptInfo for an appendrel, and 'subpaths' is the list
* of scan paths for its child rels.
* 'prunequal' is a list of potential pruning quals (i.e., restriction
* clauses that are applicable to the appendrel).
*/
-PartitionPruneInfo *
+int
make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
List *subpaths,
List *prunequal)
@@ -332,7 +336,7 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
* quals, then we can just not bother with run-time pruning.
*/
if (prunerelinfos == NIL)
- return NULL;
+ return -1;
/* Else build the result data structure */
pruneinfo = makeNode(PartitionPruneInfo);
@@ -358,7 +362,9 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
else
pruneinfo->other_subplans = NULL;
- return pruneinfo;
+ root->partPruneInfos = lappend(root->partPruneInfos, pruneinfo);
+
+ return list_length(root->partPruneInfos) - 1;
}
/*
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index 708435e952..bf962af7af 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -123,9 +123,8 @@ typedef struct PartitionPruneState
extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
int n_total_subplans,
- PartitionPruneInfo *pruneinfo,
+ int part_prune_index,
Bitmapset **initially_valid_subplans);
extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
bool initial_prune);
-
#endif /* EXECPARTITION_H */
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 01b1727fc0..63a89474db 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -611,6 +611,8 @@ typedef struct EState
struct ExecRowMark **es_rowmarks; /* Array of per-range-table-entry
* ExecRowMarks, or NULL if none */
PlannedStmt *es_plannedstmt; /* link to top of plan tree */
+ List *es_part_prune_infos; /* PlannedStmt.partPruneInfos */
+ struct PartitionPruneResult *es_part_prune_result; /* QueryDesc.part_prune_result */
const char *es_sourceText; /* Source text from QueryDesc */
JunkFilter *es_junkFilter; /* top-level junk filter, if any */
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 44ffc73f15..d87957ff6c 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -122,6 +122,9 @@ typedef struct PlannerGlobal
/* "flat" list of AppendRelInfos */
List *appendRelations;
+ /* List of PartitionPruneInfo contained in the plan */
+ List *partPruneInfos;
+
/* OIDs of relations the plan depends on */
List *relationOids;
@@ -480,6 +483,9 @@ struct PlannerInfo
/* Does this query modify any partition key columns? */
bool partColsUpdated;
+
+ /* PartitionPruneInfos added in this query's plan. */
+ List *partPruneInfos;
};
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index dca2a21e7a..f2daabb3b7 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -69,6 +69,9 @@ typedef struct PlannedStmt
struct Plan *planTree; /* tree of Plan nodes */
+ List *partPruneInfos; /* List of PartitionPruneInfo contained in
+ * the plan */
+
List *rtable; /* list of RangeTblEntry nodes */
/* rtable indexes of target relations for INSERT/UPDATE/DELETE */
@@ -269,8 +272,8 @@ typedef struct Append
*/
int first_partial_plan;
- /* Info for run-time subplan pruning; NULL if we're not doing that */
- struct PartitionPruneInfo *part_prune_info;
+ /* Index to PlannerInfo.partPruneInfos or -1 if no run-time pruning */
+ int part_prune_index;
} Append;
/* ----------------
@@ -304,8 +307,8 @@ typedef struct MergeAppend
/* NULLS FIRST/LAST directions */
bool *nullsFirst pg_node_attr(array_size(numCols));
- /* Info for run-time subplan pruning; NULL if we're not doing that */
- struct PartitionPruneInfo *part_prune_info;
+ /* Index to PlannerInfo.partPruneInfos or -1 if no run-time pruning */
+ int part_prune_index;
} MergeAppend;
/* ----------------
diff --git a/src/include/partitioning/partprune.h b/src/include/partitioning/partprune.h
index 90684efa25..ebf0dcff8c 100644
--- a/src/include/partitioning/partprune.h
+++ b/src/include/partitioning/partprune.h
@@ -70,10 +70,10 @@ typedef struct PartitionPruneContext
#define PruneCxtStateIdx(partnatts, step_id, keyno) \
((partnatts) * (step_id) + (keyno))
-extern PartitionPruneInfo *make_partition_pruneinfo(struct PlannerInfo *root,
- struct RelOptInfo *parentrel,
- List *subpaths,
- List *prunequal);
+extern int make_partition_pruneinfo(struct PlannerInfo *root,
+ struct RelOptInfo *parentrel,
+ List *subpaths,
+ List *prunequal);
extern Bitmapset *prune_append_rel_partitions(struct RelOptInfo *rel);
extern Bitmapset *get_matching_partitions(PartitionPruneContext *context,
List *pruning_steps);
--
2.35.3
^ permalink raw reply [nested|flat] 71+ messages in thread
* Re: generic plans and "initial" pruning
@ 2022-07-13 07:03 Amit Langote <[email protected]>
parent: Amit Langote <[email protected]>
0 siblings, 1 reply; 71+ messages in thread
From: Amit Langote @ 2022-07-13 07:03 UTC (permalink / raw)
To: Jacob Champion <[email protected]>; +Cc: Zhihong Yu <[email protected]>; David Rowley <[email protected]>; Robert Haas <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers
On Wed, Jul 13, 2022 at 3:40 PM Amit Langote <[email protected]> wrote:
> Rebased over 964d01ae90c.
Sorry, left some pointless hunks in there while rebasing. Fixed in
the attached.
--
Thanks, Amit Langote
EDB: http://www.enterprisedb.com
Attachments:
[application/octet-stream] v19-0001-Move-PartitioPruneInfo-out-of-plan-nodes-into-Pl.patch (17.3K, 2-v19-0001-Move-PartitioPruneInfo-out-of-plan-nodes-into-Pl.patch)
download | inline diff:
From 9fa5cd5f4256b7249ab6f560edca9d3609a126ef Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Fri, 27 May 2022 16:00:28 +0900
Subject: [PATCH v19 1/2] Move PartitioPruneInfo out of plan nodes into
PlannedStmt
The planner will now add a given PartitioPruneInfo to
PlannedStmt.partPruneInfos instead of directly to the
Append/MergeAppend plan node. What gets set instead in the
latter is an index field which points to the list element
of PlannedStmt.partPruneInfos containing the PartitioPruneInfo
belonging to the plan node.
A later commit will make AcquireExecutorLocks() do the initial
partition pruning to determine a minimal set of partitions to be
locked when validating a plan tree and it will need to consult the
PartitioPruneInfos referenced therein to do so. It would be better
for the PartitioPruneInfos to be accessible directly than requiring
a walk of the plan tree to find them, which is easier when it can be
done by simply iterating over PlannedStmt.partPruneInfos.
---
src/backend/executor/execMain.c | 1 +
src/backend/executor/execParallel.c | 1 +
src/backend/executor/execPartition.c | 4 +-
src/backend/executor/execUtils.c | 2 +
src/backend/executor/nodeAppend.c | 4 +-
src/backend/executor/nodeMergeAppend.c | 4 +-
src/backend/optimizer/plan/createplan.c | 24 ++++-----
src/backend/optimizer/plan/planner.c | 1 +
src/backend/optimizer/plan/setrefs.c | 65 +++++++++++++------------
src/backend/partitioning/partprune.c | 18 ++++---
src/include/executor/execPartition.h | 3 +-
src/include/nodes/execnodes.h | 2 +
src/include/nodes/pathnodes.h | 6 +++
src/include/nodes/plannodes.h | 11 +++--
src/include/partitioning/partprune.h | 8 +--
15 files changed, 92 insertions(+), 62 deletions(-)
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index ef2fd46092..72fc273524 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -825,6 +825,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
ExecInitRangeTable(estate, rangeTable);
estate->es_plannedstmt = plannedstmt;
+ estate->es_part_prune_infos = plannedstmt->partPruneInfos;
/*
* Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index f1fd7f7e8b..f73b8c2607 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -183,6 +183,7 @@ ExecSerializePlan(Plan *plan, EState *estate)
pstmt->dependsOnRole = false;
pstmt->parallelModeNeeded = false;
pstmt->planTree = plan;
+ pstmt->partPruneInfos = estate->es_part_prune_infos;
pstmt->rtable = estate->es_range_table;
pstmt->resultRelations = NIL;
pstmt->appendRelations = NIL;
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index e03ea27299..b55cdd2580 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -1638,11 +1638,13 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
PartitionPruneState *
ExecInitPartitionPruning(PlanState *planstate,
int n_total_subplans,
- PartitionPruneInfo *pruneinfo,
+ int part_prune_index,
Bitmapset **initially_valid_subplans)
{
PartitionPruneState *prunestate;
EState *estate = planstate->state;
+ PartitionPruneInfo *pruneinfo = list_nth(estate->es_part_prune_infos,
+ part_prune_index);
/* We may need an expression context to evaluate partition exprs */
ExecAssignExprContext(estate, planstate);
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 9df1f81ea8..f9c7976ff2 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -119,6 +119,8 @@ CreateExecutorState(void)
estate->es_relations = NULL;
estate->es_rowmarks = NULL;
estate->es_plannedstmt = NULL;
+ estate->es_part_prune_infos = NIL;
+ estate->es_part_prune_result = NULL;
estate->es_junkFilter = NULL;
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index 357e10a1d7..c6f86a6510 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -134,7 +134,7 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
appendstate->as_begun = false;
/* If run-time partition pruning is enabled, then set that up now */
- if (node->part_prune_info != NULL)
+ if (node->part_prune_index >= 0)
{
PartitionPruneState *prunestate;
@@ -145,7 +145,7 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
*/
prunestate = ExecInitPartitionPruning(&appendstate->ps,
list_length(node->appendplans),
- node->part_prune_info,
+ node->part_prune_index,
&validsubplans);
appendstate->as_prune_state = prunestate;
nplans = bms_num_members(validsubplans);
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index c5c62fa5c7..8d35860c30 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -82,7 +82,7 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
mergestate->ps.ExecProcNode = ExecMergeAppend;
/* If run-time partition pruning is enabled, then set that up now */
- if (node->part_prune_info != NULL)
+ if (node->part_prune_index >= 0)
{
PartitionPruneState *prunestate;
@@ -93,7 +93,7 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
*/
prunestate = ExecInitPartitionPruning(&mergestate->ps,
list_length(node->mergeplans),
- node->part_prune_info,
+ node->part_prune_index,
&validsubplans);
mergestate->ms_prune_state = prunestate;
nplans = bms_num_members(validsubplans);
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index e37f2933eb..fd8ab4a167 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -1203,7 +1203,6 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
ListCell *subpaths;
int nasyncplans = 0;
RelOptInfo *rel = best_path->path.parent;
- PartitionPruneInfo *partpruneinfo = NULL;
int nodenumsortkeys = 0;
AttrNumber *nodeSortColIdx = NULL;
Oid *nodeSortOperators = NULL;
@@ -1354,6 +1353,9 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
subplans = lappend(subplans, subplan);
}
+ /* Set below if we find quals that we can use to run-time prune */
+ plan->part_prune_index = -1;
+
/*
* If any quals exist, they may be useful to perform further partition
* pruning during execution. Gather information needed by the executor to
@@ -1377,16 +1379,14 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
}
if (prunequal != NIL)
- partpruneinfo =
- make_partition_pruneinfo(root, rel,
- best_path->subpaths,
- prunequal);
+ plan->part_prune_index = make_partition_pruneinfo(root, rel,
+ best_path->subpaths,
+ prunequal);
}
plan->appendplans = subplans;
plan->nasyncplans = nasyncplans;
plan->first_partial_plan = best_path->first_partial_path;
- plan->part_prune_info = partpruneinfo;
copy_generic_path_info(&plan->plan, (Path *) best_path);
@@ -1425,7 +1425,6 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
List *subplans = NIL;
ListCell *subpaths;
RelOptInfo *rel = best_path->path.parent;
- PartitionPruneInfo *partpruneinfo = NULL;
/*
* We don't have the actual creation of the MergeAppend node split out
@@ -1518,6 +1517,9 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
subplans = lappend(subplans, subplan);
}
+ /* Set below if we find quals that we can use to run-time prune */
+ node->part_prune_index = -1;
+
/*
* If any quals exist, they may be useful to perform further partition
* pruning during execution. Gather information needed by the executor to
@@ -1541,13 +1543,13 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
}
if (prunequal != NIL)
- partpruneinfo = make_partition_pruneinfo(root, rel,
- best_path->subpaths,
- prunequal);
+ node->part_prune_index = make_partition_pruneinfo(root, rel,
+ best_path->subpaths,
+ prunequal);
}
node->mergeplans = subplans;
- node->part_prune_info = partpruneinfo;
+
/*
* If prepare_sort_from_pathkeys added sort columns, but we were told to
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 06ad856eac..b11249ed8f 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -518,6 +518,7 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
result->dependsOnRole = glob->dependsOnRole;
result->parallelModeNeeded = glob->parallelModeNeeded;
result->planTree = top_plan;
+ result->partPruneInfos = glob->partPruneInfos;
result->rtable = glob->finalrtable;
result->resultRelations = glob->resultRelations;
result->appendRelations = glob->appendRelations;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 9cef92cab2..b8d5610593 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -348,6 +348,29 @@ set_plan_references(PlannerInfo *root, Plan *plan)
}
}
+ /* Also fix up the information in PartitionPruneInfos. */
+ foreach (lc, root->partPruneInfos)
+ {
+ PartitionPruneInfo *pruneinfo = lfirst(lc);
+ ListCell *l;
+
+ foreach(l, pruneinfo->prune_infos)
+ {
+ List *prune_infos = lfirst(l);
+ ListCell *l2;
+
+ foreach(l2, prune_infos)
+ {
+ PartitionedRelPruneInfo *pinfo = lfirst(l2);
+
+ /* RT index of the table to which the pinfo belongs. */
+ pinfo->rtindex += rtoffset;
+ }
+ }
+
+ glob->partPruneInfos = lappend(glob->partPruneInfos, pruneinfo);
+ }
+
return result;
}
@@ -1655,21 +1678,12 @@ set_append_references(PlannerInfo *root,
aplan->apprelids = offset_relid_set(aplan->apprelids, rtoffset);
- if (aplan->part_prune_info)
- {
- foreach(l, aplan->part_prune_info->prune_infos)
- {
- List *prune_infos = lfirst(l);
- ListCell *l2;
-
- foreach(l2, prune_infos)
- {
- PartitionedRelPruneInfo *pinfo = lfirst(l2);
-
- pinfo->rtindex += rtoffset;
- }
- }
- }
+ /*
+ * PartitionPruneInfos will be added to a list in PlannerGlobal, so update
+ * the index.
+ */
+ if (aplan->part_prune_index >= 0)
+ aplan->part_prune_index += list_length(root->glob->partPruneInfos);
/* We don't need to recurse to lefttree or righttree ... */
Assert(aplan->plan.lefttree == NULL);
@@ -1727,21 +1741,12 @@ set_mergeappend_references(PlannerInfo *root,
mplan->apprelids = offset_relid_set(mplan->apprelids, rtoffset);
- if (mplan->part_prune_info)
- {
- foreach(l, mplan->part_prune_info->prune_infos)
- {
- List *prune_infos = lfirst(l);
- ListCell *l2;
-
- foreach(l2, prune_infos)
- {
- PartitionedRelPruneInfo *pinfo = lfirst(l2);
-
- pinfo->rtindex += rtoffset;
- }
- }
- }
+ /*
+ * PartitionPruneInfos will be added to a list in PlannerGlobal, so update
+ * the index.
+ */
+ if (mplan->part_prune_index >= 0)
+ mplan->part_prune_index += list_length(root->glob->partPruneInfos);
/* We don't need to recurse to lefttree or righttree ... */
Assert(mplan->plan.lefttree == NULL);
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index 9d3c05aed3..d77f7d3aef 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -209,16 +209,20 @@ static void partkey_datum_from_expr(PartitionPruneContext *context,
/*
* make_partition_pruneinfo
- * Builds a PartitionPruneInfo which can be used in the executor to allow
- * additional partition pruning to take place. Returns NULL when
- * partition pruning would be useless.
+ * Checks if the given set of quals can be used to build pruning steps
+ * that the executor can use to prune away unneeded partitions. If
+ * suitable quals are found then a PartitionPruneInfo is built and tagged
+ * onto the PlannerInfo's partPruneInfos list.
+ *
+ * The return value is the 0-based index of the item added to the
+ * partPruneInfos list or -1 if nothing was added.
*
* 'parentrel' is the RelOptInfo for an appendrel, and 'subpaths' is the list
* of scan paths for its child rels.
* 'prunequal' is a list of potential pruning quals (i.e., restriction
* clauses that are applicable to the appendrel).
*/
-PartitionPruneInfo *
+int
make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
List *subpaths,
List *prunequal)
@@ -332,7 +336,7 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
* quals, then we can just not bother with run-time pruning.
*/
if (prunerelinfos == NIL)
- return NULL;
+ return -1;
/* Else build the result data structure */
pruneinfo = makeNode(PartitionPruneInfo);
@@ -358,7 +362,9 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
else
pruneinfo->other_subplans = NULL;
- return pruneinfo;
+ root->partPruneInfos = lappend(root->partPruneInfos, pruneinfo);
+
+ return list_length(root->partPruneInfos) - 1;
}
/*
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index 708435e952..bf962af7af 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -123,9 +123,8 @@ typedef struct PartitionPruneState
extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
int n_total_subplans,
- PartitionPruneInfo *pruneinfo,
+ int part_prune_index,
Bitmapset **initially_valid_subplans);
extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
bool initial_prune);
-
#endif /* EXECPARTITION_H */
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 01b1727fc0..63a89474db 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -611,6 +611,8 @@ typedef struct EState
struct ExecRowMark **es_rowmarks; /* Array of per-range-table-entry
* ExecRowMarks, or NULL if none */
PlannedStmt *es_plannedstmt; /* link to top of plan tree */
+ List *es_part_prune_infos; /* PlannedStmt.partPruneInfos */
+ struct PartitionPruneResult *es_part_prune_result; /* QueryDesc.part_prune_result */
const char *es_sourceText; /* Source text from QueryDesc */
JunkFilter *es_junkFilter; /* top-level junk filter, if any */
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 44ffc73f15..d87957ff6c 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -122,6 +122,9 @@ typedef struct PlannerGlobal
/* "flat" list of AppendRelInfos */
List *appendRelations;
+ /* List of PartitionPruneInfo contained in the plan */
+ List *partPruneInfos;
+
/* OIDs of relations the plan depends on */
List *relationOids;
@@ -480,6 +483,9 @@ struct PlannerInfo
/* Does this query modify any partition key columns? */
bool partColsUpdated;
+
+ /* PartitionPruneInfos added in this query's plan. */
+ List *partPruneInfos;
};
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index dca2a21e7a..f2daabb3b7 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -69,6 +69,9 @@ typedef struct PlannedStmt
struct Plan *planTree; /* tree of Plan nodes */
+ List *partPruneInfos; /* List of PartitionPruneInfo contained in
+ * the plan */
+
List *rtable; /* list of RangeTblEntry nodes */
/* rtable indexes of target relations for INSERT/UPDATE/DELETE */
@@ -269,8 +272,8 @@ typedef struct Append
*/
int first_partial_plan;
- /* Info for run-time subplan pruning; NULL if we're not doing that */
- struct PartitionPruneInfo *part_prune_info;
+ /* Index to PlannerInfo.partPruneInfos or -1 if no run-time pruning */
+ int part_prune_index;
} Append;
/* ----------------
@@ -304,8 +307,8 @@ typedef struct MergeAppend
/* NULLS FIRST/LAST directions */
bool *nullsFirst pg_node_attr(array_size(numCols));
- /* Info for run-time subplan pruning; NULL if we're not doing that */
- struct PartitionPruneInfo *part_prune_info;
+ /* Index to PlannerInfo.partPruneInfos or -1 if no run-time pruning */
+ int part_prune_index;
} MergeAppend;
/* ----------------
diff --git a/src/include/partitioning/partprune.h b/src/include/partitioning/partprune.h
index 90684efa25..ebf0dcff8c 100644
--- a/src/include/partitioning/partprune.h
+++ b/src/include/partitioning/partprune.h
@@ -70,10 +70,10 @@ typedef struct PartitionPruneContext
#define PruneCxtStateIdx(partnatts, step_id, keyno) \
((partnatts) * (step_id) + (keyno))
-extern PartitionPruneInfo *make_partition_pruneinfo(struct PlannerInfo *root,
- struct RelOptInfo *parentrel,
- List *subpaths,
- List *prunequal);
+extern int make_partition_pruneinfo(struct PlannerInfo *root,
+ struct RelOptInfo *parentrel,
+ List *subpaths,
+ List *prunequal);
extern Bitmapset *prune_append_rel_partitions(struct RelOptInfo *rel);
extern Bitmapset *get_matching_partitions(PartitionPruneContext *context,
List *pruning_steps);
--
2.35.3
[application/octet-stream] v19-0002-Optimize-AcquireExecutorLocks-by-locking-only-un.patch (80.6K, 3-v19-0002-Optimize-AcquireExecutorLocks-by-locking-only-un.patch)
download | inline diff:
From b67911f2ae182f7158501e7ce4b1799ff2e1efb4 Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Wed, 22 Dec 2021 16:55:17 +0900
Subject: [PATCH v19 2/2] Optimize AcquireExecutorLocks() by locking only
unpruned partitions
This commit teaches AcquireExecutorLocks() to perform initial
partition pruning to notionally eliminate the subnodes contained in a
generic cached plan that need not be initialized during the actual
execution of the plan and skip locking the partition scanned by those
subnodes.
The result of performing initial partition pruning this way before the
actual execution has started is made available to the actual execution via
PartitionPruneResult, made available along with the PlannedStmt by the
callers of the executor that used plancache.c to get the plan. It is NULL
in the cases in which the plan is obtained by calling the planner
directly or if the plan obtained by plancache.c is not a generic one.
---
src/backend/commands/copyto.c | 2 +-
src/backend/commands/createas.c | 2 +-
src/backend/commands/explain.c | 7 +-
src/backend/commands/extension.c | 2 +-
src/backend/commands/matview.c | 2 +-
src/backend/commands/prepare.c | 26 ++-
src/backend/executor/README | 32 ++++
src/backend/executor/execMain.c | 53 ++++++
src/backend/executor/execParallel.c | 27 ++-
src/backend/executor/execPartition.c | 234 +++++++++++++++++++++----
src/backend/executor/functions.c | 2 +-
src/backend/executor/nodeAppend.c | 11 +-
src/backend/executor/nodeMergeAppend.c | 5 +-
src/backend/executor/spi.c | 27 ++-
src/backend/nodes/readfuncs.c | 29 +++
src/backend/optimizer/plan/planner.c | 2 +
src/backend/optimizer/plan/setrefs.c | 46 +++++
src/backend/partitioning/partprune.c | 41 ++++-
src/backend/tcop/postgres.c | 8 +-
src/backend/tcop/pquery.c | 28 ++-
src/backend/utils/cache/plancache.c | 187 +++++++++++++++++---
src/backend/utils/mmgr/portalmem.c | 19 ++
src/include/commands/explain.h | 4 +-
src/include/executor/execPartition.h | 9 +-
src/include/executor/execdesc.h | 3 +
src/include/executor/executor.h | 2 +
src/include/nodes/execnodes.h | 27 +++
src/include/nodes/nodes.h | 1 +
src/include/nodes/pathnodes.h | 13 ++
src/include/nodes/plannodes.h | 21 +++
src/include/utils/plancache.h | 3 +-
src/include/utils/portal.h | 3 +
32 files changed, 782 insertions(+), 96 deletions(-)
diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index fca29a9a10..d839517693 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -541,7 +541,7 @@ BeginCopyTo(ParseState *pstate,
((DR_copy *) dest)->cstate = cstate;
/* Create a QueryDesc requesting no output */
- cstate->queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+ cstate->queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
GetActiveSnapshot(),
InvalidSnapshot,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 9abbb6b555..f6607f2454 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -325,7 +325,7 @@ ExecCreateTableAs(ParseState *pstate, CreateTableAsStmt *stmt,
UpdateActiveSnapshotCommandId();
/* Create a QueryDesc, redirecting output to our tuple receiver */
- queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+ queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, 0);
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index e29c2ae206..e41b13a3ea 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -407,7 +407,7 @@ ExplainOneQuery(Query *query, int cursorOptions,
}
/* run it (if needed) and produce output */
- ExplainOnePlan(plan, into, es, queryString, params, queryEnv,
+ ExplainOnePlan(plan, NULL, into, es, queryString, params, queryEnv,
&planduration, (es->buffers ? &bufusage : NULL));
}
}
@@ -515,7 +515,8 @@ ExplainOneUtility(Node *utilityStmt, IntoClause *into, ExplainState *es,
* to call it.
*/
void
-ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
+ExplainOnePlan(PlannedStmt *plannedstmt, PartitionPruneResult *part_prune_result,
+ IntoClause *into, ExplainState *es,
const char *queryString, ParamListInfo params,
QueryEnvironment *queryEnv, const instr_time *planduration,
const BufferUsage *bufusage)
@@ -563,7 +564,7 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
dest = None_Receiver;
/* Create a QueryDesc for the query */
- queryDesc = CreateQueryDesc(plannedstmt, queryString,
+ queryDesc = CreateQueryDesc(plannedstmt, part_prune_result, queryString,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, instrument_option);
diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index 3db859c3ea..631cc07217 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -776,7 +776,7 @@ execute_sql_string(const char *sql)
{
QueryDesc *qdesc;
- qdesc = CreateQueryDesc(stmt,
+ qdesc = CreateQueryDesc(stmt, NULL,
sql,
GetActiveSnapshot(), NULL,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index 9ac0383459..b0ed96e56c 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -408,7 +408,7 @@ refresh_matview_datafill(DestReceiver *dest, Query *query,
UpdateActiveSnapshotCommandId();
/* Create a QueryDesc, redirecting output to our tuple receiver */
- queryDesc = CreateQueryDesc(plan, queryString,
+ queryDesc = CreateQueryDesc(plan, NULL, queryString,
GetActiveSnapshot(), InvalidSnapshot,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 2333aae467..83465e40f8 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -155,6 +155,7 @@ ExecuteQuery(ParseState *pstate,
PreparedStatement *entry;
CachedPlan *cplan;
List *plan_list;
+ List *part_prune_result_list;
ParamListInfo paramLI = NULL;
EState *estate = NULL;
Portal portal;
@@ -193,7 +194,10 @@ ExecuteQuery(ParseState *pstate,
entry->plansource->query_string);
/* Replan if needed, and increment plan refcount for portal */
- cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL);
+ cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL,
+ &part_prune_result_list);
+ Assert(list_length(cplan->stmt_list) ==
+ list_length(part_prune_result_list));
plan_list = cplan->stmt_list;
/*
@@ -207,6 +211,9 @@ ExecuteQuery(ParseState *pstate,
plan_list,
cplan);
+ /* Copy PartitionPruneResults into the portal's context. */
+ PortalStorePartitionPruneResults(portal, part_prune_result_list);
+
/*
* For CREATE TABLE ... AS EXECUTE, we must verify that the prepared
* statement is one that produces tuples. Currently we insist that it be
@@ -576,7 +583,9 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
const char *query_string;
CachedPlan *cplan;
List *plan_list;
- ListCell *p;
+ List *part_prune_result_list;
+ ListCell *p,
+ *pp;
ParamListInfo paramLI = NULL;
EState *estate = NULL;
instr_time planstart;
@@ -619,7 +628,10 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
/* Replan if needed, and acquire a transient refcount */
cplan = GetCachedPlan(entry->plansource, paramLI,
- CurrentResourceOwner, queryEnv);
+ CurrentResourceOwner, queryEnv,
+ &part_prune_result_list);
+ Assert(list_length(cplan->stmt_list) ==
+ list_length(part_prune_result_list));
INSTR_TIME_SET_CURRENT(planduration);
INSTR_TIME_SUBTRACT(planduration, planstart);
@@ -634,13 +646,15 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
plan_list = cplan->stmt_list;
/* Explain each query */
- foreach(p, plan_list)
+ forboth(p, plan_list, pp, part_prune_result_list)
{
PlannedStmt *pstmt = lfirst_node(PlannedStmt, p);
+ PartitionPruneResult *part_prune_result = lfirst_node(PartitionPruneResult, pp);
if (pstmt->commandType != CMD_UTILITY)
- ExplainOnePlan(pstmt, into, es, query_string, paramLI, queryEnv,
- &planduration, (es->buffers ? &bufusage : NULL));
+ ExplainOnePlan(pstmt, part_prune_result, into, es, query_string,
+ paramLI, queryEnv, &planduration,
+ (es->buffers ? &bufusage : NULL));
else
ExplainOneUtility(pstmt->utilityStmt, into, es, query_string,
paramLI, queryEnv);
diff --git a/src/backend/executor/README b/src/backend/executor/README
index 0b5183fc4a..953a476ea5 100644
--- a/src/backend/executor/README
+++ b/src/backend/executor/README
@@ -65,6 +65,34 @@ found there. This currently only occurs for Append and MergeAppend nodes. In
this case the non-required subplans are ignored and the executor state's
subnode array will become out of sequence to the plan's subplan list.
+Actually, the so-called execution time pruning may also occur even before the
+execution has started. One case where that occurs is when a cached generic
+plan is being validated for execution by plancache.c: GetCachedPlan(), which
+works by locking all the relations that will be scanned by that plan. If the
+generic plan contains nodes that can perform execution time partition pruning
+(that is, contain a PartitionPruneInfo), a subset of pruning steps contained
+in a given node's PartitionPruneInfo that do not depend on the execution
+actually having started (called "initial" pruning steps) are performed at this
+point to figure out the minimal set of child subplans that satisfy those
+pruning steps. AcquireExecutorLocks() looking at a given plan tree will then
+lock only the relations scanned by the child subplans that survived such
+pruning, along with those present in PlannedStmt.minLockRelids. Note that the
+subplans are only notionally pruned in that they are not removed from the plan
+tree as such.
+
+To prevent the executor and any third party execution code that can look at
+the plan tree from trying to execute the subplans that were pruned as
+described above, the result of pruning is passed to the executor as a
+PartitionPruneResult node via the QueryDesc. It consists of the set of
+indexes of surviving subplans in their respective parent plan node's list of
+child subplans, saved as a list of bitmapsets, with one element for every
+parent plan node whose PartitionPruneInfo is present in
+PlannedStmt.partPruneInfos. In other words, the executor should not
+re-evaluate the set of initially valid subplans by redoing the initial pruning
+if it was already done by AcquireExecutorLocks(), because the re-evaluation may
+very well end up resulting in a different set of subplans, containing some
+whose relations were not locked by AcquireExecutorLocks().
+
Each Plan node may have expression trees associated with it, to represent
its target list, qualification conditions, etc. These trees are also
read-only to the executor, but the executor state for expression evaluation
@@ -286,6 +314,10 @@ Query Processing Control Flow
This is a sketch of control flow for full query processing:
+ [ ExecutorDoInitialPruning ] --- an optional step to perform initial
+ partition pruning on the plan tree the result of which is passed
+ to the executor via QueryDesc
+
CreateQueryDesc
ExecutorStart
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 72fc273524..45824624f8 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -49,6 +49,7 @@
#include "commands/matview.h"
#include "commands/trigger.h"
#include "executor/execdebug.h"
+#include "executor/execPartition.h"
#include "executor/nodeSubplan.h"
#include "foreign/fdwapi.h"
#include "jit/jit.h"
@@ -104,6 +105,56 @@ static void EvalPlanQualStart(EPQState *epqstate, Plan *planTree);
/* end of local decls */
+/* ----------------------------------------------------------------
+ * ExecutorDoInitialPruning
+ *
+ * For each plan tree node that has been assigned a PartitionPruneInfo,
+ * this performs initial partition pruning using the information contained
+ * therein to determine the set of child subplans that satisfy the initial
+ * pruning steps, to be returned as a bitmapset of their indexes in the
+ * node's list of child subplans (for example, an Append's appendplans).
+ *
+ * Return value is a PartitionPruneResult node that contains a list of those
+ * bitmapsets, with one element for every PartitionPruneInfo, and a bitmapset
+ * of the RT indexes of all the leaf partitions scanned by those chosen
+ * subplans. Note that the latter is shared across all PartitionPruneInfos.
+ *
+ * The executor must see the exactly same set of subplans as valid for
+ * execution when doing ExecInitNode() on the plan nodes whose
+ * PartitionPruneInfos are processed here. So, it must get the set from the
+ * aforementioned PartitionPruneResult, instead of computing it all over
+ * again by redoing the initial pruning. It's the caller's job to pass the
+ * PartitionPruneResult to the executor.
+ *
+ * Note: Partitioned tables mentioned in PartitionedRelPruneInfo nodes that
+ * drive the pruning will be locked before doing the pruning.
+ * ----------------------------------------------------------------
+ */
+PartitionPruneResult *
+ExecutorDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params)
+{
+ PartitionPruneResult *result;
+ ListCell *lc;
+
+ /* Only get here if there is any pruning to do. */
+ Assert(plannedstmt->containsInitialPruning);
+
+ result = makeNode(PartitionPruneResult);
+ foreach(lc, plannedstmt->partPruneInfos)
+ {
+ PartitionPruneInfo *pruneinfo = lfirst(lc);
+ Bitmapset *valid_subplan_offs;
+
+ valid_subplan_offs =
+ ExecPartitionDoInitialPruning(plannedstmt, params, pruneinfo,
+ &result->scan_leafpart_rtis);
+ result->valid_subplan_offs_list =
+ lappend(result->valid_subplan_offs_list,
+ valid_subplan_offs);
+ }
+
+ return result;
+}
/* ----------------------------------------------------------------
* ExecutorStart
@@ -806,6 +857,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
{
CmdType operation = queryDesc->operation;
PlannedStmt *plannedstmt = queryDesc->plannedstmt;
+ PartitionPruneResult *part_prune_result = queryDesc->part_prune_result;
Plan *plan = plannedstmt->planTree;
List *rangeTable = plannedstmt->rtable;
EState *estate = queryDesc->estate;
@@ -826,6 +878,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
estate->es_plannedstmt = plannedstmt;
estate->es_part_prune_infos = plannedstmt->partPruneInfos;
+ estate->es_part_prune_result = part_prune_result;
/*
* Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index f73b8c2607..7e6dab5623 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -66,6 +66,7 @@
#define PARALLEL_KEY_QUERY_TEXT UINT64CONST(0xE000000000000008)
#define PARALLEL_KEY_JIT_INSTRUMENTATION UINT64CONST(0xE000000000000009)
#define PARALLEL_KEY_WAL_USAGE UINT64CONST(0xE00000000000000A)
+#define PARALLEL_KEY_PARTITIONPRUNERESULT UINT64CONST(0xE00000000000000B)
#define PARALLEL_TUPLE_QUEUE_SIZE 65536
@@ -182,6 +183,7 @@ ExecSerializePlan(Plan *plan, EState *estate)
pstmt->transientPlan = false;
pstmt->dependsOnRole = false;
pstmt->parallelModeNeeded = false;
+ pstmt->containsInitialPruning = false;
pstmt->planTree = plan;
pstmt->partPruneInfos = estate->es_part_prune_infos;
pstmt->rtable = estate->es_range_table;
@@ -597,12 +599,15 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
FixedParallelExecutorState *fpes;
char *pstmt_data;
char *pstmt_space;
+ char *part_prune_result_data;
+ char *part_prune_result_space;
char *paramlistinfo_space;
BufferUsage *bufusage_space;
WalUsage *walusage_space;
SharedExecutorInstrumentation *instrumentation = NULL;
SharedJitInstrumentation *jit_instrumentation = NULL;
int pstmt_len;
+ int part_prune_result_len;
int paramlistinfo_len;
int instrumentation_len = 0;
int jit_instrumentation_len = 0;
@@ -631,6 +636,7 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
/* Fix up and serialize plan to be sent to workers. */
pstmt_data = ExecSerializePlan(planstate->plan, estate);
+ part_prune_result_data = nodeToString(estate->es_part_prune_result);
/* Create a parallel context. */
pcxt = CreateParallelContext("postgres", "ParallelQueryMain", nworkers);
@@ -657,6 +663,11 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
shm_toc_estimate_chunk(&pcxt->estimator, pstmt_len);
shm_toc_estimate_keys(&pcxt->estimator, 1);
+ /* Estimate space for serialized PartitionPruneResult. */
+ part_prune_result_len = strlen(part_prune_result_data) + 1;
+ shm_toc_estimate_chunk(&pcxt->estimator, part_prune_result_len);
+ shm_toc_estimate_keys(&pcxt->estimator, 1);
+
/* Estimate space for serialized ParamListInfo. */
paramlistinfo_len = EstimateParamListSpace(estate->es_param_list_info);
shm_toc_estimate_chunk(&pcxt->estimator, paramlistinfo_len);
@@ -751,6 +762,12 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
memcpy(pstmt_space, pstmt_data, pstmt_len);
shm_toc_insert(pcxt->toc, PARALLEL_KEY_PLANNEDSTMT, pstmt_space);
+ /* Store serialized PartitionPruneResult */
+ part_prune_result_space = shm_toc_allocate(pcxt->toc, part_prune_result_len);
+ memcpy(part_prune_result_space, part_prune_result_data, part_prune_result_len);
+ shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARTITIONPRUNERESULT,
+ part_prune_result_space);
+
/* Store serialized ParamListInfo. */
paramlistinfo_space = shm_toc_allocate(pcxt->toc, paramlistinfo_len);
shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARAMLISTINFO, paramlistinfo_space);
@@ -1232,8 +1249,10 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
int instrument_options)
{
char *pstmtspace;
+ char *part_prune_result_space;
char *paramspace;
PlannedStmt *pstmt;
+ PartitionPruneResult *part_prune_result;
ParamListInfo paramLI;
char *queryString;
@@ -1244,12 +1263,18 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
pstmtspace = shm_toc_lookup(toc, PARALLEL_KEY_PLANNEDSTMT, false);
pstmt = (PlannedStmt *) stringToNode(pstmtspace);
+ /* Reconstruct leader-supplied PartitionPruneResult. */
+ part_prune_result_space =
+ shm_toc_lookup(toc, PARALLEL_KEY_PARTITIONPRUNERESULT, false);
+ part_prune_result = (PartitionPruneResult *)
+ stringToNode(part_prune_result_space);
+
/* Reconstruct ParamListInfo. */
paramspace = shm_toc_lookup(toc, PARALLEL_KEY_PARAMLISTINFO, false);
paramLI = RestoreParamList(¶mspace);
/* Create a QueryDesc for the query. */
- return CreateQueryDesc(pstmt,
+ return CreateQueryDesc(pstmt, part_prune_result,
queryString,
GetActiveSnapshot(), InvalidSnapshot,
receiver, paramLI, NULL, instrument_options);
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index b55cdd2580..24e6f6e988 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -25,6 +25,7 @@
#include "mb/pg_wchar.h"
#include "miscadmin.h"
#include "nodes/makefuncs.h"
+#include "parser/parsetree.h"
#include "partitioning/partbounds.h"
#include "partitioning/partdesc.h"
#include "partitioning/partprune.h"
@@ -185,7 +186,11 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation rel,
static List *adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri);
static List *adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap);
static PartitionPruneState *CreatePartitionPruneState(PlanState *planstate,
- PartitionPruneInfo *pruneinfo);
+ PartitionPruneInfo *pruneinfo,
+ bool consider_initial_steps,
+ bool consider_exec_steps,
+ List *rtable, ExprContext *econtext,
+ PartitionDirectory partdir);
static void InitPartitionPruneContext(PartitionPruneContext *context,
List *pruning_steps,
PartitionDesc partdesc,
@@ -198,7 +203,8 @@ static void PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
static void find_matching_subplans_recurse(PartitionPruningData *prunedata,
PartitionedRelPruningData *pprune,
bool initial_prune,
- Bitmapset **validsubplans);
+ Bitmapset **validsubplans,
+ Bitmapset **scan_leafpart_rtis);
/*
@@ -1593,8 +1599,10 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
* considered to be a stable expression, it can change value from one plan
* node scan to the next during query execution. Stable comparison
* expressions that don't involve such Params allow partition pruning to be
- * done once during executor startup. Expressions that do involve such Params
- * require us to prune separately for each scan of the parent plan node.
+ * done once during executor startup or during ExecutorDoInitialPruning() that
+ * runs as part of performing AcquireExecutorLocks() on a given plan tree.
+ * Expressions that do involve such Params require us to prune separately for
+ * each scan of the parent plan node.
*
* Note that pruning away unneeded subplans during executor startup has the
* added benefit of not having to initialize the unneeded subplans at all.
@@ -1611,6 +1619,13 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
* account for initial pruning possibly having eliminated some of the
* subplans.
*
+ * ExecPartitionDoInitialPruning:
+ * Do initial pruning with the information contained in a given
+ * PartitionPruneInfo to determine the minimal set of child subplans
+ * to be executed of the parent plan node to which the PartitionPruneInfo
+ * belongs and also the set of the RT indexes of leaf partitions that will
+ * be scanned with those subplans.
+ *
* ExecFindMatchingSubPlans:
* Returns indexes of matching subplans after evaluating the expressions
* that are safe to evaluate at a given point. This function is first
@@ -1628,8 +1643,9 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
*
* On return, *initially_valid_subplans is assigned the set of indexes of
* child subplans that must be initialized along with the parent plan node.
- * Initial pruning is performed here if needed and in that case only the
- * surviving subplans' indexes are added.
+ * Initial pruning is performed here if needed (unless it has already been done
+ * by ExecutorDoInitialPruning()), and in that case only the surviving
+ * subplans' indexes are added.
*
* If subplans are indeed pruned, subplan_map arrays contained in the returned
* PartitionPruneState are re-sequenced to not count those, though only if the
@@ -1645,24 +1661,59 @@ ExecInitPartitionPruning(PlanState *planstate,
EState *estate = planstate->state;
PartitionPruneInfo *pruneinfo = list_nth(estate->es_part_prune_infos,
part_prune_index);
+ PartitionPruneResult *pruneresult = estate->es_part_prune_result;
+ bool do_pruning = (pruneinfo->needs_init_pruning ||
+ pruneinfo->needs_exec_pruning);
- /* We may need an expression context to evaluate partition exprs */
- ExecAssignExprContext(estate, planstate);
+ /*
+ * No need to do initial pruning if it was done already by
+ * ExecutorDoInitialPruning(), which it would be if es_part_prune_result
+ * has been set.
+ */
+ if (pruneresult)
+ do_pruning = pruneinfo->needs_exec_pruning;
- /* Create the working data structure for pruning */
- prunestate = CreatePartitionPruneState(planstate, pruneinfo);
+ prunestate = NULL;
+ if (do_pruning)
+ {
+ /* We may need an expression context to evaluate partition exprs */
+ ExecAssignExprContext(estate, planstate);
+
+ /* For data reading, executor always omits detached partitions */
+ if (estate->es_partition_directory == NULL)
+ estate->es_partition_directory =
+ CreatePartitionDirectory(estate->es_query_cxt, false);
+
+ /*
+ * Create the working data structure for pruning. No need to consider
+ * initial pruning steps if we have a PartitionPruneResult.
+ */
+ prunestate = CreatePartitionPruneState(planstate, pruneinfo,
+ pruneresult == NULL, true,
+ NIL, planstate->ps_ExprContext,
+ estate->es_partition_directory);
+ }
/*
* Perform an initial partition prune pass, if required.
*/
- if (prunestate->do_initial_prune)
- *initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true);
+ if (pruneresult)
+ {
+ *initially_valid_subplans =
+ list_nth(pruneresult->valid_subplan_offs_list, part_prune_index);
+ }
+ else if (prunestate && prunestate->do_initial_prune)
+ {
+ *initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true,
+ NULL);
+ }
else
{
/* No pruning, so we'll need to initialize all subplans */
Assert(n_total_subplans > 0);
*initially_valid_subplans = bms_add_range(NULL, 0,
n_total_subplans - 1);
+ return prunestate;
}
/*
@@ -1670,7 +1721,8 @@ ExecInitPartitionPruning(PlanState *planstate,
* that were removed above due to initial pruning. No need to do this if
* no steps were removed.
*/
- if (bms_num_members(*initially_valid_subplans) < n_total_subplans)
+ if (prunestate &&
+ bms_num_members(*initially_valid_subplans) < n_total_subplans)
{
/*
* We can safely skip this when !do_exec_prune, even though that
@@ -1686,11 +1738,73 @@ ExecInitPartitionPruning(PlanState *planstate,
return prunestate;
}
+/*
+ * ExecPartitionDoInitialPruning
+ * Perform initial pruning using given PartitionPruneInfo to determine
+ * the minimal set of child subplans that will be executed and also the
+ * set of RT indexes of the leaf partitions scanned by those subplans.
+ */
+Bitmapset *
+ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params,
+ PartitionPruneInfo *pruneinfo,
+ Bitmapset **scan_leafpart_rtis)
+{
+ List *rtable = plannedstmt->rtable;
+ ExprContext *econtext;
+ PartitionDirectory pdir;
+ MemoryContext oldcontext,
+ tmpcontext;
+ PartitionPruneState *prunestate;
+ Bitmapset *valid_subplan_offs;
+
+ /*
+ * A temporary context for memory allocations required while executing
+ * partition pruning steps.
+ */
+ tmpcontext = AllocSetContextCreate(CurrentMemoryContext,
+ "initial pruning working data",
+ ALLOCSET_DEFAULT_SIZES);
+ oldcontext = MemoryContextSwitchTo(tmpcontext);
+
+ /*
+ * PartitionDirectory to look up partition descriptors, which omits
+ * detached partitions, just like in the executor proper.
+ */
+ pdir = CreatePartitionDirectory(CurrentMemoryContext, false);
+
+ /*
+ * We don't yet have a PlanState for the parent plan node, so we must
+ * create a standalone ExprContext to evaluate pruning expressions,
+ * equipped with the information about the EXTERN parameters that the
+ * caller passed us. Note that that's okay because the initial pruning
+ * steps do not contain anything that requires the execution to have
+ * started.
+ */
+ econtext = CreateStandaloneExprContext();
+ econtext->ecxt_param_list_info = params;
+ prunestate = CreatePartitionPruneState(NULL, pruneinfo, true, false,
+ rtable, econtext, pdir);
+ MemoryContextSwitchTo(oldcontext);
+
+ /* Do the initial pruning. */
+ valid_subplan_offs = ExecFindMatchingSubPlans(prunestate, true,
+ scan_leafpart_rtis);
+
+ FreeExprContext(econtext, true);
+ DestroyPartitionDirectory(pdir);
+ MemoryContextDelete(tmpcontext);
+
+ return valid_subplan_offs;
+}
+
/*
* CreatePartitionPruneState
* Build the data structure required for calling ExecFindMatchingSubPlans
*
- * 'planstate' is the parent plan node's execution state.
+ * 'planstate', if not NULL, is the parent plan node's execution state. It
+ * can be NULL if being called before ExecutorStart(), in which case,
+ * 'rtable' (range table), 'econtext', and 'partdir' must be explicitly
+ * provided.
*
* 'pruneinfo' is a PartitionPruneInfo as generated by
* make_partition_pruneinfo. Here we build a PartitionPruneState containing a
@@ -1704,19 +1818,21 @@ ExecInitPartitionPruning(PlanState *planstate,
* PartitionedRelPruneInfo.
*/
static PartitionPruneState *
-CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
+CreatePartitionPruneState(PlanState *planstate,
+ PartitionPruneInfo *pruneinfo,
+ bool consider_initial_steps,
+ bool consider_exec_steps,
+ List *rtable, ExprContext *econtext,
+ PartitionDirectory partdir)
{
- EState *estate = planstate->state;
+ EState *estate = planstate ? planstate->state : NULL;
PartitionPruneState *prunestate;
int n_part_hierarchies;
ListCell *lc;
int i;
- ExprContext *econtext = planstate->ps_ExprContext;
- /* For data reading, executor always omits detached partitions */
- if (estate->es_partition_directory == NULL)
- estate->es_partition_directory =
- CreatePartitionDirectory(estate->es_query_cxt, false);
+ Assert((estate != NULL) ||
+ (partdir != NULL && econtext != NULL && rtable != NIL));
n_part_hierarchies = list_length(pruneinfo->prune_infos);
Assert(n_part_hierarchies > 0);
@@ -1771,15 +1887,42 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
PartitionKey partkey;
/*
- * We can rely on the copies of the partitioned table's partition
- * key and partition descriptor appearing in its relcache entry,
- * because that entry will be held open and locked for the
- * duration of this executor run.
+ * Must open the relation by ourselves when called before the
+ * execution has started, such as, when called during
+ * ExecutorDoInitialPruning() on a cached plan. In that case,
+ * sub-partitions must be locked, because AcquirePlannerLocks()
+ * would not have seen them. (1st relation in a partrelpruneinfos
+ * list is always the root partitioned table appearing in the
+ * query, which AcquirePlannerLocks() would have locked; the
+ * Assert in relation_open() guards that assumption.)
+ */
+ if (estate == NULL)
+ {
+ RangeTblEntry *rte = rt_fetch(pinfo->rtindex, rtable);
+ int lockmode = (j == 0) ? NoLock : rte->rellockmode;
+
+ partrel = table_open(rte->relid, lockmode);
+ }
+ else
+ partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
+
+ /*
+ * We can rely on the copy of the partitioned table's partition
+ * key from in its relcache entry, because it can't change (or
+ * get destroyed) as long as the relation is locked. Partition
+ * descriptor is taken from the PartitionDirectory associated with
+ * the table that is held open long enough for the descriptor to
+ * remain valid while it's used to perform the pruning steps.
*/
- partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
partkey = RelationGetPartitionKey(partrel);
- partdesc = PartitionDirectoryLookup(estate->es_partition_directory,
- partrel);
+ partdesc = PartitionDirectoryLookup(partdir, partrel);
+
+ /*
+ * Must close partrel, keeping the lock taken, if we're not using
+ * EState's entry.
+ */
+ if (estate == NULL)
+ table_close(partrel, NoLock);
/*
* Initialize the subplan_map and subpart_map.
@@ -1793,6 +1936,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
Assert(partdesc->nparts >= pinfo->nparts);
pprune->nparts = partdesc->nparts;
pprune->subplan_map = palloc(sizeof(int) * partdesc->nparts);
+ pprune->rti_map = palloc(sizeof(Index) * partdesc->nparts);
if (partdesc->nparts == pinfo->nparts)
{
/*
@@ -1803,6 +1947,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
pprune->subpart_map = pinfo->subpart_map;
memcpy(pprune->subplan_map, pinfo->subplan_map,
sizeof(int) * pinfo->nparts);
+ memcpy(pprune->rti_map, pinfo->rti_map,
+ sizeof(int) * pinfo->nparts);
/*
* Double-check that the list of unpruned relations has not
@@ -1853,6 +1999,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
pinfo->subplan_map[pd_idx];
pprune->subpart_map[pp_idx] =
pinfo->subpart_map[pd_idx];
+ pprune->rti_map[pp_idx] =
+ pinfo->rti_map[pd_idx];
pd_idx++;
}
else
@@ -1860,6 +2008,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
/* this partdesc entry is not in the plan */
pprune->subplan_map[pp_idx] = -1;
pprune->subpart_map[pp_idx] = -1;
+ pprune->rti_map[pp_idx] = 0;
}
}
@@ -1881,7 +2030,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
* Initialize pruning contexts as needed.
*/
pprune->initial_pruning_steps = pinfo->initial_pruning_steps;
- if (pinfo->initial_pruning_steps)
+ if (consider_initial_steps && pinfo->initial_pruning_steps)
{
InitPartitionPruneContext(&pprune->initial_context,
pinfo->initial_pruning_steps,
@@ -1891,7 +2040,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
prunestate->do_initial_prune = true;
}
pprune->exec_pruning_steps = pinfo->exec_pruning_steps;
- if (pinfo->exec_pruning_steps)
+ if (consider_exec_steps && pinfo->exec_pruning_steps)
{
InitPartitionPruneContext(&pprune->exec_context,
pinfo->exec_pruning_steps,
@@ -2119,10 +2268,14 @@ PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
* Pass initial_prune if PARAM_EXEC Params cannot yet be evaluated. This
* differentiates the initial executor-time pruning step from later
* runtime pruning.
+ *
+ * RT indexes of leaf partitions scanned by the chosen subplans are added to
+ * *scan_leafpart_rtis if the pointer is non-NULL.
*/
Bitmapset *
ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
- bool initial_prune)
+ bool initial_prune,
+ Bitmapset **scan_leafpart_rtis)
{
Bitmapset *result = NULL;
MemoryContext oldcontext;
@@ -2157,7 +2310,7 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
*/
pprune = &prunedata->partrelprunedata[0];
find_matching_subplans_recurse(prunedata, pprune, initial_prune,
- &result);
+ &result, scan_leafpart_rtis);
/* Expression eval may have used space in ExprContext too */
if (pprune->exec_pruning_steps)
@@ -2171,6 +2324,8 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
/* Copy result out of the temp context before we reset it */
result = bms_copy(result);
+ if (scan_leafpart_rtis)
+ *scan_leafpart_rtis = bms_copy(*scan_leafpart_rtis);
MemoryContextReset(prunestate->prune_context);
@@ -2181,13 +2336,15 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
* find_matching_subplans_recurse
* Recursive worker function for ExecFindMatchingSubPlans
*
- * Adds valid (non-prunable) subplan IDs to *validsubplans
+ * Adds valid (non-prunable) subplan IDs to *validsubplans and RT indexes of
+ * of the corresponding leaf partitions to *scan_leafpart_rtis (if asked for).
*/
static void
find_matching_subplans_recurse(PartitionPruningData *prunedata,
PartitionedRelPruningData *pprune,
bool initial_prune,
- Bitmapset **validsubplans)
+ Bitmapset **validsubplans,
+ Bitmapset **scan_leafpart_rtis)
{
Bitmapset *partset;
int i;
@@ -2214,8 +2371,14 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
while ((i = bms_next_member(partset, i)) >= 0)
{
if (pprune->subplan_map[i] >= 0)
+ {
*validsubplans = bms_add_member(*validsubplans,
pprune->subplan_map[i]);
+ Assert(pprune->rti_map[i] > 0);
+ if (scan_leafpart_rtis)
+ *scan_leafpart_rtis = bms_add_member(*scan_leafpart_rtis,
+ pprune->rti_map[i]);
+ }
else
{
int partidx = pprune->subpart_map[i];
@@ -2223,7 +2386,8 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
if (partidx >= 0)
find_matching_subplans_recurse(prunedata,
&prunedata->partrelprunedata[partidx],
- initial_prune, validsubplans);
+ initial_prune, validsubplans,
+ scan_leafpart_rtis);
else
{
/*
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index f9460ae506..a2182a6b1f 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -844,7 +844,7 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
else
dest = None_Receiver;
- es->qd = CreateQueryDesc(es->stmt,
+ es->qd = CreateQueryDesc(es->stmt, NULL,
fcache->src,
GetActiveSnapshot(),
InvalidSnapshot,
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index c6f86a6510..96880e122a 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -155,7 +155,8 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
* subplan, we can fill as_valid_subplans immediately, preventing
* later calls to ExecFindMatchingSubPlans.
*/
- if (!prunestate->do_exec_prune && nplans > 0)
+ if (appendstate->as_prune_state == NULL ||
+ (!appendstate->as_prune_state->do_exec_prune && nplans > 0))
appendstate->as_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
}
else
@@ -577,7 +578,7 @@ choose_next_subplan_locally(AppendState *node)
}
else if (node->as_valid_subplans == NULL)
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
whichplan = -1;
}
@@ -642,7 +643,7 @@ choose_next_subplan_for_leader(AppendState *node)
if (node->as_valid_subplans == NULL)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
/*
* Mark each invalid plan as finished to allow the loop below to
@@ -717,7 +718,7 @@ choose_next_subplan_for_worker(AppendState *node)
else if (node->as_valid_subplans == NULL)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
mark_invalid_subplans_as_finished(node);
}
@@ -868,7 +869,7 @@ ExecAppendAsyncBegin(AppendState *node)
if (node->as_valid_subplans == NULL)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
classify_matching_subplans(node);
}
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index 8d35860c30..2312e5a633 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -103,7 +103,8 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
* subplan, we can fill ms_valid_subplans immediately, preventing
* later calls to ExecFindMatchingSubPlans.
*/
- if (!prunestate->do_exec_prune && nplans > 0)
+ if (mergestate->ms_prune_state == NULL ||
+ (!mergestate->ms_prune_state->do_exec_prune && nplans > 0))
mergestate->ms_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
}
else
@@ -218,7 +219,7 @@ ExecMergeAppend(PlanState *pstate)
*/
if (node->ms_valid_subplans == NULL)
node->ms_valid_subplans =
- ExecFindMatchingSubPlans(node->ms_prune_state, false);
+ ExecFindMatchingSubPlans(node->ms_prune_state, false, NULL);
/*
* First time through: pull the first tuple from each valid subplan,
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 29bc26669b..303a572c02 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -1578,6 +1578,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
CachedPlanSource *plansource;
CachedPlan *cplan;
List *stmt_list;
+ List *part_prune_result_list;
char *query_string;
Snapshot snapshot;
MemoryContext oldcontext;
@@ -1657,7 +1658,10 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
*/
/* Replan if needed, and increment plan refcount for portal */
- cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv);
+ cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv,
+ &part_prune_result_list);
+ Assert(list_length(cplan->stmt_list) ==
+ list_length(part_prune_result_list));
stmt_list = cplan->stmt_list;
if (!plan->saved)
@@ -1685,6 +1689,9 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
stmt_list,
cplan);
+ /* Copy PartitionPruneResults into the portal's context. */
+ PortalStorePartitionPruneResults(portal, part_prune_result_list);
+
/*
* Set up options for portal. Default SCROLL type is chosen the same way
* as PerformCursorOpen does it.
@@ -2092,7 +2099,8 @@ SPI_plan_get_cached_plan(SPIPlanPtr plan)
/* Get the generic plan for the query */
cplan = GetCachedPlan(plansource, NULL,
plan->saved ? CurrentResourceOwner : NULL,
- _SPI_current->queryEnv);
+ _SPI_current->queryEnv,
+ NULL /* Not interested in PartitionPruneResults */);
Assert(cplan == plansource->gplan);
/* Pop the error context stack */
@@ -2473,7 +2481,9 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
{
CachedPlanSource *plansource = (CachedPlanSource *) lfirst(lc1);
List *stmt_list;
- ListCell *lc2;
+ List *part_prune_result_list;
+ ListCell *lc2,
+ *lc3;
spicallbackarg.query = plansource->query_string;
@@ -2549,8 +2559,10 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
* plan, the refcount must be backed by the plan_owner.
*/
cplan = GetCachedPlan(plansource, options->params,
- plan_owner, _SPI_current->queryEnv);
-
+ plan_owner, _SPI_current->queryEnv,
+ &part_prune_result_list);
+ Assert(list_length(cplan->stmt_list) ==
+ list_length(part_prune_result_list));
stmt_list = cplan->stmt_list;
/*
@@ -2589,9 +2601,10 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
}
}
- foreach(lc2, stmt_list)
+ forboth(lc2, stmt_list, lc3, part_prune_result_list)
{
PlannedStmt *stmt = lfirst_node(PlannedStmt, lc2);
+ PartitionPruneResult *part_prune_result = lfirst_node(PartitionPruneResult, lc3);
bool canSetTag = stmt->canSetTag;
DestReceiver *dest;
@@ -2663,7 +2676,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
else
snap = InvalidSnapshot;
- qdesc = CreateQueryDesc(stmt,
+ qdesc = CreateQueryDesc(stmt, part_prune_result,
plansource->query_string,
snap, crosscheck_snapshot,
dest,
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 1421686938..d57478bde9 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -158,6 +158,11 @@
token = pg_strtok(&length); /* skip :fldname */ \
local_node->fldname = readIntCols(len)
+/* Read an Index array */
+#define READ_INDEX_ARRAY(fldname, len) \
+ token = pg_strtok(&length); /* skip :fldname */ \
+ local_node->fldname = readIndexCols(len)
+
/* Read a bool array */
#define READ_BOOL_ARRAY(fldname, len) \
token = pg_strtok(&length); /* skip :fldname */ \
@@ -623,6 +628,30 @@ readIntCols(int numCols)
return int_vals;
}
+/*
+ * readIndexCols
+ */
+Index *
+readIndexCols(int numCols)
+{
+ int tokenLength,
+ i;
+ const char *token;
+ Index *index_vals;
+
+ if (numCols <= 0)
+ return NULL;
+
+ index_vals = (Index *) palloc(numCols * sizeof(Index));
+ for (i = 0; i < numCols; i++)
+ {
+ token = pg_strtok(&tokenLength);
+ index_vals[i] = atoui(token);
+ }
+
+ return index_vals;
+}
+
/*
* readBoolCols
*/
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index b11249ed8f..7141035cc4 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -519,7 +519,9 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
result->parallelModeNeeded = glob->parallelModeNeeded;
result->planTree = top_plan;
result->partPruneInfos = glob->partPruneInfos;
+ result->containsInitialPruning = glob->containsInitialPruning;
result->rtable = glob->finalrtable;
+ result->minLockRelids = glob->minLockRelids;
result->resultRelations = glob->resultRelations;
result->appendRelations = glob->appendRelations;
result->subplans = glob->subplans;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index b8d5610593..da749e331e 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -270,6 +270,16 @@ set_plan_references(PlannerInfo *root, Plan *plan)
*/
add_rtes_to_flat_rtable(root, false);
+ /*
+ * Add the query's adjusted range of RT indexes to glob->minLockRelids.
+ * The adjusted RT indexes of prunable relations will be deleted from the
+ * set below where PartitionPruneInfos are processed.
+ */
+ glob->minLockRelids =
+ bms_add_range(glob->minLockRelids,
+ rtoffset + 1,
+ rtoffset + list_length(root->parse->rtable));
+
/*
* Adjust RT indexes of PlanRowMarks and add to final rowmarks list
*/
@@ -352,6 +362,7 @@ set_plan_references(PlannerInfo *root, Plan *plan)
foreach (lc, root->partPruneInfos)
{
PartitionPruneInfo *pruneinfo = lfirst(lc);
+ Bitmapset *leafpart_rtis = NULL;
ListCell *l;
foreach(l, pruneinfo->prune_infos)
@@ -362,15 +373,50 @@ set_plan_references(PlannerInfo *root, Plan *plan)
foreach(l2, prune_infos)
{
PartitionedRelPruneInfo *pinfo = lfirst(l2);
+ int i;
/* RT index of the table to which the pinfo belongs. */
pinfo->rtindex += rtoffset;
+
+ /* Also of the leaf partitions that might be scanned. */
+ for (i = 0; i < pinfo->nparts; i++)
+ {
+ if (pinfo->rti_map[i] > 0 && pinfo->subplan_map[i] >= 0)
+ {
+ pinfo->rti_map[i] += rtoffset;
+ leafpart_rtis = bms_add_member(leafpart_rtis,
+ pinfo->rti_map[i]);
+ }
+ }
}
}
+ if (pruneinfo->needs_init_pruning)
+ {
+ glob->containsInitialPruning = true;
+
+ /*
+ * Delete the leaf partition RTIs from the global set of relations
+ * to be locked before executing the plan. AcquireExecutorLocks()
+ * will find the ones to add to the set after performing initial
+ * pruning.
+ */
+ glob->minLockRelids = bms_del_members(glob->minLockRelids,
+ leafpart_rtis);
+ }
+
glob->partPruneInfos = lappend(glob->partPruneInfos, pruneinfo);
}
+ /*
+ * It seems worth doing a bms_copy() on glob->minLockRelids if we deleted
+ * bits from it above to get rid of any empty tail bits. It seems better
+ * for the loop over this set in AcquireExecutorLocks() to not have to go
+ * through those useless bit words.
+ */
+ if (glob->containsInitialPruning)
+ glob->minLockRelids = bms_copy(glob->minLockRelids);
+
return result;
}
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index d77f7d3aef..952c5b8327 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -144,7 +144,9 @@ static List *make_partitionedrel_pruneinfo(PlannerInfo *root,
List *prunequal,
Bitmapset *partrelids,
int *relid_subplan_map,
- Bitmapset **matchedsubplans);
+ Bitmapset **matchedsubplans,
+ bool *needs_init_pruning,
+ bool *needs_exec_pruning);
static void gen_partprune_steps(RelOptInfo *rel, List *clauses,
PartClauseTarget target,
GeneratePruningStepsContext *context);
@@ -234,6 +236,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
int *relid_subplan_map;
ListCell *lc;
int i;
+ bool needs_init_pruning = false;
+ bool needs_exec_pruning = false;
/*
* Scan the subpaths to see which ones are scans of partition child
@@ -313,12 +317,16 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
Bitmapset *partrelids = (Bitmapset *) lfirst(lc);
List *pinfolist;
Bitmapset *matchedsubplans = NULL;
+ bool partrel_needs_init_pruning;
+ bool partrel_needs_exec_pruning;
pinfolist = make_partitionedrel_pruneinfo(root, parentrel,
prunequal,
partrelids,
relid_subplan_map,
- &matchedsubplans);
+ &matchedsubplans,
+ &partrel_needs_init_pruning,
+ &partrel_needs_exec_pruning);
/* When pruning is possible, record the matched subplans */
if (pinfolist != NIL)
@@ -327,6 +335,9 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
allmatchedsubplans = bms_join(matchedsubplans,
allmatchedsubplans);
}
+
+ needs_init_pruning |= partrel_needs_init_pruning;
+ needs_exec_pruning |= partrel_needs_exec_pruning;
}
pfree(relid_subplan_map);
@@ -341,6 +352,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
/* Else build the result data structure */
pruneinfo = makeNode(PartitionPruneInfo);
pruneinfo->prune_infos = prunerelinfos;
+ pruneinfo->needs_init_pruning = needs_init_pruning;
+ pruneinfo->needs_exec_pruning = needs_exec_pruning;
/*
* Some subplans may not belong to any of the identified partitioned rels.
@@ -441,13 +454,18 @@ add_part_relids(List *allpartrelids, Bitmapset *partrelids)
* If we cannot find any useful run-time pruning steps, return NIL.
* However, on success, each rel identified in partrelids will have
* an element in the result list, even if some of them are useless.
+ * *needs_init_pruning and *needs_exec_pruning are set to indicate that the
+ * returned PartitionedRelPruneInfos contains pruning steps that can be
+ * performed before and after execution begins, respectively.
*/
static List *
make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
List *prunequal,
Bitmapset *partrelids,
int *relid_subplan_map,
- Bitmapset **matchedsubplans)
+ Bitmapset **matchedsubplans,
+ bool *needs_init_pruning,
+ bool *needs_exec_pruning)
{
RelOptInfo *targetpart = NULL;
List *pinfolist = NIL;
@@ -458,6 +476,10 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
int rti;
int i;
+ /* Will find out below. */
+ *needs_init_pruning = false;
+ *needs_exec_pruning = false;
+
/*
* Examine each partitioned rel, constructing a temporary array to map
* from planner relids to index of the partitioned rel, and building a
@@ -545,6 +567,9 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
* executor per-scan pruning steps. This first pass creates startup
* pruning steps and detects whether there's any possibly-useful quals
* that would require per-scan pruning.
+ *
+ * In the first pass, we note whether the 2nd pass is necessary by
+ * noting the presence of EXEC parameters.
*/
gen_partprune_steps(subpart, partprunequal, PARTTARGET_INITIAL,
&context);
@@ -619,6 +644,12 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
pinfo->execparamids = execparamids;
/* Remaining fields will be filled in the next loop */
+ /* record which types of pruning steps we've seen so far */
+ if (initial_pruning_steps != NIL)
+ *needs_init_pruning = true;
+ if (exec_pruning_steps != NIL)
+ *needs_exec_pruning = true;
+
pinfolist = lappend(pinfolist, pinfo);
}
@@ -646,6 +677,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
int *subplan_map;
int *subpart_map;
Oid *relid_map;
+ Index *rti_map;
/*
* Construct the subplan and subpart maps for this partitioning level.
@@ -658,6 +690,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
subpart_map = (int *) palloc(nparts * sizeof(int));
memset(subpart_map, -1, nparts * sizeof(int));
relid_map = (Oid *) palloc0(nparts * sizeof(Oid));
+ rti_map = (Index *) palloc0(nparts * sizeof(Index));
present_parts = NULL;
i = -1;
@@ -672,6 +705,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
subplan_map[i] = subplanidx = relid_subplan_map[partrel->relid] - 1;
subpart_map[i] = subpartidx = relid_subpart_map[partrel->relid] - 1;
relid_map[i] = planner_rt_fetch(partrel->relid, root)->relid;
+ rti_map[i] = partrel->relid;
if (subplanidx >= 0)
{
present_parts = bms_add_member(present_parts, i);
@@ -696,6 +730,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
pinfo->subplan_map = subplan_map;
pinfo->subpart_map = subpart_map;
pinfo->relid_map = relid_map;
+ pinfo->rti_map = rti_map;
}
pfree(relid_subpart_map);
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 6f18b68856..16bda42f11 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -1596,6 +1596,7 @@ exec_bind_message(StringInfo input_message)
int16 *rformats = NULL;
CachedPlanSource *psrc;
CachedPlan *cplan;
+ List *part_prune_result_list;
Portal portal;
char *query_string;
char *saved_stmt_name;
@@ -1971,7 +1972,9 @@ exec_bind_message(StringInfo input_message)
* will be generated in MessageContext. The plan refcount will be
* assigned to the Portal, so it will be released at portal destruction.
*/
- cplan = GetCachedPlan(psrc, params, NULL, NULL);
+ cplan = GetCachedPlan(psrc, params, NULL, NULL, &part_prune_result_list);
+ Assert(list_length(cplan->stmt_list) ==
+ list_length(part_prune_result_list));
/*
* Now we can define the portal.
@@ -1986,6 +1989,9 @@ exec_bind_message(StringInfo input_message)
cplan->stmt_list,
cplan);
+ /* Copy PartitionPruneResults into the portal's context. */
+ PortalStorePartitionPruneResults(portal, part_prune_result_list);
+
/* Done with the snapshot used for parameter I/O and parsing/planning */
if (snapshot_set)
PopActiveSnapshot();
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index 5aa5a350f3..8cc2e2162d 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -35,7 +35,7 @@
Portal ActivePortal = NULL;
-static void ProcessQuery(PlannedStmt *plan,
+static void ProcessQuery(PlannedStmt *plan, PartitionPruneResult *part_prune_result,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -65,6 +65,7 @@ static void DoPortalRewind(Portal portal);
*/
QueryDesc *
CreateQueryDesc(PlannedStmt *plannedstmt,
+ PartitionPruneResult *part_prune_result,
const char *sourceText,
Snapshot snapshot,
Snapshot crosscheck_snapshot,
@@ -77,6 +78,8 @@ CreateQueryDesc(PlannedStmt *plannedstmt,
qd->operation = plannedstmt->commandType; /* operation */
qd->plannedstmt = plannedstmt; /* plan */
+ qd->part_prune_result = part_prune_result; /* ExecutorDoInitialPruning()
+ * output for plan */
qd->sourceText = sourceText; /* query text */
qd->snapshot = RegisterSnapshot(snapshot); /* snapshot */
/* RI check snapshot */
@@ -122,6 +125,7 @@ FreeQueryDesc(QueryDesc *qdesc)
* PORTAL_ONE_RETURNING, or PORTAL_ONE_MOD_WITH portal
*
* plan: the plan tree for the query
+ * part_prune_result: ExecutorDoInitialPruning() output for the plan tree
* sourceText: the source text of the query
* params: any parameters needed
* dest: where to send results
@@ -134,6 +138,7 @@ FreeQueryDesc(QueryDesc *qdesc)
*/
static void
ProcessQuery(PlannedStmt *plan,
+ PartitionPruneResult *part_prune_result,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -145,7 +150,7 @@ ProcessQuery(PlannedStmt *plan,
/*
* Create the QueryDesc object
*/
- queryDesc = CreateQueryDesc(plan, sourceText,
+ queryDesc = CreateQueryDesc(plan, part_prune_result, sourceText,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, 0);
@@ -491,8 +496,13 @@ PortalStart(Portal portal, ParamListInfo params,
/*
* Create QueryDesc in portal's context; for the moment, set
* the destination to DestNone.
+ *
+ * There is no PartitionPruneResult unless the PlannedStmt is
+ * from a CachedPlan.
*/
queryDesc = CreateQueryDesc(linitial_node(PlannedStmt, portal->stmts),
+ portal->part_prune_results == NIL ? NULL :
+ linitial(portal->part_prune_results),
portal->sourceText,
GetActiveSnapshot(),
InvalidSnapshot,
@@ -1225,6 +1235,8 @@ PortalRunMulti(Portal portal,
if (pstmt->utilityStmt == NULL)
{
+ PartitionPruneResult *part_prune_result = NULL;
+
/*
* process a plannable query.
*/
@@ -1271,10 +1283,18 @@ PortalRunMulti(Portal portal,
else
UpdateActiveSnapshotCommandId();
+ /*
+ * Determine if there's a corresponding PartitionPruneResult for
+ * this PlannedStmt.
+ */
+ if (portal->part_prune_results != NIL)
+ part_prune_result = list_nth(portal->part_prune_results,
+ foreach_current_index(stmtlist_item));
+
if (pstmt->canSetTag)
{
/* statement can set tag string */
- ProcessQuery(pstmt,
+ ProcessQuery(pstmt, part_prune_result,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
@@ -1283,7 +1303,7 @@ PortalRunMulti(Portal portal,
else
{
/* stmt added by rewrite cannot set tag */
- ProcessQuery(pstmt,
+ ProcessQuery(pstmt, part_prune_result,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 0d6a295674..d1c9605979 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -99,14 +99,19 @@ static dlist_head cached_expression_list = DLIST_STATIC_INIT(cached_expression_l
static void ReleaseGenericPlan(CachedPlanSource *plansource);
static List *RevalidateCachedQuery(CachedPlanSource *plansource,
QueryEnvironment *queryEnv);
-static bool CheckCachedPlan(CachedPlanSource *plansource);
+static bool CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
+ List **part_prune_result_list);
static CachedPlan *BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
- ParamListInfo boundParams, QueryEnvironment *queryEnv);
+ ParamListInfo boundParams, QueryEnvironment *queryEnv,
+ List **part_prune_result_list);
static bool choose_custom_plan(CachedPlanSource *plansource,
ParamListInfo boundParams);
static double cached_plan_cost(CachedPlan *plan, bool include_planner);
static Query *QueryListGetPrimaryStmt(List *stmts);
-static void AcquireExecutorLocks(List *stmt_list, bool acquire);
+static void AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams,
+ List **part_prune_result_list,
+ List **lockedRelids_per_stmt);
+static void ReleaseExecutorLocks(List *stmt_list, List *lockedRelids_per_stmt);
static void AcquirePlannerLocks(List *stmt_list, bool acquire);
static void ScanQueryForLocks(Query *parsetree, bool acquire);
static bool ScanQueryWalker(Node *node, bool *acquire);
@@ -790,15 +795,20 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
*
* On a "true" return, we have acquired the locks needed to run the plan.
* (We must do this for the "true" result to be race-condition-free.)
+ *
+ * See GetCachedPlan()'s comment for a description of part_prune_result_list.
*/
static bool
-CheckCachedPlan(CachedPlanSource *plansource)
+CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
+ List **part_prune_result_list)
{
CachedPlan *plan = plansource->gplan;
/* Assert that caller checked the querytree */
Assert(plansource->is_valid);
+ *part_prune_result_list = NIL;
+
/* If there's no generic plan, just say "false" */
if (!plan)
return false;
@@ -820,13 +830,21 @@ CheckCachedPlan(CachedPlanSource *plansource)
*/
if (plan->is_valid)
{
+ List *lockedRelids_per_stmt;
+
/*
* Plan must have positive refcount because it is referenced by
* plansource; so no need to fear it disappears under us here.
*/
Assert(plan->refcount > 0);
- AcquireExecutorLocks(plan->stmt_list, true);
+ /*
+ * Lock relations scanned by the plan. This is where the pruning
+ * happens if needed.
+ */
+ AcquireExecutorLocks(plan->stmt_list, boundParams,
+ part_prune_result_list,
+ &lockedRelids_per_stmt);
/*
* If plan was transient, check to see if TransactionXmin has
@@ -848,7 +866,14 @@ CheckCachedPlan(CachedPlanSource *plansource)
}
/* Oops, the race case happened. Release useless locks. */
- AcquireExecutorLocks(plan->stmt_list, false);
+ ReleaseExecutorLocks(plan->stmt_list, lockedRelids_per_stmt);
+
+ /*
+ * The output list and any objects therein have been allocated in the
+ * caller's hopefully short-lived context, so will not remain leaked
+ * for long, though reset to avoid its accidentally being looked at.
+ */
+ *part_prune_result_list = NIL;
}
/*
@@ -874,10 +899,15 @@ CheckCachedPlan(CachedPlanSource *plansource)
* Planning work is done in the caller's memory context. The finished plan
* is in a child memory context, which typically should get reparented
* (unless this is a one-shot plan, in which case we don't copy the plan).
+ *
+ * A list of NULLs is returned in *part_prune_result_list, meaning that no
+ * PartitionPruneResult nodes have yet been created for the plans in
+ * stmt_list.
*/
static CachedPlan *
BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
- ParamListInfo boundParams, QueryEnvironment *queryEnv)
+ ParamListInfo boundParams, QueryEnvironment *queryEnv,
+ List **part_prune_result_list)
{
CachedPlan *plan;
List *plist;
@@ -1007,6 +1037,17 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
MemoryContextSwitchTo(oldcxt);
+ /*
+ * No actual PartitionPruneResults yet to add, though must initialize
+ * the list to have the same number of elements as the list of
+ * PlannedStmts.
+ */
+ *part_prune_result_list = NIL;
+ foreach(lc, plist)
+ {
+ *part_prune_result_list = lappend(*part_prune_result_list, NULL);
+ }
+
return plan;
}
@@ -1126,6 +1167,17 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
* plan or a custom plan for the given parameters: the caller does not know
* which it will get.
*
+ * For every PlannedStmt found in the returned CachedPlan, an element that
+ * is either a PartitionPruneResult or a NULL is added to
+ * *part_prune_result_list if needed. The former if the PlannedStmt is from
+ * the existing CachedPlan that is otherwise valid and contains at least one
+ * PartitionPruneInfo that has "initial" pruning steps. Those steps are
+ * performed by calling ExecutorDoInitialPruning() to determine only those
+ * leaf partitions that need to be locked by AcquireExecutorLocks() by pruning
+ * away subplans that don't match the pruning conditions. The
+ * PartitionPruneResult contains a list of bitmapsets of the indexes of
+ * matching subplans, one for each PartitionPruneInfo.
+ *
* On return, the plan is valid and we have sufficient locks to begin
* execution.
*
@@ -1139,11 +1191,13 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
*/
CachedPlan *
GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
- ResourceOwner owner, QueryEnvironment *queryEnv)
+ ResourceOwner owner, QueryEnvironment *queryEnv,
+ List **part_prune_result_list)
{
CachedPlan *plan = NULL;
List *qlist;
bool customplan;
+ List *my_part_prune_result_list;
/* Assert caller is doing things in a sane order */
Assert(plansource->magic == CACHEDPLANSOURCE_MAGIC);
@@ -1160,7 +1214,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
if (!customplan)
{
- if (CheckCachedPlan(plansource))
+ if (CheckCachedPlan(plansource, boundParams,
+ &my_part_prune_result_list))
{
/* We want a generic plan, and we already have a valid one */
plan = plansource->gplan;
@@ -1169,7 +1224,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
else
{
/* Build a new generic plan */
- plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv);
+ plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv,
+ &my_part_prune_result_list);
/* Just make real sure plansource->gplan is clear */
ReleaseGenericPlan(plansource);
/* Link the new generic plan into the plansource */
@@ -1214,7 +1270,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
if (customplan)
{
/* Build a custom plan */
- plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv);
+ plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv,
+ &my_part_prune_result_list);
/* Accumulate total costs of custom plans */
plansource->total_custom_cost += cached_plan_cost(plan, true);
@@ -1246,6 +1303,9 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
plan->is_saved = true;
}
+ if (part_prune_result_list)
+ *part_prune_result_list = my_part_prune_result_list;
+
return plan;
}
@@ -1737,17 +1797,29 @@ QueryListGetPrimaryStmt(List *stmts)
/*
* AcquireExecutorLocks: acquire locks needed for execution of a cached plan;
- * or release them if acquire is false.
+ *
+ * See GetCachedPlan()'s comment for a description of part_prune_result_list.
+ *
+ * On return, *lockedRelids_per_stmt will contain a bitmapset for every
+ * PlannedStmt in stmt_list, containing the RT indexes of relation entries
+ * in its range table that were actually locked, or NULL if the PlannedStmt
+ * contains a utility statement.
*/
static void
-AcquireExecutorLocks(List *stmt_list, bool acquire)
+AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams,
+ List **part_prune_result_list,
+ List **lockedRelids_per_stmt)
{
ListCell *lc1;
+ *part_prune_result_list = *lockedRelids_per_stmt = NIL;
foreach(lc1, stmt_list)
{
PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
- ListCell *lc2;
+ PartitionPruneResult *part_prune_result = NULL;
+ Bitmapset *allLockRelids;
+ Bitmapset *lockedRelids = NULL;
+ int rti;
if (plannedstmt->commandType == CMD_UTILITY)
{
@@ -1761,13 +1833,38 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
Query *query = UtilityContainsQuery(plannedstmt->utilityStmt);
if (query)
- ScanQueryForLocks(query, acquire);
+ ScanQueryForLocks(query, true);
+ *part_prune_result_list = lappend(*part_prune_result_list, NULL);
continue;
}
- foreach(lc2, plannedstmt->rtable)
+ /*
+ * Figure out the set of relations that would need to be locked
+ * before executing the plan.
+ */
+ if (plannedstmt->containsInitialPruning)
{
- RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc2);
+ /*
+ * Obtain the set of leaf partitions to be locked.
+ *
+ * The following does initial partition pruning using the
+ * PartitionPruneInfos found in plannedstmt->partPruneInfos and
+ * finds leaf partitions that survive that pruning across all the
+ * nodes in the plan tree.
+ */
+ PartitionPruneResult *part_prune_result =
+ ExecutorDoInitialPruning(plannedstmt, boundParams);
+
+ allLockRelids = bms_union(plannedstmt->minLockRelids,
+ part_prune_result->scan_leafpart_rtis);
+ }
+ else
+ allLockRelids = plannedstmt->minLockRelids;
+
+ rti = -1;
+ while ((rti = bms_next_member(allLockRelids, rti)) > 0)
+ {
+ RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
if (rte->rtekind != RTE_RELATION)
continue;
@@ -1778,10 +1875,58 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
* fail if it's been dropped entirely --- we'll just transiently
* acquire a non-conflicting lock.
*/
- if (acquire)
- LockRelationOid(rte->relid, rte->rellockmode);
- else
- UnlockRelationOid(rte->relid, rte->rellockmode);
+ LockRelationOid(rte->relid, rte->rellockmode);
+ lockedRelids = bms_add_member(lockedRelids, rti);
+ }
+
+ *part_prune_result_list = lappend(*part_prune_result_list,
+ part_prune_result);
+ *lockedRelids_per_stmt = lappend(*lockedRelids_per_stmt, lockedRelids);
+ }
+}
+
+/*
+ * ReleaseExecutorLocks
+ * Release locks that would've been acquired by an earlier call to
+ * AcquireExecutorLocks()
+ */
+static void
+ReleaseExecutorLocks(List *stmt_list, List *lockedRelids_per_stmt)
+{
+ ListCell *lc1,
+ *lc2;
+
+ forboth(lc1, stmt_list, lc2, lockedRelids_per_stmt)
+ {
+ PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
+ Bitmapset *lockedRelids = lfirst(lc2);
+ int rti;
+
+ if (plannedstmt->commandType == CMD_UTILITY)
+ {
+ /*
+ * Ignore utility statements, except those (such as EXPLAIN) that
+ * contain a parsed-but-not-planned query. Note: it's okay to use
+ * ScanQueryForLocks, even though the query hasn't been through
+ * rule rewriting, because rewriting doesn't change the query
+ * representation.
+ */
+ Query *query = UtilityContainsQuery(plannedstmt->utilityStmt);
+
+ if (query)
+ ScanQueryForLocks(query, false);
+ continue;
+ }
+
+ rti = -1;
+ while ((rti = bms_next_member(lockedRelids, rti)) >= 0)
+ {
+ RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
+
+ Assert(rte->rtekind == RTE_RELATION);
+
+ /* See the comment in AcquireExecutorLocks(). */
+ UnlockRelationOid(rte->relid, rte->rellockmode);
}
}
}
diff --git a/src/backend/utils/mmgr/portalmem.c b/src/backend/utils/mmgr/portalmem.c
index d549f66d4a..1bbe6b704b 100644
--- a/src/backend/utils/mmgr/portalmem.c
+++ b/src/backend/utils/mmgr/portalmem.c
@@ -303,6 +303,25 @@ PortalDefineQuery(Portal portal,
portal->status = PORTAL_DEFINED;
}
+/*
+ * PortalStorePartitionPruneResults
+ * Copy the given list of PartitionPruneResults into the portal's
+ * context
+ *
+ * This allows the caller to ensure that the list exists as long as the portal
+ * does.
+ */
+void
+PortalStorePartitionPruneResults(Portal portal, List *part_prune_results)
+{
+ MemoryContext oldcxt;
+
+ AssertArg(PortalIsValid(portal));
+ oldcxt = MemoryContextSwitchTo(portal->portalContext);
+ portal->part_prune_results = copyObject(part_prune_results);
+ MemoryContextSwitchTo(oldcxt);
+}
+
/*
* PortalReleaseCachedPlan
* Release a portal's reference to its cached plan, if any.
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 9ebde089ae..e57e133f0e 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -87,7 +87,9 @@ extern void ExplainOneUtility(Node *utilityStmt, IntoClause *into,
ExplainState *es, const char *queryString,
ParamListInfo params, QueryEnvironment *queryEnv);
-extern void ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into,
+extern void ExplainOnePlan(PlannedStmt *plannedstmt,
+ PartitionPruneResult *part_prune_result,
+ IntoClause *into,
ExplainState *es, const char *queryString,
ParamListInfo params, QueryEnvironment *queryEnv,
const instr_time *planduration,
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index bf962af7af..bd8776402e 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -45,6 +45,7 @@ extern void ExecCleanupTupleRouting(ModifyTableState *mtstate,
* nparts Length of subplan_map[] and subpart_map[].
* subplan_map Subplan index by partition index, or -1.
* subpart_map Subpart index by partition index, or -1.
+ * rti_map Range table index by partition index, or 0.
* present_parts A Bitmapset of the partition indexes that we
* have subplans or subparts for.
* initial_pruning_steps List of PartitionPruneSteps used to
@@ -61,6 +62,7 @@ typedef struct PartitionedRelPruningData
int nparts;
int *subplan_map;
int *subpart_map;
+ Index *rti_map;
Bitmapset *present_parts;
List *initial_pruning_steps;
List *exec_pruning_steps;
@@ -126,5 +128,10 @@ extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
int part_prune_index,
Bitmapset **initially_valid_subplans);
extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
- bool initial_prune);
+ bool initial_prune,
+ Bitmapset **scan_leafpart_rtis);
+extern Bitmapset *ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt,
+ ParamListInfo params,
+ PartitionPruneInfo *pruneinfo,
+ Bitmapset **scan_leafpart_rtis);
#endif /* EXECPARTITION_H */
diff --git a/src/include/executor/execdesc.h b/src/include/executor/execdesc.h
index e79e2c001f..60d5644908 100644
--- a/src/include/executor/execdesc.h
+++ b/src/include/executor/execdesc.h
@@ -35,6 +35,8 @@ typedef struct QueryDesc
/* These fields are provided by CreateQueryDesc */
CmdType operation; /* CMD_SELECT, CMD_UPDATE, etc. */
PlannedStmt *plannedstmt; /* planner's output (could be utility, too) */
+ PartitionPruneResult *part_prune_result; /* ExecutorDoInitialPruning()'s
+ * output for plannedstmt */
const char *sourceText; /* source text of the query */
Snapshot snapshot; /* snapshot to use for query */
Snapshot crosscheck_snapshot; /* crosscheck for RI update/delete */
@@ -57,6 +59,7 @@ typedef struct QueryDesc
/* in pquery.c */
extern QueryDesc *CreateQueryDesc(PlannedStmt *plannedstmt,
+ PartitionPruneResult *part_prune_result,
const char *sourceText,
Snapshot snapshot,
Snapshot crosscheck_snapshot,
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index d68a6b9d28..5c4a282be0 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -185,6 +185,8 @@ ExecGetJunkAttribute(TupleTableSlot *slot, AttrNumber attno, bool *isNull)
/*
* prototypes from functions in execMain.c
*/
+extern PartitionPruneResult *ExecutorDoInitialPruning(PlannedStmt *plannedstmt,
+ ParamListInfo params);
extern void ExecutorStart(QueryDesc *queryDesc, int eflags);
extern void standard_ExecutorStart(QueryDesc *queryDesc, int eflags);
extern void ExecutorRun(QueryDesc *queryDesc,
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 63a89474db..12ea06c2f6 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -1001,6 +1001,33 @@ typedef struct DomainConstraintState
*/
typedef TupleTableSlot *(*ExecProcNodeMtd) (struct PlanState *pstate);
+/*----------------
+ * PartitionPruneResult
+ *
+ * The result of performing ExecutorDoInitialPruning() invocation on a given
+ * PlannedStmt.
+ *
+ * Contains a list of Bitmapset of the indexes of the subplans remaining after
+ * performing initial pruning by calling ExecFindMatchingSubPlans() for every
+ * PartitionPruneInfo found in PlannedStmt.partPruneInfos. RT indexes of the
+ * leaf partitions scanned by those subplans across all PartitionPruneInfos
+ * are added into scan_leafpart_rtis.
+ *
+ * This is used by GetCachedPlan() to inform its callers of the pruning
+ * decisions made when performing AcquireExecutorLocks() on a given cached
+ * PlannedStmt, which the callers then pass on to the executor. The executor
+ * refers to this node when initializing the plan nodes which contain subplans
+ * that may have been pruned by ExecutorDoInitialPruning(), rather than
+ * redoing initial pruning.
+ */
+typedef struct PartitionPruneResult
+{
+ NodeTag type;
+
+ List *valid_subplan_offs_list;
+ Bitmapset *scan_leafpart_rtis;
+} PartitionPruneResult;
+
/* ----------------
* PlanState node
*
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index cdd6debfa0..b33d9e426d 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -212,6 +212,7 @@ extern struct Bitmapset *readBitmapset(void);
extern uintptr_t readDatum(bool typbyval);
extern bool *readBoolCols(int numCols);
extern int *readIntCols(int numCols);
+extern Index *readIndexCols(int numCols);
extern Oid *readOidCols(int numCols);
extern int16 *readAttrNumberCols(int numCols);
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index d87957ff6c..7957aeb6d7 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -125,6 +125,19 @@ typedef struct PlannerGlobal
/* List of PartitionPruneInfo contained in the plan */
List *partPruneInfos;
+ /*
+ * Do any of those PartitionPruneInfos have initial (pre-exec) pruning
+ * steps in them?
+ */
+ bool containsInitialPruning;
+
+ /*
+ * Indexes of all range table entries minus indexes of range table entries
+ * of the leaf partitions scanned by prunable subplans; see
+ * AcquireExecutorLocks()
+ */
+ Bitmapset *minLockRelids;
+
/* OIDs of relations the plan depends on */
List *relationOids;
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index f2daabb3b7..1d2c0d9bdf 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -72,8 +72,17 @@ typedef struct PlannedStmt
List *partPruneInfos; /* List of PartitionPruneInfo contained in
* the plan */
+ bool containsInitialPruning; /* Do any of those PartitionPruneInfos
+ * have initial (pre-exec) pruning
+ * steps in them? */
+
List *rtable; /* list of RangeTblEntry nodes */
+ Bitmapset *minLockRelids; /* Indexes of all range table entries minus
+ * indexes of range table entries of the leaf
+ * partitions scanned by prunable subplans;
+ * see AcquireExecutorLocks() */
+
/* rtable indexes of target relations for INSERT/UPDATE/DELETE */
List *resultRelations; /* integer list of RT indexes, or NIL */
@@ -1409,6 +1418,13 @@ typedef struct PlanRowMark
* prune_infos List of Lists containing PartitionedRelPruneInfo nodes,
* one sublist per run-time-prunable partition hierarchy
* appearing in the parent plan node's subplans.
+ *
+ * needs_init_pruning Does any of the PartitionedRelPruneInfos in
+ * prune_infos have its initial_pruning_steps set?
+ *
+ * needs_exec_pruning Does any of the PartitionedRelPruneInfos in
+ * prune_infos have its exec_pruning_steps set?
+ *
* other_subplans Indexes of any subplans that are not accounted for
* by any of the PartitionedRelPruneInfo nodes in
* "prune_infos". These subplans must not be pruned.
@@ -1419,6 +1435,8 @@ typedef struct PartitionPruneInfo
NodeTag type;
List *prune_infos;
+ bool needs_init_pruning;
+ bool needs_exec_pruning;
Bitmapset *other_subplans;
} PartitionPruneInfo;
@@ -1463,6 +1481,9 @@ typedef struct PartitionedRelPruneInfo
/* relation OID by partition index, or 0 */
Oid *relid_map pg_node_attr(array_size(nparts));
+ /* Range table index by partition index, or 0. */
+ Index *rti_map pg_node_attr(array_size(nparts));
+
/*
* initial_pruning_steps shows how to prune during executor startup (i.e.,
* without use of any PARAM_EXEC Params); it is NIL if no startup pruning
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index 0499635f59..1c5bb5ece1 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -220,7 +220,8 @@ extern List *CachedPlanGetTargetList(CachedPlanSource *plansource,
extern CachedPlan *GetCachedPlan(CachedPlanSource *plansource,
ParamListInfo boundParams,
ResourceOwner owner,
- QueryEnvironment *queryEnv);
+ QueryEnvironment *queryEnv,
+ List **part_prune_result_list);
extern void ReleaseCachedPlan(CachedPlan *plan, ResourceOwner owner);
extern bool CachedPlanAllowsSimpleValidityCheck(CachedPlanSource *plansource,
diff --git a/src/include/utils/portal.h b/src/include/utils/portal.h
index aeddbdafe5..9f7727a837 100644
--- a/src/include/utils/portal.h
+++ b/src/include/utils/portal.h
@@ -138,6 +138,7 @@ typedef struct PortalData
QueryCompletion qc; /* command completion data for executed query */
List *stmts; /* list of PlannedStmts */
CachedPlan *cplan; /* CachedPlan, if stmts are from one */
+ List *part_prune_results; /* list of PartitionPruneResults */
ParamListInfo portalParams; /* params to pass to query */
QueryEnvironment *queryEnv; /* environment for query */
@@ -242,6 +243,8 @@ extern void PortalDefineQuery(Portal portal,
CommandTag commandTag,
List *stmts,
CachedPlan *cplan);
+extern void PortalStorePartitionPruneResults(Portal portal,
+ List *part_prune_result_list);
extern PlannedStmt *PortalGetPrimaryStmt(Portal portal);
extern void PortalCreateHoldStore(Portal portal);
extern void PortalHashTableDeleteAll(void);
--
2.35.3
^ permalink raw reply [nested|flat] 71+ messages in thread
* Re: generic plans and "initial" pruning
@ 2022-07-27 03:00 Amit Langote <[email protected]>
parent: Amit Langote <[email protected]>
0 siblings, 1 reply; 71+ messages in thread
From: Amit Langote @ 2022-07-27 03:00 UTC (permalink / raw)
To: Jacob Champion <[email protected]>; +Cc: Zhihong Yu <[email protected]>; David Rowley <[email protected]>; Robert Haas <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers
On Wed, Jul 13, 2022 at 4:03 PM Amit Langote <[email protected]> wrote:
> On Wed, Jul 13, 2022 at 3:40 PM Amit Langote <[email protected]> wrote:
> > Rebased over 964d01ae90c.
>
> Sorry, left some pointless hunks in there while rebasing. Fixed in
> the attached.
Needed to be rebased again, over 2d04277121f this time.
--
Thanks, Amit Langote
EDB: http://www.enterprisedb.com
Attachments:
[application/octet-stream] v20-0001-Move-PartitioPruneInfo-out-of-plan-nodes-into-Pl.patch (17.3K, 2-v20-0001-Move-PartitioPruneInfo-out-of-plan-nodes-into-Pl.patch)
download | inline diff:
From 8de25528e8f388beffdab3d7c9905712e2f8eeef Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Fri, 27 May 2022 16:00:28 +0900
Subject: [PATCH v20 1/2] Move PartitioPruneInfo out of plan nodes into
PlannedStmt
The planner will now add a given PartitioPruneInfo to
PlannedStmt.partPruneInfos instead of directly to the
Append/MergeAppend plan node. What gets set instead in the
latter is an index field which points to the list element
of PlannedStmt.partPruneInfos containing the PartitioPruneInfo
belonging to the plan node.
A later commit will make AcquireExecutorLocks() do the initial
partition pruning to determine a minimal set of partitions to be
locked when validating a plan tree and it will need to consult the
PartitioPruneInfos referenced therein to do so. It would be better
for the PartitioPruneInfos to be accessible directly than requiring
a walk of the plan tree to find them, which is easier when it can be
done by simply iterating over PlannedStmt.partPruneInfos.
---
src/backend/executor/execMain.c | 1 +
src/backend/executor/execParallel.c | 1 +
src/backend/executor/execPartition.c | 4 +-
src/backend/executor/execUtils.c | 2 +
src/backend/executor/nodeAppend.c | 4 +-
src/backend/executor/nodeMergeAppend.c | 4 +-
src/backend/optimizer/plan/createplan.c | 24 ++++-----
src/backend/optimizer/plan/planner.c | 1 +
src/backend/optimizer/plan/setrefs.c | 65 +++++++++++++------------
src/backend/partitioning/partprune.c | 18 ++++---
src/include/executor/execPartition.h | 3 +-
src/include/nodes/execnodes.h | 2 +
src/include/nodes/pathnodes.h | 6 +++
src/include/nodes/plannodes.h | 11 +++--
src/include/partitioning/partprune.h | 8 +--
15 files changed, 92 insertions(+), 62 deletions(-)
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index ef2fd46092..72fc273524 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -825,6 +825,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
ExecInitRangeTable(estate, rangeTable);
estate->es_plannedstmt = plannedstmt;
+ estate->es_part_prune_infos = plannedstmt->partPruneInfos;
/*
* Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index f1fd7f7e8b..f73b8c2607 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -183,6 +183,7 @@ ExecSerializePlan(Plan *plan, EState *estate)
pstmt->dependsOnRole = false;
pstmt->parallelModeNeeded = false;
pstmt->planTree = plan;
+ pstmt->partPruneInfos = estate->es_part_prune_infos;
pstmt->rtable = estate->es_range_table;
pstmt->resultRelations = NIL;
pstmt->appendRelations = NIL;
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index e03ea27299..b55cdd2580 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -1638,11 +1638,13 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
PartitionPruneState *
ExecInitPartitionPruning(PlanState *planstate,
int n_total_subplans,
- PartitionPruneInfo *pruneinfo,
+ int part_prune_index,
Bitmapset **initially_valid_subplans)
{
PartitionPruneState *prunestate;
EState *estate = planstate->state;
+ PartitionPruneInfo *pruneinfo = list_nth(estate->es_part_prune_infos,
+ part_prune_index);
/* We may need an expression context to evaluate partition exprs */
ExecAssignExprContext(estate, planstate);
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 9df1f81ea8..f9c7976ff2 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -119,6 +119,8 @@ CreateExecutorState(void)
estate->es_relations = NULL;
estate->es_rowmarks = NULL;
estate->es_plannedstmt = NULL;
+ estate->es_part_prune_infos = NIL;
+ estate->es_part_prune_result = NULL;
estate->es_junkFilter = NULL;
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index 357e10a1d7..c6f86a6510 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -134,7 +134,7 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
appendstate->as_begun = false;
/* If run-time partition pruning is enabled, then set that up now */
- if (node->part_prune_info != NULL)
+ if (node->part_prune_index >= 0)
{
PartitionPruneState *prunestate;
@@ -145,7 +145,7 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
*/
prunestate = ExecInitPartitionPruning(&appendstate->ps,
list_length(node->appendplans),
- node->part_prune_info,
+ node->part_prune_index,
&validsubplans);
appendstate->as_prune_state = prunestate;
nplans = bms_num_members(validsubplans);
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index c5c62fa5c7..8d35860c30 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -82,7 +82,7 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
mergestate->ps.ExecProcNode = ExecMergeAppend;
/* If run-time partition pruning is enabled, then set that up now */
- if (node->part_prune_info != NULL)
+ if (node->part_prune_index >= 0)
{
PartitionPruneState *prunestate;
@@ -93,7 +93,7 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
*/
prunestate = ExecInitPartitionPruning(&mergestate->ps,
list_length(node->mergeplans),
- node->part_prune_info,
+ node->part_prune_index,
&validsubplans);
mergestate->ms_prune_state = prunestate;
nplans = bms_num_members(validsubplans);
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index e37f2933eb..fd8ab4a167 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -1203,7 +1203,6 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
ListCell *subpaths;
int nasyncplans = 0;
RelOptInfo *rel = best_path->path.parent;
- PartitionPruneInfo *partpruneinfo = NULL;
int nodenumsortkeys = 0;
AttrNumber *nodeSortColIdx = NULL;
Oid *nodeSortOperators = NULL;
@@ -1354,6 +1353,9 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
subplans = lappend(subplans, subplan);
}
+ /* Set below if we find quals that we can use to run-time prune */
+ plan->part_prune_index = -1;
+
/*
* If any quals exist, they may be useful to perform further partition
* pruning during execution. Gather information needed by the executor to
@@ -1377,16 +1379,14 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
}
if (prunequal != NIL)
- partpruneinfo =
- make_partition_pruneinfo(root, rel,
- best_path->subpaths,
- prunequal);
+ plan->part_prune_index = make_partition_pruneinfo(root, rel,
+ best_path->subpaths,
+ prunequal);
}
plan->appendplans = subplans;
plan->nasyncplans = nasyncplans;
plan->first_partial_plan = best_path->first_partial_path;
- plan->part_prune_info = partpruneinfo;
copy_generic_path_info(&plan->plan, (Path *) best_path);
@@ -1425,7 +1425,6 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
List *subplans = NIL;
ListCell *subpaths;
RelOptInfo *rel = best_path->path.parent;
- PartitionPruneInfo *partpruneinfo = NULL;
/*
* We don't have the actual creation of the MergeAppend node split out
@@ -1518,6 +1517,9 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
subplans = lappend(subplans, subplan);
}
+ /* Set below if we find quals that we can use to run-time prune */
+ node->part_prune_index = -1;
+
/*
* If any quals exist, they may be useful to perform further partition
* pruning during execution. Gather information needed by the executor to
@@ -1541,13 +1543,13 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
}
if (prunequal != NIL)
- partpruneinfo = make_partition_pruneinfo(root, rel,
- best_path->subpaths,
- prunequal);
+ node->part_prune_index = make_partition_pruneinfo(root, rel,
+ best_path->subpaths,
+ prunequal);
}
node->mergeplans = subplans;
- node->part_prune_info = partpruneinfo;
+
/*
* If prepare_sort_from_pathkeys added sort columns, but we were told to
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 06ad856eac..b11249ed8f 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -518,6 +518,7 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
result->dependsOnRole = glob->dependsOnRole;
result->parallelModeNeeded = glob->parallelModeNeeded;
result->planTree = top_plan;
+ result->partPruneInfos = glob->partPruneInfos;
result->rtable = glob->finalrtable;
result->resultRelations = glob->resultRelations;
result->appendRelations = glob->appendRelations;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 1cb0abdbc1..720f20f563 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -348,6 +348,29 @@ set_plan_references(PlannerInfo *root, Plan *plan)
}
}
+ /* Also fix up the information in PartitionPruneInfos. */
+ foreach (lc, root->partPruneInfos)
+ {
+ PartitionPruneInfo *pruneinfo = lfirst(lc);
+ ListCell *l;
+
+ foreach(l, pruneinfo->prune_infos)
+ {
+ List *prune_infos = lfirst(l);
+ ListCell *l2;
+
+ foreach(l2, prune_infos)
+ {
+ PartitionedRelPruneInfo *pinfo = lfirst(l2);
+
+ /* RT index of the table to which the pinfo belongs. */
+ pinfo->rtindex += rtoffset;
+ }
+ }
+
+ glob->partPruneInfos = lappend(glob->partPruneInfos, pruneinfo);
+ }
+
return result;
}
@@ -1658,21 +1681,12 @@ set_append_references(PlannerInfo *root,
aplan->apprelids = offset_relid_set(aplan->apprelids, rtoffset);
- if (aplan->part_prune_info)
- {
- foreach(l, aplan->part_prune_info->prune_infos)
- {
- List *prune_infos = lfirst(l);
- ListCell *l2;
-
- foreach(l2, prune_infos)
- {
- PartitionedRelPruneInfo *pinfo = lfirst(l2);
-
- pinfo->rtindex += rtoffset;
- }
- }
- }
+ /*
+ * PartitionPruneInfos will be added to a list in PlannerGlobal, so update
+ * the index.
+ */
+ if (aplan->part_prune_index >= 0)
+ aplan->part_prune_index += list_length(root->glob->partPruneInfos);
/* We don't need to recurse to lefttree or righttree ... */
Assert(aplan->plan.lefttree == NULL);
@@ -1734,21 +1748,12 @@ set_mergeappend_references(PlannerInfo *root,
mplan->apprelids = offset_relid_set(mplan->apprelids, rtoffset);
- if (mplan->part_prune_info)
- {
- foreach(l, mplan->part_prune_info->prune_infos)
- {
- List *prune_infos = lfirst(l);
- ListCell *l2;
-
- foreach(l2, prune_infos)
- {
- PartitionedRelPruneInfo *pinfo = lfirst(l2);
-
- pinfo->rtindex += rtoffset;
- }
- }
- }
+ /*
+ * PartitionPruneInfos will be added to a list in PlannerGlobal, so update
+ * the index.
+ */
+ if (mplan->part_prune_index >= 0)
+ mplan->part_prune_index += list_length(root->glob->partPruneInfos);
/* We don't need to recurse to lefttree or righttree ... */
Assert(mplan->plan.lefttree == NULL);
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index 9d3c05aed3..d77f7d3aef 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -209,16 +209,20 @@ static void partkey_datum_from_expr(PartitionPruneContext *context,
/*
* make_partition_pruneinfo
- * Builds a PartitionPruneInfo which can be used in the executor to allow
- * additional partition pruning to take place. Returns NULL when
- * partition pruning would be useless.
+ * Checks if the given set of quals can be used to build pruning steps
+ * that the executor can use to prune away unneeded partitions. If
+ * suitable quals are found then a PartitionPruneInfo is built and tagged
+ * onto the PlannerInfo's partPruneInfos list.
+ *
+ * The return value is the 0-based index of the item added to the
+ * partPruneInfos list or -1 if nothing was added.
*
* 'parentrel' is the RelOptInfo for an appendrel, and 'subpaths' is the list
* of scan paths for its child rels.
* 'prunequal' is a list of potential pruning quals (i.e., restriction
* clauses that are applicable to the appendrel).
*/
-PartitionPruneInfo *
+int
make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
List *subpaths,
List *prunequal)
@@ -332,7 +336,7 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
* quals, then we can just not bother with run-time pruning.
*/
if (prunerelinfos == NIL)
- return NULL;
+ return -1;
/* Else build the result data structure */
pruneinfo = makeNode(PartitionPruneInfo);
@@ -358,7 +362,9 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
else
pruneinfo->other_subplans = NULL;
- return pruneinfo;
+ root->partPruneInfos = lappend(root->partPruneInfos, pruneinfo);
+
+ return list_length(root->partPruneInfos) - 1;
}
/*
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index 708435e952..bf962af7af 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -123,9 +123,8 @@ typedef struct PartitionPruneState
extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
int n_total_subplans,
- PartitionPruneInfo *pruneinfo,
+ int part_prune_index,
Bitmapset **initially_valid_subplans);
extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
bool initial_prune);
-
#endif /* EXECPARTITION_H */
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 01b1727fc0..63a89474db 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -611,6 +611,8 @@ typedef struct EState
struct ExecRowMark **es_rowmarks; /* Array of per-range-table-entry
* ExecRowMarks, or NULL if none */
PlannedStmt *es_plannedstmt; /* link to top of plan tree */
+ List *es_part_prune_infos; /* PlannedStmt.partPruneInfos */
+ struct PartitionPruneResult *es_part_prune_result; /* QueryDesc.part_prune_result */
const char *es_sourceText; /* Source text from QueryDesc */
JunkFilter *es_junkFilter; /* top-level junk filter, if any */
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index e2081db4ed..a4e6b4db92 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -122,6 +122,9 @@ typedef struct PlannerGlobal
/* "flat" list of AppendRelInfos */
List *appendRelations;
+ /* List of PartitionPruneInfo contained in the plan */
+ List *partPruneInfos;
+
/* OIDs of relations the plan depends on */
List *relationOids;
@@ -488,6 +491,9 @@ struct PlannerInfo
/* Does this query modify any partition key columns? */
bool partColsUpdated;
+
+ /* PartitionPruneInfos added in this query's plan. */
+ List *partPruneInfos;
};
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index dca2a21e7a..f2daabb3b7 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -69,6 +69,9 @@ typedef struct PlannedStmt
struct Plan *planTree; /* tree of Plan nodes */
+ List *partPruneInfos; /* List of PartitionPruneInfo contained in
+ * the plan */
+
List *rtable; /* list of RangeTblEntry nodes */
/* rtable indexes of target relations for INSERT/UPDATE/DELETE */
@@ -269,8 +272,8 @@ typedef struct Append
*/
int first_partial_plan;
- /* Info for run-time subplan pruning; NULL if we're not doing that */
- struct PartitionPruneInfo *part_prune_info;
+ /* Index to PlannerInfo.partPruneInfos or -1 if no run-time pruning */
+ int part_prune_index;
} Append;
/* ----------------
@@ -304,8 +307,8 @@ typedef struct MergeAppend
/* NULLS FIRST/LAST directions */
bool *nullsFirst pg_node_attr(array_size(numCols));
- /* Info for run-time subplan pruning; NULL if we're not doing that */
- struct PartitionPruneInfo *part_prune_info;
+ /* Index to PlannerInfo.partPruneInfos or -1 if no run-time pruning */
+ int part_prune_index;
} MergeAppend;
/* ----------------
diff --git a/src/include/partitioning/partprune.h b/src/include/partitioning/partprune.h
index 90684efa25..ebf0dcff8c 100644
--- a/src/include/partitioning/partprune.h
+++ b/src/include/partitioning/partprune.h
@@ -70,10 +70,10 @@ typedef struct PartitionPruneContext
#define PruneCxtStateIdx(partnatts, step_id, keyno) \
((partnatts) * (step_id) + (keyno))
-extern PartitionPruneInfo *make_partition_pruneinfo(struct PlannerInfo *root,
- struct RelOptInfo *parentrel,
- List *subpaths,
- List *prunequal);
+extern int make_partition_pruneinfo(struct PlannerInfo *root,
+ struct RelOptInfo *parentrel,
+ List *subpaths,
+ List *prunequal);
extern Bitmapset *prune_append_rel_partitions(struct RelOptInfo *rel);
extern Bitmapset *get_matching_partitions(PartitionPruneContext *context,
List *pruning_steps);
--
2.35.3
[application/octet-stream] v20-0002-Optimize-AcquireExecutorLocks-by-locking-only-un.patch (80.5K, 3-v20-0002-Optimize-AcquireExecutorLocks-by-locking-only-un.patch)
download | inline diff:
From 7a1454c6a1ecde5c871bec5a4d646da4e41a62c3 Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Wed, 22 Dec 2021 16:55:17 +0900
Subject: [PATCH v20 2/2] Optimize AcquireExecutorLocks() by locking only
unpruned partitions
This commit teaches AcquireExecutorLocks() to perform initial
partition pruning to notionally eliminate the subnodes contained in a
generic cached plan that need not be initialized during the actual
execution of the plan and skip locking the partition scanned by those
subnodes.
The result of performing initial partition pruning this way before the
actual execution has started is made available to the actual execution via
PartitionPruneResult, made available along with the PlannedStmt by the
callers of the executor that used plancache.c to get the plan. It is NULL
in the cases in which the plan is obtained by calling the planner
directly or if the plan obtained by plancache.c is not a generic one.
---
src/backend/commands/copyto.c | 2 +-
src/backend/commands/createas.c | 2 +-
src/backend/commands/explain.c | 7 +-
src/backend/commands/extension.c | 2 +-
src/backend/commands/matview.c | 2 +-
src/backend/commands/prepare.c | 26 ++-
src/backend/executor/README | 32 ++++
src/backend/executor/execMain.c | 53 ++++++
src/backend/executor/execParallel.c | 27 ++-
src/backend/executor/execPartition.c | 234 +++++++++++++++++++++----
src/backend/executor/functions.c | 2 +-
src/backend/executor/nodeAppend.c | 11 +-
src/backend/executor/nodeMergeAppend.c | 5 +-
src/backend/executor/spi.c | 27 ++-
src/backend/nodes/readfuncs.c | 8 +-
src/backend/optimizer/plan/planner.c | 2 +
src/backend/optimizer/plan/setrefs.c | 46 +++++
src/backend/partitioning/partprune.c | 41 ++++-
src/backend/tcop/postgres.c | 8 +-
src/backend/tcop/pquery.c | 28 ++-
src/backend/utils/cache/plancache.c | 187 +++++++++++++++++---
src/backend/utils/mmgr/portalmem.c | 19 ++
src/include/commands/explain.h | 4 +-
src/include/executor/execPartition.h | 9 +-
src/include/executor/execdesc.h | 3 +
src/include/executor/executor.h | 2 +
src/include/nodes/execnodes.h | 27 +++
src/include/nodes/nodes.h | 1 +
src/include/nodes/pathnodes.h | 13 ++
src/include/nodes/plannodes.h | 21 +++
src/include/utils/plancache.h | 3 +-
src/include/utils/portal.h | 3 +
32 files changed, 759 insertions(+), 98 deletions(-)
diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index fca29a9a10..d839517693 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -541,7 +541,7 @@ BeginCopyTo(ParseState *pstate,
((DR_copy *) dest)->cstate = cstate;
/* Create a QueryDesc requesting no output */
- cstate->queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+ cstate->queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
GetActiveSnapshot(),
InvalidSnapshot,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 9abbb6b555..f6607f2454 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -325,7 +325,7 @@ ExecCreateTableAs(ParseState *pstate, CreateTableAsStmt *stmt,
UpdateActiveSnapshotCommandId();
/* Create a QueryDesc, redirecting output to our tuple receiver */
- queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+ queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, 0);
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index e29c2ae206..e41b13a3ea 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -407,7 +407,7 @@ ExplainOneQuery(Query *query, int cursorOptions,
}
/* run it (if needed) and produce output */
- ExplainOnePlan(plan, into, es, queryString, params, queryEnv,
+ ExplainOnePlan(plan, NULL, into, es, queryString, params, queryEnv,
&planduration, (es->buffers ? &bufusage : NULL));
}
}
@@ -515,7 +515,8 @@ ExplainOneUtility(Node *utilityStmt, IntoClause *into, ExplainState *es,
* to call it.
*/
void
-ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
+ExplainOnePlan(PlannedStmt *plannedstmt, PartitionPruneResult *part_prune_result,
+ IntoClause *into, ExplainState *es,
const char *queryString, ParamListInfo params,
QueryEnvironment *queryEnv, const instr_time *planduration,
const BufferUsage *bufusage)
@@ -563,7 +564,7 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
dest = None_Receiver;
/* Create a QueryDesc for the query */
- queryDesc = CreateQueryDesc(plannedstmt, queryString,
+ queryDesc = CreateQueryDesc(plannedstmt, part_prune_result, queryString,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, instrument_option);
diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index 6b6720c690..374c0ff807 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -776,7 +776,7 @@ execute_sql_string(const char *sql)
{
QueryDesc *qdesc;
- qdesc = CreateQueryDesc(stmt,
+ qdesc = CreateQueryDesc(stmt, NULL,
sql,
GetActiveSnapshot(), NULL,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index 9ac0383459..b0ed96e56c 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -408,7 +408,7 @@ refresh_matview_datafill(DestReceiver *dest, Query *query,
UpdateActiveSnapshotCommandId();
/* Create a QueryDesc, redirecting output to our tuple receiver */
- queryDesc = CreateQueryDesc(plan, queryString,
+ queryDesc = CreateQueryDesc(plan, NULL, queryString,
GetActiveSnapshot(), InvalidSnapshot,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 579825c159..b6285958bc 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -155,6 +155,7 @@ ExecuteQuery(ParseState *pstate,
PreparedStatement *entry;
CachedPlan *cplan;
List *plan_list;
+ List *part_prune_result_list;
ParamListInfo paramLI = NULL;
EState *estate = NULL;
Portal portal;
@@ -193,7 +194,10 @@ ExecuteQuery(ParseState *pstate,
entry->plansource->query_string);
/* Replan if needed, and increment plan refcount for portal */
- cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL);
+ cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL,
+ &part_prune_result_list);
+ Assert(list_length(cplan->stmt_list) ==
+ list_length(part_prune_result_list));
plan_list = cplan->stmt_list;
/*
@@ -207,6 +211,9 @@ ExecuteQuery(ParseState *pstate,
plan_list,
cplan);
+ /* Copy PartitionPruneResults into the portal's context. */
+ PortalStorePartitionPruneResults(portal, part_prune_result_list);
+
/*
* For CREATE TABLE ... AS EXECUTE, we must verify that the prepared
* statement is one that produces tuples. Currently we insist that it be
@@ -576,7 +583,9 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
const char *query_string;
CachedPlan *cplan;
List *plan_list;
- ListCell *p;
+ List *part_prune_result_list;
+ ListCell *p,
+ *pp;
ParamListInfo paramLI = NULL;
EState *estate = NULL;
instr_time planstart;
@@ -619,7 +628,10 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
/* Replan if needed, and acquire a transient refcount */
cplan = GetCachedPlan(entry->plansource, paramLI,
- CurrentResourceOwner, queryEnv);
+ CurrentResourceOwner, queryEnv,
+ &part_prune_result_list);
+ Assert(list_length(cplan->stmt_list) ==
+ list_length(part_prune_result_list));
INSTR_TIME_SET_CURRENT(planduration);
INSTR_TIME_SUBTRACT(planduration, planstart);
@@ -634,13 +646,15 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
plan_list = cplan->stmt_list;
/* Explain each query */
- foreach(p, plan_list)
+ forboth(p, plan_list, pp, part_prune_result_list)
{
PlannedStmt *pstmt = lfirst_node(PlannedStmt, p);
+ PartitionPruneResult *part_prune_result = lfirst_node(PartitionPruneResult, pp);
if (pstmt->commandType != CMD_UTILITY)
- ExplainOnePlan(pstmt, into, es, query_string, paramLI, queryEnv,
- &planduration, (es->buffers ? &bufusage : NULL));
+ ExplainOnePlan(pstmt, part_prune_result, into, es, query_string,
+ paramLI, queryEnv, &planduration,
+ (es->buffers ? &bufusage : NULL));
else
ExplainOneUtility(pstmt->utilityStmt, into, es, query_string,
paramLI, queryEnv);
diff --git a/src/backend/executor/README b/src/backend/executor/README
index 0b5183fc4a..953a476ea5 100644
--- a/src/backend/executor/README
+++ b/src/backend/executor/README
@@ -65,6 +65,34 @@ found there. This currently only occurs for Append and MergeAppend nodes. In
this case the non-required subplans are ignored and the executor state's
subnode array will become out of sequence to the plan's subplan list.
+Actually, the so-called execution time pruning may also occur even before the
+execution has started. One case where that occurs is when a cached generic
+plan is being validated for execution by plancache.c: GetCachedPlan(), which
+works by locking all the relations that will be scanned by that plan. If the
+generic plan contains nodes that can perform execution time partition pruning
+(that is, contain a PartitionPruneInfo), a subset of pruning steps contained
+in a given node's PartitionPruneInfo that do not depend on the execution
+actually having started (called "initial" pruning steps) are performed at this
+point to figure out the minimal set of child subplans that satisfy those
+pruning steps. AcquireExecutorLocks() looking at a given plan tree will then
+lock only the relations scanned by the child subplans that survived such
+pruning, along with those present in PlannedStmt.minLockRelids. Note that the
+subplans are only notionally pruned in that they are not removed from the plan
+tree as such.
+
+To prevent the executor and any third party execution code that can look at
+the plan tree from trying to execute the subplans that were pruned as
+described above, the result of pruning is passed to the executor as a
+PartitionPruneResult node via the QueryDesc. It consists of the set of
+indexes of surviving subplans in their respective parent plan node's list of
+child subplans, saved as a list of bitmapsets, with one element for every
+parent plan node whose PartitionPruneInfo is present in
+PlannedStmt.partPruneInfos. In other words, the executor should not
+re-evaluate the set of initially valid subplans by redoing the initial pruning
+if it was already done by AcquireExecutorLocks(), because the re-evaluation may
+very well end up resulting in a different set of subplans, containing some
+whose relations were not locked by AcquireExecutorLocks().
+
Each Plan node may have expression trees associated with it, to represent
its target list, qualification conditions, etc. These trees are also
read-only to the executor, but the executor state for expression evaluation
@@ -286,6 +314,10 @@ Query Processing Control Flow
This is a sketch of control flow for full query processing:
+ [ ExecutorDoInitialPruning ] --- an optional step to perform initial
+ partition pruning on the plan tree the result of which is passed
+ to the executor via QueryDesc
+
CreateQueryDesc
ExecutorStart
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 72fc273524..45824624f8 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -49,6 +49,7 @@
#include "commands/matview.h"
#include "commands/trigger.h"
#include "executor/execdebug.h"
+#include "executor/execPartition.h"
#include "executor/nodeSubplan.h"
#include "foreign/fdwapi.h"
#include "jit/jit.h"
@@ -104,6 +105,56 @@ static void EvalPlanQualStart(EPQState *epqstate, Plan *planTree);
/* end of local decls */
+/* ----------------------------------------------------------------
+ * ExecutorDoInitialPruning
+ *
+ * For each plan tree node that has been assigned a PartitionPruneInfo,
+ * this performs initial partition pruning using the information contained
+ * therein to determine the set of child subplans that satisfy the initial
+ * pruning steps, to be returned as a bitmapset of their indexes in the
+ * node's list of child subplans (for example, an Append's appendplans).
+ *
+ * Return value is a PartitionPruneResult node that contains a list of those
+ * bitmapsets, with one element for every PartitionPruneInfo, and a bitmapset
+ * of the RT indexes of all the leaf partitions scanned by those chosen
+ * subplans. Note that the latter is shared across all PartitionPruneInfos.
+ *
+ * The executor must see the exactly same set of subplans as valid for
+ * execution when doing ExecInitNode() on the plan nodes whose
+ * PartitionPruneInfos are processed here. So, it must get the set from the
+ * aforementioned PartitionPruneResult, instead of computing it all over
+ * again by redoing the initial pruning. It's the caller's job to pass the
+ * PartitionPruneResult to the executor.
+ *
+ * Note: Partitioned tables mentioned in PartitionedRelPruneInfo nodes that
+ * drive the pruning will be locked before doing the pruning.
+ * ----------------------------------------------------------------
+ */
+PartitionPruneResult *
+ExecutorDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params)
+{
+ PartitionPruneResult *result;
+ ListCell *lc;
+
+ /* Only get here if there is any pruning to do. */
+ Assert(plannedstmt->containsInitialPruning);
+
+ result = makeNode(PartitionPruneResult);
+ foreach(lc, plannedstmt->partPruneInfos)
+ {
+ PartitionPruneInfo *pruneinfo = lfirst(lc);
+ Bitmapset *valid_subplan_offs;
+
+ valid_subplan_offs =
+ ExecPartitionDoInitialPruning(plannedstmt, params, pruneinfo,
+ &result->scan_leafpart_rtis);
+ result->valid_subplan_offs_list =
+ lappend(result->valid_subplan_offs_list,
+ valid_subplan_offs);
+ }
+
+ return result;
+}
/* ----------------------------------------------------------------
* ExecutorStart
@@ -806,6 +857,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
{
CmdType operation = queryDesc->operation;
PlannedStmt *plannedstmt = queryDesc->plannedstmt;
+ PartitionPruneResult *part_prune_result = queryDesc->part_prune_result;
Plan *plan = plannedstmt->planTree;
List *rangeTable = plannedstmt->rtable;
EState *estate = queryDesc->estate;
@@ -826,6 +878,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
estate->es_plannedstmt = plannedstmt;
estate->es_part_prune_infos = plannedstmt->partPruneInfos;
+ estate->es_part_prune_result = part_prune_result;
/*
* Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index f73b8c2607..7e6dab5623 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -66,6 +66,7 @@
#define PARALLEL_KEY_QUERY_TEXT UINT64CONST(0xE000000000000008)
#define PARALLEL_KEY_JIT_INSTRUMENTATION UINT64CONST(0xE000000000000009)
#define PARALLEL_KEY_WAL_USAGE UINT64CONST(0xE00000000000000A)
+#define PARALLEL_KEY_PARTITIONPRUNERESULT UINT64CONST(0xE00000000000000B)
#define PARALLEL_TUPLE_QUEUE_SIZE 65536
@@ -182,6 +183,7 @@ ExecSerializePlan(Plan *plan, EState *estate)
pstmt->transientPlan = false;
pstmt->dependsOnRole = false;
pstmt->parallelModeNeeded = false;
+ pstmt->containsInitialPruning = false;
pstmt->planTree = plan;
pstmt->partPruneInfos = estate->es_part_prune_infos;
pstmt->rtable = estate->es_range_table;
@@ -597,12 +599,15 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
FixedParallelExecutorState *fpes;
char *pstmt_data;
char *pstmt_space;
+ char *part_prune_result_data;
+ char *part_prune_result_space;
char *paramlistinfo_space;
BufferUsage *bufusage_space;
WalUsage *walusage_space;
SharedExecutorInstrumentation *instrumentation = NULL;
SharedJitInstrumentation *jit_instrumentation = NULL;
int pstmt_len;
+ int part_prune_result_len;
int paramlistinfo_len;
int instrumentation_len = 0;
int jit_instrumentation_len = 0;
@@ -631,6 +636,7 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
/* Fix up and serialize plan to be sent to workers. */
pstmt_data = ExecSerializePlan(planstate->plan, estate);
+ part_prune_result_data = nodeToString(estate->es_part_prune_result);
/* Create a parallel context. */
pcxt = CreateParallelContext("postgres", "ParallelQueryMain", nworkers);
@@ -657,6 +663,11 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
shm_toc_estimate_chunk(&pcxt->estimator, pstmt_len);
shm_toc_estimate_keys(&pcxt->estimator, 1);
+ /* Estimate space for serialized PartitionPruneResult. */
+ part_prune_result_len = strlen(part_prune_result_data) + 1;
+ shm_toc_estimate_chunk(&pcxt->estimator, part_prune_result_len);
+ shm_toc_estimate_keys(&pcxt->estimator, 1);
+
/* Estimate space for serialized ParamListInfo. */
paramlistinfo_len = EstimateParamListSpace(estate->es_param_list_info);
shm_toc_estimate_chunk(&pcxt->estimator, paramlistinfo_len);
@@ -751,6 +762,12 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
memcpy(pstmt_space, pstmt_data, pstmt_len);
shm_toc_insert(pcxt->toc, PARALLEL_KEY_PLANNEDSTMT, pstmt_space);
+ /* Store serialized PartitionPruneResult */
+ part_prune_result_space = shm_toc_allocate(pcxt->toc, part_prune_result_len);
+ memcpy(part_prune_result_space, part_prune_result_data, part_prune_result_len);
+ shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARTITIONPRUNERESULT,
+ part_prune_result_space);
+
/* Store serialized ParamListInfo. */
paramlistinfo_space = shm_toc_allocate(pcxt->toc, paramlistinfo_len);
shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARAMLISTINFO, paramlistinfo_space);
@@ -1232,8 +1249,10 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
int instrument_options)
{
char *pstmtspace;
+ char *part_prune_result_space;
char *paramspace;
PlannedStmt *pstmt;
+ PartitionPruneResult *part_prune_result;
ParamListInfo paramLI;
char *queryString;
@@ -1244,12 +1263,18 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
pstmtspace = shm_toc_lookup(toc, PARALLEL_KEY_PLANNEDSTMT, false);
pstmt = (PlannedStmt *) stringToNode(pstmtspace);
+ /* Reconstruct leader-supplied PartitionPruneResult. */
+ part_prune_result_space =
+ shm_toc_lookup(toc, PARALLEL_KEY_PARTITIONPRUNERESULT, false);
+ part_prune_result = (PartitionPruneResult *)
+ stringToNode(part_prune_result_space);
+
/* Reconstruct ParamListInfo. */
paramspace = shm_toc_lookup(toc, PARALLEL_KEY_PARAMLISTINFO, false);
paramLI = RestoreParamList(¶mspace);
/* Create a QueryDesc for the query. */
- return CreateQueryDesc(pstmt,
+ return CreateQueryDesc(pstmt, part_prune_result,
queryString,
GetActiveSnapshot(), InvalidSnapshot,
receiver, paramLI, NULL, instrument_options);
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index b55cdd2580..24e6f6e988 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -25,6 +25,7 @@
#include "mb/pg_wchar.h"
#include "miscadmin.h"
#include "nodes/makefuncs.h"
+#include "parser/parsetree.h"
#include "partitioning/partbounds.h"
#include "partitioning/partdesc.h"
#include "partitioning/partprune.h"
@@ -185,7 +186,11 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation rel,
static List *adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri);
static List *adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap);
static PartitionPruneState *CreatePartitionPruneState(PlanState *planstate,
- PartitionPruneInfo *pruneinfo);
+ PartitionPruneInfo *pruneinfo,
+ bool consider_initial_steps,
+ bool consider_exec_steps,
+ List *rtable, ExprContext *econtext,
+ PartitionDirectory partdir);
static void InitPartitionPruneContext(PartitionPruneContext *context,
List *pruning_steps,
PartitionDesc partdesc,
@@ -198,7 +203,8 @@ static void PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
static void find_matching_subplans_recurse(PartitionPruningData *prunedata,
PartitionedRelPruningData *pprune,
bool initial_prune,
- Bitmapset **validsubplans);
+ Bitmapset **validsubplans,
+ Bitmapset **scan_leafpart_rtis);
/*
@@ -1593,8 +1599,10 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
* considered to be a stable expression, it can change value from one plan
* node scan to the next during query execution. Stable comparison
* expressions that don't involve such Params allow partition pruning to be
- * done once during executor startup. Expressions that do involve such Params
- * require us to prune separately for each scan of the parent plan node.
+ * done once during executor startup or during ExecutorDoInitialPruning() that
+ * runs as part of performing AcquireExecutorLocks() on a given plan tree.
+ * Expressions that do involve such Params require us to prune separately for
+ * each scan of the parent plan node.
*
* Note that pruning away unneeded subplans during executor startup has the
* added benefit of not having to initialize the unneeded subplans at all.
@@ -1611,6 +1619,13 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
* account for initial pruning possibly having eliminated some of the
* subplans.
*
+ * ExecPartitionDoInitialPruning:
+ * Do initial pruning with the information contained in a given
+ * PartitionPruneInfo to determine the minimal set of child subplans
+ * to be executed of the parent plan node to which the PartitionPruneInfo
+ * belongs and also the set of the RT indexes of leaf partitions that will
+ * be scanned with those subplans.
+ *
* ExecFindMatchingSubPlans:
* Returns indexes of matching subplans after evaluating the expressions
* that are safe to evaluate at a given point. This function is first
@@ -1628,8 +1643,9 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
*
* On return, *initially_valid_subplans is assigned the set of indexes of
* child subplans that must be initialized along with the parent plan node.
- * Initial pruning is performed here if needed and in that case only the
- * surviving subplans' indexes are added.
+ * Initial pruning is performed here if needed (unless it has already been done
+ * by ExecutorDoInitialPruning()), and in that case only the surviving
+ * subplans' indexes are added.
*
* If subplans are indeed pruned, subplan_map arrays contained in the returned
* PartitionPruneState are re-sequenced to not count those, though only if the
@@ -1645,24 +1661,59 @@ ExecInitPartitionPruning(PlanState *planstate,
EState *estate = planstate->state;
PartitionPruneInfo *pruneinfo = list_nth(estate->es_part_prune_infos,
part_prune_index);
+ PartitionPruneResult *pruneresult = estate->es_part_prune_result;
+ bool do_pruning = (pruneinfo->needs_init_pruning ||
+ pruneinfo->needs_exec_pruning);
- /* We may need an expression context to evaluate partition exprs */
- ExecAssignExprContext(estate, planstate);
+ /*
+ * No need to do initial pruning if it was done already by
+ * ExecutorDoInitialPruning(), which it would be if es_part_prune_result
+ * has been set.
+ */
+ if (pruneresult)
+ do_pruning = pruneinfo->needs_exec_pruning;
- /* Create the working data structure for pruning */
- prunestate = CreatePartitionPruneState(planstate, pruneinfo);
+ prunestate = NULL;
+ if (do_pruning)
+ {
+ /* We may need an expression context to evaluate partition exprs */
+ ExecAssignExprContext(estate, planstate);
+
+ /* For data reading, executor always omits detached partitions */
+ if (estate->es_partition_directory == NULL)
+ estate->es_partition_directory =
+ CreatePartitionDirectory(estate->es_query_cxt, false);
+
+ /*
+ * Create the working data structure for pruning. No need to consider
+ * initial pruning steps if we have a PartitionPruneResult.
+ */
+ prunestate = CreatePartitionPruneState(planstate, pruneinfo,
+ pruneresult == NULL, true,
+ NIL, planstate->ps_ExprContext,
+ estate->es_partition_directory);
+ }
/*
* Perform an initial partition prune pass, if required.
*/
- if (prunestate->do_initial_prune)
- *initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true);
+ if (pruneresult)
+ {
+ *initially_valid_subplans =
+ list_nth(pruneresult->valid_subplan_offs_list, part_prune_index);
+ }
+ else if (prunestate && prunestate->do_initial_prune)
+ {
+ *initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true,
+ NULL);
+ }
else
{
/* No pruning, so we'll need to initialize all subplans */
Assert(n_total_subplans > 0);
*initially_valid_subplans = bms_add_range(NULL, 0,
n_total_subplans - 1);
+ return prunestate;
}
/*
@@ -1670,7 +1721,8 @@ ExecInitPartitionPruning(PlanState *planstate,
* that were removed above due to initial pruning. No need to do this if
* no steps were removed.
*/
- if (bms_num_members(*initially_valid_subplans) < n_total_subplans)
+ if (prunestate &&
+ bms_num_members(*initially_valid_subplans) < n_total_subplans)
{
/*
* We can safely skip this when !do_exec_prune, even though that
@@ -1686,11 +1738,73 @@ ExecInitPartitionPruning(PlanState *planstate,
return prunestate;
}
+/*
+ * ExecPartitionDoInitialPruning
+ * Perform initial pruning using given PartitionPruneInfo to determine
+ * the minimal set of child subplans that will be executed and also the
+ * set of RT indexes of the leaf partitions scanned by those subplans.
+ */
+Bitmapset *
+ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params,
+ PartitionPruneInfo *pruneinfo,
+ Bitmapset **scan_leafpart_rtis)
+{
+ List *rtable = plannedstmt->rtable;
+ ExprContext *econtext;
+ PartitionDirectory pdir;
+ MemoryContext oldcontext,
+ tmpcontext;
+ PartitionPruneState *prunestate;
+ Bitmapset *valid_subplan_offs;
+
+ /*
+ * A temporary context for memory allocations required while executing
+ * partition pruning steps.
+ */
+ tmpcontext = AllocSetContextCreate(CurrentMemoryContext,
+ "initial pruning working data",
+ ALLOCSET_DEFAULT_SIZES);
+ oldcontext = MemoryContextSwitchTo(tmpcontext);
+
+ /*
+ * PartitionDirectory to look up partition descriptors, which omits
+ * detached partitions, just like in the executor proper.
+ */
+ pdir = CreatePartitionDirectory(CurrentMemoryContext, false);
+
+ /*
+ * We don't yet have a PlanState for the parent plan node, so we must
+ * create a standalone ExprContext to evaluate pruning expressions,
+ * equipped with the information about the EXTERN parameters that the
+ * caller passed us. Note that that's okay because the initial pruning
+ * steps do not contain anything that requires the execution to have
+ * started.
+ */
+ econtext = CreateStandaloneExprContext();
+ econtext->ecxt_param_list_info = params;
+ prunestate = CreatePartitionPruneState(NULL, pruneinfo, true, false,
+ rtable, econtext, pdir);
+ MemoryContextSwitchTo(oldcontext);
+
+ /* Do the initial pruning. */
+ valid_subplan_offs = ExecFindMatchingSubPlans(prunestate, true,
+ scan_leafpart_rtis);
+
+ FreeExprContext(econtext, true);
+ DestroyPartitionDirectory(pdir);
+ MemoryContextDelete(tmpcontext);
+
+ return valid_subplan_offs;
+}
+
/*
* CreatePartitionPruneState
* Build the data structure required for calling ExecFindMatchingSubPlans
*
- * 'planstate' is the parent plan node's execution state.
+ * 'planstate', if not NULL, is the parent plan node's execution state. It
+ * can be NULL if being called before ExecutorStart(), in which case,
+ * 'rtable' (range table), 'econtext', and 'partdir' must be explicitly
+ * provided.
*
* 'pruneinfo' is a PartitionPruneInfo as generated by
* make_partition_pruneinfo. Here we build a PartitionPruneState containing a
@@ -1704,19 +1818,21 @@ ExecInitPartitionPruning(PlanState *planstate,
* PartitionedRelPruneInfo.
*/
static PartitionPruneState *
-CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
+CreatePartitionPruneState(PlanState *planstate,
+ PartitionPruneInfo *pruneinfo,
+ bool consider_initial_steps,
+ bool consider_exec_steps,
+ List *rtable, ExprContext *econtext,
+ PartitionDirectory partdir)
{
- EState *estate = planstate->state;
+ EState *estate = planstate ? planstate->state : NULL;
PartitionPruneState *prunestate;
int n_part_hierarchies;
ListCell *lc;
int i;
- ExprContext *econtext = planstate->ps_ExprContext;
- /* For data reading, executor always omits detached partitions */
- if (estate->es_partition_directory == NULL)
- estate->es_partition_directory =
- CreatePartitionDirectory(estate->es_query_cxt, false);
+ Assert((estate != NULL) ||
+ (partdir != NULL && econtext != NULL && rtable != NIL));
n_part_hierarchies = list_length(pruneinfo->prune_infos);
Assert(n_part_hierarchies > 0);
@@ -1771,15 +1887,42 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
PartitionKey partkey;
/*
- * We can rely on the copies of the partitioned table's partition
- * key and partition descriptor appearing in its relcache entry,
- * because that entry will be held open and locked for the
- * duration of this executor run.
+ * Must open the relation by ourselves when called before the
+ * execution has started, such as, when called during
+ * ExecutorDoInitialPruning() on a cached plan. In that case,
+ * sub-partitions must be locked, because AcquirePlannerLocks()
+ * would not have seen them. (1st relation in a partrelpruneinfos
+ * list is always the root partitioned table appearing in the
+ * query, which AcquirePlannerLocks() would have locked; the
+ * Assert in relation_open() guards that assumption.)
+ */
+ if (estate == NULL)
+ {
+ RangeTblEntry *rte = rt_fetch(pinfo->rtindex, rtable);
+ int lockmode = (j == 0) ? NoLock : rte->rellockmode;
+
+ partrel = table_open(rte->relid, lockmode);
+ }
+ else
+ partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
+
+ /*
+ * We can rely on the copy of the partitioned table's partition
+ * key from in its relcache entry, because it can't change (or
+ * get destroyed) as long as the relation is locked. Partition
+ * descriptor is taken from the PartitionDirectory associated with
+ * the table that is held open long enough for the descriptor to
+ * remain valid while it's used to perform the pruning steps.
*/
- partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
partkey = RelationGetPartitionKey(partrel);
- partdesc = PartitionDirectoryLookup(estate->es_partition_directory,
- partrel);
+ partdesc = PartitionDirectoryLookup(partdir, partrel);
+
+ /*
+ * Must close partrel, keeping the lock taken, if we're not using
+ * EState's entry.
+ */
+ if (estate == NULL)
+ table_close(partrel, NoLock);
/*
* Initialize the subplan_map and subpart_map.
@@ -1793,6 +1936,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
Assert(partdesc->nparts >= pinfo->nparts);
pprune->nparts = partdesc->nparts;
pprune->subplan_map = palloc(sizeof(int) * partdesc->nparts);
+ pprune->rti_map = palloc(sizeof(Index) * partdesc->nparts);
if (partdesc->nparts == pinfo->nparts)
{
/*
@@ -1803,6 +1947,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
pprune->subpart_map = pinfo->subpart_map;
memcpy(pprune->subplan_map, pinfo->subplan_map,
sizeof(int) * pinfo->nparts);
+ memcpy(pprune->rti_map, pinfo->rti_map,
+ sizeof(int) * pinfo->nparts);
/*
* Double-check that the list of unpruned relations has not
@@ -1853,6 +1999,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
pinfo->subplan_map[pd_idx];
pprune->subpart_map[pp_idx] =
pinfo->subpart_map[pd_idx];
+ pprune->rti_map[pp_idx] =
+ pinfo->rti_map[pd_idx];
pd_idx++;
}
else
@@ -1860,6 +2008,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
/* this partdesc entry is not in the plan */
pprune->subplan_map[pp_idx] = -1;
pprune->subpart_map[pp_idx] = -1;
+ pprune->rti_map[pp_idx] = 0;
}
}
@@ -1881,7 +2030,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
* Initialize pruning contexts as needed.
*/
pprune->initial_pruning_steps = pinfo->initial_pruning_steps;
- if (pinfo->initial_pruning_steps)
+ if (consider_initial_steps && pinfo->initial_pruning_steps)
{
InitPartitionPruneContext(&pprune->initial_context,
pinfo->initial_pruning_steps,
@@ -1891,7 +2040,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
prunestate->do_initial_prune = true;
}
pprune->exec_pruning_steps = pinfo->exec_pruning_steps;
- if (pinfo->exec_pruning_steps)
+ if (consider_exec_steps && pinfo->exec_pruning_steps)
{
InitPartitionPruneContext(&pprune->exec_context,
pinfo->exec_pruning_steps,
@@ -2119,10 +2268,14 @@ PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
* Pass initial_prune if PARAM_EXEC Params cannot yet be evaluated. This
* differentiates the initial executor-time pruning step from later
* runtime pruning.
+ *
+ * RT indexes of leaf partitions scanned by the chosen subplans are added to
+ * *scan_leafpart_rtis if the pointer is non-NULL.
*/
Bitmapset *
ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
- bool initial_prune)
+ bool initial_prune,
+ Bitmapset **scan_leafpart_rtis)
{
Bitmapset *result = NULL;
MemoryContext oldcontext;
@@ -2157,7 +2310,7 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
*/
pprune = &prunedata->partrelprunedata[0];
find_matching_subplans_recurse(prunedata, pprune, initial_prune,
- &result);
+ &result, scan_leafpart_rtis);
/* Expression eval may have used space in ExprContext too */
if (pprune->exec_pruning_steps)
@@ -2171,6 +2324,8 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
/* Copy result out of the temp context before we reset it */
result = bms_copy(result);
+ if (scan_leafpart_rtis)
+ *scan_leafpart_rtis = bms_copy(*scan_leafpart_rtis);
MemoryContextReset(prunestate->prune_context);
@@ -2181,13 +2336,15 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
* find_matching_subplans_recurse
* Recursive worker function for ExecFindMatchingSubPlans
*
- * Adds valid (non-prunable) subplan IDs to *validsubplans
+ * Adds valid (non-prunable) subplan IDs to *validsubplans and RT indexes of
+ * of the corresponding leaf partitions to *scan_leafpart_rtis (if asked for).
*/
static void
find_matching_subplans_recurse(PartitionPruningData *prunedata,
PartitionedRelPruningData *pprune,
bool initial_prune,
- Bitmapset **validsubplans)
+ Bitmapset **validsubplans,
+ Bitmapset **scan_leafpart_rtis)
{
Bitmapset *partset;
int i;
@@ -2214,8 +2371,14 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
while ((i = bms_next_member(partset, i)) >= 0)
{
if (pprune->subplan_map[i] >= 0)
+ {
*validsubplans = bms_add_member(*validsubplans,
pprune->subplan_map[i]);
+ Assert(pprune->rti_map[i] > 0);
+ if (scan_leafpart_rtis)
+ *scan_leafpart_rtis = bms_add_member(*scan_leafpart_rtis,
+ pprune->rti_map[i]);
+ }
else
{
int partidx = pprune->subpart_map[i];
@@ -2223,7 +2386,8 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
if (partidx >= 0)
find_matching_subplans_recurse(prunedata,
&prunedata->partrelprunedata[partidx],
- initial_prune, validsubplans);
+ initial_prune, validsubplans,
+ scan_leafpart_rtis);
else
{
/*
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index 076226868f..ed359b5153 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -842,7 +842,7 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
else
dest = None_Receiver;
- es->qd = CreateQueryDesc(es->stmt,
+ es->qd = CreateQueryDesc(es->stmt, NULL,
fcache->src,
GetActiveSnapshot(),
InvalidSnapshot,
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index c6f86a6510..96880e122a 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -155,7 +155,8 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
* subplan, we can fill as_valid_subplans immediately, preventing
* later calls to ExecFindMatchingSubPlans.
*/
- if (!prunestate->do_exec_prune && nplans > 0)
+ if (appendstate->as_prune_state == NULL ||
+ (!appendstate->as_prune_state->do_exec_prune && nplans > 0))
appendstate->as_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
}
else
@@ -577,7 +578,7 @@ choose_next_subplan_locally(AppendState *node)
}
else if (node->as_valid_subplans == NULL)
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
whichplan = -1;
}
@@ -642,7 +643,7 @@ choose_next_subplan_for_leader(AppendState *node)
if (node->as_valid_subplans == NULL)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
/*
* Mark each invalid plan as finished to allow the loop below to
@@ -717,7 +718,7 @@ choose_next_subplan_for_worker(AppendState *node)
else if (node->as_valid_subplans == NULL)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
mark_invalid_subplans_as_finished(node);
}
@@ -868,7 +869,7 @@ ExecAppendAsyncBegin(AppendState *node)
if (node->as_valid_subplans == NULL)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
classify_matching_subplans(node);
}
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index 8d35860c30..2312e5a633 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -103,7 +103,8 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
* subplan, we can fill ms_valid_subplans immediately, preventing
* later calls to ExecFindMatchingSubPlans.
*/
- if (!prunestate->do_exec_prune && nplans > 0)
+ if (mergestate->ms_prune_state == NULL ||
+ (!mergestate->ms_prune_state->do_exec_prune && nplans > 0))
mergestate->ms_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
}
else
@@ -218,7 +219,7 @@ ExecMergeAppend(PlanState *pstate)
*/
if (node->ms_valid_subplans == NULL)
node->ms_valid_subplans =
- ExecFindMatchingSubPlans(node->ms_prune_state, false);
+ ExecFindMatchingSubPlans(node->ms_prune_state, false, NULL);
/*
* First time through: pull the first tuple from each valid subplan,
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 29bc26669b..303a572c02 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -1578,6 +1578,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
CachedPlanSource *plansource;
CachedPlan *cplan;
List *stmt_list;
+ List *part_prune_result_list;
char *query_string;
Snapshot snapshot;
MemoryContext oldcontext;
@@ -1657,7 +1658,10 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
*/
/* Replan if needed, and increment plan refcount for portal */
- cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv);
+ cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv,
+ &part_prune_result_list);
+ Assert(list_length(cplan->stmt_list) ==
+ list_length(part_prune_result_list));
stmt_list = cplan->stmt_list;
if (!plan->saved)
@@ -1685,6 +1689,9 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
stmt_list,
cplan);
+ /* Copy PartitionPruneResults into the portal's context. */
+ PortalStorePartitionPruneResults(portal, part_prune_result_list);
+
/*
* Set up options for portal. Default SCROLL type is chosen the same way
* as PerformCursorOpen does it.
@@ -2092,7 +2099,8 @@ SPI_plan_get_cached_plan(SPIPlanPtr plan)
/* Get the generic plan for the query */
cplan = GetCachedPlan(plansource, NULL,
plan->saved ? CurrentResourceOwner : NULL,
- _SPI_current->queryEnv);
+ _SPI_current->queryEnv,
+ NULL /* Not interested in PartitionPruneResults */);
Assert(cplan == plansource->gplan);
/* Pop the error context stack */
@@ -2473,7 +2481,9 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
{
CachedPlanSource *plansource = (CachedPlanSource *) lfirst(lc1);
List *stmt_list;
- ListCell *lc2;
+ List *part_prune_result_list;
+ ListCell *lc2,
+ *lc3;
spicallbackarg.query = plansource->query_string;
@@ -2549,8 +2559,10 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
* plan, the refcount must be backed by the plan_owner.
*/
cplan = GetCachedPlan(plansource, options->params,
- plan_owner, _SPI_current->queryEnv);
-
+ plan_owner, _SPI_current->queryEnv,
+ &part_prune_result_list);
+ Assert(list_length(cplan->stmt_list) ==
+ list_length(part_prune_result_list));
stmt_list = cplan->stmt_list;
/*
@@ -2589,9 +2601,10 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
}
}
- foreach(lc2, stmt_list)
+ forboth(lc2, stmt_list, lc3, part_prune_result_list)
{
PlannedStmt *stmt = lfirst_node(PlannedStmt, lc2);
+ PartitionPruneResult *part_prune_result = lfirst_node(PartitionPruneResult, lc3);
bool canSetTag = stmt->canSetTag;
DestReceiver *dest;
@@ -2663,7 +2676,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
else
snap = InvalidSnapshot;
- qdesc = CreateQueryDesc(stmt,
+ qdesc = CreateQueryDesc(stmt, part_prune_result,
plansource->query_string,
snap, crosscheck_snapshot,
dest,
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index bee62fc15c..e7886afa35 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -158,6 +158,11 @@
token = pg_strtok(&length); /* skip :fldname */ \
local_node->fldname = readIntCols(len)
+/* Read an Index array */
+#define READ_INDEX_ARRAY(fldname, len) \
+ token = pg_strtok(&length); /* skip :fldname */ \
+ local_node->fldname = readIndexCols(len)
+
/* Read a bool array */
#define READ_BOOL_ARRAY(fldname, len) \
token = pg_strtok(&length); /* skip :fldname */ \
@@ -542,7 +547,6 @@ fnname(int numCols) \
*/
READ_SCALAR_ARRAY(readAttrNumberCols, int16, atoi)
READ_SCALAR_ARRAY(readOidCols, Oid, atooid)
-/* outfuncs.c has writeIndexCols, but we don't yet need that here */
-/* READ_SCALAR_ARRAY(readIndexCols, Index, atoui) */
+READ_SCALAR_ARRAY(readIndexCols, Index, atoui)
READ_SCALAR_ARRAY(readIntCols, int, atoi)
READ_SCALAR_ARRAY(readBoolCols, bool, strtobool)
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index b11249ed8f..7141035cc4 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -519,7 +519,9 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
result->parallelModeNeeded = glob->parallelModeNeeded;
result->planTree = top_plan;
result->partPruneInfos = glob->partPruneInfos;
+ result->containsInitialPruning = glob->containsInitialPruning;
result->rtable = glob->finalrtable;
+ result->minLockRelids = glob->minLockRelids;
result->resultRelations = glob->resultRelations;
result->appendRelations = glob->appendRelations;
result->subplans = glob->subplans;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 720f20f563..61d6934978 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -270,6 +270,16 @@ set_plan_references(PlannerInfo *root, Plan *plan)
*/
add_rtes_to_flat_rtable(root, false);
+ /*
+ * Add the query's adjusted range of RT indexes to glob->minLockRelids.
+ * The adjusted RT indexes of prunable relations will be deleted from the
+ * set below where PartitionPruneInfos are processed.
+ */
+ glob->minLockRelids =
+ bms_add_range(glob->minLockRelids,
+ rtoffset + 1,
+ rtoffset + list_length(root->parse->rtable));
+
/*
* Adjust RT indexes of PlanRowMarks and add to final rowmarks list
*/
@@ -352,6 +362,7 @@ set_plan_references(PlannerInfo *root, Plan *plan)
foreach (lc, root->partPruneInfos)
{
PartitionPruneInfo *pruneinfo = lfirst(lc);
+ Bitmapset *leafpart_rtis = NULL;
ListCell *l;
foreach(l, pruneinfo->prune_infos)
@@ -362,15 +373,50 @@ set_plan_references(PlannerInfo *root, Plan *plan)
foreach(l2, prune_infos)
{
PartitionedRelPruneInfo *pinfo = lfirst(l2);
+ int i;
/* RT index of the table to which the pinfo belongs. */
pinfo->rtindex += rtoffset;
+
+ /* Also of the leaf partitions that might be scanned. */
+ for (i = 0; i < pinfo->nparts; i++)
+ {
+ if (pinfo->rti_map[i] > 0 && pinfo->subplan_map[i] >= 0)
+ {
+ pinfo->rti_map[i] += rtoffset;
+ leafpart_rtis = bms_add_member(leafpart_rtis,
+ pinfo->rti_map[i]);
+ }
+ }
}
}
+ if (pruneinfo->needs_init_pruning)
+ {
+ glob->containsInitialPruning = true;
+
+ /*
+ * Delete the leaf partition RTIs from the global set of relations
+ * to be locked before executing the plan. AcquireExecutorLocks()
+ * will find the ones to add to the set after performing initial
+ * pruning.
+ */
+ glob->minLockRelids = bms_del_members(glob->minLockRelids,
+ leafpart_rtis);
+ }
+
glob->partPruneInfos = lappend(glob->partPruneInfos, pruneinfo);
}
+ /*
+ * It seems worth doing a bms_copy() on glob->minLockRelids if we deleted
+ * bits from it above to get rid of any empty tail bits. It seems better
+ * for the loop over this set in AcquireExecutorLocks() to not have to go
+ * through those useless bit words.
+ */
+ if (glob->containsInitialPruning)
+ glob->minLockRelids = bms_copy(glob->minLockRelids);
+
return result;
}
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index d77f7d3aef..952c5b8327 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -144,7 +144,9 @@ static List *make_partitionedrel_pruneinfo(PlannerInfo *root,
List *prunequal,
Bitmapset *partrelids,
int *relid_subplan_map,
- Bitmapset **matchedsubplans);
+ Bitmapset **matchedsubplans,
+ bool *needs_init_pruning,
+ bool *needs_exec_pruning);
static void gen_partprune_steps(RelOptInfo *rel, List *clauses,
PartClauseTarget target,
GeneratePruningStepsContext *context);
@@ -234,6 +236,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
int *relid_subplan_map;
ListCell *lc;
int i;
+ bool needs_init_pruning = false;
+ bool needs_exec_pruning = false;
/*
* Scan the subpaths to see which ones are scans of partition child
@@ -313,12 +317,16 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
Bitmapset *partrelids = (Bitmapset *) lfirst(lc);
List *pinfolist;
Bitmapset *matchedsubplans = NULL;
+ bool partrel_needs_init_pruning;
+ bool partrel_needs_exec_pruning;
pinfolist = make_partitionedrel_pruneinfo(root, parentrel,
prunequal,
partrelids,
relid_subplan_map,
- &matchedsubplans);
+ &matchedsubplans,
+ &partrel_needs_init_pruning,
+ &partrel_needs_exec_pruning);
/* When pruning is possible, record the matched subplans */
if (pinfolist != NIL)
@@ -327,6 +335,9 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
allmatchedsubplans = bms_join(matchedsubplans,
allmatchedsubplans);
}
+
+ needs_init_pruning |= partrel_needs_init_pruning;
+ needs_exec_pruning |= partrel_needs_exec_pruning;
}
pfree(relid_subplan_map);
@@ -341,6 +352,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
/* Else build the result data structure */
pruneinfo = makeNode(PartitionPruneInfo);
pruneinfo->prune_infos = prunerelinfos;
+ pruneinfo->needs_init_pruning = needs_init_pruning;
+ pruneinfo->needs_exec_pruning = needs_exec_pruning;
/*
* Some subplans may not belong to any of the identified partitioned rels.
@@ -441,13 +454,18 @@ add_part_relids(List *allpartrelids, Bitmapset *partrelids)
* If we cannot find any useful run-time pruning steps, return NIL.
* However, on success, each rel identified in partrelids will have
* an element in the result list, even if some of them are useless.
+ * *needs_init_pruning and *needs_exec_pruning are set to indicate that the
+ * returned PartitionedRelPruneInfos contains pruning steps that can be
+ * performed before and after execution begins, respectively.
*/
static List *
make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
List *prunequal,
Bitmapset *partrelids,
int *relid_subplan_map,
- Bitmapset **matchedsubplans)
+ Bitmapset **matchedsubplans,
+ bool *needs_init_pruning,
+ bool *needs_exec_pruning)
{
RelOptInfo *targetpart = NULL;
List *pinfolist = NIL;
@@ -458,6 +476,10 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
int rti;
int i;
+ /* Will find out below. */
+ *needs_init_pruning = false;
+ *needs_exec_pruning = false;
+
/*
* Examine each partitioned rel, constructing a temporary array to map
* from planner relids to index of the partitioned rel, and building a
@@ -545,6 +567,9 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
* executor per-scan pruning steps. This first pass creates startup
* pruning steps and detects whether there's any possibly-useful quals
* that would require per-scan pruning.
+ *
+ * In the first pass, we note whether the 2nd pass is necessary by
+ * noting the presence of EXEC parameters.
*/
gen_partprune_steps(subpart, partprunequal, PARTTARGET_INITIAL,
&context);
@@ -619,6 +644,12 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
pinfo->execparamids = execparamids;
/* Remaining fields will be filled in the next loop */
+ /* record which types of pruning steps we've seen so far */
+ if (initial_pruning_steps != NIL)
+ *needs_init_pruning = true;
+ if (exec_pruning_steps != NIL)
+ *needs_exec_pruning = true;
+
pinfolist = lappend(pinfolist, pinfo);
}
@@ -646,6 +677,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
int *subplan_map;
int *subpart_map;
Oid *relid_map;
+ Index *rti_map;
/*
* Construct the subplan and subpart maps for this partitioning level.
@@ -658,6 +690,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
subpart_map = (int *) palloc(nparts * sizeof(int));
memset(subpart_map, -1, nparts * sizeof(int));
relid_map = (Oid *) palloc0(nparts * sizeof(Oid));
+ rti_map = (Index *) palloc0(nparts * sizeof(Index));
present_parts = NULL;
i = -1;
@@ -672,6 +705,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
subplan_map[i] = subplanidx = relid_subplan_map[partrel->relid] - 1;
subpart_map[i] = subpartidx = relid_subpart_map[partrel->relid] - 1;
relid_map[i] = planner_rt_fetch(partrel->relid, root)->relid;
+ rti_map[i] = partrel->relid;
if (subplanidx >= 0)
{
present_parts = bms_add_member(present_parts, i);
@@ -696,6 +730,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
pinfo->subplan_map = subplan_map;
pinfo->subpart_map = subpart_map;
pinfo->relid_map = relid_map;
+ pinfo->rti_map = rti_map;
}
pfree(relid_subpart_map);
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 078fbdb5a0..02fc5a011b 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -1603,6 +1603,7 @@ exec_bind_message(StringInfo input_message)
int16 *rformats = NULL;
CachedPlanSource *psrc;
CachedPlan *cplan;
+ List *part_prune_result_list;
Portal portal;
char *query_string;
char *saved_stmt_name;
@@ -1978,7 +1979,9 @@ exec_bind_message(StringInfo input_message)
* will be generated in MessageContext. The plan refcount will be
* assigned to the Portal, so it will be released at portal destruction.
*/
- cplan = GetCachedPlan(psrc, params, NULL, NULL);
+ cplan = GetCachedPlan(psrc, params, NULL, NULL, &part_prune_result_list);
+ Assert(list_length(cplan->stmt_list) ==
+ list_length(part_prune_result_list));
/*
* Now we can define the portal.
@@ -1993,6 +1996,9 @@ exec_bind_message(StringInfo input_message)
cplan->stmt_list,
cplan);
+ /* Copy PartitionPruneResults into the portal's context. */
+ PortalStorePartitionPruneResults(portal, part_prune_result_list);
+
/* Done with the snapshot used for parameter I/O and parsing/planning */
if (snapshot_set)
PopActiveSnapshot();
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index 5aa5a350f3..8cc2e2162d 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -35,7 +35,7 @@
Portal ActivePortal = NULL;
-static void ProcessQuery(PlannedStmt *plan,
+static void ProcessQuery(PlannedStmt *plan, PartitionPruneResult *part_prune_result,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -65,6 +65,7 @@ static void DoPortalRewind(Portal portal);
*/
QueryDesc *
CreateQueryDesc(PlannedStmt *plannedstmt,
+ PartitionPruneResult *part_prune_result,
const char *sourceText,
Snapshot snapshot,
Snapshot crosscheck_snapshot,
@@ -77,6 +78,8 @@ CreateQueryDesc(PlannedStmt *plannedstmt,
qd->operation = plannedstmt->commandType; /* operation */
qd->plannedstmt = plannedstmt; /* plan */
+ qd->part_prune_result = part_prune_result; /* ExecutorDoInitialPruning()
+ * output for plan */
qd->sourceText = sourceText; /* query text */
qd->snapshot = RegisterSnapshot(snapshot); /* snapshot */
/* RI check snapshot */
@@ -122,6 +125,7 @@ FreeQueryDesc(QueryDesc *qdesc)
* PORTAL_ONE_RETURNING, or PORTAL_ONE_MOD_WITH portal
*
* plan: the plan tree for the query
+ * part_prune_result: ExecutorDoInitialPruning() output for the plan tree
* sourceText: the source text of the query
* params: any parameters needed
* dest: where to send results
@@ -134,6 +138,7 @@ FreeQueryDesc(QueryDesc *qdesc)
*/
static void
ProcessQuery(PlannedStmt *plan,
+ PartitionPruneResult *part_prune_result,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -145,7 +150,7 @@ ProcessQuery(PlannedStmt *plan,
/*
* Create the QueryDesc object
*/
- queryDesc = CreateQueryDesc(plan, sourceText,
+ queryDesc = CreateQueryDesc(plan, part_prune_result, sourceText,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, 0);
@@ -491,8 +496,13 @@ PortalStart(Portal portal, ParamListInfo params,
/*
* Create QueryDesc in portal's context; for the moment, set
* the destination to DestNone.
+ *
+ * There is no PartitionPruneResult unless the PlannedStmt is
+ * from a CachedPlan.
*/
queryDesc = CreateQueryDesc(linitial_node(PlannedStmt, portal->stmts),
+ portal->part_prune_results == NIL ? NULL :
+ linitial(portal->part_prune_results),
portal->sourceText,
GetActiveSnapshot(),
InvalidSnapshot,
@@ -1225,6 +1235,8 @@ PortalRunMulti(Portal portal,
if (pstmt->utilityStmt == NULL)
{
+ PartitionPruneResult *part_prune_result = NULL;
+
/*
* process a plannable query.
*/
@@ -1271,10 +1283,18 @@ PortalRunMulti(Portal portal,
else
UpdateActiveSnapshotCommandId();
+ /*
+ * Determine if there's a corresponding PartitionPruneResult for
+ * this PlannedStmt.
+ */
+ if (portal->part_prune_results != NIL)
+ part_prune_result = list_nth(portal->part_prune_results,
+ foreach_current_index(stmtlist_item));
+
if (pstmt->canSetTag)
{
/* statement can set tag string */
- ProcessQuery(pstmt,
+ ProcessQuery(pstmt, part_prune_result,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
@@ -1283,7 +1303,7 @@ PortalRunMulti(Portal portal,
else
{
/* stmt added by rewrite cannot set tag */
- ProcessQuery(pstmt,
+ ProcessQuery(pstmt, part_prune_result,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 0d6a295674..d1c9605979 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -99,14 +99,19 @@ static dlist_head cached_expression_list = DLIST_STATIC_INIT(cached_expression_l
static void ReleaseGenericPlan(CachedPlanSource *plansource);
static List *RevalidateCachedQuery(CachedPlanSource *plansource,
QueryEnvironment *queryEnv);
-static bool CheckCachedPlan(CachedPlanSource *plansource);
+static bool CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
+ List **part_prune_result_list);
static CachedPlan *BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
- ParamListInfo boundParams, QueryEnvironment *queryEnv);
+ ParamListInfo boundParams, QueryEnvironment *queryEnv,
+ List **part_prune_result_list);
static bool choose_custom_plan(CachedPlanSource *plansource,
ParamListInfo boundParams);
static double cached_plan_cost(CachedPlan *plan, bool include_planner);
static Query *QueryListGetPrimaryStmt(List *stmts);
-static void AcquireExecutorLocks(List *stmt_list, bool acquire);
+static void AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams,
+ List **part_prune_result_list,
+ List **lockedRelids_per_stmt);
+static void ReleaseExecutorLocks(List *stmt_list, List *lockedRelids_per_stmt);
static void AcquirePlannerLocks(List *stmt_list, bool acquire);
static void ScanQueryForLocks(Query *parsetree, bool acquire);
static bool ScanQueryWalker(Node *node, bool *acquire);
@@ -790,15 +795,20 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
*
* On a "true" return, we have acquired the locks needed to run the plan.
* (We must do this for the "true" result to be race-condition-free.)
+ *
+ * See GetCachedPlan()'s comment for a description of part_prune_result_list.
*/
static bool
-CheckCachedPlan(CachedPlanSource *plansource)
+CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
+ List **part_prune_result_list)
{
CachedPlan *plan = plansource->gplan;
/* Assert that caller checked the querytree */
Assert(plansource->is_valid);
+ *part_prune_result_list = NIL;
+
/* If there's no generic plan, just say "false" */
if (!plan)
return false;
@@ -820,13 +830,21 @@ CheckCachedPlan(CachedPlanSource *plansource)
*/
if (plan->is_valid)
{
+ List *lockedRelids_per_stmt;
+
/*
* Plan must have positive refcount because it is referenced by
* plansource; so no need to fear it disappears under us here.
*/
Assert(plan->refcount > 0);
- AcquireExecutorLocks(plan->stmt_list, true);
+ /*
+ * Lock relations scanned by the plan. This is where the pruning
+ * happens if needed.
+ */
+ AcquireExecutorLocks(plan->stmt_list, boundParams,
+ part_prune_result_list,
+ &lockedRelids_per_stmt);
/*
* If plan was transient, check to see if TransactionXmin has
@@ -848,7 +866,14 @@ CheckCachedPlan(CachedPlanSource *plansource)
}
/* Oops, the race case happened. Release useless locks. */
- AcquireExecutorLocks(plan->stmt_list, false);
+ ReleaseExecutorLocks(plan->stmt_list, lockedRelids_per_stmt);
+
+ /*
+ * The output list and any objects therein have been allocated in the
+ * caller's hopefully short-lived context, so will not remain leaked
+ * for long, though reset to avoid its accidentally being looked at.
+ */
+ *part_prune_result_list = NIL;
}
/*
@@ -874,10 +899,15 @@ CheckCachedPlan(CachedPlanSource *plansource)
* Planning work is done in the caller's memory context. The finished plan
* is in a child memory context, which typically should get reparented
* (unless this is a one-shot plan, in which case we don't copy the plan).
+ *
+ * A list of NULLs is returned in *part_prune_result_list, meaning that no
+ * PartitionPruneResult nodes have yet been created for the plans in
+ * stmt_list.
*/
static CachedPlan *
BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
- ParamListInfo boundParams, QueryEnvironment *queryEnv)
+ ParamListInfo boundParams, QueryEnvironment *queryEnv,
+ List **part_prune_result_list)
{
CachedPlan *plan;
List *plist;
@@ -1007,6 +1037,17 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
MemoryContextSwitchTo(oldcxt);
+ /*
+ * No actual PartitionPruneResults yet to add, though must initialize
+ * the list to have the same number of elements as the list of
+ * PlannedStmts.
+ */
+ *part_prune_result_list = NIL;
+ foreach(lc, plist)
+ {
+ *part_prune_result_list = lappend(*part_prune_result_list, NULL);
+ }
+
return plan;
}
@@ -1126,6 +1167,17 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
* plan or a custom plan for the given parameters: the caller does not know
* which it will get.
*
+ * For every PlannedStmt found in the returned CachedPlan, an element that
+ * is either a PartitionPruneResult or a NULL is added to
+ * *part_prune_result_list if needed. The former if the PlannedStmt is from
+ * the existing CachedPlan that is otherwise valid and contains at least one
+ * PartitionPruneInfo that has "initial" pruning steps. Those steps are
+ * performed by calling ExecutorDoInitialPruning() to determine only those
+ * leaf partitions that need to be locked by AcquireExecutorLocks() by pruning
+ * away subplans that don't match the pruning conditions. The
+ * PartitionPruneResult contains a list of bitmapsets of the indexes of
+ * matching subplans, one for each PartitionPruneInfo.
+ *
* On return, the plan is valid and we have sufficient locks to begin
* execution.
*
@@ -1139,11 +1191,13 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
*/
CachedPlan *
GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
- ResourceOwner owner, QueryEnvironment *queryEnv)
+ ResourceOwner owner, QueryEnvironment *queryEnv,
+ List **part_prune_result_list)
{
CachedPlan *plan = NULL;
List *qlist;
bool customplan;
+ List *my_part_prune_result_list;
/* Assert caller is doing things in a sane order */
Assert(plansource->magic == CACHEDPLANSOURCE_MAGIC);
@@ -1160,7 +1214,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
if (!customplan)
{
- if (CheckCachedPlan(plansource))
+ if (CheckCachedPlan(plansource, boundParams,
+ &my_part_prune_result_list))
{
/* We want a generic plan, and we already have a valid one */
plan = plansource->gplan;
@@ -1169,7 +1224,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
else
{
/* Build a new generic plan */
- plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv);
+ plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv,
+ &my_part_prune_result_list);
/* Just make real sure plansource->gplan is clear */
ReleaseGenericPlan(plansource);
/* Link the new generic plan into the plansource */
@@ -1214,7 +1270,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
if (customplan)
{
/* Build a custom plan */
- plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv);
+ plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv,
+ &my_part_prune_result_list);
/* Accumulate total costs of custom plans */
plansource->total_custom_cost += cached_plan_cost(plan, true);
@@ -1246,6 +1303,9 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
plan->is_saved = true;
}
+ if (part_prune_result_list)
+ *part_prune_result_list = my_part_prune_result_list;
+
return plan;
}
@@ -1737,17 +1797,29 @@ QueryListGetPrimaryStmt(List *stmts)
/*
* AcquireExecutorLocks: acquire locks needed for execution of a cached plan;
- * or release them if acquire is false.
+ *
+ * See GetCachedPlan()'s comment for a description of part_prune_result_list.
+ *
+ * On return, *lockedRelids_per_stmt will contain a bitmapset for every
+ * PlannedStmt in stmt_list, containing the RT indexes of relation entries
+ * in its range table that were actually locked, or NULL if the PlannedStmt
+ * contains a utility statement.
*/
static void
-AcquireExecutorLocks(List *stmt_list, bool acquire)
+AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams,
+ List **part_prune_result_list,
+ List **lockedRelids_per_stmt)
{
ListCell *lc1;
+ *part_prune_result_list = *lockedRelids_per_stmt = NIL;
foreach(lc1, stmt_list)
{
PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
- ListCell *lc2;
+ PartitionPruneResult *part_prune_result = NULL;
+ Bitmapset *allLockRelids;
+ Bitmapset *lockedRelids = NULL;
+ int rti;
if (plannedstmt->commandType == CMD_UTILITY)
{
@@ -1761,13 +1833,38 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
Query *query = UtilityContainsQuery(plannedstmt->utilityStmt);
if (query)
- ScanQueryForLocks(query, acquire);
+ ScanQueryForLocks(query, true);
+ *part_prune_result_list = lappend(*part_prune_result_list, NULL);
continue;
}
- foreach(lc2, plannedstmt->rtable)
+ /*
+ * Figure out the set of relations that would need to be locked
+ * before executing the plan.
+ */
+ if (plannedstmt->containsInitialPruning)
{
- RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc2);
+ /*
+ * Obtain the set of leaf partitions to be locked.
+ *
+ * The following does initial partition pruning using the
+ * PartitionPruneInfos found in plannedstmt->partPruneInfos and
+ * finds leaf partitions that survive that pruning across all the
+ * nodes in the plan tree.
+ */
+ PartitionPruneResult *part_prune_result =
+ ExecutorDoInitialPruning(plannedstmt, boundParams);
+
+ allLockRelids = bms_union(plannedstmt->minLockRelids,
+ part_prune_result->scan_leafpart_rtis);
+ }
+ else
+ allLockRelids = plannedstmt->minLockRelids;
+
+ rti = -1;
+ while ((rti = bms_next_member(allLockRelids, rti)) > 0)
+ {
+ RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
if (rte->rtekind != RTE_RELATION)
continue;
@@ -1778,10 +1875,58 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
* fail if it's been dropped entirely --- we'll just transiently
* acquire a non-conflicting lock.
*/
- if (acquire)
- LockRelationOid(rte->relid, rte->rellockmode);
- else
- UnlockRelationOid(rte->relid, rte->rellockmode);
+ LockRelationOid(rte->relid, rte->rellockmode);
+ lockedRelids = bms_add_member(lockedRelids, rti);
+ }
+
+ *part_prune_result_list = lappend(*part_prune_result_list,
+ part_prune_result);
+ *lockedRelids_per_stmt = lappend(*lockedRelids_per_stmt, lockedRelids);
+ }
+}
+
+/*
+ * ReleaseExecutorLocks
+ * Release locks that would've been acquired by an earlier call to
+ * AcquireExecutorLocks()
+ */
+static void
+ReleaseExecutorLocks(List *stmt_list, List *lockedRelids_per_stmt)
+{
+ ListCell *lc1,
+ *lc2;
+
+ forboth(lc1, stmt_list, lc2, lockedRelids_per_stmt)
+ {
+ PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
+ Bitmapset *lockedRelids = lfirst(lc2);
+ int rti;
+
+ if (plannedstmt->commandType == CMD_UTILITY)
+ {
+ /*
+ * Ignore utility statements, except those (such as EXPLAIN) that
+ * contain a parsed-but-not-planned query. Note: it's okay to use
+ * ScanQueryForLocks, even though the query hasn't been through
+ * rule rewriting, because rewriting doesn't change the query
+ * representation.
+ */
+ Query *query = UtilityContainsQuery(plannedstmt->utilityStmt);
+
+ if (query)
+ ScanQueryForLocks(query, false);
+ continue;
+ }
+
+ rti = -1;
+ while ((rti = bms_next_member(lockedRelids, rti)) >= 0)
+ {
+ RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
+
+ Assert(rte->rtekind == RTE_RELATION);
+
+ /* See the comment in AcquireExecutorLocks(). */
+ UnlockRelationOid(rte->relid, rte->rellockmode);
}
}
}
diff --git a/src/backend/utils/mmgr/portalmem.c b/src/backend/utils/mmgr/portalmem.c
index 3a161bdb88..27407a7f0f 100644
--- a/src/backend/utils/mmgr/portalmem.c
+++ b/src/backend/utils/mmgr/portalmem.c
@@ -303,6 +303,25 @@ PortalDefineQuery(Portal portal,
portal->status = PORTAL_DEFINED;
}
+/*
+ * PortalStorePartitionPruneResults
+ * Copy the given list of PartitionPruneResults into the portal's
+ * context
+ *
+ * This allows the caller to ensure that the list exists as long as the portal
+ * does.
+ */
+void
+PortalStorePartitionPruneResults(Portal portal, List *part_prune_results)
+{
+ MemoryContext oldcxt;
+
+ AssertArg(PortalIsValid(portal));
+ oldcxt = MemoryContextSwitchTo(portal->portalContext);
+ portal->part_prune_results = copyObject(part_prune_results);
+ MemoryContextSwitchTo(oldcxt);
+}
+
/*
* PortalReleaseCachedPlan
* Release a portal's reference to its cached plan, if any.
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 9ebde089ae..e57e133f0e 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -87,7 +87,9 @@ extern void ExplainOneUtility(Node *utilityStmt, IntoClause *into,
ExplainState *es, const char *queryString,
ParamListInfo params, QueryEnvironment *queryEnv);
-extern void ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into,
+extern void ExplainOnePlan(PlannedStmt *plannedstmt,
+ PartitionPruneResult *part_prune_result,
+ IntoClause *into,
ExplainState *es, const char *queryString,
ParamListInfo params, QueryEnvironment *queryEnv,
const instr_time *planduration,
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index bf962af7af..bd8776402e 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -45,6 +45,7 @@ extern void ExecCleanupTupleRouting(ModifyTableState *mtstate,
* nparts Length of subplan_map[] and subpart_map[].
* subplan_map Subplan index by partition index, or -1.
* subpart_map Subpart index by partition index, or -1.
+ * rti_map Range table index by partition index, or 0.
* present_parts A Bitmapset of the partition indexes that we
* have subplans or subparts for.
* initial_pruning_steps List of PartitionPruneSteps used to
@@ -61,6 +62,7 @@ typedef struct PartitionedRelPruningData
int nparts;
int *subplan_map;
int *subpart_map;
+ Index *rti_map;
Bitmapset *present_parts;
List *initial_pruning_steps;
List *exec_pruning_steps;
@@ -126,5 +128,10 @@ extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
int part_prune_index,
Bitmapset **initially_valid_subplans);
extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
- bool initial_prune);
+ bool initial_prune,
+ Bitmapset **scan_leafpart_rtis);
+extern Bitmapset *ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt,
+ ParamListInfo params,
+ PartitionPruneInfo *pruneinfo,
+ Bitmapset **scan_leafpart_rtis);
#endif /* EXECPARTITION_H */
diff --git a/src/include/executor/execdesc.h b/src/include/executor/execdesc.h
index e79e2c001f..60d5644908 100644
--- a/src/include/executor/execdesc.h
+++ b/src/include/executor/execdesc.h
@@ -35,6 +35,8 @@ typedef struct QueryDesc
/* These fields are provided by CreateQueryDesc */
CmdType operation; /* CMD_SELECT, CMD_UPDATE, etc. */
PlannedStmt *plannedstmt; /* planner's output (could be utility, too) */
+ PartitionPruneResult *part_prune_result; /* ExecutorDoInitialPruning()'s
+ * output for plannedstmt */
const char *sourceText; /* source text of the query */
Snapshot snapshot; /* snapshot to use for query */
Snapshot crosscheck_snapshot; /* crosscheck for RI update/delete */
@@ -57,6 +59,7 @@ typedef struct QueryDesc
/* in pquery.c */
extern QueryDesc *CreateQueryDesc(PlannedStmt *plannedstmt,
+ PartitionPruneResult *part_prune_result,
const char *sourceText,
Snapshot snapshot,
Snapshot crosscheck_snapshot,
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index d68a6b9d28..5c4a282be0 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -185,6 +185,8 @@ ExecGetJunkAttribute(TupleTableSlot *slot, AttrNumber attno, bool *isNull)
/*
* prototypes from functions in execMain.c
*/
+extern PartitionPruneResult *ExecutorDoInitialPruning(PlannedStmt *plannedstmt,
+ ParamListInfo params);
extern void ExecutorStart(QueryDesc *queryDesc, int eflags);
extern void standard_ExecutorStart(QueryDesc *queryDesc, int eflags);
extern void ExecutorRun(QueryDesc *queryDesc,
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 63a89474db..12ea06c2f6 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -1001,6 +1001,33 @@ typedef struct DomainConstraintState
*/
typedef TupleTableSlot *(*ExecProcNodeMtd) (struct PlanState *pstate);
+/*----------------
+ * PartitionPruneResult
+ *
+ * The result of performing ExecutorDoInitialPruning() invocation on a given
+ * PlannedStmt.
+ *
+ * Contains a list of Bitmapset of the indexes of the subplans remaining after
+ * performing initial pruning by calling ExecFindMatchingSubPlans() for every
+ * PartitionPruneInfo found in PlannedStmt.partPruneInfos. RT indexes of the
+ * leaf partitions scanned by those subplans across all PartitionPruneInfos
+ * are added into scan_leafpart_rtis.
+ *
+ * This is used by GetCachedPlan() to inform its callers of the pruning
+ * decisions made when performing AcquireExecutorLocks() on a given cached
+ * PlannedStmt, which the callers then pass on to the executor. The executor
+ * refers to this node when initializing the plan nodes which contain subplans
+ * that may have been pruned by ExecutorDoInitialPruning(), rather than
+ * redoing initial pruning.
+ */
+typedef struct PartitionPruneResult
+{
+ NodeTag type;
+
+ List *valid_subplan_offs_list;
+ Bitmapset *scan_leafpart_rtis;
+} PartitionPruneResult;
+
/* ----------------
* PlanState node
*
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index cdd6debfa0..b33d9e426d 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -212,6 +212,7 @@ extern struct Bitmapset *readBitmapset(void);
extern uintptr_t readDatum(bool typbyval);
extern bool *readBoolCols(int numCols);
extern int *readIntCols(int numCols);
+extern Index *readIndexCols(int numCols);
extern Oid *readOidCols(int numCols);
extern int16 *readAttrNumberCols(int numCols);
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index a4e6b4db92..86eda6c7c3 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -125,6 +125,19 @@ typedef struct PlannerGlobal
/* List of PartitionPruneInfo contained in the plan */
List *partPruneInfos;
+ /*
+ * Do any of those PartitionPruneInfos have initial (pre-exec) pruning
+ * steps in them?
+ */
+ bool containsInitialPruning;
+
+ /*
+ * Indexes of all range table entries minus indexes of range table entries
+ * of the leaf partitions scanned by prunable subplans; see
+ * AcquireExecutorLocks()
+ */
+ Bitmapset *minLockRelids;
+
/* OIDs of relations the plan depends on */
List *relationOids;
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index f2daabb3b7..1d2c0d9bdf 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -72,8 +72,17 @@ typedef struct PlannedStmt
List *partPruneInfos; /* List of PartitionPruneInfo contained in
* the plan */
+ bool containsInitialPruning; /* Do any of those PartitionPruneInfos
+ * have initial (pre-exec) pruning
+ * steps in them? */
+
List *rtable; /* list of RangeTblEntry nodes */
+ Bitmapset *minLockRelids; /* Indexes of all range table entries minus
+ * indexes of range table entries of the leaf
+ * partitions scanned by prunable subplans;
+ * see AcquireExecutorLocks() */
+
/* rtable indexes of target relations for INSERT/UPDATE/DELETE */
List *resultRelations; /* integer list of RT indexes, or NIL */
@@ -1409,6 +1418,13 @@ typedef struct PlanRowMark
* prune_infos List of Lists containing PartitionedRelPruneInfo nodes,
* one sublist per run-time-prunable partition hierarchy
* appearing in the parent plan node's subplans.
+ *
+ * needs_init_pruning Does any of the PartitionedRelPruneInfos in
+ * prune_infos have its initial_pruning_steps set?
+ *
+ * needs_exec_pruning Does any of the PartitionedRelPruneInfos in
+ * prune_infos have its exec_pruning_steps set?
+ *
* other_subplans Indexes of any subplans that are not accounted for
* by any of the PartitionedRelPruneInfo nodes in
* "prune_infos". These subplans must not be pruned.
@@ -1419,6 +1435,8 @@ typedef struct PartitionPruneInfo
NodeTag type;
List *prune_infos;
+ bool needs_init_pruning;
+ bool needs_exec_pruning;
Bitmapset *other_subplans;
} PartitionPruneInfo;
@@ -1463,6 +1481,9 @@ typedef struct PartitionedRelPruneInfo
/* relation OID by partition index, or 0 */
Oid *relid_map pg_node_attr(array_size(nparts));
+ /* Range table index by partition index, or 0. */
+ Index *rti_map pg_node_attr(array_size(nparts));
+
/*
* initial_pruning_steps shows how to prune during executor startup (i.e.,
* without use of any PARAM_EXEC Params); it is NIL if no startup pruning
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index 0499635f59..1c5bb5ece1 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -220,7 +220,8 @@ extern List *CachedPlanGetTargetList(CachedPlanSource *plansource,
extern CachedPlan *GetCachedPlan(CachedPlanSource *plansource,
ParamListInfo boundParams,
ResourceOwner owner,
- QueryEnvironment *queryEnv);
+ QueryEnvironment *queryEnv,
+ List **part_prune_result_list);
extern void ReleaseCachedPlan(CachedPlan *plan, ResourceOwner owner);
extern bool CachedPlanAllowsSimpleValidityCheck(CachedPlanSource *plansource,
diff --git a/src/include/utils/portal.h b/src/include/utils/portal.h
index aeddbdafe5..9f7727a837 100644
--- a/src/include/utils/portal.h
+++ b/src/include/utils/portal.h
@@ -138,6 +138,7 @@ typedef struct PortalData
QueryCompletion qc; /* command completion data for executed query */
List *stmts; /* list of PlannedStmts */
CachedPlan *cplan; /* CachedPlan, if stmts are from one */
+ List *part_prune_results; /* list of PartitionPruneResults */
ParamListInfo portalParams; /* params to pass to query */
QueryEnvironment *queryEnv; /* environment for query */
@@ -242,6 +243,8 @@ extern void PortalDefineQuery(Portal portal,
CommandTag commandTag,
List *stmts,
CachedPlan *cplan);
+extern void PortalStorePartitionPruneResults(Portal portal,
+ List *part_prune_result_list);
extern PlannedStmt *PortalGetPrimaryStmt(Portal portal);
extern void PortalCreateHoldStore(Portal portal);
extern void PortalHashTableDeleteAll(void);
--
2.35.3
^ permalink raw reply [nested|flat] 71+ messages in thread
* Re: generic plans and "initial" pruning
@ 2022-07-27 16:27 Robert Haas <[email protected]>
parent: Amit Langote <[email protected]>
0 siblings, 1 reply; 71+ messages in thread
From: Robert Haas @ 2022-07-27 16:27 UTC (permalink / raw)
To: Amit Langote <[email protected]>; +Cc: Jacob Champion <[email protected]>; Zhihong Yu <[email protected]>; David Rowley <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers
On Tue, Jul 26, 2022 at 11:01 PM Amit Langote <[email protected]> wrote:
> Needed to be rebased again, over 2d04277121f this time.
0001 adds es_part_prune_result but does not use it, so maybe the
introduction of that field should be deferred until it's needed for
something.
I wonder whether it's really necessary to added the PartitionPruneInfo
objects to a list in PlannerInfo first and then roll them up into
PlannerGlobal later. I know we do that for range table entries, but
I've never quite understood why we do it that way instead of creating
a flat range table in PlannerGlobal from the start. And so by
extension I wonder whether this table couldn't be flat from the start
also.
--
Robert Haas
EDB: http://www.enterprisedb.com
^ permalink raw reply [nested|flat] 71+ messages in thread
* Re: generic plans and "initial" pruning
@ 2022-07-29 04:20 Amit Langote <[email protected]>
parent: Robert Haas <[email protected]>
0 siblings, 1 reply; 71+ messages in thread
From: Amit Langote @ 2022-07-29 04:20 UTC (permalink / raw)
To: Robert Haas <[email protected]>; +Cc: Jacob Champion <[email protected]>; Zhihong Yu <[email protected]>; David Rowley <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers
On Thu, Jul 28, 2022 at 1:27 AM Robert Haas <[email protected]> wrote:
> On Tue, Jul 26, 2022 at 11:01 PM Amit Langote <[email protected]> wrote:
> > Needed to be rebased again, over 2d04277121f this time.
Thanks for looking.
> 0001 adds es_part_prune_result but does not use it, so maybe the
> introduction of that field should be deferred until it's needed for
> something.
Oops, looks like a mistake when breaking the patch. Will move that bit to 0002.
> I wonder whether it's really necessary to added the PartitionPruneInfo
> objects to a list in PlannerInfo first and then roll them up into
> PlannerGlobal later. I know we do that for range table entries, but
> I've never quite understood why we do it that way instead of creating
> a flat range table in PlannerGlobal from the start. And so by
> extension I wonder whether this table couldn't be flat from the start
> also.
Tom may want to correct me but my understanding of why the planner
waits till the end of planning to start populating the PlannerGlobal
range table is that it is not until then that we know which subqueries
will be scanned by the final plan tree, so also whose range table
entries will be included in the range table passed to the executor. I
can see that subquery pull-up causes a pulled-up subquery's range
table entries to be added into the parent's query's and all its nodes
changed using OffsetVarNodes() to refer to the new RT indexes. But
for subqueries that are not pulled up, their subplans' nodes (present
in PlannerGlboal.subplans) would still refer to the original RT
indexes (per range table in the corresponding PlannerGlobal.subroot),
which must be fixed and the end of planning is the time to do so. Or
maybe that could be done when build_subplan() creates a subplan and
adds it to PlannerGlobal.subplans, but for some reason it's not?
--
Thanks, Amit Langote
EDB: http://www.enterprisedb.com
^ permalink raw reply [nested|flat] 71+ messages in thread
* Re: generic plans and "initial" pruning
@ 2022-07-29 04:55 Tom Lane <[email protected]>
parent: Amit Langote <[email protected]>
1 sibling, 2 replies; 71+ messages in thread
From: Tom Lane @ 2022-07-29 04:55 UTC (permalink / raw)
To: Amit Langote <[email protected]>; +Cc: Robert Haas <[email protected]>; Jacob Champion <[email protected]>; Zhihong Yu <[email protected]>; David Rowley <[email protected]>; pgsql-hackers
Amit Langote <[email protected]> writes:
> On Thu, Jul 28, 2022 at 1:27 AM Robert Haas <[email protected]> wrote:
>> I wonder whether it's really necessary to added the PartitionPruneInfo
>> objects to a list in PlannerInfo first and then roll them up into
>> PlannerGlobal later. I know we do that for range table entries, but
>> I've never quite understood why we do it that way instead of creating
>> a flat range table in PlannerGlobal from the start. And so by
>> extension I wonder whether this table couldn't be flat from the start
>> also.
> Tom may want to correct me but my understanding of why the planner
> waits till the end of planning to start populating the PlannerGlobal
> range table is that it is not until then that we know which subqueries
> will be scanned by the final plan tree, so also whose range table
> entries will be included in the range table passed to the executor.
It would not be profitable to flatten the range table before we've
done remove_useless_joins. We'd end up with useless entries from
subqueries that ultimately aren't there. We could perhaps do it
after we finish that phase, but I don't really see the point: it
wouldn't be better than what we do now, just the same work at a
different time.
regards, tom lane
^ permalink raw reply [nested|flat] 71+ messages in thread
* Re: generic plans and "initial" pruning
@ 2022-07-29 12:22 Robert Haas <[email protected]>
parent: Tom Lane <[email protected]>
1 sibling, 1 reply; 71+ messages in thread
From: Robert Haas @ 2022-07-29 12:22 UTC (permalink / raw)
To: Tom Lane <[email protected]>; +Cc: Amit Langote <[email protected]>; Jacob Champion <[email protected]>; Zhihong Yu <[email protected]>; David Rowley <[email protected]>; pgsql-hackers
On Fri, Jul 29, 2022 at 12:55 AM Tom Lane <[email protected]> wrote:
> It would not be profitable to flatten the range table before we've
> done remove_useless_joins. We'd end up with useless entries from
> subqueries that ultimately aren't there. We could perhaps do it
> after we finish that phase, but I don't really see the point: it
> wouldn't be better than what we do now, just the same work at a
> different time.
That's not quite my question, though. Why do we ever build a non-flat
range table in the first place? Like, instead of assigning indexes
relative to the current subquery level, why not just assign them
relative to the whole query from the start? It can't really be that
we've done it this way because of remove_useless_joins(), because
we've been building separate range tables and later flattening them
for longer than join removal has existed as a feature.
What bugs me is that it's very much not free. By building a bunch of
separate range tables and combining them later, we generate extra
work: we have to go back and adjust RT indexes after-the-fact. We pay
that overhead for every query, not just the ones that end up with some
unused entries in the range table. And why would it matter if we did
end up with some useless entries in the range table, anyway? If
there's some semantic difference, we could add a flag to mark those
entries as needing to be ignored, which seems way better than crawling
all over the whole tree adjusting RTIs everywhere.
I don't really expect that we're ever going to change this -- and
certainly not on this thread. The idea of running around and replacing
RT indexes all over the tree is deeply embedded in the system. But are
we really sure we want to add a second kind of index that we have to
run around and adjust at the same time?
If we are, so be it, I guess. It just looks really ugly and unnecessary to me.
--
Robert Haas
EDB: http://www.enterprisedb.com
^ permalink raw reply [nested|flat] 71+ messages in thread
* Re: generic plans and "initial" pruning
@ 2022-07-29 15:04 Tom Lane <[email protected]>
parent: Tom Lane <[email protected]>
1 sibling, 1 reply; 71+ messages in thread
From: Tom Lane @ 2022-07-29 15:04 UTC (permalink / raw)
To: Robert Haas <[email protected]>; +Cc: Amit Langote <[email protected]>; Jacob Champion <[email protected]>; Zhihong Yu <[email protected]>; David Rowley <[email protected]>; pgsql-hackers
Robert Haas <[email protected]> writes:
> That's not quite my question, though. Why do we ever build a non-flat
> range table in the first place? Like, instead of assigning indexes
> relative to the current subquery level, why not just assign them
> relative to the whole query from the start?
We could probably make that work, but I'm skeptical that it would
really be an improvement overall, for a couple of reasons.
(1) The need for merge-rangetables-and-renumber-Vars logic doesn't
go away. It just moves from setrefs.c to the rewriter, which would
have to do it when expanding views. This would be a net loss
performance-wise, I think, because setrefs.c can do it as part of a
parsetree scan that it has to perform anyway for other housekeeping
reasons; but the rewriter would need a brand new pass over the tree.
Admittedly that pass would only happen for view replacement, but
it's still not open-and-shut that there'd be a performance win.
(2) The need for varlevelsup and similar fields doesn't go away,
I think, because we need those for semantic purposes such as
discovering the query level that aggregates are associated with.
That means that subquery flattening still has to make a pass over
the tree to touch every Var's varlevelsup; so not having to adjust
varno at the same time would save little.
I'm not sure whether I think it's a net plus or net minus that
varno would become effectively independent of varlevelsup.
It'd be different from the way we think of them now, for sure,
and I think it'd take awhile to flush out bugs arising from such
a redefinition.
> I don't really expect that we're ever going to change this -- and
> certainly not on this thread. The idea of running around and replacing
> RT indexes all over the tree is deeply embedded in the system. But are
> we really sure we want to add a second kind of index that we have to
> run around and adjust at the same time?
You probably want to avert your eyes from [1], then ;-). Although
I'm far from convinced that the cross-list index fields currently
proposed there are actually necessary; the cost to adjust them
during rangetable merging could outweigh any benefit.
regards, tom lane
[1] https://www.postgresql.org/message-id/flat/CA+HiwqGjJDmUhDSfv-U2qhKJjt9ST7Xh9JXC_irsAQ1TAUsJYg@mail....
^ permalink raw reply [nested|flat] 71+ messages in thread
* Re: generic plans and "initial" pruning
@ 2022-07-29 15:56 Robert Haas <[email protected]>
parent: Tom Lane <[email protected]>
0 siblings, 0 replies; 71+ messages in thread
From: Robert Haas @ 2022-07-29 15:56 UTC (permalink / raw)
To: Tom Lane <[email protected]>; +Cc: Amit Langote <[email protected]>; Jacob Champion <[email protected]>; Zhihong Yu <[email protected]>; David Rowley <[email protected]>; pgsql-hackers
On Fri, Jul 29, 2022 at 11:04 AM Tom Lane <[email protected]> wrote:
> We could probably make that work, but I'm skeptical that it would
> really be an improvement overall, for a couple of reasons.
>
> (1) The need for merge-rangetables-and-renumber-Vars logic doesn't
> go away. It just moves from setrefs.c to the rewriter, which would
> have to do it when expanding views. This would be a net loss
> performance-wise, I think, because setrefs.c can do it as part of a
> parsetree scan that it has to perform anyway for other housekeeping
> reasons; but the rewriter would need a brand new pass over the tree.
> Admittedly that pass would only happen for view replacement, but
> it's still not open-and-shut that there'd be a performance win.
>
> (2) The need for varlevelsup and similar fields doesn't go away,
> I think, because we need those for semantic purposes such as
> discovering the query level that aggregates are associated with.
> That means that subquery flattening still has to make a pass over
> the tree to touch every Var's varlevelsup; so not having to adjust
> varno at the same time would save little.
>
> I'm not sure whether I think it's a net plus or net minus that
> varno would become effectively independent of varlevelsup.
> It'd be different from the way we think of them now, for sure,
> and I think it'd take awhile to flush out bugs arising from such
> a redefinition.
Interesting. Thanks for your thoughts. I guess it's not as clear-cut
as I thought, but I still can't help feeling like we're doing an awful
lot of expensive rearrangement at the end of query planning.
I kind of wonder whether varlevelsup is the wrong idea. Like, suppose
we instead handed out subquery identifiers serially, sort of like what
we do with SubTransactionId values. Then instead of testing whether
varlevelsup>0 you test whether varsubqueryid==mysubqueryid. If you
flatten a query into its parent, you still need to adjust every var
that refers to the dead subquery, but you don't need to adjust vars
that refer to subqueries underneath it. Their level changes, but their
identity doesn't. Maybe that doesn't really help that much, but it's
always struck me as a little unfortunate that we basically test
whether a var is equal by testing whether the varno and varlevelsup
are equal. That only works if you assume that you can never end up
comparing two vars from thoroughly unrelated parts of the tree, such
that the subquery one level up from one might be different from the
subquery one level up from the other.
> > I don't really expect that we're ever going to change this -- and
> > certainly not on this thread. The idea of running around and replacing
> > RT indexes all over the tree is deeply embedded in the system. But are
> > we really sure we want to add a second kind of index that we have to
> > run around and adjust at the same time?
>
> You probably want to avert your eyes from [1], then ;-). Although
> I'm far from convinced that the cross-list index fields currently
> proposed there are actually necessary; the cost to adjust them
> during rangetable merging could outweigh any benefit.
I really like the idea of that patch overall, actually; I think
permissions checking is a good example of something that shouldn't
require walking the whole query tree but currently does. And actually,
I think the same thing is true here: we shouldn't need to walk the
whole query tree to find the pruning information, but right now we do.
I'm just uncertain whether what Amit has implemented is the
least-annoying way to go about it... any thoughts on that,
specifically as it pertains to this patch?
--
Robert Haas
EDB: http://www.enterprisedb.com
^ permalink raw reply [nested|flat] 71+ messages in thread
* Re: generic plans and "initial" pruning
@ 2022-07-29 16:47 Tom Lane <[email protected]>
parent: Robert Haas <[email protected]>
0 siblings, 1 reply; 71+ messages in thread
From: Tom Lane @ 2022-07-29 16:47 UTC (permalink / raw)
To: Robert Haas <[email protected]>; +Cc: Amit Langote <[email protected]>; Jacob Champion <[email protected]>; Zhihong Yu <[email protected]>; David Rowley <[email protected]>; pgsql-hackers
Robert Haas <[email protected]> writes:
> ... it's
> always struck me as a little unfortunate that we basically test
> whether a var is equal by testing whether the varno and varlevelsup
> are equal. That only works if you assume that you can never end up
> comparing two vars from thoroughly unrelated parts of the tree, such
> that the subquery one level up from one might be different from the
> subquery one level up from the other.
Yeah, that's always bothered me a little as well. I've yet to see a
case where it causes a problem in practice. But I think that if, say,
we were to try to do any sort of cross-query-level optimization, then
the ambiguity could rise up to bite us. That might be a situation
where a flat rangetable would be worth the trouble.
> I'm just uncertain whether what Amit has implemented is the
> least-annoying way to go about it... any thoughts on that,
> specifically as it pertains to this patch?
I haven't looked at this patch at all. I'll try to make some
time for it, but probably not today.
regards, tom lane
^ permalink raw reply [nested|flat] 71+ messages in thread
* Re: generic plans and "initial" pruning
@ 2022-07-29 16:55 Robert Haas <[email protected]>
parent: Tom Lane <[email protected]>
0 siblings, 0 replies; 71+ messages in thread
From: Robert Haas @ 2022-07-29 16:55 UTC (permalink / raw)
To: Tom Lane <[email protected]>; +Cc: Amit Langote <[email protected]>; Jacob Champion <[email protected]>; Zhihong Yu <[email protected]>; David Rowley <[email protected]>; pgsql-hackers
On Fri, Jul 29, 2022 at 12:47 PM Tom Lane <[email protected]> wrote:
> > I'm just uncertain whether what Amit has implemented is the
> > least-annoying way to go about it... any thoughts on that,
> > specifically as it pertains to this patch?
>
> I haven't looked at this patch at all. I'll try to make some
> time for it, but probably not today.
OK, thanks. The preliminary patch I'm talking about here is pretty
short, so you could probably look at that part of it, at least, in
some relatively small amount of time. And I think it's also in pretty
reasonable shape apart from this issue. But, as usual, there's the
question of how well one can evaluate a preliminary patch without
reviewing the full patch in detail.
--
Robert Haas
EDB: http://www.enterprisedb.com
^ permalink raw reply [nested|flat] 71+ messages in thread
* Re: generic plans and "initial" pruning
@ 2022-10-12 07:36 Amit Langote <[email protected]>
parent: Amit Langote <[email protected]>
0 siblings, 1 reply; 71+ messages in thread
From: Amit Langote @ 2022-10-12 07:36 UTC (permalink / raw)
To: Robert Haas <[email protected]>; +Cc: Jacob Champion <[email protected]>; Zhihong Yu <[email protected]>; David Rowley <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers
On Fri, Jul 29, 2022 at 1:20 PM Amit Langote <[email protected]> wrote:
> On Thu, Jul 28, 2022 at 1:27 AM Robert Haas <[email protected]> wrote:
> > 0001 adds es_part_prune_result but does not use it, so maybe the
> > introduction of that field should be deferred until it's needed for
> > something.
>
> Oops, looks like a mistake when breaking the patch. Will move that bit to 0002.
Fixed that and also noticed that I had defined PartitionPruneResult in
the wrong header (execnodes.h). That led to PartitionPruneResult
nodes not being able to be written and read, because
src/backend/nodes/gen_node_support.pl doesn't create _out* and _read*
routines for the nodes defined in execnodes.h. I moved its definition
to plannodes.h, even though it is not actually the planner that
instantiates those; no other include/nodes header sounds better.
One more thing I realized is that Bitmapsets added to the List
PartitionPruneResult.valid_subplan_offs_list are not actually
read/write-able. That's a problem that I also faced in [1], so I
proposed a patch there to make Bitmapset a read/write-able Node and
mark (only) the Bitmapsets that are added into read/write-able node
trees with the corresponding NodeTag. I'm including that patch here
as well (0002) for the main patch to work (pass
-DWRITE_READ_PARSE_PLAN_TREES build tests), though it might make sense
to discuss it in its own thread?
--
Thanks, Amit Langote
EDB: http://www.enterprisedb.com
[1] https://www.postgresql.org/message-id/CA%2BHiwqH80qX1ZLx3HyHmBrOzLQeuKuGx6FzGep0F_9zw9L4PAA%40mail.g...
Attachments:
[application/octet-stream] v21-0001-Move-PartitioPruneInfo-out-of-plan-nodes-into-Pl.patch (17.2K, 2-v21-0001-Move-PartitioPruneInfo-out-of-plan-nodes-into-Pl.patch)
download | inline diff:
From 06cda14113c3572440a716a4aacb250b2ed52f52 Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Fri, 27 May 2022 16:00:28 +0900
Subject: [PATCH v21 1/3] Move PartitioPruneInfo out of plan nodes into
PlannedStmt
The planner will now add a given PartitioPruneInfo to
PlannedStmt.partPruneInfos instead of directly to the
Append/MergeAppend plan node. What gets set instead in the
latter is an index field which points to the list element
of PlannedStmt.partPruneInfos containing the PartitioPruneInfo
belonging to the plan node.
A later commit will make AcquireExecutorLocks() do the initial
partition pruning to determine a minimal set of partitions to be
locked when validating a plan tree and it will need to consult the
PartitioPruneInfos referenced therein to do so. It would be better
for the PartitioPruneInfos to be accessible directly than requiring
a walk of the plan tree to find them, which is easier when it can be
done by simply iterating over PlannedStmt.partPruneInfos.
---
src/backend/executor/execMain.c | 1 +
src/backend/executor/execParallel.c | 1 +
src/backend/executor/execPartition.c | 4 +-
src/backend/executor/execUtils.c | 1 +
src/backend/executor/nodeAppend.c | 4 +-
src/backend/executor/nodeMergeAppend.c | 4 +-
src/backend/optimizer/plan/createplan.c | 24 ++++-----
src/backend/optimizer/plan/planner.c | 1 +
src/backend/optimizer/plan/setrefs.c | 65 +++++++++++++------------
src/backend/partitioning/partprune.c | 18 ++++---
src/include/executor/execPartition.h | 3 +-
src/include/nodes/execnodes.h | 1 +
src/include/nodes/pathnodes.h | 6 +++
src/include/nodes/plannodes.h | 11 +++--
src/include/partitioning/partprune.h | 8 +--
15 files changed, 90 insertions(+), 62 deletions(-)
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index d78862e660..32475e33ff 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -825,6 +825,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
ExecInitRangeTable(estate, rangeTable);
estate->es_plannedstmt = plannedstmt;
+ estate->es_part_prune_infos = plannedstmt->partPruneInfos;
/*
* Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index 99512826c5..aca0c6f323 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -183,6 +183,7 @@ ExecSerializePlan(Plan *plan, EState *estate)
pstmt->dependsOnRole = false;
pstmt->parallelModeNeeded = false;
pstmt->planTree = plan;
+ pstmt->partPruneInfos = estate->es_part_prune_infos;
pstmt->rtable = estate->es_range_table;
pstmt->resultRelations = NIL;
pstmt->appendRelations = NIL;
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 40e3c07693..80197d5141 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -1791,11 +1791,13 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
PartitionPruneState *
ExecInitPartitionPruning(PlanState *planstate,
int n_total_subplans,
- PartitionPruneInfo *pruneinfo,
+ int part_prune_index,
Bitmapset **initially_valid_subplans)
{
PartitionPruneState *prunestate;
EState *estate = planstate->state;
+ PartitionPruneInfo *pruneinfo = list_nth(estate->es_part_prune_infos,
+ part_prune_index);
/* We may need an expression context to evaluate partition exprs */
ExecAssignExprContext(estate, planstate);
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 9df1f81ea8..21f4c10937 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -119,6 +119,7 @@ CreateExecutorState(void)
estate->es_relations = NULL;
estate->es_rowmarks = NULL;
estate->es_plannedstmt = NULL;
+ estate->es_part_prune_infos = NIL;
estate->es_junkFilter = NULL;
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index 357e10a1d7..c6f86a6510 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -134,7 +134,7 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
appendstate->as_begun = false;
/* If run-time partition pruning is enabled, then set that up now */
- if (node->part_prune_info != NULL)
+ if (node->part_prune_index >= 0)
{
PartitionPruneState *prunestate;
@@ -145,7 +145,7 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
*/
prunestate = ExecInitPartitionPruning(&appendstate->ps,
list_length(node->appendplans),
- node->part_prune_info,
+ node->part_prune_index,
&validsubplans);
appendstate->as_prune_state = prunestate;
nplans = bms_num_members(validsubplans);
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index c5c62fa5c7..8d35860c30 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -82,7 +82,7 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
mergestate->ps.ExecProcNode = ExecMergeAppend;
/* If run-time partition pruning is enabled, then set that up now */
- if (node->part_prune_info != NULL)
+ if (node->part_prune_index >= 0)
{
PartitionPruneState *prunestate;
@@ -93,7 +93,7 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
*/
prunestate = ExecInitPartitionPruning(&mergestate->ps,
list_length(node->mergeplans),
- node->part_prune_info,
+ node->part_prune_index,
&validsubplans);
mergestate->ms_prune_state = prunestate;
nplans = bms_num_members(validsubplans);
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index ab4d8e201d..2bfb817d75 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -1203,7 +1203,6 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
ListCell *subpaths;
int nasyncplans = 0;
RelOptInfo *rel = best_path->path.parent;
- PartitionPruneInfo *partpruneinfo = NULL;
int nodenumsortkeys = 0;
AttrNumber *nodeSortColIdx = NULL;
Oid *nodeSortOperators = NULL;
@@ -1354,6 +1353,9 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
subplans = lappend(subplans, subplan);
}
+ /* Set below if we find quals that we can use to run-time prune */
+ plan->part_prune_index = -1;
+
/*
* If any quals exist, they may be useful to perform further partition
* pruning during execution. Gather information needed by the executor to
@@ -1377,16 +1379,14 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
}
if (prunequal != NIL)
- partpruneinfo =
- make_partition_pruneinfo(root, rel,
- best_path->subpaths,
- prunequal);
+ plan->part_prune_index = make_partition_pruneinfo(root, rel,
+ best_path->subpaths,
+ prunequal);
}
plan->appendplans = subplans;
plan->nasyncplans = nasyncplans;
plan->first_partial_plan = best_path->first_partial_path;
- plan->part_prune_info = partpruneinfo;
copy_generic_path_info(&plan->plan, (Path *) best_path);
@@ -1425,7 +1425,6 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
List *subplans = NIL;
ListCell *subpaths;
RelOptInfo *rel = best_path->path.parent;
- PartitionPruneInfo *partpruneinfo = NULL;
/*
* We don't have the actual creation of the MergeAppend node split out
@@ -1518,6 +1517,9 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
subplans = lappend(subplans, subplan);
}
+ /* Set below if we find quals that we can use to run-time prune */
+ node->part_prune_index = -1;
+
/*
* If any quals exist, they may be useful to perform further partition
* pruning during execution. Gather information needed by the executor to
@@ -1541,13 +1543,13 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
}
if (prunequal != NIL)
- partpruneinfo = make_partition_pruneinfo(root, rel,
- best_path->subpaths,
- prunequal);
+ node->part_prune_index = make_partition_pruneinfo(root, rel,
+ best_path->subpaths,
+ prunequal);
}
node->mergeplans = subplans;
- node->part_prune_info = partpruneinfo;
+
/*
* If prepare_sort_from_pathkeys added sort columns, but we were told to
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 5d0fd6e072..31fff597a7 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -519,6 +519,7 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
result->dependsOnRole = glob->dependsOnRole;
result->parallelModeNeeded = glob->parallelModeNeeded;
result->planTree = top_plan;
+ result->partPruneInfos = glob->partPruneInfos;
result->rtable = glob->finalrtable;
result->resultRelations = glob->resultRelations;
result->appendRelations = glob->appendRelations;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 1cb0abdbc1..720f20f563 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -348,6 +348,29 @@ set_plan_references(PlannerInfo *root, Plan *plan)
}
}
+ /* Also fix up the information in PartitionPruneInfos. */
+ foreach (lc, root->partPruneInfos)
+ {
+ PartitionPruneInfo *pruneinfo = lfirst(lc);
+ ListCell *l;
+
+ foreach(l, pruneinfo->prune_infos)
+ {
+ List *prune_infos = lfirst(l);
+ ListCell *l2;
+
+ foreach(l2, prune_infos)
+ {
+ PartitionedRelPruneInfo *pinfo = lfirst(l2);
+
+ /* RT index of the table to which the pinfo belongs. */
+ pinfo->rtindex += rtoffset;
+ }
+ }
+
+ glob->partPruneInfos = lappend(glob->partPruneInfos, pruneinfo);
+ }
+
return result;
}
@@ -1658,21 +1681,12 @@ set_append_references(PlannerInfo *root,
aplan->apprelids = offset_relid_set(aplan->apprelids, rtoffset);
- if (aplan->part_prune_info)
- {
- foreach(l, aplan->part_prune_info->prune_infos)
- {
- List *prune_infos = lfirst(l);
- ListCell *l2;
-
- foreach(l2, prune_infos)
- {
- PartitionedRelPruneInfo *pinfo = lfirst(l2);
-
- pinfo->rtindex += rtoffset;
- }
- }
- }
+ /*
+ * PartitionPruneInfos will be added to a list in PlannerGlobal, so update
+ * the index.
+ */
+ if (aplan->part_prune_index >= 0)
+ aplan->part_prune_index += list_length(root->glob->partPruneInfos);
/* We don't need to recurse to lefttree or righttree ... */
Assert(aplan->plan.lefttree == NULL);
@@ -1734,21 +1748,12 @@ set_mergeappend_references(PlannerInfo *root,
mplan->apprelids = offset_relid_set(mplan->apprelids, rtoffset);
- if (mplan->part_prune_info)
- {
- foreach(l, mplan->part_prune_info->prune_infos)
- {
- List *prune_infos = lfirst(l);
- ListCell *l2;
-
- foreach(l2, prune_infos)
- {
- PartitionedRelPruneInfo *pinfo = lfirst(l2);
-
- pinfo->rtindex += rtoffset;
- }
- }
- }
+ /*
+ * PartitionPruneInfos will be added to a list in PlannerGlobal, so update
+ * the index.
+ */
+ if (mplan->part_prune_index >= 0)
+ mplan->part_prune_index += list_length(root->glob->partPruneInfos);
/* We don't need to recurse to lefttree or righttree ... */
Assert(mplan->plan.lefttree == NULL);
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index 6188bf69cb..6565b6ed01 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -209,16 +209,20 @@ static void partkey_datum_from_expr(PartitionPruneContext *context,
/*
* make_partition_pruneinfo
- * Builds a PartitionPruneInfo which can be used in the executor to allow
- * additional partition pruning to take place. Returns NULL when
- * partition pruning would be useless.
+ * Checks if the given set of quals can be used to build pruning steps
+ * that the executor can use to prune away unneeded partitions. If
+ * suitable quals are found then a PartitionPruneInfo is built and tagged
+ * onto the PlannerInfo's partPruneInfos list.
+ *
+ * The return value is the 0-based index of the item added to the
+ * partPruneInfos list or -1 if nothing was added.
*
* 'parentrel' is the RelOptInfo for an appendrel, and 'subpaths' is the list
* of scan paths for its child rels.
* 'prunequal' is a list of potential pruning quals (i.e., restriction
* clauses that are applicable to the appendrel).
*/
-PartitionPruneInfo *
+int
make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
List *subpaths,
List *prunequal)
@@ -332,7 +336,7 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
* quals, then we can just not bother with run-time pruning.
*/
if (prunerelinfos == NIL)
- return NULL;
+ return -1;
/* Else build the result data structure */
pruneinfo = makeNode(PartitionPruneInfo);
@@ -358,7 +362,9 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
else
pruneinfo->other_subplans = NULL;
- return pruneinfo;
+ root->partPruneInfos = lappend(root->partPruneInfos, pruneinfo);
+
+ return list_length(root->partPruneInfos) - 1;
}
/*
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index 708435e952..bf962af7af 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -123,9 +123,8 @@ typedef struct PartitionPruneState
extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
int n_total_subplans,
- PartitionPruneInfo *pruneinfo,
+ int part_prune_index,
Bitmapset **initially_valid_subplans);
extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
bool initial_prune);
-
#endif /* EXECPARTITION_H */
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 01b1727fc0..4a741b053f 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -611,6 +611,7 @@ typedef struct EState
struct ExecRowMark **es_rowmarks; /* Array of per-range-table-entry
* ExecRowMarks, or NULL if none */
PlannedStmt *es_plannedstmt; /* link to top of plan tree */
+ List *es_part_prune_infos; /* PlannedStmt.partPruneInfos */
const char *es_sourceText; /* Source text from QueryDesc */
JunkFilter *es_junkFilter; /* top-level junk filter, if any */
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 6bda383bea..e392fb6fc0 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -122,6 +122,9 @@ typedef struct PlannerGlobal
/* "flat" list of AppendRelInfos */
List *appendRelations;
+ /* List of PartitionPruneInfo contained in the plan */
+ List *partPruneInfos;
+
/* OIDs of relations the plan depends on */
List *relationOids;
@@ -503,6 +506,9 @@ struct PlannerInfo
/* Does this query modify any partition key columns? */
bool partColsUpdated;
+
+ /* PartitionPruneInfos added in this query's plan. */
+ List *partPruneInfos;
};
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 21e642a64c..3eb3e6e527 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -70,6 +70,9 @@ typedef struct PlannedStmt
struct Plan *planTree; /* tree of Plan nodes */
+ List *partPruneInfos; /* List of PartitionPruneInfo contained in
+ * the plan */
+
List *rtable; /* list of RangeTblEntry nodes */
/* rtable indexes of target relations for INSERT/UPDATE/DELETE */
@@ -270,8 +273,8 @@ typedef struct Append
*/
int first_partial_plan;
- /* Info for run-time subplan pruning; NULL if we're not doing that */
- struct PartitionPruneInfo *part_prune_info;
+ /* Index to PlannerInfo.partPruneInfos or -1 if no run-time pruning */
+ int part_prune_index;
} Append;
/* ----------------
@@ -305,8 +308,8 @@ typedef struct MergeAppend
/* NULLS FIRST/LAST directions */
bool *nullsFirst pg_node_attr(array_size(numCols));
- /* Info for run-time subplan pruning; NULL if we're not doing that */
- struct PartitionPruneInfo *part_prune_info;
+ /* Index to PlannerInfo.partPruneInfos or -1 if no run-time pruning */
+ int part_prune_index;
} MergeAppend;
/* ----------------
diff --git a/src/include/partitioning/partprune.h b/src/include/partitioning/partprune.h
index 90684efa25..ebf0dcff8c 100644
--- a/src/include/partitioning/partprune.h
+++ b/src/include/partitioning/partprune.h
@@ -70,10 +70,10 @@ typedef struct PartitionPruneContext
#define PruneCxtStateIdx(partnatts, step_id, keyno) \
((partnatts) * (step_id) + (keyno))
-extern PartitionPruneInfo *make_partition_pruneinfo(struct PlannerInfo *root,
- struct RelOptInfo *parentrel,
- List *subpaths,
- List *prunequal);
+extern int make_partition_pruneinfo(struct PlannerInfo *root,
+ struct RelOptInfo *parentrel,
+ List *subpaths,
+ List *prunequal);
extern Bitmapset *prune_append_rel_partitions(struct RelOptInfo *rel);
extern Bitmapset *get_matching_partitions(PartitionPruneContext *context,
List *pruning_steps);
--
2.35.3
[application/octet-stream] v21-0003-Optimize-AcquireExecutorLocks-by-locking-only-un.patch (81.7K, 3-v21-0003-Optimize-AcquireExecutorLocks-by-locking-only-un.patch)
download | inline diff:
From ce28c4cfe8bc69e313ba7f59b048fe96f73139a6 Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Wed, 22 Dec 2021 16:55:17 +0900
Subject: [PATCH v21 3/3] Optimize AcquireExecutorLocks() by locking only
unpruned partitions
This commit teaches AcquireExecutorLocks() to perform initial
partition pruning to notionally eliminate the subnodes contained in a
generic cached plan that need not be initialized during the actual
execution of the plan and skip locking the partition scanned by those
subnodes.
The result of performing initial partition pruning this way before the
actual execution has started is made available to the actual execution via
PartitionPruneResult, made available along with the PlannedStmt by the
callers of the executor that used plancache.c to get the plan. It is NULL
in the cases in which the plan is obtained by calling the planner
directly or if the plan obtained by plancache.c is not a generic one.
---
src/backend/commands/copyto.c | 2 +-
src/backend/commands/createas.c | 2 +-
src/backend/commands/explain.c | 7 +-
src/backend/commands/extension.c | 2 +-
src/backend/commands/matview.c | 2 +-
src/backend/commands/prepare.c | 26 ++-
src/backend/executor/README | 32 ++++
src/backend/executor/execMain.c | 55 ++++++
src/backend/executor/execParallel.c | 27 ++-
src/backend/executor/execPartition.c | 238 +++++++++++++++++++++----
src/backend/executor/execUtils.c | 1 +
src/backend/executor/functions.c | 2 +-
src/backend/executor/nodeAppend.c | 11 +-
src/backend/executor/nodeMergeAppend.c | 5 +-
src/backend/executor/spi.c | 27 ++-
src/backend/nodes/readfuncs.c | 8 +-
src/backend/optimizer/plan/planner.c | 2 +
src/backend/optimizer/plan/setrefs.c | 46 +++++
src/backend/partitioning/partprune.c | 41 ++++-
src/backend/tcop/postgres.c | 8 +-
src/backend/tcop/pquery.c | 28 ++-
src/backend/utils/cache/plancache.c | 187 ++++++++++++++++---
src/backend/utils/mmgr/portalmem.c | 19 ++
src/include/commands/explain.h | 4 +-
src/include/executor/execPartition.h | 9 +-
src/include/executor/execdesc.h | 3 +
src/include/executor/executor.h | 2 +
src/include/nodes/execnodes.h | 1 +
src/include/nodes/nodes.h | 1 +
src/include/nodes/pathnodes.h | 12 ++
src/include/nodes/plannodes.h | 47 +++++
src/include/utils/plancache.h | 3 +-
src/include/utils/portal.h | 3 +
33 files changed, 763 insertions(+), 100 deletions(-)
diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index 2527e66059..df4b0dcf0e 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -558,7 +558,7 @@ BeginCopyTo(ParseState *pstate,
((DR_copy *) dest)->cstate = cstate;
/* Create a QueryDesc requesting no output */
- cstate->queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+ cstate->queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
GetActiveSnapshot(),
InvalidSnapshot,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 152c29b551..462651910a 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -325,7 +325,7 @@ ExecCreateTableAs(ParseState *pstate, CreateTableAsStmt *stmt,
UpdateActiveSnapshotCommandId();
/* Create a QueryDesc, redirecting output to our tuple receiver */
- queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+ queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, 0);
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index f86983c660..219c63fa81 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -407,7 +407,7 @@ ExplainOneQuery(Query *query, int cursorOptions,
}
/* run it (if needed) and produce output */
- ExplainOnePlan(plan, into, es, queryString, params, queryEnv,
+ ExplainOnePlan(plan, NULL, into, es, queryString, params, queryEnv,
&planduration, (es->buffers ? &bufusage : NULL));
}
}
@@ -515,7 +515,8 @@ ExplainOneUtility(Node *utilityStmt, IntoClause *into, ExplainState *es,
* to call it.
*/
void
-ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
+ExplainOnePlan(PlannedStmt *plannedstmt, PartitionPruneResult *part_prune_result,
+ IntoClause *into, ExplainState *es,
const char *queryString, ParamListInfo params,
QueryEnvironment *queryEnv, const instr_time *planduration,
const BufferUsage *bufusage)
@@ -563,7 +564,7 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
dest = None_Receiver;
/* Create a QueryDesc for the query */
- queryDesc = CreateQueryDesc(plannedstmt, queryString,
+ queryDesc = CreateQueryDesc(plannedstmt, part_prune_result, queryString,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, instrument_option);
diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index 6b6720c690..374c0ff807 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -776,7 +776,7 @@ execute_sql_string(const char *sql)
{
QueryDesc *qdesc;
- qdesc = CreateQueryDesc(stmt,
+ qdesc = CreateQueryDesc(stmt, NULL,
sql,
GetActiveSnapshot(), NULL,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index 9ac0383459..b0ed96e56c 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -408,7 +408,7 @@ refresh_matview_datafill(DestReceiver *dest, Query *query,
UpdateActiveSnapshotCommandId();
/* Create a QueryDesc, redirecting output to our tuple receiver */
- queryDesc = CreateQueryDesc(plan, queryString,
+ queryDesc = CreateQueryDesc(plan, NULL, queryString,
GetActiveSnapshot(), InvalidSnapshot,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index c4b54d0547..69e02e0346 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -155,6 +155,7 @@ ExecuteQuery(ParseState *pstate,
PreparedStatement *entry;
CachedPlan *cplan;
List *plan_list;
+ List *part_prune_result_list;
ParamListInfo paramLI = NULL;
EState *estate = NULL;
Portal portal;
@@ -193,7 +194,10 @@ ExecuteQuery(ParseState *pstate,
entry->plansource->query_string);
/* Replan if needed, and increment plan refcount for portal */
- cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL);
+ cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL,
+ &part_prune_result_list);
+ Assert(list_length(cplan->stmt_list) ==
+ list_length(part_prune_result_list));
plan_list = cplan->stmt_list;
/*
@@ -207,6 +211,9 @@ ExecuteQuery(ParseState *pstate,
plan_list,
cplan);
+ /* Copy PartitionPruneResults into the portal's context. */
+ PortalStorePartitionPruneResults(portal, part_prune_result_list);
+
/*
* For CREATE TABLE ... AS EXECUTE, we must verify that the prepared
* statement is one that produces tuples. Currently we insist that it be
@@ -576,7 +583,9 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
const char *query_string;
CachedPlan *cplan;
List *plan_list;
- ListCell *p;
+ List *part_prune_result_list;
+ ListCell *p,
+ *pp;
ParamListInfo paramLI = NULL;
EState *estate = NULL;
instr_time planstart;
@@ -619,7 +628,10 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
/* Replan if needed, and acquire a transient refcount */
cplan = GetCachedPlan(entry->plansource, paramLI,
- CurrentResourceOwner, queryEnv);
+ CurrentResourceOwner, queryEnv,
+ &part_prune_result_list);
+ Assert(list_length(cplan->stmt_list) ==
+ list_length(part_prune_result_list));
INSTR_TIME_SET_CURRENT(planduration);
INSTR_TIME_SUBTRACT(planduration, planstart);
@@ -634,13 +646,15 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
plan_list = cplan->stmt_list;
/* Explain each query */
- foreach(p, plan_list)
+ forboth(p, plan_list, pp, part_prune_result_list)
{
PlannedStmt *pstmt = lfirst_node(PlannedStmt, p);
+ PartitionPruneResult *part_prune_result = lfirst_node(PartitionPruneResult, pp);
if (pstmt->commandType != CMD_UTILITY)
- ExplainOnePlan(pstmt, into, es, query_string, paramLI, queryEnv,
- &planduration, (es->buffers ? &bufusage : NULL));
+ ExplainOnePlan(pstmt, part_prune_result, into, es, query_string,
+ paramLI, queryEnv, &planduration,
+ (es->buffers ? &bufusage : NULL));
else
ExplainOneUtility(pstmt->utilityStmt, into, es, query_string,
paramLI, queryEnv);
diff --git a/src/backend/executor/README b/src/backend/executor/README
index 0b5183fc4a..953a476ea5 100644
--- a/src/backend/executor/README
+++ b/src/backend/executor/README
@@ -65,6 +65,34 @@ found there. This currently only occurs for Append and MergeAppend nodes. In
this case the non-required subplans are ignored and the executor state's
subnode array will become out of sequence to the plan's subplan list.
+Actually, the so-called execution time pruning may also occur even before the
+execution has started. One case where that occurs is when a cached generic
+plan is being validated for execution by plancache.c: GetCachedPlan(), which
+works by locking all the relations that will be scanned by that plan. If the
+generic plan contains nodes that can perform execution time partition pruning
+(that is, contain a PartitionPruneInfo), a subset of pruning steps contained
+in a given node's PartitionPruneInfo that do not depend on the execution
+actually having started (called "initial" pruning steps) are performed at this
+point to figure out the minimal set of child subplans that satisfy those
+pruning steps. AcquireExecutorLocks() looking at a given plan tree will then
+lock only the relations scanned by the child subplans that survived such
+pruning, along with those present in PlannedStmt.minLockRelids. Note that the
+subplans are only notionally pruned in that they are not removed from the plan
+tree as such.
+
+To prevent the executor and any third party execution code that can look at
+the plan tree from trying to execute the subplans that were pruned as
+described above, the result of pruning is passed to the executor as a
+PartitionPruneResult node via the QueryDesc. It consists of the set of
+indexes of surviving subplans in their respective parent plan node's list of
+child subplans, saved as a list of bitmapsets, with one element for every
+parent plan node whose PartitionPruneInfo is present in
+PlannedStmt.partPruneInfos. In other words, the executor should not
+re-evaluate the set of initially valid subplans by redoing the initial pruning
+if it was already done by AcquireExecutorLocks(), because the re-evaluation may
+very well end up resulting in a different set of subplans, containing some
+whose relations were not locked by AcquireExecutorLocks().
+
Each Plan node may have expression trees associated with it, to represent
its target list, qualification conditions, etc. These trees are also
read-only to the executor, but the executor state for expression evaluation
@@ -286,6 +314,10 @@ Query Processing Control Flow
This is a sketch of control flow for full query processing:
+ [ ExecutorDoInitialPruning ] --- an optional step to perform initial
+ partition pruning on the plan tree the result of which is passed
+ to the executor via QueryDesc
+
CreateQueryDesc
ExecutorStart
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 32475e33ff..6e2cd1596f 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -49,6 +49,7 @@
#include "commands/matview.h"
#include "commands/trigger.h"
#include "executor/execdebug.h"
+#include "executor/execPartition.h"
#include "executor/nodeSubplan.h"
#include "foreign/fdwapi.h"
#include "jit/jit.h"
@@ -104,6 +105,58 @@ static void EvalPlanQualStart(EPQState *epqstate, Plan *planTree);
/* end of local decls */
+/* ----------------------------------------------------------------
+ * ExecutorDoInitialPruning
+ *
+ * For each plan tree node that has been assigned a PartitionPruneInfo,
+ * this performs initial partition pruning using the information contained
+ * therein to determine the set of child subplans that satisfy the initial
+ * pruning steps, to be returned as a bitmapset of their indexes in the
+ * node's list of child subplans (for example, an Append's appendplans).
+ *
+ * Return value is a PartitionPruneResult node that contains a list of those
+ * bitmapsets, with one element for every PartitionPruneInfo, and a bitmapset
+ * of the RT indexes of all the leaf partitions scanned by those chosen
+ * subplans. Note that the latter is shared across all PartitionPruneInfos.
+ *
+ * The executor must see the exactly same set of subplans as valid for
+ * execution when doing ExecInitNode() on the plan nodes whose
+ * PartitionPruneInfos are processed here. So, it must get the set from the
+ * aforementioned PartitionPruneResult, instead of computing it all over
+ * again by redoing the initial pruning. It's the caller's job to pass the
+ * PartitionPruneResult to the executor.
+ *
+ * Note: Partitioned tables mentioned in PartitionedRelPruneInfo nodes that
+ * drive the pruning will be locked before doing the pruning.
+ * ----------------------------------------------------------------
+ */
+PartitionPruneResult *
+ExecutorDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params)
+{
+ PartitionPruneResult *result;
+ ListCell *lc;
+
+ /* Only get here if there is any pruning to do. */
+ Assert(plannedstmt->containsInitialPruning);
+
+ result = makeNode(PartitionPruneResult);
+ foreach(lc, plannedstmt->partPruneInfos)
+ {
+ PartitionPruneInfo *pruneinfo = lfirst(lc);
+ Bitmapset *valid_subplan_offs;
+
+ valid_subplan_offs =
+ ExecPartitionDoInitialPruning(plannedstmt, params, pruneinfo,
+ &result->scan_leafpart_rtis);
+ if (valid_subplan_offs)
+ valid_subplan_offs->type = T_Bitmapset;
+ result->valid_subplan_offs_list =
+ lappend(result->valid_subplan_offs_list,
+ valid_subplan_offs);
+ }
+
+ return result;
+}
/* ----------------------------------------------------------------
* ExecutorStart
@@ -806,6 +859,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
{
CmdType operation = queryDesc->operation;
PlannedStmt *plannedstmt = queryDesc->plannedstmt;
+ PartitionPruneResult *part_prune_result = queryDesc->part_prune_result;
Plan *plan = plannedstmt->planTree;
List *rangeTable = plannedstmt->rtable;
EState *estate = queryDesc->estate;
@@ -826,6 +880,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
estate->es_plannedstmt = plannedstmt;
estate->es_part_prune_infos = plannedstmt->partPruneInfos;
+ estate->es_part_prune_result = part_prune_result;
/*
* Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index aca0c6f323..abae5b8623 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -66,6 +66,7 @@
#define PARALLEL_KEY_QUERY_TEXT UINT64CONST(0xE000000000000008)
#define PARALLEL_KEY_JIT_INSTRUMENTATION UINT64CONST(0xE000000000000009)
#define PARALLEL_KEY_WAL_USAGE UINT64CONST(0xE00000000000000A)
+#define PARALLEL_KEY_PARTITIONPRUNERESULT UINT64CONST(0xE00000000000000B)
#define PARALLEL_TUPLE_QUEUE_SIZE 65536
@@ -182,6 +183,7 @@ ExecSerializePlan(Plan *plan, EState *estate)
pstmt->transientPlan = false;
pstmt->dependsOnRole = false;
pstmt->parallelModeNeeded = false;
+ pstmt->containsInitialPruning = false;
pstmt->planTree = plan;
pstmt->partPruneInfos = estate->es_part_prune_infos;
pstmt->rtable = estate->es_range_table;
@@ -597,12 +599,15 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
FixedParallelExecutorState *fpes;
char *pstmt_data;
char *pstmt_space;
+ char *part_prune_result_data;
+ char *part_prune_result_space;
char *paramlistinfo_space;
BufferUsage *bufusage_space;
WalUsage *walusage_space;
SharedExecutorInstrumentation *instrumentation = NULL;
SharedJitInstrumentation *jit_instrumentation = NULL;
int pstmt_len;
+ int part_prune_result_len;
int paramlistinfo_len;
int instrumentation_len = 0;
int jit_instrumentation_len = 0;
@@ -631,6 +636,7 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
/* Fix up and serialize plan to be sent to workers. */
pstmt_data = ExecSerializePlan(planstate->plan, estate);
+ part_prune_result_data = nodeToString(estate->es_part_prune_result);
/* Create a parallel context. */
pcxt = CreateParallelContext("postgres", "ParallelQueryMain", nworkers);
@@ -657,6 +663,11 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
shm_toc_estimate_chunk(&pcxt->estimator, pstmt_len);
shm_toc_estimate_keys(&pcxt->estimator, 1);
+ /* Estimate space for serialized PartitionPruneResult. */
+ part_prune_result_len = strlen(part_prune_result_data) + 1;
+ shm_toc_estimate_chunk(&pcxt->estimator, part_prune_result_len);
+ shm_toc_estimate_keys(&pcxt->estimator, 1);
+
/* Estimate space for serialized ParamListInfo. */
paramlistinfo_len = EstimateParamListSpace(estate->es_param_list_info);
shm_toc_estimate_chunk(&pcxt->estimator, paramlistinfo_len);
@@ -751,6 +762,12 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
memcpy(pstmt_space, pstmt_data, pstmt_len);
shm_toc_insert(pcxt->toc, PARALLEL_KEY_PLANNEDSTMT, pstmt_space);
+ /* Store serialized PartitionPruneResult */
+ part_prune_result_space = shm_toc_allocate(pcxt->toc, part_prune_result_len);
+ memcpy(part_prune_result_space, part_prune_result_data, part_prune_result_len);
+ shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARTITIONPRUNERESULT,
+ part_prune_result_space);
+
/* Store serialized ParamListInfo. */
paramlistinfo_space = shm_toc_allocate(pcxt->toc, paramlistinfo_len);
shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARAMLISTINFO, paramlistinfo_space);
@@ -1232,8 +1249,10 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
int instrument_options)
{
char *pstmtspace;
+ char *part_prune_result_space;
char *paramspace;
PlannedStmt *pstmt;
+ PartitionPruneResult *part_prune_result;
ParamListInfo paramLI;
char *queryString;
@@ -1244,12 +1263,18 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
pstmtspace = shm_toc_lookup(toc, PARALLEL_KEY_PLANNEDSTMT, false);
pstmt = (PlannedStmt *) stringToNode(pstmtspace);
+ /* Reconstruct leader-supplied PartitionPruneResult. */
+ part_prune_result_space =
+ shm_toc_lookup(toc, PARALLEL_KEY_PARTITIONPRUNERESULT, false);
+ part_prune_result = (PartitionPruneResult *)
+ stringToNode(part_prune_result_space);
+
/* Reconstruct ParamListInfo. */
paramspace = shm_toc_lookup(toc, PARALLEL_KEY_PARAMLISTINFO, false);
paramLI = RestoreParamList(¶mspace);
/* Create a QueryDesc for the query. */
- return CreateQueryDesc(pstmt,
+ return CreateQueryDesc(pstmt, part_prune_result,
queryString,
GetActiveSnapshot(), InvalidSnapshot,
receiver, paramLI, NULL, instrument_options);
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 80197d5141..b612c24d62 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -25,6 +25,7 @@
#include "mb/pg_wchar.h"
#include "miscadmin.h"
#include "nodes/makefuncs.h"
+#include "parser/parsetree.h"
#include "partitioning/partbounds.h"
#include "partitioning/partdesc.h"
#include "partitioning/partprune.h"
@@ -185,7 +186,11 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation rel,
static List *adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri);
static List *adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap);
static PartitionPruneState *CreatePartitionPruneState(PlanState *planstate,
- PartitionPruneInfo *pruneinfo);
+ PartitionPruneInfo *pruneinfo,
+ bool consider_initial_steps,
+ bool consider_exec_steps,
+ List *rtable, ExprContext *econtext,
+ PartitionDirectory partdir);
static void InitPartitionPruneContext(PartitionPruneContext *context,
List *pruning_steps,
PartitionDesc partdesc,
@@ -198,7 +203,8 @@ static void PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
static void find_matching_subplans_recurse(PartitionPruningData *prunedata,
PartitionedRelPruningData *pprune,
bool initial_prune,
- Bitmapset **validsubplans);
+ Bitmapset **validsubplans,
+ Bitmapset **scan_leafpart_rtis);
/*
@@ -1746,8 +1752,10 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
* considered to be a stable expression, it can change value from one plan
* node scan to the next during query execution. Stable comparison
* expressions that don't involve such Params allow partition pruning to be
- * done once during executor startup. Expressions that do involve such Params
- * require us to prune separately for each scan of the parent plan node.
+ * done once during executor startup or during ExecutorDoInitialPruning() that
+ * runs as part of performing AcquireExecutorLocks() on a given plan tree.
+ * Expressions that do involve such Params require us to prune separately for
+ * each scan of the parent plan node.
*
* Note that pruning away unneeded subplans during executor startup has the
* added benefit of not having to initialize the unneeded subplans at all.
@@ -1764,6 +1772,13 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
* account for initial pruning possibly having eliminated some of the
* subplans.
*
+ * ExecPartitionDoInitialPruning:
+ * Do initial pruning with the information contained in a given
+ * PartitionPruneInfo to determine the minimal set of child subplans
+ * to be executed of the parent plan node to which the PartitionPruneInfo
+ * belongs and also the set of the RT indexes of leaf partitions that will
+ * be scanned with those subplans.
+ *
* ExecFindMatchingSubPlans:
* Returns indexes of matching subplans after evaluating the expressions
* that are safe to evaluate at a given point. This function is first
@@ -1781,8 +1796,9 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
*
* On return, *initially_valid_subplans is assigned the set of indexes of
* child subplans that must be initialized along with the parent plan node.
- * Initial pruning is performed here if needed and in that case only the
- * surviving subplans' indexes are added.
+ * Initial pruning is performed here if needed (unless it has already been done
+ * by ExecutorDoInitialPruning()), and in that case only the surviving
+ * subplans' indexes are added.
*
* If subplans are indeed pruned, subplan_map arrays contained in the returned
* PartitionPruneState are re-sequenced to not count those, though only if the
@@ -1794,28 +1810,62 @@ ExecInitPartitionPruning(PlanState *planstate,
int part_prune_index,
Bitmapset **initially_valid_subplans)
{
- PartitionPruneState *prunestate;
+ PartitionPruneState *prunestate = NULL;
EState *estate = planstate->state;
PartitionPruneInfo *pruneinfo = list_nth(estate->es_part_prune_infos,
part_prune_index);
+ PartitionPruneResult *pruneresult = estate->es_part_prune_result;
+ bool do_pruning = (pruneinfo->needs_init_pruning ||
+ pruneinfo->needs_exec_pruning);
- /* We may need an expression context to evaluate partition exprs */
- ExecAssignExprContext(estate, planstate);
+ /*
+ * No need to do initial pruning if it was done already by
+ * ExecutorDoInitialPruning(), which it would be if es_part_prune_result
+ * has been set.
+ */
+ if (pruneresult)
+ do_pruning = pruneinfo->needs_exec_pruning;
- /* Create the working data structure for pruning */
- prunestate = CreatePartitionPruneState(planstate, pruneinfo);
+ if (do_pruning)
+ {
+ /* We may need an expression context to evaluate partition exprs */
+ ExecAssignExprContext(estate, planstate);
+
+ /* For data reading, executor always omits detached partitions */
+ if (estate->es_partition_directory == NULL)
+ estate->es_partition_directory =
+ CreatePartitionDirectory(estate->es_query_cxt, false);
+
+ /*
+ * Create the working data structure for pruning. No need to consider
+ * initial pruning steps if we have a PartitionPruneResult.
+ */
+ prunestate = CreatePartitionPruneState(planstate, pruneinfo,
+ pruneresult == NULL, true,
+ NIL, planstate->ps_ExprContext,
+ estate->es_partition_directory);
+ }
/*
* Perform an initial partition prune pass, if required.
*/
- if (prunestate->do_initial_prune)
- *initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true);
+ if (pruneresult)
+ {
+ *initially_valid_subplans =
+ list_nth(pruneresult->valid_subplan_offs_list, part_prune_index);
+ }
+ else if (prunestate && prunestate->do_initial_prune)
+ {
+ *initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true,
+ NULL);
+ }
else
{
- /* No pruning, so we'll need to initialize all subplans */
+ /* No initial pruning, so we'll need to initialize all subplans */
Assert(n_total_subplans > 0);
*initially_valid_subplans = bms_add_range(NULL, 0,
n_total_subplans - 1);
+ return prunestate;
}
/*
@@ -1823,7 +1873,8 @@ ExecInitPartitionPruning(PlanState *planstate,
* that were removed above due to initial pruning. No need to do this if
* no steps were removed.
*/
- if (bms_num_members(*initially_valid_subplans) < n_total_subplans)
+ if (prunestate &&
+ bms_num_members(*initially_valid_subplans) < n_total_subplans)
{
/*
* We can safely skip this when !do_exec_prune, even though that
@@ -1839,11 +1890,74 @@ ExecInitPartitionPruning(PlanState *planstate,
return prunestate;
}
+/*
+ * ExecPartitionDoInitialPruning
+ * Perform initial pruning using given PartitionPruneInfo to determine
+ * the minimal set of child subplans that will be executed and also the
+ * set of RT indexes of the leaf partitions scanned by those subplans.
+ */
+Bitmapset *
+ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params,
+ PartitionPruneInfo *pruneinfo,
+ Bitmapset **scan_leafpart_rtis)
+{
+ List *rtable = plannedstmt->rtable;
+ ExprContext *econtext;
+ PartitionDirectory pdir;
+ MemoryContext oldcontext,
+ tmpcontext;
+ PartitionPruneState *prunestate;
+ Bitmapset *valid_subplan_offs;
+
+ /*
+ * A temporary context for memory allocations required while executing
+ * partition pruning steps.
+ */
+ tmpcontext = AllocSetContextCreate(CurrentMemoryContext,
+ "initial pruning working data",
+ ALLOCSET_DEFAULT_SIZES);
+ oldcontext = MemoryContextSwitchTo(tmpcontext);
+
+ /*
+ * PartitionDirectory to look up partition descriptors.
+ * Note that we don't omit detached partitions, just like during
+ * execution proper.
+ */
+ pdir = CreatePartitionDirectory(CurrentMemoryContext, false);
+
+ /*
+ * We don't yet have a PlanState for the parent plan node, so we must
+ * create a standalone ExprContext to evaluate pruning expressions,
+ * equipped with the information about the EXTERN parameters that the
+ * caller passed us. Note that that's okay because the initial pruning
+ * steps do not contain anything that requires the execution to have
+ * started and thus need the information contained in a PlanState.
+ */
+ econtext = CreateStandaloneExprContext();
+ econtext->ecxt_param_list_info = params;
+ prunestate = CreatePartitionPruneState(NULL, pruneinfo, true, false,
+ rtable, econtext, pdir);
+ MemoryContextSwitchTo(oldcontext);
+
+ /* Do the initial pruning. */
+ valid_subplan_offs = ExecFindMatchingSubPlans(prunestate, true,
+ scan_leafpart_rtis);
+
+ FreeExprContext(econtext, true);
+ DestroyPartitionDirectory(pdir);
+ MemoryContextDelete(tmpcontext);
+
+ return valid_subplan_offs;
+}
+
/*
* CreatePartitionPruneState
* Build the data structure required for calling ExecFindMatchingSubPlans
*
- * 'planstate' is the parent plan node's execution state.
+ * 'planstate', if not NULL, is the parent plan node's execution state. It
+ * can be NULL if being called before ExecutorStart(), in which case,
+ * 'rtable' (range table), 'econtext', and 'partdir' must be explicitly
+ * provided.
*
* 'pruneinfo' is a PartitionPruneInfo as generated by
* make_partition_pruneinfo. Here we build a PartitionPruneState containing a
@@ -1857,19 +1971,21 @@ ExecInitPartitionPruning(PlanState *planstate,
* PartitionedRelPruneInfo.
*/
static PartitionPruneState *
-CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
+CreatePartitionPruneState(PlanState *planstate,
+ PartitionPruneInfo *pruneinfo,
+ bool consider_initial_steps,
+ bool consider_exec_steps,
+ List *rtable, ExprContext *econtext,
+ PartitionDirectory partdir)
{
- EState *estate = planstate->state;
+ EState *estate = planstate ? planstate->state : NULL;
PartitionPruneState *prunestate;
int n_part_hierarchies;
ListCell *lc;
int i;
- ExprContext *econtext = planstate->ps_ExprContext;
- /* For data reading, executor always omits detached partitions */
- if (estate->es_partition_directory == NULL)
- estate->es_partition_directory =
- CreatePartitionDirectory(estate->es_query_cxt, false);
+ Assert((estate != NULL) ||
+ (partdir != NULL && econtext != NULL && rtable != NIL));
n_part_hierarchies = list_length(pruneinfo->prune_infos);
Assert(n_part_hierarchies > 0);
@@ -1924,15 +2040,42 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
PartitionKey partkey;
/*
- * We can rely on the copies of the partitioned table's partition
- * key and partition descriptor appearing in its relcache entry,
- * because that entry will be held open and locked for the
- * duration of this executor run.
+ * Must open the relation by ourselves when called before the
+ * execution has started, such as, when called during
+ * ExecutorDoInitialPruning() on a cached plan. In that case,
+ * sub-partitions must be locked, because AcquirePlannerLocks()
+ * would not have seen them. (1st relation in a partrelpruneinfos
+ * list is always the root partitioned table appearing in the
+ * query, which AcquirePlannerLocks() would have locked; the
+ * Assert in relation_open() guards that assumption.)
+ */
+ if (estate == NULL)
+ {
+ RangeTblEntry *rte = rt_fetch(pinfo->rtindex, rtable);
+ int lockmode = (j == 0) ? NoLock : rte->rellockmode;
+
+ partrel = table_open(rte->relid, lockmode);
+ }
+ else
+ partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
+
+ /*
+ * We can rely on the copy of the partitioned table's partition
+ * key from in its relcache entry, because it can't change (or
+ * get destroyed) as long as the relation is locked. Partition
+ * descriptor is taken from the PartitionDirectory associated with
+ * the table that is held open long enough for the descriptor to
+ * remain valid while it's used to perform the pruning steps.
*/
- partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
partkey = RelationGetPartitionKey(partrel);
- partdesc = PartitionDirectoryLookup(estate->es_partition_directory,
- partrel);
+ partdesc = PartitionDirectoryLookup(partdir, partrel);
+
+ /*
+ * Must close partrel, keeping the lock taken, if we're not using
+ * EState's entry.
+ */
+ if (estate == NULL)
+ table_close(partrel, NoLock);
/*
* Initialize the subplan_map and subpart_map.
@@ -1946,6 +2089,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
Assert(partdesc->nparts >= pinfo->nparts);
pprune->nparts = partdesc->nparts;
pprune->subplan_map = palloc(sizeof(int) * partdesc->nparts);
+ pprune->rti_map = palloc(sizeof(Index) * partdesc->nparts);
if (partdesc->nparts == pinfo->nparts)
{
/*
@@ -1956,6 +2100,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
pprune->subpart_map = pinfo->subpart_map;
memcpy(pprune->subplan_map, pinfo->subplan_map,
sizeof(int) * pinfo->nparts);
+ memcpy(pprune->rti_map, pinfo->rti_map,
+ sizeof(int) * pinfo->nparts);
/*
* Double-check that the list of unpruned relations has not
@@ -2006,6 +2152,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
pinfo->subplan_map[pd_idx];
pprune->subpart_map[pp_idx] =
pinfo->subpart_map[pd_idx];
+ pprune->rti_map[pp_idx] =
+ pinfo->rti_map[pd_idx];
pd_idx++;
}
else
@@ -2013,6 +2161,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
/* this partdesc entry is not in the plan */
pprune->subplan_map[pp_idx] = -1;
pprune->subpart_map[pp_idx] = -1;
+ pprune->rti_map[pp_idx] = 0;
}
}
@@ -2034,7 +2183,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
* Initialize pruning contexts as needed.
*/
pprune->initial_pruning_steps = pinfo->initial_pruning_steps;
- if (pinfo->initial_pruning_steps)
+ if (consider_initial_steps && pinfo->initial_pruning_steps)
{
InitPartitionPruneContext(&pprune->initial_context,
pinfo->initial_pruning_steps,
@@ -2044,7 +2193,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
prunestate->do_initial_prune = true;
}
pprune->exec_pruning_steps = pinfo->exec_pruning_steps;
- if (pinfo->exec_pruning_steps)
+ if (consider_exec_steps && pinfo->exec_pruning_steps)
{
InitPartitionPruneContext(&pprune->exec_context,
pinfo->exec_pruning_steps,
@@ -2272,10 +2421,14 @@ PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
* Pass initial_prune if PARAM_EXEC Params cannot yet be evaluated. This
* differentiates the initial executor-time pruning step from later
* runtime pruning.
+ *
+ * RT indexes of leaf partitions scanned by the chosen subplans are added to
+ * *scan_leafpart_rtis if the pointer is non-NULL.
*/
Bitmapset *
ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
- bool initial_prune)
+ bool initial_prune,
+ Bitmapset **scan_leafpart_rtis)
{
Bitmapset *result = NULL;
MemoryContext oldcontext;
@@ -2310,7 +2463,7 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
*/
pprune = &prunedata->partrelprunedata[0];
find_matching_subplans_recurse(prunedata, pprune, initial_prune,
- &result);
+ &result, scan_leafpart_rtis);
/* Expression eval may have used space in ExprContext too */
if (pprune->exec_pruning_steps)
@@ -2324,6 +2477,8 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
/* Copy result out of the temp context before we reset it */
result = bms_copy(result);
+ if (scan_leafpart_rtis)
+ *scan_leafpart_rtis = bms_copy(*scan_leafpart_rtis);
MemoryContextReset(prunestate->prune_context);
@@ -2334,13 +2489,15 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
* find_matching_subplans_recurse
* Recursive worker function for ExecFindMatchingSubPlans
*
- * Adds valid (non-prunable) subplan IDs to *validsubplans
+ * Adds valid (non-prunable) subplan IDs to *validsubplans and RT indexes of
+ * of the corresponding leaf partitions to *scan_leafpart_rtis (if asked for).
*/
static void
find_matching_subplans_recurse(PartitionPruningData *prunedata,
PartitionedRelPruningData *pprune,
bool initial_prune,
- Bitmapset **validsubplans)
+ Bitmapset **validsubplans,
+ Bitmapset **scan_leafpart_rtis)
{
Bitmapset *partset;
int i;
@@ -2367,8 +2524,14 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
while ((i = bms_next_member(partset, i)) >= 0)
{
if (pprune->subplan_map[i] >= 0)
+ {
*validsubplans = bms_add_member(*validsubplans,
pprune->subplan_map[i]);
+ Assert(pprune->rti_map[i] > 0);
+ if (scan_leafpart_rtis)
+ *scan_leafpart_rtis = bms_add_member(*scan_leafpart_rtis,
+ pprune->rti_map[i]);
+ }
else
{
int partidx = pprune->subpart_map[i];
@@ -2376,7 +2539,8 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
if (partidx >= 0)
find_matching_subplans_recurse(prunedata,
&prunedata->partrelprunedata[partidx],
- initial_prune, validsubplans);
+ initial_prune, validsubplans,
+ scan_leafpart_rtis);
else
{
/*
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 21f4c10937..bb7d028463 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -134,6 +134,7 @@ CreateExecutorState(void)
estate->es_param_exec_vals = NULL;
estate->es_queryEnv = NULL;
+ estate->es_part_prune_result = NULL;
estate->es_query_cxt = qcontext;
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index e134a82ff7..901768cc34 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -842,7 +842,7 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
else
dest = None_Receiver;
- es->qd = CreateQueryDesc(es->stmt,
+ es->qd = CreateQueryDesc(es->stmt, NULL,
fcache->src,
GetActiveSnapshot(),
InvalidSnapshot,
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index c6f86a6510..96880e122a 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -155,7 +155,8 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
* subplan, we can fill as_valid_subplans immediately, preventing
* later calls to ExecFindMatchingSubPlans.
*/
- if (!prunestate->do_exec_prune && nplans > 0)
+ if (appendstate->as_prune_state == NULL ||
+ (!appendstate->as_prune_state->do_exec_prune && nplans > 0))
appendstate->as_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
}
else
@@ -577,7 +578,7 @@ choose_next_subplan_locally(AppendState *node)
}
else if (node->as_valid_subplans == NULL)
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
whichplan = -1;
}
@@ -642,7 +643,7 @@ choose_next_subplan_for_leader(AppendState *node)
if (node->as_valid_subplans == NULL)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
/*
* Mark each invalid plan as finished to allow the loop below to
@@ -717,7 +718,7 @@ choose_next_subplan_for_worker(AppendState *node)
else if (node->as_valid_subplans == NULL)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
mark_invalid_subplans_as_finished(node);
}
@@ -868,7 +869,7 @@ ExecAppendAsyncBegin(AppendState *node)
if (node->as_valid_subplans == NULL)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
classify_matching_subplans(node);
}
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index 8d35860c30..2312e5a633 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -103,7 +103,8 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
* subplan, we can fill ms_valid_subplans immediately, preventing
* later calls to ExecFindMatchingSubPlans.
*/
- if (!prunestate->do_exec_prune && nplans > 0)
+ if (mergestate->ms_prune_state == NULL ||
+ (!mergestate->ms_prune_state->do_exec_prune && nplans > 0))
mergestate->ms_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
}
else
@@ -218,7 +219,7 @@ ExecMergeAppend(PlanState *pstate)
*/
if (node->ms_valid_subplans == NULL)
node->ms_valid_subplans =
- ExecFindMatchingSubPlans(node->ms_prune_state, false);
+ ExecFindMatchingSubPlans(node->ms_prune_state, false, NULL);
/*
* First time through: pull the first tuple from each valid subplan,
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index fd5796f1b9..b3faeae2af 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -1578,6 +1578,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
CachedPlanSource *plansource;
CachedPlan *cplan;
List *stmt_list;
+ List *part_prune_result_list;
char *query_string;
Snapshot snapshot;
MemoryContext oldcontext;
@@ -1657,7 +1658,10 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
*/
/* Replan if needed, and increment plan refcount for portal */
- cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv);
+ cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv,
+ &part_prune_result_list);
+ Assert(list_length(cplan->stmt_list) ==
+ list_length(part_prune_result_list));
stmt_list = cplan->stmt_list;
if (!plan->saved)
@@ -1685,6 +1689,9 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
stmt_list,
cplan);
+ /* Copy PartitionPruneResults into the portal's context. */
+ PortalStorePartitionPruneResults(portal, part_prune_result_list);
+
/*
* Set up options for portal. Default SCROLL type is chosen the same way
* as PerformCursorOpen does it.
@@ -2092,7 +2099,8 @@ SPI_plan_get_cached_plan(SPIPlanPtr plan)
/* Get the generic plan for the query */
cplan = GetCachedPlan(plansource, NULL,
plan->saved ? CurrentResourceOwner : NULL,
- _SPI_current->queryEnv);
+ _SPI_current->queryEnv,
+ NULL /* Not interested in PartitionPruneResults */);
Assert(cplan == plansource->gplan);
/* Pop the error context stack */
@@ -2473,7 +2481,9 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
{
CachedPlanSource *plansource = (CachedPlanSource *) lfirst(lc1);
List *stmt_list;
- ListCell *lc2;
+ List *part_prune_result_list;
+ ListCell *lc2,
+ *lc3;
spicallbackarg.query = plansource->query_string;
@@ -2549,8 +2559,10 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
* plan, the refcount must be backed by the plan_owner.
*/
cplan = GetCachedPlan(plansource, options->params,
- plan_owner, _SPI_current->queryEnv);
-
+ plan_owner, _SPI_current->queryEnv,
+ &part_prune_result_list);
+ Assert(list_length(cplan->stmt_list) ==
+ list_length(part_prune_result_list));
stmt_list = cplan->stmt_list;
/*
@@ -2589,9 +2601,10 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
}
}
- foreach(lc2, stmt_list)
+ forboth(lc2, stmt_list, lc3, part_prune_result_list)
{
PlannedStmt *stmt = lfirst_node(PlannedStmt, lc2);
+ PartitionPruneResult *part_prune_result = lfirst_node(PartitionPruneResult, lc3);
bool canSetTag = stmt->canSetTag;
DestReceiver *dest;
@@ -2663,7 +2676,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
else
snap = InvalidSnapshot;
- qdesc = CreateQueryDesc(stmt,
+ qdesc = CreateQueryDesc(stmt, part_prune_result,
plansource->query_string,
snap, crosscheck_snapshot,
dest,
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 4d6902d3ac..c34226a83b 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -158,6 +158,11 @@
token = pg_strtok(&length); /* skip :fldname */ \
local_node->fldname = readIntCols(len)
+/* Read an Index array */
+#define READ_INDEX_ARRAY(fldname, len) \
+ token = pg_strtok(&length); /* skip :fldname */ \
+ local_node->fldname = readIndexCols(len)
+
/* Read a bool array */
#define READ_BOOL_ARRAY(fldname, len) \
token = pg_strtok(&length); /* skip :fldname */ \
@@ -799,7 +804,6 @@ fnname(int numCols) \
*/
READ_SCALAR_ARRAY(readAttrNumberCols, int16, atoi)
READ_SCALAR_ARRAY(readOidCols, Oid, atooid)
-/* outfuncs.c has writeIndexCols, but we don't yet need that here */
-/* READ_SCALAR_ARRAY(readIndexCols, Index, atoui) */
+READ_SCALAR_ARRAY(readIndexCols, Index, atoui)
READ_SCALAR_ARRAY(readIntCols, int, atoi)
READ_SCALAR_ARRAY(readBoolCols, bool, strtobool)
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 31fff597a7..4097cf7164 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -520,7 +520,9 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
result->parallelModeNeeded = glob->parallelModeNeeded;
result->planTree = top_plan;
result->partPruneInfos = glob->partPruneInfos;
+ result->containsInitialPruning = glob->containsInitialPruning;
result->rtable = glob->finalrtable;
+ result->minLockRelids = glob->minLockRelids;
result->resultRelations = glob->resultRelations;
result->appendRelations = glob->appendRelations;
result->subplans = glob->subplans;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 720f20f563..61d6934978 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -270,6 +270,16 @@ set_plan_references(PlannerInfo *root, Plan *plan)
*/
add_rtes_to_flat_rtable(root, false);
+ /*
+ * Add the query's adjusted range of RT indexes to glob->minLockRelids.
+ * The adjusted RT indexes of prunable relations will be deleted from the
+ * set below where PartitionPruneInfos are processed.
+ */
+ glob->minLockRelids =
+ bms_add_range(glob->minLockRelids,
+ rtoffset + 1,
+ rtoffset + list_length(root->parse->rtable));
+
/*
* Adjust RT indexes of PlanRowMarks and add to final rowmarks list
*/
@@ -352,6 +362,7 @@ set_plan_references(PlannerInfo *root, Plan *plan)
foreach (lc, root->partPruneInfos)
{
PartitionPruneInfo *pruneinfo = lfirst(lc);
+ Bitmapset *leafpart_rtis = NULL;
ListCell *l;
foreach(l, pruneinfo->prune_infos)
@@ -362,15 +373,50 @@ set_plan_references(PlannerInfo *root, Plan *plan)
foreach(l2, prune_infos)
{
PartitionedRelPruneInfo *pinfo = lfirst(l2);
+ int i;
/* RT index of the table to which the pinfo belongs. */
pinfo->rtindex += rtoffset;
+
+ /* Also of the leaf partitions that might be scanned. */
+ for (i = 0; i < pinfo->nparts; i++)
+ {
+ if (pinfo->rti_map[i] > 0 && pinfo->subplan_map[i] >= 0)
+ {
+ pinfo->rti_map[i] += rtoffset;
+ leafpart_rtis = bms_add_member(leafpart_rtis,
+ pinfo->rti_map[i]);
+ }
+ }
}
}
+ if (pruneinfo->needs_init_pruning)
+ {
+ glob->containsInitialPruning = true;
+
+ /*
+ * Delete the leaf partition RTIs from the global set of relations
+ * to be locked before executing the plan. AcquireExecutorLocks()
+ * will find the ones to add to the set after performing initial
+ * pruning.
+ */
+ glob->minLockRelids = bms_del_members(glob->minLockRelids,
+ leafpart_rtis);
+ }
+
glob->partPruneInfos = lappend(glob->partPruneInfos, pruneinfo);
}
+ /*
+ * It seems worth doing a bms_copy() on glob->minLockRelids if we deleted
+ * bits from it above to get rid of any empty tail bits. It seems better
+ * for the loop over this set in AcquireExecutorLocks() to not have to go
+ * through those useless bit words.
+ */
+ if (glob->containsInitialPruning)
+ glob->minLockRelids = bms_copy(glob->minLockRelids);
+
return result;
}
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index 6565b6ed01..37f3e6af61 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -144,7 +144,9 @@ static List *make_partitionedrel_pruneinfo(PlannerInfo *root,
List *prunequal,
Bitmapset *partrelids,
int *relid_subplan_map,
- Bitmapset **matchedsubplans);
+ Bitmapset **matchedsubplans,
+ bool *needs_init_pruning,
+ bool *needs_exec_pruning);
static void gen_partprune_steps(RelOptInfo *rel, List *clauses,
PartClauseTarget target,
GeneratePruningStepsContext *context);
@@ -234,6 +236,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
int *relid_subplan_map;
ListCell *lc;
int i;
+ bool needs_init_pruning = false;
+ bool needs_exec_pruning = false;
/*
* Scan the subpaths to see which ones are scans of partition child
@@ -313,12 +317,16 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
Bitmapset *partrelids = (Bitmapset *) lfirst(lc);
List *pinfolist;
Bitmapset *matchedsubplans = NULL;
+ bool partrel_needs_init_pruning;
+ bool partrel_needs_exec_pruning;
pinfolist = make_partitionedrel_pruneinfo(root, parentrel,
prunequal,
partrelids,
relid_subplan_map,
- &matchedsubplans);
+ &matchedsubplans,
+ &partrel_needs_init_pruning,
+ &partrel_needs_exec_pruning);
/* When pruning is possible, record the matched subplans */
if (pinfolist != NIL)
@@ -327,6 +335,9 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
allmatchedsubplans = bms_join(matchedsubplans,
allmatchedsubplans);
}
+
+ needs_init_pruning |= partrel_needs_init_pruning;
+ needs_exec_pruning |= partrel_needs_exec_pruning;
}
pfree(relid_subplan_map);
@@ -341,6 +352,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
/* Else build the result data structure */
pruneinfo = makeNode(PartitionPruneInfo);
pruneinfo->prune_infos = prunerelinfos;
+ pruneinfo->needs_init_pruning = needs_init_pruning;
+ pruneinfo->needs_exec_pruning = needs_exec_pruning;
/*
* Some subplans may not belong to any of the identified partitioned rels.
@@ -441,13 +454,18 @@ add_part_relids(List *allpartrelids, Bitmapset *partrelids)
* If we cannot find any useful run-time pruning steps, return NIL.
* However, on success, each rel identified in partrelids will have
* an element in the result list, even if some of them are useless.
+ * *needs_init_pruning and *needs_exec_pruning are set to indicate that the
+ * returned PartitionedRelPruneInfos contains pruning steps that can be
+ * performed before and after execution begins, respectively.
*/
static List *
make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
List *prunequal,
Bitmapset *partrelids,
int *relid_subplan_map,
- Bitmapset **matchedsubplans)
+ Bitmapset **matchedsubplans,
+ bool *needs_init_pruning,
+ bool *needs_exec_pruning)
{
RelOptInfo *targetpart = NULL;
List *pinfolist = NIL;
@@ -458,6 +476,10 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
int rti;
int i;
+ /* Will find out below. */
+ *needs_init_pruning = false;
+ *needs_exec_pruning = false;
+
/*
* Examine each partitioned rel, constructing a temporary array to map
* from planner relids to index of the partitioned rel, and building a
@@ -545,6 +567,9 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
* executor per-scan pruning steps. This first pass creates startup
* pruning steps and detects whether there's any possibly-useful quals
* that would require per-scan pruning.
+ *
+ * In the first pass, we note whether the 2nd pass is necessary by
+ * noting the presence of EXEC parameters.
*/
gen_partprune_steps(subpart, partprunequal, PARTTARGET_INITIAL,
&context);
@@ -619,6 +644,12 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
pinfo->execparamids = execparamids;
/* Remaining fields will be filled in the next loop */
+ /* record which types of pruning steps we've seen so far */
+ if (initial_pruning_steps != NIL)
+ *needs_init_pruning = true;
+ if (exec_pruning_steps != NIL)
+ *needs_exec_pruning = true;
+
pinfolist = lappend(pinfolist, pinfo);
}
@@ -646,6 +677,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
int *subplan_map;
int *subpart_map;
Oid *relid_map;
+ Index *rti_map;
/*
* Construct the subplan and subpart maps for this partitioning level.
@@ -658,6 +690,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
subpart_map = (int *) palloc(nparts * sizeof(int));
memset(subpart_map, -1, nparts * sizeof(int));
relid_map = (Oid *) palloc0(nparts * sizeof(Oid));
+ rti_map = (Index *) palloc0(nparts * sizeof(Index));
present_parts = NULL;
i = -1;
@@ -672,6 +705,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
subplan_map[i] = subplanidx = relid_subplan_map[partrel->relid] - 1;
subpart_map[i] = subpartidx = relid_subpart_map[partrel->relid] - 1;
relid_map[i] = planner_rt_fetch(partrel->relid, root)->relid;
+ rti_map[i] = partrel->relid;
if (subplanidx >= 0)
{
present_parts = bms_add_member(present_parts, i);
@@ -696,6 +730,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
pinfo->subplan_map = subplan_map;
pinfo->subpart_map = subpart_map;
pinfo->relid_map = relid_map;
+ pinfo->rti_map = rti_map;
}
pfree(relid_subpart_map);
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 27dee29f42..5a37c4160b 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -1598,6 +1598,7 @@ exec_bind_message(StringInfo input_message)
int16 *rformats = NULL;
CachedPlanSource *psrc;
CachedPlan *cplan;
+ List *part_prune_result_list;
Portal portal;
char *query_string;
char *saved_stmt_name;
@@ -1972,7 +1973,9 @@ exec_bind_message(StringInfo input_message)
* will be generated in MessageContext. The plan refcount will be
* assigned to the Portal, so it will be released at portal destruction.
*/
- cplan = GetCachedPlan(psrc, params, NULL, NULL);
+ cplan = GetCachedPlan(psrc, params, NULL, NULL, &part_prune_result_list);
+ Assert(list_length(cplan->stmt_list) ==
+ list_length(part_prune_result_list));
/*
* Now we can define the portal.
@@ -1987,6 +1990,9 @@ exec_bind_message(StringInfo input_message)
cplan->stmt_list,
cplan);
+ /* Copy PartitionPruneResults into the portal's context. */
+ PortalStorePartitionPruneResults(portal, part_prune_result_list);
+
/* Done with the snapshot used for parameter I/O and parsing/planning */
if (snapshot_set)
PopActiveSnapshot();
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index 5aa5a350f3..8cc2e2162d 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -35,7 +35,7 @@
Portal ActivePortal = NULL;
-static void ProcessQuery(PlannedStmt *plan,
+static void ProcessQuery(PlannedStmt *plan, PartitionPruneResult *part_prune_result,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -65,6 +65,7 @@ static void DoPortalRewind(Portal portal);
*/
QueryDesc *
CreateQueryDesc(PlannedStmt *plannedstmt,
+ PartitionPruneResult *part_prune_result,
const char *sourceText,
Snapshot snapshot,
Snapshot crosscheck_snapshot,
@@ -77,6 +78,8 @@ CreateQueryDesc(PlannedStmt *plannedstmt,
qd->operation = plannedstmt->commandType; /* operation */
qd->plannedstmt = plannedstmt; /* plan */
+ qd->part_prune_result = part_prune_result; /* ExecutorDoInitialPruning()
+ * output for plan */
qd->sourceText = sourceText; /* query text */
qd->snapshot = RegisterSnapshot(snapshot); /* snapshot */
/* RI check snapshot */
@@ -122,6 +125,7 @@ FreeQueryDesc(QueryDesc *qdesc)
* PORTAL_ONE_RETURNING, or PORTAL_ONE_MOD_WITH portal
*
* plan: the plan tree for the query
+ * part_prune_result: ExecutorDoInitialPruning() output for the plan tree
* sourceText: the source text of the query
* params: any parameters needed
* dest: where to send results
@@ -134,6 +138,7 @@ FreeQueryDesc(QueryDesc *qdesc)
*/
static void
ProcessQuery(PlannedStmt *plan,
+ PartitionPruneResult *part_prune_result,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -145,7 +150,7 @@ ProcessQuery(PlannedStmt *plan,
/*
* Create the QueryDesc object
*/
- queryDesc = CreateQueryDesc(plan, sourceText,
+ queryDesc = CreateQueryDesc(plan, part_prune_result, sourceText,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, 0);
@@ -491,8 +496,13 @@ PortalStart(Portal portal, ParamListInfo params,
/*
* Create QueryDesc in portal's context; for the moment, set
* the destination to DestNone.
+ *
+ * There is no PartitionPruneResult unless the PlannedStmt is
+ * from a CachedPlan.
*/
queryDesc = CreateQueryDesc(linitial_node(PlannedStmt, portal->stmts),
+ portal->part_prune_results == NIL ? NULL :
+ linitial(portal->part_prune_results),
portal->sourceText,
GetActiveSnapshot(),
InvalidSnapshot,
@@ -1225,6 +1235,8 @@ PortalRunMulti(Portal portal,
if (pstmt->utilityStmt == NULL)
{
+ PartitionPruneResult *part_prune_result = NULL;
+
/*
* process a plannable query.
*/
@@ -1271,10 +1283,18 @@ PortalRunMulti(Portal portal,
else
UpdateActiveSnapshotCommandId();
+ /*
+ * Determine if there's a corresponding PartitionPruneResult for
+ * this PlannedStmt.
+ */
+ if (portal->part_prune_results != NIL)
+ part_prune_result = list_nth(portal->part_prune_results,
+ foreach_current_index(stmtlist_item));
+
if (pstmt->canSetTag)
{
/* statement can set tag string */
- ProcessQuery(pstmt,
+ ProcessQuery(pstmt, part_prune_result,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
@@ -1283,7 +1303,7 @@ PortalRunMulti(Portal portal,
else
{
/* stmt added by rewrite cannot set tag */
- ProcessQuery(pstmt,
+ ProcessQuery(pstmt, part_prune_result,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 0d6a295674..c8281e7201 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -99,14 +99,19 @@ static dlist_head cached_expression_list = DLIST_STATIC_INIT(cached_expression_l
static void ReleaseGenericPlan(CachedPlanSource *plansource);
static List *RevalidateCachedQuery(CachedPlanSource *plansource,
QueryEnvironment *queryEnv);
-static bool CheckCachedPlan(CachedPlanSource *plansource);
+static bool CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
+ List **part_prune_result_list);
static CachedPlan *BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
- ParamListInfo boundParams, QueryEnvironment *queryEnv);
+ ParamListInfo boundParams, QueryEnvironment *queryEnv,
+ List **part_prune_result_list);
static bool choose_custom_plan(CachedPlanSource *plansource,
ParamListInfo boundParams);
static double cached_plan_cost(CachedPlan *plan, bool include_planner);
static Query *QueryListGetPrimaryStmt(List *stmts);
-static void AcquireExecutorLocks(List *stmt_list, bool acquire);
+static void AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams,
+ List **part_prune_result_list,
+ List **lockedRelids_per_stmt);
+static void ReleaseExecutorLocks(List *stmt_list, List *lockedRelids_per_stmt);
static void AcquirePlannerLocks(List *stmt_list, bool acquire);
static void ScanQueryForLocks(Query *parsetree, bool acquire);
static bool ScanQueryWalker(Node *node, bool *acquire);
@@ -790,15 +795,20 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
*
* On a "true" return, we have acquired the locks needed to run the plan.
* (We must do this for the "true" result to be race-condition-free.)
+ *
+ * See GetCachedPlan()'s comment for a description of part_prune_result_list.
*/
static bool
-CheckCachedPlan(CachedPlanSource *plansource)
+CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
+ List **part_prune_result_list)
{
CachedPlan *plan = plansource->gplan;
/* Assert that caller checked the querytree */
Assert(plansource->is_valid);
+ *part_prune_result_list = NIL;
+
/* If there's no generic plan, just say "false" */
if (!plan)
return false;
@@ -820,13 +830,21 @@ CheckCachedPlan(CachedPlanSource *plansource)
*/
if (plan->is_valid)
{
+ List *lockedRelids_per_stmt;
+
/*
* Plan must have positive refcount because it is referenced by
* plansource; so no need to fear it disappears under us here.
*/
Assert(plan->refcount > 0);
- AcquireExecutorLocks(plan->stmt_list, true);
+ /*
+ * Lock relations scanned by the plan. This is where the pruning
+ * happens if needed.
+ */
+ AcquireExecutorLocks(plan->stmt_list, boundParams,
+ part_prune_result_list,
+ &lockedRelids_per_stmt);
/*
* If plan was transient, check to see if TransactionXmin has
@@ -848,7 +866,14 @@ CheckCachedPlan(CachedPlanSource *plansource)
}
/* Oops, the race case happened. Release useless locks. */
- AcquireExecutorLocks(plan->stmt_list, false);
+ ReleaseExecutorLocks(plan->stmt_list, lockedRelids_per_stmt);
+
+ /*
+ * The output list and any objects therein have been allocated in the
+ * caller's hopefully short-lived context, so will not remain leaked
+ * for long, though reset to avoid its accidentally being looked at.
+ */
+ *part_prune_result_list = NIL;
}
/*
@@ -874,10 +899,15 @@ CheckCachedPlan(CachedPlanSource *plansource)
* Planning work is done in the caller's memory context. The finished plan
* is in a child memory context, which typically should get reparented
* (unless this is a one-shot plan, in which case we don't copy the plan).
+ *
+ * A list of NULLs is returned in *part_prune_result_list, meaning that no
+ * PartitionPruneResult nodes have yet been created for the plans in
+ * stmt_list.
*/
static CachedPlan *
BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
- ParamListInfo boundParams, QueryEnvironment *queryEnv)
+ ParamListInfo boundParams, QueryEnvironment *queryEnv,
+ List **part_prune_result_list)
{
CachedPlan *plan;
List *plist;
@@ -1007,6 +1037,17 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
MemoryContextSwitchTo(oldcxt);
+ /*
+ * No actual PartitionPruneResults yet to add, though must initialize
+ * the list to have the same number of elements as the list of
+ * PlannedStmts.
+ */
+ *part_prune_result_list = NIL;
+ foreach(lc, plist)
+ {
+ *part_prune_result_list = lappend(*part_prune_result_list, NULL);
+ }
+
return plan;
}
@@ -1126,6 +1167,17 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
* plan or a custom plan for the given parameters: the caller does not know
* which it will get.
*
+ * For every PlannedStmt found in the returned CachedPlan, an element that
+ * is either a PartitionPruneResult or a NULL is added to
+ * *part_prune_result_list if needed. The former if the PlannedStmt is from
+ * the existing CachedPlan that is otherwise valid and contains at least one
+ * PartitionPruneInfo that has "initial" pruning steps. Those steps are
+ * performed by calling ExecutorDoInitialPruning() to determine only those
+ * leaf partitions that need to be locked by AcquireExecutorLocks() by pruning
+ * away subplans that don't match the pruning conditions. The
+ * PartitionPruneResult contains a list of bitmapsets of the indexes of
+ * matching subplans, one for each PartitionPruneInfo.
+ *
* On return, the plan is valid and we have sufficient locks to begin
* execution.
*
@@ -1139,11 +1191,13 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
*/
CachedPlan *
GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
- ResourceOwner owner, QueryEnvironment *queryEnv)
+ ResourceOwner owner, QueryEnvironment *queryEnv,
+ List **part_prune_result_list)
{
CachedPlan *plan = NULL;
List *qlist;
bool customplan;
+ List *my_part_prune_result_list;
/* Assert caller is doing things in a sane order */
Assert(plansource->magic == CACHEDPLANSOURCE_MAGIC);
@@ -1160,7 +1214,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
if (!customplan)
{
- if (CheckCachedPlan(plansource))
+ if (CheckCachedPlan(plansource, boundParams,
+ &my_part_prune_result_list))
{
/* We want a generic plan, and we already have a valid one */
plan = plansource->gplan;
@@ -1169,7 +1224,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
else
{
/* Build a new generic plan */
- plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv);
+ plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv,
+ &my_part_prune_result_list);
/* Just make real sure plansource->gplan is clear */
ReleaseGenericPlan(plansource);
/* Link the new generic plan into the plansource */
@@ -1214,7 +1270,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
if (customplan)
{
/* Build a custom plan */
- plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv);
+ plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv,
+ &my_part_prune_result_list);
/* Accumulate total costs of custom plans */
plansource->total_custom_cost += cached_plan_cost(plan, true);
@@ -1246,6 +1303,9 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
plan->is_saved = true;
}
+ if (part_prune_result_list)
+ *part_prune_result_list = my_part_prune_result_list;
+
return plan;
}
@@ -1737,17 +1797,29 @@ QueryListGetPrimaryStmt(List *stmts)
/*
* AcquireExecutorLocks: acquire locks needed for execution of a cached plan;
- * or release them if acquire is false.
+ *
+ * See GetCachedPlan()'s comment for a description of part_prune_result_list.
+ *
+ * On return, *lockedRelids_per_stmt will contain a bitmapset for every
+ * PlannedStmt in stmt_list, containing the RT indexes of relation entries
+ * in its range table that were actually locked, or NULL if the PlannedStmt
+ * contains a utility statement.
*/
static void
-AcquireExecutorLocks(List *stmt_list, bool acquire)
+AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams,
+ List **part_prune_result_list,
+ List **lockedRelids_per_stmt)
{
ListCell *lc1;
+ *part_prune_result_list = *lockedRelids_per_stmt = NIL;
foreach(lc1, stmt_list)
{
PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
- ListCell *lc2;
+ PartitionPruneResult *part_prune_result = NULL;
+ Bitmapset *allLockRelids;
+ Bitmapset *lockedRelids = NULL;
+ int rti;
if (plannedstmt->commandType == CMD_UTILITY)
{
@@ -1761,13 +1833,37 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
Query *query = UtilityContainsQuery(plannedstmt->utilityStmt);
if (query)
- ScanQueryForLocks(query, acquire);
+ ScanQueryForLocks(query, true);
+ *part_prune_result_list = lappend(*part_prune_result_list, NULL);
continue;
}
- foreach(lc2, plannedstmt->rtable)
+ /*
+ * Figure out the set of relations that would need to be locked
+ * before executing the plan.
+ */
+ if (plannedstmt->containsInitialPruning)
{
- RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc2);
+ /*
+ * Obtain the set of leaf partitions to be locked.
+ *
+ * The following does initial partition pruning using the
+ * PartitionPruneInfos found in plannedstmt->partPruneInfos and
+ * finds leaf partitions that survive that pruning across all the
+ * nodes in the plan tree.
+ */
+ part_prune_result = ExecutorDoInitialPruning(plannedstmt,
+ boundParams);
+ allLockRelids = bms_union(plannedstmt->minLockRelids,
+ part_prune_result->scan_leafpart_rtis);
+ }
+ else
+ allLockRelids = plannedstmt->minLockRelids;
+
+ rti = -1;
+ while ((rti = bms_next_member(allLockRelids, rti)) > 0)
+ {
+ RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
if (rte->rtekind != RTE_RELATION)
continue;
@@ -1778,10 +1874,59 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
* fail if it's been dropped entirely --- we'll just transiently
* acquire a non-conflicting lock.
*/
- if (acquire)
- LockRelationOid(rte->relid, rte->rellockmode);
- else
- UnlockRelationOid(rte->relid, rte->rellockmode);
+ LockRelationOid(rte->relid, rte->rellockmode);
+ lockedRelids = bms_add_member(lockedRelids, rti);
+ }
+
+ *part_prune_result_list = lappend(*part_prune_result_list,
+ part_prune_result);
+ *lockedRelids_per_stmt = lappend(*lockedRelids_per_stmt, lockedRelids);
+ }
+}
+
+/*
+ * ReleaseExecutorLocks
+ * Release locks that would've been acquired by an earlier call to
+ * AcquireExecutorLocks()
+ */
+static void
+ReleaseExecutorLocks(List *stmt_list, List *lockedRelids_per_stmt)
+{
+ ListCell *lc1,
+ *lc2;
+
+ forboth(lc1, stmt_list, lc2, lockedRelids_per_stmt)
+ {
+ PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
+ Bitmapset *lockedRelids = lfirst(lc2);
+ int rti;
+
+ if (plannedstmt->commandType == CMD_UTILITY)
+ {
+ /*
+ * Ignore utility statements, except those (such as EXPLAIN) that
+ * contain a parsed-but-not-planned query. Note: it's okay to use
+ * ScanQueryForLocks, even though the query hasn't been through
+ * rule rewriting, because rewriting doesn't change the query
+ * representation.
+ */
+ Query *query = UtilityContainsQuery(plannedstmt->utilityStmt);
+
+ Assert(lockedRelids == NULL);
+ if (query)
+ ScanQueryForLocks(query, false);
+ continue;
+ }
+
+ rti = -1;
+ while ((rti = bms_next_member(lockedRelids, rti)) >= 0)
+ {
+ RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
+
+ Assert(rte->rtekind == RTE_RELATION);
+
+ /* See the comment in AcquireExecutorLocks(). */
+ UnlockRelationOid(rte->relid, rte->rellockmode);
}
}
}
diff --git a/src/backend/utils/mmgr/portalmem.c b/src/backend/utils/mmgr/portalmem.c
index 3a161bdb88..27407a7f0f 100644
--- a/src/backend/utils/mmgr/portalmem.c
+++ b/src/backend/utils/mmgr/portalmem.c
@@ -303,6 +303,25 @@ PortalDefineQuery(Portal portal,
portal->status = PORTAL_DEFINED;
}
+/*
+ * PortalStorePartitionPruneResults
+ * Copy the given list of PartitionPruneResults into the portal's
+ * context
+ *
+ * This allows the caller to ensure that the list exists as long as the portal
+ * does.
+ */
+void
+PortalStorePartitionPruneResults(Portal portal, List *part_prune_results)
+{
+ MemoryContext oldcxt;
+
+ AssertArg(PortalIsValid(portal));
+ oldcxt = MemoryContextSwitchTo(portal->portalContext);
+ portal->part_prune_results = copyObject(part_prune_results);
+ MemoryContextSwitchTo(oldcxt);
+}
+
/*
* PortalReleaseCachedPlan
* Release a portal's reference to its cached plan, if any.
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 9ebde089ae..e57e133f0e 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -87,7 +87,9 @@ extern void ExplainOneUtility(Node *utilityStmt, IntoClause *into,
ExplainState *es, const char *queryString,
ParamListInfo params, QueryEnvironment *queryEnv);
-extern void ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into,
+extern void ExplainOnePlan(PlannedStmt *plannedstmt,
+ PartitionPruneResult *part_prune_result,
+ IntoClause *into,
ExplainState *es, const char *queryString,
ParamListInfo params, QueryEnvironment *queryEnv,
const instr_time *planduration,
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index bf962af7af..bd8776402e 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -45,6 +45,7 @@ extern void ExecCleanupTupleRouting(ModifyTableState *mtstate,
* nparts Length of subplan_map[] and subpart_map[].
* subplan_map Subplan index by partition index, or -1.
* subpart_map Subpart index by partition index, or -1.
+ * rti_map Range table index by partition index, or 0.
* present_parts A Bitmapset of the partition indexes that we
* have subplans or subparts for.
* initial_pruning_steps List of PartitionPruneSteps used to
@@ -61,6 +62,7 @@ typedef struct PartitionedRelPruningData
int nparts;
int *subplan_map;
int *subpart_map;
+ Index *rti_map;
Bitmapset *present_parts;
List *initial_pruning_steps;
List *exec_pruning_steps;
@@ -126,5 +128,10 @@ extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
int part_prune_index,
Bitmapset **initially_valid_subplans);
extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
- bool initial_prune);
+ bool initial_prune,
+ Bitmapset **scan_leafpart_rtis);
+extern Bitmapset *ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt,
+ ParamListInfo params,
+ PartitionPruneInfo *pruneinfo,
+ Bitmapset **scan_leafpart_rtis);
#endif /* EXECPARTITION_H */
diff --git a/src/include/executor/execdesc.h b/src/include/executor/execdesc.h
index e79e2c001f..60d5644908 100644
--- a/src/include/executor/execdesc.h
+++ b/src/include/executor/execdesc.h
@@ -35,6 +35,8 @@ typedef struct QueryDesc
/* These fields are provided by CreateQueryDesc */
CmdType operation; /* CMD_SELECT, CMD_UPDATE, etc. */
PlannedStmt *plannedstmt; /* planner's output (could be utility, too) */
+ PartitionPruneResult *part_prune_result; /* ExecutorDoInitialPruning()'s
+ * output for plannedstmt */
const char *sourceText; /* source text of the query */
Snapshot snapshot; /* snapshot to use for query */
Snapshot crosscheck_snapshot; /* crosscheck for RI update/delete */
@@ -57,6 +59,7 @@ typedef struct QueryDesc
/* in pquery.c */
extern QueryDesc *CreateQueryDesc(PlannedStmt *plannedstmt,
+ PartitionPruneResult *part_prune_result,
const char *sourceText,
Snapshot snapshot,
Snapshot crosscheck_snapshot,
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index ed95ed1176..6ae897d5d1 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -185,6 +185,8 @@ ExecGetJunkAttribute(TupleTableSlot *slot, AttrNumber attno, bool *isNull)
/*
* prototypes from functions in execMain.c
*/
+extern PartitionPruneResult *ExecutorDoInitialPruning(PlannedStmt *plannedstmt,
+ ParamListInfo params);
extern void ExecutorStart(QueryDesc *queryDesc, int eflags);
extern void standard_ExecutorStart(QueryDesc *queryDesc, int eflags);
extern void ExecutorRun(QueryDesc *queryDesc,
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 4a741b053f..63a89474db 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -612,6 +612,7 @@ typedef struct EState
* ExecRowMarks, or NULL if none */
PlannedStmt *es_plannedstmt; /* link to top of plan tree */
List *es_part_prune_infos; /* PlannedStmt.partPruneInfos */
+ struct PartitionPruneResult *es_part_prune_result; /* QueryDesc.part_prune_result */
const char *es_sourceText; /* Source text from QueryDesc */
JunkFilter *es_junkFilter; /* top-level junk filter, if any */
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index a80f43e540..937cc4629d 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -212,6 +212,7 @@ extern struct Bitmapset *readBitmapset(void);
extern uintptr_t readDatum(bool typbyval);
extern bool *readBoolCols(int numCols);
extern int *readIntCols(int numCols);
+extern Index *readIndexCols(int numCols);
extern Oid *readOidCols(int numCols);
extern int16 *readAttrNumberCols(int numCols);
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index e392fb6fc0..494ae461be 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -125,6 +125,18 @@ typedef struct PlannerGlobal
/* List of PartitionPruneInfo contained in the plan */
List *partPruneInfos;
+ /*
+ * Do any of those PartitionPruneInfos have initial pruning steps in them?
+ */
+ bool containsInitialPruning;
+
+ /*
+ * Indexes of all range table entries minus indexes of range table entries
+ * of the leaf partitions scanned by prunable subplans; see
+ * AcquireExecutorLocks()
+ */
+ Bitmapset *minLockRelids;
+
/* OIDs of relations the plan depends on */
List *relationOids;
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 3eb3e6e527..a1e06719e6 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -73,8 +73,17 @@ typedef struct PlannedStmt
List *partPruneInfos; /* List of PartitionPruneInfo contained in
* the plan */
+ bool containsInitialPruning; /* Do any of those PartitionPruneInfos
+ * have initial pruning steps in them?
+ */
+
List *rtable; /* list of RangeTblEntry nodes */
+ Bitmapset *minLockRelids; /* Indexes of all range table entries minus
+ * indexes of range table entries of the leaf
+ * partitions scanned by prunable subplans;
+ * see AcquireExecutorLocks() */
+
/* rtable indexes of target relations for INSERT/UPDATE/DELETE */
List *resultRelations; /* integer list of RT indexes, or NIL */
@@ -1410,6 +1419,13 @@ typedef struct PlanRowMark
* prune_infos List of Lists containing PartitionedRelPruneInfo nodes,
* one sublist per run-time-prunable partition hierarchy
* appearing in the parent plan node's subplans.
+ *
+ * needs_init_pruning Does any of the PartitionedRelPruneInfos in
+ * prune_infos have its initial_pruning_steps set?
+ *
+ * needs_exec_pruning Does any of the PartitionedRelPruneInfos in
+ * prune_infos have its exec_pruning_steps set?
+ *
* other_subplans Indexes of any subplans that are not accounted for
* by any of the PartitionedRelPruneInfo nodes in
* "prune_infos". These subplans must not be pruned.
@@ -1420,6 +1436,8 @@ typedef struct PartitionPruneInfo
NodeTag type;
List *prune_infos;
+ bool needs_init_pruning;
+ bool needs_exec_pruning;
Bitmapset *other_subplans;
} PartitionPruneInfo;
@@ -1464,6 +1482,9 @@ typedef struct PartitionedRelPruneInfo
/* relation OID by partition index, or 0 */
Oid *relid_map pg_node_attr(array_size(nparts));
+ /* Range table index by partition index, or 0. */
+ Index *rti_map pg_node_attr(array_size(nparts));
+
/*
* initial_pruning_steps shows how to prune during executor startup (i.e.,
* without use of any PARAM_EXEC Params); it is NIL if no startup pruning
@@ -1548,6 +1569,32 @@ typedef struct PartitionPruneStepCombine
List *source_stepids;
} PartitionPruneStepCombine;
+/*----------------
+ * PartitionPruneResult
+ *
+ * The result of performing ExecutorDoInitialPruning() invocation on a given
+ * PlannedStmt.
+ *
+ * Contains a list of Bitmapset of the indexes of the subplans remaining after
+ * performing initial pruning by calling ExecFindMatchingSubPlans() for every
+ * PartitionPruneInfo found in PlannedStmt.partPruneInfos. RT indexes of the
+ * leaf partitions scanned by those subplans across all PartitionPruneInfos
+ * are added into scan_leafpart_rtis.
+ *
+ * This is used by GetCachedPlan() to inform its callers of the pruning
+ * decisions made when performing AcquireExecutorLocks() on a given cached
+ * PlannedStmt, which the callers then pass on to the executor. The executor
+ * refers to this node when initializing the plan nodes which contain subplans
+ * that may have been pruned by ExecutorDoInitialPruning(), rather than
+ * redoing initial pruning.
+ */
+typedef struct PartitionPruneResult
+{
+ NodeTag type;
+
+ List *valid_subplan_offs_list;
+ Bitmapset *scan_leafpart_rtis;
+} PartitionPruneResult;
/*
* Plan invalidation info
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index 0499635f59..1c5bb5ece1 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -220,7 +220,8 @@ extern List *CachedPlanGetTargetList(CachedPlanSource *plansource,
extern CachedPlan *GetCachedPlan(CachedPlanSource *plansource,
ParamListInfo boundParams,
ResourceOwner owner,
- QueryEnvironment *queryEnv);
+ QueryEnvironment *queryEnv,
+ List **part_prune_result_list);
extern void ReleaseCachedPlan(CachedPlan *plan, ResourceOwner owner);
extern bool CachedPlanAllowsSimpleValidityCheck(CachedPlanSource *plansource,
diff --git a/src/include/utils/portal.h b/src/include/utils/portal.h
index aeddbdafe5..9f7727a837 100644
--- a/src/include/utils/portal.h
+++ b/src/include/utils/portal.h
@@ -138,6 +138,7 @@ typedef struct PortalData
QueryCompletion qc; /* command completion data for executed query */
List *stmts; /* list of PlannedStmts */
CachedPlan *cplan; /* CachedPlan, if stmts are from one */
+ List *part_prune_results; /* list of PartitionPruneResults */
ParamListInfo portalParams; /* params to pass to query */
QueryEnvironment *queryEnv; /* environment for query */
@@ -242,6 +243,8 @@ extern void PortalDefineQuery(Portal portal,
CommandTag commandTag,
List *stmts,
CachedPlan *cplan);
+extern void PortalStorePartitionPruneResults(Portal portal,
+ List *part_prune_result_list);
extern PlannedStmt *PortalGetPrimaryStmt(Portal portal);
extern void PortalCreateHoldStore(Portal portal);
extern void PortalHashTableDeleteAll(void);
--
2.35.3
[application/octet-stream] v21-0002-Allow-adding-Bitmapsets-as-Nodes-into-plan-trees.patch (5.5K, 4-v21-0002-Allow-adding-Bitmapsets-as-Nodes-into-plan-trees.patch)
download | inline diff:
From 41465f94e426a0b22b070ab8034de19cfdb6daa4 Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Thu, 6 Oct 2022 17:31:37 +0900
Subject: [PATCH v21 2/3] Allow adding Bitmapsets as Nodes into plan trees
Note that this only adds some infrastructure bits and none of the
existing bitmapsets that are added to plan trees have been changed
to instead add the Node version. So, the plan trees, or really the
bitmapsets contained in them, look the same as before as far as
Node write/read functionality is concerned.
This is needed, because it is not currently possible to write and
then read back Bitmapsets that are not direct members of write/read
capable Nodes; for example, if one needs to add a List of Bitmapsets
to a plan tree. The most straightforward way to do that is to make
Bitmapsets be written with outNode() and read with nodeRead().
---
src/backend/nodes/Makefile | 3 ++-
src/backend/nodes/copyfuncs.c | 11 +++++++++++
src/backend/nodes/equalfuncs.c | 6 ++++++
src/backend/nodes/gen_node_support.pl | 1 +
src/backend/nodes/outfuncs.c | 11 +++++++++++
src/backend/nodes/readfuncs.c | 4 ++++
src/backend/optimizer/prep/preptlist.c | 1 -
src/include/nodes/bitmapset.h | 5 +++++
src/include/nodes/meson.build | 1 +
9 files changed, 41 insertions(+), 2 deletions(-)
diff --git a/src/backend/nodes/Makefile b/src/backend/nodes/Makefile
index 7450e191ee..da5307771b 100644
--- a/src/backend/nodes/Makefile
+++ b/src/backend/nodes/Makefile
@@ -57,7 +57,8 @@ node_headers = \
nodes/replnodes.h \
nodes/supportnodes.h \
nodes/value.h \
- utils/rel.h
+ utils/rel.h \
+ nodes/bitmapset.h
# see also catalog/Makefile for an explanation of these make rules
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index e76fda8eba..1482019327 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -160,6 +160,17 @@ _copyExtensibleNode(const ExtensibleNode *from)
return newnode;
}
+/* Custom copy routine for Node bitmapsets */
+static Bitmapset *
+_copyBitmapset(const Bitmapset *from)
+{
+ Bitmapset *newnode = bms_copy(from);
+
+ newnode->type = T_Bitmapset;
+
+ return newnode;
+}
+
/*
* copyObjectImpl -- implementation of copyObject(); see nodes/nodes.h
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 0373aa30fe..e8706c461a 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -210,6 +210,12 @@ _equalList(const List *a, const List *b)
return true;
}
+/* Custom equal routine for Node bitmapsets */
+static bool
+_equalBitmapset(const Bitmapset *a, const Bitmapset *b)
+{
+ return bms_equal(a, b);
+}
/*
* equal
diff --git a/src/backend/nodes/gen_node_support.pl b/src/backend/nodes/gen_node_support.pl
index 81b8c184a9..ccb5aff874 100644
--- a/src/backend/nodes/gen_node_support.pl
+++ b/src/backend/nodes/gen_node_support.pl
@@ -71,6 +71,7 @@ my @all_input_files = qw(
nodes/supportnodes.h
nodes/value.h
utils/rel.h
+ nodes/bitmapset.h
);
# Nodes from these input files are automatically treated as nodetag_only.
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 64c65f060b..b3ffd8cec2 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -328,6 +328,17 @@ outBitmapset(StringInfo str, const Bitmapset *bms)
appendStringInfoChar(str, ')');
}
+/* Custom write routine for Node bitmapsets */
+static void
+_outBitmapset(StringInfo str, const Bitmapset *bms)
+{
+ Assert(IsA(bms, Bitmapset));
+ WRITE_NODE_TYPE("BITMAPSET");
+
+ outBitmapset(str, bms);
+}
+
+
/*
* Print the value of a Datum given its type.
*/
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index b4ff855f7c..4d6902d3ac 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -230,6 +230,10 @@ _readBitmapset(void)
result = bms_add_member(result, val);
}
+ /* XXX maybe do `result = makeNode(Bitmapset);` at the top? */
+ if (result)
+ result->type = T_Bitmapset;
+
return result;
}
diff --git a/src/backend/optimizer/prep/preptlist.c b/src/backend/optimizer/prep/preptlist.c
index 137b28323d..e5c1103316 100644
--- a/src/backend/optimizer/prep/preptlist.c
+++ b/src/backend/optimizer/prep/preptlist.c
@@ -337,7 +337,6 @@ extract_update_targetlist_colnos(List *tlist)
return update_colnos;
}
-
/*****************************************************************************
*
* TARGETLIST EXPANSION
diff --git a/src/include/nodes/bitmapset.h b/src/include/nodes/bitmapset.h
index 75b5ce1a8e..9046ca177f 100644
--- a/src/include/nodes/bitmapset.h
+++ b/src/include/nodes/bitmapset.h
@@ -20,6 +20,8 @@
#ifndef BITMAPSET_H
#define BITMAPSET_H
+#include "nodes/nodes.h"
+
/*
* Forward decl to save including pg_list.h
*/
@@ -48,6 +50,9 @@ typedef int32 signedbitmapword; /* must be the matching signed type */
typedef struct Bitmapset
{
+ pg_node_attr(custom_copy_equal, custom_read_write)
+
+ NodeTag type;
int nwords; /* number of words in array */
bitmapword words[FLEXIBLE_ARRAY_MEMBER]; /* really [nwords] */
} Bitmapset;
diff --git a/src/include/nodes/meson.build b/src/include/nodes/meson.build
index b7df232081..94701af8e1 100644
--- a/src/include/nodes/meson.build
+++ b/src/include/nodes/meson.build
@@ -19,6 +19,7 @@ node_support_input_i = [
'nodes/supportnodes.h',
'nodes/value.h',
'utils/rel.h',
+ 'nodes/bitmapset.h',
]
node_support_input = []
--
2.35.3
^ permalink raw reply [nested|flat] 71+ messages in thread
* Re: generic plans and "initial" pruning
@ 2022-10-17 09:29 Amit Langote <[email protected]>
parent: Amit Langote <[email protected]>
0 siblings, 1 reply; 71+ messages in thread
From: Amit Langote @ 2022-10-17 09:29 UTC (permalink / raw)
To: Robert Haas <[email protected]>; +Cc: Jacob Champion <[email protected]>; Zhihong Yu <[email protected]>; David Rowley <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers
On Wed, Oct 12, 2022 at 4:36 PM Amit Langote <[email protected]> wrote:
> On Fri, Jul 29, 2022 at 1:20 PM Amit Langote <[email protected]> wrote:
> > On Thu, Jul 28, 2022 at 1:27 AM Robert Haas <[email protected]> wrote:
> > > 0001 adds es_part_prune_result but does not use it, so maybe the
> > > introduction of that field should be deferred until it's needed for
> > > something.
> >
> > Oops, looks like a mistake when breaking the patch. Will move that bit to 0002.
>
> Fixed that and also noticed that I had defined PartitionPruneResult in
> the wrong header (execnodes.h). That led to PartitionPruneResult
> nodes not being able to be written and read, because
> src/backend/nodes/gen_node_support.pl doesn't create _out* and _read*
> routines for the nodes defined in execnodes.h. I moved its definition
> to plannodes.h, even though it is not actually the planner that
> instantiates those; no other include/nodes header sounds better.
>
> One more thing I realized is that Bitmapsets added to the List
> PartitionPruneResult.valid_subplan_offs_list are not actually
> read/write-able. That's a problem that I also faced in [1], so I
> proposed a patch there to make Bitmapset a read/write-able Node and
> mark (only) the Bitmapsets that are added into read/write-able node
> trees with the corresponding NodeTag. I'm including that patch here
> as well (0002) for the main patch to work (pass
> -DWRITE_READ_PARSE_PLAN_TREES build tests), though it might make sense
> to discuss it in its own thread?
Had second thoughts on the use of List of Bitmapsets for this, such
that the make-Bitmapset-Nodes patch is no longer needed.
I had defined PartitionPruneResult such that it stood for the results
of pruning for all PartitionPruneInfos contained in
PlannedStmt.partPruneInfos (covering all Append/MergeAppend nodes that
can use partition pruning in a given plan). So, it had a List of
Bitmapset. I think it's perhaps better for PartitionPruneResult to
cover only one PartitionPruneInfo and thus need only a Bitmapset and
not a List thereof, which I have implemented in the attached updated
patch 0002. So, instead of needing to pass around a
PartitionPruneResult with each PlannedStmt, this now passes a List of
PartitionPruneResult with an entry for each in
PlannedStmt.partPruneInfos.
--
Thanks, Amit Langote
EDB: http://www.enterprisedb.com
Attachments:
[application/octet-stream] v22-0001-Move-PartitioPruneInfo-out-of-plan-nodes-into-Pl.patch (17.2K, 2-v22-0001-Move-PartitioPruneInfo-out-of-plan-nodes-into-Pl.patch)
download | inline diff:
From 27db8ab066dace77953d71a6446788190b66ce60 Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Fri, 27 May 2022 16:00:28 +0900
Subject: [PATCH v22 1/2] Move PartitioPruneInfo out of plan nodes into
PlannedStmt
The planner will now add a given PartitioPruneInfo to
PlannedStmt.partPruneInfos instead of directly to the
Append/MergeAppend plan node. What gets set instead in the
latter is an index field which points to the list element
of PlannedStmt.partPruneInfos containing the PartitioPruneInfo
belonging to the plan node.
A later commit will make AcquireExecutorLocks() do the initial
partition pruning to determine a minimal set of partitions to be
locked when validating a plan tree and it will need to consult the
PartitioPruneInfos referenced therein to do so. It would be better
for the PartitioPruneInfos to be accessible directly than requiring
a walk of the plan tree to find them, which is easier when it can be
done by simply iterating over PlannedStmt.partPruneInfos.
---
src/backend/executor/execMain.c | 1 +
src/backend/executor/execParallel.c | 1 +
src/backend/executor/execPartition.c | 4 +-
src/backend/executor/execUtils.c | 1 +
src/backend/executor/nodeAppend.c | 4 +-
src/backend/executor/nodeMergeAppend.c | 4 +-
src/backend/optimizer/plan/createplan.c | 24 ++++-----
src/backend/optimizer/plan/planner.c | 1 +
src/backend/optimizer/plan/setrefs.c | 65 +++++++++++++------------
src/backend/partitioning/partprune.c | 18 ++++---
src/include/executor/execPartition.h | 3 +-
src/include/nodes/execnodes.h | 1 +
src/include/nodes/pathnodes.h | 6 +++
src/include/nodes/plannodes.h | 11 +++--
src/include/partitioning/partprune.h | 8 +--
15 files changed, 90 insertions(+), 62 deletions(-)
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index d78862e660..32475e33ff 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -825,6 +825,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
ExecInitRangeTable(estate, rangeTable);
estate->es_plannedstmt = plannedstmt;
+ estate->es_part_prune_infos = plannedstmt->partPruneInfos;
/*
* Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index 99512826c5..aca0c6f323 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -183,6 +183,7 @@ ExecSerializePlan(Plan *plan, EState *estate)
pstmt->dependsOnRole = false;
pstmt->parallelModeNeeded = false;
pstmt->planTree = plan;
+ pstmt->partPruneInfos = estate->es_part_prune_infos;
pstmt->rtable = estate->es_range_table;
pstmt->resultRelations = NIL;
pstmt->appendRelations = NIL;
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 40e3c07693..80197d5141 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -1791,11 +1791,13 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
PartitionPruneState *
ExecInitPartitionPruning(PlanState *planstate,
int n_total_subplans,
- PartitionPruneInfo *pruneinfo,
+ int part_prune_index,
Bitmapset **initially_valid_subplans)
{
PartitionPruneState *prunestate;
EState *estate = planstate->state;
+ PartitionPruneInfo *pruneinfo = list_nth(estate->es_part_prune_infos,
+ part_prune_index);
/* We may need an expression context to evaluate partition exprs */
ExecAssignExprContext(estate, planstate);
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 9df1f81ea8..21f4c10937 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -119,6 +119,7 @@ CreateExecutorState(void)
estate->es_relations = NULL;
estate->es_rowmarks = NULL;
estate->es_plannedstmt = NULL;
+ estate->es_part_prune_infos = NIL;
estate->es_junkFilter = NULL;
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index 357e10a1d7..c6f86a6510 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -134,7 +134,7 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
appendstate->as_begun = false;
/* If run-time partition pruning is enabled, then set that up now */
- if (node->part_prune_info != NULL)
+ if (node->part_prune_index >= 0)
{
PartitionPruneState *prunestate;
@@ -145,7 +145,7 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
*/
prunestate = ExecInitPartitionPruning(&appendstate->ps,
list_length(node->appendplans),
- node->part_prune_info,
+ node->part_prune_index,
&validsubplans);
appendstate->as_prune_state = prunestate;
nplans = bms_num_members(validsubplans);
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index c5c62fa5c7..8d35860c30 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -82,7 +82,7 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
mergestate->ps.ExecProcNode = ExecMergeAppend;
/* If run-time partition pruning is enabled, then set that up now */
- if (node->part_prune_info != NULL)
+ if (node->part_prune_index >= 0)
{
PartitionPruneState *prunestate;
@@ -93,7 +93,7 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
*/
prunestate = ExecInitPartitionPruning(&mergestate->ps,
list_length(node->mergeplans),
- node->part_prune_info,
+ node->part_prune_index,
&validsubplans);
mergestate->ms_prune_state = prunestate;
nplans = bms_num_members(validsubplans);
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index ac86ce9003..50a5719ac6 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -1203,7 +1203,6 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
ListCell *subpaths;
int nasyncplans = 0;
RelOptInfo *rel = best_path->path.parent;
- PartitionPruneInfo *partpruneinfo = NULL;
int nodenumsortkeys = 0;
AttrNumber *nodeSortColIdx = NULL;
Oid *nodeSortOperators = NULL;
@@ -1354,6 +1353,9 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
subplans = lappend(subplans, subplan);
}
+ /* Set below if we find quals that we can use to run-time prune */
+ plan->part_prune_index = -1;
+
/*
* If any quals exist, they may be useful to perform further partition
* pruning during execution. Gather information needed by the executor to
@@ -1377,16 +1379,14 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
}
if (prunequal != NIL)
- partpruneinfo =
- make_partition_pruneinfo(root, rel,
- best_path->subpaths,
- prunequal);
+ plan->part_prune_index = make_partition_pruneinfo(root, rel,
+ best_path->subpaths,
+ prunequal);
}
plan->appendplans = subplans;
plan->nasyncplans = nasyncplans;
plan->first_partial_plan = best_path->first_partial_path;
- plan->part_prune_info = partpruneinfo;
copy_generic_path_info(&plan->plan, (Path *) best_path);
@@ -1425,7 +1425,6 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
List *subplans = NIL;
ListCell *subpaths;
RelOptInfo *rel = best_path->path.parent;
- PartitionPruneInfo *partpruneinfo = NULL;
/*
* We don't have the actual creation of the MergeAppend node split out
@@ -1518,6 +1517,9 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
subplans = lappend(subplans, subplan);
}
+ /* Set below if we find quals that we can use to run-time prune */
+ node->part_prune_index = -1;
+
/*
* If any quals exist, they may be useful to perform further partition
* pruning during execution. Gather information needed by the executor to
@@ -1541,13 +1543,13 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
}
if (prunequal != NIL)
- partpruneinfo = make_partition_pruneinfo(root, rel,
- best_path->subpaths,
- prunequal);
+ node->part_prune_index = make_partition_pruneinfo(root, rel,
+ best_path->subpaths,
+ prunequal);
}
node->mergeplans = subplans;
- node->part_prune_info = partpruneinfo;
+
/*
* If prepare_sort_from_pathkeys added sort columns, but we were told to
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 5d0fd6e072..31fff597a7 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -519,6 +519,7 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
result->dependsOnRole = glob->dependsOnRole;
result->parallelModeNeeded = glob->parallelModeNeeded;
result->planTree = top_plan;
+ result->partPruneInfos = glob->partPruneInfos;
result->rtable = glob->finalrtable;
result->resultRelations = glob->resultRelations;
result->appendRelations = glob->appendRelations;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 1cb0abdbc1..720f20f563 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -348,6 +348,29 @@ set_plan_references(PlannerInfo *root, Plan *plan)
}
}
+ /* Also fix up the information in PartitionPruneInfos. */
+ foreach (lc, root->partPruneInfos)
+ {
+ PartitionPruneInfo *pruneinfo = lfirst(lc);
+ ListCell *l;
+
+ foreach(l, pruneinfo->prune_infos)
+ {
+ List *prune_infos = lfirst(l);
+ ListCell *l2;
+
+ foreach(l2, prune_infos)
+ {
+ PartitionedRelPruneInfo *pinfo = lfirst(l2);
+
+ /* RT index of the table to which the pinfo belongs. */
+ pinfo->rtindex += rtoffset;
+ }
+ }
+
+ glob->partPruneInfos = lappend(glob->partPruneInfos, pruneinfo);
+ }
+
return result;
}
@@ -1658,21 +1681,12 @@ set_append_references(PlannerInfo *root,
aplan->apprelids = offset_relid_set(aplan->apprelids, rtoffset);
- if (aplan->part_prune_info)
- {
- foreach(l, aplan->part_prune_info->prune_infos)
- {
- List *prune_infos = lfirst(l);
- ListCell *l2;
-
- foreach(l2, prune_infos)
- {
- PartitionedRelPruneInfo *pinfo = lfirst(l2);
-
- pinfo->rtindex += rtoffset;
- }
- }
- }
+ /*
+ * PartitionPruneInfos will be added to a list in PlannerGlobal, so update
+ * the index.
+ */
+ if (aplan->part_prune_index >= 0)
+ aplan->part_prune_index += list_length(root->glob->partPruneInfos);
/* We don't need to recurse to lefttree or righttree ... */
Assert(aplan->plan.lefttree == NULL);
@@ -1734,21 +1748,12 @@ set_mergeappend_references(PlannerInfo *root,
mplan->apprelids = offset_relid_set(mplan->apprelids, rtoffset);
- if (mplan->part_prune_info)
- {
- foreach(l, mplan->part_prune_info->prune_infos)
- {
- List *prune_infos = lfirst(l);
- ListCell *l2;
-
- foreach(l2, prune_infos)
- {
- PartitionedRelPruneInfo *pinfo = lfirst(l2);
-
- pinfo->rtindex += rtoffset;
- }
- }
- }
+ /*
+ * PartitionPruneInfos will be added to a list in PlannerGlobal, so update
+ * the index.
+ */
+ if (mplan->part_prune_index >= 0)
+ mplan->part_prune_index += list_length(root->glob->partPruneInfos);
/* We don't need to recurse to lefttree or righttree ... */
Assert(mplan->plan.lefttree == NULL);
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index 6188bf69cb..6565b6ed01 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -209,16 +209,20 @@ static void partkey_datum_from_expr(PartitionPruneContext *context,
/*
* make_partition_pruneinfo
- * Builds a PartitionPruneInfo which can be used in the executor to allow
- * additional partition pruning to take place. Returns NULL when
- * partition pruning would be useless.
+ * Checks if the given set of quals can be used to build pruning steps
+ * that the executor can use to prune away unneeded partitions. If
+ * suitable quals are found then a PartitionPruneInfo is built and tagged
+ * onto the PlannerInfo's partPruneInfos list.
+ *
+ * The return value is the 0-based index of the item added to the
+ * partPruneInfos list or -1 if nothing was added.
*
* 'parentrel' is the RelOptInfo for an appendrel, and 'subpaths' is the list
* of scan paths for its child rels.
* 'prunequal' is a list of potential pruning quals (i.e., restriction
* clauses that are applicable to the appendrel).
*/
-PartitionPruneInfo *
+int
make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
List *subpaths,
List *prunequal)
@@ -332,7 +336,7 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
* quals, then we can just not bother with run-time pruning.
*/
if (prunerelinfos == NIL)
- return NULL;
+ return -1;
/* Else build the result data structure */
pruneinfo = makeNode(PartitionPruneInfo);
@@ -358,7 +362,9 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
else
pruneinfo->other_subplans = NULL;
- return pruneinfo;
+ root->partPruneInfos = lappend(root->partPruneInfos, pruneinfo);
+
+ return list_length(root->partPruneInfos) - 1;
}
/*
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index 708435e952..bf962af7af 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -123,9 +123,8 @@ typedef struct PartitionPruneState
extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
int n_total_subplans,
- PartitionPruneInfo *pruneinfo,
+ int part_prune_index,
Bitmapset **initially_valid_subplans);
extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
bool initial_prune);
-
#endif /* EXECPARTITION_H */
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 01b1727fc0..4a741b053f 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -611,6 +611,7 @@ typedef struct EState
struct ExecRowMark **es_rowmarks; /* Array of per-range-table-entry
* ExecRowMarks, or NULL if none */
PlannedStmt *es_plannedstmt; /* link to top of plan tree */
+ List *es_part_prune_infos; /* PlannedStmt.partPruneInfos */
const char *es_sourceText; /* Source text from QueryDesc */
JunkFilter *es_junkFilter; /* top-level junk filter, if any */
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 6bda383bea..e392fb6fc0 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -122,6 +122,9 @@ typedef struct PlannerGlobal
/* "flat" list of AppendRelInfos */
List *appendRelations;
+ /* List of PartitionPruneInfo contained in the plan */
+ List *partPruneInfos;
+
/* OIDs of relations the plan depends on */
List *relationOids;
@@ -503,6 +506,9 @@ struct PlannerInfo
/* Does this query modify any partition key columns? */
bool partColsUpdated;
+
+ /* PartitionPruneInfos added in this query's plan. */
+ List *partPruneInfos;
};
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 21e642a64c..3eb3e6e527 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -70,6 +70,9 @@ typedef struct PlannedStmt
struct Plan *planTree; /* tree of Plan nodes */
+ List *partPruneInfos; /* List of PartitionPruneInfo contained in
+ * the plan */
+
List *rtable; /* list of RangeTblEntry nodes */
/* rtable indexes of target relations for INSERT/UPDATE/DELETE */
@@ -270,8 +273,8 @@ typedef struct Append
*/
int first_partial_plan;
- /* Info for run-time subplan pruning; NULL if we're not doing that */
- struct PartitionPruneInfo *part_prune_info;
+ /* Index to PlannerInfo.partPruneInfos or -1 if no run-time pruning */
+ int part_prune_index;
} Append;
/* ----------------
@@ -305,8 +308,8 @@ typedef struct MergeAppend
/* NULLS FIRST/LAST directions */
bool *nullsFirst pg_node_attr(array_size(numCols));
- /* Info for run-time subplan pruning; NULL if we're not doing that */
- struct PartitionPruneInfo *part_prune_info;
+ /* Index to PlannerInfo.partPruneInfos or -1 if no run-time pruning */
+ int part_prune_index;
} MergeAppend;
/* ----------------
diff --git a/src/include/partitioning/partprune.h b/src/include/partitioning/partprune.h
index 90684efa25..ebf0dcff8c 100644
--- a/src/include/partitioning/partprune.h
+++ b/src/include/partitioning/partprune.h
@@ -70,10 +70,10 @@ typedef struct PartitionPruneContext
#define PruneCxtStateIdx(partnatts, step_id, keyno) \
((partnatts) * (step_id) + (keyno))
-extern PartitionPruneInfo *make_partition_pruneinfo(struct PlannerInfo *root,
- struct RelOptInfo *parentrel,
- List *subpaths,
- List *prunequal);
+extern int make_partition_pruneinfo(struct PlannerInfo *root,
+ struct RelOptInfo *parentrel,
+ List *subpaths,
+ List *prunequal);
extern Bitmapset *prune_append_rel_partitions(struct RelOptInfo *rel);
extern Bitmapset *get_matching_partitions(PartitionPruneContext *context,
List *pruning_steps);
--
2.35.3
[application/octet-stream] v22-0002-Optimize-AcquireExecutorLocks-by-locking-only-un.patch (82.3K, 3-v22-0002-Optimize-AcquireExecutorLocks-by-locking-only-un.patch)
download | inline diff:
From 5f2d5ca36111f8007a7850fd985c7e965d621149 Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Wed, 22 Dec 2021 16:55:17 +0900
Subject: [PATCH v22 2/2] Optimize AcquireExecutorLocks() by locking only
unpruned partitions
This commit teaches AcquireExecutorLocks() to perform initial
partition pruning to notionally eliminate the subnodes contained in a
generic cached plan that need not be initialized during the actual
execution of the plan and skip locking the partition scanned by those
subnodes.
The result of performing initial partition pruning this way before the
actual execution has started is made available to the actual execution via
PartitionPruneResult, made available along with the PlannedStmt by the
callers of the executor that used plancache.c to get the plan. It is NULL
in the cases in which the plan is obtained by calling the planner
directly or if the plan obtained by plancache.c is not a generic one.
---
src/backend/commands/copyto.c | 2 +-
src/backend/commands/createas.c | 2 +-
src/backend/commands/explain.c | 7 +-
src/backend/commands/extension.c | 2 +-
src/backend/commands/matview.c | 2 +-
src/backend/commands/prepare.c | 26 ++-
src/backend/executor/README | 32 ++++
src/backend/executor/execMain.c | 51 ++++++
src/backend/executor/execParallel.c | 26 ++-
src/backend/executor/execPartition.c | 241 +++++++++++++++++++++----
src/backend/executor/execUtils.c | 1 +
src/backend/executor/functions.c | 2 +-
src/backend/executor/nodeAppend.c | 11 +-
src/backend/executor/nodeMergeAppend.c | 5 +-
src/backend/executor/spi.c | 27 ++-
src/backend/nodes/readfuncs.c | 8 +-
src/backend/optimizer/plan/planner.c | 2 +
src/backend/optimizer/plan/setrefs.c | 46 +++++
src/backend/partitioning/partprune.c | 41 ++++-
src/backend/tcop/postgres.c | 8 +-
src/backend/tcop/pquery.c | 28 ++-
src/backend/utils/cache/plancache.c | 208 ++++++++++++++++++---
src/backend/utils/mmgr/portalmem.c | 19 ++
src/include/commands/explain.h | 4 +-
src/include/executor/execPartition.h | 9 +-
src/include/executor/execdesc.h | 3 +
src/include/executor/executor.h | 3 +
src/include/nodes/execnodes.h | 1 +
src/include/nodes/nodes.h | 1 +
src/include/nodes/pathnodes.h | 12 ++
src/include/nodes/plannodes.h | 46 +++++
src/include/utils/plancache.h | 3 +-
src/include/utils/portal.h | 3 +
33 files changed, 782 insertions(+), 100 deletions(-)
diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index 2527e66059..fb8779fec0 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -558,7 +558,7 @@ BeginCopyTo(ParseState *pstate,
((DR_copy *) dest)->cstate = cstate;
/* Create a QueryDesc requesting no output */
- cstate->queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+ cstate->queryDesc = CreateQueryDesc(plan, NIL, pstate->p_sourcetext,
GetActiveSnapshot(),
InvalidSnapshot,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 152c29b551..942449544c 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -325,7 +325,7 @@ ExecCreateTableAs(ParseState *pstate, CreateTableAsStmt *stmt,
UpdateActiveSnapshotCommandId();
/* Create a QueryDesc, redirecting output to our tuple receiver */
- queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+ queryDesc = CreateQueryDesc(plan, NIL, pstate->p_sourcetext,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, 0);
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index f86983c660..2f2b558608 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -407,7 +407,7 @@ ExplainOneQuery(Query *query, int cursorOptions,
}
/* run it (if needed) and produce output */
- ExplainOnePlan(plan, into, es, queryString, params, queryEnv,
+ ExplainOnePlan(plan, NIL, into, es, queryString, params, queryEnv,
&planduration, (es->buffers ? &bufusage : NULL));
}
}
@@ -515,7 +515,8 @@ ExplainOneUtility(Node *utilityStmt, IntoClause *into, ExplainState *es,
* to call it.
*/
void
-ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
+ExplainOnePlan(PlannedStmt *plannedstmt, List *part_prune_results,
+ IntoClause *into, ExplainState *es,
const char *queryString, ParamListInfo params,
QueryEnvironment *queryEnv, const instr_time *planduration,
const BufferUsage *bufusage)
@@ -563,7 +564,7 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
dest = None_Receiver;
/* Create a QueryDesc for the query */
- queryDesc = CreateQueryDesc(plannedstmt, queryString,
+ queryDesc = CreateQueryDesc(plannedstmt, part_prune_results, queryString,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, instrument_option);
diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index 6b6720c690..06dfcd4d84 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -776,7 +776,7 @@ execute_sql_string(const char *sql)
{
QueryDesc *qdesc;
- qdesc = CreateQueryDesc(stmt,
+ qdesc = CreateQueryDesc(stmt, NIL,
sql,
GetActiveSnapshot(), NULL,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index 9ac0383459..65c8d0aa59 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -408,7 +408,7 @@ refresh_matview_datafill(DestReceiver *dest, Query *query,
UpdateActiveSnapshotCommandId();
/* Create a QueryDesc, redirecting output to our tuple receiver */
- queryDesc = CreateQueryDesc(plan, queryString,
+ queryDesc = CreateQueryDesc(plan, NIL, queryString,
GetActiveSnapshot(), InvalidSnapshot,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index c4b54d0547..b469e05672 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -155,6 +155,7 @@ ExecuteQuery(ParseState *pstate,
PreparedStatement *entry;
CachedPlan *cplan;
List *plan_list;
+ List *part_prune_results_list;
ParamListInfo paramLI = NULL;
EState *estate = NULL;
Portal portal;
@@ -193,7 +194,10 @@ ExecuteQuery(ParseState *pstate,
entry->plansource->query_string);
/* Replan if needed, and increment plan refcount for portal */
- cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL);
+ cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL,
+ &part_prune_results_list);
+ Assert(list_length(cplan->stmt_list) ==
+ list_length(part_prune_results_list));
plan_list = cplan->stmt_list;
/*
@@ -207,6 +211,9 @@ ExecuteQuery(ParseState *pstate,
plan_list,
cplan);
+ /* Copy Lists of PartitionPruneResults into the portal's context. */
+ PortalStorePartitionPruneResults(portal, part_prune_results_list);
+
/*
* For CREATE TABLE ... AS EXECUTE, we must verify that the prepared
* statement is one that produces tuples. Currently we insist that it be
@@ -576,7 +583,9 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
const char *query_string;
CachedPlan *cplan;
List *plan_list;
- ListCell *p;
+ List *part_prune_results_list;
+ ListCell *p,
+ *pp;
ParamListInfo paramLI = NULL;
EState *estate = NULL;
instr_time planstart;
@@ -619,7 +628,10 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
/* Replan if needed, and acquire a transient refcount */
cplan = GetCachedPlan(entry->plansource, paramLI,
- CurrentResourceOwner, queryEnv);
+ CurrentResourceOwner, queryEnv,
+ &part_prune_results_list);
+ Assert(list_length(cplan->stmt_list) ==
+ list_length(part_prune_results_list));
INSTR_TIME_SET_CURRENT(planduration);
INSTR_TIME_SUBTRACT(planduration, planstart);
@@ -634,13 +646,15 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
plan_list = cplan->stmt_list;
/* Explain each query */
- foreach(p, plan_list)
+ forboth(p, plan_list, pp, part_prune_results_list)
{
PlannedStmt *pstmt = lfirst_node(PlannedStmt, p);
+ List *part_prune_results = lfirst_node(List, pp);
if (pstmt->commandType != CMD_UTILITY)
- ExplainOnePlan(pstmt, into, es, query_string, paramLI, queryEnv,
- &planduration, (es->buffers ? &bufusage : NULL));
+ ExplainOnePlan(pstmt, part_prune_results, into, es, query_string,
+ paramLI, queryEnv, &planduration,
+ (es->buffers ? &bufusage : NULL));
else
ExplainOneUtility(pstmt->utilityStmt, into, es, query_string,
paramLI, queryEnv);
diff --git a/src/backend/executor/README b/src/backend/executor/README
index 0b5183fc4a..f14f9197b5 100644
--- a/src/backend/executor/README
+++ b/src/backend/executor/README
@@ -65,6 +65,34 @@ found there. This currently only occurs for Append and MergeAppend nodes. In
this case the non-required subplans are ignored and the executor state's
subnode array will become out of sequence to the plan's subplan list.
+Actually, the so-called execution time pruning may also occur even before the
+execution has started. One case where that occurs is when a cached generic
+plan is being validated for execution by plancache.c: GetCachedPlan(), which
+works by locking all the relations that will be scanned by that plan. If the
+generic plan contains nodes that can perform execution time partition pruning
+(that is, contain a PartitionPruneInfo), a subset of pruning steps contained
+in a given node's PartitionPruneInfo that do not depend on the execution
+actually having started (called "initial" pruning steps) are performed to
+figure out the minimal set of child subplans that satisfy those pruning steps.
+AcquireExecutorLocks() looking at a given generic plan will then lock only the
+relations scanned by the child subplans that survived such pruning, along with
+those present in PlannedStmt.minLockRelids. Note that the subplans are only
+notionally pruned, that is, they are not removed from the plan tree as such.
+
+To prevent the executor and any third party execution code that can look at
+the plan tree from trying to execute the subplans that were pruned as
+described above, the result of pruning is passed to the executor as a List
+of PartitionPruneResult nodes via the QueryDesc. Each PartitionPruneResult
+consists of the set of indexes of surviving subplans in the respective parent
+plan node's (the one to which the corresponding PartitionPruneInfo belongs)
+list of child subplans, saved as a bitmapset (valid_subplan_offs). In other
+words, the executor executing a generic plan should not re-evaluate the set of
+initially valid subplans for a given plan node by redoing the initial pruning
+if it was already done by AcquireExecutorLocks() when validating the plan.
+Such re-evaluation of the pruning steps may very well end up resulting in a
+different set of subplans, containing some whose relations were not locked by
+AcquireExecutorLocks().
+
Each Plan node may have expression trees associated with it, to represent
its target list, qualification conditions, etc. These trees are also
read-only to the executor, but the executor state for expression evaluation
@@ -286,6 +314,10 @@ Query Processing Control Flow
This is a sketch of control flow for full query processing:
+ [ ExecutorDoInitialPruning ] --- an optional step to perform initial
+ partition pruning on the plan tree the result of which is passed
+ to the executor via QueryDesc
+
CreateQueryDesc
ExecutorStart
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 32475e33ff..b59474841f 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -49,6 +49,7 @@
#include "commands/matview.h"
#include "commands/trigger.h"
#include "executor/execdebug.h"
+#include "executor/execPartition.h"
#include "executor/nodeSubplan.h"
#include "foreign/fdwapi.h"
#include "jit/jit.h"
@@ -104,6 +105,54 @@ static void EvalPlanQualStart(EPQState *epqstate, Plan *planTree);
/* end of local decls */
+/* ----------------------------------------------------------------
+ * ExecutorDoInitialPruning
+ *
+ * For each plan tree node that has been assigned a PartitionPruneInfo,
+ * this performs initial partition pruning using the information contained
+ * therein to determine the set of child subplans that satisfy the initial
+ * pruning steps, to be returned as a bitmapset of their indexes in the
+ * node's list of child subplans (for example, an Append's appendplans).
+ *
+ * Return value is a PartitionPruneResult node that contains a list of those
+ * bitmapsets, with one element for every PartitionPruneInfo, and a bitmapset
+ * of the RT indexes of all the leaf partitions scanned by those chosen
+ * subplans. Note that the latter is shared across all PartitionPruneInfos.
+ *
+ * The executor must see the exactly same set of subplans as valid for
+ * execution when doing ExecInitNode() on the plan nodes whose
+ * PartitionPruneInfos are processed here. So, it must get the set from the
+ * aforementioned PartitionPruneResult, instead of computing it all over
+ * again by redoing the initial pruning. It's the caller's job to pass the
+ * PartitionPruneResult to the executor.
+ *
+ * Note: Partitioned tables mentioned in PartitionedRelPruneInfo nodes that
+ * drive the pruning will be locked before doing the pruning.
+ * ----------------------------------------------------------------
+ */
+List *
+ExecutorDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params,
+ Bitmapset **scan_leafpart_rtis)
+{
+ List *part_prune_results = NIL;
+ ListCell *lc;
+
+ /* Only get here if there is any pruning to do. */
+ Assert(plannedstmt->containsInitialPruning);
+
+ foreach(lc, plannedstmt->partPruneInfos)
+ {
+ PartitionPruneInfo *pruneinfo = lfirst(lc);
+ PartitionPruneResult *pruneresult = makeNode(PartitionPruneResult);
+
+ pruneresult->valid_subplan_offs =
+ ExecPartitionDoInitialPruning(plannedstmt, params, pruneinfo,
+ scan_leafpart_rtis);
+ part_prune_results = lappend(part_prune_results, pruneresult);
+ }
+
+ return part_prune_results;
+}
/* ----------------------------------------------------------------
* ExecutorStart
@@ -806,6 +855,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
{
CmdType operation = queryDesc->operation;
PlannedStmt *plannedstmt = queryDesc->plannedstmt;
+ List *part_prune_results = queryDesc->part_prune_results;
Plan *plan = plannedstmt->planTree;
List *rangeTable = plannedstmt->rtable;
EState *estate = queryDesc->estate;
@@ -826,6 +876,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
estate->es_plannedstmt = plannedstmt;
estate->es_part_prune_infos = plannedstmt->partPruneInfos;
+ estate->es_part_prune_results = part_prune_results;
/*
* Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index aca0c6f323..917079a034 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -66,6 +66,7 @@
#define PARALLEL_KEY_QUERY_TEXT UINT64CONST(0xE000000000000008)
#define PARALLEL_KEY_JIT_INSTRUMENTATION UINT64CONST(0xE000000000000009)
#define PARALLEL_KEY_WAL_USAGE UINT64CONST(0xE00000000000000A)
+#define PARALLEL_KEY_PARTITION_PRUNE_RESULTS UINT64CONST(0xE00000000000000B)
#define PARALLEL_TUPLE_QUEUE_SIZE 65536
@@ -182,6 +183,7 @@ ExecSerializePlan(Plan *plan, EState *estate)
pstmt->transientPlan = false;
pstmt->dependsOnRole = false;
pstmt->parallelModeNeeded = false;
+ pstmt->containsInitialPruning = false;
pstmt->planTree = plan;
pstmt->partPruneInfos = estate->es_part_prune_infos;
pstmt->rtable = estate->es_range_table;
@@ -597,12 +599,15 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
FixedParallelExecutorState *fpes;
char *pstmt_data;
char *pstmt_space;
+ char *part_prune_results_data;
+ char *part_prune_results_space;
char *paramlistinfo_space;
BufferUsage *bufusage_space;
WalUsage *walusage_space;
SharedExecutorInstrumentation *instrumentation = NULL;
SharedJitInstrumentation *jit_instrumentation = NULL;
int pstmt_len;
+ int part_prune_results_len;
int paramlistinfo_len;
int instrumentation_len = 0;
int jit_instrumentation_len = 0;
@@ -631,6 +636,7 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
/* Fix up and serialize plan to be sent to workers. */
pstmt_data = ExecSerializePlan(planstate->plan, estate);
+ part_prune_results_data = nodeToString(estate->es_part_prune_results);
/* Create a parallel context. */
pcxt = CreateParallelContext("postgres", "ParallelQueryMain", nworkers);
@@ -657,6 +663,11 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
shm_toc_estimate_chunk(&pcxt->estimator, pstmt_len);
shm_toc_estimate_keys(&pcxt->estimator, 1);
+ /* Estimate space for serialized List of PartitionPruneResult. */
+ part_prune_results_len = strlen(part_prune_results_data) + 1;
+ shm_toc_estimate_chunk(&pcxt->estimator, part_prune_results_len);
+ shm_toc_estimate_keys(&pcxt->estimator, 1);
+
/* Estimate space for serialized ParamListInfo. */
paramlistinfo_len = EstimateParamListSpace(estate->es_param_list_info);
shm_toc_estimate_chunk(&pcxt->estimator, paramlistinfo_len);
@@ -751,6 +762,12 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
memcpy(pstmt_space, pstmt_data, pstmt_len);
shm_toc_insert(pcxt->toc, PARALLEL_KEY_PLANNEDSTMT, pstmt_space);
+ /* Store serialized List of PartitionPruneResult */
+ part_prune_results_space = shm_toc_allocate(pcxt->toc, part_prune_results_len);
+ memcpy(part_prune_results_space, part_prune_results_data, part_prune_results_len);
+ shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARTITION_PRUNE_RESULTS,
+ part_prune_results_space);
+
/* Store serialized ParamListInfo. */
paramlistinfo_space = shm_toc_allocate(pcxt->toc, paramlistinfo_len);
shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARAMLISTINFO, paramlistinfo_space);
@@ -1232,8 +1249,10 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
int instrument_options)
{
char *pstmtspace;
+ char *part_prune_results_space;
char *paramspace;
PlannedStmt *pstmt;
+ List *part_prune_results;
ParamListInfo paramLI;
char *queryString;
@@ -1244,12 +1263,17 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
pstmtspace = shm_toc_lookup(toc, PARALLEL_KEY_PLANNEDSTMT, false);
pstmt = (PlannedStmt *) stringToNode(pstmtspace);
+ /* Reconstruct leader-supplied PartitionPruneResult. */
+ part_prune_results_space =
+ shm_toc_lookup(toc, PARALLEL_KEY_PARTITION_PRUNE_RESULTS, false);
+ part_prune_results = (List *) stringToNode(part_prune_results_space);
+
/* Reconstruct ParamListInfo. */
paramspace = shm_toc_lookup(toc, PARALLEL_KEY_PARAMLISTINFO, false);
paramLI = RestoreParamList(¶mspace);
/* Create a QueryDesc for the query. */
- return CreateQueryDesc(pstmt,
+ return CreateQueryDesc(pstmt, part_prune_results,
queryString,
GetActiveSnapshot(), InvalidSnapshot,
receiver, paramLI, NULL, instrument_options);
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 80197d5141..8728745c44 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -25,6 +25,7 @@
#include "mb/pg_wchar.h"
#include "miscadmin.h"
#include "nodes/makefuncs.h"
+#include "parser/parsetree.h"
#include "partitioning/partbounds.h"
#include "partitioning/partdesc.h"
#include "partitioning/partprune.h"
@@ -185,7 +186,11 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation rel,
static List *adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri);
static List *adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap);
static PartitionPruneState *CreatePartitionPruneState(PlanState *planstate,
- PartitionPruneInfo *pruneinfo);
+ PartitionPruneInfo *pruneinfo,
+ bool consider_initial_steps,
+ bool consider_exec_steps,
+ List *rtable, ExprContext *econtext,
+ PartitionDirectory partdir);
static void InitPartitionPruneContext(PartitionPruneContext *context,
List *pruning_steps,
PartitionDesc partdesc,
@@ -198,7 +203,8 @@ static void PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
static void find_matching_subplans_recurse(PartitionPruningData *prunedata,
PartitionedRelPruningData *pprune,
bool initial_prune,
- Bitmapset **validsubplans);
+ Bitmapset **validsubplans,
+ Bitmapset **scan_leafpart_rtis);
/*
@@ -1746,8 +1752,10 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
* considered to be a stable expression, it can change value from one plan
* node scan to the next during query execution. Stable comparison
* expressions that don't involve such Params allow partition pruning to be
- * done once during executor startup. Expressions that do involve such Params
- * require us to prune separately for each scan of the parent plan node.
+ * done once during executor startup or during ExecutorDoInitialPruning() that
+ * runs as part of performing AcquireExecutorLocks() on a given plan tree.
+ * Expressions that do involve such Params require us to prune separately for
+ * each scan of the parent plan node.
*
* Note that pruning away unneeded subplans during executor startup has the
* added benefit of not having to initialize the unneeded subplans at all.
@@ -1764,6 +1772,13 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
* account for initial pruning possibly having eliminated some of the
* subplans.
*
+ * ExecPartitionDoInitialPruning:
+ * Do initial pruning with the information contained in a given
+ * PartitionPruneInfo to determine the minimal set of child subplans
+ * to be executed of the parent plan node to which the PartitionPruneInfo
+ * belongs and also the set of the RT indexes of leaf partitions that will
+ * be scanned with those subplans.
+ *
* ExecFindMatchingSubPlans:
* Returns indexes of matching subplans after evaluating the expressions
* that are safe to evaluate at a given point. This function is first
@@ -1781,8 +1796,9 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
*
* On return, *initially_valid_subplans is assigned the set of indexes of
* child subplans that must be initialized along with the parent plan node.
- * Initial pruning is performed here if needed and in that case only the
- * surviving subplans' indexes are added.
+ * Initial pruning is performed here if needed (unless it has already been done
+ * by ExecutorDoInitialPruning()), and in that case only the surviving
+ * subplans' indexes are added.
*
* If subplans are indeed pruned, subplan_map arrays contained in the returned
* PartitionPruneState are re-sequenced to not count those, though only if the
@@ -1794,28 +1810,65 @@ ExecInitPartitionPruning(PlanState *planstate,
int part_prune_index,
Bitmapset **initially_valid_subplans)
{
- PartitionPruneState *prunestate;
+ PartitionPruneState *prunestate = NULL;
EState *estate = planstate->state;
PartitionPruneInfo *pruneinfo = list_nth(estate->es_part_prune_infos,
part_prune_index);
+ PartitionPruneResult *pruneresult = NULL;
+ bool do_pruning = (pruneinfo->needs_init_pruning ||
+ pruneinfo->needs_exec_pruning);
+
+ /*
+ * No need to do initial pruning if it was done already by
+ * ExecutorDoInitialPruning(), which it would be if es_part_prune_results
+ * is set.
+ */
+ if (estate->es_part_prune_results)
+ {
+ pruneresult = list_nth(estate->es_part_prune_results, part_prune_index);
+ Assert(IsA(pruneresult, PartitionPruneResult));
+ do_pruning = pruneinfo->needs_exec_pruning;
+ }
- /* We may need an expression context to evaluate partition exprs */
- ExecAssignExprContext(estate, planstate);
+ if (do_pruning)
+ {
+ /* We may need an expression context to evaluate partition exprs */
+ ExecAssignExprContext(estate, planstate);
- /* Create the working data structure for pruning */
- prunestate = CreatePartitionPruneState(planstate, pruneinfo);
+ /* For data reading, executor always omits detached partitions */
+ if (estate->es_partition_directory == NULL)
+ estate->es_partition_directory =
+ CreatePartitionDirectory(estate->es_query_cxt, false);
+
+ /*
+ * Create the working data structure for pruning. No need to consider
+ * initial pruning steps if we have a PartitionPruneResult.
+ */
+ prunestate = CreatePartitionPruneState(planstate, pruneinfo,
+ pruneresult == NULL, true,
+ NIL, planstate->ps_ExprContext,
+ estate->es_partition_directory);
+ }
/*
* Perform an initial partition prune pass, if required.
*/
- if (prunestate->do_initial_prune)
- *initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true);
+ if (pruneresult)
+ {
+ *initially_valid_subplans = bms_copy(pruneresult->valid_subplan_offs);
+ }
+ else if (prunestate && prunestate->do_initial_prune)
+ {
+ *initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true,
+ NULL);
+ }
else
{
- /* No pruning, so we'll need to initialize all subplans */
+ /* No initial pruning, so we'll need to initialize all subplans */
Assert(n_total_subplans > 0);
*initially_valid_subplans = bms_add_range(NULL, 0,
n_total_subplans - 1);
+ return prunestate;
}
/*
@@ -1823,7 +1876,8 @@ ExecInitPartitionPruning(PlanState *planstate,
* that were removed above due to initial pruning. No need to do this if
* no steps were removed.
*/
- if (bms_num_members(*initially_valid_subplans) < n_total_subplans)
+ if (prunestate &&
+ bms_num_members(*initially_valid_subplans) < n_total_subplans)
{
/*
* We can safely skip this when !do_exec_prune, even though that
@@ -1839,11 +1893,74 @@ ExecInitPartitionPruning(PlanState *planstate,
return prunestate;
}
+/*
+ * ExecPartitionDoInitialPruning
+ * Perform initial pruning using given PartitionPruneInfo to determine
+ * the minimal set of child subplans that will be executed and also the
+ * set of RT indexes of the leaf partitions scanned by those subplans.
+ */
+Bitmapset *
+ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params,
+ PartitionPruneInfo *pruneinfo,
+ Bitmapset **scan_leafpart_rtis)
+{
+ List *rtable = plannedstmt->rtable;
+ ExprContext *econtext;
+ PartitionDirectory pdir;
+ MemoryContext oldcontext,
+ tmpcontext;
+ PartitionPruneState *prunestate;
+ Bitmapset *valid_subplan_offs;
+
+ /*
+ * A temporary context for memory allocations required while executing
+ * partition pruning steps.
+ */
+ tmpcontext = AllocSetContextCreate(CurrentMemoryContext,
+ "initial pruning working data",
+ ALLOCSET_DEFAULT_SIZES);
+ oldcontext = MemoryContextSwitchTo(tmpcontext);
+
+ /*
+ * PartitionDirectory to look up partition descriptors.
+ * Note that we don't omit detached partitions, just like during
+ * execution proper.
+ */
+ pdir = CreatePartitionDirectory(CurrentMemoryContext, false);
+
+ /*
+ * We don't yet have a PlanState for the parent plan node, so we must
+ * create a standalone ExprContext to evaluate pruning expressions,
+ * equipped with the information about the EXTERN parameters that the
+ * caller passed us. Note that that's okay because the initial pruning
+ * steps do not contain anything that requires the execution to have
+ * started and thus need the information contained in a PlanState.
+ */
+ econtext = CreateStandaloneExprContext();
+ econtext->ecxt_param_list_info = params;
+ prunestate = CreatePartitionPruneState(NULL, pruneinfo, true, false,
+ rtable, econtext, pdir);
+ MemoryContextSwitchTo(oldcontext);
+
+ /* Do the initial pruning. */
+ valid_subplan_offs = ExecFindMatchingSubPlans(prunestate, true,
+ scan_leafpart_rtis);
+
+ FreeExprContext(econtext, true);
+ DestroyPartitionDirectory(pdir);
+ MemoryContextDelete(tmpcontext);
+
+ return valid_subplan_offs;
+}
+
/*
* CreatePartitionPruneState
* Build the data structure required for calling ExecFindMatchingSubPlans
*
- * 'planstate' is the parent plan node's execution state.
+ * 'planstate', if not NULL, is the parent plan node's execution state. It
+ * can be NULL if being called before ExecutorStart(), in which case,
+ * 'rtable' (range table), 'econtext', and 'partdir' must be explicitly
+ * provided.
*
* 'pruneinfo' is a PartitionPruneInfo as generated by
* make_partition_pruneinfo. Here we build a PartitionPruneState containing a
@@ -1857,19 +1974,21 @@ ExecInitPartitionPruning(PlanState *planstate,
* PartitionedRelPruneInfo.
*/
static PartitionPruneState *
-CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
+CreatePartitionPruneState(PlanState *planstate,
+ PartitionPruneInfo *pruneinfo,
+ bool consider_initial_steps,
+ bool consider_exec_steps,
+ List *rtable, ExprContext *econtext,
+ PartitionDirectory partdir)
{
- EState *estate = planstate->state;
+ EState *estate = planstate ? planstate->state : NULL;
PartitionPruneState *prunestate;
int n_part_hierarchies;
ListCell *lc;
int i;
- ExprContext *econtext = planstate->ps_ExprContext;
- /* For data reading, executor always omits detached partitions */
- if (estate->es_partition_directory == NULL)
- estate->es_partition_directory =
- CreatePartitionDirectory(estate->es_query_cxt, false);
+ Assert((estate != NULL) ||
+ (partdir != NULL && econtext != NULL && rtable != NIL));
n_part_hierarchies = list_length(pruneinfo->prune_infos);
Assert(n_part_hierarchies > 0);
@@ -1924,15 +2043,42 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
PartitionKey partkey;
/*
- * We can rely on the copies of the partitioned table's partition
- * key and partition descriptor appearing in its relcache entry,
- * because that entry will be held open and locked for the
- * duration of this executor run.
+ * Must open the relation by ourselves when called before the
+ * execution has started, such as, when called during
+ * ExecutorDoInitialPruning() on a cached plan. In that case,
+ * sub-partitions must be locked, because AcquirePlannerLocks()
+ * would not have seen them. (1st relation in a partrelpruneinfos
+ * list is always the root partitioned table appearing in the
+ * query, which AcquirePlannerLocks() would have locked; the
+ * Assert in relation_open() guards that assumption.)
+ */
+ if (estate == NULL)
+ {
+ RangeTblEntry *rte = rt_fetch(pinfo->rtindex, rtable);
+ int lockmode = (j == 0) ? NoLock : rte->rellockmode;
+
+ partrel = table_open(rte->relid, lockmode);
+ }
+ else
+ partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
+
+ /*
+ * We can rely on the copy of the partitioned table's partition
+ * key from in its relcache entry, because it can't change (or
+ * get destroyed) as long as the relation is locked. Partition
+ * descriptor is taken from the PartitionDirectory associated with
+ * the table that is held open long enough for the descriptor to
+ * remain valid while it's used to perform the pruning steps.
*/
- partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
partkey = RelationGetPartitionKey(partrel);
- partdesc = PartitionDirectoryLookup(estate->es_partition_directory,
- partrel);
+ partdesc = PartitionDirectoryLookup(partdir, partrel);
+
+ /*
+ * Must close partrel, keeping the lock taken, if we're not using
+ * EState's entry.
+ */
+ if (estate == NULL)
+ table_close(partrel, NoLock);
/*
* Initialize the subplan_map and subpart_map.
@@ -1946,6 +2092,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
Assert(partdesc->nparts >= pinfo->nparts);
pprune->nparts = partdesc->nparts;
pprune->subplan_map = palloc(sizeof(int) * partdesc->nparts);
+ pprune->rti_map = palloc(sizeof(Index) * partdesc->nparts);
if (partdesc->nparts == pinfo->nparts)
{
/*
@@ -1956,6 +2103,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
pprune->subpart_map = pinfo->subpart_map;
memcpy(pprune->subplan_map, pinfo->subplan_map,
sizeof(int) * pinfo->nparts);
+ memcpy(pprune->rti_map, pinfo->rti_map,
+ sizeof(int) * pinfo->nparts);
/*
* Double-check that the list of unpruned relations has not
@@ -2006,6 +2155,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
pinfo->subplan_map[pd_idx];
pprune->subpart_map[pp_idx] =
pinfo->subpart_map[pd_idx];
+ pprune->rti_map[pp_idx] =
+ pinfo->rti_map[pd_idx];
pd_idx++;
}
else
@@ -2013,6 +2164,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
/* this partdesc entry is not in the plan */
pprune->subplan_map[pp_idx] = -1;
pprune->subpart_map[pp_idx] = -1;
+ pprune->rti_map[pp_idx] = 0;
}
}
@@ -2034,7 +2186,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
* Initialize pruning contexts as needed.
*/
pprune->initial_pruning_steps = pinfo->initial_pruning_steps;
- if (pinfo->initial_pruning_steps)
+ if (consider_initial_steps && pinfo->initial_pruning_steps)
{
InitPartitionPruneContext(&pprune->initial_context,
pinfo->initial_pruning_steps,
@@ -2044,7 +2196,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
prunestate->do_initial_prune = true;
}
pprune->exec_pruning_steps = pinfo->exec_pruning_steps;
- if (pinfo->exec_pruning_steps)
+ if (consider_exec_steps && pinfo->exec_pruning_steps)
{
InitPartitionPruneContext(&pprune->exec_context,
pinfo->exec_pruning_steps,
@@ -2272,10 +2424,14 @@ PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
* Pass initial_prune if PARAM_EXEC Params cannot yet be evaluated. This
* differentiates the initial executor-time pruning step from later
* runtime pruning.
+ *
+ * RT indexes of leaf partitions scanned by the chosen subplans are added to
+ * *scan_leafpart_rtis if the pointer is non-NULL.
*/
Bitmapset *
ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
- bool initial_prune)
+ bool initial_prune,
+ Bitmapset **scan_leafpart_rtis)
{
Bitmapset *result = NULL;
MemoryContext oldcontext;
@@ -2310,7 +2466,7 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
*/
pprune = &prunedata->partrelprunedata[0];
find_matching_subplans_recurse(prunedata, pprune, initial_prune,
- &result);
+ &result, scan_leafpart_rtis);
/* Expression eval may have used space in ExprContext too */
if (pprune->exec_pruning_steps)
@@ -2324,6 +2480,8 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
/* Copy result out of the temp context before we reset it */
result = bms_copy(result);
+ if (scan_leafpart_rtis)
+ *scan_leafpart_rtis = bms_copy(*scan_leafpart_rtis);
MemoryContextReset(prunestate->prune_context);
@@ -2334,13 +2492,15 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
* find_matching_subplans_recurse
* Recursive worker function for ExecFindMatchingSubPlans
*
- * Adds valid (non-prunable) subplan IDs to *validsubplans
+ * Adds valid (non-prunable) subplan IDs to *validsubplans and RT indexes of
+ * of the corresponding leaf partitions to *scan_leafpart_rtis (if asked for).
*/
static void
find_matching_subplans_recurse(PartitionPruningData *prunedata,
PartitionedRelPruningData *pprune,
bool initial_prune,
- Bitmapset **validsubplans)
+ Bitmapset **validsubplans,
+ Bitmapset **scan_leafpart_rtis)
{
Bitmapset *partset;
int i;
@@ -2367,8 +2527,14 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
while ((i = bms_next_member(partset, i)) >= 0)
{
if (pprune->subplan_map[i] >= 0)
+ {
*validsubplans = bms_add_member(*validsubplans,
pprune->subplan_map[i]);
+ Assert(pprune->rti_map[i] > 0);
+ if (scan_leafpart_rtis)
+ *scan_leafpart_rtis = bms_add_member(*scan_leafpart_rtis,
+ pprune->rti_map[i]);
+ }
else
{
int partidx = pprune->subpart_map[i];
@@ -2376,7 +2542,8 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
if (partidx >= 0)
find_matching_subplans_recurse(prunedata,
&prunedata->partrelprunedata[partidx],
- initial_prune, validsubplans);
+ initial_prune, validsubplans,
+ scan_leafpart_rtis);
else
{
/*
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 21f4c10937..67a58c7163 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -134,6 +134,7 @@ CreateExecutorState(void)
estate->es_param_exec_vals = NULL;
estate->es_queryEnv = NULL;
+ estate->es_part_prune_results = NIL;
estate->es_query_cxt = qcontext;
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index e134a82ff7..18d3b98cdc 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -842,7 +842,7 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
else
dest = None_Receiver;
- es->qd = CreateQueryDesc(es->stmt,
+ es->qd = CreateQueryDesc(es->stmt, NIL,
fcache->src,
GetActiveSnapshot(),
InvalidSnapshot,
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index c6f86a6510..96880e122a 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -155,7 +155,8 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
* subplan, we can fill as_valid_subplans immediately, preventing
* later calls to ExecFindMatchingSubPlans.
*/
- if (!prunestate->do_exec_prune && nplans > 0)
+ if (appendstate->as_prune_state == NULL ||
+ (!appendstate->as_prune_state->do_exec_prune && nplans > 0))
appendstate->as_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
}
else
@@ -577,7 +578,7 @@ choose_next_subplan_locally(AppendState *node)
}
else if (node->as_valid_subplans == NULL)
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
whichplan = -1;
}
@@ -642,7 +643,7 @@ choose_next_subplan_for_leader(AppendState *node)
if (node->as_valid_subplans == NULL)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
/*
* Mark each invalid plan as finished to allow the loop below to
@@ -717,7 +718,7 @@ choose_next_subplan_for_worker(AppendState *node)
else if (node->as_valid_subplans == NULL)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
mark_invalid_subplans_as_finished(node);
}
@@ -868,7 +869,7 @@ ExecAppendAsyncBegin(AppendState *node)
if (node->as_valid_subplans == NULL)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
classify_matching_subplans(node);
}
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index 8d35860c30..2312e5a633 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -103,7 +103,8 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
* subplan, we can fill ms_valid_subplans immediately, preventing
* later calls to ExecFindMatchingSubPlans.
*/
- if (!prunestate->do_exec_prune && nplans > 0)
+ if (mergestate->ms_prune_state == NULL ||
+ (!mergestate->ms_prune_state->do_exec_prune && nplans > 0))
mergestate->ms_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
}
else
@@ -218,7 +219,7 @@ ExecMergeAppend(PlanState *pstate)
*/
if (node->ms_valid_subplans == NULL)
node->ms_valid_subplans =
- ExecFindMatchingSubPlans(node->ms_prune_state, false);
+ ExecFindMatchingSubPlans(node->ms_prune_state, false, NULL);
/*
* First time through: pull the first tuple from each valid subplan,
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index fd5796f1b9..93012a5b3b 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -1578,6 +1578,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
CachedPlanSource *plansource;
CachedPlan *cplan;
List *stmt_list;
+ List *part_prune_results_list;
char *query_string;
Snapshot snapshot;
MemoryContext oldcontext;
@@ -1657,7 +1658,10 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
*/
/* Replan if needed, and increment plan refcount for portal */
- cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv);
+ cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv,
+ &part_prune_results_list);
+ Assert(list_length(cplan->stmt_list) ==
+ list_length(part_prune_results_list));
stmt_list = cplan->stmt_list;
if (!plan->saved)
@@ -1685,6 +1689,9 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
stmt_list,
cplan);
+ /* Copy Lists of PartitionPruneResults into the portal's context. */
+ PortalStorePartitionPruneResults(portal, part_prune_results_list);
+
/*
* Set up options for portal. Default SCROLL type is chosen the same way
* as PerformCursorOpen does it.
@@ -2092,7 +2099,8 @@ SPI_plan_get_cached_plan(SPIPlanPtr plan)
/* Get the generic plan for the query */
cplan = GetCachedPlan(plansource, NULL,
plan->saved ? CurrentResourceOwner : NULL,
- _SPI_current->queryEnv);
+ _SPI_current->queryEnv,
+ NULL /* Not interested in PartitionPruneResults */);
Assert(cplan == plansource->gplan);
/* Pop the error context stack */
@@ -2473,7 +2481,9 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
{
CachedPlanSource *plansource = (CachedPlanSource *) lfirst(lc1);
List *stmt_list;
- ListCell *lc2;
+ List *part_prune_results_list;
+ ListCell *lc2,
+ *lc3;
spicallbackarg.query = plansource->query_string;
@@ -2549,8 +2559,10 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
* plan, the refcount must be backed by the plan_owner.
*/
cplan = GetCachedPlan(plansource, options->params,
- plan_owner, _SPI_current->queryEnv);
-
+ plan_owner, _SPI_current->queryEnv,
+ &part_prune_results_list);
+ Assert(list_length(cplan->stmt_list) ==
+ list_length(part_prune_results_list));
stmt_list = cplan->stmt_list;
/*
@@ -2589,9 +2601,10 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
}
}
- foreach(lc2, stmt_list)
+ forboth(lc2, stmt_list, lc3, part_prune_results_list)
{
PlannedStmt *stmt = lfirst_node(PlannedStmt, lc2);
+ List *part_prune_results = lfirst_node(List, lc3);
bool canSetTag = stmt->canSetTag;
DestReceiver *dest;
@@ -2663,7 +2676,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
else
snap = InvalidSnapshot;
- qdesc = CreateQueryDesc(stmt,
+ qdesc = CreateQueryDesc(stmt, part_prune_results,
plansource->query_string,
snap, crosscheck_snapshot,
dest,
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index b4ff855f7c..77990a2732 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -158,6 +158,11 @@
token = pg_strtok(&length); /* skip :fldname */ \
local_node->fldname = readIntCols(len)
+/* Read an Index array */
+#define READ_INDEX_ARRAY(fldname, len) \
+ token = pg_strtok(&length); /* skip :fldname */ \
+ local_node->fldname = readIndexCols(len)
+
/* Read a bool array */
#define READ_BOOL_ARRAY(fldname, len) \
token = pg_strtok(&length); /* skip :fldname */ \
@@ -795,7 +800,6 @@ fnname(int numCols) \
*/
READ_SCALAR_ARRAY(readAttrNumberCols, int16, atoi)
READ_SCALAR_ARRAY(readOidCols, Oid, atooid)
-/* outfuncs.c has writeIndexCols, but we don't yet need that here */
-/* READ_SCALAR_ARRAY(readIndexCols, Index, atoui) */
+READ_SCALAR_ARRAY(readIndexCols, Index, atoui)
READ_SCALAR_ARRAY(readIntCols, int, atoi)
READ_SCALAR_ARRAY(readBoolCols, bool, strtobool)
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 31fff597a7..4097cf7164 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -520,7 +520,9 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
result->parallelModeNeeded = glob->parallelModeNeeded;
result->planTree = top_plan;
result->partPruneInfos = glob->partPruneInfos;
+ result->containsInitialPruning = glob->containsInitialPruning;
result->rtable = glob->finalrtable;
+ result->minLockRelids = glob->minLockRelids;
result->resultRelations = glob->resultRelations;
result->appendRelations = glob->appendRelations;
result->subplans = glob->subplans;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 720f20f563..61d6934978 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -270,6 +270,16 @@ set_plan_references(PlannerInfo *root, Plan *plan)
*/
add_rtes_to_flat_rtable(root, false);
+ /*
+ * Add the query's adjusted range of RT indexes to glob->minLockRelids.
+ * The adjusted RT indexes of prunable relations will be deleted from the
+ * set below where PartitionPruneInfos are processed.
+ */
+ glob->minLockRelids =
+ bms_add_range(glob->minLockRelids,
+ rtoffset + 1,
+ rtoffset + list_length(root->parse->rtable));
+
/*
* Adjust RT indexes of PlanRowMarks and add to final rowmarks list
*/
@@ -352,6 +362,7 @@ set_plan_references(PlannerInfo *root, Plan *plan)
foreach (lc, root->partPruneInfos)
{
PartitionPruneInfo *pruneinfo = lfirst(lc);
+ Bitmapset *leafpart_rtis = NULL;
ListCell *l;
foreach(l, pruneinfo->prune_infos)
@@ -362,15 +373,50 @@ set_plan_references(PlannerInfo *root, Plan *plan)
foreach(l2, prune_infos)
{
PartitionedRelPruneInfo *pinfo = lfirst(l2);
+ int i;
/* RT index of the table to which the pinfo belongs. */
pinfo->rtindex += rtoffset;
+
+ /* Also of the leaf partitions that might be scanned. */
+ for (i = 0; i < pinfo->nparts; i++)
+ {
+ if (pinfo->rti_map[i] > 0 && pinfo->subplan_map[i] >= 0)
+ {
+ pinfo->rti_map[i] += rtoffset;
+ leafpart_rtis = bms_add_member(leafpart_rtis,
+ pinfo->rti_map[i]);
+ }
+ }
}
}
+ if (pruneinfo->needs_init_pruning)
+ {
+ glob->containsInitialPruning = true;
+
+ /*
+ * Delete the leaf partition RTIs from the global set of relations
+ * to be locked before executing the plan. AcquireExecutorLocks()
+ * will find the ones to add to the set after performing initial
+ * pruning.
+ */
+ glob->minLockRelids = bms_del_members(glob->minLockRelids,
+ leafpart_rtis);
+ }
+
glob->partPruneInfos = lappend(glob->partPruneInfos, pruneinfo);
}
+ /*
+ * It seems worth doing a bms_copy() on glob->minLockRelids if we deleted
+ * bits from it above to get rid of any empty tail bits. It seems better
+ * for the loop over this set in AcquireExecutorLocks() to not have to go
+ * through those useless bit words.
+ */
+ if (glob->containsInitialPruning)
+ glob->minLockRelids = bms_copy(glob->minLockRelids);
+
return result;
}
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index 6565b6ed01..37f3e6af61 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -144,7 +144,9 @@ static List *make_partitionedrel_pruneinfo(PlannerInfo *root,
List *prunequal,
Bitmapset *partrelids,
int *relid_subplan_map,
- Bitmapset **matchedsubplans);
+ Bitmapset **matchedsubplans,
+ bool *needs_init_pruning,
+ bool *needs_exec_pruning);
static void gen_partprune_steps(RelOptInfo *rel, List *clauses,
PartClauseTarget target,
GeneratePruningStepsContext *context);
@@ -234,6 +236,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
int *relid_subplan_map;
ListCell *lc;
int i;
+ bool needs_init_pruning = false;
+ bool needs_exec_pruning = false;
/*
* Scan the subpaths to see which ones are scans of partition child
@@ -313,12 +317,16 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
Bitmapset *partrelids = (Bitmapset *) lfirst(lc);
List *pinfolist;
Bitmapset *matchedsubplans = NULL;
+ bool partrel_needs_init_pruning;
+ bool partrel_needs_exec_pruning;
pinfolist = make_partitionedrel_pruneinfo(root, parentrel,
prunequal,
partrelids,
relid_subplan_map,
- &matchedsubplans);
+ &matchedsubplans,
+ &partrel_needs_init_pruning,
+ &partrel_needs_exec_pruning);
/* When pruning is possible, record the matched subplans */
if (pinfolist != NIL)
@@ -327,6 +335,9 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
allmatchedsubplans = bms_join(matchedsubplans,
allmatchedsubplans);
}
+
+ needs_init_pruning |= partrel_needs_init_pruning;
+ needs_exec_pruning |= partrel_needs_exec_pruning;
}
pfree(relid_subplan_map);
@@ -341,6 +352,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
/* Else build the result data structure */
pruneinfo = makeNode(PartitionPruneInfo);
pruneinfo->prune_infos = prunerelinfos;
+ pruneinfo->needs_init_pruning = needs_init_pruning;
+ pruneinfo->needs_exec_pruning = needs_exec_pruning;
/*
* Some subplans may not belong to any of the identified partitioned rels.
@@ -441,13 +454,18 @@ add_part_relids(List *allpartrelids, Bitmapset *partrelids)
* If we cannot find any useful run-time pruning steps, return NIL.
* However, on success, each rel identified in partrelids will have
* an element in the result list, even if some of them are useless.
+ * *needs_init_pruning and *needs_exec_pruning are set to indicate that the
+ * returned PartitionedRelPruneInfos contains pruning steps that can be
+ * performed before and after execution begins, respectively.
*/
static List *
make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
List *prunequal,
Bitmapset *partrelids,
int *relid_subplan_map,
- Bitmapset **matchedsubplans)
+ Bitmapset **matchedsubplans,
+ bool *needs_init_pruning,
+ bool *needs_exec_pruning)
{
RelOptInfo *targetpart = NULL;
List *pinfolist = NIL;
@@ -458,6 +476,10 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
int rti;
int i;
+ /* Will find out below. */
+ *needs_init_pruning = false;
+ *needs_exec_pruning = false;
+
/*
* Examine each partitioned rel, constructing a temporary array to map
* from planner relids to index of the partitioned rel, and building a
@@ -545,6 +567,9 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
* executor per-scan pruning steps. This first pass creates startup
* pruning steps and detects whether there's any possibly-useful quals
* that would require per-scan pruning.
+ *
+ * In the first pass, we note whether the 2nd pass is necessary by
+ * noting the presence of EXEC parameters.
*/
gen_partprune_steps(subpart, partprunequal, PARTTARGET_INITIAL,
&context);
@@ -619,6 +644,12 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
pinfo->execparamids = execparamids;
/* Remaining fields will be filled in the next loop */
+ /* record which types of pruning steps we've seen so far */
+ if (initial_pruning_steps != NIL)
+ *needs_init_pruning = true;
+ if (exec_pruning_steps != NIL)
+ *needs_exec_pruning = true;
+
pinfolist = lappend(pinfolist, pinfo);
}
@@ -646,6 +677,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
int *subplan_map;
int *subpart_map;
Oid *relid_map;
+ Index *rti_map;
/*
* Construct the subplan and subpart maps for this partitioning level.
@@ -658,6 +690,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
subpart_map = (int *) palloc(nparts * sizeof(int));
memset(subpart_map, -1, nparts * sizeof(int));
relid_map = (Oid *) palloc0(nparts * sizeof(Oid));
+ rti_map = (Index *) palloc0(nparts * sizeof(Index));
present_parts = NULL;
i = -1;
@@ -672,6 +705,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
subplan_map[i] = subplanidx = relid_subplan_map[partrel->relid] - 1;
subpart_map[i] = subpartidx = relid_subpart_map[partrel->relid] - 1;
relid_map[i] = planner_rt_fetch(partrel->relid, root)->relid;
+ rti_map[i] = partrel->relid;
if (subplanidx >= 0)
{
present_parts = bms_add_member(present_parts, i);
@@ -696,6 +730,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
pinfo->subplan_map = subplan_map;
pinfo->subpart_map = subpart_map;
pinfo->relid_map = relid_map;
+ pinfo->rti_map = rti_map;
}
pfree(relid_subpart_map);
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index a9a1851c94..a1be8179e8 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -1598,6 +1598,7 @@ exec_bind_message(StringInfo input_message)
int16 *rformats = NULL;
CachedPlanSource *psrc;
CachedPlan *cplan;
+ List *part_prune_results_list;
Portal portal;
char *query_string;
char *saved_stmt_name;
@@ -1972,7 +1973,9 @@ exec_bind_message(StringInfo input_message)
* will be generated in MessageContext. The plan refcount will be
* assigned to the Portal, so it will be released at portal destruction.
*/
- cplan = GetCachedPlan(psrc, params, NULL, NULL);
+ cplan = GetCachedPlan(psrc, params, NULL, NULL, &part_prune_results_list);
+ Assert(list_length(cplan->stmt_list) ==
+ list_length(part_prune_results_list));
/*
* Now we can define the portal.
@@ -1987,6 +1990,9 @@ exec_bind_message(StringInfo input_message)
cplan->stmt_list,
cplan);
+ /* Copy Lists of PartitionPruneResults into the portal's context. */
+ PortalStorePartitionPruneResults(portal, part_prune_results_list);
+
/* Done with the snapshot used for parameter I/O and parsing/planning */
if (snapshot_set)
PopActiveSnapshot();
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index 5aa5a350f3..226ee81b63 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -35,7 +35,7 @@
Portal ActivePortal = NULL;
-static void ProcessQuery(PlannedStmt *plan,
+static void ProcessQuery(PlannedStmt *plan, List *part_prune_results,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -65,6 +65,7 @@ static void DoPortalRewind(Portal portal);
*/
QueryDesc *
CreateQueryDesc(PlannedStmt *plannedstmt,
+ List *part_prune_results,
const char *sourceText,
Snapshot snapshot,
Snapshot crosscheck_snapshot,
@@ -77,6 +78,8 @@ CreateQueryDesc(PlannedStmt *plannedstmt,
qd->operation = plannedstmt->commandType; /* operation */
qd->plannedstmt = plannedstmt; /* plan */
+ qd->part_prune_results = part_prune_results; /* ExecutorDoInitialPruning()
+ * output for plan */
qd->sourceText = sourceText; /* query text */
qd->snapshot = RegisterSnapshot(snapshot); /* snapshot */
/* RI check snapshot */
@@ -122,6 +125,7 @@ FreeQueryDesc(QueryDesc *qdesc)
* PORTAL_ONE_RETURNING, or PORTAL_ONE_MOD_WITH portal
*
* plan: the plan tree for the query
+ * part_prune_results: ExecutorDoInitialPruning() output for the PlannedStmt
* sourceText: the source text of the query
* params: any parameters needed
* dest: where to send results
@@ -134,6 +138,7 @@ FreeQueryDesc(QueryDesc *qdesc)
*/
static void
ProcessQuery(PlannedStmt *plan,
+ List *part_prune_results,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -145,7 +150,7 @@ ProcessQuery(PlannedStmt *plan,
/*
* Create the QueryDesc object
*/
- queryDesc = CreateQueryDesc(plan, sourceText,
+ queryDesc = CreateQueryDesc(plan, part_prune_results, sourceText,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, 0);
@@ -491,8 +496,13 @@ PortalStart(Portal portal, ParamListInfo params,
/*
* Create QueryDesc in portal's context; for the moment, set
* the destination to DestNone.
+ *
+ * There is no PartitionPruneResult unless the PlannedStmt is
+ * from a CachedPlan.
*/
queryDesc = CreateQueryDesc(linitial_node(PlannedStmt, portal->stmts),
+ portal->part_prune_results_list == NIL ? NIL :
+ linitial(portal->part_prune_results_list),
portal->sourceText,
GetActiveSnapshot(),
InvalidSnapshot,
@@ -1225,6 +1235,8 @@ PortalRunMulti(Portal portal,
if (pstmt->utilityStmt == NULL)
{
+ List *part_prune_results = NIL;
+
/*
* process a plannable query.
*/
@@ -1271,10 +1283,18 @@ PortalRunMulti(Portal portal,
else
UpdateActiveSnapshotCommandId();
+ /*
+ * Determine if there's a corresponding List of PartitionPruneResult
+ * for this PlannedStmt.
+ */
+ if (portal->part_prune_results_list != NIL)
+ part_prune_results = list_nth(portal->part_prune_results_list,
+ foreach_current_index(stmtlist_item));
+
if (pstmt->canSetTag)
{
/* statement can set tag string */
- ProcessQuery(pstmt,
+ ProcessQuery(pstmt, part_prune_results,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
@@ -1283,7 +1303,7 @@ PortalRunMulti(Portal portal,
else
{
/* stmt added by rewrite cannot set tag */
- ProcessQuery(pstmt,
+ ProcessQuery(pstmt, part_prune_results,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 0d6a295674..957221c47e 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -99,14 +99,19 @@ static dlist_head cached_expression_list = DLIST_STATIC_INIT(cached_expression_l
static void ReleaseGenericPlan(CachedPlanSource *plansource);
static List *RevalidateCachedQuery(CachedPlanSource *plansource,
QueryEnvironment *queryEnv);
-static bool CheckCachedPlan(CachedPlanSource *plansource);
+static bool CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
+ List **part_prune_results_list);
static CachedPlan *BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
- ParamListInfo boundParams, QueryEnvironment *queryEnv);
+ ParamListInfo boundParams, QueryEnvironment *queryEnv,
+ List **part_prune_results_list);
static bool choose_custom_plan(CachedPlanSource *plansource,
ParamListInfo boundParams);
static double cached_plan_cost(CachedPlan *plan, bool include_planner);
static Query *QueryListGetPrimaryStmt(List *stmts);
-static void AcquireExecutorLocks(List *stmt_list, bool acquire);
+static void AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams,
+ List **part_prune_results_list,
+ List **lockedRelids_per_stmt);
+static void ReleaseExecutorLocks(List *stmt_list, List *lockedRelids_per_stmt);
static void AcquirePlannerLocks(List *stmt_list, bool acquire);
static void ScanQueryForLocks(Query *parsetree, bool acquire);
static bool ScanQueryWalker(Node *node, bool *acquire);
@@ -782,6 +787,26 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
return tlist;
}
+/*
+ * FreePartitionPruneResults
+ * Frees the List of Lists of PartitionPruneResults for CheckCachedPlan()
+ */
+static void
+FreePartitionPruneResults(List *part_prune_results_list)
+{
+ ListCell *lc;
+
+ foreach(lc, part_prune_results_list)
+ {
+ List *part_prune_results = lfirst(lc);
+
+ /* Free both the PartitionPruneResults and the containing List. */
+ list_free_deep(part_prune_results);
+ }
+
+ list_free(part_prune_results_list);
+}
+
/*
* CheckCachedPlan: see if the CachedPlanSource's generic plan is valid.
*
@@ -790,15 +815,20 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
*
* On a "true" return, we have acquired the locks needed to run the plan.
* (We must do this for the "true" result to be race-condition-free.)
+ *
+ * See GetCachedPlan()'s comment for a description of part_prune_results_list.
*/
static bool
-CheckCachedPlan(CachedPlanSource *plansource)
+CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
+ List **part_prune_results_list)
{
CachedPlan *plan = plansource->gplan;
/* Assert that caller checked the querytree */
Assert(plansource->is_valid);
+ *part_prune_results_list = NIL;
+
/* If there's no generic plan, just say "false" */
if (!plan)
return false;
@@ -820,13 +850,21 @@ CheckCachedPlan(CachedPlanSource *plansource)
*/
if (plan->is_valid)
{
+ List *lockedRelids_per_stmt;
+
/*
* Plan must have positive refcount because it is referenced by
* plansource; so no need to fear it disappears under us here.
*/
Assert(plan->refcount > 0);
- AcquireExecutorLocks(plan->stmt_list, true);
+ /*
+ * Lock relations scanned by the plan. This is where the pruning
+ * happens if needed.
+ */
+ AcquireExecutorLocks(plan->stmt_list, boundParams,
+ part_prune_results_list,
+ &lockedRelids_per_stmt);
/*
* If plan was transient, check to see if TransactionXmin has
@@ -848,7 +886,11 @@ CheckCachedPlan(CachedPlanSource *plansource)
}
/* Oops, the race case happened. Release useless locks. */
- AcquireExecutorLocks(plan->stmt_list, false);
+ ReleaseExecutorLocks(plan->stmt_list, lockedRelids_per_stmt);
+
+ /* Release any PartitionPruneResults that may been created. */
+ FreePartitionPruneResults(*part_prune_results_list);
+ *part_prune_results_list = NIL;
}
/*
@@ -874,10 +916,14 @@ CheckCachedPlan(CachedPlanSource *plansource)
* Planning work is done in the caller's memory context. The finished plan
* is in a child memory context, which typically should get reparented
* (unless this is a one-shot plan, in which case we don't copy the plan).
+ *
+ * A list of NILs is returned in *part_prune_results_list, meaning that no
+ * no partition pruning has been done yet for the plans in stmt_list.
*/
static CachedPlan *
BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
- ParamListInfo boundParams, QueryEnvironment *queryEnv)
+ ParamListInfo boundParams, QueryEnvironment *queryEnv,
+ List **part_prune_results_list)
{
CachedPlan *plan;
List *plist;
@@ -1007,6 +1053,17 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
MemoryContextSwitchTo(oldcxt);
+ /*
+ * No actual PartitionPruneResults yet to add, though must initialize
+ * the list to have the same number of elements as the list of
+ * PlannedStmts.
+ */
+ *part_prune_results_list = NIL;
+ foreach(lc, plist)
+ {
+ *part_prune_results_list = lappend(*part_prune_results_list, NIL);
+ }
+
return plan;
}
@@ -1126,6 +1183,19 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
* plan or a custom plan for the given parameters: the caller does not know
* which it will get.
*
+ * For every PlannedStmt found in the returned CachedPlan, an element that
+ * is either a List of PartitionPruneResult or a NIL is added to
+ * *part_prune_results_list. The former if the PlannedStmt is from
+ * the existing CachedPlan that is otherwise valid and has
+ * containsInitialPruning set to true. Before returning such a CachedPlan,
+ * those "initial" steps are performed by calling ExecutorDoInitialPruning()
+ * to determine only those leaf partitions that need to be locked by
+ * AcquireExecutorLocks() by pruning away subplans that don't match the
+ * "initial" pruning conditions. For each PartitionPruneInfo found in
+ * PlannedStmt.partPruneInfos, a PartitionPruneResult containing the bitmapset
+ * of the indexes of surviving subplans is added to the List for the
+ * PlannedStmt.
+ *
* On return, the plan is valid and we have sufficient locks to begin
* execution.
*
@@ -1139,11 +1209,13 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
*/
CachedPlan *
GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
- ResourceOwner owner, QueryEnvironment *queryEnv)
+ ResourceOwner owner, QueryEnvironment *queryEnv,
+ List **part_prune_results_list)
{
CachedPlan *plan = NULL;
List *qlist;
bool customplan;
+ List *my_part_prune_results_list;
/* Assert caller is doing things in a sane order */
Assert(plansource->magic == CACHEDPLANSOURCE_MAGIC);
@@ -1160,7 +1232,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
if (!customplan)
{
- if (CheckCachedPlan(plansource))
+ if (CheckCachedPlan(plansource, boundParams,
+ &my_part_prune_results_list))
{
/* We want a generic plan, and we already have a valid one */
plan = plansource->gplan;
@@ -1169,7 +1242,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
else
{
/* Build a new generic plan */
- plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv);
+ plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv,
+ &my_part_prune_results_list);
/* Just make real sure plansource->gplan is clear */
ReleaseGenericPlan(plansource);
/* Link the new generic plan into the plansource */
@@ -1214,7 +1288,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
if (customplan)
{
/* Build a custom plan */
- plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv);
+ plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv,
+ &my_part_prune_results_list);
/* Accumulate total costs of custom plans */
plansource->total_custom_cost += cached_plan_cost(plan, true);
@@ -1246,6 +1321,9 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
plan->is_saved = true;
}
+ if (part_prune_results_list)
+ *part_prune_results_list = my_part_prune_results_list;
+
return plan;
}
@@ -1737,17 +1815,29 @@ QueryListGetPrimaryStmt(List *stmts)
/*
* AcquireExecutorLocks: acquire locks needed for execution of a cached plan;
- * or release them if acquire is false.
+ *
+ * See GetCachedPlan()'s comment for a description of part_prune_results_list.
+ *
+ * On return, *lockedRelids_per_stmt will contain a bitmapset for every
+ * PlannedStmt in stmt_list, containing the RT indexes of relation entries
+ * in its range table that were actually locked, or NULL if the PlannedStmt
+ * contains a utility statement.
*/
static void
-AcquireExecutorLocks(List *stmt_list, bool acquire)
+AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams,
+ List **part_prune_results_list,
+ List **lockedRelids_per_stmt)
{
ListCell *lc1;
+ *part_prune_results_list = *lockedRelids_per_stmt = NIL;
foreach(lc1, stmt_list)
{
PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
- ListCell *lc2;
+ List *part_prune_results = NIL;
+ Bitmapset *allLockRelids;
+ Bitmapset *lockedRelids = NULL;
+ int rti;
if (plannedstmt->commandType == CMD_UTILITY)
{
@@ -1761,13 +1851,40 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
Query *query = UtilityContainsQuery(plannedstmt->utilityStmt);
if (query)
- ScanQueryForLocks(query, acquire);
+ ScanQueryForLocks(query, true);
+ *part_prune_results_list = lappend(*part_prune_results_list, NIL);
continue;
}
- foreach(lc2, plannedstmt->rtable)
+ /*
+ * Figure out the set of relations that would need to be locked
+ * before executing the plan.
+ */
+ if (plannedstmt->containsInitialPruning)
{
- RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc2);
+ Bitmapset *scan_leafpart_rtis = NULL;
+
+ /*
+ * Obtain the set of leaf partitions to be locked.
+ *
+ * The following does initial partition pruning using the
+ * PartitionPruneInfos found in plannedstmt->partPruneInfos and
+ * finds leaf partitions that survive that pruning across all the
+ * nodes in the plan tree.
+ */
+ part_prune_results = ExecutorDoInitialPruning(plannedstmt,
+ boundParams,
+ &scan_leafpart_rtis);
+ allLockRelids = bms_union(plannedstmt->minLockRelids,
+ scan_leafpart_rtis);
+ }
+ else
+ allLockRelids = plannedstmt->minLockRelids;
+
+ rti = -1;
+ while ((rti = bms_next_member(allLockRelids, rti)) > 0)
+ {
+ RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
if (rte->rtekind != RTE_RELATION)
continue;
@@ -1778,10 +1895,59 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
* fail if it's been dropped entirely --- we'll just transiently
* acquire a non-conflicting lock.
*/
- if (acquire)
- LockRelationOid(rte->relid, rte->rellockmode);
- else
- UnlockRelationOid(rte->relid, rte->rellockmode);
+ LockRelationOid(rte->relid, rte->rellockmode);
+ lockedRelids = bms_add_member(lockedRelids, rti);
+ }
+
+ *part_prune_results_list = lappend(*part_prune_results_list,
+ part_prune_results);
+ *lockedRelids_per_stmt = lappend(*lockedRelids_per_stmt, lockedRelids);
+ }
+}
+
+/*
+ * ReleaseExecutorLocks
+ * Release locks that would've been acquired by an earlier call to
+ * AcquireExecutorLocks()
+ */
+static void
+ReleaseExecutorLocks(List *stmt_list, List *lockedRelids_per_stmt)
+{
+ ListCell *lc1,
+ *lc2;
+
+ forboth(lc1, stmt_list, lc2, lockedRelids_per_stmt)
+ {
+ PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
+ Bitmapset *lockedRelids = lfirst(lc2);
+ int rti;
+
+ if (plannedstmt->commandType == CMD_UTILITY)
+ {
+ /*
+ * Ignore utility statements, except those (such as EXPLAIN) that
+ * contain a parsed-but-not-planned query. Note: it's okay to use
+ * ScanQueryForLocks, even though the query hasn't been through
+ * rule rewriting, because rewriting doesn't change the query
+ * representation.
+ */
+ Query *query = UtilityContainsQuery(plannedstmt->utilityStmt);
+
+ Assert(lockedRelids == NULL);
+ if (query)
+ ScanQueryForLocks(query, false);
+ continue;
+ }
+
+ rti = -1;
+ while ((rti = bms_next_member(lockedRelids, rti)) >= 0)
+ {
+ RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
+
+ Assert(rte->rtekind == RTE_RELATION);
+
+ /* See the comment in AcquireExecutorLocks(). */
+ UnlockRelationOid(rte->relid, rte->rellockmode);
}
}
}
diff --git a/src/backend/utils/mmgr/portalmem.c b/src/backend/utils/mmgr/portalmem.c
index 3a161bdb88..4b156de524 100644
--- a/src/backend/utils/mmgr/portalmem.c
+++ b/src/backend/utils/mmgr/portalmem.c
@@ -303,6 +303,25 @@ PortalDefineQuery(Portal portal,
portal->status = PORTAL_DEFINED;
}
+/*
+ * PortalStorePartitionPruneResults
+ * Copy the given List of Lists of PartitionPruneResults into the
+ * portal's context
+ *
+ * This allows the caller to ensure that the list exists as long as the portal
+ * does.
+ */
+void
+PortalStorePartitionPruneResults(Portal portal, List *part_prune_results_list)
+{
+ MemoryContext oldcxt;
+
+ AssertArg(PortalIsValid(portal));
+ oldcxt = MemoryContextSwitchTo(portal->portalContext);
+ portal->part_prune_results_list = copyObject(part_prune_results_list);
+ MemoryContextSwitchTo(oldcxt);
+}
+
/*
* PortalReleaseCachedPlan
* Release a portal's reference to its cached plan, if any.
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 9ebde089ae..269cc4d562 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -87,7 +87,9 @@ extern void ExplainOneUtility(Node *utilityStmt, IntoClause *into,
ExplainState *es, const char *queryString,
ParamListInfo params, QueryEnvironment *queryEnv);
-extern void ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into,
+extern void ExplainOnePlan(PlannedStmt *plannedstmt,
+ List *part_prune_results,
+ IntoClause *into,
ExplainState *es, const char *queryString,
ParamListInfo params, QueryEnvironment *queryEnv,
const instr_time *planduration,
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index bf962af7af..bd8776402e 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -45,6 +45,7 @@ extern void ExecCleanupTupleRouting(ModifyTableState *mtstate,
* nparts Length of subplan_map[] and subpart_map[].
* subplan_map Subplan index by partition index, or -1.
* subpart_map Subpart index by partition index, or -1.
+ * rti_map Range table index by partition index, or 0.
* present_parts A Bitmapset of the partition indexes that we
* have subplans or subparts for.
* initial_pruning_steps List of PartitionPruneSteps used to
@@ -61,6 +62,7 @@ typedef struct PartitionedRelPruningData
int nparts;
int *subplan_map;
int *subpart_map;
+ Index *rti_map;
Bitmapset *present_parts;
List *initial_pruning_steps;
List *exec_pruning_steps;
@@ -126,5 +128,10 @@ extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
int part_prune_index,
Bitmapset **initially_valid_subplans);
extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
- bool initial_prune);
+ bool initial_prune,
+ Bitmapset **scan_leafpart_rtis);
+extern Bitmapset *ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt,
+ ParamListInfo params,
+ PartitionPruneInfo *pruneinfo,
+ Bitmapset **scan_leafpart_rtis);
#endif /* EXECPARTITION_H */
diff --git a/src/include/executor/execdesc.h b/src/include/executor/execdesc.h
index e79e2c001f..7d4379da7b 100644
--- a/src/include/executor/execdesc.h
+++ b/src/include/executor/execdesc.h
@@ -35,6 +35,8 @@ typedef struct QueryDesc
/* These fields are provided by CreateQueryDesc */
CmdType operation; /* CMD_SELECT, CMD_UPDATE, etc. */
PlannedStmt *plannedstmt; /* planner's output (could be utility, too) */
+ List *part_prune_results; /* ExecutorDoInitialPruning()'s
+ * output for plannedstmt */
const char *sourceText; /* source text of the query */
Snapshot snapshot; /* snapshot to use for query */
Snapshot crosscheck_snapshot; /* crosscheck for RI update/delete */
@@ -57,6 +59,7 @@ typedef struct QueryDesc
/* in pquery.c */
extern QueryDesc *CreateQueryDesc(PlannedStmt *plannedstmt,
+ List *part_prune_results,
const char *sourceText,
Snapshot snapshot,
Snapshot crosscheck_snapshot,
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index ed95ed1176..c9a5e5fb68 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -185,6 +185,9 @@ ExecGetJunkAttribute(TupleTableSlot *slot, AttrNumber attno, bool *isNull)
/*
* prototypes from functions in execMain.c
*/
+extern List *ExecutorDoInitialPruning(PlannedStmt *plannedstmt,
+ ParamListInfo params,
+ Bitmapset **scan_leafpart_rtis);
extern void ExecutorStart(QueryDesc *queryDesc, int eflags);
extern void standard_ExecutorStart(QueryDesc *queryDesc, int eflags);
extern void ExecutorRun(QueryDesc *queryDesc,
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 4a741b053f..521a60b988 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -612,6 +612,7 @@ typedef struct EState
* ExecRowMarks, or NULL if none */
PlannedStmt *es_plannedstmt; /* link to top of plan tree */
List *es_part_prune_infos; /* PlannedStmt.partPruneInfos */
+ List *es_part_prune_results; /* QueryDesc.part_prune_results */
const char *es_sourceText; /* Source text from QueryDesc */
JunkFilter *es_junkFilter; /* top-level junk filter, if any */
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index a80f43e540..937cc4629d 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -212,6 +212,7 @@ extern struct Bitmapset *readBitmapset(void);
extern uintptr_t readDatum(bool typbyval);
extern bool *readBoolCols(int numCols);
extern int *readIntCols(int numCols);
+extern Index *readIndexCols(int numCols);
extern Oid *readOidCols(int numCols);
extern int16 *readAttrNumberCols(int numCols);
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index e392fb6fc0..494ae461be 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -125,6 +125,18 @@ typedef struct PlannerGlobal
/* List of PartitionPruneInfo contained in the plan */
List *partPruneInfos;
+ /*
+ * Do any of those PartitionPruneInfos have initial pruning steps in them?
+ */
+ bool containsInitialPruning;
+
+ /*
+ * Indexes of all range table entries minus indexes of range table entries
+ * of the leaf partitions scanned by prunable subplans; see
+ * AcquireExecutorLocks()
+ */
+ Bitmapset *minLockRelids;
+
/* OIDs of relations the plan depends on */
List *relationOids;
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 3eb3e6e527..0bc4c8130a 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -73,8 +73,17 @@ typedef struct PlannedStmt
List *partPruneInfos; /* List of PartitionPruneInfo contained in
* the plan */
+ bool containsInitialPruning; /* Do any of those PartitionPruneInfos
+ * have initial pruning steps in them?
+ */
+
List *rtable; /* list of RangeTblEntry nodes */
+ Bitmapset *minLockRelids; /* Indexes of all range table entries minus
+ * indexes of range table entries of the leaf
+ * partitions scanned by prunable subplans;
+ * see AcquireExecutorLocks() */
+
/* rtable indexes of target relations for INSERT/UPDATE/DELETE */
List *resultRelations; /* integer list of RT indexes, or NIL */
@@ -1410,6 +1419,13 @@ typedef struct PlanRowMark
* prune_infos List of Lists containing PartitionedRelPruneInfo nodes,
* one sublist per run-time-prunable partition hierarchy
* appearing in the parent plan node's subplans.
+ *
+ * needs_init_pruning Does any of the PartitionedRelPruneInfos in
+ * prune_infos have its initial_pruning_steps set?
+ *
+ * needs_exec_pruning Does any of the PartitionedRelPruneInfos in
+ * prune_infos have its exec_pruning_steps set?
+ *
* other_subplans Indexes of any subplans that are not accounted for
* by any of the PartitionedRelPruneInfo nodes in
* "prune_infos". These subplans must not be pruned.
@@ -1420,6 +1436,8 @@ typedef struct PartitionPruneInfo
NodeTag type;
List *prune_infos;
+ bool needs_init_pruning;
+ bool needs_exec_pruning;
Bitmapset *other_subplans;
} PartitionPruneInfo;
@@ -1464,6 +1482,9 @@ typedef struct PartitionedRelPruneInfo
/* relation OID by partition index, or 0 */
Oid *relid_map pg_node_attr(array_size(nparts));
+ /* Range table index by partition index, or 0. */
+ Index *rti_map pg_node_attr(array_size(nparts));
+
/*
* initial_pruning_steps shows how to prune during executor startup (i.e.,
* without use of any PARAM_EXEC Params); it is NIL if no startup pruning
@@ -1548,6 +1569,31 @@ typedef struct PartitionPruneStepCombine
List *source_stepids;
} PartitionPruneStepCombine;
+/*----------------
+ * PartitionPruneResult
+ *
+ * The result of performing ExecPartitionDoInitialPruning() on a given
+ * PartitionPruneInfo.
+ *
+ * valid_subplans_offs contains the indexes of subplans remaining after
+ * performing initial pruning by calling ExecFindMatchingSubPlans() on the
+ * PartitionPruneInfo.
+ *
+ * This is used to store the result of initial partition pruning that is
+ * peformed before the execution has started. A module that needs to do so
+ * should call ExecutorDoInitialPruning() on a given PlannedStmt, which
+ * returns a List of PartitionPruneResult containing an entry for each
+ * PartitionPruneInfo present in PlannedStmt.part_prune_infos. The module
+ * should then pass that list, along with the PlannedStmt, to the executor,
+ * so that it can reuse the result of initial partition pruning when
+ * initializing the subplans for execution.
+ */
+typedef struct PartitionPruneResult
+{
+ NodeTag type;
+
+ Bitmapset *valid_subplan_offs;
+} PartitionPruneResult;
/*
* Plan invalidation info
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index 0499635f59..32579d4788 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -220,7 +220,8 @@ extern List *CachedPlanGetTargetList(CachedPlanSource *plansource,
extern CachedPlan *GetCachedPlan(CachedPlanSource *plansource,
ParamListInfo boundParams,
ResourceOwner owner,
- QueryEnvironment *queryEnv);
+ QueryEnvironment *queryEnv,
+ List **part_prune_results_list);
extern void ReleaseCachedPlan(CachedPlan *plan, ResourceOwner owner);
extern bool CachedPlanAllowsSimpleValidityCheck(CachedPlanSource *plansource,
diff --git a/src/include/utils/portal.h b/src/include/utils/portal.h
index aeddbdafe5..1901fc5f28 100644
--- a/src/include/utils/portal.h
+++ b/src/include/utils/portal.h
@@ -138,6 +138,7 @@ typedef struct PortalData
QueryCompletion qc; /* command completion data for executed query */
List *stmts; /* list of PlannedStmts */
CachedPlan *cplan; /* CachedPlan, if stmts are from one */
+ List *part_prune_results_list; /* List of Lists of PartitionPruneResults */
ParamListInfo portalParams; /* params to pass to query */
QueryEnvironment *queryEnv; /* environment for query */
@@ -242,6 +243,8 @@ extern void PortalDefineQuery(Portal portal,
CommandTag commandTag,
List *stmts,
CachedPlan *cplan);
+extern void PortalStorePartitionPruneResults(Portal portal,
+ List *part_prune_results_list);
extern PlannedStmt *PortalGetPrimaryStmt(Portal portal);
extern void PortalCreateHoldStore(Portal portal);
extern void PortalHashTableDeleteAll(void);
--
2.35.3
^ permalink raw reply [nested|flat] 71+ messages in thread
* Re: generic plans and "initial" pruning
@ 2022-10-27 02:41 Amit Langote <[email protected]>
parent: Amit Langote <[email protected]>
0 siblings, 1 reply; 71+ messages in thread
From: Amit Langote @ 2022-10-27 02:41 UTC (permalink / raw)
To: Robert Haas <[email protected]>; +Cc: Jacob Champion <[email protected]>; Zhihong Yu <[email protected]>; David Rowley <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers
On Mon, Oct 17, 2022 at 6:29 PM Amit Langote <[email protected]> wrote:
> On Wed, Oct 12, 2022 at 4:36 PM Amit Langote <[email protected]> wrote:
> > On Fri, Jul 29, 2022 at 1:20 PM Amit Langote <[email protected]> wrote:
> > > On Thu, Jul 28, 2022 at 1:27 AM Robert Haas <[email protected]> wrote:
> > > > 0001 adds es_part_prune_result but does not use it, so maybe the
> > > > introduction of that field should be deferred until it's needed for
> > > > something.
> > >
> > > Oops, looks like a mistake when breaking the patch. Will move that bit to 0002.
> >
> > Fixed that and also noticed that I had defined PartitionPruneResult in
> > the wrong header (execnodes.h). That led to PartitionPruneResult
> > nodes not being able to be written and read, because
> > src/backend/nodes/gen_node_support.pl doesn't create _out* and _read*
> > routines for the nodes defined in execnodes.h. I moved its definition
> > to plannodes.h, even though it is not actually the planner that
> > instantiates those; no other include/nodes header sounds better.
> >
> > One more thing I realized is that Bitmapsets added to the List
> > PartitionPruneResult.valid_subplan_offs_list are not actually
> > read/write-able. That's a problem that I also faced in [1], so I
> > proposed a patch there to make Bitmapset a read/write-able Node and
> > mark (only) the Bitmapsets that are added into read/write-able node
> > trees with the corresponding NodeTag. I'm including that patch here
> > as well (0002) for the main patch to work (pass
> > -DWRITE_READ_PARSE_PLAN_TREES build tests), though it might make sense
> > to discuss it in its own thread?
>
> Had second thoughts on the use of List of Bitmapsets for this, such
> that the make-Bitmapset-Nodes patch is no longer needed.
>
> I had defined PartitionPruneResult such that it stood for the results
> of pruning for all PartitionPruneInfos contained in
> PlannedStmt.partPruneInfos (covering all Append/MergeAppend nodes that
> can use partition pruning in a given plan). So, it had a List of
> Bitmapset. I think it's perhaps better for PartitionPruneResult to
> cover only one PartitionPruneInfo and thus need only a Bitmapset and
> not a List thereof, which I have implemented in the attached updated
> patch 0002. So, instead of needing to pass around a
> PartitionPruneResult with each PlannedStmt, this now passes a List of
> PartitionPruneResult with an entry for each in
> PlannedStmt.partPruneInfos.
Rebased over 3b2db22fe.
--
Thanks, Amit Langote
EDB: http://www.enterprisedb.com
Attachments:
[application/octet-stream] v23-0001-Move-PartitioPruneInfo-out-of-plan-nodes-into-Pl.patch (17.2K, 2-v23-0001-Move-PartitioPruneInfo-out-of-plan-nodes-into-Pl.patch)
download | inline diff:
From c805965cadc12217406309221e2c89e3c17be433 Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Fri, 27 May 2022 16:00:28 +0900
Subject: [PATCH v23 1/2] Move PartitioPruneInfo out of plan nodes into
PlannedStmt
The planner will now add a given PartitioPruneInfo to
PlannedStmt.partPruneInfos instead of directly to the
Append/MergeAppend plan node. What gets set instead in the
latter is an index field which points to the list element
of PlannedStmt.partPruneInfos containing the PartitioPruneInfo
belonging to the plan node.
A later commit will make AcquireExecutorLocks() do the initial
partition pruning to determine a minimal set of partitions to be
locked when validating a plan tree and it will need to consult the
PartitioPruneInfos referenced therein to do so. It would be better
for the PartitioPruneInfos to be accessible directly than requiring
a walk of the plan tree to find them, which is easier when it can be
done by simply iterating over PlannedStmt.partPruneInfos.
---
src/backend/executor/execMain.c | 1 +
src/backend/executor/execParallel.c | 1 +
src/backend/executor/execPartition.c | 4 +-
src/backend/executor/execUtils.c | 1 +
src/backend/executor/nodeAppend.c | 4 +-
src/backend/executor/nodeMergeAppend.c | 4 +-
src/backend/optimizer/plan/createplan.c | 24 ++++-----
src/backend/optimizer/plan/planner.c | 1 +
src/backend/optimizer/plan/setrefs.c | 65 +++++++++++++------------
src/backend/partitioning/partprune.c | 18 ++++---
src/include/executor/execPartition.h | 3 +-
src/include/nodes/execnodes.h | 1 +
src/include/nodes/pathnodes.h | 6 +++
src/include/nodes/plannodes.h | 11 +++--
src/include/partitioning/partprune.h | 8 +--
15 files changed, 90 insertions(+), 62 deletions(-)
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index d78862e660..32475e33ff 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -825,6 +825,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
ExecInitRangeTable(estate, rangeTable);
estate->es_plannedstmt = plannedstmt;
+ estate->es_part_prune_infos = plannedstmt->partPruneInfos;
/*
* Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index 99512826c5..aca0c6f323 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -183,6 +183,7 @@ ExecSerializePlan(Plan *plan, EState *estate)
pstmt->dependsOnRole = false;
pstmt->parallelModeNeeded = false;
pstmt->planTree = plan;
+ pstmt->partPruneInfos = estate->es_part_prune_infos;
pstmt->rtable = estate->es_range_table;
pstmt->resultRelations = NIL;
pstmt->appendRelations = NIL;
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 40e3c07693..80197d5141 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -1791,11 +1791,13 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
PartitionPruneState *
ExecInitPartitionPruning(PlanState *planstate,
int n_total_subplans,
- PartitionPruneInfo *pruneinfo,
+ int part_prune_index,
Bitmapset **initially_valid_subplans)
{
PartitionPruneState *prunestate;
EState *estate = planstate->state;
+ PartitionPruneInfo *pruneinfo = list_nth(estate->es_part_prune_infos,
+ part_prune_index);
/* We may need an expression context to evaluate partition exprs */
ExecAssignExprContext(estate, planstate);
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 9df1f81ea8..21f4c10937 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -119,6 +119,7 @@ CreateExecutorState(void)
estate->es_relations = NULL;
estate->es_rowmarks = NULL;
estate->es_plannedstmt = NULL;
+ estate->es_part_prune_infos = NIL;
estate->es_junkFilter = NULL;
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index 357e10a1d7..c6f86a6510 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -134,7 +134,7 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
appendstate->as_begun = false;
/* If run-time partition pruning is enabled, then set that up now */
- if (node->part_prune_info != NULL)
+ if (node->part_prune_index >= 0)
{
PartitionPruneState *prunestate;
@@ -145,7 +145,7 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
*/
prunestate = ExecInitPartitionPruning(&appendstate->ps,
list_length(node->appendplans),
- node->part_prune_info,
+ node->part_prune_index,
&validsubplans);
appendstate->as_prune_state = prunestate;
nplans = bms_num_members(validsubplans);
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index c5c62fa5c7..8d35860c30 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -82,7 +82,7 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
mergestate->ps.ExecProcNode = ExecMergeAppend;
/* If run-time partition pruning is enabled, then set that up now */
- if (node->part_prune_info != NULL)
+ if (node->part_prune_index >= 0)
{
PartitionPruneState *prunestate;
@@ -93,7 +93,7 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
*/
prunestate = ExecInitPartitionPruning(&mergestate->ps,
list_length(node->mergeplans),
- node->part_prune_info,
+ node->part_prune_index,
&validsubplans);
mergestate->ms_prune_state = prunestate;
nplans = bms_num_members(validsubplans);
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index ac86ce9003..50a5719ac6 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -1203,7 +1203,6 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
ListCell *subpaths;
int nasyncplans = 0;
RelOptInfo *rel = best_path->path.parent;
- PartitionPruneInfo *partpruneinfo = NULL;
int nodenumsortkeys = 0;
AttrNumber *nodeSortColIdx = NULL;
Oid *nodeSortOperators = NULL;
@@ -1354,6 +1353,9 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
subplans = lappend(subplans, subplan);
}
+ /* Set below if we find quals that we can use to run-time prune */
+ plan->part_prune_index = -1;
+
/*
* If any quals exist, they may be useful to perform further partition
* pruning during execution. Gather information needed by the executor to
@@ -1377,16 +1379,14 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
}
if (prunequal != NIL)
- partpruneinfo =
- make_partition_pruneinfo(root, rel,
- best_path->subpaths,
- prunequal);
+ plan->part_prune_index = make_partition_pruneinfo(root, rel,
+ best_path->subpaths,
+ prunequal);
}
plan->appendplans = subplans;
plan->nasyncplans = nasyncplans;
plan->first_partial_plan = best_path->first_partial_path;
- plan->part_prune_info = partpruneinfo;
copy_generic_path_info(&plan->plan, (Path *) best_path);
@@ -1425,7 +1425,6 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
List *subplans = NIL;
ListCell *subpaths;
RelOptInfo *rel = best_path->path.parent;
- PartitionPruneInfo *partpruneinfo = NULL;
/*
* We don't have the actual creation of the MergeAppend node split out
@@ -1518,6 +1517,9 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
subplans = lappend(subplans, subplan);
}
+ /* Set below if we find quals that we can use to run-time prune */
+ node->part_prune_index = -1;
+
/*
* If any quals exist, they may be useful to perform further partition
* pruning during execution. Gather information needed by the executor to
@@ -1541,13 +1543,13 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
}
if (prunequal != NIL)
- partpruneinfo = make_partition_pruneinfo(root, rel,
- best_path->subpaths,
- prunequal);
+ node->part_prune_index = make_partition_pruneinfo(root, rel,
+ best_path->subpaths,
+ prunequal);
}
node->mergeplans = subplans;
- node->part_prune_info = partpruneinfo;
+
/*
* If prepare_sort_from_pathkeys added sort columns, but we were told to
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 78a8174534..240d50f1c0 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -519,6 +519,7 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
result->dependsOnRole = glob->dependsOnRole;
result->parallelModeNeeded = glob->parallelModeNeeded;
result->planTree = top_plan;
+ result->partPruneInfos = glob->partPruneInfos;
result->rtable = glob->finalrtable;
result->resultRelations = glob->resultRelations;
result->appendRelations = glob->appendRelations;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 1cb0abdbc1..720f20f563 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -348,6 +348,29 @@ set_plan_references(PlannerInfo *root, Plan *plan)
}
}
+ /* Also fix up the information in PartitionPruneInfos. */
+ foreach (lc, root->partPruneInfos)
+ {
+ PartitionPruneInfo *pruneinfo = lfirst(lc);
+ ListCell *l;
+
+ foreach(l, pruneinfo->prune_infos)
+ {
+ List *prune_infos = lfirst(l);
+ ListCell *l2;
+
+ foreach(l2, prune_infos)
+ {
+ PartitionedRelPruneInfo *pinfo = lfirst(l2);
+
+ /* RT index of the table to which the pinfo belongs. */
+ pinfo->rtindex += rtoffset;
+ }
+ }
+
+ glob->partPruneInfos = lappend(glob->partPruneInfos, pruneinfo);
+ }
+
return result;
}
@@ -1658,21 +1681,12 @@ set_append_references(PlannerInfo *root,
aplan->apprelids = offset_relid_set(aplan->apprelids, rtoffset);
- if (aplan->part_prune_info)
- {
- foreach(l, aplan->part_prune_info->prune_infos)
- {
- List *prune_infos = lfirst(l);
- ListCell *l2;
-
- foreach(l2, prune_infos)
- {
- PartitionedRelPruneInfo *pinfo = lfirst(l2);
-
- pinfo->rtindex += rtoffset;
- }
- }
- }
+ /*
+ * PartitionPruneInfos will be added to a list in PlannerGlobal, so update
+ * the index.
+ */
+ if (aplan->part_prune_index >= 0)
+ aplan->part_prune_index += list_length(root->glob->partPruneInfos);
/* We don't need to recurse to lefttree or righttree ... */
Assert(aplan->plan.lefttree == NULL);
@@ -1734,21 +1748,12 @@ set_mergeappend_references(PlannerInfo *root,
mplan->apprelids = offset_relid_set(mplan->apprelids, rtoffset);
- if (mplan->part_prune_info)
- {
- foreach(l, mplan->part_prune_info->prune_infos)
- {
- List *prune_infos = lfirst(l);
- ListCell *l2;
-
- foreach(l2, prune_infos)
- {
- PartitionedRelPruneInfo *pinfo = lfirst(l2);
-
- pinfo->rtindex += rtoffset;
- }
- }
- }
+ /*
+ * PartitionPruneInfos will be added to a list in PlannerGlobal, so update
+ * the index.
+ */
+ if (mplan->part_prune_index >= 0)
+ mplan->part_prune_index += list_length(root->glob->partPruneInfos);
/* We don't need to recurse to lefttree or righttree ... */
Assert(mplan->plan.lefttree == NULL);
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index 6188bf69cb..6565b6ed01 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -209,16 +209,20 @@ static void partkey_datum_from_expr(PartitionPruneContext *context,
/*
* make_partition_pruneinfo
- * Builds a PartitionPruneInfo which can be used in the executor to allow
- * additional partition pruning to take place. Returns NULL when
- * partition pruning would be useless.
+ * Checks if the given set of quals can be used to build pruning steps
+ * that the executor can use to prune away unneeded partitions. If
+ * suitable quals are found then a PartitionPruneInfo is built and tagged
+ * onto the PlannerInfo's partPruneInfos list.
+ *
+ * The return value is the 0-based index of the item added to the
+ * partPruneInfos list or -1 if nothing was added.
*
* 'parentrel' is the RelOptInfo for an appendrel, and 'subpaths' is the list
* of scan paths for its child rels.
* 'prunequal' is a list of potential pruning quals (i.e., restriction
* clauses that are applicable to the appendrel).
*/
-PartitionPruneInfo *
+int
make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
List *subpaths,
List *prunequal)
@@ -332,7 +336,7 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
* quals, then we can just not bother with run-time pruning.
*/
if (prunerelinfos == NIL)
- return NULL;
+ return -1;
/* Else build the result data structure */
pruneinfo = makeNode(PartitionPruneInfo);
@@ -358,7 +362,9 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
else
pruneinfo->other_subplans = NULL;
- return pruneinfo;
+ root->partPruneInfos = lappend(root->partPruneInfos, pruneinfo);
+
+ return list_length(root->partPruneInfos) - 1;
}
/*
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index 708435e952..bf962af7af 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -123,9 +123,8 @@ typedef struct PartitionPruneState
extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
int n_total_subplans,
- PartitionPruneInfo *pruneinfo,
+ int part_prune_index,
Bitmapset **initially_valid_subplans);
extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
bool initial_prune);
-
#endif /* EXECPARTITION_H */
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 01b1727fc0..4a741b053f 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -611,6 +611,7 @@ typedef struct EState
struct ExecRowMark **es_rowmarks; /* Array of per-range-table-entry
* ExecRowMarks, or NULL if none */
PlannedStmt *es_plannedstmt; /* link to top of plan tree */
+ List *es_part_prune_infos; /* PlannedStmt.partPruneInfos */
const char *es_sourceText; /* Source text from QueryDesc */
JunkFilter *es_junkFilter; /* top-level junk filter, if any */
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 09342d128d..fbe75dca0f 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -122,6 +122,9 @@ typedef struct PlannerGlobal
/* "flat" list of AppendRelInfos */
List *appendRelations;
+ /* List of PartitionPruneInfo contained in the plan */
+ List *partPruneInfos;
+
/* OIDs of relations the plan depends on */
List *relationOids;
@@ -503,6 +506,9 @@ struct PlannerInfo
/* Does this query modify any partition key columns? */
bool partColsUpdated;
+
+ /* PartitionPruneInfos added in this query's plan. */
+ List *partPruneInfos;
};
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 5c2ab1b379..2e132afc5a 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -70,6 +70,9 @@ typedef struct PlannedStmt
struct Plan *planTree; /* tree of Plan nodes */
+ List *partPruneInfos; /* List of PartitionPruneInfo contained in
+ * the plan */
+
List *rtable; /* list of RangeTblEntry nodes */
/* rtable indexes of target relations for INSERT/UPDATE/DELETE/MERGE */
@@ -270,8 +273,8 @@ typedef struct Append
*/
int first_partial_plan;
- /* Info for run-time subplan pruning; NULL if we're not doing that */
- struct PartitionPruneInfo *part_prune_info;
+ /* Index to PlannerInfo.partPruneInfos or -1 if no run-time pruning */
+ int part_prune_index;
} Append;
/* ----------------
@@ -305,8 +308,8 @@ typedef struct MergeAppend
/* NULLS FIRST/LAST directions */
bool *nullsFirst pg_node_attr(array_size(numCols));
- /* Info for run-time subplan pruning; NULL if we're not doing that */
- struct PartitionPruneInfo *part_prune_info;
+ /* Index to PlannerInfo.partPruneInfos or -1 if no run-time pruning */
+ int part_prune_index;
} MergeAppend;
/* ----------------
diff --git a/src/include/partitioning/partprune.h b/src/include/partitioning/partprune.h
index 90684efa25..ebf0dcff8c 100644
--- a/src/include/partitioning/partprune.h
+++ b/src/include/partitioning/partprune.h
@@ -70,10 +70,10 @@ typedef struct PartitionPruneContext
#define PruneCxtStateIdx(partnatts, step_id, keyno) \
((partnatts) * (step_id) + (keyno))
-extern PartitionPruneInfo *make_partition_pruneinfo(struct PlannerInfo *root,
- struct RelOptInfo *parentrel,
- List *subpaths,
- List *prunequal);
+extern int make_partition_pruneinfo(struct PlannerInfo *root,
+ struct RelOptInfo *parentrel,
+ List *subpaths,
+ List *prunequal);
extern Bitmapset *prune_append_rel_partitions(struct RelOptInfo *rel);
extern Bitmapset *get_matching_partitions(PartitionPruneContext *context,
List *pruning_steps);
--
2.35.3
[application/octet-stream] v23-0002-Optimize-AcquireExecutorLocks-by-locking-only-un.patch (82.3K, 3-v23-0002-Optimize-AcquireExecutorLocks-by-locking-only-un.patch)
download | inline diff:
From ae9a6b7186c77888fd85dd7e4056dd3cd607617c Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Wed, 22 Dec 2021 16:55:17 +0900
Subject: [PATCH v23 2/2] Optimize AcquireExecutorLocks() by locking only
unpruned partitions
This commit teaches AcquireExecutorLocks() to perform initial
partition pruning to notionally eliminate the subnodes contained in a
generic cached plan that need not be initialized during the actual
execution of the plan and skip locking the partition scanned by those
subnodes.
The result of performing initial partition pruning this way before the
actual execution has started is made available to the actual execution via
PartitionPruneResult, made available along with the PlannedStmt by the
callers of the executor that used plancache.c to get the plan. It is NULL
in the cases in which the plan is obtained by calling the planner
directly or if the plan obtained by plancache.c is not a generic one.
---
src/backend/commands/copyto.c | 2 +-
src/backend/commands/createas.c | 2 +-
src/backend/commands/explain.c | 7 +-
src/backend/commands/extension.c | 2 +-
src/backend/commands/matview.c | 2 +-
src/backend/commands/prepare.c | 26 ++-
src/backend/executor/README | 32 ++++
src/backend/executor/execMain.c | 51 ++++++
src/backend/executor/execParallel.c | 26 ++-
src/backend/executor/execPartition.c | 241 +++++++++++++++++++++----
src/backend/executor/execUtils.c | 1 +
src/backend/executor/functions.c | 2 +-
src/backend/executor/nodeAppend.c | 11 +-
src/backend/executor/nodeMergeAppend.c | 5 +-
src/backend/executor/spi.c | 27 ++-
src/backend/nodes/readfuncs.c | 8 +-
src/backend/optimizer/plan/planner.c | 2 +
src/backend/optimizer/plan/setrefs.c | 46 +++++
src/backend/partitioning/partprune.c | 41 ++++-
src/backend/tcop/postgres.c | 8 +-
src/backend/tcop/pquery.c | 28 ++-
src/backend/utils/cache/plancache.c | 208 ++++++++++++++++++---
src/backend/utils/mmgr/portalmem.c | 19 ++
src/include/commands/explain.h | 4 +-
src/include/executor/execPartition.h | 9 +-
src/include/executor/execdesc.h | 3 +
src/include/executor/executor.h | 3 +
src/include/nodes/execnodes.h | 1 +
src/include/nodes/nodes.h | 1 +
src/include/nodes/pathnodes.h | 12 ++
src/include/nodes/plannodes.h | 46 +++++
src/include/utils/plancache.h | 3 +-
src/include/utils/portal.h | 3 +
33 files changed, 782 insertions(+), 100 deletions(-)
diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index 2527e66059..fb8779fec0 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -558,7 +558,7 @@ BeginCopyTo(ParseState *pstate,
((DR_copy *) dest)->cstate = cstate;
/* Create a QueryDesc requesting no output */
- cstate->queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+ cstate->queryDesc = CreateQueryDesc(plan, NIL, pstate->p_sourcetext,
GetActiveSnapshot(),
InvalidSnapshot,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 152c29b551..942449544c 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -325,7 +325,7 @@ ExecCreateTableAs(ParseState *pstate, CreateTableAsStmt *stmt,
UpdateActiveSnapshotCommandId();
/* Create a QueryDesc, redirecting output to our tuple receiver */
- queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+ queryDesc = CreateQueryDesc(plan, NIL, pstate->p_sourcetext,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, 0);
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index f86983c660..2f2b558608 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -407,7 +407,7 @@ ExplainOneQuery(Query *query, int cursorOptions,
}
/* run it (if needed) and produce output */
- ExplainOnePlan(plan, into, es, queryString, params, queryEnv,
+ ExplainOnePlan(plan, NIL, into, es, queryString, params, queryEnv,
&planduration, (es->buffers ? &bufusage : NULL));
}
}
@@ -515,7 +515,8 @@ ExplainOneUtility(Node *utilityStmt, IntoClause *into, ExplainState *es,
* to call it.
*/
void
-ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
+ExplainOnePlan(PlannedStmt *plannedstmt, List *part_prune_results,
+ IntoClause *into, ExplainState *es,
const char *queryString, ParamListInfo params,
QueryEnvironment *queryEnv, const instr_time *planduration,
const BufferUsage *bufusage)
@@ -563,7 +564,7 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
dest = None_Receiver;
/* Create a QueryDesc for the query */
- queryDesc = CreateQueryDesc(plannedstmt, queryString,
+ queryDesc = CreateQueryDesc(plannedstmt, part_prune_results, queryString,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, instrument_option);
diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index 1a62e5dac5..cc36b6fd15 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -776,7 +776,7 @@ execute_sql_string(const char *sql)
{
QueryDesc *qdesc;
- qdesc = CreateQueryDesc(stmt,
+ qdesc = CreateQueryDesc(stmt, NIL,
sql,
GetActiveSnapshot(), NULL,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index 9ac0383459..65c8d0aa59 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -408,7 +408,7 @@ refresh_matview_datafill(DestReceiver *dest, Query *query,
UpdateActiveSnapshotCommandId();
/* Create a QueryDesc, redirecting output to our tuple receiver */
- queryDesc = CreateQueryDesc(plan, queryString,
+ queryDesc = CreateQueryDesc(plan, NIL, queryString,
GetActiveSnapshot(), InvalidSnapshot,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 9e29584d93..29b45539d3 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -155,6 +155,7 @@ ExecuteQuery(ParseState *pstate,
PreparedStatement *entry;
CachedPlan *cplan;
List *plan_list;
+ List *part_prune_results_list;
ParamListInfo paramLI = NULL;
EState *estate = NULL;
Portal portal;
@@ -193,7 +194,10 @@ ExecuteQuery(ParseState *pstate,
entry->plansource->query_string);
/* Replan if needed, and increment plan refcount for portal */
- cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL);
+ cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL,
+ &part_prune_results_list);
+ Assert(list_length(cplan->stmt_list) ==
+ list_length(part_prune_results_list));
plan_list = cplan->stmt_list;
/*
@@ -207,6 +211,9 @@ ExecuteQuery(ParseState *pstate,
plan_list,
cplan);
+ /* Copy Lists of PartitionPruneResults into the portal's context. */
+ PortalStorePartitionPruneResults(portal, part_prune_results_list);
+
/*
* For CREATE TABLE ... AS EXECUTE, we must verify that the prepared
* statement is one that produces tuples. Currently we insist that it be
@@ -576,7 +583,9 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
const char *query_string;
CachedPlan *cplan;
List *plan_list;
- ListCell *p;
+ List *part_prune_results_list;
+ ListCell *p,
+ *pp;
ParamListInfo paramLI = NULL;
EState *estate = NULL;
instr_time planstart;
@@ -619,7 +628,10 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
/* Replan if needed, and acquire a transient refcount */
cplan = GetCachedPlan(entry->plansource, paramLI,
- CurrentResourceOwner, queryEnv);
+ CurrentResourceOwner, queryEnv,
+ &part_prune_results_list);
+ Assert(list_length(cplan->stmt_list) ==
+ list_length(part_prune_results_list));
INSTR_TIME_SET_CURRENT(planduration);
INSTR_TIME_SUBTRACT(planduration, planstart);
@@ -634,13 +646,15 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
plan_list = cplan->stmt_list;
/* Explain each query */
- foreach(p, plan_list)
+ forboth(p, plan_list, pp, part_prune_results_list)
{
PlannedStmt *pstmt = lfirst_node(PlannedStmt, p);
+ List *part_prune_results = lfirst_node(List, pp);
if (pstmt->commandType != CMD_UTILITY)
- ExplainOnePlan(pstmt, into, es, query_string, paramLI, queryEnv,
- &planduration, (es->buffers ? &bufusage : NULL));
+ ExplainOnePlan(pstmt, part_prune_results, into, es, query_string,
+ paramLI, queryEnv, &planduration,
+ (es->buffers ? &bufusage : NULL));
else
ExplainOneUtility(pstmt->utilityStmt, into, es, query_string,
paramLI, queryEnv);
diff --git a/src/backend/executor/README b/src/backend/executor/README
index 0b5183fc4a..f14f9197b5 100644
--- a/src/backend/executor/README
+++ b/src/backend/executor/README
@@ -65,6 +65,34 @@ found there. This currently only occurs for Append and MergeAppend nodes. In
this case the non-required subplans are ignored and the executor state's
subnode array will become out of sequence to the plan's subplan list.
+Actually, the so-called execution time pruning may also occur even before the
+execution has started. One case where that occurs is when a cached generic
+plan is being validated for execution by plancache.c: GetCachedPlan(), which
+works by locking all the relations that will be scanned by that plan. If the
+generic plan contains nodes that can perform execution time partition pruning
+(that is, contain a PartitionPruneInfo), a subset of pruning steps contained
+in a given node's PartitionPruneInfo that do not depend on the execution
+actually having started (called "initial" pruning steps) are performed to
+figure out the minimal set of child subplans that satisfy those pruning steps.
+AcquireExecutorLocks() looking at a given generic plan will then lock only the
+relations scanned by the child subplans that survived such pruning, along with
+those present in PlannedStmt.minLockRelids. Note that the subplans are only
+notionally pruned, that is, they are not removed from the plan tree as such.
+
+To prevent the executor and any third party execution code that can look at
+the plan tree from trying to execute the subplans that were pruned as
+described above, the result of pruning is passed to the executor as a List
+of PartitionPruneResult nodes via the QueryDesc. Each PartitionPruneResult
+consists of the set of indexes of surviving subplans in the respective parent
+plan node's (the one to which the corresponding PartitionPruneInfo belongs)
+list of child subplans, saved as a bitmapset (valid_subplan_offs). In other
+words, the executor executing a generic plan should not re-evaluate the set of
+initially valid subplans for a given plan node by redoing the initial pruning
+if it was already done by AcquireExecutorLocks() when validating the plan.
+Such re-evaluation of the pruning steps may very well end up resulting in a
+different set of subplans, containing some whose relations were not locked by
+AcquireExecutorLocks().
+
Each Plan node may have expression trees associated with it, to represent
its target list, qualification conditions, etc. These trees are also
read-only to the executor, but the executor state for expression evaluation
@@ -286,6 +314,10 @@ Query Processing Control Flow
This is a sketch of control flow for full query processing:
+ [ ExecutorDoInitialPruning ] --- an optional step to perform initial
+ partition pruning on the plan tree the result of which is passed
+ to the executor via QueryDesc
+
CreateQueryDesc
ExecutorStart
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 32475e33ff..b59474841f 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -49,6 +49,7 @@
#include "commands/matview.h"
#include "commands/trigger.h"
#include "executor/execdebug.h"
+#include "executor/execPartition.h"
#include "executor/nodeSubplan.h"
#include "foreign/fdwapi.h"
#include "jit/jit.h"
@@ -104,6 +105,54 @@ static void EvalPlanQualStart(EPQState *epqstate, Plan *planTree);
/* end of local decls */
+/* ----------------------------------------------------------------
+ * ExecutorDoInitialPruning
+ *
+ * For each plan tree node that has been assigned a PartitionPruneInfo,
+ * this performs initial partition pruning using the information contained
+ * therein to determine the set of child subplans that satisfy the initial
+ * pruning steps, to be returned as a bitmapset of their indexes in the
+ * node's list of child subplans (for example, an Append's appendplans).
+ *
+ * Return value is a PartitionPruneResult node that contains a list of those
+ * bitmapsets, with one element for every PartitionPruneInfo, and a bitmapset
+ * of the RT indexes of all the leaf partitions scanned by those chosen
+ * subplans. Note that the latter is shared across all PartitionPruneInfos.
+ *
+ * The executor must see the exactly same set of subplans as valid for
+ * execution when doing ExecInitNode() on the plan nodes whose
+ * PartitionPruneInfos are processed here. So, it must get the set from the
+ * aforementioned PartitionPruneResult, instead of computing it all over
+ * again by redoing the initial pruning. It's the caller's job to pass the
+ * PartitionPruneResult to the executor.
+ *
+ * Note: Partitioned tables mentioned in PartitionedRelPruneInfo nodes that
+ * drive the pruning will be locked before doing the pruning.
+ * ----------------------------------------------------------------
+ */
+List *
+ExecutorDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params,
+ Bitmapset **scan_leafpart_rtis)
+{
+ List *part_prune_results = NIL;
+ ListCell *lc;
+
+ /* Only get here if there is any pruning to do. */
+ Assert(plannedstmt->containsInitialPruning);
+
+ foreach(lc, plannedstmt->partPruneInfos)
+ {
+ PartitionPruneInfo *pruneinfo = lfirst(lc);
+ PartitionPruneResult *pruneresult = makeNode(PartitionPruneResult);
+
+ pruneresult->valid_subplan_offs =
+ ExecPartitionDoInitialPruning(plannedstmt, params, pruneinfo,
+ scan_leafpart_rtis);
+ part_prune_results = lappend(part_prune_results, pruneresult);
+ }
+
+ return part_prune_results;
+}
/* ----------------------------------------------------------------
* ExecutorStart
@@ -806,6 +855,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
{
CmdType operation = queryDesc->operation;
PlannedStmt *plannedstmt = queryDesc->plannedstmt;
+ List *part_prune_results = queryDesc->part_prune_results;
Plan *plan = plannedstmt->planTree;
List *rangeTable = plannedstmt->rtable;
EState *estate = queryDesc->estate;
@@ -826,6 +876,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
estate->es_plannedstmt = plannedstmt;
estate->es_part_prune_infos = plannedstmt->partPruneInfos;
+ estate->es_part_prune_results = part_prune_results;
/*
* Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index aca0c6f323..917079a034 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -66,6 +66,7 @@
#define PARALLEL_KEY_QUERY_TEXT UINT64CONST(0xE000000000000008)
#define PARALLEL_KEY_JIT_INSTRUMENTATION UINT64CONST(0xE000000000000009)
#define PARALLEL_KEY_WAL_USAGE UINT64CONST(0xE00000000000000A)
+#define PARALLEL_KEY_PARTITION_PRUNE_RESULTS UINT64CONST(0xE00000000000000B)
#define PARALLEL_TUPLE_QUEUE_SIZE 65536
@@ -182,6 +183,7 @@ ExecSerializePlan(Plan *plan, EState *estate)
pstmt->transientPlan = false;
pstmt->dependsOnRole = false;
pstmt->parallelModeNeeded = false;
+ pstmt->containsInitialPruning = false;
pstmt->planTree = plan;
pstmt->partPruneInfos = estate->es_part_prune_infos;
pstmt->rtable = estate->es_range_table;
@@ -597,12 +599,15 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
FixedParallelExecutorState *fpes;
char *pstmt_data;
char *pstmt_space;
+ char *part_prune_results_data;
+ char *part_prune_results_space;
char *paramlistinfo_space;
BufferUsage *bufusage_space;
WalUsage *walusage_space;
SharedExecutorInstrumentation *instrumentation = NULL;
SharedJitInstrumentation *jit_instrumentation = NULL;
int pstmt_len;
+ int part_prune_results_len;
int paramlistinfo_len;
int instrumentation_len = 0;
int jit_instrumentation_len = 0;
@@ -631,6 +636,7 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
/* Fix up and serialize plan to be sent to workers. */
pstmt_data = ExecSerializePlan(planstate->plan, estate);
+ part_prune_results_data = nodeToString(estate->es_part_prune_results);
/* Create a parallel context. */
pcxt = CreateParallelContext("postgres", "ParallelQueryMain", nworkers);
@@ -657,6 +663,11 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
shm_toc_estimate_chunk(&pcxt->estimator, pstmt_len);
shm_toc_estimate_keys(&pcxt->estimator, 1);
+ /* Estimate space for serialized List of PartitionPruneResult. */
+ part_prune_results_len = strlen(part_prune_results_data) + 1;
+ shm_toc_estimate_chunk(&pcxt->estimator, part_prune_results_len);
+ shm_toc_estimate_keys(&pcxt->estimator, 1);
+
/* Estimate space for serialized ParamListInfo. */
paramlistinfo_len = EstimateParamListSpace(estate->es_param_list_info);
shm_toc_estimate_chunk(&pcxt->estimator, paramlistinfo_len);
@@ -751,6 +762,12 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
memcpy(pstmt_space, pstmt_data, pstmt_len);
shm_toc_insert(pcxt->toc, PARALLEL_KEY_PLANNEDSTMT, pstmt_space);
+ /* Store serialized List of PartitionPruneResult */
+ part_prune_results_space = shm_toc_allocate(pcxt->toc, part_prune_results_len);
+ memcpy(part_prune_results_space, part_prune_results_data, part_prune_results_len);
+ shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARTITION_PRUNE_RESULTS,
+ part_prune_results_space);
+
/* Store serialized ParamListInfo. */
paramlistinfo_space = shm_toc_allocate(pcxt->toc, paramlistinfo_len);
shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARAMLISTINFO, paramlistinfo_space);
@@ -1232,8 +1249,10 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
int instrument_options)
{
char *pstmtspace;
+ char *part_prune_results_space;
char *paramspace;
PlannedStmt *pstmt;
+ List *part_prune_results;
ParamListInfo paramLI;
char *queryString;
@@ -1244,12 +1263,17 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
pstmtspace = shm_toc_lookup(toc, PARALLEL_KEY_PLANNEDSTMT, false);
pstmt = (PlannedStmt *) stringToNode(pstmtspace);
+ /* Reconstruct leader-supplied PartitionPruneResult. */
+ part_prune_results_space =
+ shm_toc_lookup(toc, PARALLEL_KEY_PARTITION_PRUNE_RESULTS, false);
+ part_prune_results = (List *) stringToNode(part_prune_results_space);
+
/* Reconstruct ParamListInfo. */
paramspace = shm_toc_lookup(toc, PARALLEL_KEY_PARAMLISTINFO, false);
paramLI = RestoreParamList(¶mspace);
/* Create a QueryDesc for the query. */
- return CreateQueryDesc(pstmt,
+ return CreateQueryDesc(pstmt, part_prune_results,
queryString,
GetActiveSnapshot(), InvalidSnapshot,
receiver, paramLI, NULL, instrument_options);
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 80197d5141..8728745c44 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -25,6 +25,7 @@
#include "mb/pg_wchar.h"
#include "miscadmin.h"
#include "nodes/makefuncs.h"
+#include "parser/parsetree.h"
#include "partitioning/partbounds.h"
#include "partitioning/partdesc.h"
#include "partitioning/partprune.h"
@@ -185,7 +186,11 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation rel,
static List *adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri);
static List *adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap);
static PartitionPruneState *CreatePartitionPruneState(PlanState *planstate,
- PartitionPruneInfo *pruneinfo);
+ PartitionPruneInfo *pruneinfo,
+ bool consider_initial_steps,
+ bool consider_exec_steps,
+ List *rtable, ExprContext *econtext,
+ PartitionDirectory partdir);
static void InitPartitionPruneContext(PartitionPruneContext *context,
List *pruning_steps,
PartitionDesc partdesc,
@@ -198,7 +203,8 @@ static void PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
static void find_matching_subplans_recurse(PartitionPruningData *prunedata,
PartitionedRelPruningData *pprune,
bool initial_prune,
- Bitmapset **validsubplans);
+ Bitmapset **validsubplans,
+ Bitmapset **scan_leafpart_rtis);
/*
@@ -1746,8 +1752,10 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
* considered to be a stable expression, it can change value from one plan
* node scan to the next during query execution. Stable comparison
* expressions that don't involve such Params allow partition pruning to be
- * done once during executor startup. Expressions that do involve such Params
- * require us to prune separately for each scan of the parent plan node.
+ * done once during executor startup or during ExecutorDoInitialPruning() that
+ * runs as part of performing AcquireExecutorLocks() on a given plan tree.
+ * Expressions that do involve such Params require us to prune separately for
+ * each scan of the parent plan node.
*
* Note that pruning away unneeded subplans during executor startup has the
* added benefit of not having to initialize the unneeded subplans at all.
@@ -1764,6 +1772,13 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
* account for initial pruning possibly having eliminated some of the
* subplans.
*
+ * ExecPartitionDoInitialPruning:
+ * Do initial pruning with the information contained in a given
+ * PartitionPruneInfo to determine the minimal set of child subplans
+ * to be executed of the parent plan node to which the PartitionPruneInfo
+ * belongs and also the set of the RT indexes of leaf partitions that will
+ * be scanned with those subplans.
+ *
* ExecFindMatchingSubPlans:
* Returns indexes of matching subplans after evaluating the expressions
* that are safe to evaluate at a given point. This function is first
@@ -1781,8 +1796,9 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
*
* On return, *initially_valid_subplans is assigned the set of indexes of
* child subplans that must be initialized along with the parent plan node.
- * Initial pruning is performed here if needed and in that case only the
- * surviving subplans' indexes are added.
+ * Initial pruning is performed here if needed (unless it has already been done
+ * by ExecutorDoInitialPruning()), and in that case only the surviving
+ * subplans' indexes are added.
*
* If subplans are indeed pruned, subplan_map arrays contained in the returned
* PartitionPruneState are re-sequenced to not count those, though only if the
@@ -1794,28 +1810,65 @@ ExecInitPartitionPruning(PlanState *planstate,
int part_prune_index,
Bitmapset **initially_valid_subplans)
{
- PartitionPruneState *prunestate;
+ PartitionPruneState *prunestate = NULL;
EState *estate = planstate->state;
PartitionPruneInfo *pruneinfo = list_nth(estate->es_part_prune_infos,
part_prune_index);
+ PartitionPruneResult *pruneresult = NULL;
+ bool do_pruning = (pruneinfo->needs_init_pruning ||
+ pruneinfo->needs_exec_pruning);
+
+ /*
+ * No need to do initial pruning if it was done already by
+ * ExecutorDoInitialPruning(), which it would be if es_part_prune_results
+ * is set.
+ */
+ if (estate->es_part_prune_results)
+ {
+ pruneresult = list_nth(estate->es_part_prune_results, part_prune_index);
+ Assert(IsA(pruneresult, PartitionPruneResult));
+ do_pruning = pruneinfo->needs_exec_pruning;
+ }
- /* We may need an expression context to evaluate partition exprs */
- ExecAssignExprContext(estate, planstate);
+ if (do_pruning)
+ {
+ /* We may need an expression context to evaluate partition exprs */
+ ExecAssignExprContext(estate, planstate);
- /* Create the working data structure for pruning */
- prunestate = CreatePartitionPruneState(planstate, pruneinfo);
+ /* For data reading, executor always omits detached partitions */
+ if (estate->es_partition_directory == NULL)
+ estate->es_partition_directory =
+ CreatePartitionDirectory(estate->es_query_cxt, false);
+
+ /*
+ * Create the working data structure for pruning. No need to consider
+ * initial pruning steps if we have a PartitionPruneResult.
+ */
+ prunestate = CreatePartitionPruneState(planstate, pruneinfo,
+ pruneresult == NULL, true,
+ NIL, planstate->ps_ExprContext,
+ estate->es_partition_directory);
+ }
/*
* Perform an initial partition prune pass, if required.
*/
- if (prunestate->do_initial_prune)
- *initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true);
+ if (pruneresult)
+ {
+ *initially_valid_subplans = bms_copy(pruneresult->valid_subplan_offs);
+ }
+ else if (prunestate && prunestate->do_initial_prune)
+ {
+ *initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true,
+ NULL);
+ }
else
{
- /* No pruning, so we'll need to initialize all subplans */
+ /* No initial pruning, so we'll need to initialize all subplans */
Assert(n_total_subplans > 0);
*initially_valid_subplans = bms_add_range(NULL, 0,
n_total_subplans - 1);
+ return prunestate;
}
/*
@@ -1823,7 +1876,8 @@ ExecInitPartitionPruning(PlanState *planstate,
* that were removed above due to initial pruning. No need to do this if
* no steps were removed.
*/
- if (bms_num_members(*initially_valid_subplans) < n_total_subplans)
+ if (prunestate &&
+ bms_num_members(*initially_valid_subplans) < n_total_subplans)
{
/*
* We can safely skip this when !do_exec_prune, even though that
@@ -1839,11 +1893,74 @@ ExecInitPartitionPruning(PlanState *planstate,
return prunestate;
}
+/*
+ * ExecPartitionDoInitialPruning
+ * Perform initial pruning using given PartitionPruneInfo to determine
+ * the minimal set of child subplans that will be executed and also the
+ * set of RT indexes of the leaf partitions scanned by those subplans.
+ */
+Bitmapset *
+ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params,
+ PartitionPruneInfo *pruneinfo,
+ Bitmapset **scan_leafpart_rtis)
+{
+ List *rtable = plannedstmt->rtable;
+ ExprContext *econtext;
+ PartitionDirectory pdir;
+ MemoryContext oldcontext,
+ tmpcontext;
+ PartitionPruneState *prunestate;
+ Bitmapset *valid_subplan_offs;
+
+ /*
+ * A temporary context for memory allocations required while executing
+ * partition pruning steps.
+ */
+ tmpcontext = AllocSetContextCreate(CurrentMemoryContext,
+ "initial pruning working data",
+ ALLOCSET_DEFAULT_SIZES);
+ oldcontext = MemoryContextSwitchTo(tmpcontext);
+
+ /*
+ * PartitionDirectory to look up partition descriptors.
+ * Note that we don't omit detached partitions, just like during
+ * execution proper.
+ */
+ pdir = CreatePartitionDirectory(CurrentMemoryContext, false);
+
+ /*
+ * We don't yet have a PlanState for the parent plan node, so we must
+ * create a standalone ExprContext to evaluate pruning expressions,
+ * equipped with the information about the EXTERN parameters that the
+ * caller passed us. Note that that's okay because the initial pruning
+ * steps do not contain anything that requires the execution to have
+ * started and thus need the information contained in a PlanState.
+ */
+ econtext = CreateStandaloneExprContext();
+ econtext->ecxt_param_list_info = params;
+ prunestate = CreatePartitionPruneState(NULL, pruneinfo, true, false,
+ rtable, econtext, pdir);
+ MemoryContextSwitchTo(oldcontext);
+
+ /* Do the initial pruning. */
+ valid_subplan_offs = ExecFindMatchingSubPlans(prunestate, true,
+ scan_leafpart_rtis);
+
+ FreeExprContext(econtext, true);
+ DestroyPartitionDirectory(pdir);
+ MemoryContextDelete(tmpcontext);
+
+ return valid_subplan_offs;
+}
+
/*
* CreatePartitionPruneState
* Build the data structure required for calling ExecFindMatchingSubPlans
*
- * 'planstate' is the parent plan node's execution state.
+ * 'planstate', if not NULL, is the parent plan node's execution state. It
+ * can be NULL if being called before ExecutorStart(), in which case,
+ * 'rtable' (range table), 'econtext', and 'partdir' must be explicitly
+ * provided.
*
* 'pruneinfo' is a PartitionPruneInfo as generated by
* make_partition_pruneinfo. Here we build a PartitionPruneState containing a
@@ -1857,19 +1974,21 @@ ExecInitPartitionPruning(PlanState *planstate,
* PartitionedRelPruneInfo.
*/
static PartitionPruneState *
-CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
+CreatePartitionPruneState(PlanState *planstate,
+ PartitionPruneInfo *pruneinfo,
+ bool consider_initial_steps,
+ bool consider_exec_steps,
+ List *rtable, ExprContext *econtext,
+ PartitionDirectory partdir)
{
- EState *estate = planstate->state;
+ EState *estate = planstate ? planstate->state : NULL;
PartitionPruneState *prunestate;
int n_part_hierarchies;
ListCell *lc;
int i;
- ExprContext *econtext = planstate->ps_ExprContext;
- /* For data reading, executor always omits detached partitions */
- if (estate->es_partition_directory == NULL)
- estate->es_partition_directory =
- CreatePartitionDirectory(estate->es_query_cxt, false);
+ Assert((estate != NULL) ||
+ (partdir != NULL && econtext != NULL && rtable != NIL));
n_part_hierarchies = list_length(pruneinfo->prune_infos);
Assert(n_part_hierarchies > 0);
@@ -1924,15 +2043,42 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
PartitionKey partkey;
/*
- * We can rely on the copies of the partitioned table's partition
- * key and partition descriptor appearing in its relcache entry,
- * because that entry will be held open and locked for the
- * duration of this executor run.
+ * Must open the relation by ourselves when called before the
+ * execution has started, such as, when called during
+ * ExecutorDoInitialPruning() on a cached plan. In that case,
+ * sub-partitions must be locked, because AcquirePlannerLocks()
+ * would not have seen them. (1st relation in a partrelpruneinfos
+ * list is always the root partitioned table appearing in the
+ * query, which AcquirePlannerLocks() would have locked; the
+ * Assert in relation_open() guards that assumption.)
+ */
+ if (estate == NULL)
+ {
+ RangeTblEntry *rte = rt_fetch(pinfo->rtindex, rtable);
+ int lockmode = (j == 0) ? NoLock : rte->rellockmode;
+
+ partrel = table_open(rte->relid, lockmode);
+ }
+ else
+ partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
+
+ /*
+ * We can rely on the copy of the partitioned table's partition
+ * key from in its relcache entry, because it can't change (or
+ * get destroyed) as long as the relation is locked. Partition
+ * descriptor is taken from the PartitionDirectory associated with
+ * the table that is held open long enough for the descriptor to
+ * remain valid while it's used to perform the pruning steps.
*/
- partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
partkey = RelationGetPartitionKey(partrel);
- partdesc = PartitionDirectoryLookup(estate->es_partition_directory,
- partrel);
+ partdesc = PartitionDirectoryLookup(partdir, partrel);
+
+ /*
+ * Must close partrel, keeping the lock taken, if we're not using
+ * EState's entry.
+ */
+ if (estate == NULL)
+ table_close(partrel, NoLock);
/*
* Initialize the subplan_map and subpart_map.
@@ -1946,6 +2092,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
Assert(partdesc->nparts >= pinfo->nparts);
pprune->nparts = partdesc->nparts;
pprune->subplan_map = palloc(sizeof(int) * partdesc->nparts);
+ pprune->rti_map = palloc(sizeof(Index) * partdesc->nparts);
if (partdesc->nparts == pinfo->nparts)
{
/*
@@ -1956,6 +2103,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
pprune->subpart_map = pinfo->subpart_map;
memcpy(pprune->subplan_map, pinfo->subplan_map,
sizeof(int) * pinfo->nparts);
+ memcpy(pprune->rti_map, pinfo->rti_map,
+ sizeof(int) * pinfo->nparts);
/*
* Double-check that the list of unpruned relations has not
@@ -2006,6 +2155,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
pinfo->subplan_map[pd_idx];
pprune->subpart_map[pp_idx] =
pinfo->subpart_map[pd_idx];
+ pprune->rti_map[pp_idx] =
+ pinfo->rti_map[pd_idx];
pd_idx++;
}
else
@@ -2013,6 +2164,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
/* this partdesc entry is not in the plan */
pprune->subplan_map[pp_idx] = -1;
pprune->subpart_map[pp_idx] = -1;
+ pprune->rti_map[pp_idx] = 0;
}
}
@@ -2034,7 +2186,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
* Initialize pruning contexts as needed.
*/
pprune->initial_pruning_steps = pinfo->initial_pruning_steps;
- if (pinfo->initial_pruning_steps)
+ if (consider_initial_steps && pinfo->initial_pruning_steps)
{
InitPartitionPruneContext(&pprune->initial_context,
pinfo->initial_pruning_steps,
@@ -2044,7 +2196,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
prunestate->do_initial_prune = true;
}
pprune->exec_pruning_steps = pinfo->exec_pruning_steps;
- if (pinfo->exec_pruning_steps)
+ if (consider_exec_steps && pinfo->exec_pruning_steps)
{
InitPartitionPruneContext(&pprune->exec_context,
pinfo->exec_pruning_steps,
@@ -2272,10 +2424,14 @@ PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
* Pass initial_prune if PARAM_EXEC Params cannot yet be evaluated. This
* differentiates the initial executor-time pruning step from later
* runtime pruning.
+ *
+ * RT indexes of leaf partitions scanned by the chosen subplans are added to
+ * *scan_leafpart_rtis if the pointer is non-NULL.
*/
Bitmapset *
ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
- bool initial_prune)
+ bool initial_prune,
+ Bitmapset **scan_leafpart_rtis)
{
Bitmapset *result = NULL;
MemoryContext oldcontext;
@@ -2310,7 +2466,7 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
*/
pprune = &prunedata->partrelprunedata[0];
find_matching_subplans_recurse(prunedata, pprune, initial_prune,
- &result);
+ &result, scan_leafpart_rtis);
/* Expression eval may have used space in ExprContext too */
if (pprune->exec_pruning_steps)
@@ -2324,6 +2480,8 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
/* Copy result out of the temp context before we reset it */
result = bms_copy(result);
+ if (scan_leafpart_rtis)
+ *scan_leafpart_rtis = bms_copy(*scan_leafpart_rtis);
MemoryContextReset(prunestate->prune_context);
@@ -2334,13 +2492,15 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
* find_matching_subplans_recurse
* Recursive worker function for ExecFindMatchingSubPlans
*
- * Adds valid (non-prunable) subplan IDs to *validsubplans
+ * Adds valid (non-prunable) subplan IDs to *validsubplans and RT indexes of
+ * of the corresponding leaf partitions to *scan_leafpart_rtis (if asked for).
*/
static void
find_matching_subplans_recurse(PartitionPruningData *prunedata,
PartitionedRelPruningData *pprune,
bool initial_prune,
- Bitmapset **validsubplans)
+ Bitmapset **validsubplans,
+ Bitmapset **scan_leafpart_rtis)
{
Bitmapset *partset;
int i;
@@ -2367,8 +2527,14 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
while ((i = bms_next_member(partset, i)) >= 0)
{
if (pprune->subplan_map[i] >= 0)
+ {
*validsubplans = bms_add_member(*validsubplans,
pprune->subplan_map[i]);
+ Assert(pprune->rti_map[i] > 0);
+ if (scan_leafpart_rtis)
+ *scan_leafpart_rtis = bms_add_member(*scan_leafpart_rtis,
+ pprune->rti_map[i]);
+ }
else
{
int partidx = pprune->subpart_map[i];
@@ -2376,7 +2542,8 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
if (partidx >= 0)
find_matching_subplans_recurse(prunedata,
&prunedata->partrelprunedata[partidx],
- initial_prune, validsubplans);
+ initial_prune, validsubplans,
+ scan_leafpart_rtis);
else
{
/*
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 21f4c10937..67a58c7163 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -134,6 +134,7 @@ CreateExecutorState(void)
estate->es_param_exec_vals = NULL;
estate->es_queryEnv = NULL;
+ estate->es_part_prune_results = NIL;
estate->es_query_cxt = qcontext;
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index e134a82ff7..18d3b98cdc 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -842,7 +842,7 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
else
dest = None_Receiver;
- es->qd = CreateQueryDesc(es->stmt,
+ es->qd = CreateQueryDesc(es->stmt, NIL,
fcache->src,
GetActiveSnapshot(),
InvalidSnapshot,
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index c6f86a6510..96880e122a 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -155,7 +155,8 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
* subplan, we can fill as_valid_subplans immediately, preventing
* later calls to ExecFindMatchingSubPlans.
*/
- if (!prunestate->do_exec_prune && nplans > 0)
+ if (appendstate->as_prune_state == NULL ||
+ (!appendstate->as_prune_state->do_exec_prune && nplans > 0))
appendstate->as_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
}
else
@@ -577,7 +578,7 @@ choose_next_subplan_locally(AppendState *node)
}
else if (node->as_valid_subplans == NULL)
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
whichplan = -1;
}
@@ -642,7 +643,7 @@ choose_next_subplan_for_leader(AppendState *node)
if (node->as_valid_subplans == NULL)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
/*
* Mark each invalid plan as finished to allow the loop below to
@@ -717,7 +718,7 @@ choose_next_subplan_for_worker(AppendState *node)
else if (node->as_valid_subplans == NULL)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
mark_invalid_subplans_as_finished(node);
}
@@ -868,7 +869,7 @@ ExecAppendAsyncBegin(AppendState *node)
if (node->as_valid_subplans == NULL)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
classify_matching_subplans(node);
}
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index 8d35860c30..2312e5a633 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -103,7 +103,8 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
* subplan, we can fill ms_valid_subplans immediately, preventing
* later calls to ExecFindMatchingSubPlans.
*/
- if (!prunestate->do_exec_prune && nplans > 0)
+ if (mergestate->ms_prune_state == NULL ||
+ (!mergestate->ms_prune_state->do_exec_prune && nplans > 0))
mergestate->ms_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
}
else
@@ -218,7 +219,7 @@ ExecMergeAppend(PlanState *pstate)
*/
if (node->ms_valid_subplans == NULL)
node->ms_valid_subplans =
- ExecFindMatchingSubPlans(node->ms_prune_state, false);
+ ExecFindMatchingSubPlans(node->ms_prune_state, false, NULL);
/*
* First time through: pull the first tuple from each valid subplan,
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index fd5796f1b9..93012a5b3b 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -1578,6 +1578,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
CachedPlanSource *plansource;
CachedPlan *cplan;
List *stmt_list;
+ List *part_prune_results_list;
char *query_string;
Snapshot snapshot;
MemoryContext oldcontext;
@@ -1657,7 +1658,10 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
*/
/* Replan if needed, and increment plan refcount for portal */
- cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv);
+ cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv,
+ &part_prune_results_list);
+ Assert(list_length(cplan->stmt_list) ==
+ list_length(part_prune_results_list));
stmt_list = cplan->stmt_list;
if (!plan->saved)
@@ -1685,6 +1689,9 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
stmt_list,
cplan);
+ /* Copy Lists of PartitionPruneResults into the portal's context. */
+ PortalStorePartitionPruneResults(portal, part_prune_results_list);
+
/*
* Set up options for portal. Default SCROLL type is chosen the same way
* as PerformCursorOpen does it.
@@ -2092,7 +2099,8 @@ SPI_plan_get_cached_plan(SPIPlanPtr plan)
/* Get the generic plan for the query */
cplan = GetCachedPlan(plansource, NULL,
plan->saved ? CurrentResourceOwner : NULL,
- _SPI_current->queryEnv);
+ _SPI_current->queryEnv,
+ NULL /* Not interested in PartitionPruneResults */);
Assert(cplan == plansource->gplan);
/* Pop the error context stack */
@@ -2473,7 +2481,9 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
{
CachedPlanSource *plansource = (CachedPlanSource *) lfirst(lc1);
List *stmt_list;
- ListCell *lc2;
+ List *part_prune_results_list;
+ ListCell *lc2,
+ *lc3;
spicallbackarg.query = plansource->query_string;
@@ -2549,8 +2559,10 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
* plan, the refcount must be backed by the plan_owner.
*/
cplan = GetCachedPlan(plansource, options->params,
- plan_owner, _SPI_current->queryEnv);
-
+ plan_owner, _SPI_current->queryEnv,
+ &part_prune_results_list);
+ Assert(list_length(cplan->stmt_list) ==
+ list_length(part_prune_results_list));
stmt_list = cplan->stmt_list;
/*
@@ -2589,9 +2601,10 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
}
}
- foreach(lc2, stmt_list)
+ forboth(lc2, stmt_list, lc3, part_prune_results_list)
{
PlannedStmt *stmt = lfirst_node(PlannedStmt, lc2);
+ List *part_prune_results = lfirst_node(List, lc3);
bool canSetTag = stmt->canSetTag;
DestReceiver *dest;
@@ -2663,7 +2676,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
else
snap = InvalidSnapshot;
- qdesc = CreateQueryDesc(stmt,
+ qdesc = CreateQueryDesc(stmt, part_prune_results,
plansource->query_string,
snap, crosscheck_snapshot,
dest,
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index b4ff855f7c..77990a2732 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -158,6 +158,11 @@
token = pg_strtok(&length); /* skip :fldname */ \
local_node->fldname = readIntCols(len)
+/* Read an Index array */
+#define READ_INDEX_ARRAY(fldname, len) \
+ token = pg_strtok(&length); /* skip :fldname */ \
+ local_node->fldname = readIndexCols(len)
+
/* Read a bool array */
#define READ_BOOL_ARRAY(fldname, len) \
token = pg_strtok(&length); /* skip :fldname */ \
@@ -795,7 +800,6 @@ fnname(int numCols) \
*/
READ_SCALAR_ARRAY(readAttrNumberCols, int16, atoi)
READ_SCALAR_ARRAY(readOidCols, Oid, atooid)
-/* outfuncs.c has writeIndexCols, but we don't yet need that here */
-/* READ_SCALAR_ARRAY(readIndexCols, Index, atoui) */
+READ_SCALAR_ARRAY(readIndexCols, Index, atoui)
READ_SCALAR_ARRAY(readIntCols, int, atoi)
READ_SCALAR_ARRAY(readBoolCols, bool, strtobool)
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 240d50f1c0..b7801ea04c 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -520,7 +520,9 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
result->parallelModeNeeded = glob->parallelModeNeeded;
result->planTree = top_plan;
result->partPruneInfos = glob->partPruneInfos;
+ result->containsInitialPruning = glob->containsInitialPruning;
result->rtable = glob->finalrtable;
+ result->minLockRelids = glob->minLockRelids;
result->resultRelations = glob->resultRelations;
result->appendRelations = glob->appendRelations;
result->subplans = glob->subplans;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 720f20f563..61d6934978 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -270,6 +270,16 @@ set_plan_references(PlannerInfo *root, Plan *plan)
*/
add_rtes_to_flat_rtable(root, false);
+ /*
+ * Add the query's adjusted range of RT indexes to glob->minLockRelids.
+ * The adjusted RT indexes of prunable relations will be deleted from the
+ * set below where PartitionPruneInfos are processed.
+ */
+ glob->minLockRelids =
+ bms_add_range(glob->minLockRelids,
+ rtoffset + 1,
+ rtoffset + list_length(root->parse->rtable));
+
/*
* Adjust RT indexes of PlanRowMarks and add to final rowmarks list
*/
@@ -352,6 +362,7 @@ set_plan_references(PlannerInfo *root, Plan *plan)
foreach (lc, root->partPruneInfos)
{
PartitionPruneInfo *pruneinfo = lfirst(lc);
+ Bitmapset *leafpart_rtis = NULL;
ListCell *l;
foreach(l, pruneinfo->prune_infos)
@@ -362,15 +373,50 @@ set_plan_references(PlannerInfo *root, Plan *plan)
foreach(l2, prune_infos)
{
PartitionedRelPruneInfo *pinfo = lfirst(l2);
+ int i;
/* RT index of the table to which the pinfo belongs. */
pinfo->rtindex += rtoffset;
+
+ /* Also of the leaf partitions that might be scanned. */
+ for (i = 0; i < pinfo->nparts; i++)
+ {
+ if (pinfo->rti_map[i] > 0 && pinfo->subplan_map[i] >= 0)
+ {
+ pinfo->rti_map[i] += rtoffset;
+ leafpart_rtis = bms_add_member(leafpart_rtis,
+ pinfo->rti_map[i]);
+ }
+ }
}
}
+ if (pruneinfo->needs_init_pruning)
+ {
+ glob->containsInitialPruning = true;
+
+ /*
+ * Delete the leaf partition RTIs from the global set of relations
+ * to be locked before executing the plan. AcquireExecutorLocks()
+ * will find the ones to add to the set after performing initial
+ * pruning.
+ */
+ glob->minLockRelids = bms_del_members(glob->minLockRelids,
+ leafpart_rtis);
+ }
+
glob->partPruneInfos = lappend(glob->partPruneInfos, pruneinfo);
}
+ /*
+ * It seems worth doing a bms_copy() on glob->minLockRelids if we deleted
+ * bits from it above to get rid of any empty tail bits. It seems better
+ * for the loop over this set in AcquireExecutorLocks() to not have to go
+ * through those useless bit words.
+ */
+ if (glob->containsInitialPruning)
+ glob->minLockRelids = bms_copy(glob->minLockRelids);
+
return result;
}
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index 6565b6ed01..37f3e6af61 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -144,7 +144,9 @@ static List *make_partitionedrel_pruneinfo(PlannerInfo *root,
List *prunequal,
Bitmapset *partrelids,
int *relid_subplan_map,
- Bitmapset **matchedsubplans);
+ Bitmapset **matchedsubplans,
+ bool *needs_init_pruning,
+ bool *needs_exec_pruning);
static void gen_partprune_steps(RelOptInfo *rel, List *clauses,
PartClauseTarget target,
GeneratePruningStepsContext *context);
@@ -234,6 +236,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
int *relid_subplan_map;
ListCell *lc;
int i;
+ bool needs_init_pruning = false;
+ bool needs_exec_pruning = false;
/*
* Scan the subpaths to see which ones are scans of partition child
@@ -313,12 +317,16 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
Bitmapset *partrelids = (Bitmapset *) lfirst(lc);
List *pinfolist;
Bitmapset *matchedsubplans = NULL;
+ bool partrel_needs_init_pruning;
+ bool partrel_needs_exec_pruning;
pinfolist = make_partitionedrel_pruneinfo(root, parentrel,
prunequal,
partrelids,
relid_subplan_map,
- &matchedsubplans);
+ &matchedsubplans,
+ &partrel_needs_init_pruning,
+ &partrel_needs_exec_pruning);
/* When pruning is possible, record the matched subplans */
if (pinfolist != NIL)
@@ -327,6 +335,9 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
allmatchedsubplans = bms_join(matchedsubplans,
allmatchedsubplans);
}
+
+ needs_init_pruning |= partrel_needs_init_pruning;
+ needs_exec_pruning |= partrel_needs_exec_pruning;
}
pfree(relid_subplan_map);
@@ -341,6 +352,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
/* Else build the result data structure */
pruneinfo = makeNode(PartitionPruneInfo);
pruneinfo->prune_infos = prunerelinfos;
+ pruneinfo->needs_init_pruning = needs_init_pruning;
+ pruneinfo->needs_exec_pruning = needs_exec_pruning;
/*
* Some subplans may not belong to any of the identified partitioned rels.
@@ -441,13 +454,18 @@ add_part_relids(List *allpartrelids, Bitmapset *partrelids)
* If we cannot find any useful run-time pruning steps, return NIL.
* However, on success, each rel identified in partrelids will have
* an element in the result list, even if some of them are useless.
+ * *needs_init_pruning and *needs_exec_pruning are set to indicate that the
+ * returned PartitionedRelPruneInfos contains pruning steps that can be
+ * performed before and after execution begins, respectively.
*/
static List *
make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
List *prunequal,
Bitmapset *partrelids,
int *relid_subplan_map,
- Bitmapset **matchedsubplans)
+ Bitmapset **matchedsubplans,
+ bool *needs_init_pruning,
+ bool *needs_exec_pruning)
{
RelOptInfo *targetpart = NULL;
List *pinfolist = NIL;
@@ -458,6 +476,10 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
int rti;
int i;
+ /* Will find out below. */
+ *needs_init_pruning = false;
+ *needs_exec_pruning = false;
+
/*
* Examine each partitioned rel, constructing a temporary array to map
* from planner relids to index of the partitioned rel, and building a
@@ -545,6 +567,9 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
* executor per-scan pruning steps. This first pass creates startup
* pruning steps and detects whether there's any possibly-useful quals
* that would require per-scan pruning.
+ *
+ * In the first pass, we note whether the 2nd pass is necessary by
+ * noting the presence of EXEC parameters.
*/
gen_partprune_steps(subpart, partprunequal, PARTTARGET_INITIAL,
&context);
@@ -619,6 +644,12 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
pinfo->execparamids = execparamids;
/* Remaining fields will be filled in the next loop */
+ /* record which types of pruning steps we've seen so far */
+ if (initial_pruning_steps != NIL)
+ *needs_init_pruning = true;
+ if (exec_pruning_steps != NIL)
+ *needs_exec_pruning = true;
+
pinfolist = lappend(pinfolist, pinfo);
}
@@ -646,6 +677,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
int *subplan_map;
int *subpart_map;
Oid *relid_map;
+ Index *rti_map;
/*
* Construct the subplan and subpart maps for this partitioning level.
@@ -658,6 +690,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
subpart_map = (int *) palloc(nparts * sizeof(int));
memset(subpart_map, -1, nparts * sizeof(int));
relid_map = (Oid *) palloc0(nparts * sizeof(Oid));
+ rti_map = (Index *) palloc0(nparts * sizeof(Index));
present_parts = NULL;
i = -1;
@@ -672,6 +705,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
subplan_map[i] = subplanidx = relid_subplan_map[partrel->relid] - 1;
subpart_map[i] = subpartidx = relid_subpart_map[partrel->relid] - 1;
relid_map[i] = planner_rt_fetch(partrel->relid, root)->relid;
+ rti_map[i] = partrel->relid;
if (subplanidx >= 0)
{
present_parts = bms_add_member(present_parts, i);
@@ -696,6 +730,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
pinfo->subplan_map = subplan_map;
pinfo->subpart_map = subpart_map;
pinfo->relid_map = relid_map;
+ pinfo->rti_map = rti_map;
}
pfree(relid_subpart_map);
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index a9a1851c94..a1be8179e8 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -1598,6 +1598,7 @@ exec_bind_message(StringInfo input_message)
int16 *rformats = NULL;
CachedPlanSource *psrc;
CachedPlan *cplan;
+ List *part_prune_results_list;
Portal portal;
char *query_string;
char *saved_stmt_name;
@@ -1972,7 +1973,9 @@ exec_bind_message(StringInfo input_message)
* will be generated in MessageContext. The plan refcount will be
* assigned to the Portal, so it will be released at portal destruction.
*/
- cplan = GetCachedPlan(psrc, params, NULL, NULL);
+ cplan = GetCachedPlan(psrc, params, NULL, NULL, &part_prune_results_list);
+ Assert(list_length(cplan->stmt_list) ==
+ list_length(part_prune_results_list));
/*
* Now we can define the portal.
@@ -1987,6 +1990,9 @@ exec_bind_message(StringInfo input_message)
cplan->stmt_list,
cplan);
+ /* Copy Lists of PartitionPruneResults into the portal's context. */
+ PortalStorePartitionPruneResults(portal, part_prune_results_list);
+
/* Done with the snapshot used for parameter I/O and parsing/planning */
if (snapshot_set)
PopActiveSnapshot();
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index 5aa5a350f3..226ee81b63 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -35,7 +35,7 @@
Portal ActivePortal = NULL;
-static void ProcessQuery(PlannedStmt *plan,
+static void ProcessQuery(PlannedStmt *plan, List *part_prune_results,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -65,6 +65,7 @@ static void DoPortalRewind(Portal portal);
*/
QueryDesc *
CreateQueryDesc(PlannedStmt *plannedstmt,
+ List *part_prune_results,
const char *sourceText,
Snapshot snapshot,
Snapshot crosscheck_snapshot,
@@ -77,6 +78,8 @@ CreateQueryDesc(PlannedStmt *plannedstmt,
qd->operation = plannedstmt->commandType; /* operation */
qd->plannedstmt = plannedstmt; /* plan */
+ qd->part_prune_results = part_prune_results; /* ExecutorDoInitialPruning()
+ * output for plan */
qd->sourceText = sourceText; /* query text */
qd->snapshot = RegisterSnapshot(snapshot); /* snapshot */
/* RI check snapshot */
@@ -122,6 +125,7 @@ FreeQueryDesc(QueryDesc *qdesc)
* PORTAL_ONE_RETURNING, or PORTAL_ONE_MOD_WITH portal
*
* plan: the plan tree for the query
+ * part_prune_results: ExecutorDoInitialPruning() output for the PlannedStmt
* sourceText: the source text of the query
* params: any parameters needed
* dest: where to send results
@@ -134,6 +138,7 @@ FreeQueryDesc(QueryDesc *qdesc)
*/
static void
ProcessQuery(PlannedStmt *plan,
+ List *part_prune_results,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -145,7 +150,7 @@ ProcessQuery(PlannedStmt *plan,
/*
* Create the QueryDesc object
*/
- queryDesc = CreateQueryDesc(plan, sourceText,
+ queryDesc = CreateQueryDesc(plan, part_prune_results, sourceText,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, 0);
@@ -491,8 +496,13 @@ PortalStart(Portal portal, ParamListInfo params,
/*
* Create QueryDesc in portal's context; for the moment, set
* the destination to DestNone.
+ *
+ * There is no PartitionPruneResult unless the PlannedStmt is
+ * from a CachedPlan.
*/
queryDesc = CreateQueryDesc(linitial_node(PlannedStmt, portal->stmts),
+ portal->part_prune_results_list == NIL ? NIL :
+ linitial(portal->part_prune_results_list),
portal->sourceText,
GetActiveSnapshot(),
InvalidSnapshot,
@@ -1225,6 +1235,8 @@ PortalRunMulti(Portal portal,
if (pstmt->utilityStmt == NULL)
{
+ List *part_prune_results = NIL;
+
/*
* process a plannable query.
*/
@@ -1271,10 +1283,18 @@ PortalRunMulti(Portal portal,
else
UpdateActiveSnapshotCommandId();
+ /*
+ * Determine if there's a corresponding List of PartitionPruneResult
+ * for this PlannedStmt.
+ */
+ if (portal->part_prune_results_list != NIL)
+ part_prune_results = list_nth(portal->part_prune_results_list,
+ foreach_current_index(stmtlist_item));
+
if (pstmt->canSetTag)
{
/* statement can set tag string */
- ProcessQuery(pstmt,
+ ProcessQuery(pstmt, part_prune_results,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
@@ -1283,7 +1303,7 @@ PortalRunMulti(Portal portal,
else
{
/* stmt added by rewrite cannot set tag */
- ProcessQuery(pstmt,
+ ProcessQuery(pstmt, part_prune_results,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 0d6a295674..957221c47e 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -99,14 +99,19 @@ static dlist_head cached_expression_list = DLIST_STATIC_INIT(cached_expression_l
static void ReleaseGenericPlan(CachedPlanSource *plansource);
static List *RevalidateCachedQuery(CachedPlanSource *plansource,
QueryEnvironment *queryEnv);
-static bool CheckCachedPlan(CachedPlanSource *plansource);
+static bool CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
+ List **part_prune_results_list);
static CachedPlan *BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
- ParamListInfo boundParams, QueryEnvironment *queryEnv);
+ ParamListInfo boundParams, QueryEnvironment *queryEnv,
+ List **part_prune_results_list);
static bool choose_custom_plan(CachedPlanSource *plansource,
ParamListInfo boundParams);
static double cached_plan_cost(CachedPlan *plan, bool include_planner);
static Query *QueryListGetPrimaryStmt(List *stmts);
-static void AcquireExecutorLocks(List *stmt_list, bool acquire);
+static void AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams,
+ List **part_prune_results_list,
+ List **lockedRelids_per_stmt);
+static void ReleaseExecutorLocks(List *stmt_list, List *lockedRelids_per_stmt);
static void AcquirePlannerLocks(List *stmt_list, bool acquire);
static void ScanQueryForLocks(Query *parsetree, bool acquire);
static bool ScanQueryWalker(Node *node, bool *acquire);
@@ -782,6 +787,26 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
return tlist;
}
+/*
+ * FreePartitionPruneResults
+ * Frees the List of Lists of PartitionPruneResults for CheckCachedPlan()
+ */
+static void
+FreePartitionPruneResults(List *part_prune_results_list)
+{
+ ListCell *lc;
+
+ foreach(lc, part_prune_results_list)
+ {
+ List *part_prune_results = lfirst(lc);
+
+ /* Free both the PartitionPruneResults and the containing List. */
+ list_free_deep(part_prune_results);
+ }
+
+ list_free(part_prune_results_list);
+}
+
/*
* CheckCachedPlan: see if the CachedPlanSource's generic plan is valid.
*
@@ -790,15 +815,20 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
*
* On a "true" return, we have acquired the locks needed to run the plan.
* (We must do this for the "true" result to be race-condition-free.)
+ *
+ * See GetCachedPlan()'s comment for a description of part_prune_results_list.
*/
static bool
-CheckCachedPlan(CachedPlanSource *plansource)
+CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
+ List **part_prune_results_list)
{
CachedPlan *plan = plansource->gplan;
/* Assert that caller checked the querytree */
Assert(plansource->is_valid);
+ *part_prune_results_list = NIL;
+
/* If there's no generic plan, just say "false" */
if (!plan)
return false;
@@ -820,13 +850,21 @@ CheckCachedPlan(CachedPlanSource *plansource)
*/
if (plan->is_valid)
{
+ List *lockedRelids_per_stmt;
+
/*
* Plan must have positive refcount because it is referenced by
* plansource; so no need to fear it disappears under us here.
*/
Assert(plan->refcount > 0);
- AcquireExecutorLocks(plan->stmt_list, true);
+ /*
+ * Lock relations scanned by the plan. This is where the pruning
+ * happens if needed.
+ */
+ AcquireExecutorLocks(plan->stmt_list, boundParams,
+ part_prune_results_list,
+ &lockedRelids_per_stmt);
/*
* If plan was transient, check to see if TransactionXmin has
@@ -848,7 +886,11 @@ CheckCachedPlan(CachedPlanSource *plansource)
}
/* Oops, the race case happened. Release useless locks. */
- AcquireExecutorLocks(plan->stmt_list, false);
+ ReleaseExecutorLocks(plan->stmt_list, lockedRelids_per_stmt);
+
+ /* Release any PartitionPruneResults that may been created. */
+ FreePartitionPruneResults(*part_prune_results_list);
+ *part_prune_results_list = NIL;
}
/*
@@ -874,10 +916,14 @@ CheckCachedPlan(CachedPlanSource *plansource)
* Planning work is done in the caller's memory context. The finished plan
* is in a child memory context, which typically should get reparented
* (unless this is a one-shot plan, in which case we don't copy the plan).
+ *
+ * A list of NILs is returned in *part_prune_results_list, meaning that no
+ * no partition pruning has been done yet for the plans in stmt_list.
*/
static CachedPlan *
BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
- ParamListInfo boundParams, QueryEnvironment *queryEnv)
+ ParamListInfo boundParams, QueryEnvironment *queryEnv,
+ List **part_prune_results_list)
{
CachedPlan *plan;
List *plist;
@@ -1007,6 +1053,17 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
MemoryContextSwitchTo(oldcxt);
+ /*
+ * No actual PartitionPruneResults yet to add, though must initialize
+ * the list to have the same number of elements as the list of
+ * PlannedStmts.
+ */
+ *part_prune_results_list = NIL;
+ foreach(lc, plist)
+ {
+ *part_prune_results_list = lappend(*part_prune_results_list, NIL);
+ }
+
return plan;
}
@@ -1126,6 +1183,19 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
* plan or a custom plan for the given parameters: the caller does not know
* which it will get.
*
+ * For every PlannedStmt found in the returned CachedPlan, an element that
+ * is either a List of PartitionPruneResult or a NIL is added to
+ * *part_prune_results_list. The former if the PlannedStmt is from
+ * the existing CachedPlan that is otherwise valid and has
+ * containsInitialPruning set to true. Before returning such a CachedPlan,
+ * those "initial" steps are performed by calling ExecutorDoInitialPruning()
+ * to determine only those leaf partitions that need to be locked by
+ * AcquireExecutorLocks() by pruning away subplans that don't match the
+ * "initial" pruning conditions. For each PartitionPruneInfo found in
+ * PlannedStmt.partPruneInfos, a PartitionPruneResult containing the bitmapset
+ * of the indexes of surviving subplans is added to the List for the
+ * PlannedStmt.
+ *
* On return, the plan is valid and we have sufficient locks to begin
* execution.
*
@@ -1139,11 +1209,13 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
*/
CachedPlan *
GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
- ResourceOwner owner, QueryEnvironment *queryEnv)
+ ResourceOwner owner, QueryEnvironment *queryEnv,
+ List **part_prune_results_list)
{
CachedPlan *plan = NULL;
List *qlist;
bool customplan;
+ List *my_part_prune_results_list;
/* Assert caller is doing things in a sane order */
Assert(plansource->magic == CACHEDPLANSOURCE_MAGIC);
@@ -1160,7 +1232,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
if (!customplan)
{
- if (CheckCachedPlan(plansource))
+ if (CheckCachedPlan(plansource, boundParams,
+ &my_part_prune_results_list))
{
/* We want a generic plan, and we already have a valid one */
plan = plansource->gplan;
@@ -1169,7 +1242,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
else
{
/* Build a new generic plan */
- plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv);
+ plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv,
+ &my_part_prune_results_list);
/* Just make real sure plansource->gplan is clear */
ReleaseGenericPlan(plansource);
/* Link the new generic plan into the plansource */
@@ -1214,7 +1288,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
if (customplan)
{
/* Build a custom plan */
- plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv);
+ plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv,
+ &my_part_prune_results_list);
/* Accumulate total costs of custom plans */
plansource->total_custom_cost += cached_plan_cost(plan, true);
@@ -1246,6 +1321,9 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
plan->is_saved = true;
}
+ if (part_prune_results_list)
+ *part_prune_results_list = my_part_prune_results_list;
+
return plan;
}
@@ -1737,17 +1815,29 @@ QueryListGetPrimaryStmt(List *stmts)
/*
* AcquireExecutorLocks: acquire locks needed for execution of a cached plan;
- * or release them if acquire is false.
+ *
+ * See GetCachedPlan()'s comment for a description of part_prune_results_list.
+ *
+ * On return, *lockedRelids_per_stmt will contain a bitmapset for every
+ * PlannedStmt in stmt_list, containing the RT indexes of relation entries
+ * in its range table that were actually locked, or NULL if the PlannedStmt
+ * contains a utility statement.
*/
static void
-AcquireExecutorLocks(List *stmt_list, bool acquire)
+AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams,
+ List **part_prune_results_list,
+ List **lockedRelids_per_stmt)
{
ListCell *lc1;
+ *part_prune_results_list = *lockedRelids_per_stmt = NIL;
foreach(lc1, stmt_list)
{
PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
- ListCell *lc2;
+ List *part_prune_results = NIL;
+ Bitmapset *allLockRelids;
+ Bitmapset *lockedRelids = NULL;
+ int rti;
if (plannedstmt->commandType == CMD_UTILITY)
{
@@ -1761,13 +1851,40 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
Query *query = UtilityContainsQuery(plannedstmt->utilityStmt);
if (query)
- ScanQueryForLocks(query, acquire);
+ ScanQueryForLocks(query, true);
+ *part_prune_results_list = lappend(*part_prune_results_list, NIL);
continue;
}
- foreach(lc2, plannedstmt->rtable)
+ /*
+ * Figure out the set of relations that would need to be locked
+ * before executing the plan.
+ */
+ if (plannedstmt->containsInitialPruning)
{
- RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc2);
+ Bitmapset *scan_leafpart_rtis = NULL;
+
+ /*
+ * Obtain the set of leaf partitions to be locked.
+ *
+ * The following does initial partition pruning using the
+ * PartitionPruneInfos found in plannedstmt->partPruneInfos and
+ * finds leaf partitions that survive that pruning across all the
+ * nodes in the plan tree.
+ */
+ part_prune_results = ExecutorDoInitialPruning(plannedstmt,
+ boundParams,
+ &scan_leafpart_rtis);
+ allLockRelids = bms_union(plannedstmt->minLockRelids,
+ scan_leafpart_rtis);
+ }
+ else
+ allLockRelids = plannedstmt->minLockRelids;
+
+ rti = -1;
+ while ((rti = bms_next_member(allLockRelids, rti)) > 0)
+ {
+ RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
if (rte->rtekind != RTE_RELATION)
continue;
@@ -1778,10 +1895,59 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
* fail if it's been dropped entirely --- we'll just transiently
* acquire a non-conflicting lock.
*/
- if (acquire)
- LockRelationOid(rte->relid, rte->rellockmode);
- else
- UnlockRelationOid(rte->relid, rte->rellockmode);
+ LockRelationOid(rte->relid, rte->rellockmode);
+ lockedRelids = bms_add_member(lockedRelids, rti);
+ }
+
+ *part_prune_results_list = lappend(*part_prune_results_list,
+ part_prune_results);
+ *lockedRelids_per_stmt = lappend(*lockedRelids_per_stmt, lockedRelids);
+ }
+}
+
+/*
+ * ReleaseExecutorLocks
+ * Release locks that would've been acquired by an earlier call to
+ * AcquireExecutorLocks()
+ */
+static void
+ReleaseExecutorLocks(List *stmt_list, List *lockedRelids_per_stmt)
+{
+ ListCell *lc1,
+ *lc2;
+
+ forboth(lc1, stmt_list, lc2, lockedRelids_per_stmt)
+ {
+ PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
+ Bitmapset *lockedRelids = lfirst(lc2);
+ int rti;
+
+ if (plannedstmt->commandType == CMD_UTILITY)
+ {
+ /*
+ * Ignore utility statements, except those (such as EXPLAIN) that
+ * contain a parsed-but-not-planned query. Note: it's okay to use
+ * ScanQueryForLocks, even though the query hasn't been through
+ * rule rewriting, because rewriting doesn't change the query
+ * representation.
+ */
+ Query *query = UtilityContainsQuery(plannedstmt->utilityStmt);
+
+ Assert(lockedRelids == NULL);
+ if (query)
+ ScanQueryForLocks(query, false);
+ continue;
+ }
+
+ rti = -1;
+ while ((rti = bms_next_member(lockedRelids, rti)) >= 0)
+ {
+ RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
+
+ Assert(rte->rtekind == RTE_RELATION);
+
+ /* See the comment in AcquireExecutorLocks(). */
+ UnlockRelationOid(rte->relid, rte->rellockmode);
}
}
}
diff --git a/src/backend/utils/mmgr/portalmem.c b/src/backend/utils/mmgr/portalmem.c
index c3e95346b6..74950bd163 100644
--- a/src/backend/utils/mmgr/portalmem.c
+++ b/src/backend/utils/mmgr/portalmem.c
@@ -303,6 +303,25 @@ PortalDefineQuery(Portal portal,
portal->status = PORTAL_DEFINED;
}
+/*
+ * PortalStorePartitionPruneResults
+ * Copy the given List of Lists of PartitionPruneResults into the
+ * portal's context
+ *
+ * This allows the caller to ensure that the list exists as long as the portal
+ * does.
+ */
+void
+PortalStorePartitionPruneResults(Portal portal, List *part_prune_results_list)
+{
+ MemoryContext oldcxt;
+
+ AssertArg(PortalIsValid(portal));
+ oldcxt = MemoryContextSwitchTo(portal->portalContext);
+ portal->part_prune_results_list = copyObject(part_prune_results_list);
+ MemoryContextSwitchTo(oldcxt);
+}
+
/*
* PortalReleaseCachedPlan
* Release a portal's reference to its cached plan, if any.
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 9ebde089ae..269cc4d562 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -87,7 +87,9 @@ extern void ExplainOneUtility(Node *utilityStmt, IntoClause *into,
ExplainState *es, const char *queryString,
ParamListInfo params, QueryEnvironment *queryEnv);
-extern void ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into,
+extern void ExplainOnePlan(PlannedStmt *plannedstmt,
+ List *part_prune_results,
+ IntoClause *into,
ExplainState *es, const char *queryString,
ParamListInfo params, QueryEnvironment *queryEnv,
const instr_time *planduration,
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index bf962af7af..bd8776402e 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -45,6 +45,7 @@ extern void ExecCleanupTupleRouting(ModifyTableState *mtstate,
* nparts Length of subplan_map[] and subpart_map[].
* subplan_map Subplan index by partition index, or -1.
* subpart_map Subpart index by partition index, or -1.
+ * rti_map Range table index by partition index, or 0.
* present_parts A Bitmapset of the partition indexes that we
* have subplans or subparts for.
* initial_pruning_steps List of PartitionPruneSteps used to
@@ -61,6 +62,7 @@ typedef struct PartitionedRelPruningData
int nparts;
int *subplan_map;
int *subpart_map;
+ Index *rti_map;
Bitmapset *present_parts;
List *initial_pruning_steps;
List *exec_pruning_steps;
@@ -126,5 +128,10 @@ extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
int part_prune_index,
Bitmapset **initially_valid_subplans);
extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
- bool initial_prune);
+ bool initial_prune,
+ Bitmapset **scan_leafpart_rtis);
+extern Bitmapset *ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt,
+ ParamListInfo params,
+ PartitionPruneInfo *pruneinfo,
+ Bitmapset **scan_leafpart_rtis);
#endif /* EXECPARTITION_H */
diff --git a/src/include/executor/execdesc.h b/src/include/executor/execdesc.h
index e79e2c001f..7d4379da7b 100644
--- a/src/include/executor/execdesc.h
+++ b/src/include/executor/execdesc.h
@@ -35,6 +35,8 @@ typedef struct QueryDesc
/* These fields are provided by CreateQueryDesc */
CmdType operation; /* CMD_SELECT, CMD_UPDATE, etc. */
PlannedStmt *plannedstmt; /* planner's output (could be utility, too) */
+ List *part_prune_results; /* ExecutorDoInitialPruning()'s
+ * output for plannedstmt */
const char *sourceText; /* source text of the query */
Snapshot snapshot; /* snapshot to use for query */
Snapshot crosscheck_snapshot; /* crosscheck for RI update/delete */
@@ -57,6 +59,7 @@ typedef struct QueryDesc
/* in pquery.c */
extern QueryDesc *CreateQueryDesc(PlannedStmt *plannedstmt,
+ List *part_prune_results,
const char *sourceText,
Snapshot snapshot,
Snapshot crosscheck_snapshot,
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index ed95ed1176..c9a5e5fb68 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -185,6 +185,9 @@ ExecGetJunkAttribute(TupleTableSlot *slot, AttrNumber attno, bool *isNull)
/*
* prototypes from functions in execMain.c
*/
+extern List *ExecutorDoInitialPruning(PlannedStmt *plannedstmt,
+ ParamListInfo params,
+ Bitmapset **scan_leafpart_rtis);
extern void ExecutorStart(QueryDesc *queryDesc, int eflags);
extern void standard_ExecutorStart(QueryDesc *queryDesc, int eflags);
extern void ExecutorRun(QueryDesc *queryDesc,
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 4a741b053f..521a60b988 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -612,6 +612,7 @@ typedef struct EState
* ExecRowMarks, or NULL if none */
PlannedStmt *es_plannedstmt; /* link to top of plan tree */
List *es_part_prune_infos; /* PlannedStmt.partPruneInfos */
+ List *es_part_prune_results; /* QueryDesc.part_prune_results */
const char *es_sourceText; /* Source text from QueryDesc */
JunkFilter *es_junkFilter; /* top-level junk filter, if any */
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index a80f43e540..937cc4629d 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -212,6 +212,7 @@ extern struct Bitmapset *readBitmapset(void);
extern uintptr_t readDatum(bool typbyval);
extern bool *readBoolCols(int numCols);
extern int *readIntCols(int numCols);
+extern Index *readIndexCols(int numCols);
extern Oid *readOidCols(int numCols);
extern int16 *readAttrNumberCols(int numCols);
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index fbe75dca0f..354c2e96c3 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -125,6 +125,18 @@ typedef struct PlannerGlobal
/* List of PartitionPruneInfo contained in the plan */
List *partPruneInfos;
+ /*
+ * Do any of those PartitionPruneInfos have initial pruning steps in them?
+ */
+ bool containsInitialPruning;
+
+ /*
+ * Indexes of all range table entries minus indexes of range table entries
+ * of the leaf partitions scanned by prunable subplans; see
+ * AcquireExecutorLocks()
+ */
+ Bitmapset *minLockRelids;
+
/* OIDs of relations the plan depends on */
List *relationOids;
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 2e132afc5a..c0717bf45e 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -73,8 +73,17 @@ typedef struct PlannedStmt
List *partPruneInfos; /* List of PartitionPruneInfo contained in
* the plan */
+ bool containsInitialPruning; /* Do any of those PartitionPruneInfos
+ * have initial pruning steps in them?
+ */
+
List *rtable; /* list of RangeTblEntry nodes */
+ Bitmapset *minLockRelids; /* Indexes of all range table entries minus
+ * indexes of range table entries of the leaf
+ * partitions scanned by prunable subplans;
+ * see AcquireExecutorLocks() */
+
/* rtable indexes of target relations for INSERT/UPDATE/DELETE/MERGE */
List *resultRelations; /* integer list of RT indexes, or NIL */
@@ -1410,6 +1419,13 @@ typedef struct PlanRowMark
* prune_infos List of Lists containing PartitionedRelPruneInfo nodes,
* one sublist per run-time-prunable partition hierarchy
* appearing in the parent plan node's subplans.
+ *
+ * needs_init_pruning Does any of the PartitionedRelPruneInfos in
+ * prune_infos have its initial_pruning_steps set?
+ *
+ * needs_exec_pruning Does any of the PartitionedRelPruneInfos in
+ * prune_infos have its exec_pruning_steps set?
+ *
* other_subplans Indexes of any subplans that are not accounted for
* by any of the PartitionedRelPruneInfo nodes in
* "prune_infos". These subplans must not be pruned.
@@ -1420,6 +1436,8 @@ typedef struct PartitionPruneInfo
NodeTag type;
List *prune_infos;
+ bool needs_init_pruning;
+ bool needs_exec_pruning;
Bitmapset *other_subplans;
} PartitionPruneInfo;
@@ -1464,6 +1482,9 @@ typedef struct PartitionedRelPruneInfo
/* relation OID by partition index, or 0 */
Oid *relid_map pg_node_attr(array_size(nparts));
+ /* Range table index by partition index, or 0. */
+ Index *rti_map pg_node_attr(array_size(nparts));
+
/*
* initial_pruning_steps shows how to prune during executor startup (i.e.,
* without use of any PARAM_EXEC Params); it is NIL if no startup pruning
@@ -1548,6 +1569,31 @@ typedef struct PartitionPruneStepCombine
List *source_stepids;
} PartitionPruneStepCombine;
+/*----------------
+ * PartitionPruneResult
+ *
+ * The result of performing ExecPartitionDoInitialPruning() on a given
+ * PartitionPruneInfo.
+ *
+ * valid_subplans_offs contains the indexes of subplans remaining after
+ * performing initial pruning by calling ExecFindMatchingSubPlans() on the
+ * PartitionPruneInfo.
+ *
+ * This is used to store the result of initial partition pruning that is
+ * peformed before the execution has started. A module that needs to do so
+ * should call ExecutorDoInitialPruning() on a given PlannedStmt, which
+ * returns a List of PartitionPruneResult containing an entry for each
+ * PartitionPruneInfo present in PlannedStmt.part_prune_infos. The module
+ * should then pass that list, along with the PlannedStmt, to the executor,
+ * so that it can reuse the result of initial partition pruning when
+ * initializing the subplans for execution.
+ */
+typedef struct PartitionPruneResult
+{
+ NodeTag type;
+
+ Bitmapset *valid_subplan_offs;
+} PartitionPruneResult;
/*
* Plan invalidation info
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index 0499635f59..32579d4788 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -220,7 +220,8 @@ extern List *CachedPlanGetTargetList(CachedPlanSource *plansource,
extern CachedPlan *GetCachedPlan(CachedPlanSource *plansource,
ParamListInfo boundParams,
ResourceOwner owner,
- QueryEnvironment *queryEnv);
+ QueryEnvironment *queryEnv,
+ List **part_prune_results_list);
extern void ReleaseCachedPlan(CachedPlan *plan, ResourceOwner owner);
extern bool CachedPlanAllowsSimpleValidityCheck(CachedPlanSource *plansource,
diff --git a/src/include/utils/portal.h b/src/include/utils/portal.h
index aeddbdafe5..1901fc5f28 100644
--- a/src/include/utils/portal.h
+++ b/src/include/utils/portal.h
@@ -138,6 +138,7 @@ typedef struct PortalData
QueryCompletion qc; /* command completion data for executed query */
List *stmts; /* list of PlannedStmts */
CachedPlan *cplan; /* CachedPlan, if stmts are from one */
+ List *part_prune_results_list; /* List of Lists of PartitionPruneResults */
ParamListInfo portalParams; /* params to pass to query */
QueryEnvironment *queryEnv; /* environment for query */
@@ -242,6 +243,8 @@ extern void PortalDefineQuery(Portal portal,
CommandTag commandTag,
List *stmts,
CachedPlan *cplan);
+extern void PortalStorePartitionPruneResults(Portal portal,
+ List *part_prune_results_list);
extern PlannedStmt *PortalGetPrimaryStmt(Portal portal);
extern void PortalCreateHoldStore(Portal portal);
extern void PortalHashTableDeleteAll(void);
--
2.35.3
^ permalink raw reply [nested|flat] 71+ messages in thread
* Re: generic plans and "initial" pruning
@ 2022-11-08 06:22 Amit Langote <[email protected]>
parent: Amit Langote <[email protected]>
0 siblings, 1 reply; 71+ messages in thread
From: Amit Langote @ 2022-11-08 06:22 UTC (permalink / raw)
To: Robert Haas <[email protected]>; +Cc: Jacob Champion <[email protected]>; Zhihong Yu <[email protected]>; David Rowley <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers
On Thu, Oct 27, 2022 at 11:41 AM Amit Langote <[email protected]> wrote:
> On Mon, Oct 17, 2022 at 6:29 PM Amit Langote <[email protected]> wrote:
> > On Wed, Oct 12, 2022 at 4:36 PM Amit Langote <[email protected]> wrote:
> > > On Fri, Jul 29, 2022 at 1:20 PM Amit Langote <[email protected]> wrote:
> > > > On Thu, Jul 28, 2022 at 1:27 AM Robert Haas <[email protected]> wrote:
> > > > > 0001 adds es_part_prune_result but does not use it, so maybe the
> > > > > introduction of that field should be deferred until it's needed for
> > > > > something.
> > > >
> > > > Oops, looks like a mistake when breaking the patch. Will move that bit to 0002.
> > >
> > > Fixed that and also noticed that I had defined PartitionPruneResult in
> > > the wrong header (execnodes.h). That led to PartitionPruneResult
> > > nodes not being able to be written and read, because
> > > src/backend/nodes/gen_node_support.pl doesn't create _out* and _read*
> > > routines for the nodes defined in execnodes.h. I moved its definition
> > > to plannodes.h, even though it is not actually the planner that
> > > instantiates those; no other include/nodes header sounds better.
> > >
> > > One more thing I realized is that Bitmapsets added to the List
> > > PartitionPruneResult.valid_subplan_offs_list are not actually
> > > read/write-able. That's a problem that I also faced in [1], so I
> > > proposed a patch there to make Bitmapset a read/write-able Node and
> > > mark (only) the Bitmapsets that are added into read/write-able node
> > > trees with the corresponding NodeTag. I'm including that patch here
> > > as well (0002) for the main patch to work (pass
> > > -DWRITE_READ_PARSE_PLAN_TREES build tests), though it might make sense
> > > to discuss it in its own thread?
> >
> > Had second thoughts on the use of List of Bitmapsets for this, such
> > that the make-Bitmapset-Nodes patch is no longer needed.
> >
> > I had defined PartitionPruneResult such that it stood for the results
> > of pruning for all PartitionPruneInfos contained in
> > PlannedStmt.partPruneInfos (covering all Append/MergeAppend nodes that
> > can use partition pruning in a given plan). So, it had a List of
> > Bitmapset. I think it's perhaps better for PartitionPruneResult to
> > cover only one PartitionPruneInfo and thus need only a Bitmapset and
> > not a List thereof, which I have implemented in the attached updated
> > patch 0002. So, instead of needing to pass around a
> > PartitionPruneResult with each PlannedStmt, this now passes a List of
> > PartitionPruneResult with an entry for each in
> > PlannedStmt.partPruneInfos.
>
> Rebased over 3b2db22fe.
Updated 0002 to cope with AssertArg() being removed from the tree.
--
Thanks, Amit Langote
EDB: http://www.enterprisedb.com
Attachments:
[application/octet-stream] v24-0002-Optimize-AcquireExecutorLocks-by-locking-only-un.patch (82.3K, 2-v24-0002-Optimize-AcquireExecutorLocks-by-locking-only-un.patch)
download | inline diff:
From 8f6456d27efb8719a7dd8a52bf0ad3c5033b31a3 Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Wed, 22 Dec 2021 16:55:17 +0900
Subject: [PATCH v24 2/2] Optimize AcquireExecutorLocks() by locking only
unpruned partitions
This commit teaches AcquireExecutorLocks() to perform initial
partition pruning to notionally eliminate the subnodes contained in a
generic cached plan that need not be initialized during the actual
execution of the plan and skip locking the partition scanned by those
subnodes.
The result of performing initial partition pruning this way before the
actual execution has started is made available to the actual execution via
PartitionPruneResult, made available along with the PlannedStmt by the
callers of the executor that used plancache.c to get the plan. It is NULL
in the cases in which the plan is obtained by calling the planner
directly or if the plan obtained by plancache.c is not a generic one.
---
src/backend/commands/copyto.c | 2 +-
src/backend/commands/createas.c | 2 +-
src/backend/commands/explain.c | 7 +-
src/backend/commands/extension.c | 2 +-
src/backend/commands/matview.c | 2 +-
src/backend/commands/prepare.c | 26 ++-
src/backend/executor/README | 32 ++++
src/backend/executor/execMain.c | 51 ++++++
src/backend/executor/execParallel.c | 26 ++-
src/backend/executor/execPartition.c | 241 +++++++++++++++++++++----
src/backend/executor/execUtils.c | 1 +
src/backend/executor/functions.c | 2 +-
src/backend/executor/nodeAppend.c | 11 +-
src/backend/executor/nodeMergeAppend.c | 5 +-
src/backend/executor/spi.c | 27 ++-
src/backend/nodes/readfuncs.c | 8 +-
src/backend/optimizer/plan/planner.c | 2 +
src/backend/optimizer/plan/setrefs.c | 46 +++++
src/backend/partitioning/partprune.c | 41 ++++-
src/backend/tcop/postgres.c | 8 +-
src/backend/tcop/pquery.c | 28 ++-
src/backend/utils/cache/plancache.c | 208 ++++++++++++++++++---
src/backend/utils/mmgr/portalmem.c | 19 ++
src/include/commands/explain.h | 4 +-
src/include/executor/execPartition.h | 9 +-
src/include/executor/execdesc.h | 3 +
src/include/executor/executor.h | 3 +
src/include/nodes/execnodes.h | 1 +
src/include/nodes/nodes.h | 1 +
src/include/nodes/pathnodes.h | 12 ++
src/include/nodes/plannodes.h | 46 +++++
src/include/utils/plancache.h | 3 +-
src/include/utils/portal.h | 3 +
33 files changed, 782 insertions(+), 100 deletions(-)
diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index f26cc0d162..401a2280a3 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -558,7 +558,7 @@ BeginCopyTo(ParseState *pstate,
((DR_copy *) dest)->cstate = cstate;
/* Create a QueryDesc requesting no output */
- cstate->queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+ cstate->queryDesc = CreateQueryDesc(plan, NIL, pstate->p_sourcetext,
GetActiveSnapshot(),
InvalidSnapshot,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 152c29b551..942449544c 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -325,7 +325,7 @@ ExecCreateTableAs(ParseState *pstate, CreateTableAsStmt *stmt,
UpdateActiveSnapshotCommandId();
/* Create a QueryDesc, redirecting output to our tuple receiver */
- queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+ queryDesc = CreateQueryDesc(plan, NIL, pstate->p_sourcetext,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, 0);
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index f86983c660..2f2b558608 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -407,7 +407,7 @@ ExplainOneQuery(Query *query, int cursorOptions,
}
/* run it (if needed) and produce output */
- ExplainOnePlan(plan, into, es, queryString, params, queryEnv,
+ ExplainOnePlan(plan, NIL, into, es, queryString, params, queryEnv,
&planduration, (es->buffers ? &bufusage : NULL));
}
}
@@ -515,7 +515,8 @@ ExplainOneUtility(Node *utilityStmt, IntoClause *into, ExplainState *es,
* to call it.
*/
void
-ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
+ExplainOnePlan(PlannedStmt *plannedstmt, List *part_prune_results,
+ IntoClause *into, ExplainState *es,
const char *queryString, ParamListInfo params,
QueryEnvironment *queryEnv, const instr_time *planduration,
const BufferUsage *bufusage)
@@ -563,7 +564,7 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
dest = None_Receiver;
/* Create a QueryDesc for the query */
- queryDesc = CreateQueryDesc(plannedstmt, queryString,
+ queryDesc = CreateQueryDesc(plannedstmt, part_prune_results, queryString,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, instrument_option);
diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index 1a62e5dac5..cc36b6fd15 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -776,7 +776,7 @@ execute_sql_string(const char *sql)
{
QueryDesc *qdesc;
- qdesc = CreateQueryDesc(stmt,
+ qdesc = CreateQueryDesc(stmt, NIL,
sql,
GetActiveSnapshot(), NULL,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index 9ac0383459..65c8d0aa59 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -408,7 +408,7 @@ refresh_matview_datafill(DestReceiver *dest, Query *query,
UpdateActiveSnapshotCommandId();
/* Create a QueryDesc, redirecting output to our tuple receiver */
- queryDesc = CreateQueryDesc(plan, queryString,
+ queryDesc = CreateQueryDesc(plan, NIL, queryString,
GetActiveSnapshot(), InvalidSnapshot,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 9e29584d93..29b45539d3 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -155,6 +155,7 @@ ExecuteQuery(ParseState *pstate,
PreparedStatement *entry;
CachedPlan *cplan;
List *plan_list;
+ List *part_prune_results_list;
ParamListInfo paramLI = NULL;
EState *estate = NULL;
Portal portal;
@@ -193,7 +194,10 @@ ExecuteQuery(ParseState *pstate,
entry->plansource->query_string);
/* Replan if needed, and increment plan refcount for portal */
- cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL);
+ cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL,
+ &part_prune_results_list);
+ Assert(list_length(cplan->stmt_list) ==
+ list_length(part_prune_results_list));
plan_list = cplan->stmt_list;
/*
@@ -207,6 +211,9 @@ ExecuteQuery(ParseState *pstate,
plan_list,
cplan);
+ /* Copy Lists of PartitionPruneResults into the portal's context. */
+ PortalStorePartitionPruneResults(portal, part_prune_results_list);
+
/*
* For CREATE TABLE ... AS EXECUTE, we must verify that the prepared
* statement is one that produces tuples. Currently we insist that it be
@@ -576,7 +583,9 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
const char *query_string;
CachedPlan *cplan;
List *plan_list;
- ListCell *p;
+ List *part_prune_results_list;
+ ListCell *p,
+ *pp;
ParamListInfo paramLI = NULL;
EState *estate = NULL;
instr_time planstart;
@@ -619,7 +628,10 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
/* Replan if needed, and acquire a transient refcount */
cplan = GetCachedPlan(entry->plansource, paramLI,
- CurrentResourceOwner, queryEnv);
+ CurrentResourceOwner, queryEnv,
+ &part_prune_results_list);
+ Assert(list_length(cplan->stmt_list) ==
+ list_length(part_prune_results_list));
INSTR_TIME_SET_CURRENT(planduration);
INSTR_TIME_SUBTRACT(planduration, planstart);
@@ -634,13 +646,15 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
plan_list = cplan->stmt_list;
/* Explain each query */
- foreach(p, plan_list)
+ forboth(p, plan_list, pp, part_prune_results_list)
{
PlannedStmt *pstmt = lfirst_node(PlannedStmt, p);
+ List *part_prune_results = lfirst_node(List, pp);
if (pstmt->commandType != CMD_UTILITY)
- ExplainOnePlan(pstmt, into, es, query_string, paramLI, queryEnv,
- &planduration, (es->buffers ? &bufusage : NULL));
+ ExplainOnePlan(pstmt, part_prune_results, into, es, query_string,
+ paramLI, queryEnv, &planduration,
+ (es->buffers ? &bufusage : NULL));
else
ExplainOneUtility(pstmt->utilityStmt, into, es, query_string,
paramLI, queryEnv);
diff --git a/src/backend/executor/README b/src/backend/executor/README
index 0b5183fc4a..f14f9197b5 100644
--- a/src/backend/executor/README
+++ b/src/backend/executor/README
@@ -65,6 +65,34 @@ found there. This currently only occurs for Append and MergeAppend nodes. In
this case the non-required subplans are ignored and the executor state's
subnode array will become out of sequence to the plan's subplan list.
+Actually, the so-called execution time pruning may also occur even before the
+execution has started. One case where that occurs is when a cached generic
+plan is being validated for execution by plancache.c: GetCachedPlan(), which
+works by locking all the relations that will be scanned by that plan. If the
+generic plan contains nodes that can perform execution time partition pruning
+(that is, contain a PartitionPruneInfo), a subset of pruning steps contained
+in a given node's PartitionPruneInfo that do not depend on the execution
+actually having started (called "initial" pruning steps) are performed to
+figure out the minimal set of child subplans that satisfy those pruning steps.
+AcquireExecutorLocks() looking at a given generic plan will then lock only the
+relations scanned by the child subplans that survived such pruning, along with
+those present in PlannedStmt.minLockRelids. Note that the subplans are only
+notionally pruned, that is, they are not removed from the plan tree as such.
+
+To prevent the executor and any third party execution code that can look at
+the plan tree from trying to execute the subplans that were pruned as
+described above, the result of pruning is passed to the executor as a List
+of PartitionPruneResult nodes via the QueryDesc. Each PartitionPruneResult
+consists of the set of indexes of surviving subplans in the respective parent
+plan node's (the one to which the corresponding PartitionPruneInfo belongs)
+list of child subplans, saved as a bitmapset (valid_subplan_offs). In other
+words, the executor executing a generic plan should not re-evaluate the set of
+initially valid subplans for a given plan node by redoing the initial pruning
+if it was already done by AcquireExecutorLocks() when validating the plan.
+Such re-evaluation of the pruning steps may very well end up resulting in a
+different set of subplans, containing some whose relations were not locked by
+AcquireExecutorLocks().
+
Each Plan node may have expression trees associated with it, to represent
its target list, qualification conditions, etc. These trees are also
read-only to the executor, but the executor state for expression evaluation
@@ -286,6 +314,10 @@ Query Processing Control Flow
This is a sketch of control flow for full query processing:
+ [ ExecutorDoInitialPruning ] --- an optional step to perform initial
+ partition pruning on the plan tree the result of which is passed
+ to the executor via QueryDesc
+
CreateQueryDesc
ExecutorStart
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 32475e33ff..b59474841f 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -49,6 +49,7 @@
#include "commands/matview.h"
#include "commands/trigger.h"
#include "executor/execdebug.h"
+#include "executor/execPartition.h"
#include "executor/nodeSubplan.h"
#include "foreign/fdwapi.h"
#include "jit/jit.h"
@@ -104,6 +105,54 @@ static void EvalPlanQualStart(EPQState *epqstate, Plan *planTree);
/* end of local decls */
+/* ----------------------------------------------------------------
+ * ExecutorDoInitialPruning
+ *
+ * For each plan tree node that has been assigned a PartitionPruneInfo,
+ * this performs initial partition pruning using the information contained
+ * therein to determine the set of child subplans that satisfy the initial
+ * pruning steps, to be returned as a bitmapset of their indexes in the
+ * node's list of child subplans (for example, an Append's appendplans).
+ *
+ * Return value is a PartitionPruneResult node that contains a list of those
+ * bitmapsets, with one element for every PartitionPruneInfo, and a bitmapset
+ * of the RT indexes of all the leaf partitions scanned by those chosen
+ * subplans. Note that the latter is shared across all PartitionPruneInfos.
+ *
+ * The executor must see the exactly same set of subplans as valid for
+ * execution when doing ExecInitNode() on the plan nodes whose
+ * PartitionPruneInfos are processed here. So, it must get the set from the
+ * aforementioned PartitionPruneResult, instead of computing it all over
+ * again by redoing the initial pruning. It's the caller's job to pass the
+ * PartitionPruneResult to the executor.
+ *
+ * Note: Partitioned tables mentioned in PartitionedRelPruneInfo nodes that
+ * drive the pruning will be locked before doing the pruning.
+ * ----------------------------------------------------------------
+ */
+List *
+ExecutorDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params,
+ Bitmapset **scan_leafpart_rtis)
+{
+ List *part_prune_results = NIL;
+ ListCell *lc;
+
+ /* Only get here if there is any pruning to do. */
+ Assert(plannedstmt->containsInitialPruning);
+
+ foreach(lc, plannedstmt->partPruneInfos)
+ {
+ PartitionPruneInfo *pruneinfo = lfirst(lc);
+ PartitionPruneResult *pruneresult = makeNode(PartitionPruneResult);
+
+ pruneresult->valid_subplan_offs =
+ ExecPartitionDoInitialPruning(plannedstmt, params, pruneinfo,
+ scan_leafpart_rtis);
+ part_prune_results = lappend(part_prune_results, pruneresult);
+ }
+
+ return part_prune_results;
+}
/* ----------------------------------------------------------------
* ExecutorStart
@@ -806,6 +855,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
{
CmdType operation = queryDesc->operation;
PlannedStmt *plannedstmt = queryDesc->plannedstmt;
+ List *part_prune_results = queryDesc->part_prune_results;
Plan *plan = plannedstmt->planTree;
List *rangeTable = plannedstmt->rtable;
EState *estate = queryDesc->estate;
@@ -826,6 +876,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
estate->es_plannedstmt = plannedstmt;
estate->es_part_prune_infos = plannedstmt->partPruneInfos;
+ estate->es_part_prune_results = part_prune_results;
/*
* Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index aca0c6f323..917079a034 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -66,6 +66,7 @@
#define PARALLEL_KEY_QUERY_TEXT UINT64CONST(0xE000000000000008)
#define PARALLEL_KEY_JIT_INSTRUMENTATION UINT64CONST(0xE000000000000009)
#define PARALLEL_KEY_WAL_USAGE UINT64CONST(0xE00000000000000A)
+#define PARALLEL_KEY_PARTITION_PRUNE_RESULTS UINT64CONST(0xE00000000000000B)
#define PARALLEL_TUPLE_QUEUE_SIZE 65536
@@ -182,6 +183,7 @@ ExecSerializePlan(Plan *plan, EState *estate)
pstmt->transientPlan = false;
pstmt->dependsOnRole = false;
pstmt->parallelModeNeeded = false;
+ pstmt->containsInitialPruning = false;
pstmt->planTree = plan;
pstmt->partPruneInfos = estate->es_part_prune_infos;
pstmt->rtable = estate->es_range_table;
@@ -597,12 +599,15 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
FixedParallelExecutorState *fpes;
char *pstmt_data;
char *pstmt_space;
+ char *part_prune_results_data;
+ char *part_prune_results_space;
char *paramlistinfo_space;
BufferUsage *bufusage_space;
WalUsage *walusage_space;
SharedExecutorInstrumentation *instrumentation = NULL;
SharedJitInstrumentation *jit_instrumentation = NULL;
int pstmt_len;
+ int part_prune_results_len;
int paramlistinfo_len;
int instrumentation_len = 0;
int jit_instrumentation_len = 0;
@@ -631,6 +636,7 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
/* Fix up and serialize plan to be sent to workers. */
pstmt_data = ExecSerializePlan(planstate->plan, estate);
+ part_prune_results_data = nodeToString(estate->es_part_prune_results);
/* Create a parallel context. */
pcxt = CreateParallelContext("postgres", "ParallelQueryMain", nworkers);
@@ -657,6 +663,11 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
shm_toc_estimate_chunk(&pcxt->estimator, pstmt_len);
shm_toc_estimate_keys(&pcxt->estimator, 1);
+ /* Estimate space for serialized List of PartitionPruneResult. */
+ part_prune_results_len = strlen(part_prune_results_data) + 1;
+ shm_toc_estimate_chunk(&pcxt->estimator, part_prune_results_len);
+ shm_toc_estimate_keys(&pcxt->estimator, 1);
+
/* Estimate space for serialized ParamListInfo. */
paramlistinfo_len = EstimateParamListSpace(estate->es_param_list_info);
shm_toc_estimate_chunk(&pcxt->estimator, paramlistinfo_len);
@@ -751,6 +762,12 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
memcpy(pstmt_space, pstmt_data, pstmt_len);
shm_toc_insert(pcxt->toc, PARALLEL_KEY_PLANNEDSTMT, pstmt_space);
+ /* Store serialized List of PartitionPruneResult */
+ part_prune_results_space = shm_toc_allocate(pcxt->toc, part_prune_results_len);
+ memcpy(part_prune_results_space, part_prune_results_data, part_prune_results_len);
+ shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARTITION_PRUNE_RESULTS,
+ part_prune_results_space);
+
/* Store serialized ParamListInfo. */
paramlistinfo_space = shm_toc_allocate(pcxt->toc, paramlistinfo_len);
shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARAMLISTINFO, paramlistinfo_space);
@@ -1232,8 +1249,10 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
int instrument_options)
{
char *pstmtspace;
+ char *part_prune_results_space;
char *paramspace;
PlannedStmt *pstmt;
+ List *part_prune_results;
ParamListInfo paramLI;
char *queryString;
@@ -1244,12 +1263,17 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
pstmtspace = shm_toc_lookup(toc, PARALLEL_KEY_PLANNEDSTMT, false);
pstmt = (PlannedStmt *) stringToNode(pstmtspace);
+ /* Reconstruct leader-supplied PartitionPruneResult. */
+ part_prune_results_space =
+ shm_toc_lookup(toc, PARALLEL_KEY_PARTITION_PRUNE_RESULTS, false);
+ part_prune_results = (List *) stringToNode(part_prune_results_space);
+
/* Reconstruct ParamListInfo. */
paramspace = shm_toc_lookup(toc, PARALLEL_KEY_PARAMLISTINFO, false);
paramLI = RestoreParamList(¶mspace);
/* Create a QueryDesc for the query. */
- return CreateQueryDesc(pstmt,
+ return CreateQueryDesc(pstmt, part_prune_results,
queryString,
GetActiveSnapshot(), InvalidSnapshot,
receiver, paramLI, NULL, instrument_options);
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 80197d5141..8728745c44 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -25,6 +25,7 @@
#include "mb/pg_wchar.h"
#include "miscadmin.h"
#include "nodes/makefuncs.h"
+#include "parser/parsetree.h"
#include "partitioning/partbounds.h"
#include "partitioning/partdesc.h"
#include "partitioning/partprune.h"
@@ -185,7 +186,11 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation rel,
static List *adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri);
static List *adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap);
static PartitionPruneState *CreatePartitionPruneState(PlanState *planstate,
- PartitionPruneInfo *pruneinfo);
+ PartitionPruneInfo *pruneinfo,
+ bool consider_initial_steps,
+ bool consider_exec_steps,
+ List *rtable, ExprContext *econtext,
+ PartitionDirectory partdir);
static void InitPartitionPruneContext(PartitionPruneContext *context,
List *pruning_steps,
PartitionDesc partdesc,
@@ -198,7 +203,8 @@ static void PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
static void find_matching_subplans_recurse(PartitionPruningData *prunedata,
PartitionedRelPruningData *pprune,
bool initial_prune,
- Bitmapset **validsubplans);
+ Bitmapset **validsubplans,
+ Bitmapset **scan_leafpart_rtis);
/*
@@ -1746,8 +1752,10 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
* considered to be a stable expression, it can change value from one plan
* node scan to the next during query execution. Stable comparison
* expressions that don't involve such Params allow partition pruning to be
- * done once during executor startup. Expressions that do involve such Params
- * require us to prune separately for each scan of the parent plan node.
+ * done once during executor startup or during ExecutorDoInitialPruning() that
+ * runs as part of performing AcquireExecutorLocks() on a given plan tree.
+ * Expressions that do involve such Params require us to prune separately for
+ * each scan of the parent plan node.
*
* Note that pruning away unneeded subplans during executor startup has the
* added benefit of not having to initialize the unneeded subplans at all.
@@ -1764,6 +1772,13 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
* account for initial pruning possibly having eliminated some of the
* subplans.
*
+ * ExecPartitionDoInitialPruning:
+ * Do initial pruning with the information contained in a given
+ * PartitionPruneInfo to determine the minimal set of child subplans
+ * to be executed of the parent plan node to which the PartitionPruneInfo
+ * belongs and also the set of the RT indexes of leaf partitions that will
+ * be scanned with those subplans.
+ *
* ExecFindMatchingSubPlans:
* Returns indexes of matching subplans after evaluating the expressions
* that are safe to evaluate at a given point. This function is first
@@ -1781,8 +1796,9 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
*
* On return, *initially_valid_subplans is assigned the set of indexes of
* child subplans that must be initialized along with the parent plan node.
- * Initial pruning is performed here if needed and in that case only the
- * surviving subplans' indexes are added.
+ * Initial pruning is performed here if needed (unless it has already been done
+ * by ExecutorDoInitialPruning()), and in that case only the surviving
+ * subplans' indexes are added.
*
* If subplans are indeed pruned, subplan_map arrays contained in the returned
* PartitionPruneState are re-sequenced to not count those, though only if the
@@ -1794,28 +1810,65 @@ ExecInitPartitionPruning(PlanState *planstate,
int part_prune_index,
Bitmapset **initially_valid_subplans)
{
- PartitionPruneState *prunestate;
+ PartitionPruneState *prunestate = NULL;
EState *estate = planstate->state;
PartitionPruneInfo *pruneinfo = list_nth(estate->es_part_prune_infos,
part_prune_index);
+ PartitionPruneResult *pruneresult = NULL;
+ bool do_pruning = (pruneinfo->needs_init_pruning ||
+ pruneinfo->needs_exec_pruning);
+
+ /*
+ * No need to do initial pruning if it was done already by
+ * ExecutorDoInitialPruning(), which it would be if es_part_prune_results
+ * is set.
+ */
+ if (estate->es_part_prune_results)
+ {
+ pruneresult = list_nth(estate->es_part_prune_results, part_prune_index);
+ Assert(IsA(pruneresult, PartitionPruneResult));
+ do_pruning = pruneinfo->needs_exec_pruning;
+ }
- /* We may need an expression context to evaluate partition exprs */
- ExecAssignExprContext(estate, planstate);
+ if (do_pruning)
+ {
+ /* We may need an expression context to evaluate partition exprs */
+ ExecAssignExprContext(estate, planstate);
- /* Create the working data structure for pruning */
- prunestate = CreatePartitionPruneState(planstate, pruneinfo);
+ /* For data reading, executor always omits detached partitions */
+ if (estate->es_partition_directory == NULL)
+ estate->es_partition_directory =
+ CreatePartitionDirectory(estate->es_query_cxt, false);
+
+ /*
+ * Create the working data structure for pruning. No need to consider
+ * initial pruning steps if we have a PartitionPruneResult.
+ */
+ prunestate = CreatePartitionPruneState(planstate, pruneinfo,
+ pruneresult == NULL, true,
+ NIL, planstate->ps_ExprContext,
+ estate->es_partition_directory);
+ }
/*
* Perform an initial partition prune pass, if required.
*/
- if (prunestate->do_initial_prune)
- *initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true);
+ if (pruneresult)
+ {
+ *initially_valid_subplans = bms_copy(pruneresult->valid_subplan_offs);
+ }
+ else if (prunestate && prunestate->do_initial_prune)
+ {
+ *initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true,
+ NULL);
+ }
else
{
- /* No pruning, so we'll need to initialize all subplans */
+ /* No initial pruning, so we'll need to initialize all subplans */
Assert(n_total_subplans > 0);
*initially_valid_subplans = bms_add_range(NULL, 0,
n_total_subplans - 1);
+ return prunestate;
}
/*
@@ -1823,7 +1876,8 @@ ExecInitPartitionPruning(PlanState *planstate,
* that were removed above due to initial pruning. No need to do this if
* no steps were removed.
*/
- if (bms_num_members(*initially_valid_subplans) < n_total_subplans)
+ if (prunestate &&
+ bms_num_members(*initially_valid_subplans) < n_total_subplans)
{
/*
* We can safely skip this when !do_exec_prune, even though that
@@ -1839,11 +1893,74 @@ ExecInitPartitionPruning(PlanState *planstate,
return prunestate;
}
+/*
+ * ExecPartitionDoInitialPruning
+ * Perform initial pruning using given PartitionPruneInfo to determine
+ * the minimal set of child subplans that will be executed and also the
+ * set of RT indexes of the leaf partitions scanned by those subplans.
+ */
+Bitmapset *
+ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params,
+ PartitionPruneInfo *pruneinfo,
+ Bitmapset **scan_leafpart_rtis)
+{
+ List *rtable = plannedstmt->rtable;
+ ExprContext *econtext;
+ PartitionDirectory pdir;
+ MemoryContext oldcontext,
+ tmpcontext;
+ PartitionPruneState *prunestate;
+ Bitmapset *valid_subplan_offs;
+
+ /*
+ * A temporary context for memory allocations required while executing
+ * partition pruning steps.
+ */
+ tmpcontext = AllocSetContextCreate(CurrentMemoryContext,
+ "initial pruning working data",
+ ALLOCSET_DEFAULT_SIZES);
+ oldcontext = MemoryContextSwitchTo(tmpcontext);
+
+ /*
+ * PartitionDirectory to look up partition descriptors.
+ * Note that we don't omit detached partitions, just like during
+ * execution proper.
+ */
+ pdir = CreatePartitionDirectory(CurrentMemoryContext, false);
+
+ /*
+ * We don't yet have a PlanState for the parent plan node, so we must
+ * create a standalone ExprContext to evaluate pruning expressions,
+ * equipped with the information about the EXTERN parameters that the
+ * caller passed us. Note that that's okay because the initial pruning
+ * steps do not contain anything that requires the execution to have
+ * started and thus need the information contained in a PlanState.
+ */
+ econtext = CreateStandaloneExprContext();
+ econtext->ecxt_param_list_info = params;
+ prunestate = CreatePartitionPruneState(NULL, pruneinfo, true, false,
+ rtable, econtext, pdir);
+ MemoryContextSwitchTo(oldcontext);
+
+ /* Do the initial pruning. */
+ valid_subplan_offs = ExecFindMatchingSubPlans(prunestate, true,
+ scan_leafpart_rtis);
+
+ FreeExprContext(econtext, true);
+ DestroyPartitionDirectory(pdir);
+ MemoryContextDelete(tmpcontext);
+
+ return valid_subplan_offs;
+}
+
/*
* CreatePartitionPruneState
* Build the data structure required for calling ExecFindMatchingSubPlans
*
- * 'planstate' is the parent plan node's execution state.
+ * 'planstate', if not NULL, is the parent plan node's execution state. It
+ * can be NULL if being called before ExecutorStart(), in which case,
+ * 'rtable' (range table), 'econtext', and 'partdir' must be explicitly
+ * provided.
*
* 'pruneinfo' is a PartitionPruneInfo as generated by
* make_partition_pruneinfo. Here we build a PartitionPruneState containing a
@@ -1857,19 +1974,21 @@ ExecInitPartitionPruning(PlanState *planstate,
* PartitionedRelPruneInfo.
*/
static PartitionPruneState *
-CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
+CreatePartitionPruneState(PlanState *planstate,
+ PartitionPruneInfo *pruneinfo,
+ bool consider_initial_steps,
+ bool consider_exec_steps,
+ List *rtable, ExprContext *econtext,
+ PartitionDirectory partdir)
{
- EState *estate = planstate->state;
+ EState *estate = planstate ? planstate->state : NULL;
PartitionPruneState *prunestate;
int n_part_hierarchies;
ListCell *lc;
int i;
- ExprContext *econtext = planstate->ps_ExprContext;
- /* For data reading, executor always omits detached partitions */
- if (estate->es_partition_directory == NULL)
- estate->es_partition_directory =
- CreatePartitionDirectory(estate->es_query_cxt, false);
+ Assert((estate != NULL) ||
+ (partdir != NULL && econtext != NULL && rtable != NIL));
n_part_hierarchies = list_length(pruneinfo->prune_infos);
Assert(n_part_hierarchies > 0);
@@ -1924,15 +2043,42 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
PartitionKey partkey;
/*
- * We can rely on the copies of the partitioned table's partition
- * key and partition descriptor appearing in its relcache entry,
- * because that entry will be held open and locked for the
- * duration of this executor run.
+ * Must open the relation by ourselves when called before the
+ * execution has started, such as, when called during
+ * ExecutorDoInitialPruning() on a cached plan. In that case,
+ * sub-partitions must be locked, because AcquirePlannerLocks()
+ * would not have seen them. (1st relation in a partrelpruneinfos
+ * list is always the root partitioned table appearing in the
+ * query, which AcquirePlannerLocks() would have locked; the
+ * Assert in relation_open() guards that assumption.)
+ */
+ if (estate == NULL)
+ {
+ RangeTblEntry *rte = rt_fetch(pinfo->rtindex, rtable);
+ int lockmode = (j == 0) ? NoLock : rte->rellockmode;
+
+ partrel = table_open(rte->relid, lockmode);
+ }
+ else
+ partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
+
+ /*
+ * We can rely on the copy of the partitioned table's partition
+ * key from in its relcache entry, because it can't change (or
+ * get destroyed) as long as the relation is locked. Partition
+ * descriptor is taken from the PartitionDirectory associated with
+ * the table that is held open long enough for the descriptor to
+ * remain valid while it's used to perform the pruning steps.
*/
- partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
partkey = RelationGetPartitionKey(partrel);
- partdesc = PartitionDirectoryLookup(estate->es_partition_directory,
- partrel);
+ partdesc = PartitionDirectoryLookup(partdir, partrel);
+
+ /*
+ * Must close partrel, keeping the lock taken, if we're not using
+ * EState's entry.
+ */
+ if (estate == NULL)
+ table_close(partrel, NoLock);
/*
* Initialize the subplan_map and subpart_map.
@@ -1946,6 +2092,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
Assert(partdesc->nparts >= pinfo->nparts);
pprune->nparts = partdesc->nparts;
pprune->subplan_map = palloc(sizeof(int) * partdesc->nparts);
+ pprune->rti_map = palloc(sizeof(Index) * partdesc->nparts);
if (partdesc->nparts == pinfo->nparts)
{
/*
@@ -1956,6 +2103,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
pprune->subpart_map = pinfo->subpart_map;
memcpy(pprune->subplan_map, pinfo->subplan_map,
sizeof(int) * pinfo->nparts);
+ memcpy(pprune->rti_map, pinfo->rti_map,
+ sizeof(int) * pinfo->nparts);
/*
* Double-check that the list of unpruned relations has not
@@ -2006,6 +2155,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
pinfo->subplan_map[pd_idx];
pprune->subpart_map[pp_idx] =
pinfo->subpart_map[pd_idx];
+ pprune->rti_map[pp_idx] =
+ pinfo->rti_map[pd_idx];
pd_idx++;
}
else
@@ -2013,6 +2164,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
/* this partdesc entry is not in the plan */
pprune->subplan_map[pp_idx] = -1;
pprune->subpart_map[pp_idx] = -1;
+ pprune->rti_map[pp_idx] = 0;
}
}
@@ -2034,7 +2186,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
* Initialize pruning contexts as needed.
*/
pprune->initial_pruning_steps = pinfo->initial_pruning_steps;
- if (pinfo->initial_pruning_steps)
+ if (consider_initial_steps && pinfo->initial_pruning_steps)
{
InitPartitionPruneContext(&pprune->initial_context,
pinfo->initial_pruning_steps,
@@ -2044,7 +2196,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
prunestate->do_initial_prune = true;
}
pprune->exec_pruning_steps = pinfo->exec_pruning_steps;
- if (pinfo->exec_pruning_steps)
+ if (consider_exec_steps && pinfo->exec_pruning_steps)
{
InitPartitionPruneContext(&pprune->exec_context,
pinfo->exec_pruning_steps,
@@ -2272,10 +2424,14 @@ PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
* Pass initial_prune if PARAM_EXEC Params cannot yet be evaluated. This
* differentiates the initial executor-time pruning step from later
* runtime pruning.
+ *
+ * RT indexes of leaf partitions scanned by the chosen subplans are added to
+ * *scan_leafpart_rtis if the pointer is non-NULL.
*/
Bitmapset *
ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
- bool initial_prune)
+ bool initial_prune,
+ Bitmapset **scan_leafpart_rtis)
{
Bitmapset *result = NULL;
MemoryContext oldcontext;
@@ -2310,7 +2466,7 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
*/
pprune = &prunedata->partrelprunedata[0];
find_matching_subplans_recurse(prunedata, pprune, initial_prune,
- &result);
+ &result, scan_leafpart_rtis);
/* Expression eval may have used space in ExprContext too */
if (pprune->exec_pruning_steps)
@@ -2324,6 +2480,8 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
/* Copy result out of the temp context before we reset it */
result = bms_copy(result);
+ if (scan_leafpart_rtis)
+ *scan_leafpart_rtis = bms_copy(*scan_leafpart_rtis);
MemoryContextReset(prunestate->prune_context);
@@ -2334,13 +2492,15 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
* find_matching_subplans_recurse
* Recursive worker function for ExecFindMatchingSubPlans
*
- * Adds valid (non-prunable) subplan IDs to *validsubplans
+ * Adds valid (non-prunable) subplan IDs to *validsubplans and RT indexes of
+ * of the corresponding leaf partitions to *scan_leafpart_rtis (if asked for).
*/
static void
find_matching_subplans_recurse(PartitionPruningData *prunedata,
PartitionedRelPruningData *pprune,
bool initial_prune,
- Bitmapset **validsubplans)
+ Bitmapset **validsubplans,
+ Bitmapset **scan_leafpart_rtis)
{
Bitmapset *partset;
int i;
@@ -2367,8 +2527,14 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
while ((i = bms_next_member(partset, i)) >= 0)
{
if (pprune->subplan_map[i] >= 0)
+ {
*validsubplans = bms_add_member(*validsubplans,
pprune->subplan_map[i]);
+ Assert(pprune->rti_map[i] > 0);
+ if (scan_leafpart_rtis)
+ *scan_leafpart_rtis = bms_add_member(*scan_leafpart_rtis,
+ pprune->rti_map[i]);
+ }
else
{
int partidx = pprune->subpart_map[i];
@@ -2376,7 +2542,8 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
if (partidx >= 0)
find_matching_subplans_recurse(prunedata,
&prunedata->partrelprunedata[partidx],
- initial_prune, validsubplans);
+ initial_prune, validsubplans,
+ scan_leafpart_rtis);
else
{
/*
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 21f4c10937..67a58c7163 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -134,6 +134,7 @@ CreateExecutorState(void)
estate->es_param_exec_vals = NULL;
estate->es_queryEnv = NULL;
+ estate->es_part_prune_results = NIL;
estate->es_query_cxt = qcontext;
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index e134a82ff7..18d3b98cdc 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -842,7 +842,7 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
else
dest = None_Receiver;
- es->qd = CreateQueryDesc(es->stmt,
+ es->qd = CreateQueryDesc(es->stmt, NIL,
fcache->src,
GetActiveSnapshot(),
InvalidSnapshot,
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index c6f86a6510..96880e122a 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -155,7 +155,8 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
* subplan, we can fill as_valid_subplans immediately, preventing
* later calls to ExecFindMatchingSubPlans.
*/
- if (!prunestate->do_exec_prune && nplans > 0)
+ if (appendstate->as_prune_state == NULL ||
+ (!appendstate->as_prune_state->do_exec_prune && nplans > 0))
appendstate->as_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
}
else
@@ -577,7 +578,7 @@ choose_next_subplan_locally(AppendState *node)
}
else if (node->as_valid_subplans == NULL)
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
whichplan = -1;
}
@@ -642,7 +643,7 @@ choose_next_subplan_for_leader(AppendState *node)
if (node->as_valid_subplans == NULL)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
/*
* Mark each invalid plan as finished to allow the loop below to
@@ -717,7 +718,7 @@ choose_next_subplan_for_worker(AppendState *node)
else if (node->as_valid_subplans == NULL)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
mark_invalid_subplans_as_finished(node);
}
@@ -868,7 +869,7 @@ ExecAppendAsyncBegin(AppendState *node)
if (node->as_valid_subplans == NULL)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
classify_matching_subplans(node);
}
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index 8d35860c30..2312e5a633 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -103,7 +103,8 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
* subplan, we can fill ms_valid_subplans immediately, preventing
* later calls to ExecFindMatchingSubPlans.
*/
- if (!prunestate->do_exec_prune && nplans > 0)
+ if (mergestate->ms_prune_state == NULL ||
+ (!mergestate->ms_prune_state->do_exec_prune && nplans > 0))
mergestate->ms_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
}
else
@@ -218,7 +219,7 @@ ExecMergeAppend(PlanState *pstate)
*/
if (node->ms_valid_subplans == NULL)
node->ms_valid_subplans =
- ExecFindMatchingSubPlans(node->ms_prune_state, false);
+ ExecFindMatchingSubPlans(node->ms_prune_state, false, NULL);
/*
* First time through: pull the first tuple from each valid subplan,
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index fd5796f1b9..93012a5b3b 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -1578,6 +1578,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
CachedPlanSource *plansource;
CachedPlan *cplan;
List *stmt_list;
+ List *part_prune_results_list;
char *query_string;
Snapshot snapshot;
MemoryContext oldcontext;
@@ -1657,7 +1658,10 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
*/
/* Replan if needed, and increment plan refcount for portal */
- cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv);
+ cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv,
+ &part_prune_results_list);
+ Assert(list_length(cplan->stmt_list) ==
+ list_length(part_prune_results_list));
stmt_list = cplan->stmt_list;
if (!plan->saved)
@@ -1685,6 +1689,9 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
stmt_list,
cplan);
+ /* Copy Lists of PartitionPruneResults into the portal's context. */
+ PortalStorePartitionPruneResults(portal, part_prune_results_list);
+
/*
* Set up options for portal. Default SCROLL type is chosen the same way
* as PerformCursorOpen does it.
@@ -2092,7 +2099,8 @@ SPI_plan_get_cached_plan(SPIPlanPtr plan)
/* Get the generic plan for the query */
cplan = GetCachedPlan(plansource, NULL,
plan->saved ? CurrentResourceOwner : NULL,
- _SPI_current->queryEnv);
+ _SPI_current->queryEnv,
+ NULL /* Not interested in PartitionPruneResults */);
Assert(cplan == plansource->gplan);
/* Pop the error context stack */
@@ -2473,7 +2481,9 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
{
CachedPlanSource *plansource = (CachedPlanSource *) lfirst(lc1);
List *stmt_list;
- ListCell *lc2;
+ List *part_prune_results_list;
+ ListCell *lc2,
+ *lc3;
spicallbackarg.query = plansource->query_string;
@@ -2549,8 +2559,10 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
* plan, the refcount must be backed by the plan_owner.
*/
cplan = GetCachedPlan(plansource, options->params,
- plan_owner, _SPI_current->queryEnv);
-
+ plan_owner, _SPI_current->queryEnv,
+ &part_prune_results_list);
+ Assert(list_length(cplan->stmt_list) ==
+ list_length(part_prune_results_list));
stmt_list = cplan->stmt_list;
/*
@@ -2589,9 +2601,10 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
}
}
- foreach(lc2, stmt_list)
+ forboth(lc2, stmt_list, lc3, part_prune_results_list)
{
PlannedStmt *stmt = lfirst_node(PlannedStmt, lc2);
+ List *part_prune_results = lfirst_node(List, lc3);
bool canSetTag = stmt->canSetTag;
DestReceiver *dest;
@@ -2663,7 +2676,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
else
snap = InvalidSnapshot;
- qdesc = CreateQueryDesc(stmt,
+ qdesc = CreateQueryDesc(stmt, part_prune_results,
plansource->query_string,
snap, crosscheck_snapshot,
dest,
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index b4ff855f7c..77990a2732 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -158,6 +158,11 @@
token = pg_strtok(&length); /* skip :fldname */ \
local_node->fldname = readIntCols(len)
+/* Read an Index array */
+#define READ_INDEX_ARRAY(fldname, len) \
+ token = pg_strtok(&length); /* skip :fldname */ \
+ local_node->fldname = readIndexCols(len)
+
/* Read a bool array */
#define READ_BOOL_ARRAY(fldname, len) \
token = pg_strtok(&length); /* skip :fldname */ \
@@ -795,7 +800,6 @@ fnname(int numCols) \
*/
READ_SCALAR_ARRAY(readAttrNumberCols, int16, atoi)
READ_SCALAR_ARRAY(readOidCols, Oid, atooid)
-/* outfuncs.c has writeIndexCols, but we don't yet need that here */
-/* READ_SCALAR_ARRAY(readIndexCols, Index, atoui) */
+READ_SCALAR_ARRAY(readIndexCols, Index, atoui)
READ_SCALAR_ARRAY(readIntCols, int, atoi)
READ_SCALAR_ARRAY(readBoolCols, bool, strtobool)
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 799602f5ea..a96d316dca 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -520,7 +520,9 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
result->parallelModeNeeded = glob->parallelModeNeeded;
result->planTree = top_plan;
result->partPruneInfos = glob->partPruneInfos;
+ result->containsInitialPruning = glob->containsInitialPruning;
result->rtable = glob->finalrtable;
+ result->minLockRelids = glob->minLockRelids;
result->resultRelations = glob->resultRelations;
result->appendRelations = glob->appendRelations;
result->subplans = glob->subplans;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 720f20f563..61d6934978 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -270,6 +270,16 @@ set_plan_references(PlannerInfo *root, Plan *plan)
*/
add_rtes_to_flat_rtable(root, false);
+ /*
+ * Add the query's adjusted range of RT indexes to glob->minLockRelids.
+ * The adjusted RT indexes of prunable relations will be deleted from the
+ * set below where PartitionPruneInfos are processed.
+ */
+ glob->minLockRelids =
+ bms_add_range(glob->minLockRelids,
+ rtoffset + 1,
+ rtoffset + list_length(root->parse->rtable));
+
/*
* Adjust RT indexes of PlanRowMarks and add to final rowmarks list
*/
@@ -352,6 +362,7 @@ set_plan_references(PlannerInfo *root, Plan *plan)
foreach (lc, root->partPruneInfos)
{
PartitionPruneInfo *pruneinfo = lfirst(lc);
+ Bitmapset *leafpart_rtis = NULL;
ListCell *l;
foreach(l, pruneinfo->prune_infos)
@@ -362,15 +373,50 @@ set_plan_references(PlannerInfo *root, Plan *plan)
foreach(l2, prune_infos)
{
PartitionedRelPruneInfo *pinfo = lfirst(l2);
+ int i;
/* RT index of the table to which the pinfo belongs. */
pinfo->rtindex += rtoffset;
+
+ /* Also of the leaf partitions that might be scanned. */
+ for (i = 0; i < pinfo->nparts; i++)
+ {
+ if (pinfo->rti_map[i] > 0 && pinfo->subplan_map[i] >= 0)
+ {
+ pinfo->rti_map[i] += rtoffset;
+ leafpart_rtis = bms_add_member(leafpart_rtis,
+ pinfo->rti_map[i]);
+ }
+ }
}
}
+ if (pruneinfo->needs_init_pruning)
+ {
+ glob->containsInitialPruning = true;
+
+ /*
+ * Delete the leaf partition RTIs from the global set of relations
+ * to be locked before executing the plan. AcquireExecutorLocks()
+ * will find the ones to add to the set after performing initial
+ * pruning.
+ */
+ glob->minLockRelids = bms_del_members(glob->minLockRelids,
+ leafpart_rtis);
+ }
+
glob->partPruneInfos = lappend(glob->partPruneInfos, pruneinfo);
}
+ /*
+ * It seems worth doing a bms_copy() on glob->minLockRelids if we deleted
+ * bits from it above to get rid of any empty tail bits. It seems better
+ * for the loop over this set in AcquireExecutorLocks() to not have to go
+ * through those useless bit words.
+ */
+ if (glob->containsInitialPruning)
+ glob->minLockRelids = bms_copy(glob->minLockRelids);
+
return result;
}
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index 6565b6ed01..37f3e6af61 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -144,7 +144,9 @@ static List *make_partitionedrel_pruneinfo(PlannerInfo *root,
List *prunequal,
Bitmapset *partrelids,
int *relid_subplan_map,
- Bitmapset **matchedsubplans);
+ Bitmapset **matchedsubplans,
+ bool *needs_init_pruning,
+ bool *needs_exec_pruning);
static void gen_partprune_steps(RelOptInfo *rel, List *clauses,
PartClauseTarget target,
GeneratePruningStepsContext *context);
@@ -234,6 +236,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
int *relid_subplan_map;
ListCell *lc;
int i;
+ bool needs_init_pruning = false;
+ bool needs_exec_pruning = false;
/*
* Scan the subpaths to see which ones are scans of partition child
@@ -313,12 +317,16 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
Bitmapset *partrelids = (Bitmapset *) lfirst(lc);
List *pinfolist;
Bitmapset *matchedsubplans = NULL;
+ bool partrel_needs_init_pruning;
+ bool partrel_needs_exec_pruning;
pinfolist = make_partitionedrel_pruneinfo(root, parentrel,
prunequal,
partrelids,
relid_subplan_map,
- &matchedsubplans);
+ &matchedsubplans,
+ &partrel_needs_init_pruning,
+ &partrel_needs_exec_pruning);
/* When pruning is possible, record the matched subplans */
if (pinfolist != NIL)
@@ -327,6 +335,9 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
allmatchedsubplans = bms_join(matchedsubplans,
allmatchedsubplans);
}
+
+ needs_init_pruning |= partrel_needs_init_pruning;
+ needs_exec_pruning |= partrel_needs_exec_pruning;
}
pfree(relid_subplan_map);
@@ -341,6 +352,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
/* Else build the result data structure */
pruneinfo = makeNode(PartitionPruneInfo);
pruneinfo->prune_infos = prunerelinfos;
+ pruneinfo->needs_init_pruning = needs_init_pruning;
+ pruneinfo->needs_exec_pruning = needs_exec_pruning;
/*
* Some subplans may not belong to any of the identified partitioned rels.
@@ -441,13 +454,18 @@ add_part_relids(List *allpartrelids, Bitmapset *partrelids)
* If we cannot find any useful run-time pruning steps, return NIL.
* However, on success, each rel identified in partrelids will have
* an element in the result list, even if some of them are useless.
+ * *needs_init_pruning and *needs_exec_pruning are set to indicate that the
+ * returned PartitionedRelPruneInfos contains pruning steps that can be
+ * performed before and after execution begins, respectively.
*/
static List *
make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
List *prunequal,
Bitmapset *partrelids,
int *relid_subplan_map,
- Bitmapset **matchedsubplans)
+ Bitmapset **matchedsubplans,
+ bool *needs_init_pruning,
+ bool *needs_exec_pruning)
{
RelOptInfo *targetpart = NULL;
List *pinfolist = NIL;
@@ -458,6 +476,10 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
int rti;
int i;
+ /* Will find out below. */
+ *needs_init_pruning = false;
+ *needs_exec_pruning = false;
+
/*
* Examine each partitioned rel, constructing a temporary array to map
* from planner relids to index of the partitioned rel, and building a
@@ -545,6 +567,9 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
* executor per-scan pruning steps. This first pass creates startup
* pruning steps and detects whether there's any possibly-useful quals
* that would require per-scan pruning.
+ *
+ * In the first pass, we note whether the 2nd pass is necessary by
+ * noting the presence of EXEC parameters.
*/
gen_partprune_steps(subpart, partprunequal, PARTTARGET_INITIAL,
&context);
@@ -619,6 +644,12 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
pinfo->execparamids = execparamids;
/* Remaining fields will be filled in the next loop */
+ /* record which types of pruning steps we've seen so far */
+ if (initial_pruning_steps != NIL)
+ *needs_init_pruning = true;
+ if (exec_pruning_steps != NIL)
+ *needs_exec_pruning = true;
+
pinfolist = lappend(pinfolist, pinfo);
}
@@ -646,6 +677,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
int *subplan_map;
int *subpart_map;
Oid *relid_map;
+ Index *rti_map;
/*
* Construct the subplan and subpart maps for this partitioning level.
@@ -658,6 +690,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
subpart_map = (int *) palloc(nparts * sizeof(int));
memset(subpart_map, -1, nparts * sizeof(int));
relid_map = (Oid *) palloc0(nparts * sizeof(Oid));
+ rti_map = (Index *) palloc0(nparts * sizeof(Index));
present_parts = NULL;
i = -1;
@@ -672,6 +705,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
subplan_map[i] = subplanidx = relid_subplan_map[partrel->relid] - 1;
subpart_map[i] = subpartidx = relid_subpart_map[partrel->relid] - 1;
relid_map[i] = planner_rt_fetch(partrel->relid, root)->relid;
+ rti_map[i] = partrel->relid;
if (subplanidx >= 0)
{
present_parts = bms_add_member(present_parts, i);
@@ -696,6 +730,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
pinfo->subplan_map = subplan_map;
pinfo->subpart_map = subpart_map;
pinfo->relid_map = relid_map;
+ pinfo->rti_map = rti_map;
}
pfree(relid_subpart_map);
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 3082093d1e..95ab1d0eef 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -1598,6 +1598,7 @@ exec_bind_message(StringInfo input_message)
int16 *rformats = NULL;
CachedPlanSource *psrc;
CachedPlan *cplan;
+ List *part_prune_results_list;
Portal portal;
char *query_string;
char *saved_stmt_name;
@@ -1972,7 +1973,9 @@ exec_bind_message(StringInfo input_message)
* will be generated in MessageContext. The plan refcount will be
* assigned to the Portal, so it will be released at portal destruction.
*/
- cplan = GetCachedPlan(psrc, params, NULL, NULL);
+ cplan = GetCachedPlan(psrc, params, NULL, NULL, &part_prune_results_list);
+ Assert(list_length(cplan->stmt_list) ==
+ list_length(part_prune_results_list));
/*
* Now we can define the portal.
@@ -1987,6 +1990,9 @@ exec_bind_message(StringInfo input_message)
cplan->stmt_list,
cplan);
+ /* Copy Lists of PartitionPruneResults into the portal's context. */
+ PortalStorePartitionPruneResults(portal, part_prune_results_list);
+
/* Done with the snapshot used for parameter I/O and parsing/planning */
if (snapshot_set)
PopActiveSnapshot();
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index 52e2db6452..280ed7d239 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -35,7 +35,7 @@
Portal ActivePortal = NULL;
-static void ProcessQuery(PlannedStmt *plan,
+static void ProcessQuery(PlannedStmt *plan, List *part_prune_results,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -65,6 +65,7 @@ static void DoPortalRewind(Portal portal);
*/
QueryDesc *
CreateQueryDesc(PlannedStmt *plannedstmt,
+ List *part_prune_results,
const char *sourceText,
Snapshot snapshot,
Snapshot crosscheck_snapshot,
@@ -77,6 +78,8 @@ CreateQueryDesc(PlannedStmt *plannedstmt,
qd->operation = plannedstmt->commandType; /* operation */
qd->plannedstmt = plannedstmt; /* plan */
+ qd->part_prune_results = part_prune_results; /* ExecutorDoInitialPruning()
+ * output for plan */
qd->sourceText = sourceText; /* query text */
qd->snapshot = RegisterSnapshot(snapshot); /* snapshot */
/* RI check snapshot */
@@ -122,6 +125,7 @@ FreeQueryDesc(QueryDesc *qdesc)
* PORTAL_ONE_RETURNING, or PORTAL_ONE_MOD_WITH portal
*
* plan: the plan tree for the query
+ * part_prune_results: ExecutorDoInitialPruning() output for the PlannedStmt
* sourceText: the source text of the query
* params: any parameters needed
* dest: where to send results
@@ -134,6 +138,7 @@ FreeQueryDesc(QueryDesc *qdesc)
*/
static void
ProcessQuery(PlannedStmt *plan,
+ List *part_prune_results,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -145,7 +150,7 @@ ProcessQuery(PlannedStmt *plan,
/*
* Create the QueryDesc object
*/
- queryDesc = CreateQueryDesc(plan, sourceText,
+ queryDesc = CreateQueryDesc(plan, part_prune_results, sourceText,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, 0);
@@ -491,8 +496,13 @@ PortalStart(Portal portal, ParamListInfo params,
/*
* Create QueryDesc in portal's context; for the moment, set
* the destination to DestNone.
+ *
+ * There is no PartitionPruneResult unless the PlannedStmt is
+ * from a CachedPlan.
*/
queryDesc = CreateQueryDesc(linitial_node(PlannedStmt, portal->stmts),
+ portal->part_prune_results_list == NIL ? NIL :
+ linitial(portal->part_prune_results_list),
portal->sourceText,
GetActiveSnapshot(),
InvalidSnapshot,
@@ -1225,6 +1235,8 @@ PortalRunMulti(Portal portal,
if (pstmt->utilityStmt == NULL)
{
+ List *part_prune_results = NIL;
+
/*
* process a plannable query.
*/
@@ -1271,10 +1283,18 @@ PortalRunMulti(Portal portal,
else
UpdateActiveSnapshotCommandId();
+ /*
+ * Determine if there's a corresponding List of PartitionPruneResult
+ * for this PlannedStmt.
+ */
+ if (portal->part_prune_results_list != NIL)
+ part_prune_results = list_nth(portal->part_prune_results_list,
+ foreach_current_index(stmtlist_item));
+
if (pstmt->canSetTag)
{
/* statement can set tag string */
- ProcessQuery(pstmt,
+ ProcessQuery(pstmt, part_prune_results,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
@@ -1283,7 +1303,7 @@ PortalRunMulti(Portal portal,
else
{
/* stmt added by rewrite cannot set tag */
- ProcessQuery(pstmt,
+ ProcessQuery(pstmt, part_prune_results,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index cc943205d3..af6fae6e3b 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -99,14 +99,19 @@ static dlist_head cached_expression_list = DLIST_STATIC_INIT(cached_expression_l
static void ReleaseGenericPlan(CachedPlanSource *plansource);
static List *RevalidateCachedQuery(CachedPlanSource *plansource,
QueryEnvironment *queryEnv);
-static bool CheckCachedPlan(CachedPlanSource *plansource);
+static bool CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
+ List **part_prune_results_list);
static CachedPlan *BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
- ParamListInfo boundParams, QueryEnvironment *queryEnv);
+ ParamListInfo boundParams, QueryEnvironment *queryEnv,
+ List **part_prune_results_list);
static bool choose_custom_plan(CachedPlanSource *plansource,
ParamListInfo boundParams);
static double cached_plan_cost(CachedPlan *plan, bool include_planner);
static Query *QueryListGetPrimaryStmt(List *stmts);
-static void AcquireExecutorLocks(List *stmt_list, bool acquire);
+static void AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams,
+ List **part_prune_results_list,
+ List **lockedRelids_per_stmt);
+static void ReleaseExecutorLocks(List *stmt_list, List *lockedRelids_per_stmt);
static void AcquirePlannerLocks(List *stmt_list, bool acquire);
static void ScanQueryForLocks(Query *parsetree, bool acquire);
static bool ScanQueryWalker(Node *node, bool *acquire);
@@ -782,6 +787,26 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
return tlist;
}
+/*
+ * FreePartitionPruneResults
+ * Frees the List of Lists of PartitionPruneResults for CheckCachedPlan()
+ */
+static void
+FreePartitionPruneResults(List *part_prune_results_list)
+{
+ ListCell *lc;
+
+ foreach(lc, part_prune_results_list)
+ {
+ List *part_prune_results = lfirst(lc);
+
+ /* Free both the PartitionPruneResults and the containing List. */
+ list_free_deep(part_prune_results);
+ }
+
+ list_free(part_prune_results_list);
+}
+
/*
* CheckCachedPlan: see if the CachedPlanSource's generic plan is valid.
*
@@ -790,15 +815,20 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
*
* On a "true" return, we have acquired the locks needed to run the plan.
* (We must do this for the "true" result to be race-condition-free.)
+ *
+ * See GetCachedPlan()'s comment for a description of part_prune_results_list.
*/
static bool
-CheckCachedPlan(CachedPlanSource *plansource)
+CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
+ List **part_prune_results_list)
{
CachedPlan *plan = plansource->gplan;
/* Assert that caller checked the querytree */
Assert(plansource->is_valid);
+ *part_prune_results_list = NIL;
+
/* If there's no generic plan, just say "false" */
if (!plan)
return false;
@@ -820,13 +850,21 @@ CheckCachedPlan(CachedPlanSource *plansource)
*/
if (plan->is_valid)
{
+ List *lockedRelids_per_stmt;
+
/*
* Plan must have positive refcount because it is referenced by
* plansource; so no need to fear it disappears under us here.
*/
Assert(plan->refcount > 0);
- AcquireExecutorLocks(plan->stmt_list, true);
+ /*
+ * Lock relations scanned by the plan. This is where the pruning
+ * happens if needed.
+ */
+ AcquireExecutorLocks(plan->stmt_list, boundParams,
+ part_prune_results_list,
+ &lockedRelids_per_stmt);
/*
* If plan was transient, check to see if TransactionXmin has
@@ -848,7 +886,11 @@ CheckCachedPlan(CachedPlanSource *plansource)
}
/* Oops, the race case happened. Release useless locks. */
- AcquireExecutorLocks(plan->stmt_list, false);
+ ReleaseExecutorLocks(plan->stmt_list, lockedRelids_per_stmt);
+
+ /* Release any PartitionPruneResults that may been created. */
+ FreePartitionPruneResults(*part_prune_results_list);
+ *part_prune_results_list = NIL;
}
/*
@@ -874,10 +916,14 @@ CheckCachedPlan(CachedPlanSource *plansource)
* Planning work is done in the caller's memory context. The finished plan
* is in a child memory context, which typically should get reparented
* (unless this is a one-shot plan, in which case we don't copy the plan).
+ *
+ * A list of NILs is returned in *part_prune_results_list, meaning that no
+ * no partition pruning has been done yet for the plans in stmt_list.
*/
static CachedPlan *
BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
- ParamListInfo boundParams, QueryEnvironment *queryEnv)
+ ParamListInfo boundParams, QueryEnvironment *queryEnv,
+ List **part_prune_results_list)
{
CachedPlan *plan;
List *plist;
@@ -1007,6 +1053,17 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
MemoryContextSwitchTo(oldcxt);
+ /*
+ * No actual PartitionPruneResults yet to add, though must initialize
+ * the list to have the same number of elements as the list of
+ * PlannedStmts.
+ */
+ *part_prune_results_list = NIL;
+ foreach(lc, plist)
+ {
+ *part_prune_results_list = lappend(*part_prune_results_list, NIL);
+ }
+
return plan;
}
@@ -1126,6 +1183,19 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
* plan or a custom plan for the given parameters: the caller does not know
* which it will get.
*
+ * For every PlannedStmt found in the returned CachedPlan, an element that
+ * is either a List of PartitionPruneResult or a NIL is added to
+ * *part_prune_results_list. The former if the PlannedStmt is from
+ * the existing CachedPlan that is otherwise valid and has
+ * containsInitialPruning set to true. Before returning such a CachedPlan,
+ * those "initial" steps are performed by calling ExecutorDoInitialPruning()
+ * to determine only those leaf partitions that need to be locked by
+ * AcquireExecutorLocks() by pruning away subplans that don't match the
+ * "initial" pruning conditions. For each PartitionPruneInfo found in
+ * PlannedStmt.partPruneInfos, a PartitionPruneResult containing the bitmapset
+ * of the indexes of surviving subplans is added to the List for the
+ * PlannedStmt.
+ *
* On return, the plan is valid and we have sufficient locks to begin
* execution.
*
@@ -1139,11 +1209,13 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
*/
CachedPlan *
GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
- ResourceOwner owner, QueryEnvironment *queryEnv)
+ ResourceOwner owner, QueryEnvironment *queryEnv,
+ List **part_prune_results_list)
{
CachedPlan *plan = NULL;
List *qlist;
bool customplan;
+ List *my_part_prune_results_list;
/* Assert caller is doing things in a sane order */
Assert(plansource->magic == CACHEDPLANSOURCE_MAGIC);
@@ -1160,7 +1232,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
if (!customplan)
{
- if (CheckCachedPlan(plansource))
+ if (CheckCachedPlan(plansource, boundParams,
+ &my_part_prune_results_list))
{
/* We want a generic plan, and we already have a valid one */
plan = plansource->gplan;
@@ -1169,7 +1242,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
else
{
/* Build a new generic plan */
- plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv);
+ plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv,
+ &my_part_prune_results_list);
/* Just make real sure plansource->gplan is clear */
ReleaseGenericPlan(plansource);
/* Link the new generic plan into the plansource */
@@ -1214,7 +1288,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
if (customplan)
{
/* Build a custom plan */
- plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv);
+ plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv,
+ &my_part_prune_results_list);
/* Accumulate total costs of custom plans */
plansource->total_custom_cost += cached_plan_cost(plan, true);
@@ -1246,6 +1321,9 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
plan->is_saved = true;
}
+ if (part_prune_results_list)
+ *part_prune_results_list = my_part_prune_results_list;
+
return plan;
}
@@ -1737,17 +1815,29 @@ QueryListGetPrimaryStmt(List *stmts)
/*
* AcquireExecutorLocks: acquire locks needed for execution of a cached plan;
- * or release them if acquire is false.
+ *
+ * See GetCachedPlan()'s comment for a description of part_prune_results_list.
+ *
+ * On return, *lockedRelids_per_stmt will contain a bitmapset for every
+ * PlannedStmt in stmt_list, containing the RT indexes of relation entries
+ * in its range table that were actually locked, or NULL if the PlannedStmt
+ * contains a utility statement.
*/
static void
-AcquireExecutorLocks(List *stmt_list, bool acquire)
+AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams,
+ List **part_prune_results_list,
+ List **lockedRelids_per_stmt)
{
ListCell *lc1;
+ *part_prune_results_list = *lockedRelids_per_stmt = NIL;
foreach(lc1, stmt_list)
{
PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
- ListCell *lc2;
+ List *part_prune_results = NIL;
+ Bitmapset *allLockRelids;
+ Bitmapset *lockedRelids = NULL;
+ int rti;
if (plannedstmt->commandType == CMD_UTILITY)
{
@@ -1761,13 +1851,40 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
Query *query = UtilityContainsQuery(plannedstmt->utilityStmt);
if (query)
- ScanQueryForLocks(query, acquire);
+ ScanQueryForLocks(query, true);
+ *part_prune_results_list = lappend(*part_prune_results_list, NIL);
continue;
}
- foreach(lc2, plannedstmt->rtable)
+ /*
+ * Figure out the set of relations that would need to be locked
+ * before executing the plan.
+ */
+ if (plannedstmt->containsInitialPruning)
{
- RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc2);
+ Bitmapset *scan_leafpart_rtis = NULL;
+
+ /*
+ * Obtain the set of leaf partitions to be locked.
+ *
+ * The following does initial partition pruning using the
+ * PartitionPruneInfos found in plannedstmt->partPruneInfos and
+ * finds leaf partitions that survive that pruning across all the
+ * nodes in the plan tree.
+ */
+ part_prune_results = ExecutorDoInitialPruning(plannedstmt,
+ boundParams,
+ &scan_leafpart_rtis);
+ allLockRelids = bms_union(plannedstmt->minLockRelids,
+ scan_leafpart_rtis);
+ }
+ else
+ allLockRelids = plannedstmt->minLockRelids;
+
+ rti = -1;
+ while ((rti = bms_next_member(allLockRelids, rti)) > 0)
+ {
+ RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
if (rte->rtekind != RTE_RELATION)
continue;
@@ -1778,10 +1895,59 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
* fail if it's been dropped entirely --- we'll just transiently
* acquire a non-conflicting lock.
*/
- if (acquire)
- LockRelationOid(rte->relid, rte->rellockmode);
- else
- UnlockRelationOid(rte->relid, rte->rellockmode);
+ LockRelationOid(rte->relid, rte->rellockmode);
+ lockedRelids = bms_add_member(lockedRelids, rti);
+ }
+
+ *part_prune_results_list = lappend(*part_prune_results_list,
+ part_prune_results);
+ *lockedRelids_per_stmt = lappend(*lockedRelids_per_stmt, lockedRelids);
+ }
+}
+
+/*
+ * ReleaseExecutorLocks
+ * Release locks that would've been acquired by an earlier call to
+ * AcquireExecutorLocks()
+ */
+static void
+ReleaseExecutorLocks(List *stmt_list, List *lockedRelids_per_stmt)
+{
+ ListCell *lc1,
+ *lc2;
+
+ forboth(lc1, stmt_list, lc2, lockedRelids_per_stmt)
+ {
+ PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
+ Bitmapset *lockedRelids = lfirst(lc2);
+ int rti;
+
+ if (plannedstmt->commandType == CMD_UTILITY)
+ {
+ /*
+ * Ignore utility statements, except those (such as EXPLAIN) that
+ * contain a parsed-but-not-planned query. Note: it's okay to use
+ * ScanQueryForLocks, even though the query hasn't been through
+ * rule rewriting, because rewriting doesn't change the query
+ * representation.
+ */
+ Query *query = UtilityContainsQuery(plannedstmt->utilityStmt);
+
+ Assert(lockedRelids == NULL);
+ if (query)
+ ScanQueryForLocks(query, false);
+ continue;
+ }
+
+ rti = -1;
+ while ((rti = bms_next_member(lockedRelids, rti)) >= 0)
+ {
+ RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
+
+ Assert(rte->rtekind == RTE_RELATION);
+
+ /* See the comment in AcquireExecutorLocks(). */
+ UnlockRelationOid(rte->relid, rte->rellockmode);
}
}
}
diff --git a/src/backend/utils/mmgr/portalmem.c b/src/backend/utils/mmgr/portalmem.c
index 7b1ae6fdcf..5b9098971b 100644
--- a/src/backend/utils/mmgr/portalmem.c
+++ b/src/backend/utils/mmgr/portalmem.c
@@ -303,6 +303,25 @@ PortalDefineQuery(Portal portal,
portal->status = PORTAL_DEFINED;
}
+/*
+ * PortalStorePartitionPruneResults
+ * Copy the given List of Lists of PartitionPruneResults into the
+ * portal's context
+ *
+ * This allows the caller to ensure that the list exists as long as the portal
+ * does.
+ */
+void
+PortalStorePartitionPruneResults(Portal portal, List *part_prune_results_list)
+{
+ MemoryContext oldcxt;
+
+ Assert(PortalIsValid(portal));
+ oldcxt = MemoryContextSwitchTo(portal->portalContext);
+ portal->part_prune_results_list = copyObject(part_prune_results_list);
+ MemoryContextSwitchTo(oldcxt);
+}
+
/*
* PortalReleaseCachedPlan
* Release a portal's reference to its cached plan, if any.
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 9ebde089ae..269cc4d562 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -87,7 +87,9 @@ extern void ExplainOneUtility(Node *utilityStmt, IntoClause *into,
ExplainState *es, const char *queryString,
ParamListInfo params, QueryEnvironment *queryEnv);
-extern void ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into,
+extern void ExplainOnePlan(PlannedStmt *plannedstmt,
+ List *part_prune_results,
+ IntoClause *into,
ExplainState *es, const char *queryString,
ParamListInfo params, QueryEnvironment *queryEnv,
const instr_time *planduration,
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index bf962af7af..bd8776402e 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -45,6 +45,7 @@ extern void ExecCleanupTupleRouting(ModifyTableState *mtstate,
* nparts Length of subplan_map[] and subpart_map[].
* subplan_map Subplan index by partition index, or -1.
* subpart_map Subpart index by partition index, or -1.
+ * rti_map Range table index by partition index, or 0.
* present_parts A Bitmapset of the partition indexes that we
* have subplans or subparts for.
* initial_pruning_steps List of PartitionPruneSteps used to
@@ -61,6 +62,7 @@ typedef struct PartitionedRelPruningData
int nparts;
int *subplan_map;
int *subpart_map;
+ Index *rti_map;
Bitmapset *present_parts;
List *initial_pruning_steps;
List *exec_pruning_steps;
@@ -126,5 +128,10 @@ extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
int part_prune_index,
Bitmapset **initially_valid_subplans);
extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
- bool initial_prune);
+ bool initial_prune,
+ Bitmapset **scan_leafpart_rtis);
+extern Bitmapset *ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt,
+ ParamListInfo params,
+ PartitionPruneInfo *pruneinfo,
+ Bitmapset **scan_leafpart_rtis);
#endif /* EXECPARTITION_H */
diff --git a/src/include/executor/execdesc.h b/src/include/executor/execdesc.h
index e79e2c001f..7d4379da7b 100644
--- a/src/include/executor/execdesc.h
+++ b/src/include/executor/execdesc.h
@@ -35,6 +35,8 @@ typedef struct QueryDesc
/* These fields are provided by CreateQueryDesc */
CmdType operation; /* CMD_SELECT, CMD_UPDATE, etc. */
PlannedStmt *plannedstmt; /* planner's output (could be utility, too) */
+ List *part_prune_results; /* ExecutorDoInitialPruning()'s
+ * output for plannedstmt */
const char *sourceText; /* source text of the query */
Snapshot snapshot; /* snapshot to use for query */
Snapshot crosscheck_snapshot; /* crosscheck for RI update/delete */
@@ -57,6 +59,7 @@ typedef struct QueryDesc
/* in pquery.c */
extern QueryDesc *CreateQueryDesc(PlannedStmt *plannedstmt,
+ List *part_prune_results,
const char *sourceText,
Snapshot snapshot,
Snapshot crosscheck_snapshot,
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index ed95ed1176..c9a5e5fb68 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -185,6 +185,9 @@ ExecGetJunkAttribute(TupleTableSlot *slot, AttrNumber attno, bool *isNull)
/*
* prototypes from functions in execMain.c
*/
+extern List *ExecutorDoInitialPruning(PlannedStmt *plannedstmt,
+ ParamListInfo params,
+ Bitmapset **scan_leafpart_rtis);
extern void ExecutorStart(QueryDesc *queryDesc, int eflags);
extern void standard_ExecutorStart(QueryDesc *queryDesc, int eflags);
extern void ExecutorRun(QueryDesc *queryDesc,
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 4a741b053f..521a60b988 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -612,6 +612,7 @@ typedef struct EState
* ExecRowMarks, or NULL if none */
PlannedStmt *es_plannedstmt; /* link to top of plan tree */
List *es_part_prune_infos; /* PlannedStmt.partPruneInfos */
+ List *es_part_prune_results; /* QueryDesc.part_prune_results */
const char *es_sourceText; /* Source text from QueryDesc */
JunkFilter *es_junkFilter; /* top-level junk filter, if any */
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index a80f43e540..937cc4629d 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -212,6 +212,7 @@ extern struct Bitmapset *readBitmapset(void);
extern uintptr_t readDatum(bool typbyval);
extern bool *readBoolCols(int numCols);
extern int *readIntCols(int numCols);
+extern Index *readIndexCols(int numCols);
extern Oid *readOidCols(int numCols);
extern int16 *readAttrNumberCols(int numCols);
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index fbe75dca0f..354c2e96c3 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -125,6 +125,18 @@ typedef struct PlannerGlobal
/* List of PartitionPruneInfo contained in the plan */
List *partPruneInfos;
+ /*
+ * Do any of those PartitionPruneInfos have initial pruning steps in them?
+ */
+ bool containsInitialPruning;
+
+ /*
+ * Indexes of all range table entries minus indexes of range table entries
+ * of the leaf partitions scanned by prunable subplans; see
+ * AcquireExecutorLocks()
+ */
+ Bitmapset *minLockRelids;
+
/* OIDs of relations the plan depends on */
List *relationOids;
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 2e132afc5a..c0717bf45e 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -73,8 +73,17 @@ typedef struct PlannedStmt
List *partPruneInfos; /* List of PartitionPruneInfo contained in
* the plan */
+ bool containsInitialPruning; /* Do any of those PartitionPruneInfos
+ * have initial pruning steps in them?
+ */
+
List *rtable; /* list of RangeTblEntry nodes */
+ Bitmapset *minLockRelids; /* Indexes of all range table entries minus
+ * indexes of range table entries of the leaf
+ * partitions scanned by prunable subplans;
+ * see AcquireExecutorLocks() */
+
/* rtable indexes of target relations for INSERT/UPDATE/DELETE/MERGE */
List *resultRelations; /* integer list of RT indexes, or NIL */
@@ -1410,6 +1419,13 @@ typedef struct PlanRowMark
* prune_infos List of Lists containing PartitionedRelPruneInfo nodes,
* one sublist per run-time-prunable partition hierarchy
* appearing in the parent plan node's subplans.
+ *
+ * needs_init_pruning Does any of the PartitionedRelPruneInfos in
+ * prune_infos have its initial_pruning_steps set?
+ *
+ * needs_exec_pruning Does any of the PartitionedRelPruneInfos in
+ * prune_infos have its exec_pruning_steps set?
+ *
* other_subplans Indexes of any subplans that are not accounted for
* by any of the PartitionedRelPruneInfo nodes in
* "prune_infos". These subplans must not be pruned.
@@ -1420,6 +1436,8 @@ typedef struct PartitionPruneInfo
NodeTag type;
List *prune_infos;
+ bool needs_init_pruning;
+ bool needs_exec_pruning;
Bitmapset *other_subplans;
} PartitionPruneInfo;
@@ -1464,6 +1482,9 @@ typedef struct PartitionedRelPruneInfo
/* relation OID by partition index, or 0 */
Oid *relid_map pg_node_attr(array_size(nparts));
+ /* Range table index by partition index, or 0. */
+ Index *rti_map pg_node_attr(array_size(nparts));
+
/*
* initial_pruning_steps shows how to prune during executor startup (i.e.,
* without use of any PARAM_EXEC Params); it is NIL if no startup pruning
@@ -1548,6 +1569,31 @@ typedef struct PartitionPruneStepCombine
List *source_stepids;
} PartitionPruneStepCombine;
+/*----------------
+ * PartitionPruneResult
+ *
+ * The result of performing ExecPartitionDoInitialPruning() on a given
+ * PartitionPruneInfo.
+ *
+ * valid_subplans_offs contains the indexes of subplans remaining after
+ * performing initial pruning by calling ExecFindMatchingSubPlans() on the
+ * PartitionPruneInfo.
+ *
+ * This is used to store the result of initial partition pruning that is
+ * peformed before the execution has started. A module that needs to do so
+ * should call ExecutorDoInitialPruning() on a given PlannedStmt, which
+ * returns a List of PartitionPruneResult containing an entry for each
+ * PartitionPruneInfo present in PlannedStmt.part_prune_infos. The module
+ * should then pass that list, along with the PlannedStmt, to the executor,
+ * so that it can reuse the result of initial partition pruning when
+ * initializing the subplans for execution.
+ */
+typedef struct PartitionPruneResult
+{
+ NodeTag type;
+
+ Bitmapset *valid_subplan_offs;
+} PartitionPruneResult;
/*
* Plan invalidation info
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index 0499635f59..32579d4788 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -220,7 +220,8 @@ extern List *CachedPlanGetTargetList(CachedPlanSource *plansource,
extern CachedPlan *GetCachedPlan(CachedPlanSource *plansource,
ParamListInfo boundParams,
ResourceOwner owner,
- QueryEnvironment *queryEnv);
+ QueryEnvironment *queryEnv,
+ List **part_prune_results_list);
extern void ReleaseCachedPlan(CachedPlan *plan, ResourceOwner owner);
extern bool CachedPlanAllowsSimpleValidityCheck(CachedPlanSource *plansource,
diff --git a/src/include/utils/portal.h b/src/include/utils/portal.h
index aeddbdafe5..1901fc5f28 100644
--- a/src/include/utils/portal.h
+++ b/src/include/utils/portal.h
@@ -138,6 +138,7 @@ typedef struct PortalData
QueryCompletion qc; /* command completion data for executed query */
List *stmts; /* list of PlannedStmts */
CachedPlan *cplan; /* CachedPlan, if stmts are from one */
+ List *part_prune_results_list; /* List of Lists of PartitionPruneResults */
ParamListInfo portalParams; /* params to pass to query */
QueryEnvironment *queryEnv; /* environment for query */
@@ -242,6 +243,8 @@ extern void PortalDefineQuery(Portal portal,
CommandTag commandTag,
List *stmts,
CachedPlan *cplan);
+extern void PortalStorePartitionPruneResults(Portal portal,
+ List *part_prune_results_list);
extern PlannedStmt *PortalGetPrimaryStmt(Portal portal);
extern void PortalCreateHoldStore(Portal portal);
extern void PortalHashTableDeleteAll(void);
--
2.35.3
[application/octet-stream] v24-0001-Move-PartitioPruneInfo-out-of-plan-nodes-into-Pl.patch (17.2K, 3-v24-0001-Move-PartitioPruneInfo-out-of-plan-nodes-into-Pl.patch)
download | inline diff:
From 9819109681e87342bf22549f5ea316501f77235d Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Fri, 27 May 2022 16:00:28 +0900
Subject: [PATCH v24 1/2] Move PartitioPruneInfo out of plan nodes into
PlannedStmt
The planner will now add a given PartitioPruneInfo to
PlannedStmt.partPruneInfos instead of directly to the
Append/MergeAppend plan node. What gets set instead in the
latter is an index field which points to the list element
of PlannedStmt.partPruneInfos containing the PartitioPruneInfo
belonging to the plan node.
A later commit will make AcquireExecutorLocks() do the initial
partition pruning to determine a minimal set of partitions to be
locked when validating a plan tree and it will need to consult the
PartitioPruneInfos referenced therein to do so. It would be better
for the PartitioPruneInfos to be accessible directly than requiring
a walk of the plan tree to find them, which is easier when it can be
done by simply iterating over PlannedStmt.partPruneInfos.
---
src/backend/executor/execMain.c | 1 +
src/backend/executor/execParallel.c | 1 +
src/backend/executor/execPartition.c | 4 +-
src/backend/executor/execUtils.c | 1 +
src/backend/executor/nodeAppend.c | 4 +-
src/backend/executor/nodeMergeAppend.c | 4 +-
src/backend/optimizer/plan/createplan.c | 24 ++++-----
src/backend/optimizer/plan/planner.c | 1 +
src/backend/optimizer/plan/setrefs.c | 65 +++++++++++++------------
src/backend/partitioning/partprune.c | 18 ++++---
src/include/executor/execPartition.h | 3 +-
src/include/nodes/execnodes.h | 1 +
src/include/nodes/pathnodes.h | 6 +++
src/include/nodes/plannodes.h | 11 +++--
src/include/partitioning/partprune.h | 8 +--
15 files changed, 90 insertions(+), 62 deletions(-)
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index d78862e660..32475e33ff 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -825,6 +825,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
ExecInitRangeTable(estate, rangeTable);
estate->es_plannedstmt = plannedstmt;
+ estate->es_part_prune_infos = plannedstmt->partPruneInfos;
/*
* Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index 99512826c5..aca0c6f323 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -183,6 +183,7 @@ ExecSerializePlan(Plan *plan, EState *estate)
pstmt->dependsOnRole = false;
pstmt->parallelModeNeeded = false;
pstmt->planTree = plan;
+ pstmt->partPruneInfos = estate->es_part_prune_infos;
pstmt->rtable = estate->es_range_table;
pstmt->resultRelations = NIL;
pstmt->appendRelations = NIL;
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 40e3c07693..80197d5141 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -1791,11 +1791,13 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
PartitionPruneState *
ExecInitPartitionPruning(PlanState *planstate,
int n_total_subplans,
- PartitionPruneInfo *pruneinfo,
+ int part_prune_index,
Bitmapset **initially_valid_subplans)
{
PartitionPruneState *prunestate;
EState *estate = planstate->state;
+ PartitionPruneInfo *pruneinfo = list_nth(estate->es_part_prune_infos,
+ part_prune_index);
/* We may need an expression context to evaluate partition exprs */
ExecAssignExprContext(estate, planstate);
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 9df1f81ea8..21f4c10937 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -119,6 +119,7 @@ CreateExecutorState(void)
estate->es_relations = NULL;
estate->es_rowmarks = NULL;
estate->es_plannedstmt = NULL;
+ estate->es_part_prune_infos = NIL;
estate->es_junkFilter = NULL;
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index 357e10a1d7..c6f86a6510 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -134,7 +134,7 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
appendstate->as_begun = false;
/* If run-time partition pruning is enabled, then set that up now */
- if (node->part_prune_info != NULL)
+ if (node->part_prune_index >= 0)
{
PartitionPruneState *prunestate;
@@ -145,7 +145,7 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
*/
prunestate = ExecInitPartitionPruning(&appendstate->ps,
list_length(node->appendplans),
- node->part_prune_info,
+ node->part_prune_index,
&validsubplans);
appendstate->as_prune_state = prunestate;
nplans = bms_num_members(validsubplans);
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index c5c62fa5c7..8d35860c30 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -82,7 +82,7 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
mergestate->ps.ExecProcNode = ExecMergeAppend;
/* If run-time partition pruning is enabled, then set that up now */
- if (node->part_prune_info != NULL)
+ if (node->part_prune_index >= 0)
{
PartitionPruneState *prunestate;
@@ -93,7 +93,7 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
*/
prunestate = ExecInitPartitionPruning(&mergestate->ps,
list_length(node->mergeplans),
- node->part_prune_info,
+ node->part_prune_index,
&validsubplans);
mergestate->ms_prune_state = prunestate;
nplans = bms_num_members(validsubplans);
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index ac86ce9003..50a5719ac6 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -1203,7 +1203,6 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
ListCell *subpaths;
int nasyncplans = 0;
RelOptInfo *rel = best_path->path.parent;
- PartitionPruneInfo *partpruneinfo = NULL;
int nodenumsortkeys = 0;
AttrNumber *nodeSortColIdx = NULL;
Oid *nodeSortOperators = NULL;
@@ -1354,6 +1353,9 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
subplans = lappend(subplans, subplan);
}
+ /* Set below if we find quals that we can use to run-time prune */
+ plan->part_prune_index = -1;
+
/*
* If any quals exist, they may be useful to perform further partition
* pruning during execution. Gather information needed by the executor to
@@ -1377,16 +1379,14 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
}
if (prunequal != NIL)
- partpruneinfo =
- make_partition_pruneinfo(root, rel,
- best_path->subpaths,
- prunequal);
+ plan->part_prune_index = make_partition_pruneinfo(root, rel,
+ best_path->subpaths,
+ prunequal);
}
plan->appendplans = subplans;
plan->nasyncplans = nasyncplans;
plan->first_partial_plan = best_path->first_partial_path;
- plan->part_prune_info = partpruneinfo;
copy_generic_path_info(&plan->plan, (Path *) best_path);
@@ -1425,7 +1425,6 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
List *subplans = NIL;
ListCell *subpaths;
RelOptInfo *rel = best_path->path.parent;
- PartitionPruneInfo *partpruneinfo = NULL;
/*
* We don't have the actual creation of the MergeAppend node split out
@@ -1518,6 +1517,9 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
subplans = lappend(subplans, subplan);
}
+ /* Set below if we find quals that we can use to run-time prune */
+ node->part_prune_index = -1;
+
/*
* If any quals exist, they may be useful to perform further partition
* pruning during execution. Gather information needed by the executor to
@@ -1541,13 +1543,13 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
}
if (prunequal != NIL)
- partpruneinfo = make_partition_pruneinfo(root, rel,
- best_path->subpaths,
- prunequal);
+ node->part_prune_index = make_partition_pruneinfo(root, rel,
+ best_path->subpaths,
+ prunequal);
}
node->mergeplans = subplans;
- node->part_prune_info = partpruneinfo;
+
/*
* If prepare_sort_from_pathkeys added sort columns, but we were told to
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 493a3af0fa..799602f5ea 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -519,6 +519,7 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
result->dependsOnRole = glob->dependsOnRole;
result->parallelModeNeeded = glob->parallelModeNeeded;
result->planTree = top_plan;
+ result->partPruneInfos = glob->partPruneInfos;
result->rtable = glob->finalrtable;
result->resultRelations = glob->resultRelations;
result->appendRelations = glob->appendRelations;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 1cb0abdbc1..720f20f563 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -348,6 +348,29 @@ set_plan_references(PlannerInfo *root, Plan *plan)
}
}
+ /* Also fix up the information in PartitionPruneInfos. */
+ foreach (lc, root->partPruneInfos)
+ {
+ PartitionPruneInfo *pruneinfo = lfirst(lc);
+ ListCell *l;
+
+ foreach(l, pruneinfo->prune_infos)
+ {
+ List *prune_infos = lfirst(l);
+ ListCell *l2;
+
+ foreach(l2, prune_infos)
+ {
+ PartitionedRelPruneInfo *pinfo = lfirst(l2);
+
+ /* RT index of the table to which the pinfo belongs. */
+ pinfo->rtindex += rtoffset;
+ }
+ }
+
+ glob->partPruneInfos = lappend(glob->partPruneInfos, pruneinfo);
+ }
+
return result;
}
@@ -1658,21 +1681,12 @@ set_append_references(PlannerInfo *root,
aplan->apprelids = offset_relid_set(aplan->apprelids, rtoffset);
- if (aplan->part_prune_info)
- {
- foreach(l, aplan->part_prune_info->prune_infos)
- {
- List *prune_infos = lfirst(l);
- ListCell *l2;
-
- foreach(l2, prune_infos)
- {
- PartitionedRelPruneInfo *pinfo = lfirst(l2);
-
- pinfo->rtindex += rtoffset;
- }
- }
- }
+ /*
+ * PartitionPruneInfos will be added to a list in PlannerGlobal, so update
+ * the index.
+ */
+ if (aplan->part_prune_index >= 0)
+ aplan->part_prune_index += list_length(root->glob->partPruneInfos);
/* We don't need to recurse to lefttree or righttree ... */
Assert(aplan->plan.lefttree == NULL);
@@ -1734,21 +1748,12 @@ set_mergeappend_references(PlannerInfo *root,
mplan->apprelids = offset_relid_set(mplan->apprelids, rtoffset);
- if (mplan->part_prune_info)
- {
- foreach(l, mplan->part_prune_info->prune_infos)
- {
- List *prune_infos = lfirst(l);
- ListCell *l2;
-
- foreach(l2, prune_infos)
- {
- PartitionedRelPruneInfo *pinfo = lfirst(l2);
-
- pinfo->rtindex += rtoffset;
- }
- }
- }
+ /*
+ * PartitionPruneInfos will be added to a list in PlannerGlobal, so update
+ * the index.
+ */
+ if (mplan->part_prune_index >= 0)
+ mplan->part_prune_index += list_length(root->glob->partPruneInfos);
/* We don't need to recurse to lefttree or righttree ... */
Assert(mplan->plan.lefttree == NULL);
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index 6188bf69cb..6565b6ed01 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -209,16 +209,20 @@ static void partkey_datum_from_expr(PartitionPruneContext *context,
/*
* make_partition_pruneinfo
- * Builds a PartitionPruneInfo which can be used in the executor to allow
- * additional partition pruning to take place. Returns NULL when
- * partition pruning would be useless.
+ * Checks if the given set of quals can be used to build pruning steps
+ * that the executor can use to prune away unneeded partitions. If
+ * suitable quals are found then a PartitionPruneInfo is built and tagged
+ * onto the PlannerInfo's partPruneInfos list.
+ *
+ * The return value is the 0-based index of the item added to the
+ * partPruneInfos list or -1 if nothing was added.
*
* 'parentrel' is the RelOptInfo for an appendrel, and 'subpaths' is the list
* of scan paths for its child rels.
* 'prunequal' is a list of potential pruning quals (i.e., restriction
* clauses that are applicable to the appendrel).
*/
-PartitionPruneInfo *
+int
make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
List *subpaths,
List *prunequal)
@@ -332,7 +336,7 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
* quals, then we can just not bother with run-time pruning.
*/
if (prunerelinfos == NIL)
- return NULL;
+ return -1;
/* Else build the result data structure */
pruneinfo = makeNode(PartitionPruneInfo);
@@ -358,7 +362,9 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
else
pruneinfo->other_subplans = NULL;
- return pruneinfo;
+ root->partPruneInfos = lappend(root->partPruneInfos, pruneinfo);
+
+ return list_length(root->partPruneInfos) - 1;
}
/*
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index 708435e952..bf962af7af 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -123,9 +123,8 @@ typedef struct PartitionPruneState
extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
int n_total_subplans,
- PartitionPruneInfo *pruneinfo,
+ int part_prune_index,
Bitmapset **initially_valid_subplans);
extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
bool initial_prune);
-
#endif /* EXECPARTITION_H */
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 01b1727fc0..4a741b053f 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -611,6 +611,7 @@ typedef struct EState
struct ExecRowMark **es_rowmarks; /* Array of per-range-table-entry
* ExecRowMarks, or NULL if none */
PlannedStmt *es_plannedstmt; /* link to top of plan tree */
+ List *es_part_prune_infos; /* PlannedStmt.partPruneInfos */
const char *es_sourceText; /* Source text from QueryDesc */
JunkFilter *es_junkFilter; /* top-level junk filter, if any */
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 09342d128d..fbe75dca0f 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -122,6 +122,9 @@ typedef struct PlannerGlobal
/* "flat" list of AppendRelInfos */
List *appendRelations;
+ /* List of PartitionPruneInfo contained in the plan */
+ List *partPruneInfos;
+
/* OIDs of relations the plan depends on */
List *relationOids;
@@ -503,6 +506,9 @@ struct PlannerInfo
/* Does this query modify any partition key columns? */
bool partColsUpdated;
+
+ /* PartitionPruneInfos added in this query's plan. */
+ List *partPruneInfos;
};
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 5c2ab1b379..2e132afc5a 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -70,6 +70,9 @@ typedef struct PlannedStmt
struct Plan *planTree; /* tree of Plan nodes */
+ List *partPruneInfos; /* List of PartitionPruneInfo contained in
+ * the plan */
+
List *rtable; /* list of RangeTblEntry nodes */
/* rtable indexes of target relations for INSERT/UPDATE/DELETE/MERGE */
@@ -270,8 +273,8 @@ typedef struct Append
*/
int first_partial_plan;
- /* Info for run-time subplan pruning; NULL if we're not doing that */
- struct PartitionPruneInfo *part_prune_info;
+ /* Index to PlannerInfo.partPruneInfos or -1 if no run-time pruning */
+ int part_prune_index;
} Append;
/* ----------------
@@ -305,8 +308,8 @@ typedef struct MergeAppend
/* NULLS FIRST/LAST directions */
bool *nullsFirst pg_node_attr(array_size(numCols));
- /* Info for run-time subplan pruning; NULL if we're not doing that */
- struct PartitionPruneInfo *part_prune_info;
+ /* Index to PlannerInfo.partPruneInfos or -1 if no run-time pruning */
+ int part_prune_index;
} MergeAppend;
/* ----------------
diff --git a/src/include/partitioning/partprune.h b/src/include/partitioning/partprune.h
index 90684efa25..ebf0dcff8c 100644
--- a/src/include/partitioning/partprune.h
+++ b/src/include/partitioning/partprune.h
@@ -70,10 +70,10 @@ typedef struct PartitionPruneContext
#define PruneCxtStateIdx(partnatts, step_id, keyno) \
((partnatts) * (step_id) + (keyno))
-extern PartitionPruneInfo *make_partition_pruneinfo(struct PlannerInfo *root,
- struct RelOptInfo *parentrel,
- List *subpaths,
- List *prunequal);
+extern int make_partition_pruneinfo(struct PlannerInfo *root,
+ struct RelOptInfo *parentrel,
+ List *subpaths,
+ List *prunequal);
extern Bitmapset *prune_append_rel_partitions(struct RelOptInfo *rel);
extern Bitmapset *get_matching_partitions(PartitionPruneContext *context,
List *pruning_steps);
--
2.35.3
^ permalink raw reply [nested|flat] 71+ messages in thread
* Re: generic plans and "initial" pruning
@ 2022-11-30 18:12 Alvaro Herrera <[email protected]>
parent: Amit Langote <[email protected]>
0 siblings, 1 reply; 71+ messages in thread
From: Alvaro Herrera @ 2022-11-30 18:12 UTC (permalink / raw)
To: Amit Langote <[email protected]>; +Cc: Robert Haas <[email protected]>; Jacob Champion <[email protected]>; Zhihong Yu <[email protected]>; David Rowley <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers
Looking at 0001, I wonder if we should have a crosscheck that a
PartitionPruneInfo you got from following an index is indeed constructed
for the relation that you think it is: previously, you were always sure
that the prune struct is for this node, because you followed a pointer
that was set up in the node itself. Now you only have an index, and you
have to trust that the index is correct.
I'm not sure how to implement this, or even if it's doable at all.
Keeping the OID of the partitioned table in the PartitionPruneInfo
struct is easy, but I don't know how to check it in ExecInitMergeAppend
and ExecInitAppend.
--
Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/
"Find a bug in a program, and fix it, and the program will work today.
Show the program how to find and fix a bug, and the program
will work forever" (Oliver Silfridge)
^ permalink raw reply [nested|flat] 71+ messages in thread
* Re: generic plans and "initial" pruning
@ 2022-12-01 07:59 Amit Langote <[email protected]>
parent: Alvaro Herrera <[email protected]>
0 siblings, 1 reply; 71+ messages in thread
From: Amit Langote @ 2022-12-01 07:59 UTC (permalink / raw)
To: Alvaro Herrera <[email protected]>; +Cc: Robert Haas <[email protected]>; Jacob Champion <[email protected]>; Zhihong Yu <[email protected]>; David Rowley <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers
Hi Alvaro,
Thanks for looking at this one.
On Thu, Dec 1, 2022 at 3:12 AM Alvaro Herrera <[email protected]> wrote:
> Looking at 0001, I wonder if we should have a crosscheck that a
> PartitionPruneInfo you got from following an index is indeed constructed
> for the relation that you think it is: previously, you were always sure
> that the prune struct is for this node, because you followed a pointer
> that was set up in the node itself. Now you only have an index, and you
> have to trust that the index is correct.
Yeah, a crosscheck sounds like a good idea.
> I'm not sure how to implement this, or even if it's doable at all.
> Keeping the OID of the partitioned table in the PartitionPruneInfo
> struct is easy, but I don't know how to check it in ExecInitMergeAppend
> and ExecInitAppend.
Hmm, how about keeping the [Merge]Append's parent relation's RT index
in the PartitionPruneInfo and passing it down to
ExecInitPartitionPruning() from ExecInit[Merge]Append() for
cross-checking? Both Append and MergeAppend already have a
'apprelids' field that we can save a copy of in the
PartitionPruneInfo. Tried that in the attached delta patch.
--
Thanks, Amit Langote
EDB: http://www.enterprisedb.com
Attachments:
[application/octet-stream] PartitionPruneInfo-relids.patch (5.3K, 2-PartitionPruneInfo-relids.patch)
download | inline diff:
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 2bd069d889..9a631a9192 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -1791,6 +1791,9 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
* Initialize data structure needed for run-time partition pruning and
* do initial pruning if needed
*
+ * 'root_parent_relids' identifies the relation to which both the parent plan
+ * and the PartitionPruneInfo given by 'part_prune_index' belong.
+ *
* On return, *initially_valid_subplans is assigned the set of indexes of
* child subplans that must be initialized along with the parent plan node.
* Initial pruning is performed here if needed and in that case only the
@@ -1804,6 +1807,7 @@ PartitionPruneState *
ExecInitPartitionPruning(PlanState *planstate,
int n_total_subplans,
int part_prune_index,
+ Bitmapset *root_parent_relids,
Bitmapset **initially_valid_subplans)
{
PartitionPruneState *prunestate;
@@ -1811,6 +1815,14 @@ ExecInitPartitionPruning(PlanState *planstate,
PartitionPruneInfo *pruneinfo = list_nth(estate->es_part_prune_infos,
part_prune_index);
+ /* Sanity: part_prune_index gives the correct PartitionPruneInfo. */
+ if (!bms_equal(root_parent_relids, pruneinfo->root_parent_relids))
+ elog(ERROR, "wrong relids (%s) found in PartitionPruneInfo at part_prune_index=%u which has root_parent_relids=%s",
+ bmsToString(root_parent_relids),
+ part_prune_index,
+ bmsToString(pruneinfo->root_parent_relids));
+
+
/* We may need an expression context to evaluate partition exprs */
ExecAssignExprContext(estate, planstate);
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index c6f86a6510..99830198bd 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -146,6 +146,7 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
prunestate = ExecInitPartitionPruning(&appendstate->ps,
list_length(node->appendplans),
node->part_prune_index,
+ node->apprelids,
&validsubplans);
appendstate->as_prune_state = prunestate;
nplans = bms_num_members(validsubplans);
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index 8d35860c30..f370f9f287 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -94,6 +94,7 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
prunestate = ExecInitPartitionPruning(&mergestate->ps,
list_length(node->mergeplans),
node->part_prune_index,
+ node->apprelids,
&validsubplans);
mergestate->ms_prune_state = prunestate;
nplans = bms_num_members(validsubplans);
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 720f20f563..e67f0e3509 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -354,6 +354,8 @@ set_plan_references(PlannerInfo *root, Plan *plan)
PartitionPruneInfo *pruneinfo = lfirst(lc);
ListCell *l;
+ pruneinfo->root_parent_relids =
+ offset_relid_set(pruneinfo->root_parent_relids, rtoffset);
foreach(l, pruneinfo->prune_infos)
{
List *prune_infos = lfirst(l);
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index 6565b6ed01..d48f6784c1 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -340,6 +340,7 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
/* Else build the result data structure */
pruneinfo = makeNode(PartitionPruneInfo);
+ pruneinfo->root_parent_relids = parentrel->relids;
pruneinfo->prune_infos = prunerelinfos;
/*
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index bf962af7af..17fabc18c9 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -124,6 +124,7 @@ typedef struct PartitionPruneState
extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
int n_total_subplans,
int part_prune_index,
+ Bitmapset *root_parent_relids,
Bitmapset **initially_valid_subplans);
extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
bool initial_prune);
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 2e132afc5a..b2d6f8fb6e 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -1407,6 +1407,8 @@ typedef struct PlanRowMark
* Then, since an Append-type node could have multiple partitioning
* hierarchies among its children, we have an unordered List of those Lists.
*
+ * root_parent_relids RelOptInfo.relids of the relation to which the parent
+ * plan node and this PartitionPruneInfo node belong
* prune_infos List of Lists containing PartitionedRelPruneInfo nodes,
* one sublist per run-time-prunable partition hierarchy
* appearing in the parent plan node's subplans.
@@ -1419,6 +1421,7 @@ typedef struct PartitionPruneInfo
pg_node_attr(no_equal)
NodeTag type;
+ Bitmapset *root_parent_relids;
List *prune_infos;
Bitmapset *other_subplans;
} PartitionPruneInfo;
^ permalink raw reply [nested|flat] 71+ messages in thread
* Re: generic plans and "initial" pruning
@ 2022-12-01 11:21 Alvaro Herrera <[email protected]>
parent: Amit Langote <[email protected]>
0 siblings, 1 reply; 71+ messages in thread
From: Alvaro Herrera @ 2022-12-01 11:21 UTC (permalink / raw)
To: Amit Langote <[email protected]>; +Cc: Robert Haas <[email protected]>; Jacob Champion <[email protected]>; Zhihong Yu <[email protected]>; David Rowley <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers
On 2022-Dec-01, Amit Langote wrote:
> Hmm, how about keeping the [Merge]Append's parent relation's RT index
> in the PartitionPruneInfo and passing it down to
> ExecInitPartitionPruning() from ExecInit[Merge]Append() for
> cross-checking? Both Append and MergeAppend already have a
> 'apprelids' field that we can save a copy of in the
> PartitionPruneInfo. Tried that in the attached delta patch.
Ah yeah, that sounds about what I was thinking. I've merged that in and
pushed to github, which had a strange pg_upgrade failure on Windows
mentioning log files that were not captured by the CI tooling. So I
pushed another one trying to grab those files, in case it wasn't an
one-off failure. It's running now:
https://cirrus-ci.com/task/5857239638999040
If all goes well with this run, I'll get this 0001 pushed.
--
Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
"Investigación es lo que hago cuando no sé lo que estoy haciendo"
(Wernher von Braun)
^ permalink raw reply [nested|flat] 71+ messages in thread
* Re: generic plans and "initial" pruning
@ 2022-12-01 12:43 Amit Langote <[email protected]>
parent: Alvaro Herrera <[email protected]>
0 siblings, 1 reply; 71+ messages in thread
From: Amit Langote @ 2022-12-01 12:43 UTC (permalink / raw)
To: Alvaro Herrera <[email protected]>; +Cc: Robert Haas <[email protected]>; Jacob Champion <[email protected]>; Zhihong Yu <[email protected]>; David Rowley <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers
On Thu, Dec 1, 2022 at 8:21 PM Alvaro Herrera <[email protected]> wrote:
> On 2022-Dec-01, Amit Langote wrote:
> > Hmm, how about keeping the [Merge]Append's parent relation's RT index
> > in the PartitionPruneInfo and passing it down to
> > ExecInitPartitionPruning() from ExecInit[Merge]Append() for
> > cross-checking? Both Append and MergeAppend already have a
> > 'apprelids' field that we can save a copy of in the
> > PartitionPruneInfo. Tried that in the attached delta patch.
>
> Ah yeah, that sounds about what I was thinking. I've merged that in and
> pushed to github, which had a strange pg_upgrade failure on Windows
> mentioning log files that were not captured by the CI tooling. So I
> pushed another one trying to grab those files, in case it wasn't an
> one-off failure. It's running now:
> https://cirrus-ci.com/task/5857239638999040
>
> If all goes well with this run, I'll get this 0001 pushed.
Thanks for pushing 0001.
Rebased 0002 attached.
--
Thanks, Amit Langote
EDB: http://www.enterprisedb.com
Attachments:
[application/octet-stream] v25-0001-Optimize-AcquireExecutorLocks-by-locking-only-un.patch (82.4K, 2-v25-0001-Optimize-AcquireExecutorLocks-by-locking-only-un.patch)
download | inline diff:
From cff400af6c264d7a2651faec4d963e987797f588 Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Wed, 22 Dec 2021 16:55:17 +0900
Subject: [PATCH v25] Optimize AcquireExecutorLocks() by locking only unpruned
partitions
This commit teaches AcquireExecutorLocks() to perform initial
partition pruning to notionally eliminate the subnodes contained in a
generic cached plan that need not be initialized during the actual
execution of the plan and skip locking the partition scanned by those
subnodes.
The result of performing initial partition pruning this way before the
actual execution has started is made available to the actual execution via
PartitionPruneResult, made available along with the PlannedStmt by the
callers of the executor that used plancache.c to get the plan. It is NULL
in the cases in which the plan is obtained by calling the planner
directly or if the plan obtained by plancache.c is not a generic one.
---
src/backend/commands/copyto.c | 2 +-
src/backend/commands/createas.c | 2 +-
src/backend/commands/explain.c | 7 +-
src/backend/commands/extension.c | 2 +-
src/backend/commands/matview.c | 2 +-
src/backend/commands/prepare.c | 26 ++-
src/backend/executor/README | 32 ++++
src/backend/executor/execMain.c | 51 ++++++
src/backend/executor/execParallel.c | 26 ++-
src/backend/executor/execPartition.c | 238 +++++++++++++++++++++----
src/backend/executor/execUtils.c | 1 +
src/backend/executor/functions.c | 2 +-
src/backend/executor/nodeAppend.c | 11 +-
src/backend/executor/nodeMergeAppend.c | 5 +-
src/backend/executor/spi.c | 27 ++-
src/backend/nodes/readfuncs.c | 8 +-
src/backend/optimizer/plan/planner.c | 2 +
src/backend/optimizer/plan/setrefs.c | 46 +++++
src/backend/partitioning/partprune.c | 41 ++++-
src/backend/tcop/postgres.c | 8 +-
src/backend/tcop/pquery.c | 28 ++-
src/backend/utils/cache/plancache.c | 208 ++++++++++++++++++---
src/backend/utils/mmgr/portalmem.c | 19 ++
src/include/commands/explain.h | 4 +-
src/include/executor/execPartition.h | 9 +-
src/include/executor/execdesc.h | 3 +
src/include/executor/executor.h | 3 +
src/include/nodes/execnodes.h | 1 +
src/include/nodes/nodes.h | 1 +
src/include/nodes/pathnodes.h | 12 ++
src/include/nodes/plannodes.h | 46 +++++
src/include/utils/plancache.h | 3 +-
src/include/utils/portal.h | 3 +
33 files changed, 781 insertions(+), 98 deletions(-)
diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index f26cc0d162..401a2280a3 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -558,7 +558,7 @@ BeginCopyTo(ParseState *pstate,
((DR_copy *) dest)->cstate = cstate;
/* Create a QueryDesc requesting no output */
- cstate->queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+ cstate->queryDesc = CreateQueryDesc(plan, NIL, pstate->p_sourcetext,
GetActiveSnapshot(),
InvalidSnapshot,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 152c29b551..942449544c 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -325,7 +325,7 @@ ExecCreateTableAs(ParseState *pstate, CreateTableAsStmt *stmt,
UpdateActiveSnapshotCommandId();
/* Create a QueryDesc, redirecting output to our tuple receiver */
- queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+ queryDesc = CreateQueryDesc(plan, NIL, pstate->p_sourcetext,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, 0);
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index f86983c660..2f2b558608 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -407,7 +407,7 @@ ExplainOneQuery(Query *query, int cursorOptions,
}
/* run it (if needed) and produce output */
- ExplainOnePlan(plan, into, es, queryString, params, queryEnv,
+ ExplainOnePlan(plan, NIL, into, es, queryString, params, queryEnv,
&planduration, (es->buffers ? &bufusage : NULL));
}
}
@@ -515,7 +515,8 @@ ExplainOneUtility(Node *utilityStmt, IntoClause *into, ExplainState *es,
* to call it.
*/
void
-ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
+ExplainOnePlan(PlannedStmt *plannedstmt, List *part_prune_results,
+ IntoClause *into, ExplainState *es,
const char *queryString, ParamListInfo params,
QueryEnvironment *queryEnv, const instr_time *planduration,
const BufferUsage *bufusage)
@@ -563,7 +564,7 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
dest = None_Receiver;
/* Create a QueryDesc for the query */
- queryDesc = CreateQueryDesc(plannedstmt, queryString,
+ queryDesc = CreateQueryDesc(plannedstmt, part_prune_results, queryString,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, instrument_option);
diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index cf1b1ca571..904cbcba4a 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -779,7 +779,7 @@ execute_sql_string(const char *sql)
{
QueryDesc *qdesc;
- qdesc = CreateQueryDesc(stmt,
+ qdesc = CreateQueryDesc(stmt, NIL,
sql,
GetActiveSnapshot(), NULL,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index 9ac0383459..65c8d0aa59 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -408,7 +408,7 @@ refresh_matview_datafill(DestReceiver *dest, Query *query,
UpdateActiveSnapshotCommandId();
/* Create a QueryDesc, redirecting output to our tuple receiver */
- queryDesc = CreateQueryDesc(plan, queryString,
+ queryDesc = CreateQueryDesc(plan, NIL, queryString,
GetActiveSnapshot(), InvalidSnapshot,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 9e29584d93..29b45539d3 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -155,6 +155,7 @@ ExecuteQuery(ParseState *pstate,
PreparedStatement *entry;
CachedPlan *cplan;
List *plan_list;
+ List *part_prune_results_list;
ParamListInfo paramLI = NULL;
EState *estate = NULL;
Portal portal;
@@ -193,7 +194,10 @@ ExecuteQuery(ParseState *pstate,
entry->plansource->query_string);
/* Replan if needed, and increment plan refcount for portal */
- cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL);
+ cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL,
+ &part_prune_results_list);
+ Assert(list_length(cplan->stmt_list) ==
+ list_length(part_prune_results_list));
plan_list = cplan->stmt_list;
/*
@@ -207,6 +211,9 @@ ExecuteQuery(ParseState *pstate,
plan_list,
cplan);
+ /* Copy Lists of PartitionPruneResults into the portal's context. */
+ PortalStorePartitionPruneResults(portal, part_prune_results_list);
+
/*
* For CREATE TABLE ... AS EXECUTE, we must verify that the prepared
* statement is one that produces tuples. Currently we insist that it be
@@ -576,7 +583,9 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
const char *query_string;
CachedPlan *cplan;
List *plan_list;
- ListCell *p;
+ List *part_prune_results_list;
+ ListCell *p,
+ *pp;
ParamListInfo paramLI = NULL;
EState *estate = NULL;
instr_time planstart;
@@ -619,7 +628,10 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
/* Replan if needed, and acquire a transient refcount */
cplan = GetCachedPlan(entry->plansource, paramLI,
- CurrentResourceOwner, queryEnv);
+ CurrentResourceOwner, queryEnv,
+ &part_prune_results_list);
+ Assert(list_length(cplan->stmt_list) ==
+ list_length(part_prune_results_list));
INSTR_TIME_SET_CURRENT(planduration);
INSTR_TIME_SUBTRACT(planduration, planstart);
@@ -634,13 +646,15 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
plan_list = cplan->stmt_list;
/* Explain each query */
- foreach(p, plan_list)
+ forboth(p, plan_list, pp, part_prune_results_list)
{
PlannedStmt *pstmt = lfirst_node(PlannedStmt, p);
+ List *part_prune_results = lfirst_node(List, pp);
if (pstmt->commandType != CMD_UTILITY)
- ExplainOnePlan(pstmt, into, es, query_string, paramLI, queryEnv,
- &planduration, (es->buffers ? &bufusage : NULL));
+ ExplainOnePlan(pstmt, part_prune_results, into, es, query_string,
+ paramLI, queryEnv, &planduration,
+ (es->buffers ? &bufusage : NULL));
else
ExplainOneUtility(pstmt->utilityStmt, into, es, query_string,
paramLI, queryEnv);
diff --git a/src/backend/executor/README b/src/backend/executor/README
index 17775a49e2..5c59ac5da7 100644
--- a/src/backend/executor/README
+++ b/src/backend/executor/README
@@ -65,6 +65,34 @@ found there. This currently only occurs for Append and MergeAppend nodes. In
this case the non-required subplans are ignored and the executor state's
subnode array will become out of sequence to the plan's subplan list.
+Actually, the so-called execution time pruning may also occur even before the
+execution has started. One case where that occurs is when a cached generic
+plan is being validated for execution by plancache.c: GetCachedPlan(), which
+works by locking all the relations that will be scanned by that plan. If the
+generic plan contains nodes that can perform execution time partition pruning
+(that is, contain a PartitionPruneInfo), a subset of pruning steps contained
+in a given node's PartitionPruneInfo that do not depend on the execution
+actually having started (called "initial" pruning steps) are performed to
+figure out the minimal set of child subplans that satisfy those pruning steps.
+AcquireExecutorLocks() looking at a given generic plan will then lock only the
+relations scanned by the child subplans that survived such pruning, along with
+those present in PlannedStmt.minLockRelids. Note that the subplans are only
+notionally pruned, that is, they are not removed from the plan tree as such.
+
+To prevent the executor and any third party execution code that can look at
+the plan tree from trying to execute the subplans that were pruned as
+described above, the result of pruning is passed to the executor as a List
+of PartitionPruneResult nodes via the QueryDesc. Each PartitionPruneResult
+consists of the set of indexes of surviving subplans in the respective parent
+plan node's (the one to which the corresponding PartitionPruneInfo belongs)
+list of child subplans, saved as a bitmapset (valid_subplan_offs). In other
+words, the executor executing a generic plan should not re-evaluate the set of
+initially valid subplans for a given plan node by redoing the initial pruning
+if it was already done by AcquireExecutorLocks() when validating the plan.
+Such re-evaluation of the pruning steps may very well end up resulting in a
+different set of subplans, containing some whose relations were not locked by
+AcquireExecutorLocks().
+
Each Plan node may have expression trees associated with it, to represent
its target list, qualification conditions, etc. These trees are also
read-only to the executor, but the executor state for expression evaluation
@@ -286,6 +314,10 @@ Query Processing Control Flow
This is a sketch of control flow for full query processing:
+ [ ExecutorDoInitialPruning ] --- an optional step to perform initial
+ partition pruning on the plan tree the result of which is passed
+ to the executor via QueryDesc
+
CreateQueryDesc
ExecutorStart
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index b6751da574..7a4db80104 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -49,6 +49,7 @@
#include "commands/matview.h"
#include "commands/trigger.h"
#include "executor/execdebug.h"
+#include "executor/execPartition.h"
#include "executor/nodeSubplan.h"
#include "foreign/fdwapi.h"
#include "jit/jit.h"
@@ -104,6 +105,54 @@ static void EvalPlanQualStart(EPQState *epqstate, Plan *planTree);
/* end of local decls */
+/* ----------------------------------------------------------------
+ * ExecutorDoInitialPruning
+ *
+ * For each plan tree node that has been assigned a PartitionPruneInfo,
+ * this performs initial partition pruning using the information contained
+ * therein to determine the set of child subplans that satisfy the initial
+ * pruning steps, to be returned as a bitmapset of their indexes in the
+ * node's list of child subplans (for example, an Append's appendplans).
+ *
+ * Return value is a PartitionPruneResult node that contains a list of those
+ * bitmapsets, with one element for every PartitionPruneInfo, and a bitmapset
+ * of the RT indexes of all the leaf partitions scanned by those chosen
+ * subplans. Note that the latter is shared across all PartitionPruneInfos.
+ *
+ * The executor must see the exactly same set of subplans as valid for
+ * execution when doing ExecInitNode() on the plan nodes whose
+ * PartitionPruneInfos are processed here. So, it must get the set from the
+ * aforementioned PartitionPruneResult, instead of computing it all over
+ * again by redoing the initial pruning. It's the caller's job to pass the
+ * PartitionPruneResult to the executor.
+ *
+ * Note: Partitioned tables mentioned in PartitionedRelPruneInfo nodes that
+ * drive the pruning will be locked before doing the pruning.
+ * ----------------------------------------------------------------
+ */
+List *
+ExecutorDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params,
+ Bitmapset **scan_leafpart_rtis)
+{
+ List *part_prune_results = NIL;
+ ListCell *lc;
+
+ /* Only get here if there is any pruning to do. */
+ Assert(plannedstmt->containsInitialPruning);
+
+ foreach(lc, plannedstmt->partPruneInfos)
+ {
+ PartitionPruneInfo *pruneinfo = lfirst(lc);
+ PartitionPruneResult *pruneresult = makeNode(PartitionPruneResult);
+
+ pruneresult->valid_subplan_offs =
+ ExecPartitionDoInitialPruning(plannedstmt, params, pruneinfo,
+ scan_leafpart_rtis);
+ part_prune_results = lappend(part_prune_results, pruneresult);
+ }
+
+ return part_prune_results;
+}
/* ----------------------------------------------------------------
* ExecutorStart
@@ -806,6 +855,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
{
CmdType operation = queryDesc->operation;
PlannedStmt *plannedstmt = queryDesc->plannedstmt;
+ List *part_prune_results = queryDesc->part_prune_results;
Plan *plan = plannedstmt->planTree;
List *rangeTable = plannedstmt->rtable;
EState *estate = queryDesc->estate;
@@ -826,6 +876,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
estate->es_plannedstmt = plannedstmt;
estate->es_part_prune_infos = plannedstmt->partPruneInfos;
+ estate->es_part_prune_results = part_prune_results;
/*
* Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index aca0c6f323..917079a034 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -66,6 +66,7 @@
#define PARALLEL_KEY_QUERY_TEXT UINT64CONST(0xE000000000000008)
#define PARALLEL_KEY_JIT_INSTRUMENTATION UINT64CONST(0xE000000000000009)
#define PARALLEL_KEY_WAL_USAGE UINT64CONST(0xE00000000000000A)
+#define PARALLEL_KEY_PARTITION_PRUNE_RESULTS UINT64CONST(0xE00000000000000B)
#define PARALLEL_TUPLE_QUEUE_SIZE 65536
@@ -182,6 +183,7 @@ ExecSerializePlan(Plan *plan, EState *estate)
pstmt->transientPlan = false;
pstmt->dependsOnRole = false;
pstmt->parallelModeNeeded = false;
+ pstmt->containsInitialPruning = false;
pstmt->planTree = plan;
pstmt->partPruneInfos = estate->es_part_prune_infos;
pstmt->rtable = estate->es_range_table;
@@ -597,12 +599,15 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
FixedParallelExecutorState *fpes;
char *pstmt_data;
char *pstmt_space;
+ char *part_prune_results_data;
+ char *part_prune_results_space;
char *paramlistinfo_space;
BufferUsage *bufusage_space;
WalUsage *walusage_space;
SharedExecutorInstrumentation *instrumentation = NULL;
SharedJitInstrumentation *jit_instrumentation = NULL;
int pstmt_len;
+ int part_prune_results_len;
int paramlistinfo_len;
int instrumentation_len = 0;
int jit_instrumentation_len = 0;
@@ -631,6 +636,7 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
/* Fix up and serialize plan to be sent to workers. */
pstmt_data = ExecSerializePlan(planstate->plan, estate);
+ part_prune_results_data = nodeToString(estate->es_part_prune_results);
/* Create a parallel context. */
pcxt = CreateParallelContext("postgres", "ParallelQueryMain", nworkers);
@@ -657,6 +663,11 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
shm_toc_estimate_chunk(&pcxt->estimator, pstmt_len);
shm_toc_estimate_keys(&pcxt->estimator, 1);
+ /* Estimate space for serialized List of PartitionPruneResult. */
+ part_prune_results_len = strlen(part_prune_results_data) + 1;
+ shm_toc_estimate_chunk(&pcxt->estimator, part_prune_results_len);
+ shm_toc_estimate_keys(&pcxt->estimator, 1);
+
/* Estimate space for serialized ParamListInfo. */
paramlistinfo_len = EstimateParamListSpace(estate->es_param_list_info);
shm_toc_estimate_chunk(&pcxt->estimator, paramlistinfo_len);
@@ -751,6 +762,12 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
memcpy(pstmt_space, pstmt_data, pstmt_len);
shm_toc_insert(pcxt->toc, PARALLEL_KEY_PLANNEDSTMT, pstmt_space);
+ /* Store serialized List of PartitionPruneResult */
+ part_prune_results_space = shm_toc_allocate(pcxt->toc, part_prune_results_len);
+ memcpy(part_prune_results_space, part_prune_results_data, part_prune_results_len);
+ shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARTITION_PRUNE_RESULTS,
+ part_prune_results_space);
+
/* Store serialized ParamListInfo. */
paramlistinfo_space = shm_toc_allocate(pcxt->toc, paramlistinfo_len);
shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARAMLISTINFO, paramlistinfo_space);
@@ -1232,8 +1249,10 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
int instrument_options)
{
char *pstmtspace;
+ char *part_prune_results_space;
char *paramspace;
PlannedStmt *pstmt;
+ List *part_prune_results;
ParamListInfo paramLI;
char *queryString;
@@ -1244,12 +1263,17 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
pstmtspace = shm_toc_lookup(toc, PARALLEL_KEY_PLANNEDSTMT, false);
pstmt = (PlannedStmt *) stringToNode(pstmtspace);
+ /* Reconstruct leader-supplied PartitionPruneResult. */
+ part_prune_results_space =
+ shm_toc_lookup(toc, PARALLEL_KEY_PARTITION_PRUNE_RESULTS, false);
+ part_prune_results = (List *) stringToNode(part_prune_results_space);
+
/* Reconstruct ParamListInfo. */
paramspace = shm_toc_lookup(toc, PARALLEL_KEY_PARAMLISTINFO, false);
paramLI = RestoreParamList(¶mspace);
/* Create a QueryDesc for the query. */
- return CreateQueryDesc(pstmt,
+ return CreateQueryDesc(pstmt, part_prune_results,
queryString,
GetActiveSnapshot(), InvalidSnapshot,
receiver, paramLI, NULL, instrument_options);
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 8e6453aec2..13e450c0fa 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -25,6 +25,7 @@
#include "mb/pg_wchar.h"
#include "miscadmin.h"
#include "nodes/makefuncs.h"
+#include "parser/parsetree.h"
#include "partitioning/partbounds.h"
#include "partitioning/partdesc.h"
#include "partitioning/partprune.h"
@@ -185,7 +186,11 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation rel,
static List *adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri);
static List *adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap);
static PartitionPruneState *CreatePartitionPruneState(PlanState *planstate,
- PartitionPruneInfo *pruneinfo);
+ PartitionPruneInfo *pruneinfo,
+ bool consider_initial_steps,
+ bool consider_exec_steps,
+ List *rtable, ExprContext *econtext,
+ PartitionDirectory partdir);
static void InitPartitionPruneContext(PartitionPruneContext *context,
List *pruning_steps,
PartitionDesc partdesc,
@@ -198,7 +203,8 @@ static void PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
static void find_matching_subplans_recurse(PartitionPruningData *prunedata,
PartitionedRelPruningData *pprune,
bool initial_prune,
- Bitmapset **validsubplans);
+ Bitmapset **validsubplans,
+ Bitmapset **scan_leafpart_rtis);
/*
@@ -1758,8 +1764,10 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
* considered to be a stable expression, it can change value from one plan
* node scan to the next during query execution. Stable comparison
* expressions that don't involve such Params allow partition pruning to be
- * done once during executor startup. Expressions that do involve such Params
- * require us to prune separately for each scan of the parent plan node.
+ * done once during executor startup or during ExecutorDoInitialPruning() that
+ * runs as part of performing AcquireExecutorLocks() on a given plan tree.
+ * Expressions that do involve such Params require us to prune separately for
+ * each scan of the parent plan node.
*
* Note that pruning away unneeded subplans during executor startup has the
* added benefit of not having to initialize the unneeded subplans at all.
@@ -1776,6 +1784,13 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
* account for initial pruning possibly having eliminated some of the
* subplans.
*
+ * ExecPartitionDoInitialPruning:
+ * Do initial pruning with the information contained in a given
+ * PartitionPruneInfo to determine the minimal set of child subplans
+ * to be executed of the parent plan node to which the PartitionPruneInfo
+ * belongs and also the set of the RT indexes of leaf partitions that will
+ * be scanned with those subplans.
+ *
* ExecFindMatchingSubPlans:
* Returns indexes of matching subplans after evaluating the expressions
* that are safe to evaluate at a given point. This function is first
@@ -1796,8 +1811,9 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
*
* On return, *initially_valid_subplans is assigned the set of indexes of
* child subplans that must be initialized along with the parent plan node.
- * Initial pruning is performed here if needed and in that case only the
- * surviving subplans' indexes are added.
+ * Initial pruning is performed here if needed (unless it has already been done
+ * by ExecutorDoInitialPruning()), and in that case only the surviving
+ * subplans' indexes are added.
*
* If subplans are indeed pruned, subplan_map arrays contained in the returned
* PartitionPruneState are re-sequenced to not count those, though only if the
@@ -1810,9 +1826,10 @@ ExecInitPartitionPruning(PlanState *planstate,
Bitmapset *root_parent_relids,
Bitmapset **initially_valid_subplans)
{
- PartitionPruneState *prunestate;
+ PartitionPruneState *prunestate = NULL;
EState *estate = planstate->state;
PartitionPruneInfo *pruneinfo;
+ PartitionPruneResult *pruneresult = NULL;
/* Obtain the pruneinfo we need, and make sure it's the right one */
pruneinfo = list_nth(estate->es_part_prune_infos, part_prune_index);
@@ -1828,20 +1845,57 @@ ExecInitPartitionPruning(PlanState *planstate,
/* We may need an expression context to evaluate partition exprs */
ExecAssignExprContext(estate, planstate);
- /* Create the working data structure for pruning */
- prunestate = CreatePartitionPruneState(planstate, pruneinfo);
+ /*
+ * No need to do initial pruning if it was done already by
+ * ExecutorDoInitialPruning(), which it would be if es_part_prune_results
+ * is set.
+ */
+ if (estate->es_part_prune_results)
+ {
+ pruneresult = list_nth(estate->es_part_prune_results, part_prune_index);
+ Assert(IsA(pruneresult, PartitionPruneResult));
+ }
+
+ if (pruneresult == NULL || pruneinfo->needs_exec_pruning)
+ {
+ /* We may need an expression context to evaluate partition exprs */
+ ExecAssignExprContext(estate, planstate);
+
+ /* For data reading, executor always omits detached partitions */
+ if (estate->es_partition_directory == NULL)
+ estate->es_partition_directory =
+ CreatePartitionDirectory(estate->es_query_cxt, false);
+
+ /*
+ * Create the working data structure for pruning. No need to consider
+ * initial pruning steps if we have a PartitionPruneResult.
+ */
+ prunestate = CreatePartitionPruneState(planstate, pruneinfo,
+ pruneresult == NULL,
+ pruneinfo->needs_exec_pruning,
+ NIL, planstate->ps_ExprContext,
+ estate->es_partition_directory);
+ }
/*
* Perform an initial partition prune pass, if required.
*/
- if (prunestate->do_initial_prune)
- *initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true);
+ if (pruneresult)
+ {
+ *initially_valid_subplans = bms_copy(pruneresult->valid_subplan_offs);
+ }
+ else if (prunestate && prunestate->do_initial_prune)
+ {
+ *initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true,
+ NULL);
+ }
else
{
- /* No pruning, so we'll need to initialize all subplans */
+ /* No initial pruning, so we'll need to initialize all subplans */
Assert(n_total_subplans > 0);
*initially_valid_subplans = bms_add_range(NULL, 0,
n_total_subplans - 1);
+ return prunestate;
}
/*
@@ -1849,7 +1903,8 @@ ExecInitPartitionPruning(PlanState *planstate,
* that were removed above due to initial pruning. No need to do this if
* no steps were removed.
*/
- if (bms_num_members(*initially_valid_subplans) < n_total_subplans)
+ if (prunestate &&
+ bms_num_members(*initially_valid_subplans) < n_total_subplans)
{
/*
* We can safely skip this when !do_exec_prune, even though that
@@ -1865,11 +1920,74 @@ ExecInitPartitionPruning(PlanState *planstate,
return prunestate;
}
+/*
+ * ExecPartitionDoInitialPruning
+ * Perform initial pruning using given PartitionPruneInfo to determine
+ * the minimal set of child subplans that will be executed and also the
+ * set of RT indexes of the leaf partitions scanned by those subplans.
+ */
+Bitmapset *
+ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params,
+ PartitionPruneInfo *pruneinfo,
+ Bitmapset **scan_leafpart_rtis)
+{
+ List *rtable = plannedstmt->rtable;
+ ExprContext *econtext;
+ PartitionDirectory pdir;
+ MemoryContext oldcontext,
+ tmpcontext;
+ PartitionPruneState *prunestate;
+ Bitmapset *valid_subplan_offs;
+
+ /*
+ * A temporary context for memory allocations required while executing
+ * partition pruning steps.
+ */
+ tmpcontext = AllocSetContextCreate(CurrentMemoryContext,
+ "initial pruning working data",
+ ALLOCSET_DEFAULT_SIZES);
+ oldcontext = MemoryContextSwitchTo(tmpcontext);
+
+ /*
+ * PartitionDirectory to look up partition descriptors.
+ * Note that we don't omit detached partitions, just like during
+ * execution proper.
+ */
+ pdir = CreatePartitionDirectory(CurrentMemoryContext, false);
+
+ /*
+ * We don't yet have a PlanState for the parent plan node, so we must
+ * create a standalone ExprContext to evaluate pruning expressions,
+ * equipped with the information about the EXTERN parameters that the
+ * caller passed us. Note that that's okay because the initial pruning
+ * steps do not contain anything that requires the execution to have
+ * started and thus need the information contained in a PlanState.
+ */
+ econtext = CreateStandaloneExprContext();
+ econtext->ecxt_param_list_info = params;
+ prunestate = CreatePartitionPruneState(NULL, pruneinfo, true, false,
+ rtable, econtext, pdir);
+ MemoryContextSwitchTo(oldcontext);
+
+ /* Do the initial pruning. */
+ valid_subplan_offs = ExecFindMatchingSubPlans(prunestate, true,
+ scan_leafpart_rtis);
+
+ FreeExprContext(econtext, true);
+ DestroyPartitionDirectory(pdir);
+ MemoryContextDelete(tmpcontext);
+
+ return valid_subplan_offs;
+}
+
/*
* CreatePartitionPruneState
* Build the data structure required for calling ExecFindMatchingSubPlans
*
- * 'planstate' is the parent plan node's execution state.
+ * 'planstate', if not NULL, is the parent plan node's execution state. It
+ * can be NULL if being called before ExecutorStart(), in which case,
+ * 'rtable' (range table), 'econtext', and 'partdir' must be explicitly
+ * provided.
*
* 'pruneinfo' is a PartitionPruneInfo as generated by
* make_partition_pruneinfo. Here we build a PartitionPruneState containing a
@@ -1883,19 +2001,21 @@ ExecInitPartitionPruning(PlanState *planstate,
* PartitionedRelPruneInfo.
*/
static PartitionPruneState *
-CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
+CreatePartitionPruneState(PlanState *planstate,
+ PartitionPruneInfo *pruneinfo,
+ bool consider_initial_steps,
+ bool consider_exec_steps,
+ List *rtable, ExprContext *econtext,
+ PartitionDirectory partdir)
{
- EState *estate = planstate->state;
+ EState *estate = planstate ? planstate->state : NULL;
PartitionPruneState *prunestate;
int n_part_hierarchies;
ListCell *lc;
int i;
- ExprContext *econtext = planstate->ps_ExprContext;
- /* For data reading, executor always omits detached partitions */
- if (estate->es_partition_directory == NULL)
- estate->es_partition_directory =
- CreatePartitionDirectory(estate->es_query_cxt, false);
+ Assert((estate != NULL) ||
+ (partdir != NULL && econtext != NULL && rtable != NIL));
n_part_hierarchies = list_length(pruneinfo->prune_infos);
Assert(n_part_hierarchies > 0);
@@ -1950,15 +2070,42 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
PartitionKey partkey;
/*
- * We can rely on the copies of the partitioned table's partition
- * key and partition descriptor appearing in its relcache entry,
- * because that entry will be held open and locked for the
- * duration of this executor run.
+ * Must open the relation by ourselves when called before the
+ * execution has started, such as, when called during
+ * ExecutorDoInitialPruning() on a cached plan. In that case,
+ * sub-partitions must be locked, because AcquirePlannerLocks()
+ * would not have seen them. (1st relation in a partrelpruneinfos
+ * list is always the root partitioned table appearing in the
+ * query, which AcquirePlannerLocks() would have locked; the
+ * Assert in relation_open() guards that assumption.)
+ */
+ if (estate == NULL)
+ {
+ RangeTblEntry *rte = rt_fetch(pinfo->rtindex, rtable);
+ int lockmode = (j == 0) ? NoLock : rte->rellockmode;
+
+ partrel = table_open(rte->relid, lockmode);
+ }
+ else
+ partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
+
+ /*
+ * We can rely on the copy of the partitioned table's partition
+ * key from in its relcache entry, because it can't change (or
+ * get destroyed) as long as the relation is locked. Partition
+ * descriptor is taken from the PartitionDirectory associated with
+ * the table that is held open long enough for the descriptor to
+ * remain valid while it's used to perform the pruning steps.
*/
- partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
partkey = RelationGetPartitionKey(partrel);
- partdesc = PartitionDirectoryLookup(estate->es_partition_directory,
- partrel);
+ partdesc = PartitionDirectoryLookup(partdir, partrel);
+
+ /*
+ * Must close partrel, keeping the lock taken, if we're not using
+ * EState's entry.
+ */
+ if (estate == NULL)
+ table_close(partrel, NoLock);
/*
* Initialize the subplan_map and subpart_map.
@@ -1972,6 +2119,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
Assert(partdesc->nparts >= pinfo->nparts);
pprune->nparts = partdesc->nparts;
pprune->subplan_map = palloc(sizeof(int) * partdesc->nparts);
+ pprune->rti_map = palloc(sizeof(Index) * partdesc->nparts);
if (partdesc->nparts == pinfo->nparts)
{
/*
@@ -1982,6 +2130,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
pprune->subpart_map = pinfo->subpart_map;
memcpy(pprune->subplan_map, pinfo->subplan_map,
sizeof(int) * pinfo->nparts);
+ memcpy(pprune->rti_map, pinfo->rti_map,
+ sizeof(int) * pinfo->nparts);
/*
* Double-check that the list of unpruned relations has not
@@ -2032,6 +2182,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
pinfo->subplan_map[pd_idx];
pprune->subpart_map[pp_idx] =
pinfo->subpart_map[pd_idx];
+ pprune->rti_map[pp_idx] =
+ pinfo->rti_map[pd_idx];
pd_idx++;
}
else
@@ -2039,6 +2191,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
/* this partdesc entry is not in the plan */
pprune->subplan_map[pp_idx] = -1;
pprune->subpart_map[pp_idx] = -1;
+ pprune->rti_map[pp_idx] = 0;
}
}
@@ -2060,7 +2213,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
* Initialize pruning contexts as needed.
*/
pprune->initial_pruning_steps = pinfo->initial_pruning_steps;
- if (pinfo->initial_pruning_steps)
+ if (consider_initial_steps && pinfo->initial_pruning_steps)
{
InitPartitionPruneContext(&pprune->initial_context,
pinfo->initial_pruning_steps,
@@ -2070,7 +2223,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
prunestate->do_initial_prune = true;
}
pprune->exec_pruning_steps = pinfo->exec_pruning_steps;
- if (pinfo->exec_pruning_steps)
+ if (consider_exec_steps && pinfo->exec_pruning_steps)
{
InitPartitionPruneContext(&pprune->exec_context,
pinfo->exec_pruning_steps,
@@ -2298,10 +2451,14 @@ PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
* Pass initial_prune if PARAM_EXEC Params cannot yet be evaluated. This
* differentiates the initial executor-time pruning step from later
* runtime pruning.
+ *
+ * RT indexes of leaf partitions scanned by the chosen subplans are added to
+ * *scan_leafpart_rtis if the pointer is non-NULL.
*/
Bitmapset *
ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
- bool initial_prune)
+ bool initial_prune,
+ Bitmapset **scan_leafpart_rtis)
{
Bitmapset *result = NULL;
MemoryContext oldcontext;
@@ -2336,7 +2493,7 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
*/
pprune = &prunedata->partrelprunedata[0];
find_matching_subplans_recurse(prunedata, pprune, initial_prune,
- &result);
+ &result, scan_leafpart_rtis);
/* Expression eval may have used space in ExprContext too */
if (pprune->exec_pruning_steps)
@@ -2350,6 +2507,8 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
/* Copy result out of the temp context before we reset it */
result = bms_copy(result);
+ if (scan_leafpart_rtis)
+ *scan_leafpart_rtis = bms_copy(*scan_leafpart_rtis);
MemoryContextReset(prunestate->prune_context);
@@ -2360,13 +2519,15 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
* find_matching_subplans_recurse
* Recursive worker function for ExecFindMatchingSubPlans
*
- * Adds valid (non-prunable) subplan IDs to *validsubplans
+ * Adds valid (non-prunable) subplan IDs to *validsubplans and RT indexes of
+ * of the corresponding leaf partitions to *scan_leafpart_rtis (if asked for).
*/
static void
find_matching_subplans_recurse(PartitionPruningData *prunedata,
PartitionedRelPruningData *pprune,
bool initial_prune,
- Bitmapset **validsubplans)
+ Bitmapset **validsubplans,
+ Bitmapset **scan_leafpart_rtis)
{
Bitmapset *partset;
int i;
@@ -2393,8 +2554,14 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
while ((i = bms_next_member(partset, i)) >= 0)
{
if (pprune->subplan_map[i] >= 0)
+ {
*validsubplans = bms_add_member(*validsubplans,
pprune->subplan_map[i]);
+ Assert(pprune->rti_map[i] > 0);
+ if (scan_leafpart_rtis)
+ *scan_leafpart_rtis = bms_add_member(*scan_leafpart_rtis,
+ pprune->rti_map[i]);
+ }
else
{
int partidx = pprune->subpart_map[i];
@@ -2402,7 +2569,8 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
if (partidx >= 0)
find_matching_subplans_recurse(prunedata,
&prunedata->partrelprunedata[partidx],
- initial_prune, validsubplans);
+ initial_prune, validsubplans,
+ scan_leafpart_rtis);
else
{
/*
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 9695de85b9..dce93a8c9f 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -135,6 +135,7 @@ CreateExecutorState(void)
estate->es_param_exec_vals = NULL;
estate->es_queryEnv = NULL;
+ estate->es_part_prune_results = NIL;
estate->es_query_cxt = qcontext;
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index dc13625171..bffb42ce71 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -842,7 +842,7 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
else
dest = None_Receiver;
- es->qd = CreateQueryDesc(es->stmt,
+ es->qd = CreateQueryDesc(es->stmt, NIL,
fcache->src,
GetActiveSnapshot(),
InvalidSnapshot,
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index 99830198bd..3b917584de 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -156,7 +156,8 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
* subplan, we can fill as_valid_subplans immediately, preventing
* later calls to ExecFindMatchingSubPlans.
*/
- if (!prunestate->do_exec_prune && nplans > 0)
+ if (appendstate->as_prune_state == NULL ||
+ (!appendstate->as_prune_state->do_exec_prune && nplans > 0))
appendstate->as_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
}
else
@@ -578,7 +579,7 @@ choose_next_subplan_locally(AppendState *node)
}
else if (node->as_valid_subplans == NULL)
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
whichplan = -1;
}
@@ -643,7 +644,7 @@ choose_next_subplan_for_leader(AppendState *node)
if (node->as_valid_subplans == NULL)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
/*
* Mark each invalid plan as finished to allow the loop below to
@@ -718,7 +719,7 @@ choose_next_subplan_for_worker(AppendState *node)
else if (node->as_valid_subplans == NULL)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
mark_invalid_subplans_as_finished(node);
}
@@ -869,7 +870,7 @@ ExecAppendAsyncBegin(AppendState *node)
if (node->as_valid_subplans == NULL)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
classify_matching_subplans(node);
}
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index f370f9f287..ccfa083945 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -104,7 +104,8 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
* subplan, we can fill ms_valid_subplans immediately, preventing
* later calls to ExecFindMatchingSubPlans.
*/
- if (!prunestate->do_exec_prune && nplans > 0)
+ if (mergestate->ms_prune_state == NULL ||
+ (!mergestate->ms_prune_state->do_exec_prune && nplans > 0))
mergestate->ms_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
}
else
@@ -219,7 +220,7 @@ ExecMergeAppend(PlanState *pstate)
*/
if (node->ms_valid_subplans == NULL)
node->ms_valid_subplans =
- ExecFindMatchingSubPlans(node->ms_prune_state, false);
+ ExecFindMatchingSubPlans(node->ms_prune_state, false, NULL);
/*
* First time through: pull the first tuple from each valid subplan,
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index fd5796f1b9..93012a5b3b 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -1578,6 +1578,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
CachedPlanSource *plansource;
CachedPlan *cplan;
List *stmt_list;
+ List *part_prune_results_list;
char *query_string;
Snapshot snapshot;
MemoryContext oldcontext;
@@ -1657,7 +1658,10 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
*/
/* Replan if needed, and increment plan refcount for portal */
- cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv);
+ cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv,
+ &part_prune_results_list);
+ Assert(list_length(cplan->stmt_list) ==
+ list_length(part_prune_results_list));
stmt_list = cplan->stmt_list;
if (!plan->saved)
@@ -1685,6 +1689,9 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
stmt_list,
cplan);
+ /* Copy Lists of PartitionPruneResults into the portal's context. */
+ PortalStorePartitionPruneResults(portal, part_prune_results_list);
+
/*
* Set up options for portal. Default SCROLL type is chosen the same way
* as PerformCursorOpen does it.
@@ -2092,7 +2099,8 @@ SPI_plan_get_cached_plan(SPIPlanPtr plan)
/* Get the generic plan for the query */
cplan = GetCachedPlan(plansource, NULL,
plan->saved ? CurrentResourceOwner : NULL,
- _SPI_current->queryEnv);
+ _SPI_current->queryEnv,
+ NULL /* Not interested in PartitionPruneResults */);
Assert(cplan == plansource->gplan);
/* Pop the error context stack */
@@ -2473,7 +2481,9 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
{
CachedPlanSource *plansource = (CachedPlanSource *) lfirst(lc1);
List *stmt_list;
- ListCell *lc2;
+ List *part_prune_results_list;
+ ListCell *lc2,
+ *lc3;
spicallbackarg.query = plansource->query_string;
@@ -2549,8 +2559,10 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
* plan, the refcount must be backed by the plan_owner.
*/
cplan = GetCachedPlan(plansource, options->params,
- plan_owner, _SPI_current->queryEnv);
-
+ plan_owner, _SPI_current->queryEnv,
+ &part_prune_results_list);
+ Assert(list_length(cplan->stmt_list) ==
+ list_length(part_prune_results_list));
stmt_list = cplan->stmt_list;
/*
@@ -2589,9 +2601,10 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
}
}
- foreach(lc2, stmt_list)
+ forboth(lc2, stmt_list, lc3, part_prune_results_list)
{
PlannedStmt *stmt = lfirst_node(PlannedStmt, lc2);
+ List *part_prune_results = lfirst_node(List, lc3);
bool canSetTag = stmt->canSetTag;
DestReceiver *dest;
@@ -2663,7 +2676,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
else
snap = InvalidSnapshot;
- qdesc = CreateQueryDesc(stmt,
+ qdesc = CreateQueryDesc(stmt, part_prune_results,
plansource->query_string,
snap, crosscheck_snapshot,
dest,
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 23776367c5..b01f55fb4f 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -158,6 +158,11 @@
token = pg_strtok(&length); /* skip :fldname */ \
local_node->fldname = readIntCols(len)
+/* Read an Index array */
+#define READ_INDEX_ARRAY(fldname, len) \
+ token = pg_strtok(&length); /* skip :fldname */ \
+ local_node->fldname = readIndexCols(len)
+
/* Read a bool array */
#define READ_BOOL_ARRAY(fldname, len) \
token = pg_strtok(&length); /* skip :fldname */ \
@@ -800,7 +805,6 @@ fnname(int numCols) \
*/
READ_SCALAR_ARRAY(readAttrNumberCols, int16, atoi)
READ_SCALAR_ARRAY(readOidCols, Oid, atooid)
-/* outfuncs.c has writeIndexCols, but we don't yet need that here */
-/* READ_SCALAR_ARRAY(readIndexCols, Index, atoui) */
+READ_SCALAR_ARRAY(readIndexCols, Index, atoui)
READ_SCALAR_ARRAY(readIntCols, int, atoi)
READ_SCALAR_ARRAY(readBoolCols, bool, strtobool)
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 799602f5ea..a96d316dca 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -520,7 +520,9 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
result->parallelModeNeeded = glob->parallelModeNeeded;
result->planTree = top_plan;
result->partPruneInfos = glob->partPruneInfos;
+ result->containsInitialPruning = glob->containsInitialPruning;
result->rtable = glob->finalrtable;
+ result->minLockRelids = glob->minLockRelids;
result->resultRelations = glob->resultRelations;
result->appendRelations = glob->appendRelations;
result->subplans = glob->subplans;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index e67f0e3509..5820f26fdb 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -270,6 +270,16 @@ set_plan_references(PlannerInfo *root, Plan *plan)
*/
add_rtes_to_flat_rtable(root, false);
+ /*
+ * Add the query's adjusted range of RT indexes to glob->minLockRelids.
+ * The adjusted RT indexes of prunable relations will be deleted from the
+ * set below where PartitionPruneInfos are processed.
+ */
+ glob->minLockRelids =
+ bms_add_range(glob->minLockRelids,
+ rtoffset + 1,
+ rtoffset + list_length(root->parse->rtable));
+
/*
* Adjust RT indexes of PlanRowMarks and add to final rowmarks list
*/
@@ -352,6 +362,7 @@ set_plan_references(PlannerInfo *root, Plan *plan)
foreach (lc, root->partPruneInfos)
{
PartitionPruneInfo *pruneinfo = lfirst(lc);
+ Bitmapset *leafpart_rtis = NULL;
ListCell *l;
pruneinfo->root_parent_relids =
@@ -364,15 +375,50 @@ set_plan_references(PlannerInfo *root, Plan *plan)
foreach(l2, prune_infos)
{
PartitionedRelPruneInfo *pinfo = lfirst(l2);
+ int i;
/* RT index of the table to which the pinfo belongs. */
pinfo->rtindex += rtoffset;
+
+ /* Also of the leaf partitions that might be scanned. */
+ for (i = 0; i < pinfo->nparts; i++)
+ {
+ if (pinfo->rti_map[i] > 0 && pinfo->subplan_map[i] >= 0)
+ {
+ pinfo->rti_map[i] += rtoffset;
+ leafpart_rtis = bms_add_member(leafpart_rtis,
+ pinfo->rti_map[i]);
+ }
+ }
}
}
+ if (pruneinfo->needs_init_pruning)
+ {
+ glob->containsInitialPruning = true;
+
+ /*
+ * Delete the leaf partition RTIs from the global set of relations
+ * to be locked before executing the plan. AcquireExecutorLocks()
+ * will find the ones to add to the set after performing initial
+ * pruning.
+ */
+ glob->minLockRelids = bms_del_members(glob->minLockRelids,
+ leafpart_rtis);
+ }
+
glob->partPruneInfos = lappend(glob->partPruneInfos, pruneinfo);
}
+ /*
+ * It seems worth doing a bms_copy() on glob->minLockRelids if we deleted
+ * bits from it above to get rid of any empty tail bits. It seems better
+ * for the loop over this set in AcquireExecutorLocks() to not have to go
+ * through those useless bit words.
+ */
+ if (glob->containsInitialPruning)
+ glob->minLockRelids = bms_copy(glob->minLockRelids);
+
return result;
}
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index d48f6784c1..d5556354f7 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -144,7 +144,9 @@ static List *make_partitionedrel_pruneinfo(PlannerInfo *root,
List *prunequal,
Bitmapset *partrelids,
int *relid_subplan_map,
- Bitmapset **matchedsubplans);
+ Bitmapset **matchedsubplans,
+ bool *needs_init_pruning,
+ bool *needs_exec_pruning);
static void gen_partprune_steps(RelOptInfo *rel, List *clauses,
PartClauseTarget target,
GeneratePruningStepsContext *context);
@@ -234,6 +236,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
int *relid_subplan_map;
ListCell *lc;
int i;
+ bool needs_init_pruning = false;
+ bool needs_exec_pruning = false;
/*
* Scan the subpaths to see which ones are scans of partition child
@@ -313,12 +317,16 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
Bitmapset *partrelids = (Bitmapset *) lfirst(lc);
List *pinfolist;
Bitmapset *matchedsubplans = NULL;
+ bool partrel_needs_init_pruning;
+ bool partrel_needs_exec_pruning;
pinfolist = make_partitionedrel_pruneinfo(root, parentrel,
prunequal,
partrelids,
relid_subplan_map,
- &matchedsubplans);
+ &matchedsubplans,
+ &partrel_needs_init_pruning,
+ &partrel_needs_exec_pruning);
/* When pruning is possible, record the matched subplans */
if (pinfolist != NIL)
@@ -327,6 +335,9 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
allmatchedsubplans = bms_join(matchedsubplans,
allmatchedsubplans);
}
+
+ needs_init_pruning |= partrel_needs_init_pruning;
+ needs_exec_pruning |= partrel_needs_exec_pruning;
}
pfree(relid_subplan_map);
@@ -342,6 +353,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
pruneinfo = makeNode(PartitionPruneInfo);
pruneinfo->root_parent_relids = parentrel->relids;
pruneinfo->prune_infos = prunerelinfos;
+ pruneinfo->needs_init_pruning = needs_init_pruning;
+ pruneinfo->needs_exec_pruning = needs_exec_pruning;
/*
* Some subplans may not belong to any of the identified partitioned rels.
@@ -442,13 +455,18 @@ add_part_relids(List *allpartrelids, Bitmapset *partrelids)
* If we cannot find any useful run-time pruning steps, return NIL.
* However, on success, each rel identified in partrelids will have
* an element in the result list, even if some of them are useless.
+ * *needs_init_pruning and *needs_exec_pruning are set to indicate that the
+ * returned PartitionedRelPruneInfos contains pruning steps that can be
+ * performed before and after execution begins, respectively.
*/
static List *
make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
List *prunequal,
Bitmapset *partrelids,
int *relid_subplan_map,
- Bitmapset **matchedsubplans)
+ Bitmapset **matchedsubplans,
+ bool *needs_init_pruning,
+ bool *needs_exec_pruning)
{
RelOptInfo *targetpart = NULL;
List *pinfolist = NIL;
@@ -459,6 +477,10 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
int rti;
int i;
+ /* Will find out below. */
+ *needs_init_pruning = false;
+ *needs_exec_pruning = false;
+
/*
* Examine each partitioned rel, constructing a temporary array to map
* from planner relids to index of the partitioned rel, and building a
@@ -546,6 +568,9 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
* executor per-scan pruning steps. This first pass creates startup
* pruning steps and detects whether there's any possibly-useful quals
* that would require per-scan pruning.
+ *
+ * In the first pass, we note whether the 2nd pass is necessary by
+ * noting the presence of EXEC parameters.
*/
gen_partprune_steps(subpart, partprunequal, PARTTARGET_INITIAL,
&context);
@@ -620,6 +645,12 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
pinfo->execparamids = execparamids;
/* Remaining fields will be filled in the next loop */
+ /* record which types of pruning steps we've seen so far */
+ if (initial_pruning_steps != NIL)
+ *needs_init_pruning = true;
+ if (exec_pruning_steps != NIL)
+ *needs_exec_pruning = true;
+
pinfolist = lappend(pinfolist, pinfo);
}
@@ -647,6 +678,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
int *subplan_map;
int *subpart_map;
Oid *relid_map;
+ Index *rti_map;
/*
* Construct the subplan and subpart maps for this partitioning level.
@@ -659,6 +691,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
subpart_map = (int *) palloc(nparts * sizeof(int));
memset(subpart_map, -1, nparts * sizeof(int));
relid_map = (Oid *) palloc0(nparts * sizeof(Oid));
+ rti_map = (Index *) palloc0(nparts * sizeof(Index));
present_parts = NULL;
i = -1;
@@ -673,6 +706,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
subplan_map[i] = subplanidx = relid_subplan_map[partrel->relid] - 1;
subpart_map[i] = subpartidx = relid_subpart_map[partrel->relid] - 1;
relid_map[i] = planner_rt_fetch(partrel->relid, root)->relid;
+ rti_map[i] = partrel->relid;
if (subplanidx >= 0)
{
present_parts = bms_add_member(present_parts, i);
@@ -697,6 +731,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
pinfo->subplan_map = subplan_map;
pinfo->subpart_map = subpart_map;
pinfo->relid_map = relid_map;
+ pinfo->rti_map = rti_map;
}
pfree(relid_subpart_map);
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 3082093d1e..95ab1d0eef 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -1598,6 +1598,7 @@ exec_bind_message(StringInfo input_message)
int16 *rformats = NULL;
CachedPlanSource *psrc;
CachedPlan *cplan;
+ List *part_prune_results_list;
Portal portal;
char *query_string;
char *saved_stmt_name;
@@ -1972,7 +1973,9 @@ exec_bind_message(StringInfo input_message)
* will be generated in MessageContext. The plan refcount will be
* assigned to the Portal, so it will be released at portal destruction.
*/
- cplan = GetCachedPlan(psrc, params, NULL, NULL);
+ cplan = GetCachedPlan(psrc, params, NULL, NULL, &part_prune_results_list);
+ Assert(list_length(cplan->stmt_list) ==
+ list_length(part_prune_results_list));
/*
* Now we can define the portal.
@@ -1987,6 +1990,9 @@ exec_bind_message(StringInfo input_message)
cplan->stmt_list,
cplan);
+ /* Copy Lists of PartitionPruneResults into the portal's context. */
+ PortalStorePartitionPruneResults(portal, part_prune_results_list);
+
/* Done with the snapshot used for parameter I/O and parsing/planning */
if (snapshot_set)
PopActiveSnapshot();
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index 52e2db6452..280ed7d239 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -35,7 +35,7 @@
Portal ActivePortal = NULL;
-static void ProcessQuery(PlannedStmt *plan,
+static void ProcessQuery(PlannedStmt *plan, List *part_prune_results,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -65,6 +65,7 @@ static void DoPortalRewind(Portal portal);
*/
QueryDesc *
CreateQueryDesc(PlannedStmt *plannedstmt,
+ List *part_prune_results,
const char *sourceText,
Snapshot snapshot,
Snapshot crosscheck_snapshot,
@@ -77,6 +78,8 @@ CreateQueryDesc(PlannedStmt *plannedstmt,
qd->operation = plannedstmt->commandType; /* operation */
qd->plannedstmt = plannedstmt; /* plan */
+ qd->part_prune_results = part_prune_results; /* ExecutorDoInitialPruning()
+ * output for plan */
qd->sourceText = sourceText; /* query text */
qd->snapshot = RegisterSnapshot(snapshot); /* snapshot */
/* RI check snapshot */
@@ -122,6 +125,7 @@ FreeQueryDesc(QueryDesc *qdesc)
* PORTAL_ONE_RETURNING, or PORTAL_ONE_MOD_WITH portal
*
* plan: the plan tree for the query
+ * part_prune_results: ExecutorDoInitialPruning() output for the PlannedStmt
* sourceText: the source text of the query
* params: any parameters needed
* dest: where to send results
@@ -134,6 +138,7 @@ FreeQueryDesc(QueryDesc *qdesc)
*/
static void
ProcessQuery(PlannedStmt *plan,
+ List *part_prune_results,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -145,7 +150,7 @@ ProcessQuery(PlannedStmt *plan,
/*
* Create the QueryDesc object
*/
- queryDesc = CreateQueryDesc(plan, sourceText,
+ queryDesc = CreateQueryDesc(plan, part_prune_results, sourceText,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, 0);
@@ -491,8 +496,13 @@ PortalStart(Portal portal, ParamListInfo params,
/*
* Create QueryDesc in portal's context; for the moment, set
* the destination to DestNone.
+ *
+ * There is no PartitionPruneResult unless the PlannedStmt is
+ * from a CachedPlan.
*/
queryDesc = CreateQueryDesc(linitial_node(PlannedStmt, portal->stmts),
+ portal->part_prune_results_list == NIL ? NIL :
+ linitial(portal->part_prune_results_list),
portal->sourceText,
GetActiveSnapshot(),
InvalidSnapshot,
@@ -1225,6 +1235,8 @@ PortalRunMulti(Portal portal,
if (pstmt->utilityStmt == NULL)
{
+ List *part_prune_results = NIL;
+
/*
* process a plannable query.
*/
@@ -1271,10 +1283,18 @@ PortalRunMulti(Portal portal,
else
UpdateActiveSnapshotCommandId();
+ /*
+ * Determine if there's a corresponding List of PartitionPruneResult
+ * for this PlannedStmt.
+ */
+ if (portal->part_prune_results_list != NIL)
+ part_prune_results = list_nth(portal->part_prune_results_list,
+ foreach_current_index(stmtlist_item));
+
if (pstmt->canSetTag)
{
/* statement can set tag string */
- ProcessQuery(pstmt,
+ ProcessQuery(pstmt, part_prune_results,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
@@ -1283,7 +1303,7 @@ PortalRunMulti(Portal portal,
else
{
/* stmt added by rewrite cannot set tag */
- ProcessQuery(pstmt,
+ ProcessQuery(pstmt, part_prune_results,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index cc943205d3..af6fae6e3b 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -99,14 +99,19 @@ static dlist_head cached_expression_list = DLIST_STATIC_INIT(cached_expression_l
static void ReleaseGenericPlan(CachedPlanSource *plansource);
static List *RevalidateCachedQuery(CachedPlanSource *plansource,
QueryEnvironment *queryEnv);
-static bool CheckCachedPlan(CachedPlanSource *plansource);
+static bool CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
+ List **part_prune_results_list);
static CachedPlan *BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
- ParamListInfo boundParams, QueryEnvironment *queryEnv);
+ ParamListInfo boundParams, QueryEnvironment *queryEnv,
+ List **part_prune_results_list);
static bool choose_custom_plan(CachedPlanSource *plansource,
ParamListInfo boundParams);
static double cached_plan_cost(CachedPlan *plan, bool include_planner);
static Query *QueryListGetPrimaryStmt(List *stmts);
-static void AcquireExecutorLocks(List *stmt_list, bool acquire);
+static void AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams,
+ List **part_prune_results_list,
+ List **lockedRelids_per_stmt);
+static void ReleaseExecutorLocks(List *stmt_list, List *lockedRelids_per_stmt);
static void AcquirePlannerLocks(List *stmt_list, bool acquire);
static void ScanQueryForLocks(Query *parsetree, bool acquire);
static bool ScanQueryWalker(Node *node, bool *acquire);
@@ -782,6 +787,26 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
return tlist;
}
+/*
+ * FreePartitionPruneResults
+ * Frees the List of Lists of PartitionPruneResults for CheckCachedPlan()
+ */
+static void
+FreePartitionPruneResults(List *part_prune_results_list)
+{
+ ListCell *lc;
+
+ foreach(lc, part_prune_results_list)
+ {
+ List *part_prune_results = lfirst(lc);
+
+ /* Free both the PartitionPruneResults and the containing List. */
+ list_free_deep(part_prune_results);
+ }
+
+ list_free(part_prune_results_list);
+}
+
/*
* CheckCachedPlan: see if the CachedPlanSource's generic plan is valid.
*
@@ -790,15 +815,20 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
*
* On a "true" return, we have acquired the locks needed to run the plan.
* (We must do this for the "true" result to be race-condition-free.)
+ *
+ * See GetCachedPlan()'s comment for a description of part_prune_results_list.
*/
static bool
-CheckCachedPlan(CachedPlanSource *plansource)
+CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
+ List **part_prune_results_list)
{
CachedPlan *plan = plansource->gplan;
/* Assert that caller checked the querytree */
Assert(plansource->is_valid);
+ *part_prune_results_list = NIL;
+
/* If there's no generic plan, just say "false" */
if (!plan)
return false;
@@ -820,13 +850,21 @@ CheckCachedPlan(CachedPlanSource *plansource)
*/
if (plan->is_valid)
{
+ List *lockedRelids_per_stmt;
+
/*
* Plan must have positive refcount because it is referenced by
* plansource; so no need to fear it disappears under us here.
*/
Assert(plan->refcount > 0);
- AcquireExecutorLocks(plan->stmt_list, true);
+ /*
+ * Lock relations scanned by the plan. This is where the pruning
+ * happens if needed.
+ */
+ AcquireExecutorLocks(plan->stmt_list, boundParams,
+ part_prune_results_list,
+ &lockedRelids_per_stmt);
/*
* If plan was transient, check to see if TransactionXmin has
@@ -848,7 +886,11 @@ CheckCachedPlan(CachedPlanSource *plansource)
}
/* Oops, the race case happened. Release useless locks. */
- AcquireExecutorLocks(plan->stmt_list, false);
+ ReleaseExecutorLocks(plan->stmt_list, lockedRelids_per_stmt);
+
+ /* Release any PartitionPruneResults that may been created. */
+ FreePartitionPruneResults(*part_prune_results_list);
+ *part_prune_results_list = NIL;
}
/*
@@ -874,10 +916,14 @@ CheckCachedPlan(CachedPlanSource *plansource)
* Planning work is done in the caller's memory context. The finished plan
* is in a child memory context, which typically should get reparented
* (unless this is a one-shot plan, in which case we don't copy the plan).
+ *
+ * A list of NILs is returned in *part_prune_results_list, meaning that no
+ * no partition pruning has been done yet for the plans in stmt_list.
*/
static CachedPlan *
BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
- ParamListInfo boundParams, QueryEnvironment *queryEnv)
+ ParamListInfo boundParams, QueryEnvironment *queryEnv,
+ List **part_prune_results_list)
{
CachedPlan *plan;
List *plist;
@@ -1007,6 +1053,17 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
MemoryContextSwitchTo(oldcxt);
+ /*
+ * No actual PartitionPruneResults yet to add, though must initialize
+ * the list to have the same number of elements as the list of
+ * PlannedStmts.
+ */
+ *part_prune_results_list = NIL;
+ foreach(lc, plist)
+ {
+ *part_prune_results_list = lappend(*part_prune_results_list, NIL);
+ }
+
return plan;
}
@@ -1126,6 +1183,19 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
* plan or a custom plan for the given parameters: the caller does not know
* which it will get.
*
+ * For every PlannedStmt found in the returned CachedPlan, an element that
+ * is either a List of PartitionPruneResult or a NIL is added to
+ * *part_prune_results_list. The former if the PlannedStmt is from
+ * the existing CachedPlan that is otherwise valid and has
+ * containsInitialPruning set to true. Before returning such a CachedPlan,
+ * those "initial" steps are performed by calling ExecutorDoInitialPruning()
+ * to determine only those leaf partitions that need to be locked by
+ * AcquireExecutorLocks() by pruning away subplans that don't match the
+ * "initial" pruning conditions. For each PartitionPruneInfo found in
+ * PlannedStmt.partPruneInfos, a PartitionPruneResult containing the bitmapset
+ * of the indexes of surviving subplans is added to the List for the
+ * PlannedStmt.
+ *
* On return, the plan is valid and we have sufficient locks to begin
* execution.
*
@@ -1139,11 +1209,13 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
*/
CachedPlan *
GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
- ResourceOwner owner, QueryEnvironment *queryEnv)
+ ResourceOwner owner, QueryEnvironment *queryEnv,
+ List **part_prune_results_list)
{
CachedPlan *plan = NULL;
List *qlist;
bool customplan;
+ List *my_part_prune_results_list;
/* Assert caller is doing things in a sane order */
Assert(plansource->magic == CACHEDPLANSOURCE_MAGIC);
@@ -1160,7 +1232,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
if (!customplan)
{
- if (CheckCachedPlan(plansource))
+ if (CheckCachedPlan(plansource, boundParams,
+ &my_part_prune_results_list))
{
/* We want a generic plan, and we already have a valid one */
plan = plansource->gplan;
@@ -1169,7 +1242,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
else
{
/* Build a new generic plan */
- plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv);
+ plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv,
+ &my_part_prune_results_list);
/* Just make real sure plansource->gplan is clear */
ReleaseGenericPlan(plansource);
/* Link the new generic plan into the plansource */
@@ -1214,7 +1288,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
if (customplan)
{
/* Build a custom plan */
- plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv);
+ plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv,
+ &my_part_prune_results_list);
/* Accumulate total costs of custom plans */
plansource->total_custom_cost += cached_plan_cost(plan, true);
@@ -1246,6 +1321,9 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
plan->is_saved = true;
}
+ if (part_prune_results_list)
+ *part_prune_results_list = my_part_prune_results_list;
+
return plan;
}
@@ -1737,17 +1815,29 @@ QueryListGetPrimaryStmt(List *stmts)
/*
* AcquireExecutorLocks: acquire locks needed for execution of a cached plan;
- * or release them if acquire is false.
+ *
+ * See GetCachedPlan()'s comment for a description of part_prune_results_list.
+ *
+ * On return, *lockedRelids_per_stmt will contain a bitmapset for every
+ * PlannedStmt in stmt_list, containing the RT indexes of relation entries
+ * in its range table that were actually locked, or NULL if the PlannedStmt
+ * contains a utility statement.
*/
static void
-AcquireExecutorLocks(List *stmt_list, bool acquire)
+AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams,
+ List **part_prune_results_list,
+ List **lockedRelids_per_stmt)
{
ListCell *lc1;
+ *part_prune_results_list = *lockedRelids_per_stmt = NIL;
foreach(lc1, stmt_list)
{
PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
- ListCell *lc2;
+ List *part_prune_results = NIL;
+ Bitmapset *allLockRelids;
+ Bitmapset *lockedRelids = NULL;
+ int rti;
if (plannedstmt->commandType == CMD_UTILITY)
{
@@ -1761,13 +1851,40 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
Query *query = UtilityContainsQuery(plannedstmt->utilityStmt);
if (query)
- ScanQueryForLocks(query, acquire);
+ ScanQueryForLocks(query, true);
+ *part_prune_results_list = lappend(*part_prune_results_list, NIL);
continue;
}
- foreach(lc2, plannedstmt->rtable)
+ /*
+ * Figure out the set of relations that would need to be locked
+ * before executing the plan.
+ */
+ if (plannedstmt->containsInitialPruning)
{
- RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc2);
+ Bitmapset *scan_leafpart_rtis = NULL;
+
+ /*
+ * Obtain the set of leaf partitions to be locked.
+ *
+ * The following does initial partition pruning using the
+ * PartitionPruneInfos found in plannedstmt->partPruneInfos and
+ * finds leaf partitions that survive that pruning across all the
+ * nodes in the plan tree.
+ */
+ part_prune_results = ExecutorDoInitialPruning(plannedstmt,
+ boundParams,
+ &scan_leafpart_rtis);
+ allLockRelids = bms_union(plannedstmt->minLockRelids,
+ scan_leafpart_rtis);
+ }
+ else
+ allLockRelids = plannedstmt->minLockRelids;
+
+ rti = -1;
+ while ((rti = bms_next_member(allLockRelids, rti)) > 0)
+ {
+ RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
if (rte->rtekind != RTE_RELATION)
continue;
@@ -1778,10 +1895,59 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
* fail if it's been dropped entirely --- we'll just transiently
* acquire a non-conflicting lock.
*/
- if (acquire)
- LockRelationOid(rte->relid, rte->rellockmode);
- else
- UnlockRelationOid(rte->relid, rte->rellockmode);
+ LockRelationOid(rte->relid, rte->rellockmode);
+ lockedRelids = bms_add_member(lockedRelids, rti);
+ }
+
+ *part_prune_results_list = lappend(*part_prune_results_list,
+ part_prune_results);
+ *lockedRelids_per_stmt = lappend(*lockedRelids_per_stmt, lockedRelids);
+ }
+}
+
+/*
+ * ReleaseExecutorLocks
+ * Release locks that would've been acquired by an earlier call to
+ * AcquireExecutorLocks()
+ */
+static void
+ReleaseExecutorLocks(List *stmt_list, List *lockedRelids_per_stmt)
+{
+ ListCell *lc1,
+ *lc2;
+
+ forboth(lc1, stmt_list, lc2, lockedRelids_per_stmt)
+ {
+ PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
+ Bitmapset *lockedRelids = lfirst(lc2);
+ int rti;
+
+ if (plannedstmt->commandType == CMD_UTILITY)
+ {
+ /*
+ * Ignore utility statements, except those (such as EXPLAIN) that
+ * contain a parsed-but-not-planned query. Note: it's okay to use
+ * ScanQueryForLocks, even though the query hasn't been through
+ * rule rewriting, because rewriting doesn't change the query
+ * representation.
+ */
+ Query *query = UtilityContainsQuery(plannedstmt->utilityStmt);
+
+ Assert(lockedRelids == NULL);
+ if (query)
+ ScanQueryForLocks(query, false);
+ continue;
+ }
+
+ rti = -1;
+ while ((rti = bms_next_member(lockedRelids, rti)) >= 0)
+ {
+ RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
+
+ Assert(rte->rtekind == RTE_RELATION);
+
+ /* See the comment in AcquireExecutorLocks(). */
+ UnlockRelationOid(rte->relid, rte->rellockmode);
}
}
}
diff --git a/src/backend/utils/mmgr/portalmem.c b/src/backend/utils/mmgr/portalmem.c
index 7b1ae6fdcf..5b9098971b 100644
--- a/src/backend/utils/mmgr/portalmem.c
+++ b/src/backend/utils/mmgr/portalmem.c
@@ -303,6 +303,25 @@ PortalDefineQuery(Portal portal,
portal->status = PORTAL_DEFINED;
}
+/*
+ * PortalStorePartitionPruneResults
+ * Copy the given List of Lists of PartitionPruneResults into the
+ * portal's context
+ *
+ * This allows the caller to ensure that the list exists as long as the portal
+ * does.
+ */
+void
+PortalStorePartitionPruneResults(Portal portal, List *part_prune_results_list)
+{
+ MemoryContext oldcxt;
+
+ Assert(PortalIsValid(portal));
+ oldcxt = MemoryContextSwitchTo(portal->portalContext);
+ portal->part_prune_results_list = copyObject(part_prune_results_list);
+ MemoryContextSwitchTo(oldcxt);
+}
+
/*
* PortalReleaseCachedPlan
* Release a portal's reference to its cached plan, if any.
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 9ebde089ae..269cc4d562 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -87,7 +87,9 @@ extern void ExplainOneUtility(Node *utilityStmt, IntoClause *into,
ExplainState *es, const char *queryString,
ParamListInfo params, QueryEnvironment *queryEnv);
-extern void ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into,
+extern void ExplainOnePlan(PlannedStmt *plannedstmt,
+ List *part_prune_results,
+ IntoClause *into,
ExplainState *es, const char *queryString,
ParamListInfo params, QueryEnvironment *queryEnv,
const instr_time *planduration,
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index 17fabc18c9..4b98d0d2ef 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -45,6 +45,7 @@ extern void ExecCleanupTupleRouting(ModifyTableState *mtstate,
* nparts Length of subplan_map[] and subpart_map[].
* subplan_map Subplan index by partition index, or -1.
* subpart_map Subpart index by partition index, or -1.
+ * rti_map Range table index by partition index, or 0.
* present_parts A Bitmapset of the partition indexes that we
* have subplans or subparts for.
* initial_pruning_steps List of PartitionPruneSteps used to
@@ -61,6 +62,7 @@ typedef struct PartitionedRelPruningData
int nparts;
int *subplan_map;
int *subpart_map;
+ Index *rti_map;
Bitmapset *present_parts;
List *initial_pruning_steps;
List *exec_pruning_steps;
@@ -127,5 +129,10 @@ extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
Bitmapset *root_parent_relids,
Bitmapset **initially_valid_subplans);
extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
- bool initial_prune);
+ bool initial_prune,
+ Bitmapset **scan_leafpart_rtis);
+extern Bitmapset *ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt,
+ ParamListInfo params,
+ PartitionPruneInfo *pruneinfo,
+ Bitmapset **scan_leafpart_rtis);
#endif /* EXECPARTITION_H */
diff --git a/src/include/executor/execdesc.h b/src/include/executor/execdesc.h
index e79e2c001f..7d4379da7b 100644
--- a/src/include/executor/execdesc.h
+++ b/src/include/executor/execdesc.h
@@ -35,6 +35,8 @@ typedef struct QueryDesc
/* These fields are provided by CreateQueryDesc */
CmdType operation; /* CMD_SELECT, CMD_UPDATE, etc. */
PlannedStmt *plannedstmt; /* planner's output (could be utility, too) */
+ List *part_prune_results; /* ExecutorDoInitialPruning()'s
+ * output for plannedstmt */
const char *sourceText; /* source text of the query */
Snapshot snapshot; /* snapshot to use for query */
Snapshot crosscheck_snapshot; /* crosscheck for RI update/delete */
@@ -57,6 +59,7 @@ typedef struct QueryDesc
/* in pquery.c */
extern QueryDesc *CreateQueryDesc(PlannedStmt *plannedstmt,
+ List *part_prune_results,
const char *sourceText,
Snapshot snapshot,
Snapshot crosscheck_snapshot,
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index ed95ed1176..c9a5e5fb68 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -185,6 +185,9 @@ ExecGetJunkAttribute(TupleTableSlot *slot, AttrNumber attno, bool *isNull)
/*
* prototypes from functions in execMain.c
*/
+extern List *ExecutorDoInitialPruning(PlannedStmt *plannedstmt,
+ ParamListInfo params,
+ Bitmapset **scan_leafpart_rtis);
extern void ExecutorStart(QueryDesc *queryDesc, int eflags);
extern void standard_ExecutorStart(QueryDesc *queryDesc, int eflags);
extern void ExecutorRun(QueryDesc *queryDesc,
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index a2008846c6..369de42caf 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -615,6 +615,7 @@ typedef struct EState
* ExecRowMarks, or NULL if none */
PlannedStmt *es_plannedstmt; /* link to top of plan tree */
List *es_part_prune_infos; /* PlannedStmt.partPruneInfos */
+ List *es_part_prune_results; /* QueryDesc.part_prune_results */
const char *es_sourceText; /* Source text from QueryDesc */
JunkFilter *es_junkFilter; /* top-level junk filter, if any */
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index a80f43e540..937cc4629d 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -212,6 +212,7 @@ extern struct Bitmapset *readBitmapset(void);
extern uintptr_t readDatum(bool typbyval);
extern bool *readBoolCols(int numCols);
extern int *readIntCols(int numCols);
+extern Index *readIndexCols(int numCols);
extern Oid *readOidCols(int numCols);
extern int16 *readAttrNumberCols(int numCols);
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index dd4eb8679d..36abe4cf9e 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -125,6 +125,18 @@ typedef struct PlannerGlobal
/* List of PartitionPruneInfo contained in the plan */
List *partPruneInfos;
+ /*
+ * Do any of those PartitionPruneInfos have initial pruning steps in them?
+ */
+ bool containsInitialPruning;
+
+ /*
+ * Indexes of all range table entries minus indexes of range table entries
+ * of the leaf partitions scanned by prunable subplans; see
+ * AcquireExecutorLocks()
+ */
+ Bitmapset *minLockRelids;
+
/* OIDs of relations the plan depends on */
List *relationOids;
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 2e202892a7..0cab6958d7 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -73,8 +73,17 @@ typedef struct PlannedStmt
List *partPruneInfos; /* List of PartitionPruneInfo contained in
* the plan */
+ bool containsInitialPruning; /* Do any of those PartitionPruneInfos
+ * have initial pruning steps in them?
+ */
+
List *rtable; /* list of RangeTblEntry nodes */
+ Bitmapset *minLockRelids; /* Indexes of all range table entries minus
+ * indexes of range table entries of the leaf
+ * partitions scanned by prunable subplans;
+ * see AcquireExecutorLocks() */
+
/* rtable indexes of target relations for INSERT/UPDATE/DELETE/MERGE */
List *resultRelations; /* integer list of RT indexes, or NIL */
@@ -1414,6 +1423,13 @@ typedef struct PlanRowMark
* prune_infos List of Lists containing PartitionedRelPruneInfo nodes,
* one sublist per run-time-prunable partition hierarchy
* appearing in the parent plan node's subplans.
+ *
+ * needs_init_pruning Does any of the PartitionedRelPruneInfos in
+ * prune_infos have its initial_pruning_steps set?
+ *
+ * needs_exec_pruning Does any of the PartitionedRelPruneInfos in
+ * prune_infos have its exec_pruning_steps set?
+ *
* other_subplans Indexes of any subplans that are not accounted for
* by any of the PartitionedRelPruneInfo nodes in
* "prune_infos". These subplans must not be pruned.
@@ -1425,6 +1441,8 @@ typedef struct PartitionPruneInfo
NodeTag type;
Bitmapset *root_parent_relids;
List *prune_infos;
+ bool needs_init_pruning;
+ bool needs_exec_pruning;
Bitmapset *other_subplans;
} PartitionPruneInfo;
@@ -1469,6 +1487,9 @@ typedef struct PartitionedRelPruneInfo
/* relation OID by partition index, or 0 */
Oid *relid_map pg_node_attr(array_size(nparts));
+ /* Range table index by partition index, or 0. */
+ Index *rti_map pg_node_attr(array_size(nparts));
+
/*
* initial_pruning_steps shows how to prune during executor startup (i.e.,
* without use of any PARAM_EXEC Params); it is NIL if no startup pruning
@@ -1553,6 +1574,31 @@ typedef struct PartitionPruneStepCombine
List *source_stepids;
} PartitionPruneStepCombine;
+/*----------------
+ * PartitionPruneResult
+ *
+ * The result of performing ExecPartitionDoInitialPruning() on a given
+ * PartitionPruneInfo.
+ *
+ * valid_subplans_offs contains the indexes of subplans remaining after
+ * performing initial pruning by calling ExecFindMatchingSubPlans() on the
+ * PartitionPruneInfo.
+ *
+ * This is used to store the result of initial partition pruning that is
+ * peformed before the execution has started. A module that needs to do so
+ * should call ExecutorDoInitialPruning() on a given PlannedStmt, which
+ * returns a List of PartitionPruneResult containing an entry for each
+ * PartitionPruneInfo present in PlannedStmt.part_prune_infos. The module
+ * should then pass that list, along with the PlannedStmt, to the executor,
+ * so that it can reuse the result of initial partition pruning when
+ * initializing the subplans for execution.
+ */
+typedef struct PartitionPruneResult
+{
+ NodeTag type;
+
+ Bitmapset *valid_subplan_offs;
+} PartitionPruneResult;
/*
* Plan invalidation info
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index 0499635f59..32579d4788 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -220,7 +220,8 @@ extern List *CachedPlanGetTargetList(CachedPlanSource *plansource,
extern CachedPlan *GetCachedPlan(CachedPlanSource *plansource,
ParamListInfo boundParams,
ResourceOwner owner,
- QueryEnvironment *queryEnv);
+ QueryEnvironment *queryEnv,
+ List **part_prune_results_list);
extern void ReleaseCachedPlan(CachedPlan *plan, ResourceOwner owner);
extern bool CachedPlanAllowsSimpleValidityCheck(CachedPlanSource *plansource,
diff --git a/src/include/utils/portal.h b/src/include/utils/portal.h
index aeddbdafe5..1901fc5f28 100644
--- a/src/include/utils/portal.h
+++ b/src/include/utils/portal.h
@@ -138,6 +138,7 @@ typedef struct PortalData
QueryCompletion qc; /* command completion data for executed query */
List *stmts; /* list of PlannedStmts */
CachedPlan *cplan; /* CachedPlan, if stmts are from one */
+ List *part_prune_results_list; /* List of Lists of PartitionPruneResults */
ParamListInfo portalParams; /* params to pass to query */
QueryEnvironment *queryEnv; /* environment for query */
@@ -242,6 +243,8 @@ extern void PortalDefineQuery(Portal portal,
CommandTag commandTag,
List *stmts,
CachedPlan *cplan);
+extern void PortalStorePartitionPruneResults(Portal portal,
+ List *part_prune_results_list);
extern PlannedStmt *PortalGetPrimaryStmt(Portal portal);
extern void PortalCreateHoldStore(Portal portal);
extern void PortalHashTableDeleteAll(void);
--
2.35.3
^ permalink raw reply [nested|flat] 71+ messages in thread
* Re: generic plans and "initial" pruning
@ 2022-12-02 10:40 Amit Langote <[email protected]>
parent: Amit Langote <[email protected]>
0 siblings, 1 reply; 71+ messages in thread
From: Amit Langote @ 2022-12-02 10:40 UTC (permalink / raw)
To: Alvaro Herrera <[email protected]>; +Cc: Robert Haas <[email protected]>; Jacob Champion <[email protected]>; Zhihong Yu <[email protected]>; David Rowley <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers
On Thu, Dec 1, 2022 at 9:43 PM Amit Langote <[email protected]> wrote:
> On Thu, Dec 1, 2022 at 8:21 PM Alvaro Herrera <[email protected]> wrote:
> > On 2022-Dec-01, Amit Langote wrote:
> > > Hmm, how about keeping the [Merge]Append's parent relation's RT index
> > > in the PartitionPruneInfo and passing it down to
> > > ExecInitPartitionPruning() from ExecInit[Merge]Append() for
> > > cross-checking? Both Append and MergeAppend already have a
> > > 'apprelids' field that we can save a copy of in the
> > > PartitionPruneInfo. Tried that in the attached delta patch.
> >
> > Ah yeah, that sounds about what I was thinking. I've merged that in and
> > pushed to github, which had a strange pg_upgrade failure on Windows
> > mentioning log files that were not captured by the CI tooling. So I
> > pushed another one trying to grab those files, in case it wasn't an
> > one-off failure. It's running now:
> > https://cirrus-ci.com/task/5857239638999040
> >
> > If all goes well with this run, I'll get this 0001 pushed.
>
> Thanks for pushing 0001.
>
> Rebased 0002 attached.
Thought it might be good for PartitionPruneResult to also have
root_parent_relids that matches with the corresponding
PartitionPruneInfo. ExecInitPartitionPruning() does a sanity check
that the root_parent_relids of a given pair of PartitionPrune{Info |
Result} match.
Posting the patch separately as the attached 0002, just in case you
might think that the extra cross-checking would be an overkill.
--
Thanks, Amit Langote
EDB: http://www.enterprisedb.com
Attachments:
[application/octet-stream] v26-0002-Add-root_parent_relids-to-PartitionPruneResult.patch (3.4K, 2-v26-0002-Add-root_parent_relids-to-PartitionPruneResult.patch)
download | inline diff:
From f1af32816635254773386630b634835bd26d1227 Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Fri, 2 Dec 2022 19:32:14 +0900
Subject: [PATCH v26 2/2] Add root_parent_relids to PartitionPruneResult
It's same as the corresponding PartitionPruneInfo's root_parent_relids.
Like PartitionPruneInfo.root_parent_relids, it's there for
cross-checking a PartitionPruneResult found at a given plan node's
part_prune_index actually matches the plan node.
---
src/backend/executor/execMain.c | 2 ++
src/backend/executor/execPartition.c | 13 +++++++++++--
src/include/nodes/plannodes.h | 7 +++++++
3 files changed, 20 insertions(+), 2 deletions(-)
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 7a4db80104..1e84e47d46 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -145,6 +145,8 @@ ExecutorDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params,
PartitionPruneInfo *pruneinfo = lfirst(lc);
PartitionPruneResult *pruneresult = makeNode(PartitionPruneResult);
+ pruneresult->root_parent_relids =
+ bms_copy(pruneinfo->root_parent_relids);
pruneresult->valid_subplan_offs =
ExecPartitionDoInitialPruning(plannedstmt, params, pruneinfo,
scan_leafpart_rtis);
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 13e450c0fa..eda14d6241 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -1852,8 +1852,17 @@ ExecInitPartitionPruning(PlanState *planstate,
*/
if (estate->es_part_prune_results)
{
- pruneresult = list_nth(estate->es_part_prune_results, part_prune_index);
- Assert(IsA(pruneresult, PartitionPruneResult));
+ pruneresult = list_nth_node(PartitionPruneResult,
+ estate->es_part_prune_results,
+ part_prune_index);
+ if (!bms_equal(root_parent_relids, pruneinfo->root_parent_relids))
+ ereport(ERROR,
+ errcode(ERRCODE_INTERNAL_ERROR),
+ errmsg_internal("mismatching PartitionPruneInfo and PartitionPruneResult at part_prune_index %d",
+ part_prune_index),
+ errdetail_internal("prunresult relids %s, pruneinfo relids %s",
+ bmsToString(pruneresult->root_parent_relids),
+ bmsToString(pruneinfo->root_parent_relids)));
}
if (pruneresult == NULL || pruneinfo->needs_exec_pruning)
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 0cab6958d7..30f51414e9 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -1580,6 +1580,12 @@ typedef struct PartitionPruneStepCombine
* The result of performing ExecPartitionDoInitialPruning() on a given
* PartitionPruneInfo.
*
+ * root_parent_relids is same as PartitionPruneInfo.root_parent_relids. It's
+ * there for cross-checking in ExecInitPartitionPruning() that the
+ * PartitionPruneResult and the PartitionPruneInfo at a given index in
+ * EState.es_part_prune_results and EState.es_part_prune_infos, respectively,
+ * belong to the same parent plan node.
+ *
* valid_subplans_offs contains the indexes of subplans remaining after
* performing initial pruning by calling ExecFindMatchingSubPlans() on the
* PartitionPruneInfo.
@@ -1597,6 +1603,7 @@ typedef struct PartitionPruneResult
{
NodeTag type;
+ Bitmapset *root_parent_relids;
Bitmapset *valid_subplan_offs;
} PartitionPruneResult;
--
2.35.3
[application/octet-stream] v26-0001-Optimize-AcquireExecutorLocks-by-locking-only-un.patch (82.5K, 3-v26-0001-Optimize-AcquireExecutorLocks-by-locking-only-un.patch)
download | inline diff:
From d8b8185b6ceb2a2a33a6af142f23a59fd93d5cdc Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Wed, 22 Dec 2021 16:55:17 +0900
Subject: [PATCH v26 1/2] Optimize AcquireExecutorLocks() by locking only
unpruned partitions
This commit teaches AcquireExecutorLocks() to perform initial
partition pruning to notionally eliminate the subnodes contained in a
generic cached plan that need not be initialized during the actual
execution of the plan and skip locking the partition scanned by those
subnodes.
The result of performing initial partition pruning this way before the
actual execution has started is made available to the actual execution via
PartitionPruneResult, made available along with the PlannedStmt by the
callers of the executor that used plancache.c to get the plan. It is NULL
in the cases in which the plan is obtained by calling the planner
directly or if the plan obtained by plancache.c is not a generic one.
---
src/backend/commands/copyto.c | 2 +-
src/backend/commands/createas.c | 2 +-
src/backend/commands/explain.c | 7 +-
src/backend/commands/extension.c | 2 +-
src/backend/commands/matview.c | 2 +-
src/backend/commands/prepare.c | 26 ++-
src/backend/executor/README | 32 ++++
src/backend/executor/execMain.c | 51 ++++++
src/backend/executor/execParallel.c | 26 ++-
src/backend/executor/execPartition.c | 238 +++++++++++++++++++++----
src/backend/executor/execUtils.c | 1 +
src/backend/executor/functions.c | 2 +-
src/backend/executor/nodeAppend.c | 11 +-
src/backend/executor/nodeMergeAppend.c | 5 +-
src/backend/executor/spi.c | 27 ++-
src/backend/nodes/readfuncs.c | 8 +-
src/backend/optimizer/plan/planner.c | 2 +
src/backend/optimizer/plan/setrefs.c | 46 +++++
src/backend/partitioning/partprune.c | 41 ++++-
src/backend/tcop/postgres.c | 8 +-
src/backend/tcop/pquery.c | 28 ++-
src/backend/utils/cache/plancache.c | 208 ++++++++++++++++++---
src/backend/utils/mmgr/portalmem.c | 19 ++
src/include/commands/explain.h | 4 +-
src/include/executor/execPartition.h | 9 +-
src/include/executor/execdesc.h | 3 +
src/include/executor/executor.h | 3 +
src/include/nodes/execnodes.h | 1 +
src/include/nodes/nodes.h | 1 +
src/include/nodes/pathnodes.h | 12 ++
src/include/nodes/plannodes.h | 46 +++++
src/include/utils/plancache.h | 3 +-
src/include/utils/portal.h | 3 +
33 files changed, 781 insertions(+), 98 deletions(-)
diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index f26cc0d162..401a2280a3 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -558,7 +558,7 @@ BeginCopyTo(ParseState *pstate,
((DR_copy *) dest)->cstate = cstate;
/* Create a QueryDesc requesting no output */
- cstate->queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+ cstate->queryDesc = CreateQueryDesc(plan, NIL, pstate->p_sourcetext,
GetActiveSnapshot(),
InvalidSnapshot,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 152c29b551..942449544c 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -325,7 +325,7 @@ ExecCreateTableAs(ParseState *pstate, CreateTableAsStmt *stmt,
UpdateActiveSnapshotCommandId();
/* Create a QueryDesc, redirecting output to our tuple receiver */
- queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+ queryDesc = CreateQueryDesc(plan, NIL, pstate->p_sourcetext,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, 0);
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index f86983c660..2f2b558608 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -407,7 +407,7 @@ ExplainOneQuery(Query *query, int cursorOptions,
}
/* run it (if needed) and produce output */
- ExplainOnePlan(plan, into, es, queryString, params, queryEnv,
+ ExplainOnePlan(plan, NIL, into, es, queryString, params, queryEnv,
&planduration, (es->buffers ? &bufusage : NULL));
}
}
@@ -515,7 +515,8 @@ ExplainOneUtility(Node *utilityStmt, IntoClause *into, ExplainState *es,
* to call it.
*/
void
-ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
+ExplainOnePlan(PlannedStmt *plannedstmt, List *part_prune_results,
+ IntoClause *into, ExplainState *es,
const char *queryString, ParamListInfo params,
QueryEnvironment *queryEnv, const instr_time *planduration,
const BufferUsage *bufusage)
@@ -563,7 +564,7 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
dest = None_Receiver;
/* Create a QueryDesc for the query */
- queryDesc = CreateQueryDesc(plannedstmt, queryString,
+ queryDesc = CreateQueryDesc(plannedstmt, part_prune_results, queryString,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, instrument_option);
diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index cf1b1ca571..904cbcba4a 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -779,7 +779,7 @@ execute_sql_string(const char *sql)
{
QueryDesc *qdesc;
- qdesc = CreateQueryDesc(stmt,
+ qdesc = CreateQueryDesc(stmt, NIL,
sql,
GetActiveSnapshot(), NULL,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index 9ac0383459..65c8d0aa59 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -408,7 +408,7 @@ refresh_matview_datafill(DestReceiver *dest, Query *query,
UpdateActiveSnapshotCommandId();
/* Create a QueryDesc, redirecting output to our tuple receiver */
- queryDesc = CreateQueryDesc(plan, queryString,
+ queryDesc = CreateQueryDesc(plan, NIL, queryString,
GetActiveSnapshot(), InvalidSnapshot,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 9e29584d93..29b45539d3 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -155,6 +155,7 @@ ExecuteQuery(ParseState *pstate,
PreparedStatement *entry;
CachedPlan *cplan;
List *plan_list;
+ List *part_prune_results_list;
ParamListInfo paramLI = NULL;
EState *estate = NULL;
Portal portal;
@@ -193,7 +194,10 @@ ExecuteQuery(ParseState *pstate,
entry->plansource->query_string);
/* Replan if needed, and increment plan refcount for portal */
- cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL);
+ cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL,
+ &part_prune_results_list);
+ Assert(list_length(cplan->stmt_list) ==
+ list_length(part_prune_results_list));
plan_list = cplan->stmt_list;
/*
@@ -207,6 +211,9 @@ ExecuteQuery(ParseState *pstate,
plan_list,
cplan);
+ /* Copy Lists of PartitionPruneResults into the portal's context. */
+ PortalStorePartitionPruneResults(portal, part_prune_results_list);
+
/*
* For CREATE TABLE ... AS EXECUTE, we must verify that the prepared
* statement is one that produces tuples. Currently we insist that it be
@@ -576,7 +583,9 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
const char *query_string;
CachedPlan *cplan;
List *plan_list;
- ListCell *p;
+ List *part_prune_results_list;
+ ListCell *p,
+ *pp;
ParamListInfo paramLI = NULL;
EState *estate = NULL;
instr_time planstart;
@@ -619,7 +628,10 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
/* Replan if needed, and acquire a transient refcount */
cplan = GetCachedPlan(entry->plansource, paramLI,
- CurrentResourceOwner, queryEnv);
+ CurrentResourceOwner, queryEnv,
+ &part_prune_results_list);
+ Assert(list_length(cplan->stmt_list) ==
+ list_length(part_prune_results_list));
INSTR_TIME_SET_CURRENT(planduration);
INSTR_TIME_SUBTRACT(planduration, planstart);
@@ -634,13 +646,15 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
plan_list = cplan->stmt_list;
/* Explain each query */
- foreach(p, plan_list)
+ forboth(p, plan_list, pp, part_prune_results_list)
{
PlannedStmt *pstmt = lfirst_node(PlannedStmt, p);
+ List *part_prune_results = lfirst_node(List, pp);
if (pstmt->commandType != CMD_UTILITY)
- ExplainOnePlan(pstmt, into, es, query_string, paramLI, queryEnv,
- &planduration, (es->buffers ? &bufusage : NULL));
+ ExplainOnePlan(pstmt, part_prune_results, into, es, query_string,
+ paramLI, queryEnv, &planduration,
+ (es->buffers ? &bufusage : NULL));
else
ExplainOneUtility(pstmt->utilityStmt, into, es, query_string,
paramLI, queryEnv);
diff --git a/src/backend/executor/README b/src/backend/executor/README
index 17775a49e2..5c59ac5da7 100644
--- a/src/backend/executor/README
+++ b/src/backend/executor/README
@@ -65,6 +65,34 @@ found there. This currently only occurs for Append and MergeAppend nodes. In
this case the non-required subplans are ignored and the executor state's
subnode array will become out of sequence to the plan's subplan list.
+Actually, the so-called execution time pruning may also occur even before the
+execution has started. One case where that occurs is when a cached generic
+plan is being validated for execution by plancache.c: GetCachedPlan(), which
+works by locking all the relations that will be scanned by that plan. If the
+generic plan contains nodes that can perform execution time partition pruning
+(that is, contain a PartitionPruneInfo), a subset of pruning steps contained
+in a given node's PartitionPruneInfo that do not depend on the execution
+actually having started (called "initial" pruning steps) are performed to
+figure out the minimal set of child subplans that satisfy those pruning steps.
+AcquireExecutorLocks() looking at a given generic plan will then lock only the
+relations scanned by the child subplans that survived such pruning, along with
+those present in PlannedStmt.minLockRelids. Note that the subplans are only
+notionally pruned, that is, they are not removed from the plan tree as such.
+
+To prevent the executor and any third party execution code that can look at
+the plan tree from trying to execute the subplans that were pruned as
+described above, the result of pruning is passed to the executor as a List
+of PartitionPruneResult nodes via the QueryDesc. Each PartitionPruneResult
+consists of the set of indexes of surviving subplans in the respective parent
+plan node's (the one to which the corresponding PartitionPruneInfo belongs)
+list of child subplans, saved as a bitmapset (valid_subplan_offs). In other
+words, the executor executing a generic plan should not re-evaluate the set of
+initially valid subplans for a given plan node by redoing the initial pruning
+if it was already done by AcquireExecutorLocks() when validating the plan.
+Such re-evaluation of the pruning steps may very well end up resulting in a
+different set of subplans, containing some whose relations were not locked by
+AcquireExecutorLocks().
+
Each Plan node may have expression trees associated with it, to represent
its target list, qualification conditions, etc. These trees are also
read-only to the executor, but the executor state for expression evaluation
@@ -286,6 +314,10 @@ Query Processing Control Flow
This is a sketch of control flow for full query processing:
+ [ ExecutorDoInitialPruning ] --- an optional step to perform initial
+ partition pruning on the plan tree the result of which is passed
+ to the executor via QueryDesc
+
CreateQueryDesc
ExecutorStart
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index b6751da574..7a4db80104 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -49,6 +49,7 @@
#include "commands/matview.h"
#include "commands/trigger.h"
#include "executor/execdebug.h"
+#include "executor/execPartition.h"
#include "executor/nodeSubplan.h"
#include "foreign/fdwapi.h"
#include "jit/jit.h"
@@ -104,6 +105,54 @@ static void EvalPlanQualStart(EPQState *epqstate, Plan *planTree);
/* end of local decls */
+/* ----------------------------------------------------------------
+ * ExecutorDoInitialPruning
+ *
+ * For each plan tree node that has been assigned a PartitionPruneInfo,
+ * this performs initial partition pruning using the information contained
+ * therein to determine the set of child subplans that satisfy the initial
+ * pruning steps, to be returned as a bitmapset of their indexes in the
+ * node's list of child subplans (for example, an Append's appendplans).
+ *
+ * Return value is a PartitionPruneResult node that contains a list of those
+ * bitmapsets, with one element for every PartitionPruneInfo, and a bitmapset
+ * of the RT indexes of all the leaf partitions scanned by those chosen
+ * subplans. Note that the latter is shared across all PartitionPruneInfos.
+ *
+ * The executor must see the exactly same set of subplans as valid for
+ * execution when doing ExecInitNode() on the plan nodes whose
+ * PartitionPruneInfos are processed here. So, it must get the set from the
+ * aforementioned PartitionPruneResult, instead of computing it all over
+ * again by redoing the initial pruning. It's the caller's job to pass the
+ * PartitionPruneResult to the executor.
+ *
+ * Note: Partitioned tables mentioned in PartitionedRelPruneInfo nodes that
+ * drive the pruning will be locked before doing the pruning.
+ * ----------------------------------------------------------------
+ */
+List *
+ExecutorDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params,
+ Bitmapset **scan_leafpart_rtis)
+{
+ List *part_prune_results = NIL;
+ ListCell *lc;
+
+ /* Only get here if there is any pruning to do. */
+ Assert(plannedstmt->containsInitialPruning);
+
+ foreach(lc, plannedstmt->partPruneInfos)
+ {
+ PartitionPruneInfo *pruneinfo = lfirst(lc);
+ PartitionPruneResult *pruneresult = makeNode(PartitionPruneResult);
+
+ pruneresult->valid_subplan_offs =
+ ExecPartitionDoInitialPruning(plannedstmt, params, pruneinfo,
+ scan_leafpart_rtis);
+ part_prune_results = lappend(part_prune_results, pruneresult);
+ }
+
+ return part_prune_results;
+}
/* ----------------------------------------------------------------
* ExecutorStart
@@ -806,6 +855,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
{
CmdType operation = queryDesc->operation;
PlannedStmt *plannedstmt = queryDesc->plannedstmt;
+ List *part_prune_results = queryDesc->part_prune_results;
Plan *plan = plannedstmt->planTree;
List *rangeTable = plannedstmt->rtable;
EState *estate = queryDesc->estate;
@@ -826,6 +876,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
estate->es_plannedstmt = plannedstmt;
estate->es_part_prune_infos = plannedstmt->partPruneInfos;
+ estate->es_part_prune_results = part_prune_results;
/*
* Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index aca0c6f323..917079a034 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -66,6 +66,7 @@
#define PARALLEL_KEY_QUERY_TEXT UINT64CONST(0xE000000000000008)
#define PARALLEL_KEY_JIT_INSTRUMENTATION UINT64CONST(0xE000000000000009)
#define PARALLEL_KEY_WAL_USAGE UINT64CONST(0xE00000000000000A)
+#define PARALLEL_KEY_PARTITION_PRUNE_RESULTS UINT64CONST(0xE00000000000000B)
#define PARALLEL_TUPLE_QUEUE_SIZE 65536
@@ -182,6 +183,7 @@ ExecSerializePlan(Plan *plan, EState *estate)
pstmt->transientPlan = false;
pstmt->dependsOnRole = false;
pstmt->parallelModeNeeded = false;
+ pstmt->containsInitialPruning = false;
pstmt->planTree = plan;
pstmt->partPruneInfos = estate->es_part_prune_infos;
pstmt->rtable = estate->es_range_table;
@@ -597,12 +599,15 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
FixedParallelExecutorState *fpes;
char *pstmt_data;
char *pstmt_space;
+ char *part_prune_results_data;
+ char *part_prune_results_space;
char *paramlistinfo_space;
BufferUsage *bufusage_space;
WalUsage *walusage_space;
SharedExecutorInstrumentation *instrumentation = NULL;
SharedJitInstrumentation *jit_instrumentation = NULL;
int pstmt_len;
+ int part_prune_results_len;
int paramlistinfo_len;
int instrumentation_len = 0;
int jit_instrumentation_len = 0;
@@ -631,6 +636,7 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
/* Fix up and serialize plan to be sent to workers. */
pstmt_data = ExecSerializePlan(planstate->plan, estate);
+ part_prune_results_data = nodeToString(estate->es_part_prune_results);
/* Create a parallel context. */
pcxt = CreateParallelContext("postgres", "ParallelQueryMain", nworkers);
@@ -657,6 +663,11 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
shm_toc_estimate_chunk(&pcxt->estimator, pstmt_len);
shm_toc_estimate_keys(&pcxt->estimator, 1);
+ /* Estimate space for serialized List of PartitionPruneResult. */
+ part_prune_results_len = strlen(part_prune_results_data) + 1;
+ shm_toc_estimate_chunk(&pcxt->estimator, part_prune_results_len);
+ shm_toc_estimate_keys(&pcxt->estimator, 1);
+
/* Estimate space for serialized ParamListInfo. */
paramlistinfo_len = EstimateParamListSpace(estate->es_param_list_info);
shm_toc_estimate_chunk(&pcxt->estimator, paramlistinfo_len);
@@ -751,6 +762,12 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
memcpy(pstmt_space, pstmt_data, pstmt_len);
shm_toc_insert(pcxt->toc, PARALLEL_KEY_PLANNEDSTMT, pstmt_space);
+ /* Store serialized List of PartitionPruneResult */
+ part_prune_results_space = shm_toc_allocate(pcxt->toc, part_prune_results_len);
+ memcpy(part_prune_results_space, part_prune_results_data, part_prune_results_len);
+ shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARTITION_PRUNE_RESULTS,
+ part_prune_results_space);
+
/* Store serialized ParamListInfo. */
paramlistinfo_space = shm_toc_allocate(pcxt->toc, paramlistinfo_len);
shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARAMLISTINFO, paramlistinfo_space);
@@ -1232,8 +1249,10 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
int instrument_options)
{
char *pstmtspace;
+ char *part_prune_results_space;
char *paramspace;
PlannedStmt *pstmt;
+ List *part_prune_results;
ParamListInfo paramLI;
char *queryString;
@@ -1244,12 +1263,17 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
pstmtspace = shm_toc_lookup(toc, PARALLEL_KEY_PLANNEDSTMT, false);
pstmt = (PlannedStmt *) stringToNode(pstmtspace);
+ /* Reconstruct leader-supplied PartitionPruneResult. */
+ part_prune_results_space =
+ shm_toc_lookup(toc, PARALLEL_KEY_PARTITION_PRUNE_RESULTS, false);
+ part_prune_results = (List *) stringToNode(part_prune_results_space);
+
/* Reconstruct ParamListInfo. */
paramspace = shm_toc_lookup(toc, PARALLEL_KEY_PARAMLISTINFO, false);
paramLI = RestoreParamList(¶mspace);
/* Create a QueryDesc for the query. */
- return CreateQueryDesc(pstmt,
+ return CreateQueryDesc(pstmt, part_prune_results,
queryString,
GetActiveSnapshot(), InvalidSnapshot,
receiver, paramLI, NULL, instrument_options);
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 8e6453aec2..13e450c0fa 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -25,6 +25,7 @@
#include "mb/pg_wchar.h"
#include "miscadmin.h"
#include "nodes/makefuncs.h"
+#include "parser/parsetree.h"
#include "partitioning/partbounds.h"
#include "partitioning/partdesc.h"
#include "partitioning/partprune.h"
@@ -185,7 +186,11 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation rel,
static List *adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri);
static List *adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap);
static PartitionPruneState *CreatePartitionPruneState(PlanState *planstate,
- PartitionPruneInfo *pruneinfo);
+ PartitionPruneInfo *pruneinfo,
+ bool consider_initial_steps,
+ bool consider_exec_steps,
+ List *rtable, ExprContext *econtext,
+ PartitionDirectory partdir);
static void InitPartitionPruneContext(PartitionPruneContext *context,
List *pruning_steps,
PartitionDesc partdesc,
@@ -198,7 +203,8 @@ static void PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
static void find_matching_subplans_recurse(PartitionPruningData *prunedata,
PartitionedRelPruningData *pprune,
bool initial_prune,
- Bitmapset **validsubplans);
+ Bitmapset **validsubplans,
+ Bitmapset **scan_leafpart_rtis);
/*
@@ -1758,8 +1764,10 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
* considered to be a stable expression, it can change value from one plan
* node scan to the next during query execution. Stable comparison
* expressions that don't involve such Params allow partition pruning to be
- * done once during executor startup. Expressions that do involve such Params
- * require us to prune separately for each scan of the parent plan node.
+ * done once during executor startup or during ExecutorDoInitialPruning() that
+ * runs as part of performing AcquireExecutorLocks() on a given plan tree.
+ * Expressions that do involve such Params require us to prune separately for
+ * each scan of the parent plan node.
*
* Note that pruning away unneeded subplans during executor startup has the
* added benefit of not having to initialize the unneeded subplans at all.
@@ -1776,6 +1784,13 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
* account for initial pruning possibly having eliminated some of the
* subplans.
*
+ * ExecPartitionDoInitialPruning:
+ * Do initial pruning with the information contained in a given
+ * PartitionPruneInfo to determine the minimal set of child subplans
+ * to be executed of the parent plan node to which the PartitionPruneInfo
+ * belongs and also the set of the RT indexes of leaf partitions that will
+ * be scanned with those subplans.
+ *
* ExecFindMatchingSubPlans:
* Returns indexes of matching subplans after evaluating the expressions
* that are safe to evaluate at a given point. This function is first
@@ -1796,8 +1811,9 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
*
* On return, *initially_valid_subplans is assigned the set of indexes of
* child subplans that must be initialized along with the parent plan node.
- * Initial pruning is performed here if needed and in that case only the
- * surviving subplans' indexes are added.
+ * Initial pruning is performed here if needed (unless it has already been done
+ * by ExecutorDoInitialPruning()), and in that case only the surviving
+ * subplans' indexes are added.
*
* If subplans are indeed pruned, subplan_map arrays contained in the returned
* PartitionPruneState are re-sequenced to not count those, though only if the
@@ -1810,9 +1826,10 @@ ExecInitPartitionPruning(PlanState *planstate,
Bitmapset *root_parent_relids,
Bitmapset **initially_valid_subplans)
{
- PartitionPruneState *prunestate;
+ PartitionPruneState *prunestate = NULL;
EState *estate = planstate->state;
PartitionPruneInfo *pruneinfo;
+ PartitionPruneResult *pruneresult = NULL;
/* Obtain the pruneinfo we need, and make sure it's the right one */
pruneinfo = list_nth(estate->es_part_prune_infos, part_prune_index);
@@ -1828,20 +1845,57 @@ ExecInitPartitionPruning(PlanState *planstate,
/* We may need an expression context to evaluate partition exprs */
ExecAssignExprContext(estate, planstate);
- /* Create the working data structure for pruning */
- prunestate = CreatePartitionPruneState(planstate, pruneinfo);
+ /*
+ * No need to do initial pruning if it was done already by
+ * ExecutorDoInitialPruning(), which it would be if es_part_prune_results
+ * is set.
+ */
+ if (estate->es_part_prune_results)
+ {
+ pruneresult = list_nth(estate->es_part_prune_results, part_prune_index);
+ Assert(IsA(pruneresult, PartitionPruneResult));
+ }
+
+ if (pruneresult == NULL || pruneinfo->needs_exec_pruning)
+ {
+ /* We may need an expression context to evaluate partition exprs */
+ ExecAssignExprContext(estate, planstate);
+
+ /* For data reading, executor always omits detached partitions */
+ if (estate->es_partition_directory == NULL)
+ estate->es_partition_directory =
+ CreatePartitionDirectory(estate->es_query_cxt, false);
+
+ /*
+ * Create the working data structure for pruning. No need to consider
+ * initial pruning steps if we have a PartitionPruneResult.
+ */
+ prunestate = CreatePartitionPruneState(planstate, pruneinfo,
+ pruneresult == NULL,
+ pruneinfo->needs_exec_pruning,
+ NIL, planstate->ps_ExprContext,
+ estate->es_partition_directory);
+ }
/*
* Perform an initial partition prune pass, if required.
*/
- if (prunestate->do_initial_prune)
- *initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true);
+ if (pruneresult)
+ {
+ *initially_valid_subplans = bms_copy(pruneresult->valid_subplan_offs);
+ }
+ else if (prunestate && prunestate->do_initial_prune)
+ {
+ *initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true,
+ NULL);
+ }
else
{
- /* No pruning, so we'll need to initialize all subplans */
+ /* No initial pruning, so we'll need to initialize all subplans */
Assert(n_total_subplans > 0);
*initially_valid_subplans = bms_add_range(NULL, 0,
n_total_subplans - 1);
+ return prunestate;
}
/*
@@ -1849,7 +1903,8 @@ ExecInitPartitionPruning(PlanState *planstate,
* that were removed above due to initial pruning. No need to do this if
* no steps were removed.
*/
- if (bms_num_members(*initially_valid_subplans) < n_total_subplans)
+ if (prunestate &&
+ bms_num_members(*initially_valid_subplans) < n_total_subplans)
{
/*
* We can safely skip this when !do_exec_prune, even though that
@@ -1865,11 +1920,74 @@ ExecInitPartitionPruning(PlanState *planstate,
return prunestate;
}
+/*
+ * ExecPartitionDoInitialPruning
+ * Perform initial pruning using given PartitionPruneInfo to determine
+ * the minimal set of child subplans that will be executed and also the
+ * set of RT indexes of the leaf partitions scanned by those subplans.
+ */
+Bitmapset *
+ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params,
+ PartitionPruneInfo *pruneinfo,
+ Bitmapset **scan_leafpart_rtis)
+{
+ List *rtable = plannedstmt->rtable;
+ ExprContext *econtext;
+ PartitionDirectory pdir;
+ MemoryContext oldcontext,
+ tmpcontext;
+ PartitionPruneState *prunestate;
+ Bitmapset *valid_subplan_offs;
+
+ /*
+ * A temporary context for memory allocations required while executing
+ * partition pruning steps.
+ */
+ tmpcontext = AllocSetContextCreate(CurrentMemoryContext,
+ "initial pruning working data",
+ ALLOCSET_DEFAULT_SIZES);
+ oldcontext = MemoryContextSwitchTo(tmpcontext);
+
+ /*
+ * PartitionDirectory to look up partition descriptors.
+ * Note that we don't omit detached partitions, just like during
+ * execution proper.
+ */
+ pdir = CreatePartitionDirectory(CurrentMemoryContext, false);
+
+ /*
+ * We don't yet have a PlanState for the parent plan node, so we must
+ * create a standalone ExprContext to evaluate pruning expressions,
+ * equipped with the information about the EXTERN parameters that the
+ * caller passed us. Note that that's okay because the initial pruning
+ * steps do not contain anything that requires the execution to have
+ * started and thus need the information contained in a PlanState.
+ */
+ econtext = CreateStandaloneExprContext();
+ econtext->ecxt_param_list_info = params;
+ prunestate = CreatePartitionPruneState(NULL, pruneinfo, true, false,
+ rtable, econtext, pdir);
+ MemoryContextSwitchTo(oldcontext);
+
+ /* Do the initial pruning. */
+ valid_subplan_offs = ExecFindMatchingSubPlans(prunestate, true,
+ scan_leafpart_rtis);
+
+ FreeExprContext(econtext, true);
+ DestroyPartitionDirectory(pdir);
+ MemoryContextDelete(tmpcontext);
+
+ return valid_subplan_offs;
+}
+
/*
* CreatePartitionPruneState
* Build the data structure required for calling ExecFindMatchingSubPlans
*
- * 'planstate' is the parent plan node's execution state.
+ * 'planstate', if not NULL, is the parent plan node's execution state. It
+ * can be NULL if being called before ExecutorStart(), in which case,
+ * 'rtable' (range table), 'econtext', and 'partdir' must be explicitly
+ * provided.
*
* 'pruneinfo' is a PartitionPruneInfo as generated by
* make_partition_pruneinfo. Here we build a PartitionPruneState containing a
@@ -1883,19 +2001,21 @@ ExecInitPartitionPruning(PlanState *planstate,
* PartitionedRelPruneInfo.
*/
static PartitionPruneState *
-CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
+CreatePartitionPruneState(PlanState *planstate,
+ PartitionPruneInfo *pruneinfo,
+ bool consider_initial_steps,
+ bool consider_exec_steps,
+ List *rtable, ExprContext *econtext,
+ PartitionDirectory partdir)
{
- EState *estate = planstate->state;
+ EState *estate = planstate ? planstate->state : NULL;
PartitionPruneState *prunestate;
int n_part_hierarchies;
ListCell *lc;
int i;
- ExprContext *econtext = planstate->ps_ExprContext;
- /* For data reading, executor always omits detached partitions */
- if (estate->es_partition_directory == NULL)
- estate->es_partition_directory =
- CreatePartitionDirectory(estate->es_query_cxt, false);
+ Assert((estate != NULL) ||
+ (partdir != NULL && econtext != NULL && rtable != NIL));
n_part_hierarchies = list_length(pruneinfo->prune_infos);
Assert(n_part_hierarchies > 0);
@@ -1950,15 +2070,42 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
PartitionKey partkey;
/*
- * We can rely on the copies of the partitioned table's partition
- * key and partition descriptor appearing in its relcache entry,
- * because that entry will be held open and locked for the
- * duration of this executor run.
+ * Must open the relation by ourselves when called before the
+ * execution has started, such as, when called during
+ * ExecutorDoInitialPruning() on a cached plan. In that case,
+ * sub-partitions must be locked, because AcquirePlannerLocks()
+ * would not have seen them. (1st relation in a partrelpruneinfos
+ * list is always the root partitioned table appearing in the
+ * query, which AcquirePlannerLocks() would have locked; the
+ * Assert in relation_open() guards that assumption.)
+ */
+ if (estate == NULL)
+ {
+ RangeTblEntry *rte = rt_fetch(pinfo->rtindex, rtable);
+ int lockmode = (j == 0) ? NoLock : rte->rellockmode;
+
+ partrel = table_open(rte->relid, lockmode);
+ }
+ else
+ partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
+
+ /*
+ * We can rely on the copy of the partitioned table's partition
+ * key from in its relcache entry, because it can't change (or
+ * get destroyed) as long as the relation is locked. Partition
+ * descriptor is taken from the PartitionDirectory associated with
+ * the table that is held open long enough for the descriptor to
+ * remain valid while it's used to perform the pruning steps.
*/
- partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
partkey = RelationGetPartitionKey(partrel);
- partdesc = PartitionDirectoryLookup(estate->es_partition_directory,
- partrel);
+ partdesc = PartitionDirectoryLookup(partdir, partrel);
+
+ /*
+ * Must close partrel, keeping the lock taken, if we're not using
+ * EState's entry.
+ */
+ if (estate == NULL)
+ table_close(partrel, NoLock);
/*
* Initialize the subplan_map and subpart_map.
@@ -1972,6 +2119,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
Assert(partdesc->nparts >= pinfo->nparts);
pprune->nparts = partdesc->nparts;
pprune->subplan_map = palloc(sizeof(int) * partdesc->nparts);
+ pprune->rti_map = palloc(sizeof(Index) * partdesc->nparts);
if (partdesc->nparts == pinfo->nparts)
{
/*
@@ -1982,6 +2130,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
pprune->subpart_map = pinfo->subpart_map;
memcpy(pprune->subplan_map, pinfo->subplan_map,
sizeof(int) * pinfo->nparts);
+ memcpy(pprune->rti_map, pinfo->rti_map,
+ sizeof(int) * pinfo->nparts);
/*
* Double-check that the list of unpruned relations has not
@@ -2032,6 +2182,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
pinfo->subplan_map[pd_idx];
pprune->subpart_map[pp_idx] =
pinfo->subpart_map[pd_idx];
+ pprune->rti_map[pp_idx] =
+ pinfo->rti_map[pd_idx];
pd_idx++;
}
else
@@ -2039,6 +2191,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
/* this partdesc entry is not in the plan */
pprune->subplan_map[pp_idx] = -1;
pprune->subpart_map[pp_idx] = -1;
+ pprune->rti_map[pp_idx] = 0;
}
}
@@ -2060,7 +2213,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
* Initialize pruning contexts as needed.
*/
pprune->initial_pruning_steps = pinfo->initial_pruning_steps;
- if (pinfo->initial_pruning_steps)
+ if (consider_initial_steps && pinfo->initial_pruning_steps)
{
InitPartitionPruneContext(&pprune->initial_context,
pinfo->initial_pruning_steps,
@@ -2070,7 +2223,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
prunestate->do_initial_prune = true;
}
pprune->exec_pruning_steps = pinfo->exec_pruning_steps;
- if (pinfo->exec_pruning_steps)
+ if (consider_exec_steps && pinfo->exec_pruning_steps)
{
InitPartitionPruneContext(&pprune->exec_context,
pinfo->exec_pruning_steps,
@@ -2298,10 +2451,14 @@ PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
* Pass initial_prune if PARAM_EXEC Params cannot yet be evaluated. This
* differentiates the initial executor-time pruning step from later
* runtime pruning.
+ *
+ * RT indexes of leaf partitions scanned by the chosen subplans are added to
+ * *scan_leafpart_rtis if the pointer is non-NULL.
*/
Bitmapset *
ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
- bool initial_prune)
+ bool initial_prune,
+ Bitmapset **scan_leafpart_rtis)
{
Bitmapset *result = NULL;
MemoryContext oldcontext;
@@ -2336,7 +2493,7 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
*/
pprune = &prunedata->partrelprunedata[0];
find_matching_subplans_recurse(prunedata, pprune, initial_prune,
- &result);
+ &result, scan_leafpart_rtis);
/* Expression eval may have used space in ExprContext too */
if (pprune->exec_pruning_steps)
@@ -2350,6 +2507,8 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
/* Copy result out of the temp context before we reset it */
result = bms_copy(result);
+ if (scan_leafpart_rtis)
+ *scan_leafpart_rtis = bms_copy(*scan_leafpart_rtis);
MemoryContextReset(prunestate->prune_context);
@@ -2360,13 +2519,15 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
* find_matching_subplans_recurse
* Recursive worker function for ExecFindMatchingSubPlans
*
- * Adds valid (non-prunable) subplan IDs to *validsubplans
+ * Adds valid (non-prunable) subplan IDs to *validsubplans and RT indexes of
+ * of the corresponding leaf partitions to *scan_leafpart_rtis (if asked for).
*/
static void
find_matching_subplans_recurse(PartitionPruningData *prunedata,
PartitionedRelPruningData *pprune,
bool initial_prune,
- Bitmapset **validsubplans)
+ Bitmapset **validsubplans,
+ Bitmapset **scan_leafpart_rtis)
{
Bitmapset *partset;
int i;
@@ -2393,8 +2554,14 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
while ((i = bms_next_member(partset, i)) >= 0)
{
if (pprune->subplan_map[i] >= 0)
+ {
*validsubplans = bms_add_member(*validsubplans,
pprune->subplan_map[i]);
+ Assert(pprune->rti_map[i] > 0);
+ if (scan_leafpart_rtis)
+ *scan_leafpart_rtis = bms_add_member(*scan_leafpart_rtis,
+ pprune->rti_map[i]);
+ }
else
{
int partidx = pprune->subpart_map[i];
@@ -2402,7 +2569,8 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
if (partidx >= 0)
find_matching_subplans_recurse(prunedata,
&prunedata->partrelprunedata[partidx],
- initial_prune, validsubplans);
+ initial_prune, validsubplans,
+ scan_leafpart_rtis);
else
{
/*
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 9695de85b9..dce93a8c9f 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -135,6 +135,7 @@ CreateExecutorState(void)
estate->es_param_exec_vals = NULL;
estate->es_queryEnv = NULL;
+ estate->es_part_prune_results = NIL;
estate->es_query_cxt = qcontext;
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index dc13625171..bffb42ce71 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -842,7 +842,7 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
else
dest = None_Receiver;
- es->qd = CreateQueryDesc(es->stmt,
+ es->qd = CreateQueryDesc(es->stmt, NIL,
fcache->src,
GetActiveSnapshot(),
InvalidSnapshot,
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index 99830198bd..3b917584de 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -156,7 +156,8 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
* subplan, we can fill as_valid_subplans immediately, preventing
* later calls to ExecFindMatchingSubPlans.
*/
- if (!prunestate->do_exec_prune && nplans > 0)
+ if (appendstate->as_prune_state == NULL ||
+ (!appendstate->as_prune_state->do_exec_prune && nplans > 0))
appendstate->as_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
}
else
@@ -578,7 +579,7 @@ choose_next_subplan_locally(AppendState *node)
}
else if (node->as_valid_subplans == NULL)
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
whichplan = -1;
}
@@ -643,7 +644,7 @@ choose_next_subplan_for_leader(AppendState *node)
if (node->as_valid_subplans == NULL)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
/*
* Mark each invalid plan as finished to allow the loop below to
@@ -718,7 +719,7 @@ choose_next_subplan_for_worker(AppendState *node)
else if (node->as_valid_subplans == NULL)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
mark_invalid_subplans_as_finished(node);
}
@@ -869,7 +870,7 @@ ExecAppendAsyncBegin(AppendState *node)
if (node->as_valid_subplans == NULL)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
classify_matching_subplans(node);
}
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index f370f9f287..ccfa083945 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -104,7 +104,8 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
* subplan, we can fill ms_valid_subplans immediately, preventing
* later calls to ExecFindMatchingSubPlans.
*/
- if (!prunestate->do_exec_prune && nplans > 0)
+ if (mergestate->ms_prune_state == NULL ||
+ (!mergestate->ms_prune_state->do_exec_prune && nplans > 0))
mergestate->ms_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
}
else
@@ -219,7 +220,7 @@ ExecMergeAppend(PlanState *pstate)
*/
if (node->ms_valid_subplans == NULL)
node->ms_valid_subplans =
- ExecFindMatchingSubPlans(node->ms_prune_state, false);
+ ExecFindMatchingSubPlans(node->ms_prune_state, false, NULL);
/*
* First time through: pull the first tuple from each valid subplan,
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index fd5796f1b9..93012a5b3b 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -1578,6 +1578,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
CachedPlanSource *plansource;
CachedPlan *cplan;
List *stmt_list;
+ List *part_prune_results_list;
char *query_string;
Snapshot snapshot;
MemoryContext oldcontext;
@@ -1657,7 +1658,10 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
*/
/* Replan if needed, and increment plan refcount for portal */
- cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv);
+ cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv,
+ &part_prune_results_list);
+ Assert(list_length(cplan->stmt_list) ==
+ list_length(part_prune_results_list));
stmt_list = cplan->stmt_list;
if (!plan->saved)
@@ -1685,6 +1689,9 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
stmt_list,
cplan);
+ /* Copy Lists of PartitionPruneResults into the portal's context. */
+ PortalStorePartitionPruneResults(portal, part_prune_results_list);
+
/*
* Set up options for portal. Default SCROLL type is chosen the same way
* as PerformCursorOpen does it.
@@ -2092,7 +2099,8 @@ SPI_plan_get_cached_plan(SPIPlanPtr plan)
/* Get the generic plan for the query */
cplan = GetCachedPlan(plansource, NULL,
plan->saved ? CurrentResourceOwner : NULL,
- _SPI_current->queryEnv);
+ _SPI_current->queryEnv,
+ NULL /* Not interested in PartitionPruneResults */);
Assert(cplan == plansource->gplan);
/* Pop the error context stack */
@@ -2473,7 +2481,9 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
{
CachedPlanSource *plansource = (CachedPlanSource *) lfirst(lc1);
List *stmt_list;
- ListCell *lc2;
+ List *part_prune_results_list;
+ ListCell *lc2,
+ *lc3;
spicallbackarg.query = plansource->query_string;
@@ -2549,8 +2559,10 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
* plan, the refcount must be backed by the plan_owner.
*/
cplan = GetCachedPlan(plansource, options->params,
- plan_owner, _SPI_current->queryEnv);
-
+ plan_owner, _SPI_current->queryEnv,
+ &part_prune_results_list);
+ Assert(list_length(cplan->stmt_list) ==
+ list_length(part_prune_results_list));
stmt_list = cplan->stmt_list;
/*
@@ -2589,9 +2601,10 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
}
}
- foreach(lc2, stmt_list)
+ forboth(lc2, stmt_list, lc3, part_prune_results_list)
{
PlannedStmt *stmt = lfirst_node(PlannedStmt, lc2);
+ List *part_prune_results = lfirst_node(List, lc3);
bool canSetTag = stmt->canSetTag;
DestReceiver *dest;
@@ -2663,7 +2676,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
else
snap = InvalidSnapshot;
- qdesc = CreateQueryDesc(stmt,
+ qdesc = CreateQueryDesc(stmt, part_prune_results,
plansource->query_string,
snap, crosscheck_snapshot,
dest,
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 23776367c5..b01f55fb4f 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -158,6 +158,11 @@
token = pg_strtok(&length); /* skip :fldname */ \
local_node->fldname = readIntCols(len)
+/* Read an Index array */
+#define READ_INDEX_ARRAY(fldname, len) \
+ token = pg_strtok(&length); /* skip :fldname */ \
+ local_node->fldname = readIndexCols(len)
+
/* Read a bool array */
#define READ_BOOL_ARRAY(fldname, len) \
token = pg_strtok(&length); /* skip :fldname */ \
@@ -800,7 +805,6 @@ fnname(int numCols) \
*/
READ_SCALAR_ARRAY(readAttrNumberCols, int16, atoi)
READ_SCALAR_ARRAY(readOidCols, Oid, atooid)
-/* outfuncs.c has writeIndexCols, but we don't yet need that here */
-/* READ_SCALAR_ARRAY(readIndexCols, Index, atoui) */
+READ_SCALAR_ARRAY(readIndexCols, Index, atoui)
READ_SCALAR_ARRAY(readIntCols, int, atoi)
READ_SCALAR_ARRAY(readBoolCols, bool, strtobool)
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 799602f5ea..a96d316dca 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -520,7 +520,9 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
result->parallelModeNeeded = glob->parallelModeNeeded;
result->planTree = top_plan;
result->partPruneInfos = glob->partPruneInfos;
+ result->containsInitialPruning = glob->containsInitialPruning;
result->rtable = glob->finalrtable;
+ result->minLockRelids = glob->minLockRelids;
result->resultRelations = glob->resultRelations;
result->appendRelations = glob->appendRelations;
result->subplans = glob->subplans;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index e67f0e3509..5820f26fdb 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -270,6 +270,16 @@ set_plan_references(PlannerInfo *root, Plan *plan)
*/
add_rtes_to_flat_rtable(root, false);
+ /*
+ * Add the query's adjusted range of RT indexes to glob->minLockRelids.
+ * The adjusted RT indexes of prunable relations will be deleted from the
+ * set below where PartitionPruneInfos are processed.
+ */
+ glob->minLockRelids =
+ bms_add_range(glob->minLockRelids,
+ rtoffset + 1,
+ rtoffset + list_length(root->parse->rtable));
+
/*
* Adjust RT indexes of PlanRowMarks and add to final rowmarks list
*/
@@ -352,6 +362,7 @@ set_plan_references(PlannerInfo *root, Plan *plan)
foreach (lc, root->partPruneInfos)
{
PartitionPruneInfo *pruneinfo = lfirst(lc);
+ Bitmapset *leafpart_rtis = NULL;
ListCell *l;
pruneinfo->root_parent_relids =
@@ -364,15 +375,50 @@ set_plan_references(PlannerInfo *root, Plan *plan)
foreach(l2, prune_infos)
{
PartitionedRelPruneInfo *pinfo = lfirst(l2);
+ int i;
/* RT index of the table to which the pinfo belongs. */
pinfo->rtindex += rtoffset;
+
+ /* Also of the leaf partitions that might be scanned. */
+ for (i = 0; i < pinfo->nparts; i++)
+ {
+ if (pinfo->rti_map[i] > 0 && pinfo->subplan_map[i] >= 0)
+ {
+ pinfo->rti_map[i] += rtoffset;
+ leafpart_rtis = bms_add_member(leafpart_rtis,
+ pinfo->rti_map[i]);
+ }
+ }
}
}
+ if (pruneinfo->needs_init_pruning)
+ {
+ glob->containsInitialPruning = true;
+
+ /*
+ * Delete the leaf partition RTIs from the global set of relations
+ * to be locked before executing the plan. AcquireExecutorLocks()
+ * will find the ones to add to the set after performing initial
+ * pruning.
+ */
+ glob->minLockRelids = bms_del_members(glob->minLockRelids,
+ leafpart_rtis);
+ }
+
glob->partPruneInfos = lappend(glob->partPruneInfos, pruneinfo);
}
+ /*
+ * It seems worth doing a bms_copy() on glob->minLockRelids if we deleted
+ * bits from it above to get rid of any empty tail bits. It seems better
+ * for the loop over this set in AcquireExecutorLocks() to not have to go
+ * through those useless bit words.
+ */
+ if (glob->containsInitialPruning)
+ glob->minLockRelids = bms_copy(glob->minLockRelids);
+
return result;
}
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index d48f6784c1..d5556354f7 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -144,7 +144,9 @@ static List *make_partitionedrel_pruneinfo(PlannerInfo *root,
List *prunequal,
Bitmapset *partrelids,
int *relid_subplan_map,
- Bitmapset **matchedsubplans);
+ Bitmapset **matchedsubplans,
+ bool *needs_init_pruning,
+ bool *needs_exec_pruning);
static void gen_partprune_steps(RelOptInfo *rel, List *clauses,
PartClauseTarget target,
GeneratePruningStepsContext *context);
@@ -234,6 +236,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
int *relid_subplan_map;
ListCell *lc;
int i;
+ bool needs_init_pruning = false;
+ bool needs_exec_pruning = false;
/*
* Scan the subpaths to see which ones are scans of partition child
@@ -313,12 +317,16 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
Bitmapset *partrelids = (Bitmapset *) lfirst(lc);
List *pinfolist;
Bitmapset *matchedsubplans = NULL;
+ bool partrel_needs_init_pruning;
+ bool partrel_needs_exec_pruning;
pinfolist = make_partitionedrel_pruneinfo(root, parentrel,
prunequal,
partrelids,
relid_subplan_map,
- &matchedsubplans);
+ &matchedsubplans,
+ &partrel_needs_init_pruning,
+ &partrel_needs_exec_pruning);
/* When pruning is possible, record the matched subplans */
if (pinfolist != NIL)
@@ -327,6 +335,9 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
allmatchedsubplans = bms_join(matchedsubplans,
allmatchedsubplans);
}
+
+ needs_init_pruning |= partrel_needs_init_pruning;
+ needs_exec_pruning |= partrel_needs_exec_pruning;
}
pfree(relid_subplan_map);
@@ -342,6 +353,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
pruneinfo = makeNode(PartitionPruneInfo);
pruneinfo->root_parent_relids = parentrel->relids;
pruneinfo->prune_infos = prunerelinfos;
+ pruneinfo->needs_init_pruning = needs_init_pruning;
+ pruneinfo->needs_exec_pruning = needs_exec_pruning;
/*
* Some subplans may not belong to any of the identified partitioned rels.
@@ -442,13 +455,18 @@ add_part_relids(List *allpartrelids, Bitmapset *partrelids)
* If we cannot find any useful run-time pruning steps, return NIL.
* However, on success, each rel identified in partrelids will have
* an element in the result list, even if some of them are useless.
+ * *needs_init_pruning and *needs_exec_pruning are set to indicate that the
+ * returned PartitionedRelPruneInfos contains pruning steps that can be
+ * performed before and after execution begins, respectively.
*/
static List *
make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
List *prunequal,
Bitmapset *partrelids,
int *relid_subplan_map,
- Bitmapset **matchedsubplans)
+ Bitmapset **matchedsubplans,
+ bool *needs_init_pruning,
+ bool *needs_exec_pruning)
{
RelOptInfo *targetpart = NULL;
List *pinfolist = NIL;
@@ -459,6 +477,10 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
int rti;
int i;
+ /* Will find out below. */
+ *needs_init_pruning = false;
+ *needs_exec_pruning = false;
+
/*
* Examine each partitioned rel, constructing a temporary array to map
* from planner relids to index of the partitioned rel, and building a
@@ -546,6 +568,9 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
* executor per-scan pruning steps. This first pass creates startup
* pruning steps and detects whether there's any possibly-useful quals
* that would require per-scan pruning.
+ *
+ * In the first pass, we note whether the 2nd pass is necessary by
+ * noting the presence of EXEC parameters.
*/
gen_partprune_steps(subpart, partprunequal, PARTTARGET_INITIAL,
&context);
@@ -620,6 +645,12 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
pinfo->execparamids = execparamids;
/* Remaining fields will be filled in the next loop */
+ /* record which types of pruning steps we've seen so far */
+ if (initial_pruning_steps != NIL)
+ *needs_init_pruning = true;
+ if (exec_pruning_steps != NIL)
+ *needs_exec_pruning = true;
+
pinfolist = lappend(pinfolist, pinfo);
}
@@ -647,6 +678,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
int *subplan_map;
int *subpart_map;
Oid *relid_map;
+ Index *rti_map;
/*
* Construct the subplan and subpart maps for this partitioning level.
@@ -659,6 +691,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
subpart_map = (int *) palloc(nparts * sizeof(int));
memset(subpart_map, -1, nparts * sizeof(int));
relid_map = (Oid *) palloc0(nparts * sizeof(Oid));
+ rti_map = (Index *) palloc0(nparts * sizeof(Index));
present_parts = NULL;
i = -1;
@@ -673,6 +706,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
subplan_map[i] = subplanidx = relid_subplan_map[partrel->relid] - 1;
subpart_map[i] = subpartidx = relid_subpart_map[partrel->relid] - 1;
relid_map[i] = planner_rt_fetch(partrel->relid, root)->relid;
+ rti_map[i] = partrel->relid;
if (subplanidx >= 0)
{
present_parts = bms_add_member(present_parts, i);
@@ -697,6 +731,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
pinfo->subplan_map = subplan_map;
pinfo->subpart_map = subpart_map;
pinfo->relid_map = relid_map;
+ pinfo->rti_map = rti_map;
}
pfree(relid_subpart_map);
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 3082093d1e..95ab1d0eef 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -1598,6 +1598,7 @@ exec_bind_message(StringInfo input_message)
int16 *rformats = NULL;
CachedPlanSource *psrc;
CachedPlan *cplan;
+ List *part_prune_results_list;
Portal portal;
char *query_string;
char *saved_stmt_name;
@@ -1972,7 +1973,9 @@ exec_bind_message(StringInfo input_message)
* will be generated in MessageContext. The plan refcount will be
* assigned to the Portal, so it will be released at portal destruction.
*/
- cplan = GetCachedPlan(psrc, params, NULL, NULL);
+ cplan = GetCachedPlan(psrc, params, NULL, NULL, &part_prune_results_list);
+ Assert(list_length(cplan->stmt_list) ==
+ list_length(part_prune_results_list));
/*
* Now we can define the portal.
@@ -1987,6 +1990,9 @@ exec_bind_message(StringInfo input_message)
cplan->stmt_list,
cplan);
+ /* Copy Lists of PartitionPruneResults into the portal's context. */
+ PortalStorePartitionPruneResults(portal, part_prune_results_list);
+
/* Done with the snapshot used for parameter I/O and parsing/planning */
if (snapshot_set)
PopActiveSnapshot();
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index 52e2db6452..280ed7d239 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -35,7 +35,7 @@
Portal ActivePortal = NULL;
-static void ProcessQuery(PlannedStmt *plan,
+static void ProcessQuery(PlannedStmt *plan, List *part_prune_results,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -65,6 +65,7 @@ static void DoPortalRewind(Portal portal);
*/
QueryDesc *
CreateQueryDesc(PlannedStmt *plannedstmt,
+ List *part_prune_results,
const char *sourceText,
Snapshot snapshot,
Snapshot crosscheck_snapshot,
@@ -77,6 +78,8 @@ CreateQueryDesc(PlannedStmt *plannedstmt,
qd->operation = plannedstmt->commandType; /* operation */
qd->plannedstmt = plannedstmt; /* plan */
+ qd->part_prune_results = part_prune_results; /* ExecutorDoInitialPruning()
+ * output for plan */
qd->sourceText = sourceText; /* query text */
qd->snapshot = RegisterSnapshot(snapshot); /* snapshot */
/* RI check snapshot */
@@ -122,6 +125,7 @@ FreeQueryDesc(QueryDesc *qdesc)
* PORTAL_ONE_RETURNING, or PORTAL_ONE_MOD_WITH portal
*
* plan: the plan tree for the query
+ * part_prune_results: ExecutorDoInitialPruning() output for the PlannedStmt
* sourceText: the source text of the query
* params: any parameters needed
* dest: where to send results
@@ -134,6 +138,7 @@ FreeQueryDesc(QueryDesc *qdesc)
*/
static void
ProcessQuery(PlannedStmt *plan,
+ List *part_prune_results,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -145,7 +150,7 @@ ProcessQuery(PlannedStmt *plan,
/*
* Create the QueryDesc object
*/
- queryDesc = CreateQueryDesc(plan, sourceText,
+ queryDesc = CreateQueryDesc(plan, part_prune_results, sourceText,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, 0);
@@ -491,8 +496,13 @@ PortalStart(Portal portal, ParamListInfo params,
/*
* Create QueryDesc in portal's context; for the moment, set
* the destination to DestNone.
+ *
+ * There is no PartitionPruneResult unless the PlannedStmt is
+ * from a CachedPlan.
*/
queryDesc = CreateQueryDesc(linitial_node(PlannedStmt, portal->stmts),
+ portal->part_prune_results_list == NIL ? NIL :
+ linitial(portal->part_prune_results_list),
portal->sourceText,
GetActiveSnapshot(),
InvalidSnapshot,
@@ -1225,6 +1235,8 @@ PortalRunMulti(Portal portal,
if (pstmt->utilityStmt == NULL)
{
+ List *part_prune_results = NIL;
+
/*
* process a plannable query.
*/
@@ -1271,10 +1283,18 @@ PortalRunMulti(Portal portal,
else
UpdateActiveSnapshotCommandId();
+ /*
+ * Determine if there's a corresponding List of PartitionPruneResult
+ * for this PlannedStmt.
+ */
+ if (portal->part_prune_results_list != NIL)
+ part_prune_results = list_nth(portal->part_prune_results_list,
+ foreach_current_index(stmtlist_item));
+
if (pstmt->canSetTag)
{
/* statement can set tag string */
- ProcessQuery(pstmt,
+ ProcessQuery(pstmt, part_prune_results,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
@@ -1283,7 +1303,7 @@ PortalRunMulti(Portal portal,
else
{
/* stmt added by rewrite cannot set tag */
- ProcessQuery(pstmt,
+ ProcessQuery(pstmt, part_prune_results,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index cc943205d3..af6fae6e3b 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -99,14 +99,19 @@ static dlist_head cached_expression_list = DLIST_STATIC_INIT(cached_expression_l
static void ReleaseGenericPlan(CachedPlanSource *plansource);
static List *RevalidateCachedQuery(CachedPlanSource *plansource,
QueryEnvironment *queryEnv);
-static bool CheckCachedPlan(CachedPlanSource *plansource);
+static bool CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
+ List **part_prune_results_list);
static CachedPlan *BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
- ParamListInfo boundParams, QueryEnvironment *queryEnv);
+ ParamListInfo boundParams, QueryEnvironment *queryEnv,
+ List **part_prune_results_list);
static bool choose_custom_plan(CachedPlanSource *plansource,
ParamListInfo boundParams);
static double cached_plan_cost(CachedPlan *plan, bool include_planner);
static Query *QueryListGetPrimaryStmt(List *stmts);
-static void AcquireExecutorLocks(List *stmt_list, bool acquire);
+static void AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams,
+ List **part_prune_results_list,
+ List **lockedRelids_per_stmt);
+static void ReleaseExecutorLocks(List *stmt_list, List *lockedRelids_per_stmt);
static void AcquirePlannerLocks(List *stmt_list, bool acquire);
static void ScanQueryForLocks(Query *parsetree, bool acquire);
static bool ScanQueryWalker(Node *node, bool *acquire);
@@ -782,6 +787,26 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
return tlist;
}
+/*
+ * FreePartitionPruneResults
+ * Frees the List of Lists of PartitionPruneResults for CheckCachedPlan()
+ */
+static void
+FreePartitionPruneResults(List *part_prune_results_list)
+{
+ ListCell *lc;
+
+ foreach(lc, part_prune_results_list)
+ {
+ List *part_prune_results = lfirst(lc);
+
+ /* Free both the PartitionPruneResults and the containing List. */
+ list_free_deep(part_prune_results);
+ }
+
+ list_free(part_prune_results_list);
+}
+
/*
* CheckCachedPlan: see if the CachedPlanSource's generic plan is valid.
*
@@ -790,15 +815,20 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
*
* On a "true" return, we have acquired the locks needed to run the plan.
* (We must do this for the "true" result to be race-condition-free.)
+ *
+ * See GetCachedPlan()'s comment for a description of part_prune_results_list.
*/
static bool
-CheckCachedPlan(CachedPlanSource *plansource)
+CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
+ List **part_prune_results_list)
{
CachedPlan *plan = plansource->gplan;
/* Assert that caller checked the querytree */
Assert(plansource->is_valid);
+ *part_prune_results_list = NIL;
+
/* If there's no generic plan, just say "false" */
if (!plan)
return false;
@@ -820,13 +850,21 @@ CheckCachedPlan(CachedPlanSource *plansource)
*/
if (plan->is_valid)
{
+ List *lockedRelids_per_stmt;
+
/*
* Plan must have positive refcount because it is referenced by
* plansource; so no need to fear it disappears under us here.
*/
Assert(plan->refcount > 0);
- AcquireExecutorLocks(plan->stmt_list, true);
+ /*
+ * Lock relations scanned by the plan. This is where the pruning
+ * happens if needed.
+ */
+ AcquireExecutorLocks(plan->stmt_list, boundParams,
+ part_prune_results_list,
+ &lockedRelids_per_stmt);
/*
* If plan was transient, check to see if TransactionXmin has
@@ -848,7 +886,11 @@ CheckCachedPlan(CachedPlanSource *plansource)
}
/* Oops, the race case happened. Release useless locks. */
- AcquireExecutorLocks(plan->stmt_list, false);
+ ReleaseExecutorLocks(plan->stmt_list, lockedRelids_per_stmt);
+
+ /* Release any PartitionPruneResults that may been created. */
+ FreePartitionPruneResults(*part_prune_results_list);
+ *part_prune_results_list = NIL;
}
/*
@@ -874,10 +916,14 @@ CheckCachedPlan(CachedPlanSource *plansource)
* Planning work is done in the caller's memory context. The finished plan
* is in a child memory context, which typically should get reparented
* (unless this is a one-shot plan, in which case we don't copy the plan).
+ *
+ * A list of NILs is returned in *part_prune_results_list, meaning that no
+ * no partition pruning has been done yet for the plans in stmt_list.
*/
static CachedPlan *
BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
- ParamListInfo boundParams, QueryEnvironment *queryEnv)
+ ParamListInfo boundParams, QueryEnvironment *queryEnv,
+ List **part_prune_results_list)
{
CachedPlan *plan;
List *plist;
@@ -1007,6 +1053,17 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
MemoryContextSwitchTo(oldcxt);
+ /*
+ * No actual PartitionPruneResults yet to add, though must initialize
+ * the list to have the same number of elements as the list of
+ * PlannedStmts.
+ */
+ *part_prune_results_list = NIL;
+ foreach(lc, plist)
+ {
+ *part_prune_results_list = lappend(*part_prune_results_list, NIL);
+ }
+
return plan;
}
@@ -1126,6 +1183,19 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
* plan or a custom plan for the given parameters: the caller does not know
* which it will get.
*
+ * For every PlannedStmt found in the returned CachedPlan, an element that
+ * is either a List of PartitionPruneResult or a NIL is added to
+ * *part_prune_results_list. The former if the PlannedStmt is from
+ * the existing CachedPlan that is otherwise valid and has
+ * containsInitialPruning set to true. Before returning such a CachedPlan,
+ * those "initial" steps are performed by calling ExecutorDoInitialPruning()
+ * to determine only those leaf partitions that need to be locked by
+ * AcquireExecutorLocks() by pruning away subplans that don't match the
+ * "initial" pruning conditions. For each PartitionPruneInfo found in
+ * PlannedStmt.partPruneInfos, a PartitionPruneResult containing the bitmapset
+ * of the indexes of surviving subplans is added to the List for the
+ * PlannedStmt.
+ *
* On return, the plan is valid and we have sufficient locks to begin
* execution.
*
@@ -1139,11 +1209,13 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
*/
CachedPlan *
GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
- ResourceOwner owner, QueryEnvironment *queryEnv)
+ ResourceOwner owner, QueryEnvironment *queryEnv,
+ List **part_prune_results_list)
{
CachedPlan *plan = NULL;
List *qlist;
bool customplan;
+ List *my_part_prune_results_list;
/* Assert caller is doing things in a sane order */
Assert(plansource->magic == CACHEDPLANSOURCE_MAGIC);
@@ -1160,7 +1232,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
if (!customplan)
{
- if (CheckCachedPlan(plansource))
+ if (CheckCachedPlan(plansource, boundParams,
+ &my_part_prune_results_list))
{
/* We want a generic plan, and we already have a valid one */
plan = plansource->gplan;
@@ -1169,7 +1242,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
else
{
/* Build a new generic plan */
- plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv);
+ plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv,
+ &my_part_prune_results_list);
/* Just make real sure plansource->gplan is clear */
ReleaseGenericPlan(plansource);
/* Link the new generic plan into the plansource */
@@ -1214,7 +1288,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
if (customplan)
{
/* Build a custom plan */
- plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv);
+ plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv,
+ &my_part_prune_results_list);
/* Accumulate total costs of custom plans */
plansource->total_custom_cost += cached_plan_cost(plan, true);
@@ -1246,6 +1321,9 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
plan->is_saved = true;
}
+ if (part_prune_results_list)
+ *part_prune_results_list = my_part_prune_results_list;
+
return plan;
}
@@ -1737,17 +1815,29 @@ QueryListGetPrimaryStmt(List *stmts)
/*
* AcquireExecutorLocks: acquire locks needed for execution of a cached plan;
- * or release them if acquire is false.
+ *
+ * See GetCachedPlan()'s comment for a description of part_prune_results_list.
+ *
+ * On return, *lockedRelids_per_stmt will contain a bitmapset for every
+ * PlannedStmt in stmt_list, containing the RT indexes of relation entries
+ * in its range table that were actually locked, or NULL if the PlannedStmt
+ * contains a utility statement.
*/
static void
-AcquireExecutorLocks(List *stmt_list, bool acquire)
+AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams,
+ List **part_prune_results_list,
+ List **lockedRelids_per_stmt)
{
ListCell *lc1;
+ *part_prune_results_list = *lockedRelids_per_stmt = NIL;
foreach(lc1, stmt_list)
{
PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
- ListCell *lc2;
+ List *part_prune_results = NIL;
+ Bitmapset *allLockRelids;
+ Bitmapset *lockedRelids = NULL;
+ int rti;
if (plannedstmt->commandType == CMD_UTILITY)
{
@@ -1761,13 +1851,40 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
Query *query = UtilityContainsQuery(plannedstmt->utilityStmt);
if (query)
- ScanQueryForLocks(query, acquire);
+ ScanQueryForLocks(query, true);
+ *part_prune_results_list = lappend(*part_prune_results_list, NIL);
continue;
}
- foreach(lc2, plannedstmt->rtable)
+ /*
+ * Figure out the set of relations that would need to be locked
+ * before executing the plan.
+ */
+ if (plannedstmt->containsInitialPruning)
{
- RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc2);
+ Bitmapset *scan_leafpart_rtis = NULL;
+
+ /*
+ * Obtain the set of leaf partitions to be locked.
+ *
+ * The following does initial partition pruning using the
+ * PartitionPruneInfos found in plannedstmt->partPruneInfos and
+ * finds leaf partitions that survive that pruning across all the
+ * nodes in the plan tree.
+ */
+ part_prune_results = ExecutorDoInitialPruning(plannedstmt,
+ boundParams,
+ &scan_leafpart_rtis);
+ allLockRelids = bms_union(plannedstmt->minLockRelids,
+ scan_leafpart_rtis);
+ }
+ else
+ allLockRelids = plannedstmt->minLockRelids;
+
+ rti = -1;
+ while ((rti = bms_next_member(allLockRelids, rti)) > 0)
+ {
+ RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
if (rte->rtekind != RTE_RELATION)
continue;
@@ -1778,10 +1895,59 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
* fail if it's been dropped entirely --- we'll just transiently
* acquire a non-conflicting lock.
*/
- if (acquire)
- LockRelationOid(rte->relid, rte->rellockmode);
- else
- UnlockRelationOid(rte->relid, rte->rellockmode);
+ LockRelationOid(rte->relid, rte->rellockmode);
+ lockedRelids = bms_add_member(lockedRelids, rti);
+ }
+
+ *part_prune_results_list = lappend(*part_prune_results_list,
+ part_prune_results);
+ *lockedRelids_per_stmt = lappend(*lockedRelids_per_stmt, lockedRelids);
+ }
+}
+
+/*
+ * ReleaseExecutorLocks
+ * Release locks that would've been acquired by an earlier call to
+ * AcquireExecutorLocks()
+ */
+static void
+ReleaseExecutorLocks(List *stmt_list, List *lockedRelids_per_stmt)
+{
+ ListCell *lc1,
+ *lc2;
+
+ forboth(lc1, stmt_list, lc2, lockedRelids_per_stmt)
+ {
+ PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
+ Bitmapset *lockedRelids = lfirst(lc2);
+ int rti;
+
+ if (plannedstmt->commandType == CMD_UTILITY)
+ {
+ /*
+ * Ignore utility statements, except those (such as EXPLAIN) that
+ * contain a parsed-but-not-planned query. Note: it's okay to use
+ * ScanQueryForLocks, even though the query hasn't been through
+ * rule rewriting, because rewriting doesn't change the query
+ * representation.
+ */
+ Query *query = UtilityContainsQuery(plannedstmt->utilityStmt);
+
+ Assert(lockedRelids == NULL);
+ if (query)
+ ScanQueryForLocks(query, false);
+ continue;
+ }
+
+ rti = -1;
+ while ((rti = bms_next_member(lockedRelids, rti)) >= 0)
+ {
+ RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
+
+ Assert(rte->rtekind == RTE_RELATION);
+
+ /* See the comment in AcquireExecutorLocks(). */
+ UnlockRelationOid(rte->relid, rte->rellockmode);
}
}
}
diff --git a/src/backend/utils/mmgr/portalmem.c b/src/backend/utils/mmgr/portalmem.c
index 7b1ae6fdcf..5b9098971b 100644
--- a/src/backend/utils/mmgr/portalmem.c
+++ b/src/backend/utils/mmgr/portalmem.c
@@ -303,6 +303,25 @@ PortalDefineQuery(Portal portal,
portal->status = PORTAL_DEFINED;
}
+/*
+ * PortalStorePartitionPruneResults
+ * Copy the given List of Lists of PartitionPruneResults into the
+ * portal's context
+ *
+ * This allows the caller to ensure that the list exists as long as the portal
+ * does.
+ */
+void
+PortalStorePartitionPruneResults(Portal portal, List *part_prune_results_list)
+{
+ MemoryContext oldcxt;
+
+ Assert(PortalIsValid(portal));
+ oldcxt = MemoryContextSwitchTo(portal->portalContext);
+ portal->part_prune_results_list = copyObject(part_prune_results_list);
+ MemoryContextSwitchTo(oldcxt);
+}
+
/*
* PortalReleaseCachedPlan
* Release a portal's reference to its cached plan, if any.
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 9ebde089ae..269cc4d562 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -87,7 +87,9 @@ extern void ExplainOneUtility(Node *utilityStmt, IntoClause *into,
ExplainState *es, const char *queryString,
ParamListInfo params, QueryEnvironment *queryEnv);
-extern void ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into,
+extern void ExplainOnePlan(PlannedStmt *plannedstmt,
+ List *part_prune_results,
+ IntoClause *into,
ExplainState *es, const char *queryString,
ParamListInfo params, QueryEnvironment *queryEnv,
const instr_time *planduration,
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index 17fabc18c9..4b98d0d2ef 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -45,6 +45,7 @@ extern void ExecCleanupTupleRouting(ModifyTableState *mtstate,
* nparts Length of subplan_map[] and subpart_map[].
* subplan_map Subplan index by partition index, or -1.
* subpart_map Subpart index by partition index, or -1.
+ * rti_map Range table index by partition index, or 0.
* present_parts A Bitmapset of the partition indexes that we
* have subplans or subparts for.
* initial_pruning_steps List of PartitionPruneSteps used to
@@ -61,6 +62,7 @@ typedef struct PartitionedRelPruningData
int nparts;
int *subplan_map;
int *subpart_map;
+ Index *rti_map;
Bitmapset *present_parts;
List *initial_pruning_steps;
List *exec_pruning_steps;
@@ -127,5 +129,10 @@ extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
Bitmapset *root_parent_relids,
Bitmapset **initially_valid_subplans);
extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
- bool initial_prune);
+ bool initial_prune,
+ Bitmapset **scan_leafpart_rtis);
+extern Bitmapset *ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt,
+ ParamListInfo params,
+ PartitionPruneInfo *pruneinfo,
+ Bitmapset **scan_leafpart_rtis);
#endif /* EXECPARTITION_H */
diff --git a/src/include/executor/execdesc.h b/src/include/executor/execdesc.h
index e79e2c001f..7d4379da7b 100644
--- a/src/include/executor/execdesc.h
+++ b/src/include/executor/execdesc.h
@@ -35,6 +35,8 @@ typedef struct QueryDesc
/* These fields are provided by CreateQueryDesc */
CmdType operation; /* CMD_SELECT, CMD_UPDATE, etc. */
PlannedStmt *plannedstmt; /* planner's output (could be utility, too) */
+ List *part_prune_results; /* ExecutorDoInitialPruning()'s
+ * output for plannedstmt */
const char *sourceText; /* source text of the query */
Snapshot snapshot; /* snapshot to use for query */
Snapshot crosscheck_snapshot; /* crosscheck for RI update/delete */
@@ -57,6 +59,7 @@ typedef struct QueryDesc
/* in pquery.c */
extern QueryDesc *CreateQueryDesc(PlannedStmt *plannedstmt,
+ List *part_prune_results,
const char *sourceText,
Snapshot snapshot,
Snapshot crosscheck_snapshot,
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index ed95ed1176..c9a5e5fb68 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -185,6 +185,9 @@ ExecGetJunkAttribute(TupleTableSlot *slot, AttrNumber attno, bool *isNull)
/*
* prototypes from functions in execMain.c
*/
+extern List *ExecutorDoInitialPruning(PlannedStmt *plannedstmt,
+ ParamListInfo params,
+ Bitmapset **scan_leafpart_rtis);
extern void ExecutorStart(QueryDesc *queryDesc, int eflags);
extern void standard_ExecutorStart(QueryDesc *queryDesc, int eflags);
extern void ExecutorRun(QueryDesc *queryDesc,
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index a2008846c6..369de42caf 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -615,6 +615,7 @@ typedef struct EState
* ExecRowMarks, or NULL if none */
PlannedStmt *es_plannedstmt; /* link to top of plan tree */
List *es_part_prune_infos; /* PlannedStmt.partPruneInfos */
+ List *es_part_prune_results; /* QueryDesc.part_prune_results */
const char *es_sourceText; /* Source text from QueryDesc */
JunkFilter *es_junkFilter; /* top-level junk filter, if any */
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index a80f43e540..937cc4629d 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -212,6 +212,7 @@ extern struct Bitmapset *readBitmapset(void);
extern uintptr_t readDatum(bool typbyval);
extern bool *readBoolCols(int numCols);
extern int *readIntCols(int numCols);
+extern Index *readIndexCols(int numCols);
extern Oid *readOidCols(int numCols);
extern int16 *readAttrNumberCols(int numCols);
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index dd4eb8679d..36abe4cf9e 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -125,6 +125,18 @@ typedef struct PlannerGlobal
/* List of PartitionPruneInfo contained in the plan */
List *partPruneInfos;
+ /*
+ * Do any of those PartitionPruneInfos have initial pruning steps in them?
+ */
+ bool containsInitialPruning;
+
+ /*
+ * Indexes of all range table entries minus indexes of range table entries
+ * of the leaf partitions scanned by prunable subplans; see
+ * AcquireExecutorLocks()
+ */
+ Bitmapset *minLockRelids;
+
/* OIDs of relations the plan depends on */
List *relationOids;
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 2e202892a7..0cab6958d7 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -73,8 +73,17 @@ typedef struct PlannedStmt
List *partPruneInfos; /* List of PartitionPruneInfo contained in
* the plan */
+ bool containsInitialPruning; /* Do any of those PartitionPruneInfos
+ * have initial pruning steps in them?
+ */
+
List *rtable; /* list of RangeTblEntry nodes */
+ Bitmapset *minLockRelids; /* Indexes of all range table entries minus
+ * indexes of range table entries of the leaf
+ * partitions scanned by prunable subplans;
+ * see AcquireExecutorLocks() */
+
/* rtable indexes of target relations for INSERT/UPDATE/DELETE/MERGE */
List *resultRelations; /* integer list of RT indexes, or NIL */
@@ -1414,6 +1423,13 @@ typedef struct PlanRowMark
* prune_infos List of Lists containing PartitionedRelPruneInfo nodes,
* one sublist per run-time-prunable partition hierarchy
* appearing in the parent plan node's subplans.
+ *
+ * needs_init_pruning Does any of the PartitionedRelPruneInfos in
+ * prune_infos have its initial_pruning_steps set?
+ *
+ * needs_exec_pruning Does any of the PartitionedRelPruneInfos in
+ * prune_infos have its exec_pruning_steps set?
+ *
* other_subplans Indexes of any subplans that are not accounted for
* by any of the PartitionedRelPruneInfo nodes in
* "prune_infos". These subplans must not be pruned.
@@ -1425,6 +1441,8 @@ typedef struct PartitionPruneInfo
NodeTag type;
Bitmapset *root_parent_relids;
List *prune_infos;
+ bool needs_init_pruning;
+ bool needs_exec_pruning;
Bitmapset *other_subplans;
} PartitionPruneInfo;
@@ -1469,6 +1487,9 @@ typedef struct PartitionedRelPruneInfo
/* relation OID by partition index, or 0 */
Oid *relid_map pg_node_attr(array_size(nparts));
+ /* Range table index by partition index, or 0. */
+ Index *rti_map pg_node_attr(array_size(nparts));
+
/*
* initial_pruning_steps shows how to prune during executor startup (i.e.,
* without use of any PARAM_EXEC Params); it is NIL if no startup pruning
@@ -1553,6 +1574,31 @@ typedef struct PartitionPruneStepCombine
List *source_stepids;
} PartitionPruneStepCombine;
+/*----------------
+ * PartitionPruneResult
+ *
+ * The result of performing ExecPartitionDoInitialPruning() on a given
+ * PartitionPruneInfo.
+ *
+ * valid_subplans_offs contains the indexes of subplans remaining after
+ * performing initial pruning by calling ExecFindMatchingSubPlans() on the
+ * PartitionPruneInfo.
+ *
+ * This is used to store the result of initial partition pruning that is
+ * peformed before the execution has started. A module that needs to do so
+ * should call ExecutorDoInitialPruning() on a given PlannedStmt, which
+ * returns a List of PartitionPruneResult containing an entry for each
+ * PartitionPruneInfo present in PlannedStmt.part_prune_infos. The module
+ * should then pass that list, along with the PlannedStmt, to the executor,
+ * so that it can reuse the result of initial partition pruning when
+ * initializing the subplans for execution.
+ */
+typedef struct PartitionPruneResult
+{
+ NodeTag type;
+
+ Bitmapset *valid_subplan_offs;
+} PartitionPruneResult;
/*
* Plan invalidation info
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index 0499635f59..32579d4788 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -220,7 +220,8 @@ extern List *CachedPlanGetTargetList(CachedPlanSource *plansource,
extern CachedPlan *GetCachedPlan(CachedPlanSource *plansource,
ParamListInfo boundParams,
ResourceOwner owner,
- QueryEnvironment *queryEnv);
+ QueryEnvironment *queryEnv,
+ List **part_prune_results_list);
extern void ReleaseCachedPlan(CachedPlan *plan, ResourceOwner owner);
extern bool CachedPlanAllowsSimpleValidityCheck(CachedPlanSource *plansource,
diff --git a/src/include/utils/portal.h b/src/include/utils/portal.h
index aeddbdafe5..1901fc5f28 100644
--- a/src/include/utils/portal.h
+++ b/src/include/utils/portal.h
@@ -138,6 +138,7 @@ typedef struct PortalData
QueryCompletion qc; /* command completion data for executed query */
List *stmts; /* list of PlannedStmts */
CachedPlan *cplan; /* CachedPlan, if stmts are from one */
+ List *part_prune_results_list; /* List of Lists of PartitionPruneResults */
ParamListInfo portalParams; /* params to pass to query */
QueryEnvironment *queryEnv; /* environment for query */
@@ -242,6 +243,8 @@ extern void PortalDefineQuery(Portal portal,
CommandTag commandTag,
List *stmts,
CachedPlan *cplan);
+extern void PortalStorePartitionPruneResults(Portal portal,
+ List *part_prune_results_list);
extern PlannedStmt *PortalGetPrimaryStmt(Portal portal);
extern void PortalCreateHoldStore(Portal portal);
extern void PortalHashTableDeleteAll(void);
--
2.35.3
^ permalink raw reply [nested|flat] 71+ messages in thread
* Re: generic plans and "initial" pruning
@ 2022-12-05 03:00 Amit Langote <[email protected]>
parent: Amit Langote <[email protected]>
0 siblings, 1 reply; 71+ messages in thread
From: Amit Langote @ 2022-12-05 03:00 UTC (permalink / raw)
To: Alvaro Herrera <[email protected]>; +Cc: Robert Haas <[email protected]>; Jacob Champion <[email protected]>; David Rowley <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers
On Fri, Dec 2, 2022 at 7:40 PM Amit Langote <[email protected]> wrote:
> On Thu, Dec 1, 2022 at 9:43 PM Amit Langote <[email protected]> wrote:
> > On Thu, Dec 1, 2022 at 8:21 PM Alvaro Herrera <[email protected]> wrote:
> > > On 2022-Dec-01, Amit Langote wrote:
> > > > Hmm, how about keeping the [Merge]Append's parent relation's RT index
> > > > in the PartitionPruneInfo and passing it down to
> > > > ExecInitPartitionPruning() from ExecInit[Merge]Append() for
> > > > cross-checking? Both Append and MergeAppend already have a
> > > > 'apprelids' field that we can save a copy of in the
> > > > PartitionPruneInfo. Tried that in the attached delta patch.
> > >
> > > Ah yeah, that sounds about what I was thinking. I've merged that in and
> > > pushed to github, which had a strange pg_upgrade failure on Windows
> > > mentioning log files that were not captured by the CI tooling. So I
> > > pushed another one trying to grab those files, in case it wasn't an
> > > one-off failure. It's running now:
> > > https://cirrus-ci.com/task/5857239638999040
> > >
> > > If all goes well with this run, I'll get this 0001 pushed.
> >
> > Thanks for pushing 0001.
> >
> > Rebased 0002 attached.
>
> Thought it might be good for PartitionPruneResult to also have
> root_parent_relids that matches with the corresponding
> PartitionPruneInfo. ExecInitPartitionPruning() does a sanity check
> that the root_parent_relids of a given pair of PartitionPrune{Info |
> Result} match.
>
> Posting the patch separately as the attached 0002, just in case you
> might think that the extra cross-checking would be an overkill.
Rebased over 92c4dafe1eed and fixed some factual mistakes in the
comment above ExecutorDoInitialPruning().
--
Thanks, Amit Langote
EDB: http://www.enterprisedb.com
Attachments:
[application/octet-stream] v27-0001-Optimize-AcquireExecutorLocks-by-locking-only-un.patch (82.9K, 2-v27-0001-Optimize-AcquireExecutorLocks-by-locking-only-un.patch)
download | inline diff:
From 6c4cf0b0a03bfac62e87f76bb3be9c1e62125a0c Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Wed, 22 Dec 2021 16:55:17 +0900
Subject: [PATCH v27 1/2] Optimize AcquireExecutorLocks() by locking only
unpruned partitions
This commit teaches AcquireExecutorLocks() to perform initial
partition pruning to notionally eliminate the subnodes contained in a
generic cached plan that need not be initialized during the actual
execution of the plan and skip locking the partition scanned by those
subnodes.
The result of performing initial partition pruning this way before the
actual execution has started is made available to the actual execution via
PartitionPruneResult, made available along with the PlannedStmt by the
callers of the executor that used plancache.c to get the plan. It is NULL
in the cases in which the plan is obtained by calling the planner
directly or if the plan obtained by plancache.c is not a generic one.
---
src/backend/commands/copyto.c | 2 +-
src/backend/commands/createas.c | 2 +-
src/backend/commands/explain.c | 7 +-
src/backend/commands/extension.c | 2 +-
src/backend/commands/matview.c | 2 +-
src/backend/commands/prepare.c | 26 ++-
src/backend/executor/README | 36 ++++
src/backend/executor/execMain.c | 53 ++++++
src/backend/executor/execParallel.c | 26 ++-
src/backend/executor/execPartition.c | 238 +++++++++++++++++++++----
src/backend/executor/execUtils.c | 1 +
src/backend/executor/functions.c | 2 +-
src/backend/executor/nodeAppend.c | 11 +-
src/backend/executor/nodeMergeAppend.c | 5 +-
src/backend/executor/spi.c | 27 ++-
src/backend/nodes/readfuncs.c | 8 +-
src/backend/optimizer/plan/planner.c | 2 +
src/backend/optimizer/plan/setrefs.c | 46 +++++
src/backend/partitioning/partprune.c | 41 ++++-
src/backend/tcop/postgres.c | 8 +-
src/backend/tcop/pquery.c | 28 ++-
src/backend/utils/cache/plancache.c | 208 ++++++++++++++++++---
src/backend/utils/mmgr/portalmem.c | 19 ++
src/include/commands/explain.h | 4 +-
src/include/executor/execPartition.h | 9 +-
src/include/executor/execdesc.h | 3 +
src/include/executor/executor.h | 3 +
src/include/nodes/execnodes.h | 1 +
src/include/nodes/nodes.h | 1 +
src/include/nodes/pathnodes.h | 12 ++
src/include/nodes/plannodes.h | 46 +++++
src/include/utils/plancache.h | 3 +-
src/include/utils/portal.h | 3 +
33 files changed, 787 insertions(+), 98 deletions(-)
diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index f26cc0d162..401a2280a3 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -558,7 +558,7 @@ BeginCopyTo(ParseState *pstate,
((DR_copy *) dest)->cstate = cstate;
/* Create a QueryDesc requesting no output */
- cstate->queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+ cstate->queryDesc = CreateQueryDesc(plan, NIL, pstate->p_sourcetext,
GetActiveSnapshot(),
InvalidSnapshot,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 152c29b551..942449544c 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -325,7 +325,7 @@ ExecCreateTableAs(ParseState *pstate, CreateTableAsStmt *stmt,
UpdateActiveSnapshotCommandId();
/* Create a QueryDesc, redirecting output to our tuple receiver */
- queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+ queryDesc = CreateQueryDesc(plan, NIL, pstate->p_sourcetext,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, 0);
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index f86983c660..2f2b558608 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -407,7 +407,7 @@ ExplainOneQuery(Query *query, int cursorOptions,
}
/* run it (if needed) and produce output */
- ExplainOnePlan(plan, into, es, queryString, params, queryEnv,
+ ExplainOnePlan(plan, NIL, into, es, queryString, params, queryEnv,
&planduration, (es->buffers ? &bufusage : NULL));
}
}
@@ -515,7 +515,8 @@ ExplainOneUtility(Node *utilityStmt, IntoClause *into, ExplainState *es,
* to call it.
*/
void
-ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
+ExplainOnePlan(PlannedStmt *plannedstmt, List *part_prune_results,
+ IntoClause *into, ExplainState *es,
const char *queryString, ParamListInfo params,
QueryEnvironment *queryEnv, const instr_time *planduration,
const BufferUsage *bufusage)
@@ -563,7 +564,7 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
dest = None_Receiver;
/* Create a QueryDesc for the query */
- queryDesc = CreateQueryDesc(plannedstmt, queryString,
+ queryDesc = CreateQueryDesc(plannedstmt, part_prune_results, queryString,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, instrument_option);
diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index cf1b1ca571..904cbcba4a 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -779,7 +779,7 @@ execute_sql_string(const char *sql)
{
QueryDesc *qdesc;
- qdesc = CreateQueryDesc(stmt,
+ qdesc = CreateQueryDesc(stmt, NIL,
sql,
GetActiveSnapshot(), NULL,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index 9ac0383459..65c8d0aa59 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -408,7 +408,7 @@ refresh_matview_datafill(DestReceiver *dest, Query *query,
UpdateActiveSnapshotCommandId();
/* Create a QueryDesc, redirecting output to our tuple receiver */
- queryDesc = CreateQueryDesc(plan, queryString,
+ queryDesc = CreateQueryDesc(plan, NIL, queryString,
GetActiveSnapshot(), InvalidSnapshot,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 9e29584d93..29b45539d3 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -155,6 +155,7 @@ ExecuteQuery(ParseState *pstate,
PreparedStatement *entry;
CachedPlan *cplan;
List *plan_list;
+ List *part_prune_results_list;
ParamListInfo paramLI = NULL;
EState *estate = NULL;
Portal portal;
@@ -193,7 +194,10 @@ ExecuteQuery(ParseState *pstate,
entry->plansource->query_string);
/* Replan if needed, and increment plan refcount for portal */
- cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL);
+ cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL,
+ &part_prune_results_list);
+ Assert(list_length(cplan->stmt_list) ==
+ list_length(part_prune_results_list));
plan_list = cplan->stmt_list;
/*
@@ -207,6 +211,9 @@ ExecuteQuery(ParseState *pstate,
plan_list,
cplan);
+ /* Copy Lists of PartitionPruneResults into the portal's context. */
+ PortalStorePartitionPruneResults(portal, part_prune_results_list);
+
/*
* For CREATE TABLE ... AS EXECUTE, we must verify that the prepared
* statement is one that produces tuples. Currently we insist that it be
@@ -576,7 +583,9 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
const char *query_string;
CachedPlan *cplan;
List *plan_list;
- ListCell *p;
+ List *part_prune_results_list;
+ ListCell *p,
+ *pp;
ParamListInfo paramLI = NULL;
EState *estate = NULL;
instr_time planstart;
@@ -619,7 +628,10 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
/* Replan if needed, and acquire a transient refcount */
cplan = GetCachedPlan(entry->plansource, paramLI,
- CurrentResourceOwner, queryEnv);
+ CurrentResourceOwner, queryEnv,
+ &part_prune_results_list);
+ Assert(list_length(cplan->stmt_list) ==
+ list_length(part_prune_results_list));
INSTR_TIME_SET_CURRENT(planduration);
INSTR_TIME_SUBTRACT(planduration, planstart);
@@ -634,13 +646,15 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
plan_list = cplan->stmt_list;
/* Explain each query */
- foreach(p, plan_list)
+ forboth(p, plan_list, pp, part_prune_results_list)
{
PlannedStmt *pstmt = lfirst_node(PlannedStmt, p);
+ List *part_prune_results = lfirst_node(List, pp);
if (pstmt->commandType != CMD_UTILITY)
- ExplainOnePlan(pstmt, into, es, query_string, paramLI, queryEnv,
- &planduration, (es->buffers ? &bufusage : NULL));
+ ExplainOnePlan(pstmt, part_prune_results, into, es, query_string,
+ paramLI, queryEnv, &planduration,
+ (es->buffers ? &bufusage : NULL));
else
ExplainOneUtility(pstmt->utilityStmt, into, es, query_string,
paramLI, queryEnv);
diff --git a/src/backend/executor/README b/src/backend/executor/README
index 17775a49e2..7f8cf1494f 100644
--- a/src/backend/executor/README
+++ b/src/backend/executor/README
@@ -65,6 +65,38 @@ found there. This currently only occurs for Append and MergeAppend nodes. In
this case the non-required subplans are ignored and the executor state's
subnode array will become out of sequence to the plan's subplan list.
+The so-called execution time pruning may also occur even before the execution
+has actually started. One case where that occurs is when a cached generic
+plan is being validated for execution by plancache.c:GetCachedPlan(), which
+works by locking all the relations that will be scanned by that plan. If the
+generic plan contains nodes that can perform execution time partition pruning
+(that is, contain a PartitionPruneInfo), a subset of pruning steps contained
+in a given node's PartitionPruneInfo that do not depend on the execution
+actually having started (called "initial" pruning steps) are performed as part
+of the plan validation step, by calling ExecutorDoInitialPruning(). That
+returns the minimal set of child subplans that satisfy thoe initial pruning
+steps contained in each PartitionPruneInfo. AcquireExecutorLocks() will then
+lock only the relations scanned by those subplans, in addition to those present
+inPlannedStmt.minLockRelids. Note that the subplans are not really pruned as
+in being removed from the plan tree, so care is needed by the downstreams
+users of such a plan that has undergone pre-execution initial pruning.
+
+To prevent the executor and any third party execution code that can look at
+the plan tree from trying to execute the subplans that were pruned as
+described above, the result of that pruning is passed to the executor as a
+List of PartitionPruneResult nodes via the QueryDesc, which is subsequently
+assigned to EState.es_part_prune_results. Each PartitionPruneResult therein
+consists of the set of indexes of surviving subplans in the respective parent
+plan node's (the one to which the corresponding PartitionPruneInfo belongs)
+list of child subplans, saved as a bitmapset valid_subplan_offs. The executor
+or any third party execution code working on a generic plan should not
+re-evaluate the set of initially valid subplans for a given plan node by
+redoing the initial pruning if a PartitionPruneResult belonging to thant plan
+node is present in es_part_prune_results. Note that that is not simply a
+performance optimization, because such re-evaluation of the pruning steps may
+very well end up resulting in a different set of initially valid subplans,
+containing some whose relations were not locked by AcquireExecutorLocks().
+
Each Plan node may have expression trees associated with it, to represent
its target list, qualification conditions, etc. These trees are also
read-only to the executor, but the executor state for expression evaluation
@@ -286,6 +318,10 @@ Query Processing Control Flow
This is a sketch of control flow for full query processing:
+ [ ExecutorDoInitialPruning ] --- an optional step to perform initial
+ partition pruning on the plan tree the result of which is passed
+ to the executor via QueryDesc
+
CreateQueryDesc
ExecutorStart
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 12ff4f3de5..4d8c8e2e43 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -49,6 +49,7 @@
#include "commands/matview.h"
#include "commands/trigger.h"
#include "executor/execdebug.h"
+#include "executor/execPartition.h"
#include "executor/nodeSubplan.h"
#include "foreign/fdwapi.h"
#include "jit/jit.h"
@@ -104,6 +105,56 @@ static void EvalPlanQualStart(EPQState *epqstate, Plan *planTree);
/* end of local decls */
+/* ----------------------------------------------------------------
+ * ExecutorDoInitialPruning
+ *
+ * For each plan tree node that has been assigned a PartitionPruneInfo,
+ * this performs initial partition pruning using the information contained
+ * therein to determine the set of child subplans that satisfy the initial
+ * pruning steps, to be returned as a bitmapset of their indexes in the
+ * node's list of child subplans (for example, an Append's appendplans).
+ *
+ * Return value is a List of PartitionPruneResult nodes, one for each
+ * PartitionPruneInfo found in plannedstmt->containsInitialPruning, each
+ * containing a bitmapset of the indexes of unpruned child subplans.
+ * A bitmapset of the RT indexes of the leaf partitions scanned by those
+ * subplans is returned in *scan_leafpart_rtis, which is shared across all
+ * of those PartitionPruneResults.
+ *
+ * The executor must see the exactly same set of subplans as valid for
+ * execution when doing ExecInitNode() on the plan nodes whose
+ * PartitionPruneInfos are processed here. So, it must get the set from the
+ * aforementioned PartitionPruneResult, instead of computing it all over
+ * again by redoing the initial pruning. It's the caller's job to pass the
+ * PartitionPruneResult to the executor.
+ *
+ * Note: Partitioned tables mentioned in PartitionedRelPruneInfo nodes that
+ * drive the pruning will be locked before doing the pruning.
+ * ----------------------------------------------------------------
+ */
+List *
+ExecutorDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params,
+ Bitmapset **scan_leafpart_rtis)
+{
+ List *part_prune_results = NIL;
+ ListCell *lc;
+
+ /* Only get here if there is any pruning to do. */
+ Assert(plannedstmt->containsInitialPruning);
+
+ foreach(lc, plannedstmt->partPruneInfos)
+ {
+ PartitionPruneInfo *pruneinfo = lfirst(lc);
+ PartitionPruneResult *pruneresult = makeNode(PartitionPruneResult);
+
+ pruneresult->valid_subplan_offs =
+ ExecPartitionDoInitialPruning(plannedstmt, params, pruneinfo,
+ scan_leafpart_rtis);
+ part_prune_results = lappend(part_prune_results, pruneresult);
+ }
+
+ return part_prune_results;
+}
/* ----------------------------------------------------------------
* ExecutorStart
@@ -806,6 +857,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
{
CmdType operation = queryDesc->operation;
PlannedStmt *plannedstmt = queryDesc->plannedstmt;
+ List *part_prune_results = queryDesc->part_prune_results;
Plan *plan = plannedstmt->planTree;
List *rangeTable = plannedstmt->rtable;
EState *estate = queryDesc->estate;
@@ -826,6 +878,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
estate->es_plannedstmt = plannedstmt;
estate->es_part_prune_infos = plannedstmt->partPruneInfos;
+ estate->es_part_prune_results = part_prune_results;
/*
* Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index aca0c6f323..917079a034 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -66,6 +66,7 @@
#define PARALLEL_KEY_QUERY_TEXT UINT64CONST(0xE000000000000008)
#define PARALLEL_KEY_JIT_INSTRUMENTATION UINT64CONST(0xE000000000000009)
#define PARALLEL_KEY_WAL_USAGE UINT64CONST(0xE00000000000000A)
+#define PARALLEL_KEY_PARTITION_PRUNE_RESULTS UINT64CONST(0xE00000000000000B)
#define PARALLEL_TUPLE_QUEUE_SIZE 65536
@@ -182,6 +183,7 @@ ExecSerializePlan(Plan *plan, EState *estate)
pstmt->transientPlan = false;
pstmt->dependsOnRole = false;
pstmt->parallelModeNeeded = false;
+ pstmt->containsInitialPruning = false;
pstmt->planTree = plan;
pstmt->partPruneInfos = estate->es_part_prune_infos;
pstmt->rtable = estate->es_range_table;
@@ -597,12 +599,15 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
FixedParallelExecutorState *fpes;
char *pstmt_data;
char *pstmt_space;
+ char *part_prune_results_data;
+ char *part_prune_results_space;
char *paramlistinfo_space;
BufferUsage *bufusage_space;
WalUsage *walusage_space;
SharedExecutorInstrumentation *instrumentation = NULL;
SharedJitInstrumentation *jit_instrumentation = NULL;
int pstmt_len;
+ int part_prune_results_len;
int paramlistinfo_len;
int instrumentation_len = 0;
int jit_instrumentation_len = 0;
@@ -631,6 +636,7 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
/* Fix up and serialize plan to be sent to workers. */
pstmt_data = ExecSerializePlan(planstate->plan, estate);
+ part_prune_results_data = nodeToString(estate->es_part_prune_results);
/* Create a parallel context. */
pcxt = CreateParallelContext("postgres", "ParallelQueryMain", nworkers);
@@ -657,6 +663,11 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
shm_toc_estimate_chunk(&pcxt->estimator, pstmt_len);
shm_toc_estimate_keys(&pcxt->estimator, 1);
+ /* Estimate space for serialized List of PartitionPruneResult. */
+ part_prune_results_len = strlen(part_prune_results_data) + 1;
+ shm_toc_estimate_chunk(&pcxt->estimator, part_prune_results_len);
+ shm_toc_estimate_keys(&pcxt->estimator, 1);
+
/* Estimate space for serialized ParamListInfo. */
paramlistinfo_len = EstimateParamListSpace(estate->es_param_list_info);
shm_toc_estimate_chunk(&pcxt->estimator, paramlistinfo_len);
@@ -751,6 +762,12 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
memcpy(pstmt_space, pstmt_data, pstmt_len);
shm_toc_insert(pcxt->toc, PARALLEL_KEY_PLANNEDSTMT, pstmt_space);
+ /* Store serialized List of PartitionPruneResult */
+ part_prune_results_space = shm_toc_allocate(pcxt->toc, part_prune_results_len);
+ memcpy(part_prune_results_space, part_prune_results_data, part_prune_results_len);
+ shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARTITION_PRUNE_RESULTS,
+ part_prune_results_space);
+
/* Store serialized ParamListInfo. */
paramlistinfo_space = shm_toc_allocate(pcxt->toc, paramlistinfo_len);
shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARAMLISTINFO, paramlistinfo_space);
@@ -1232,8 +1249,10 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
int instrument_options)
{
char *pstmtspace;
+ char *part_prune_results_space;
char *paramspace;
PlannedStmt *pstmt;
+ List *part_prune_results;
ParamListInfo paramLI;
char *queryString;
@@ -1244,12 +1263,17 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
pstmtspace = shm_toc_lookup(toc, PARALLEL_KEY_PLANNEDSTMT, false);
pstmt = (PlannedStmt *) stringToNode(pstmtspace);
+ /* Reconstruct leader-supplied PartitionPruneResult. */
+ part_prune_results_space =
+ shm_toc_lookup(toc, PARALLEL_KEY_PARTITION_PRUNE_RESULTS, false);
+ part_prune_results = (List *) stringToNode(part_prune_results_space);
+
/* Reconstruct ParamListInfo. */
paramspace = shm_toc_lookup(toc, PARALLEL_KEY_PARAMLISTINFO, false);
paramLI = RestoreParamList(¶mspace);
/* Create a QueryDesc for the query. */
- return CreateQueryDesc(pstmt,
+ return CreateQueryDesc(pstmt, part_prune_results,
queryString,
GetActiveSnapshot(), InvalidSnapshot,
receiver, paramLI, NULL, instrument_options);
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 88d0ea3adb..b0eb15b982 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -25,6 +25,7 @@
#include "mb/pg_wchar.h"
#include "miscadmin.h"
#include "nodes/makefuncs.h"
+#include "parser/parsetree.h"
#include "partitioning/partbounds.h"
#include "partitioning/partdesc.h"
#include "partitioning/partprune.h"
@@ -185,7 +186,11 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation rel,
static List *adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri);
static List *adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap);
static PartitionPruneState *CreatePartitionPruneState(PlanState *planstate,
- PartitionPruneInfo *pruneinfo);
+ PartitionPruneInfo *pruneinfo,
+ bool consider_initial_steps,
+ bool consider_exec_steps,
+ List *rtable, ExprContext *econtext,
+ PartitionDirectory partdir);
static void InitPartitionPruneContext(PartitionPruneContext *context,
List *pruning_steps,
PartitionDesc partdesc,
@@ -198,7 +203,8 @@ static void PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
static void find_matching_subplans_recurse(PartitionPruningData *prunedata,
PartitionedRelPruningData *pprune,
bool initial_prune,
- Bitmapset **validsubplans);
+ Bitmapset **validsubplans,
+ Bitmapset **scan_leafpart_rtis);
/*
@@ -1749,8 +1755,10 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
* considered to be a stable expression, it can change value from one plan
* node scan to the next during query execution. Stable comparison
* expressions that don't involve such Params allow partition pruning to be
- * done once during executor startup. Expressions that do involve such Params
- * require us to prune separately for each scan of the parent plan node.
+ * done once during executor startup or during ExecutorDoInitialPruning() that
+ * runs as part of performing AcquireExecutorLocks() on a given plan tree.
+ * Expressions that do involve such Params require us to prune separately for
+ * each scan of the parent plan node.
*
* Note that pruning away unneeded subplans during executor startup has the
* added benefit of not having to initialize the unneeded subplans at all.
@@ -1767,6 +1775,13 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
* account for initial pruning possibly having eliminated some of the
* subplans.
*
+ * ExecPartitionDoInitialPruning:
+ * Do initial pruning with the information contained in a given
+ * PartitionPruneInfo to determine the minimal set of child subplans
+ * to be executed of the parent plan node to which the PartitionPruneInfo
+ * belongs and also the set of the RT indexes of leaf partitions that will
+ * be scanned with those subplans.
+ *
* ExecFindMatchingSubPlans:
* Returns indexes of matching subplans after evaluating the expressions
* that are safe to evaluate at a given point. This function is first
@@ -1787,8 +1802,9 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
*
* On return, *initially_valid_subplans is assigned the set of indexes of
* child subplans that must be initialized along with the parent plan node.
- * Initial pruning is performed here if needed and in that case only the
- * surviving subplans' indexes are added.
+ * Initial pruning is performed here if needed (unless it has already been done
+ * by ExecutorDoInitialPruning()), and in that case only the surviving
+ * subplans' indexes are added.
*
* If subplans are indeed pruned, subplan_map arrays contained in the returned
* PartitionPruneState are re-sequenced to not count those, though only if the
@@ -1801,9 +1817,10 @@ ExecInitPartitionPruning(PlanState *planstate,
Bitmapset *root_parent_relids,
Bitmapset **initially_valid_subplans)
{
- PartitionPruneState *prunestate;
+ PartitionPruneState *prunestate = NULL;
EState *estate = planstate->state;
PartitionPruneInfo *pruneinfo;
+ PartitionPruneResult *pruneresult = NULL;
/* Obtain the pruneinfo we need, and make sure it's the right one */
pruneinfo = list_nth(estate->es_part_prune_infos, part_prune_index);
@@ -1819,20 +1836,57 @@ ExecInitPartitionPruning(PlanState *planstate,
/* We may need an expression context to evaluate partition exprs */
ExecAssignExprContext(estate, planstate);
- /* Create the working data structure for pruning */
- prunestate = CreatePartitionPruneState(planstate, pruneinfo);
+ /*
+ * No need to do initial pruning if it was done already by
+ * ExecutorDoInitialPruning(), which it would be if es_part_prune_results
+ * is set.
+ */
+ if (estate->es_part_prune_results)
+ {
+ pruneresult = list_nth(estate->es_part_prune_results, part_prune_index);
+ Assert(IsA(pruneresult, PartitionPruneResult));
+ }
+
+ if (pruneresult == NULL || pruneinfo->needs_exec_pruning)
+ {
+ /* We may need an expression context to evaluate partition exprs */
+ ExecAssignExprContext(estate, planstate);
+
+ /* For data reading, executor always omits detached partitions */
+ if (estate->es_partition_directory == NULL)
+ estate->es_partition_directory =
+ CreatePartitionDirectory(estate->es_query_cxt, false);
+
+ /*
+ * Create the working data structure for pruning. No need to consider
+ * initial pruning steps if we have a PartitionPruneResult.
+ */
+ prunestate = CreatePartitionPruneState(planstate, pruneinfo,
+ pruneresult == NULL,
+ pruneinfo->needs_exec_pruning,
+ NIL, planstate->ps_ExprContext,
+ estate->es_partition_directory);
+ }
/*
* Perform an initial partition prune pass, if required.
*/
- if (prunestate->do_initial_prune)
- *initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true);
+ if (pruneresult)
+ {
+ *initially_valid_subplans = bms_copy(pruneresult->valid_subplan_offs);
+ }
+ else if (prunestate && prunestate->do_initial_prune)
+ {
+ *initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true,
+ NULL);
+ }
else
{
- /* No pruning, so we'll need to initialize all subplans */
+ /* No initial pruning, so we'll need to initialize all subplans */
Assert(n_total_subplans > 0);
*initially_valid_subplans = bms_add_range(NULL, 0,
n_total_subplans - 1);
+ return prunestate;
}
/*
@@ -1840,7 +1894,8 @@ ExecInitPartitionPruning(PlanState *planstate,
* that were removed above due to initial pruning. No need to do this if
* no steps were removed.
*/
- if (bms_num_members(*initially_valid_subplans) < n_total_subplans)
+ if (prunestate &&
+ bms_num_members(*initially_valid_subplans) < n_total_subplans)
{
/*
* We can safely skip this when !do_exec_prune, even though that
@@ -1856,11 +1911,74 @@ ExecInitPartitionPruning(PlanState *planstate,
return prunestate;
}
+/*
+ * ExecPartitionDoInitialPruning
+ * Perform initial pruning using given PartitionPruneInfo to determine
+ * the minimal set of child subplans that will be executed and also the
+ * set of RT indexes of the leaf partitions scanned by those subplans.
+ */
+Bitmapset *
+ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params,
+ PartitionPruneInfo *pruneinfo,
+ Bitmapset **scan_leafpart_rtis)
+{
+ List *rtable = plannedstmt->rtable;
+ ExprContext *econtext;
+ PartitionDirectory pdir;
+ MemoryContext oldcontext,
+ tmpcontext;
+ PartitionPruneState *prunestate;
+ Bitmapset *valid_subplan_offs;
+
+ /*
+ * A temporary context for memory allocations required while executing
+ * partition pruning steps.
+ */
+ tmpcontext = AllocSetContextCreate(CurrentMemoryContext,
+ "initial pruning working data",
+ ALLOCSET_DEFAULT_SIZES);
+ oldcontext = MemoryContextSwitchTo(tmpcontext);
+
+ /*
+ * PartitionDirectory to look up partition descriptors.
+ * Note that we don't omit detached partitions, just like during
+ * execution proper.
+ */
+ pdir = CreatePartitionDirectory(CurrentMemoryContext, false);
+
+ /*
+ * We don't yet have a PlanState for the parent plan node, so we must
+ * create a standalone ExprContext to evaluate pruning expressions,
+ * equipped with the information about the EXTERN parameters that the
+ * caller passed us. Note that that's okay because the initial pruning
+ * steps do not contain anything that requires the execution to have
+ * started and thus need the information contained in a PlanState.
+ */
+ econtext = CreateStandaloneExprContext();
+ econtext->ecxt_param_list_info = params;
+ prunestate = CreatePartitionPruneState(NULL, pruneinfo, true, false,
+ rtable, econtext, pdir);
+ MemoryContextSwitchTo(oldcontext);
+
+ /* Do the initial pruning. */
+ valid_subplan_offs = ExecFindMatchingSubPlans(prunestate, true,
+ scan_leafpart_rtis);
+
+ FreeExprContext(econtext, true);
+ DestroyPartitionDirectory(pdir);
+ MemoryContextDelete(tmpcontext);
+
+ return valid_subplan_offs;
+}
+
/*
* CreatePartitionPruneState
* Build the data structure required for calling ExecFindMatchingSubPlans
*
- * 'planstate' is the parent plan node's execution state.
+ * 'planstate', if not NULL, is the parent plan node's execution state. It
+ * can be NULL if being called before ExecutorStart(), in which case,
+ * 'rtable' (range table), 'econtext', and 'partdir' must be explicitly
+ * provided.
*
* 'pruneinfo' is a PartitionPruneInfo as generated by
* make_partition_pruneinfo. Here we build a PartitionPruneState containing a
@@ -1874,19 +1992,21 @@ ExecInitPartitionPruning(PlanState *planstate,
* PartitionedRelPruneInfo.
*/
static PartitionPruneState *
-CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
+CreatePartitionPruneState(PlanState *planstate,
+ PartitionPruneInfo *pruneinfo,
+ bool consider_initial_steps,
+ bool consider_exec_steps,
+ List *rtable, ExprContext *econtext,
+ PartitionDirectory partdir)
{
- EState *estate = planstate->state;
+ EState *estate = planstate ? planstate->state : NULL;
PartitionPruneState *prunestate;
int n_part_hierarchies;
ListCell *lc;
int i;
- ExprContext *econtext = planstate->ps_ExprContext;
- /* For data reading, executor always omits detached partitions */
- if (estate->es_partition_directory == NULL)
- estate->es_partition_directory =
- CreatePartitionDirectory(estate->es_query_cxt, false);
+ Assert((estate != NULL) ||
+ (partdir != NULL && econtext != NULL && rtable != NIL));
n_part_hierarchies = list_length(pruneinfo->prune_infos);
Assert(n_part_hierarchies > 0);
@@ -1941,15 +2061,42 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
PartitionKey partkey;
/*
- * We can rely on the copies of the partitioned table's partition
- * key and partition descriptor appearing in its relcache entry,
- * because that entry will be held open and locked for the
- * duration of this executor run.
+ * Must open the relation by ourselves when called before the
+ * execution has started, such as, when called during
+ * ExecutorDoInitialPruning() on a cached plan. In that case,
+ * sub-partitions must be locked, because AcquirePlannerLocks()
+ * would not have seen them. (1st relation in a partrelpruneinfos
+ * list is always the root partitioned table appearing in the
+ * query, which AcquirePlannerLocks() would have locked; the
+ * Assert in relation_open() guards that assumption.)
+ */
+ if (estate == NULL)
+ {
+ RangeTblEntry *rte = rt_fetch(pinfo->rtindex, rtable);
+ int lockmode = (j == 0) ? NoLock : rte->rellockmode;
+
+ partrel = table_open(rte->relid, lockmode);
+ }
+ else
+ partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
+
+ /*
+ * We can rely on the copy of the partitioned table's partition
+ * key from in its relcache entry, because it can't change (or
+ * get destroyed) as long as the relation is locked. Partition
+ * descriptor is taken from the PartitionDirectory associated with
+ * the table that is held open long enough for the descriptor to
+ * remain valid while it's used to perform the pruning steps.
*/
- partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
partkey = RelationGetPartitionKey(partrel);
- partdesc = PartitionDirectoryLookup(estate->es_partition_directory,
- partrel);
+ partdesc = PartitionDirectoryLookup(partdir, partrel);
+
+ /*
+ * Must close partrel, keeping the lock taken, if we're not using
+ * EState's entry.
+ */
+ if (estate == NULL)
+ table_close(partrel, NoLock);
/*
* Initialize the subplan_map and subpart_map.
@@ -1963,6 +2110,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
Assert(partdesc->nparts >= pinfo->nparts);
pprune->nparts = partdesc->nparts;
pprune->subplan_map = palloc(sizeof(int) * partdesc->nparts);
+ pprune->rti_map = palloc(sizeof(Index) * partdesc->nparts);
if (partdesc->nparts == pinfo->nparts)
{
/*
@@ -1973,6 +2121,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
pprune->subpart_map = pinfo->subpart_map;
memcpy(pprune->subplan_map, pinfo->subplan_map,
sizeof(int) * pinfo->nparts);
+ memcpy(pprune->rti_map, pinfo->rti_map,
+ sizeof(int) * pinfo->nparts);
/*
* Double-check that the list of unpruned relations has not
@@ -2023,6 +2173,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
pinfo->subplan_map[pd_idx];
pprune->subpart_map[pp_idx] =
pinfo->subpart_map[pd_idx];
+ pprune->rti_map[pp_idx] =
+ pinfo->rti_map[pd_idx];
pd_idx++;
}
else
@@ -2030,6 +2182,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
/* this partdesc entry is not in the plan */
pprune->subplan_map[pp_idx] = -1;
pprune->subpart_map[pp_idx] = -1;
+ pprune->rti_map[pp_idx] = 0;
}
}
@@ -2051,7 +2204,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
* Initialize pruning contexts as needed.
*/
pprune->initial_pruning_steps = pinfo->initial_pruning_steps;
- if (pinfo->initial_pruning_steps)
+ if (consider_initial_steps && pinfo->initial_pruning_steps)
{
InitPartitionPruneContext(&pprune->initial_context,
pinfo->initial_pruning_steps,
@@ -2061,7 +2214,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
prunestate->do_initial_prune = true;
}
pprune->exec_pruning_steps = pinfo->exec_pruning_steps;
- if (pinfo->exec_pruning_steps)
+ if (consider_exec_steps && pinfo->exec_pruning_steps)
{
InitPartitionPruneContext(&pprune->exec_context,
pinfo->exec_pruning_steps,
@@ -2289,10 +2442,14 @@ PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
* Pass initial_prune if PARAM_EXEC Params cannot yet be evaluated. This
* differentiates the initial executor-time pruning step from later
* runtime pruning.
+ *
+ * RT indexes of leaf partitions scanned by the chosen subplans are added to
+ * *scan_leafpart_rtis if the pointer is non-NULL.
*/
Bitmapset *
ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
- bool initial_prune)
+ bool initial_prune,
+ Bitmapset **scan_leafpart_rtis)
{
Bitmapset *result = NULL;
MemoryContext oldcontext;
@@ -2327,7 +2484,7 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
*/
pprune = &prunedata->partrelprunedata[0];
find_matching_subplans_recurse(prunedata, pprune, initial_prune,
- &result);
+ &result, scan_leafpart_rtis);
/* Expression eval may have used space in ExprContext too */
if (pprune->exec_pruning_steps)
@@ -2341,6 +2498,8 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
/* Copy result out of the temp context before we reset it */
result = bms_copy(result);
+ if (scan_leafpart_rtis)
+ *scan_leafpart_rtis = bms_copy(*scan_leafpart_rtis);
MemoryContextReset(prunestate->prune_context);
@@ -2351,13 +2510,15 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
* find_matching_subplans_recurse
* Recursive worker function for ExecFindMatchingSubPlans
*
- * Adds valid (non-prunable) subplan IDs to *validsubplans
+ * Adds valid (non-prunable) subplan IDs to *validsubplans and RT indexes of
+ * of the corresponding leaf partitions to *scan_leafpart_rtis (if asked for).
*/
static void
find_matching_subplans_recurse(PartitionPruningData *prunedata,
PartitionedRelPruningData *pprune,
bool initial_prune,
- Bitmapset **validsubplans)
+ Bitmapset **validsubplans,
+ Bitmapset **scan_leafpart_rtis)
{
Bitmapset *partset;
int i;
@@ -2384,8 +2545,14 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
while ((i = bms_next_member(partset, i)) >= 0)
{
if (pprune->subplan_map[i] >= 0)
+ {
*validsubplans = bms_add_member(*validsubplans,
pprune->subplan_map[i]);
+ Assert(pprune->rti_map[i] > 0);
+ if (scan_leafpart_rtis)
+ *scan_leafpart_rtis = bms_add_member(*scan_leafpart_rtis,
+ pprune->rti_map[i]);
+ }
else
{
int partidx = pprune->subpart_map[i];
@@ -2393,7 +2560,8 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
if (partidx >= 0)
find_matching_subplans_recurse(prunedata,
&prunedata->partrelprunedata[partidx],
- initial_prune, validsubplans);
+ initial_prune, validsubplans,
+ scan_leafpart_rtis);
else
{
/*
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 572c87e453..044bf3f491 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -135,6 +135,7 @@ CreateExecutorState(void)
estate->es_param_exec_vals = NULL;
estate->es_queryEnv = NULL;
+ estate->es_part_prune_results = NIL;
estate->es_query_cxt = qcontext;
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index dc13625171..bffb42ce71 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -842,7 +842,7 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
else
dest = None_Receiver;
- es->qd = CreateQueryDesc(es->stmt,
+ es->qd = CreateQueryDesc(es->stmt, NIL,
fcache->src,
GetActiveSnapshot(),
InvalidSnapshot,
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index 99830198bd..3b917584de 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -156,7 +156,8 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
* subplan, we can fill as_valid_subplans immediately, preventing
* later calls to ExecFindMatchingSubPlans.
*/
- if (!prunestate->do_exec_prune && nplans > 0)
+ if (appendstate->as_prune_state == NULL ||
+ (!appendstate->as_prune_state->do_exec_prune && nplans > 0))
appendstate->as_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
}
else
@@ -578,7 +579,7 @@ choose_next_subplan_locally(AppendState *node)
}
else if (node->as_valid_subplans == NULL)
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
whichplan = -1;
}
@@ -643,7 +644,7 @@ choose_next_subplan_for_leader(AppendState *node)
if (node->as_valid_subplans == NULL)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
/*
* Mark each invalid plan as finished to allow the loop below to
@@ -718,7 +719,7 @@ choose_next_subplan_for_worker(AppendState *node)
else if (node->as_valid_subplans == NULL)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
mark_invalid_subplans_as_finished(node);
}
@@ -869,7 +870,7 @@ ExecAppendAsyncBegin(AppendState *node)
if (node->as_valid_subplans == NULL)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
classify_matching_subplans(node);
}
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index f370f9f287..ccfa083945 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -104,7 +104,8 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
* subplan, we can fill ms_valid_subplans immediately, preventing
* later calls to ExecFindMatchingSubPlans.
*/
- if (!prunestate->do_exec_prune && nplans > 0)
+ if (mergestate->ms_prune_state == NULL ||
+ (!mergestate->ms_prune_state->do_exec_prune && nplans > 0))
mergestate->ms_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
}
else
@@ -219,7 +220,7 @@ ExecMergeAppend(PlanState *pstate)
*/
if (node->ms_valid_subplans == NULL)
node->ms_valid_subplans =
- ExecFindMatchingSubPlans(node->ms_prune_state, false);
+ ExecFindMatchingSubPlans(node->ms_prune_state, false, NULL);
/*
* First time through: pull the first tuple from each valid subplan,
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index fd5796f1b9..93012a5b3b 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -1578,6 +1578,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
CachedPlanSource *plansource;
CachedPlan *cplan;
List *stmt_list;
+ List *part_prune_results_list;
char *query_string;
Snapshot snapshot;
MemoryContext oldcontext;
@@ -1657,7 +1658,10 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
*/
/* Replan if needed, and increment plan refcount for portal */
- cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv);
+ cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv,
+ &part_prune_results_list);
+ Assert(list_length(cplan->stmt_list) ==
+ list_length(part_prune_results_list));
stmt_list = cplan->stmt_list;
if (!plan->saved)
@@ -1685,6 +1689,9 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
stmt_list,
cplan);
+ /* Copy Lists of PartitionPruneResults into the portal's context. */
+ PortalStorePartitionPruneResults(portal, part_prune_results_list);
+
/*
* Set up options for portal. Default SCROLL type is chosen the same way
* as PerformCursorOpen does it.
@@ -2092,7 +2099,8 @@ SPI_plan_get_cached_plan(SPIPlanPtr plan)
/* Get the generic plan for the query */
cplan = GetCachedPlan(plansource, NULL,
plan->saved ? CurrentResourceOwner : NULL,
- _SPI_current->queryEnv);
+ _SPI_current->queryEnv,
+ NULL /* Not interested in PartitionPruneResults */);
Assert(cplan == plansource->gplan);
/* Pop the error context stack */
@@ -2473,7 +2481,9 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
{
CachedPlanSource *plansource = (CachedPlanSource *) lfirst(lc1);
List *stmt_list;
- ListCell *lc2;
+ List *part_prune_results_list;
+ ListCell *lc2,
+ *lc3;
spicallbackarg.query = plansource->query_string;
@@ -2549,8 +2559,10 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
* plan, the refcount must be backed by the plan_owner.
*/
cplan = GetCachedPlan(plansource, options->params,
- plan_owner, _SPI_current->queryEnv);
-
+ plan_owner, _SPI_current->queryEnv,
+ &part_prune_results_list);
+ Assert(list_length(cplan->stmt_list) ==
+ list_length(part_prune_results_list));
stmt_list = cplan->stmt_list;
/*
@@ -2589,9 +2601,10 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
}
}
- foreach(lc2, stmt_list)
+ forboth(lc2, stmt_list, lc3, part_prune_results_list)
{
PlannedStmt *stmt = lfirst_node(PlannedStmt, lc2);
+ List *part_prune_results = lfirst_node(List, lc3);
bool canSetTag = stmt->canSetTag;
DestReceiver *dest;
@@ -2663,7 +2676,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
else
snap = InvalidSnapshot;
- qdesc = CreateQueryDesc(stmt,
+ qdesc = CreateQueryDesc(stmt, part_prune_results,
plansource->query_string,
snap, crosscheck_snapshot,
dest,
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 23776367c5..b01f55fb4f 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -158,6 +158,11 @@
token = pg_strtok(&length); /* skip :fldname */ \
local_node->fldname = readIntCols(len)
+/* Read an Index array */
+#define READ_INDEX_ARRAY(fldname, len) \
+ token = pg_strtok(&length); /* skip :fldname */ \
+ local_node->fldname = readIndexCols(len)
+
/* Read a bool array */
#define READ_BOOL_ARRAY(fldname, len) \
token = pg_strtok(&length); /* skip :fldname */ \
@@ -800,7 +805,6 @@ fnname(int numCols) \
*/
READ_SCALAR_ARRAY(readAttrNumberCols, int16, atoi)
READ_SCALAR_ARRAY(readOidCols, Oid, atooid)
-/* outfuncs.c has writeIndexCols, but we don't yet need that here */
-/* READ_SCALAR_ARRAY(readIndexCols, Index, atoui) */
+READ_SCALAR_ARRAY(readIndexCols, Index, atoui)
READ_SCALAR_ARRAY(readIntCols, int, atoi)
READ_SCALAR_ARRAY(readBoolCols, bool, strtobool)
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 799602f5ea..a96d316dca 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -520,7 +520,9 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
result->parallelModeNeeded = glob->parallelModeNeeded;
result->planTree = top_plan;
result->partPruneInfos = glob->partPruneInfos;
+ result->containsInitialPruning = glob->containsInitialPruning;
result->rtable = glob->finalrtable;
+ result->minLockRelids = glob->minLockRelids;
result->resultRelations = glob->resultRelations;
result->appendRelations = glob->appendRelations;
result->subplans = glob->subplans;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 399c1812d4..44ffe71c49 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -270,6 +270,16 @@ set_plan_references(PlannerInfo *root, Plan *plan)
*/
add_rtes_to_flat_rtable(root, false);
+ /*
+ * Add the query's adjusted range of RT indexes to glob->minLockRelids.
+ * The adjusted RT indexes of prunable relations will be deleted from the
+ * set below where PartitionPruneInfos are processed.
+ */
+ glob->minLockRelids =
+ bms_add_range(glob->minLockRelids,
+ rtoffset + 1,
+ rtoffset + list_length(root->parse->rtable));
+
/*
* Adjust RT indexes of PlanRowMarks and add to final rowmarks list
*/
@@ -353,6 +363,7 @@ set_plan_references(PlannerInfo *root, Plan *plan)
{
PartitionPruneInfo *pruneinfo = lfirst(lc);
ListCell *l;
+ Bitmapset *leafpart_rtis = NULL;
pruneinfo->root_parent_relids =
offset_relid_set(pruneinfo->root_parent_relids, rtoffset);
@@ -364,15 +375,50 @@ set_plan_references(PlannerInfo *root, Plan *plan)
foreach(l2, prune_infos)
{
PartitionedRelPruneInfo *pinfo = lfirst(l2);
+ int i;
/* RT index of the table to which the pinfo belongs. */
pinfo->rtindex += rtoffset;
+
+ /* Also of the leaf partitions that might be scanned. */
+ for (i = 0; i < pinfo->nparts; i++)
+ {
+ if (pinfo->rti_map[i] > 0 && pinfo->subplan_map[i] >= 0)
+ {
+ pinfo->rti_map[i] += rtoffset;
+ leafpart_rtis = bms_add_member(leafpart_rtis,
+ pinfo->rti_map[i]);
+ }
+ }
}
}
+ if (pruneinfo->needs_init_pruning)
+ {
+ glob->containsInitialPruning = true;
+
+ /*
+ * Delete the leaf partition RTIs from the global set of relations
+ * to be locked before executing the plan. AcquireExecutorLocks()
+ * will find the ones to add to the set after performing initial
+ * pruning.
+ */
+ glob->minLockRelids = bms_del_members(glob->minLockRelids,
+ leafpart_rtis);
+ }
+
glob->partPruneInfos = lappend(glob->partPruneInfos, pruneinfo);
}
+ /*
+ * It seems worth doing a bms_copy() on glob->minLockRelids if we deleted
+ * bits from it above to get rid of any empty tail bits. It seems better
+ * for the loop over this set in AcquireExecutorLocks() to not have to go
+ * through those useless bit words.
+ */
+ if (glob->containsInitialPruning)
+ glob->minLockRelids = bms_copy(glob->minLockRelids);
+
return result;
}
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index d48f6784c1..d5556354f7 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -144,7 +144,9 @@ static List *make_partitionedrel_pruneinfo(PlannerInfo *root,
List *prunequal,
Bitmapset *partrelids,
int *relid_subplan_map,
- Bitmapset **matchedsubplans);
+ Bitmapset **matchedsubplans,
+ bool *needs_init_pruning,
+ bool *needs_exec_pruning);
static void gen_partprune_steps(RelOptInfo *rel, List *clauses,
PartClauseTarget target,
GeneratePruningStepsContext *context);
@@ -234,6 +236,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
int *relid_subplan_map;
ListCell *lc;
int i;
+ bool needs_init_pruning = false;
+ bool needs_exec_pruning = false;
/*
* Scan the subpaths to see which ones are scans of partition child
@@ -313,12 +317,16 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
Bitmapset *partrelids = (Bitmapset *) lfirst(lc);
List *pinfolist;
Bitmapset *matchedsubplans = NULL;
+ bool partrel_needs_init_pruning;
+ bool partrel_needs_exec_pruning;
pinfolist = make_partitionedrel_pruneinfo(root, parentrel,
prunequal,
partrelids,
relid_subplan_map,
- &matchedsubplans);
+ &matchedsubplans,
+ &partrel_needs_init_pruning,
+ &partrel_needs_exec_pruning);
/* When pruning is possible, record the matched subplans */
if (pinfolist != NIL)
@@ -327,6 +335,9 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
allmatchedsubplans = bms_join(matchedsubplans,
allmatchedsubplans);
}
+
+ needs_init_pruning |= partrel_needs_init_pruning;
+ needs_exec_pruning |= partrel_needs_exec_pruning;
}
pfree(relid_subplan_map);
@@ -342,6 +353,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
pruneinfo = makeNode(PartitionPruneInfo);
pruneinfo->root_parent_relids = parentrel->relids;
pruneinfo->prune_infos = prunerelinfos;
+ pruneinfo->needs_init_pruning = needs_init_pruning;
+ pruneinfo->needs_exec_pruning = needs_exec_pruning;
/*
* Some subplans may not belong to any of the identified partitioned rels.
@@ -442,13 +455,18 @@ add_part_relids(List *allpartrelids, Bitmapset *partrelids)
* If we cannot find any useful run-time pruning steps, return NIL.
* However, on success, each rel identified in partrelids will have
* an element in the result list, even if some of them are useless.
+ * *needs_init_pruning and *needs_exec_pruning are set to indicate that the
+ * returned PartitionedRelPruneInfos contains pruning steps that can be
+ * performed before and after execution begins, respectively.
*/
static List *
make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
List *prunequal,
Bitmapset *partrelids,
int *relid_subplan_map,
- Bitmapset **matchedsubplans)
+ Bitmapset **matchedsubplans,
+ bool *needs_init_pruning,
+ bool *needs_exec_pruning)
{
RelOptInfo *targetpart = NULL;
List *pinfolist = NIL;
@@ -459,6 +477,10 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
int rti;
int i;
+ /* Will find out below. */
+ *needs_init_pruning = false;
+ *needs_exec_pruning = false;
+
/*
* Examine each partitioned rel, constructing a temporary array to map
* from planner relids to index of the partitioned rel, and building a
@@ -546,6 +568,9 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
* executor per-scan pruning steps. This first pass creates startup
* pruning steps and detects whether there's any possibly-useful quals
* that would require per-scan pruning.
+ *
+ * In the first pass, we note whether the 2nd pass is necessary by
+ * noting the presence of EXEC parameters.
*/
gen_partprune_steps(subpart, partprunequal, PARTTARGET_INITIAL,
&context);
@@ -620,6 +645,12 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
pinfo->execparamids = execparamids;
/* Remaining fields will be filled in the next loop */
+ /* record which types of pruning steps we've seen so far */
+ if (initial_pruning_steps != NIL)
+ *needs_init_pruning = true;
+ if (exec_pruning_steps != NIL)
+ *needs_exec_pruning = true;
+
pinfolist = lappend(pinfolist, pinfo);
}
@@ -647,6 +678,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
int *subplan_map;
int *subpart_map;
Oid *relid_map;
+ Index *rti_map;
/*
* Construct the subplan and subpart maps for this partitioning level.
@@ -659,6 +691,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
subpart_map = (int *) palloc(nparts * sizeof(int));
memset(subpart_map, -1, nparts * sizeof(int));
relid_map = (Oid *) palloc0(nparts * sizeof(Oid));
+ rti_map = (Index *) palloc0(nparts * sizeof(Index));
present_parts = NULL;
i = -1;
@@ -673,6 +706,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
subplan_map[i] = subplanidx = relid_subplan_map[partrel->relid] - 1;
subpart_map[i] = subpartidx = relid_subpart_map[partrel->relid] - 1;
relid_map[i] = planner_rt_fetch(partrel->relid, root)->relid;
+ rti_map[i] = partrel->relid;
if (subplanidx >= 0)
{
present_parts = bms_add_member(present_parts, i);
@@ -697,6 +731,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
pinfo->subplan_map = subplan_map;
pinfo->subpart_map = subpart_map;
pinfo->relid_map = relid_map;
+ pinfo->rti_map = rti_map;
}
pfree(relid_subpart_map);
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 3082093d1e..95ab1d0eef 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -1598,6 +1598,7 @@ exec_bind_message(StringInfo input_message)
int16 *rformats = NULL;
CachedPlanSource *psrc;
CachedPlan *cplan;
+ List *part_prune_results_list;
Portal portal;
char *query_string;
char *saved_stmt_name;
@@ -1972,7 +1973,9 @@ exec_bind_message(StringInfo input_message)
* will be generated in MessageContext. The plan refcount will be
* assigned to the Portal, so it will be released at portal destruction.
*/
- cplan = GetCachedPlan(psrc, params, NULL, NULL);
+ cplan = GetCachedPlan(psrc, params, NULL, NULL, &part_prune_results_list);
+ Assert(list_length(cplan->stmt_list) ==
+ list_length(part_prune_results_list));
/*
* Now we can define the portal.
@@ -1987,6 +1990,9 @@ exec_bind_message(StringInfo input_message)
cplan->stmt_list,
cplan);
+ /* Copy Lists of PartitionPruneResults into the portal's context. */
+ PortalStorePartitionPruneResults(portal, part_prune_results_list);
+
/* Done with the snapshot used for parameter I/O and parsing/planning */
if (snapshot_set)
PopActiveSnapshot();
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index 52e2db6452..280ed7d239 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -35,7 +35,7 @@
Portal ActivePortal = NULL;
-static void ProcessQuery(PlannedStmt *plan,
+static void ProcessQuery(PlannedStmt *plan, List *part_prune_results,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -65,6 +65,7 @@ static void DoPortalRewind(Portal portal);
*/
QueryDesc *
CreateQueryDesc(PlannedStmt *plannedstmt,
+ List *part_prune_results,
const char *sourceText,
Snapshot snapshot,
Snapshot crosscheck_snapshot,
@@ -77,6 +78,8 @@ CreateQueryDesc(PlannedStmt *plannedstmt,
qd->operation = plannedstmt->commandType; /* operation */
qd->plannedstmt = plannedstmt; /* plan */
+ qd->part_prune_results = part_prune_results; /* ExecutorDoInitialPruning()
+ * output for plan */
qd->sourceText = sourceText; /* query text */
qd->snapshot = RegisterSnapshot(snapshot); /* snapshot */
/* RI check snapshot */
@@ -122,6 +125,7 @@ FreeQueryDesc(QueryDesc *qdesc)
* PORTAL_ONE_RETURNING, or PORTAL_ONE_MOD_WITH portal
*
* plan: the plan tree for the query
+ * part_prune_results: ExecutorDoInitialPruning() output for the PlannedStmt
* sourceText: the source text of the query
* params: any parameters needed
* dest: where to send results
@@ -134,6 +138,7 @@ FreeQueryDesc(QueryDesc *qdesc)
*/
static void
ProcessQuery(PlannedStmt *plan,
+ List *part_prune_results,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -145,7 +150,7 @@ ProcessQuery(PlannedStmt *plan,
/*
* Create the QueryDesc object
*/
- queryDesc = CreateQueryDesc(plan, sourceText,
+ queryDesc = CreateQueryDesc(plan, part_prune_results, sourceText,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, 0);
@@ -491,8 +496,13 @@ PortalStart(Portal portal, ParamListInfo params,
/*
* Create QueryDesc in portal's context; for the moment, set
* the destination to DestNone.
+ *
+ * There is no PartitionPruneResult unless the PlannedStmt is
+ * from a CachedPlan.
*/
queryDesc = CreateQueryDesc(linitial_node(PlannedStmt, portal->stmts),
+ portal->part_prune_results_list == NIL ? NIL :
+ linitial(portal->part_prune_results_list),
portal->sourceText,
GetActiveSnapshot(),
InvalidSnapshot,
@@ -1225,6 +1235,8 @@ PortalRunMulti(Portal portal,
if (pstmt->utilityStmt == NULL)
{
+ List *part_prune_results = NIL;
+
/*
* process a plannable query.
*/
@@ -1271,10 +1283,18 @@ PortalRunMulti(Portal portal,
else
UpdateActiveSnapshotCommandId();
+ /*
+ * Determine if there's a corresponding List of PartitionPruneResult
+ * for this PlannedStmt.
+ */
+ if (portal->part_prune_results_list != NIL)
+ part_prune_results = list_nth(portal->part_prune_results_list,
+ foreach_current_index(stmtlist_item));
+
if (pstmt->canSetTag)
{
/* statement can set tag string */
- ProcessQuery(pstmt,
+ ProcessQuery(pstmt, part_prune_results,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
@@ -1283,7 +1303,7 @@ PortalRunMulti(Portal portal,
else
{
/* stmt added by rewrite cannot set tag */
- ProcessQuery(pstmt,
+ ProcessQuery(pstmt, part_prune_results,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index cc943205d3..af6fae6e3b 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -99,14 +99,19 @@ static dlist_head cached_expression_list = DLIST_STATIC_INIT(cached_expression_l
static void ReleaseGenericPlan(CachedPlanSource *plansource);
static List *RevalidateCachedQuery(CachedPlanSource *plansource,
QueryEnvironment *queryEnv);
-static bool CheckCachedPlan(CachedPlanSource *plansource);
+static bool CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
+ List **part_prune_results_list);
static CachedPlan *BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
- ParamListInfo boundParams, QueryEnvironment *queryEnv);
+ ParamListInfo boundParams, QueryEnvironment *queryEnv,
+ List **part_prune_results_list);
static bool choose_custom_plan(CachedPlanSource *plansource,
ParamListInfo boundParams);
static double cached_plan_cost(CachedPlan *plan, bool include_planner);
static Query *QueryListGetPrimaryStmt(List *stmts);
-static void AcquireExecutorLocks(List *stmt_list, bool acquire);
+static void AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams,
+ List **part_prune_results_list,
+ List **lockedRelids_per_stmt);
+static void ReleaseExecutorLocks(List *stmt_list, List *lockedRelids_per_stmt);
static void AcquirePlannerLocks(List *stmt_list, bool acquire);
static void ScanQueryForLocks(Query *parsetree, bool acquire);
static bool ScanQueryWalker(Node *node, bool *acquire);
@@ -782,6 +787,26 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
return tlist;
}
+/*
+ * FreePartitionPruneResults
+ * Frees the List of Lists of PartitionPruneResults for CheckCachedPlan()
+ */
+static void
+FreePartitionPruneResults(List *part_prune_results_list)
+{
+ ListCell *lc;
+
+ foreach(lc, part_prune_results_list)
+ {
+ List *part_prune_results = lfirst(lc);
+
+ /* Free both the PartitionPruneResults and the containing List. */
+ list_free_deep(part_prune_results);
+ }
+
+ list_free(part_prune_results_list);
+}
+
/*
* CheckCachedPlan: see if the CachedPlanSource's generic plan is valid.
*
@@ -790,15 +815,20 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
*
* On a "true" return, we have acquired the locks needed to run the plan.
* (We must do this for the "true" result to be race-condition-free.)
+ *
+ * See GetCachedPlan()'s comment for a description of part_prune_results_list.
*/
static bool
-CheckCachedPlan(CachedPlanSource *plansource)
+CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
+ List **part_prune_results_list)
{
CachedPlan *plan = plansource->gplan;
/* Assert that caller checked the querytree */
Assert(plansource->is_valid);
+ *part_prune_results_list = NIL;
+
/* If there's no generic plan, just say "false" */
if (!plan)
return false;
@@ -820,13 +850,21 @@ CheckCachedPlan(CachedPlanSource *plansource)
*/
if (plan->is_valid)
{
+ List *lockedRelids_per_stmt;
+
/*
* Plan must have positive refcount because it is referenced by
* plansource; so no need to fear it disappears under us here.
*/
Assert(plan->refcount > 0);
- AcquireExecutorLocks(plan->stmt_list, true);
+ /*
+ * Lock relations scanned by the plan. This is where the pruning
+ * happens if needed.
+ */
+ AcquireExecutorLocks(plan->stmt_list, boundParams,
+ part_prune_results_list,
+ &lockedRelids_per_stmt);
/*
* If plan was transient, check to see if TransactionXmin has
@@ -848,7 +886,11 @@ CheckCachedPlan(CachedPlanSource *plansource)
}
/* Oops, the race case happened. Release useless locks. */
- AcquireExecutorLocks(plan->stmt_list, false);
+ ReleaseExecutorLocks(plan->stmt_list, lockedRelids_per_stmt);
+
+ /* Release any PartitionPruneResults that may been created. */
+ FreePartitionPruneResults(*part_prune_results_list);
+ *part_prune_results_list = NIL;
}
/*
@@ -874,10 +916,14 @@ CheckCachedPlan(CachedPlanSource *plansource)
* Planning work is done in the caller's memory context. The finished plan
* is in a child memory context, which typically should get reparented
* (unless this is a one-shot plan, in which case we don't copy the plan).
+ *
+ * A list of NILs is returned in *part_prune_results_list, meaning that no
+ * no partition pruning has been done yet for the plans in stmt_list.
*/
static CachedPlan *
BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
- ParamListInfo boundParams, QueryEnvironment *queryEnv)
+ ParamListInfo boundParams, QueryEnvironment *queryEnv,
+ List **part_prune_results_list)
{
CachedPlan *plan;
List *plist;
@@ -1007,6 +1053,17 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
MemoryContextSwitchTo(oldcxt);
+ /*
+ * No actual PartitionPruneResults yet to add, though must initialize
+ * the list to have the same number of elements as the list of
+ * PlannedStmts.
+ */
+ *part_prune_results_list = NIL;
+ foreach(lc, plist)
+ {
+ *part_prune_results_list = lappend(*part_prune_results_list, NIL);
+ }
+
return plan;
}
@@ -1126,6 +1183,19 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
* plan or a custom plan for the given parameters: the caller does not know
* which it will get.
*
+ * For every PlannedStmt found in the returned CachedPlan, an element that
+ * is either a List of PartitionPruneResult or a NIL is added to
+ * *part_prune_results_list. The former if the PlannedStmt is from
+ * the existing CachedPlan that is otherwise valid and has
+ * containsInitialPruning set to true. Before returning such a CachedPlan,
+ * those "initial" steps are performed by calling ExecutorDoInitialPruning()
+ * to determine only those leaf partitions that need to be locked by
+ * AcquireExecutorLocks() by pruning away subplans that don't match the
+ * "initial" pruning conditions. For each PartitionPruneInfo found in
+ * PlannedStmt.partPruneInfos, a PartitionPruneResult containing the bitmapset
+ * of the indexes of surviving subplans is added to the List for the
+ * PlannedStmt.
+ *
* On return, the plan is valid and we have sufficient locks to begin
* execution.
*
@@ -1139,11 +1209,13 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
*/
CachedPlan *
GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
- ResourceOwner owner, QueryEnvironment *queryEnv)
+ ResourceOwner owner, QueryEnvironment *queryEnv,
+ List **part_prune_results_list)
{
CachedPlan *plan = NULL;
List *qlist;
bool customplan;
+ List *my_part_prune_results_list;
/* Assert caller is doing things in a sane order */
Assert(plansource->magic == CACHEDPLANSOURCE_MAGIC);
@@ -1160,7 +1232,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
if (!customplan)
{
- if (CheckCachedPlan(plansource))
+ if (CheckCachedPlan(plansource, boundParams,
+ &my_part_prune_results_list))
{
/* We want a generic plan, and we already have a valid one */
plan = plansource->gplan;
@@ -1169,7 +1242,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
else
{
/* Build a new generic plan */
- plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv);
+ plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv,
+ &my_part_prune_results_list);
/* Just make real sure plansource->gplan is clear */
ReleaseGenericPlan(plansource);
/* Link the new generic plan into the plansource */
@@ -1214,7 +1288,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
if (customplan)
{
/* Build a custom plan */
- plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv);
+ plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv,
+ &my_part_prune_results_list);
/* Accumulate total costs of custom plans */
plansource->total_custom_cost += cached_plan_cost(plan, true);
@@ -1246,6 +1321,9 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
plan->is_saved = true;
}
+ if (part_prune_results_list)
+ *part_prune_results_list = my_part_prune_results_list;
+
return plan;
}
@@ -1737,17 +1815,29 @@ QueryListGetPrimaryStmt(List *stmts)
/*
* AcquireExecutorLocks: acquire locks needed for execution of a cached plan;
- * or release them if acquire is false.
+ *
+ * See GetCachedPlan()'s comment for a description of part_prune_results_list.
+ *
+ * On return, *lockedRelids_per_stmt will contain a bitmapset for every
+ * PlannedStmt in stmt_list, containing the RT indexes of relation entries
+ * in its range table that were actually locked, or NULL if the PlannedStmt
+ * contains a utility statement.
*/
static void
-AcquireExecutorLocks(List *stmt_list, bool acquire)
+AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams,
+ List **part_prune_results_list,
+ List **lockedRelids_per_stmt)
{
ListCell *lc1;
+ *part_prune_results_list = *lockedRelids_per_stmt = NIL;
foreach(lc1, stmt_list)
{
PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
- ListCell *lc2;
+ List *part_prune_results = NIL;
+ Bitmapset *allLockRelids;
+ Bitmapset *lockedRelids = NULL;
+ int rti;
if (plannedstmt->commandType == CMD_UTILITY)
{
@@ -1761,13 +1851,40 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
Query *query = UtilityContainsQuery(plannedstmt->utilityStmt);
if (query)
- ScanQueryForLocks(query, acquire);
+ ScanQueryForLocks(query, true);
+ *part_prune_results_list = lappend(*part_prune_results_list, NIL);
continue;
}
- foreach(lc2, plannedstmt->rtable)
+ /*
+ * Figure out the set of relations that would need to be locked
+ * before executing the plan.
+ */
+ if (plannedstmt->containsInitialPruning)
{
- RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc2);
+ Bitmapset *scan_leafpart_rtis = NULL;
+
+ /*
+ * Obtain the set of leaf partitions to be locked.
+ *
+ * The following does initial partition pruning using the
+ * PartitionPruneInfos found in plannedstmt->partPruneInfos and
+ * finds leaf partitions that survive that pruning across all the
+ * nodes in the plan tree.
+ */
+ part_prune_results = ExecutorDoInitialPruning(plannedstmt,
+ boundParams,
+ &scan_leafpart_rtis);
+ allLockRelids = bms_union(plannedstmt->minLockRelids,
+ scan_leafpart_rtis);
+ }
+ else
+ allLockRelids = plannedstmt->minLockRelids;
+
+ rti = -1;
+ while ((rti = bms_next_member(allLockRelids, rti)) > 0)
+ {
+ RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
if (rte->rtekind != RTE_RELATION)
continue;
@@ -1778,10 +1895,59 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
* fail if it's been dropped entirely --- we'll just transiently
* acquire a non-conflicting lock.
*/
- if (acquire)
- LockRelationOid(rte->relid, rte->rellockmode);
- else
- UnlockRelationOid(rte->relid, rte->rellockmode);
+ LockRelationOid(rte->relid, rte->rellockmode);
+ lockedRelids = bms_add_member(lockedRelids, rti);
+ }
+
+ *part_prune_results_list = lappend(*part_prune_results_list,
+ part_prune_results);
+ *lockedRelids_per_stmt = lappend(*lockedRelids_per_stmt, lockedRelids);
+ }
+}
+
+/*
+ * ReleaseExecutorLocks
+ * Release locks that would've been acquired by an earlier call to
+ * AcquireExecutorLocks()
+ */
+static void
+ReleaseExecutorLocks(List *stmt_list, List *lockedRelids_per_stmt)
+{
+ ListCell *lc1,
+ *lc2;
+
+ forboth(lc1, stmt_list, lc2, lockedRelids_per_stmt)
+ {
+ PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
+ Bitmapset *lockedRelids = lfirst(lc2);
+ int rti;
+
+ if (plannedstmt->commandType == CMD_UTILITY)
+ {
+ /*
+ * Ignore utility statements, except those (such as EXPLAIN) that
+ * contain a parsed-but-not-planned query. Note: it's okay to use
+ * ScanQueryForLocks, even though the query hasn't been through
+ * rule rewriting, because rewriting doesn't change the query
+ * representation.
+ */
+ Query *query = UtilityContainsQuery(plannedstmt->utilityStmt);
+
+ Assert(lockedRelids == NULL);
+ if (query)
+ ScanQueryForLocks(query, false);
+ continue;
+ }
+
+ rti = -1;
+ while ((rti = bms_next_member(lockedRelids, rti)) >= 0)
+ {
+ RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
+
+ Assert(rte->rtekind == RTE_RELATION);
+
+ /* See the comment in AcquireExecutorLocks(). */
+ UnlockRelationOid(rte->relid, rte->rellockmode);
}
}
}
diff --git a/src/backend/utils/mmgr/portalmem.c b/src/backend/utils/mmgr/portalmem.c
index 7b1ae6fdcf..5b9098971b 100644
--- a/src/backend/utils/mmgr/portalmem.c
+++ b/src/backend/utils/mmgr/portalmem.c
@@ -303,6 +303,25 @@ PortalDefineQuery(Portal portal,
portal->status = PORTAL_DEFINED;
}
+/*
+ * PortalStorePartitionPruneResults
+ * Copy the given List of Lists of PartitionPruneResults into the
+ * portal's context
+ *
+ * This allows the caller to ensure that the list exists as long as the portal
+ * does.
+ */
+void
+PortalStorePartitionPruneResults(Portal portal, List *part_prune_results_list)
+{
+ MemoryContext oldcxt;
+
+ Assert(PortalIsValid(portal));
+ oldcxt = MemoryContextSwitchTo(portal->portalContext);
+ portal->part_prune_results_list = copyObject(part_prune_results_list);
+ MemoryContextSwitchTo(oldcxt);
+}
+
/*
* PortalReleaseCachedPlan
* Release a portal's reference to its cached plan, if any.
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 9ebde089ae..269cc4d562 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -87,7 +87,9 @@ extern void ExplainOneUtility(Node *utilityStmt, IntoClause *into,
ExplainState *es, const char *queryString,
ParamListInfo params, QueryEnvironment *queryEnv);
-extern void ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into,
+extern void ExplainOnePlan(PlannedStmt *plannedstmt,
+ List *part_prune_results,
+ IntoClause *into,
ExplainState *es, const char *queryString,
ParamListInfo params, QueryEnvironment *queryEnv,
const instr_time *planduration,
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index 17fabc18c9..4b98d0d2ef 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -45,6 +45,7 @@ extern void ExecCleanupTupleRouting(ModifyTableState *mtstate,
* nparts Length of subplan_map[] and subpart_map[].
* subplan_map Subplan index by partition index, or -1.
* subpart_map Subpart index by partition index, or -1.
+ * rti_map Range table index by partition index, or 0.
* present_parts A Bitmapset of the partition indexes that we
* have subplans or subparts for.
* initial_pruning_steps List of PartitionPruneSteps used to
@@ -61,6 +62,7 @@ typedef struct PartitionedRelPruningData
int nparts;
int *subplan_map;
int *subpart_map;
+ Index *rti_map;
Bitmapset *present_parts;
List *initial_pruning_steps;
List *exec_pruning_steps;
@@ -127,5 +129,10 @@ extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
Bitmapset *root_parent_relids,
Bitmapset **initially_valid_subplans);
extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
- bool initial_prune);
+ bool initial_prune,
+ Bitmapset **scan_leafpart_rtis);
+extern Bitmapset *ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt,
+ ParamListInfo params,
+ PartitionPruneInfo *pruneinfo,
+ Bitmapset **scan_leafpart_rtis);
#endif /* EXECPARTITION_H */
diff --git a/src/include/executor/execdesc.h b/src/include/executor/execdesc.h
index e79e2c001f..7d4379da7b 100644
--- a/src/include/executor/execdesc.h
+++ b/src/include/executor/execdesc.h
@@ -35,6 +35,8 @@ typedef struct QueryDesc
/* These fields are provided by CreateQueryDesc */
CmdType operation; /* CMD_SELECT, CMD_UPDATE, etc. */
PlannedStmt *plannedstmt; /* planner's output (could be utility, too) */
+ List *part_prune_results; /* ExecutorDoInitialPruning()'s
+ * output for plannedstmt */
const char *sourceText; /* source text of the query */
Snapshot snapshot; /* snapshot to use for query */
Snapshot crosscheck_snapshot; /* crosscheck for RI update/delete */
@@ -57,6 +59,7 @@ typedef struct QueryDesc
/* in pquery.c */
extern QueryDesc *CreateQueryDesc(PlannedStmt *plannedstmt,
+ List *part_prune_results,
const char *sourceText,
Snapshot snapshot,
Snapshot crosscheck_snapshot,
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index aaf2bc78b9..32bbbc5927 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -185,6 +185,9 @@ ExecGetJunkAttribute(TupleTableSlot *slot, AttrNumber attno, bool *isNull)
/*
* prototypes from functions in execMain.c
*/
+extern List *ExecutorDoInitialPruning(PlannedStmt *plannedstmt,
+ ParamListInfo params,
+ Bitmapset **scan_leafpart_rtis);
extern void ExecutorStart(QueryDesc *queryDesc, int eflags);
extern void standard_ExecutorStart(QueryDesc *queryDesc, int eflags);
extern void ExecutorRun(QueryDesc *queryDesc,
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 71248a9466..9c6e8f5e13 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -619,6 +619,7 @@ typedef struct EState
* ExecRowMarks, or NULL if none */
PlannedStmt *es_plannedstmt; /* link to top of plan tree */
List *es_part_prune_infos; /* PlannedStmt.partPruneInfos */
+ List *es_part_prune_results; /* QueryDesc.part_prune_results */
const char *es_sourceText; /* Source text from QueryDesc */
JunkFilter *es_junkFilter; /* top-level junk filter, if any */
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index 1f33902947..c2f2544df5 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -218,6 +218,7 @@ extern struct Bitmapset *readBitmapset(void);
extern uintptr_t readDatum(bool typbyval);
extern bool *readBoolCols(int numCols);
extern int *readIntCols(int numCols);
+extern Index *readIndexCols(int numCols);
extern Oid *readOidCols(int numCols);
extern int16 *readAttrNumberCols(int numCols);
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index dbaa9bb54d..e0e5c15b09 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -125,6 +125,18 @@ typedef struct PlannerGlobal
/* List of PartitionPruneInfo contained in the plan */
List *partPruneInfos;
+ /*
+ * Do any of those PartitionPruneInfos have initial pruning steps in them?
+ */
+ bool containsInitialPruning;
+
+ /*
+ * Indexes of all range table entries minus indexes of range table entries
+ * of the leaf partitions scanned by prunable subplans; see
+ * AcquireExecutorLocks()
+ */
+ Bitmapset *minLockRelids;
+
/* OIDs of relations the plan depends on */
List *relationOids;
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index c36a15bd09..714e2cf2c7 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -73,8 +73,17 @@ typedef struct PlannedStmt
List *partPruneInfos; /* List of PartitionPruneInfo contained in the
* plan */
+ bool containsInitialPruning; /* Do any of those PartitionPruneInfos
+ * have initial pruning steps in them?
+ */
+
List *rtable; /* list of RangeTblEntry nodes */
+ Bitmapset *minLockRelids; /* Indexes of all range table entries minus
+ * indexes of range table entries of the leaf
+ * partitions scanned by prunable subplans;
+ * see AcquireExecutorLocks() */
+
/* rtable indexes of target relations for INSERT/UPDATE/DELETE/MERGE */
List *resultRelations; /* integer list of RT indexes, or NIL */
@@ -1414,6 +1423,13 @@ typedef struct PlanRowMark
* prune_infos List of Lists containing PartitionedRelPruneInfo nodes,
* one sublist per run-time-prunable partition hierarchy
* appearing in the parent plan node's subplans.
+ *
+ * needs_init_pruning Does any of the PartitionedRelPruneInfos in
+ * prune_infos have its initial_pruning_steps set?
+ *
+ * needs_exec_pruning Does any of the PartitionedRelPruneInfos in
+ * prune_infos have its exec_pruning_steps set?
+ *
* other_subplans Indexes of any subplans that are not accounted for
* by any of the PartitionedRelPruneInfo nodes in
* "prune_infos". These subplans must not be pruned.
@@ -1425,6 +1441,8 @@ typedef struct PartitionPruneInfo
NodeTag type;
Bitmapset *root_parent_relids;
List *prune_infos;
+ bool needs_init_pruning;
+ bool needs_exec_pruning;
Bitmapset *other_subplans;
} PartitionPruneInfo;
@@ -1469,6 +1487,9 @@ typedef struct PartitionedRelPruneInfo
/* relation OID by partition index, or 0 */
Oid *relid_map pg_node_attr(array_size(nparts));
+ /* Range table index by partition index, or 0. */
+ Index *rti_map pg_node_attr(array_size(nparts));
+
/*
* initial_pruning_steps shows how to prune during executor startup (i.e.,
* without use of any PARAM_EXEC Params); it is NIL if no startup pruning
@@ -1553,6 +1574,31 @@ typedef struct PartitionPruneStepCombine
List *source_stepids;
} PartitionPruneStepCombine;
+/*----------------
+ * PartitionPruneResult
+ *
+ * The result of performing ExecPartitionDoInitialPruning() on a given
+ * PartitionPruneInfo.
+ *
+ * valid_subplans_offs contains the indexes of subplans remaining after
+ * performing initial pruning by calling ExecFindMatchingSubPlans() on the
+ * PartitionPruneInfo.
+ *
+ * This is used to store the result of initial partition pruning that is
+ * peformed before the execution has started. A module that needs to do so
+ * should call ExecutorDoInitialPruning() on a given PlannedStmt, which
+ * returns a List of PartitionPruneResult containing an entry for each
+ * PartitionPruneInfo present in PlannedStmt.part_prune_infos. The module
+ * should then pass that list, along with the PlannedStmt, to the executor,
+ * so that it can reuse the result of initial partition pruning when
+ * initializing the subplans for execution.
+ */
+typedef struct PartitionPruneResult
+{
+ NodeTag type;
+
+ Bitmapset *valid_subplan_offs;
+} PartitionPruneResult;
/*
* Plan invalidation info
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index 0499635f59..32579d4788 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -220,7 +220,8 @@ extern List *CachedPlanGetTargetList(CachedPlanSource *plansource,
extern CachedPlan *GetCachedPlan(CachedPlanSource *plansource,
ParamListInfo boundParams,
ResourceOwner owner,
- QueryEnvironment *queryEnv);
+ QueryEnvironment *queryEnv,
+ List **part_prune_results_list);
extern void ReleaseCachedPlan(CachedPlan *plan, ResourceOwner owner);
extern bool CachedPlanAllowsSimpleValidityCheck(CachedPlanSource *plansource,
diff --git a/src/include/utils/portal.h b/src/include/utils/portal.h
index aeddbdafe5..1901fc5f28 100644
--- a/src/include/utils/portal.h
+++ b/src/include/utils/portal.h
@@ -138,6 +138,7 @@ typedef struct PortalData
QueryCompletion qc; /* command completion data for executed query */
List *stmts; /* list of PlannedStmts */
CachedPlan *cplan; /* CachedPlan, if stmts are from one */
+ List *part_prune_results_list; /* List of Lists of PartitionPruneResults */
ParamListInfo portalParams; /* params to pass to query */
QueryEnvironment *queryEnv; /* environment for query */
@@ -242,6 +243,8 @@ extern void PortalDefineQuery(Portal portal,
CommandTag commandTag,
List *stmts,
CachedPlan *cplan);
+extern void PortalStorePartitionPruneResults(Portal portal,
+ List *part_prune_results_list);
extern PlannedStmt *PortalGetPrimaryStmt(Portal portal);
extern void PortalCreateHoldStore(Portal portal);
extern void PortalHashTableDeleteAll(void);
--
2.35.3
[application/octet-stream] v27-0002-Add-root_parent_relids-to-PartitionPruneResult.patch (3.4K, 3-v27-0002-Add-root_parent_relids-to-PartitionPruneResult.patch)
download | inline diff:
From 4ef1d918405a7c7c63a3e7376ccef57cf844796d Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Fri, 2 Dec 2022 19:32:14 +0900
Subject: [PATCH v27 2/2] Add root_parent_relids to PartitionPruneResult
It's same as the corresponding PartitionPruneInfo's root_parent_relids.
Like PartitionPruneInfo.root_parent_relids, it's there for
cross-checking a PartitionPruneResult found at a given plan node's
part_prune_index actually matches the plan node.
---
src/backend/executor/execMain.c | 2 ++
src/backend/executor/execPartition.c | 13 +++++++++++--
src/include/nodes/plannodes.h | 7 +++++++
3 files changed, 20 insertions(+), 2 deletions(-)
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 4d8c8e2e43..3293a65d15 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -147,6 +147,8 @@ ExecutorDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params,
PartitionPruneInfo *pruneinfo = lfirst(lc);
PartitionPruneResult *pruneresult = makeNode(PartitionPruneResult);
+ pruneresult->root_parent_relids =
+ bms_copy(pruneinfo->root_parent_relids);
pruneresult->valid_subplan_offs =
ExecPartitionDoInitialPruning(plannedstmt, params, pruneinfo,
scan_leafpart_rtis);
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index b0eb15b982..2eadc30ec8 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -1843,8 +1843,17 @@ ExecInitPartitionPruning(PlanState *planstate,
*/
if (estate->es_part_prune_results)
{
- pruneresult = list_nth(estate->es_part_prune_results, part_prune_index);
- Assert(IsA(pruneresult, PartitionPruneResult));
+ pruneresult = list_nth_node(PartitionPruneResult,
+ estate->es_part_prune_results,
+ part_prune_index);
+ if (!bms_equal(root_parent_relids, pruneinfo->root_parent_relids))
+ ereport(ERROR,
+ errcode(ERRCODE_INTERNAL_ERROR),
+ errmsg_internal("mismatching PartitionPruneInfo and PartitionPruneResult at part_prune_index %d",
+ part_prune_index),
+ errdetail_internal("prunresult relids %s, pruneinfo relids %s",
+ bmsToString(pruneresult->root_parent_relids),
+ bmsToString(pruneinfo->root_parent_relids)));
}
if (pruneresult == NULL || pruneinfo->needs_exec_pruning)
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 714e2cf2c7..ed664c5469 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -1580,6 +1580,12 @@ typedef struct PartitionPruneStepCombine
* The result of performing ExecPartitionDoInitialPruning() on a given
* PartitionPruneInfo.
*
+ * root_parent_relids is same as PartitionPruneInfo.root_parent_relids. It's
+ * there for cross-checking in ExecInitPartitionPruning() that the
+ * PartitionPruneResult and the PartitionPruneInfo at a given index in
+ * EState.es_part_prune_results and EState.es_part_prune_infos, respectively,
+ * belong to the same parent plan node.
+ *
* valid_subplans_offs contains the indexes of subplans remaining after
* performing initial pruning by calling ExecFindMatchingSubPlans() on the
* PartitionPruneInfo.
@@ -1597,6 +1603,7 @@ typedef struct PartitionPruneResult
{
NodeTag type;
+ Bitmapset *root_parent_relids;
Bitmapset *valid_subplan_offs;
} PartitionPruneResult;
--
2.35.3
^ permalink raw reply [nested|flat] 71+ messages in thread
* Re: generic plans and "initial" pruning
@ 2022-12-05 06:08 Amit Langote <[email protected]>
parent: Amit Langote <[email protected]>
0 siblings, 1 reply; 71+ messages in thread
From: Amit Langote @ 2022-12-05 06:08 UTC (permalink / raw)
To: Alvaro Herrera <[email protected]>; +Cc: Robert Haas <[email protected]>; Jacob Champion <[email protected]>; David Rowley <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers
On Mon, Dec 5, 2022 at 12:00 PM Amit Langote <[email protected]> wrote:
> On Fri, Dec 2, 2022 at 7:40 PM Amit Langote <[email protected]> wrote:
> > Thought it might be good for PartitionPruneResult to also have
> > root_parent_relids that matches with the corresponding
> > PartitionPruneInfo. ExecInitPartitionPruning() does a sanity check
> > that the root_parent_relids of a given pair of PartitionPrune{Info |
> > Result} match.
> >
> > Posting the patch separately as the attached 0002, just in case you
> > might think that the extra cross-checking would be an overkill.
>
> Rebased over 92c4dafe1eed and fixed some factual mistakes in the
> comment above ExecutorDoInitialPruning().
Sorry, I had forgotten to git-add hunks including some cosmetic
changes in that one. Here's another version.
--
Thanks, Amit Langote
EDB: http://www.enterprisedb.com
Attachments:
[application/octet-stream] v28-0002-Add-root_parent_relids-to-PartitionPruneResult.patch (3.3K, 2-v28-0002-Add-root_parent_relids-to-PartitionPruneResult.patch)
download | inline diff:
From 04f156396309f8c34a853ce1ad4e293fe4e2c4a2 Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Fri, 2 Dec 2022 19:32:14 +0900
Subject: [PATCH v28 2/2] Add root_parent_relids to PartitionPruneResult
It's same as the corresponding PartitionPruneInfo's root_parent_relids.
Like PartitionPruneInfo.root_parent_relids, it's there for
cross-checking a PartitionPruneResult found at a given plan node's
part_prune_index actually matches the plan node.
---
src/backend/executor/execMain.c | 2 ++
src/backend/executor/execPartition.c | 10 ++++++++++
src/include/nodes/plannodes.h | 7 +++++++
3 files changed, 19 insertions(+)
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index f15265716a..554623751b 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -147,6 +147,8 @@ ExecutorDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params,
PartitionPruneInfo *pruneinfo = lfirst_node(PartitionPruneInfo, lc);
PartitionPruneResult *pruneresult = makeNode(PartitionPruneResult);
+ pruneresult->root_parent_relids =
+ bms_copy(pruneinfo->root_parent_relids);
pruneresult->valid_subplan_offs =
ExecPartitionDoInitialPruning(plannedstmt, params, pruneinfo,
scan_leafpart_rtis);
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index bc8331a222..2eadc30ec8 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -1842,9 +1842,19 @@ ExecInitPartitionPruning(PlanState *planstate,
* is set.
*/
if (estate->es_part_prune_results)
+ {
pruneresult = list_nth_node(PartitionPruneResult,
estate->es_part_prune_results,
part_prune_index);
+ if (!bms_equal(root_parent_relids, pruneinfo->root_parent_relids))
+ ereport(ERROR,
+ errcode(ERRCODE_INTERNAL_ERROR),
+ errmsg_internal("mismatching PartitionPruneInfo and PartitionPruneResult at part_prune_index %d",
+ part_prune_index),
+ errdetail_internal("prunresult relids %s, pruneinfo relids %s",
+ bmsToString(pruneresult->root_parent_relids),
+ bmsToString(pruneinfo->root_parent_relids)));
+ }
if (pruneresult == NULL || pruneinfo->needs_exec_pruning)
{
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 714e2cf2c7..ed664c5469 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -1580,6 +1580,12 @@ typedef struct PartitionPruneStepCombine
* The result of performing ExecPartitionDoInitialPruning() on a given
* PartitionPruneInfo.
*
+ * root_parent_relids is same as PartitionPruneInfo.root_parent_relids. It's
+ * there for cross-checking in ExecInitPartitionPruning() that the
+ * PartitionPruneResult and the PartitionPruneInfo at a given index in
+ * EState.es_part_prune_results and EState.es_part_prune_infos, respectively,
+ * belong to the same parent plan node.
+ *
* valid_subplans_offs contains the indexes of subplans remaining after
* performing initial pruning by calling ExecFindMatchingSubPlans() on the
* PartitionPruneInfo.
@@ -1597,6 +1603,7 @@ typedef struct PartitionPruneResult
{
NodeTag type;
+ Bitmapset *root_parent_relids;
Bitmapset *valid_subplan_offs;
} PartitionPruneResult;
--
2.35.3
[application/octet-stream] v28-0001-Optimize-AcquireExecutorLocks-by-locking-only-un.patch (83.0K, 3-v28-0001-Optimize-AcquireExecutorLocks-by-locking-only-un.patch)
download | inline diff:
From 28bdd07ae15228bc3173257ab5968864455dda16 Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Wed, 22 Dec 2021 16:55:17 +0900
Subject: [PATCH v28 1/2] Optimize AcquireExecutorLocks() by locking only
unpruned partitions
This commit teaches AcquireExecutorLocks() to perform initial
partition pruning to notionally eliminate the subnodes contained in a
generic cached plan that need not be initialized during the actual
execution of the plan and skip locking the partition scanned by those
subnodes.
The result of performing initial partition pruning this way before the
actual execution has started is made available to the actual execution via
PartitionPruneResult, made available along with the PlannedStmt by the
callers of the executor that used plancache.c to get the plan. It is NULL
in the cases in which the plan is obtained by calling the planner
directly or if the plan obtained by plancache.c is not a generic one.
---
src/backend/commands/copyto.c | 2 +-
src/backend/commands/createas.c | 2 +-
src/backend/commands/explain.c | 7 +-
src/backend/commands/extension.c | 2 +-
src/backend/commands/matview.c | 2 +-
src/backend/commands/prepare.c | 26 ++-
src/backend/executor/README | 36 ++++
src/backend/executor/execMain.c | 53 ++++++
src/backend/executor/execParallel.c | 26 ++-
src/backend/executor/execPartition.c | 237 +++++++++++++++++++++----
src/backend/executor/execUtils.c | 1 +
src/backend/executor/functions.c | 2 +-
src/backend/executor/nodeAppend.c | 11 +-
src/backend/executor/nodeMergeAppend.c | 5 +-
src/backend/executor/spi.c | 27 ++-
src/backend/nodes/readfuncs.c | 8 +-
src/backend/optimizer/plan/planner.c | 2 +
src/backend/optimizer/plan/setrefs.c | 46 +++++
src/backend/partitioning/partprune.c | 41 ++++-
src/backend/tcop/postgres.c | 8 +-
src/backend/tcop/pquery.c | 29 ++-
src/backend/utils/cache/plancache.c | 208 +++++++++++++++++++---
src/backend/utils/mmgr/portalmem.c | 19 ++
src/include/commands/explain.h | 4 +-
src/include/executor/execPartition.h | 9 +-
src/include/executor/execdesc.h | 3 +
src/include/executor/executor.h | 3 +
src/include/nodes/execnodes.h | 1 +
src/include/nodes/nodes.h | 1 +
src/include/nodes/pathnodes.h | 12 ++
src/include/nodes/plannodes.h | 46 +++++
src/include/utils/plancache.h | 3 +-
src/include/utils/portal.h | 3 +
33 files changed, 787 insertions(+), 98 deletions(-)
diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index f26cc0d162..401a2280a3 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -558,7 +558,7 @@ BeginCopyTo(ParseState *pstate,
((DR_copy *) dest)->cstate = cstate;
/* Create a QueryDesc requesting no output */
- cstate->queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+ cstate->queryDesc = CreateQueryDesc(plan, NIL, pstate->p_sourcetext,
GetActiveSnapshot(),
InvalidSnapshot,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 152c29b551..942449544c 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -325,7 +325,7 @@ ExecCreateTableAs(ParseState *pstate, CreateTableAsStmt *stmt,
UpdateActiveSnapshotCommandId();
/* Create a QueryDesc, redirecting output to our tuple receiver */
- queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+ queryDesc = CreateQueryDesc(plan, NIL, pstate->p_sourcetext,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, 0);
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index f86983c660..2f2b558608 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -407,7 +407,7 @@ ExplainOneQuery(Query *query, int cursorOptions,
}
/* run it (if needed) and produce output */
- ExplainOnePlan(plan, into, es, queryString, params, queryEnv,
+ ExplainOnePlan(plan, NIL, into, es, queryString, params, queryEnv,
&planduration, (es->buffers ? &bufusage : NULL));
}
}
@@ -515,7 +515,8 @@ ExplainOneUtility(Node *utilityStmt, IntoClause *into, ExplainState *es,
* to call it.
*/
void
-ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
+ExplainOnePlan(PlannedStmt *plannedstmt, List *part_prune_results,
+ IntoClause *into, ExplainState *es,
const char *queryString, ParamListInfo params,
QueryEnvironment *queryEnv, const instr_time *planduration,
const BufferUsage *bufusage)
@@ -563,7 +564,7 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
dest = None_Receiver;
/* Create a QueryDesc for the query */
- queryDesc = CreateQueryDesc(plannedstmt, queryString,
+ queryDesc = CreateQueryDesc(plannedstmt, part_prune_results, queryString,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, instrument_option);
diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index cf1b1ca571..904cbcba4a 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -779,7 +779,7 @@ execute_sql_string(const char *sql)
{
QueryDesc *qdesc;
- qdesc = CreateQueryDesc(stmt,
+ qdesc = CreateQueryDesc(stmt, NIL,
sql,
GetActiveSnapshot(), NULL,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index 9ac0383459..65c8d0aa59 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -408,7 +408,7 @@ refresh_matview_datafill(DestReceiver *dest, Query *query,
UpdateActiveSnapshotCommandId();
/* Create a QueryDesc, redirecting output to our tuple receiver */
- queryDesc = CreateQueryDesc(plan, queryString,
+ queryDesc = CreateQueryDesc(plan, NIL, queryString,
GetActiveSnapshot(), InvalidSnapshot,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 9e29584d93..29b45539d3 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -155,6 +155,7 @@ ExecuteQuery(ParseState *pstate,
PreparedStatement *entry;
CachedPlan *cplan;
List *plan_list;
+ List *part_prune_results_list;
ParamListInfo paramLI = NULL;
EState *estate = NULL;
Portal portal;
@@ -193,7 +194,10 @@ ExecuteQuery(ParseState *pstate,
entry->plansource->query_string);
/* Replan if needed, and increment plan refcount for portal */
- cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL);
+ cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL,
+ &part_prune_results_list);
+ Assert(list_length(cplan->stmt_list) ==
+ list_length(part_prune_results_list));
plan_list = cplan->stmt_list;
/*
@@ -207,6 +211,9 @@ ExecuteQuery(ParseState *pstate,
plan_list,
cplan);
+ /* Copy Lists of PartitionPruneResults into the portal's context. */
+ PortalStorePartitionPruneResults(portal, part_prune_results_list);
+
/*
* For CREATE TABLE ... AS EXECUTE, we must verify that the prepared
* statement is one that produces tuples. Currently we insist that it be
@@ -576,7 +583,9 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
const char *query_string;
CachedPlan *cplan;
List *plan_list;
- ListCell *p;
+ List *part_prune_results_list;
+ ListCell *p,
+ *pp;
ParamListInfo paramLI = NULL;
EState *estate = NULL;
instr_time planstart;
@@ -619,7 +628,10 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
/* Replan if needed, and acquire a transient refcount */
cplan = GetCachedPlan(entry->plansource, paramLI,
- CurrentResourceOwner, queryEnv);
+ CurrentResourceOwner, queryEnv,
+ &part_prune_results_list);
+ Assert(list_length(cplan->stmt_list) ==
+ list_length(part_prune_results_list));
INSTR_TIME_SET_CURRENT(planduration);
INSTR_TIME_SUBTRACT(planduration, planstart);
@@ -634,13 +646,15 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
plan_list = cplan->stmt_list;
/* Explain each query */
- foreach(p, plan_list)
+ forboth(p, plan_list, pp, part_prune_results_list)
{
PlannedStmt *pstmt = lfirst_node(PlannedStmt, p);
+ List *part_prune_results = lfirst_node(List, pp);
if (pstmt->commandType != CMD_UTILITY)
- ExplainOnePlan(pstmt, into, es, query_string, paramLI, queryEnv,
- &planduration, (es->buffers ? &bufusage : NULL));
+ ExplainOnePlan(pstmt, part_prune_results, into, es, query_string,
+ paramLI, queryEnv, &planduration,
+ (es->buffers ? &bufusage : NULL));
else
ExplainOneUtility(pstmt->utilityStmt, into, es, query_string,
paramLI, queryEnv);
diff --git a/src/backend/executor/README b/src/backend/executor/README
index 17775a49e2..7f8cf1494f 100644
--- a/src/backend/executor/README
+++ b/src/backend/executor/README
@@ -65,6 +65,38 @@ found there. This currently only occurs for Append and MergeAppend nodes. In
this case the non-required subplans are ignored and the executor state's
subnode array will become out of sequence to the plan's subplan list.
+The so-called execution time pruning may also occur even before the execution
+has actually started. One case where that occurs is when a cached generic
+plan is being validated for execution by plancache.c:GetCachedPlan(), which
+works by locking all the relations that will be scanned by that plan. If the
+generic plan contains nodes that can perform execution time partition pruning
+(that is, contain a PartitionPruneInfo), a subset of pruning steps contained
+in a given node's PartitionPruneInfo that do not depend on the execution
+actually having started (called "initial" pruning steps) are performed as part
+of the plan validation step, by calling ExecutorDoInitialPruning(). That
+returns the minimal set of child subplans that satisfy thoe initial pruning
+steps contained in each PartitionPruneInfo. AcquireExecutorLocks() will then
+lock only the relations scanned by those subplans, in addition to those present
+inPlannedStmt.minLockRelids. Note that the subplans are not really pruned as
+in being removed from the plan tree, so care is needed by the downstreams
+users of such a plan that has undergone pre-execution initial pruning.
+
+To prevent the executor and any third party execution code that can look at
+the plan tree from trying to execute the subplans that were pruned as
+described above, the result of that pruning is passed to the executor as a
+List of PartitionPruneResult nodes via the QueryDesc, which is subsequently
+assigned to EState.es_part_prune_results. Each PartitionPruneResult therein
+consists of the set of indexes of surviving subplans in the respective parent
+plan node's (the one to which the corresponding PartitionPruneInfo belongs)
+list of child subplans, saved as a bitmapset valid_subplan_offs. The executor
+or any third party execution code working on a generic plan should not
+re-evaluate the set of initially valid subplans for a given plan node by
+redoing the initial pruning if a PartitionPruneResult belonging to thant plan
+node is present in es_part_prune_results. Note that that is not simply a
+performance optimization, because such re-evaluation of the pruning steps may
+very well end up resulting in a different set of initially valid subplans,
+containing some whose relations were not locked by AcquireExecutorLocks().
+
Each Plan node may have expression trees associated with it, to represent
its target list, qualification conditions, etc. These trees are also
read-only to the executor, but the executor state for expression evaluation
@@ -286,6 +318,10 @@ Query Processing Control Flow
This is a sketch of control flow for full query processing:
+ [ ExecutorDoInitialPruning ] --- an optional step to perform initial
+ partition pruning on the plan tree the result of which is passed
+ to the executor via QueryDesc
+
CreateQueryDesc
ExecutorStart
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 12ff4f3de5..f15265716a 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -49,6 +49,7 @@
#include "commands/matview.h"
#include "commands/trigger.h"
#include "executor/execdebug.h"
+#include "executor/execPartition.h"
#include "executor/nodeSubplan.h"
#include "foreign/fdwapi.h"
#include "jit/jit.h"
@@ -104,6 +105,56 @@ static void EvalPlanQualStart(EPQState *epqstate, Plan *planTree);
/* end of local decls */
+/* ----------------------------------------------------------------
+ * ExecutorDoInitialPruning
+ *
+ * For each plan tree node that has been assigned a PartitionPruneInfo,
+ * this performs initial partition pruning using the information contained
+ * therein to determine the set of child subplans that satisfy the initial
+ * pruning steps, to be returned as a bitmapset of their indexes in the
+ * node's list of child subplans (for example, an Append's appendplans).
+ *
+ * Return value is a List of PartitionPruneResult nodes, one for each
+ * PartitionPruneInfo found in plannedstmt->containsInitialPruning, each
+ * containing a bitmapset of the indexes of unpruned child subplans.
+ * A bitmapset of the RT indexes of the leaf partitions scanned by those
+ * subplans is returned in *scan_leafpart_rtis, which is shared across all
+ * of those PartitionPruneResults.
+ *
+ * The executor must see the exactly same set of subplans as valid for
+ * execution when doing ExecInitNode() on the plan nodes whose
+ * PartitionPruneInfos are processed here. So, it must get the set from the
+ * aforementioned PartitionPruneResult, instead of computing it all over
+ * again by redoing the initial pruning. It's the caller's job to pass the
+ * PartitionPruneResult to the executor.
+ *
+ * Note: Partitioned tables mentioned in PartitionedRelPruneInfo nodes that
+ * drive the pruning will be locked before doing the pruning.
+ * ----------------------------------------------------------------
+ */
+List *
+ExecutorDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params,
+ Bitmapset **scan_leafpart_rtis)
+{
+ List *part_prune_results = NIL;
+ ListCell *lc;
+
+ /* Only get here if there is any pruning to do. */
+ Assert(plannedstmt->containsInitialPruning);
+
+ foreach(lc, plannedstmt->partPruneInfos)
+ {
+ PartitionPruneInfo *pruneinfo = lfirst_node(PartitionPruneInfo, lc);
+ PartitionPruneResult *pruneresult = makeNode(PartitionPruneResult);
+
+ pruneresult->valid_subplan_offs =
+ ExecPartitionDoInitialPruning(plannedstmt, params, pruneinfo,
+ scan_leafpart_rtis);
+ part_prune_results = lappend(part_prune_results, pruneresult);
+ }
+
+ return part_prune_results;
+}
/* ----------------------------------------------------------------
* ExecutorStart
@@ -806,6 +857,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
{
CmdType operation = queryDesc->operation;
PlannedStmt *plannedstmt = queryDesc->plannedstmt;
+ List *part_prune_results = queryDesc->part_prune_results;
Plan *plan = plannedstmt->planTree;
List *rangeTable = plannedstmt->rtable;
EState *estate = queryDesc->estate;
@@ -826,6 +878,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
estate->es_plannedstmt = plannedstmt;
estate->es_part_prune_infos = plannedstmt->partPruneInfos;
+ estate->es_part_prune_results = part_prune_results;
/*
* Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index aca0c6f323..917079a034 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -66,6 +66,7 @@
#define PARALLEL_KEY_QUERY_TEXT UINT64CONST(0xE000000000000008)
#define PARALLEL_KEY_JIT_INSTRUMENTATION UINT64CONST(0xE000000000000009)
#define PARALLEL_KEY_WAL_USAGE UINT64CONST(0xE00000000000000A)
+#define PARALLEL_KEY_PARTITION_PRUNE_RESULTS UINT64CONST(0xE00000000000000B)
#define PARALLEL_TUPLE_QUEUE_SIZE 65536
@@ -182,6 +183,7 @@ ExecSerializePlan(Plan *plan, EState *estate)
pstmt->transientPlan = false;
pstmt->dependsOnRole = false;
pstmt->parallelModeNeeded = false;
+ pstmt->containsInitialPruning = false;
pstmt->planTree = plan;
pstmt->partPruneInfos = estate->es_part_prune_infos;
pstmt->rtable = estate->es_range_table;
@@ -597,12 +599,15 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
FixedParallelExecutorState *fpes;
char *pstmt_data;
char *pstmt_space;
+ char *part_prune_results_data;
+ char *part_prune_results_space;
char *paramlistinfo_space;
BufferUsage *bufusage_space;
WalUsage *walusage_space;
SharedExecutorInstrumentation *instrumentation = NULL;
SharedJitInstrumentation *jit_instrumentation = NULL;
int pstmt_len;
+ int part_prune_results_len;
int paramlistinfo_len;
int instrumentation_len = 0;
int jit_instrumentation_len = 0;
@@ -631,6 +636,7 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
/* Fix up and serialize plan to be sent to workers. */
pstmt_data = ExecSerializePlan(planstate->plan, estate);
+ part_prune_results_data = nodeToString(estate->es_part_prune_results);
/* Create a parallel context. */
pcxt = CreateParallelContext("postgres", "ParallelQueryMain", nworkers);
@@ -657,6 +663,11 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
shm_toc_estimate_chunk(&pcxt->estimator, pstmt_len);
shm_toc_estimate_keys(&pcxt->estimator, 1);
+ /* Estimate space for serialized List of PartitionPruneResult. */
+ part_prune_results_len = strlen(part_prune_results_data) + 1;
+ shm_toc_estimate_chunk(&pcxt->estimator, part_prune_results_len);
+ shm_toc_estimate_keys(&pcxt->estimator, 1);
+
/* Estimate space for serialized ParamListInfo. */
paramlistinfo_len = EstimateParamListSpace(estate->es_param_list_info);
shm_toc_estimate_chunk(&pcxt->estimator, paramlistinfo_len);
@@ -751,6 +762,12 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
memcpy(pstmt_space, pstmt_data, pstmt_len);
shm_toc_insert(pcxt->toc, PARALLEL_KEY_PLANNEDSTMT, pstmt_space);
+ /* Store serialized List of PartitionPruneResult */
+ part_prune_results_space = shm_toc_allocate(pcxt->toc, part_prune_results_len);
+ memcpy(part_prune_results_space, part_prune_results_data, part_prune_results_len);
+ shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARTITION_PRUNE_RESULTS,
+ part_prune_results_space);
+
/* Store serialized ParamListInfo. */
paramlistinfo_space = shm_toc_allocate(pcxt->toc, paramlistinfo_len);
shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARAMLISTINFO, paramlistinfo_space);
@@ -1232,8 +1249,10 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
int instrument_options)
{
char *pstmtspace;
+ char *part_prune_results_space;
char *paramspace;
PlannedStmt *pstmt;
+ List *part_prune_results;
ParamListInfo paramLI;
char *queryString;
@@ -1244,12 +1263,17 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
pstmtspace = shm_toc_lookup(toc, PARALLEL_KEY_PLANNEDSTMT, false);
pstmt = (PlannedStmt *) stringToNode(pstmtspace);
+ /* Reconstruct leader-supplied PartitionPruneResult. */
+ part_prune_results_space =
+ shm_toc_lookup(toc, PARALLEL_KEY_PARTITION_PRUNE_RESULTS, false);
+ part_prune_results = (List *) stringToNode(part_prune_results_space);
+
/* Reconstruct ParamListInfo. */
paramspace = shm_toc_lookup(toc, PARALLEL_KEY_PARAMLISTINFO, false);
paramLI = RestoreParamList(¶mspace);
/* Create a QueryDesc for the query. */
- return CreateQueryDesc(pstmt,
+ return CreateQueryDesc(pstmt, part_prune_results,
queryString,
GetActiveSnapshot(), InvalidSnapshot,
receiver, paramLI, NULL, instrument_options);
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 88d0ea3adb..bc8331a222 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -25,6 +25,7 @@
#include "mb/pg_wchar.h"
#include "miscadmin.h"
#include "nodes/makefuncs.h"
+#include "parser/parsetree.h"
#include "partitioning/partbounds.h"
#include "partitioning/partdesc.h"
#include "partitioning/partprune.h"
@@ -185,7 +186,11 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation rel,
static List *adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri);
static List *adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap);
static PartitionPruneState *CreatePartitionPruneState(PlanState *planstate,
- PartitionPruneInfo *pruneinfo);
+ PartitionPruneInfo *pruneinfo,
+ bool consider_initial_steps,
+ bool consider_exec_steps,
+ List *rtable, ExprContext *econtext,
+ PartitionDirectory partdir);
static void InitPartitionPruneContext(PartitionPruneContext *context,
List *pruning_steps,
PartitionDesc partdesc,
@@ -198,7 +203,8 @@ static void PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
static void find_matching_subplans_recurse(PartitionPruningData *prunedata,
PartitionedRelPruningData *pprune,
bool initial_prune,
- Bitmapset **validsubplans);
+ Bitmapset **validsubplans,
+ Bitmapset **scan_leafpart_rtis);
/*
@@ -1749,8 +1755,10 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
* considered to be a stable expression, it can change value from one plan
* node scan to the next during query execution. Stable comparison
* expressions that don't involve such Params allow partition pruning to be
- * done once during executor startup. Expressions that do involve such Params
- * require us to prune separately for each scan of the parent plan node.
+ * done once during executor startup or during ExecutorDoInitialPruning() that
+ * runs as part of performing AcquireExecutorLocks() on a given plan tree.
+ * Expressions that do involve such Params require us to prune separately for
+ * each scan of the parent plan node.
*
* Note that pruning away unneeded subplans during executor startup has the
* added benefit of not having to initialize the unneeded subplans at all.
@@ -1767,6 +1775,13 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
* account for initial pruning possibly having eliminated some of the
* subplans.
*
+ * ExecPartitionDoInitialPruning:
+ * Do initial pruning with the information contained in a given
+ * PartitionPruneInfo to determine the minimal set of child subplans
+ * to be executed of the parent plan node to which the PartitionPruneInfo
+ * belongs and also the set of the RT indexes of leaf partitions that will
+ * be scanned with those subplans.
+ *
* ExecFindMatchingSubPlans:
* Returns indexes of matching subplans after evaluating the expressions
* that are safe to evaluate at a given point. This function is first
@@ -1787,8 +1802,9 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
*
* On return, *initially_valid_subplans is assigned the set of indexes of
* child subplans that must be initialized along with the parent plan node.
- * Initial pruning is performed here if needed and in that case only the
- * surviving subplans' indexes are added.
+ * Initial pruning is performed here if needed (unless it has already been done
+ * by ExecutorDoInitialPruning()), and in that case only the surviving
+ * subplans' indexes are added.
*
* If subplans are indeed pruned, subplan_map arrays contained in the returned
* PartitionPruneState are re-sequenced to not count those, though only if the
@@ -1801,9 +1817,10 @@ ExecInitPartitionPruning(PlanState *planstate,
Bitmapset *root_parent_relids,
Bitmapset **initially_valid_subplans)
{
- PartitionPruneState *prunestate;
+ PartitionPruneState *prunestate = NULL;
EState *estate = planstate->state;
PartitionPruneInfo *pruneinfo;
+ PartitionPruneResult *pruneresult = NULL;
/* Obtain the pruneinfo we need, and make sure it's the right one */
pruneinfo = list_nth(estate->es_part_prune_infos, part_prune_index);
@@ -1819,20 +1836,56 @@ ExecInitPartitionPruning(PlanState *planstate,
/* We may need an expression context to evaluate partition exprs */
ExecAssignExprContext(estate, planstate);
- /* Create the working data structure for pruning */
- prunestate = CreatePartitionPruneState(planstate, pruneinfo);
+ /*
+ * No need to do initial pruning if it was done already by
+ * ExecutorDoInitialPruning(), which it would be if es_part_prune_results
+ * is set.
+ */
+ if (estate->es_part_prune_results)
+ pruneresult = list_nth_node(PartitionPruneResult,
+ estate->es_part_prune_results,
+ part_prune_index);
+
+ if (pruneresult == NULL || pruneinfo->needs_exec_pruning)
+ {
+ /* We may need an expression context to evaluate partition exprs */
+ ExecAssignExprContext(estate, planstate);
+
+ /* For data reading, executor always omits detached partitions */
+ if (estate->es_partition_directory == NULL)
+ estate->es_partition_directory =
+ CreatePartitionDirectory(estate->es_query_cxt, false);
+
+ /*
+ * Create the working data structure for pruning. No need to consider
+ * initial pruning steps if we have a PartitionPruneResult.
+ */
+ prunestate = CreatePartitionPruneState(planstate, pruneinfo,
+ pruneresult == NULL,
+ pruneinfo->needs_exec_pruning,
+ NIL, planstate->ps_ExprContext,
+ estate->es_partition_directory);
+ }
/*
* Perform an initial partition prune pass, if required.
*/
- if (prunestate->do_initial_prune)
- *initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true);
+ if (pruneresult)
+ {
+ *initially_valid_subplans = bms_copy(pruneresult->valid_subplan_offs);
+ }
+ else if (prunestate && prunestate->do_initial_prune)
+ {
+ *initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true,
+ NULL);
+ }
else
{
- /* No pruning, so we'll need to initialize all subplans */
+ /* No initial pruning, so we'll need to initialize all subplans */
Assert(n_total_subplans > 0);
*initially_valid_subplans = bms_add_range(NULL, 0,
n_total_subplans - 1);
+ return prunestate;
}
/*
@@ -1840,7 +1893,8 @@ ExecInitPartitionPruning(PlanState *planstate,
* that were removed above due to initial pruning. No need to do this if
* no steps were removed.
*/
- if (bms_num_members(*initially_valid_subplans) < n_total_subplans)
+ if (prunestate &&
+ bms_num_members(*initially_valid_subplans) < n_total_subplans)
{
/*
* We can safely skip this when !do_exec_prune, even though that
@@ -1856,11 +1910,74 @@ ExecInitPartitionPruning(PlanState *planstate,
return prunestate;
}
+/*
+ * ExecPartitionDoInitialPruning
+ * Perform initial pruning using given PartitionPruneInfo to determine
+ * the minimal set of child subplans that will be executed and also the
+ * set of RT indexes of the leaf partitions scanned by those subplans.
+ */
+Bitmapset *
+ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params,
+ PartitionPruneInfo *pruneinfo,
+ Bitmapset **scan_leafpart_rtis)
+{
+ List *rtable = plannedstmt->rtable;
+ ExprContext *econtext;
+ PartitionDirectory pdir;
+ MemoryContext oldcontext,
+ tmpcontext;
+ PartitionPruneState *prunestate;
+ Bitmapset *valid_subplan_offs;
+
+ /*
+ * A temporary context for memory allocations required while executing
+ * partition pruning steps.
+ */
+ tmpcontext = AllocSetContextCreate(CurrentMemoryContext,
+ "initial pruning working data",
+ ALLOCSET_DEFAULT_SIZES);
+ oldcontext = MemoryContextSwitchTo(tmpcontext);
+
+ /*
+ * PartitionDirectory to look up partition descriptors.
+ * Note that we don't omit detached partitions, just like during
+ * execution proper.
+ */
+ pdir = CreatePartitionDirectory(CurrentMemoryContext, false);
+
+ /*
+ * We don't yet have a PlanState for the parent plan node, so we must
+ * create a standalone ExprContext to evaluate pruning expressions,
+ * equipped with the information about the EXTERN parameters that the
+ * caller passed us. Note that that's okay because the initial pruning
+ * steps do not contain anything that requires the execution to have
+ * started and thus need the information contained in a PlanState.
+ */
+ econtext = CreateStandaloneExprContext();
+ econtext->ecxt_param_list_info = params;
+ prunestate = CreatePartitionPruneState(NULL, pruneinfo, true, false,
+ rtable, econtext, pdir);
+ MemoryContextSwitchTo(oldcontext);
+
+ /* Do the initial pruning. */
+ valid_subplan_offs = ExecFindMatchingSubPlans(prunestate, true,
+ scan_leafpart_rtis);
+
+ FreeExprContext(econtext, true);
+ DestroyPartitionDirectory(pdir);
+ MemoryContextDelete(tmpcontext);
+
+ return valid_subplan_offs;
+}
+
/*
* CreatePartitionPruneState
* Build the data structure required for calling ExecFindMatchingSubPlans
*
- * 'planstate' is the parent plan node's execution state.
+ * 'planstate', if not NULL, is the parent plan node's execution state. It
+ * can be NULL if being called before ExecutorStart(), in which case,
+ * 'rtable' (range table), 'econtext', and 'partdir' must be explicitly
+ * provided.
*
* 'pruneinfo' is a PartitionPruneInfo as generated by
* make_partition_pruneinfo. Here we build a PartitionPruneState containing a
@@ -1874,19 +1991,21 @@ ExecInitPartitionPruning(PlanState *planstate,
* PartitionedRelPruneInfo.
*/
static PartitionPruneState *
-CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
+CreatePartitionPruneState(PlanState *planstate,
+ PartitionPruneInfo *pruneinfo,
+ bool consider_initial_steps,
+ bool consider_exec_steps,
+ List *rtable, ExprContext *econtext,
+ PartitionDirectory partdir)
{
- EState *estate = planstate->state;
+ EState *estate = planstate ? planstate->state : NULL;
PartitionPruneState *prunestate;
int n_part_hierarchies;
ListCell *lc;
int i;
- ExprContext *econtext = planstate->ps_ExprContext;
- /* For data reading, executor always omits detached partitions */
- if (estate->es_partition_directory == NULL)
- estate->es_partition_directory =
- CreatePartitionDirectory(estate->es_query_cxt, false);
+ Assert((estate != NULL) ||
+ (partdir != NULL && econtext != NULL && rtable != NIL));
n_part_hierarchies = list_length(pruneinfo->prune_infos);
Assert(n_part_hierarchies > 0);
@@ -1941,15 +2060,42 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
PartitionKey partkey;
/*
- * We can rely on the copies of the partitioned table's partition
- * key and partition descriptor appearing in its relcache entry,
- * because that entry will be held open and locked for the
- * duration of this executor run.
+ * Must open the relation by ourselves when called before the
+ * execution has started, such as, when called during
+ * ExecutorDoInitialPruning() on a cached plan. In that case,
+ * sub-partitions must be locked, because AcquirePlannerLocks()
+ * would not have seen them. (1st relation in a partrelpruneinfos
+ * list is always the root partitioned table appearing in the
+ * query, which AcquirePlannerLocks() would have locked; the
+ * Assert in relation_open() guards that assumption.)
+ */
+ if (estate == NULL)
+ {
+ RangeTblEntry *rte = rt_fetch(pinfo->rtindex, rtable);
+ int lockmode = (j == 0) ? NoLock : rte->rellockmode;
+
+ partrel = table_open(rte->relid, lockmode);
+ }
+ else
+ partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
+
+ /*
+ * We can rely on the copy of the partitioned table's partition
+ * key from in its relcache entry, because it can't change (or
+ * get destroyed) as long as the relation is locked. Partition
+ * descriptor is taken from the PartitionDirectory associated with
+ * the table that is held open long enough for the descriptor to
+ * remain valid while it's used to perform the pruning steps.
*/
- partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
partkey = RelationGetPartitionKey(partrel);
- partdesc = PartitionDirectoryLookup(estate->es_partition_directory,
- partrel);
+ partdesc = PartitionDirectoryLookup(partdir, partrel);
+
+ /*
+ * Must close partrel, keeping the lock taken, if we're not using
+ * EState's entry.
+ */
+ if (estate == NULL)
+ table_close(partrel, NoLock);
/*
* Initialize the subplan_map and subpart_map.
@@ -1963,6 +2109,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
Assert(partdesc->nparts >= pinfo->nparts);
pprune->nparts = partdesc->nparts;
pprune->subplan_map = palloc(sizeof(int) * partdesc->nparts);
+ pprune->rti_map = palloc(sizeof(Index) * partdesc->nparts);
if (partdesc->nparts == pinfo->nparts)
{
/*
@@ -1973,6 +2120,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
pprune->subpart_map = pinfo->subpart_map;
memcpy(pprune->subplan_map, pinfo->subplan_map,
sizeof(int) * pinfo->nparts);
+ memcpy(pprune->rti_map, pinfo->rti_map,
+ sizeof(int) * pinfo->nparts);
/*
* Double-check that the list of unpruned relations has not
@@ -2023,6 +2172,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
pinfo->subplan_map[pd_idx];
pprune->subpart_map[pp_idx] =
pinfo->subpart_map[pd_idx];
+ pprune->rti_map[pp_idx] =
+ pinfo->rti_map[pd_idx];
pd_idx++;
}
else
@@ -2030,6 +2181,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
/* this partdesc entry is not in the plan */
pprune->subplan_map[pp_idx] = -1;
pprune->subpart_map[pp_idx] = -1;
+ pprune->rti_map[pp_idx] = 0;
}
}
@@ -2051,7 +2203,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
* Initialize pruning contexts as needed.
*/
pprune->initial_pruning_steps = pinfo->initial_pruning_steps;
- if (pinfo->initial_pruning_steps)
+ if (consider_initial_steps && pinfo->initial_pruning_steps)
{
InitPartitionPruneContext(&pprune->initial_context,
pinfo->initial_pruning_steps,
@@ -2061,7 +2213,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
prunestate->do_initial_prune = true;
}
pprune->exec_pruning_steps = pinfo->exec_pruning_steps;
- if (pinfo->exec_pruning_steps)
+ if (consider_exec_steps && pinfo->exec_pruning_steps)
{
InitPartitionPruneContext(&pprune->exec_context,
pinfo->exec_pruning_steps,
@@ -2289,10 +2441,14 @@ PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
* Pass initial_prune if PARAM_EXEC Params cannot yet be evaluated. This
* differentiates the initial executor-time pruning step from later
* runtime pruning.
+ *
+ * RT indexes of leaf partitions scanned by the chosen subplans are added to
+ * *scan_leafpart_rtis if the pointer is non-NULL.
*/
Bitmapset *
ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
- bool initial_prune)
+ bool initial_prune,
+ Bitmapset **scan_leafpart_rtis)
{
Bitmapset *result = NULL;
MemoryContext oldcontext;
@@ -2327,7 +2483,7 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
*/
pprune = &prunedata->partrelprunedata[0];
find_matching_subplans_recurse(prunedata, pprune, initial_prune,
- &result);
+ &result, scan_leafpart_rtis);
/* Expression eval may have used space in ExprContext too */
if (pprune->exec_pruning_steps)
@@ -2341,6 +2497,8 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
/* Copy result out of the temp context before we reset it */
result = bms_copy(result);
+ if (scan_leafpart_rtis)
+ *scan_leafpart_rtis = bms_copy(*scan_leafpart_rtis);
MemoryContextReset(prunestate->prune_context);
@@ -2351,13 +2509,15 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
* find_matching_subplans_recurse
* Recursive worker function for ExecFindMatchingSubPlans
*
- * Adds valid (non-prunable) subplan IDs to *validsubplans
+ * Adds valid (non-prunable) subplan IDs to *validsubplans and RT indexes of
+ * of the corresponding leaf partitions to *scan_leafpart_rtis (if asked for).
*/
static void
find_matching_subplans_recurse(PartitionPruningData *prunedata,
PartitionedRelPruningData *pprune,
bool initial_prune,
- Bitmapset **validsubplans)
+ Bitmapset **validsubplans,
+ Bitmapset **scan_leafpart_rtis)
{
Bitmapset *partset;
int i;
@@ -2384,8 +2544,14 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
while ((i = bms_next_member(partset, i)) >= 0)
{
if (pprune->subplan_map[i] >= 0)
+ {
*validsubplans = bms_add_member(*validsubplans,
pprune->subplan_map[i]);
+ Assert(pprune->rti_map[i] > 0);
+ if (scan_leafpart_rtis)
+ *scan_leafpart_rtis = bms_add_member(*scan_leafpart_rtis,
+ pprune->rti_map[i]);
+ }
else
{
int partidx = pprune->subpart_map[i];
@@ -2393,7 +2559,8 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
if (partidx >= 0)
find_matching_subplans_recurse(prunedata,
&prunedata->partrelprunedata[partidx],
- initial_prune, validsubplans);
+ initial_prune, validsubplans,
+ scan_leafpart_rtis);
else
{
/*
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 572c87e453..044bf3f491 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -135,6 +135,7 @@ CreateExecutorState(void)
estate->es_param_exec_vals = NULL;
estate->es_queryEnv = NULL;
+ estate->es_part_prune_results = NIL;
estate->es_query_cxt = qcontext;
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index dc13625171..bffb42ce71 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -842,7 +842,7 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
else
dest = None_Receiver;
- es->qd = CreateQueryDesc(es->stmt,
+ es->qd = CreateQueryDesc(es->stmt, NIL,
fcache->src,
GetActiveSnapshot(),
InvalidSnapshot,
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index 99830198bd..3b917584de 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -156,7 +156,8 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
* subplan, we can fill as_valid_subplans immediately, preventing
* later calls to ExecFindMatchingSubPlans.
*/
- if (!prunestate->do_exec_prune && nplans > 0)
+ if (appendstate->as_prune_state == NULL ||
+ (!appendstate->as_prune_state->do_exec_prune && nplans > 0))
appendstate->as_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
}
else
@@ -578,7 +579,7 @@ choose_next_subplan_locally(AppendState *node)
}
else if (node->as_valid_subplans == NULL)
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
whichplan = -1;
}
@@ -643,7 +644,7 @@ choose_next_subplan_for_leader(AppendState *node)
if (node->as_valid_subplans == NULL)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
/*
* Mark each invalid plan as finished to allow the loop below to
@@ -718,7 +719,7 @@ choose_next_subplan_for_worker(AppendState *node)
else if (node->as_valid_subplans == NULL)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
mark_invalid_subplans_as_finished(node);
}
@@ -869,7 +870,7 @@ ExecAppendAsyncBegin(AppendState *node)
if (node->as_valid_subplans == NULL)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
classify_matching_subplans(node);
}
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index f370f9f287..ccfa083945 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -104,7 +104,8 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
* subplan, we can fill ms_valid_subplans immediately, preventing
* later calls to ExecFindMatchingSubPlans.
*/
- if (!prunestate->do_exec_prune && nplans > 0)
+ if (mergestate->ms_prune_state == NULL ||
+ (!mergestate->ms_prune_state->do_exec_prune && nplans > 0))
mergestate->ms_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
}
else
@@ -219,7 +220,7 @@ ExecMergeAppend(PlanState *pstate)
*/
if (node->ms_valid_subplans == NULL)
node->ms_valid_subplans =
- ExecFindMatchingSubPlans(node->ms_prune_state, false);
+ ExecFindMatchingSubPlans(node->ms_prune_state, false, NULL);
/*
* First time through: pull the first tuple from each valid subplan,
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index fd5796f1b9..93012a5b3b 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -1578,6 +1578,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
CachedPlanSource *plansource;
CachedPlan *cplan;
List *stmt_list;
+ List *part_prune_results_list;
char *query_string;
Snapshot snapshot;
MemoryContext oldcontext;
@@ -1657,7 +1658,10 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
*/
/* Replan if needed, and increment plan refcount for portal */
- cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv);
+ cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv,
+ &part_prune_results_list);
+ Assert(list_length(cplan->stmt_list) ==
+ list_length(part_prune_results_list));
stmt_list = cplan->stmt_list;
if (!plan->saved)
@@ -1685,6 +1689,9 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
stmt_list,
cplan);
+ /* Copy Lists of PartitionPruneResults into the portal's context. */
+ PortalStorePartitionPruneResults(portal, part_prune_results_list);
+
/*
* Set up options for portal. Default SCROLL type is chosen the same way
* as PerformCursorOpen does it.
@@ -2092,7 +2099,8 @@ SPI_plan_get_cached_plan(SPIPlanPtr plan)
/* Get the generic plan for the query */
cplan = GetCachedPlan(plansource, NULL,
plan->saved ? CurrentResourceOwner : NULL,
- _SPI_current->queryEnv);
+ _SPI_current->queryEnv,
+ NULL /* Not interested in PartitionPruneResults */);
Assert(cplan == plansource->gplan);
/* Pop the error context stack */
@@ -2473,7 +2481,9 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
{
CachedPlanSource *plansource = (CachedPlanSource *) lfirst(lc1);
List *stmt_list;
- ListCell *lc2;
+ List *part_prune_results_list;
+ ListCell *lc2,
+ *lc3;
spicallbackarg.query = plansource->query_string;
@@ -2549,8 +2559,10 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
* plan, the refcount must be backed by the plan_owner.
*/
cplan = GetCachedPlan(plansource, options->params,
- plan_owner, _SPI_current->queryEnv);
-
+ plan_owner, _SPI_current->queryEnv,
+ &part_prune_results_list);
+ Assert(list_length(cplan->stmt_list) ==
+ list_length(part_prune_results_list));
stmt_list = cplan->stmt_list;
/*
@@ -2589,9 +2601,10 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
}
}
- foreach(lc2, stmt_list)
+ forboth(lc2, stmt_list, lc3, part_prune_results_list)
{
PlannedStmt *stmt = lfirst_node(PlannedStmt, lc2);
+ List *part_prune_results = lfirst_node(List, lc3);
bool canSetTag = stmt->canSetTag;
DestReceiver *dest;
@@ -2663,7 +2676,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
else
snap = InvalidSnapshot;
- qdesc = CreateQueryDesc(stmt,
+ qdesc = CreateQueryDesc(stmt, part_prune_results,
plansource->query_string,
snap, crosscheck_snapshot,
dest,
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 23776367c5..b01f55fb4f 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -158,6 +158,11 @@
token = pg_strtok(&length); /* skip :fldname */ \
local_node->fldname = readIntCols(len)
+/* Read an Index array */
+#define READ_INDEX_ARRAY(fldname, len) \
+ token = pg_strtok(&length); /* skip :fldname */ \
+ local_node->fldname = readIndexCols(len)
+
/* Read a bool array */
#define READ_BOOL_ARRAY(fldname, len) \
token = pg_strtok(&length); /* skip :fldname */ \
@@ -800,7 +805,6 @@ fnname(int numCols) \
*/
READ_SCALAR_ARRAY(readAttrNumberCols, int16, atoi)
READ_SCALAR_ARRAY(readOidCols, Oid, atooid)
-/* outfuncs.c has writeIndexCols, but we don't yet need that here */
-/* READ_SCALAR_ARRAY(readIndexCols, Index, atoui) */
+READ_SCALAR_ARRAY(readIndexCols, Index, atoui)
READ_SCALAR_ARRAY(readIntCols, int, atoi)
READ_SCALAR_ARRAY(readBoolCols, bool, strtobool)
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 799602f5ea..a96d316dca 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -520,7 +520,9 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
result->parallelModeNeeded = glob->parallelModeNeeded;
result->planTree = top_plan;
result->partPruneInfos = glob->partPruneInfos;
+ result->containsInitialPruning = glob->containsInitialPruning;
result->rtable = glob->finalrtable;
+ result->minLockRelids = glob->minLockRelids;
result->resultRelations = glob->resultRelations;
result->appendRelations = glob->appendRelations;
result->subplans = glob->subplans;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 399c1812d4..44ffe71c49 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -270,6 +270,16 @@ set_plan_references(PlannerInfo *root, Plan *plan)
*/
add_rtes_to_flat_rtable(root, false);
+ /*
+ * Add the query's adjusted range of RT indexes to glob->minLockRelids.
+ * The adjusted RT indexes of prunable relations will be deleted from the
+ * set below where PartitionPruneInfos are processed.
+ */
+ glob->minLockRelids =
+ bms_add_range(glob->minLockRelids,
+ rtoffset + 1,
+ rtoffset + list_length(root->parse->rtable));
+
/*
* Adjust RT indexes of PlanRowMarks and add to final rowmarks list
*/
@@ -353,6 +363,7 @@ set_plan_references(PlannerInfo *root, Plan *plan)
{
PartitionPruneInfo *pruneinfo = lfirst(lc);
ListCell *l;
+ Bitmapset *leafpart_rtis = NULL;
pruneinfo->root_parent_relids =
offset_relid_set(pruneinfo->root_parent_relids, rtoffset);
@@ -364,15 +375,50 @@ set_plan_references(PlannerInfo *root, Plan *plan)
foreach(l2, prune_infos)
{
PartitionedRelPruneInfo *pinfo = lfirst(l2);
+ int i;
/* RT index of the table to which the pinfo belongs. */
pinfo->rtindex += rtoffset;
+
+ /* Also of the leaf partitions that might be scanned. */
+ for (i = 0; i < pinfo->nparts; i++)
+ {
+ if (pinfo->rti_map[i] > 0 && pinfo->subplan_map[i] >= 0)
+ {
+ pinfo->rti_map[i] += rtoffset;
+ leafpart_rtis = bms_add_member(leafpart_rtis,
+ pinfo->rti_map[i]);
+ }
+ }
}
}
+ if (pruneinfo->needs_init_pruning)
+ {
+ glob->containsInitialPruning = true;
+
+ /*
+ * Delete the leaf partition RTIs from the global set of relations
+ * to be locked before executing the plan. AcquireExecutorLocks()
+ * will find the ones to add to the set after performing initial
+ * pruning.
+ */
+ glob->minLockRelids = bms_del_members(glob->minLockRelids,
+ leafpart_rtis);
+ }
+
glob->partPruneInfos = lappend(glob->partPruneInfos, pruneinfo);
}
+ /*
+ * It seems worth doing a bms_copy() on glob->minLockRelids if we deleted
+ * bits from it above to get rid of any empty tail bits. It seems better
+ * for the loop over this set in AcquireExecutorLocks() to not have to go
+ * through those useless bit words.
+ */
+ if (glob->containsInitialPruning)
+ glob->minLockRelids = bms_copy(glob->minLockRelids);
+
return result;
}
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index d48f6784c1..d5556354f7 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -144,7 +144,9 @@ static List *make_partitionedrel_pruneinfo(PlannerInfo *root,
List *prunequal,
Bitmapset *partrelids,
int *relid_subplan_map,
- Bitmapset **matchedsubplans);
+ Bitmapset **matchedsubplans,
+ bool *needs_init_pruning,
+ bool *needs_exec_pruning);
static void gen_partprune_steps(RelOptInfo *rel, List *clauses,
PartClauseTarget target,
GeneratePruningStepsContext *context);
@@ -234,6 +236,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
int *relid_subplan_map;
ListCell *lc;
int i;
+ bool needs_init_pruning = false;
+ bool needs_exec_pruning = false;
/*
* Scan the subpaths to see which ones are scans of partition child
@@ -313,12 +317,16 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
Bitmapset *partrelids = (Bitmapset *) lfirst(lc);
List *pinfolist;
Bitmapset *matchedsubplans = NULL;
+ bool partrel_needs_init_pruning;
+ bool partrel_needs_exec_pruning;
pinfolist = make_partitionedrel_pruneinfo(root, parentrel,
prunequal,
partrelids,
relid_subplan_map,
- &matchedsubplans);
+ &matchedsubplans,
+ &partrel_needs_init_pruning,
+ &partrel_needs_exec_pruning);
/* When pruning is possible, record the matched subplans */
if (pinfolist != NIL)
@@ -327,6 +335,9 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
allmatchedsubplans = bms_join(matchedsubplans,
allmatchedsubplans);
}
+
+ needs_init_pruning |= partrel_needs_init_pruning;
+ needs_exec_pruning |= partrel_needs_exec_pruning;
}
pfree(relid_subplan_map);
@@ -342,6 +353,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
pruneinfo = makeNode(PartitionPruneInfo);
pruneinfo->root_parent_relids = parentrel->relids;
pruneinfo->prune_infos = prunerelinfos;
+ pruneinfo->needs_init_pruning = needs_init_pruning;
+ pruneinfo->needs_exec_pruning = needs_exec_pruning;
/*
* Some subplans may not belong to any of the identified partitioned rels.
@@ -442,13 +455,18 @@ add_part_relids(List *allpartrelids, Bitmapset *partrelids)
* If we cannot find any useful run-time pruning steps, return NIL.
* However, on success, each rel identified in partrelids will have
* an element in the result list, even if some of them are useless.
+ * *needs_init_pruning and *needs_exec_pruning are set to indicate that the
+ * returned PartitionedRelPruneInfos contains pruning steps that can be
+ * performed before and after execution begins, respectively.
*/
static List *
make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
List *prunequal,
Bitmapset *partrelids,
int *relid_subplan_map,
- Bitmapset **matchedsubplans)
+ Bitmapset **matchedsubplans,
+ bool *needs_init_pruning,
+ bool *needs_exec_pruning)
{
RelOptInfo *targetpart = NULL;
List *pinfolist = NIL;
@@ -459,6 +477,10 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
int rti;
int i;
+ /* Will find out below. */
+ *needs_init_pruning = false;
+ *needs_exec_pruning = false;
+
/*
* Examine each partitioned rel, constructing a temporary array to map
* from planner relids to index of the partitioned rel, and building a
@@ -546,6 +568,9 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
* executor per-scan pruning steps. This first pass creates startup
* pruning steps and detects whether there's any possibly-useful quals
* that would require per-scan pruning.
+ *
+ * In the first pass, we note whether the 2nd pass is necessary by
+ * noting the presence of EXEC parameters.
*/
gen_partprune_steps(subpart, partprunequal, PARTTARGET_INITIAL,
&context);
@@ -620,6 +645,12 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
pinfo->execparamids = execparamids;
/* Remaining fields will be filled in the next loop */
+ /* record which types of pruning steps we've seen so far */
+ if (initial_pruning_steps != NIL)
+ *needs_init_pruning = true;
+ if (exec_pruning_steps != NIL)
+ *needs_exec_pruning = true;
+
pinfolist = lappend(pinfolist, pinfo);
}
@@ -647,6 +678,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
int *subplan_map;
int *subpart_map;
Oid *relid_map;
+ Index *rti_map;
/*
* Construct the subplan and subpart maps for this partitioning level.
@@ -659,6 +691,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
subpart_map = (int *) palloc(nparts * sizeof(int));
memset(subpart_map, -1, nparts * sizeof(int));
relid_map = (Oid *) palloc0(nparts * sizeof(Oid));
+ rti_map = (Index *) palloc0(nparts * sizeof(Index));
present_parts = NULL;
i = -1;
@@ -673,6 +706,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
subplan_map[i] = subplanidx = relid_subplan_map[partrel->relid] - 1;
subpart_map[i] = subpartidx = relid_subpart_map[partrel->relid] - 1;
relid_map[i] = planner_rt_fetch(partrel->relid, root)->relid;
+ rti_map[i] = partrel->relid;
if (subplanidx >= 0)
{
present_parts = bms_add_member(present_parts, i);
@@ -697,6 +731,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
pinfo->subplan_map = subplan_map;
pinfo->subpart_map = subpart_map;
pinfo->relid_map = relid_map;
+ pinfo->rti_map = rti_map;
}
pfree(relid_subpart_map);
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 3082093d1e..95ab1d0eef 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -1598,6 +1598,7 @@ exec_bind_message(StringInfo input_message)
int16 *rformats = NULL;
CachedPlanSource *psrc;
CachedPlan *cplan;
+ List *part_prune_results_list;
Portal portal;
char *query_string;
char *saved_stmt_name;
@@ -1972,7 +1973,9 @@ exec_bind_message(StringInfo input_message)
* will be generated in MessageContext. The plan refcount will be
* assigned to the Portal, so it will be released at portal destruction.
*/
- cplan = GetCachedPlan(psrc, params, NULL, NULL);
+ cplan = GetCachedPlan(psrc, params, NULL, NULL, &part_prune_results_list);
+ Assert(list_length(cplan->stmt_list) ==
+ list_length(part_prune_results_list));
/*
* Now we can define the portal.
@@ -1987,6 +1990,9 @@ exec_bind_message(StringInfo input_message)
cplan->stmt_list,
cplan);
+ /* Copy Lists of PartitionPruneResults into the portal's context. */
+ PortalStorePartitionPruneResults(portal, part_prune_results_list);
+
/* Done with the snapshot used for parameter I/O and parsing/planning */
if (snapshot_set)
PopActiveSnapshot();
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index 52e2db6452..f582ff177b 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -35,7 +35,7 @@
Portal ActivePortal = NULL;
-static void ProcessQuery(PlannedStmt *plan,
+static void ProcessQuery(PlannedStmt *plan, List *part_prune_results,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -65,6 +65,7 @@ static void DoPortalRewind(Portal portal);
*/
QueryDesc *
CreateQueryDesc(PlannedStmt *plannedstmt,
+ List *part_prune_results,
const char *sourceText,
Snapshot snapshot,
Snapshot crosscheck_snapshot,
@@ -77,6 +78,8 @@ CreateQueryDesc(PlannedStmt *plannedstmt,
qd->operation = plannedstmt->commandType; /* operation */
qd->plannedstmt = plannedstmt; /* plan */
+ qd->part_prune_results = part_prune_results; /* ExecutorDoInitialPruning()
+ * output for plan */
qd->sourceText = sourceText; /* query text */
qd->snapshot = RegisterSnapshot(snapshot); /* snapshot */
/* RI check snapshot */
@@ -122,6 +125,7 @@ FreeQueryDesc(QueryDesc *qdesc)
* PORTAL_ONE_RETURNING, or PORTAL_ONE_MOD_WITH portal
*
* plan: the plan tree for the query
+ * part_prune_results: ExecutorDoInitialPruning() output for the PlannedStmt
* sourceText: the source text of the query
* params: any parameters needed
* dest: where to send results
@@ -134,6 +138,7 @@ FreeQueryDesc(QueryDesc *qdesc)
*/
static void
ProcessQuery(PlannedStmt *plan,
+ List *part_prune_results,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -145,7 +150,7 @@ ProcessQuery(PlannedStmt *plan,
/*
* Create the QueryDesc object
*/
- queryDesc = CreateQueryDesc(plan, sourceText,
+ queryDesc = CreateQueryDesc(plan, part_prune_results, sourceText,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, 0);
@@ -491,8 +496,13 @@ PortalStart(Portal portal, ParamListInfo params,
/*
* Create QueryDesc in portal's context; for the moment, set
* the destination to DestNone.
+ *
+ * There is no PartitionPruneResult unless the PlannedStmt is
+ * from a CachedPlan.
*/
queryDesc = CreateQueryDesc(linitial_node(PlannedStmt, portal->stmts),
+ portal->part_prune_results_list == NIL ? NIL :
+ linitial(portal->part_prune_results_list),
portal->sourceText,
GetActiveSnapshot(),
InvalidSnapshot,
@@ -1225,6 +1235,8 @@ PortalRunMulti(Portal portal,
if (pstmt->utilityStmt == NULL)
{
+ List *part_prune_results = NIL;
+
/*
* process a plannable query.
*/
@@ -1271,10 +1283,19 @@ PortalRunMulti(Portal portal,
else
UpdateActiveSnapshotCommandId();
+ /*
+ * Determine if there's a corresponding List of PartitionPruneResult
+ * for this PlannedStmt.
+ */
+ if (portal->part_prune_results_list != NIL)
+ part_prune_results = list_nth_node(List,
+ portal->part_prune_results_list,
+ foreach_current_index(stmtlist_item));
+
if (pstmt->canSetTag)
{
/* statement can set tag string */
- ProcessQuery(pstmt,
+ ProcessQuery(pstmt, part_prune_results,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
@@ -1283,7 +1304,7 @@ PortalRunMulti(Portal portal,
else
{
/* stmt added by rewrite cannot set tag */
- ProcessQuery(pstmt,
+ ProcessQuery(pstmt, part_prune_results,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index cc943205d3..8ff42153a1 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -99,14 +99,19 @@ static dlist_head cached_expression_list = DLIST_STATIC_INIT(cached_expression_l
static void ReleaseGenericPlan(CachedPlanSource *plansource);
static List *RevalidateCachedQuery(CachedPlanSource *plansource,
QueryEnvironment *queryEnv);
-static bool CheckCachedPlan(CachedPlanSource *plansource);
+static bool CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
+ List **part_prune_results_list);
static CachedPlan *BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
- ParamListInfo boundParams, QueryEnvironment *queryEnv);
+ ParamListInfo boundParams, QueryEnvironment *queryEnv,
+ List **part_prune_results_list);
static bool choose_custom_plan(CachedPlanSource *plansource,
ParamListInfo boundParams);
static double cached_plan_cost(CachedPlan *plan, bool include_planner);
static Query *QueryListGetPrimaryStmt(List *stmts);
-static void AcquireExecutorLocks(List *stmt_list, bool acquire);
+static void AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams,
+ List **part_prune_results_list,
+ List **lockedRelids_per_stmt);
+static void ReleaseExecutorLocks(List *stmt_list, List *lockedRelids_per_stmt);
static void AcquirePlannerLocks(List *stmt_list, bool acquire);
static void ScanQueryForLocks(Query *parsetree, bool acquire);
static bool ScanQueryWalker(Node *node, bool *acquire);
@@ -782,6 +787,26 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
return tlist;
}
+/*
+ * FreePartitionPruneResults
+ * Frees the List of Lists of PartitionPruneResults for CheckCachedPlan()
+ */
+static void
+FreePartitionPruneResults(List *part_prune_results_list)
+{
+ ListCell *lc;
+
+ foreach(lc, part_prune_results_list)
+ {
+ List *part_prune_results = lfirst_node(List, lc);
+
+ /* Free both the PartitionPruneResults and the containing List. */
+ list_free_deep(part_prune_results);
+ }
+
+ list_free(part_prune_results_list);
+}
+
/*
* CheckCachedPlan: see if the CachedPlanSource's generic plan is valid.
*
@@ -790,15 +815,20 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
*
* On a "true" return, we have acquired the locks needed to run the plan.
* (We must do this for the "true" result to be race-condition-free.)
+ *
+ * See GetCachedPlan()'s comment for a description of part_prune_results_list.
*/
static bool
-CheckCachedPlan(CachedPlanSource *plansource)
+CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
+ List **part_prune_results_list)
{
CachedPlan *plan = plansource->gplan;
/* Assert that caller checked the querytree */
Assert(plansource->is_valid);
+ *part_prune_results_list = NIL;
+
/* If there's no generic plan, just say "false" */
if (!plan)
return false;
@@ -820,13 +850,21 @@ CheckCachedPlan(CachedPlanSource *plansource)
*/
if (plan->is_valid)
{
+ List *lockedRelids_per_stmt;
+
/*
* Plan must have positive refcount because it is referenced by
* plansource; so no need to fear it disappears under us here.
*/
Assert(plan->refcount > 0);
- AcquireExecutorLocks(plan->stmt_list, true);
+ /*
+ * Lock relations scanned by the plan. This is where the pruning
+ * happens if needed.
+ */
+ AcquireExecutorLocks(plan->stmt_list, boundParams,
+ part_prune_results_list,
+ &lockedRelids_per_stmt);
/*
* If plan was transient, check to see if TransactionXmin has
@@ -848,7 +886,11 @@ CheckCachedPlan(CachedPlanSource *plansource)
}
/* Oops, the race case happened. Release useless locks. */
- AcquireExecutorLocks(plan->stmt_list, false);
+ ReleaseExecutorLocks(plan->stmt_list, lockedRelids_per_stmt);
+
+ /* Release any PartitionPruneResults that may been created. */
+ FreePartitionPruneResults(*part_prune_results_list);
+ *part_prune_results_list = NIL;
}
/*
@@ -874,10 +916,14 @@ CheckCachedPlan(CachedPlanSource *plansource)
* Planning work is done in the caller's memory context. The finished plan
* is in a child memory context, which typically should get reparented
* (unless this is a one-shot plan, in which case we don't copy the plan).
+ *
+ * A list of NILs is returned in *part_prune_results_list, meaning that no
+ * no partition pruning has been done yet for the plans in stmt_list.
*/
static CachedPlan *
BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
- ParamListInfo boundParams, QueryEnvironment *queryEnv)
+ ParamListInfo boundParams, QueryEnvironment *queryEnv,
+ List **part_prune_results_list)
{
CachedPlan *plan;
List *plist;
@@ -1007,6 +1053,17 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
MemoryContextSwitchTo(oldcxt);
+ /*
+ * No actual PartitionPruneResults yet to add, though must initialize
+ * the list to have the same number of elements as the list of
+ * PlannedStmts.
+ */
+ *part_prune_results_list = NIL;
+ foreach(lc, plist)
+ {
+ *part_prune_results_list = lappend(*part_prune_results_list, NIL);
+ }
+
return plan;
}
@@ -1126,6 +1183,19 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
* plan or a custom plan for the given parameters: the caller does not know
* which it will get.
*
+ * For every PlannedStmt found in the returned CachedPlan, an element that
+ * is either a List of PartitionPruneResult or a NIL is added to
+ * *part_prune_results_list. The former if the PlannedStmt is from
+ * the existing CachedPlan that is otherwise valid and has
+ * containsInitialPruning set to true. Before returning such a CachedPlan,
+ * those "initial" steps are performed by calling ExecutorDoInitialPruning()
+ * to determine only those leaf partitions that need to be locked by
+ * AcquireExecutorLocks() by pruning away subplans that don't match the
+ * "initial" pruning conditions. For each PartitionPruneInfo found in
+ * PlannedStmt.partPruneInfos, a PartitionPruneResult containing the bitmapset
+ * of the indexes of surviving subplans is added to the List for the
+ * PlannedStmt.
+ *
* On return, the plan is valid and we have sufficient locks to begin
* execution.
*
@@ -1139,11 +1209,13 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
*/
CachedPlan *
GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
- ResourceOwner owner, QueryEnvironment *queryEnv)
+ ResourceOwner owner, QueryEnvironment *queryEnv,
+ List **part_prune_results_list)
{
CachedPlan *plan = NULL;
List *qlist;
bool customplan;
+ List *my_part_prune_results_list;
/* Assert caller is doing things in a sane order */
Assert(plansource->magic == CACHEDPLANSOURCE_MAGIC);
@@ -1160,7 +1232,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
if (!customplan)
{
- if (CheckCachedPlan(plansource))
+ if (CheckCachedPlan(plansource, boundParams,
+ &my_part_prune_results_list))
{
/* We want a generic plan, and we already have a valid one */
plan = plansource->gplan;
@@ -1169,7 +1242,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
else
{
/* Build a new generic plan */
- plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv);
+ plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv,
+ &my_part_prune_results_list);
/* Just make real sure plansource->gplan is clear */
ReleaseGenericPlan(plansource);
/* Link the new generic plan into the plansource */
@@ -1214,7 +1288,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
if (customplan)
{
/* Build a custom plan */
- plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv);
+ plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv,
+ &my_part_prune_results_list);
/* Accumulate total costs of custom plans */
plansource->total_custom_cost += cached_plan_cost(plan, true);
@@ -1246,6 +1321,9 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
plan->is_saved = true;
}
+ if (part_prune_results_list)
+ *part_prune_results_list = my_part_prune_results_list;
+
return plan;
}
@@ -1737,17 +1815,29 @@ QueryListGetPrimaryStmt(List *stmts)
/*
* AcquireExecutorLocks: acquire locks needed for execution of a cached plan;
- * or release them if acquire is false.
+ *
+ * See GetCachedPlan()'s comment for a description of part_prune_results_list.
+ *
+ * On return, *lockedRelids_per_stmt will contain a bitmapset for every
+ * PlannedStmt in stmt_list, containing the RT indexes of relation entries
+ * in its range table that were actually locked, or NULL if the PlannedStmt
+ * contains a utility statement.
*/
static void
-AcquireExecutorLocks(List *stmt_list, bool acquire)
+AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams,
+ List **part_prune_results_list,
+ List **lockedRelids_per_stmt)
{
ListCell *lc1;
+ *part_prune_results_list = *lockedRelids_per_stmt = NIL;
foreach(lc1, stmt_list)
{
PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
- ListCell *lc2;
+ List *part_prune_results = NIL;
+ Bitmapset *allLockRelids;
+ Bitmapset *lockedRelids = NULL;
+ int rti;
if (plannedstmt->commandType == CMD_UTILITY)
{
@@ -1761,13 +1851,40 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
Query *query = UtilityContainsQuery(plannedstmt->utilityStmt);
if (query)
- ScanQueryForLocks(query, acquire);
+ ScanQueryForLocks(query, true);
+ *part_prune_results_list = lappend(*part_prune_results_list, NIL);
continue;
}
- foreach(lc2, plannedstmt->rtable)
+ /*
+ * Figure out the set of relations that would need to be locked
+ * before executing the plan.
+ */
+ if (plannedstmt->containsInitialPruning)
{
- RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc2);
+ Bitmapset *scan_leafpart_rtis = NULL;
+
+ /*
+ * Obtain the set of leaf partitions to be locked.
+ *
+ * The following does initial partition pruning using the
+ * PartitionPruneInfos found in plannedstmt->partPruneInfos and
+ * finds leaf partitions that survive that pruning across all the
+ * nodes in the plan tree.
+ */
+ part_prune_results = ExecutorDoInitialPruning(plannedstmt,
+ boundParams,
+ &scan_leafpart_rtis);
+ allLockRelids = bms_union(plannedstmt->minLockRelids,
+ scan_leafpart_rtis);
+ }
+ else
+ allLockRelids = plannedstmt->minLockRelids;
+
+ rti = -1;
+ while ((rti = bms_next_member(allLockRelids, rti)) > 0)
+ {
+ RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
if (rte->rtekind != RTE_RELATION)
continue;
@@ -1778,10 +1895,59 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
* fail if it's been dropped entirely --- we'll just transiently
* acquire a non-conflicting lock.
*/
- if (acquire)
- LockRelationOid(rte->relid, rte->rellockmode);
- else
- UnlockRelationOid(rte->relid, rte->rellockmode);
+ LockRelationOid(rte->relid, rte->rellockmode);
+ lockedRelids = bms_add_member(lockedRelids, rti);
+ }
+
+ *part_prune_results_list = lappend(*part_prune_results_list,
+ part_prune_results);
+ *lockedRelids_per_stmt = lappend(*lockedRelids_per_stmt, lockedRelids);
+ }
+}
+
+/*
+ * ReleaseExecutorLocks
+ * Release locks that would've been acquired by an earlier call to
+ * AcquireExecutorLocks()
+ */
+static void
+ReleaseExecutorLocks(List *stmt_list, List *lockedRelids_per_stmt)
+{
+ ListCell *lc1,
+ *lc2;
+
+ forboth(lc1, stmt_list, lc2, lockedRelids_per_stmt)
+ {
+ PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
+ Bitmapset *lockedRelids = lfirst_node(Bitmapset, lc2);
+ int rti;
+
+ if (plannedstmt->commandType == CMD_UTILITY)
+ {
+ /*
+ * Ignore utility statements, except those (such as EXPLAIN) that
+ * contain a parsed-but-not-planned query. Note: it's okay to use
+ * ScanQueryForLocks, even though the query hasn't been through
+ * rule rewriting, because rewriting doesn't change the query
+ * representation.
+ */
+ Query *query = UtilityContainsQuery(plannedstmt->utilityStmt);
+
+ Assert(lockedRelids == NULL);
+ if (query)
+ ScanQueryForLocks(query, false);
+ continue;
+ }
+
+ rti = -1;
+ while ((rti = bms_next_member(lockedRelids, rti)) >= 0)
+ {
+ RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
+
+ Assert(rte->rtekind == RTE_RELATION);
+
+ /* See the comment in AcquireExecutorLocks(). */
+ UnlockRelationOid(rte->relid, rte->rellockmode);
}
}
}
diff --git a/src/backend/utils/mmgr/portalmem.c b/src/backend/utils/mmgr/portalmem.c
index 7b1ae6fdcf..5b9098971b 100644
--- a/src/backend/utils/mmgr/portalmem.c
+++ b/src/backend/utils/mmgr/portalmem.c
@@ -303,6 +303,25 @@ PortalDefineQuery(Portal portal,
portal->status = PORTAL_DEFINED;
}
+/*
+ * PortalStorePartitionPruneResults
+ * Copy the given List of Lists of PartitionPruneResults into the
+ * portal's context
+ *
+ * This allows the caller to ensure that the list exists as long as the portal
+ * does.
+ */
+void
+PortalStorePartitionPruneResults(Portal portal, List *part_prune_results_list)
+{
+ MemoryContext oldcxt;
+
+ Assert(PortalIsValid(portal));
+ oldcxt = MemoryContextSwitchTo(portal->portalContext);
+ portal->part_prune_results_list = copyObject(part_prune_results_list);
+ MemoryContextSwitchTo(oldcxt);
+}
+
/*
* PortalReleaseCachedPlan
* Release a portal's reference to its cached plan, if any.
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 9ebde089ae..269cc4d562 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -87,7 +87,9 @@ extern void ExplainOneUtility(Node *utilityStmt, IntoClause *into,
ExplainState *es, const char *queryString,
ParamListInfo params, QueryEnvironment *queryEnv);
-extern void ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into,
+extern void ExplainOnePlan(PlannedStmt *plannedstmt,
+ List *part_prune_results,
+ IntoClause *into,
ExplainState *es, const char *queryString,
ParamListInfo params, QueryEnvironment *queryEnv,
const instr_time *planduration,
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index 17fabc18c9..4b98d0d2ef 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -45,6 +45,7 @@ extern void ExecCleanupTupleRouting(ModifyTableState *mtstate,
* nparts Length of subplan_map[] and subpart_map[].
* subplan_map Subplan index by partition index, or -1.
* subpart_map Subpart index by partition index, or -1.
+ * rti_map Range table index by partition index, or 0.
* present_parts A Bitmapset of the partition indexes that we
* have subplans or subparts for.
* initial_pruning_steps List of PartitionPruneSteps used to
@@ -61,6 +62,7 @@ typedef struct PartitionedRelPruningData
int nparts;
int *subplan_map;
int *subpart_map;
+ Index *rti_map;
Bitmapset *present_parts;
List *initial_pruning_steps;
List *exec_pruning_steps;
@@ -127,5 +129,10 @@ extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
Bitmapset *root_parent_relids,
Bitmapset **initially_valid_subplans);
extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
- bool initial_prune);
+ bool initial_prune,
+ Bitmapset **scan_leafpart_rtis);
+extern Bitmapset *ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt,
+ ParamListInfo params,
+ PartitionPruneInfo *pruneinfo,
+ Bitmapset **scan_leafpart_rtis);
#endif /* EXECPARTITION_H */
diff --git a/src/include/executor/execdesc.h b/src/include/executor/execdesc.h
index e79e2c001f..7d4379da7b 100644
--- a/src/include/executor/execdesc.h
+++ b/src/include/executor/execdesc.h
@@ -35,6 +35,8 @@ typedef struct QueryDesc
/* These fields are provided by CreateQueryDesc */
CmdType operation; /* CMD_SELECT, CMD_UPDATE, etc. */
PlannedStmt *plannedstmt; /* planner's output (could be utility, too) */
+ List *part_prune_results; /* ExecutorDoInitialPruning()'s
+ * output for plannedstmt */
const char *sourceText; /* source text of the query */
Snapshot snapshot; /* snapshot to use for query */
Snapshot crosscheck_snapshot; /* crosscheck for RI update/delete */
@@ -57,6 +59,7 @@ typedef struct QueryDesc
/* in pquery.c */
extern QueryDesc *CreateQueryDesc(PlannedStmt *plannedstmt,
+ List *part_prune_results,
const char *sourceText,
Snapshot snapshot,
Snapshot crosscheck_snapshot,
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index aaf2bc78b9..32bbbc5927 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -185,6 +185,9 @@ ExecGetJunkAttribute(TupleTableSlot *slot, AttrNumber attno, bool *isNull)
/*
* prototypes from functions in execMain.c
*/
+extern List *ExecutorDoInitialPruning(PlannedStmt *plannedstmt,
+ ParamListInfo params,
+ Bitmapset **scan_leafpart_rtis);
extern void ExecutorStart(QueryDesc *queryDesc, int eflags);
extern void standard_ExecutorStart(QueryDesc *queryDesc, int eflags);
extern void ExecutorRun(QueryDesc *queryDesc,
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 71248a9466..9c6e8f5e13 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -619,6 +619,7 @@ typedef struct EState
* ExecRowMarks, or NULL if none */
PlannedStmt *es_plannedstmt; /* link to top of plan tree */
List *es_part_prune_infos; /* PlannedStmt.partPruneInfos */
+ List *es_part_prune_results; /* QueryDesc.part_prune_results */
const char *es_sourceText; /* Source text from QueryDesc */
JunkFilter *es_junkFilter; /* top-level junk filter, if any */
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index 1f33902947..c2f2544df5 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -218,6 +218,7 @@ extern struct Bitmapset *readBitmapset(void);
extern uintptr_t readDatum(bool typbyval);
extern bool *readBoolCols(int numCols);
extern int *readIntCols(int numCols);
+extern Index *readIndexCols(int numCols);
extern Oid *readOidCols(int numCols);
extern int16 *readAttrNumberCols(int numCols);
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index dbaa9bb54d..e0e5c15b09 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -125,6 +125,18 @@ typedef struct PlannerGlobal
/* List of PartitionPruneInfo contained in the plan */
List *partPruneInfos;
+ /*
+ * Do any of those PartitionPruneInfos have initial pruning steps in them?
+ */
+ bool containsInitialPruning;
+
+ /*
+ * Indexes of all range table entries minus indexes of range table entries
+ * of the leaf partitions scanned by prunable subplans; see
+ * AcquireExecutorLocks()
+ */
+ Bitmapset *minLockRelids;
+
/* OIDs of relations the plan depends on */
List *relationOids;
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index c36a15bd09..714e2cf2c7 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -73,8 +73,17 @@ typedef struct PlannedStmt
List *partPruneInfos; /* List of PartitionPruneInfo contained in the
* plan */
+ bool containsInitialPruning; /* Do any of those PartitionPruneInfos
+ * have initial pruning steps in them?
+ */
+
List *rtable; /* list of RangeTblEntry nodes */
+ Bitmapset *minLockRelids; /* Indexes of all range table entries minus
+ * indexes of range table entries of the leaf
+ * partitions scanned by prunable subplans;
+ * see AcquireExecutorLocks() */
+
/* rtable indexes of target relations for INSERT/UPDATE/DELETE/MERGE */
List *resultRelations; /* integer list of RT indexes, or NIL */
@@ -1414,6 +1423,13 @@ typedef struct PlanRowMark
* prune_infos List of Lists containing PartitionedRelPruneInfo nodes,
* one sublist per run-time-prunable partition hierarchy
* appearing in the parent plan node's subplans.
+ *
+ * needs_init_pruning Does any of the PartitionedRelPruneInfos in
+ * prune_infos have its initial_pruning_steps set?
+ *
+ * needs_exec_pruning Does any of the PartitionedRelPruneInfos in
+ * prune_infos have its exec_pruning_steps set?
+ *
* other_subplans Indexes of any subplans that are not accounted for
* by any of the PartitionedRelPruneInfo nodes in
* "prune_infos". These subplans must not be pruned.
@@ -1425,6 +1441,8 @@ typedef struct PartitionPruneInfo
NodeTag type;
Bitmapset *root_parent_relids;
List *prune_infos;
+ bool needs_init_pruning;
+ bool needs_exec_pruning;
Bitmapset *other_subplans;
} PartitionPruneInfo;
@@ -1469,6 +1487,9 @@ typedef struct PartitionedRelPruneInfo
/* relation OID by partition index, or 0 */
Oid *relid_map pg_node_attr(array_size(nparts));
+ /* Range table index by partition index, or 0. */
+ Index *rti_map pg_node_attr(array_size(nparts));
+
/*
* initial_pruning_steps shows how to prune during executor startup (i.e.,
* without use of any PARAM_EXEC Params); it is NIL if no startup pruning
@@ -1553,6 +1574,31 @@ typedef struct PartitionPruneStepCombine
List *source_stepids;
} PartitionPruneStepCombine;
+/*----------------
+ * PartitionPruneResult
+ *
+ * The result of performing ExecPartitionDoInitialPruning() on a given
+ * PartitionPruneInfo.
+ *
+ * valid_subplans_offs contains the indexes of subplans remaining after
+ * performing initial pruning by calling ExecFindMatchingSubPlans() on the
+ * PartitionPruneInfo.
+ *
+ * This is used to store the result of initial partition pruning that is
+ * peformed before the execution has started. A module that needs to do so
+ * should call ExecutorDoInitialPruning() on a given PlannedStmt, which
+ * returns a List of PartitionPruneResult containing an entry for each
+ * PartitionPruneInfo present in PlannedStmt.part_prune_infos. The module
+ * should then pass that list, along with the PlannedStmt, to the executor,
+ * so that it can reuse the result of initial partition pruning when
+ * initializing the subplans for execution.
+ */
+typedef struct PartitionPruneResult
+{
+ NodeTag type;
+
+ Bitmapset *valid_subplan_offs;
+} PartitionPruneResult;
/*
* Plan invalidation info
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index 0499635f59..32579d4788 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -220,7 +220,8 @@ extern List *CachedPlanGetTargetList(CachedPlanSource *plansource,
extern CachedPlan *GetCachedPlan(CachedPlanSource *plansource,
ParamListInfo boundParams,
ResourceOwner owner,
- QueryEnvironment *queryEnv);
+ QueryEnvironment *queryEnv,
+ List **part_prune_results_list);
extern void ReleaseCachedPlan(CachedPlan *plan, ResourceOwner owner);
extern bool CachedPlanAllowsSimpleValidityCheck(CachedPlanSource *plansource,
diff --git a/src/include/utils/portal.h b/src/include/utils/portal.h
index aeddbdafe5..1901fc5f28 100644
--- a/src/include/utils/portal.h
+++ b/src/include/utils/portal.h
@@ -138,6 +138,7 @@ typedef struct PortalData
QueryCompletion qc; /* command completion data for executed query */
List *stmts; /* list of PlannedStmts */
CachedPlan *cplan; /* CachedPlan, if stmts are from one */
+ List *part_prune_results_list; /* List of Lists of PartitionPruneResults */
ParamListInfo portalParams; /* params to pass to query */
QueryEnvironment *queryEnv; /* environment for query */
@@ -242,6 +243,8 @@ extern void PortalDefineQuery(Portal portal,
CommandTag commandTag,
List *stmts,
CachedPlan *cplan);
+extern void PortalStorePartitionPruneResults(Portal portal,
+ List *part_prune_results_list);
extern PlannedStmt *PortalGetPrimaryStmt(Portal portal);
extern void PortalCreateHoldStore(Portal portal);
extern void PortalHashTableDeleteAll(void);
--
2.35.3
^ permalink raw reply [nested|flat] 71+ messages in thread
* Re: generic plans and "initial" pruning
@ 2022-12-06 19:00 Alvaro Herrera <[email protected]>
parent: Amit Langote <[email protected]>
0 siblings, 1 reply; 71+ messages in thread
From: Alvaro Herrera @ 2022-12-06 19:00 UTC (permalink / raw)
To: Amit Langote <[email protected]>; +Cc: Robert Haas <[email protected]>; Jacob Champion <[email protected]>; David Rowley <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers
I find the API of GetCachedPlans a little weird after this patch. I
think it may be better to have it return a pointer of a new struct --
one that contains both the CachedPlan pointer and the list of pruning
results. (As I understand, the sole caller that isn't interested in the
pruning results, SPI_plan_get_cached_plan, can be explained by the fact
that it knows there won't be any. So I don't think we need to worry
about this case?)
And I think you should make that struct also be the last argument of
PortalDefineQuery, so you don't need the separate
PortalStorePartitionPruneResults function -- because as far as I can
tell, the callers that pass a non-NULL pointer there are the exactly
same that later call PortalStorePartitionPruneResults.
--
Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
"La primera ley de las demostraciones en vivo es: no trate de usar el sistema.
Escriba un guión que no toque nada para no causar daños." (Jakob Nielsen)
^ permalink raw reply [nested|flat] 71+ messages in thread
* Re: generic plans and "initial" pruning
@ 2022-12-09 08:26 Amit Langote <[email protected]>
parent: Alvaro Herrera <[email protected]>
0 siblings, 1 reply; 71+ messages in thread
From: Amit Langote @ 2022-12-09 08:26 UTC (permalink / raw)
To: Alvaro Herrera <[email protected]>; +Cc: Robert Haas <[email protected]>; Jacob Champion <[email protected]>; David Rowley <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers
Thanks for the review.
On Wed, Dec 7, 2022 at 4:00 AM Alvaro Herrera <[email protected]> wrote:
> I find the API of GetCachedPlans a little weird after this patch. I
> think it may be better to have it return a pointer of a new struct --
> one that contains both the CachedPlan pointer and the list of pruning
> results. (As I understand, the sole caller that isn't interested in the
> pruning results, SPI_plan_get_cached_plan, can be explained by the fact
> that it knows there won't be any. So I don't think we need to worry
> about this case?)
David, in his Apr 7 reply on this thread, also sounded to suggest
something similar.
Hmm, I was / am not so sure if GetCachedPlan() should return something
that is not CachedPlan. An idea I had today was to replace the
part_prune_results_list output List parameter with, say,
QueryInitPruningResult, or something like that and put the current
list into that struct. Was looking at QueryEnvironment to come up
with *that* name. Any thoughts?
> And I think you should make that struct also be the last argument of
> PortalDefineQuery, so you don't need the separate
> PortalStorePartitionPruneResults function -- because as far as I can
> tell, the callers that pass a non-NULL pointer there are the exactly
> same that later call PortalStorePartitionPruneResults.
Yes, it would be better to not need PortalStorePartitionPruneResults.
--
Thanks, Amit Langote
EDB: http://www.enterprisedb.com
^ permalink raw reply [nested|flat] 71+ messages in thread
* Re: generic plans and "initial" pruning
@ 2022-12-09 09:52 Alvaro Herrera <[email protected]>
parent: Amit Langote <[email protected]>
0 siblings, 1 reply; 71+ messages in thread
From: Alvaro Herrera @ 2022-12-09 09:52 UTC (permalink / raw)
To: Amit Langote <[email protected]>; +Cc: Robert Haas <[email protected]>; Jacob Champion <[email protected]>; David Rowley <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers
On 2022-Dec-09, Amit Langote wrote:
> On Wed, Dec 7, 2022 at 4:00 AM Alvaro Herrera <[email protected]> wrote:
> > I find the API of GetCachedPlans a little weird after this patch.
> David, in his Apr 7 reply on this thread, also sounded to suggest
> something similar.
>
> Hmm, I was / am not so sure if GetCachedPlan() should return something
> that is not CachedPlan. An idea I had today was to replace the
> part_prune_results_list output List parameter with, say,
> QueryInitPruningResult, or something like that and put the current
> list into that struct. Was looking at QueryEnvironment to come up
> with *that* name. Any thoughts?
Remind me again why is part_prune_results_list not part of struct
CachedPlan then? I tried to understand that based on comments upthread,
but I was unable to find anything.
(My first reaction to your above comment was "well, rename GetCachedPlan
then, maybe to GetRunnablePlan", but then I'm wondering if CachedPlan is
in any way a structure that must be "immutable" in the way parser output
is. Looking at the comment at the top of plancache.c it appears to me
that it isn't, but maybe I'm missing something.)
--
Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
"The Postgresql hackers have what I call a "NASA space shot" mentality.
Quite refreshing in a world of "weekend drag racer" developers."
(Scott Marlowe)
^ permalink raw reply [nested|flat] 71+ messages in thread
* Re: generic plans and "initial" pruning
@ 2022-12-09 10:34 Amit Langote <[email protected]>
parent: Alvaro Herrera <[email protected]>
0 siblings, 1 reply; 71+ messages in thread
From: Amit Langote @ 2022-12-09 10:34 UTC (permalink / raw)
To: Alvaro Herrera <[email protected]>; +Cc: Robert Haas <[email protected]>; Jacob Champion <[email protected]>; David Rowley <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers
On Fri, Dec 9, 2022 at 6:52 PM Alvaro Herrera <[email protected]> wrote:
> On 2022-Dec-09, Amit Langote wrote:
> > On Wed, Dec 7, 2022 at 4:00 AM Alvaro Herrera <[email protected]> wrote:
> > > I find the API of GetCachedPlans a little weird after this patch.
>
> > David, in his Apr 7 reply on this thread, also sounded to suggest
> > something similar.
> >
> > Hmm, I was / am not so sure if GetCachedPlan() should return something
> > that is not CachedPlan. An idea I had today was to replace the
> > part_prune_results_list output List parameter with, say,
> > QueryInitPruningResult, or something like that and put the current
> > list into that struct. Was looking at QueryEnvironment to come up
> > with *that* name. Any thoughts?
>
> Remind me again why is part_prune_results_list not part of struct
> CachedPlan then? I tried to understand that based on comments upthread,
> but I was unable to find anything.
It used to be part of CachedPlan for a brief period of time (in patch
v12 I posted in [1]), but David, in his reply to [1], said he wasn't
so sure that it belonged there.
> (My first reaction to your above comment was "well, rename GetCachedPlan
> then, maybe to GetRunnablePlan", but then I'm wondering if CachedPlan is
> in any way a structure that must be "immutable" in the way parser output
> is. Looking at the comment at the top of plancache.c it appears to me
> that it isn't, but maybe I'm missing something.)
CachedPlan *is* supposed to be read-only per the comment above
CachedPlanSource definition:
* ...If we are using a generic
* cached plan then it is meant to be re-used across multiple executions, so
* callers must always treat CachedPlans as read-only.
FYI, there was even an idea of putting a PartitionPruneResults for a
given PlannedStmt into the PlannedStmt itself [2], but PlannedStmt is
supposed to be read-only too [3].
Maybe we need some new overarching context when invoking plancache, if
Portal can't already be it, whose struct can be passed to
GetCachedPlan() to put the pruning results in? Perhaps,
GetRunnablePlan() that you floated could be a wrapper for
GetCachedPlan(), owning that new context.
--
Thanks, Amit Langote
EDB: http://www.enterprisedb.com
[1] https://www.postgresql.org/message-id/CA%2BHiwqH4qQ_YVROr7TY0jSCuGn0oHhH79_DswOdXWN5UnMCBtQ%40mail.g...
[2] https://www.postgresql.org/message-id/CAApHDvp_DjVVkgSV24%2BUF7p_yKWeepgoo%2BW2SWLLhNmjwHTVYQ%40mail...
[3] https://www.postgresql.org/message-id/922566.1648784745%40sss.pgh.pa.us
^ permalink raw reply [nested|flat] 71+ messages in thread
* Re: generic plans and "initial" pruning
@ 2022-12-09 10:49 Alvaro Herrera <[email protected]>
parent: Amit Langote <[email protected]>
0 siblings, 1 reply; 71+ messages in thread
From: Alvaro Herrera @ 2022-12-09 10:49 UTC (permalink / raw)
To: Amit Langote <[email protected]>; +Cc: Robert Haas <[email protected]>; Jacob Champion <[email protected]>; David Rowley <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers
On 2022-Dec-09, Amit Langote wrote:
> On Fri, Dec 9, 2022 at 6:52 PM Alvaro Herrera <[email protected]> wrote:
> > Remind me again why is part_prune_results_list not part of struct
> > CachedPlan then? I tried to understand that based on comments upthread,
> > but I was unable to find anything.
>
> It used to be part of CachedPlan for a brief period of time (in patch
> v12 I posted in [1]), but David, in his reply to [1], said he wasn't
> so sure that it belonged there.
I'm not sure I necessarily agree with that. I'll have a look at v12 to
try and understand what was David so unhappy about.
> > (My first reaction to your above comment was "well, rename GetCachedPlan
> > then, maybe to GetRunnablePlan", but then I'm wondering if CachedPlan is
> > in any way a structure that must be "immutable" in the way parser output
> > is. Looking at the comment at the top of plancache.c it appears to me
> > that it isn't, but maybe I'm missing something.)
>
> CachedPlan *is* supposed to be read-only per the comment above
> CachedPlanSource definition:
>
> * ...If we are using a generic
> * cached plan then it is meant to be re-used across multiple executions, so
> * callers must always treat CachedPlans as read-only.
I read that as implying that the part_prune_results_list must remain
intact as long as no invalidations occur. Does part_prune_result_list
really change as a result of something other than a sinval event?
Keep in mind that if a sinval message that touches one of the relations
in the plan arrives, then we'll discard it and generate it afresh. I
don't see that the part_prune_results_list would change otherwise, but
maybe I misunderstand?
> FYI, there was even an idea of putting a PartitionPruneResults for a
> given PlannedStmt into the PlannedStmt itself [2], but PlannedStmt is
> supposed to be read-only too [3].
Hmm, I'm not familiar with PlannedStmt lifetime, but I'm definitely not
betting that Tom is wrong about this.
> Maybe we need some new overarching context when invoking plancache, if
> Portal can't already be it, whose struct can be passed to
> GetCachedPlan() to put the pruning results in? Perhaps,
> GetRunnablePlan() that you floated could be a wrapper for
> GetCachedPlan(), owning that new context.
Perhaps that is a solution. I'm not sure.
--
Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
"Uno puede defenderse de los ataques; contra los elogios se esta indefenso"
^ permalink raw reply [nested|flat] 71+ messages in thread
* Re: generic plans and "initial" pruning
@ 2022-12-09 11:02 Amit Langote <[email protected]>
parent: Alvaro Herrera <[email protected]>
0 siblings, 1 reply; 71+ messages in thread
From: Amit Langote @ 2022-12-09 11:02 UTC (permalink / raw)
To: Alvaro Herrera <[email protected]>; +Cc: Robert Haas <[email protected]>; Jacob Champion <[email protected]>; David Rowley <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers
On Fri, Dec 9, 2022 at 7:49 PM Alvaro Herrera <[email protected]> wrote:
> On 2022-Dec-09, Amit Langote wrote:
> > On Fri, Dec 9, 2022 at 6:52 PM Alvaro Herrera <[email protected]> wrote:
> > > Remind me again why is part_prune_results_list not part of struct
> > > CachedPlan then? I tried to understand that based on comments upthread,
> > > but I was unable to find anything.
> >
> > > (My first reaction to your above comment was "well, rename GetCachedPlan
> > > then, maybe to GetRunnablePlan", but then I'm wondering if CachedPlan is
> > > in any way a structure that must be "immutable" in the way parser output
> > > is. Looking at the comment at the top of plancache.c it appears to me
> > > that it isn't, but maybe I'm missing something.)
> >
> > CachedPlan *is* supposed to be read-only per the comment above
> > CachedPlanSource definition:
> >
> > * ...If we are using a generic
> > * cached plan then it is meant to be re-used across multiple executions, so
> > * callers must always treat CachedPlans as read-only.
>
> I read that as implying that the part_prune_results_list must remain
> intact as long as no invalidations occur. Does part_prune_result_list
> really change as a result of something other than a sinval event?
> Keep in mind that if a sinval message that touches one of the relations
> in the plan arrives, then we'll discard it and generate it afresh. I
> don't see that the part_prune_results_list would change otherwise, but
> maybe I misunderstand?
Pruning will be done afresh on every fetch of a given cached plan when
CheckCachedPlan() is called on it, so the part_prune_results_list part
will be discarded and rebuilt as many times as the plan is executed.
You'll find a description around CachedPlanSavePartitionPruneResults()
that's in v12.
--
Thanks, Amit Langote
EDB: http://www.enterprisedb.com
^ permalink raw reply [nested|flat] 71+ messages in thread
* Re: generic plans and "initial" pruning
@ 2022-12-09 11:37 Alvaro Herrera <[email protected]>
parent: Amit Langote <[email protected]>
0 siblings, 1 reply; 71+ messages in thread
From: Alvaro Herrera @ 2022-12-09 11:37 UTC (permalink / raw)
To: Amit Langote <[email protected]>; +Cc: Robert Haas <[email protected]>; Jacob Champion <[email protected]>; David Rowley <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers
On 2022-Dec-09, Amit Langote wrote:
> Pruning will be done afresh on every fetch of a given cached plan when
> CheckCachedPlan() is called on it, so the part_prune_results_list part
> will be discarded and rebuilt as many times as the plan is executed.
> You'll find a description around CachedPlanSavePartitionPruneResults()
> that's in v12.
I see.
In that case, a separate container struct seems warranted.
--
Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
"Industry suffers from the managerial dogma that for the sake of stability
and continuity, the company should be independent of the competence of
individual employees." (E. Dijkstra)
^ permalink raw reply [nested|flat] 71+ messages in thread
* Re: generic plans and "initial" pruning
@ 2022-12-12 11:19 Amit Langote <[email protected]>
parent: Alvaro Herrera <[email protected]>
0 siblings, 1 reply; 71+ messages in thread
From: Amit Langote @ 2022-12-12 11:19 UTC (permalink / raw)
To: Alvaro Herrera <[email protected]>; +Cc: Robert Haas <[email protected]>; Jacob Champion <[email protected]>; David Rowley <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers
On Fri, Dec 9, 2022 at 8:37 PM Alvaro Herrera <[email protected]> wrote:
> On 2022-Dec-09, Amit Langote wrote:
>
> > Pruning will be done afresh on every fetch of a given cached plan when
> > CheckCachedPlan() is called on it, so the part_prune_results_list part
> > will be discarded and rebuilt as many times as the plan is executed.
> > You'll find a description around CachedPlanSavePartitionPruneResults()
> > that's in v12.
>
> I see.
>
> In that case, a separate container struct seems warranted.
I thought about this today and played around with some container struct ideas.
Though, I started feeling like putting all the new logic being added
by this patch into plancache.c at the heart of GetCachedPlan() and
tweaking its API in kind of unintuitive ways may not have been such a
good idea to begin with. So I started thinking again about your
GetRunnablePlan() wrapper idea and thought maybe we could do something
with it. Let's say we name it GetCachedPlanLockPartitions() and put
the logic that does initial pruning with the new
ExecutorDoInitialPruning() in it, instead of in the normal
GetCachedPlan() path. Any callers that call GetCachedPlan() instead
call GetCachedPlanLockPartitions() with either the List ** parameter
as now or some container struct if that seems better. Whether
GetCachedPlanLockPartitions() needs to do anything other than return
the CachedPlan returned by GetCachedPlan() can be decided by the
latter setting, say, CachedPlan.has_unlocked_partitions. That will be
done by AcquireExecutorLocks() when it sees containsInitialPrunnig in
any of the PlannedStmts it sees, locking only the
PlannedStmt.minLockRelids set (which is all relations where no pruning
is needed!), leaving the partition locking to
GetCachedPlanLockPartitions(). If the CachedPlan is invalidated
during the partition locking phase, it calls GetCachedPlan() again;
maybe some refactoring is needed to avoid too much useless work in
such cases.
Thoughts?
--
Thanks, Amit Langote
EDB: http://www.enterprisedb.com
^ permalink raw reply [nested|flat] 71+ messages in thread
* Re: generic plans and "initial" pruning
@ 2022-12-12 17:24 Alvaro Herrera <[email protected]>
parent: Amit Langote <[email protected]>
0 siblings, 1 reply; 71+ messages in thread
From: Alvaro Herrera @ 2022-12-12 17:24 UTC (permalink / raw)
To: Amit Langote <[email protected]>; +Cc: Robert Haas <[email protected]>; Jacob Champion <[email protected]>; David Rowley <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers
On 2022-Dec-12, Amit Langote wrote:
> I started feeling like putting all the new logic being added
> by this patch into plancache.c at the heart of GetCachedPlan() and
> tweaking its API in kind of unintuitive ways may not have been such a
> good idea to begin with. So I started thinking again about your
> GetRunnablePlan() wrapper idea and thought maybe we could do something
> with it. Let's say we name it GetCachedPlanLockPartitions() and put
> the logic that does initial pruning with the new
> ExecutorDoInitialPruning() in it, instead of in the normal
> GetCachedPlan() path. Any callers that call GetCachedPlan() instead
> call GetCachedPlanLockPartitions() with either the List ** parameter
> as now or some container struct if that seems better. Whether
> GetCachedPlanLockPartitions() needs to do anything other than return
> the CachedPlan returned by GetCachedPlan() can be decided by the
> latter setting, say, CachedPlan.has_unlocked_partitions. That will be
> done by AcquireExecutorLocks() when it sees containsInitialPrunnig in
> any of the PlannedStmts it sees, locking only the
> PlannedStmt.minLockRelids set (which is all relations where no pruning
> is needed!), leaving the partition locking to
> GetCachedPlanLockPartitions().
Hmm. This doesn't sound totally unreasonable, except to the point David
was making that perhaps we may want this container struct to accomodate
other things in the future than just the partition pruning results, so I
think its name (and that of the function that produces it) ought to be a
little more generic than that.
(I think this also answers your question on whether a List ** is better
than a container struct.)
--
Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
"Las cosas son buenas o malas segun las hace nuestra opinión" (Lisias)
^ permalink raw reply [nested|flat] 71+ messages in thread
* Re: generic plans and "initial" pruning
@ 2022-12-14 08:35 Amit Langote <[email protected]>
parent: Alvaro Herrera <[email protected]>
0 siblings, 1 reply; 71+ messages in thread
From: Amit Langote @ 2022-12-14 08:35 UTC (permalink / raw)
To: Alvaro Herrera <[email protected]>; +Cc: Robert Haas <[email protected]>; Jacob Champion <[email protected]>; David Rowley <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers
On Tue, Dec 13, 2022 at 2:24 AM Alvaro Herrera <[email protected]> wrote:
> On 2022-Dec-12, Amit Langote wrote:
> > I started feeling like putting all the new logic being added
> > by this patch into plancache.c at the heart of GetCachedPlan() and
> > tweaking its API in kind of unintuitive ways may not have been such a
> > good idea to begin with. So I started thinking again about your
> > GetRunnablePlan() wrapper idea and thought maybe we could do something
> > with it. Let's say we name it GetCachedPlanLockPartitions() and put
> > the logic that does initial pruning with the new
> > ExecutorDoInitialPruning() in it, instead of in the normal
> > GetCachedPlan() path. Any callers that call GetCachedPlan() instead
> > call GetCachedPlanLockPartitions() with either the List ** parameter
> > as now or some container struct if that seems better. Whether
> > GetCachedPlanLockPartitions() needs to do anything other than return
> > the CachedPlan returned by GetCachedPlan() can be decided by the
> > latter setting, say, CachedPlan.has_unlocked_partitions. That will be
> > done by AcquireExecutorLocks() when it sees containsInitialPrunnig in
> > any of the PlannedStmts it sees, locking only the
> > PlannedStmt.minLockRelids set (which is all relations where no pruning
> > is needed!), leaving the partition locking to
> > GetCachedPlanLockPartitions().
>
> Hmm. This doesn't sound totally unreasonable, except to the point David
> was making that perhaps we may want this container struct to accomodate
> other things in the future than just the partition pruning results, so I
> think its name (and that of the function that produces it) ought to be a
> little more generic than that.
>
> (I think this also answers your question on whether a List ** is better
> than a container struct.)
OK, so here's a WIP attempt at that.
I have moved the original functionality of GetCachedPlan() to
GetCachedPlanInternal(), turning the former into a sort of controller
as described shortly. The latter's CheckCachedPlan() part now only
locks the "minimal" set of, non-prunable, relations, making a note of
whether the plan contains any prunable subnodes and thus prunable
relations whose locking is deferred to the caller, GetCachedPlan().
GetCachedPlan(), as a sort of controller as mentioned before, does the
pruning if needed on the minimally valid plan returned by
GetCachedPlanInternal(), locks the partitions that survive, and redoes
the whole thing if the locking of partitions invalidates the plan.
The pruning results are returned through the new output parameter of
GetCachedPlan() of type CachedPlanExtra. I named it so after much
consideration, because all the new logic that produces stuff to put
into it is a part of the plancache module and has to do with
manipulating a CachedPlan. (I had considered CachedPlanExecInfo to
indicate that it contains information that is to be forwarded to the
executor, though that just didn't seem to fit in plancache.h.)
I have broken out a few things into a preparatory patch 0001. Mainly,
it invents PlannedStmt.minLockRelids to replace the
AcquireExecutorLocks()'s current loop over the range table to figure
out the relations to lock. I also threw in a couple of pruning
related non-functional changes in there to make it easier to read the
0002, which is the main patch.
--
Thanks, Amit Langote
EDB: http://www.enterprisedb.com
Attachments:
[application/octet-stream] v29-0001-Preparatory-refactoring-before-reworking-CachedP.patch (17.2K, 2-v29-0001-Preparatory-refactoring-before-reworking-CachedP.patch)
download | inline diff:
From 14a1198bdaad007b1dc835f24caa42d3667c7048 Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Tue, 13 Dec 2022 11:58:07 +0900
Subject: [PATCH v29 1/2] Preparatory refactoring before reworking CachedPlan
locking
Remember the RT indexes of RTEs that AcquireExecutorLocks() must
look at to consider locking in a bitmapset, so that nstead of looping
over the range table to find those RTEs, it can look them up using
the RT indexes set in the bitmapset.
This also adds some extra information related to execution-time
pruning to the relevant plan nodes.
---
src/backend/executor/execParallel.c | 1 +
src/backend/executor/execPartition.c | 6 ++++
src/backend/nodes/readfuncs.c | 8 ++++--
src/backend/optimizer/plan/planner.c | 2 ++
src/backend/optimizer/plan/setrefs.c | 12 ++++++++
src/backend/partitioning/partprune.c | 42 ++++++++++++++++++++++++++--
src/backend/utils/cache/plancache.c | 10 +++++--
src/include/executor/execPartition.h | 2 ++
src/include/nodes/nodes.h | 1 +
src/include/nodes/pathnodes.h | 11 ++++++++
src/include/nodes/plannodes.h | 19 +++++++++++++
11 files changed, 106 insertions(+), 8 deletions(-)
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index a5b8e43ec5..65c4b63bbd 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -182,6 +182,7 @@ ExecSerializePlan(Plan *plan, EState *estate)
pstmt->transientPlan = false;
pstmt->dependsOnRole = false;
pstmt->parallelModeNeeded = false;
+ pstmt->containsInitialPruning = false; /* workers need not know! */
pstmt->planTree = plan;
pstmt->partPruneInfos = estate->es_part_prune_infos;
pstmt->rtable = estate->es_range_table;
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 76d79b9741..5b62157712 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -1956,6 +1956,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
Assert(partdesc->nparts >= pinfo->nparts);
pprune->nparts = partdesc->nparts;
pprune->subplan_map = palloc(sizeof(int) * partdesc->nparts);
+ pprune->rti_map = palloc(sizeof(Index) * partdesc->nparts);
if (partdesc->nparts == pinfo->nparts)
{
/*
@@ -1966,6 +1967,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
pprune->subpart_map = pinfo->subpart_map;
memcpy(pprune->subplan_map, pinfo->subplan_map,
sizeof(int) * pinfo->nparts);
+ memcpy(pprune->rti_map, pinfo->rti_map,
+ sizeof(int) * pinfo->nparts);
/*
* Double-check that the list of unpruned relations has not
@@ -2016,6 +2019,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
pinfo->subplan_map[pd_idx];
pprune->subpart_map[pp_idx] =
pinfo->subpart_map[pd_idx];
+ pprune->rti_map[pp_idx] =
+ pinfo->rti_map[pd_idx];
pd_idx++;
}
else
@@ -2023,6 +2028,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
/* this partdesc entry is not in the plan */
pprune->subplan_map[pp_idx] = -1;
pprune->subpart_map[pp_idx] = -1;
+ pprune->rti_map[pp_idx] = 0;
}
}
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 966b75f5a6..1161671fa4 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -158,6 +158,11 @@
token = pg_strtok(&length); /* skip :fldname */ \
local_node->fldname = readIntCols(len)
+/* Read an Index array */
+#define READ_INDEX_ARRAY(fldname, len) \
+ token = pg_strtok(&length); /* skip :fldname */ \
+ local_node->fldname = readIndexCols(len)
+
/* Read a bool array */
#define READ_BOOL_ARRAY(fldname, len) \
token = pg_strtok(&length); /* skip :fldname */ \
@@ -796,7 +801,6 @@ fnname(int numCols) \
*/
READ_SCALAR_ARRAY(readAttrNumberCols, int16, atoi)
READ_SCALAR_ARRAY(readOidCols, Oid, atooid)
-/* outfuncs.c has writeIndexCols, but we don't yet need that here */
-/* READ_SCALAR_ARRAY(readIndexCols, Index, atoui) */
+READ_SCALAR_ARRAY(readIndexCols, Index, atoui)
READ_SCALAR_ARRAY(readIntCols, int, atoi)
READ_SCALAR_ARRAY(readBoolCols, bool, strtobool)
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 5dd4f92720..620b163ef9 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -523,8 +523,10 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
result->parallelModeNeeded = glob->parallelModeNeeded;
result->planTree = top_plan;
result->partPruneInfos = glob->partPruneInfos;
+ result->containsInitialPruning = glob->containsInitialPruning;
result->rtable = glob->finalrtable;
result->permInfos = glob->finalrteperminfos;
+ result->minLockRelids = glob->minLockRelids;
result->resultRelations = glob->resultRelations;
result->appendRelations = glob->appendRelations;
result->subplans = glob->subplans;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 596f1fbc8e..ed43d5936d 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -279,6 +279,16 @@ set_plan_references(PlannerInfo *root, Plan *plan)
*/
add_rtes_to_flat_rtable(root, false);
+ /*
+ * Add the query's adjusted range of RT indexes to glob->minLockRelids.
+ * The adjusted RT indexes of prunable relations will be deleted from the
+ * set below where PartitionPruneInfos are processed.
+ */
+ glob->minLockRelids =
+ bms_add_range(glob->minLockRelids,
+ rtoffset + 1,
+ rtoffset + list_length(root->parse->rtable));
+
/*
* Adjust RT indexes of PlanRowMarks and add to final rowmarks list
*/
@@ -377,9 +387,11 @@ set_plan_references(PlannerInfo *root, Plan *plan)
/* RT index of the table to which the pinfo belongs. */
pinfo->rtindex += rtoffset;
}
+
}
glob->partPruneInfos = lappend(glob->partPruneInfos, pruneinfo);
+ glob->containsInitialPruning |= pruneinfo->needs_init_pruning;
}
return result;
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index d48f6784c1..56270d7670 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -144,7 +144,9 @@ static List *make_partitionedrel_pruneinfo(PlannerInfo *root,
List *prunequal,
Bitmapset *partrelids,
int *relid_subplan_map,
- Bitmapset **matchedsubplans);
+ Bitmapset **matchedsubplans,
+ bool *needs_init_pruning,
+ bool *needs_exec_pruning);
static void gen_partprune_steps(RelOptInfo *rel, List *clauses,
PartClauseTarget target,
GeneratePruningStepsContext *context);
@@ -234,6 +236,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
int *relid_subplan_map;
ListCell *lc;
int i;
+ bool needs_init_pruning = false;
+ bool needs_exec_pruning = false;
/*
* Scan the subpaths to see which ones are scans of partition child
@@ -313,12 +317,16 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
Bitmapset *partrelids = (Bitmapset *) lfirst(lc);
List *pinfolist;
Bitmapset *matchedsubplans = NULL;
+ bool partrel_needs_init_pruning;
+ bool partrel_needs_exec_pruning;
pinfolist = make_partitionedrel_pruneinfo(root, parentrel,
prunequal,
partrelids,
relid_subplan_map,
- &matchedsubplans);
+ &matchedsubplans,
+ &partrel_needs_init_pruning,
+ &partrel_needs_exec_pruning);
/* When pruning is possible, record the matched subplans */
if (pinfolist != NIL)
@@ -327,6 +335,9 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
allmatchedsubplans = bms_join(matchedsubplans,
allmatchedsubplans);
}
+
+ needs_init_pruning |= partrel_needs_init_pruning;
+ needs_exec_pruning |= partrel_needs_exec_pruning;
}
pfree(relid_subplan_map);
@@ -342,6 +353,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
pruneinfo = makeNode(PartitionPruneInfo);
pruneinfo->root_parent_relids = parentrel->relids;
pruneinfo->prune_infos = prunerelinfos;
+ pruneinfo->needs_init_pruning = needs_init_pruning;
+ pruneinfo->needs_exec_pruning = needs_exec_pruning;
/*
* Some subplans may not belong to any of the identified partitioned rels.
@@ -442,13 +455,19 @@ add_part_relids(List *allpartrelids, Bitmapset *partrelids)
* If we cannot find any useful run-time pruning steps, return NIL.
* However, on success, each rel identified in partrelids will have
* an element in the result list, even if some of them are useless.
+ * *needs_init_pruning and *needs_exec_pruning are set to indicate whether
+ * the pruning steps contained in the returned PartitionedRelPruneInfos
+ * can be performed during executor startup and during execution,
+ * respectively.
*/
static List *
make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
List *prunequal,
Bitmapset *partrelids,
int *relid_subplan_map,
- Bitmapset **matchedsubplans)
+ Bitmapset **matchedsubplans,
+ bool *needs_init_pruning,
+ bool *needs_exec_pruning)
{
RelOptInfo *targetpart = NULL;
List *pinfolist = NIL;
@@ -459,6 +478,10 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
int rti;
int i;
+ /* Will find out below. */
+ *needs_init_pruning = false;
+ *needs_exec_pruning = false;
+
/*
* Examine each partitioned rel, constructing a temporary array to map
* from planner relids to index of the partitioned rel, and building a
@@ -546,6 +569,9 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
* executor per-scan pruning steps. This first pass creates startup
* pruning steps and detects whether there's any possibly-useful quals
* that would require per-scan pruning.
+ *
+ * In the first pass, we note whether the 2nd pass is necessary by
+ * noting the presence of EXEC parameters.
*/
gen_partprune_steps(subpart, partprunequal, PARTTARGET_INITIAL,
&context);
@@ -620,6 +646,12 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
pinfo->execparamids = execparamids;
/* Remaining fields will be filled in the next loop */
+ /* record which types of pruning steps we've seen so far */
+ if (initial_pruning_steps != NIL)
+ *needs_init_pruning = true;
+ if (exec_pruning_steps != NIL)
+ *needs_exec_pruning = true;
+
pinfolist = lappend(pinfolist, pinfo);
}
@@ -647,6 +679,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
int *subplan_map;
int *subpart_map;
Oid *relid_map;
+ Index *rti_map;
/*
* Construct the subplan and subpart maps for this partitioning level.
@@ -659,6 +692,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
subpart_map = (int *) palloc(nparts * sizeof(int));
memset(subpart_map, -1, nparts * sizeof(int));
relid_map = (Oid *) palloc0(nparts * sizeof(Oid));
+ rti_map = (Index *) palloc0(nparts * sizeof(Index));
present_parts = NULL;
i = -1;
@@ -673,6 +707,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
subplan_map[i] = subplanidx = relid_subplan_map[partrel->relid] - 1;
subpart_map[i] = subpartidx = relid_subpart_map[partrel->relid] - 1;
relid_map[i] = planner_rt_fetch(partrel->relid, root)->relid;
+ rti_map[i] = partrel->relid;
if (subplanidx >= 0)
{
present_parts = bms_add_member(present_parts, i);
@@ -697,6 +732,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
pinfo->subplan_map = subplan_map;
pinfo->subpart_map = subpart_map;
pinfo->relid_map = relid_map;
+ pinfo->rti_map = rti_map;
}
pfree(relid_subpart_map);
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index cc943205d3..339bb603f7 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -1747,7 +1747,8 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
foreach(lc1, stmt_list)
{
PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
- ListCell *lc2;
+ Bitmapset *allLockRelids;
+ int rti;
if (plannedstmt->commandType == CMD_UTILITY)
{
@@ -1760,14 +1761,17 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
*/
Query *query = UtilityContainsQuery(plannedstmt->utilityStmt);
+ Assert(plannedstmt->minLockRelids == NULL);
if (query)
ScanQueryForLocks(query, acquire);
continue;
}
- foreach(lc2, plannedstmt->rtable)
+ allLockRelids = plannedstmt->minLockRelids;
+ rti = -1;
+ while ((rti = bms_next_member(allLockRelids, rti)) > 0)
{
- RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc2);
+ RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
if (rte->rtekind != RTE_RELATION)
continue;
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index 17fabc18c9..aeeaeb7884 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -45,6 +45,7 @@ extern void ExecCleanupTupleRouting(ModifyTableState *mtstate,
* nparts Length of subplan_map[] and subpart_map[].
* subplan_map Subplan index by partition index, or -1.
* subpart_map Subpart index by partition index, or -1.
+ * rti_map Range table index by partition index, or 0.
* present_parts A Bitmapset of the partition indexes that we
* have subplans or subparts for.
* initial_pruning_steps List of PartitionPruneSteps used to
@@ -61,6 +62,7 @@ typedef struct PartitionedRelPruningData
int nparts;
int *subplan_map;
int *subpart_map;
+ Index *rti_map;
Bitmapset *present_parts;
List *initial_pruning_steps;
List *exec_pruning_steps;
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index 1f33902947..c2f2544df5 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -218,6 +218,7 @@ extern struct Bitmapset *readBitmapset(void);
extern uintptr_t readDatum(bool typbyval);
extern bool *readBoolCols(int numCols);
extern int *readIntCols(int numCols);
+extern Index *readIndexCols(int numCols);
extern Oid *readOidCols(int numCols);
extern int16 *readAttrNumberCols(int numCols);
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 654dba61aa..4337e7aa34 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -128,6 +128,17 @@ typedef struct PlannerGlobal
/* List of PartitionPruneInfo contained in the plan */
List *partPruneInfos;
+ /*
+ * Do any of those PartitionPruneInfos have initial pruning steps in them?
+ */
+ bool containsInitialPruning;
+
+ /*
+ * Indexes of all range table entries; for AcquireExecutorLocks()'s
+ * perusal.
+ */
+ Bitmapset *minLockRelids;
+
/* OIDs of relations the plan depends on */
List *relationOids;
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index bddfe86191..eb0a007946 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -73,11 +73,18 @@ typedef struct PlannedStmt
List *partPruneInfos; /* List of PartitionPruneInfo contained in the
* plan */
+ bool containsInitialPruning; /* Do any of those PartitionPruneInfos
+ * have initial pruning steps in them?
+ */
+
List *rtable; /* list of RangeTblEntry nodes */
List *permInfos; /* list of RTEPermissionInfo nodes for rtable
* entries needing one */
+ Bitmapset *minLockRelids; /* Indexes of all range table entries; for
+ * AcquireExecutorLocks()'s perusal */
+
/* rtable indexes of target relations for INSERT/UPDATE/DELETE/MERGE */
List *resultRelations; /* integer list of RT indexes, or NIL */
@@ -1417,6 +1424,13 @@ typedef struct PlanRowMark
* prune_infos List of Lists containing PartitionedRelPruneInfo nodes,
* one sublist per run-time-prunable partition hierarchy
* appearing in the parent plan node's subplans.
+ *
+ * needs_init_pruning Does any of the PartitionedRelPruneInfos in
+ * prune_infos have its initial_pruning_steps set?
+ *
+ * needs_exec_pruning Does any of the PartitionedRelPruneInfos in
+ * prune_infos have its exec_pruning_steps set?
+ *
* other_subplans Indexes of any subplans that are not accounted for
* by any of the PartitionedRelPruneInfo nodes in
* "prune_infos". These subplans must not be pruned.
@@ -1428,6 +1442,8 @@ typedef struct PartitionPruneInfo
NodeTag type;
Bitmapset *root_parent_relids;
List *prune_infos;
+ bool needs_init_pruning;
+ bool needs_exec_pruning;
Bitmapset *other_subplans;
} PartitionPruneInfo;
@@ -1472,6 +1488,9 @@ typedef struct PartitionedRelPruneInfo
/* relation OID by partition index, or 0 */
Oid *relid_map pg_node_attr(array_size(nparts));
+ /* Range table index by partition index, or 0. */
+ Index *rti_map pg_node_attr(array_size(nparts));
+
/*
* initial_pruning_steps shows how to prune during executor startup (i.e.,
* without use of any PARAM_EXEC Params); it is NIL if no startup pruning
--
2.35.3
[application/octet-stream] v29-0002-In-GetCachedPlan-only-lock-unpruned-partitions.patch (67.1K, 3-v29-0002-In-GetCachedPlan-only-lock-unpruned-partitions.patch)
download | inline diff:
From 69855fffacf69575471beb69da761babadc9f75c Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Wed, 22 Dec 2021 16:55:17 +0900
Subject: [PATCH v29 2/2] In GetCachedPlan(), only lock unpruned partitions
This does two things mainly:
* The planner now removes the RT indexes of "initially prunable"
partitions from PlannedStmt.minLockRelids such that the set only
contains the relations not subject to initial partition pruning. So,
AcquireExecutorLocks only locks a subset of the relations contained
in a plan, deferring the locking of prunable relations to the caller.
* GetCachedPlans(), if there are prunable relations in the plan,
performs the initial partition pruning using available EXTERN params
and locks the partitions remaining after that, so the the CachedPlan
that's returned is valid in a race-free manner including for any
partitions that will be scanned during execution.
To make the pruning possible before entering ExecutorStart(), this
also adds a ExecPartitionDoInitialPruning(), which can be called by
GetCachedPlan() for a given PlannedStmt.
The result of performing initial partition pruning this way is made
available to the actual execution via PartitionPruneResult, of which
there is one for every ParttionPruneInfo contained in the PlannedStmt.
List of PartitionPruneResult for a given PlannedStmt are returned to
to the callers of GetCachedPlan() via its new output parameter of type
CachedPlanExtra, whose members currently only include said List.
---
src/backend/commands/copyto.c | 2 +-
src/backend/commands/createas.c | 2 +-
src/backend/commands/explain.c | 7 +-
src/backend/commands/extension.c | 2 +-
src/backend/commands/matview.c | 2 +-
src/backend/commands/prepare.c | 28 ++-
src/backend/executor/README | 31 ++-
src/backend/executor/execMain.c | 2 +
src/backend/executor/execParallel.c | 25 ++-
src/backend/executor/execPartition.c | 215 +++++++++++++++++----
src/backend/executor/execUtils.c | 1 +
src/backend/executor/functions.c | 2 +-
src/backend/executor/nodeAppend.c | 11 +-
src/backend/executor/nodeMergeAppend.c | 5 +-
src/backend/executor/spi.c | 31 ++-
src/backend/optimizer/plan/setrefs.c | 36 ++++
src/backend/tcop/postgres.c | 9 +-
src/backend/tcop/pquery.c | 28 ++-
src/backend/utils/cache/plancache.c | 257 +++++++++++++++++++++++--
src/backend/utils/mmgr/portalmem.c | 16 ++
src/include/commands/explain.h | 4 +-
src/include/executor/execPartition.h | 7 +-
src/include/executor/execdesc.h | 3 +
src/include/nodes/execnodes.h | 1 +
src/include/nodes/pathnodes.h | 4 +-
src/include/nodes/plannodes.h | 31 ++-
src/include/utils/plancache.h | 11 +-
src/include/utils/portal.h | 3 +
28 files changed, 694 insertions(+), 82 deletions(-)
diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index f26cc0d162..401a2280a3 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -558,7 +558,7 @@ BeginCopyTo(ParseState *pstate,
((DR_copy *) dest)->cstate = cstate;
/* Create a QueryDesc requesting no output */
- cstate->queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+ cstate->queryDesc = CreateQueryDesc(plan, NIL, pstate->p_sourcetext,
GetActiveSnapshot(),
InvalidSnapshot,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 152c29b551..942449544c 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -325,7 +325,7 @@ ExecCreateTableAs(ParseState *pstate, CreateTableAsStmt *stmt,
UpdateActiveSnapshotCommandId();
/* Create a QueryDesc, redirecting output to our tuple receiver */
- queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+ queryDesc = CreateQueryDesc(plan, NIL, pstate->p_sourcetext,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, 0);
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index f86983c660..2f2b558608 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -407,7 +407,7 @@ ExplainOneQuery(Query *query, int cursorOptions,
}
/* run it (if needed) and produce output */
- ExplainOnePlan(plan, into, es, queryString, params, queryEnv,
+ ExplainOnePlan(plan, NIL, into, es, queryString, params, queryEnv,
&planduration, (es->buffers ? &bufusage : NULL));
}
}
@@ -515,7 +515,8 @@ ExplainOneUtility(Node *utilityStmt, IntoClause *into, ExplainState *es,
* to call it.
*/
void
-ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
+ExplainOnePlan(PlannedStmt *plannedstmt, List *part_prune_results,
+ IntoClause *into, ExplainState *es,
const char *queryString, ParamListInfo params,
QueryEnvironment *queryEnv, const instr_time *planduration,
const BufferUsage *bufusage)
@@ -563,7 +564,7 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
dest = None_Receiver;
/* Create a QueryDesc for the query */
- queryDesc = CreateQueryDesc(plannedstmt, queryString,
+ queryDesc = CreateQueryDesc(plannedstmt, part_prune_results, queryString,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, instrument_option);
diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index cf1b1ca571..904cbcba4a 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -779,7 +779,7 @@ execute_sql_string(const char *sql)
{
QueryDesc *qdesc;
- qdesc = CreateQueryDesc(stmt,
+ qdesc = CreateQueryDesc(stmt, NIL,
sql,
GetActiveSnapshot(), NULL,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index 9ac0383459..65c8d0aa59 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -408,7 +408,7 @@ refresh_matview_datafill(DestReceiver *dest, Query *query,
UpdateActiveSnapshotCommandId();
/* Create a QueryDesc, redirecting output to our tuple receiver */
- queryDesc = CreateQueryDesc(plan, queryString,
+ queryDesc = CreateQueryDesc(plan, NIL, queryString,
GetActiveSnapshot(), InvalidSnapshot,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 9e29584d93..729384a9a6 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -154,6 +154,7 @@ ExecuteQuery(ParseState *pstate,
{
PreparedStatement *entry;
CachedPlan *cplan;
+ CachedPlanExtra *cplan_extra = NULL;
List *plan_list;
ParamListInfo paramLI = NULL;
EState *estate = NULL;
@@ -193,7 +194,11 @@ ExecuteQuery(ParseState *pstate,
entry->plansource->query_string);
/* Replan if needed, and increment plan refcount for portal */
- cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL);
+ cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL,
+ &cplan_extra);
+ Assert(cplan_extra == NULL ||
+ (list_length(cplan->stmt_list) ==
+ list_length(cplan_extra->part_prune_results_list)));
plan_list = cplan->stmt_list;
/*
@@ -207,6 +212,9 @@ ExecuteQuery(ParseState *pstate,
plan_list,
cplan);
+ if (cplan_extra)
+ PortalSaveCachedPlanExtra(portal, cplan_extra);
+
/*
* For CREATE TABLE ... AS EXECUTE, we must verify that the prepared
* statement is one that produces tuples. Currently we insist that it be
@@ -575,6 +583,7 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
PreparedStatement *entry;
const char *query_string;
CachedPlan *cplan;
+ CachedPlanExtra *cplan_extra = NULL;
List *plan_list;
ListCell *p;
ParamListInfo paramLI = NULL;
@@ -619,7 +628,11 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
/* Replan if needed, and acquire a transient refcount */
cplan = GetCachedPlan(entry->plansource, paramLI,
- CurrentResourceOwner, queryEnv);
+ CurrentResourceOwner, queryEnv,
+ &cplan_extra);
+ Assert(cplan_extra == NULL ||
+ (list_length(cplan->stmt_list) ==
+ list_length(cplan_extra->part_prune_results_list)));
INSTR_TIME_SET_CURRENT(planduration);
INSTR_TIME_SUBTRACT(planduration, planstart);
@@ -637,10 +650,17 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
foreach(p, plan_list)
{
PlannedStmt *pstmt = lfirst_node(PlannedStmt, p);
+ List *part_prune_results = NIL;
+
+ if (cplan_extra)
+ part_prune_results = list_nth_node(List,
+ cplan_extra->part_prune_results_list,
+ foreach_current_index(p));
if (pstmt->commandType != CMD_UTILITY)
- ExplainOnePlan(pstmt, into, es, query_string, paramLI, queryEnv,
- &planduration, (es->buffers ? &bufusage : NULL));
+ ExplainOnePlan(pstmt, part_prune_results, into, es, query_string,
+ paramLI, queryEnv, &planduration,
+ (es->buffers ? &bufusage : NULL));
else
ExplainOneUtility(pstmt->utilityStmt, into, es, query_string,
paramLI, queryEnv);
diff --git a/src/backend/executor/README b/src/backend/executor/README
index 17775a49e2..2222b3ed6f 100644
--- a/src/backend/executor/README
+++ b/src/backend/executor/README
@@ -63,7 +63,36 @@ if the executor determines that an entire subplan is not required due to
execution time partition pruning determining that no matching records will be
found there. This currently only occurs for Append and MergeAppend nodes. In
this case the non-required subplans are ignored and the executor state's
-subnode array will become out of sequence to the plan's subplan list.
+subnode array will become out of sequence to the plan's subplan list. Note
+that this is referred to as "initial" pruning, because it needs to occur only
+once during the execution startup, and uses a set of pruning steps called
+initial pruning steps (see PartitionedRelPruneInfo.initial_pruning_steps).
+
+Actually, "initial" pruning may occur even before the execution startup in
+in some cases. For example, when a cached generic plan is validated for
+execution, which works by locking all the relations that will be scanned by
+that plan during execution. If the generic plan contains plan nodes that have
+prunable child subnodes, then this validation locking is performed after
+pruning child subnodes that need not be scanned during execution, that is,
+using initial pruning steps. When such a generic plan is forwarded for
+execution, it must be accompanied by the set of PartitionPruneResult nodes that
+contain the result of that pruning, which basically consists of a bitmapset of
+child subnode indexes that survived the pruning and thus whose relations would
+have been locked for execution. This is important, because, unlike the
+plan-time pruning and actual executor-startup pruning, this does not actually
+remove the pruned subnodes from the plan tree, but only marks them as being
+pruned. So, the executor code (core or third party), especially one that runs
+before ExecutorStart() and thus looks at bare Plan trees (not PlanState trees)
+must beware of plan nodes that may actually have been pruned and thus subject
+to being invalidated by concurrent schema changes. For plan nodes that can
+have prunable child subnodes and thus contain a PartitionPruneInfo, such code
+must always check if the corresponding PartitionPruneResult exists
+in EState.es_part_prune_results at given part_prune_index and use that to
+decide which subplans are valid for execution instead of redoing the pruning.
+Note that that is not just a performance optimization but also necessary to
+avoid possibly ending up considering a different set of child subnodes as valid
+than the set CachedPlanLockPartitions() would have locked the relations of, if
+the pruning steps produce a different result when executed multiple times.
Each Plan node may have expression trees associated with it, to represent
its target list, qualification conditions, etc. These trees are also
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 2c2b3a8874..229f61f72e 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -798,6 +798,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
{
CmdType operation = queryDesc->operation;
PlannedStmt *plannedstmt = queryDesc->plannedstmt;
+ List *part_prune_results = queryDesc->part_prune_results;
Plan *plan = plannedstmt->planTree;
List *rangeTable = plannedstmt->rtable;
EState *estate = queryDesc->estate;
@@ -819,6 +820,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
estate->es_plannedstmt = plannedstmt;
estate->es_part_prune_infos = plannedstmt->partPruneInfos;
+ estate->es_part_prune_results = part_prune_results;
/*
* Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index 65c4b63bbd..9745eba0af 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -66,6 +66,7 @@
#define PARALLEL_KEY_QUERY_TEXT UINT64CONST(0xE000000000000008)
#define PARALLEL_KEY_JIT_INSTRUMENTATION UINT64CONST(0xE000000000000009)
#define PARALLEL_KEY_WAL_USAGE UINT64CONST(0xE00000000000000A)
+#define PARALLEL_KEY_PARTITION_PRUNE_RESULTS UINT64CONST(0xE00000000000000B)
#define PARALLEL_TUPLE_QUEUE_SIZE 65536
@@ -599,12 +600,15 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
FixedParallelExecutorState *fpes;
char *pstmt_data;
char *pstmt_space;
+ char *part_prune_results_data;
+ char *part_prune_results_space;
char *paramlistinfo_space;
BufferUsage *bufusage_space;
WalUsage *walusage_space;
SharedExecutorInstrumentation *instrumentation = NULL;
SharedJitInstrumentation *jit_instrumentation = NULL;
int pstmt_len;
+ int part_prune_results_len;
int paramlistinfo_len;
int instrumentation_len = 0;
int jit_instrumentation_len = 0;
@@ -633,6 +637,7 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
/* Fix up and serialize plan to be sent to workers. */
pstmt_data = ExecSerializePlan(planstate->plan, estate);
+ part_prune_results_data = nodeToString(estate->es_part_prune_results);
/* Create a parallel context. */
pcxt = CreateParallelContext("postgres", "ParallelQueryMain", nworkers);
@@ -659,6 +664,11 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
shm_toc_estimate_chunk(&pcxt->estimator, pstmt_len);
shm_toc_estimate_keys(&pcxt->estimator, 1);
+ /* Estimate space for serialized List of PartitionPruneResult. */
+ part_prune_results_len = strlen(part_prune_results_data) + 1;
+ shm_toc_estimate_chunk(&pcxt->estimator, part_prune_results_len);
+ shm_toc_estimate_keys(&pcxt->estimator, 1);
+
/* Estimate space for serialized ParamListInfo. */
paramlistinfo_len = EstimateParamListSpace(estate->es_param_list_info);
shm_toc_estimate_chunk(&pcxt->estimator, paramlistinfo_len);
@@ -753,6 +763,12 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
memcpy(pstmt_space, pstmt_data, pstmt_len);
shm_toc_insert(pcxt->toc, PARALLEL_KEY_PLANNEDSTMT, pstmt_space);
+ /* Store serialized List of PartitionPruneResult */
+ part_prune_results_space = shm_toc_allocate(pcxt->toc, part_prune_results_len);
+ memcpy(part_prune_results_space, part_prune_results_data, part_prune_results_len);
+ shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARTITION_PRUNE_RESULTS,
+ part_prune_results_space);
+
/* Store serialized ParamListInfo. */
paramlistinfo_space = shm_toc_allocate(pcxt->toc, paramlistinfo_len);
shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARAMLISTINFO, paramlistinfo_space);
@@ -1234,8 +1250,10 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
int instrument_options)
{
char *pstmtspace;
+ char *part_prune_results_space;
char *paramspace;
PlannedStmt *pstmt;
+ List *part_prune_results;
ParamListInfo paramLI;
char *queryString;
@@ -1246,12 +1264,17 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
pstmtspace = shm_toc_lookup(toc, PARALLEL_KEY_PLANNEDSTMT, false);
pstmt = (PlannedStmt *) stringToNode(pstmtspace);
+ /* Reconstruct leader-supplied PartitionPruneResult. */
+ part_prune_results_space =
+ shm_toc_lookup(toc, PARALLEL_KEY_PARTITION_PRUNE_RESULTS, false);
+ part_prune_results = (List *) stringToNode(part_prune_results_space);
+
/* Reconstruct ParamListInfo. */
paramspace = shm_toc_lookup(toc, PARALLEL_KEY_PARAMLISTINFO, false);
paramLI = RestoreParamList(¶mspace);
/* Create a QueryDesc for the query. */
- return CreateQueryDesc(pstmt,
+ return CreateQueryDesc(pstmt, part_prune_results,
queryString,
GetActiveSnapshot(), InvalidSnapshot,
receiver, paramLI, NULL, instrument_options);
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 5b62157712..dcd2bb0f90 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -25,6 +25,7 @@
#include "mb/pg_wchar.h"
#include "miscadmin.h"
#include "nodes/makefuncs.h"
+#include "parser/parsetree.h"
#include "partitioning/partbounds.h"
#include "partitioning/partdesc.h"
#include "partitioning/partprune.h"
@@ -185,7 +186,11 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation rel,
static List *adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri);
static List *adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap);
static PartitionPruneState *CreatePartitionPruneState(PlanState *planstate,
- PartitionPruneInfo *pruneinfo);
+ PartitionPruneInfo *pruneinfo,
+ bool consider_initial_steps,
+ bool consider_exec_steps,
+ List *rtable, ExprContext *econtext,
+ PartitionDirectory partdir);
static void InitPartitionPruneContext(PartitionPruneContext *context,
List *pruning_steps,
PartitionDesc partdesc,
@@ -198,7 +203,8 @@ static void PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
static void find_matching_subplans_recurse(PartitionPruningData *prunedata,
PartitionedRelPruningData *pprune,
bool initial_prune,
- Bitmapset **validsubplans);
+ Bitmapset **validsubplans,
+ Bitmapset **scan_leafpart_rtis);
/*
@@ -1742,7 +1748,8 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
* considered to be a stable expression, it can change value from one plan
* node scan to the next during query execution. Stable comparison
* expressions that don't involve such Params allow partition pruning to be
- * done once during executor startup. Expressions that do involve such Params
+ * done once during executor startup or even before that, such as when called
+ * from CachedPlanLockPartitions(). Expressions that do involve such Params
* require us to prune separately for each scan of the parent plan node.
*
* Note that pruning away unneeded subplans during executor startup has the
@@ -1760,6 +1767,12 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
* account for initial pruning possibly having eliminated some of the
* subplans.
*
+ * ExecPartitionDoInitialPruning:
+ * Do initial pruning with the information contained in a given
+ * PartitionPruneInfo to determine the set of the parent plan node's
+ * child subnodes that are valid for execution and also the set of the RT
+ * indexes of leaf partitions scanned by those subnodes.
+ *
* ExecFindMatchingSubPlans:
* Returns indexes of matching subplans after evaluating the expressions
* that are safe to evaluate at a given point. This function is first
@@ -1780,8 +1793,10 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
*
* On return, *initially_valid_subplans is assigned the set of indexes of
* child subplans that must be initialized along with the parent plan node.
- * Initial pruning is performed here if needed and in that case only the
- * surviving subplans' indexes are added.
+ * That set is computed by either performing the "initial pruning" here or
+ * reusing the one present in EState.es_part_prune_results[part_prune_index]
+ * if it has been set, which it would be if CachedPlanLockPartitions() would
+ * have done the initial pruning.
*
* If subplans are indeed pruned, subplan_map arrays contained in the returned
* PartitionPruneState are re-sequenced to not count those, though only if the
@@ -1794,9 +1809,10 @@ ExecInitPartitionPruning(PlanState *planstate,
Bitmapset *root_parent_relids,
Bitmapset **initially_valid_subplans)
{
- PartitionPruneState *prunestate;
+ PartitionPruneState *prunestate = NULL;
EState *estate = planstate->state;
PartitionPruneInfo *pruneinfo;
+ PartitionPruneResult *pruneresult = NULL;
/* Obtain the pruneinfo we need, and make sure it's the right one */
pruneinfo = list_nth(estate->es_part_prune_infos, part_prune_index);
@@ -1812,20 +1828,62 @@ ExecInitPartitionPruning(PlanState *planstate,
/* We may need an expression context to evaluate partition exprs */
ExecAssignExprContext(estate, planstate);
- /* Create the working data structure for pruning */
- prunestate = CreatePartitionPruneState(planstate, pruneinfo);
+ /* Initial pruning already done if es_part_prune_results has been set. */
+ if (estate->es_part_prune_results)
+ {
+ pruneresult = list_nth_node(PartitionPruneResult,
+ estate->es_part_prune_results,
+ part_prune_index);
+ if (!bms_equal(root_parent_relids, pruneinfo->root_parent_relids))
+ ereport(ERROR,
+ errcode(ERRCODE_INTERNAL_ERROR),
+ errmsg_internal("mismatching PartitionPruneInfo and PartitionPruneResult at part_prune_index %d",
+ part_prune_index),
+ errdetail_internal("prunresult relids %s, pruneinfo relids %s",
+ bmsToString(pruneresult->root_parent_relids),
+ bmsToString(pruneinfo->root_parent_relids)));
+ }
+
+ if (pruneresult == NULL || pruneinfo->needs_exec_pruning)
+ {
+ /* We may need an expression context to evaluate partition exprs */
+ ExecAssignExprContext(estate, planstate);
+
+ /* For data reading, executor always omits detached partitions */
+ if (estate->es_partition_directory == NULL)
+ estate->es_partition_directory =
+ CreatePartitionDirectory(estate->es_query_cxt, false);
+
+ /*
+ * Create the working data structure for pruning. No need to consider
+ * initial pruning steps if we have a PartitionPruneResult.
+ */
+ prunestate = CreatePartitionPruneState(planstate, pruneinfo,
+ pruneresult == NULL,
+ pruneinfo->needs_exec_pruning,
+ NIL, planstate->ps_ExprContext,
+ estate->es_partition_directory);
+ }
/*
* Perform an initial partition prune pass, if required.
*/
- if (prunestate->do_initial_prune)
- *initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true);
+ if (pruneresult)
+ {
+ *initially_valid_subplans = bms_copy(pruneresult->valid_subplan_offs);
+ }
+ else if (prunestate && prunestate->do_initial_prune)
+ {
+ *initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true,
+ NULL);
+ }
else
{
- /* No pruning, so we'll need to initialize all subplans */
+ /* No initial pruning, so we'll need to initialize all subplans */
Assert(n_total_subplans > 0);
*initially_valid_subplans = bms_add_range(NULL, 0,
n_total_subplans - 1);
+ return prunestate;
}
/*
@@ -1833,7 +1891,8 @@ ExecInitPartitionPruning(PlanState *planstate,
* that were removed above due to initial pruning. No need to do this if
* no steps were removed.
*/
- if (bms_num_members(*initially_valid_subplans) < n_total_subplans)
+ if (prunestate &&
+ bms_num_members(*initially_valid_subplans) < n_total_subplans)
{
/*
* We can safely skip this when !do_exec_prune, even though that
@@ -1849,11 +1908,58 @@ ExecInitPartitionPruning(PlanState *planstate,
return prunestate;
}
+/*
+ * ExecPartitionDoInitialPruning
+ * Perform initial pruning using given PartitionPruneInfo to determine
+ * the set of the parent plan node's child subnodes that are valid for
+ * execution
+ *
+ * On return, *scan_leafpart_rtis will contain the RT indexes of leaf
+ * partitions scanned by those valid subnodes.
+ *
+ * Note that this does not share state with the actual execution, so must do
+ * with the information present in the PlannedStmt. For example, there isn't
+ * a PlanState for the parent plan node yet, so we must create a standalone
+ * ExprContext to evaluate pruning expressions, equipped with the information
+ * about the EXTERN parameters that we do have. Note that that's okay because
+ * the initial pruning steps do not contain anything that would require the
+ * execution to have started. Likewise, we create our own PartitionDirectory
+ * to look up the PartitionDescs to use.
+ */
+Bitmapset *
+ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params,
+ PartitionPruneInfo *pruneinfo,
+ Bitmapset **scan_leafpart_rtis)
+{
+ List *rtable = plannedstmt->rtable;
+ ExprContext *econtext;
+ PartitionDirectory pdir;
+ PartitionPruneState *prunestate;
+ Bitmapset *valid_subplan_offs;
+
+ /* Don't omit detached partitions, just like during execution proper. */
+ pdir = CreatePartitionDirectory(CurrentMemoryContext, false);
+ econtext = CreateStandaloneExprContext();
+ econtext->ecxt_param_list_info = params;
+ prunestate = CreatePartitionPruneState(NULL, pruneinfo, true, false,
+ rtable, econtext, pdir);
+ valid_subplan_offs = ExecFindMatchingSubPlans(prunestate, true,
+ scan_leafpart_rtis);
+
+ FreeExprContext(econtext, true);
+ DestroyPartitionDirectory(pdir);
+
+ return valid_subplan_offs;
+}
+
/*
* CreatePartitionPruneState
* Build the data structure required for calling ExecFindMatchingSubPlans
*
- * 'planstate' is the parent plan node's execution state.
+ * 'planstate', if not NULL, is the parent plan node's execution state. It
+ * can be NULL if being called before ExecutorStart(), in which case,
+ * 'rtable' (range table), 'econtext', and 'partdir' must be explicitly
+ * provided.
*
* 'pruneinfo' is a PartitionPruneInfo as generated by
* make_partition_pruneinfo. Here we build a PartitionPruneState containing a
@@ -1867,19 +1973,21 @@ ExecInitPartitionPruning(PlanState *planstate,
* PartitionedRelPruneInfo.
*/
static PartitionPruneState *
-CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
+CreatePartitionPruneState(PlanState *planstate,
+ PartitionPruneInfo *pruneinfo,
+ bool consider_initial_steps,
+ bool consider_exec_steps,
+ List *rtable, ExprContext *econtext,
+ PartitionDirectory partdir)
{
- EState *estate = planstate->state;
+ EState *estate = planstate ? planstate->state : NULL;
PartitionPruneState *prunestate;
int n_part_hierarchies;
ListCell *lc;
int i;
- ExprContext *econtext = planstate->ps_ExprContext;
- /* For data reading, executor always omits detached partitions */
- if (estate->es_partition_directory == NULL)
- estate->es_partition_directory =
- CreatePartitionDirectory(estate->es_query_cxt, false);
+ Assert((estate != NULL) ||
+ (partdir != NULL && econtext != NULL && rtable != NIL));
n_part_hierarchies = list_length(pruneinfo->prune_infos);
Assert(n_part_hierarchies > 0);
@@ -1934,15 +2042,39 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
PartitionKey partkey;
/*
- * We can rely on the copies of the partitioned table's partition
- * key and partition descriptor appearing in its relcache entry,
- * because that entry will be held open and locked for the
- * duration of this executor run.
+ * Must open the relation by ourselves when called before the
+ * execution has started, such as, when called from
+ * CachedPlanLockPartitions(). In that case, sub-partitions must
+ * be locked, because AcquirePlannerLocks() would have locked only
+ * the root parent.
+ */
+ if (estate == NULL)
+ {
+ RangeTblEntry *rte = rt_fetch(pinfo->rtindex, rtable);
+ int lockmode = (j == 0) ? NoLock : rte->rellockmode;
+
+ partrel = table_open(rte->relid, lockmode);
+ }
+ else
+ partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
+
+ /*
+ * We can rely on the copy of the partitioned table's partition
+ * key from in its relcache entry, because it can't change (or
+ * get destroyed) as long as the relation is locked. Partition
+ * descriptor is taken from the PartitionDirectory associated with
+ * the table that is held open long enough for the descriptor to
+ * remain valid while it's used to perform the pruning steps.
*/
- partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
partkey = RelationGetPartitionKey(partrel);
- partdesc = PartitionDirectoryLookup(estate->es_partition_directory,
- partrel);
+ partdesc = PartitionDirectoryLookup(partdir, partrel);
+
+ /*
+ * Must close partrel, keeping the lock taken, if we're not using
+ * EState's entry.
+ */
+ if (estate == NULL)
+ table_close(partrel, NoLock);
/*
* Initialize the subplan_map and subpart_map.
@@ -2050,7 +2182,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
* Initialize pruning contexts as needed.
*/
pprune->initial_pruning_steps = pinfo->initial_pruning_steps;
- if (pinfo->initial_pruning_steps)
+ if (consider_initial_steps && pinfo->initial_pruning_steps)
{
InitPartitionPruneContext(&pprune->initial_context,
pinfo->initial_pruning_steps,
@@ -2060,7 +2192,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
prunestate->do_initial_prune = true;
}
pprune->exec_pruning_steps = pinfo->exec_pruning_steps;
- if (pinfo->exec_pruning_steps)
+ if (consider_exec_steps && pinfo->exec_pruning_steps)
{
InitPartitionPruneContext(&pprune->exec_context,
pinfo->exec_pruning_steps,
@@ -2288,10 +2420,14 @@ PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
* Pass initial_prune if PARAM_EXEC Params cannot yet be evaluated. This
* differentiates the initial executor-time pruning step from later
* runtime pruning.
+ *
+ * RT indexes of leaf partitions scanned by the chosen subplans are added to
+ * *scan_leafpart_rtis if the pointer is non-NULL.
*/
Bitmapset *
ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
- bool initial_prune)
+ bool initial_prune,
+ Bitmapset **scan_leafpart_rtis)
{
Bitmapset *result = NULL;
MemoryContext oldcontext;
@@ -2326,7 +2462,7 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
*/
pprune = &prunedata->partrelprunedata[0];
find_matching_subplans_recurse(prunedata, pprune, initial_prune,
- &result);
+ &result, scan_leafpart_rtis);
/* Expression eval may have used space in ExprContext too */
if (pprune->exec_pruning_steps)
@@ -2340,6 +2476,8 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
/* Copy result out of the temp context before we reset it */
result = bms_copy(result);
+ if (scan_leafpart_rtis)
+ *scan_leafpart_rtis = bms_copy(*scan_leafpart_rtis);
MemoryContextReset(prunestate->prune_context);
@@ -2350,13 +2488,15 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
* find_matching_subplans_recurse
* Recursive worker function for ExecFindMatchingSubPlans
*
- * Adds valid (non-prunable) subplan IDs to *validsubplans
+ * Adds valid (non-prunable) subplan IDs to *validsubplans and RT indexes of
+ * of the corresponding leaf partitions to *scan_leafpart_rtis (if asked for).
*/
static void
find_matching_subplans_recurse(PartitionPruningData *prunedata,
PartitionedRelPruningData *pprune,
bool initial_prune,
- Bitmapset **validsubplans)
+ Bitmapset **validsubplans,
+ Bitmapset **scan_leafpart_rtis)
{
Bitmapset *partset;
int i;
@@ -2383,8 +2523,14 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
while ((i = bms_next_member(partset, i)) >= 0)
{
if (pprune->subplan_map[i] >= 0)
+ {
*validsubplans = bms_add_member(*validsubplans,
pprune->subplan_map[i]);
+ Assert(pprune->rti_map[i] > 0);
+ if (scan_leafpart_rtis)
+ *scan_leafpart_rtis = bms_add_member(*scan_leafpart_rtis,
+ pprune->rti_map[i]);
+ }
else
{
int partidx = pprune->subpart_map[i];
@@ -2392,7 +2538,8 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
if (partidx >= 0)
find_matching_subplans_recurse(prunedata,
&prunedata->partrelprunedata[partidx],
- initial_prune, validsubplans);
+ initial_prune, validsubplans,
+ scan_leafpart_rtis);
else
{
/*
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 87f4d53ca7..7d36c972d3 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -139,6 +139,7 @@ CreateExecutorState(void)
estate->es_param_exec_vals = NULL;
estate->es_queryEnv = NULL;
+ estate->es_part_prune_results = NIL;
estate->es_query_cxt = qcontext;
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index dc13625171..bffb42ce71 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -842,7 +842,7 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
else
dest = None_Receiver;
- es->qd = CreateQueryDesc(es->stmt,
+ es->qd = CreateQueryDesc(es->stmt, NIL,
fcache->src,
GetActiveSnapshot(),
InvalidSnapshot,
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index 99830198bd..3b917584de 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -156,7 +156,8 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
* subplan, we can fill as_valid_subplans immediately, preventing
* later calls to ExecFindMatchingSubPlans.
*/
- if (!prunestate->do_exec_prune && nplans > 0)
+ if (appendstate->as_prune_state == NULL ||
+ (!appendstate->as_prune_state->do_exec_prune && nplans > 0))
appendstate->as_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
}
else
@@ -578,7 +579,7 @@ choose_next_subplan_locally(AppendState *node)
}
else if (node->as_valid_subplans == NULL)
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
whichplan = -1;
}
@@ -643,7 +644,7 @@ choose_next_subplan_for_leader(AppendState *node)
if (node->as_valid_subplans == NULL)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
/*
* Mark each invalid plan as finished to allow the loop below to
@@ -718,7 +719,7 @@ choose_next_subplan_for_worker(AppendState *node)
else if (node->as_valid_subplans == NULL)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
mark_invalid_subplans_as_finished(node);
}
@@ -869,7 +870,7 @@ ExecAppendAsyncBegin(AppendState *node)
if (node->as_valid_subplans == NULL)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
classify_matching_subplans(node);
}
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index f370f9f287..ccfa083945 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -104,7 +104,8 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
* subplan, we can fill ms_valid_subplans immediately, preventing
* later calls to ExecFindMatchingSubPlans.
*/
- if (!prunestate->do_exec_prune && nplans > 0)
+ if (mergestate->ms_prune_state == NULL ||
+ (!mergestate->ms_prune_state->do_exec_prune && nplans > 0))
mergestate->ms_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
}
else
@@ -219,7 +220,7 @@ ExecMergeAppend(PlanState *pstate)
*/
if (node->ms_valid_subplans == NULL)
node->ms_valid_subplans =
- ExecFindMatchingSubPlans(node->ms_prune_state, false);
+ ExecFindMatchingSubPlans(node->ms_prune_state, false, NULL);
/*
* First time through: pull the first tuple from each valid subplan,
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index fd5796f1b9..2ecb9193aa 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -1577,6 +1577,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
{
CachedPlanSource *plansource;
CachedPlan *cplan;
+ CachedPlanExtra *cplan_extra;
List *stmt_list;
char *query_string;
Snapshot snapshot;
@@ -1657,7 +1658,11 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
*/
/* Replan if needed, and increment plan refcount for portal */
- cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv);
+ cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv,
+ &cplan_extra);
+ Assert(cplan_extra == NULL ||
+ (list_length(cplan->stmt_list) ==
+ list_length(cplan_extra->part_prune_results_list)));
stmt_list = cplan->stmt_list;
if (!plan->saved)
@@ -1685,6 +1690,9 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
stmt_list,
cplan);
+ if (cplan_extra)
+ PortalSaveCachedPlanExtra(portal, cplan_extra);
+
/*
* Set up options for portal. Default SCROLL type is chosen the same way
* as PerformCursorOpen does it.
@@ -2067,6 +2075,7 @@ SPI_plan_get_cached_plan(SPIPlanPtr plan)
{
CachedPlanSource *plansource;
CachedPlan *cplan;
+ CachedPlanExtra *cplan_extra = NULL;
SPICallbackArg spicallbackarg;
ErrorContextCallback spierrcontext;
@@ -2092,8 +2101,12 @@ SPI_plan_get_cached_plan(SPIPlanPtr plan)
/* Get the generic plan for the query */
cplan = GetCachedPlan(plansource, NULL,
plan->saved ? CurrentResourceOwner : NULL,
- _SPI_current->queryEnv);
+ _SPI_current->queryEnv,
+ &cplan_extra);
Assert(cplan == plansource->gplan);
+ Assert(cplan_extra == NULL ||
+ (list_length(cplan->stmt_list) ==
+ list_length(cplan_extra->part_prune_results_list)));
/* Pop the error context stack */
error_context_stack = spierrcontext.previous;
@@ -2399,6 +2412,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
SPICallbackArg spicallbackarg;
ErrorContextCallback spierrcontext;
CachedPlan *cplan = NULL;
+ CachedPlanExtra *cplan_extra = NULL;
ListCell *lc1;
/*
@@ -2549,8 +2563,12 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
* plan, the refcount must be backed by the plan_owner.
*/
cplan = GetCachedPlan(plansource, options->params,
- plan_owner, _SPI_current->queryEnv);
+ plan_owner, _SPI_current->queryEnv,
+ &cplan_extra);
+ Assert(cplan_extra == NULL ||
+ (list_length(cplan->stmt_list) ==
+ list_length(cplan_extra->part_prune_results_list)));
stmt_list = cplan->stmt_list;
/*
@@ -2592,9 +2610,14 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
foreach(lc2, stmt_list)
{
PlannedStmt *stmt = lfirst_node(PlannedStmt, lc2);
+ List *part_prune_results = NIL;
bool canSetTag = stmt->canSetTag;
DestReceiver *dest;
+ if (cplan_extra)
+ part_prune_results = list_nth_node(List,
+ cplan_extra->part_prune_results_list,
+ foreach_current_index(lc2));
/*
* Reset output state. (Note that if a non-SPI receiver is used,
* _SPI_current->processed will stay zero, and that's what we'll
@@ -2663,7 +2686,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
else
snap = InvalidSnapshot;
- qdesc = CreateQueryDesc(stmt,
+ qdesc = CreateQueryDesc(stmt, part_prune_results,
plansource->query_string,
snap, crosscheck_snapshot,
dest,
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index ed43d5936d..db27cae297 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -372,6 +372,7 @@ set_plan_references(PlannerInfo *root, Plan *plan)
{
PartitionPruneInfo *pruneinfo = lfirst(lc);
ListCell *l;
+ Bitmapset *leafpart_rtis = NULL;
pruneinfo->root_parent_relids =
offset_relid_set(pruneinfo->root_parent_relids, rtoffset);
@@ -383,17 +384,52 @@ set_plan_references(PlannerInfo *root, Plan *plan)
foreach(l2, prune_infos)
{
PartitionedRelPruneInfo *pinfo = lfirst(l2);
+ int i;
/* RT index of the table to which the pinfo belongs. */
pinfo->rtindex += rtoffset;
+
+ /* Also of the leaf partitions that might be scanned. */
+ for (i = 0; i < pinfo->nparts; i++)
+ {
+ if (pinfo->rti_map[i] > 0 && pinfo->subplan_map[i] >= 0)
+ {
+ pinfo->rti_map[i] += rtoffset;
+ leafpart_rtis = bms_add_member(leafpart_rtis,
+ pinfo->rti_map[i]);
+ }
+ }
}
}
+ if (pruneinfo->needs_init_pruning)
+ {
+ glob->containsInitialPruning = true;
+
+ /*
+ * Delete the leaf partition RTIs from the set of relations to be
+ * locked by AcquireExecutorLocks(). The actual set of leaf
+ * partitions to be locked is computed by
+ * CachedPlanLockPartitions().
+ */
+ glob->minLockRelids = bms_del_members(glob->minLockRelids,
+ leafpart_rtis);
+ }
+
glob->partPruneInfos = lappend(glob->partPruneInfos, pruneinfo);
glob->containsInitialPruning |= pruneinfo->needs_init_pruning;
}
+ /*
+ * It seems worth doing a bms_copy() on glob->minLockRelids if we deleted
+ * bits from it above to get rid of any empty tail bits. It seems better
+ * for the loop over this set in AcquireExecutorLocks() to not have to go
+ * through those useless bit words.
+ */
+ if (glob->containsInitialPruning)
+ glob->minLockRelids = bms_copy(glob->minLockRelids);
+
return result;
}
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index f8808d2191..9c1c7bfa9e 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -1598,6 +1598,7 @@ exec_bind_message(StringInfo input_message)
int16 *rformats = NULL;
CachedPlanSource *psrc;
CachedPlan *cplan;
+ CachedPlanExtra *cplan_extra = NULL;
Portal portal;
char *query_string;
char *saved_stmt_name;
@@ -1972,7 +1973,10 @@ exec_bind_message(StringInfo input_message)
* will be generated in MessageContext. The plan refcount will be
* assigned to the Portal, so it will be released at portal destruction.
*/
- cplan = GetCachedPlan(psrc, params, NULL, NULL);
+ cplan = GetCachedPlan(psrc, params, NULL, NULL, &cplan_extra);
+ Assert(cplan_extra == NULL ||
+ (list_length(cplan->stmt_list) ==
+ list_length(cplan_extra->part_prune_results_list)));
/*
* Now we can define the portal.
@@ -1987,6 +1991,9 @@ exec_bind_message(StringInfo input_message)
cplan->stmt_list,
cplan);
+ if (cplan_extra)
+ PortalSaveCachedPlanExtra(portal, cplan_extra);
+
/* Done with the snapshot used for parameter I/O and parsing/planning */
if (snapshot_set)
PopActiveSnapshot();
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index 52e2db6452..32e6b7b767 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -35,7 +35,7 @@
Portal ActivePortal = NULL;
-static void ProcessQuery(PlannedStmt *plan,
+static void ProcessQuery(PlannedStmt *plan, List *part_prune_results,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -65,6 +65,7 @@ static void DoPortalRewind(Portal portal);
*/
QueryDesc *
CreateQueryDesc(PlannedStmt *plannedstmt,
+ List *part_prune_results,
const char *sourceText,
Snapshot snapshot,
Snapshot crosscheck_snapshot,
@@ -77,6 +78,7 @@ CreateQueryDesc(PlannedStmt *plannedstmt,
qd->operation = plannedstmt->commandType; /* operation */
qd->plannedstmt = plannedstmt; /* plan */
+ qd->part_prune_results = part_prune_results;
qd->sourceText = sourceText; /* query text */
qd->snapshot = RegisterSnapshot(snapshot); /* snapshot */
/* RI check snapshot */
@@ -122,6 +124,7 @@ FreeQueryDesc(QueryDesc *qdesc)
* PORTAL_ONE_RETURNING, or PORTAL_ONE_MOD_WITH portal
*
* plan: the plan tree for the query
+ * part_prune_results: pruning results returned by CachedPlanLockPartitions()
* sourceText: the source text of the query
* params: any parameters needed
* dest: where to send results
@@ -134,6 +137,7 @@ FreeQueryDesc(QueryDesc *qdesc)
*/
static void
ProcessQuery(PlannedStmt *plan,
+ List *part_prune_results,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -145,7 +149,7 @@ ProcessQuery(PlannedStmt *plan,
/*
* Create the QueryDesc object
*/
- queryDesc = CreateQueryDesc(plan, sourceText,
+ queryDesc = CreateQueryDesc(plan, part_prune_results, sourceText,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, 0);
@@ -491,8 +495,13 @@ PortalStart(Portal portal, ParamListInfo params,
/*
* Create QueryDesc in portal's context; for the moment, set
* the destination to DestNone.
+ *
+ * There is no PartitionPruneResult unless the PlannedStmt is
+ * from a CachedPlan.
*/
queryDesc = CreateQueryDesc(linitial_node(PlannedStmt, portal->stmts),
+ portal->cplan_extra == NULL ? NIL :
+ linitial(portal->cplan_extra->part_prune_results_list),
portal->sourceText,
GetActiveSnapshot(),
InvalidSnapshot,
@@ -1225,6 +1234,8 @@ PortalRunMulti(Portal portal,
if (pstmt->utilityStmt == NULL)
{
+ List *part_prune_results = NIL;
+
/*
* process a plannable query.
*/
@@ -1271,10 +1282,19 @@ PortalRunMulti(Portal portal,
else
UpdateActiveSnapshotCommandId();
+ /*
+ * Determine if there's a corresponding List of PartitionPruneResult
+ * for this PlannedStmt.
+ */
+ if (portal->cplan_extra)
+ part_prune_results = list_nth_node(List,
+ portal->cplan_extra->part_prune_results_list,
+ foreach_current_index(stmtlist_item));
+
if (pstmt->canSetTag)
{
/* statement can set tag string */
- ProcessQuery(pstmt,
+ ProcessQuery(pstmt, part_prune_results,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
@@ -1283,7 +1303,7 @@ PortalRunMulti(Portal portal,
else
{
/* stmt added by rewrite cannot set tag */
- ProcessQuery(pstmt,
+ ProcessQuery(pstmt, part_prune_results,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 339bb603f7..7bd94e7632 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -59,6 +59,7 @@
#include "access/transam.h"
#include "catalog/namespace.h"
#include "executor/executor.h"
+#include "executor/execPartition.h"
#include "miscadmin.h"
#include "nodes/nodeFuncs.h"
#include "optimizer/optimizer.h"
@@ -96,17 +97,20 @@ static dlist_head saved_plan_list = DLIST_STATIC_INIT(saved_plan_list);
*/
static dlist_head cached_expression_list = DLIST_STATIC_INIT(cached_expression_list);
+static CachedPlan *GetCachedPlanInternal(CachedPlanSource *plansource,
+ ParamListInfo boundParams, ResourceOwner owner,
+ QueryEnvironment *queryEnv, bool *hasUnlockedParts);
static void ReleaseGenericPlan(CachedPlanSource *plansource);
static List *RevalidateCachedQuery(CachedPlanSource *plansource,
QueryEnvironment *queryEnv);
-static bool CheckCachedPlan(CachedPlanSource *plansource);
+static bool CheckCachedPlan(CachedPlanSource *plansource, bool *hasUnlockedParts);
static CachedPlan *BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
ParamListInfo boundParams, QueryEnvironment *queryEnv);
static bool choose_custom_plan(CachedPlanSource *plansource,
ParamListInfo boundParams);
static double cached_plan_cost(CachedPlan *plan, bool include_planner);
static Query *QueryListGetPrimaryStmt(List *stmts);
-static void AcquireExecutorLocks(List *stmt_list, bool acquire);
+static bool AcquireExecutorLocks(List *stmt_list, bool acquire);
static void AcquirePlannerLocks(List *stmt_list, bool acquire);
static void ScanQueryForLocks(Query *parsetree, bool acquire);
static bool ScanQueryWalker(Node *node, bool *acquire);
@@ -783,16 +787,23 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
}
/*
- * CheckCachedPlan: see if the CachedPlanSource's generic plan is valid.
+ * CheckCachedPlan: see if the CachedPlanSource's generic plan is valid and
+ * set *hasUnlockedParts if any PlannedStmt contains "initially" prunable
+ * subnodes; partitions are not locked till initial pruning is done.
*
* Caller must have already called RevalidateCachedQuery to verify that the
* querytree is up to date.
*
- * On a "true" return, we have acquired the locks needed to run the plan.
+ * On a "true" return, we have acquired the minimal set of locks needed to run
+ * the plan, that is, excluding partitions that are subject to being pruned
+ * before execution. The caller must lock partitions after pruning those and
+ * locking the ones that remain before actually telling the world that the
+ * plan is "valid".
+ *
* (We must do this for the "true" result to be race-condition-free.)
*/
static bool
-CheckCachedPlan(CachedPlanSource *plansource)
+CheckCachedPlan(CachedPlanSource *plansource, bool *hasUnlockedParts)
{
CachedPlan *plan = plansource->gplan;
@@ -826,7 +837,7 @@ CheckCachedPlan(CachedPlanSource *plansource)
*/
Assert(plan->refcount > 0);
- AcquireExecutorLocks(plan->stmt_list, true);
+ *hasUnlockedParts = AcquireExecutorLocks(plan->stmt_list, true);
/*
* If plan was transient, check to see if TransactionXmin has
@@ -848,7 +859,7 @@ CheckCachedPlan(CachedPlanSource *plansource)
}
/* Oops, the race case happened. Release useless locks. */
- AcquireExecutorLocks(plan->stmt_list, false);
+ (void) AcquireExecutorLocks(plan->stmt_list, false);
}
/*
@@ -1120,7 +1131,125 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
}
/*
- * GetCachedPlan: get a cached plan from a CachedPlanSource.
+ * For each PlannedStmt in plan->stmt_list, do initial partition pruning if
+ * needed and lock partitions that survive.
+ *
+ * The returned list of the same length as plan->stmt_list will contains either
+ * a NIL if the PlannedStmt did not contain any PartitionPruneInfos requiring
+ * initial pruning or a List of PartitionPruneResult that in turn contains
+ * an element for each PartitionPruneInfo found in stmt->partPruneInfos.
+ *
+ * Also, on return, *lockedRelids_per_stmt, that will be made of the same
+ * length as plan->stmt_list, will contain either a NULL if no additional
+ * relations needed to be locked for the PlannedStmt, or a bitmapset of RT
+ * indexes of partitions locked.
+ */
+static bool
+CachedPlanLockPartitions(CachedPlan *plan,
+ ParamListInfo boundParams,
+ ResourceOwner owner,
+ List **part_prune_results_list,
+ List **lockedRelids_per_stmt)
+{
+ List *my_part_prune_results_list = NIL;
+ List *my_lockedRelids_per_stmt = NIL;
+ ListCell *lc1;
+ MemoryContext oldcontext,
+ tmpcontext;
+
+ *part_prune_results_list = NIL;
+ *lockedRelids_per_stmt = NIL;
+
+ /*
+ * Create a temporary context for memory allocations required while
+ * executing partition pruning steps.
+ */
+ tmpcontext = AllocSetContextCreate(CurrentMemoryContext,
+ "CachedPlanLockPartitions() working data",
+ ALLOCSET_DEFAULT_SIZES);
+ oldcontext = MemoryContextSwitchTo(tmpcontext);
+ foreach(lc1, plan->stmt_list)
+ {
+ PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
+ Bitmapset *lockPartRelids = NULL;
+ int rti;
+ List *part_prune_results = NIL;
+ Bitmapset *lockedRelids = NULL;
+
+ if (plannedstmt->commandType == CMD_UTILITY)
+ {
+ /*
+ * Ignore utility statements, because AcquireExecutorLocks on the
+ * parent CachedPlan would have dealt with these. Though, do let
+ * the caller know that no pruning is applicable to this statement.
+ */
+ my_part_prune_results_list = lappend(my_part_prune_results_list,
+ NIL);
+ *lockedRelids_per_stmt = lappend(*lockedRelids_per_stmt, NULL);
+ continue;
+ }
+
+ /* Figure out the partitions that would need to be locked. */
+ if (plannedstmt->containsInitialPruning)
+ {
+ ListCell *lc2;
+
+ foreach(lc2, plannedstmt->partPruneInfos)
+ {
+ PartitionPruneInfo *pruneinfo = lfirst_node(PartitionPruneInfo, lc2);
+ PartitionPruneResult *pruneresult = makeNode(PartitionPruneResult);
+
+ pruneresult->root_parent_relids =
+ bms_copy(pruneinfo->root_parent_relids);
+ pruneresult->valid_subplan_offs =
+ ExecPartitionDoInitialPruning(plannedstmt, boundParams,
+ pruneinfo,
+ &lockPartRelids);
+ part_prune_results = lappend(part_prune_results, pruneresult);
+ }
+ }
+
+ rti = -1;
+ while ((rti = bms_next_member(lockPartRelids, rti)) > 0)
+ {
+ RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
+
+ Assert(rte->rtekind == RTE_RELATION);
+
+ /*
+ * Acquire the appropriate type of lock on each relation OID. Note
+ * that we don't actually try to open the rel, and hence will not
+ * fail if it's been dropped entirely --- we'll just transiently
+ * acquire a non-conflicting lock.
+ */
+ LockRelationOid(rte->relid, rte->rellockmode);
+ lockedRelids = bms_add_member(lockedRelids, rti);
+ }
+
+ my_part_prune_results_list = lappend(my_part_prune_results_list,
+ part_prune_results);
+ my_lockedRelids_per_stmt = lappend(my_lockedRelids_per_stmt,
+ lockedRelids);
+ }
+
+ /*
+ * If the plan is still valid, copy the prune results and lockRelids
+ * bitmapsets into the caller's context.
+ */
+ MemoryContextSwitchTo(oldcontext);
+ if (plan->is_valid)
+ {
+ *part_prune_results_list = copyObject(my_part_prune_results_list);
+ *lockedRelids_per_stmt = copyObject(my_lockedRelids_per_stmt);
+ }
+
+ /* Clear up the temporary context. */
+ MemoryContextDelete(tmpcontext);
+ return plan->is_valid;
+}
+
+/*
+ * GetCachedPlan: get a cached plan from a CachedPlanSource
*
* This function hides the logic that decides whether to use a generic
* plan or a custom plan for the given parameters: the caller does not know
@@ -1139,7 +1268,97 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
*/
CachedPlan *
GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
- ResourceOwner owner, QueryEnvironment *queryEnv)
+ ResourceOwner owner, QueryEnvironment *queryEnv,
+ CachedPlanExtra **extra)
+{
+ CachedPlan *plan;
+
+ Assert(extra != NULL);
+ *extra = NULL;
+ for (;;)
+ {
+ bool hasUnlockedParts = false;
+
+ /* Actually get the plan. */
+ plan = GetCachedPlanInternal(plansource, boundParams, owner, queryEnv,
+ &hasUnlockedParts);
+ Assert(plan->is_valid);
+
+ /* Nothing to do if all relations already locked. */
+ if (!hasUnlockedParts)
+ return plan;
+ else
+ {
+ /*
+ * Do initial pruning to filter out partitions that need not be
+ * locked for execution.
+ */
+ ListCell *lc1,
+ *lc2;
+ List *part_prune_results_list;
+ List *lockedRelids_per_stmt;
+
+ /* Only a generic plan can ever have unlocked partitions in it. */
+ Assert(plan == plansource->gplan);
+
+ /*
+ * This does:
+ *
+ * 1) the pruning, returning in part_prune_results_list the
+ * PartitionPruneResult Lists for all statements
+ *
+ * 2) lock partitions that survive in each statement, returning
+ * in lockedRelids_per_stmt the RT indexes of those locked.
+ *
+ * True is returned if the plan is still valid after locking all
+ * partitions; false otherwise, in which case we must get a new
+ * plan.
+ */
+ if (CachedPlanLockPartitions(plan, boundParams, owner,
+ &part_prune_results_list,
+ &lockedRelids_per_stmt))
+ {
+ Assert(plan->is_valid);
+ *extra = (CachedPlanExtra *) palloc(sizeof(CachedPlanExtra));
+ (*extra)->part_prune_results_list = part_prune_results_list;
+ return plan;
+ }
+
+ /*
+ * Release the locks and start over. This is the same as what
+ * CheckCachedPlan does when doing AcquireExecutorLocks() causes
+ * the plan to be invalidated.
+ */
+ forboth(lc1, plan->stmt_list, lc2, lockedRelids_per_stmt)
+ {
+ PlannedStmt *plannedstmt = lfirst(lc1);
+ Bitmapset *lockedRelids = lfirst(lc2);
+ int rti;
+
+ if (plannedstmt->commandType == CMD_UTILITY)
+ continue;
+ rti = -1;
+ while ((rti = bms_next_member(lockedRelids, rti)) > 0)
+ {
+ RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
+
+ Assert(rte->rtekind == RTE_RELATION);
+
+ UnlockRelationOid(rte->relid, rte->rellockmode);
+ }
+ }
+ }
+ }
+
+ Assert(false);
+ return NULL;
+}
+
+/* Internal workhorse of GetCachedPlan() */
+static CachedPlan *
+GetCachedPlanInternal(CachedPlanSource *plansource, ParamListInfo boundParams,
+ ResourceOwner owner, QueryEnvironment *queryEnv,
+ bool *hasUnlockedParts)
{
CachedPlan *plan = NULL;
List *qlist;
@@ -1160,7 +1379,7 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
if (!customplan)
{
- if (CheckCachedPlan(plansource))
+ if (CheckCachedPlan(plansource, hasUnlockedParts))
{
/* We want a generic plan, and we already have a valid one */
plan = plansource->gplan;
@@ -1738,11 +1957,16 @@ QueryListGetPrimaryStmt(List *stmts)
/*
* AcquireExecutorLocks: acquire locks needed for execution of a cached plan;
* or release them if acquire is false.
+ *
+ * If some PlannedStmt(s) contain "initially prunable" partitions, they are not
+ * locked here. Instead, the caller is informed of their existence so that it
+ * can lock them after doing the initial pruning.
*/
-static void
+static bool
AcquireExecutorLocks(List *stmt_list, bool acquire)
{
ListCell *lc1;
+ bool hasUnlockedParts = false;
foreach(lc1, stmt_list)
{
@@ -1763,10 +1987,17 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
Assert(plannedstmt->minLockRelids == NULL);
if (query)
- ScanQueryForLocks(query, acquire);
+ ScanQueryForLocks(query, true);
continue;
}
+ /*
+ * If partitions can be pruned before execution, defer their locking to
+ * the caller.
+ */
+ if (plannedstmt->containsInitialPruning)
+ hasUnlockedParts = true;
+
allLockRelids = plannedstmt->minLockRelids;
rti = -1;
while ((rti = bms_next_member(allLockRelids, rti)) > 0)
@@ -1788,6 +2019,8 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
UnlockRelationOid(rte->relid, rte->rellockmode);
}
}
+
+ return hasUnlockedParts;
}
/*
diff --git a/src/backend/utils/mmgr/portalmem.c b/src/backend/utils/mmgr/portalmem.c
index 7b1ae6fdcf..94a9db84e3 100644
--- a/src/backend/utils/mmgr/portalmem.c
+++ b/src/backend/utils/mmgr/portalmem.c
@@ -303,6 +303,22 @@ PortalDefineQuery(Portal portal,
portal->status = PORTAL_DEFINED;
}
+/*
+ * Copies the given CachedPlanExtra struct into the portal.
+ */
+void
+PortalSaveCachedPlanExtra(Portal portal, CachedPlanExtra *extra)
+{
+ MemoryContext oldcxt = MemoryContextSwitchTo(portal->portalContext);
+
+ Assert(portal->cplan_extra == NULL && extra != NULL);
+ portal->cplan_extra = (CachedPlanExtra *)
+ palloc(sizeof(CachedPlanExtra));
+ portal->cplan_extra->part_prune_results_list =
+ copyObject(extra->part_prune_results_list);
+ MemoryContextSwitchTo(oldcxt);
+}
+
/*
* PortalReleaseCachedPlan
* Release a portal's reference to its cached plan, if any.
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 9ebde089ae..269cc4d562 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -87,7 +87,9 @@ extern void ExplainOneUtility(Node *utilityStmt, IntoClause *into,
ExplainState *es, const char *queryString,
ParamListInfo params, QueryEnvironment *queryEnv);
-extern void ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into,
+extern void ExplainOnePlan(PlannedStmt *plannedstmt,
+ List *part_prune_results,
+ IntoClause *into,
ExplainState *es, const char *queryString,
ParamListInfo params, QueryEnvironment *queryEnv,
const instr_time *planduration,
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index aeeaeb7884..4b98d0d2ef 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -129,5 +129,10 @@ extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
Bitmapset *root_parent_relids,
Bitmapset **initially_valid_subplans);
extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
- bool initial_prune);
+ bool initial_prune,
+ Bitmapset **scan_leafpart_rtis);
+extern Bitmapset *ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt,
+ ParamListInfo params,
+ PartitionPruneInfo *pruneinfo,
+ Bitmapset **scan_leafpart_rtis);
#endif /* EXECPARTITION_H */
diff --git a/src/include/executor/execdesc.h b/src/include/executor/execdesc.h
index e79e2c001f..5a7d075750 100644
--- a/src/include/executor/execdesc.h
+++ b/src/include/executor/execdesc.h
@@ -35,6 +35,8 @@ typedef struct QueryDesc
/* These fields are provided by CreateQueryDesc */
CmdType operation; /* CMD_SELECT, CMD_UPDATE, etc. */
PlannedStmt *plannedstmt; /* planner's output (could be utility, too) */
+ List *part_prune_results; /* PartitionPruneResults returned by
+ * CachedPlanLockPartitions() */
const char *sourceText; /* source text of the query */
Snapshot snapshot; /* snapshot to use for query */
Snapshot crosscheck_snapshot; /* crosscheck for RI update/delete */
@@ -57,6 +59,7 @@ typedef struct QueryDesc
/* in pquery.c */
extern QueryDesc *CreateQueryDesc(PlannedStmt *plannedstmt,
+ List *part_prune_results,
const char *sourceText,
Snapshot snapshot,
Snapshot crosscheck_snapshot,
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 9a64a830a2..f1374057e5 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -617,6 +617,7 @@ typedef struct EState
List *es_rteperminfos; /* List of RTEPermissionInfo */
PlannedStmt *es_plannedstmt; /* link to top of plan tree */
List *es_part_prune_infos; /* PlannedStmt.partPruneInfos */
+ List *es_part_prune_results; /* QueryDesc.part_prune_results */
const char *es_sourceText; /* Source text from QueryDesc */
JunkFilter *es_junkFilter; /* top-level junk filter, if any */
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 4337e7aa34..10f12e780e 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -134,8 +134,8 @@ typedef struct PlannerGlobal
bool containsInitialPruning;
/*
- * Indexes of all range table entries; for AcquireExecutorLocks()'s
- * perusal.
+ * Indexes of all range table entries except those of leaf partitions
+ * scanned by prunable subplans; for AcquireExecutorLocks() perusal.
*/
Bitmapset *minLockRelids;
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index eb0a007946..ab8bc74e4a 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -82,7 +82,9 @@ typedef struct PlannedStmt
List *permInfos; /* list of RTEPermissionInfo nodes for rtable
* entries needing one */
- Bitmapset *minLockRelids; /* Indexes of all range table entries; for
+ Bitmapset *minLockRelids; /* Indexes of all range table entries except
+ * those of leaf partitions scanned by
+ * prunable subplans; for
* AcquireExecutorLocks()'s perusal */
/* rtable indexes of target relations for INSERT/UPDATE/DELETE/MERGE */
@@ -1575,6 +1577,33 @@ typedef struct PartitionPruneStepCombine
List *source_stepids;
} PartitionPruneStepCombine;
+/*----------------
+ * PartitionPruneResult
+ *
+ * The result of performing ExecPartitionDoInitialPruning() on a given
+ * PartitionPruneInfo.
+ *
+ * root_parent_relids is same as PartitionPruneInfo.root_parent_relids. It's
+ * there for cross-checking in ExecInitPartitionPruning() that the
+ * PartitionPruneResult and the PartitionPruneInfo at a given index in
+ * EState.es_part_prune_results and EState.es_part_prune_infos, respectively,
+ * belong to the same parent plan node.
+ *
+ * valid_subplans_offs contains the indexes of subplans remaining after
+ * performing initial pruning by calling ExecFindMatchingSubPlans() on the
+ * PartitionPruneInfo.
+ *
+ * This is used to store the result of initial partition pruning that is
+ * peformed before the execution has started, such as in
+ * CachedPlanLockPartitions().
+ */
+typedef struct PartitionPruneResult
+{
+ NodeTag type;
+
+ Bitmapset *root_parent_relids;
+ Bitmapset *valid_subplan_offs;
+} PartitionPruneResult;
/*
* Plan invalidation info
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index 0499635f59..4ac66d2761 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -160,6 +160,14 @@ typedef struct CachedPlan
MemoryContext context; /* context containing this CachedPlan */
} CachedPlan;
+/*
+ * Additional information to pass the executor when executing a CachedPlan.
+ */
+typedef struct CachedPlanExtra
+{
+ List *part_prune_results_list;
+} CachedPlanExtra;
+
/*
* CachedExpression is a low-overhead mechanism for caching the planned form
* of standalone scalar expressions. While such expressions are not usually
@@ -220,7 +228,8 @@ extern List *CachedPlanGetTargetList(CachedPlanSource *plansource,
extern CachedPlan *GetCachedPlan(CachedPlanSource *plansource,
ParamListInfo boundParams,
ResourceOwner owner,
- QueryEnvironment *queryEnv);
+ QueryEnvironment *queryEnv,
+ CachedPlanExtra **extra);
extern void ReleaseCachedPlan(CachedPlan *plan, ResourceOwner owner);
extern bool CachedPlanAllowsSimpleValidityCheck(CachedPlanSource *plansource,
diff --git a/src/include/utils/portal.h b/src/include/utils/portal.h
index aeddbdafe5..49bb00cda5 100644
--- a/src/include/utils/portal.h
+++ b/src/include/utils/portal.h
@@ -138,6 +138,8 @@ typedef struct PortalData
QueryCompletion qc; /* command completion data for executed query */
List *stmts; /* list of PlannedStmts */
CachedPlan *cplan; /* CachedPlan, if stmts are from one */
+ CachedPlanExtra *cplan_extra; /* CachedPlanExtra for cplan in Portal's
+ * memory */
ParamListInfo portalParams; /* params to pass to query */
QueryEnvironment *queryEnv; /* environment for query */
@@ -242,6 +244,7 @@ extern void PortalDefineQuery(Portal portal,
CommandTag commandTag,
List *stmts,
CachedPlan *cplan);
+extern void PortalSaveCachedPlanExtra(Portal portal, CachedPlanExtra *extra);
extern PlannedStmt *PortalGetPrimaryStmt(Portal portal);
extern void PortalCreateHoldStore(Portal portal);
extern void PortalHashTableDeleteAll(void);
--
2.35.3
^ permalink raw reply [nested|flat] 71+ messages in thread
* Re: generic plans and "initial" pruning
@ 2022-12-16 02:33 Amit Langote <[email protected]>
parent: Amit Langote <[email protected]>
0 siblings, 1 reply; 71+ messages in thread
From: Amit Langote @ 2022-12-16 02:33 UTC (permalink / raw)
To: Alvaro Herrera <[email protected]>; +Cc: Robert Haas <[email protected]>; Jacob Champion <[email protected]>; David Rowley <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers
On Wed, Dec 14, 2022 at 5:35 PM Amit Langote <[email protected]> wrote:
> I have moved the original functionality of GetCachedPlan() to
> GetCachedPlanInternal(), turning the former into a sort of controller
> as described shortly. The latter's CheckCachedPlan() part now only
> locks the "minimal" set of, non-prunable, relations, making a note of
> whether the plan contains any prunable subnodes and thus prunable
> relations whose locking is deferred to the caller, GetCachedPlan().
> GetCachedPlan(), as a sort of controller as mentioned before, does the
> pruning if needed on the minimally valid plan returned by
> GetCachedPlanInternal(), locks the partitions that survive, and redoes
> the whole thing if the locking of partitions invalidates the plan.
After sleeping on it, I realized this doesn't have to be that
complicated. Rather than turn GetCachedPlan() into a wrapper for
handling deferred partition locking as outlined above, I could have
changed it more simply as follows to get the same thing done:
if (!customplan)
{
- if (CheckCachedPlan(plansource))
+ bool hasUnlockedParts = false;
+
+ if (CheckCachedPlan(plansource, &hasUnlockedParts) &&
+ hasUnlockedParts &&
+ CachedPlanLockPartitions(plansource, boundParams, owner, extra))
{
/* We want a generic plan, and we already have a valid one */
plan = plansource->gplan;
Attached updated patch does it like that.
--
Thanks, Amit Langote
EDB: http://www.enterprisedb.com
Attachments:
[application/octet-stream] v30-0002-In-GetCachedPlan-only-lock-unpruned-partitions.patch (66.2K, 2-v30-0002-In-GetCachedPlan-only-lock-unpruned-partitions.patch)
download | inline diff:
From 4176843628ef29c1ff173ad0dfbdd13f7d07c225 Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Wed, 22 Dec 2021 16:55:17 +0900
Subject: [PATCH v30 2/2] In GetCachedPlan(), only lock unpruned partitions
This does two things mainly:
* The planner now removes the RT indexes of "initially prunable"
partitions from PlannedStmt.minLockRelids such that the set only
contains the relations not subject to initial partition pruning. So,
AcquireExecutorLocks only locks a subset of the relations contained
in a plan, deferring the locking of prunable relations to the caller.
* GetCachedPlans(), if there are prunable relations in the plan,
performs the initial partition pruning using available EXTERN params
and locks the partitions remaining after that, so the the CachedPlan
that's returned is valid in a race-free manner including for any
partitions that will be scanned during execution.
To make the pruning possible before entering ExecutorStart(), this
also adds a ExecPartitionDoInitialPruning(), which can be called by
GetCachedPlan() for a given PlannedStmt.
The result of performing initial partition pruning this way is made
available to the actual execution via PartitionPruneResult, of which
there is one for every ParttionPruneInfo contained in the PlannedStmt.
List of PartitionPruneResult for a given PlannedStmt are returned to
to the callers of GetCachedPlan() via its new output parameter of type
CachedPlanExtra, whose members currently only include said List.
---
src/backend/commands/copyto.c | 2 +-
src/backend/commands/createas.c | 2 +-
src/backend/commands/explain.c | 7 +-
src/backend/commands/extension.c | 2 +-
src/backend/commands/matview.c | 2 +-
src/backend/commands/prepare.c | 28 +++-
src/backend/executor/README | 31 +++-
src/backend/executor/execMain.c | 2 +
src/backend/executor/execParallel.c | 25 ++-
src/backend/executor/execPartition.c | 215 +++++++++++++++++++++----
src/backend/executor/execUtils.c | 1 +
src/backend/executor/functions.c | 2 +-
src/backend/executor/nodeAppend.c | 11 +-
src/backend/executor/nodeMergeAppend.c | 5 +-
src/backend/executor/spi.c | 31 +++-
src/backend/optimizer/plan/setrefs.c | 36 +++++
src/backend/tcop/postgres.c | 9 +-
src/backend/tcop/pquery.c | 28 +++-
src/backend/utils/cache/plancache.c | 204 +++++++++++++++++++++--
src/backend/utils/mmgr/portalmem.c | 16 ++
src/include/commands/explain.h | 4 +-
src/include/executor/execPartition.h | 7 +-
src/include/executor/execdesc.h | 3 +
src/include/nodes/execnodes.h | 1 +
src/include/nodes/pathnodes.h | 4 +-
src/include/nodes/plannodes.h | 31 +++-
src/include/utils/plancache.h | 11 +-
src/include/utils/portal.h | 3 +
28 files changed, 640 insertions(+), 83 deletions(-)
diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index f26cc0d162..401a2280a3 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -558,7 +558,7 @@ BeginCopyTo(ParseState *pstate,
((DR_copy *) dest)->cstate = cstate;
/* Create a QueryDesc requesting no output */
- cstate->queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+ cstate->queryDesc = CreateQueryDesc(plan, NIL, pstate->p_sourcetext,
GetActiveSnapshot(),
InvalidSnapshot,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 152c29b551..942449544c 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -325,7 +325,7 @@ ExecCreateTableAs(ParseState *pstate, CreateTableAsStmt *stmt,
UpdateActiveSnapshotCommandId();
/* Create a QueryDesc, redirecting output to our tuple receiver */
- queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+ queryDesc = CreateQueryDesc(plan, NIL, pstate->p_sourcetext,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, 0);
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index f86983c660..2f2b558608 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -407,7 +407,7 @@ ExplainOneQuery(Query *query, int cursorOptions,
}
/* run it (if needed) and produce output */
- ExplainOnePlan(plan, into, es, queryString, params, queryEnv,
+ ExplainOnePlan(plan, NIL, into, es, queryString, params, queryEnv,
&planduration, (es->buffers ? &bufusage : NULL));
}
}
@@ -515,7 +515,8 @@ ExplainOneUtility(Node *utilityStmt, IntoClause *into, ExplainState *es,
* to call it.
*/
void
-ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
+ExplainOnePlan(PlannedStmt *plannedstmt, List *part_prune_results,
+ IntoClause *into, ExplainState *es,
const char *queryString, ParamListInfo params,
QueryEnvironment *queryEnv, const instr_time *planduration,
const BufferUsage *bufusage)
@@ -563,7 +564,7 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
dest = None_Receiver;
/* Create a QueryDesc for the query */
- queryDesc = CreateQueryDesc(plannedstmt, queryString,
+ queryDesc = CreateQueryDesc(plannedstmt, part_prune_results, queryString,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, instrument_option);
diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index cf1b1ca571..904cbcba4a 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -779,7 +779,7 @@ execute_sql_string(const char *sql)
{
QueryDesc *qdesc;
- qdesc = CreateQueryDesc(stmt,
+ qdesc = CreateQueryDesc(stmt, NIL,
sql,
GetActiveSnapshot(), NULL,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index 8ba2436a71..049a90f49d 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -409,7 +409,7 @@ refresh_matview_datafill(DestReceiver *dest, Query *query,
UpdateActiveSnapshotCommandId();
/* Create a QueryDesc, redirecting output to our tuple receiver */
- queryDesc = CreateQueryDesc(plan, queryString,
+ queryDesc = CreateQueryDesc(plan, NIL, queryString,
GetActiveSnapshot(), InvalidSnapshot,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 9e29584d93..729384a9a6 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -154,6 +154,7 @@ ExecuteQuery(ParseState *pstate,
{
PreparedStatement *entry;
CachedPlan *cplan;
+ CachedPlanExtra *cplan_extra = NULL;
List *plan_list;
ParamListInfo paramLI = NULL;
EState *estate = NULL;
@@ -193,7 +194,11 @@ ExecuteQuery(ParseState *pstate,
entry->plansource->query_string);
/* Replan if needed, and increment plan refcount for portal */
- cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL);
+ cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL,
+ &cplan_extra);
+ Assert(cplan_extra == NULL ||
+ (list_length(cplan->stmt_list) ==
+ list_length(cplan_extra->part_prune_results_list)));
plan_list = cplan->stmt_list;
/*
@@ -207,6 +212,9 @@ ExecuteQuery(ParseState *pstate,
plan_list,
cplan);
+ if (cplan_extra)
+ PortalSaveCachedPlanExtra(portal, cplan_extra);
+
/*
* For CREATE TABLE ... AS EXECUTE, we must verify that the prepared
* statement is one that produces tuples. Currently we insist that it be
@@ -575,6 +583,7 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
PreparedStatement *entry;
const char *query_string;
CachedPlan *cplan;
+ CachedPlanExtra *cplan_extra = NULL;
List *plan_list;
ListCell *p;
ParamListInfo paramLI = NULL;
@@ -619,7 +628,11 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
/* Replan if needed, and acquire a transient refcount */
cplan = GetCachedPlan(entry->plansource, paramLI,
- CurrentResourceOwner, queryEnv);
+ CurrentResourceOwner, queryEnv,
+ &cplan_extra);
+ Assert(cplan_extra == NULL ||
+ (list_length(cplan->stmt_list) ==
+ list_length(cplan_extra->part_prune_results_list)));
INSTR_TIME_SET_CURRENT(planduration);
INSTR_TIME_SUBTRACT(planduration, planstart);
@@ -637,10 +650,17 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
foreach(p, plan_list)
{
PlannedStmt *pstmt = lfirst_node(PlannedStmt, p);
+ List *part_prune_results = NIL;
+
+ if (cplan_extra)
+ part_prune_results = list_nth_node(List,
+ cplan_extra->part_prune_results_list,
+ foreach_current_index(p));
if (pstmt->commandType != CMD_UTILITY)
- ExplainOnePlan(pstmt, into, es, query_string, paramLI, queryEnv,
- &planduration, (es->buffers ? &bufusage : NULL));
+ ExplainOnePlan(pstmt, part_prune_results, into, es, query_string,
+ paramLI, queryEnv, &planduration,
+ (es->buffers ? &bufusage : NULL));
else
ExplainOneUtility(pstmt->utilityStmt, into, es, query_string,
paramLI, queryEnv);
diff --git a/src/backend/executor/README b/src/backend/executor/README
index 17775a49e2..2222b3ed6f 100644
--- a/src/backend/executor/README
+++ b/src/backend/executor/README
@@ -63,7 +63,36 @@ if the executor determines that an entire subplan is not required due to
execution time partition pruning determining that no matching records will be
found there. This currently only occurs for Append and MergeAppend nodes. In
this case the non-required subplans are ignored and the executor state's
-subnode array will become out of sequence to the plan's subplan list.
+subnode array will become out of sequence to the plan's subplan list. Note
+that this is referred to as "initial" pruning, because it needs to occur only
+once during the execution startup, and uses a set of pruning steps called
+initial pruning steps (see PartitionedRelPruneInfo.initial_pruning_steps).
+
+Actually, "initial" pruning may occur even before the execution startup in
+in some cases. For example, when a cached generic plan is validated for
+execution, which works by locking all the relations that will be scanned by
+that plan during execution. If the generic plan contains plan nodes that have
+prunable child subnodes, then this validation locking is performed after
+pruning child subnodes that need not be scanned during execution, that is,
+using initial pruning steps. When such a generic plan is forwarded for
+execution, it must be accompanied by the set of PartitionPruneResult nodes that
+contain the result of that pruning, which basically consists of a bitmapset of
+child subnode indexes that survived the pruning and thus whose relations would
+have been locked for execution. This is important, because, unlike the
+plan-time pruning and actual executor-startup pruning, this does not actually
+remove the pruned subnodes from the plan tree, but only marks them as being
+pruned. So, the executor code (core or third party), especially one that runs
+before ExecutorStart() and thus looks at bare Plan trees (not PlanState trees)
+must beware of plan nodes that may actually have been pruned and thus subject
+to being invalidated by concurrent schema changes. For plan nodes that can
+have prunable child subnodes and thus contain a PartitionPruneInfo, such code
+must always check if the corresponding PartitionPruneResult exists
+in EState.es_part_prune_results at given part_prune_index and use that to
+decide which subplans are valid for execution instead of redoing the pruning.
+Note that that is not just a performance optimization but also necessary to
+avoid possibly ending up considering a different set of child subnodes as valid
+than the set CachedPlanLockPartitions() would have locked the relations of, if
+the pruning steps produce a different result when executed multiple times.
Each Plan node may have expression trees associated with it, to represent
its target list, qualification conditions, etc. These trees are also
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 2c2b3a8874..229f61f72e 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -798,6 +798,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
{
CmdType operation = queryDesc->operation;
PlannedStmt *plannedstmt = queryDesc->plannedstmt;
+ List *part_prune_results = queryDesc->part_prune_results;
Plan *plan = plannedstmt->planTree;
List *rangeTable = plannedstmt->rtable;
EState *estate = queryDesc->estate;
@@ -819,6 +820,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
estate->es_plannedstmt = plannedstmt;
estate->es_part_prune_infos = plannedstmt->partPruneInfos;
+ estate->es_part_prune_results = part_prune_results;
/*
* Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index 65c4b63bbd..9745eba0af 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -66,6 +66,7 @@
#define PARALLEL_KEY_QUERY_TEXT UINT64CONST(0xE000000000000008)
#define PARALLEL_KEY_JIT_INSTRUMENTATION UINT64CONST(0xE000000000000009)
#define PARALLEL_KEY_WAL_USAGE UINT64CONST(0xE00000000000000A)
+#define PARALLEL_KEY_PARTITION_PRUNE_RESULTS UINT64CONST(0xE00000000000000B)
#define PARALLEL_TUPLE_QUEUE_SIZE 65536
@@ -599,12 +600,15 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
FixedParallelExecutorState *fpes;
char *pstmt_data;
char *pstmt_space;
+ char *part_prune_results_data;
+ char *part_prune_results_space;
char *paramlistinfo_space;
BufferUsage *bufusage_space;
WalUsage *walusage_space;
SharedExecutorInstrumentation *instrumentation = NULL;
SharedJitInstrumentation *jit_instrumentation = NULL;
int pstmt_len;
+ int part_prune_results_len;
int paramlistinfo_len;
int instrumentation_len = 0;
int jit_instrumentation_len = 0;
@@ -633,6 +637,7 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
/* Fix up and serialize plan to be sent to workers. */
pstmt_data = ExecSerializePlan(planstate->plan, estate);
+ part_prune_results_data = nodeToString(estate->es_part_prune_results);
/* Create a parallel context. */
pcxt = CreateParallelContext("postgres", "ParallelQueryMain", nworkers);
@@ -659,6 +664,11 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
shm_toc_estimate_chunk(&pcxt->estimator, pstmt_len);
shm_toc_estimate_keys(&pcxt->estimator, 1);
+ /* Estimate space for serialized List of PartitionPruneResult. */
+ part_prune_results_len = strlen(part_prune_results_data) + 1;
+ shm_toc_estimate_chunk(&pcxt->estimator, part_prune_results_len);
+ shm_toc_estimate_keys(&pcxt->estimator, 1);
+
/* Estimate space for serialized ParamListInfo. */
paramlistinfo_len = EstimateParamListSpace(estate->es_param_list_info);
shm_toc_estimate_chunk(&pcxt->estimator, paramlistinfo_len);
@@ -753,6 +763,12 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
memcpy(pstmt_space, pstmt_data, pstmt_len);
shm_toc_insert(pcxt->toc, PARALLEL_KEY_PLANNEDSTMT, pstmt_space);
+ /* Store serialized List of PartitionPruneResult */
+ part_prune_results_space = shm_toc_allocate(pcxt->toc, part_prune_results_len);
+ memcpy(part_prune_results_space, part_prune_results_data, part_prune_results_len);
+ shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARTITION_PRUNE_RESULTS,
+ part_prune_results_space);
+
/* Store serialized ParamListInfo. */
paramlistinfo_space = shm_toc_allocate(pcxt->toc, paramlistinfo_len);
shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARAMLISTINFO, paramlistinfo_space);
@@ -1234,8 +1250,10 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
int instrument_options)
{
char *pstmtspace;
+ char *part_prune_results_space;
char *paramspace;
PlannedStmt *pstmt;
+ List *part_prune_results;
ParamListInfo paramLI;
char *queryString;
@@ -1246,12 +1264,17 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
pstmtspace = shm_toc_lookup(toc, PARALLEL_KEY_PLANNEDSTMT, false);
pstmt = (PlannedStmt *) stringToNode(pstmtspace);
+ /* Reconstruct leader-supplied PartitionPruneResult. */
+ part_prune_results_space =
+ shm_toc_lookup(toc, PARALLEL_KEY_PARTITION_PRUNE_RESULTS, false);
+ part_prune_results = (List *) stringToNode(part_prune_results_space);
+
/* Reconstruct ParamListInfo. */
paramspace = shm_toc_lookup(toc, PARALLEL_KEY_PARAMLISTINFO, false);
paramLI = RestoreParamList(¶mspace);
/* Create a QueryDesc for the query. */
- return CreateQueryDesc(pstmt,
+ return CreateQueryDesc(pstmt, part_prune_results,
queryString,
GetActiveSnapshot(), InvalidSnapshot,
receiver, paramLI, NULL, instrument_options);
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 5b62157712..dcd2bb0f90 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -25,6 +25,7 @@
#include "mb/pg_wchar.h"
#include "miscadmin.h"
#include "nodes/makefuncs.h"
+#include "parser/parsetree.h"
#include "partitioning/partbounds.h"
#include "partitioning/partdesc.h"
#include "partitioning/partprune.h"
@@ -185,7 +186,11 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation rel,
static List *adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri);
static List *adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap);
static PartitionPruneState *CreatePartitionPruneState(PlanState *planstate,
- PartitionPruneInfo *pruneinfo);
+ PartitionPruneInfo *pruneinfo,
+ bool consider_initial_steps,
+ bool consider_exec_steps,
+ List *rtable, ExprContext *econtext,
+ PartitionDirectory partdir);
static void InitPartitionPruneContext(PartitionPruneContext *context,
List *pruning_steps,
PartitionDesc partdesc,
@@ -198,7 +203,8 @@ static void PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
static void find_matching_subplans_recurse(PartitionPruningData *prunedata,
PartitionedRelPruningData *pprune,
bool initial_prune,
- Bitmapset **validsubplans);
+ Bitmapset **validsubplans,
+ Bitmapset **scan_leafpart_rtis);
/*
@@ -1742,7 +1748,8 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
* considered to be a stable expression, it can change value from one plan
* node scan to the next during query execution. Stable comparison
* expressions that don't involve such Params allow partition pruning to be
- * done once during executor startup. Expressions that do involve such Params
+ * done once during executor startup or even before that, such as when called
+ * from CachedPlanLockPartitions(). Expressions that do involve such Params
* require us to prune separately for each scan of the parent plan node.
*
* Note that pruning away unneeded subplans during executor startup has the
@@ -1760,6 +1767,12 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
* account for initial pruning possibly having eliminated some of the
* subplans.
*
+ * ExecPartitionDoInitialPruning:
+ * Do initial pruning with the information contained in a given
+ * PartitionPruneInfo to determine the set of the parent plan node's
+ * child subnodes that are valid for execution and also the set of the RT
+ * indexes of leaf partitions scanned by those subnodes.
+ *
* ExecFindMatchingSubPlans:
* Returns indexes of matching subplans after evaluating the expressions
* that are safe to evaluate at a given point. This function is first
@@ -1780,8 +1793,10 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
*
* On return, *initially_valid_subplans is assigned the set of indexes of
* child subplans that must be initialized along with the parent plan node.
- * Initial pruning is performed here if needed and in that case only the
- * surviving subplans' indexes are added.
+ * That set is computed by either performing the "initial pruning" here or
+ * reusing the one present in EState.es_part_prune_results[part_prune_index]
+ * if it has been set, which it would be if CachedPlanLockPartitions() would
+ * have done the initial pruning.
*
* If subplans are indeed pruned, subplan_map arrays contained in the returned
* PartitionPruneState are re-sequenced to not count those, though only if the
@@ -1794,9 +1809,10 @@ ExecInitPartitionPruning(PlanState *planstate,
Bitmapset *root_parent_relids,
Bitmapset **initially_valid_subplans)
{
- PartitionPruneState *prunestate;
+ PartitionPruneState *prunestate = NULL;
EState *estate = planstate->state;
PartitionPruneInfo *pruneinfo;
+ PartitionPruneResult *pruneresult = NULL;
/* Obtain the pruneinfo we need, and make sure it's the right one */
pruneinfo = list_nth(estate->es_part_prune_infos, part_prune_index);
@@ -1812,20 +1828,62 @@ ExecInitPartitionPruning(PlanState *planstate,
/* We may need an expression context to evaluate partition exprs */
ExecAssignExprContext(estate, planstate);
- /* Create the working data structure for pruning */
- prunestate = CreatePartitionPruneState(planstate, pruneinfo);
+ /* Initial pruning already done if es_part_prune_results has been set. */
+ if (estate->es_part_prune_results)
+ {
+ pruneresult = list_nth_node(PartitionPruneResult,
+ estate->es_part_prune_results,
+ part_prune_index);
+ if (!bms_equal(root_parent_relids, pruneinfo->root_parent_relids))
+ ereport(ERROR,
+ errcode(ERRCODE_INTERNAL_ERROR),
+ errmsg_internal("mismatching PartitionPruneInfo and PartitionPruneResult at part_prune_index %d",
+ part_prune_index),
+ errdetail_internal("prunresult relids %s, pruneinfo relids %s",
+ bmsToString(pruneresult->root_parent_relids),
+ bmsToString(pruneinfo->root_parent_relids)));
+ }
+
+ if (pruneresult == NULL || pruneinfo->needs_exec_pruning)
+ {
+ /* We may need an expression context to evaluate partition exprs */
+ ExecAssignExprContext(estate, planstate);
+
+ /* For data reading, executor always omits detached partitions */
+ if (estate->es_partition_directory == NULL)
+ estate->es_partition_directory =
+ CreatePartitionDirectory(estate->es_query_cxt, false);
+
+ /*
+ * Create the working data structure for pruning. No need to consider
+ * initial pruning steps if we have a PartitionPruneResult.
+ */
+ prunestate = CreatePartitionPruneState(planstate, pruneinfo,
+ pruneresult == NULL,
+ pruneinfo->needs_exec_pruning,
+ NIL, planstate->ps_ExprContext,
+ estate->es_partition_directory);
+ }
/*
* Perform an initial partition prune pass, if required.
*/
- if (prunestate->do_initial_prune)
- *initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true);
+ if (pruneresult)
+ {
+ *initially_valid_subplans = bms_copy(pruneresult->valid_subplan_offs);
+ }
+ else if (prunestate && prunestate->do_initial_prune)
+ {
+ *initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true,
+ NULL);
+ }
else
{
- /* No pruning, so we'll need to initialize all subplans */
+ /* No initial pruning, so we'll need to initialize all subplans */
Assert(n_total_subplans > 0);
*initially_valid_subplans = bms_add_range(NULL, 0,
n_total_subplans - 1);
+ return prunestate;
}
/*
@@ -1833,7 +1891,8 @@ ExecInitPartitionPruning(PlanState *planstate,
* that were removed above due to initial pruning. No need to do this if
* no steps were removed.
*/
- if (bms_num_members(*initially_valid_subplans) < n_total_subplans)
+ if (prunestate &&
+ bms_num_members(*initially_valid_subplans) < n_total_subplans)
{
/*
* We can safely skip this when !do_exec_prune, even though that
@@ -1849,11 +1908,58 @@ ExecInitPartitionPruning(PlanState *planstate,
return prunestate;
}
+/*
+ * ExecPartitionDoInitialPruning
+ * Perform initial pruning using given PartitionPruneInfo to determine
+ * the set of the parent plan node's child subnodes that are valid for
+ * execution
+ *
+ * On return, *scan_leafpart_rtis will contain the RT indexes of leaf
+ * partitions scanned by those valid subnodes.
+ *
+ * Note that this does not share state with the actual execution, so must do
+ * with the information present in the PlannedStmt. For example, there isn't
+ * a PlanState for the parent plan node yet, so we must create a standalone
+ * ExprContext to evaluate pruning expressions, equipped with the information
+ * about the EXTERN parameters that we do have. Note that that's okay because
+ * the initial pruning steps do not contain anything that would require the
+ * execution to have started. Likewise, we create our own PartitionDirectory
+ * to look up the PartitionDescs to use.
+ */
+Bitmapset *
+ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params,
+ PartitionPruneInfo *pruneinfo,
+ Bitmapset **scan_leafpart_rtis)
+{
+ List *rtable = plannedstmt->rtable;
+ ExprContext *econtext;
+ PartitionDirectory pdir;
+ PartitionPruneState *prunestate;
+ Bitmapset *valid_subplan_offs;
+
+ /* Don't omit detached partitions, just like during execution proper. */
+ pdir = CreatePartitionDirectory(CurrentMemoryContext, false);
+ econtext = CreateStandaloneExprContext();
+ econtext->ecxt_param_list_info = params;
+ prunestate = CreatePartitionPruneState(NULL, pruneinfo, true, false,
+ rtable, econtext, pdir);
+ valid_subplan_offs = ExecFindMatchingSubPlans(prunestate, true,
+ scan_leafpart_rtis);
+
+ FreeExprContext(econtext, true);
+ DestroyPartitionDirectory(pdir);
+
+ return valid_subplan_offs;
+}
+
/*
* CreatePartitionPruneState
* Build the data structure required for calling ExecFindMatchingSubPlans
*
- * 'planstate' is the parent plan node's execution state.
+ * 'planstate', if not NULL, is the parent plan node's execution state. It
+ * can be NULL if being called before ExecutorStart(), in which case,
+ * 'rtable' (range table), 'econtext', and 'partdir' must be explicitly
+ * provided.
*
* 'pruneinfo' is a PartitionPruneInfo as generated by
* make_partition_pruneinfo. Here we build a PartitionPruneState containing a
@@ -1867,19 +1973,21 @@ ExecInitPartitionPruning(PlanState *planstate,
* PartitionedRelPruneInfo.
*/
static PartitionPruneState *
-CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
+CreatePartitionPruneState(PlanState *planstate,
+ PartitionPruneInfo *pruneinfo,
+ bool consider_initial_steps,
+ bool consider_exec_steps,
+ List *rtable, ExprContext *econtext,
+ PartitionDirectory partdir)
{
- EState *estate = planstate->state;
+ EState *estate = planstate ? planstate->state : NULL;
PartitionPruneState *prunestate;
int n_part_hierarchies;
ListCell *lc;
int i;
- ExprContext *econtext = planstate->ps_ExprContext;
- /* For data reading, executor always omits detached partitions */
- if (estate->es_partition_directory == NULL)
- estate->es_partition_directory =
- CreatePartitionDirectory(estate->es_query_cxt, false);
+ Assert((estate != NULL) ||
+ (partdir != NULL && econtext != NULL && rtable != NIL));
n_part_hierarchies = list_length(pruneinfo->prune_infos);
Assert(n_part_hierarchies > 0);
@@ -1934,15 +2042,39 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
PartitionKey partkey;
/*
- * We can rely on the copies of the partitioned table's partition
- * key and partition descriptor appearing in its relcache entry,
- * because that entry will be held open and locked for the
- * duration of this executor run.
+ * Must open the relation by ourselves when called before the
+ * execution has started, such as, when called from
+ * CachedPlanLockPartitions(). In that case, sub-partitions must
+ * be locked, because AcquirePlannerLocks() would have locked only
+ * the root parent.
+ */
+ if (estate == NULL)
+ {
+ RangeTblEntry *rte = rt_fetch(pinfo->rtindex, rtable);
+ int lockmode = (j == 0) ? NoLock : rte->rellockmode;
+
+ partrel = table_open(rte->relid, lockmode);
+ }
+ else
+ partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
+
+ /*
+ * We can rely on the copy of the partitioned table's partition
+ * key from in its relcache entry, because it can't change (or
+ * get destroyed) as long as the relation is locked. Partition
+ * descriptor is taken from the PartitionDirectory associated with
+ * the table that is held open long enough for the descriptor to
+ * remain valid while it's used to perform the pruning steps.
*/
- partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
partkey = RelationGetPartitionKey(partrel);
- partdesc = PartitionDirectoryLookup(estate->es_partition_directory,
- partrel);
+ partdesc = PartitionDirectoryLookup(partdir, partrel);
+
+ /*
+ * Must close partrel, keeping the lock taken, if we're not using
+ * EState's entry.
+ */
+ if (estate == NULL)
+ table_close(partrel, NoLock);
/*
* Initialize the subplan_map and subpart_map.
@@ -2050,7 +2182,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
* Initialize pruning contexts as needed.
*/
pprune->initial_pruning_steps = pinfo->initial_pruning_steps;
- if (pinfo->initial_pruning_steps)
+ if (consider_initial_steps && pinfo->initial_pruning_steps)
{
InitPartitionPruneContext(&pprune->initial_context,
pinfo->initial_pruning_steps,
@@ -2060,7 +2192,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
prunestate->do_initial_prune = true;
}
pprune->exec_pruning_steps = pinfo->exec_pruning_steps;
- if (pinfo->exec_pruning_steps)
+ if (consider_exec_steps && pinfo->exec_pruning_steps)
{
InitPartitionPruneContext(&pprune->exec_context,
pinfo->exec_pruning_steps,
@@ -2288,10 +2420,14 @@ PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
* Pass initial_prune if PARAM_EXEC Params cannot yet be evaluated. This
* differentiates the initial executor-time pruning step from later
* runtime pruning.
+ *
+ * RT indexes of leaf partitions scanned by the chosen subplans are added to
+ * *scan_leafpart_rtis if the pointer is non-NULL.
*/
Bitmapset *
ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
- bool initial_prune)
+ bool initial_prune,
+ Bitmapset **scan_leafpart_rtis)
{
Bitmapset *result = NULL;
MemoryContext oldcontext;
@@ -2326,7 +2462,7 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
*/
pprune = &prunedata->partrelprunedata[0];
find_matching_subplans_recurse(prunedata, pprune, initial_prune,
- &result);
+ &result, scan_leafpart_rtis);
/* Expression eval may have used space in ExprContext too */
if (pprune->exec_pruning_steps)
@@ -2340,6 +2476,8 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
/* Copy result out of the temp context before we reset it */
result = bms_copy(result);
+ if (scan_leafpart_rtis)
+ *scan_leafpart_rtis = bms_copy(*scan_leafpart_rtis);
MemoryContextReset(prunestate->prune_context);
@@ -2350,13 +2488,15 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
* find_matching_subplans_recurse
* Recursive worker function for ExecFindMatchingSubPlans
*
- * Adds valid (non-prunable) subplan IDs to *validsubplans
+ * Adds valid (non-prunable) subplan IDs to *validsubplans and RT indexes of
+ * of the corresponding leaf partitions to *scan_leafpart_rtis (if asked for).
*/
static void
find_matching_subplans_recurse(PartitionPruningData *prunedata,
PartitionedRelPruningData *pprune,
bool initial_prune,
- Bitmapset **validsubplans)
+ Bitmapset **validsubplans,
+ Bitmapset **scan_leafpart_rtis)
{
Bitmapset *partset;
int i;
@@ -2383,8 +2523,14 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
while ((i = bms_next_member(partset, i)) >= 0)
{
if (pprune->subplan_map[i] >= 0)
+ {
*validsubplans = bms_add_member(*validsubplans,
pprune->subplan_map[i]);
+ Assert(pprune->rti_map[i] > 0);
+ if (scan_leafpart_rtis)
+ *scan_leafpart_rtis = bms_add_member(*scan_leafpart_rtis,
+ pprune->rti_map[i]);
+ }
else
{
int partidx = pprune->subpart_map[i];
@@ -2392,7 +2538,8 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
if (partidx >= 0)
find_matching_subplans_recurse(prunedata,
&prunedata->partrelprunedata[partidx],
- initial_prune, validsubplans);
+ initial_prune, validsubplans,
+ scan_leafpart_rtis);
else
{
/*
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 87f4d53ca7..7d36c972d3 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -139,6 +139,7 @@ CreateExecutorState(void)
estate->es_param_exec_vals = NULL;
estate->es_queryEnv = NULL;
+ estate->es_part_prune_results = NIL;
estate->es_query_cxt = qcontext;
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index dc13625171..bffb42ce71 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -842,7 +842,7 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
else
dest = None_Receiver;
- es->qd = CreateQueryDesc(es->stmt,
+ es->qd = CreateQueryDesc(es->stmt, NIL,
fcache->src,
GetActiveSnapshot(),
InvalidSnapshot,
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index 99830198bd..3b917584de 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -156,7 +156,8 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
* subplan, we can fill as_valid_subplans immediately, preventing
* later calls to ExecFindMatchingSubPlans.
*/
- if (!prunestate->do_exec_prune && nplans > 0)
+ if (appendstate->as_prune_state == NULL ||
+ (!appendstate->as_prune_state->do_exec_prune && nplans > 0))
appendstate->as_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
}
else
@@ -578,7 +579,7 @@ choose_next_subplan_locally(AppendState *node)
}
else if (node->as_valid_subplans == NULL)
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
whichplan = -1;
}
@@ -643,7 +644,7 @@ choose_next_subplan_for_leader(AppendState *node)
if (node->as_valid_subplans == NULL)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
/*
* Mark each invalid plan as finished to allow the loop below to
@@ -718,7 +719,7 @@ choose_next_subplan_for_worker(AppendState *node)
else if (node->as_valid_subplans == NULL)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
mark_invalid_subplans_as_finished(node);
}
@@ -869,7 +870,7 @@ ExecAppendAsyncBegin(AppendState *node)
if (node->as_valid_subplans == NULL)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
classify_matching_subplans(node);
}
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index f370f9f287..ccfa083945 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -104,7 +104,8 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
* subplan, we can fill ms_valid_subplans immediately, preventing
* later calls to ExecFindMatchingSubPlans.
*/
- if (!prunestate->do_exec_prune && nplans > 0)
+ if (mergestate->ms_prune_state == NULL ||
+ (!mergestate->ms_prune_state->do_exec_prune && nplans > 0))
mergestate->ms_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
}
else
@@ -219,7 +220,7 @@ ExecMergeAppend(PlanState *pstate)
*/
if (node->ms_valid_subplans == NULL)
node->ms_valid_subplans =
- ExecFindMatchingSubPlans(node->ms_prune_state, false);
+ ExecFindMatchingSubPlans(node->ms_prune_state, false, NULL);
/*
* First time through: pull the first tuple from each valid subplan,
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index fd5796f1b9..2ecb9193aa 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -1577,6 +1577,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
{
CachedPlanSource *plansource;
CachedPlan *cplan;
+ CachedPlanExtra *cplan_extra;
List *stmt_list;
char *query_string;
Snapshot snapshot;
@@ -1657,7 +1658,11 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
*/
/* Replan if needed, and increment plan refcount for portal */
- cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv);
+ cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv,
+ &cplan_extra);
+ Assert(cplan_extra == NULL ||
+ (list_length(cplan->stmt_list) ==
+ list_length(cplan_extra->part_prune_results_list)));
stmt_list = cplan->stmt_list;
if (!plan->saved)
@@ -1685,6 +1690,9 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
stmt_list,
cplan);
+ if (cplan_extra)
+ PortalSaveCachedPlanExtra(portal, cplan_extra);
+
/*
* Set up options for portal. Default SCROLL type is chosen the same way
* as PerformCursorOpen does it.
@@ -2067,6 +2075,7 @@ SPI_plan_get_cached_plan(SPIPlanPtr plan)
{
CachedPlanSource *plansource;
CachedPlan *cplan;
+ CachedPlanExtra *cplan_extra = NULL;
SPICallbackArg spicallbackarg;
ErrorContextCallback spierrcontext;
@@ -2092,8 +2101,12 @@ SPI_plan_get_cached_plan(SPIPlanPtr plan)
/* Get the generic plan for the query */
cplan = GetCachedPlan(plansource, NULL,
plan->saved ? CurrentResourceOwner : NULL,
- _SPI_current->queryEnv);
+ _SPI_current->queryEnv,
+ &cplan_extra);
Assert(cplan == plansource->gplan);
+ Assert(cplan_extra == NULL ||
+ (list_length(cplan->stmt_list) ==
+ list_length(cplan_extra->part_prune_results_list)));
/* Pop the error context stack */
error_context_stack = spierrcontext.previous;
@@ -2399,6 +2412,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
SPICallbackArg spicallbackarg;
ErrorContextCallback spierrcontext;
CachedPlan *cplan = NULL;
+ CachedPlanExtra *cplan_extra = NULL;
ListCell *lc1;
/*
@@ -2549,8 +2563,12 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
* plan, the refcount must be backed by the plan_owner.
*/
cplan = GetCachedPlan(plansource, options->params,
- plan_owner, _SPI_current->queryEnv);
+ plan_owner, _SPI_current->queryEnv,
+ &cplan_extra);
+ Assert(cplan_extra == NULL ||
+ (list_length(cplan->stmt_list) ==
+ list_length(cplan_extra->part_prune_results_list)));
stmt_list = cplan->stmt_list;
/*
@@ -2592,9 +2610,14 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
foreach(lc2, stmt_list)
{
PlannedStmt *stmt = lfirst_node(PlannedStmt, lc2);
+ List *part_prune_results = NIL;
bool canSetTag = stmt->canSetTag;
DestReceiver *dest;
+ if (cplan_extra)
+ part_prune_results = list_nth_node(List,
+ cplan_extra->part_prune_results_list,
+ foreach_current_index(lc2));
/*
* Reset output state. (Note that if a non-SPI receiver is used,
* _SPI_current->processed will stay zero, and that's what we'll
@@ -2663,7 +2686,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
else
snap = InvalidSnapshot;
- qdesc = CreateQueryDesc(stmt,
+ qdesc = CreateQueryDesc(stmt, part_prune_results,
plansource->query_string,
snap, crosscheck_snapshot,
dest,
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index ed43d5936d..db27cae297 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -372,6 +372,7 @@ set_plan_references(PlannerInfo *root, Plan *plan)
{
PartitionPruneInfo *pruneinfo = lfirst(lc);
ListCell *l;
+ Bitmapset *leafpart_rtis = NULL;
pruneinfo->root_parent_relids =
offset_relid_set(pruneinfo->root_parent_relids, rtoffset);
@@ -383,17 +384,52 @@ set_plan_references(PlannerInfo *root, Plan *plan)
foreach(l2, prune_infos)
{
PartitionedRelPruneInfo *pinfo = lfirst(l2);
+ int i;
/* RT index of the table to which the pinfo belongs. */
pinfo->rtindex += rtoffset;
+
+ /* Also of the leaf partitions that might be scanned. */
+ for (i = 0; i < pinfo->nparts; i++)
+ {
+ if (pinfo->rti_map[i] > 0 && pinfo->subplan_map[i] >= 0)
+ {
+ pinfo->rti_map[i] += rtoffset;
+ leafpart_rtis = bms_add_member(leafpart_rtis,
+ pinfo->rti_map[i]);
+ }
+ }
}
}
+ if (pruneinfo->needs_init_pruning)
+ {
+ glob->containsInitialPruning = true;
+
+ /*
+ * Delete the leaf partition RTIs from the set of relations to be
+ * locked by AcquireExecutorLocks(). The actual set of leaf
+ * partitions to be locked is computed by
+ * CachedPlanLockPartitions().
+ */
+ glob->minLockRelids = bms_del_members(glob->minLockRelids,
+ leafpart_rtis);
+ }
+
glob->partPruneInfos = lappend(glob->partPruneInfos, pruneinfo);
glob->containsInitialPruning |= pruneinfo->needs_init_pruning;
}
+ /*
+ * It seems worth doing a bms_copy() on glob->minLockRelids if we deleted
+ * bits from it above to get rid of any empty tail bits. It seems better
+ * for the loop over this set in AcquireExecutorLocks() to not have to go
+ * through those useless bit words.
+ */
+ if (glob->containsInitialPruning)
+ glob->minLockRelids = bms_copy(glob->minLockRelids);
+
return result;
}
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 01d264b5ab..e11e07658d 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -1598,6 +1598,7 @@ exec_bind_message(StringInfo input_message)
int16 *rformats = NULL;
CachedPlanSource *psrc;
CachedPlan *cplan;
+ CachedPlanExtra *cplan_extra = NULL;
Portal portal;
char *query_string;
char *saved_stmt_name;
@@ -1972,7 +1973,10 @@ exec_bind_message(StringInfo input_message)
* will be generated in MessageContext. The plan refcount will be
* assigned to the Portal, so it will be released at portal destruction.
*/
- cplan = GetCachedPlan(psrc, params, NULL, NULL);
+ cplan = GetCachedPlan(psrc, params, NULL, NULL, &cplan_extra);
+ Assert(cplan_extra == NULL ||
+ (list_length(cplan->stmt_list) ==
+ list_length(cplan_extra->part_prune_results_list)));
/*
* Now we can define the portal.
@@ -1987,6 +1991,9 @@ exec_bind_message(StringInfo input_message)
cplan->stmt_list,
cplan);
+ if (cplan_extra)
+ PortalSaveCachedPlanExtra(portal, cplan_extra);
+
/* Done with the snapshot used for parameter I/O and parsing/planning */
if (snapshot_set)
PopActiveSnapshot();
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index 52e2db6452..32e6b7b767 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -35,7 +35,7 @@
Portal ActivePortal = NULL;
-static void ProcessQuery(PlannedStmt *plan,
+static void ProcessQuery(PlannedStmt *plan, List *part_prune_results,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -65,6 +65,7 @@ static void DoPortalRewind(Portal portal);
*/
QueryDesc *
CreateQueryDesc(PlannedStmt *plannedstmt,
+ List *part_prune_results,
const char *sourceText,
Snapshot snapshot,
Snapshot crosscheck_snapshot,
@@ -77,6 +78,7 @@ CreateQueryDesc(PlannedStmt *plannedstmt,
qd->operation = plannedstmt->commandType; /* operation */
qd->plannedstmt = plannedstmt; /* plan */
+ qd->part_prune_results = part_prune_results;
qd->sourceText = sourceText; /* query text */
qd->snapshot = RegisterSnapshot(snapshot); /* snapshot */
/* RI check snapshot */
@@ -122,6 +124,7 @@ FreeQueryDesc(QueryDesc *qdesc)
* PORTAL_ONE_RETURNING, or PORTAL_ONE_MOD_WITH portal
*
* plan: the plan tree for the query
+ * part_prune_results: pruning results returned by CachedPlanLockPartitions()
* sourceText: the source text of the query
* params: any parameters needed
* dest: where to send results
@@ -134,6 +137,7 @@ FreeQueryDesc(QueryDesc *qdesc)
*/
static void
ProcessQuery(PlannedStmt *plan,
+ List *part_prune_results,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -145,7 +149,7 @@ ProcessQuery(PlannedStmt *plan,
/*
* Create the QueryDesc object
*/
- queryDesc = CreateQueryDesc(plan, sourceText,
+ queryDesc = CreateQueryDesc(plan, part_prune_results, sourceText,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, 0);
@@ -491,8 +495,13 @@ PortalStart(Portal portal, ParamListInfo params,
/*
* Create QueryDesc in portal's context; for the moment, set
* the destination to DestNone.
+ *
+ * There is no PartitionPruneResult unless the PlannedStmt is
+ * from a CachedPlan.
*/
queryDesc = CreateQueryDesc(linitial_node(PlannedStmt, portal->stmts),
+ portal->cplan_extra == NULL ? NIL :
+ linitial(portal->cplan_extra->part_prune_results_list),
portal->sourceText,
GetActiveSnapshot(),
InvalidSnapshot,
@@ -1225,6 +1234,8 @@ PortalRunMulti(Portal portal,
if (pstmt->utilityStmt == NULL)
{
+ List *part_prune_results = NIL;
+
/*
* process a plannable query.
*/
@@ -1271,10 +1282,19 @@ PortalRunMulti(Portal portal,
else
UpdateActiveSnapshotCommandId();
+ /*
+ * Determine if there's a corresponding List of PartitionPruneResult
+ * for this PlannedStmt.
+ */
+ if (portal->cplan_extra)
+ part_prune_results = list_nth_node(List,
+ portal->cplan_extra->part_prune_results_list,
+ foreach_current_index(stmtlist_item));
+
if (pstmt->canSetTag)
{
/* statement can set tag string */
- ProcessQuery(pstmt,
+ ProcessQuery(pstmt, part_prune_results,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
@@ -1283,7 +1303,7 @@ PortalRunMulti(Portal portal,
else
{
/* stmt added by rewrite cannot set tag */
- ProcessQuery(pstmt,
+ ProcessQuery(pstmt, part_prune_results,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 339bb603f7..16b9869fae 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -59,6 +59,7 @@
#include "access/transam.h"
#include "catalog/namespace.h"
#include "executor/executor.h"
+#include "executor/execPartition.h"
#include "miscadmin.h"
#include "nodes/nodeFuncs.h"
#include "optimizer/optimizer.h"
@@ -99,14 +100,18 @@ static dlist_head cached_expression_list = DLIST_STATIC_INIT(cached_expression_l
static void ReleaseGenericPlan(CachedPlanSource *plansource);
static List *RevalidateCachedQuery(CachedPlanSource *plansource,
QueryEnvironment *queryEnv);
-static bool CheckCachedPlan(CachedPlanSource *plansource);
+static bool CheckCachedPlan(CachedPlanSource *plansource, bool *hasUnlockedParts);
static CachedPlan *BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
ParamListInfo boundParams, QueryEnvironment *queryEnv);
static bool choose_custom_plan(CachedPlanSource *plansource,
ParamListInfo boundParams);
static double cached_plan_cost(CachedPlan *plan, bool include_planner);
static Query *QueryListGetPrimaryStmt(List *stmts);
-static void AcquireExecutorLocks(List *stmt_list, bool acquire);
+static bool AcquireExecutorLocks(List *stmt_list, bool acquire);
+static bool CachedPlanLockPartitions(CachedPlanSource *plansource,
+ ParamListInfo boundParams,
+ ResourceOwner owner,
+ CachedPlanExtra **extra);
static void AcquirePlannerLocks(List *stmt_list, bool acquire);
static void ScanQueryForLocks(Query *parsetree, bool acquire);
static bool ScanQueryWalker(Node *node, bool *acquire);
@@ -783,16 +788,23 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
}
/*
- * CheckCachedPlan: see if the CachedPlanSource's generic plan is valid.
+ * CheckCachedPlan: see if the CachedPlanSource's generic plan is valid and
+ * set *hasUnlockedParts if any PlannedStmt contains "initially" prunable
+ * subnodes; partitions are not locked till initial pruning is done.
*
* Caller must have already called RevalidateCachedQuery to verify that the
* querytree is up to date.
*
- * On a "true" return, we have acquired the locks needed to run the plan.
+ * On a "true" return, we have acquired the minimal set of locks needed to run
+ * the plan, that is, excluding partitions that are subject to being pruned
+ * before execution. The caller must lock partitions after pruning those and
+ * locking the ones that remain before actually telling the world that the
+ * plan is "valid".
+ *
* (We must do this for the "true" result to be race-condition-free.)
*/
static bool
-CheckCachedPlan(CachedPlanSource *plansource)
+CheckCachedPlan(CachedPlanSource *plansource, bool *hasUnlockedParts)
{
CachedPlan *plan = plansource->gplan;
@@ -826,7 +838,7 @@ CheckCachedPlan(CachedPlanSource *plansource)
*/
Assert(plan->refcount > 0);
- AcquireExecutorLocks(plan->stmt_list, true);
+ *hasUnlockedParts = AcquireExecutorLocks(plan->stmt_list, true);
/*
* If plan was transient, check to see if TransactionXmin has
@@ -848,7 +860,7 @@ CheckCachedPlan(CachedPlanSource *plansource)
}
/* Oops, the race case happened. Release useless locks. */
- AcquireExecutorLocks(plan->stmt_list, false);
+ (void) AcquireExecutorLocks(plan->stmt_list, false);
}
/*
@@ -1120,14 +1132,17 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
}
/*
- * GetCachedPlan: get a cached plan from a CachedPlanSource.
+ * GetCachedPlan: get a cached plan from a CachedPlanSource
*
* This function hides the logic that decides whether to use a generic
* plan or a custom plan for the given parameters: the caller does not know
* which it will get.
*
* On return, the plan is valid and we have sufficient locks to begin
- * execution.
+ * execution. If the plan is a generic plan containing prunable partitions,
+ * the locks on partitions are taken after the pruning and the result of that
+ * pruning is saved in *extra->part_prune_results_list for the caller to pass
+ * to the executor, along with plan->stmt_list.
*
* On return, the refcount of the plan has been incremented; a later
* ReleaseCachedPlan() call is expected. If "owner" is not NULL then
@@ -1139,12 +1154,16 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
*/
CachedPlan *
GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
- ResourceOwner owner, QueryEnvironment *queryEnv)
+ ResourceOwner owner, QueryEnvironment *queryEnv,
+ CachedPlanExtra **extra)
{
CachedPlan *plan = NULL;
List *qlist;
bool customplan;
+ Assert(extra != NULL);
+ *extra = NULL;
+
/* Assert caller is doing things in a sane order */
Assert(plansource->magic == CACHEDPLANSOURCE_MAGIC);
Assert(plansource->is_complete);
@@ -1160,7 +1179,11 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
if (!customplan)
{
- if (CheckCachedPlan(plansource))
+ bool hasUnlockedParts = false;
+
+ if (CheckCachedPlan(plansource, &hasUnlockedParts) &&
+ hasUnlockedParts &&
+ CachedPlanLockPartitions(plansource, boundParams, owner, extra))
{
/* We want a generic plan, and we already have a valid one */
plan = plansource->gplan;
@@ -1282,6 +1305,147 @@ ReleaseCachedPlan(CachedPlan *plan, ResourceOwner owner)
}
}
+/*
+ * For each PlannedStmt in the generic plan, do the "initial" partition pruning
+ * if needed and lock only partitions that survive.
+ *
+ * On return, (*extra)->part_prune_results_list will contain an element for
+ * each PlannedStmt in the generic plan's stmt_list, which is a NIL if the
+ * PlannedStmt does not contain any PartitionPruneInfos requiring initial
+ * pruning or a List of PartitionPruneResult containing elements corresponding
+ * to the PartitionPruneInfos in PlannedStmt.partPruneInfos.
+ */
+static bool
+CachedPlanLockPartitions(CachedPlanSource *plansource,
+ ParamListInfo boundParams,
+ ResourceOwner owner,
+ CachedPlanExtra **extra)
+{
+ CachedPlan *plan = plansource->gplan;
+ List *part_prune_results_list = NIL;
+ List *lockedRelids_per_stmt = NIL;
+ ListCell *lc1,
+ *lc2;
+ MemoryContext oldcontext,
+ tmpcontext;
+
+ /*
+ * Won't be here without CheckCachedPlan() having validated a generic
+ * plan.
+ */
+ Assert(plansource->gplan != NULL);
+
+ /*
+ * Create a temporary context for memory allocations required while
+ * executing partition pruning steps.
+ */
+ tmpcontext = AllocSetContextCreate(CurrentMemoryContext,
+ "CachedPlanLockPartitions() working data",
+ ALLOCSET_DEFAULT_SIZES);
+ oldcontext = MemoryContextSwitchTo(tmpcontext);
+ foreach(lc1, plan->stmt_list)
+ {
+ PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
+ Bitmapset *lockPartRelids = NULL;
+ int rti;
+ List *part_prune_results = NIL;
+ Bitmapset *lockedRelids = NULL;
+
+ if (plannedstmt->commandType == CMD_UTILITY)
+ {
+ /*
+ * Ignore utility statements, because AcquireExecutorLocks on the
+ * parent CachedPlan would have dealt with these. Though, do let
+ * the caller know that no pruning is applicable to this statement.
+ */
+ part_prune_results_list = lappend(part_prune_results_list, NIL);
+ lockedRelids_per_stmt = lappend(lockedRelids_per_stmt, NULL);
+ continue;
+ }
+
+ /* Figure out the partitions that would need to be locked. */
+ if (plannedstmt->containsInitialPruning)
+ {
+ foreach(lc2, plannedstmt->partPruneInfos)
+ {
+ PartitionPruneInfo *pruneinfo = lfirst_node(PartitionPruneInfo, lc2);
+ PartitionPruneResult *pruneresult = makeNode(PartitionPruneResult);
+
+ pruneresult->root_parent_relids =
+ bms_copy(pruneinfo->root_parent_relids);
+ pruneresult->valid_subplan_offs =
+ ExecPartitionDoInitialPruning(plannedstmt, boundParams,
+ pruneinfo,
+ &lockPartRelids);
+ part_prune_results = lappend(part_prune_results, pruneresult);
+ }
+ }
+
+ /* Lock 'em. */
+ rti = -1;
+ while ((rti = bms_next_member(lockPartRelids, rti)) > 0)
+ {
+ RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
+
+ Assert(rte->rtekind == RTE_RELATION);
+
+ /*
+ * Acquire the appropriate type of lock on each relation OID. Note
+ * that we don't actually try to open the rel, and hence will not
+ * fail if it's been dropped entirely --- we'll just transiently
+ * acquire a non-conflicting lock.
+ */
+ LockRelationOid(rte->relid, rte->rellockmode);
+ lockedRelids = bms_add_member(lockedRelids, rti);
+ }
+
+ part_prune_results_list = lappend(part_prune_results_list,
+ part_prune_results);
+ lockedRelids_per_stmt = lappend(lockedRelids_per_stmt,
+ lockedRelids);
+ }
+
+ /*
+ * If the plan is still valid, set *extra, returning in it a copy the
+ * pruning results obtained above allocated in the caller's context.
+ */
+ MemoryContextSwitchTo(oldcontext);
+ if (plan->is_valid)
+ {
+ *extra = (CachedPlanExtra *) palloc(sizeof(CachedPlanExtra));
+ (*extra)->part_prune_results_list = copyObject(part_prune_results_list);
+ }
+ else
+ {
+ /*
+ * Release the now useless locks. Note that this is the same as what
+ * CheckCachedPlan() does when the locks taken by
+ * AcquireExecutorLocks() causes the plan to be invalidated.
+ */
+ forboth(lc1, plan->stmt_list, lc2, lockedRelids_per_stmt)
+ {
+ PlannedStmt *plannedstmt = lfirst(lc1);
+ Bitmapset *lockedRelids = lfirst(lc2);
+ int rti;
+
+ if (plannedstmt->commandType == CMD_UTILITY)
+ continue;
+ rti = -1;
+ while ((rti = bms_next_member(lockedRelids, rti)) > 0)
+ {
+ RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
+
+ Assert(rte->rtekind == RTE_RELATION);
+ UnlockRelationOid(rte->relid, rte->rellockmode);
+ }
+ }
+ }
+
+ /* Clear up the temporary context. */
+ MemoryContextDelete(tmpcontext);
+ return plan->is_valid;
+}
+
/*
* CachedPlanAllowsSimpleValidityCheck: can we use CachedPlanIsSimplyValid?
*
@@ -1738,11 +1902,16 @@ QueryListGetPrimaryStmt(List *stmts)
/*
* AcquireExecutorLocks: acquire locks needed for execution of a cached plan;
* or release them if acquire is false.
+ *
+ * If some PlannedStmt(s) contain "initially prunable" partitions, they are not
+ * locked here. Instead, the caller is informed of their existence so that it
+ * can lock them after doing the initial pruning.
*/
-static void
+static bool
AcquireExecutorLocks(List *stmt_list, bool acquire)
{
ListCell *lc1;
+ bool hasUnlockedParts = false;
foreach(lc1, stmt_list)
{
@@ -1763,10 +1932,17 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
Assert(plannedstmt->minLockRelids == NULL);
if (query)
- ScanQueryForLocks(query, acquire);
+ ScanQueryForLocks(query, true);
continue;
}
+ /*
+ * If partitions can be pruned before execution, defer their locking to
+ * the caller.
+ */
+ if (plannedstmt->containsInitialPruning)
+ hasUnlockedParts = true;
+
allLockRelids = plannedstmt->minLockRelids;
rti = -1;
while ((rti = bms_next_member(allLockRelids, rti)) > 0)
@@ -1788,6 +1964,8 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
UnlockRelationOid(rte->relid, rte->rellockmode);
}
}
+
+ return hasUnlockedParts;
}
/*
diff --git a/src/backend/utils/mmgr/portalmem.c b/src/backend/utils/mmgr/portalmem.c
index 7b1ae6fdcf..94a9db84e3 100644
--- a/src/backend/utils/mmgr/portalmem.c
+++ b/src/backend/utils/mmgr/portalmem.c
@@ -303,6 +303,22 @@ PortalDefineQuery(Portal portal,
portal->status = PORTAL_DEFINED;
}
+/*
+ * Copies the given CachedPlanExtra struct into the portal.
+ */
+void
+PortalSaveCachedPlanExtra(Portal portal, CachedPlanExtra *extra)
+{
+ MemoryContext oldcxt = MemoryContextSwitchTo(portal->portalContext);
+
+ Assert(portal->cplan_extra == NULL && extra != NULL);
+ portal->cplan_extra = (CachedPlanExtra *)
+ palloc(sizeof(CachedPlanExtra));
+ portal->cplan_extra->part_prune_results_list =
+ copyObject(extra->part_prune_results_list);
+ MemoryContextSwitchTo(oldcxt);
+}
+
/*
* PortalReleaseCachedPlan
* Release a portal's reference to its cached plan, if any.
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 9ebde089ae..269cc4d562 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -87,7 +87,9 @@ extern void ExplainOneUtility(Node *utilityStmt, IntoClause *into,
ExplainState *es, const char *queryString,
ParamListInfo params, QueryEnvironment *queryEnv);
-extern void ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into,
+extern void ExplainOnePlan(PlannedStmt *plannedstmt,
+ List *part_prune_results,
+ IntoClause *into,
ExplainState *es, const char *queryString,
ParamListInfo params, QueryEnvironment *queryEnv,
const instr_time *planduration,
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index aeeaeb7884..4b98d0d2ef 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -129,5 +129,10 @@ extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
Bitmapset *root_parent_relids,
Bitmapset **initially_valid_subplans);
extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
- bool initial_prune);
+ bool initial_prune,
+ Bitmapset **scan_leafpart_rtis);
+extern Bitmapset *ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt,
+ ParamListInfo params,
+ PartitionPruneInfo *pruneinfo,
+ Bitmapset **scan_leafpart_rtis);
#endif /* EXECPARTITION_H */
diff --git a/src/include/executor/execdesc.h b/src/include/executor/execdesc.h
index e79e2c001f..5a7d075750 100644
--- a/src/include/executor/execdesc.h
+++ b/src/include/executor/execdesc.h
@@ -35,6 +35,8 @@ typedef struct QueryDesc
/* These fields are provided by CreateQueryDesc */
CmdType operation; /* CMD_SELECT, CMD_UPDATE, etc. */
PlannedStmt *plannedstmt; /* planner's output (could be utility, too) */
+ List *part_prune_results; /* PartitionPruneResults returned by
+ * CachedPlanLockPartitions() */
const char *sourceText; /* source text of the query */
Snapshot snapshot; /* snapshot to use for query */
Snapshot crosscheck_snapshot; /* crosscheck for RI update/delete */
@@ -57,6 +59,7 @@ typedef struct QueryDesc
/* in pquery.c */
extern QueryDesc *CreateQueryDesc(PlannedStmt *plannedstmt,
+ List *part_prune_results,
const char *sourceText,
Snapshot snapshot,
Snapshot crosscheck_snapshot,
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 9a64a830a2..f1374057e5 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -617,6 +617,7 @@ typedef struct EState
List *es_rteperminfos; /* List of RTEPermissionInfo */
PlannedStmt *es_plannedstmt; /* link to top of plan tree */
List *es_part_prune_infos; /* PlannedStmt.partPruneInfos */
+ List *es_part_prune_results; /* QueryDesc.part_prune_results */
const char *es_sourceText; /* Source text from QueryDesc */
JunkFilter *es_junkFilter; /* top-level junk filter, if any */
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 4337e7aa34..10f12e780e 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -134,8 +134,8 @@ typedef struct PlannerGlobal
bool containsInitialPruning;
/*
- * Indexes of all range table entries; for AcquireExecutorLocks()'s
- * perusal.
+ * Indexes of all range table entries except those of leaf partitions
+ * scanned by prunable subplans; for AcquireExecutorLocks() perusal.
*/
Bitmapset *minLockRelids;
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index eb0a007946..ab8bc74e4a 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -82,7 +82,9 @@ typedef struct PlannedStmt
List *permInfos; /* list of RTEPermissionInfo nodes for rtable
* entries needing one */
- Bitmapset *minLockRelids; /* Indexes of all range table entries; for
+ Bitmapset *minLockRelids; /* Indexes of all range table entries except
+ * those of leaf partitions scanned by
+ * prunable subplans; for
* AcquireExecutorLocks()'s perusal */
/* rtable indexes of target relations for INSERT/UPDATE/DELETE/MERGE */
@@ -1575,6 +1577,33 @@ typedef struct PartitionPruneStepCombine
List *source_stepids;
} PartitionPruneStepCombine;
+/*----------------
+ * PartitionPruneResult
+ *
+ * The result of performing ExecPartitionDoInitialPruning() on a given
+ * PartitionPruneInfo.
+ *
+ * root_parent_relids is same as PartitionPruneInfo.root_parent_relids. It's
+ * there for cross-checking in ExecInitPartitionPruning() that the
+ * PartitionPruneResult and the PartitionPruneInfo at a given index in
+ * EState.es_part_prune_results and EState.es_part_prune_infos, respectively,
+ * belong to the same parent plan node.
+ *
+ * valid_subplans_offs contains the indexes of subplans remaining after
+ * performing initial pruning by calling ExecFindMatchingSubPlans() on the
+ * PartitionPruneInfo.
+ *
+ * This is used to store the result of initial partition pruning that is
+ * peformed before the execution has started, such as in
+ * CachedPlanLockPartitions().
+ */
+typedef struct PartitionPruneResult
+{
+ NodeTag type;
+
+ Bitmapset *root_parent_relids;
+ Bitmapset *valid_subplan_offs;
+} PartitionPruneResult;
/*
* Plan invalidation info
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index 0499635f59..4ac66d2761 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -160,6 +160,14 @@ typedef struct CachedPlan
MemoryContext context; /* context containing this CachedPlan */
} CachedPlan;
+/*
+ * Additional information to pass the executor when executing a CachedPlan.
+ */
+typedef struct CachedPlanExtra
+{
+ List *part_prune_results_list;
+} CachedPlanExtra;
+
/*
* CachedExpression is a low-overhead mechanism for caching the planned form
* of standalone scalar expressions. While such expressions are not usually
@@ -220,7 +228,8 @@ extern List *CachedPlanGetTargetList(CachedPlanSource *plansource,
extern CachedPlan *GetCachedPlan(CachedPlanSource *plansource,
ParamListInfo boundParams,
ResourceOwner owner,
- QueryEnvironment *queryEnv);
+ QueryEnvironment *queryEnv,
+ CachedPlanExtra **extra);
extern void ReleaseCachedPlan(CachedPlan *plan, ResourceOwner owner);
extern bool CachedPlanAllowsSimpleValidityCheck(CachedPlanSource *plansource,
diff --git a/src/include/utils/portal.h b/src/include/utils/portal.h
index aeddbdafe5..49bb00cda5 100644
--- a/src/include/utils/portal.h
+++ b/src/include/utils/portal.h
@@ -138,6 +138,8 @@ typedef struct PortalData
QueryCompletion qc; /* command completion data for executed query */
List *stmts; /* list of PlannedStmts */
CachedPlan *cplan; /* CachedPlan, if stmts are from one */
+ CachedPlanExtra *cplan_extra; /* CachedPlanExtra for cplan in Portal's
+ * memory */
ParamListInfo portalParams; /* params to pass to query */
QueryEnvironment *queryEnv; /* environment for query */
@@ -242,6 +244,7 @@ extern void PortalDefineQuery(Portal portal,
CommandTag commandTag,
List *stmts,
CachedPlan *cplan);
+extern void PortalSaveCachedPlanExtra(Portal portal, CachedPlanExtra *extra);
extern PlannedStmt *PortalGetPrimaryStmt(Portal portal);
extern void PortalCreateHoldStore(Portal portal);
extern void PortalHashTableDeleteAll(void);
--
2.35.3
[application/octet-stream] v30-0001-Preparatory-refactoring-before-reworking-CachedP.patch (17.2K, 3-v30-0001-Preparatory-refactoring-before-reworking-CachedP.patch)
download | inline diff:
From 22c64b3d1ade0cb0f413c17d84a9bb0dd4e6d734 Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Tue, 13 Dec 2022 11:58:07 +0900
Subject: [PATCH v30 1/2] Preparatory refactoring before reworking CachedPlan
locking
Remember the RT indexes of RTEs that AcquireExecutorLocks() must
look at to consider locking in a bitmapset, so that nstead of looping
over the range table to find those RTEs, it can look them up using
the RT indexes set in the bitmapset.
This also adds some extra information related to execution-time
pruning to the relevant plan nodes.
---
src/backend/executor/execParallel.c | 1 +
src/backend/executor/execPartition.c | 6 ++++
src/backend/nodes/readfuncs.c | 8 ++++--
src/backend/optimizer/plan/planner.c | 2 ++
src/backend/optimizer/plan/setrefs.c | 12 ++++++++
src/backend/partitioning/partprune.c | 42 ++++++++++++++++++++++++++--
src/backend/utils/cache/plancache.c | 10 +++++--
src/include/executor/execPartition.h | 2 ++
src/include/nodes/nodes.h | 1 +
src/include/nodes/pathnodes.h | 11 ++++++++
src/include/nodes/plannodes.h | 19 +++++++++++++
11 files changed, 106 insertions(+), 8 deletions(-)
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index a5b8e43ec5..65c4b63bbd 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -182,6 +182,7 @@ ExecSerializePlan(Plan *plan, EState *estate)
pstmt->transientPlan = false;
pstmt->dependsOnRole = false;
pstmt->parallelModeNeeded = false;
+ pstmt->containsInitialPruning = false; /* workers need not know! */
pstmt->planTree = plan;
pstmt->partPruneInfos = estate->es_part_prune_infos;
pstmt->rtable = estate->es_range_table;
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 76d79b9741..5b62157712 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -1956,6 +1956,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
Assert(partdesc->nparts >= pinfo->nparts);
pprune->nparts = partdesc->nparts;
pprune->subplan_map = palloc(sizeof(int) * partdesc->nparts);
+ pprune->rti_map = palloc(sizeof(Index) * partdesc->nparts);
if (partdesc->nparts == pinfo->nparts)
{
/*
@@ -1966,6 +1967,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
pprune->subpart_map = pinfo->subpart_map;
memcpy(pprune->subplan_map, pinfo->subplan_map,
sizeof(int) * pinfo->nparts);
+ memcpy(pprune->rti_map, pinfo->rti_map,
+ sizeof(int) * pinfo->nparts);
/*
* Double-check that the list of unpruned relations has not
@@ -2016,6 +2019,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
pinfo->subplan_map[pd_idx];
pprune->subpart_map[pp_idx] =
pinfo->subpart_map[pd_idx];
+ pprune->rti_map[pp_idx] =
+ pinfo->rti_map[pd_idx];
pd_idx++;
}
else
@@ -2023,6 +2028,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
/* this partdesc entry is not in the plan */
pprune->subplan_map[pp_idx] = -1;
pprune->subpart_map[pp_idx] = -1;
+ pprune->rti_map[pp_idx] = 0;
}
}
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 966b75f5a6..1161671fa4 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -158,6 +158,11 @@
token = pg_strtok(&length); /* skip :fldname */ \
local_node->fldname = readIntCols(len)
+/* Read an Index array */
+#define READ_INDEX_ARRAY(fldname, len) \
+ token = pg_strtok(&length); /* skip :fldname */ \
+ local_node->fldname = readIndexCols(len)
+
/* Read a bool array */
#define READ_BOOL_ARRAY(fldname, len) \
token = pg_strtok(&length); /* skip :fldname */ \
@@ -796,7 +801,6 @@ fnname(int numCols) \
*/
READ_SCALAR_ARRAY(readAttrNumberCols, int16, atoi)
READ_SCALAR_ARRAY(readOidCols, Oid, atooid)
-/* outfuncs.c has writeIndexCols, but we don't yet need that here */
-/* READ_SCALAR_ARRAY(readIndexCols, Index, atoui) */
+READ_SCALAR_ARRAY(readIndexCols, Index, atoui)
READ_SCALAR_ARRAY(readIntCols, int, atoi)
READ_SCALAR_ARRAY(readBoolCols, bool, strtobool)
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 5dd4f92720..620b163ef9 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -523,8 +523,10 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
result->parallelModeNeeded = glob->parallelModeNeeded;
result->planTree = top_plan;
result->partPruneInfos = glob->partPruneInfos;
+ result->containsInitialPruning = glob->containsInitialPruning;
result->rtable = glob->finalrtable;
result->permInfos = glob->finalrteperminfos;
+ result->minLockRelids = glob->minLockRelids;
result->resultRelations = glob->resultRelations;
result->appendRelations = glob->appendRelations;
result->subplans = glob->subplans;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 596f1fbc8e..ed43d5936d 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -279,6 +279,16 @@ set_plan_references(PlannerInfo *root, Plan *plan)
*/
add_rtes_to_flat_rtable(root, false);
+ /*
+ * Add the query's adjusted range of RT indexes to glob->minLockRelids.
+ * The adjusted RT indexes of prunable relations will be deleted from the
+ * set below where PartitionPruneInfos are processed.
+ */
+ glob->minLockRelids =
+ bms_add_range(glob->minLockRelids,
+ rtoffset + 1,
+ rtoffset + list_length(root->parse->rtable));
+
/*
* Adjust RT indexes of PlanRowMarks and add to final rowmarks list
*/
@@ -377,9 +387,11 @@ set_plan_references(PlannerInfo *root, Plan *plan)
/* RT index of the table to which the pinfo belongs. */
pinfo->rtindex += rtoffset;
}
+
}
glob->partPruneInfos = lappend(glob->partPruneInfos, pruneinfo);
+ glob->containsInitialPruning |= pruneinfo->needs_init_pruning;
}
return result;
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index d48f6784c1..56270d7670 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -144,7 +144,9 @@ static List *make_partitionedrel_pruneinfo(PlannerInfo *root,
List *prunequal,
Bitmapset *partrelids,
int *relid_subplan_map,
- Bitmapset **matchedsubplans);
+ Bitmapset **matchedsubplans,
+ bool *needs_init_pruning,
+ bool *needs_exec_pruning);
static void gen_partprune_steps(RelOptInfo *rel, List *clauses,
PartClauseTarget target,
GeneratePruningStepsContext *context);
@@ -234,6 +236,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
int *relid_subplan_map;
ListCell *lc;
int i;
+ bool needs_init_pruning = false;
+ bool needs_exec_pruning = false;
/*
* Scan the subpaths to see which ones are scans of partition child
@@ -313,12 +317,16 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
Bitmapset *partrelids = (Bitmapset *) lfirst(lc);
List *pinfolist;
Bitmapset *matchedsubplans = NULL;
+ bool partrel_needs_init_pruning;
+ bool partrel_needs_exec_pruning;
pinfolist = make_partitionedrel_pruneinfo(root, parentrel,
prunequal,
partrelids,
relid_subplan_map,
- &matchedsubplans);
+ &matchedsubplans,
+ &partrel_needs_init_pruning,
+ &partrel_needs_exec_pruning);
/* When pruning is possible, record the matched subplans */
if (pinfolist != NIL)
@@ -327,6 +335,9 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
allmatchedsubplans = bms_join(matchedsubplans,
allmatchedsubplans);
}
+
+ needs_init_pruning |= partrel_needs_init_pruning;
+ needs_exec_pruning |= partrel_needs_exec_pruning;
}
pfree(relid_subplan_map);
@@ -342,6 +353,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
pruneinfo = makeNode(PartitionPruneInfo);
pruneinfo->root_parent_relids = parentrel->relids;
pruneinfo->prune_infos = prunerelinfos;
+ pruneinfo->needs_init_pruning = needs_init_pruning;
+ pruneinfo->needs_exec_pruning = needs_exec_pruning;
/*
* Some subplans may not belong to any of the identified partitioned rels.
@@ -442,13 +455,19 @@ add_part_relids(List *allpartrelids, Bitmapset *partrelids)
* If we cannot find any useful run-time pruning steps, return NIL.
* However, on success, each rel identified in partrelids will have
* an element in the result list, even if some of them are useless.
+ * *needs_init_pruning and *needs_exec_pruning are set to indicate whether
+ * the pruning steps contained in the returned PartitionedRelPruneInfos
+ * can be performed during executor startup and during execution,
+ * respectively.
*/
static List *
make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
List *prunequal,
Bitmapset *partrelids,
int *relid_subplan_map,
- Bitmapset **matchedsubplans)
+ Bitmapset **matchedsubplans,
+ bool *needs_init_pruning,
+ bool *needs_exec_pruning)
{
RelOptInfo *targetpart = NULL;
List *pinfolist = NIL;
@@ -459,6 +478,10 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
int rti;
int i;
+ /* Will find out below. */
+ *needs_init_pruning = false;
+ *needs_exec_pruning = false;
+
/*
* Examine each partitioned rel, constructing a temporary array to map
* from planner relids to index of the partitioned rel, and building a
@@ -546,6 +569,9 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
* executor per-scan pruning steps. This first pass creates startup
* pruning steps and detects whether there's any possibly-useful quals
* that would require per-scan pruning.
+ *
+ * In the first pass, we note whether the 2nd pass is necessary by
+ * noting the presence of EXEC parameters.
*/
gen_partprune_steps(subpart, partprunequal, PARTTARGET_INITIAL,
&context);
@@ -620,6 +646,12 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
pinfo->execparamids = execparamids;
/* Remaining fields will be filled in the next loop */
+ /* record which types of pruning steps we've seen so far */
+ if (initial_pruning_steps != NIL)
+ *needs_init_pruning = true;
+ if (exec_pruning_steps != NIL)
+ *needs_exec_pruning = true;
+
pinfolist = lappend(pinfolist, pinfo);
}
@@ -647,6 +679,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
int *subplan_map;
int *subpart_map;
Oid *relid_map;
+ Index *rti_map;
/*
* Construct the subplan and subpart maps for this partitioning level.
@@ -659,6 +692,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
subpart_map = (int *) palloc(nparts * sizeof(int));
memset(subpart_map, -1, nparts * sizeof(int));
relid_map = (Oid *) palloc0(nparts * sizeof(Oid));
+ rti_map = (Index *) palloc0(nparts * sizeof(Index));
present_parts = NULL;
i = -1;
@@ -673,6 +707,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
subplan_map[i] = subplanidx = relid_subplan_map[partrel->relid] - 1;
subpart_map[i] = subpartidx = relid_subpart_map[partrel->relid] - 1;
relid_map[i] = planner_rt_fetch(partrel->relid, root)->relid;
+ rti_map[i] = partrel->relid;
if (subplanidx >= 0)
{
present_parts = bms_add_member(present_parts, i);
@@ -697,6 +732,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
pinfo->subplan_map = subplan_map;
pinfo->subpart_map = subpart_map;
pinfo->relid_map = relid_map;
+ pinfo->rti_map = rti_map;
}
pfree(relid_subpart_map);
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index cc943205d3..339bb603f7 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -1747,7 +1747,8 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
foreach(lc1, stmt_list)
{
PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
- ListCell *lc2;
+ Bitmapset *allLockRelids;
+ int rti;
if (plannedstmt->commandType == CMD_UTILITY)
{
@@ -1760,14 +1761,17 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
*/
Query *query = UtilityContainsQuery(plannedstmt->utilityStmt);
+ Assert(plannedstmt->minLockRelids == NULL);
if (query)
ScanQueryForLocks(query, acquire);
continue;
}
- foreach(lc2, plannedstmt->rtable)
+ allLockRelids = plannedstmt->minLockRelids;
+ rti = -1;
+ while ((rti = bms_next_member(allLockRelids, rti)) > 0)
{
- RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc2);
+ RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
if (rte->rtekind != RTE_RELATION)
continue;
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index 17fabc18c9..aeeaeb7884 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -45,6 +45,7 @@ extern void ExecCleanupTupleRouting(ModifyTableState *mtstate,
* nparts Length of subplan_map[] and subpart_map[].
* subplan_map Subplan index by partition index, or -1.
* subpart_map Subpart index by partition index, or -1.
+ * rti_map Range table index by partition index, or 0.
* present_parts A Bitmapset of the partition indexes that we
* have subplans or subparts for.
* initial_pruning_steps List of PartitionPruneSteps used to
@@ -61,6 +62,7 @@ typedef struct PartitionedRelPruningData
int nparts;
int *subplan_map;
int *subpart_map;
+ Index *rti_map;
Bitmapset *present_parts;
List *initial_pruning_steps;
List *exec_pruning_steps;
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index 1f33902947..c2f2544df5 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -218,6 +218,7 @@ extern struct Bitmapset *readBitmapset(void);
extern uintptr_t readDatum(bool typbyval);
extern bool *readBoolCols(int numCols);
extern int *readIntCols(int numCols);
+extern Index *readIndexCols(int numCols);
extern Oid *readOidCols(int numCols);
extern int16 *readAttrNumberCols(int numCols);
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 654dba61aa..4337e7aa34 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -128,6 +128,17 @@ typedef struct PlannerGlobal
/* List of PartitionPruneInfo contained in the plan */
List *partPruneInfos;
+ /*
+ * Do any of those PartitionPruneInfos have initial pruning steps in them?
+ */
+ bool containsInitialPruning;
+
+ /*
+ * Indexes of all range table entries; for AcquireExecutorLocks()'s
+ * perusal.
+ */
+ Bitmapset *minLockRelids;
+
/* OIDs of relations the plan depends on */
List *relationOids;
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index bddfe86191..eb0a007946 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -73,11 +73,18 @@ typedef struct PlannedStmt
List *partPruneInfos; /* List of PartitionPruneInfo contained in the
* plan */
+ bool containsInitialPruning; /* Do any of those PartitionPruneInfos
+ * have initial pruning steps in them?
+ */
+
List *rtable; /* list of RangeTblEntry nodes */
List *permInfos; /* list of RTEPermissionInfo nodes for rtable
* entries needing one */
+ Bitmapset *minLockRelids; /* Indexes of all range table entries; for
+ * AcquireExecutorLocks()'s perusal */
+
/* rtable indexes of target relations for INSERT/UPDATE/DELETE/MERGE */
List *resultRelations; /* integer list of RT indexes, or NIL */
@@ -1417,6 +1424,13 @@ typedef struct PlanRowMark
* prune_infos List of Lists containing PartitionedRelPruneInfo nodes,
* one sublist per run-time-prunable partition hierarchy
* appearing in the parent plan node's subplans.
+ *
+ * needs_init_pruning Does any of the PartitionedRelPruneInfos in
+ * prune_infos have its initial_pruning_steps set?
+ *
+ * needs_exec_pruning Does any of the PartitionedRelPruneInfos in
+ * prune_infos have its exec_pruning_steps set?
+ *
* other_subplans Indexes of any subplans that are not accounted for
* by any of the PartitionedRelPruneInfo nodes in
* "prune_infos". These subplans must not be pruned.
@@ -1428,6 +1442,8 @@ typedef struct PartitionPruneInfo
NodeTag type;
Bitmapset *root_parent_relids;
List *prune_infos;
+ bool needs_init_pruning;
+ bool needs_exec_pruning;
Bitmapset *other_subplans;
} PartitionPruneInfo;
@@ -1472,6 +1488,9 @@ typedef struct PartitionedRelPruneInfo
/* relation OID by partition index, or 0 */
Oid *relid_map pg_node_attr(array_size(nparts));
+ /* Range table index by partition index, or 0. */
+ Index *rti_map pg_node_attr(array_size(nparts));
+
/*
* initial_pruning_steps shows how to prune during executor startup (i.e.,
* without use of any PARAM_EXEC Params); it is NIL if no startup pruning
--
2.35.3
^ permalink raw reply [nested|flat] 71+ messages in thread
* Re: generic plans and "initial" pruning
@ 2022-12-21 10:18 Alvaro Herrera <[email protected]>
parent: Amit Langote <[email protected]>
0 siblings, 2 replies; 71+ messages in thread
From: Alvaro Herrera @ 2022-12-21 10:18 UTC (permalink / raw)
To: Amit Langote <[email protected]>; +Cc: Robert Haas <[email protected]>; Jacob Champion <[email protected]>; David Rowley <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers
This version of the patch looks not entirely unreasonable to me. I'll
set this as Ready for Committer in case David or Tom or someone else
want to have a look and potentially commit it.
--
Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
^ permalink raw reply [nested|flat] 71+ messages in thread
* Re: generic plans and "initial" pruning
@ 2022-12-21 10:47 Amit Langote <[email protected]>
parent: Alvaro Herrera <[email protected]>
1 sibling, 0 replies; 71+ messages in thread
From: Amit Langote @ 2022-12-21 10:47 UTC (permalink / raw)
To: Alvaro Herrera <[email protected]>; +Cc: Robert Haas <[email protected]>; Jacob Champion <[email protected]>; David Rowley <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers
On Wed, Dec 21, 2022 at 7:18 PM Alvaro Herrera <[email protected]> wrote:
> This version of the patch looks not entirely unreasonable to me. I'll
> set this as Ready for Committer in case David or Tom or someone else
> want to have a look and potentially commit it.
Thank you, Alvaro.
--
Thanks, Amit Langote
EDB: http://www.enterprisedb.com
^ permalink raw reply [nested|flat] 71+ messages in thread
* Re: generic plans and "initial" pruning
@ 2022-12-21 15:18 Tom Lane <[email protected]>
parent: Alvaro Herrera <[email protected]>
1 sibling, 0 replies; 71+ messages in thread
From: Tom Lane @ 2022-12-21 15:18 UTC (permalink / raw)
To: Alvaro Herrera <[email protected]>; +Cc: Amit Langote <[email protected]>; Robert Haas <[email protected]>; Jacob Champion <[email protected]>; David Rowley <[email protected]>; pgsql-hackers
Alvaro Herrera <[email protected]> writes:
> This version of the patch looks not entirely unreasonable to me. I'll
> set this as Ready for Committer in case David or Tom or someone else
> want to have a look and potentially commit it.
I will have a look during the January CF.
regards, tom lane
^ permalink raw reply [nested|flat] 71+ messages in thread
end of thread, other threads:[~2022-12-21 15:18 UTC | newest]
Thread overview: 71+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2022-02-10 08:13 Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2022-02-10 22:01 ` Robert Haas <[email protected]>
2022-03-07 14:18 ` Amit Langote <[email protected]>
2022-03-11 14:35 ` Amit Langote <[email protected]>
2022-03-11 15:06 ` Amit Langote <[email protected]>
2022-03-14 18:42 ` Robert Haas <[email protected]>
2022-03-14 19:38 ` Tom Lane <[email protected]>
2022-03-14 20:06 ` Robert Haas <[email protected]>
2022-03-15 06:19 ` Amit Langote <[email protected]>
2022-03-22 12:44 ` Amit Langote <[email protected]>
2022-03-28 07:17 ` Amit Langote <[email protected]>
2022-03-28 07:28 ` Amit Langote <[email protected]>
2022-03-31 03:25 ` Amit Langote <[email protected]>
2022-03-31 09:56 ` Alvaro Herrera <[email protected]>
2022-03-31 11:11 ` Amit Langote <[email protected]>
2022-04-01 01:31 ` David Rowley <[email protected]>
2022-04-01 03:09 ` Amit Langote <[email protected]>
2022-04-01 03:45 ` Tom Lane <[email protected]>
2022-04-01 07:01 ` Amit Langote <[email protected]>
2022-04-01 04:08 ` David Rowley <[email protected]>
2022-04-01 06:58 ` Amit Langote <[email protected]>
2022-04-01 08:19 ` David Rowley <[email protected]>
2022-04-01 08:36 ` Amit Langote <[email protected]>
2022-04-06 07:20 ` Amit Langote <[email protected]>
2022-04-07 08:27 ` Amit Langote <[email protected]>
2022-04-07 12:41 ` David Rowley <[email protected]>
2022-04-08 05:49 ` Amit Langote <[email protected]>
2022-04-08 11:15 ` David Rowley <[email protected]>
2022-04-08 11:45 ` Amit Langote <[email protected]>
2022-04-11 03:05 ` Amit Langote <[email protected]>
2022-04-11 03:58 ` Zhihong Yu <[email protected]>
2022-05-27 08:09 ` Amit Langote <[email protected]>
2022-05-27 20:08 ` Zhihong Yu <[email protected]>
2022-07-05 17:43 ` Jacob Champion <[email protected]>
2022-07-06 02:37 ` Amit Langote <[email protected]>
2022-07-13 06:40 ` Amit Langote <[email protected]>
2022-07-13 07:03 ` Amit Langote <[email protected]>
2022-07-27 03:00 ` Amit Langote <[email protected]>
2022-07-27 16:27 ` Robert Haas <[email protected]>
2022-07-29 04:20 ` Amit Langote <[email protected]>
2022-10-12 07:36 ` Amit Langote <[email protected]>
2022-10-17 09:29 ` Amit Langote <[email protected]>
2022-10-27 02:41 ` Amit Langote <[email protected]>
2022-11-08 06:22 ` Amit Langote <[email protected]>
2022-11-30 18:12 ` Alvaro Herrera <[email protected]>
2022-12-01 07:59 ` Amit Langote <[email protected]>
2022-12-01 11:21 ` Alvaro Herrera <[email protected]>
2022-12-01 12:43 ` Amit Langote <[email protected]>
2022-12-02 10:40 ` Amit Langote <[email protected]>
2022-12-05 03:00 ` Amit Langote <[email protected]>
2022-12-05 06:08 ` Amit Langote <[email protected]>
2022-12-06 19:00 ` Alvaro Herrera <[email protected]>
2022-12-09 08:26 ` Amit Langote <[email protected]>
2022-12-09 09:52 ` Alvaro Herrera <[email protected]>
2022-12-09 10:34 ` Amit Langote <[email protected]>
2022-12-09 10:49 ` Alvaro Herrera <[email protected]>
2022-12-09 11:02 ` Amit Langote <[email protected]>
2022-12-09 11:37 ` Alvaro Herrera <[email protected]>
2022-12-12 11:19 ` Amit Langote <[email protected]>
2022-12-12 17:24 ` Alvaro Herrera <[email protected]>
2022-12-14 08:35 ` Amit Langote <[email protected]>
2022-12-16 02:33 ` Amit Langote <[email protected]>
2022-12-21 10:18 ` Alvaro Herrera <[email protected]>
2022-12-21 10:47 ` Amit Langote <[email protected]>
2022-12-21 15:18 ` Tom Lane <[email protected]>
2022-07-29 04:55 ` Tom Lane <[email protected]>
2022-07-29 12:22 ` Robert Haas <[email protected]>
2022-07-29 16:47 ` Tom Lane <[email protected]>
2022-07-29 16:55 ` Robert Haas <[email protected]>
2022-07-29 15:04 ` Tom Lane <[email protected]>
2022-07-29 15:56 ` Robert Haas <[email protected]>
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox