public inbox for [email protected]  
help / color / mirror / Atom feed
generic plans and "initial" pruning
108+ messages / 13 participants
[nested] [flat]

* generic plans and "initial" pruning
@ 2021-12-25 03:36  Amit Langote <[email protected]>
  0 siblings, 2 replies; 108+ messages in thread

From: Amit Langote @ 2021-12-25 03:36 UTC (permalink / raw)
  To: pgsql-hackers

Executing generic plans involving partitions is known to become slower
as partition count grows due to a number of bottlenecks, with
AcquireExecutorLocks() showing at the top in profiles.

Previous attempt at solving that problem was by David Rowley [1],
where he proposed delaying locking of *all* partitions appearing under
an Append/MergeAppend until "initial" pruning is done during the
executor initialization phase.  A problem with that approach that he
has described in [2] is that leaving partitions unlocked can lead to
race conditions where the Plan node belonging to a partition can be
invalidated when a concurrent session successfully alters the
partition between AcquireExecutorLocks() saying the plan is okay to
execute and then actually executing it.

However, using an idea that Robert suggested to me off-list a little
while back, it seems possible to determine the set of partitions that
we can safely skip locking.  The idea is to look at the "initial" or
"pre-execution" pruning instructions contained in a given Append or
MergeAppend node when AcquireExecutorLocks() is collecting the
relations to lock and consider relations from only those sub-nodes
that survive performing those instructions.   I've attempted
implementing that idea in the attached patch.

Note that "initial" pruning steps are now performed twice when
executing generic plans: once in AcquireExecutorLocks() to find
partitions to be locked, and a 2nd time in ExecInit[Merge]Append() to
determine the set of partition sub-nodes to be initialized for
execution, though I wasn't able to come up with a good idea to avoid
this duplication.

Using the following benchmark setup:

pgbench testdb -i --partitions=$nparts > /dev/null 2>&1
pgbench -n testdb -S -T 30 -Mprepared

And plan_cache_mode = force_generic_plan,

I get following numbers:

HEAD:

32      tps = 20561.776403 (without initial connection time)
64      tps = 12553.131423 (without initial connection time)
128     tps = 13330.365696 (without initial connection time)
256     tps = 8605.723120 (without initial connection time)
512     tps = 4435.951139 (without initial connection time)
1024    tps = 2346.902973 (without initial connection time)
2048    tps = 1334.680971 (without initial connection time)

Patched:

32      tps = 27554.156077 (without initial connection time)
64      tps = 27531.161310 (without initial connection time)
128     tps = 27138.305677 (without initial connection time)
256     tps = 25825.467724 (without initial connection time)
512     tps = 19864.386305 (without initial connection time)
1024    tps = 18742.668944 (without initial connection time)
2048    tps = 16312.412704 (without initial connection time)

-- 
Amit Langote
EDB: http://www.enterprisedb.com

[1] https://www.postgresql.org/message-id/[email protected]...

[2] https://www.postgresql.org/message-id/CAKJS1f99JNe%2Bsw5E3qWmS%2BHeLMFaAhehKO67J1Ym3pXv0XBsxw%40mail...


Attachments:

  [application/octet-stream] v1-0001-Teach-AcquireExecutorLocks-to-acquire-fewer-locks.patch (62.1K, 2-v1-0001-Teach-AcquireExecutorLocks-to-acquire-fewer-locks.patch)
  download | inline diff:
From ed4de69e7ae180eca380ae581152b6650175661f Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Wed, 22 Dec 2021 16:55:17 +0900
Subject: [PATCH v1] Teach AcquireExecutorLocks() to acquire fewer locks in
 some cases

Currently, AcquireExecutorLocks() loops over the range table of a
given PlannedStmt and locks all relations found therein, even those
that won't actually be scanned during execution due to being
eliminated by "initial" pruning that is applied during the
initialization of their owning Append or MergeAppend node. This makes
AcquireExecutorLocks() itself do the "initial" pruning on nodes that
support it and lock only those relations that are contained in the
subnodes that survive the pruning.

To that end, AcquireExecutorLocks() now loops over a bitmapset of
RT indexes, those of the RTEs of "lockable" relations, instead of
the whole range table to find such entries.  When pruning is possible,
the bitmapset is constructed by walking the plan tree to locate
nodes that allow "initial" (or "pre-execution") pruning and
disregarding relations from subnodes that don't survive the pruning
instructions.

PlannedStmt gets a bitmapset field to store the RT indexes of
lockable relations that is populated when contructing the flat range
table in setrefs.c.  It is used as is in the absence of any prunable
nodes.

PlannedStmt also gets a new field that indicates whether any of the
nodes of the plan tree contain "initial" (or "pre-execution") pruning
steps, which saves the trouble of walking the plan tree only to find
whether that's the case.

ExecFindInitialMatchingSubPlans() is refactored to allow being
called outside a full-fledged executor context.
---
 src/backend/executor/execParallel.c    |   2 +
 src/backend/executor/execPartition.c   | 534 ++++++++++++++++++-------
 src/backend/executor/nodeAppend.c      |  39 +-
 src/backend/executor/nodeMergeAppend.c |  39 +-
 src/backend/nodes/copyfuncs.c          |   4 +
 src/backend/nodes/nodeFuncs.c          | 121 +++++-
 src/backend/nodes/outfuncs.c           |   5 +
 src/backend/nodes/readfuncs.c          |   4 +
 src/backend/optimizer/plan/planner.c   |   2 +
 src/backend/optimizer/plan/setrefs.c   |  10 +
 src/backend/partitioning/partprune.c   |  57 ++-
 src/backend/utils/cache/plancache.c    | 217 +++++++++-
 src/include/executor/execPartition.h   |  13 +-
 src/include/nodes/nodeFuncs.h          |   3 +
 src/include/nodes/pathnodes.h          |   6 +
 src/include/nodes/plannodes.h          |  15 +
 src/include/partitioning/partprune.h   |   3 +
 17 files changed, 866 insertions(+), 208 deletions(-)

diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index f8a4a40e7b..d14e60724b 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -182,8 +182,10 @@ ExecSerializePlan(Plan *plan, EState *estate)
 	pstmt->transientPlan = false;
 	pstmt->dependsOnRole = false;
 	pstmt->parallelModeNeeded = false;
+	pstmt->usesPreExecPruning = false;
 	pstmt->planTree = plan;
 	pstmt->rtable = estate->es_range_table;
+	pstmt->relationRTIs = NULL;
 	pstmt->resultRelations = NIL;
 	pstmt->appendRelations = NIL;
 
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 5c723bc54e..8c63272398 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -24,6 +24,7 @@
 #include "mb/pg_wchar.h"
 #include "miscadmin.h"
 #include "nodes/makefuncs.h"
+#include "parser/parsetree.h"
 #include "partitioning/partbounds.h"
 #include "partitioning/partdesc.h"
 #include "partitioning/partprune.h"
@@ -186,7 +187,8 @@ static void ExecInitPruningContext(PartitionPruneContext *context,
 								   List *pruning_steps,
 								   PartitionDesc partdesc,
 								   PartitionKey partkey,
-								   PlanState *planstate);
+								   PlanState *planstate,
+								   ExprContext *econtext);
 static void find_matching_subplans_recurse(PartitionPruningData *prunedata,
 										   PartitionedRelPruningData *pprune,
 										   bool initial_prune,
@@ -1511,8 +1513,7 @@ adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri)
 
 /*
  * ExecCreatePartitionPruneState
- *		Build the data structure required for calling
- *		ExecFindInitialMatchingSubPlans and ExecFindMatchingSubPlans.
+ *		Build the data structure for run-time pruning
  *
  * 'planstate' is the parent plan node's execution state.
  *
@@ -1526,10 +1527,20 @@ adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri)
  * as children.  The data stored in each PartitionedRelPruningData can be
  * re-used each time we re-evaluate which partitions match the pruning steps
  * provided in each PartitionedRelPruneInfo.
+ *
+ * This does not consider initial_pruning_steps because they must already have
+ * been performed by the caller and the subplans remaining after doing so are
+ * given as 'initially_valid_subplans'.  The translation data to be put into
+ * PartitionPruneState that allows conversion of partition indexes into subplan
+ * indexes are updated here to account for the unneeded subplans having been
+ * removed by initial pruning. 'nsubplans' gives the number of subplans that
+ * were present before initial pruning.
  */
 PartitionPruneState *
 ExecCreatePartitionPruneState(PlanState *planstate,
-							  PartitionPruneInfo *partitionpruneinfo)
+							  PartitionPruneInfo *partitionpruneinfo,
+							  Bitmapset *initially_valid_subplans,
+							  int nsubplans)
 {
 	EState	   *estate = planstate->state;
 	PartitionPruneState *prunestate;
@@ -1537,6 +1548,15 @@ ExecCreatePartitionPruneState(PlanState *planstate,
 	ListCell   *lc;
 	int			i;
 
+	/*
+	 * Only create a PartitionPruneState if pruning needs to be performed
+	 * during the execution of the owning plan.  Note that this means the
+	 * initial pruning steps which are used to determine the set of subplans
+	 * that are valid for actual execution are performed without creating a
+	 * PartitionPruneState; see ExecFindInitialMatchingSubPlans().
+	 */
+	Assert(partitionpruneinfo->contains_exec_steps);
+
 	/* For data reading, executor always omits detached partitions */
 	if (estate->es_partition_directory == NULL)
 		estate->es_partition_directory =
@@ -1555,7 +1575,6 @@ ExecCreatePartitionPruneState(PlanState *planstate,
 	prunestate->execparamids = NULL;
 	/* other_subplans can change at runtime, so we need our own copy */
 	prunestate->other_subplans = bms_copy(partitionpruneinfo->other_subplans);
-	prunestate->do_initial_prune = false;	/* may be set below */
 	prunestate->do_exec_prune = false;	/* may be set below */
 	prunestate->num_partprunedata = n_part_hierarchies;
 
@@ -1702,23 +1721,17 @@ ExecCreatePartitionPruneState(PlanState *planstate,
 			pprune->present_parts = bms_copy(pinfo->present_parts);
 
 			/*
-			 * Initialize pruning contexts as needed.
+			 * Initialize pruning contexts as needed, ignoring any
+			 * initial_pruning_steps because they must already have been
+			 * performed.
 			 */
-			pprune->initial_pruning_steps = pinfo->initial_pruning_steps;
-			if (pinfo->initial_pruning_steps)
-			{
-				ExecInitPruningContext(&pprune->initial_context,
-									   pinfo->initial_pruning_steps,
-									   partdesc, partkey, planstate);
-				/* Record whether initial pruning is needed at any level */
-				prunestate->do_initial_prune = true;
-			}
 			pprune->exec_pruning_steps = pinfo->exec_pruning_steps;
 			if (pinfo->exec_pruning_steps)
 			{
 				ExecInitPruningContext(&pprune->exec_context,
 									   pinfo->exec_pruning_steps,
-									   partdesc, partkey, planstate);
+									   partdesc, partkey, planstate,
+									   planstate->ps_ExprContext);
 				/* Record whether exec pruning is needed at any level */
 				prunestate->do_exec_prune = true;
 			}
@@ -1735,18 +1748,136 @@ ExecCreatePartitionPruneState(PlanState *planstate,
 		i++;
 	}
 
+	/*
+	 * If exec-time pruning is required and subplans appear to have been
+	 * pruned by initial pruning steps, then we must re-sequence the subplan
+	 * indexes so that ExecFindMatchingSubPlans() properly returns the indexes
+	 * of the subplans that have remained after initial pruning, that is,
+	 * initially_valid_subplans.
+	 *
+	 * We can safely skip this when !do_exec_prune, even though that leaves
+	 * invalid data in pruneinfo, because that data won't be consulted again
+	 * (cf initial Assert in ExecFindMatchingSubPlans).
+	 */
+	if (prunestate->do_exec_prune &&
+		bms_num_members(initially_valid_subplans) < nsubplans)
+	{
+		int		   *new_subplan_indexes;
+		Bitmapset  *new_other_subplans;
+		int			i;
+		int			newidx;
+
+		/*
+		 * First we must build a temporary array which maps old subplan
+		 * indexes to new ones.  For convenience of initialization, we use
+		 * 1-based indexes in this array and leave pruned items as 0.
+		 */
+		new_subplan_indexes = (int *) palloc0(sizeof(int) * nsubplans);
+		newidx = 1;
+		i = -1;
+		while ((i = bms_next_member(initially_valid_subplans, i)) >= 0)
+		{
+			Assert(i < nsubplans);
+			new_subplan_indexes[i] = newidx++;
+		}
+
+		/*
+		 * Now we can update each PartitionedRelPruneInfo's subplan_map with
+		 * new subplan indexes.  We must also recompute its present_parts
+		 * bitmap.
+		 */
+		for (i = 0; i < prunestate->num_partprunedata; i++)
+		{
+			PartitionPruningData *prunedata = prunestate->partprunedata[i];
+			int			j;
+
+			/*
+			 * Within each hierarchy, we perform this loop in back-to-front
+			 * order so that we determine present_parts for the lowest-level
+			 * partitioned tables first.  This way we can tell whether a
+			 * sub-partitioned table's partitions were entirely pruned so we
+			 * can exclude it from the current level's present_parts.
+			 */
+			for (j = prunedata->num_partrelprunedata - 1; j >= 0; j--)
+			{
+				PartitionedRelPruningData *pprune = &prunedata->partrelprunedata[j];
+				int			nparts = pprune->nparts;
+				int			k;
+
+				/* We just rebuild present_parts from scratch */
+				bms_free(pprune->present_parts);
+				pprune->present_parts = NULL;
+
+				for (k = 0; k < nparts; k++)
+				{
+					int			oldidx = pprune->subplan_map[k];
+					int			subidx;
+
+					/*
+					 * If this partition existed as a subplan then change the
+					 * old subplan index to the new subplan index.  The new
+					 * index may become -1 if the partition was pruned above,
+					 * or it may just come earlier in the subplan list due to
+					 * some subplans being removed earlier in the list.  If
+					 * it's a subpartition, add it to present_parts unless
+					 * it's entirely pruned.
+					 */
+					if (oldidx >= 0)
+					{
+						Assert(oldidx < nsubplans);
+						pprune->subplan_map[k] = new_subplan_indexes[oldidx] - 1;
+
+						if (new_subplan_indexes[oldidx] > 0)
+							pprune->present_parts =
+								bms_add_member(pprune->present_parts, k);
+					}
+					else if ((subidx = pprune->subpart_map[k]) >= 0)
+					{
+						PartitionedRelPruningData *subprune;
+
+						subprune = &prunedata->partrelprunedata[subidx];
+
+						if (!bms_is_empty(subprune->present_parts))
+							pprune->present_parts =
+								bms_add_member(pprune->present_parts, k);
+					}
+				}
+			}
+		}
+
+		/*
+		 * We must also recompute the other_subplans set, since indexes in it
+		 * may change.
+		 */
+		new_other_subplans = NULL;
+		i = -1;
+		while ((i = bms_next_member(prunestate->other_subplans, i)) >= 0)
+			new_other_subplans = bms_add_member(new_other_subplans,
+												new_subplan_indexes[i] - 1);
+
+		bms_free(prunestate->other_subplans);
+		prunestate->other_subplans = new_other_subplans;
+
+		pfree(new_subplan_indexes);
+	}
+
 	return prunestate;
 }
 
 /*
  * Initialize a PartitionPruneContext for the given list of pruning steps.
+ *
+ * At least one of 'planstate' or 'econtext' must be passed to be able to
+ * successfully evaluate any non-Const expressions contained in the
+ * steps.
  */
 static void
 ExecInitPruningContext(PartitionPruneContext *context,
 					   List *pruning_steps,
 					   PartitionDesc partdesc,
 					   PartitionKey partkey,
-					   PlanState *planstate)
+					   PlanState *planstate,
+					   ExprContext *econtext)
 {
 	int			n_steps;
 	int			partnatts;
@@ -1767,6 +1898,7 @@ ExecInitPruningContext(PartitionPruneContext *context,
 
 	context->ppccontext = CurrentMemoryContext;
 	context->planstate = planstate;
+	context->exprcontext = econtext;
 
 	/* Initialize expression state for each expression we need */
 	context->exprstates = (ExprState **)
@@ -1795,8 +1927,13 @@ ExecInitPruningContext(PartitionPruneContext *context,
 														step->step.step_id,
 														keyno);
 
-				context->exprstates[stateidx] =
-					ExecInitExpr(expr, context->planstate);
+				if (planstate == NULL)
+					context->exprstates[stateidx] =
+						ExecInitExprWithParams(expr,
+											   econtext->ecxt_param_list_info);
+				else
+					context->exprstates[stateidx] =
+						ExecInitExpr(expr, context->planstate);
 			}
 			keyno++;
 		}
@@ -1809,171 +1946,283 @@ ExecInitPruningContext(PartitionPruneContext *context,
  *		pruning, disregarding any pruning constraints involving PARAM_EXEC
  *		Params.
  *
- * If additional pruning passes will be required (because of PARAM_EXEC
- * Params), we must also update the translation data that allows conversion
- * of partition indexes into subplan indexes to account for the unneeded
- * subplans having been removed.
+ * Must only be called once per 'pruneinfo', and only if initial pruning is
+ * required.
  *
- * Must only be called once per 'prunestate', and only if initial pruning
- * is required.
+ * 'param' contains information about any EXTERN parameters that might be
+ * present in the initial pruning steps.
  *
- * 'nsubplans' must be passed as the total number of unpruned subplans.
+ * The RT indexes of unpruned parents are returned in *parentrelids if asked
+ * for by the caller.
  */
 Bitmapset *
-ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate, int nsubplans)
+ExecFindInitialMatchingSubPlans(PartitionPruneInfo *pruneinfo,
+								EState *estate, List *rtable,
+								ParamListInfo params,
+								Bitmapset **parentrelids)
 {
 	Bitmapset  *result = NULL;
 	MemoryContext oldcontext;
+	MemoryContext tmpcontext;
 	int			i;
+	ListCell   *lc;
+	int			n_part_hierarchies;
+	bool		free_estate = false;
+	ExprContext *econtext;
+	PartitionPruningData **partprunedata;
+	PartitionDirectory	pdir;
 
-	/* Caller error if we get here without do_initial_prune */
-	Assert(prunestate->do_initial_prune);
-
-	/*
-	 * Switch to a temp context to avoid leaking memory in the executor's
-	 * query-lifespan memory context.
-	 */
-	oldcontext = MemoryContextSwitchTo(prunestate->prune_context);
-
-	/*
-	 * For each hierarchy, do the pruning tests, and add nondeletable
-	 * subplans' indexes to "result".
-	 */
-	for (i = 0; i < prunestate->num_partprunedata; i++)
-	{
-		PartitionPruningData *prunedata;
-		PartitionedRelPruningData *pprune;
+	/* Caller error if we get here without contains_init_steps */
+	Assert(pruneinfo->contains_init_steps);
 
-		prunedata = prunestate->partprunedata[i];
-		pprune = &prunedata->partrelprunedata[0];
 
-		/* Perform pruning without using PARAM_EXEC Params */
-		find_matching_subplans_recurse(prunedata, pprune, true, &result);
+	if (parentrelids)
+		*parentrelids = NULL;
 
-		/* Expression eval may have used space in node's ps_ExprContext too */
-		if (pprune->initial_pruning_steps)
-			ResetExprContext(pprune->initial_context.planstate->ps_ExprContext);
+	/* Set up EState if not in the executor proper. */
+	if (estate == NULL)
+	{
+		estate = CreateExecutorState();
+		estate->es_param_list_info = params;
+		free_estate = true;
 	}
 
-	/* Add in any subplans that partition pruning didn't account for */
-	result = bms_add_members(result, prunestate->other_subplans);
-
-	MemoryContextSwitchTo(oldcontext);
+	/* An ExprContext to evaluate expressions. */
+	econtext = CreateExprContext(estate);
 
-	/* Copy result out of the temp context before we reset it */
-	result = bms_copy(result);
+	/* PartitionDirectory, creating one if not there already. */
+	pdir = estate->es_partition_directory;
+	if (pdir == NULL)
+	{
+		/* Omits detached partitions, just like in the executor proper. */
+		pdir = CreatePartitionDirectory(CurrentMemoryContext, false);
+		estate->es_partition_directory = pdir;
+	}
 
-	MemoryContextReset(prunestate->prune_context);
+	/* A temporary context to allocate stuff needded to run pruning steps. */
+	tmpcontext = AllocSetContextCreate(CurrentMemoryContext,
+									   "initial pruning working data",
+									   ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(tmpcontext);
 
 	/*
-	 * If exec-time pruning is required and we pruned subplans above, then we
-	 * must re-sequence the subplan indexes so that ExecFindMatchingSubPlans
-	 * properly returns the indexes from the subplans which will remain after
-	 * execution of this function.
+	 * Stuff that follows matches exactly what ExecCreatePartitionPruneState()
+	 * does, except we don't need a PartitionPruneState here, so don't call
+	 * that function.
 	 *
-	 * We can safely skip this when !do_exec_prune, even though that leaves
-	 * invalid data in prunestate, because that data won't be consulted again
-	 * (cf initial Assert in ExecFindMatchingSubPlans).
+	 * XXX some refactoring might be good.
 	 */
-	if (prunestate->do_exec_prune && bms_num_members(result) < nsubplans)
+
+	/* PartitionPruningData for each partition hierarachy. */
+	n_part_hierarchies = list_length(pruneinfo->prune_infos);
+	Assert(n_part_hierarchies > 0);
+	partprunedata = (PartitionPruningData **)
+			palloc(sizeof(PartitionPruningData *) * n_part_hierarchies);
+	i = 0;
+	foreach(lc, pruneinfo->prune_infos)
 	{
-		int		   *new_subplan_indexes;
-		Bitmapset  *new_other_subplans;
-		int			i;
-		int			newidx;
+		PartitionPruningData *prunedata;
+		List	   *partrelpruneinfos = lfirst_node(List, lc);
+		int			npartrelpruneinfos = list_length(partrelpruneinfos);
+		ListCell   *lc2;
+		int			j;
 
-		/*
-		 * First we must build a temporary array which maps old subplan
-		 * indexes to new ones.  For convenience of initialization, we use
-		 * 1-based indexes in this array and leave pruned items as 0.
-		 */
-		new_subplan_indexes = (int *) palloc0(sizeof(int) * nsubplans);
-		newidx = 1;
-		i = -1;
-		while ((i = bms_next_member(result, i)) >= 0)
-		{
-			Assert(i < nsubplans);
-			new_subplan_indexes[i] = newidx++;
-		}
+		/* PartitionedRelPruningData per parent in the hierarchy. */
+		prunedata = (PartitionPruningData *)
+			palloc(offsetof(PartitionPruningData, partrelprunedata) +
+				   npartrelpruneinfos * sizeof(PartitionedRelPruningData));
+		partprunedata[i] = prunedata;
+		prunedata->num_partrelprunedata = npartrelpruneinfos;
 
-		/*
-		 * Now we can update each PartitionedRelPruneInfo's subplan_map with
-		 * new subplan indexes.  We must also recompute its present_parts
-		 * bitmap.
-		 */
-		for (i = 0; i < prunestate->num_partprunedata; i++)
+		j = 0;
+		foreach(lc2, partrelpruneinfos)
 		{
-			PartitionPruningData *prunedata = prunestate->partprunedata[i];
-			int			j;
+			PartitionedRelPruneInfo *pinfo = lfirst_node(PartitionedRelPruneInfo, lc2);
+			PartitionedRelPruningData *pprune = &prunedata->partrelprunedata[j];
+			RangeTblEntry *partrte = rt_fetch(pinfo->rtindex, rtable);
+			Relation	partrel;
+			PartitionDesc partdesc;
+			PartitionKey partkey;
 
 			/*
-			 * Within each hierarchy, we perform this loop in back-to-front
-			 * order so that we determine present_parts for the lowest-level
-			 * partitioned tables first.  This way we can tell whether a
-			 * sub-partitioned table's partitions were entirely pruned so we
-			 * can exclude it from the current level's present_parts.
+			 * We can rely on the copies of the partitioned table's partition
+			 * key and partition descriptor appearing in its relcache entry,
+			 * because that entry will be held open and locked while the
+			 * PartitionedRelPruningData is in use.
 			 */
-			for (j = prunedata->num_partrelprunedata - 1; j >= 0; j--)
+			partrel = table_open(partrte->relid, partrte->rellockmode);
+			partkey = RelationGetPartitionKey(partrel);
+			partdesc = PartitionDirectoryLookup(pdir, partrel);
+
+			/*
+			 * Initialize the subplan_map and subpart_map.
+			 *
+			 * Because we request detached partitions to be included, and
+			 * detaching waits for old transactions, it is safe to assume that
+			 * no partitions have disappeared since this query was planned.
+			 *
+			 * However, new partitions may have been added.
+			 */
+			Assert(partdesc->nparts >= pinfo->nparts);
+			pprune->nparts = partdesc->nparts;
+			pprune->subplan_map = palloc(sizeof(int) * partdesc->nparts);
+			if (partdesc->nparts == pinfo->nparts)
 			{
-				PartitionedRelPruningData *pprune = &prunedata->partrelprunedata[j];
-				int			nparts = pprune->nparts;
-				int			k;
+				/*
+				 * There are no new partitions, so this is simple.  We can
+				 * simply point to the subpart_map from the plan, but we must
+				 * copy the subplan_map since we may change it later.
+				 */
+				pprune->subpart_map = pinfo->subpart_map;
+				memcpy(pprune->subplan_map, pinfo->subplan_map,
+					   sizeof(int) * pinfo->nparts);
 
-				/* We just rebuild present_parts from scratch */
-				bms_free(pprune->present_parts);
-				pprune->present_parts = NULL;
+				/*
+				 * Double-check that the list of unpruned relations has not
+				 * changed.  (Pruned partitions are not in relid_map[].)
+				 */
+#ifdef USE_ASSERT_CHECKING
+				for (int k = 0; k < pinfo->nparts; k++)
+				{
+					Assert(partdesc->oids[k] == pinfo->relid_map[k] ||
+						   pinfo->subplan_map[k] == -1);
+				}
+#endif
+			}
+			else
+			{
+				int			pd_idx = 0;
+				int			pp_idx;
 
-				for (k = 0; k < nparts; k++)
+				/*
+				 * Some new partitions have appeared since plan time, and
+				 * those are reflected in our PartitionDesc but were not
+				 * present in the one used to construct subplan_map and
+				 * subpart_map.  So we must construct new and longer arrays
+				 * where the partitions that were originally present map to
+				 * the same sub-structures, and any added partitions map to
+				 * -1, as if the new partitions had been pruned.
+				 *
+				 * Note: pinfo->relid_map[] may contain InvalidOid entries for
+				 * partitions pruned by the planner.  We cannot tell exactly
+				 * which of the partdesc entries these correspond to, but we
+				 * don't have to; just skip over them.  The non-pruned
+				 * relid_map entries, however, had better be a subset of the
+				 * partdesc entries and in the same order.
+				 */
+				pprune->subpart_map = palloc(sizeof(int) * partdesc->nparts);
+				for (pp_idx = 0; pp_idx < partdesc->nparts; pp_idx++)
 				{
-					int			oldidx = pprune->subplan_map[k];
-					int			subidx;
+					/* Skip any InvalidOid relid_map entries */
+					while (pd_idx < pinfo->nparts &&
+						   !OidIsValid(pinfo->relid_map[pd_idx]))
+						pd_idx++;
 
-					/*
-					 * If this partition existed as a subplan then change the
-					 * old subplan index to the new subplan index.  The new
-					 * index may become -1 if the partition was pruned above,
-					 * or it may just come earlier in the subplan list due to
-					 * some subplans being removed earlier in the list.  If
-					 * it's a subpartition, add it to present_parts unless
-					 * it's entirely pruned.
-					 */
-					if (oldidx >= 0)
+					if (pd_idx < pinfo->nparts &&
+						pinfo->relid_map[pd_idx] == partdesc->oids[pp_idx])
 					{
-						Assert(oldidx < nsubplans);
-						pprune->subplan_map[k] = new_subplan_indexes[oldidx] - 1;
-
-						if (new_subplan_indexes[oldidx] > 0)
-							pprune->present_parts =
-								bms_add_member(pprune->present_parts, k);
+						/* match... */
+						pprune->subplan_map[pp_idx] =
+							pinfo->subplan_map[pd_idx];
+						pprune->subpart_map[pp_idx] =
+							pinfo->subpart_map[pd_idx];
+						pd_idx++;
 					}
-					else if ((subidx = pprune->subpart_map[k]) >= 0)
+					else
 					{
-						PartitionedRelPruningData *subprune;
-
-						subprune = &prunedata->partrelprunedata[subidx];
-
-						if (!bms_is_empty(subprune->present_parts))
-							pprune->present_parts =
-								bms_add_member(pprune->present_parts, k);
+						/* this partdesc entry is not in the plan */
+						pprune->subplan_map[pp_idx] = -1;
+						pprune->subpart_map[pp_idx] = -1;
 					}
 				}
+
+				/*
+				 * It might seem that we need to skip any trailing InvalidOid
+				 * entries in pinfo->relid_map before checking that we scanned
+				 * all of the relid_map.  But we will have skipped them above,
+				 * because they must correspond to some partdesc->oids
+				 * entries; we just couldn't tell which.
+				 */
+				if (pd_idx != pinfo->nparts)
+					elog(ERROR, "could not match partition child tables to plan elements");
 			}
+
+			/* present_parts is also subject to later modification */
+			pprune->present_parts = bms_copy(pinfo->present_parts);
+			pprune->initial_pruning_steps = pinfo->initial_pruning_steps;
+			if (pprune->initial_pruning_steps)
+				ExecInitPruningContext(&pprune->initial_context,
+									   pprune->initial_pruning_steps,
+									   partdesc, partkey, NULL, econtext);
+
+			table_close(partrel, NoLock);
+			j++;
 		}
+		i++;
+	}
+
+	/*
+	 * For each hierarchy, do the pruning tests, and add nondeletable
+	 * subplans' indexes to result.
+	 */
+	for (i = 0; i < n_part_hierarchies; i++)
+	{
+		PartitionPruningData *prunedata = partprunedata[i];
+		PartitionedRelPruningData *pprune;
 
 		/*
-		 * We must also recompute the other_subplans set, since indexes in it
-		 * may change.
+		 * We pass the 1st item belonging to the root table of the hierarchy
+		 * and find_matching_subplans_recurse() takes care of recursing to
+		 * other (lower-level) parents as needed.
 		 */
-		new_other_subplans = NULL;
-		i = -1;
-		while ((i = bms_next_member(prunestate->other_subplans, i)) >= 0)
-			new_other_subplans = bms_add_member(new_other_subplans,
-												new_subplan_indexes[i] - 1);
+		pprune = &prunedata->partrelprunedata[0];
+		find_matching_subplans_recurse(prunedata, pprune, true, &result);
 
-		bms_free(prunestate->other_subplans);
-		prunestate->other_subplans = new_other_subplans;
+		/*
+		 * Collect the RT indexes of surviving parents if the callers asked
+		 * to see them.
+		 */
+		if (parentrelids)
+		{
+			int		j;
+			List   *partrelpruneinfos = list_nth_node(List,
+													  pruneinfo->prune_infos,
+													  i);
 
-		pfree(new_subplan_indexes);
+			for (j = 0; j < prunedata->num_partrelprunedata; j++)
+			{
+				PartitionedRelPruneInfo *pinfo = list_nth_node(PartitionedRelPruneInfo,
+															   partrelpruneinfos, j);
+
+				pprune = &prunedata->partrelprunedata[j];
+				if (!bms_is_empty(pprune->present_parts))
+					*parentrelids = bms_add_member(*parentrelids, pinfo->rtindex);
+			}
+		}
+
+		/* Release space used up in our ExprContext. */
+		ResetExprContext(econtext);
+	}
+
+	/* Add in any subplans that partition pruning didn't account for. */
+	result = bms_add_members(result, pruneinfo->other_subplans);
+
+	MemoryContextSwitchTo(oldcontext);
+
+	/* Copy result out of the temp context before we reset it */
+	result = bms_copy(result);
+	if (parentrelids)
+		*parentrelids = bms_copy(*parentrelids);
+
+	/* Safe to drop the temporary context */
+	MemoryContextDelete(tmpcontext);
+
+	/* Free the ExprState, and EState if needed. */
+	FreeExprContext(econtext, true);
+	if (free_estate)
+	{
+		FreeExecutorState(estate);
+		estate = NULL;
 	}
 
 	return result;
@@ -2018,6 +2267,11 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate)
 		prunedata = prunestate->partprunedata[i];
 		pprune = &prunedata->partrelprunedata[0];
 
+		/*
+		 * We pass the 1st item belonging to the root table of the hierarchy
+		 * and find_matching_subplans_recurse() takes care of recursing to
+		 * other (lower-level) parents as needed.
+		 */
 		find_matching_subplans_recurse(prunedata, pprune, false, &result);
 
 		/* Expression eval may have used space in node's ps_ExprContext too */
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index 6a2daa6e76..7f813476ab 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -136,24 +136,15 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
 	/* If run-time partition pruning is enabled, then set that up now */
 	if (node->part_prune_info != NULL)
 	{
-		PartitionPruneState *prunestate;
-
-		/* We may need an expression context to evaluate partition exprs */
-		ExecAssignExprContext(estate, &appendstate->ps);
-
-		/* Create the working data structure for pruning. */
-		prunestate = ExecCreatePartitionPruneState(&appendstate->ps,
-												   node->part_prune_info);
-		appendstate->as_prune_state = prunestate;
-
-		/* Perform an initial partition prune, if required. */
-		if (prunestate->do_initial_prune)
+		if (node->part_prune_info->contains_init_steps)
 		{
-			/* Determine which subplans survive initial pruning */
-			validsubplans = ExecFindInitialMatchingSubPlans(prunestate,
-															list_length(node->appendplans));
-
+			validsubplans =
+				ExecFindInitialMatchingSubPlans(node->part_prune_info,
+												estate, estate->es_range_table,
+												estate->es_param_list_info,
+												NULL);
 			nplans = bms_num_members(validsubplans);
+			Assert(nplans >= 0);
 		}
 		else
 		{
@@ -163,12 +154,26 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
 			validsubplans = bms_add_range(NULL, 0, nplans - 1);
 		}
 
+		/* Create the working data structure for run-time pruning. */
+		if (node->part_prune_info->contains_exec_steps)
+		{
+			PartitionPruneState *prunestate;
+
+			/* We may need an expression context to evaluate partition exprs */
+			ExecAssignExprContext(estate, &appendstate->ps);
+			prunestate = ExecCreatePartitionPruneState(&appendstate->ps,
+													   node->part_prune_info,
+													   validsubplans,
+													   list_length(node->appendplans));
+
+			appendstate->as_prune_state = prunestate;
+		}
 		/*
 		 * When no run-time pruning is required and there's at least one
 		 * subplan, we can fill as_valid_subplans immediately, preventing
 		 * later calls to ExecFindMatchingSubPlans.
 		 */
-		if (!prunestate->do_exec_prune && nplans > 0)
+		else
 			appendstate->as_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
 	}
 	else
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index 617bffb206..51c5c3433d 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -84,23 +84,15 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
 	/* If run-time partition pruning is enabled, then set that up now */
 	if (node->part_prune_info != NULL)
 	{
-		PartitionPruneState *prunestate;
-
-		/* We may need an expression context to evaluate partition exprs */
-		ExecAssignExprContext(estate, &mergestate->ps);
-
-		prunestate = ExecCreatePartitionPruneState(&mergestate->ps,
-												   node->part_prune_info);
-		mergestate->ms_prune_state = prunestate;
-
-		/* Perform an initial partition prune, if required. */
-		if (prunestate->do_initial_prune)
+		if (node->part_prune_info->contains_init_steps)
 		{
-			/* Determine which subplans survive initial pruning */
-			validsubplans = ExecFindInitialMatchingSubPlans(prunestate,
-															list_length(node->mergeplans));
-
+			validsubplans =
+				ExecFindInitialMatchingSubPlans(node->part_prune_info,
+												estate, estate->es_range_table,
+												estate->es_param_list_info,
+												NULL);
 			nplans = bms_num_members(validsubplans);
+			Assert(nplans >= 0);
 		}
 		else
 		{
@@ -110,13 +102,28 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
 			validsubplans = bms_add_range(NULL, 0, nplans - 1);
 		}
 
+		/* Create the working data structure for run-time pruning. */
+		if (node->part_prune_info->contains_exec_steps)
+		{
+			PartitionPruneState *prunestate;
+
+			/* We may need an expression context to evaluate partition exprs */
+			ExecAssignExprContext(estate, &mergestate->ps);
+			prunestate = ExecCreatePartitionPruneState(&mergestate->ps,
+													   node->part_prune_info,
+													   validsubplans,
+													   list_length(node->mergeplans));
+
+			mergestate->ms_prune_state = prunestate;
+		}
 		/*
 		 * When no run-time pruning is required and there's at least one
 		 * subplan, we can fill as_valid_subplans immediately, preventing
 		 * later calls to ExecFindMatchingSubPlans.
 		 */
-		if (!prunestate->do_exec_prune && nplans > 0)
+		else
 			mergestate->ms_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
+
 	}
 	else
 	{
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index df0b747883..57f2fce3d4 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -94,9 +94,11 @@ _copyPlannedStmt(const PlannedStmt *from)
 	COPY_SCALAR_FIELD(transientPlan);
 	COPY_SCALAR_FIELD(dependsOnRole);
 	COPY_SCALAR_FIELD(parallelModeNeeded);
+	COPY_SCALAR_FIELD(usesPreExecPruning);
 	COPY_SCALAR_FIELD(jitFlags);
 	COPY_NODE_FIELD(planTree);
 	COPY_NODE_FIELD(rtable);
+	COPY_BITMAPSET_FIELD(relationRTIs);
 	COPY_NODE_FIELD(resultRelations);
 	COPY_NODE_FIELD(appendRelations);
 	COPY_NODE_FIELD(subplans);
@@ -1277,6 +1279,8 @@ _copyPartitionPruneInfo(const PartitionPruneInfo *from)
 	PartitionPruneInfo *newnode = makeNode(PartitionPruneInfo);
 
 	COPY_NODE_FIELD(prune_infos);
+	COPY_SCALAR_FIELD(contains_init_steps);
+	COPY_SCALAR_FIELD(contains_exec_steps);
 	COPY_BITMAPSET_FIELD(other_subplans);
 
 	return newnode;
diff --git a/src/backend/nodes/nodeFuncs.c b/src/backend/nodes/nodeFuncs.c
index e276264882..a13ee087a8 100644
--- a/src/backend/nodes/nodeFuncs.c
+++ b/src/backend/nodes/nodeFuncs.c
@@ -31,7 +31,10 @@ static bool planstate_walk_subplans(List *plans, bool (*walker) (),
 									void *context);
 static bool planstate_walk_members(PlanState **planstates, int nplans,
 								   bool (*walker) (), void *context);
-
+static bool plan_walk_subplans(List *plans,
+				   bool (*walker) (),
+				   void *context);
+static bool plan_walk_members(List *plans, bool (*walker) (), void *context);
 
 /*
  *	exprType -
@@ -4105,3 +4108,119 @@ planstate_walk_members(PlanState **planstates, int nplans,
 
 	return false;
 }
+
+/*
+ * plan_tree_walker --- walk plantrees
+ *
+ * The walker has already visited the current node, and so we need only
+ * recurse into any sub-nodes it has.
+ */
+bool
+plan_tree_walker(Plan *plan,
+				 bool (*walker) (),
+				 void *context)
+{
+	ListCell   *lc;
+
+	/* Guard against stack overflow due to overly complex plan trees */
+	check_stack_depth();
+
+	/* initPlan-s */
+	if (plan_walk_subplans(plan->initPlan, walker, context))
+		return true;
+
+	/* lefttree */
+	if (outerPlan(plan))
+	{
+		if (walker(outerPlan(plan), context))
+			return true;
+	}
+
+	/* righttree */
+	if (innerPlan(plan))
+	{
+		if (walker(innerPlan(plan), context))
+			return true;
+	}
+
+	/* special child plans */
+	switch (nodeTag(plan))
+	{
+		case T_Append:
+			if (plan_walk_members(((Append *) plan)->appendplans,
+								  walker, context))
+				return true;
+			break;
+		case T_MergeAppend:
+			if (plan_walk_members(((MergeAppend *) plan)->mergeplans,
+								  walker, context))
+				return true;
+			break;
+		case T_BitmapAnd:
+			if (plan_walk_members(((BitmapAnd *) plan)->bitmapplans,
+								  walker, context))
+				return true;
+			break;
+		case T_BitmapOr:
+			if (plan_walk_members(((BitmapOr *) plan)->bitmapplans,
+								  walker, context))
+				return true;
+			break;
+		case T_SubqueryScan:
+			if (walker(((SubqueryScan *) plan)->subplan, context))
+				return true;
+			break;
+		case T_CustomScan:
+			foreach(lc, ((CustomScan *) plan)->custom_plans)
+			{
+				if (walker((Plan *) lfirst(lc), context))
+					return true;
+			}
+			break;
+		default:
+			break;
+	}
+
+	return false;
+}
+
+/*
+ * Walk a list of SubPlans (or initPlans, which also use SubPlan nodes).
+ */
+static bool
+plan_walk_subplans(List *plans,
+				   bool (*walker) (),
+				   void *context)
+{
+	ListCell   *lc;
+	PlannedStmt *plannedstmt = (PlannedStmt *) context;
+
+	foreach(lc, plans)
+	{
+		SubPlan *sp = lfirst_node(SubPlan, lc);
+		Plan *p = list_nth(plannedstmt->subplans, sp->plan_id - 1);
+
+		if (walker(p, context))
+			return true;
+	}
+
+	return false;
+}
+
+/*
+ * Walk the constituent plans of a ModifyTable, Append, MergeAppend,
+ * BitmapAnd, or BitmapOr node.
+ */
+static bool
+plan_walk_members(List *plans, bool (*walker) (), void *context)
+{
+	ListCell *lc;
+
+	foreach(lc, plans)
+	{
+		if (walker(lfirst(lc), context))
+			return true;
+	}
+
+	return false;
+}
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 91a89b6d51..8364633d2e 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -312,9 +312,11 @@ _outPlannedStmt(StringInfo str, const PlannedStmt *node)
 	WRITE_BOOL_FIELD(transientPlan);
 	WRITE_BOOL_FIELD(dependsOnRole);
 	WRITE_BOOL_FIELD(parallelModeNeeded);
+	WRITE_BOOL_FIELD(usesPreExecPruning);
 	WRITE_INT_FIELD(jitFlags);
 	WRITE_NODE_FIELD(planTree);
 	WRITE_NODE_FIELD(rtable);
+	WRITE_BITMAPSET_FIELD(relationRTIs);
 	WRITE_NODE_FIELD(resultRelations);
 	WRITE_NODE_FIELD(appendRelations);
 	WRITE_NODE_FIELD(subplans);
@@ -1003,6 +1005,8 @@ _outPartitionPruneInfo(StringInfo str, const PartitionPruneInfo *node)
 	WRITE_NODE_TYPE("PARTITIONPRUNEINFO");
 
 	WRITE_NODE_FIELD(prune_infos);
+	WRITE_BOOL_FIELD(contains_init_steps);
+	WRITE_BOOL_FIELD(contains_exec_steps);
 	WRITE_BITMAPSET_FIELD(other_subplans);
 }
 
@@ -2273,6 +2277,7 @@ _outPlannerGlobal(StringInfo str, const PlannerGlobal *node)
 	WRITE_NODE_FIELD(subplans);
 	WRITE_BITMAPSET_FIELD(rewindPlanIDs);
 	WRITE_NODE_FIELD(finalrtable);
+	WRITE_BITMAPSET_FIELD(relationRTIs);
 	WRITE_NODE_FIELD(finalrowmarks);
 	WRITE_NODE_FIELD(resultRelations);
 	WRITE_NODE_FIELD(appendRelations);
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index d79af6e56e..df06782c3c 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -1585,9 +1585,11 @@ _readPlannedStmt(void)
 	READ_BOOL_FIELD(transientPlan);
 	READ_BOOL_FIELD(dependsOnRole);
 	READ_BOOL_FIELD(parallelModeNeeded);
+	READ_BOOL_FIELD(usesPreExecPruning);
 	READ_INT_FIELD(jitFlags);
 	READ_NODE_FIELD(planTree);
 	READ_NODE_FIELD(rtable);
+	READ_BITMAPSET_FIELD(relationRTIs);
 	READ_NODE_FIELD(resultRelations);
 	READ_NODE_FIELD(appendRelations);
 	READ_NODE_FIELD(subplans);
@@ -2533,6 +2535,8 @@ _readPartitionPruneInfo(void)
 	READ_LOCALS(PartitionPruneInfo);
 
 	READ_NODE_FIELD(prune_infos);
+	READ_BOOL_FIELD(contains_init_steps);
+	READ_BOOL_FIELD(contains_exec_steps);
 	READ_BITMAPSET_FIELD(other_subplans);
 
 	READ_DONE();
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index bd01ec0526..37a07cb258 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -517,8 +517,10 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
 	result->transientPlan = glob->transientPlan;
 	result->dependsOnRole = glob->dependsOnRole;
 	result->parallelModeNeeded = glob->parallelModeNeeded;
+	result->usesPreExecPruning = glob->usesPreExecPruning;
 	result->planTree = top_plan;
 	result->rtable = glob->finalrtable;
+	result->relationRTIs = glob->relationRTIs;
 	result->resultRelations = glob->resultRelations;
 	result->appendRelations = glob->appendRelations;
 	result->subplans = glob->subplans;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 6ccec759bd..4616dc675d 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -483,6 +483,7 @@ static void
 add_rte_to_flat_rtable(PlannerGlobal *glob, RangeTblEntry *rte)
 {
 	RangeTblEntry *newrte;
+	Index		rti = list_length(glob->finalrtable) + 1;
 
 	/* flat copy to duplicate all the scalar fields */
 	newrte = (RangeTblEntry *) palloc(sizeof(RangeTblEntry));
@@ -517,7 +518,10 @@ add_rte_to_flat_rtable(PlannerGlobal *glob, RangeTblEntry *rte)
 	 * but it would probably cost more cycles than it would save.
 	 */
 	if (newrte->rtekind == RTE_RELATION)
+	{
+		glob->relationRTIs = bms_add_member(glob->relationRTIs, rti);
 		glob->relationOids = lappend_oid(glob->relationOids, newrte->relid);
+	}
 }
 
 /*
@@ -1515,6 +1519,9 @@ set_append_references(PlannerInfo *root,
 				pinfo->rtindex += rtoffset;
 			}
 		}
+
+		if (aplan->part_prune_info->contains_init_steps)
+			root->glob->usesPreExecPruning = true;
 	}
 
 	/* We don't need to recurse to lefttree or righttree ... */
@@ -1579,6 +1586,9 @@ set_mergeappend_references(PlannerInfo *root,
 				pinfo->rtindex += rtoffset;
 			}
 		}
+
+		if (mplan->part_prune_info->contains_init_steps)
+			root->glob->usesPreExecPruning = true;
 	}
 
 	/* We don't need to recurse to lefttree or righttree ... */
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index e00edbe5c8..d2874f716e 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -144,7 +144,9 @@ static List *make_partitionedrel_pruneinfo(PlannerInfo *root,
 										   List *prunequal,
 										   Bitmapset *partrelids,
 										   int *relid_subplan_map,
-										   Bitmapset **matchedsubplans);
+										   Bitmapset **matchedsubplans,
+										   bool *contains_init_steps,
+										   bool *contains_exec_steps);
 static void gen_partprune_steps(RelOptInfo *rel, List *clauses,
 								PartClauseTarget target,
 								GeneratePruningStepsContext *context);
@@ -230,6 +232,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	int		   *relid_subplan_map;
 	ListCell   *lc;
 	int			i;
+	bool		contains_init_steps = false;
+	bool		contains_exec_steps = false;
 
 	/*
 	 * Scan the subpaths to see which ones are scans of partition child
@@ -309,12 +313,16 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		Bitmapset  *partrelids = (Bitmapset *) lfirst(lc);
 		List	   *pinfolist;
 		Bitmapset  *matchedsubplans = NULL;
+		bool		partrel_contains_init_steps,
+					partrel_contains_exec_steps;
 
 		pinfolist = make_partitionedrel_pruneinfo(root, parentrel,
 												  prunequal,
 												  partrelids,
 												  relid_subplan_map,
-												  &matchedsubplans);
+												  &matchedsubplans,
+												  &partrel_contains_init_steps,
+												  &partrel_contains_exec_steps);
 
 		/* When pruning is possible, record the matched subplans */
 		if (pinfolist != NIL)
@@ -323,6 +331,10 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 			allmatchedsubplans = bms_join(matchedsubplans,
 										  allmatchedsubplans);
 		}
+		if (!contains_init_steps)
+			contains_init_steps = partrel_contains_init_steps;
+		if (!contains_exec_steps)
+			contains_exec_steps = partrel_contains_exec_steps;
 	}
 
 	pfree(relid_subplan_map);
@@ -337,6 +349,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	/* Else build the result data structure */
 	pruneinfo = makeNode(PartitionPruneInfo);
 	pruneinfo->prune_infos = prunerelinfos;
+	pruneinfo->contains_init_steps = contains_init_steps;
+	pruneinfo->contains_exec_steps = contains_exec_steps;
 
 	/*
 	 * Some subplans may not belong to any of the identified partitioned rels.
@@ -435,13 +449,18 @@ add_part_relids(List *allpartrelids, Bitmapset *partrelids)
  * If we cannot find any useful run-time pruning steps, return NIL.
  * However, on success, each rel identified in partrelids will have
  * an element in the result list, even if some of them are useless.
+ * *contains_init_steps and *contains_exec_steps are set to indicate
+ * that the returned PartitionedRelPruneInfos contains pruning steps
+ * that can be performed before and during execution, respectively.
  */
 static List *
 make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 							  List *prunequal,
 							  Bitmapset *partrelids,
 							  int *relid_subplan_map,
-							  Bitmapset **matchedsubplans)
+							  Bitmapset **matchedsubplans,
+							  bool *contains_init_steps,
+							  bool *contains_exec_steps)
 {
 	RelOptInfo *targetpart = NULL;
 	List	   *pinfolist = NIL;
@@ -452,6 +471,10 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	int			rti;
 	int			i;
 
+	/* Will find out below. */
+	*contains_init_steps = false;
+	*contains_exec_steps = false;
+
 	/*
 	 * Examine each partitioned rel, constructing a temporary array to map
 	 * from planner relids to index of the partitioned rel, and building a
@@ -539,6 +562,9 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		 * executor per-scan pruning steps.  This first pass creates startup
 		 * pruning steps and detects whether there's any possibly-useful quals
 		 * that would require per-scan pruning.
+		 *
+		 * In the first pass, we note whether the 2nd pass is necessary by
+		 * by noting the presence of EXEC parameters.
 		 */
 		gen_partprune_steps(subpart, partprunequal, PARTTARGET_INITIAL,
 							&context);
@@ -613,6 +639,11 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		pinfo->execparamids = execparamids;
 		/* Remaining fields will be filled in the next loop */
 
+		if (!*contains_init_steps)
+			*contains_init_steps = (initial_pruning_steps != NIL);
+		if (!*contains_exec_steps)
+			*contains_exec_steps = (exec_pruning_steps != NIL);
+
 		pinfolist = lappend(pinfolist, pinfo);
 	}
 
@@ -798,6 +829,7 @@ prune_append_rel_partitions(RelOptInfo *rel)
 
 	/* These are not valid when being called from the planner */
 	context.planstate = NULL;
+	context.exprcontext = NULL;
 	context.exprstates = NULL;
 
 	/* Actual pruning happens here. */
@@ -808,8 +840,8 @@ prune_append_rel_partitions(RelOptInfo *rel)
  * get_matching_partitions
  *		Determine partitions that survive partition pruning
  *
- * Note: context->planstate must be set to a valid PlanState when the
- * pruning_steps were generated with a target other than PARTTARGET_PLANNER.
+ * Note: context->exprcontext must be valid when the pruning_steps were
+ * generated with a target other than PARTTARGET_PLANNER.
  *
  * Returns a Bitmapset of the RelOptInfo->part_rels indexes of the surviving
  * partitions.
@@ -3654,7 +3686,7 @@ match_boolean_partition_clause(Oid partopfamily, Expr *clause, Expr *partkey,
  * exprstate array.
  *
  * Note that the evaluated result may be in the per-tuple memory context of
- * context->planstate->ps_ExprContext, and we may have leaked other memory
+ * context->exprcontext, and we may have leaked other memory
  * there too.  This memory must be recovered by resetting that ExprContext
  * after we're done with the pruning operation (see execPartition.c).
  */
@@ -3677,13 +3709,18 @@ partkey_datum_from_expr(PartitionPruneContext *context,
 		ExprContext *ectx;
 
 		/*
-		 * We should never see a non-Const in a step unless we're running in
-		 * the executor.
+		 * We should never see a non-Const in a step unless the caller has
+		 * passed a valid ExprContext.
+		 *
+		 * When context->planstate is valid, context->exprcontext is same
+		 * as context->planstate->ps_ExprContext.
 		 */
-		Assert(context->planstate != NULL);
+		Assert(context->planstate != NULL || context->exprcontext != NULL);
+		Assert(context->planstate == NULL ||
+			   (context->exprcontext == context->planstate->ps_ExprContext));
 
 		exprstate = context->exprstates[stateidx];
-		ectx = context->planstate->ps_ExprContext;
+		ectx = context->exprcontext;
 		*value = ExecEvalExprSwitchContext(exprstate, ectx, isnull);
 	}
 }
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 6767eae8f2..6161907ace 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -58,6 +58,7 @@
 
 #include "access/transam.h"
 #include "catalog/namespace.h"
+#include "executor/execPartition.h"
 #include "executor/executor.h"
 #include "miscadmin.h"
 #include "nodes/nodeFuncs.h"
@@ -99,14 +100,26 @@ static dlist_head cached_expression_list = DLIST_STATIC_INIT(cached_expression_l
 static void ReleaseGenericPlan(CachedPlanSource *plansource);
 static List *RevalidateCachedQuery(CachedPlanSource *plansource,
 								   QueryEnvironment *queryEnv);
-static bool CheckCachedPlan(CachedPlanSource *plansource);
+static bool CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams);
 static CachedPlan *BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
 								   ParamListInfo boundParams, QueryEnvironment *queryEnv);
 static bool choose_custom_plan(CachedPlanSource *plansource,
 							   ParamListInfo boundParams);
 static double cached_plan_cost(CachedPlan *plan, bool include_planner);
 static Query *QueryListGetPrimaryStmt(List *stmts);
-static void AcquireExecutorLocks(List *stmt_list, bool acquire);
+static void AcquireExecutorLocks(List *stmt_list, bool acquire,
+								 ParamListInfo boundParams);
+struct GetLockableRelations_context
+{
+	PlannedStmt *plannedstmt;
+	Bitmapset *relations;
+	ParamListInfo params;
+};
+static Bitmapset *GetLockableRelations(PlannedStmt *plannedstmt,
+									   ParamListInfo boundParams);
+static bool GetLockableRelations_worker(Plan *plan,
+							struct GetLockableRelations_context *context);
+static Bitmapset *get_plan_scanrelids(Plan *plan);
 static void AcquirePlannerLocks(List *stmt_list, bool acquire);
 static void ScanQueryForLocks(Query *parsetree, bool acquire);
 static bool ScanQueryWalker(Node *node, bool *acquire);
@@ -792,7 +805,7 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
  * (We must do this for the "true" result to be race-condition-free.)
  */
 static bool
-CheckCachedPlan(CachedPlanSource *plansource)
+CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams)
 {
 	CachedPlan *plan = plansource->gplan;
 
@@ -826,7 +839,7 @@ CheckCachedPlan(CachedPlanSource *plansource)
 		 */
 		Assert(plan->refcount > 0);
 
-		AcquireExecutorLocks(plan->stmt_list, true);
+		AcquireExecutorLocks(plan->stmt_list, true, boundParams);
 
 		/*
 		 * If plan was transient, check to see if TransactionXmin has
@@ -848,7 +861,7 @@ CheckCachedPlan(CachedPlanSource *plansource)
 		}
 
 		/* Oops, the race case happened.  Release useless locks. */
-		AcquireExecutorLocks(plan->stmt_list, false);
+		AcquireExecutorLocks(plan->stmt_list, false, boundParams);
 	}
 
 	/*
@@ -1160,7 +1173,7 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 
 	if (!customplan)
 	{
-		if (CheckCachedPlan(plansource))
+		if (CheckCachedPlan(plansource, boundParams))
 		{
 			/* We want a generic plan, and we already have a valid one */
 			plan = plansource->gplan;
@@ -1366,7 +1379,6 @@ CachedPlanAllowsSimpleValidityCheck(CachedPlanSource *plansource,
 	foreach(lc, plan->stmt_list)
 	{
 		PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc);
-		ListCell   *lc2;
 
 		if (plannedstmt->commandType == CMD_UTILITY)
 			return false;
@@ -1375,13 +1387,8 @@ CachedPlanAllowsSimpleValidityCheck(CachedPlanSource *plansource,
 		 * We have to grovel through the rtable because it's likely to contain
 		 * an RTE_RESULT relation, rather than being totally empty.
 		 */
-		foreach(lc2, plannedstmt->rtable)
-		{
-			RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc2);
-
-			if (rte->rtekind == RTE_RELATION)
-				return false;
-		}
+		if (!bms_is_empty(plannedstmt->relationRTIs))
+			return false;
 	}
 
 	/*
@@ -1740,14 +1747,15 @@ QueryListGetPrimaryStmt(List *stmts)
  * or release them if acquire is false.
  */
 static void
-AcquireExecutorLocks(List *stmt_list, bool acquire)
+AcquireExecutorLocks(List *stmt_list, bool acquire, ParamListInfo boundParams)
 {
 	ListCell   *lc1;
 
 	foreach(lc1, stmt_list)
 	{
 		PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
-		ListCell   *lc2;
+		Bitmapset  *relations;
+		int			rti;
 
 		if (plannedstmt->commandType == CMD_UTILITY)
 		{
@@ -1765,9 +1773,22 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
 			continue;
 		}
 
-		foreach(lc2, plannedstmt->rtable)
+		/*
+		 * Fetch the RT indexes of only the relations that will be actually
+		 * scanned when the plan is executed.  This skips over scan nodes
+		 * appearing as child subnodes of any Append/MergeAppend nodes present
+		 * in the plan tree.  It does so by performing
+		 * ExecFindInitialMatchingSubPlans() to run any pruning steps
+		 * contained in those nodes that can be safely run at this point, using
+		 * 'boundParams' to evaluate any EXTERN parameters contained in the
+		 * steps.
+		 */
+		relations = GetLockableRelations(plannedstmt, boundParams);
+
+		rti = -1;
+		while ((rti = bms_next_member(relations, rti)) >= 0)
 		{
-			RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc2);
+			RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
 
 			if (rte->rtekind != RTE_RELATION)
 				continue;
@@ -1786,6 +1807,166 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
 	}
 }
 
+/*
+ * GetLockableRelations
+ *		Returns set of RT indexes of relations that must be locked by
+ *		AcquireExecutorLocks()
+ */
+static Bitmapset *
+GetLockableRelations(PlannedStmt *plannedstmt, ParamListInfo boundParams)
+{
+	ListCell *lc;
+	struct GetLockableRelations_context context;
+
+	/* None of the relation scanning nodes are prunable here. */
+	if (!plannedstmt->usesPreExecPruning)
+		return plannedstmt->relationRTIs;
+
+	/*
+	 * Look for prunable nodes in the main plan tree, followed by those in
+	 * subplans.
+	 */
+	context.plannedstmt = plannedstmt;
+	context.params = boundParams;
+	context.relations = NULL;
+
+	(void) GetLockableRelations_worker(plannedstmt->planTree, &context);
+
+	foreach(lc, plannedstmt->subplans)
+	{
+		Plan *subplan = lfirst(lc);
+
+		(void) GetLockableRelations_worker(subplan, &context);
+	}
+
+	return context.relations;
+}
+
+/*
+ * GetLockableRelations_worker
+ *		Adds RT indexes of relations to be scanned by plan to
+ *		context->relations
+ *
+ * For plan node types that support pruning, this only adds child plan
+ * subnodes that satisfy the "initial" pruning steps.
+ */
+static bool
+GetLockableRelations_worker(Plan *plan,
+							struct GetLockableRelations_context *context)
+{
+	if (plan == NULL)
+		return false;
+
+	switch(nodeTag(plan))
+	{
+		/* Nodes scanning a relation or relations. */
+		case T_SeqScan:
+		case T_SampleScan:
+		case T_IndexScan:
+		case T_IndexOnlyScan:
+		case T_BitmapHeapScan:
+		case T_TidScan:
+		case T_TidRangeScan:
+			context->relations = bms_add_member(context->relations,
+												((Scan *) plan)->scanrelid);
+			return false;
+		case T_ForeignScan:
+			context->relations = bms_add_members(context->relations,
+												 ((ForeignScan *) plan)->fs_relids);
+			return false;
+		case T_CustomScan:
+			context->relations = bms_add_members(context->relations,
+												 ((CustomScan *) plan)->custom_relids);
+			return false;
+
+		/* Nodes containing prunable subnodes. */
+		case T_Append:
+		case T_MergeAppend:
+			{
+				PlannedStmt *plannedstmt = context->plannedstmt;
+				List	   *rtable = plannedstmt->rtable;
+				ParamListInfo params = context->params;
+				PartitionPruneInfo *pruneinfo;
+				Bitmapset  *validsubplans;
+				Bitmapset  *parentrelids;
+
+				pruneinfo = IsA(plan, Append) ?
+					((Append *) plan)->part_prune_info :
+					((MergeAppend *) plan)->part_prune_info;
+
+				if (pruneinfo && pruneinfo->contains_init_steps)
+				{
+					int 	i;
+					List   *subplans = IsA(plan, Append) ?
+						((Append *) plan)->appendplans :
+						((MergeAppend *) plan)->mergeplans;
+
+					validsubplans =
+						ExecFindInitialMatchingSubPlans(pruneinfo,
+														NULL, rtable,
+														params,
+														&parentrelids);
+
+					/* All relevant parents must be locked. */
+					Assert(bms_num_members(parentrelids) > 0);
+					context->relations = bms_add_members(context->relations,
+														 parentrelids);
+
+					/* And all leaf partitions that will be scanned. */
+					i = -1;
+					while ((i = bms_next_member(validsubplans, i)) >= 0)
+					{
+						Plan   *subplan = list_nth(subplans, i);
+
+						context->relations =
+							bms_add_members(context->relations,
+											get_plan_scanrelids(subplan));
+					}
+
+					return false;
+				}
+			}
+			break;
+
+		default:
+			break;
+	}
+
+	return plan_tree_walker(plan, GetLockableRelations_worker,
+							(void *) context);
+}
+
+/*
+ * get_plan_scanrelid
+ *		Returns RT indexes of the relation(s) scanned by plan
+ */
+static Bitmapset *
+get_plan_scanrelids(Plan *plan)
+{
+	if (plan == NULL)
+		return NULL;
+
+	switch(nodeTag(plan))
+	{
+		case T_SeqScan:
+		case T_SampleScan:
+		case T_IndexScan:
+		case T_IndexOnlyScan:
+		case T_BitmapHeapScan:
+		case T_TidScan:
+		case T_TidRangeScan:
+			return bms_make_singleton(((Scan *) plan)->scanrelid);
+		case T_ForeignScan:
+			return ((ForeignScan *) plan)->fs_relids;
+		case T_CustomScan:
+			return ((CustomScan *) plan)->custom_relids;
+		default:
+			break;
+	}
+
+	return NULL;
+}
+
 /*
  * AcquirePlannerLocks: acquire locks needed for planning of a querytree list;
  * or release them if acquire is false.
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index 694e38b7dd..0eeaf3e79d 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -90,8 +90,6 @@ typedef struct PartitionPruningData
  *						These must not be pruned.
  * prune_context		A short-lived memory context in which to execute the
  *						partition pruning functions.
- * do_initial_prune		true if pruning should be performed during executor
- *						startup (at any hierarchy level).
  * do_exec_prune		true if pruning should be performed during
  *						executor run (at any hierarchy level).
  * num_partprunedata	Number of items in "partprunedata" array.
@@ -104,7 +102,6 @@ typedef struct PartitionPruneState
 	Bitmapset  *execparamids;
 	Bitmapset  *other_subplans;
 	MemoryContext prune_context;
-	bool		do_initial_prune;
 	bool		do_exec_prune;
 	int			num_partprunedata;
 	PartitionPruningData *partprunedata[FLEXIBLE_ARRAY_MEMBER];
@@ -120,9 +117,13 @@ extern ResultRelInfo *ExecFindPartition(ModifyTableState *mtstate,
 extern void ExecCleanupTupleRouting(ModifyTableState *mtstate,
 									PartitionTupleRouting *proute);
 extern PartitionPruneState *ExecCreatePartitionPruneState(PlanState *planstate,
-														  PartitionPruneInfo *partitionpruneinfo);
+														  PartitionPruneInfo *partitionpruneinfo,
+														  Bitmapset *initially_valid_subplans,
+														  int nsubplans);
 extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate);
-extern Bitmapset *ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate,
-												  int nsubplans);
+extern Bitmapset *ExecFindInitialMatchingSubPlans(PartitionPruneInfo *pruneinfo,
+								EState *estate, List *rtable,
+								ParamListInfo params,
+								Bitmapset **parentrelids);
 
 #endif							/* EXECPARTITION_H */
diff --git a/src/include/nodes/nodeFuncs.h b/src/include/nodes/nodeFuncs.h
index 03a346c01d..8b985a4706 100644
--- a/src/include/nodes/nodeFuncs.h
+++ b/src/include/nodes/nodeFuncs.h
@@ -158,5 +158,8 @@ extern bool raw_expression_tree_walker(Node *node, bool (*walker) (),
 struct PlanState;
 extern bool planstate_tree_walker(struct PlanState *planstate, bool (*walker) (),
 								  void *context);
+struct Plan;
+extern bool plan_tree_walker(struct Plan *plan, bool (*walker) (),
+				 void *context);
 
 #endif							/* NODEFUNCS_H */
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 324d92880b..d041b4d924 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -101,6 +101,9 @@ typedef struct PlannerGlobal
 
 	List	   *finalrtable;	/* "flat" rangetable for executor */
 
+	Bitmapset  *relationRTIs;	/* Indexes of RTE_RELATION entries in range
+								 * table */
+
 	List	   *finalrowmarks;	/* "flat" list of PlanRowMarks */
 
 	List	   *resultRelations;	/* "flat" list of integer RT indexes */
@@ -129,6 +132,9 @@ typedef struct PlannerGlobal
 
 	char		maxParallelHazard;	/* worst PROPARALLEL hazard level */
 
+	bool		usesPreExecPruning;	/* Do some Plan nodes use pre-execution
+									 * partition pruning */
+
 	PartitionDirectory partition_directory; /* partition descriptors */
 } PlannerGlobal;
 
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index be3c30704a..23bf04578b 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -59,12 +59,18 @@ typedef struct PlannedStmt
 
 	bool		parallelModeNeeded; /* parallel mode required to execute? */
 
+	bool		usesPreExecPruning;	/* Do some Plan nodes use pre-execution
+									 * partition pruning */
+
 	int			jitFlags;		/* which forms of JIT should be performed */
 
 	struct Plan *planTree;		/* tree of Plan nodes */
 
 	List	   *rtable;			/* list of RangeTblEntry nodes */
 
+	Bitmapset  *relationRTIs;	/* Indexes of RTE_RELATION entries in range
+								 * table */
+
 	/* rtable indexes of target relations for INSERT/UPDATE/DELETE */
 	List	   *resultRelations;	/* integer list of RT indexes, or NIL */
 
@@ -1157,6 +1163,13 @@ typedef struct PlanRowMark
  * prune_infos			List of Lists containing PartitionedRelPruneInfo nodes,
  *						one sublist per run-time-prunable partition hierarchy
  *						appearing in the parent plan node's subplans.
+ *
+ * contains_init_steps	Does any of the PartitionedRelPruneInfos in
+ *						prune_infos have its initial_pruning_steps set?
+ *
+ * contains_exec_steps	Does any of the PartitionedRelPruneInfos in
+ *						prune_infos have its exec_pruning_steps set?
+ *
  * other_subplans		Indexes of any subplans that are not accounted for
  *						by any of the PartitionedRelPruneInfo nodes in
  *						"prune_infos".  These subplans must not be pruned.
@@ -1165,6 +1178,8 @@ typedef struct PartitionPruneInfo
 {
 	NodeTag		type;
 	List	   *prune_infos;
+	bool		contains_init_steps;
+	bool		contains_exec_steps;
 	Bitmapset  *other_subplans;
 } PartitionPruneInfo;
 
diff --git a/src/include/partitioning/partprune.h b/src/include/partitioning/partprune.h
index 5f51e73a4d..1c9c408f00 100644
--- a/src/include/partitioning/partprune.h
+++ b/src/include/partitioning/partprune.h
@@ -41,6 +41,8 @@ struct RelOptInfo;
  *					subsidiary data, such as the FmgrInfos.
  * planstate		Points to the parent plan node's PlanState when called
  *					during execution; NULL when called from the planner.
+ * exprcontext		ExprContext to use during pre-execution pruning; planstate
+ *					would be NULL in that case.
  * exprstates		Array of ExprStates, indexed as per PruneCxtStateIdx; one
  *					for each partition key in each pruning step.  Allocated if
  *					planstate is non-NULL, otherwise NULL.
@@ -56,6 +58,7 @@ typedef struct PartitionPruneContext
 	FmgrInfo   *stepcmpfuncs;
 	MemoryContext ppccontext;
 	PlanState  *planstate;
+	ExprContext *exprcontext;
 	ExprState **exprstates;
 } PartitionPruneContext;
 
-- 
2.24.1



^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2021-12-28 13:12  Ashutosh Bapat <[email protected]>
  parent: Amit Langote <[email protected]>
  1 sibling, 1 reply; 108+ messages in thread

From: Ashutosh Bapat @ 2021-12-28 13:12 UTC (permalink / raw)
  To: Amit Langote <[email protected]>; +Cc: pgsql-hackers

On Sat, Dec 25, 2021 at 9:06 AM Amit Langote <[email protected]> wrote:
>
> Executing generic plans involving partitions is known to become slower
> as partition count grows due to a number of bottlenecks, with
> AcquireExecutorLocks() showing at the top in profiles.
>
> Previous attempt at solving that problem was by David Rowley [1],
> where he proposed delaying locking of *all* partitions appearing under
> an Append/MergeAppend until "initial" pruning is done during the
> executor initialization phase.  A problem with that approach that he
> has described in [2] is that leaving partitions unlocked can lead to
> race conditions where the Plan node belonging to a partition can be
> invalidated when a concurrent session successfully alters the
> partition between AcquireExecutorLocks() saying the plan is okay to
> execute and then actually executing it.
>
> However, using an idea that Robert suggested to me off-list a little
> while back, it seems possible to determine the set of partitions that
> we can safely skip locking.  The idea is to look at the "initial" or
> "pre-execution" pruning instructions contained in a given Append or
> MergeAppend node when AcquireExecutorLocks() is collecting the
> relations to lock and consider relations from only those sub-nodes
> that survive performing those instructions.   I've attempted
> implementing that idea in the attached patch.
>

In which cases, we will have "pre-execution" pruning instructions that
can be used to skip locking partitions? Can you please give a few
examples where this approach will be useful?

The benchmark is showing good results, indeed.


-- 
Best Wishes,
Ashutosh Bapat





^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2021-12-31 02:26  Amit Langote <[email protected]>
  parent: Ashutosh Bapat <[email protected]>
  0 siblings, 0 replies; 108+ messages in thread

From: Amit Langote @ 2021-12-31 02:26 UTC (permalink / raw)
  To: Ashutosh Bapat <[email protected]>; +Cc: pgsql-hackers

On Tue, Dec 28, 2021 at 22:12 Ashutosh Bapat <[email protected]>
wrote:

> On Sat, Dec 25, 2021 at 9:06 AM Amit Langote <[email protected]>
> wrote:
> >
> > Executing generic plans involving partitions is known to become slower
> > as partition count grows due to a number of bottlenecks, with
> > AcquireExecutorLocks() showing at the top in profiles.
> >
> > Previous attempt at solving that problem was by David Rowley [1],
> > where he proposed delaying locking of *all* partitions appearing under
> > an Append/MergeAppend until "initial" pruning is done during the
> > executor initialization phase.  A problem with that approach that he
> > has described in [2] is that leaving partitions unlocked can lead to
> > race conditions where the Plan node belonging to a partition can be
> > invalidated when a concurrent session successfully alters the
> > partition between AcquireExecutorLocks() saying the plan is okay to
> > execute and then actually executing it.
> >
> > However, using an idea that Robert suggested to me off-list a little
> > while back, it seems possible to determine the set of partitions that
> > we can safely skip locking.  The idea is to look at the "initial" or
> > "pre-execution" pruning instructions contained in a given Append or
> > MergeAppend node when AcquireExecutorLocks() is collecting the
> > relations to lock and consider relations from only those sub-nodes
> > that survive performing those instructions.   I've attempted
> > implementing that idea in the attached patch.
> >
>
> In which cases, we will have "pre-execution" pruning instructions that
> can be used to skip locking partitions? Can you please give a few
> examples where this approach will be useful?


This is mainly to be useful for prepared queries, so something like:

prepare q as select * from partitioned_table where key = $1;

And that too when execute q(…) uses a generic plan. Generic plans are
problematic because it must contain nodes for all partitions (without any
plan time pruning), which means CheckCachedPlan() has to spend time
proportional to the number of partitions to determine that the plan is
still usable / has not been invalidated; most of that is
AcquireExecutorLocks().

Other bottlenecks, not addressed in this patch, pertain to some executor
startup/shutdown subroutines that process the range table of a PlannedStmt
in its entirety, whose length is also proportional to the number of
partitions when the plan is generic.

The benchmark is showing good results, indeed.


Thanks.
-- 
Amit Langote
EDB: http://www.enterprisedb.com


^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2022-02-10 08:13  Amit Langote <[email protected]>
  parent: Amit Langote <[email protected]>
  1 sibling, 1 reply; 108+ messages in thread

From: Amit Langote @ 2022-02-10 08:13 UTC (permalink / raw)
  To: Robert Haas <[email protected]>; +Cc: pgsql-hackers

On Thu, Jan 13, 2022 at 3:20 AM Robert Haas <[email protected]> wrote:
> On Wed, Jan 12, 2022 at 9:32 AM Amit Langote <[email protected]> wrote:
> > Or, maybe this won't be a concern if performing ExecutorStart() is
> > made a part of CheckCachedPlan() somehow, which would then take locks
> > on the relation as the PlanState tree is built capturing any plan
> > invalidations, instead of AcquireExecutorLocks(). That does sound like
> > an ambitious undertaking though.
>
> On the surface that would seem to involve abstraction violations, but
> maybe that could be finessed somehow. The plancache shouldn't know too
> much about what the executor is going to do with the plan, but it
> could ask the executor to perform a step that has been designed for
> use by the plancache. I guess the core problem here is how to pass
> around information that is node-specific before we've stood up the
> executor state tree. Maybe the executor could have a function that
> does the pruning and returns some kind of array of results that can be
> used both to decide what to lock and also what to consider as pruned
> at the start of execution. (I'm hand-waving about the details because
> I don't know.)

The attached patch implements this idea.  Sorry for the delay in
getting this out and thanks to Robert for the off-list discussions on
this.

So the new executor "step" you mention is the function ExecutorPrep in
the patch, which calls a recursive function ExecPrepNode on the plan
tree's top node, much as ExecutorStart calls (via InitPlan)
ExecInitNode to construct a PlanState tree for actual execution
paralleling the plan tree.

For now, ExecutorPrep() / ExecPrepNode() does mainly two things if and
as it walks the plan tree: 1) Extract the RT indexes of RTE_RELATION
entries and add them to a bitmapset in the result struct, 2) If the
node contains a PartitionPruneInfo, perform its "initial pruning
steps" and store the result of doing so in a per-plan-node node called
PlanPrepOutput.  The bitmapset and the array containing per-plan-node
PlanPrepOutput nodes are returned in a node called ExecPrepOutput,
which is the result of ExecutorPrep, to its calling module (say,
plancache.c), which, after it's done using that information, must pass
it forward to subsequent execution steps.  That is done by passing it,
via the module's callers, to CreateQueryDesc() which remembers the
ExecPrepOutput in QueryDesc that is eventually passed to
ExecutorStart().

A bunch of other details are mentioned in the patch's commit message,
which I'm pasting below for anyone reading to spot any obvious flaws
(no-go's) of this approach:

    Invent a new executor "prep" phase

    The new phase, implemented by execMain.c:ExecutorPrep() and its
    recursive underling execProcnode.c:ExecPrepNode(), takes a query's
    PlannedStmt and processes the plan tree contained in it to produce
    a ExecPrepOutput node as result.

    As the plan tree is walked, each node must add the RT index(es) of
    any relation(s) that it directly manipulates to a bitmapset member of
    ExecPrepOutput (for example, an IndexScan node must add the Scan's
    scanrelid).  Also, each node may want to make a PlanPrepOutput node
    containing additional information that may be of interest to the
    calling module or to the later execution phases, if the node can
    provide one (for example, an Append node may perform initial pruning
    and add a set of "initially valid subplans" to the PlanPrepOutput).
    The PlanPrepOutput nodess of all the plan nodes are added to an array
    in the ExecPrepOutput, which is indexed using the individual nodes'
    plan_node_id; a NULL is stored in the array slots of nodes that
    don't have anything interesting to add to the PlanPrepOutput.

    The ExecPrepOutput thus produced is passed to CreateQueryDesc()
    and subsequently to ExecutorStart() via QueryDesc, which then makes
    it available to the executor routines via the query's EState.

    The main goal of adding this new phase is, for now, to allow cached
    cached generic plans containing scans of partitioned tables using
    Append/MergeAppend to be executed more efficiently by the prep phase
    doing any initial pruning, instead of deferring that to
    ExecutorStart().  That may allow AcquireExecutorLocks() on the plan
    to lock only only the minimal set of relations/partitions, that is
    those whose subplans survive the initial pruning.

    Implementation notes:

    * To allow initial pruning to be done as part of the pre-execution
    prep phase as opposed to as part of ExecutorStart(), this refactors
    ExecCreatePartitionPruneState() and ExecFindInitialMatchingSubPlans()
    to pass the information needed to do initial pruning directly as
    parameters instead of getting that from the EState and the PlanState
    of the parent Append/MergeAppend, both of which would not be
    available in ExecutorPrep().  Another, sort of non-essential-to-this-
    goal, refactoring this does is moving the partition pruning
    initialization stanzas in ExecInitAppend() and ExecInitMergeAppend()
    both of which contain the same cod into its own function
    ExecInitPartitionPruning().

    * To pass the ExecPrepOutput(s) created by the plancache module's
    invocation of ExecutorPrep() to the callers of the module, which in
    turn would pass them down to ExecutorStart(), CachedPlan gets a new
    List field that stores those ExecPrepOutputs, containing one element
    for each PlannedStmt also contained in the CachedPlan.  The new list
    is stored in a child context of the context containing the
    PlannedStmts, though unlike the latter, it is reset on every
    invocation of CheckCachedPlan(), which in turn calls ExecutorPrep()
    with a new set of bound Params.

    * AcquireExecutorLocks() is now made to loop over a bitmapset of RT
    indexes, those of relations returned in ExecPrepOutput, instead of
    over the whole range table.  With initial pruning that is also done
    as part of ExcecutorPrep(), only relations from non-pruned nodes of
    the plan tree would get locked as a result of this new arrangement.

    * PlannedStmt gets a new field usesPrepExecPruning that indicates
    whether any of the nodes of the plan tree contain "initial" (or
    "pre-execution") pruning steps, which saves ExecutorPrep() the
    trouble of walking the plan tree only to find out whether that's
    the case.

    * PartitionPruneInfo nodes now explicitly stores whether the steps
    contained in any of the individual PartitionedRelPruneInfos embedded
    in it contain initial pruning steps (those that can be performed
    during ExecutorPrep) and execution pruning steps (those that can only
    be performed during ExecutorRun), as flags contains_initial_steps and
    contains_exec_steps, respectively.  In fact, the aforementioned
    PlannedStmt field's value is a logical OR of the values of the former
    across all PartitionPruneInfo nodes embedded in the plan tree.

    * PlannedStmt also gets a bitmapset field to store the RT indexes of
    all relation RTEs referenced in the query that is populated when
    contructing the flat range table in setrefs.c, which effectively
    contains all the relations that the planner must have locked. In the
    case of a cached plan, AcquireExecutorLocks() must lock all of those
    relations, except those whose subnodes get pruned as result of
    ExecutorPrep().

    * PlannedStmt gets yet another field numPlanNodes that records the
    highest plan_node_id assigned to any of the node contained in the
    tree, which serves as the size to use when allocating the
    PlanPrepOutput array.

Maybe this should be more than one patch?  Say:

0001 to add ExecutorPrep and the boilerplate,
0002 to teach plancache.c to use the new facility

Thoughts?

--
Amit Langote
EDB: http://www.enterprisedb.com


Attachments:

  [application/octet-stream] v4-0001-Invent-a-new-executor-prep-phase.patch (169.2K, 2-v4-0001-Invent-a-new-executor-prep-phase.patch)
  download | inline diff:
From 7d29fea0fcf8e6aec2877804555dd0239fdaf1be Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Wed, 22 Dec 2021 16:55:17 +0900
Subject: [PATCH v4] Invent a new executor "prep" phase

The new phase, implemented by execMain.c:ExecutorPrep() and its
recursive underling execProcnode.c:ExecPrepNode(), takes a query's
PlannedStmt and processes the plan tree contained in it to produce
a ExecPrepOutput node as result.

As the plan tree is walked, each node must add the RT index(es) of
any relation(s) that it directly manipulates to a bitmapset member of
ExecPrepOutput (for example, an IndexScan node must add the Scan's
scanrelid).  Also, each node may want to make a PlanPrepOutput node
containing additional information that may be of interest to the
calling module or to the later execution phases, if the node can
provide one (for example, an Append node may perform initial pruning
and add a set of "initially valid subplans" to the PlanPrepOutput).
The PlanPrepOutput nodess of all the plan nodes are added to an array
in the ExecPrepOutput, which is indexed using the individual nodes'
plan_node_id; a NULL is stored in the array slots of nodes that
don't have anything interesting to add to the PlanPrepOutput.

The ExecPrepOutput thus produced is passed to CreateQueryDesc()
and subsequently to ExecutorStart() via QueryDesc, which then makes
it available to the executor routines via the query's EState.

The main goal of adding this new phase is, for now, to allow cached
cached generic plans containing scans of partitioned tables using
Append/MergeAppend to be executed more efficiently by the prep phase
doing any initial pruning, instead of deferring that to
ExecutorStart().  That may allow AcquireExecutorLocks() on the plan
to lock only only the minimal set of relations/partitions, that is
those whose subplans survive the initial pruning.

Implementation notes:

* To allow initial pruning to be done as part of the pre-execution
prep phase as opposed to as part of ExecutorStart(), this refactors
ExecCreatePartitionPruneState() and ExecFindInitialMatchingSubPlans()
to pass the information needed to do initial pruning directly as
parameters instead of getting that from the EState and the PlanState
of the parent Append/MergeAppend, both of which would not be
available in ExecutorPrep().  Another, sort of non-essential-to-this-
goal, refactoring this does is moving the partition pruning
initialization stanzas in ExecInitAppend() and ExecInitMergeAppend()
both of which contain the same cod into its own function
ExecInitPartitionPruning().

* To pass the ExecPrepOutput(s) created by the plancache module's
invocation of ExecutorPrep() to the callers of the module, which in
turn would pass them down to ExecutorStart(), CachedPlan gets a new
List field that stores those ExecPrepOutputs, containing one element
for each PlannedStmt also contained in the CachedPlan.  The new list
is stored in a child context of the context containing the
PlannedStmts, though unlike the latter, it is reset on every
invocation of CheckCachedPlan(), which in turn calls ExecutorPrep()
with a new set of bound Params.

* AcquireExecutorLocks() is now made to loop over a bitmapset of RT
indexes, those of relations returned in ExecPrepOutput, instead of
over the whole range table.  With initial pruning that is also done
as part of ExcecutorPrep(), only relations from non-pruned nodes of
the plan tree would get locked as a result of this new arrangement.

* PlannedStmt gets a new field usesPrepExecPruning that indicates
whether any of the nodes of the plan tree contain "initial" (or
"pre-execution") pruning steps, which saves ExecutorPrep() the
trouble of walking the plan tree only to find out whether that's
the case.

* PartitionPruneInfo nodes now explicitly stores whether the steps
contained in any of the individual PartitionedRelPruneInfos embedded
in it contain initial pruning steps (those that can be performed
during ExecutorPrep) and execution pruning steps (those that can only
be performed during ExecutorRun), as flags contains_initial_steps and
contains_exec_steps, respectively.  In fact, the aforementioned
PlannedStmt field's value is a logical OR of the values of the former
across all PartitionPruneInfo nodes embedded in the plan tree.

* PlannedStmt also gets a bitmapset field to store the RT indexes of
all relation RTEs referenced in the query that is populated when
contructing the flat range table in setrefs.c, which effectively
contains all the relations that the planner must have locked. In the
case of a cached plan, AcquireExecutorLocks() must lock all of those
relations, except those whose subnodes get pruned as result of
ExecutorPrep().

* PlannedStmt gets yet another field numPlanNodes that records the
highest plan_node_id assigned to any of the node contained in the
tree, which serves as the size to use when allocating the
PlanPrepOutput array.
---
 src/backend/commands/copyto.c                 |   2 +-
 src/backend/commands/createas.c               |   2 +-
 src/backend/commands/explain.c                |   7 +-
 src/backend/commands/extension.c              |  13 +-
 src/backend/commands/matview.c                |   2 +-
 src/backend/commands/portalcmds.c             |   1 +
 src/backend/commands/prepare.c                |  17 +-
 src/backend/executor/README                   |  18 +
 src/backend/executor/execMain.c               |  48 ++
 src/backend/executor/execParallel.c           |   4 +-
 src/backend/executor/execPartition.c          | 538 +++++++++++++-----
 src/backend/executor/execProcnode.c           | 206 +++++++
 src/backend/executor/execUtils.c              |   8 +
 src/backend/executor/functions.c              |   2 +-
 src/backend/executor/nodeAgg.c                |  13 +
 src/backend/executor/nodeAppend.c             |  91 ++-
 src/backend/executor/nodeBitmapAnd.c          |  18 +
 src/backend/executor/nodeBitmapHeapscan.c     |  14 +
 src/backend/executor/nodeBitmapIndexscan.c    |  14 +
 src/backend/executor/nodeBitmapOr.c           |  18 +
 src/backend/executor/nodeCtescan.c            |  12 +
 src/backend/executor/nodeCustom.c             |  18 +
 src/backend/executor/nodeForeignscan.c        |  12 +
 src/backend/executor/nodeFunctionscan.c       |  13 +
 src/backend/executor/nodeGather.c             |  13 +
 src/backend/executor/nodeGatherMerge.c        |  13 +
 src/backend/executor/nodeGroup.c              |  13 +
 src/backend/executor/nodeHash.c               |  13 +
 src/backend/executor/nodeHashjoin.c           |  14 +
 src/backend/executor/nodeIncrementalSort.c    |  14 +
 src/backend/executor/nodeIndexonlyscan.c      |  14 +
 src/backend/executor/nodeIndexscan.c          |  14 +
 src/backend/executor/nodeLimit.c              |  13 +
 src/backend/executor/nodeLockRows.c           |  13 +
 src/backend/executor/nodeMaterial.c           |  13 +
 src/backend/executor/nodeMemoize.c            |  13 +
 src/backend/executor/nodeMergeAppend.c        |  90 ++-
 src/backend/executor/nodeMergejoin.c          |  14 +
 src/backend/executor/nodeModifyTable.c        |  26 +
 .../executor/nodeNamedtuplestorescan.c        |  13 +
 src/backend/executor/nodeNestloop.c           |  14 +
 src/backend/executor/nodeProjectSet.c         |  13 +
 src/backend/executor/nodeRecursiveunion.c     |  14 +
 src/backend/executor/nodeResult.c             |  13 +
 src/backend/executor/nodeSamplescan.c         |  14 +
 src/backend/executor/nodeSeqscan.c            |  13 +
 src/backend/executor/nodeSetOp.c              |  13 +
 src/backend/executor/nodeSort.c               |  13 +
 src/backend/executor/nodeSubplan.c            |  12 +
 src/backend/executor/nodeSubqueryscan.c       |  14 +
 src/backend/executor/nodeTableFuncscan.c      |  13 +
 src/backend/executor/nodeTidrangescan.c       |  14 +
 src/backend/executor/nodeTidscan.c            |  15 +-
 src/backend/executor/nodeUnique.c             |  13 +
 src/backend/executor/nodeValuesscan.c         |  13 +
 src/backend/executor/nodeWindowAgg.c          |  13 +
 src/backend/executor/nodeWorktablescan.c      |  12 +
 src/backend/executor/spi.c                    |  14 +-
 src/backend/nodes/copyfuncs.c                 |  49 ++
 src/backend/nodes/outfuncs.c                  |   6 +
 src/backend/nodes/readfuncs.c                 |   5 +
 src/backend/optimizer/plan/planner.c          |   3 +
 src/backend/optimizer/plan/setrefs.c          |  10 +
 src/backend/partitioning/partprune.c          |  57 +-
 src/backend/tcop/postgres.c                   |  15 +-
 src/backend/tcop/pquery.c                     |  21 +-
 src/backend/utils/cache/plancache.c           | 155 +++--
 src/backend/utils/mmgr/portalmem.c            |   2 +
 src/include/commands/explain.h                |   3 +-
 src/include/executor/execPartition.h          |  19 +-
 src/include/executor/execdesc.h               |   2 +
 src/include/executor/executor.h               |   3 +
 src/include/executor/nodeAgg.h                |   1 +
 src/include/executor/nodeAppend.h             |   1 +
 src/include/executor/nodeBitmapAnd.h          |   1 +
 src/include/executor/nodeBitmapHeapscan.h     |   1 +
 src/include/executor/nodeBitmapIndexscan.h    |   1 +
 src/include/executor/nodeBitmapOr.h           |   1 +
 src/include/executor/nodeCtescan.h            |   1 +
 src/include/executor/nodeCustom.h             |   1 +
 src/include/executor/nodeForeignscan.h        |   1 +
 src/include/executor/nodeFunctionscan.h       |   1 +
 src/include/executor/nodeGather.h             |   1 +
 src/include/executor/nodeGatherMerge.h        |   1 +
 src/include/executor/nodeGroup.h              |   1 +
 src/include/executor/nodeHash.h               |   1 +
 src/include/executor/nodeHashjoin.h           |   1 +
 src/include/executor/nodeIncrementalSort.h    |   1 +
 src/include/executor/nodeIndexonlyscan.h      |   1 +
 src/include/executor/nodeIndexscan.h          |   1 +
 src/include/executor/nodeLimit.h              |   1 +
 src/include/executor/nodeLockRows.h           |   1 +
 src/include/executor/nodeMaterial.h           |   1 +
 src/include/executor/nodeMemoize.h            |   1 +
 src/include/executor/nodeMergeAppend.h        |   1 +
 src/include/executor/nodeMergejoin.h          |   1 +
 src/include/executor/nodeModifyTable.h        |   1 +
 .../executor/nodeNamedtuplestorescan.h        |   1 +
 src/include/executor/nodeNestloop.h           |   1 +
 src/include/executor/nodeProjectSet.h         |   1 +
 src/include/executor/nodeRecursiveunion.h     |   1 +
 src/include/executor/nodeResult.h             |   2 +
 src/include/executor/nodeSamplescan.h         |   1 +
 src/include/executor/nodeSeqscan.h            |   1 +
 src/include/executor/nodeSetOp.h              |   1 +
 src/include/executor/nodeSort.h               |   1 +
 src/include/executor/nodeSubplan.h            |   1 +
 src/include/executor/nodeSubqueryscan.h       |   1 +
 src/include/executor/nodeTableFuncscan.h      |   1 +
 src/include/executor/nodeTidrangescan.h       |   1 +
 src/include/executor/nodeTidscan.h            |   1 +
 src/include/executor/nodeUnique.h             |   1 +
 src/include/executor/nodeValuesscan.h         |   1 +
 src/include/executor/nodeWindowAgg.h          |   1 +
 src/include/executor/nodeWorktablescan.h      |   1 +
 src/include/nodes/execnodes.h                 |  78 +++
 src/include/nodes/nodeFuncs.h                 |   3 +
 src/include/nodes/nodes.h                     |   5 +
 src/include/nodes/pathnodes.h                 |   6 +
 src/include/nodes/plannodes.h                 |  17 +
 src/include/partitioning/partprune.h          |   2 +
 src/include/tcop/tcopprot.h                   |   2 +-
 src/include/utils/plancache.h                 |   5 +
 src/include/utils/portal.h                    |   5 +
 124 files changed, 1866 insertions(+), 285 deletions(-)

diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index 3283ef50d0..bb7d5e65ea 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -542,7 +542,7 @@ BeginCopyTo(ParseState *pstate,
 		((DR_copy *) dest)->cstate = cstate;
 
 		/* Create a QueryDesc requesting no output */
-		cstate->queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+		cstate->queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
 											GetActiveSnapshot(),
 											InvalidSnapshot,
 											dest, NULL, NULL, 0);
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 9abbb6b555..f6607f2454 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -325,7 +325,7 @@ ExecCreateTableAs(ParseState *pstate, CreateTableAsStmt *stmt,
 		UpdateActiveSnapshotCommandId();
 
 		/* Create a QueryDesc, redirecting output to our tuple receiver */
-		queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+		queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
 									GetActiveSnapshot(), InvalidSnapshot,
 									dest, params, queryEnv, 0);
 
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index b970997c34..9ee82824a1 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -407,7 +407,7 @@ ExplainOneQuery(Query *query, int cursorOptions,
 		}
 
 		/* run it (if needed) and produce output */
-		ExplainOnePlan(plan, into, es, queryString, params, queryEnv,
+		ExplainOnePlan(plan, NULL, into, es, queryString, params, queryEnv,
 					   &planduration, (es->buffers ? &bufusage : NULL));
 	}
 }
@@ -515,7 +515,8 @@ ExplainOneUtility(Node *utilityStmt, IntoClause *into, ExplainState *es,
  * to call it.
  */
 void
-ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
+ExplainOnePlan(PlannedStmt *plannedstmt, ExecPrepOutput *execprep,
+			   IntoClause *into, ExplainState *es,
 			   const char *queryString, ParamListInfo params,
 			   QueryEnvironment *queryEnv, const instr_time *planduration,
 			   const BufferUsage *bufusage)
@@ -563,7 +564,7 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
 		dest = None_Receiver;
 
 	/* Create a QueryDesc for the query */
-	queryDesc = CreateQueryDesc(plannedstmt, queryString,
+	queryDesc = CreateQueryDesc(plannedstmt, execprep, queryString,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, params, queryEnv, instrument_option);
 
diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index a2e77c418a..214a345aa2 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -741,8 +741,10 @@ execute_sql_string(const char *sql)
 		RawStmt    *parsetree = lfirst_node(RawStmt, lc1);
 		MemoryContext per_parsetree_context,
 					oldcontext;
-		List	   *stmt_list;
-		ListCell   *lc2;
+		List	   *stmt_list,
+				   *stmt_execprep_list;
+		ListCell   *lc2,
+				   *lc3;
 
 		/*
 		 * We do the work for each parsetree in a short-lived context, to
@@ -762,11 +764,13 @@ execute_sql_string(const char *sql)
 										   NULL,
 										   0,
 										   NULL);
-		stmt_list = pg_plan_queries(stmt_list, sql, CURSOR_OPT_PARALLEL_OK, NULL);
+		stmt_list = pg_plan_queries(stmt_list, sql, CURSOR_OPT_PARALLEL_OK, NULL,
+									&stmt_execprep_list);
 
-		foreach(lc2, stmt_list)
+		forboth(lc2, stmt_list, lc3, stmt_execprep_list)
 		{
 			PlannedStmt *stmt = lfirst_node(PlannedStmt, lc2);
+			ExecPrepOutput *execprep = lfirst_node(ExecPrepOutput, lc3);
 
 			CommandCounterIncrement();
 
@@ -777,6 +781,7 @@ execute_sql_string(const char *sql)
 				QueryDesc  *qdesc;
 
 				qdesc = CreateQueryDesc(stmt,
+										execprep,
 										sql,
 										GetActiveSnapshot(), NULL,
 										dest, NULL, NULL, 0);
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index 05e7b60059..4ef44aaf23 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -416,7 +416,7 @@ refresh_matview_datafill(DestReceiver *dest, Query *query,
 	UpdateActiveSnapshotCommandId();
 
 	/* Create a QueryDesc, redirecting output to our tuple receiver */
-	queryDesc = CreateQueryDesc(plan, queryString,
+	queryDesc = CreateQueryDesc(plan, NULL, queryString,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, NULL, NULL, 0);
 
diff --git a/src/backend/commands/portalcmds.c b/src/backend/commands/portalcmds.c
index 9902c5c566..0bea2dd18f 100644
--- a/src/backend/commands/portalcmds.c
+++ b/src/backend/commands/portalcmds.c
@@ -107,6 +107,7 @@ PerformCursorOpen(ParseState *pstate, DeclareCursorStmt *cstmt, ParamListInfo pa
 					  queryString,
 					  CMDTAG_SELECT,	/* cursor's query is always a SELECT */
 					  list_make1(plan),
+					  list_make1(NULL),	/* no ExecPrepOutput to pass */
 					  NULL);
 
 	/*----------
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 206d2bbbf9..ac188a7347 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -189,6 +189,7 @@ ExecuteQuery(ParseState *pstate,
 	PreparedStatement *entry;
 	CachedPlan *cplan;
 	List	   *plan_list;
+	List	   *plan_execprep_list;
 	ParamListInfo paramLI = NULL;
 	EState	   *estate = NULL;
 	Portal		portal;
@@ -229,6 +230,7 @@ ExecuteQuery(ParseState *pstate,
 	/* Replan if needed, and increment plan refcount for portal */
 	cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL);
 	plan_list = cplan->stmt_list;
+	plan_execprep_list = cplan->stmt_execprep_list;
 
 	/*
 	 * DO NOT add any logic that could possibly throw an error between
@@ -238,7 +240,7 @@ ExecuteQuery(ParseState *pstate,
 					  NULL,
 					  query_string,
 					  entry->plansource->commandTag,
-					  plan_list,
+					  plan_list, plan_execprep_list,
 					  cplan);
 
 	/*
@@ -610,7 +612,9 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	const char *query_string;
 	CachedPlan *cplan;
 	List	   *plan_list;
-	ListCell   *p;
+	List	   *plan_execprep_list;
+	ListCell   *p,
+			   *pe;
 	ParamListInfo paramLI = NULL;
 	EState	   *estate = NULL;
 	instr_time	planstart;
@@ -666,15 +670,18 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	}
 
 	plan_list = cplan->stmt_list;
+	plan_execprep_list = cplan->stmt_execprep_list;
 
 	/* Explain each query */
-	foreach(p, plan_list)
+	forboth(p, plan_list, pe, plan_execprep_list)
 	{
 		PlannedStmt *pstmt = lfirst_node(PlannedStmt, p);
+		ExecPrepOutput *execprep = lfirst_node(ExecPrepOutput, pe);
 
 		if (pstmt->commandType != CMD_UTILITY)
-			ExplainOnePlan(pstmt, into, es, query_string, paramLI, queryEnv,
-						   &planduration, (es->buffers ? &bufusage : NULL));
+			ExplainOnePlan(pstmt, execprep, into, es, query_string, paramLI,
+						   queryEnv, &planduration,
+						   (es->buffers ? &bufusage : NULL));
 		else
 			ExplainOneUtility(pstmt->utilityStmt, into, es, query_string,
 							  paramLI, queryEnv);
diff --git a/src/backend/executor/README b/src/backend/executor/README
index bf5e70860d..c25db66ff0 100644
--- a/src/backend/executor/README
+++ b/src/backend/executor/README
@@ -65,6 +65,21 @@ found there.  This currently only occurs for Append and MergeAppend nodes.  In
 this case the non-required subplans are ignored and the executor state's
 subnode array will become out of sequence to the plan's subplan list.
 
+A plan tree may also be made to go through ExecutorPrep() to collect some
+information about the individual plan nodes that may help optimize the
+actual execution of the plan.  Such information about each plan node is put
+into a PlanPrepOutput node if the plan node type supports producing one and
+stored in an array in ExecPrepOutput that in turn represents the output of
+a ExecutorPrep() run.  The PlanPrepOutput array is indexed with plan_node_id
+of the individual plan nodes.  An example of what such information may look
+like is in the "prep" routine of the Append node (ExecPrepAppend), which does
+partition pruning using "initial steps", that is, pruning with expressions
+that can evaluated even before the actual execution has started. That produces
+a set of "initially valid subplans" that is put into the PlanPrepOutput
+belonging to Append that can be used as-is by the initializer routine of the
+Append node (nodeAppend.c: ExecInitAppend) to only initialize the plan state
+trees of those subplans.
+
 Each Plan node may have expression trees associated with it, to represent
 its target list, qualification conditions, etc.  These trees are also
 read-only to the executor, but the executor state for expression evaluation
@@ -247,6 +262,9 @@ Query Processing Control Flow
 
 This is a sketch of control flow for full query processing:
 
+  [ ExecutorPrep ] --- an optional step to walk over the plan tree to produce
+		an ExecPrepOutput to be passed to CreateQueryDesc
+
 	CreateQueryDesc
 
 	ExecutorStart
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 549d9eb696..e38966295e 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -103,6 +103,52 @@ static void EvalPlanQualStart(EPQState *epqstate, Plan *planTree);
 
 /* end of local decls */
 
+/* ----------------------------------------------------------------
+ *		ExecutorPrep
+ *
+ *		This optional executor routine must be called if the PlannedStmt
+ *		indicates that some nodes in the planTree can perform preparatory
+ *		actions, such as pre-execution/initial pruning
+ *
+ * Returned information includes the set of RT indexes of relations referenced
+ * in the plan, and a PlanPrepOutput node for each node in the planTree if the
+ * node type supports producing one.
+ *
+ * This may lock relations whose information may be used to produce the
+ * PlanPrepOutput nodes. For example, a partitioned table before perusing its
+ * PartitionPruneInfo contained in an Append node to do the pruning the result
+ * of which is used to populate the Append node's PlanPrepOutput.
+ */
+ExecPrepOutput *
+ExecutorPrep(ExecPrepContext *context)
+{
+	ExecPrepOutput *result = makeNode(ExecPrepOutput);
+
+	result->numPlanNodes = context->stmt->numPlanNodes;
+	result->planPrepResults = palloc0(sizeof(PlanPrepOutput *) *
+									  result->numPlanNodes);
+	if (!context->stmt->usesPreExecPruning)
+	{
+		/* Shortcut */
+		result->relationRTIs = bms_copy(context->stmt->relationRTIs);
+	}
+	else
+	{
+		/* Go find the nodes that need any "prep" work done. */
+		ListCell *lc;
+
+		foreach(lc, context->stmt->subplans)
+		{
+			Plan *subplan = lfirst(lc);
+
+			ExecPrepNode(subplan, context, result);
+		}
+
+		ExecPrepNode(context->stmt->planTree, context, result);
+	}
+
+	return result;
+}
 
 /* ----------------------------------------------------------------
  *		ExecutorStart
@@ -804,6 +850,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
 {
 	CmdType		operation = queryDesc->operation;
 	PlannedStmt *plannedstmt = queryDesc->plannedstmt;
+	ExecPrepOutput *execprep = queryDesc->execprep;
 	Plan	   *plan = plannedstmt->planTree;
 	List	   *rangeTable = plannedstmt->rtable;
 	EState	   *estate = queryDesc->estate;
@@ -823,6 +870,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
 	ExecInitRangeTable(estate, rangeTable);
 
 	estate->es_plannedstmt = plannedstmt;
+	estate->es_execprep = execprep;
 
 	/*
 	 * Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index 5dd8ab7db2..0567534358 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -182,8 +182,10 @@ ExecSerializePlan(Plan *plan, EState *estate)
 	pstmt->transientPlan = false;
 	pstmt->dependsOnRole = false;
 	pstmt->parallelModeNeeded = false;
+	pstmt->usesPreExecPruning = false;
 	pstmt->planTree = plan;
 	pstmt->rtable = estate->es_range_table;
+	pstmt->relationRTIs = NULL;
 	pstmt->resultRelations = NIL;
 	pstmt->appendRelations = NIL;
 
@@ -1248,7 +1250,7 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
 	paramLI = RestoreParamList(&paramspace);
 
 	/* Create a QueryDesc for the query. */
-	return CreateQueryDesc(pstmt,
+	return CreateQueryDesc(pstmt, NULL,	/* XXX pass ExecPrepOutput too? */
 						   queryString,
 						   GetActiveSnapshot(), InvalidSnapshot,
 						   receiver, paramLI, NULL, instrument_options);
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 90ed1485d1..75292fbd21 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -24,6 +24,7 @@
 #include "mb/pg_wchar.h"
 #include "miscadmin.h"
 #include "nodes/makefuncs.h"
+#include "parser/parsetree.h"
 #include "partitioning/partbounds.h"
 #include "partitioning/partdesc.h"
 #include "partitioning/partprune.h"
@@ -186,7 +187,11 @@ static void ExecInitPruningContext(PartitionPruneContext *context,
 								   List *pruning_steps,
 								   PartitionDesc partdesc,
 								   PartitionKey partkey,
-								   PlanState *planstate);
+								   PlanState *planstate,
+								   ExprContext *econtext);
+static void ExecPartitionPruneFixSubPlanIndexes(PartitionPruneState *prunestate,
+									Bitmapset *initially_valid_subplans,
+									int n_total_subplans);
 static void find_matching_subplans_recurse(PartitionPruningData *prunedata,
 										   PartitionedRelPruningData *pprune,
 										   bool initial_prune,
@@ -1476,8 +1481,9 @@ adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri)
  * considered to be a stable expression, it can change value from one plan
  * node scan to the next during query execution.  Stable comparison
  * expressions that don't involve such Params allow partition pruning to be
- * done once during executor startup.  Expressions that do involve such Params
- * require us to prune separately for each scan of the parent plan node.
+ * done once during executor startup or even before during ExecutorPrep().
+ * Expressions that do involve such Params require us to prune separately for
+ * each scan of the parent plan node.
  *
  * Note that pruning away unneeded subplans during executor startup has the
  * added benefit of not having to initialize the unneeded subplans at all.
@@ -1485,10 +1491,28 @@ adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri)
  *
  * Functions:
  *
+ * ExecInitPartitionPruning:
+ *		This determines the initially valid subplans by doing pruning with
+ *		only pre-execution pruning expressions, that is, expressions in the
+ *		query that were matched to the partition key(s), whose values are
+ *		known at executor startup (excludeing expressions containing
+ *		PARAM_EXEC Params); see ExecFindInitialMatchingSubPlans().  The
+ *		PartitionPruneState thus created, which stores the details about
+ *		mapping the partition indexes returned by the partition pruning code
+ *		into subplan indexes, is also returned for use during subsquent
+ *		pruning.  Pruned subplans must be removed from the parent plan's list
+ *		of subplans to be executed, so this also remaps the partition indexes
+ *		in the PartitionPruneState to the new indexes of surviving subplans.
+ *
+ * ExecPrepDoInitialPruning:
+ * 		Do ExecFindInitialMatchingSubPlans as part of ExecPrepNode() on the
+ * 		parent plan node
+ *
  * ExecCreatePartitionPruneState:
  *		Creates the PartitionPruneState required by each of the two pruning
  *		functions.  Details stored include how to map the partition index
  *		returned by the partition pruning code into subplan indexes.
+ *		(Note: Use ExecInitPartitionPruning() rather than use this directly.)
  *
  * ExecFindInitialMatchingSubPlans:
  *		Returns indexes of matching subplans.  Partition pruning is attempted
@@ -1500,6 +1524,7 @@ adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri)
  *		remap of the partition index to subplan index map and the newly
  *		created map provides indexes only for subplans which remain after
  *		calling this function.
+ *		(Note: Use ExecInitPartitionPruning() rather than use this directly.)
  *
  * ExecFindMatchingSubPlans:
  *		Returns indexes of matching subplans after evaluating all available
@@ -1514,7 +1539,9 @@ adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri)
  *		Build the data structure required for calling
  *		ExecFindInitialMatchingSubPlans and ExecFindMatchingSubPlans.
  *
- * 'planstate' is the parent plan node's execution state.
+ * 'planstate', if not NULL, is the parent plan node's execution state.  It
+ * can be NULL if being called before ExecutorStart(), in which case,
+ * 'rtable', 'econtext', and 'partdir' must be provided.
  *
  * 'partitionpruneinfo' is a PartitionPruneInfo as generated by
  * make_partition_pruneinfo.  Here we build a PartitionPruneState containing a
@@ -1529,18 +1556,20 @@ adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri)
  */
 PartitionPruneState *
 ExecCreatePartitionPruneState(PlanState *planstate,
-							  PartitionPruneInfo *partitionpruneinfo)
+							  PartitionPruneInfo *partitionpruneinfo,
+							  bool consider_initial_steps,
+							  bool consider_exec_steps,
+							  List *rtable, ExprContext *econtext,
+							  PartitionDirectory partdir)
 {
-	EState	   *estate = planstate->state;
+	EState	   *estate = planstate ? planstate->state : NULL;
 	PartitionPruneState *prunestate;
 	int			n_part_hierarchies;
 	ListCell   *lc;
 	int			i;
 
-	/* For data reading, executor always omits detached partitions */
-	if (estate->es_partition_directory == NULL)
-		estate->es_partition_directory =
-			CreatePartitionDirectory(estate->es_query_cxt, false);
+	Assert(partdir != NULL && econtext != NULL &&
+		   (estate != NULL || rtable != NIL));
 
 	n_part_hierarchies = list_length(partitionpruneinfo->prune_infos);
 	Assert(n_part_hierarchies > 0);
@@ -1591,19 +1620,34 @@ ExecCreatePartitionPruneState(PlanState *planstate,
 			PartitionedRelPruneInfo *pinfo = lfirst_node(PartitionedRelPruneInfo, lc2);
 			PartitionedRelPruningData *pprune = &prunedata->partrelprunedata[j];
 			Relation	partrel;
+			bool		close_partrel = false;
 			PartitionDesc partdesc;
 			PartitionKey partkey;
 
 			/*
-			 * We can rely on the copies of the partitioned table's partition
-			 * key and partition descriptor appearing in its relcache entry,
-			 * because that entry will be held open and locked for the
-			 * duration of this executor run.
+			 * We can rely on the copy of the partitioned table's partition
+			 * key from in its relcache entry, because it can't change (or
+			 * get destroyed) as long as the relation is locked.  Partition
+			 * descriptor is taken from the PartitionDirectory associated with
+			 * the table that is held open long enough for the descriptor to
+			 * remain valid while it's used to perform the pruning steps.
 			 */
-			partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
+			if (estate == NULL)
+			{
+				RangeTblEntry *rte = rt_fetch(pinfo->rtindex, rtable);
+
+				partrel = table_open(rte->relid, rte->rellockmode);
+				close_partrel = true;
+			}
+			else
+				partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
+
 			partkey = RelationGetPartitionKey(partrel);
-			partdesc = PartitionDirectoryLookup(estate->es_partition_directory,
-												partrel);
+			partdesc = PartitionDirectoryLookup(partdir, partrel);
+
+			/* Safe to close partrel, if necessary, keeping the lock taken. */
+			if (close_partrel)
+				table_close(partrel, NoLock);
 
 			/*
 			 * Initialize the subplan_map and subpart_map.
@@ -1705,30 +1749,32 @@ ExecCreatePartitionPruneState(PlanState *planstate,
 			 * Initialize pruning contexts as needed.
 			 */
 			pprune->initial_pruning_steps = pinfo->initial_pruning_steps;
-			if (pinfo->initial_pruning_steps)
+			if (consider_initial_steps && pinfo->initial_pruning_steps)
 			{
 				ExecInitPruningContext(&pprune->initial_context,
 									   pinfo->initial_pruning_steps,
-									   partdesc, partkey, planstate);
+									   partdesc, partkey, planstate,
+									   econtext);
 				/* Record whether initial pruning is needed at any level */
 				prunestate->do_initial_prune = true;
 			}
 			pprune->exec_pruning_steps = pinfo->exec_pruning_steps;
-			if (pinfo->exec_pruning_steps)
+			if (consider_exec_steps && pinfo->exec_pruning_steps)
 			{
 				ExecInitPruningContext(&pprune->exec_context,
 									   pinfo->exec_pruning_steps,
-									   partdesc, partkey, planstate);
+									   partdesc, partkey, planstate,
+									   econtext);
 				/* Record whether exec pruning is needed at any level */
 				prunestate->do_exec_prune = true;
-			}
 
-			/*
-			 * Accumulate the IDs of all PARAM_EXEC Params affecting the
-			 * partitioning decisions at this plan node.
-			 */
-			prunestate->execparamids = bms_add_members(prunestate->execparamids,
-													   pinfo->execparamids);
+				/*
+				 * Accumulate the IDs of all PARAM_EXEC Params affecting the
+				 * partitioning decisions at this plan node.
+				 */
+				prunestate->execparamids = bms_add_members(prunestate->execparamids,
+														   pinfo->execparamids);
+			}
 
 			j++;
 		}
@@ -1740,13 +1786,18 @@ ExecCreatePartitionPruneState(PlanState *planstate,
 
 /*
  * Initialize a PartitionPruneContext for the given list of pruning steps.
+ *
+ * At least one of 'planstate' or 'econtext' must be passed to be able to
+ * successfully evaluate any non-Const expressions contained in the
+ * steps.
  */
 static void
 ExecInitPruningContext(PartitionPruneContext *context,
 					   List *pruning_steps,
 					   PartitionDesc partdesc,
 					   PartitionKey partkey,
-					   PlanState *planstate)
+					   PlanState *planstate,
+					   ExprContext *econtext)
 {
 	int			n_steps;
 	int			partnatts;
@@ -1767,6 +1818,7 @@ ExecInitPruningContext(PartitionPruneContext *context,
 
 	context->ppccontext = CurrentMemoryContext;
 	context->planstate = planstate;
+	context->exprcontext = econtext;
 
 	/* Initialize expression state for each expression we need */
 	context->exprstates = (ExprState **)
@@ -1795,14 +1847,269 @@ ExecInitPruningContext(PartitionPruneContext *context,
 														step->step.step_id,
 														keyno);
 
-				context->exprstates[stateidx] =
-					ExecInitExpr(expr, context->planstate);
+				if (planstate == NULL)
+					context->exprstates[stateidx] =
+						ExecInitExprWithParams(expr,
+											   econtext->ecxt_param_list_info);
+				else
+					context->exprstates[stateidx] =
+						ExecInitExpr(expr, context->planstate);
 			}
 			keyno++;
 		}
 	}
 }
 
+Bitmapset *
+ExecInitPartitionPruning(PlanState *planstate, int n_total_subplans,
+						 PartitionPruneInfo *pruneinfo,
+						 PartitionPruneState **prunestate)
+{
+	Bitmapset *validsubplans;
+	Plan   *plan = planstate->plan;
+	EState *estate = planstate->state;
+	PlanPrepOutput *planPrepResult = NULL;
+	bool	do_pruning = (pruneinfo->contains_init_steps ||
+						  pruneinfo->contains_exec_steps);
+
+	*prunestate = NULL;
+	if (estate->es_execprep)
+	{
+		planPrepResult = ExecPrepFetchPlanPrepOutput(estate->es_execprep,
+													 plan);
+
+		Assert(planPrepResult != NULL);
+		/* No need to do initial pruning again, only exec pruning. */
+		do_pruning = pruneinfo->contains_exec_steps;
+	}
+
+	if (do_pruning)
+	{
+		/* We may need an expression context to evaluate partition exprs */
+		ExecAssignExprContext(estate, planstate);
+
+		/* For data reading, executor always omits detached partitions */
+		if (estate->es_partition_directory == NULL)
+			estate->es_partition_directory =
+				CreatePartitionDirectory(estate->es_query_cxt, false);
+
+		/*
+		 * Create the working data structure for pruning.  No need to consider
+		 * initial pruning steps if we have a PlanPrepOutput.
+		 */
+		*prunestate = ExecCreatePartitionPruneState(planstate, pruneinfo,
+													planPrepResult == NULL, true,
+													NIL, planstate->ps_ExprContext,
+													estate->es_partition_directory);
+	}
+
+	/*
+	 * Perform an initial partition prune, if required.
+	 */
+	if (planPrepResult)
+	{
+		/* ExecutorPrep() already did it for us! */
+		validsubplans = planPrepResult->initially_valid_subnodes;
+	}
+	else if (*prunestate && (*prunestate)->do_initial_prune)
+	{
+		/* Determine which subplans survive initial pruning */
+		validsubplans = ExecFindInitialMatchingSubPlans(*prunestate, pruneinfo,
+														NULL);
+	}
+	else
+	{
+		/* We'll need to initialize all subplans */
+		Assert(n_total_subplans > 0);
+		validsubplans = bms_add_range(NULL, 0, n_total_subplans - 1);
+	}
+
+	/*
+	 * If exec-time pruning is required and subplans are pruned by initial
+	 * pruning, then we must re-sequence the subplan indexes so that
+	 * ExecFindMatchingSubPlans properly returns the indexes from the
+	 * subplans which will remain after initial pruning.
+	 *
+	 * We can safely skip this when !do_exec_prune, even though that leaves
+	 * invalid data in prunestate, because that data won't be consulted again
+	 * (cf initial Assert in ExecFindMatchingSubPlans).
+	 */
+	if (*prunestate && (*prunestate)->do_exec_prune &&
+		bms_num_members(validsubplans) < n_total_subplans)
+		ExecPartitionPruneFixSubPlanIndexes(*prunestate, validsubplans,
+											n_total_subplans);
+
+	return validsubplans;
+}
+
+/*
+ * ExecPrepDoInitialPruning
+ *		Perform initial pruning as part of doing ExecPrepNode() on the parent
+ *		plan node
+ */
+Bitmapset *
+ExecPrepDoInitialPruning(PartitionPruneInfo *pruneinfo,
+						 List *rtable, ParamListInfo params,
+						 Bitmapset **parentrelids)
+{
+	ExprContext	 *econtext;
+	PartitionDirectory pdir;
+	MemoryContext oldcontext,
+				  tmpcontext;
+	PartitionPruneState *prunestate;
+	Bitmapset	 *validsubplans;
+
+	/*
+	 * A temporary context to allocate stuff needded to run
+	 * the pruning steps.
+	 */
+	tmpcontext = AllocSetContextCreate(CurrentMemoryContext,
+									   "initial pruning working data",
+									   ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(tmpcontext);
+
+	/* An ExprContext to evaluate expressions. */
+	econtext = CreateStandaloneExprContext();
+	econtext->ecxt_param_list_info = params;
+
+	/*
+	 * PartitionDirectory, to look up partition descriptors
+	 * Omits detached partitions, just like in the executor
+	 * proper.
+	 */
+	pdir = CreatePartitionDirectory(CurrentMemoryContext, false);
+	prunestate = ExecCreatePartitionPruneState(NULL, pruneinfo,
+											   true, false,
+											   rtable, econtext,
+											   pdir);
+	MemoryContextSwitchTo(oldcontext);
+
+	/* Do the "initial" pruning. */
+	validsubplans =
+		ExecFindInitialMatchingSubPlans(prunestate,
+										pruneinfo,
+										parentrelids);
+
+	FreeExprContext(econtext, true);
+	DestroyPartitionDirectory(pdir);
+	MemoryContextDelete(tmpcontext);
+
+	return validsubplans;
+}
+
+/*
+ * ExecPartitionPruneFixSubPlanIndexes
+ *		Fix mapping of partition indexes to subplan indexes contained in
+ *		prunestate by considering the new list of subplans that survived
+ *		initial pruning
+ *
+ * Subplans would be previously indexed 0..(n_total_subplans - 1), though
+ * now should be changed to index range 0..num(initially_valid_subplans).
+ */
+static void
+ExecPartitionPruneFixSubPlanIndexes(PartitionPruneState *prunestate,
+									Bitmapset *initially_valid_subplans,
+									int n_total_subplans)
+{
+	int		   *new_subplan_indexes;
+	Bitmapset  *new_other_subplans;
+	int			i;
+	int			newidx;
+
+	/*
+	 * First we must build a temporary array which maps old subplan
+	 * indexes to new ones.  For convenience of initialization, we use
+	 * 1-based indexes in this array and leave pruned items as 0.
+	 */
+	new_subplan_indexes = (int *) palloc0(sizeof(int) * n_total_subplans);
+	newidx = 1;
+	i = -1;
+	while ((i = bms_next_member(initially_valid_subplans, i)) >= 0)
+	{
+		Assert(i < n_total_subplans);
+		new_subplan_indexes[i] = newidx++;
+	}
+
+	/*
+	 * Now we can update each PartitionedRelPruneInfo's subplan_map with
+	 * new subplan indexes.  We must also recompute its present_parts
+	 * bitmap.
+	 */
+	for (i = 0; i < prunestate->num_partprunedata; i++)
+	{
+		PartitionPruningData *prunedata = prunestate->partprunedata[i];
+		int			j;
+
+		/*
+		 * Within each hierarchy, we perform this loop in back-to-front
+		 * order so that we determine present_parts for the lowest-level
+		 * partitioned tables first.  This way we can tell whether a
+		 * sub-partitioned table's partitions were entirely pruned so we
+		 * can exclude it from the current level's present_parts.
+		 */
+		for (j = prunedata->num_partrelprunedata - 1; j >= 0; j--)
+		{
+			PartitionedRelPruningData *pprune = &prunedata->partrelprunedata[j];
+			int			nparts = pprune->nparts;
+			int			k;
+
+			/* We just rebuild present_parts from scratch */
+			bms_free(pprune->present_parts);
+			pprune->present_parts = NULL;
+
+			for (k = 0; k < nparts; k++)
+			{
+				int			oldidx = pprune->subplan_map[k];
+				int			subidx;
+
+				/*
+				 * If this partition existed as a subplan then change the
+				 * old subplan index to the new subplan index.  The new
+				 * index may become -1 if the partition was pruned above,
+				 * or it may just come earlier in the subplan list due to
+				 * some subplans being removed earlier in the list.  If
+				 * it's a subpartition, add it to present_parts unless
+				 * it's entirely pruned.
+				 */
+				if (oldidx >= 0)
+				{
+					Assert(oldidx < n_total_subplans);
+					pprune->subplan_map[k] = new_subplan_indexes[oldidx] - 1;
+
+					if (new_subplan_indexes[oldidx] > 0)
+						pprune->present_parts =
+							bms_add_member(pprune->present_parts, k);
+				}
+				else if ((subidx = pprune->subpart_map[k]) >= 0)
+				{
+					PartitionedRelPruningData *subprune;
+
+					subprune = &prunedata->partrelprunedata[subidx];
+
+					if (!bms_is_empty(subprune->present_parts))
+						pprune->present_parts =
+							bms_add_member(pprune->present_parts, k);
+				}
+			}
+		}
+	}
+
+	/*
+	 * We must also recompute the other_subplans set, since indexes in it
+	 * may change.
+	 */
+	new_other_subplans = NULL;
+	i = -1;
+	while ((i = bms_next_member(prunestate->other_subplans, i)) >= 0)
+		new_other_subplans = bms_add_member(new_other_subplans,
+											new_subplan_indexes[i] - 1);
+
+	bms_free(prunestate->other_subplans);
+	prunestate->other_subplans = new_other_subplans;
+
+	pfree(new_subplan_indexes);
+}
+
 /*
  * ExecFindInitialMatchingSubPlans
  *		Identify the set of subplans that cannot be eliminated by initial
@@ -1817,10 +2124,14 @@ ExecInitPruningContext(PartitionPruneContext *context,
  * Must only be called once per 'prunestate', and only if initial pruning
  * is required.
  *
- * 'nsubplans' must be passed as the total number of unpruned subplans.
+ * The RT indexes of unpruned parents are returned in *parentrelids if asked
+ * for by the caller, in which case 'pruneinfo' must also be passed because
+ * that is where the RT indexes are to be found.
  */
 Bitmapset *
-ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate, int nsubplans)
+ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate,
+								PartitionPruneInfo *pruneinfo,
+								Bitmapset **parentrelids)
 {
 	Bitmapset  *result = NULL;
 	MemoryContext oldcontext;
@@ -1830,11 +2141,14 @@ ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate, int nsubplans)
 	Assert(prunestate->do_initial_prune);
 
 	/*
-	 * Switch to a temp context to avoid leaking memory in the executor's
-	 * query-lifespan memory context.
+	 * Switch to a temp context to avoid leaking memory in the longer-term
+	 * memory context.
 	 */
 	oldcontext = MemoryContextSwitchTo(prunestate->prune_context);
 
+	if (parentrelids)
+		*parentrelids = NULL;
+
 	/*
 	 * For each hierarchy, do the pruning tests, and add nondeletable
 	 * subplans' indexes to "result".
@@ -1845,14 +2159,42 @@ ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate, int nsubplans)
 		PartitionedRelPruningData *pprune;
 
 		prunedata = prunestate->partprunedata[i];
+
+		/*
+		 * We pass the 1st item belonging to the root table of the hierarchy
+		 * and find_matching_subplans_recurse() takes care of recursing to
+		 * other (lower-level) parents as needed.
+		 */
 		pprune = &prunedata->partrelprunedata[0];
 
 		/* Perform pruning without using PARAM_EXEC Params */
 		find_matching_subplans_recurse(prunedata, pprune, true, &result);
 
-		/* Expression eval may have used space in node's ps_ExprContext too */
+		/*
+		 * Collect the RT indexes of surviving parents if the callers asked
+		 * to see them.
+		 */
+		if (parentrelids)
+		{
+			int		j;
+			List   *partrelpruneinfos = list_nth_node(List,
+													  pruneinfo->prune_infos,
+													  i);
+
+			for (j = 0; j < prunedata->num_partrelprunedata; j++)
+			{
+				PartitionedRelPruneInfo *pinfo = list_nth_node(PartitionedRelPruneInfo,
+															   partrelpruneinfos, j);
+
+				pprune = &prunedata->partrelprunedata[j];
+				if (!bms_is_empty(pprune->present_parts))
+					*parentrelids = bms_add_member(*parentrelids, pinfo->rtindex);
+			}
+		}
+
+		/* Expression eval may have used space in ExprContext too */
 		if (pprune->initial_pruning_steps)
-			ResetExprContext(pprune->initial_context.planstate->ps_ExprContext);
+			ResetExprContext(pprune->initial_context.exprcontext);
 	}
 
 	/* Add in any subplans that partition pruning didn't account for */
@@ -1862,120 +2204,11 @@ ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate, int nsubplans)
 
 	/* Copy result out of the temp context before we reset it */
 	result = bms_copy(result);
+	if (parentrelids)
+		*parentrelids = bms_copy(*parentrelids);
 
 	MemoryContextReset(prunestate->prune_context);
 
-	/*
-	 * If exec-time pruning is required and we pruned subplans above, then we
-	 * must re-sequence the subplan indexes so that ExecFindMatchingSubPlans
-	 * properly returns the indexes from the subplans which will remain after
-	 * execution of this function.
-	 *
-	 * We can safely skip this when !do_exec_prune, even though that leaves
-	 * invalid data in prunestate, because that data won't be consulted again
-	 * (cf initial Assert in ExecFindMatchingSubPlans).
-	 */
-	if (prunestate->do_exec_prune && bms_num_members(result) < nsubplans)
-	{
-		int		   *new_subplan_indexes;
-		Bitmapset  *new_other_subplans;
-		int			i;
-		int			newidx;
-
-		/*
-		 * First we must build a temporary array which maps old subplan
-		 * indexes to new ones.  For convenience of initialization, we use
-		 * 1-based indexes in this array and leave pruned items as 0.
-		 */
-		new_subplan_indexes = (int *) palloc0(sizeof(int) * nsubplans);
-		newidx = 1;
-		i = -1;
-		while ((i = bms_next_member(result, i)) >= 0)
-		{
-			Assert(i < nsubplans);
-			new_subplan_indexes[i] = newidx++;
-		}
-
-		/*
-		 * Now we can update each PartitionedRelPruneInfo's subplan_map with
-		 * new subplan indexes.  We must also recompute its present_parts
-		 * bitmap.
-		 */
-		for (i = 0; i < prunestate->num_partprunedata; i++)
-		{
-			PartitionPruningData *prunedata = prunestate->partprunedata[i];
-			int			j;
-
-			/*
-			 * Within each hierarchy, we perform this loop in back-to-front
-			 * order so that we determine present_parts for the lowest-level
-			 * partitioned tables first.  This way we can tell whether a
-			 * sub-partitioned table's partitions were entirely pruned so we
-			 * can exclude it from the current level's present_parts.
-			 */
-			for (j = prunedata->num_partrelprunedata - 1; j >= 0; j--)
-			{
-				PartitionedRelPruningData *pprune = &prunedata->partrelprunedata[j];
-				int			nparts = pprune->nparts;
-				int			k;
-
-				/* We just rebuild present_parts from scratch */
-				bms_free(pprune->present_parts);
-				pprune->present_parts = NULL;
-
-				for (k = 0; k < nparts; k++)
-				{
-					int			oldidx = pprune->subplan_map[k];
-					int			subidx;
-
-					/*
-					 * If this partition existed as a subplan then change the
-					 * old subplan index to the new subplan index.  The new
-					 * index may become -1 if the partition was pruned above,
-					 * or it may just come earlier in the subplan list due to
-					 * some subplans being removed earlier in the list.  If
-					 * it's a subpartition, add it to present_parts unless
-					 * it's entirely pruned.
-					 */
-					if (oldidx >= 0)
-					{
-						Assert(oldidx < nsubplans);
-						pprune->subplan_map[k] = new_subplan_indexes[oldidx] - 1;
-
-						if (new_subplan_indexes[oldidx] > 0)
-							pprune->present_parts =
-								bms_add_member(pprune->present_parts, k);
-					}
-					else if ((subidx = pprune->subpart_map[k]) >= 0)
-					{
-						PartitionedRelPruningData *subprune;
-
-						subprune = &prunedata->partrelprunedata[subidx];
-
-						if (!bms_is_empty(subprune->present_parts))
-							pprune->present_parts =
-								bms_add_member(pprune->present_parts, k);
-					}
-				}
-			}
-		}
-
-		/*
-		 * We must also recompute the other_subplans set, since indexes in it
-		 * may change.
-		 */
-		new_other_subplans = NULL;
-		i = -1;
-		while ((i = bms_next_member(prunestate->other_subplans, i)) >= 0)
-			new_other_subplans = bms_add_member(new_other_subplans,
-												new_subplan_indexes[i] - 1);
-
-		bms_free(prunestate->other_subplans);
-		prunestate->other_subplans = new_other_subplans;
-
-		pfree(new_subplan_indexes);
-	}
-
 	return result;
 }
 
@@ -2018,11 +2251,16 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate)
 		prunedata = prunestate->partprunedata[i];
 		pprune = &prunedata->partrelprunedata[0];
 
+		/*
+		 * We pass the 1st item belonging to the root table of the hierarchy
+		 * and find_matching_subplans_recurse() takes care of recursing to
+		 * other (lower-level) parents as needed.
+		 */
 		find_matching_subplans_recurse(prunedata, pprune, false, &result);
 
-		/* Expression eval may have used space in node's ps_ExprContext too */
+		/* Expression eval may have used space in ExprContext too */
 		if (pprune->exec_pruning_steps)
-			ResetExprContext(pprune->exec_context.planstate->ps_ExprContext);
+			ResetExprContext(pprune->exec_context.exprcontext);
 	}
 
 	/* Add in any subplans that partition pruning didn't account for */
diff --git a/src/backend/executor/execProcnode.c b/src/backend/executor/execProcnode.c
index b5667e53e5..d5e10756ac 100644
--- a/src/backend/executor/execProcnode.c
+++ b/src/backend/executor/execProcnode.c
@@ -123,6 +123,209 @@ static TupleTableSlot *ExecProcNodeFirst(PlanState *node);
 static TupleTableSlot *ExecProcNodeInstr(PlanState *node);
 
 
+/* ------------------------------------------------------------------------
+ * ExecPrepNode
+ *		Recursively "prep" all the nodes in the plan tree rooted
+ *		at 'node'.
+ *
+ *		'node' is the current node of the plan produced by the query planner
+ *		'context' is the information that may be necessary to do the prep
+ *			work, (such as any EXTERN parameters in the query to do partition
+ *			pruning with)
+ *		'result' is the output variable to add the result into
+ *
+ * NOTE: ExecPrepNode subroutine for a given node must add the RT indexes of
+ * any relations that it manipulates to result->relationRTIs.  Optionally, it
+ * can produce a PlanPrepOutput node containing the information that may be of
+ * interest to later execution steps or to any intervening modules that have
+ * access to the ExecPrepOutput and put that in
+ * result->planPrepResults[plan->plan_node_id].  For example, nodes that
+ * supports partition pruning can perform the "initial" pruning steps to
+ * produce the set of "initially valid" subnodes that can be used as-is by the
+ * node's ExecInit* routine to only initialize those subnodes.
+ * ------------------------------------------------------------------------
+ */
+void
+ExecPrepNode(Plan *node, ExecPrepContext *context, ExecPrepOutput *result)
+{
+	ListCell *l;
+
+	/* Do nothing when we get to the end of a leaf on tree. */
+	if (node == NULL)
+		return;
+
+	/* Make sure there's enough stack available. */
+	check_stack_depth();
+
+	/*
+	 * Write NULL for the node's PlanPruneOutput which the node's Prep routine
+	 * might write over.
+	 */
+	ExecPrepStorePlanPrepOutput(result, NULL, node);
+
+	switch (nodeTag(node))
+	{
+			/*
+			 * control nodes
+			 */
+		case T_Result:
+			ExecPrepResult((Result *) node, context, result);
+			break;
+		case T_ProjectSet:
+			ExecPrepProjectSet((ProjectSet *) node, context, result);
+			break;
+		case T_RecursiveUnion:
+			ExecPrepRecursiveUnion((RecursiveUnion *) node, context, result);
+			break;
+		case T_BitmapAnd:
+			ExecPrepBitmapAnd((BitmapAnd *) node, context, result);
+			break;
+		case T_BitmapOr:
+			ExecPrepBitmapOr((BitmapOr *) node, context, result);
+			break;
+		case T_ModifyTable:
+			ExecPrepModifyTable((ModifyTable *) node, context, result);
+			break;
+		case T_Append:
+			ExecPrepAppend((Append *) node, context, result);
+			break;
+		case T_MergeAppend:
+			ExecPrepMergeAppend((MergeAppend *) node, context, result);
+			break;
+
+			/*
+			 * scan nodes
+			 */
+		case T_SeqScan:
+			ExecPrepSeqScan((SeqScan *) node, context, result);
+			break;
+		case T_SampleScan:
+			ExecPrepSampleScan((SampleScan *) node, context, result);
+			break;
+		case T_IndexScan:
+			ExecPrepIndexScan((IndexScan *) node, context, result);
+			break;
+		case T_IndexOnlyScan:
+			ExecPrepIndexOnlyScan((IndexOnlyScan *) node, context, result);
+			break;
+		case T_BitmapIndexScan:
+			ExecPrepBitmapIndexScan((BitmapIndexScan *) node, context, result);
+			break;
+		case T_BitmapHeapScan:
+			ExecPrepBitmapHeapScan((BitmapHeapScan *) node, context, result);
+			break;
+		case T_TidScan:
+			ExecPrepTidScan((TidScan *) node, context, result);
+			break;
+		case T_TidRangeScan:
+			ExecPrepTidRangeScan((TidRangeScan *) node, context, result);
+			break;
+		case T_SubqueryScan:
+			ExecPrepSubqueryScan((SubqueryScan *) node, context, result);
+			break;
+		case T_FunctionScan:
+			ExecPrepFunctionScan((FunctionScan *) node, context, result);
+			break;
+		case T_TableFuncScan:
+			ExecPrepTableFuncScan((TableFuncScan *) node, context, result);
+			break;
+		case T_ValuesScan:
+			ExecPrepValuesScan((ValuesScan *) node, context, result);
+			break;
+		case T_CteScan:
+			ExecPrepCteScan((CteScan *) node, context, result);
+			break;
+		case T_NamedTuplestoreScan:
+			ExecPrepNamedTuplestoreScan((NamedTuplestoreScan *) node, context, result);
+			break;
+		case T_WorkTableScan:
+			ExecPrepWorkTableScan((WorkTableScan *) node, context, result);
+			break;
+		case T_ForeignScan:
+			ExecPrepForeignScan((ForeignScan *) node, context, result);
+			break;
+		case T_CustomScan:
+			ExecPrepCustomScan((CustomScan *) node, context, result);
+			break;
+
+			/*
+			 * join nodes: subnodes handled below
+			 */
+		case T_NestLoop:
+			ExecPrepNestLoop((NestLoop *) node, context, result);
+			break;
+		case T_MergeJoin:
+			ExecPrepMergeJoin((MergeJoin *) node, context, result);
+			break;
+		case T_HashJoin:
+			ExecPrepHashJoin((HashJoin *) node, context, result);
+			break;
+
+			/*
+			 * materialization nodes: subnodes handled below
+			 */
+		case T_Material:
+			ExecPrepMaterial((Material *) node, context, result);
+			break;
+		case T_Sort:
+			ExecPrepSort((Sort *) node, context, result);
+			break;
+		case T_IncrementalSort:
+			ExecPrepIncrementalSort((IncrementalSort *) node, context, result);
+			break;
+		case T_Memoize:
+			ExecPrepMemoize((Memoize *) node, context, result);
+			break;
+		case T_Group:
+			ExecPrepGroup((Group *) node, context, result);
+			break;
+		case T_Agg:
+			ExecPrepAgg((Agg *) node, context, result);
+			break;
+		case T_WindowAgg:
+			ExecPrepWindowAgg((WindowAgg *) node, context, result);
+			break;
+		case T_Unique:
+			ExecPrepUnique((Unique *) node, context, result);
+			break;
+		case T_Gather:
+			ExecPrepGather((Gather *) node, context, result);
+			break;
+		case T_GatherMerge:
+			ExecPrepGatherMerge((GatherMerge *) node, context, result);
+			break;
+		case T_Hash:
+			ExecPrepHash((Hash *) node, context, result);
+			break;
+		case T_SetOp:
+			ExecPrepSetOp((SetOp *) node, context, result);
+			break;
+		case T_LockRows:
+			ExecPrepLockRows((LockRows *) node, context, result);
+			break;
+		case T_Limit:
+			ExecPrepLimit((Limit *) node, context, result);
+			break;
+
+		default:
+			elog(ERROR, "unrecognized node type: %d", (int) nodeTag(node));
+			result = NULL;		/* keep compiler quiet */
+			break;
+	}
+
+	/*
+	 * Prep any initPlans present in this node.  The planner put them in
+	 * a separate list for us.
+	 */
+	foreach(l, node->initPlan)
+	{
+		SubPlan    *subplan = (SubPlan *) lfirst(l);
+
+		Assert(IsA(subplan, SubPlan));
+		ExecPrepSubPlan(subplan, context, result);
+	}
+}
+
 /* ------------------------------------------------------------------------
  *		ExecInitNode
  *
@@ -157,6 +360,9 @@ ExecInitNode(Plan *node, EState *estate, int eflags)
 	 */
 	check_stack_depth();
 
+	/* Check that the PlanPrepOutput for the node looks sane if any. */
+	EXEC_PREP_OUTPUT_SANITY(node, estate);
+
 	switch (nodeTag(node))
 	{
 			/*
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 9df1f81ea8..5c85148b37 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -119,6 +119,7 @@ CreateExecutorState(void)
 	estate->es_relations = NULL;
 	estate->es_rowmarks = NULL;
 	estate->es_plannedstmt = NULL;
+	estate->es_execprep = NULL;
 
 	estate->es_junkFilter = NULL;
 
@@ -785,6 +786,13 @@ ExecGetRangeTableRelation(EState *estate, Index rti)
 
 	Assert(rti > 0 && rti <= estate->es_range_table_size);
 
+	/*
+	 * A cross-check that AcquireExecutorLocks() hasn't missed any relations
+	 * it must not have.
+	 */
+	Assert(estate->es_execprep == NULL ||
+		   bms_is_member(rti, estate->es_execprep->relationRTIs));
+
 	rel = estate->es_relations[rti - 1];
 	if (rel == NULL)
 	{
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index 29a68879ee..5f0ff2df2a 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -844,7 +844,7 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
 	else
 		dest = None_Receiver;
 
-	es->qd = CreateQueryDesc(es->stmt,
+	es->qd = CreateQueryDesc(es->stmt, NULL,
 							 fcache->src,
 							 GetActiveSnapshot(),
 							 InvalidSnapshot,
diff --git a/src/backend/executor/nodeAgg.c b/src/backend/executor/nodeAgg.c
index 08cf569d8f..f3b0ec75d3 100644
--- a/src/backend/executor/nodeAgg.c
+++ b/src/backend/executor/nodeAgg.c
@@ -3142,6 +3142,19 @@ hashagg_reset_spill_state(AggState *aggstate)
 	}
 }
 
+/* ----------------------------------------------------------------
+ *		ExecPrepAgg
+ *
+ *		This "preps" the Agg node and the node's subplan.
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepAgg(Agg *node, ExecPrepContext *context, ExecPrepOutput *result)
+{
+	/* Nothing to do beside recursing to the subplan. */
+	ExecPrepNode(outerPlan(node), context, result);
+}
+
 
 /* -----------------
  * ExecInitAgg
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index 7937f1c88f..a44c8079bd 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -62,6 +62,7 @@
 #include "executor/execPartition.h"
 #include "executor/nodeAppend.h"
 #include "miscadmin.h"
+#include "partitioning/partdesc.h"
 #include "pgstat.h"
 #include "storage/latch.h"
 
@@ -94,6 +95,62 @@ static bool ExecAppendAsyncRequest(AppendState *node, TupleTableSlot **result);
 static void ExecAppendAsyncEventWait(AppendState *node);
 static void classify_matching_subplans(AppendState *node);
 
+/* ----------------------------------------------------------------
+ *		ExecPrepAppend
+ *
+ *		Prep an append node
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepAppend(Append *node, ExecPrepContext *context,
+			   ExecPrepOutput *result)
+{
+	PartitionPruneInfo *pruneinfo = node->part_prune_info;
+
+	if (pruneinfo && pruneinfo->contains_init_steps)
+	{
+		List	   *rtable = context->stmt->rtable;
+		List	   *subplans = node->appendplans;
+		ParamListInfo params = context->params;
+		Bitmapset  *parentrelids;
+		int			i;
+		PlanPrepOutput *planPrepResult = makeNode(PlanPrepOutput);
+
+		planPrepResult->plan_node_id = node->plan.plan_node_id;
+		planPrepResult->initially_valid_subnodes =
+			ExecPrepDoInitialPruning(pruneinfo, rtable, params, &parentrelids);
+		/* Replace the NULL that ExecPrepNode() would've written. */
+		ExecPrepStorePlanPrepOutput(result, planPrepResult, &node->plan);
+
+		/* All relevant parents must be reported too. */
+		Assert(bms_num_members(parentrelids) > 0);
+		result->relationRTIs = bms_add_members(result->relationRTIs,
+											   parentrelids);
+
+		/* And all leaf partitions that will be scanned. */
+		i = -1;
+		while ((i = bms_next_member(planPrepResult->initially_valid_subnodes, i)) >= 0)
+		{
+			Plan   *subplan = list_nth(subplans, i);
+
+			ExecPrepNode(subplan, context, result);
+		}
+	}
+	else
+	{
+		List	   *subplans = node->appendplans;
+		ListCell   *lc;
+
+		/* Recurse to prep *all* of the node's child subplans. */
+		foreach(lc, subplans)
+		{
+			Plan *subplan = (Plan *) lfirst(lc);
+
+			ExecPrepNode(subplan, context, result);
+		}
+	}
+}
+
 /* ----------------------------------------------------------------
  *		ExecInitAppend
  *
@@ -136,39 +193,19 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
 	/* If run-time partition pruning is enabled, then set that up now */
 	if (node->part_prune_info != NULL)
 	{
-		PartitionPruneState *prunestate;
-
-		/* We may need an expression context to evaluate partition exprs */
-		ExecAssignExprContext(estate, &appendstate->ps);
-
-		/* Create the working data structure for pruning. */
-		prunestate = ExecCreatePartitionPruneState(&appendstate->ps,
-												   node->part_prune_info);
-		appendstate->as_prune_state = prunestate;
-
-		/* Perform an initial partition prune, if required. */
-		if (prunestate->do_initial_prune)
-		{
-			/* Determine which subplans survive initial pruning */
-			validsubplans = ExecFindInitialMatchingSubPlans(prunestate,
-															list_length(node->appendplans));
-
-			nplans = bms_num_members(validsubplans);
-		}
-		else
-		{
-			/* We'll need to initialize all subplans */
-			nplans = list_length(node->appendplans);
-			Assert(nplans > 0);
-			validsubplans = bms_add_range(NULL, 0, nplans - 1);
-		}
+		validsubplans = ExecInitPartitionPruning(&appendstate->ps,
+												 list_length(node->appendplans),
+												 node->part_prune_info,
+												 &appendstate->as_prune_state);
+		nplans = bms_num_members(validsubplans);
 
 		/*
 		 * When no run-time pruning is required and there's at least one
 		 * subplan, we can fill as_valid_subplans immediately, preventing
 		 * later calls to ExecFindMatchingSubPlans.
 		 */
-		if (!prunestate->do_exec_prune && nplans > 0)
+		if (appendstate->as_prune_state == NULL ||
+			(!appendstate->as_prune_state->do_exec_prune && nplans > 0))
 			appendstate->as_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
 	}
 	else
diff --git a/src/backend/executor/nodeBitmapAnd.c b/src/backend/executor/nodeBitmapAnd.c
index b54c79f853..4ad3e5ff81 100644
--- a/src/backend/executor/nodeBitmapAnd.c
+++ b/src/backend/executor/nodeBitmapAnd.c
@@ -45,6 +45,24 @@ ExecBitmapAnd(PlanState *pstate)
 	return NULL;
 }
 
+/* ----------------------------------------------------------------
+ *		ExecPrepBitmapAnd
+ *
+ *		This "preps" the BitmapAnd node and the subplans.
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepBitmapAnd(BitmapAnd *node, ExecPrepContext *context,
+				  ExecPrepOutput *result)
+{
+	ListCell *lc;
+
+	foreach(lc, node->bitmapplans)
+	{
+		ExecPrepNode((Plan *) lfirst(lc), context, result);
+	}
+}
+
 /* ----------------------------------------------------------------
  *		ExecInitBitmapAnd
  *
diff --git a/src/backend/executor/nodeBitmapHeapscan.c b/src/backend/executor/nodeBitmapHeapscan.c
index f6fe07ad70..aaf215a4cc 100644
--- a/src/backend/executor/nodeBitmapHeapscan.c
+++ b/src/backend/executor/nodeBitmapHeapscan.c
@@ -696,6 +696,20 @@ ExecEndBitmapHeapScan(BitmapHeapScanState *node)
 	table_endscan(scanDesc);
 }
 
+/* ----------------------------------------------------------------
+ *		ExecPrepBitmapHeapScan
+ *
+ *		This "preps" the BitmapHeapScan node.
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepBitmapHeapScan(BitmapHeapScan *node, ExecPrepContext *context,
+					   ExecPrepOutput *result)
+{
+	result->relationRTIs = bms_add_member(result->relationRTIs,
+										  node->scan.scanrelid);
+}
+
 /* ----------------------------------------------------------------
  *		ExecInitBitmapHeapScan
  *
diff --git a/src/backend/executor/nodeBitmapIndexscan.c b/src/backend/executor/nodeBitmapIndexscan.c
index 551e47630d..bb766f71a2 100644
--- a/src/backend/executor/nodeBitmapIndexscan.c
+++ b/src/backend/executor/nodeBitmapIndexscan.c
@@ -201,6 +201,20 @@ ExecEndBitmapIndexScan(BitmapIndexScanState *node)
 		index_close(indexRelationDesc, NoLock);
 }
 
+/* ----------------------------------------------------------------
+ *		ExecPrepBitmapIndexScan
+ *
+ *		This "preps" the BitmapIndexScan node.
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepBitmapIndexScan(BitmapIndexScan *node, ExecPrepContext *context,
+						ExecPrepOutput *result)
+{
+	result->relationRTIs = bms_add_member(result->relationRTIs,
+										  node->scan.scanrelid);
+}
+
 /* ----------------------------------------------------------------
  *		ExecInitBitmapIndexScan
  *
diff --git a/src/backend/executor/nodeBitmapOr.c b/src/backend/executor/nodeBitmapOr.c
index 2d57f11fe7..feb3e4a8d6 100644
--- a/src/backend/executor/nodeBitmapOr.c
+++ b/src/backend/executor/nodeBitmapOr.c
@@ -46,6 +46,24 @@ ExecBitmapOr(PlanState *pstate)
 	return NULL;
 }
 
+/* ----------------------------------------------------------------
+ *		ExecPrepBitmapOr
+ *
+ *		This "preps" the BitmapOr node and the subplans.
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepBitmapOr(BitmapOr *node, ExecPrepContext *context,
+				 ExecPrepOutput *result)
+{
+	ListCell *lc;
+
+	foreach(lc, node->bitmapplans)
+	{
+		ExecPrepNode((Plan *) lfirst(lc), context, result);
+	}
+}
+
 /* ----------------------------------------------------------------
  *		ExecInitBitmapOr
  *
diff --git a/src/backend/executor/nodeCtescan.c b/src/backend/executor/nodeCtescan.c
index b9d7dec8a2..533cfb7874 100644
--- a/src/backend/executor/nodeCtescan.c
+++ b/src/backend/executor/nodeCtescan.c
@@ -166,6 +166,18 @@ ExecCteScan(PlanState *pstate)
 					(ExecScanRecheckMtd) CteScanRecheck);
 }
 
+/* ----------------------------------------------------------------
+ *		ExecPrepCteScan
+ *
+ *		This "preps" the CteScan node.
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepCteScan(CteScan *node, ExecPrepContext *context,
+				ExecPrepOutput *result)
+{
+	/* nothing to do */
+}
 
 /* ----------------------------------------------------------------
  *		ExecInitCteScan
diff --git a/src/backend/executor/nodeCustom.c b/src/backend/executor/nodeCustom.c
index 8f56bd8a23..0bf1636326 100644
--- a/src/backend/executor/nodeCustom.c
+++ b/src/backend/executor/nodeCustom.c
@@ -24,6 +24,24 @@
 
 static TupleTableSlot *ExecCustomScan(PlanState *pstate);
 
+/* ----------------------------------------------------------------
+ *		ExecPrepCustomScan
+ *
+ *		This "preps" the CustomScan node.
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepCustomScan(CustomScan *node, ExecPrepContext *context, ExecPrepOutput *result)
+{
+	ListCell *lc;
+
+	result->relationRTIs = bms_add_members(result->relationRTIs,
+										   node->custom_relids);
+	foreach(lc, node->custom_plans)
+	{
+		ExecPrepNode((Plan *) lfirst(lc), context, result);
+	}
+}
 
 CustomScanState *
 ExecInitCustomScan(CustomScan *cscan, EState *estate, int eflags)
diff --git a/src/backend/executor/nodeForeignscan.c b/src/backend/executor/nodeForeignscan.c
index 5b9737c2ab..ffe17ec6d5 100644
--- a/src/backend/executor/nodeForeignscan.c
+++ b/src/backend/executor/nodeForeignscan.c
@@ -134,6 +134,18 @@ ExecForeignScan(PlanState *pstate)
 					(ExecScanRecheckMtd) ForeignRecheck);
 }
 
+/* ----------------------------------------------------------------
+ *		ExecPrepForeignScan
+ *
+ *		This "preps" the ForeignScan node.
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepForeignScan(ForeignScan *node, ExecPrepContext *context, ExecPrepOutput *result)
+{
+	result->relationRTIs = bms_add_members(result->relationRTIs,
+										   node->fs_relids);
+}
 
 /* ----------------------------------------------------------------
  *		ExecInitForeignScan
diff --git a/src/backend/executor/nodeFunctionscan.c b/src/backend/executor/nodeFunctionscan.c
index 434379a5aa..df055ce01f 100644
--- a/src/backend/executor/nodeFunctionscan.c
+++ b/src/backend/executor/nodeFunctionscan.c
@@ -272,6 +272,19 @@ ExecFunctionScan(PlanState *pstate)
 					(ExecScanRecheckMtd) FunctionRecheck);
 }
 
+/* ----------------------------------------------------------------
+ *		ExecPrepFunctionScan
+ *
+ *		This "preps" the FunctionScan node.
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepFunctionScan(FunctionScan *node, ExecPrepContext *context,
+					 ExecPrepOutput *result)
+{
+	/* nothing to do*/
+}
+
 /* ----------------------------------------------------------------
  *		ExecInitFunctionScan
  * ----------------------------------------------------------------
diff --git a/src/backend/executor/nodeGather.c b/src/backend/executor/nodeGather.c
index 4f8a17df7d..0edb0ae13a 100644
--- a/src/backend/executor/nodeGather.c
+++ b/src/backend/executor/nodeGather.c
@@ -49,6 +49,19 @@ static TupleTableSlot *gather_getnext(GatherState *gatherstate);
 static MinimalTuple gather_readnext(GatherState *gatherstate);
 static void ExecShutdownGatherWorkers(GatherState *node);
 
+/* ----------------------------------------------------------------
+ *		ExecPrepGather
+ *
+ *		This "preps" the Gather node and the node's subplan.
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepGather(Gather *node, ExecPrepContext *context, ExecPrepOutput *result)
+{
+	/* Nothing to do beside recursing to the subplan. */
+	ExecPrepNode(outerPlan(node), context, result);
+}
+
 
 /* ----------------------------------------------------------------
  *		ExecInitGather
diff --git a/src/backend/executor/nodeGatherMerge.c b/src/backend/executor/nodeGatherMerge.c
index a488cc6d8b..c564d4ac25 100644
--- a/src/backend/executor/nodeGatherMerge.c
+++ b/src/backend/executor/nodeGatherMerge.c
@@ -64,6 +64,19 @@ static bool gather_merge_readnext(GatherMergeState *gm_state, int reader,
 								  bool nowait);
 static void load_tuple_array(GatherMergeState *gm_state, int reader);
 
+/* ----------------------------------------------------------------
+ *		ExecPrepGatherMerge
+ *
+ *		This "preps" the GatherMerge node and the node's subplan.
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepGatherMerge(GatherMerge *node, ExecPrepContext *context, ExecPrepOutput *result)
+{
+	/* Nothing to do beside recursing to the subplan. */
+	ExecPrepNode(outerPlan(node), context, result);
+}
+
 /* ----------------------------------------------------------------
  *		ExecInitGather
  * ----------------------------------------------------------------
diff --git a/src/backend/executor/nodeGroup.c b/src/backend/executor/nodeGroup.c
index 666d02b58f..0e5bcf89bf 100644
--- a/src/backend/executor/nodeGroup.c
+++ b/src/backend/executor/nodeGroup.c
@@ -151,6 +151,19 @@ ExecGroup(PlanState *pstate)
 	}
 }
 
+/* ----------------------------------------------------------------
+ *		ExecPrepGroup
+ *
+ *		This "preps" the Group node and the node's subplan.
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepGroup(Group *node, ExecPrepContext *context, ExecPrepOutput *result)
+{
+	/* Nothing to do beside recursing to the subplan. */
+	ExecPrepNode(outerPlan(node), context, result);
+}
+
 /* -----------------
  * ExecInitGroup
  *
diff --git a/src/backend/executor/nodeHash.c b/src/backend/executor/nodeHash.c
index 4d68a8b97b..d20e14c7fc 100644
--- a/src/backend/executor/nodeHash.c
+++ b/src/backend/executor/nodeHash.c
@@ -344,6 +344,19 @@ MultiExecParallelHash(HashState *node)
 		   BarrierPhase(build_barrier) == PHJ_BUILD_DONE);
 }
 
+/* ----------------------------------------------------------------
+ *		ExecPrepHash
+ *
+ *		This "preps" the hash node and the node's subplan.
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepHash(Hash *node, ExecPrepContext *context, ExecPrepOutput *result)
+{
+	/* Nothing to do beside recursing to the subplan. */
+	ExecPrepNode(outerPlan(node), context, result);
+}
+
 /* ----------------------------------------------------------------
  *		ExecInitHash
  *
diff --git a/src/backend/executor/nodeHashjoin.c b/src/backend/executor/nodeHashjoin.c
index 88b870655e..5665c31873 100644
--- a/src/backend/executor/nodeHashjoin.c
+++ b/src/backend/executor/nodeHashjoin.c
@@ -607,6 +607,20 @@ ExecParallelHashJoin(PlanState *pstate)
 	return ExecHashJoinImpl(pstate, true);
 }
 
+/* ----------------------------------------------------------------
+ *		ExecPrepHashJoin
+ *
+ *		This "preps" the HashJoin node and the node's children.
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepHashJoin(HashJoin *node, ExecPrepContext *context, ExecPrepOutput *result)
+{
+	/* Nothing to do beside recursing to the children. */
+	ExecPrepNode(outerPlan(node), context, result);
+	ExecPrepNode(innerPlan(node), context, result);
+}
+
 /* ----------------------------------------------------------------
  *		ExecInitHashJoin
  *
diff --git a/src/backend/executor/nodeIncrementalSort.c b/src/backend/executor/nodeIncrementalSort.c
index d6fb56dec7..c1c8fe2af6 100644
--- a/src/backend/executor/nodeIncrementalSort.c
+++ b/src/backend/executor/nodeIncrementalSort.c
@@ -964,6 +964,20 @@ ExecIncrementalSort(PlanState *pstate)
 	return slot;
 }
 
+/* ----------------------------------------------------------------
+ *		ExecPrepIncrementalSort
+ *
+ *		This "preps" the IncrementalSort node and the node's subplan.
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepIncrementalSort(IncrementalSort *node, ExecPrepContext *context,
+						ExecPrepOutput *result)
+{
+	/* Nothing to do beside recursing to the subplan. */
+	ExecPrepNode(outerPlan(node), context, result);
+}
+
 /* ----------------------------------------------------------------
  *		ExecInitIncrementalSort
  *
diff --git a/src/backend/executor/nodeIndexonlyscan.c b/src/backend/executor/nodeIndexonlyscan.c
index eb3ddd2943..ccc60c38f5 100644
--- a/src/backend/executor/nodeIndexonlyscan.c
+++ b/src/backend/executor/nodeIndexonlyscan.c
@@ -476,6 +476,20 @@ ExecIndexOnlyRestrPos(IndexOnlyScanState *node)
 	index_restrpos(node->ioss_ScanDesc);
 }
 
+/* ----------------------------------------------------------------
+ *		ExecPrepIndexOnlyScan
+ *
+ *		This "preps" the IndexOnlyScan node.
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepIndexOnlyScan(IndexOnlyScan *node, ExecPrepContext *context,
+					  ExecPrepOutput *result)
+{
+	result->relationRTIs = bms_add_member(result->relationRTIs,
+										  node->scan.scanrelid);
+}
+
 /* ----------------------------------------------------------------
  *		ExecInitIndexOnlyScan
  *
diff --git a/src/backend/executor/nodeIndexscan.c b/src/backend/executor/nodeIndexscan.c
index a91f135be7..5080abdd9d 100644
--- a/src/backend/executor/nodeIndexscan.c
+++ b/src/backend/executor/nodeIndexscan.c
@@ -885,6 +885,20 @@ ExecIndexRestrPos(IndexScanState *node)
 	index_restrpos(node->iss_ScanDesc);
 }
 
+/* ----------------------------------------------------------------
+ *		ExecPrepIndexScan
+ *
+ *		This "preps" the IndexScan node.
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepIndexScan(IndexScan *node, ExecPrepContext *context,
+				  ExecPrepOutput *result)
+{
+	result->relationRTIs = bms_add_member(result->relationRTIs,
+										  node->scan.scanrelid);
+}
+
 /* ----------------------------------------------------------------
  *		ExecInitIndexScan
  *
diff --git a/src/backend/executor/nodeLimit.c b/src/backend/executor/nodeLimit.c
index 1b91b123fa..00aa5dd577 100644
--- a/src/backend/executor/nodeLimit.c
+++ b/src/backend/executor/nodeLimit.c
@@ -437,6 +437,19 @@ compute_tuples_needed(LimitState *node)
 	return node->count + node->offset;
 }
 
+/* ----------------------------------------------------------------
+ *		ExecPrepLimit
+ *
+ *		This "preps" the limit node and	the node's subplan.
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepLimit(Limit *node, ExecPrepContext *context, ExecPrepOutput *result)
+{
+	/* Nothing to do beside recursing to the subplan. */
+	ExecPrepNode(outerPlan(node), context, result);
+}
+
 /* ----------------------------------------------------------------
  *		ExecInitLimit
  *
diff --git a/src/backend/executor/nodeLockRows.c b/src/backend/executor/nodeLockRows.c
index 1a9dab25dd..9a3d2c5583 100644
--- a/src/backend/executor/nodeLockRows.c
+++ b/src/backend/executor/nodeLockRows.c
@@ -281,6 +281,19 @@ lnext:
 	return slot;
 }
 
+/* ----------------------------------------------------------------
+ *		ExecPrepLockRows
+ *
+ *		This "preps" the LockRows node and the node's subplan.
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepLockRows(LockRows *node, ExecPrepContext *context, ExecPrepOutput *result)
+{
+	/* Nothing to do beside recursing to the subplan. */
+	ExecPrepNode(outerPlan(node), context, result);
+}
+
 /* ----------------------------------------------------------------
  *		ExecInitLockRows
  *
diff --git a/src/backend/executor/nodeMaterial.c b/src/backend/executor/nodeMaterial.c
index 2cb27e0e9a..802bf37ff1 100644
--- a/src/backend/executor/nodeMaterial.c
+++ b/src/backend/executor/nodeMaterial.c
@@ -156,6 +156,19 @@ ExecMaterial(PlanState *pstate)
 	return ExecClearTuple(slot);
 }
 
+/* ----------------------------------------------------------------
+ *		ExecPrepMaterial
+ *
+ *		This "preps" the Material node and the node's subplan.
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepMaterial(Material *node, ExecPrepContext *context, ExecPrepOutput *result)
+{
+	/* Nothing to do beside recursing to the subplan. */
+	ExecPrepNode(outerPlan(node), context, result);
+}
+
 /* ----------------------------------------------------------------
  *		ExecInitMaterial
  * ----------------------------------------------------------------
diff --git a/src/backend/executor/nodeMemoize.c b/src/backend/executor/nodeMemoize.c
index 55cdd5c4d9..eacfd5f3cb 100644
--- a/src/backend/executor/nodeMemoize.c
+++ b/src/backend/executor/nodeMemoize.c
@@ -902,6 +902,19 @@ ExecMemoize(PlanState *pstate)
 	}							/* switch */
 }
 
+/* ----------------------------------------------------------------
+ *		ExecPrepMemoize
+ *
+ *		This "preps" the Memoize node and the node's subplan.
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepMemoize(Memoize *node, ExecPrepContext *context, ExecPrepOutput *result)
+{
+	/* Nothing to do beside recursing to the subplan. */
+	ExecPrepNode(outerPlan(node), context, result);
+}
+
 MemoizeState *
 ExecInitMemoize(Memoize *node, EState *estate, int eflags)
 {
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index 418f89dea8..50f6429533 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -43,6 +43,7 @@
 #include "executor/nodeMergeAppend.h"
 #include "lib/binaryheap.h"
 #include "miscadmin.h"
+#include "partitioning/partdesc.h"
 
 /*
  * We have one slot for each item in the heap array.  We use SlotNumber
@@ -54,6 +55,62 @@ typedef int32 SlotNumber;
 static TupleTableSlot *ExecMergeAppend(PlanState *pstate);
 static int	heap_compare_slots(Datum a, Datum b, void *arg);
 
+/* ----------------------------------------------------------------
+ *		ExecPrepMergeAppend
+ *
+ *		Prep an MergeAppend node
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepMergeAppend(MergeAppend *node, ExecPrepContext *context,
+					ExecPrepOutput *result)
+{
+	PartitionPruneInfo *pruneinfo = node->part_prune_info;
+
+	if (pruneinfo && pruneinfo->contains_init_steps)
+	{
+		List	   *rtable = context->stmt->rtable;
+		List	   *subplans = node->mergeplans;
+		ParamListInfo params = context->params;
+		Bitmapset  *parentrelids;
+		int			i;
+		PlanPrepOutput *planPrepResult = makeNode(PlanPrepOutput);
+
+		planPrepResult->plan_node_id = node->plan.plan_node_id;
+		planPrepResult->initially_valid_subnodes =
+			ExecPrepDoInitialPruning(pruneinfo, rtable, params, &parentrelids);
+		/* Replace the NULL that ExecPrepNode() would've written. */
+		ExecPrepStorePlanPrepOutput(result, planPrepResult, &node->plan);
+
+		/* All relevant parents must be reported too. */
+		Assert(bms_num_members(parentrelids) > 0);
+		result->relationRTIs = bms_add_members(result->relationRTIs,
+											   parentrelids);
+
+		/* And all leaf partitions that will be scanned. */
+		i = -1;
+		while ((i = bms_next_member(planPrepResult->initially_valid_subnodes, i)) >= 0)
+		{
+			Plan   *subplan = list_nth(subplans, i);
+
+			ExecPrepNode(subplan, context, result);
+		}
+	}
+	else
+	{
+		List	   *subplans = node->mergeplans;
+		ListCell   *lc;
+
+		/* Recurse to prep *all* of the node's child subplans. */
+		foreach(lc, subplans)
+		{
+			Plan *subplan = (Plan *) lfirst(lc);
+
+			ExecPrepNode(subplan, context, result);
+		}
+	}
+}
+
 
 /* ----------------------------------------------------------------
  *		ExecInitMergeAppend
@@ -84,38 +141,19 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
 	/* If run-time partition pruning is enabled, then set that up now */
 	if (node->part_prune_info != NULL)
 	{
-		PartitionPruneState *prunestate;
-
-		/* We may need an expression context to evaluate partition exprs */
-		ExecAssignExprContext(estate, &mergestate->ps);
-
-		prunestate = ExecCreatePartitionPruneState(&mergestate->ps,
-												   node->part_prune_info);
-		mergestate->ms_prune_state = prunestate;
-
-		/* Perform an initial partition prune, if required. */
-		if (prunestate->do_initial_prune)
-		{
-			/* Determine which subplans survive initial pruning */
-			validsubplans = ExecFindInitialMatchingSubPlans(prunestate,
-															list_length(node->mergeplans));
-
-			nplans = bms_num_members(validsubplans);
-		}
-		else
-		{
-			/* We'll need to initialize all subplans */
-			nplans = list_length(node->mergeplans);
-			Assert(nplans > 0);
-			validsubplans = bms_add_range(NULL, 0, nplans - 1);
-		}
+		validsubplans = ExecInitPartitionPruning(&mergestate->ps,
+												 list_length(node->mergeplans),
+												 node->part_prune_info,
+												 &mergestate->ms_prune_state);
+		nplans = bms_num_members(validsubplans);
 
 		/*
 		 * When no run-time pruning is required and there's at least one
 		 * subplan, we can fill as_valid_subplans immediately, preventing
 		 * later calls to ExecFindMatchingSubPlans.
 		 */
-		if (!prunestate->do_exec_prune && nplans > 0)
+		if (mergestate->ms_prune_state == NULL ||
+			(!mergestate->ms_prune_state->do_exec_prune && nplans > 0))
 			mergestate->ms_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
 	}
 	else
diff --git a/src/backend/executor/nodeMergejoin.c b/src/backend/executor/nodeMergejoin.c
index a049bc4ae0..12b1790c8a 100644
--- a/src/backend/executor/nodeMergejoin.c
+++ b/src/backend/executor/nodeMergejoin.c
@@ -1428,6 +1428,20 @@ ExecMergeJoin(PlanState *pstate)
 	}
 }
 
+/* ----------------------------------------------------------------
+ *		ExecPrepMergeJoin
+ *
+ *		This "preps" the MergeJoin node and the node's children.
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepMergeJoin(MergeJoin *node, ExecPrepContext *context, ExecPrepOutput *result)
+{
+	/* Nothing to do beside recursing to the children. */
+	ExecPrepNode(outerPlan(node), context, result);
+	ExecPrepNode(innerPlan(node), context, result);
+}
+
 /* ----------------------------------------------------------------
  *		ExecInitMergeJoin
  * ----------------------------------------------------------------
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index 5ec699a9bd..93a6ac062f 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -2700,6 +2700,32 @@ ExecLookupResultRelByOid(ModifyTableState *node, Oid resultoid,
 	return NULL;
 }
 
+/* ----------------------------------------------------------------
+ *		ExecPrepModifyTable
+ *
+ *		This "preps" the ModifyTable node and the subplan.
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepModifyTable(ModifyTable *node, ExecPrepContext *context,
+					ExecPrepOutput *result)
+{
+	ListCell *lc;
+
+	if (node->rootRelation > 0)
+		result->relationRTIs = bms_add_member(result->relationRTIs,
+											  node->rootRelation);
+	result->relationRTIs = bms_add_member(result->relationRTIs,
+										  node->nominalRelation);
+	foreach(lc, node->resultRelations)
+	{
+		result->relationRTIs = bms_add_member(result->relationRTIs,
+											  lfirst_int(lc));
+	}
+
+	ExecPrepNode(outerPlan(node), context, result);
+}
+
 /* ----------------------------------------------------------------
  *		ExecInitModifyTable
  * ----------------------------------------------------------------
diff --git a/src/backend/executor/nodeNamedtuplestorescan.c b/src/backend/executor/nodeNamedtuplestorescan.c
index ca637b1b0e..5db23af93c 100644
--- a/src/backend/executor/nodeNamedtuplestorescan.c
+++ b/src/backend/executor/nodeNamedtuplestorescan.c
@@ -74,6 +74,19 @@ ExecNamedTuplestoreScan(PlanState *pstate)
 					(ExecScanRecheckMtd) NamedTuplestoreScanRecheck);
 }
 
+/* ----------------------------------------------------------------
+ *		ExecPrepNamedTuplestoreScan
+ *
+ *		This "preps" the NamedTuplestoreScan node.
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepNamedTuplestoreScan(NamedTuplestoreScan *node,
+							ExecPrepContext *context,
+							ExecPrepOutput *result)
+{
+	/* nothing to do */
+}
 
 /* ----------------------------------------------------------------
  *		ExecInitNamedTuplestoreScan
diff --git a/src/backend/executor/nodeNestloop.c b/src/backend/executor/nodeNestloop.c
index 06767c3133..ffb3a94f07 100644
--- a/src/backend/executor/nodeNestloop.c
+++ b/src/backend/executor/nodeNestloop.c
@@ -255,6 +255,20 @@ ExecNestLoop(PlanState *pstate)
 	}
 }
 
+/* ----------------------------------------------------------------
+ *		ExecPrepNestLoop
+ *
+ *		This "preps" the NestLoop node and the node's children.
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepNestLoop(NestLoop *node, ExecPrepContext *context, ExecPrepOutput *result)
+{
+	/* Nothing to do beside recursing to the children. */
+	ExecPrepNode(outerPlan(node), context, result);
+	ExecPrepNode(innerPlan(node), context, result);
+}
+
 /* ----------------------------------------------------------------
  *		ExecInitNestLoop
  * ----------------------------------------------------------------
diff --git a/src/backend/executor/nodeProjectSet.c b/src/backend/executor/nodeProjectSet.c
index ea40d61b0b..1d6085a3b4 100644
--- a/src/backend/executor/nodeProjectSet.c
+++ b/src/backend/executor/nodeProjectSet.c
@@ -208,6 +208,19 @@ ExecProjectSRF(ProjectSetState *node, bool continuing)
 	return NULL;
 }
 
+/* ----------------------------------------------------------------
+ *		ExecPrepProjectSet
+ *
+ *		This "preps" the ProjectSet node and the subplan.
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepProjectSet(ProjectSet *node, ExecPrepContext *context,
+				   ExecPrepOutput *result)
+{
+	ExecPrepNode(outerPlan(node), context, result);
+}
+
 /* ----------------------------------------------------------------
  *		ExecInitProjectSet
  *
diff --git a/src/backend/executor/nodeRecursiveunion.c b/src/backend/executor/nodeRecursiveunion.c
index 2d01ed7711..806c653c56 100644
--- a/src/backend/executor/nodeRecursiveunion.c
+++ b/src/backend/executor/nodeRecursiveunion.c
@@ -159,6 +159,20 @@ ExecRecursiveUnion(PlanState *pstate)
 	return NULL;
 }
 
+/* ----------------------------------------------------------------
+ *		ExecPrepRecursiveUnion
+ *
+ *		This "preps" the RecursiveUnion node and the children.
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepRecursiveUnion(RecursiveUnion *node, ExecPrepContext *context,
+					   ExecPrepOutput *result)
+{
+	ExecPrepNode(outerPlan(node), context, result);
+	ExecPrepNode(innerPlan(node), context, result);
+}
+
 /* ----------------------------------------------------------------
  *		ExecInitRecursiveUnion
  * ----------------------------------------------------------------
diff --git a/src/backend/executor/nodeResult.c b/src/backend/executor/nodeResult.c
index d0413e05de..14883b6764 100644
--- a/src/backend/executor/nodeResult.c
+++ b/src/backend/executor/nodeResult.c
@@ -169,6 +169,19 @@ ExecResultRestrPos(ResultState *node)
 		elog(ERROR, "Result nodes do not support mark/restore");
 }
 
+/* ----------------------------------------------------------------
+ *		ExecPrepResult
+ *
+ *		This "preps" the Result node and the subplan.
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepResult(Result *node, ExecPrepContext *context,
+			   ExecPrepOutput *result)
+{
+	ExecPrepNode(outerPlan(node), context, result);
+}
+
 /* ----------------------------------------------------------------
  *		ExecInitResult
  *
diff --git a/src/backend/executor/nodeSamplescan.c b/src/backend/executor/nodeSamplescan.c
index a03ae120f8..ef4c0775f7 100644
--- a/src/backend/executor/nodeSamplescan.c
+++ b/src/backend/executor/nodeSamplescan.c
@@ -89,6 +89,20 @@ ExecSampleScan(PlanState *pstate)
 					(ExecScanRecheckMtd) SampleRecheck);
 }
 
+/* ----------------------------------------------------------------
+ *		ExecPrepSampleScan
+ *
+ *		This "preps" the SampleScan node.
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepSampleScan(SampleScan *node, ExecPrepContext *context,
+				   ExecPrepOutput *result)
+{
+	result->relationRTIs = bms_add_member(result->relationRTIs,
+										  node->scan.scanrelid);
+}
+
 /* ----------------------------------------------------------------
  *		ExecInitSampleScan
  * ----------------------------------------------------------------
diff --git a/src/backend/executor/nodeSeqscan.c b/src/backend/executor/nodeSeqscan.c
index 7b58cd9162..8964c1e9b2 100644
--- a/src/backend/executor/nodeSeqscan.c
+++ b/src/backend/executor/nodeSeqscan.c
@@ -114,6 +114,19 @@ ExecSeqScan(PlanState *pstate)
 					(ExecScanRecheckMtd) SeqRecheck);
 }
 
+/* ----------------------------------------------------------------
+ *		ExecPrepSeqScanScan
+ *
+ *		This "preps" the SeqScan node.
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepSeqScan(SeqScan *node, ExecPrepContext *context,
+				ExecPrepOutput *result)
+{
+	result->relationRTIs = bms_add_member(result->relationRTIs,
+										  node->scan.scanrelid);
+}
 
 /* ----------------------------------------------------------------
  *		ExecInitSeqScan
diff --git a/src/backend/executor/nodeSetOp.c b/src/backend/executor/nodeSetOp.c
index 4b428cfa39..312aa8511f 100644
--- a/src/backend/executor/nodeSetOp.c
+++ b/src/backend/executor/nodeSetOp.c
@@ -470,6 +470,19 @@ setop_retrieve_hash_table(SetOpState *setopstate)
 	return NULL;
 }
 
+/* ----------------------------------------------------------------
+ *		ExecPrepSetOp
+ *
+ *		This "preps" the setop node and the node's subplan.
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepSetOp(SetOp *node, ExecPrepContext *context, ExecPrepOutput *result)
+{
+	/* Nothing to do beside recursing to the subplan. */
+	ExecPrepNode(outerPlan(node), context, result);
+}
+
 /* ----------------------------------------------------------------
  *		ExecInitSetOp
  *
diff --git a/src/backend/executor/nodeSort.c b/src/backend/executor/nodeSort.c
index 9481a622bf..c31f2634e8 100644
--- a/src/backend/executor/nodeSort.c
+++ b/src/backend/executor/nodeSort.c
@@ -203,6 +203,19 @@ ExecSort(PlanState *pstate)
 	return slot;
 }
 
+/* ----------------------------------------------------------------
+ *		ExecPrepSort
+ *
+ *		This "preps" the Sort node and the node's subplan.
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepSort(Sort *node, ExecPrepContext *context, ExecPrepOutput *result)
+{
+	/* Nothing to do beside recursing to the subplan. */
+	ExecPrepNode(outerPlan(node), context, result);
+}
+
 /* ----------------------------------------------------------------
  *		ExecInitSort
  *
diff --git a/src/backend/executor/nodeSubplan.c b/src/backend/executor/nodeSubplan.c
index 60d2290030..b95084ddb2 100644
--- a/src/backend/executor/nodeSubplan.c
+++ b/src/backend/executor/nodeSubplan.c
@@ -775,6 +775,18 @@ slotNoNulls(TupleTableSlot *slot)
 	return true;
 }
 
+/* ----------------------------------------------------------------
+ *		ExecPrepSubPlan
+ *
+ *		This "preps" the SubPlan node and the node's subplan.
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepSubPlan(SubPlan *node, ExecPrepContext *context, ExecPrepOutput *result)
+{
+	/* nothing to do */
+}
+
 /* ----------------------------------------------------------------
  *		ExecInitSubPlan
  *
diff --git a/src/backend/executor/nodeSubqueryscan.c b/src/backend/executor/nodeSubqueryscan.c
index 242c9cd4b9..cc0d62ca85 100644
--- a/src/backend/executor/nodeSubqueryscan.c
+++ b/src/backend/executor/nodeSubqueryscan.c
@@ -89,6 +89,20 @@ ExecSubqueryScan(PlanState *pstate)
 					(ExecScanRecheckMtd) SubqueryRecheck);
 }
 
+/* ----------------------------------------------------------------
+ *		ExecPrepSubqueryScan
+ *
+ *		This "preps" the SubqueryScan node and the subplan.
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepSubqueryScan(SubqueryScan *node, ExecPrepContext *context,
+					 ExecPrepOutput *result)
+{
+	/* Nothing to do beside recursing to the subplan. */
+	ExecPrepNode((Plan *) node->subplan, context, result);
+}
+
 /* ----------------------------------------------------------------
  *		ExecInitSubqueryScan
  * ----------------------------------------------------------------
diff --git a/src/backend/executor/nodeTableFuncscan.c b/src/backend/executor/nodeTableFuncscan.c
index 0db4ed0c2f..dccecb3916 100644
--- a/src/backend/executor/nodeTableFuncscan.c
+++ b/src/backend/executor/nodeTableFuncscan.c
@@ -83,6 +83,19 @@ TableFuncRecheck(TableFuncScanState *node, TupleTableSlot *slot)
 	return true;
 }
 
+/* ----------------------------------------------------------------
+ *		ExecPrepTableFuncScan
+ *
+ *		This "preps" the TableFuncScan node.
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepTableFuncScan(TableFuncScan *node, ExecPrepContext *context,
+					  ExecPrepOutput *result)
+{
+	/* nothing to do*/
+}
+
 /* ----------------------------------------------------------------
  *		ExecTableFuncScan(node)
  *
diff --git a/src/backend/executor/nodeTidrangescan.c b/src/backend/executor/nodeTidrangescan.c
index d5bf1be787..1c05ce8035 100644
--- a/src/backend/executor/nodeTidrangescan.c
+++ b/src/backend/executor/nodeTidrangescan.c
@@ -340,6 +340,20 @@ ExecEndTidRangeScan(TidRangeScanState *node)
 	ExecClearTuple(node->ss.ss_ScanTupleSlot);
 }
 
+/* ----------------------------------------------------------------
+ *		ExecPrepTidRangeScan
+ *
+ *		This "preps" the TidRangeScan node.
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepTidRangeScan(TidRangeScan *node, ExecPrepContext *context,
+					 ExecPrepOutput *result)
+{
+	result->relationRTIs = bms_add_member(result->relationRTIs,
+										  node->scan.scanrelid);
+}
+
 /* ----------------------------------------------------------------
  *		ExecInitTidRangeScan
  *
diff --git a/src/backend/executor/nodeTidscan.c b/src/backend/executor/nodeTidscan.c
index 4116d1f3b5..6031ab52b6 100644
--- a/src/backend/executor/nodeTidscan.c
+++ b/src/backend/executor/nodeTidscan.c
@@ -408,7 +408,6 @@ TidRecheck(TidScanState *node, TupleTableSlot *slot)
 	return true;
 }
 
-
 /* ----------------------------------------------------------------
  *		ExecTidScan(node)
  *
@@ -483,6 +482,20 @@ ExecEndTidScan(TidScanState *node)
 	ExecClearTuple(node->ss.ss_ScanTupleSlot);
 }
 
+/* ----------------------------------------------------------------
+ *		ExecPrepTidScan
+ *
+ *		This "preps" the TidScan node.
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepTidScan(TidScan *node, ExecPrepContext *context,
+				ExecPrepOutput *result)
+{
+	result->relationRTIs = bms_add_member(result->relationRTIs,
+										  node->scan.scanrelid);
+}
+
 /* ----------------------------------------------------------------
  *		ExecInitTidScan
  *
diff --git a/src/backend/executor/nodeUnique.c b/src/backend/executor/nodeUnique.c
index 6c99d13a39..87c1b53515 100644
--- a/src/backend/executor/nodeUnique.c
+++ b/src/backend/executor/nodeUnique.c
@@ -104,6 +104,19 @@ ExecUnique(PlanState *pstate)
 	return ExecCopySlot(resultTupleSlot, slot);
 }
 
+/* ----------------------------------------------------------------
+ *		ExecPrepUnique
+ *
+ *		This "preps" the unique node and the node's subplan.
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepUnique(Unique *node, ExecPrepContext *context, ExecPrepOutput *result)
+{
+	/* Nothing to do beside recursing to the subplan. */
+	ExecPrepNode(outerPlan(node), context, result);
+}
+
 /* ----------------------------------------------------------------
  *		ExecInitUnique
  *
diff --git a/src/backend/executor/nodeValuesscan.c b/src/backend/executor/nodeValuesscan.c
index dda1c59b23..6cf7fd77d6 100644
--- a/src/backend/executor/nodeValuesscan.c
+++ b/src/backend/executor/nodeValuesscan.c
@@ -203,6 +203,19 @@ ExecValuesScan(PlanState *pstate)
 					(ExecScanRecheckMtd) ValuesRecheck);
 }
 
+/* ----------------------------------------------------------------
+ *		ExecPrepValuesScan
+ *
+ *		This "preps" the ValuesScan node.
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepValuesScan(ValuesScan *node, ExecPrepContext *context,
+				   ExecPrepOutput *result)
+{
+	/* nothing to do */
+}
+
 /* ----------------------------------------------------------------
  *		ExecInitValuesScan
  * ----------------------------------------------------------------
diff --git a/src/backend/executor/nodeWindowAgg.c b/src/backend/executor/nodeWindowAgg.c
index 08ce05ca5a..90b7494bee 100644
--- a/src/backend/executor/nodeWindowAgg.c
+++ b/src/backend/executor/nodeWindowAgg.c
@@ -2238,6 +2238,19 @@ ExecWindowAgg(PlanState *pstate)
 	return ExecProject(winstate->ss.ps.ps_ProjInfo);
 }
 
+/* ----------------------------------------------------------------
+ *		ExecPrepWindowAgg
+ *
+ *		This "preps" the WindowAgg node and the node's subplan.
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepWindowAgg(WindowAgg *node, ExecPrepContext *context, ExecPrepOutput *result)
+{
+	/* Nothing to do beside recursing to the subplan. */
+	ExecPrepNode(outerPlan(node), context, result);
+}
+
 /* -----------------
  * ExecInitWindowAgg
  *
diff --git a/src/backend/executor/nodeWorktablescan.c b/src/backend/executor/nodeWorktablescan.c
index 15fd71fb32..71a2ac7e40 100644
--- a/src/backend/executor/nodeWorktablescan.c
+++ b/src/backend/executor/nodeWorktablescan.c
@@ -121,6 +121,18 @@ ExecWorkTableScan(PlanState *pstate)
 					(ExecScanRecheckMtd) WorkTableScanRecheck);
 }
 
+/* ----------------------------------------------------------------
+ *		ExecPrepWorkTableScan
+ *
+ *		This "preps" the WorkTableScan node.
+ * ----------------------------------------------------------------
+ */
+void
+ExecPrepWorkTableScan(WorkTableScan *node, ExecPrepContext *context,
+					  ExecPrepOutput *result)
+{
+	/* nothing to do */
+}
 
 /* ----------------------------------------------------------------
  *		ExecInitWorkTableScan
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index c93f90de9b..84c1b22ccb 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -1485,6 +1485,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 	CachedPlanSource *plansource;
 	CachedPlan *cplan;
 	List	   *stmt_list;
+	List	   *stmt_execprep_list;
 	char	   *query_string;
 	Snapshot	snapshot;
 	MemoryContext oldcontext;
@@ -1566,6 +1567,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 	/* Replan if needed, and increment plan refcount for portal */
 	cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv);
 	stmt_list = cplan->stmt_list;
+	stmt_execprep_list = cplan->stmt_execprep_list;
 
 	if (!plan->saved)
 	{
@@ -1577,6 +1579,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 		 */
 		oldcontext = MemoryContextSwitchTo(portal->portalContext);
 		stmt_list = copyObject(stmt_list);
+		stmt_execprep_list = copyObject(stmt_execprep_list);
 		MemoryContextSwitchTo(oldcontext);
 		ReleaseCachedPlan(cplan, NULL);
 		cplan = NULL;			/* portal shouldn't depend on cplan */
@@ -1590,6 +1593,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 					  query_string,
 					  plansource->commandTag,
 					  stmt_list,
+					  stmt_execprep_list,
 					  cplan);
 
 	/*
@@ -2380,7 +2384,9 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 	{
 		CachedPlanSource *plansource = (CachedPlanSource *) lfirst(lc1);
 		List	   *stmt_list;
-		ListCell   *lc2;
+		List	   *stmt_execprep_list;
+		ListCell   *lc2,
+				   *lc3;
 
 		spicallbackarg.query = plansource->query_string;
 
@@ -2459,6 +2465,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 							  plan_owner, _SPI_current->queryEnv);
 
 		stmt_list = cplan->stmt_list;
+		stmt_execprep_list = cplan->stmt_execprep_list;
 
 		/*
 		 * If we weren't given a specific snapshot to use, and the statement
@@ -2496,9 +2503,10 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 			}
 		}
 
-		foreach(lc2, stmt_list)
+		forboth(lc2, stmt_list, lc3, stmt_execprep_list)
 		{
 			PlannedStmt *stmt = lfirst_node(PlannedStmt, lc2);
+			ExecPrepOutput *execprep = lfirst_node(ExecPrepOutput, lc3);
 			bool		canSetTag = stmt->canSetTag;
 			DestReceiver *dest;
 
@@ -2570,7 +2578,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 				else
 					snap = InvalidSnapshot;
 
-				qdesc = CreateQueryDesc(stmt,
+				qdesc = CreateQueryDesc(stmt, execprep,
 										plansource->query_string,
 										snap, crosscheck_snapshot,
 										dest,
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 6bd95bbce2..89101256cf 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -68,6 +68,18 @@
 		} \
 	} while (0)
 
+/* Copy a field that is an array with numElem of Node objects */
+#define COPY_NODE_ARRAY(fldname, numElem) \
+	do { \
+		int		i; \
+		newnode->fldname = numElem > 0 ? \
+			palloc(numElem * sizeof(from->fldname[0])) : NULL; \
+		for (i = 0; i < numElem; i++) \
+		{ \
+			newnode->fldname[i] = copyObject(from->fldname[i]); \
+		} \
+	} while (0)
+
 /* Copy a parse location field (for Copy, this is same as scalar case) */
 #define COPY_LOCATION_FIELD(fldname) \
 	(newnode->fldname = from->fldname)
@@ -94,9 +106,12 @@ _copyPlannedStmt(const PlannedStmt *from)
 	COPY_SCALAR_FIELD(transientPlan);
 	COPY_SCALAR_FIELD(dependsOnRole);
 	COPY_SCALAR_FIELD(parallelModeNeeded);
+	COPY_SCALAR_FIELD(usesPreExecPruning);
 	COPY_SCALAR_FIELD(jitFlags);
 	COPY_NODE_FIELD(planTree);
+	COPY_SCALAR_FIELD(numPlanNodes);
 	COPY_NODE_FIELD(rtable);
+	COPY_BITMAPSET_FIELD(relationRTIs);
 	COPY_NODE_FIELD(resultRelations);
 	COPY_NODE_FIELD(appendRelations);
 	COPY_NODE_FIELD(subplans);
@@ -1278,6 +1293,8 @@ _copyPartitionPruneInfo(const PartitionPruneInfo *from)
 	PartitionPruneInfo *newnode = makeNode(PartitionPruneInfo);
 
 	COPY_NODE_FIELD(prune_infos);
+	COPY_SCALAR_FIELD(contains_init_steps);
+	COPY_SCALAR_FIELD(contains_exec_steps);
 	COPY_BITMAPSET_FIELD(other_subplans);
 
 	return newnode;
@@ -4984,6 +5001,28 @@ _copyBitString(const BitString *from)
 	return newnode;
 }
 
+static ExecPrepOutput *
+_copyExecPrepOutput(const ExecPrepOutput *from)
+{
+	ExecPrepOutput *newnode = makeNode(ExecPrepOutput);
+
+	COPY_BITMAPSET_FIELD(relationRTIs);
+	COPY_SCALAR_FIELD(numPlanNodes);
+	COPY_NODE_ARRAY(planPrepResults, from->numPlanNodes);
+
+	return newnode;
+}
+
+static PlanPrepOutput *
+_copyPlanPrepOutput(const PlanPrepOutput *from)
+{
+	PlanPrepOutput *newnode = makeNode(PlanPrepOutput);
+
+	COPY_SCALAR_FIELD(plan_node_id);
+	COPY_BITMAPSET_FIELD(initially_valid_subnodes);
+
+	return newnode;
+}
 
 static ForeignKeyCacheInfo *
 _copyForeignKeyCacheInfo(const ForeignKeyCacheInfo *from)
@@ -5930,6 +5969,16 @@ copyObjectImpl(const void *from)
 			retval = _copyPublicationTable(from);
 			break;
 
+			/*
+			 * EXECUTION NODES
+			 */
+		case T_ExecPrepOutput:
+			retval = _copyExecPrepOutput(from);
+			break;
+		case T_PlanPrepOutput:
+			retval = _copyPlanPrepOutput(from);
+			break;
+
 			/*
 			 * MISCELLANEOUS NODES
 			 */
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 6bdad462c7..9fe247d505 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -312,9 +312,12 @@ _outPlannedStmt(StringInfo str, const PlannedStmt *node)
 	WRITE_BOOL_FIELD(transientPlan);
 	WRITE_BOOL_FIELD(dependsOnRole);
 	WRITE_BOOL_FIELD(parallelModeNeeded);
+	WRITE_BOOL_FIELD(usesPreExecPruning);
 	WRITE_INT_FIELD(jitFlags);
 	WRITE_NODE_FIELD(planTree);
+	WRITE_INT_FIELD(numPlanNodes);
 	WRITE_NODE_FIELD(rtable);
+	WRITE_BITMAPSET_FIELD(relationRTIs);
 	WRITE_NODE_FIELD(resultRelations);
 	WRITE_NODE_FIELD(appendRelations);
 	WRITE_NODE_FIELD(subplans);
@@ -1004,6 +1007,8 @@ _outPartitionPruneInfo(StringInfo str, const PartitionPruneInfo *node)
 	WRITE_NODE_TYPE("PARTITIONPRUNEINFO");
 
 	WRITE_NODE_FIELD(prune_infos);
+	WRITE_BOOL_FIELD(contains_init_steps);
+	WRITE_BOOL_FIELD(contains_exec_steps);
 	WRITE_BITMAPSET_FIELD(other_subplans);
 }
 
@@ -2274,6 +2279,7 @@ _outPlannerGlobal(StringInfo str, const PlannerGlobal *node)
 	WRITE_NODE_FIELD(subplans);
 	WRITE_BITMAPSET_FIELD(rewindPlanIDs);
 	WRITE_NODE_FIELD(finalrtable);
+	WRITE_BITMAPSET_FIELD(relationRTIs);
 	WRITE_NODE_FIELD(finalrowmarks);
 	WRITE_NODE_FIELD(resultRelations);
 	WRITE_NODE_FIELD(appendRelations);
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 3f68f7c18d..7ecb9ad73c 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -1585,9 +1585,12 @@ _readPlannedStmt(void)
 	READ_BOOL_FIELD(transientPlan);
 	READ_BOOL_FIELD(dependsOnRole);
 	READ_BOOL_FIELD(parallelModeNeeded);
+	READ_BOOL_FIELD(usesPreExecPruning);
 	READ_INT_FIELD(jitFlags);
 	READ_NODE_FIELD(planTree);
+	READ_INT_FIELD(numPlanNodes);
 	READ_NODE_FIELD(rtable);
+	READ_BITMAPSET_FIELD(relationRTIs);
 	READ_NODE_FIELD(resultRelations);
 	READ_NODE_FIELD(appendRelations);
 	READ_NODE_FIELD(subplans);
@@ -2534,6 +2537,8 @@ _readPartitionPruneInfo(void)
 	READ_LOCALS(PartitionPruneInfo);
 
 	READ_NODE_FIELD(prune_infos);
+	READ_BOOL_FIELD(contains_init_steps);
+	READ_BOOL_FIELD(contains_exec_steps);
 	READ_BITMAPSET_FIELD(other_subplans);
 
 	READ_DONE();
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index bd09f85aea..70c5b9d88b 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -517,8 +517,11 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
 	result->transientPlan = glob->transientPlan;
 	result->dependsOnRole = glob->dependsOnRole;
 	result->parallelModeNeeded = glob->parallelModeNeeded;
+	result->usesPreExecPruning = glob->usesPreExecPruning;
 	result->planTree = top_plan;
+	result->numPlanNodes = glob->lastPlanNodeId;
 	result->rtable = glob->finalrtable;
+	result->relationRTIs = glob->relationRTIs;
 	result->resultRelations = glob->resultRelations;
 	result->appendRelations = glob->appendRelations;
 	result->subplans = glob->subplans;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index a7b11b7f03..c1b1cf503d 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -483,6 +483,7 @@ static void
 add_rte_to_flat_rtable(PlannerGlobal *glob, RangeTblEntry *rte)
 {
 	RangeTblEntry *newrte;
+	Index		rti = list_length(glob->finalrtable) + 1;
 
 	/* flat copy to duplicate all the scalar fields */
 	newrte = (RangeTblEntry *) palloc(sizeof(RangeTblEntry));
@@ -517,7 +518,10 @@ add_rte_to_flat_rtable(PlannerGlobal *glob, RangeTblEntry *rte)
 	 * but it would probably cost more cycles than it would save.
 	 */
 	if (newrte->rtekind == RTE_RELATION)
+	{
+		glob->relationRTIs = bms_add_member(glob->relationRTIs, rti);
 		glob->relationOids = lappend_oid(glob->relationOids, newrte->relid);
+	}
 }
 
 /*
@@ -1548,6 +1552,9 @@ set_append_references(PlannerInfo *root,
 				pinfo->rtindex += rtoffset;
 			}
 		}
+
+		if (aplan->part_prune_info->contains_init_steps)
+			root->glob->usesPreExecPruning = true;
 	}
 
 	/* We don't need to recurse to lefttree or righttree ... */
@@ -1620,6 +1627,9 @@ set_mergeappend_references(PlannerInfo *root,
 				pinfo->rtindex += rtoffset;
 			}
 		}
+
+		if (mplan->part_prune_info->contains_init_steps)
+			root->glob->usesPreExecPruning = true;
 	}
 
 	/* We don't need to recurse to lefttree or righttree ... */
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index 1bc00826c1..390d4e4c06 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -144,7 +144,9 @@ static List *make_partitionedrel_pruneinfo(PlannerInfo *root,
 										   List *prunequal,
 										   Bitmapset *partrelids,
 										   int *relid_subplan_map,
-										   Bitmapset **matchedsubplans);
+										   Bitmapset **matchedsubplans,
+										   bool *contains_init_steps,
+										   bool *contains_exec_steps);
 static void gen_partprune_steps(RelOptInfo *rel, List *clauses,
 								PartClauseTarget target,
 								GeneratePruningStepsContext *context);
@@ -230,6 +232,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	int		   *relid_subplan_map;
 	ListCell   *lc;
 	int			i;
+	bool		contains_init_steps = false;
+	bool		contains_exec_steps = false;
 
 	/*
 	 * Scan the subpaths to see which ones are scans of partition child
@@ -309,12 +313,16 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		Bitmapset  *partrelids = (Bitmapset *) lfirst(lc);
 		List	   *pinfolist;
 		Bitmapset  *matchedsubplans = NULL;
+		bool		partrel_contains_init_steps;
+		bool		partrel_contains_exec_steps;
 
 		pinfolist = make_partitionedrel_pruneinfo(root, parentrel,
 												  prunequal,
 												  partrelids,
 												  relid_subplan_map,
-												  &matchedsubplans);
+												  &matchedsubplans,
+												  &partrel_contains_init_steps,
+												  &partrel_contains_exec_steps);
 
 		/* When pruning is possible, record the matched subplans */
 		if (pinfolist != NIL)
@@ -323,6 +331,10 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 			allmatchedsubplans = bms_join(matchedsubplans,
 										  allmatchedsubplans);
 		}
+		if (!contains_init_steps)
+			contains_init_steps = partrel_contains_init_steps;
+		if (!contains_exec_steps)
+			contains_exec_steps = partrel_contains_exec_steps;
 	}
 
 	pfree(relid_subplan_map);
@@ -337,6 +349,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	/* Else build the result data structure */
 	pruneinfo = makeNode(PartitionPruneInfo);
 	pruneinfo->prune_infos = prunerelinfos;
+	pruneinfo->contains_init_steps = contains_init_steps;
+	pruneinfo->contains_exec_steps = contains_exec_steps;
 
 	/*
 	 * Some subplans may not belong to any of the identified partitioned rels.
@@ -435,13 +449,18 @@ add_part_relids(List *allpartrelids, Bitmapset *partrelids)
  * If we cannot find any useful run-time pruning steps, return NIL.
  * However, on success, each rel identified in partrelids will have
  * an element in the result list, even if some of them are useless.
+ * *contains_init_steps and *contains_exec_steps are set to indicate that the
+ * returned PartitionedRelPruneInfos contains pruning steps that can be
+ * performed before and after execution begins, respectively.
  */
 static List *
 make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 							  List *prunequal,
 							  Bitmapset *partrelids,
 							  int *relid_subplan_map,
-							  Bitmapset **matchedsubplans)
+							  Bitmapset **matchedsubplans,
+							  bool *contains_init_steps,
+							  bool *contains_exec_steps)
 {
 	RelOptInfo *targetpart = NULL;
 	List	   *pinfolist = NIL;
@@ -452,6 +471,10 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	int			rti;
 	int			i;
 
+	/* Will find out below. */
+	*contains_init_steps = false;
+	*contains_exec_steps = false;
+
 	/*
 	 * Examine each partitioned rel, constructing a temporary array to map
 	 * from planner relids to index of the partitioned rel, and building a
@@ -539,6 +562,9 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		 * executor per-scan pruning steps.  This first pass creates startup
 		 * pruning steps and detects whether there's any possibly-useful quals
 		 * that would require per-scan pruning.
+		 *
+		 * In the first pass, we note whether the 2nd pass is necessary by
+		 * by noting the presence of EXEC parameters.
 		 */
 		gen_partprune_steps(subpart, partprunequal, PARTTARGET_INITIAL,
 							&context);
@@ -613,6 +639,11 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		pinfo->execparamids = execparamids;
 		/* Remaining fields will be filled in the next loop */
 
+		if (!*contains_init_steps)
+			*contains_init_steps = (initial_pruning_steps != NIL);
+		if (!*contains_exec_steps)
+			*contains_exec_steps = (exec_pruning_steps != NIL);
+
 		pinfolist = lappend(pinfolist, pinfo);
 	}
 
@@ -798,6 +829,7 @@ prune_append_rel_partitions(RelOptInfo *rel)
 
 	/* These are not valid when being called from the planner */
 	context.planstate = NULL;
+	context.exprcontext = NULL;
 	context.exprstates = NULL;
 
 	/* Actual pruning happens here. */
@@ -808,8 +840,8 @@ prune_append_rel_partitions(RelOptInfo *rel)
  * get_matching_partitions
  *		Determine partitions that survive partition pruning
  *
- * Note: context->planstate must be set to a valid PlanState when the
- * pruning_steps were generated with a target other than PARTTARGET_PLANNER.
+ * Note: context->exprcontext must be valid when the pruning_steps were
+ * generated with a target other than PARTTARGET_PLANNER.
  *
  * Returns a Bitmapset of the RelOptInfo->part_rels indexes of the surviving
  * partitions.
@@ -3654,7 +3686,7 @@ match_boolean_partition_clause(Oid partopfamily, Expr *clause, Expr *partkey,
  * exprstate array.
  *
  * Note that the evaluated result may be in the per-tuple memory context of
- * context->planstate->ps_ExprContext, and we may have leaked other memory
+ * context->exprcontext, and we may have leaked other memory
  * there too.  This memory must be recovered by resetting that ExprContext
  * after we're done with the pruning operation (see execPartition.c).
  */
@@ -3677,13 +3709,18 @@ partkey_datum_from_expr(PartitionPruneContext *context,
 		ExprContext *ectx;
 
 		/*
-		 * We should never see a non-Const in a step unless we're running in
-		 * the executor.
+		 * We should never see a non-Const in a step unless the caller has
+		 * passed a valid ExprContext.
+		 *
+		 * When context->planstate is valid, context->exprcontext is same
+		 * as context->planstate->ps_ExprContext.
 		 */
-		Assert(context->planstate != NULL);
+		Assert(context->planstate != NULL || context->exprcontext != NULL);
+		Assert(context->planstate == NULL ||
+			   (context->exprcontext == context->planstate->ps_ExprContext));
 
 		exprstate = context->exprstates[stateidx];
-		ectx = context->planstate->ps_ExprContext;
+		ectx = context->exprcontext;
 		*value = ExecEvalExprSwitchContext(exprstate, ectx, isnull);
 	}
 }
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index fda2e9360e..5d8f3fc3cb 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -910,15 +910,17 @@ pg_plan_query(Query *querytree, const char *query_string, int cursorOptions,
  * For normal optimizable statements, invoke the planner.  For utility
  * statements, just make a wrapper PlannedStmt node.
  *
- * The result is a list of PlannedStmt nodes.
+ * The result is a list of PlannedStmt nodes.  Also, a NULL is appended to
+ * *execPrepResults for each PlannedStmt added to the returned list.
  */
 List *
 pg_plan_queries(List *querytrees, const char *query_string, int cursorOptions,
-				ParamListInfo boundParams)
+				ParamListInfo boundParams, List **stmt_execprep_list)
 {
 	List	   *stmt_list = NIL;
 	ListCell   *query_list;
 
+	*stmt_execprep_list = NIL;
 	foreach(query_list, querytrees)
 	{
 		Query	   *query = lfirst_node(Query, query_list);
@@ -942,6 +944,7 @@ pg_plan_queries(List *querytrees, const char *query_string, int cursorOptions,
 		}
 
 		stmt_list = lappend(stmt_list, stmt);
+		*stmt_execprep_list = lappend(*stmt_execprep_list, NULL);
 	}
 
 	return stmt_list;
@@ -1045,7 +1048,8 @@ exec_simple_query(const char *query_string)
 		QueryCompletion qc;
 		MemoryContext per_parsetree_context = NULL;
 		List	   *querytree_list,
-				   *plantree_list;
+				   *plantree_list,
+				   *plantree_execprep_list;
 		Portal		portal;
 		DestReceiver *receiver;
 		int16		format;
@@ -1132,7 +1136,8 @@ exec_simple_query(const char *query_string)
 												NULL, 0, NULL);
 
 		plantree_list = pg_plan_queries(querytree_list, query_string,
-										CURSOR_OPT_PARALLEL_OK, NULL);
+										CURSOR_OPT_PARALLEL_OK, NULL,
+										&plantree_execprep_list);
 
 		/*
 		 * Done with the snapshot used for parsing/planning.
@@ -1168,6 +1173,7 @@ exec_simple_query(const char *query_string)
 						  query_string,
 						  commandTag,
 						  plantree_list,
+						  plantree_execprep_list,
 						  NULL);
 
 		/*
@@ -1978,6 +1984,7 @@ exec_bind_message(StringInfo input_message)
 					  query_string,
 					  psrc->commandTag,
 					  cplan->stmt_list,
+					  cplan->stmt_execprep_list,
 					  cplan);
 
 	/* Done with the snapshot used for parameter I/O and parsing/planning */
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index 5f907831a3..b76aa3ef3b 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -35,7 +35,7 @@
 Portal		ActivePortal = NULL;
 
 
-static void ProcessQuery(PlannedStmt *plan,
+static void ProcessQuery(PlannedStmt *plan, ExecPrepOutput *execprep,
 						 const char *sourceText,
 						 ParamListInfo params,
 						 QueryEnvironment *queryEnv,
@@ -65,6 +65,7 @@ static void DoPortalRewind(Portal portal);
  */
 QueryDesc *
 CreateQueryDesc(PlannedStmt *plannedstmt,
+				ExecPrepOutput *execprep,
 				const char *sourceText,
 				Snapshot snapshot,
 				Snapshot crosscheck_snapshot,
@@ -77,6 +78,7 @@ CreateQueryDesc(PlannedStmt *plannedstmt,
 
 	qd->operation = plannedstmt->commandType;	/* operation */
 	qd->plannedstmt = plannedstmt;	/* plan */
+	qd->execprep = execprep;		/* ExecutorPrep() output for plan */
 	qd->sourceText = sourceText;	/* query text */
 	qd->snapshot = RegisterSnapshot(snapshot);	/* snapshot */
 	/* RI check snapshot */
@@ -122,6 +124,7 @@ FreeQueryDesc(QueryDesc *qdesc)
  *		PORTAL_ONE_RETURNING, or PORTAL_ONE_MOD_WITH portal
  *
  *	plan: the plan tree for the query
+ *	execprep: ExecutorPrep() output for the plan tree
  *	sourceText: the source text of the query
  *	params: any parameters needed
  *	dest: where to send results
@@ -134,6 +137,7 @@ FreeQueryDesc(QueryDesc *qdesc)
  */
 static void
 ProcessQuery(PlannedStmt *plan,
+			 ExecPrepOutput *execprep,
 			 const char *sourceText,
 			 ParamListInfo params,
 			 QueryEnvironment *queryEnv,
@@ -145,7 +149,7 @@ ProcessQuery(PlannedStmt *plan,
 	/*
 	 * Create the QueryDesc object
 	 */
-	queryDesc = CreateQueryDesc(plan, sourceText,
+	queryDesc = CreateQueryDesc(plan, execprep, sourceText,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, params, queryEnv, 0);
 
@@ -490,6 +494,7 @@ PortalStart(Portal portal, ParamListInfo params,
 				 * the destination to DestNone.
 				 */
 				queryDesc = CreateQueryDesc(linitial_node(PlannedStmt, portal->stmts),
+											linitial_node(ExecPrepOutput, portal->stmt_execpreps),
 											portal->sourceText,
 											GetActiveSnapshot(),
 											InvalidSnapshot,
@@ -1190,7 +1195,8 @@ PortalRunMulti(Portal portal,
 			   QueryCompletion *qc)
 {
 	bool		active_snapshot_set = false;
-	ListCell   *stmtlist_item;
+	ListCell   *stmtlist_item,
+			   *execpreplist_item;
 
 	/*
 	 * If the destination is DestRemoteExecute, change to DestNone.  The
@@ -1211,9 +1217,12 @@ PortalRunMulti(Portal portal,
 	 * Loop to handle the individual queries generated from a single parsetree
 	 * by analysis and rewrite.
 	 */
-	foreach(stmtlist_item, portal->stmts)
+	forboth(stmtlist_item, portal->stmts,
+			execpreplist_item, portal->stmt_execpreps)
 	{
 		PlannedStmt *pstmt = lfirst_node(PlannedStmt, stmtlist_item);
+		ExecPrepOutput *execprep = lfirst_node(ExecPrepOutput,
+											   execpreplist_item);
 
 		/*
 		 * If we got a cancel signal in prior command, quit
@@ -1271,7 +1280,7 @@ PortalRunMulti(Portal portal,
 			if (pstmt->canSetTag)
 			{
 				/* statement can set tag string */
-				ProcessQuery(pstmt,
+				ProcessQuery(pstmt, execprep,
 							 portal->sourceText,
 							 portal->portalParams,
 							 portal->queryEnv,
@@ -1280,7 +1289,7 @@ PortalRunMulti(Portal portal,
 			else
 			{
 				/* stmt added by rewrite cannot set tag */
-				ProcessQuery(pstmt,
+				ProcessQuery(pstmt, execprep,
 							 portal->sourceText,
 							 portal->portalParams,
 							 portal->queryEnv,
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 4a9055e6bb..221738dddc 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -58,12 +58,14 @@
 
 #include "access/transam.h"
 #include "catalog/namespace.h"
+#include "executor/execPartition.h"
 #include "executor/executor.h"
 #include "miscadmin.h"
 #include "nodes/nodeFuncs.h"
 #include "optimizer/optimizer.h"
 #include "parser/analyze.h"
 #include "parser/parsetree.h"
+#include "partitioning/partdesc.h"
 #include "storage/lmgr.h"
 #include "tcop/pquery.h"
 #include "tcop/utility.h"
@@ -99,14 +101,15 @@ static dlist_head cached_expression_list = DLIST_STATIC_INIT(cached_expression_l
 static void ReleaseGenericPlan(CachedPlanSource *plansource);
 static List *RevalidateCachedQuery(CachedPlanSource *plansource,
 								   QueryEnvironment *queryEnv);
-static bool CheckCachedPlan(CachedPlanSource *plansource);
+static bool CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams);
 static CachedPlan *BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
 								   ParamListInfo boundParams, QueryEnvironment *queryEnv);
 static bool choose_custom_plan(CachedPlanSource *plansource,
 							   ParamListInfo boundParams);
 static double cached_plan_cost(CachedPlan *plan, bool include_planner);
 static Query *QueryListGetPrimaryStmt(List *stmts);
-static void AcquireExecutorLocks(List *stmt_list, bool acquire);
+static List *AcquireExecutorLocks(List *stmt_list, bool acquire,
+								 ParamListInfo boundParams);
 static void AcquirePlannerLocks(List *stmt_list, bool acquire);
 static void ScanQueryForLocks(Query *parsetree, bool acquire);
 static bool ScanQueryWalker(Node *node, bool *acquire);
@@ -782,6 +785,47 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
 	return tlist;
 }
 
+/*
+ * CachedPlanSaveExecPrepOutputs
+ *		Save the list containing ExecPrepOutput nodes in the given CachedPlan
+ *
+ * The provided list is copied into a dedicated context that is a child of
+ * plan->context.
+ */
+static void
+CachedPlanSaveExecPrepOutputs(CachedPlan *plan, List *execprep_list)
+{
+	MemoryContext	execprep_context = plan->execprep_context,
+					oldcontext = CurrentMemoryContext;
+	List		   *execprep_list_copy;
+
+	/*
+	 * Set up the dedicated context if not already done, saving it as a child
+	 * of the CachedPlan's context.
+	 */
+	if (execprep_context == NULL)
+	{
+		execprep_context = AllocSetContextCreate(CurrentMemoryContext,
+												 "CachedPlan execprep list",
+												 ALLOCSET_START_SMALL_SIZES);
+		MemoryContextSetParent(execprep_context, plan->context);
+		MemoryContextSetIdentifier(execprep_context, plan->context->ident);
+		plan->execprep_context = execprep_context;
+	}
+	else
+	{
+		/* Just lear existing contents by resetting the context. */
+		Assert(MemoryContextIsValid(execprep_context));
+		MemoryContextReset(execprep_context);
+	}
+
+	MemoryContextSwitchTo(execprep_context);
+	execprep_list_copy = copyObject(execprep_list);
+	MemoryContextSwitchTo(oldcontext);
+
+	plan->stmt_execprep_list = execprep_list_copy;
+}
+
 /*
  * CheckCachedPlan: see if the CachedPlanSource's generic plan is valid.
  *
@@ -790,9 +834,16 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
  *
  * On a "true" return, we have acquired the locks needed to run the plan.
  * (We must do this for the "true" result to be race-condition-free.)
+ *
+ * If the CachedPlan is valid, this prepares the PlannedStmts contained in it
+ * for execution by invoking ExecutorPrep() on each.  Resulting ExecPrepOutput
+ * nodes, allocated in a child context of the context containing the plan
+ * itself, are added into plan->stmt_execprep_list.  ExecPrepOutput nodes that
+ * may be present in the list from the last invocation of CheckCachedPlan() on
+ * the same CachedPlan are deleted.
  */
 static bool
-CheckCachedPlan(CachedPlanSource *plansource)
+CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams)
 {
 	CachedPlan *plan = plansource->gplan;
 
@@ -820,13 +871,22 @@ CheckCachedPlan(CachedPlanSource *plansource)
 	 */
 	if (plan->is_valid)
 	{
+		List   *execprep_list;
+
 		/*
 		 * Plan must have positive refcount because it is referenced by
 		 * plansource; so no need to fear it disappears under us here.
 		 */
 		Assert(plan->refcount > 0);
 
-		AcquireExecutorLocks(plan->stmt_list, true);
+		/*
+		 * Take executor locks on the plan tree and perform other
+		 * preparatatory actions on it by invoking ExecutorPrep().  A list of
+		 * ExecPrepOutput nodes is generated as result which is saved in the
+		 * CachedPlan.
+		 */
+		execprep_list = AcquireExecutorLocks(plan->stmt_list, true, boundParams);
+		CachedPlanSaveExecPrepOutputs(plan, execprep_list);
 
 		/*
 		 * If plan was transient, check to see if TransactionXmin has
@@ -848,7 +908,7 @@ CheckCachedPlan(CachedPlanSource *plansource)
 		}
 
 		/* Oops, the race case happened.  Release useless locks. */
-		AcquireExecutorLocks(plan->stmt_list, false);
+		(void) AcquireExecutorLocks(plan->stmt_list, false, boundParams);
 	}
 
 	/*
@@ -880,7 +940,8 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
 				ParamListInfo boundParams, QueryEnvironment *queryEnv)
 {
 	CachedPlan *plan;
-	List	   *plist;
+	List	   *plist,
+			   *execprep_list;
 	bool		snapshot_set;
 	bool		is_transient;
 	MemoryContext plan_context;
@@ -933,7 +994,8 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
 	 * Generate the plan.
 	 */
 	plist = pg_plan_queries(qlist, plansource->query_string,
-							plansource->cursor_options, boundParams);
+							plansource->cursor_options, boundParams,
+							&execprep_list);
 
 	/* Release snapshot if we got one */
 	if (snapshot_set)
@@ -1002,6 +1064,11 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
 	plan->is_saved = false;
 	plan->is_valid = true;
 
+	/* Save the dummy ExecPrepOutput list. */
+	plan->execprep_context = NULL;
+	CachedPlanSaveExecPrepOutputs(plan, execprep_list);
+	Assert(MemoryContextIsValid(plan->execprep_context));
+
 	/* assign generation number to new plan */
 	plan->generation = ++(plansource->generation);
 
@@ -1160,7 +1227,7 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 
 	if (!customplan)
 	{
-		if (CheckCachedPlan(plansource))
+		if (CheckCachedPlan(plansource, boundParams))
 		{
 			/* We want a generic plan, and we already have a valid one */
 			plan = plansource->gplan;
@@ -1366,7 +1433,6 @@ CachedPlanAllowsSimpleValidityCheck(CachedPlanSource *plansource,
 	foreach(lc, plan->stmt_list)
 	{
 		PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc);
-		ListCell   *lc2;
 
 		if (plannedstmt->commandType == CMD_UTILITY)
 			return false;
@@ -1375,13 +1441,8 @@ CachedPlanAllowsSimpleValidityCheck(CachedPlanSource *plansource,
 		 * We have to grovel through the rtable because it's likely to contain
 		 * an RTE_RESULT relation, rather than being totally empty.
 		 */
-		foreach(lc2, plannedstmt->rtable)
-		{
-			RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc2);
-
-			if (rte->rtekind == RTE_RELATION)
-				return false;
-		}
+		if (!bms_is_empty(plannedstmt->relationRTIs))
+			return false;
 	}
 
 	/*
@@ -1738,16 +1799,22 @@ QueryListGetPrimaryStmt(List *stmts)
 /*
  * AcquireExecutorLocks: acquire locks needed for execution of a cached plan;
  * or release them if acquire is false.
+ *
+ * Returns a list of ExecPrepOutput nodes containing one element for each
+ * PlannedStmt in stmt_list; NULL if the latter is utility statement.
  */
-static void
-AcquireExecutorLocks(List *stmt_list, bool acquire)
+static List *
+AcquireExecutorLocks(List *stmt_list, bool acquire, ParamListInfo boundParams)
 {
 	ListCell   *lc1;
+	List	   *stmt_execprep_list = NIL;
 
 	foreach(lc1, stmt_list)
 	{
 		PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
-		ListCell   *lc2;
+		ExecPrepContext *context;
+		ExecPrepOutput *execprep = NULL;
+		int			rti;
 
 		if (plannedstmt->commandType == CMD_UTILITY)
 		{
@@ -1762,28 +1829,46 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
 
 			if (query)
 				ScanQueryForLocks(query, acquire);
-			continue;
 		}
-
-		foreach(lc2, plannedstmt->rtable)
+		else
 		{
-			RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc2);
-
-			if (rte->rtekind != RTE_RELATION)
-				continue;
-
 			/*
-			 * Acquire the appropriate type of lock on each relation OID. Note
-			 * that we don't actually try to open the rel, and hence will not
-			 * fail if it's been dropped entirely --- we'll just transiently
-			 * acquire a non-conflicting lock.
+			 * Prep the plan tree for execution.
 			 */
-			if (acquire)
-				LockRelationOid(rte->relid, rte->rellockmode);
-			else
-				UnlockRelationOid(rte->relid, rte->rellockmode);
+			context = makeNode(ExecPrepContext);
+			context->stmt = plannedstmt;
+			context->params = boundParams;
+			execprep = ExecutorPrep(context);
+
+			rti = -1;
+			while ((rti = bms_next_member(execprep->relationRTIs, rti)) >= 0)
+			{
+				RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
+
+				if (rte->rtekind != RTE_RELATION)
+					continue;
+
+				/*
+				 * Acquire the appropriate type of lock on each relation OID.
+				 * Note that we don't actually try to open the rel, and hence
+				 * will not fail if it's been dropped entirely --- we'll just
+				 * transiently acquire a non-conflicting lock.
+				 */
+				if (acquire)
+					LockRelationOid(rte->relid, rte->rellockmode);
+				else
+					UnlockRelationOid(rte->relid, rte->rellockmode);
+			}
 		}
+
+		/*
+		 * Keep the invariant that stmt_execprep_list is same length as
+		 * stmt_list.
+		 */
+		stmt_execprep_list = lappend(stmt_execprep_list, execprep);
 	}
+
+	return stmt_execprep_list;
 }
 
 /*
diff --git a/src/backend/utils/mmgr/portalmem.c b/src/backend/utils/mmgr/portalmem.c
index 236f450a2b..5cf1339ffd 100644
--- a/src/backend/utils/mmgr/portalmem.c
+++ b/src/backend/utils/mmgr/portalmem.c
@@ -284,6 +284,7 @@ PortalDefineQuery(Portal portal,
 				  const char *sourceText,
 				  CommandTag commandTag,
 				  List *stmts,
+				  List *stmt_execpreps,
 				  CachedPlan *cplan)
 {
 	AssertArg(PortalIsValid(portal));
@@ -298,6 +299,7 @@ PortalDefineQuery(Portal portal,
 	portal->qc.nprocessed = 0;
 	portal->commandTag = commandTag;
 	portal->stmts = stmts;
+	portal->stmt_execpreps = stmt_execpreps;
 	portal->cplan = cplan;
 	portal->status = PORTAL_DEFINED;
 }
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 666977fb1f..f553649a5d 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -87,7 +87,8 @@ extern void ExplainOneUtility(Node *utilityStmt, IntoClause *into,
 							  ExplainState *es, const char *queryString,
 							  ParamListInfo params, QueryEnvironment *queryEnv);
 
-extern void ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into,
+extern void ExplainOnePlan(PlannedStmt *plannedstmt, ExecPrepOutput *execprep,
+						   IntoClause *into,
 						   ExplainState *es, const char *queryString,
 						   ParamListInfo params, QueryEnvironment *queryEnv,
 						   const instr_time *planduration,
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index 603d8becc4..785a09f15f 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -119,10 +119,21 @@ extern ResultRelInfo *ExecFindPartition(ModifyTableState *mtstate,
 										EState *estate);
 extern void ExecCleanupTupleRouting(ModifyTableState *mtstate,
 									PartitionTupleRouting *proute);
+
 extern PartitionPruneState *ExecCreatePartitionPruneState(PlanState *planstate,
-														  PartitionPruneInfo *partitionpruneinfo);
-extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate);
+														  PartitionPruneInfo *partitionpruneinfo,
+														  bool consider_initial_steps,
+														  bool consider_exec_steps,
+														  List *rtable, ExprContext *econtext,
+														  PartitionDirectory partdir);
 extern Bitmapset *ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate,
-												  int nsubplans);
-
+								PartitionPruneInfo *pruneinfo,
+								Bitmapset **parentrelids);
+extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate);
+extern Bitmapset *ExecInitPartitionPruning(PlanState *planstate, int n_total_subplans,
+						 PartitionPruneInfo *pruneinfo,
+						 PartitionPruneState **prunestate);
+extern Bitmapset *ExecPrepDoInitialPruning(PartitionPruneInfo *pruneinfo,
+						 List *rtable, ParamListInfo params,
+						 Bitmapset **parentrelids);
 #endif							/* EXECPARTITION_H */
diff --git a/src/include/executor/execdesc.h b/src/include/executor/execdesc.h
index e79e2c001f..491ceef401 100644
--- a/src/include/executor/execdesc.h
+++ b/src/include/executor/execdesc.h
@@ -35,6 +35,7 @@ typedef struct QueryDesc
 	/* These fields are provided by CreateQueryDesc */
 	CmdType		operation;		/* CMD_SELECT, CMD_UPDATE, etc. */
 	PlannedStmt *plannedstmt;	/* planner's output (could be utility, too) */
+	ExecPrepOutput *execprep;	/* ExecutorPrep()'s output given plannedstmt */
 	const char *sourceText;		/* source text of the query */
 	Snapshot	snapshot;		/* snapshot to use for query */
 	Snapshot	crosscheck_snapshot;	/* crosscheck for RI update/delete */
@@ -57,6 +58,7 @@ typedef struct QueryDesc
 
 /* in pquery.c */
 extern QueryDesc *CreateQueryDesc(PlannedStmt *plannedstmt,
+								  ExecPrepOutput *execprep,
 								  const char *sourceText,
 								  Snapshot snapshot,
 								  Snapshot crosscheck_snapshot,
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index 344399f6a8..627cb19a4c 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -185,6 +185,7 @@ ExecGetJunkAttribute(TupleTableSlot *slot, AttrNumber attno, bool *isNull)
 /*
  * prototypes from functions in execMain.c
  */
+extern ExecPrepOutput *ExecutorPrep(ExecPrepContext *context);
 extern void ExecutorStart(QueryDesc *queryDesc, int eflags);
 extern void standard_ExecutorStart(QueryDesc *queryDesc, int eflags);
 extern void ExecutorRun(QueryDesc *queryDesc,
@@ -233,6 +234,8 @@ extern void EvalPlanQualEnd(EPQState *epqstate);
 /*
  * functions in execProcnode.c
  */
+extern void ExecPrepNode(Plan *node, ExecPrepContext *context,
+						 ExecPrepOutput *result);
 extern PlanState *ExecInitNode(Plan *node, EState *estate, int eflags);
 extern void ExecSetExecProcNode(PlanState *node, ExecProcNodeMtd function);
 extern Node *MultiExecProcNode(PlanState *node);
diff --git a/src/include/executor/nodeAgg.h b/src/include/executor/nodeAgg.h
index 4d1bd92999..2dd7570067 100644
--- a/src/include/executor/nodeAgg.h
+++ b/src/include/executor/nodeAgg.h
@@ -314,6 +314,7 @@ typedef struct AggStatePerHashData
 }			AggStatePerHashData;
 
 
+extern void ExecPrepAgg(Agg *node, ExecPrepContext *context, ExecPrepOutput *result);
 extern AggState *ExecInitAgg(Agg *node, EState *estate, int eflags);
 extern void ExecEndAgg(AggState *node);
 extern void ExecReScanAgg(AggState *node);
diff --git a/src/include/executor/nodeAppend.h b/src/include/executor/nodeAppend.h
index 4cb78ee5b6..85bc9d30a6 100644
--- a/src/include/executor/nodeAppend.h
+++ b/src/include/executor/nodeAppend.h
@@ -17,6 +17,7 @@
 #include "access/parallel.h"
 #include "nodes/execnodes.h"
 
+extern void ExecPrepAppend(Append *node, ExecPrepContext *context, ExecPrepOutput *execprep);
 extern AppendState *ExecInitAppend(Append *node, EState *estate, int eflags);
 extern void ExecEndAppend(AppendState *node);
 extern void ExecReScanAppend(AppendState *node);
diff --git a/src/include/executor/nodeBitmapAnd.h b/src/include/executor/nodeBitmapAnd.h
index bae6a83826..aafb10a2aa 100644
--- a/src/include/executor/nodeBitmapAnd.h
+++ b/src/include/executor/nodeBitmapAnd.h
@@ -16,6 +16,7 @@
 
 #include "nodes/execnodes.h"
 
+extern void ExecPrepBitmapAnd(BitmapAnd *node, ExecPrepContext *context, ExecPrepOutput *result);
 extern BitmapAndState *ExecInitBitmapAnd(BitmapAnd *node, EState *estate, int eflags);
 extern Node *MultiExecBitmapAnd(BitmapAndState *node);
 extern void ExecEndBitmapAnd(BitmapAndState *node);
diff --git a/src/include/executor/nodeBitmapHeapscan.h b/src/include/executor/nodeBitmapHeapscan.h
index 789522cb8d..7240d9fa93 100644
--- a/src/include/executor/nodeBitmapHeapscan.h
+++ b/src/include/executor/nodeBitmapHeapscan.h
@@ -17,6 +17,7 @@
 #include "access/parallel.h"
 #include "nodes/execnodes.h"
 
+extern void ExecPrepBitmapHeapScan(BitmapHeapScan *node, ExecPrepContext *context, ExecPrepOutput *result);
 extern BitmapHeapScanState *ExecInitBitmapHeapScan(BitmapHeapScan *node, EState *estate, int eflags);
 extern void ExecEndBitmapHeapScan(BitmapHeapScanState *node);
 extern void ExecReScanBitmapHeapScan(BitmapHeapScanState *node);
diff --git a/src/include/executor/nodeBitmapIndexscan.h b/src/include/executor/nodeBitmapIndexscan.h
index 01fb6ef536..6759724c2e 100644
--- a/src/include/executor/nodeBitmapIndexscan.h
+++ b/src/include/executor/nodeBitmapIndexscan.h
@@ -16,6 +16,7 @@
 
 #include "nodes/execnodes.h"
 
+extern void ExecPrepBitmapIndexScan(BitmapIndexScan *node, ExecPrepContext *context, ExecPrepOutput *result);
 extern BitmapIndexScanState *ExecInitBitmapIndexScan(BitmapIndexScan *node, EState *estate, int eflags);
 extern Node *MultiExecBitmapIndexScan(BitmapIndexScanState *node);
 extern void ExecEndBitmapIndexScan(BitmapIndexScanState *node);
diff --git a/src/include/executor/nodeBitmapOr.h b/src/include/executor/nodeBitmapOr.h
index ad90812cc1..66ddc18f63 100644
--- a/src/include/executor/nodeBitmapOr.h
+++ b/src/include/executor/nodeBitmapOr.h
@@ -16,6 +16,7 @@
 
 #include "nodes/execnodes.h"
 
+extern void ExecPrepBitmapOr(BitmapOr *node, ExecPrepContext *context, ExecPrepOutput *result);
 extern BitmapOrState *ExecInitBitmapOr(BitmapOr *node, EState *estate, int eflags);
 extern Node *MultiExecBitmapOr(BitmapOrState *node);
 extern void ExecEndBitmapOr(BitmapOrState *node);
diff --git a/src/include/executor/nodeCtescan.h b/src/include/executor/nodeCtescan.h
index 317d142b16..7908ae51df 100644
--- a/src/include/executor/nodeCtescan.h
+++ b/src/include/executor/nodeCtescan.h
@@ -16,6 +16,7 @@
 
 #include "nodes/execnodes.h"
 
+extern void ExecPrepCteScan(CteScan *node, ExecPrepContext *context, ExecPrepOutput *result);
 extern CteScanState *ExecInitCteScan(CteScan *node, EState *estate, int eflags);
 extern void ExecEndCteScan(CteScanState *node);
 extern void ExecReScanCteScan(CteScanState *node);
diff --git a/src/include/executor/nodeCustom.h b/src/include/executor/nodeCustom.h
index 5ef890144f..8c1d05f64b 100644
--- a/src/include/executor/nodeCustom.h
+++ b/src/include/executor/nodeCustom.h
@@ -18,6 +18,7 @@
 /*
  * General executor code
  */
+extern void ExecPrepCustomScan(CustomScan *node, ExecPrepContext *context, ExecPrepOutput *result);
 extern CustomScanState *ExecInitCustomScan(CustomScan *cscan,
 										   EState *estate, int eflags);
 extern void ExecEndCustomScan(CustomScanState *node);
diff --git a/src/include/executor/nodeForeignscan.h b/src/include/executor/nodeForeignscan.h
index c9fbaed79c..a2d6667011 100644
--- a/src/include/executor/nodeForeignscan.h
+++ b/src/include/executor/nodeForeignscan.h
@@ -17,6 +17,7 @@
 #include "access/parallel.h"
 #include "nodes/execnodes.h"
 
+extern void ExecPrepForeignScan(ForeignScan *node, ExecPrepContext *context, ExecPrepOutput *result);
 extern ForeignScanState *ExecInitForeignScan(ForeignScan *node, EState *estate, int eflags);
 extern void ExecEndForeignScan(ForeignScanState *node);
 extern void ExecReScanForeignScan(ForeignScanState *node);
diff --git a/src/include/executor/nodeFunctionscan.h b/src/include/executor/nodeFunctionscan.h
index 7a598a1d46..8686bb5c09 100644
--- a/src/include/executor/nodeFunctionscan.h
+++ b/src/include/executor/nodeFunctionscan.h
@@ -16,6 +16,7 @@
 
 #include "nodes/execnodes.h"
 
+extern void ExecPrepFunctionScan(FunctionScan *node, ExecPrepContext *context, ExecPrepOutput *result);
 extern FunctionScanState *ExecInitFunctionScan(FunctionScan *node, EState *estate, int eflags);
 extern void ExecEndFunctionScan(FunctionScanState *node);
 extern void ExecReScanFunctionScan(FunctionScanState *node);
diff --git a/src/include/executor/nodeGather.h b/src/include/executor/nodeGather.h
index 29829ffe9a..206185ffbc 100644
--- a/src/include/executor/nodeGather.h
+++ b/src/include/executor/nodeGather.h
@@ -16,6 +16,7 @@
 
 #include "nodes/execnodes.h"
 
+extern void ExecPrepGather(Gather *node, ExecPrepContext *context, ExecPrepOutput *result);
 extern GatherState *ExecInitGather(Gather *node, EState *estate, int eflags);
 extern void ExecEndGather(GatherState *node);
 extern void ExecShutdownGather(GatherState *node);
diff --git a/src/include/executor/nodeGatherMerge.h b/src/include/executor/nodeGatherMerge.h
index d724d5fea4..b124a3fe99 100644
--- a/src/include/executor/nodeGatherMerge.h
+++ b/src/include/executor/nodeGatherMerge.h
@@ -16,6 +16,7 @@
 
 #include "nodes/execnodes.h"
 
+extern void ExecPrepGatherMerge(GatherMerge *node, ExecPrepContext *context, ExecPrepOutput *result);
 extern GatherMergeState *ExecInitGatherMerge(GatherMerge *node,
 											 EState *estate,
 											 int eflags);
diff --git a/src/include/executor/nodeGroup.h b/src/include/executor/nodeGroup.h
index 816ed2c099..7e86abab01 100644
--- a/src/include/executor/nodeGroup.h
+++ b/src/include/executor/nodeGroup.h
@@ -16,6 +16,7 @@
 
 #include "nodes/execnodes.h"
 
+extern void ExecPrepGroup(Group *node, ExecPrepContext *context, ExecPrepOutput *result);
 extern GroupState *ExecInitGroup(Group *node, EState *estate, int eflags);
 extern void ExecEndGroup(GroupState *node);
 extern void ExecReScanGroup(GroupState *node);
diff --git a/src/include/executor/nodeHash.h b/src/include/executor/nodeHash.h
index e1e0dec24b..1426a6e9a1 100644
--- a/src/include/executor/nodeHash.h
+++ b/src/include/executor/nodeHash.h
@@ -19,6 +19,7 @@
 
 struct SharedHashJoinBatch;
 
+extern void ExecPrepHash(Hash *node, ExecPrepContext *context, ExecPrepOutput *result);
 extern HashState *ExecInitHash(Hash *node, EState *estate, int eflags);
 extern Node *MultiExecHash(HashState *node);
 extern void ExecEndHash(HashState *node);
diff --git a/src/include/executor/nodeHashjoin.h b/src/include/executor/nodeHashjoin.h
index b3b5a2c3f2..6dc88282d4 100644
--- a/src/include/executor/nodeHashjoin.h
+++ b/src/include/executor/nodeHashjoin.h
@@ -18,6 +18,7 @@
 #include "nodes/execnodes.h"
 #include "storage/buffile.h"
 
+extern void ExecPrepHashJoin(HashJoin *node, ExecPrepContext *context, ExecPrepOutput *result);
 extern HashJoinState *ExecInitHashJoin(HashJoin *node, EState *estate, int eflags);
 extern void ExecEndHashJoin(HashJoinState *node);
 extern void ExecReScanHashJoin(HashJoinState *node);
diff --git a/src/include/executor/nodeIncrementalSort.h b/src/include/executor/nodeIncrementalSort.h
index 84cfd96b13..e909cb784b 100644
--- a/src/include/executor/nodeIncrementalSort.h
+++ b/src/include/executor/nodeIncrementalSort.h
@@ -15,6 +15,7 @@
 #include "access/parallel.h"
 #include "nodes/execnodes.h"
 
+extern void ExecPrepIncrementalSort(IncrementalSort *node, ExecPrepContext *context, ExecPrepOutput *result);
 extern IncrementalSortState *ExecInitIncrementalSort(IncrementalSort *node, EState *estate, int eflags);
 extern void ExecEndIncrementalSort(IncrementalSortState *node);
 extern void ExecReScanIncrementalSort(IncrementalSortState *node);
diff --git a/src/include/executor/nodeIndexonlyscan.h b/src/include/executor/nodeIndexonlyscan.h
index 47b03950ea..d0aca7a303 100644
--- a/src/include/executor/nodeIndexonlyscan.h
+++ b/src/include/executor/nodeIndexonlyscan.h
@@ -17,6 +17,7 @@
 #include "access/parallel.h"
 #include "nodes/execnodes.h"
 
+extern void ExecPrepIndexOnlyScan(IndexOnlyScan *node, ExecPrepContext *context, ExecPrepOutput *result);
 extern IndexOnlyScanState *ExecInitIndexOnlyScan(IndexOnlyScan *node, EState *estate, int eflags);
 extern void ExecEndIndexOnlyScan(IndexOnlyScanState *node);
 extern void ExecIndexOnlyMarkPos(IndexOnlyScanState *node);
diff --git a/src/include/executor/nodeIndexscan.h b/src/include/executor/nodeIndexscan.h
index 0a075f9aea..d57c370466 100644
--- a/src/include/executor/nodeIndexscan.h
+++ b/src/include/executor/nodeIndexscan.h
@@ -18,6 +18,7 @@
 #include "access/parallel.h"
 #include "nodes/execnodes.h"
 
+extern void ExecPrepIndexScan(IndexScan *node, ExecPrepContext *context, ExecPrepOutput *result);
 extern IndexScanState *ExecInitIndexScan(IndexScan *node, EState *estate, int eflags);
 extern void ExecEndIndexScan(IndexScanState *node);
 extern void ExecIndexMarkPos(IndexScanState *node);
diff --git a/src/include/executor/nodeLimit.h b/src/include/executor/nodeLimit.h
index 6da0c4026c..05d7e4797b 100644
--- a/src/include/executor/nodeLimit.h
+++ b/src/include/executor/nodeLimit.h
@@ -16,6 +16,7 @@
 
 #include "nodes/execnodes.h"
 
+extern void ExecPrepLimit(Limit *node, ExecPrepContext *context, ExecPrepOutput *result);
 extern LimitState *ExecInitLimit(Limit *node, EState *estate, int eflags);
 extern void ExecEndLimit(LimitState *node);
 extern void ExecReScanLimit(LimitState *node);
diff --git a/src/include/executor/nodeLockRows.h b/src/include/executor/nodeLockRows.h
index 125a32b608..157d4a7f0e 100644
--- a/src/include/executor/nodeLockRows.h
+++ b/src/include/executor/nodeLockRows.h
@@ -16,6 +16,7 @@
 
 #include "nodes/execnodes.h"
 
+extern void ExecPrepLockRows(LockRows *node, ExecPrepContext *context, ExecPrepOutput *result);
 extern LockRowsState *ExecInitLockRows(LockRows *node, EState *estate, int eflags);
 extern void ExecEndLockRows(LockRowsState *node);
 extern void ExecReScanLockRows(LockRowsState *node);
diff --git a/src/include/executor/nodeMaterial.h b/src/include/executor/nodeMaterial.h
index 21a6860a1a..9b70d6e97b 100644
--- a/src/include/executor/nodeMaterial.h
+++ b/src/include/executor/nodeMaterial.h
@@ -16,6 +16,7 @@
 
 #include "nodes/execnodes.h"
 
+extern void ExecPrepMaterial(Material *node, ExecPrepContext *context, ExecPrepOutput *result);
 extern MaterialState *ExecInitMaterial(Material *node, EState *estate, int eflags);
 extern void ExecEndMaterial(MaterialState *node);
 extern void ExecMaterialMarkPos(MaterialState *node);
diff --git a/src/include/executor/nodeMemoize.h b/src/include/executor/nodeMemoize.h
index 4643163dc7..53a784f012 100644
--- a/src/include/executor/nodeMemoize.h
+++ b/src/include/executor/nodeMemoize.h
@@ -17,6 +17,7 @@
 #include "access/parallel.h"
 #include "nodes/execnodes.h"
 
+extern void ExecPrepMemoize(Memoize *node, ExecPrepContext *context, ExecPrepOutput *result);
 extern MemoizeState *ExecInitMemoize(Memoize *node, EState *estate, int eflags);
 extern void ExecEndMemoize(MemoizeState *node);
 extern void ExecReScanMemoize(MemoizeState *node);
diff --git a/src/include/executor/nodeMergeAppend.h b/src/include/executor/nodeMergeAppend.h
index 97fe3b0665..60a9136de6 100644
--- a/src/include/executor/nodeMergeAppend.h
+++ b/src/include/executor/nodeMergeAppend.h
@@ -16,6 +16,7 @@
 
 #include "nodes/execnodes.h"
 
+extern void ExecPrepMergeAppend(MergeAppend *node, ExecPrepContext *context, ExecPrepOutput *result);
 extern MergeAppendState *ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags);
 extern void ExecEndMergeAppend(MergeAppendState *node);
 extern void ExecReScanMergeAppend(MergeAppendState *node);
diff --git a/src/include/executor/nodeMergejoin.h b/src/include/executor/nodeMergejoin.h
index 26ab517508..29553d5dd0 100644
--- a/src/include/executor/nodeMergejoin.h
+++ b/src/include/executor/nodeMergejoin.h
@@ -16,6 +16,7 @@
 
 #include "nodes/execnodes.h"
 
+extern void ExecPrepMergeJoin(MergeJoin *node, ExecPrepContext *context, ExecPrepOutput *result);
 extern MergeJoinState *ExecInitMergeJoin(MergeJoin *node, EState *estate, int eflags);
 extern void ExecEndMergeJoin(MergeJoinState *node);
 extern void ExecReScanMergeJoin(MergeJoinState *node);
diff --git a/src/include/executor/nodeModifyTable.h b/src/include/executor/nodeModifyTable.h
index 1d225bc88d..4b1846f8ff 100644
--- a/src/include/executor/nodeModifyTable.h
+++ b/src/include/executor/nodeModifyTable.h
@@ -19,6 +19,7 @@ extern void ExecComputeStoredGenerated(ResultRelInfo *resultRelInfo,
 									   EState *estate, TupleTableSlot *slot,
 									   CmdType cmdtype);
 
+extern void ExecPrepModifyTable(ModifyTable *node, ExecPrepContext *context, ExecPrepOutput *result);
 extern ModifyTableState *ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags);
 extern void ExecEndModifyTable(ModifyTableState *node);
 extern void ExecReScanModifyTable(ModifyTableState *node);
diff --git a/src/include/executor/nodeNamedtuplestorescan.h b/src/include/executor/nodeNamedtuplestorescan.h
index d595124e54..964afcd816 100644
--- a/src/include/executor/nodeNamedtuplestorescan.h
+++ b/src/include/executor/nodeNamedtuplestorescan.h
@@ -16,6 +16,7 @@
 
 #include "nodes/execnodes.h"
 
+extern void ExecPrepNamedTuplestoreScan(NamedTuplestoreScan *node, ExecPrepContext *context, ExecPrepOutput *result);
 extern NamedTuplestoreScanState *ExecInitNamedTuplestoreScan(NamedTuplestoreScan *node, EState *estate, int eflags);
 extern void ExecEndNamedTuplestoreScan(NamedTuplestoreScanState *node);
 extern void ExecReScanNamedTuplestoreScan(NamedTuplestoreScanState *node);
diff --git a/src/include/executor/nodeNestloop.h b/src/include/executor/nodeNestloop.h
index b1411faf57..13ea4cc870 100644
--- a/src/include/executor/nodeNestloop.h
+++ b/src/include/executor/nodeNestloop.h
@@ -16,6 +16,7 @@
 
 #include "nodes/execnodes.h"
 
+extern void ExecPrepNestLoop(NestLoop *node, ExecPrepContext *context, ExecPrepOutput *result);
 extern NestLoopState *ExecInitNestLoop(NestLoop *node, EState *estate, int eflags);
 extern void ExecEndNestLoop(NestLoopState *node);
 extern void ExecReScanNestLoop(NestLoopState *node);
diff --git a/src/include/executor/nodeProjectSet.h b/src/include/executor/nodeProjectSet.h
index 2c2b58282c..c9b44356ba 100644
--- a/src/include/executor/nodeProjectSet.h
+++ b/src/include/executor/nodeProjectSet.h
@@ -16,6 +16,7 @@
 
 #include "nodes/execnodes.h"
 
+extern void ExecPrepProjectSet(ProjectSet *node, ExecPrepContext *context, ExecPrepOutput *result);
 extern ProjectSetState *ExecInitProjectSet(ProjectSet *node, EState *estate, int eflags);
 extern void ExecEndProjectSet(ProjectSetState *node);
 extern void ExecReScanProjectSet(ProjectSetState *node);
diff --git a/src/include/executor/nodeRecursiveunion.h b/src/include/executor/nodeRecursiveunion.h
index 2d20470da2..7b7585d594 100644
--- a/src/include/executor/nodeRecursiveunion.h
+++ b/src/include/executor/nodeRecursiveunion.h
@@ -16,6 +16,7 @@
 
 #include "nodes/execnodes.h"
 
+extern void ExecPrepRecursiveUnion(RecursiveUnion *node, ExecPrepContext *context, ExecPrepOutput *result);
 extern RecursiveUnionState *ExecInitRecursiveUnion(RecursiveUnion *node, EState *estate, int eflags);
 extern void ExecEndRecursiveUnion(RecursiveUnionState *node);
 extern void ExecReScanRecursiveUnion(RecursiveUnionState *node);
diff --git a/src/include/executor/nodeResult.h b/src/include/executor/nodeResult.h
index ebb131d265..998a50ae27 100644
--- a/src/include/executor/nodeResult.h
+++ b/src/include/executor/nodeResult.h
@@ -16,6 +16,8 @@
 
 #include "nodes/execnodes.h"
 
+extern void ExecPrepResult(Result *node, ExecPrepContext *context, ExecPrepOutput *result);
+extern ResultState *ExecInitResult(Result *node, EState *estate, int eflags);
 extern ResultState *ExecInitResult(Result *node, EState *estate, int eflags);
 extern void ExecEndResult(ResultState *node);
 extern void ExecResultMarkPos(ResultState *node);
diff --git a/src/include/executor/nodeSamplescan.h b/src/include/executor/nodeSamplescan.h
index 340b41a427..c0dd45b8bc 100644
--- a/src/include/executor/nodeSamplescan.h
+++ b/src/include/executor/nodeSamplescan.h
@@ -16,6 +16,7 @@
 
 #include "nodes/execnodes.h"
 
+extern void ExecPrepSampleScan(SampleScan *node, ExecPrepContext *context, ExecPrepOutput *result);
 extern SampleScanState *ExecInitSampleScan(SampleScan *node, EState *estate, int eflags);
 extern void ExecEndSampleScan(SampleScanState *node);
 extern void ExecReScanSampleScan(SampleScanState *node);
diff --git a/src/include/executor/nodeSeqscan.h b/src/include/executor/nodeSeqscan.h
index c225ba6e04..5452742622 100644
--- a/src/include/executor/nodeSeqscan.h
+++ b/src/include/executor/nodeSeqscan.h
@@ -17,6 +17,7 @@
 #include "access/parallel.h"
 #include "nodes/execnodes.h"
 
+extern void ExecPrepSeqScan(SeqScan *node, ExecPrepContext *context, ExecPrepOutput *result);
 extern SeqScanState *ExecInitSeqScan(SeqScan *node, EState *estate, int eflags);
 extern void ExecEndSeqScan(SeqScanState *node);
 extern void ExecReScanSeqScan(SeqScanState *node);
diff --git a/src/include/executor/nodeSetOp.h b/src/include/executor/nodeSetOp.h
index a504cf8613..bc80011513 100644
--- a/src/include/executor/nodeSetOp.h
+++ b/src/include/executor/nodeSetOp.h
@@ -16,6 +16,7 @@
 
 #include "nodes/execnodes.h"
 
+extern void ExecPrepSetOp(SetOp *node, ExecPrepContext *context, ExecPrepOutput *result);
 extern SetOpState *ExecInitSetOp(SetOp *node, EState *estate, int eflags);
 extern void ExecEndSetOp(SetOpState *node);
 extern void ExecReScanSetOp(SetOpState *node);
diff --git a/src/include/executor/nodeSort.h b/src/include/executor/nodeSort.h
index 008e6a6bc6..def930a8bc 100644
--- a/src/include/executor/nodeSort.h
+++ b/src/include/executor/nodeSort.h
@@ -17,6 +17,7 @@
 #include "access/parallel.h"
 #include "nodes/execnodes.h"
 
+extern void ExecPrepSort(Sort *node, ExecPrepContext *context, ExecPrepOutput *result);
 extern SortState *ExecInitSort(Sort *node, EState *estate, int eflags);
 extern void ExecEndSort(SortState *node);
 extern void ExecSortMarkPos(SortState *node);
diff --git a/src/include/executor/nodeSubplan.h b/src/include/executor/nodeSubplan.h
index 75cc6d5104..f6e21007fa 100644
--- a/src/include/executor/nodeSubplan.h
+++ b/src/include/executor/nodeSubplan.h
@@ -16,6 +16,7 @@
 
 #include "nodes/execnodes.h"
 
+extern void ExecPrepSubPlan(SubPlan *node, ExecPrepContext *context, ExecPrepOutput *result);
 extern SubPlanState *ExecInitSubPlan(SubPlan *subplan, PlanState *parent);
 
 extern Datum ExecSubPlan(SubPlanState *node, ExprContext *econtext, bool *isNull);
diff --git a/src/include/executor/nodeSubqueryscan.h b/src/include/executor/nodeSubqueryscan.h
index a09e2be423..3fbf053e04 100644
--- a/src/include/executor/nodeSubqueryscan.h
+++ b/src/include/executor/nodeSubqueryscan.h
@@ -16,6 +16,7 @@
 
 #include "nodes/execnodes.h"
 
+extern void ExecPrepSubqueryScan(SubqueryScan *node, ExecPrepContext *context, ExecPrepOutput *result);
 extern SubqueryScanState *ExecInitSubqueryScan(SubqueryScan *node, EState *estate, int eflags);
 extern void ExecEndSubqueryScan(SubqueryScanState *node);
 extern void ExecReScanSubqueryScan(SubqueryScanState *node);
diff --git a/src/include/executor/nodeTableFuncscan.h b/src/include/executor/nodeTableFuncscan.h
index 2b82e7d7ed..ba2e7774f1 100644
--- a/src/include/executor/nodeTableFuncscan.h
+++ b/src/include/executor/nodeTableFuncscan.h
@@ -16,6 +16,7 @@
 
 #include "nodes/execnodes.h"
 
+extern void ExecPrepTableFuncScan(TableFuncScan *node, ExecPrepContext *context, ExecPrepOutput *result);
 extern TableFuncScanState *ExecInitTableFuncScan(TableFuncScan *node, EState *estate, int eflags);
 extern void ExecEndTableFuncScan(TableFuncScanState *node);
 extern void ExecReScanTableFuncScan(TableFuncScanState *node);
diff --git a/src/include/executor/nodeTidrangescan.h b/src/include/executor/nodeTidrangescan.h
index f122e09583..333cfbb5c6 100644
--- a/src/include/executor/nodeTidrangescan.h
+++ b/src/include/executor/nodeTidrangescan.h
@@ -16,6 +16,7 @@
 
 #include "nodes/execnodes.h"
 
+extern void ExecPrepTidRangeScan(TidRangeScan *node, ExecPrepContext *context, ExecPrepOutput *result);
 extern TidRangeScanState *ExecInitTidRangeScan(TidRangeScan *node,
 											   EState *estate, int eflags);
 extern void ExecEndTidRangeScan(TidRangeScanState *node);
diff --git a/src/include/executor/nodeTidscan.h b/src/include/executor/nodeTidscan.h
index 91a5f89f42..188f3f3f97 100644
--- a/src/include/executor/nodeTidscan.h
+++ b/src/include/executor/nodeTidscan.h
@@ -16,6 +16,7 @@
 
 #include "nodes/execnodes.h"
 
+extern void ExecPrepTidScan(TidScan *node, ExecPrepContext *context, ExecPrepOutput *result);
 extern TidScanState *ExecInitTidScan(TidScan *node, EState *estate, int eflags);
 extern void ExecEndTidScan(TidScanState *node);
 extern void ExecReScanTidScan(TidScanState *node);
diff --git a/src/include/executor/nodeUnique.h b/src/include/executor/nodeUnique.h
index 61f09d9853..970e894681 100644
--- a/src/include/executor/nodeUnique.h
+++ b/src/include/executor/nodeUnique.h
@@ -16,6 +16,7 @@
 
 #include "nodes/execnodes.h"
 
+extern void ExecPrepUnique(Unique *node, ExecPrepContext *context, ExecPrepOutput *result);
 extern UniqueState *ExecInitUnique(Unique *node, EState *estate, int eflags);
 extern void ExecEndUnique(UniqueState *node);
 extern void ExecReScanUnique(UniqueState *node);
diff --git a/src/include/executor/nodeValuesscan.h b/src/include/executor/nodeValuesscan.h
index 07c13ef123..f08bb080eb 100644
--- a/src/include/executor/nodeValuesscan.h
+++ b/src/include/executor/nodeValuesscan.h
@@ -16,6 +16,7 @@
 
 #include "nodes/execnodes.h"
 
+extern void ExecPrepValuesScan(ValuesScan *node, ExecPrepContext *context, ExecPrepOutput *result);
 extern ValuesScanState *ExecInitValuesScan(ValuesScan *node, EState *estate, int eflags);
 extern void ExecEndValuesScan(ValuesScanState *node);
 extern void ExecReScanValuesScan(ValuesScanState *node);
diff --git a/src/include/executor/nodeWindowAgg.h b/src/include/executor/nodeWindowAgg.h
index 4e62c8936d..a4d8487aba 100644
--- a/src/include/executor/nodeWindowAgg.h
+++ b/src/include/executor/nodeWindowAgg.h
@@ -16,6 +16,7 @@
 
 #include "nodes/execnodes.h"
 
+extern void ExecPrepWindowAgg(WindowAgg *node, ExecPrepContext *context, ExecPrepOutput *result);
 extern WindowAggState *ExecInitWindowAgg(WindowAgg *node, EState *estate, int eflags);
 extern void ExecEndWindowAgg(WindowAggState *node);
 extern void ExecReScanWindowAgg(WindowAggState *node);
diff --git a/src/include/executor/nodeWorktablescan.h b/src/include/executor/nodeWorktablescan.h
index 17842de576..5f7f76ec85 100644
--- a/src/include/executor/nodeWorktablescan.h
+++ b/src/include/executor/nodeWorktablescan.h
@@ -16,6 +16,7 @@
 
 #include "nodes/execnodes.h"
 
+extern void ExecPrepWorkTableScan(WorkTableScan *node, ExecPrepContext *context, ExecPrepOutput *result);
 extern WorkTableScanState *ExecInitWorkTableScan(WorkTableScan *node, EState *estate, int eflags);
 extern void ExecEndWorkTableScan(WorkTableScanState *node);
 extern void ExecReScanWorkTableScan(WorkTableScanState *node);
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index dd95dc40c7..7b03f46966 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -570,6 +570,8 @@ typedef struct EState
 	struct ExecRowMark **es_rowmarks;	/* Array of per-range-table-entry
 										 * ExecRowMarks, or NULL if none */
 	PlannedStmt *es_plannedstmt;	/* link to top of plan tree */
+	struct ExecPrepOutput *es_execprep;	/* link to ExecPrepOutput, if one was
+										 * passed to ExecutorStart() */
 	const char *es_sourceText;	/* Source text from QueryDesc */
 
 	JunkFilter *es_junkFilter;	/* top-level junk filter, if any */
@@ -958,6 +960,82 @@ typedef struct DomainConstraintState
  */
 typedef TupleTableSlot *(*ExecProcNodeMtd) (struct PlanState *pstate);
 
+/*----------------
+ * ExecPrepContext
+ *
+ * Context information for performing ExecutorPrep() on a given plan
+ */
+typedef struct ExecPrepContext
+{
+	NodeTag		type;
+
+	PlannedStmt	   *stmt;		/* target plan */
+	ParamListInfo	params;		/* EXTERN parameters to prune with */
+} ExecPrepContext;
+
+/*----------------
+ * ExecPrepOutput
+ *
+ * Result of of performing ExecutorPrep() for a given PlannedStmt
+ */
+typedef struct ExecPrepOutput
+{
+	NodeTag		type;
+
+	Bitmapset  *relationRTIs;		/* RT indexes of RTE_RELATIONs */
+	int			numPlanNodes;		/* PlannedStmt.numPlanNodes */
+
+	/*
+	 * Array of 'numPlanNodes' elements containing PlanPrepOutput nodes
+	 * for each node in the plan tree, indexed using  the node's plan_node_id.
+	 * A NULL value means that the corresponding plan node does not have a
+	 * PlanPrepOutput associated with it.
+	 */
+	struct PlanPrepOutput **planPrepResults;
+} ExecPrepOutput;
+
+#define	ExecPrepStorePlanPrepOutput(execprep, planPrepResult, plannode) \
+	(execprep)->planPrepResults[(plannode)->plan_node_id] = (planPrepResult)
+
+#define	ExecPrepFetchPlanPrepOutput(execprep, plannode) \
+		((execprep) != NULL ? \
+		 (execprep)->planPrepResults[(plannode)->plan_node_id] : NULL)
+
+#ifdef USE_ASSERT_CHECKING
+#define	EXEC_PREP_OUTPUT_SANITY(plannode, estate) \
+	do { \
+		PlanPrepOutput *planPrepOutput = \
+		ExecPrepFetchPlanPrepOutput(estate->es_execprep, node); \
+		Assert(planPrepOutput == NULL || \
+			   (IsA(planPrepOutput, PlanPrepOutput) && \
+				planPrepOutput->plan_node_id == plannode->plan_node_id)); \
+	} while (0);
+#else
+#define	EXEC_PREP_OUTPUT_SANITY(node, estate)
+#endif
+
+/* ---------------
+ * PlanPrepOutput
+ *
+ * ExecutorPrep() creates a node of this type for every node in the Plan tree
+ * that does some "prep" work.
+ */
+typedef struct PlanPrepOutput
+{
+	NodeTag		type;
+
+	int			plan_node_id;		/* associated Plan node */
+
+	/* Information collected by ExecPrepNode subroutine for the node */
+
+	/*
+	 * For nodes that contain a list of prunable subnodes, the following
+	 * contains offsets into that list, of the subnodes that survive initial
+	 * partition pruning.
+	 */
+	Bitmapset  *initially_valid_subnodes;
+} PlanPrepOutput;
+
 /* ----------------
  *		PlanState node
  *
diff --git a/src/include/nodes/nodeFuncs.h b/src/include/nodes/nodeFuncs.h
index 93c60bde66..fca107ad65 100644
--- a/src/include/nodes/nodeFuncs.h
+++ b/src/include/nodes/nodeFuncs.h
@@ -158,5 +158,8 @@ extern bool raw_expression_tree_walker(Node *node, bool (*walker) (),
 struct PlanState;
 extern bool planstate_tree_walker(struct PlanState *planstate, bool (*walker) (),
 								  void *context);
+struct Plan;
+extern bool plan_tree_walker(struct Plan *plan, bool (*walker) (),
+				 void *context);
 
 #endif							/* NODEFUNCS_H */
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index da35f2c272..8db017a138 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -96,6 +96,11 @@ typedef enum NodeTag
 	T_PartitionPruneStepCombine,
 	T_PlanInvalItem,
 
+	/* TAGS FOR EXECUTOR PREP NODES (execnodes.h) */
+	T_ExecPrepContext,
+	T_ExecPrepOutput,
+	T_PlanPrepOutput,
+
 	/*
 	 * TAGS FOR PLAN STATE NODES (execnodes.h)
 	 *
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 1f3845b3fe..ffde93ef13 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -101,6 +101,9 @@ typedef struct PlannerGlobal
 
 	List	   *finalrtable;	/* "flat" rangetable for executor */
 
+	Bitmapset  *relationRTIs;	/* Indexes of RTE_RELATION entries in range
+								 * table */
+
 	List	   *finalrowmarks;	/* "flat" list of PlanRowMarks */
 
 	List	   *resultRelations;	/* "flat" list of integer RT indexes */
@@ -129,6 +132,9 @@ typedef struct PlannerGlobal
 
 	char		maxParallelHazard;	/* worst PROPARALLEL hazard level */
 
+	bool		usesPreExecPruning;	/* Do some Plan nodes use pre-execution
+									 * partition pruning */
+
 	PartitionDirectory partition_directory; /* partition descriptors */
 } PlannerGlobal;
 
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 0b518ce6b2..69bc5f918c 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -59,12 +59,20 @@ typedef struct PlannedStmt
 
 	bool		parallelModeNeeded; /* parallel mode required to execute? */
 
+	bool		usesPreExecPruning;	/* Do some Plan nodes use pre-execution
+									 * partition pruning */
+
 	int			jitFlags;		/* which forms of JIT should be performed */
 
 	struct Plan *planTree;		/* tree of Plan nodes */
 
+	int			numPlanNodes;	/* number of nodes in planTree */
+
 	List	   *rtable;			/* list of RangeTblEntry nodes */
 
+	Bitmapset  *relationRTIs;	/* Indexes of RTE_RELATION entries in range
+								 * table */
+
 	/* rtable indexes of target relations for INSERT/UPDATE/DELETE */
 	List	   *resultRelations;	/* integer list of RT indexes, or NIL */
 
@@ -1172,6 +1180,13 @@ typedef struct PlanRowMark
  * prune_infos			List of Lists containing PartitionedRelPruneInfo nodes,
  *						one sublist per run-time-prunable partition hierarchy
  *						appearing in the parent plan node's subplans.
+ *
+ * contains_init_steps	Does any of the PartitionedRelPruneInfos in
+ *						prune_infos have its initial_pruning_steps set?
+ *
+ * contains_exec_steps	Does any of the PartitionedRelPruneInfos in
+ *						prune_infos have its exec_pruning_steps set?
+ *
  * other_subplans		Indexes of any subplans that are not accounted for
  *						by any of the PartitionedRelPruneInfo nodes in
  *						"prune_infos".  These subplans must not be pruned.
@@ -1180,6 +1195,8 @@ typedef struct PartitionPruneInfo
 {
 	NodeTag		type;
 	List	   *prune_infos;
+	bool		contains_init_steps;
+	bool		contains_exec_steps;
 	Bitmapset  *other_subplans;
 } PartitionPruneInfo;
 
diff --git a/src/include/partitioning/partprune.h b/src/include/partitioning/partprune.h
index ee11b6feae..90684efa25 100644
--- a/src/include/partitioning/partprune.h
+++ b/src/include/partitioning/partprune.h
@@ -41,6 +41,7 @@ struct RelOptInfo;
  *					subsidiary data, such as the FmgrInfos.
  * planstate		Points to the parent plan node's PlanState when called
  *					during execution; NULL when called from the planner.
+ * exprcontext		ExprContext to use when evaluating pruning expressions
  * exprstates		Array of ExprStates, indexed as per PruneCxtStateIdx; one
  *					for each partition key in each pruning step.  Allocated if
  *					planstate is non-NULL, otherwise NULL.
@@ -56,6 +57,7 @@ typedef struct PartitionPruneContext
 	FmgrInfo   *stepcmpfuncs;
 	MemoryContext ppccontext;
 	PlanState  *planstate;
+	ExprContext *exprcontext;
 	ExprState **exprstates;
 } PartitionPruneContext;
 
diff --git a/src/include/tcop/tcopprot.h b/src/include/tcop/tcopprot.h
index 15a11bc3ff..02124af4ed 100644
--- a/src/include/tcop/tcopprot.h
+++ b/src/include/tcop/tcopprot.h
@@ -59,7 +59,7 @@ extern PlannedStmt *pg_plan_query(Query *querytree, const char *query_string,
 								  ParamListInfo boundParams);
 extern List *pg_plan_queries(List *querytrees, const char *query_string,
 							 int cursorOptions,
-							 ParamListInfo boundParams);
+							 ParamListInfo boundParams, List **stmt_execprep_list);
 
 extern bool check_max_stack_depth(int *newval, void **extra, GucSource source);
 extern void assign_max_stack_depth(int newval, void *extra);
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index 95b99e3d25..14794972a0 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -148,6 +148,9 @@ typedef struct CachedPlan
 {
 	int			magic;			/* should equal CACHEDPLAN_MAGIC */
 	List	   *stmt_list;		/* list of PlannedStmts */
+	List	   *stmt_execprep_list;	/* list of ExecutorPrepResult with one
+									 * element for each of stmt_list; NIL
+									 * if not a generic plan */
 	bool		is_oneshot;		/* is it a "oneshot" plan? */
 	bool		is_saved;		/* is CachedPlan in a long-lived context? */
 	bool		is_valid;		/* is the stmt_list currently valid? */
@@ -158,6 +161,8 @@ typedef struct CachedPlan
 	int			generation;		/* parent's generation number for this plan */
 	int			refcount;		/* count of live references to this struct */
 	MemoryContext context;		/* context containing this CachedPlan */
+	MemoryContext execprep_context;	/* context containing stmt_execprep_list,
+									 * a child of the above context */
 } CachedPlan;
 
 /*
diff --git a/src/include/utils/portal.h b/src/include/utils/portal.h
index aeddbdafe5..03c39ff97a 100644
--- a/src/include/utils/portal.h
+++ b/src/include/utils/portal.h
@@ -137,6 +137,10 @@ typedef struct PortalData
 	CommandTag	commandTag;		/* command tag for original query */
 	QueryCompletion qc;			/* command completion data for executed query */
 	List	   *stmts;			/* list of PlannedStmts */
+	List	   *stmt_execpreps;	/* list of ExecutorPrepResults with one element
+								 * for each of 'stmts'; same as
+								 * cplan->stmt_execprep_list if cplan is
+								 * not NULL */
 	CachedPlan *cplan;			/* CachedPlan, if stmts are from one */
 
 	ParamListInfo portalParams; /* params to pass to query */
@@ -241,6 +245,7 @@ extern void PortalDefineQuery(Portal portal,
 							  const char *sourceText,
 							  CommandTag commandTag,
 							  List *stmts,
+							  List *stmt_execpreps,
 							  CachedPlan *cplan);
 extern PlannedStmt *PortalGetPrimaryStmt(Portal portal);
 extern void PortalCreateHoldStore(Portal portal);
-- 
2.24.1



^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2022-02-10 22:01  Robert Haas <[email protected]>
  parent: Amit Langote <[email protected]>
  0 siblings, 1 reply; 108+ messages in thread

From: Robert Haas @ 2022-02-10 22:01 UTC (permalink / raw)
  To: Amit Langote <[email protected]>; +Cc: pgsql-hackers; David Rowley *EXTERN* <[email protected]>

On Thu, Feb 10, 2022 at 3:14 AM Amit Langote <[email protected]> wrote:
> Maybe this should be more than one patch?  Say:
>
> 0001 to add ExecutorPrep and the boilerplate,
> 0002 to teach plancache.c to use the new facility

Could be, not sure. I agree that if it's possible to split this in a
meaningful way, it would facilitate review. I notice that there is
some straight code movement e.g. the creation of
ExecPartitionPruneFixSubPlanIndexes. It would be best, I think, to do
pure code movement in a preparatory patch so that the main patch is
just adding the new stuff we need and not moving stuff around.

David Rowley recently proposed a patch for some parallel-safety
debugging cross checks which added a plan tree walker. I'm not sure
whether he's going to press that patch forward to commit, but I think
we should get something like that into the tree and start using it,
rather than adding more bespoke code. Maybe you/we should steal that
part of his patch and commit it separately. What I'm imagining is that
plan_tree_walker() would know which nodes have subnodes and how to
recurse over the tree structure, and you'd have a walker function to
use with it that would know which executor nodes have ExecPrep
functions and call them, and just do nothing for the others. That
would spare you adding stub functions for nodes that don't need to do
anything, or don't need to do anything other than recurse. Admittedly
it would look a bit different from the existing executor phases, but
I'd argue that it's a better coding model.

Actually, you might've had this in the patch at some point, because
you have a declaration for plan_tree_walker but no implementation. I
guess one thing that's a bit awkward about this idea is that in some
cases you want to recurse to some subnodes but not other subnodes. But
maybe it would work to put the recursion in the walker function in
that case, and then just return true; but if you want to walk all
children, return false.

+ bool contains_init_steps;
+ bool contains_exec_steps;

s/steps/pruning/? maybe with contains -> needs or performs or requires as well?

+ * Returned information includes the set of RT indexes of relations referenced
+ * in the plan, and a PlanPrepOutput node for each node in the planTree if the
+ * node type supports producing one.

Aren't all RT indexes referenced in the plan?

+ * This may lock relations whose information may be used to produce the
+ * PlanPrepOutput nodes. For example, a partitioned table before perusing its
+ * PartitionPruneInfo contained in an Append node to do the pruning the result
+ * of which is used to populate the Append node's PlanPrepOutput.

"may lock" feels awfully fuzzy to me. How am I supposed to rely on
something that "may" happen? And don't we need to have tight logic
around locking, with specific guarantees about what is locked at which
points in the code and what is not?

+ * At least one of 'planstate' or 'econtext' must be passed to be able to
+ * successfully evaluate any non-Const expressions contained in the
+ * steps.

This also seems fuzzy. If I'm thinking of calling this function, I
don't know how I'd know whether this criterion is met.

I don't love PlanPrepOutput the way you have it. I think one of the
basic design issues for this patch is: should we think of the prep
phase as specifically pruning, or is it general prep and pruning is
the first thing for which we're going to use it? If it's really a
pre-pruning phase, we could name it that way instead of calling it
"prep". If it's really a general prep phase, then why does
PlanPrepOutput contain initially_valid_subnodes as a field? One could
imagine letting each prep function decide what kind of prep node it
would like to return, with partition pruning being just one of the
options. But is that a useful generalization of the basic concept, or
just pretending that a special-purpose mechanism is more general than
it really is?

+ return CreateQueryDesc(pstmt, NULL, /* XXX pass ExecPrepOutput too? */

It seems to me that we should do what the XXX suggests. It doesn't
seem nice if the parallel workers could theoretically decide to prune
a different set of nodes than the leader.

+ * known at executor startup (excludeing expressions containing

Extra e.

+ * into subplan indexes, is also returned for use during subsquent

Missing e.

Somewhere, we're going to need to document the idea that this may
permit us to execute a plan that isn't actually fully valid, but that
we expect to survive because we'll never do anything with the parts of
it that aren't. Maybe that should be added to the executor README, or
maybe there's some better place, but I don't think that should remain
something that's just implicit.

This is not a full review, just some initial thoughts looking through this.

-- 
Robert Haas
EDB: http://www.enterprisedb.com






^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2022-03-07 14:18  Amit Langote <[email protected]>
  parent: Robert Haas <[email protected]>
  0 siblings, 1 reply; 108+ messages in thread

From: Amit Langote @ 2022-03-07 14:18 UTC (permalink / raw)
  To: Robert Haas <[email protected]>; +Cc: pgsql-hackers; David Rowley *EXTERN* <[email protected]>

On Fri, Feb 11, 2022 at 7:02 AM Robert Haas <[email protected]> wrote:
> On Thu, Feb 10, 2022 at 3:14 AM Amit Langote <[email protected]> wrote:
> > Maybe this should be more than one patch?  Say:
> >
> > 0001 to add ExecutorPrep and the boilerplate,
> > 0002 to teach plancache.c to use the new facility

Thanks for taking a look and sorry about the delay.

> Could be, not sure. I agree that if it's possible to split this in a
> meaningful way, it would facilitate review. I notice that there is
> some straight code movement e.g. the creation of
> ExecPartitionPruneFixSubPlanIndexes. It would be best, I think, to do
> pure code movement in a preparatory patch so that the main patch is
> just adding the new stuff we need and not moving stuff around.

Okay, created 0001 for moving around the execution pruning code.

> David Rowley recently proposed a patch for some parallel-safety
> debugging cross checks which added a plan tree walker. I'm not sure
> whether he's going to press that patch forward to commit, but I think
> we should get something like that into the tree and start using it,
> rather than adding more bespoke code. Maybe you/we should steal that
> part of his patch and commit it separately.

I looked at the thread you mentioned (I guess [1]), though it seems
David's proposing a path_tree_walker(), so I guess only useful within
the planner and not here.

> What I'm imagining is that
> plan_tree_walker() would know which nodes have subnodes and how to
> recurse over the tree structure, and you'd have a walker function to
> use with it that would know which executor nodes have ExecPrep
> functions and call them, and just do nothing for the others. That
> would spare you adding stub functions for nodes that don't need to do
> anything, or don't need to do anything other than recurse. Admittedly
> it would look a bit different from the existing executor phases, but
> I'd argue that it's a better coding model.
>
> Actually, you might've had this in the patch at some point, because
> you have a declaration for plan_tree_walker but no implementation.

Right, the previous patch indeed used a plan_tree_walker() for this
and I think in a way you seem to think it should work.

I do agree that plan_tree_walker() allows for a better implementation
of the idea of this patch and may also be generally useful, so I've
created a separate patch that adds it to nodeFuncs.c.

> I guess one thing that's a bit awkward about this idea is that in some
> cases you want to recurse to some subnodes but not other subnodes. But
> maybe it would work to put the recursion in the walker function in
> that case, and then just return true; but if you want to walk all
> children, return false.

Right, that's how I've made ExecPrepAppend() etc. do it.

> + bool contains_init_steps;
> + bool contains_exec_steps;
>
> s/steps/pruning/? maybe with contains -> needs or performs or requires as well?

Went with: needs_{init|exec}_pruning

> + * Returned information includes the set of RT indexes of relations referenced
> + * in the plan, and a PlanPrepOutput node for each node in the planTree if the
> + * node type supports producing one.
>
> Aren't all RT indexes referenced in the plan?

Ah yes.  How about:

 * Returned information includes the set of RT indexes of relations that must
 * be locked to safely execute the plan,

> + * This may lock relations whose information may be used to produce the
> + * PlanPrepOutput nodes. For example, a partitioned table before perusing its
> + * PartitionPruneInfo contained in an Append node to do the pruning the result
> + * of which is used to populate the Append node's PlanPrepOutput.
>
> "may lock" feels awfully fuzzy to me. How am I supposed to rely on
> something that "may" happen? And don't we need to have tight logic
> around locking, with specific guarantees about what is locked at which
> points in the code and what is not?

Agree the wording was fuzzy.  I've rewrote as:

 * This locks relations whose information is needed to produce the
 * PlanPrepOutput nodes. For example, a partitioned table before perusing its
 * PartitionedRelPruneInfo contained in an Append node to do the pruning, the
 * result of which is used to populate the Append node's PlanPrepOutput.

BTW, I've added an Assert in ExecGetRangeTableRelation():

   /*
    * A cross-check that AcquireExecutorLocks() hasn't missed any relations
    * it must not have.
    */
   Assert(estate->es_execprep == NULL ||
          bms_is_member(rti, estate->es_execprep->relationRTIs));

which IOW ensures that the actual execution of a plan only sees
relations that ExecutorPrep() would've told AcquireExecutorLocks() to
take a lock on.

> + * At least one of 'planstate' or 'econtext' must be passed to be able to
> + * successfully evaluate any non-Const expressions contained in the
> + * steps.
>
> This also seems fuzzy. If I'm thinking of calling this function, I
> don't know how I'd know whether this criterion is met.

OK, I have removed this comment (which was on top of a static local
function) in favor of adding some commentary on this in places where
it belongs.  For example, in ExecPrepDoInitialPruning():

    /*
     * We don't yet have a PlanState for the parent plan node, so must create
     * a standalone ExprContext to evaluate pruning expressions, equipped with
     * the information about the EXTERN parameters that the caller passed us.
     * Note that that's okay because the initial pruning steps does not
     * involve anything that requires the execution to have started.
     */
    econtext = CreateStandaloneExprContext();
    econtext->ecxt_param_list_info = params;
    prunestate = ExecCreatePartitionPruneState(NULL, pruneinfo,
                                               true, false,
                                               rtable, econtext,
                                               pdir, parentrelids);

> I don't love PlanPrepOutput the way you have it. I think one of the
> basic design issues for this patch is: should we think of the prep
> phase as specifically pruning, or is it general prep and pruning is
> the first thing for which we're going to use it? If it's really a
> pre-pruning phase, we could name it that way instead of calling it
> "prep". If it's really a general prep phase, then why does
> PlanPrepOutput contain initially_valid_subnodes as a field? One could
> imagine letting each prep function decide what kind of prep node it
> would like to return, with partition pruning being just one of the
> options. But is that a useful generalization of the basic concept, or
> just pretending that a special-purpose mechanism is more general than
> it really is?

While it can feel like the latter TBH, I'm inclined to keep
ExecutorPrep generalized.   What bothers me about about the
alternative of calling the new phase something less generalized like
ExecutorDoInitPruning() is that that makes the somewhat elaborate API
changes needed for the phase's output to put into QueryDesc, through
which it ultimately reaches the main executor, seem less worthwhile.

I agree that PlanPrepOutput design needs to be likewise generalized,
maybe like you suggest -- using PlanInitPruningOutput, a child class
of PlanPrepOutput, to return the prep output for plan nodes that
support pruning.

Thoughts?

> + return CreateQueryDesc(pstmt, NULL, /* XXX pass ExecPrepOutput too? */
>
> It seems to me that we should do what the XXX suggests. It doesn't
> seem nice if the parallel workers could theoretically decide to prune
> a different set of nodes than the leader.

OK, will fix.

> + * known at executor startup (excludeing expressions containing
>
> Extra e.
>
> + * into subplan indexes, is also returned for use during subsquent
>
> Missing e.

Will fix.

> Somewhere, we're going to need to document the idea that this may
> permit us to execute a plan that isn't actually fully valid, but that
> we expect to survive because we'll never do anything with the parts of
> it that aren't. Maybe that should be added to the executor README, or
> maybe there's some better place, but I don't think that should remain
> something that's just implicit.

Agreed.  I'd added a description of the new prep phase to executor
README, though the text didn't mention this particular bit.  Will fix
to mention it.

> This is not a full review, just some initial thoughts looking through this.

Thanks again. Will post a new version soon after a bit more polishing.

--
Amit Langote
EDB: http://www.enterprisedb.com

[1] https://www.postgresql.org/message-id/flat/b59605fecb20ba9ea94e70ab60098c237c870628.camel%40postgres...






^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2022-03-11 14:35  Amit Langote <[email protected]>
  parent: Amit Langote <[email protected]>
  0 siblings, 2 replies; 108+ messages in thread

From: Amit Langote @ 2022-03-11 14:35 UTC (permalink / raw)
  To: Robert Haas <[email protected]>; +Cc: pgsql-hackers; David Rowley *EXTERN* <[email protected]>

On Mon, Mar 7, 2022 at 11:18 PM Amit Langote <[email protected]> wrote:
> On Fri, Feb 11, 2022 at 7:02 AM Robert Haas <[email protected]> wrote:
> > I don't love PlanPrepOutput the way you have it. I think one of the
> > basic design issues for this patch is: should we think of the prep
> > phase as specifically pruning, or is it general prep and pruning is
> > the first thing for which we're going to use it? If it's really a
> > pre-pruning phase, we could name it that way instead of calling it
> > "prep". If it's really a general prep phase, then why does
> > PlanPrepOutput contain initially_valid_subnodes as a field? One could
> > imagine letting each prep function decide what kind of prep node it
> > would like to return, with partition pruning being just one of the
> > options. But is that a useful generalization of the basic concept, or
> > just pretending that a special-purpose mechanism is more general than
> > it really is?
>
> While it can feel like the latter TBH, I'm inclined to keep
> ExecutorPrep generalized.   What bothers me about about the
> alternative of calling the new phase something less generalized like
> ExecutorDoInitPruning() is that that makes the somewhat elaborate API
> changes needed for the phase's output to put into QueryDesc, through
> which it ultimately reaches the main executor, seem less worthwhile.
>
> I agree that PlanPrepOutput design needs to be likewise generalized,
> maybe like you suggest -- using PlanInitPruningOutput, a child class
> of PlanPrepOutput, to return the prep output for plan nodes that
> support pruning.
>
> Thoughts?

So I decided to agree with you after all about limiting the scope of
this new executor interface, or IOW call it what it is.

I have named it ExecutorGetLockRels() to go with the only use case we
know for it -- get the set of relations for AcquireExecutorLocks() to
lock to validate a plan tree.  Its result returned in a node named
ExecLockRelsInfo, which contains the set of relations scanned in the
plan tree (lockrels) and a list of PlanInitPruningOutput nodes for all
nodes that undergo pruning.

> > + return CreateQueryDesc(pstmt, NULL, /* XXX pass ExecPrepOutput too? */
> >
> > It seems to me that we should do what the XXX suggests. It doesn't
> > seem nice if the parallel workers could theoretically decide to prune
> > a different set of nodes than the leader.
>
> OK, will fix.

Done.  This required adding nodeToString() and stringToNode() support
for the nodes produced by the new executor function that wasn't there
before.

> > Somewhere, we're going to need to document the idea that this may
> > permit us to execute a plan that isn't actually fully valid, but that
> > we expect to survive because we'll never do anything with the parts of
> > it that aren't. Maybe that should be added to the executor README, or
> > maybe there's some better place, but I don't think that should remain
> > something that's just implicit.
>
> Agreed.  I'd added a description of the new prep phase to executor
> README, though the text didn't mention this particular bit.  Will fix
> to mention it.

Rewrote the comments above ExecutorGetLockRels() (previously
ExecutorPrep()) and the executor README text to be explicit about the
fact that not locking some relations effectively invalidates pruned
parts of the plan tree.

> > This is not a full review, just some initial thoughts looking through this.
>
> Thanks again. Will post a new version soon after a bit more polishing.

Attached is v5, now broken into 3 patches:

0001: Some refactoring of runtime pruning code
0002: Add a plan_tree_walker
0003: Teach AcquireExecutorLocks to skip locking pruned relations

-- 
Amit Langote
EDB: http://www.enterprisedb.com


Attachments:

  [application/octet-stream] v5-0002-Add-a-plan_tree_walker.patch (3.9K, 2-v5-0002-Add-a-plan_tree_walker.patch)
  download | inline diff:
From 22ff31c7b052eabb32f4a529c48fe48180332156 Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Thu, 3 Mar 2022 16:04:13 +0900
Subject: [PATCH v5 2/3] Add a plan_tree_walker()

Like planstate_tree_walker() but for uninitialized plan trees.
---
 src/backend/nodes/nodeFuncs.c | 116 ++++++++++++++++++++++++++++++++++
 src/include/nodes/nodeFuncs.h |   3 +
 2 files changed, 119 insertions(+)

diff --git a/src/backend/nodes/nodeFuncs.c b/src/backend/nodes/nodeFuncs.c
index 47d0564fa2..cdf937f127 100644
--- a/src/backend/nodes/nodeFuncs.c
+++ b/src/backend/nodes/nodeFuncs.c
@@ -31,6 +31,10 @@ static bool planstate_walk_subplans(List *plans, bool (*walker) (),
 									void *context);
 static bool planstate_walk_members(PlanState **planstates, int nplans,
 								   bool (*walker) (), void *context);
+static bool plan_walk_subplans(List *plans,
+				   bool (*walker) (),
+				   void *context);
+static bool plan_walk_members(List *plans, bool (*walker) (), void *context);
 
 
 /*
@@ -4148,3 +4152,115 @@ planstate_walk_members(PlanState **planstates, int nplans,
 
 	return false;
 }
+
+/*
+ * plan_tree_walker --- walk plantrees
+ *
+ * The walker has already visited the current node, and so we need only
+ * recurse into any sub-nodes it has.
+ */
+bool
+plan_tree_walker(Plan *plan,
+				 bool (*walker) (),
+				 void *context)
+{
+	/* Guard against stack overflow due to overly complex plan trees */
+	check_stack_depth();
+
+	/* initPlan-s */
+	if (plan_walk_subplans(plan->initPlan, walker, context))
+		return true;
+
+	/* lefttree */
+	if (outerPlan(plan))
+	{
+		if (walker(outerPlan(plan), context))
+			return true;
+	}
+
+	/* righttree */
+	if (innerPlan(plan))
+	{
+		if (walker(innerPlan(plan), context))
+			return true;
+	}
+
+	/* special child plans */
+	switch (nodeTag(plan))
+	{
+		case T_Append:
+			if (plan_walk_members(((Append *) plan)->appendplans,
+								  walker, context))
+				return true;
+			break;
+		case T_MergeAppend:
+			if (plan_walk_members(((MergeAppend *) plan)->mergeplans,
+								  walker, context))
+				return true;
+			break;
+		case T_BitmapAnd:
+			if (plan_walk_members(((BitmapAnd *) plan)->bitmapplans,
+								  walker, context))
+				return true;
+			break;
+		case T_BitmapOr:
+			if (plan_walk_members(((BitmapOr *) plan)->bitmapplans,
+								  walker, context))
+				return true;
+			break;
+		case T_CustomScan:
+			if (plan_walk_members(((CustomScan *) plan)->custom_plans,
+								  walker, context))
+				return true;
+			break;
+		case T_SubqueryScan:
+			if (walker(((SubqueryScan *) plan)->subplan, context))
+				return true;
+			break;
+		default:
+			break;
+	}
+
+	return false;
+}
+
+/*
+ * Walk a list of SubPlans (or initPlans, which also use SubPlan nodes).
+ */
+static bool
+plan_walk_subplans(List *plans,
+				   bool (*walker) (),
+				   void *context)
+{
+	ListCell   *lc;
+	PlannedStmt *plannedstmt = (PlannedStmt *) context;
+
+	foreach(lc, plans)
+	{
+		SubPlan *sp = lfirst_node(SubPlan, lc);
+		Plan *p = list_nth(plannedstmt->subplans, sp->plan_id - 1);
+
+		if (walker(p, context))
+			return true;
+	}
+
+	return false;
+}
+
+/*
+ * Walk the constituent plans of a ModifyTable, Append, MergeAppend,
+ * BitmapAnd, or BitmapOr node.
+ */
+static bool
+plan_walk_members(List *plans, bool (*walker) (), void *context)
+{
+	ListCell *lc;
+
+	foreach(lc, plans)
+	{
+		if (walker(lfirst(lc), context))
+			return true;
+	}
+
+	return false;
+}
diff --git a/src/include/nodes/nodeFuncs.h b/src/include/nodes/nodeFuncs.h
index 93c60bde66..fca107ad65 100644
--- a/src/include/nodes/nodeFuncs.h
+++ b/src/include/nodes/nodeFuncs.h
@@ -158,5 +158,8 @@ extern bool raw_expression_tree_walker(Node *node, bool (*walker) (),
 struct PlanState;
 extern bool planstate_tree_walker(struct PlanState *planstate, bool (*walker) (),
 								  void *context);
+struct Plan;
+extern bool plan_tree_walker(struct Plan *plan, bool (*walker) (),
+				 void *context);
 
 #endif							/* NODEFUNCS_H */
-- 
2.24.1



  [application/octet-stream] v5-0003-Teach-AcquireExecutorLocks-to-skip-locking-pruned.patch (93.2K, 3-v5-0003-Teach-AcquireExecutorLocks-to-skip-locking-pruned.patch)
  download | inline diff:
From 62fd8ca887f62dcd89010bf4475529eb16f07d52 Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Wed, 22 Dec 2021 16:55:17 +0900
Subject: [PATCH v5 3/3] Teach AcquireExecutorLocks() to skip locking pruned
 partitions

Instead of locking all relations listed in the range table, this
asks the new executor function ExecutorGetLockRels() to return a set
of relations (their RT indexes) to lock or simply use the set
given by PlannedStmt.lockrels.  To wit, ExecutorGetLockRels() must be
called if some nodes in the plan tree contain initial pruning steps
(pruning steps containing expressions that can be computed before
before the executor proper has started), which results in the lockrels
set to be computed such that any subplans that are pruned as result of
doing initial pruning do not contribute any relations to the set.
That can result in a much smaller lockrels set when the plan contains
thousands of child subplans, of which only a small number remain
after pruning.

The result of doing the initial pruning during ExecutorGetLockRels()
is preserved for use later during actual execution by creating a
a new node called PlanInitPruningOutput for each plan node that
undergoes pruning and a set of those for the whole plan tree are
put into another new node ExecLockRelsInfo that represents the output
of a given ExecutorGetLockRels() invocation.  ExecLockRelsInfos are
passed down the executor alongside the PlannedStmts.  This
arrangement ensures that the set of plan tree nodes that
AcquireExecutorLocks() has acquired locks to protect and the one
that the executor will initialize and execute are one and the same.
---
 src/backend/commands/copyto.c          |   2 +-
 src/backend/commands/createas.c        |   2 +-
 src/backend/commands/explain.c         |   7 +-
 src/backend/commands/extension.c       |  13 +-
 src/backend/commands/matview.c         |   2 +-
 src/backend/commands/portalcmds.c      |   1 +
 src/backend/commands/prepare.c         |  17 +-
 src/backend/executor/README            |  22 ++-
 src/backend/executor/execMain.c        | 181 +++++++++++++++++++
 src/backend/executor/execParallel.c    |  27 ++-
 src/backend/executor/execPartition.c   | 233 +++++++++++++++++++++----
 src/backend/executor/execUtils.c       |   8 +
 src/backend/executor/functions.c       |   2 +-
 src/backend/executor/nodeAppend.c      |  42 ++++-
 src/backend/executor/nodeMergeAppend.c |  42 ++++-
 src/backend/executor/nodeModifyTable.c |  24 +++
 src/backend/executor/spi.c             |  14 +-
 src/backend/nodes/copyfuncs.c          |  50 +++++-
 src/backend/nodes/outfuncs.c           |  41 +++++
 src/backend/nodes/readfuncs.c          |  38 ++++
 src/backend/optimizer/plan/planner.c   |   3 +
 src/backend/optimizer/plan/setrefs.c   |  10 ++
 src/backend/partitioning/partprune.c   |  37 +++-
 src/backend/tcop/postgres.c            |  15 +-
 src/backend/tcop/pquery.c              |  21 ++-
 src/backend/utils/cache/plancache.c    | 220 +++++++++++++++++++----
 src/backend/utils/mmgr/portalmem.c     |   2 +
 src/include/commands/explain.h         |   3 +-
 src/include/executor/execPartition.h   |   2 +
 src/include/executor/execdesc.h        |   2 +
 src/include/executor/executor.h        |   2 +
 src/include/executor/nodeAppend.h      |   1 +
 src/include/executor/nodeMergeAppend.h |   1 +
 src/include/executor/nodeModifyTable.h |   1 +
 src/include/nodes/execnodes.h          |  87 +++++++++
 src/include/nodes/nodes.h              |   5 +
 src/include/nodes/pathnodes.h          |   7 +
 src/include/nodes/plannodes.h          |  18 ++
 src/include/tcop/tcopprot.h            |   2 +-
 src/include/utils/plancache.h          |   5 +
 src/include/utils/portal.h             |   5 +
 41 files changed, 1108 insertions(+), 109 deletions(-)

diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index 55c38b04c4..d403eb2309 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -542,7 +542,7 @@ BeginCopyTo(ParseState *pstate,
 		((DR_copy *) dest)->cstate = cstate;
 
 		/* Create a QueryDesc requesting no output */
-		cstate->queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+		cstate->queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
 											GetActiveSnapshot(),
 											InvalidSnapshot,
 											dest, NULL, NULL, 0);
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 9abbb6b555..f6607f2454 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -325,7 +325,7 @@ ExecCreateTableAs(ParseState *pstate, CreateTableAsStmt *stmt,
 		UpdateActiveSnapshotCommandId();
 
 		/* Create a QueryDesc, redirecting output to our tuple receiver */
-		queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+		queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
 									GetActiveSnapshot(), InvalidSnapshot,
 									dest, params, queryEnv, 0);
 
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index de81379da3..a9dc6d1755 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -407,7 +407,7 @@ ExplainOneQuery(Query *query, int cursorOptions,
 		}
 
 		/* run it (if needed) and produce output */
-		ExplainOnePlan(plan, into, es, queryString, params, queryEnv,
+		ExplainOnePlan(plan, NULL, into, es, queryString, params, queryEnv,
 					   &planduration, (es->buffers ? &bufusage : NULL));
 	}
 }
@@ -515,7 +515,8 @@ ExplainOneUtility(Node *utilityStmt, IntoClause *into, ExplainState *es,
  * to call it.
  */
 void
-ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
+ExplainOnePlan(PlannedStmt *plannedstmt, ExecLockRelsInfo *execlockrelsinfo,
+			   IntoClause *into, ExplainState *es,
 			   const char *queryString, ParamListInfo params,
 			   QueryEnvironment *queryEnv, const instr_time *planduration,
 			   const BufferUsage *bufusage)
@@ -563,7 +564,7 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
 		dest = None_Receiver;
 
 	/* Create a QueryDesc for the query */
-	queryDesc = CreateQueryDesc(plannedstmt, queryString,
+	queryDesc = CreateQueryDesc(plannedstmt, execlockrelsinfo, queryString,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, params, queryEnv, instrument_option);
 
diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index 1013790dbb..008b8ce0e9 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -741,8 +741,10 @@ execute_sql_string(const char *sql)
 		RawStmt    *parsetree = lfirst_node(RawStmt, lc1);
 		MemoryContext per_parsetree_context,
 					oldcontext;
-		List	   *stmt_list;
-		ListCell   *lc2;
+		List	   *stmt_list,
+				   *execlockrelsinfo_list;
+		ListCell   *lc2,
+				   *lc3;
 
 		/*
 		 * We do the work for each parsetree in a short-lived context, to
@@ -762,11 +764,13 @@ execute_sql_string(const char *sql)
 										   NULL,
 										   0,
 										   NULL);
-		stmt_list = pg_plan_queries(stmt_list, sql, CURSOR_OPT_PARALLEL_OK, NULL);
+		stmt_list = pg_plan_queries(stmt_list, sql, CURSOR_OPT_PARALLEL_OK, NULL,
+									&execlockrelsinfo_list);
 
-		foreach(lc2, stmt_list)
+		forboth(lc2, stmt_list, lc3, execlockrelsinfo_list)
 		{
 			PlannedStmt *stmt = lfirst_node(PlannedStmt, lc2);
+			ExecLockRelsInfo *execlockrelsinfo = lfirst_node(ExecLockRelsInfo, lc3);
 
 			CommandCounterIncrement();
 
@@ -777,6 +781,7 @@ execute_sql_string(const char *sql)
 				QueryDesc  *qdesc;
 
 				qdesc = CreateQueryDesc(stmt,
+										execlockrelsinfo,
 										sql,
 										GetActiveSnapshot(), NULL,
 										dest, NULL, NULL, 0);
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index 05e7b60059..4ef44aaf23 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -416,7 +416,7 @@ refresh_matview_datafill(DestReceiver *dest, Query *query,
 	UpdateActiveSnapshotCommandId();
 
 	/* Create a QueryDesc, redirecting output to our tuple receiver */
-	queryDesc = CreateQueryDesc(plan, queryString,
+	queryDesc = CreateQueryDesc(plan, NULL, queryString,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, NULL, NULL, 0);
 
diff --git a/src/backend/commands/portalcmds.c b/src/backend/commands/portalcmds.c
index 9902c5c566..85e73ddded 100644
--- a/src/backend/commands/portalcmds.c
+++ b/src/backend/commands/portalcmds.c
@@ -107,6 +107,7 @@ PerformCursorOpen(ParseState *pstate, DeclareCursorStmt *cstmt, ParamListInfo pa
 					  queryString,
 					  CMDTAG_SELECT,	/* cursor's query is always a SELECT */
 					  list_make1(plan),
+					  list_make1(NULL),	/* no ExecLockRelsInfo to pass */
 					  NULL);
 
 	/*----------
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 80738547ed..bbbf8bbcbd 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -155,6 +155,7 @@ ExecuteQuery(ParseState *pstate,
 	PreparedStatement *entry;
 	CachedPlan *cplan;
 	List	   *plan_list;
+	List	   *plan_execlockrelsinfo_list;
 	ParamListInfo paramLI = NULL;
 	EState	   *estate = NULL;
 	Portal		portal;
@@ -195,6 +196,7 @@ ExecuteQuery(ParseState *pstate,
 	/* Replan if needed, and increment plan refcount for portal */
 	cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL);
 	plan_list = cplan->stmt_list;
+	plan_execlockrelsinfo_list = cplan->execlockrelsinfo_list;
 
 	/*
 	 * DO NOT add any logic that could possibly throw an error between
@@ -204,7 +206,7 @@ ExecuteQuery(ParseState *pstate,
 					  NULL,
 					  query_string,
 					  entry->plansource->commandTag,
-					  plan_list,
+					  plan_list, plan_execlockrelsinfo_list,
 					  cplan);
 
 	/*
@@ -576,7 +578,9 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	const char *query_string;
 	CachedPlan *cplan;
 	List	   *plan_list;
-	ListCell   *p;
+	List	   *plan_execlockrelsinfo_list;
+	ListCell   *p,
+			   *pe;
 	ParamListInfo paramLI = NULL;
 	EState	   *estate = NULL;
 	instr_time	planstart;
@@ -632,15 +636,18 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	}
 
 	plan_list = cplan->stmt_list;
+	plan_execlockrelsinfo_list = cplan->execlockrelsinfo_list;
 
 	/* Explain each query */
-	foreach(p, plan_list)
+	forboth(p, plan_list, pe, plan_execlockrelsinfo_list)
 	{
 		PlannedStmt *pstmt = lfirst_node(PlannedStmt, p);
+		ExecLockRelsInfo *execlockrelsinfo = lfirst_node(ExecLockRelsInfo, pe);
 
 		if (pstmt->commandType != CMD_UTILITY)
-			ExplainOnePlan(pstmt, into, es, query_string, paramLI, queryEnv,
-						   &planduration, (es->buffers ? &bufusage : NULL));
+			ExplainOnePlan(pstmt, execlockrelsinfo, into, es, query_string,
+						   paramLI, queryEnv, &planduration,
+						   (es->buffers ? &bufusage : NULL));
 		else
 			ExplainOneUtility(pstmt->utilityStmt, into, es, query_string,
 							  paramLI, queryEnv);
diff --git a/src/backend/executor/README b/src/backend/executor/README
index bf5e70860d..27341a2818 100644
--- a/src/backend/executor/README
+++ b/src/backend/executor/README
@@ -59,11 +59,20 @@ state tree.  Read-only plan trees make life much simpler for plan caching and
 reuse.
 
 A corresponding executor state node may not be created during executor startup
-if the executor determines that an entire subplan is not required due to
-execution time partition pruning determining that no matching records will be
-found there.  This currently only occurs for Append and MergeAppend nodes.  In
-this case the non-required subplans are ignored and the executor state's
-subnode array will become out of sequence to the plan's subplan list.
+if the ExecutorGetLockRels() determines that an entire subplan is not required
+due to initial partition pruning determining that no matching records will be
+found there, while also skipping the locking of relation(s) that would be
+scanned by the subplan were it not pruned.  This currently only occurs for
+Append and MergeAppend nodes (see ExecGet[Merge]AppendLockRels()).  In this
+case, the non-required subplans are ignored and the executor state's subnode
+array will become out of sequence to the plan's subplan list.
+ExecutorGetLockRels() typically runs before the execution starts, for example,
+as part of checking if a cached generic plan is still valid, though the
+result it produces (ExecLockRelsInfo) is made available to ExecutorStart() via
+the QueryDesc.  ExecInitNode() on the plan nodes whose child subplans may have
+been pruned as part of ExecutorGetLockRels() must look up the surviving set of
+subplans to initialize in the ExecLockRelsInfo, instead of reiterating the
+initial pruning computation.
 
 Each Plan node may have expression trees associated with it, to represent
 its target list, qualification conditions, etc.  These trees are also
@@ -247,6 +256,9 @@ Query Processing Control Flow
 
 This is a sketch of control flow for full query processing:
 
+  [ ExecutorGetLockRels ] --- an optional step to walk over the plan tree
+		to produce an ExecLockRelsInfo to be passed to CreateQueryDesc
+
 	CreateQueryDesc
 
 	ExecutorStart
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 549d9eb696..3b1f588321 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -48,11 +48,15 @@
 #include "commands/matview.h"
 #include "commands/trigger.h"
 #include "executor/execdebug.h"
+#include "executor/nodeAppend.h"
+#include "executor/nodeMergeAppend.h"
+#include "executor/nodeModifyTable.h"
 #include "executor/nodeSubplan.h"
 #include "foreign/fdwapi.h"
 #include "jit/jit.h"
 #include "mb/pg_wchar.h"
 #include "miscadmin.h"
+#include "nodes/nodeFuncs.h"
 #include "parser/parsetree.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
@@ -100,9 +104,184 @@ static char *ExecBuildSlotValueDescription(Oid reloid,
 										   Bitmapset *modifiedCols,
 										   int maxfieldlen);
 static void EvalPlanQualStart(EPQState *epqstate, Plan *planTree);
+static bool ExecGetScanLockRels(Scan *scan, ExecGetLockRelsContext *context);
 
 /* end of local decls */
 
+/* ----------------------------------------------------------------
+ *		ExecutorGetLockRels
+ *
+ *		Figure out the set of relations to lock to be able to execute a given
+ *		plan, after taking into account the result of performing any initial
+ *		pruning steps present in the plan.  Performing those pruning steps
+ *		would effectively invalidate the pruned subplans (that is, will not
+ *		be looked at during the actual execution of the parent plan), so the
+ *		relations that those subplans scan need not be locked.
+ *
+ * Along with the set of RT indexes of relations that must be locked, the
+ * returned struct also contains the information look up PlanInitPruningOutput
+ * nodes, containing the result of performing initial pruning (identities of
+ * surviving partition subnodes), for each plan node that undergoes pruning.
+ *
+ * The caller must arrange to pass on the returned struct down to the
+ * executor, so that the latter can reuse the result of initial pruning to
+ * initialize the same set of surviving subplans, instead of doing the pruning
+ * again by itself.
+ *
+ * This locks relations whose information is perused to do the pruning. For
+ * example, a partitioned table before perusing its PartitionedRelPruneInfo
+ * contained in an Append node to do pruning in ExecGetAppendLockRels().
+ */
+ExecLockRelsInfo *
+ExecutorGetLockRels(PlannedStmt *plannedstmt, ParamListInfo params)
+{
+	int		numPlanNodes = plannedstmt->numPlanNodes;
+	ExecGetLockRelsContext context;
+	ExecLockRelsInfo *result;
+	ListCell *lc;
+
+	/* Only get here if there is any pruning to do. */
+	Assert(plannedstmt->containsInitialPruning);
+
+	context.stmt = plannedstmt;
+	context.params = params;
+
+	/* Go do init pruning and fill lockrels. */
+	context.lockrels = NULL;
+	context.initPruningOutputs = NIL;
+	context.ipoIndexes = palloc0(sizeof(int) * numPlanNodes);
+	foreach(lc, plannedstmt->subplans)
+	{
+		Plan *subplan = lfirst(lc);
+
+		(void) ExecGetLockRels(subplan, &context);
+	}
+
+	(void) ExecGetLockRels(plannedstmt->planTree, &context);
+
+	result = makeNode(ExecLockRelsInfo);
+	result->lockrels = context.lockrels;
+	result->numPlanNodes = numPlanNodes;
+	result->initPruningOutputs = context.initPruningOutputs;
+	result->ipoIndexes = context.ipoIndexes;
+
+	return result;
+}
+
+/* ------------------------------------------------------------------------
+ * ExecGetLockRels
+ *		Recursively find relations to lock in the plan tree rooted at 'node',
+ *		performing initial pruning if the node contains the information to
+ *		do so
+ *
+ *		'node' is the current node of the plan produced by the query planner
+ *		'context' contains the PlannedStmt and the information about EXTERN
+ *			parameters to use for partition pruning and also where to add the
+ *			result -- lockrels and PlanInitPruningOutput nodes
+ *
+ * NOTE: ExecGetLockRels subroutine for a given node must add the RT indexes of
+ * any relations that it manipulates to result->lockrels.  If the node needs
+ * initial pruning, it must add the resulting PlanInitPruningOutput node to
+ * context using the ExecStorePlanInitPruningOutput() macro.
+ * ------------------------------------------------------------------------
+ */
+bool
+ExecGetLockRels(Plan *node, ExecGetLockRelsContext *context)
+{
+	/* Do nothing when we get to the end of a leaf on tree. */
+	if (node == NULL)
+		return true;
+
+	/* Make sure there's enough stack available. */
+	check_stack_depth();
+
+	switch (nodeTag(node))
+	{
+		case T_Append:
+			if (ExecGetAppendLockRels((Append *) node, context))
+				return true;
+			break;
+		case T_MergeAppend:
+			if (ExecGetMergeAppendLockRels((MergeAppend *) node, context))
+				return true;
+			break;
+
+		case T_SeqScan:
+		case T_SampleScan:
+		case T_IndexScan:
+		case T_IndexOnlyScan:
+		case T_BitmapIndexScan:
+		case T_BitmapHeapScan:
+		case T_TidScan:
+		case T_TidRangeScan:
+		case T_ForeignScan:
+		case T_SubqueryScan:
+		case T_CustomScan:
+			if (ExecGetScanLockRels((Scan *) node, context))
+				return true;
+			break;
+
+		case T_ModifyTable:
+			if (ExecGetModifyTableLockRels((ModifyTable *) node, context))
+				return true;
+			/* plan_tree_walker() will visit the subplan (outerNode) */
+			break;
+
+		default:
+			break;
+	}
+
+	return plan_tree_walker(node, ExecGetLockRels, (void *) context);
+}
+
+/*
+ * ExecGetScanLockRels
+ * 		Do ExecGetLockRels()'s work for a Scan plan
+ */
+static bool
+ExecGetScanLockRels(Scan *scan, ExecGetLockRelsContext *context)
+{
+	switch (nodeTag(scan))
+	{
+		case T_ForeignScan:
+			{
+				ForeignScan *fscan = (ForeignScan *) scan;
+
+				context->lockrels = bms_add_members(context->lockrels,
+													fscan->fs_relids);
+			}
+			break;
+
+		case T_SubqueryScan:
+			{
+				SubqueryScan *sscan = (SubqueryScan *) scan;
+
+				(void) ExecGetLockRels((Plan *) sscan->subplan, context);
+			}
+			break;
+
+		case T_CustomScan:
+			{
+				CustomScan *cscan = (CustomScan *) scan;
+				ListCell *lc;
+
+				context->lockrels = bms_add_members(context->lockrels,
+													cscan->custom_relids);
+				foreach(lc, cscan->custom_plans)
+				{
+					(void) ExecGetLockRels((Plan *) lfirst(lc), context);
+				}
+			}
+			break;
+
+		default:
+			context->lockrels = bms_add_member(context->lockrels,
+											   scan->scanrelid);
+			break;
+	}
+
+	return true;
+}
 
 /* ----------------------------------------------------------------
  *		ExecutorStart
@@ -804,6 +983,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
 {
 	CmdType		operation = queryDesc->operation;
 	PlannedStmt *plannedstmt = queryDesc->plannedstmt;
+	ExecLockRelsInfo *execlockrelsinfo = queryDesc->execlockrelsinfo;
 	Plan	   *plan = plannedstmt->planTree;
 	List	   *rangeTable = plannedstmt->rtable;
 	EState	   *estate = queryDesc->estate;
@@ -823,6 +1003,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
 	ExecInitRangeTable(estate, rangeTable);
 
 	estate->es_plannedstmt = plannedstmt;
+	estate->es_execlockrelsinfo = execlockrelsinfo;
 
 	/*
 	 * Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index 5dd8ab7db2..f27f85ab4f 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -66,6 +66,7 @@
 #define PARALLEL_KEY_QUERY_TEXT		UINT64CONST(0xE000000000000008)
 #define PARALLEL_KEY_JIT_INSTRUMENTATION UINT64CONST(0xE000000000000009)
 #define PARALLEL_KEY_WAL_USAGE			UINT64CONST(0xE00000000000000A)
+#define PARALLEL_KEY_EXECLOCKRELSINFO	UINT64CONST(0xE00000000000000B)
 
 #define PARALLEL_TUPLE_QUEUE_SIZE		65536
 
@@ -182,8 +183,10 @@ ExecSerializePlan(Plan *plan, EState *estate)
 	pstmt->transientPlan = false;
 	pstmt->dependsOnRole = false;
 	pstmt->parallelModeNeeded = false;
+	pstmt->containsInitialPruning = false;
 	pstmt->planTree = plan;
 	pstmt->rtable = estate->es_range_table;
+	pstmt->lockrels = NULL;
 	pstmt->resultRelations = NIL;
 	pstmt->appendRelations = NIL;
 
@@ -596,12 +599,15 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	FixedParallelExecutorState *fpes;
 	char	   *pstmt_data;
 	char	   *pstmt_space;
+	char	   *execlockrelsinfo_data;
+	char	   *execlockrelsinfo_space;
 	char	   *paramlistinfo_space;
 	BufferUsage *bufusage_space;
 	WalUsage   *walusage_space;
 	SharedExecutorInstrumentation *instrumentation = NULL;
 	SharedJitInstrumentation *jit_instrumentation = NULL;
 	int			pstmt_len;
+	int			execlockrelsinfo_len;
 	int			paramlistinfo_len;
 	int			instrumentation_len = 0;
 	int			jit_instrumentation_len = 0;
@@ -630,6 +636,7 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 
 	/* Fix up and serialize plan to be sent to workers. */
 	pstmt_data = ExecSerializePlan(planstate->plan, estate);
+	execlockrelsinfo_data = nodeToString(estate->es_execlockrelsinfo);
 
 	/* Create a parallel context. */
 	pcxt = CreateParallelContext("postgres", "ParallelQueryMain", nworkers);
@@ -656,6 +663,11 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	shm_toc_estimate_chunk(&pcxt->estimator, pstmt_len);
 	shm_toc_estimate_keys(&pcxt->estimator, 1);
 
+	/* Estimate space for serialized ExecLockRelsInfo. */
+	execlockrelsinfo_len = strlen(execlockrelsinfo_data) + 1;
+	shm_toc_estimate_chunk(&pcxt->estimator, execlockrelsinfo_len);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
 	/* Estimate space for serialized ParamListInfo. */
 	paramlistinfo_len = EstimateParamListSpace(estate->es_param_list_info);
 	shm_toc_estimate_chunk(&pcxt->estimator, paramlistinfo_len);
@@ -750,6 +762,12 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	memcpy(pstmt_space, pstmt_data, pstmt_len);
 	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PLANNEDSTMT, pstmt_space);
 
+	/* Store serialized ExecLockRelsInfo */
+	execlockrelsinfo_space = shm_toc_allocate(pcxt->toc, execlockrelsinfo_len);
+	memcpy(execlockrelsinfo_space, execlockrelsinfo_data, execlockrelsinfo_len);
+	shm_toc_insert(pcxt->toc, PARALLEL_KEY_EXECLOCKRELSINFO,
+				   execlockrelsinfo_space);
+
 	/* Store serialized ParamListInfo. */
 	paramlistinfo_space = shm_toc_allocate(pcxt->toc, paramlistinfo_len);
 	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARAMLISTINFO, paramlistinfo_space);
@@ -1231,8 +1249,10 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
 						 int instrument_options)
 {
 	char	   *pstmtspace;
+	char	   *execlockrelsinfospace;
 	char	   *paramspace;
 	PlannedStmt *pstmt;
+	ExecLockRelsInfo *execlockrelsinfo;
 	ParamListInfo paramLI;
 	char	   *queryString;
 
@@ -1243,12 +1263,17 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
 	pstmtspace = shm_toc_lookup(toc, PARALLEL_KEY_PLANNEDSTMT, false);
 	pstmt = (PlannedStmt *) stringToNode(pstmtspace);
 
+	/* Reconstruct leader-supplied ExecLockRelsInfo. */
+	execlockrelsinfospace = shm_toc_lookup(toc, PARALLEL_KEY_EXECLOCKRELSINFO,
+										  false);
+	execlockrelsinfo = (ExecLockRelsInfo *) stringToNode(execlockrelsinfospace);
+
 	/* Reconstruct ParamListInfo. */
 	paramspace = shm_toc_lookup(toc, PARALLEL_KEY_PARAMLISTINFO, false);
 	paramLI = RestoreParamList(&paramspace);
 
 	/* Create a QueryDesc for the query. */
-	return CreateQueryDesc(pstmt,
+	return CreateQueryDesc(pstmt, execlockrelsinfo,
 						   queryString,
 						   GetActiveSnapshot(), InvalidSnapshot,
 						   receiver, paramLI, NULL, instrument_options);
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 21953f253b..db8c4cd719 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -24,6 +24,7 @@
 #include "mb/pg_wchar.h"
 #include "miscadmin.h"
 #include "nodes/makefuncs.h"
+#include "parser/parsetree.h"
 #include "partitioning/partbounds.h"
 #include "partitioning/partdesc.h"
 #include "partitioning/partprune.h"
@@ -183,8 +184,14 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation rel,
 												  int maxfieldlen);
 static List *adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri);
 static PartitionPruneState *ExecCreatePartitionPruneState(PlanState *planstate,
-							  PartitionPruneInfo *partitionpruneinfo);
-static Bitmapset *ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate);
+							  PartitionPruneInfo *partitionpruneinfo,
+							  bool consider_initial_steps,
+							  bool consider_exec_steps,
+							  List *rtable, ExprContext *econtext,
+							  PartitionDirectory partdir,
+							  Bitmapset **parentrelids);
+static Bitmapset *ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate,
+								PartitionPruneInfo *pruneinfo);
 static void ExecInitPruningContext(PartitionPruneContext *context,
 								   List *pruning_steps,
 								   PartitionDesc partdesc,
@@ -1483,8 +1490,9 @@ adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri)
  * considered to be a stable expression, it can change value from one plan
  * node scan to the next during query execution.  Stable comparison
  * expressions that don't involve such Params allow partition pruning to be
- * done once during executor startup.  Expressions that do involve such Params
- * require us to prune separately for each scan of the parent plan node.
+ * done once during executor startup or even before during ExecutorGetLockRels().
+ * Expressions that do involve such Params require us to prune separately for
+ * each scan of the parent plan node.
  *
  * Note that pruning away unneeded subplans during executor startup has the
  * added benefit of not having to initialize the unneeded subplans at all.
@@ -1503,6 +1511,10 @@ adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri)
  *		updated to account for initial pruning having eliminated some of the
  *		subplans, if any.
  *
+ * ExecGetLockRelsDoInitialPruning:
+ * 		Do initial pruning as part of ExecGetLockRels() on the parent plan
+ * 		node
+ *
  * ExecFindMatchingSubPlans:
  *		Returns indexes of matching subplans after evaluating all available
  *		expressions, that is, using execution pruning steps.  This function can
@@ -1531,22 +1543,57 @@ ExecInitPartitionPruning(PlanState *planstate,
 {
 	PartitionPruneState *prunestate;
 	EState *estate = planstate->state;
+	Plan   *plan = planstate->plan;
+	PlanInitPruningOutput *initPruningOutput = NULL;
+	bool	do_pruning = (pruneinfo->needs_init_pruning ||
+						  pruneinfo->needs_exec_pruning);
 
-	/* We may need an expression context to evaluate partition exprs */
-	ExecAssignExprContext(estate, planstate);
+	if (estate->es_execlockrelsinfo)
+	{
+		initPruningOutput = (PlanInitPruningOutput *)
+			ExecFetchPlanInitPruningOutput(estate->es_execlockrelsinfo, plan);
 
-	/*
-	 * Create the working data structure for pruning.
-	 */
-	prunestate = ExecCreatePartitionPruneState(planstate, pruneinfo);
+		Assert(initPruningOutput != NULL &&
+			   IsA(initPruningOutput, PlanInitPruningOutput));
+		/* No need to do initial pruning again, only exec pruning. */
+		do_pruning = pruneinfo->needs_exec_pruning;
+	}
+
+	prunestate = NULL;
+	if (do_pruning)
+	{
+		/* We may need an expression context to evaluate partition exprs */
+		ExecAssignExprContext(estate, planstate);
+
+		/* For data reading, executor always omits detached partitions */
+		if (estate->es_partition_directory == NULL)
+			estate->es_partition_directory =
+				CreatePartitionDirectory(estate->es_query_cxt, false);
+
+		/*
+		 * Create the working data structure for pruning.  No need to consider
+		 * initial pruning steps if we have a PlanInitPruningOutput.
+		 */
+		prunestate = ExecCreatePartitionPruneState(planstate, pruneinfo,
+												   initPruningOutput == NULL, true,
+												   NIL, planstate->ps_ExprContext,
+												   estate->es_partition_directory,
+												   NULL);
+	}
 
 	/*
 	 * Perform an initial partition prune, if required.
 	 */
-	if (prunestate->do_initial_prune)
+	if (initPruningOutput)
+	{
+		/* ExecutorGetLockRels() already did it for us! */
+		*initially_valid_subplans = initPruningOutput->initially_valid_subplans;
+	}
+	else if (prunestate && prunestate->do_initial_prune)
 	{
 		/* Determine which subplans survive initial pruning */
-		*initially_valid_subplans = ExecFindInitialMatchingSubPlans(prunestate);
+		*initially_valid_subplans = ExecFindInitialMatchingSubPlans(prunestate,
+																	pruneinfo);
 	}
 	else
 	{
@@ -1564,7 +1611,7 @@ ExecInitPartitionPruning(PlanState *planstate,
 	 * invalid data in prunestate, because that data won't be consulted again
 	 * (cf initial Assert in ExecFindMatchingSubPlans).
 	 */
-	if (prunestate->do_exec_prune &&
+	if (prunestate && prunestate->do_exec_prune &&
 		bms_num_members(*initially_valid_subplans) < n_total_subplans)
 		ExecPartitionPruneFixSubPlanIndexes(prunestate,
 											*initially_valid_subplans,
@@ -1573,12 +1620,83 @@ ExecInitPartitionPruning(PlanState *planstate,
 	return prunestate;
 }
 
+/*
+ * ExecGetLockRelsDoInitialPruning
+ *		Perform initial pruning as part of doing ExecGetLockRels() on the parent
+ *		plan node
+ */
+Bitmapset *
+ExecGetLockRelsDoInitialPruning(Plan *plan, ExecGetLockRelsContext *context,
+								PartitionPruneInfo *pruneinfo)
+{
+	List		 *rtable = context->stmt->rtable;
+	ParamListInfo params = context->params;
+	ExprContext	 *econtext;
+	PartitionDirectory pdir;
+	MemoryContext oldcontext,
+				  tmpcontext;
+	Bitmapset	 *parentrelids;
+	PartitionPruneState *prunestate;
+	PlanInitPruningOutput *initPruningOutput;
+
+	/*
+	 * A temporary context to allocate stuff needded to run the pruning steps.
+	 */
+	tmpcontext = AllocSetContextCreate(CurrentMemoryContext,
+									   "initial pruning working data",
+									   ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(tmpcontext);
+
+	/*
+	 * PartitionDirectory to look up partition descriptors, which omits
+	 * detached partitions, just like in the executor proper.
+	 */
+	pdir = CreatePartitionDirectory(CurrentMemoryContext, false);
+
+	/*
+	 * We don't yet have a PlanState for the parent plan node, so must create
+	 * a standalone ExprContext to evaluate pruning expressions, equipped with
+	 * the information about the EXTERN parameters that the caller passed us.
+	 * Note that that's okay because the initial pruning steps do not contain
+	 * anything that requires the execution to have started.
+	 */
+	econtext = CreateStandaloneExprContext();
+	econtext->ecxt_param_list_info = params;
+	prunestate = ExecCreatePartitionPruneState(NULL, pruneinfo,
+											   true, false,
+											   rtable, econtext,
+											   pdir, &parentrelids);
+	MemoryContextSwitchTo(oldcontext);
+
+	/* Do the pruning and populate a PlanInitPruningOutput for this node. */
+	initPruningOutput = makeNode(PlanInitPruningOutput);
+	initPruningOutput->initially_valid_subplans =
+		ExecFindInitialMatchingSubPlans(prunestate, pruneinfo);
+	ExecStorePlanInitPruningOutput(context, initPruningOutput, plan);
+
+	/*
+	 * Report parent partitioned tables as locking targets, though they
+	 * would already be locked by ExecCreatePartitionPruneState().
+	 */
+	Assert(bms_num_members(parentrelids) > 0);
+	context->lockrels = bms_add_members(context->lockrels, parentrelids);
+
+	FreeExprContext(econtext, true);
+	DestroyPartitionDirectory(pdir);
+	MemoryContextDelete(tmpcontext);
+
+	return initPruningOutput->initially_valid_subplans;
+}
+
 /*
  * ExecCreatePartitionPruneState
  *		Build the data structure required for calling
  *		ExecFindInitialMatchingSubPlans and ExecFindMatchingSubPlans.
  *
- * 'planstate' is the parent plan node's execution state.
+ * 'planstate', if not NULL, is the parent plan node's execution state.  It
+ * can be NULL if being called before ExecutorStart(), in which case,
+ * 'rtable' (range table), 'econtext', and 'partdir' must be explicitly
+ * provided.
  *
  * 'partitionpruneinfo' is a PartitionPruneInfo as generated by
  * make_partition_pruneinfo.  Here we build a PartitionPruneState containing a
@@ -1590,26 +1708,35 @@ ExecInitPartitionPruning(PlanState *planstate,
  * as children.  The data stored in each PartitionedRelPruningData can be
  * re-used each time we re-evaluate which partitions match the pruning steps
  * provided in each PartitionedRelPruneInfo.
+ *
+ * The RT indexes of parent partitioned table that are locked here to peruse
+ * their PartitionedRelPruningInfo are returned in *parentrelids if asked
+ * for by the caller.
  */
 static PartitionPruneState *
 ExecCreatePartitionPruneState(PlanState *planstate,
-							  PartitionPruneInfo *partitionpruneinfo)
+							  PartitionPruneInfo *partitionpruneinfo,
+							  bool consider_initial_steps,
+							  bool consider_exec_steps,
+							  List *rtable, ExprContext *econtext,
+							  PartitionDirectory partdir,
+							  Bitmapset **parentrelids)
 {
-	EState	   *estate = planstate->state;
+	EState	   *estate = planstate ? planstate->state : NULL;
 	PartitionPruneState *prunestate;
 	int			n_part_hierarchies;
 	ListCell   *lc;
 	int			i;
-	ExprContext	*econtext = planstate->ps_ExprContext;
 
-	/* For data reading, executor always omits detached partitions */
-	if (estate->es_partition_directory == NULL)
-		estate->es_partition_directory =
-			CreatePartitionDirectory(estate->es_query_cxt, false);
+	Assert((estate != NULL) ||
+			(partdir != NULL && econtext != NULL && rtable != NIL));
 
 	n_part_hierarchies = list_length(partitionpruneinfo->prune_infos);
 	Assert(n_part_hierarchies > 0);
 
+	if (parentrelids)
+		*parentrelids = NULL;
+
 	/*
 	 * Allocate the data structure
 	 */
@@ -1656,19 +1783,58 @@ ExecCreatePartitionPruneState(PlanState *planstate,
 			PartitionedRelPruneInfo *pinfo = lfirst_node(PartitionedRelPruneInfo, lc2);
 			PartitionedRelPruningData *pprune = &prunedata->partrelprunedata[j];
 			Relation	partrel;
+			bool		close_partrel = false;
 			PartitionDesc partdesc;
 			PartitionKey partkey;
 
 			/*
-			 * We can rely on the copies of the partitioned table's partition
-			 * key and partition descriptor appearing in its relcache entry,
-			 * because that entry will be held open and locked for the
-			 * duration of this executor run.
+			 * Must open the relation by ourselves when called before the
+			 * execution has started, such as, when called during
+			 * ExecutorGetLockRels() on a cached plan.  In that case,
+			 * sub-partitions must be locked, because AcquirePlannerLocks()
+			 * would not have seen them. (1st relation in a partrelpruneinfos
+			 * list is always the root partitioned table appearing in the
+			 * query, which AcquirePlannerLocks() would have locked; the
+			 * Assert in relation_open() guards that assumption.)
+			 */
+			if (estate == NULL)
+			{
+				RangeTblEntry *rte = rt_fetch(pinfo->rtindex, rtable);
+				int		lockmode = (j == 0) ? NoLock : rte->rellockmode;
+
+				partrel = table_open(rte->relid, lockmode);
+				close_partrel = true;
+
+				/*
+				 * Also report the partitioned table as having been locked.
+				 * XXX - actually, *parentrelids set is later merged by the
+				 * caller into the set of relations "to-be locked" by
+				 * AcquireExecutorLocks(), thus causing the lock on this
+				 * table to be requested again.
+				 */
+				Assert(parentrelids != NULL);
+				*parentrelids = bms_add_member(*parentrelids, pinfo->rtindex);
+			}
+			else
+				partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
+
+			/*
+			 * We can rely on the copy of the partitioned table's partition
+			 * key from in its relcache entry, because it can't change (or
+			 * get destroyed) as long as the relation is locked.  Partition
+			 * descriptor is taken from the PartitionDirectory associated with
+			 * the table that is held open long enough for the descriptor to
+			 * remain valid while it's used to perform the pruning steps.
 			 */
-			partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
 			partkey = RelationGetPartitionKey(partrel);
-			partdesc = PartitionDirectoryLookup(estate->es_partition_directory,
-												partrel);
+			partdesc = PartitionDirectoryLookup(partdir, partrel);
+
+			/*
+			 * Must close partrel, keeping the lock taken, if we're not using
+			 * EState's entry.
+			 */
+			if (close_partrel)
+				table_close(partrel, NoLock);
 
 			/*
 			 * Initialize the subplan_map and subpart_map.
@@ -1770,7 +1936,7 @@ ExecCreatePartitionPruneState(PlanState *planstate,
 			 * Initialize pruning contexts as needed.
 			 */
 			pprune->initial_pruning_steps = pinfo->initial_pruning_steps;
-			if (pinfo->initial_pruning_steps)
+			if (consider_initial_steps && pinfo->initial_pruning_steps)
 			{
 				ExecInitPruningContext(&pprune->initial_context,
 									   pinfo->initial_pruning_steps,
@@ -1780,7 +1946,7 @@ ExecCreatePartitionPruneState(PlanState *planstate,
 				prunestate->do_initial_prune = true;
 			}
 			pprune->exec_pruning_steps = pinfo->exec_pruning_steps;
-			if (pinfo->exec_pruning_steps)
+			if (consider_exec_steps && pinfo->exec_pruning_steps)
 			{
 				ExecInitPruningContext(&pprune->exec_context,
 									   pinfo->exec_pruning_steps,
@@ -1899,7 +2065,8 @@ ExecInitPruningContext(PartitionPruneContext *context,
  * is required.
  */
 static Bitmapset *
-ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate)
+ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate,
+								PartitionPruneInfo *pruneinfo)
 {
 	Bitmapset  *result = NULL;
 	MemoryContext oldcontext;
@@ -1909,8 +2076,8 @@ ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate)
 	Assert(prunestate->do_initial_prune);
 
 	/*
-	 * Switch to a temp context to avoid leaking memory in the executor's
-	 * query-lifespan memory context.
+	 * Switch to a temp context to avoid leaking memory in the longer-term
+	 * memory context.
 	 */
 	oldcontext = MemoryContextSwitchTo(prunestate->prune_context);
 
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 9df1f81ea8..7246f9175f 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -119,6 +119,7 @@ CreateExecutorState(void)
 	estate->es_relations = NULL;
 	estate->es_rowmarks = NULL;
 	estate->es_plannedstmt = NULL;
+	estate->es_execlockrelsinfo = NULL;
 
 	estate->es_junkFilter = NULL;
 
@@ -785,6 +786,13 @@ ExecGetRangeTableRelation(EState *estate, Index rti)
 
 	Assert(rti > 0 && rti <= estate->es_range_table_size);
 
+	/*
+	 * A cross-check that AcquireExecutorLocks() hasn't missed any relations
+	 * it must not have.
+	 */
+	Assert(estate->es_execlockrelsinfo == NULL ||
+		   bms_is_member(rti, estate->es_execlockrelsinfo->lockrels));
+
 	rel = estate->es_relations[rti - 1];
 	if (rel == NULL)
 	{
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index f9460ae506..a2182a6b1f 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -844,7 +844,7 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
 	else
 		dest = None_Receiver;
 
-	es->qd = CreateQueryDesc(es->stmt,
+	es->qd = CreateQueryDesc(es->stmt, NULL,
 							 fcache->src,
 							 GetActiveSnapshot(),
 							 InvalidSnapshot,
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index 5b6d3eb23b..966615f670 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -94,6 +94,45 @@ static bool ExecAppendAsyncRequest(AppendState *node, TupleTableSlot **result);
 static void ExecAppendAsyncEventWait(AppendState *node);
 static void classify_matching_subplans(AppendState *node);
 
+/* ----------------------------------------------------------------
+ *		ExecGetAppendLockRels
+ *			Do ExecGetLockRels()'s work for an Append plan
+ * ----------------------------------------------------------------
+ */
+bool
+ExecGetAppendLockRels(Append *node, ExecGetLockRelsContext *context)
+{
+	PartitionPruneInfo *pruneinfo = node->part_prune_info;
+
+	if (pruneinfo && pruneinfo->needs_init_pruning)
+	{
+		List	   *subplans = node->appendplans;
+		Bitmapset  *validsubplans;
+		int			i;
+
+		validsubplans = ExecGetLockRelsDoInitialPruning((Plan *) node,
+														context, pruneinfo);
+
+		/* Prep the surviving subplans. */
+		i = -1;
+		while ((i = bms_next_member(validsubplans, i)) >= 0)
+		{
+			Plan   *subplan = list_nth(subplans, i);
+
+			(void) ExecGetLockRels(subplan, context);
+		}
+
+		/* done with this node */
+		return true;
+	}
+
+	/*
+	 * Look at all subplans, which the caller would do by calling
+	 * plan_tree_walker() on the node.
+	 */
+	return false;
+}
+
 /* ----------------------------------------------------------------
  *		ExecInitAppend
  *
@@ -155,7 +194,8 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
 		 * subplan, we can fill as_valid_subplans immediately, preventing
 		 * later calls to ExecFindMatchingSubPlans.
 		 */
-		if (!prunestate->do_exec_prune && nplans > 0)
+		if (appendstate->as_prune_state == NULL ||
+			(!appendstate->as_prune_state->do_exec_prune && nplans > 0))
 			appendstate->as_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
 	}
 	else
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index 9a9f29e845..869b836a14 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -54,6 +54,45 @@ typedef int32 SlotNumber;
 static TupleTableSlot *ExecMergeAppend(PlanState *pstate);
 static int	heap_compare_slots(Datum a, Datum b, void *arg);
 
+/* ----------------------------------------------------------------
+ *		ExecGetMergeAppendLockRels
+ *			Do ExecGetLockRels()'s work for a MergeAppend plan
+ * ----------------------------------------------------------------
+ */
+bool
+ExecGetMergeAppendLockRels(MergeAppend *node, ExecGetLockRelsContext *context)
+{
+	PartitionPruneInfo *pruneinfo = node->part_prune_info;
+
+	if (pruneinfo && pruneinfo->needs_init_pruning)
+	{
+		List	   *subplans = node->mergeplans;
+		Bitmapset  *validsubplans;
+		int			i;
+
+		validsubplans = ExecGetLockRelsDoInitialPruning((Plan *) node,
+														context, pruneinfo);
+
+		/* Prep the surviving subplans. */
+		i = -1;
+		while ((i = bms_next_member(validsubplans, i)) >= 0)
+		{
+			Plan   *subplan = list_nth(subplans, i);
+
+			(void) ExecGetLockRels(subplan, context);
+		}
+
+		/* done with this node */
+		return true;
+	}
+
+	/*
+	 * Look at all subplans, which the caller would do by calling
+	 * plan_tree_walker() on the node.
+	 */
+	return false;
+}
+
 
 /* ----------------------------------------------------------------
  *		ExecInitMergeAppend
@@ -103,7 +142,8 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
 		 * subplan, we can fill as_valid_subplans immediately, preventing
 		 * later calls to ExecFindMatchingSubPlans.
 		 */
-		if (!prunestate->do_exec_prune && nplans > 0)
+		if (mergestate->ms_prune_state == NULL ||
+			(!mergestate->ms_prune_state->do_exec_prune && nplans > 0))
 			mergestate->ms_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
 	}
 	else
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index 5ec699a9bd..c860045fcb 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -2700,6 +2700,30 @@ ExecLookupResultRelByOid(ModifyTableState *node, Oid resultoid,
 	return NULL;
 }
 
+/*
+ * ExecGetModifyTableLockRels
+ * 		Do ExecGetLockRels()'s work for a ModifyTable plan
+ */
+bool
+ExecGetModifyTableLockRels(ModifyTable *plan, ExecGetLockRelsContext *context)
+{
+	ListCell *lc;
+
+	if (plan->rootRelation > 0)
+		context->lockrels = bms_add_member(context->lockrels,
+										   plan->rootRelation);
+	context->lockrels = bms_add_member(context->lockrels,
+									   plan->nominalRelation);
+	foreach(lc, plan->resultRelations)
+	{
+		context->lockrels = bms_add_member(context->lockrels,
+										   lfirst_int(lc));
+	}
+
+	/* caller will look at the source subplan */
+	return false;
+}
+
 /* ----------------------------------------------------------------
  *		ExecInitModifyTable
  * ----------------------------------------------------------------
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index a82e986667..2107009591 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -1578,6 +1578,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 	CachedPlanSource *plansource;
 	CachedPlan *cplan;
 	List	   *stmt_list;
+	List	   *execlockrelsinfo_list;
 	char	   *query_string;
 	Snapshot	snapshot;
 	MemoryContext oldcontext;
@@ -1659,6 +1660,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 	/* Replan if needed, and increment plan refcount for portal */
 	cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv);
 	stmt_list = cplan->stmt_list;
+	execlockrelsinfo_list = cplan->execlockrelsinfo_list;
 
 	if (!plan->saved)
 	{
@@ -1670,6 +1672,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 		 */
 		oldcontext = MemoryContextSwitchTo(portal->portalContext);
 		stmt_list = copyObject(stmt_list);
+		execlockrelsinfo_list = copyObject(execlockrelsinfo_list);
 		MemoryContextSwitchTo(oldcontext);
 		ReleaseCachedPlan(cplan, NULL);
 		cplan = NULL;			/* portal shouldn't depend on cplan */
@@ -1683,6 +1686,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 					  query_string,
 					  plansource->commandTag,
 					  stmt_list,
+					  execlockrelsinfo_list,
 					  cplan);
 
 	/*
@@ -2473,7 +2477,9 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 	{
 		CachedPlanSource *plansource = (CachedPlanSource *) lfirst(lc1);
 		List	   *stmt_list;
-		ListCell   *lc2;
+		List	   *execlockrelsinfo_list;
+		ListCell   *lc2,
+				   *lc3;
 
 		spicallbackarg.query = plansource->query_string;
 
@@ -2552,6 +2558,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 							  plan_owner, _SPI_current->queryEnv);
 
 		stmt_list = cplan->stmt_list;
+		execlockrelsinfo_list = cplan->execlockrelsinfo_list;
 
 		/*
 		 * If we weren't given a specific snapshot to use, and the statement
@@ -2589,9 +2596,10 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 			}
 		}
 
-		foreach(lc2, stmt_list)
+		forboth(lc2, stmt_list, lc3, execlockrelsinfo_list)
 		{
 			PlannedStmt *stmt = lfirst_node(PlannedStmt, lc2);
+			ExecLockRelsInfo *execlockrelsinfo = lfirst_node(ExecLockRelsInfo, lc3);
 			bool		canSetTag = stmt->canSetTag;
 			DestReceiver *dest;
 
@@ -2663,7 +2671,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 				else
 					snap = InvalidSnapshot;
 
-				qdesc = CreateQueryDesc(stmt,
+				qdesc = CreateQueryDesc(stmt, execlockrelsinfo,
 										plansource->query_string,
 										snap, crosscheck_snapshot,
 										dest,
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index d4f8455a2b..68c664070c 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -68,6 +68,13 @@
 		} \
 	} while (0)
 
+/* Copy a field that is an array with numElem ints */
+#define COPY_INT_ARRAY(fldname, numElem) \
+	do { \
+		newnode->fldname = (numElem) > 0 ? palloc((numElem) * sizeof(int)) : NULL; \
+		memcpy(newnode->fldname, from->fldname, sizeof(int) * (numElem)); \
+	} while (0)
+
 /* Copy a parse location field (for Copy, this is same as scalar case) */
 #define COPY_LOCATION_FIELD(fldname) \
 	(newnode->fldname = from->fldname)
@@ -94,9 +101,12 @@ _copyPlannedStmt(const PlannedStmt *from)
 	COPY_SCALAR_FIELD(transientPlan);
 	COPY_SCALAR_FIELD(dependsOnRole);
 	COPY_SCALAR_FIELD(parallelModeNeeded);
+	COPY_SCALAR_FIELD(containsInitialPruning);
 	COPY_SCALAR_FIELD(jitFlags);
 	COPY_NODE_FIELD(planTree);
+	COPY_SCALAR_FIELD(numPlanNodes);
 	COPY_NODE_FIELD(rtable);
+	COPY_BITMAPSET_FIELD(lockrels);
 	COPY_NODE_FIELD(resultRelations);
 	COPY_NODE_FIELD(appendRelations);
 	COPY_NODE_FIELD(subplans);
@@ -1278,6 +1288,8 @@ _copyPartitionPruneInfo(const PartitionPruneInfo *from)
 	PartitionPruneInfo *newnode = makeNode(PartitionPruneInfo);
 
 	COPY_NODE_FIELD(prune_infos);
+	COPY_SCALAR_FIELD(needs_init_pruning);
+	COPY_SCALAR_FIELD(needs_exec_pruning);
 	COPY_BITMAPSET_FIELD(other_subplans);
 
 	return newnode;
@@ -4941,6 +4953,33 @@ _copyExtensibleNode(const ExtensibleNode *from)
 	return newnode;
 }
 
+/* ****************************************************************
+ *					execnodes.h copy functions
+ * ****************************************************************
+ */
+static ExecLockRelsInfo *
+_copyExecLockRelsInfo(const ExecLockRelsInfo *from)
+{
+	ExecLockRelsInfo *newnode = makeNode(ExecLockRelsInfo);
+
+	COPY_BITMAPSET_FIELD(lockrels);
+	COPY_SCALAR_FIELD(numPlanNodes);
+	COPY_NODE_FIELD(initPruningOutputs);
+	COPY_INT_ARRAY(ipoIndexes, from->numPlanNodes);
+
+	return newnode;
+}
+
+static PlanInitPruningOutput *
+_copyPlanInitPruningOutput(const PlanInitPruningOutput *from)
+{
+	PlanInitPruningOutput *newnode = makeNode(PlanInitPruningOutput);
+
+	COPY_BITMAPSET_FIELD(initially_valid_subplans);
+
+	return newnode;
+}
+
 /* ****************************************************************
  *					value.h copy functions
  * ****************************************************************
@@ -4995,7 +5034,6 @@ _copyBitString(const BitString *from)
 	return newnode;
 }
 
-
 static ForeignKeyCacheInfo *
 _copyForeignKeyCacheInfo(const ForeignKeyCacheInfo *from)
 {
@@ -5944,6 +5982,16 @@ copyObjectImpl(const void *from)
 			retval = _copyPublicationTable(from);
 			break;
 
+			/*
+			 * EXECUTION NODES
+			 */
+		case T_ExecLockRelsInfo:
+			retval = _copyExecLockRelsInfo(from);
+			break;
+		case T_PlanInitPruningOutput:
+			retval = _copyPlanInitPruningOutput(from);
+			break;
+
 			/*
 			 * MISCELLANEOUS NODES
 			 */
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 6bdad462c7..e0e09d7abd 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -312,9 +312,12 @@ _outPlannedStmt(StringInfo str, const PlannedStmt *node)
 	WRITE_BOOL_FIELD(transientPlan);
 	WRITE_BOOL_FIELD(dependsOnRole);
 	WRITE_BOOL_FIELD(parallelModeNeeded);
+	WRITE_BOOL_FIELD(containsInitialPruning);
 	WRITE_INT_FIELD(jitFlags);
 	WRITE_NODE_FIELD(planTree);
+	WRITE_INT_FIELD(numPlanNodes);
 	WRITE_NODE_FIELD(rtable);
+	WRITE_BITMAPSET_FIELD(lockrels);
 	WRITE_NODE_FIELD(resultRelations);
 	WRITE_NODE_FIELD(appendRelations);
 	WRITE_NODE_FIELD(subplans);
@@ -1004,6 +1007,8 @@ _outPartitionPruneInfo(StringInfo str, const PartitionPruneInfo *node)
 	WRITE_NODE_TYPE("PARTITIONPRUNEINFO");
 
 	WRITE_NODE_FIELD(prune_infos);
+	WRITE_BOOL_FIELD(needs_init_pruning);
+	WRITE_BOOL_FIELD(needs_exec_pruning);
 	WRITE_BITMAPSET_FIELD(other_subplans);
 }
 
@@ -2274,6 +2279,7 @@ _outPlannerGlobal(StringInfo str, const PlannerGlobal *node)
 	WRITE_NODE_FIELD(subplans);
 	WRITE_BITMAPSET_FIELD(rewindPlanIDs);
 	WRITE_NODE_FIELD(finalrtable);
+	WRITE_BITMAPSET_FIELD(lockrels);
 	WRITE_NODE_FIELD(finalrowmarks);
 	WRITE_NODE_FIELD(resultRelations);
 	WRITE_NODE_FIELD(appendRelations);
@@ -2697,6 +2703,31 @@ _outExtensibleNode(StringInfo str, const ExtensibleNode *node)
 	methods->nodeOut(str, node);
 }
 
+/*****************************************************************************
+ *
+ *	Stuff from execnodes.h
+ *
+ *****************************************************************************/
+
+static void
+_outExecLockRelsInfo(StringInfo str, const ExecLockRelsInfo *node)
+{
+	WRITE_NODE_TYPE("EXECLOCKRELSINFO");
+
+	WRITE_BITMAPSET_FIELD(lockrels);
+	WRITE_INT_FIELD(numPlanNodes);
+	WRITE_NODE_FIELD(initPruningOutputs);
+	WRITE_INT_ARRAY(ipoIndexes, node->numPlanNodes);
+}
+
+static void
+_outPlanInitPruningOutput(StringInfo str, const PlanInitPruningOutput *node)
+{
+	WRITE_NODE_TYPE("PARTITIONINITPRUNINGOUTPUT");
+
+	WRITE_BITMAPSET_FIELD(initially_valid_subplans);
+}
+
 /*****************************************************************************
  *
  *	Stuff from parsenodes.h.
@@ -4538,6 +4569,16 @@ outNode(StringInfo str, const void *obj)
 				_outPartitionRangeDatum(str, obj);
 				break;
 
+				/*
+				 * EXECUTION NODES
+				 */
+			case T_ExecLockRelsInfo:
+				_outExecLockRelsInfo(str, obj);
+				break;
+			case T_PlanInitPruningOutput:
+				_outPlanInitPruningOutput(str, obj);
+				break;
+
 			default:
 
 				/*
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 3f68f7c18d..41ded72c4c 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -1585,9 +1585,12 @@ _readPlannedStmt(void)
 	READ_BOOL_FIELD(transientPlan);
 	READ_BOOL_FIELD(dependsOnRole);
 	READ_BOOL_FIELD(parallelModeNeeded);
+	READ_BOOL_FIELD(containsInitialPruning);
 	READ_INT_FIELD(jitFlags);
 	READ_NODE_FIELD(planTree);
+	READ_INT_FIELD(numPlanNodes);
 	READ_NODE_FIELD(rtable);
+	READ_BITMAPSET_FIELD(lockrels);
 	READ_NODE_FIELD(resultRelations);
 	READ_NODE_FIELD(appendRelations);
 	READ_NODE_FIELD(subplans);
@@ -2534,6 +2537,8 @@ _readPartitionPruneInfo(void)
 	READ_LOCALS(PartitionPruneInfo);
 
 	READ_NODE_FIELD(prune_infos);
+	READ_BOOL_FIELD(needs_init_pruning);
+	READ_BOOL_FIELD(needs_exec_pruning);
 	READ_BITMAPSET_FIELD(other_subplans);
 
 	READ_DONE();
@@ -2703,6 +2708,35 @@ _readPartitionRangeDatum(void)
 	READ_DONE();
 }
 
+/*
+ * _readExecLockRelsInfo
+ */
+static ExecLockRelsInfo *
+_readExecLockRelsInfo(void)
+{
+	READ_LOCALS(ExecLockRelsInfo);
+
+	READ_BITMAPSET_FIELD(lockrels);
+	READ_INT_FIELD(numPlanNodes);
+	READ_NODE_FIELD(initPruningOutputs);
+	READ_INT_ARRAY(ipoIndexes, local_node->numPlanNodes);
+
+	READ_DONE();
+}
+
+/*
+ * _readPlanInitPruningOutput
+ */
+static PlanInitPruningOutput *
+_readPlanInitPruningOutput(void)
+{
+	READ_LOCALS(PlanInitPruningOutput);
+
+	READ_BITMAPSET_FIELD(initially_valid_subplans);
+
+	READ_DONE();
+}
+
 /*
  * parseNodeString
  *
@@ -2974,6 +3008,10 @@ parseNodeString(void)
 		return_value = _readPartitionBoundSpec();
 	else if (MATCH("PARTITIONRANGEDATUM", 19))
 		return_value = _readPartitionRangeDatum();
+	else if (MATCH("EXECLOCKRELSINFO", 16))
+		return_value = _readExecLockRelsInfo();
+	else if (MATCH("PARTITIONINITPRUNINGOUTPUT", 26))
+		return_value = _readPlanInitPruningOutput();
 	else
 	{
 		elog(ERROR, "badly formatted node string \"%.32s\"...", token);
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index bd09f85aea..9e41bbd228 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -517,8 +517,11 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
 	result->transientPlan = glob->transientPlan;
 	result->dependsOnRole = glob->dependsOnRole;
 	result->parallelModeNeeded = glob->parallelModeNeeded;
+	result->containsInitialPruning = glob->containsInitialPruning;
 	result->planTree = top_plan;
+	result->numPlanNodes = glob->lastPlanNodeId;
 	result->rtable = glob->finalrtable;
+	result->lockrels = glob->lockrels;
 	result->resultRelations = glob->resultRelations;
 	result->appendRelations = glob->appendRelations;
 	result->subplans = glob->subplans;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index a7b11b7f03..cee8c570fd 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -483,6 +483,7 @@ static void
 add_rte_to_flat_rtable(PlannerGlobal *glob, RangeTblEntry *rte)
 {
 	RangeTblEntry *newrte;
+	Index		rti = list_length(glob->finalrtable) + 1;
 
 	/* flat copy to duplicate all the scalar fields */
 	newrte = (RangeTblEntry *) palloc(sizeof(RangeTblEntry));
@@ -517,7 +518,10 @@ add_rte_to_flat_rtable(PlannerGlobal *glob, RangeTblEntry *rte)
 	 * but it would probably cost more cycles than it would save.
 	 */
 	if (newrte->rtekind == RTE_RELATION)
+	{
+		glob->lockrels = bms_add_member(glob->lockrels, rti);
 		glob->relationOids = lappend_oid(glob->relationOids, newrte->relid);
+	}
 }
 
 /*
@@ -1548,6 +1552,9 @@ set_append_references(PlannerInfo *root,
 				pinfo->rtindex += rtoffset;
 			}
 		}
+
+		if (aplan->part_prune_info->needs_init_pruning)
+			root->glob->containsInitialPruning = true;
 	}
 
 	/* We don't need to recurse to lefttree or righttree ... */
@@ -1620,6 +1627,9 @@ set_mergeappend_references(PlannerInfo *root,
 				pinfo->rtindex += rtoffset;
 			}
 		}
+
+		if (mplan->part_prune_info->needs_init_pruning)
+			root->glob->containsInitialPruning = true;
 	}
 
 	/* We don't need to recurse to lefttree or righttree ... */
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index 7080cb25d9..3322dc79f2 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -144,7 +144,9 @@ static List *make_partitionedrel_pruneinfo(PlannerInfo *root,
 										   List *prunequal,
 										   Bitmapset *partrelids,
 										   int *relid_subplan_map,
-										   Bitmapset **matchedsubplans);
+										   Bitmapset **matchedsubplans,
+										   bool *needs_init_pruning,
+										   bool *needs_exec_pruning);
 static void gen_partprune_steps(RelOptInfo *rel, List *clauses,
 								PartClauseTarget target,
 								GeneratePruningStepsContext *context);
@@ -230,6 +232,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	int		   *relid_subplan_map;
 	ListCell   *lc;
 	int			i;
+	bool		needs_init_pruning = false;
+	bool		needs_exec_pruning = false;
 
 	/*
 	 * Scan the subpaths to see which ones are scans of partition child
@@ -309,12 +313,16 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		Bitmapset  *partrelids = (Bitmapset *) lfirst(lc);
 		List	   *pinfolist;
 		Bitmapset  *matchedsubplans = NULL;
+		bool		partrel_needs_init_pruning;
+		bool		partrel_needs_exec_pruning;
 
 		pinfolist = make_partitionedrel_pruneinfo(root, parentrel,
 												  prunequal,
 												  partrelids,
 												  relid_subplan_map,
-												  &matchedsubplans);
+												  &matchedsubplans,
+												  &partrel_needs_init_pruning,
+												  &partrel_needs_exec_pruning);
 
 		/* When pruning is possible, record the matched subplans */
 		if (pinfolist != NIL)
@@ -323,6 +331,10 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 			allmatchedsubplans = bms_join(matchedsubplans,
 										  allmatchedsubplans);
 		}
+		if (!needs_init_pruning)
+			needs_init_pruning = partrel_needs_init_pruning;
+		if (!needs_exec_pruning)
+			needs_exec_pruning = partrel_needs_exec_pruning;
 	}
 
 	pfree(relid_subplan_map);
@@ -337,6 +349,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	/* Else build the result data structure */
 	pruneinfo = makeNode(PartitionPruneInfo);
 	pruneinfo->prune_infos = prunerelinfos;
+	pruneinfo->needs_init_pruning = needs_init_pruning;
+	pruneinfo->needs_exec_pruning = needs_exec_pruning;
 
 	/*
 	 * Some subplans may not belong to any of the identified partitioned rels.
@@ -435,13 +449,18 @@ add_part_relids(List *allpartrelids, Bitmapset *partrelids)
  * If we cannot find any useful run-time pruning steps, return NIL.
  * However, on success, each rel identified in partrelids will have
  * an element in the result list, even if some of them are useless.
+ * *needs_init_pruning and *needs_exec_pruning are set to indicate that the
+ * returned PartitionedRelPruneInfos contains pruning steps that can be
+ * performed before and after execution begins, respectively.
  */
 static List *
 make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 							  List *prunequal,
 							  Bitmapset *partrelids,
 							  int *relid_subplan_map,
-							  Bitmapset **matchedsubplans)
+							  Bitmapset **matchedsubplans,
+							  bool *needs_init_pruning,
+							  bool *needs_exec_pruning)
 {
 	RelOptInfo *targetpart = NULL;
 	List	   *pinfolist = NIL;
@@ -452,6 +471,10 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	int			rti;
 	int			i;
 
+	/* Will find out below. */
+	*needs_init_pruning = false;
+	*needs_exec_pruning = false;
+
 	/*
 	 * Examine each partitioned rel, constructing a temporary array to map
 	 * from planner relids to index of the partitioned rel, and building a
@@ -539,6 +562,9 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		 * executor per-scan pruning steps.  This first pass creates startup
 		 * pruning steps and detects whether there's any possibly-useful quals
 		 * that would require per-scan pruning.
+		 *
+		 * In the first pass, we note whether the 2nd pass is necessary by
+		 * by noting the presence of EXEC parameters.
 		 */
 		gen_partprune_steps(subpart, partprunequal, PARTTARGET_INITIAL,
 							&context);
@@ -613,6 +639,11 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		pinfo->execparamids = execparamids;
 		/* Remaining fields will be filled in the next loop */
 
+		if (!*needs_init_pruning)
+			*needs_init_pruning = (initial_pruning_steps != NIL);
+		if (!*needs_exec_pruning)
+			*needs_exec_pruning = (exec_pruning_steps != NIL);
+
 		pinfolist = lappend(pinfolist, pinfo);
 	}
 
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index ba2fcfeb4a..085eb3f209 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -945,15 +945,17 @@ pg_plan_query(Query *querytree, const char *query_string, int cursorOptions,
  * For normal optimizable statements, invoke the planner.  For utility
  * statements, just make a wrapper PlannedStmt node.
  *
- * The result is a list of PlannedStmt nodes.
+ * The result is a list of PlannedStmt nodes.  Also, a NULL is appended to
+ * *execlockrelsinfo_list for each PlannedStmt added to the returned list.
  */
 List *
 pg_plan_queries(List *querytrees, const char *query_string, int cursorOptions,
-				ParamListInfo boundParams)
+				ParamListInfo boundParams, List **execlockrelsinfo_list)
 {
 	List	   *stmt_list = NIL;
 	ListCell   *query_list;
 
+	*execlockrelsinfo_list = NIL;
 	foreach(query_list, querytrees)
 	{
 		Query	   *query = lfirst_node(Query, query_list);
@@ -977,6 +979,7 @@ pg_plan_queries(List *querytrees, const char *query_string, int cursorOptions,
 		}
 
 		stmt_list = lappend(stmt_list, stmt);
+		*execlockrelsinfo_list = lappend(*execlockrelsinfo_list, NULL);
 	}
 
 	return stmt_list;
@@ -1080,7 +1083,8 @@ exec_simple_query(const char *query_string)
 		QueryCompletion qc;
 		MemoryContext per_parsetree_context = NULL;
 		List	   *querytree_list,
-				   *plantree_list;
+				   *plantree_list,
+				   *plantree_execlockrelsinfo_list;
 		Portal		portal;
 		DestReceiver *receiver;
 		int16		format;
@@ -1167,7 +1171,8 @@ exec_simple_query(const char *query_string)
 												NULL, 0, NULL);
 
 		plantree_list = pg_plan_queries(querytree_list, query_string,
-										CURSOR_OPT_PARALLEL_OK, NULL);
+										CURSOR_OPT_PARALLEL_OK, NULL,
+										&plantree_execlockrelsinfo_list);
 
 		/*
 		 * Done with the snapshot used for parsing/planning.
@@ -1203,6 +1208,7 @@ exec_simple_query(const char *query_string)
 						  query_string,
 						  commandTag,
 						  plantree_list,
+						  plantree_execlockrelsinfo_list,
 						  NULL);
 
 		/*
@@ -1991,6 +1997,7 @@ exec_bind_message(StringInfo input_message)
 					  query_string,
 					  psrc->commandTag,
 					  cplan->stmt_list,
+					  cplan->execlockrelsinfo_list,
 					  cplan);
 
 	/* Done with the snapshot used for parameter I/O and parsing/planning */
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index 5f907831a3..972ddc014e 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -35,7 +35,7 @@
 Portal		ActivePortal = NULL;
 
 
-static void ProcessQuery(PlannedStmt *plan,
+static void ProcessQuery(PlannedStmt *plan, ExecLockRelsInfo *execlockrelsinfo,
 						 const char *sourceText,
 						 ParamListInfo params,
 						 QueryEnvironment *queryEnv,
@@ -65,6 +65,7 @@ static void DoPortalRewind(Portal portal);
  */
 QueryDesc *
 CreateQueryDesc(PlannedStmt *plannedstmt,
+				ExecLockRelsInfo *execlockrelsinfo,
 				const char *sourceText,
 				Snapshot snapshot,
 				Snapshot crosscheck_snapshot,
@@ -77,6 +78,7 @@ CreateQueryDesc(PlannedStmt *plannedstmt,
 
 	qd->operation = plannedstmt->commandType;	/* operation */
 	qd->plannedstmt = plannedstmt;	/* plan */
+	qd->execlockrelsinfo = execlockrelsinfo;		/* ExecutorGetLockRels() output for plan */
 	qd->sourceText = sourceText;	/* query text */
 	qd->snapshot = RegisterSnapshot(snapshot);	/* snapshot */
 	/* RI check snapshot */
@@ -122,6 +124,7 @@ FreeQueryDesc(QueryDesc *qdesc)
  *		PORTAL_ONE_RETURNING, or PORTAL_ONE_MOD_WITH portal
  *
  *	plan: the plan tree for the query
+ *	execlockrelsinfo: ExecutorGetLockRels() output for the plan tree
  *	sourceText: the source text of the query
  *	params: any parameters needed
  *	dest: where to send results
@@ -134,6 +137,7 @@ FreeQueryDesc(QueryDesc *qdesc)
  */
 static void
 ProcessQuery(PlannedStmt *plan,
+			 ExecLockRelsInfo *execlockrelsinfo,
 			 const char *sourceText,
 			 ParamListInfo params,
 			 QueryEnvironment *queryEnv,
@@ -145,7 +149,7 @@ ProcessQuery(PlannedStmt *plan,
 	/*
 	 * Create the QueryDesc object
 	 */
-	queryDesc = CreateQueryDesc(plan, sourceText,
+	queryDesc = CreateQueryDesc(plan, execlockrelsinfo, sourceText,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, params, queryEnv, 0);
 
@@ -490,6 +494,7 @@ PortalStart(Portal portal, ParamListInfo params,
 				 * the destination to DestNone.
 				 */
 				queryDesc = CreateQueryDesc(linitial_node(PlannedStmt, portal->stmts),
+											linitial_node(ExecLockRelsInfo, portal->execlockrelsinfos),
 											portal->sourceText,
 											GetActiveSnapshot(),
 											InvalidSnapshot,
@@ -1190,7 +1195,8 @@ PortalRunMulti(Portal portal,
 			   QueryCompletion *qc)
 {
 	bool		active_snapshot_set = false;
-	ListCell   *stmtlist_item;
+	ListCell   *stmtlist_item,
+			   *execlockrelsinfolist_item;
 
 	/*
 	 * If the destination is DestRemoteExecute, change to DestNone.  The
@@ -1211,9 +1217,12 @@ PortalRunMulti(Portal portal,
 	 * Loop to handle the individual queries generated from a single parsetree
 	 * by analysis and rewrite.
 	 */
-	foreach(stmtlist_item, portal->stmts)
+	forboth(stmtlist_item, portal->stmts,
+			execlockrelsinfolist_item, portal->execlockrelsinfos)
 	{
 		PlannedStmt *pstmt = lfirst_node(PlannedStmt, stmtlist_item);
+		ExecLockRelsInfo *execlockrelsinfo = lfirst_node(ExecLockRelsInfo,
+											   execlockrelsinfolist_item);
 
 		/*
 		 * If we got a cancel signal in prior command, quit
@@ -1271,7 +1280,7 @@ PortalRunMulti(Portal portal,
 			if (pstmt->canSetTag)
 			{
 				/* statement can set tag string */
-				ProcessQuery(pstmt,
+				ProcessQuery(pstmt, execlockrelsinfo,
 							 portal->sourceText,
 							 portal->portalParams,
 							 portal->queryEnv,
@@ -1280,7 +1289,7 @@ PortalRunMulti(Portal portal,
 			else
 			{
 				/* stmt added by rewrite cannot set tag */
-				ProcessQuery(pstmt,
+				ProcessQuery(pstmt, execlockrelsinfo,
 							 portal->sourceText,
 							 portal->portalParams,
 							 portal->queryEnv,
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 4cf6db504f..c40a6f19d6 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -99,14 +99,15 @@ static dlist_head cached_expression_list = DLIST_STATIC_INIT(cached_expression_l
 static void ReleaseGenericPlan(CachedPlanSource *plansource);
 static List *RevalidateCachedQuery(CachedPlanSource *plansource,
 								   QueryEnvironment *queryEnv);
-static bool CheckCachedPlan(CachedPlanSource *plansource);
+static bool CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams);
 static CachedPlan *BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
 								   ParamListInfo boundParams, QueryEnvironment *queryEnv);
 static bool choose_custom_plan(CachedPlanSource *plansource,
 							   ParamListInfo boundParams);
 static double cached_plan_cost(CachedPlan *plan, bool include_planner);
 static Query *QueryListGetPrimaryStmt(List *stmts);
-static void AcquireExecutorLocks(List *stmt_list, bool acquire);
+static List *AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams);
+static void ReleaseExecutorLocks(List *stmt_list, List *execlockrelsinfo_list);
 static void AcquirePlannerLocks(List *stmt_list, bool acquire);
 static void ScanQueryForLocks(Query *parsetree, bool acquire);
 static bool ScanQueryWalker(Node *node, bool *acquire);
@@ -782,6 +783,47 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
 	return tlist;
 }
 
+/*
+ * CachedPlanSaveExecLockRelsInfos
+ *		Save the list containing ExecLockRelsInfo nodes in the given CachedPlan
+ *
+ * The provided list is copied into a dedicated context that is a child of
+ * plan->context.
+ */
+static void
+CachedPlanSaveExecLockRelsInfos(CachedPlan *plan, List *execlockrelsinfo_list)
+{
+	MemoryContext	execlockrelsinfo_context = plan->execlockrelsinfo_context,
+					oldcontext = CurrentMemoryContext;
+	List		   *execlockrelsinfo_list_copy;
+
+	/*
+	 * Set up the dedicated context if not already done, saving it as a child
+	 * of the CachedPlan's context.
+	 */
+	if (execlockrelsinfo_context == NULL)
+	{
+		execlockrelsinfo_context = AllocSetContextCreate(CurrentMemoryContext,
+												 "CachedPlan execlockrelsinfo list",
+												 ALLOCSET_START_SMALL_SIZES);
+		MemoryContextSetParent(execlockrelsinfo_context, plan->context);
+		MemoryContextSetIdentifier(execlockrelsinfo_context, plan->context->ident);
+		plan->execlockrelsinfo_context = execlockrelsinfo_context;
+	}
+	else
+	{
+		/* Just clear existing contents by resetting the context. */
+		Assert(MemoryContextIsValid(execlockrelsinfo_context));
+		MemoryContextReset(execlockrelsinfo_context);
+	}
+
+	MemoryContextSwitchTo(execlockrelsinfo_context);
+	execlockrelsinfo_list_copy = copyObject(execlockrelsinfo_list);
+	MemoryContextSwitchTo(oldcontext);
+
+	plan->execlockrelsinfo_list = execlockrelsinfo_list_copy;
+}
+
 /*
  * CheckCachedPlan: see if the CachedPlanSource's generic plan is valid.
  *
@@ -790,9 +832,17 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
  *
  * On a "true" return, we have acquired the locks needed to run the plan.
  * (We must do this for the "true" result to be race-condition-free.)
+ *
+ * If the CachedPlan is valid, this calls ExecutorGetLockRels on each
+ * PlannedStmt contained in it to determine the set of relations to lock by
+ * AcquireExecutorLocks().  Resulting ExecLockRelsInfo nodes, allocated in a
+ * child context of the context containing the plan itself, are added into
+ * plan->execlockrelsinfo_list.  ExecLockRelsInfo nodes that may be present
+ * in the list from the last invocation of CheckCachedPlan() on the same
+ * CachedPlan are deleted.
  */
 static bool
-CheckCachedPlan(CachedPlanSource *plansource)
+CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams)
 {
 	CachedPlan *plan = plansource->gplan;
 
@@ -820,13 +870,22 @@ CheckCachedPlan(CachedPlanSource *plansource)
 	 */
 	if (plan->is_valid)
 	{
+		List   *execlockrelsinfo_list;
+
 		/*
 		 * Plan must have positive refcount because it is referenced by
 		 * plansource; so no need to fear it disappears under us here.
 		 */
 		Assert(plan->refcount > 0);
 
-		AcquireExecutorLocks(plan->stmt_list, true);
+		/*
+		 * Lock relations scanned by the plan.  This also invokes
+		 * ExecutorGetLockRels() to do initial partition pruning on the plan
+		 * tree iff some nodes in it are marked as needing it.  Relations whose
+		 * scan nodes are pruned as a result of that are not locked here.
+		 */
+		execlockrelsinfo_list = AcquireExecutorLocks(plan->stmt_list,
+													boundParams);
 
 		/*
 		 * If plan was transient, check to see if TransactionXmin has
@@ -844,11 +903,14 @@ CheckCachedPlan(CachedPlanSource *plansource)
 		if (plan->is_valid)
 		{
 			/* Successfully revalidated and locked the query. */
+
+			/* Remember ExecLockRelsInfos in the CachedPlan. */
+			CachedPlanSaveExecLockRelsInfos(plan, execlockrelsinfo_list);
 			return true;
 		}
 
 		/* Oops, the race case happened.  Release useless locks. */
-		AcquireExecutorLocks(plan->stmt_list, false);
+		ReleaseExecutorLocks(plan->stmt_list, execlockrelsinfo_list);
 	}
 
 	/*
@@ -880,7 +942,8 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
 				ParamListInfo boundParams, QueryEnvironment *queryEnv)
 {
 	CachedPlan *plan;
-	List	   *plist;
+	List	   *plist,
+			   *execlockrelsinfo_list;
 	bool		snapshot_set;
 	bool		is_transient;
 	MemoryContext plan_context;
@@ -933,7 +996,8 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
 	 * Generate the plan.
 	 */
 	plist = pg_plan_queries(qlist, plansource->query_string,
-							plansource->cursor_options, boundParams);
+							plansource->cursor_options, boundParams,
+							&execlockrelsinfo_list);
 
 	/* Release snapshot if we got one */
 	if (snapshot_set)
@@ -1002,6 +1066,11 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
 	plan->is_saved = false;
 	plan->is_valid = true;
 
+	/* Save the dummy ExecLockRelsInfo list. */
+	plan->execlockrelsinfo_context = NULL;
+	CachedPlanSaveExecLockRelsInfos(plan, execlockrelsinfo_list);
+	Assert(MemoryContextIsValid(plan->execlockrelsinfo_context));
+
 	/* assign generation number to new plan */
 	plan->generation = ++(plansource->generation);
 
@@ -1160,7 +1229,7 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 
 	if (!customplan)
 	{
-		if (CheckCachedPlan(plansource))
+		if (CheckCachedPlan(plansource, boundParams))
 		{
 			/* We want a generic plan, and we already have a valid one */
 			plan = plansource->gplan;
@@ -1366,7 +1435,6 @@ CachedPlanAllowsSimpleValidityCheck(CachedPlanSource *plansource,
 	foreach(lc, plan->stmt_list)
 	{
 		PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc);
-		ListCell   *lc2;
 
 		if (plannedstmt->commandType == CMD_UTILITY)
 			return false;
@@ -1375,13 +1443,8 @@ CachedPlanAllowsSimpleValidityCheck(CachedPlanSource *plansource,
 		 * We have to grovel through the rtable because it's likely to contain
 		 * an RTE_RESULT relation, rather than being totally empty.
 		 */
-		foreach(lc2, plannedstmt->rtable)
-		{
-			RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc2);
-
-			if (rte->rtekind == RTE_RELATION)
-				return false;
-		}
+		if (!bms_is_empty(plannedstmt->lockrels))
+			return false;
 	}
 
 	/*
@@ -1737,17 +1800,22 @@ QueryListGetPrimaryStmt(List *stmts)
 
 /*
  * AcquireExecutorLocks: acquire locks needed for execution of a cached plan;
- * or release them if acquire is false.
+ *
+ * Returns a list of ExecLockRelsInfo nodes containing one element for each
+ * PlannedStmt in stmt_list; NULL when the latter is utility statement or
+ * its containsInitialPruning is false.
  */
-static void
-AcquireExecutorLocks(List *stmt_list, bool acquire)
+static List *
+AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams)
 {
 	ListCell   *lc1;
+	List	   *execlockrelsinfo_list = NIL;
 
 	foreach(lc1, stmt_list)
 	{
 		PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
-		ListCell   *lc2;
+		ExecLockRelsInfo *execlockrelsinfo = NULL;
+		int			rti;
 
 		if (plannedstmt->commandType == CMD_UTILITY)
 		{
@@ -1761,27 +1829,113 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
 			Query	   *query = UtilityContainsQuery(plannedstmt->utilityStmt);
 
 			if (query)
-				ScanQueryForLocks(query, acquire);
-			continue;
+				ScanQueryForLocks(query, true);
 		}
-
-		foreach(lc2, plannedstmt->rtable)
+		else
 		{
-			RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc2);
-
-			if (rte->rtekind != RTE_RELATION)
-				continue;
+			Bitmapset   *lockrels;
 
 			/*
-			 * Acquire the appropriate type of lock on each relation OID. Note
-			 * that we don't actually try to open the rel, and hence will not
-			 * fail if it's been dropped entirely --- we'll just transiently
-			 * acquire a non-conflicting lock.
+			 * Figure out the set of relations that would need to be locked
+			 * before executing the plan.
 			 */
-			if (acquire)
+			if (!plannedstmt->containsInitialPruning)
+			{
+				/*
+				 * If the plan contains no initial pruning steps, the executor
+				 * would just need to lock whatever relations the planner would
+				 * have locked to make the plan.
+				 */
+				lockrels = plannedstmt->lockrels;
+			}
+			else
+			{
+				/*
+				 * Ask the executor to perform initial pruning steps to skip
+				 * relations that are pruned away.
+				 */
+				execlockrelsinfo = ExecutorGetLockRels(plannedstmt, boundParams);
+				lockrels = execlockrelsinfo->lockrels;
+			}
+
+			rti = -1;
+			while ((rti = bms_next_member(lockrels, rti)) >= 0)
+			{
+				RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
+
+				Assert(rte->rtekind == RTE_RELATION);
+
+				/*
+				 * Acquire the appropriate type of lock on each relation OID.
+				 * Note that we don't actually try to open the rel, and hence
+				 * will not fail if it's been dropped entirely --- we'll just
+				 * transiently acquire a non-conflicting lock.
+				 */
 				LockRelationOid(rte->relid, rte->rellockmode);
+			}
+		}
+
+		/*
+		 * Remember ExecLockRelsInfo for later adding to the QueryDesc that
+		 * will be passed to the executor when executing this plan.  May be
+		 * NULL, but must keep the list the same length as stmt_list.
+		 */
+		execlockrelsinfo_list = lappend(execlockrelsinfo_list,
+										execlockrelsinfo);
+	}
+
+	return execlockrelsinfo_list;
+}
+
+/*
+ * ReleaseExecutorLocks
+ * 		Release locks that would've been acquired by an earlier call to
+ * 		AcquireExecutorLocks()
+ */
+static void
+ReleaseExecutorLocks(List *stmt_list, List *execlockrelsinfo_list)
+{
+	ListCell   *lc1,
+			   *lc2;
+
+	forboth(lc1, stmt_list, lc2, execlockrelsinfo_list)
+	{
+		PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
+		ExecLockRelsInfo *execlockrelsinfo = lfirst_node(ExecLockRelsInfo, lc2);
+		int			rti;
+
+		if (plannedstmt->commandType == CMD_UTILITY)
+		{
+			/*
+			 * Ignore utility statements, except those (such as EXPLAIN) that
+			 * contain a parsed-but-not-planned query.  Note: it's okay to use
+			 * ScanQueryForLocks, even though the query hasn't been through
+			 * rule rewriting, because rewriting doesn't change the query
+			 * representation.
+			 */
+			Query	   *query = UtilityContainsQuery(plannedstmt->utilityStmt);
+
+			if (query)
+				ScanQueryForLocks(query, false);
+		}
+		else
+		{
+			Bitmapset   *lockrels;
+
+			if (execlockrelsinfo == NULL)
+				lockrels = plannedstmt->lockrels;
 			else
+				lockrels = execlockrelsinfo->lockrels;
+
+			rti = -1;
+			while ((rti = bms_next_member(lockrels, rti)) >= 0)
+			{
+				RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
+
+				Assert(rte->rtekind == RTE_RELATION);
+
 				UnlockRelationOid(rte->relid, rte->rellockmode);
+			}
 		}
 	}
 }
diff --git a/src/backend/utils/mmgr/portalmem.c b/src/backend/utils/mmgr/portalmem.c
index d549f66d4a..896f51be08 100644
--- a/src/backend/utils/mmgr/portalmem.c
+++ b/src/backend/utils/mmgr/portalmem.c
@@ -285,6 +285,7 @@ PortalDefineQuery(Portal portal,
 				  const char *sourceText,
 				  CommandTag commandTag,
 				  List *stmts,
+				  List *execlockrelsinfos,
 				  CachedPlan *cplan)
 {
 	AssertArg(PortalIsValid(portal));
@@ -299,6 +300,7 @@ PortalDefineQuery(Portal portal,
 	portal->qc.nprocessed = 0;
 	portal->commandTag = commandTag;
 	portal->stmts = stmts;
+	portal->execlockrelsinfos = execlockrelsinfos;
 	portal->cplan = cplan;
 	portal->status = PORTAL_DEFINED;
 }
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 666977fb1f..fef75ba147 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -87,7 +87,8 @@ extern void ExplainOneUtility(Node *utilityStmt, IntoClause *into,
 							  ExplainState *es, const char *queryString,
 							  ParamListInfo params, QueryEnvironment *queryEnv);
 
-extern void ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into,
+extern void ExplainOnePlan(PlannedStmt *plannedstmt, ExecLockRelsInfo *execlockrelsinfo,
+						   IntoClause *into,
 						   ExplainState *es, const char *queryString,
 						   ParamListInfo params, QueryEnvironment *queryEnv,
 						   const instr_time *planduration,
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index fd5735a946..ded19b8cbb 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -124,4 +124,6 @@ extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
 						 PartitionPruneInfo *pruneinfo,
 						 Bitmapset **initially_valid_subplans);
 extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate);
+extern Bitmapset *ExecGetLockRelsDoInitialPruning(Plan *plan, ExecGetLockRelsContext *context,
+								PartitionPruneInfo *pruneinfo);
 #endif							/* EXECPARTITION_H */
diff --git a/src/include/executor/execdesc.h b/src/include/executor/execdesc.h
index e79e2c001f..4338463479 100644
--- a/src/include/executor/execdesc.h
+++ b/src/include/executor/execdesc.h
@@ -35,6 +35,7 @@ typedef struct QueryDesc
 	/* These fields are provided by CreateQueryDesc */
 	CmdType		operation;		/* CMD_SELECT, CMD_UPDATE, etc. */
 	PlannedStmt *plannedstmt;	/* planner's output (could be utility, too) */
+	ExecLockRelsInfo *execlockrelsinfo;	/* ExecutorGetLockRels()'s output given plannedstmt */
 	const char *sourceText;		/* source text of the query */
 	Snapshot	snapshot;		/* snapshot to use for query */
 	Snapshot	crosscheck_snapshot;	/* crosscheck for RI update/delete */
@@ -57,6 +58,7 @@ typedef struct QueryDesc
 
 /* in pquery.c */
 extern QueryDesc *CreateQueryDesc(PlannedStmt *plannedstmt,
+								  ExecLockRelsInfo *execlockrelsinfo,
 								  const char *sourceText,
 								  Snapshot snapshot,
 								  Snapshot crosscheck_snapshot,
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index 344399f6a8..5959d67221 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -185,6 +185,8 @@ ExecGetJunkAttribute(TupleTableSlot *slot, AttrNumber attno, bool *isNull)
 /*
  * prototypes from functions in execMain.c
  */
+extern ExecLockRelsInfo *ExecutorGetLockRels(PlannedStmt *plannedstmt, ParamListInfo params);
+extern bool ExecGetLockRels(Plan *node, ExecGetLockRelsContext *context);
 extern void ExecutorStart(QueryDesc *queryDesc, int eflags);
 extern void standard_ExecutorStart(QueryDesc *queryDesc, int eflags);
 extern void ExecutorRun(QueryDesc *queryDesc,
diff --git a/src/include/executor/nodeAppend.h b/src/include/executor/nodeAppend.h
index 4cb78ee5b6..b53535c2a4 100644
--- a/src/include/executor/nodeAppend.h
+++ b/src/include/executor/nodeAppend.h
@@ -17,6 +17,7 @@
 #include "access/parallel.h"
 #include "nodes/execnodes.h"
 
+extern bool ExecGetAppendLockRels(Append *node, ExecGetLockRelsContext *context);
 extern AppendState *ExecInitAppend(Append *node, EState *estate, int eflags);
 extern void ExecEndAppend(AppendState *node);
 extern void ExecReScanAppend(AppendState *node);
diff --git a/src/include/executor/nodeMergeAppend.h b/src/include/executor/nodeMergeAppend.h
index 97fe3b0665..8eb4e9df93 100644
--- a/src/include/executor/nodeMergeAppend.h
+++ b/src/include/executor/nodeMergeAppend.h
@@ -16,6 +16,7 @@
 
 #include "nodes/execnodes.h"
 
+extern bool ExecGetMergeAppendLockRels(MergeAppend *node, ExecGetLockRelsContext *context);
 extern MergeAppendState *ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags);
 extern void ExecEndMergeAppend(MergeAppendState *node);
 extern void ExecReScanMergeAppend(MergeAppendState *node);
diff --git a/src/include/executor/nodeModifyTable.h b/src/include/executor/nodeModifyTable.h
index 1d225bc88d..5006499088 100644
--- a/src/include/executor/nodeModifyTable.h
+++ b/src/include/executor/nodeModifyTable.h
@@ -19,6 +19,7 @@ extern void ExecComputeStoredGenerated(ResultRelInfo *resultRelInfo,
 									   EState *estate, TupleTableSlot *slot,
 									   CmdType cmdtype);
 
+extern bool ExecGetModifyTableLockRels(ModifyTable *plan, ExecGetLockRelsContext *context);
 extern ModifyTableState *ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags);
 extern void ExecEndModifyTable(ModifyTableState *node);
 extern void ExecReScanModifyTable(ModifyTableState *node);
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index dd95dc40c7..718603d400 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -570,6 +570,7 @@ typedef struct EState
 	struct ExecRowMark **es_rowmarks;	/* Array of per-range-table-entry
 										 * ExecRowMarks, or NULL if none */
 	PlannedStmt *es_plannedstmt;	/* link to top of plan tree */
+	struct ExecLockRelsInfo *es_execlockrelsinfo; /* QueryDesc.execlockrelsinfo */
 	const char *es_sourceText;	/* Source text from QueryDesc */
 
 	JunkFilter *es_junkFilter;	/* top-level junk filter, if any */
@@ -958,6 +959,92 @@ typedef struct DomainConstraintState
  */
 typedef TupleTableSlot *(*ExecProcNodeMtd) (struct PlanState *pstate);
 
+/*----------------
+ * ExecLockRelsInfo
+ *
+ * Result of performing ExecutorGetLockRels() for a given PlannedStmt
+ */
+typedef struct ExecLockRelsInfo
+{
+	NodeTag		type;
+
+	/*
+	 * Relations that must be locked to execute the plan tree contained in
+	 * the PlannedStmt.
+	 */
+	Bitmapset  *lockrels;
+
+	/* PlannedStmt.numPlanNodes */
+	int			numPlanNodes;
+
+	/*
+	 * List of PlanInitPruningOutput, each representing the output of
+	 * performing initial pruning on a given plan node, for all nodes in the
+	 * plan tree that have been marked as needing initial pruning.
+	 *
+	 * 'ipoIndexes' is an array of 'numPlanNodes' elements, indexed with
+	 * plan_node_id of the individual nodes in the plan tree, each a 1-based
+	 * index into 'initPruningOutputs' list for a given plan node.  0 means
+	 * that a given plan node has no entry in the list because of not needing
+	 * any initial pruning done on it.
+	 */
+	List	   *initPruningOutputs;
+	int		   *ipoIndexes;
+} ExecLockRelsInfo;
+
+/*----------------
+ * ExecGetLockRelsContext
+ *
+ * Context information for performing ExecutorGetLockRels() on a given plan
+ */
+typedef struct ExecGetLockRelsContext
+{
+	NodeTag		type;
+
+	PlannedStmt	   *stmt;		/* target plan */
+	ParamListInfo	params;		/* EXTERN parameters to prune with */
+
+	/* Output parameters for ExecGetLockRels and its subroutines. */
+	Bitmapset	   *lockrels;
+
+	/* See above comment. */
+	List		   *initPruningOutputs;
+	int			   *ipoIndexes;
+} ExecGetLockRelsContext;
+
+#define ExecStorePlanInitPruningOutput(prepcxt, initPruningOutput, plannode) \
+	do { \
+		(prepcxt)->initPruningOutputs = lappend((prepcxt)->initPruningOutputs, initPruningOutput); \
+		(prepcxt)->ipoIndexes[(plannode)->plan_node_id] = list_length((prepcxt)->initPruningOutputs); \
+	} while (0)
+
+#define ExecFetchPlanInitPruningOutput(prepres, plannode) \
+		(((prepres) != NULL && (prepres)->initPruningOutputs != NIL) ? \
+		 list_nth((prepres)->initPruningOutputs, \
+				  (prepres)->ipoIndexes[(plannode)->plan_node_id] - 1) : NULL)
+
+/* ---------------
+ * PlanInitPruningOutput
+ *
+ * Node to remember the result of performing initial partition pruning steps
+ * during ExecutorGetLockRels() on nodes that support pruning.
+ *
+ * ExecLockRelsDoInitPruning(), which runs during ExecutorGetLockRels(),
+ * creates it and stores it in the corresponding ExecLockRelsInfo.
+ *
+ * ExecInitPartitionPruning(), which runs during ExecuorStart(), fetches it
+ * from the EState's ExecLockRelsInfo (if any) and uses the value of
+ * initially_valid_subplans contained in it as-is to select the subplans to be
+ * initialized for execution, instead of re-evaluating that by performing
+ * initial pruning again.
+ */
+typedef struct PlanInitPruningOutput
+{
+	NodeTag		type;
+
+	Bitmapset  *initially_valid_subplans;
+} PlanInitPruningOutput;
+
 /* ----------------
  *		PlanState node
  *
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index 5d075f0c34..d365fc4402 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -96,6 +96,11 @@ typedef enum NodeTag
 	T_PartitionPruneStepCombine,
 	T_PlanInvalItem,
 
+	/* TAGS FOR EXECUTOR PREP NODES (execnodes.h) */
+	T_ExecGetLockRelsContext,
+	T_ExecLockRelsInfo,
+	T_PlanInitPruningOutput,
+
 	/*
 	 * TAGS FOR PLAN STATE NODES (execnodes.h)
 	 *
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 1f3845b3fe..96c652ebaf 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -101,6 +101,9 @@ typedef struct PlannerGlobal
 
 	List	   *finalrtable;	/* "flat" rangetable for executor */
 
+	Bitmapset  *lockrels;	/* Indexes of RTE_RELATION entries in range
+								 * table */
+
 	List	   *finalrowmarks;	/* "flat" list of PlanRowMarks */
 
 	List	   *resultRelations;	/* "flat" list of integer RT indexes */
@@ -129,6 +132,10 @@ typedef struct PlannerGlobal
 
 	char		maxParallelHazard;	/* worst PROPARALLEL hazard level */
 
+	bool		containsInitialPruning;	/* Do some Plan nodes in the tree
+										 * have initial (pre-exec) pruning
+										 * steps? */
+
 	PartitionDirectory partition_directory; /* partition descriptors */
 } PlannerGlobal;
 
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 0b518ce6b2..5a8c34bdf6 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -59,12 +59,21 @@ typedef struct PlannedStmt
 
 	bool		parallelModeNeeded; /* parallel mode required to execute? */
 
+	bool		containsInitialPruning;	/* Do some Plan nodes in the tree
+										 * have initial (pre-exec) pruning
+										 * steps? */
+
 	int			jitFlags;		/* which forms of JIT should be performed */
 
 	struct Plan *planTree;		/* tree of Plan nodes */
 
+	int			numPlanNodes;	/* number of nodes in planTree */
+
 	List	   *rtable;			/* list of RangeTblEntry nodes */
 
+	Bitmapset  *lockrels;	/* Indexes of RTE_RELATION entries in range
+								 * table */
+
 	/* rtable indexes of target relations for INSERT/UPDATE/DELETE */
 	List	   *resultRelations;	/* integer list of RT indexes, or NIL */
 
@@ -1172,6 +1181,13 @@ typedef struct PlanRowMark
  * prune_infos			List of Lists containing PartitionedRelPruneInfo nodes,
  *						one sublist per run-time-prunable partition hierarchy
  *						appearing in the parent plan node's subplans.
+ *
+ * needs_init_pruning	Does any of the PartitionedRelPruneInfos in
+ *						prune_infos have its initial_pruning_steps set?
+ *
+ * needs_exec_pruning	Does any of the PartitionedRelPruneInfos in
+ *						prune_infos have its exec_pruning_steps set?
+ *
  * other_subplans		Indexes of any subplans that are not accounted for
  *						by any of the PartitionedRelPruneInfo nodes in
  *						"prune_infos".  These subplans must not be pruned.
@@ -1180,6 +1196,8 @@ typedef struct PartitionPruneInfo
 {
 	NodeTag		type;
 	List	   *prune_infos;
+	bool		needs_init_pruning;
+	bool		needs_exec_pruning;
 	Bitmapset  *other_subplans;
 } PartitionPruneInfo;
 
diff --git a/src/include/tcop/tcopprot.h b/src/include/tcop/tcopprot.h
index 92291a750d..bf80c53bed 100644
--- a/src/include/tcop/tcopprot.h
+++ b/src/include/tcop/tcopprot.h
@@ -64,7 +64,7 @@ extern PlannedStmt *pg_plan_query(Query *querytree, const char *query_string,
 								  ParamListInfo boundParams);
 extern List *pg_plan_queries(List *querytrees, const char *query_string,
 							 int cursorOptions,
-							 ParamListInfo boundParams);
+							 ParamListInfo boundParams, List **execlockrelsinfo_list);
 
 extern bool check_max_stack_depth(int *newval, void **extra, GucSource source);
 extern void assign_max_stack_depth(int newval, void *extra);
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index 95b99e3d25..2a847f54da 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -148,6 +148,9 @@ typedef struct CachedPlan
 {
 	int			magic;			/* should equal CACHEDPLAN_MAGIC */
 	List	   *stmt_list;		/* list of PlannedStmts */
+	List	   *execlockrelsinfo_list;	/* list of ExecutorGetLockRelsResult with one
+									 * element for each of stmt_list; NIL
+									 * if not a generic plan */
 	bool		is_oneshot;		/* is it a "oneshot" plan? */
 	bool		is_saved;		/* is CachedPlan in a long-lived context? */
 	bool		is_valid;		/* is the stmt_list currently valid? */
@@ -158,6 +161,8 @@ typedef struct CachedPlan
 	int			generation;		/* parent's generation number for this plan */
 	int			refcount;		/* count of live references to this struct */
 	MemoryContext context;		/* context containing this CachedPlan */
+	MemoryContext execlockrelsinfo_context;	/* context containing execlockrelsinfo_list,
+									 * a child of the above context */
 } CachedPlan;
 
 /*
diff --git a/src/include/utils/portal.h b/src/include/utils/portal.h
index aeddbdafe5..9abace6734 100644
--- a/src/include/utils/portal.h
+++ b/src/include/utils/portal.h
@@ -137,6 +137,10 @@ typedef struct PortalData
 	CommandTag	commandTag;		/* command tag for original query */
 	QueryCompletion qc;			/* command completion data for executed query */
 	List	   *stmts;			/* list of PlannedStmts */
+	List	   *execlockrelsinfos;	/* list of ExecutorGetLockRelsResults with one element
+								 * for each of 'stmts'; same as
+								 * cplan->execlockrelsinfo_list if cplan is
+								 * not NULL */
 	CachedPlan *cplan;			/* CachedPlan, if stmts are from one */
 
 	ParamListInfo portalParams; /* params to pass to query */
@@ -241,6 +245,7 @@ extern void PortalDefineQuery(Portal portal,
 							  const char *sourceText,
 							  CommandTag commandTag,
 							  List *stmts,
+							  List *execlockrelsinfos,
 							  CachedPlan *cplan);
 extern PlannedStmt *PortalGetPrimaryStmt(Portal portal);
 extern void PortalCreateHoldStore(Portal portal);
-- 
2.24.1



  [application/octet-stream] v5-0001-Some-refactoring-of-runtime-pruning-code.patch (26.4K, 4-v5-0001-Some-refactoring-of-runtime-pruning-code.patch)
  download | inline diff:
From 1164015d8561151d1fb5d861b236961e237102ff Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Wed, 2 Mar 2022 15:17:55 +0900
Subject: [PATCH v5 1/3] Some refactoring of runtime pruning code

This does two things mainly:

* Move the execution pruning initialization steps that are common
between both ExecInitAppend() and ExecInitMergeAppend() into a new
function ExecInitPartitionPruning() defined in execPartition.c.
Thus, ExecFindInitialMatchingSubPlans() need not be exported.

* Add an ExprContext field to PartitionPruneContext to remove the
implicit assumption in the runtime pruning code that the ExprContext
to use to compute pruning expressions that need one can always rely
on the PlanState providing it.  A future patch will allow runtime
pruning (at least the initial pruning steps) to be performed without
the corresponding PlanState yet having been created, so this will
help.
---
 src/backend/executor/execPartition.c   | 340 ++++++++++++++++---------
 src/backend/executor/nodeAppend.c      |  33 +--
 src/backend/executor/nodeMergeAppend.c |  32 +--
 src/backend/partitioning/partprune.c   |  20 +-
 src/include/executor/execPartition.h   |   9 +-
 src/include/partitioning/partprune.h   |   2 +
 6 files changed, 255 insertions(+), 181 deletions(-)

diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 90ed1485d1..21953f253b 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -182,11 +182,18 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation rel,
 												  bool *isnull,
 												  int maxfieldlen);
 static List *adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri);
+static PartitionPruneState *ExecCreatePartitionPruneState(PlanState *planstate,
+							  PartitionPruneInfo *partitionpruneinfo);
+static Bitmapset *ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate);
 static void ExecInitPruningContext(PartitionPruneContext *context,
 								   List *pruning_steps,
 								   PartitionDesc partdesc,
 								   PartitionKey partkey,
-								   PlanState *planstate);
+								   PlanState *planstate,
+								   ExprContext *econtext);
+static void ExecPartitionPruneFixSubPlanIndexes(PartitionPruneState *prunestate,
+									Bitmapset *initially_valid_subplans,
+									int n_total_subplans);
 static void find_matching_subplans_recurse(PartitionPruningData *prunedata,
 										   PartitionedRelPruningData *pprune,
 										   bool initial_prune,
@@ -1485,30 +1492,87 @@ adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri)
  *
  * Functions:
  *
- * ExecCreatePartitionPruneState:
- *		Creates the PartitionPruneState required by each of the two pruning
- *		functions.  Details stored include how to map the partition index
- *		returned by the partition pruning code into subplan indexes.
- *
- * ExecFindInitialMatchingSubPlans:
- *		Returns indexes of matching subplans.  Partition pruning is attempted
- *		without any evaluation of expressions containing PARAM_EXEC Params.
- *		This function must be called during executor startup for the parent
- *		plan before the subplans themselves are initialized.  Subplans which
- *		are found not to match by this function must be removed from the
- *		plan's list of subplans during execution, as this function performs a
- *		remap of the partition index to subplan index map and the newly
- *		created map provides indexes only for subplans which remain after
- *		calling this function.
+ * ExecInitPartitionPruning:
+ *		Sets up run-time pruning data structure (PartitionPruneState) that is
+ *		needed by each of the two pruning functions.  Also determines the set
+ *		of initially valid subplans by performing initial pruning steps,
+ *		telling the caller (such as ExecInitAppend) to initialize only those
+ *		for execution.  Maps in PartitionPruneState that are used to map the
+ *		partition indexes returned by partprune.c functions into the indexes
+ *		of partition's subplans in the parent node (such as Append) are
+ *		updated to account for initial pruning having eliminated some of the
+ *		subplans, if any.
  *
  * ExecFindMatchingSubPlans:
  *		Returns indexes of matching subplans after evaluating all available
- *		expressions.  This function can only be called during execution and
- *		must be called again each time the value of a Param listed in
- *		PartitionPruneState's 'execparamids' changes.
+ *		expressions, that is, using execution pruning steps.  This function can
+ *		can only be called during execution and must be called again each time
+ *		the value of a Param listed in PartitionPruneState's 'execparamids'
+ *		changes.
  *-------------------------------------------------------------------------
  */
 
+/*
+ * ExecInitPartitionPruning
+ * 		Initialize data structure needed for run-time partition pruning
+ *
+ * Initial pruning can be done immediately, so it is done here if needed and
+ * the set of surviving partition subplans' indexes are added to the output
+ * parameter *initially_valid_subplans.  If subplans are indeed pruned,
+ * subplan_map arrays contained in the returned PartitionPruneState are
+ * re-sequenced to not count those, though only if the maps will be needed
+ * for subsequent execution pruning passes.
+ */
+PartitionPruneState *
+ExecInitPartitionPruning(PlanState *planstate,
+						 int n_total_subplans,
+						 PartitionPruneInfo *pruneinfo,
+						 Bitmapset **initially_valid_subplans)
+{
+	PartitionPruneState *prunestate;
+	EState *estate = planstate->state;
+
+	/* We may need an expression context to evaluate partition exprs */
+	ExecAssignExprContext(estate, planstate);
+
+	/*
+	 * Create the working data structure for pruning.
+	 */
+	prunestate = ExecCreatePartitionPruneState(planstate, pruneinfo);
+
+	/*
+	 * Perform an initial partition prune, if required.
+	 */
+	if (prunestate->do_initial_prune)
+	{
+		/* Determine which subplans survive initial pruning */
+		*initially_valid_subplans = ExecFindInitialMatchingSubPlans(prunestate);
+	}
+	else
+	{
+		/* We'll need to initialize all subplans */
+		Assert(n_total_subplans > 0);
+		*initially_valid_subplans = bms_add_range(NULL, 0,
+												  n_total_subplans - 1);
+	}
+
+	/*
+	 * Re-sequence subplan indexes contained in prunestate to account for any
+	 * that were removed above due to initial pruning.
+	 *
+	 * We can safely skip this when !do_exec_prune, even though that leaves
+	 * invalid data in prunestate, because that data won't be consulted again
+	 * (cf initial Assert in ExecFindMatchingSubPlans).
+	 */
+	if (prunestate->do_exec_prune &&
+		bms_num_members(*initially_valid_subplans) < n_total_subplans)
+		ExecPartitionPruneFixSubPlanIndexes(prunestate,
+											*initially_valid_subplans,
+											n_total_subplans);
+
+	return prunestate;
+}
+
 /*
  * ExecCreatePartitionPruneState
  *		Build the data structure required for calling
@@ -1527,7 +1591,7 @@ adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri)
  * re-used each time we re-evaluate which partitions match the pruning steps
  * provided in each PartitionedRelPruneInfo.
  */
-PartitionPruneState *
+static PartitionPruneState *
 ExecCreatePartitionPruneState(PlanState *planstate,
 							  PartitionPruneInfo *partitionpruneinfo)
 {
@@ -1536,6 +1600,7 @@ ExecCreatePartitionPruneState(PlanState *planstate,
 	int			n_part_hierarchies;
 	ListCell   *lc;
 	int			i;
+	ExprContext	*econtext = planstate->ps_ExprContext;
 
 	/* For data reading, executor always omits detached partitions */
 	if (estate->es_partition_directory == NULL)
@@ -1709,7 +1774,8 @@ ExecCreatePartitionPruneState(PlanState *planstate,
 			{
 				ExecInitPruningContext(&pprune->initial_context,
 									   pinfo->initial_pruning_steps,
-									   partdesc, partkey, planstate);
+									   partdesc, partkey, planstate,
+									   econtext);
 				/* Record whether initial pruning is needed at any level */
 				prunestate->do_initial_prune = true;
 			}
@@ -1718,7 +1784,8 @@ ExecCreatePartitionPruneState(PlanState *planstate,
 			{
 				ExecInitPruningContext(&pprune->exec_context,
 									   pinfo->exec_pruning_steps,
-									   partdesc, partkey, planstate);
+									   partdesc, partkey, planstate,
+									   econtext);
 				/* Record whether exec pruning is needed at any level */
 				prunestate->do_exec_prune = true;
 			}
@@ -1746,7 +1813,8 @@ ExecInitPruningContext(PartitionPruneContext *context,
 					   List *pruning_steps,
 					   PartitionDesc partdesc,
 					   PartitionKey partkey,
-					   PlanState *planstate)
+					   PlanState *planstate,
+					   ExprContext *econtext)
 {
 	int			n_steps;
 	int			partnatts;
@@ -1767,6 +1835,7 @@ ExecInitPruningContext(PartitionPruneContext *context,
 
 	context->ppccontext = CurrentMemoryContext;
 	context->planstate = planstate;
+	context->exprcontext = econtext;
 
 	/* Initialize expression state for each expression we need */
 	context->exprstates = (ExprState **)
@@ -1795,8 +1864,20 @@ ExecInitPruningContext(PartitionPruneContext *context,
 														step->step.step_id,
 														keyno);
 
-				context->exprstates[stateidx] =
-					ExecInitExpr(expr, context->planstate);
+				/*
+				 * When planstate is NULL, pruning_steps is known not to
+				 * contain any expressions that depend on the parent plan.
+				 * Information of any available EXTERN parameters must be
+				 * passed explicitly in that case, which the caller must
+				 * have made available via econtext.
+				 */
+				if (planstate == NULL)
+					context->exprstates[stateidx] =
+						ExecInitExprWithParams(expr,
+											   econtext->ecxt_param_list_info);
+				else
+					context->exprstates[stateidx] =
+						ExecInitExpr(expr, context->planstate);
 			}
 			keyno++;
 		}
@@ -1816,11 +1897,9 @@ ExecInitPruningContext(PartitionPruneContext *context,
  *
  * Must only be called once per 'prunestate', and only if initial pruning
  * is required.
- *
- * 'nsubplans' must be passed as the total number of unpruned subplans.
  */
-Bitmapset *
-ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate, int nsubplans)
+static Bitmapset *
+ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate)
 {
 	Bitmapset  *result = NULL;
 	MemoryContext oldcontext;
@@ -1845,14 +1924,20 @@ ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate, int nsubplans)
 		PartitionedRelPruningData *pprune;
 
 		prunedata = prunestate->partprunedata[i];
+
+		/*
+		 * We pass the 1st item belonging to the root table of the hierarchy
+		 * and find_matching_subplans_recurse() takes care of recursing to
+		 * other (lower-level) parents as needed.
+		 */
 		pprune = &prunedata->partrelprunedata[0];
 
 		/* Perform pruning without using PARAM_EXEC Params */
 		find_matching_subplans_recurse(prunedata, pprune, true, &result);
 
-		/* Expression eval may have used space in node's ps_ExprContext too */
+		/* Expression eval may have used space in ExprContext too */
 		if (pprune->initial_pruning_steps)
-			ResetExprContext(pprune->initial_context.planstate->ps_ExprContext);
+			ResetExprContext(pprune->initial_context.exprcontext);
 	}
 
 	/* Add in any subplans that partition pruning didn't account for */
@@ -1865,118 +1950,120 @@ ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate, int nsubplans)
 
 	MemoryContextReset(prunestate->prune_context);
 
+	return result;
+}
+
+/*
+ * ExecPartitionPruneFixSubPlanIndexes
+ *		Fix mapping of partition indexes to subplan indexes contained in
+ *		prunestate by considering the new list of subplans that survived
+ *		initial pruning
+ *
+ * Subplans would be previously indexed 0..(n_total_subplans - 1), though
+ * now should be changed to index range 0..num(initially_valid_subplans).
+ */
+static void
+ExecPartitionPruneFixSubPlanIndexes(PartitionPruneState *prunestate,
+									Bitmapset *initially_valid_subplans,
+									int n_total_subplans)
+{
+	int		   *new_subplan_indexes;
+	Bitmapset  *new_other_subplans;
+	int			i;
+	int			newidx;
+
 	/*
-	 * If exec-time pruning is required and we pruned subplans above, then we
-	 * must re-sequence the subplan indexes so that ExecFindMatchingSubPlans
-	 * properly returns the indexes from the subplans which will remain after
-	 * execution of this function.
-	 *
-	 * We can safely skip this when !do_exec_prune, even though that leaves
-	 * invalid data in prunestate, because that data won't be consulted again
-	 * (cf initial Assert in ExecFindMatchingSubPlans).
+	 * First we must build a temporary array which maps old subplan
+	 * indexes to new ones.  For convenience of initialization, we use
+	 * 1-based indexes in this array and leave pruned items as 0.
 	 */
-	if (prunestate->do_exec_prune && bms_num_members(result) < nsubplans)
+	new_subplan_indexes = (int *) palloc0(sizeof(int) * n_total_subplans);
+	newidx = 1;
+	i = -1;
+	while ((i = bms_next_member(initially_valid_subplans, i)) >= 0)
 	{
-		int		   *new_subplan_indexes;
-		Bitmapset  *new_other_subplans;
-		int			i;
-		int			newidx;
+		Assert(i < n_total_subplans);
+		new_subplan_indexes[i] = newidx++;
+	}
 
-		/*
-		 * First we must build a temporary array which maps old subplan
-		 * indexes to new ones.  For convenience of initialization, we use
-		 * 1-based indexes in this array and leave pruned items as 0.
-		 */
-		new_subplan_indexes = (int *) palloc0(sizeof(int) * nsubplans);
-		newidx = 1;
-		i = -1;
-		while ((i = bms_next_member(result, i)) >= 0)
-		{
-			Assert(i < nsubplans);
-			new_subplan_indexes[i] = newidx++;
-		}
+	/*
+	 * Now we can update each PartitionedRelPruneInfo's subplan_map with
+	 * new subplan indexes.  We must also recompute its present_parts
+	 * bitmap.
+	 */
+	for (i = 0; i < prunestate->num_partprunedata; i++)
+	{
+		PartitionPruningData *prunedata = prunestate->partprunedata[i];
+		int			j;
 
 		/*
-		 * Now we can update each PartitionedRelPruneInfo's subplan_map with
-		 * new subplan indexes.  We must also recompute its present_parts
-		 * bitmap.
+		 * Within each hierarchy, we perform this loop in back-to-front
+		 * order so that we determine present_parts for the lowest-level
+		 * partitioned tables first.  This way we can tell whether a
+		 * sub-partitioned table's partitions were entirely pruned so we
+		 * can exclude it from the current level's present_parts.
 		 */
-		for (i = 0; i < prunestate->num_partprunedata; i++)
+		for (j = prunedata->num_partrelprunedata - 1; j >= 0; j--)
 		{
-			PartitionPruningData *prunedata = prunestate->partprunedata[i];
-			int			j;
+			PartitionedRelPruningData *pprune = &prunedata->partrelprunedata[j];
+			int			nparts = pprune->nparts;
+			int			k;
 
-			/*
-			 * Within each hierarchy, we perform this loop in back-to-front
-			 * order so that we determine present_parts for the lowest-level
-			 * partitioned tables first.  This way we can tell whether a
-			 * sub-partitioned table's partitions were entirely pruned so we
-			 * can exclude it from the current level's present_parts.
-			 */
-			for (j = prunedata->num_partrelprunedata - 1; j >= 0; j--)
-			{
-				PartitionedRelPruningData *pprune = &prunedata->partrelprunedata[j];
-				int			nparts = pprune->nparts;
-				int			k;
+			/* We just rebuild present_parts from scratch */
+			bms_free(pprune->present_parts);
+			pprune->present_parts = NULL;
 
-				/* We just rebuild present_parts from scratch */
-				bms_free(pprune->present_parts);
-				pprune->present_parts = NULL;
+			for (k = 0; k < nparts; k++)
+			{
+				int			oldidx = pprune->subplan_map[k];
+				int			subidx;
 
-				for (k = 0; k < nparts; k++)
+				/*
+				 * If this partition existed as a subplan then change the
+				 * old subplan index to the new subplan index.  The new
+				 * index may become -1 if the partition was pruned above,
+				 * or it may just come earlier in the subplan list due to
+				 * some subplans being removed earlier in the list.  If
+				 * it's a subpartition, add it to present_parts unless
+				 * it's entirely pruned.
+				 */
+				if (oldidx >= 0)
 				{
-					int			oldidx = pprune->subplan_map[k];
-					int			subidx;
-
-					/*
-					 * If this partition existed as a subplan then change the
-					 * old subplan index to the new subplan index.  The new
-					 * index may become -1 if the partition was pruned above,
-					 * or it may just come earlier in the subplan list due to
-					 * some subplans being removed earlier in the list.  If
-					 * it's a subpartition, add it to present_parts unless
-					 * it's entirely pruned.
-					 */
-					if (oldidx >= 0)
-					{
-						Assert(oldidx < nsubplans);
-						pprune->subplan_map[k] = new_subplan_indexes[oldidx] - 1;
+					Assert(oldidx < n_total_subplans);
+					pprune->subplan_map[k] = new_subplan_indexes[oldidx] - 1;
 
-						if (new_subplan_indexes[oldidx] > 0)
-							pprune->present_parts =
-								bms_add_member(pprune->present_parts, k);
-					}
-					else if ((subidx = pprune->subpart_map[k]) >= 0)
-					{
-						PartitionedRelPruningData *subprune;
+					if (new_subplan_indexes[oldidx] > 0)
+						pprune->present_parts =
+							bms_add_member(pprune->present_parts, k);
+				}
+				else if ((subidx = pprune->subpart_map[k]) >= 0)
+				{
+					PartitionedRelPruningData *subprune;
 
-						subprune = &prunedata->partrelprunedata[subidx];
+					subprune = &prunedata->partrelprunedata[subidx];
 
-						if (!bms_is_empty(subprune->present_parts))
-							pprune->present_parts =
-								bms_add_member(pprune->present_parts, k);
-					}
+					if (!bms_is_empty(subprune->present_parts))
+						pprune->present_parts =
+							bms_add_member(pprune->present_parts, k);
 				}
 			}
 		}
+	}
 
-		/*
-		 * We must also recompute the other_subplans set, since indexes in it
-		 * may change.
-		 */
-		new_other_subplans = NULL;
-		i = -1;
-		while ((i = bms_next_member(prunestate->other_subplans, i)) >= 0)
-			new_other_subplans = bms_add_member(new_other_subplans,
-												new_subplan_indexes[i] - 1);
-
-		bms_free(prunestate->other_subplans);
-		prunestate->other_subplans = new_other_subplans;
+	/*
+	 * We must also recompute the other_subplans set, since indexes in it
+	 * may change.
+	 */
+	new_other_subplans = NULL;
+	i = -1;
+	while ((i = bms_next_member(prunestate->other_subplans, i)) >= 0)
+		new_other_subplans = bms_add_member(new_other_subplans,
+											new_subplan_indexes[i] - 1);
 
-		pfree(new_subplan_indexes);
-	}
+	bms_free(prunestate->other_subplans);
+	prunestate->other_subplans = new_other_subplans;
 
-	return result;
+	pfree(new_subplan_indexes);
 }
 
 /*
@@ -2018,11 +2105,16 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate)
 		prunedata = prunestate->partprunedata[i];
 		pprune = &prunedata->partrelprunedata[0];
 
+		/*
+		 * We pass the 1st item belonging to the root table of the hierarchy
+		 * and find_matching_subplans_recurse() takes care of recursing to
+		 * other (lower-level) parents as needed.
+		 */
 		find_matching_subplans_recurse(prunedata, pprune, false, &result);
 
-		/* Expression eval may have used space in node's ps_ExprContext too */
+		/* Expression eval may have used space in ExprContext too */
 		if (pprune->exec_pruning_steps)
-			ResetExprContext(pprune->exec_context.planstate->ps_ExprContext);
+			ResetExprContext(pprune->exec_context.exprcontext);
 	}
 
 	/* Add in any subplans that partition pruning didn't account for */
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index 7937f1c88f..5b6d3eb23b 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -138,30 +138,17 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
 	{
 		PartitionPruneState *prunestate;
 
-		/* We may need an expression context to evaluate partition exprs */
-		ExecAssignExprContext(estate, &appendstate->ps);
-
-		/* Create the working data structure for pruning. */
-		prunestate = ExecCreatePartitionPruneState(&appendstate->ps,
-												   node->part_prune_info);
+		/*
+		 * Set up pruning data structure.  Initial pruning steps, if any, are
+		 * performed as part of the setup, adding the set of indexes of
+		 * surviving subplans to 'validsubplans'.
+		 */
+		prunestate = ExecInitPartitionPruning(&appendstate->ps,
+											  list_length(node->appendplans),
+											  node->part_prune_info,
+											  &validsubplans);
 		appendstate->as_prune_state = prunestate;
-
-		/* Perform an initial partition prune, if required. */
-		if (prunestate->do_initial_prune)
-		{
-			/* Determine which subplans survive initial pruning */
-			validsubplans = ExecFindInitialMatchingSubPlans(prunestate,
-															list_length(node->appendplans));
-
-			nplans = bms_num_members(validsubplans);
-		}
-		else
-		{
-			/* We'll need to initialize all subplans */
-			nplans = list_length(node->appendplans);
-			Assert(nplans > 0);
-			validsubplans = bms_add_range(NULL, 0, nplans - 1);
-		}
+		nplans = bms_num_members(validsubplans);
 
 		/*
 		 * When no run-time pruning is required and there's at least one
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index 418f89dea8..9a9f29e845 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -86,29 +86,17 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
 	{
 		PartitionPruneState *prunestate;
 
-		/* We may need an expression context to evaluate partition exprs */
-		ExecAssignExprContext(estate, &mergestate->ps);
-
-		prunestate = ExecCreatePartitionPruneState(&mergestate->ps,
-												   node->part_prune_info);
+		/*
+		 * Set up pruning data structure.  Initial pruning steps, if any, are
+		 * performed as part of the setup, adding the set of indexes of
+		 * surviving subplans to 'validsubplans'.
+		 */
+		prunestate = ExecInitPartitionPruning(&mergestate->ps,
+											  list_length(node->mergeplans),
+											  node->part_prune_info,
+											  &validsubplans);
 		mergestate->ms_prune_state = prunestate;
-
-		/* Perform an initial partition prune, if required. */
-		if (prunestate->do_initial_prune)
-		{
-			/* Determine which subplans survive initial pruning */
-			validsubplans = ExecFindInitialMatchingSubPlans(prunestate,
-															list_length(node->mergeplans));
-
-			nplans = bms_num_members(validsubplans);
-		}
-		else
-		{
-			/* We'll need to initialize all subplans */
-			nplans = list_length(node->mergeplans);
-			Assert(nplans > 0);
-			validsubplans = bms_add_range(NULL, 0, nplans - 1);
-		}
+		nplans = bms_num_members(validsubplans);
 
 		/*
 		 * When no run-time pruning is required and there's at least one
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index 1bc00826c1..7080cb25d9 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -798,6 +798,7 @@ prune_append_rel_partitions(RelOptInfo *rel)
 
 	/* These are not valid when being called from the planner */
 	context.planstate = NULL;
+	context.exprcontext = NULL;
 	context.exprstates = NULL;
 
 	/* Actual pruning happens here. */
@@ -808,8 +809,8 @@ prune_append_rel_partitions(RelOptInfo *rel)
  * get_matching_partitions
  *		Determine partitions that survive partition pruning
  *
- * Note: context->planstate must be set to a valid PlanState when the
- * pruning_steps were generated with a target other than PARTTARGET_PLANNER.
+ * Note: context->exprcontext must be valid when the pruning_steps were
+ * generated with a target other than PARTTARGET_PLANNER.
  *
  * Returns a Bitmapset of the RelOptInfo->part_rels indexes of the surviving
  * partitions.
@@ -3654,7 +3655,7 @@ match_boolean_partition_clause(Oid partopfamily, Expr *clause, Expr *partkey,
  * exprstate array.
  *
  * Note that the evaluated result may be in the per-tuple memory context of
- * context->planstate->ps_ExprContext, and we may have leaked other memory
+ * context->exprcontext, and we may have leaked other memory
  * there too.  This memory must be recovered by resetting that ExprContext
  * after we're done with the pruning operation (see execPartition.c).
  */
@@ -3677,13 +3678,18 @@ partkey_datum_from_expr(PartitionPruneContext *context,
 		ExprContext *ectx;
 
 		/*
-		 * We should never see a non-Const in a step unless we're running in
-		 * the executor.
+		 * We should never see a non-Const in a step unless the caller has
+		 * passed a valid ExprContext.
+		 *
+		 * When context->planstate is valid, context->exprcontext is same
+		 * as context->planstate->ps_ExprContext.
 		 */
-		Assert(context->planstate != NULL);
+		Assert(context->planstate != NULL || context->exprcontext != NULL);
+		Assert(context->planstate == NULL ||
+			   (context->exprcontext == context->planstate->ps_ExprContext));
 
 		exprstate = context->exprstates[stateidx];
-		ectx = context->planstate->ps_ExprContext;
+		ectx = context->exprcontext;
 		*value = ExecEvalExprSwitchContext(exprstate, ectx, isnull);
 	}
 }
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index 603d8becc4..fd5735a946 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -119,10 +119,9 @@ extern ResultRelInfo *ExecFindPartition(ModifyTableState *mtstate,
 										EState *estate);
 extern void ExecCleanupTupleRouting(ModifyTableState *mtstate,
 									PartitionTupleRouting *proute);
-extern PartitionPruneState *ExecCreatePartitionPruneState(PlanState *planstate,
-														  PartitionPruneInfo *partitionpruneinfo);
+extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
+						 int n_total_subplans,
+						 PartitionPruneInfo *pruneinfo,
+						 Bitmapset **initially_valid_subplans);
 extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate);
-extern Bitmapset *ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate,
-												  int nsubplans);
-
 #endif							/* EXECPARTITION_H */
diff --git a/src/include/partitioning/partprune.h b/src/include/partitioning/partprune.h
index ee11b6feae..90684efa25 100644
--- a/src/include/partitioning/partprune.h
+++ b/src/include/partitioning/partprune.h
@@ -41,6 +41,7 @@ struct RelOptInfo;
  *					subsidiary data, such as the FmgrInfos.
  * planstate		Points to the parent plan node's PlanState when called
  *					during execution; NULL when called from the planner.
+ * exprcontext		ExprContext to use when evaluating pruning expressions
  * exprstates		Array of ExprStates, indexed as per PruneCxtStateIdx; one
  *					for each partition key in each pruning step.  Allocated if
  *					planstate is non-NULL, otherwise NULL.
@@ -56,6 +57,7 @@ typedef struct PartitionPruneContext
 	FmgrInfo   *stepcmpfuncs;
 	MemoryContext ppccontext;
 	PlanState  *planstate;
+	ExprContext *exprcontext;
 	ExprState **exprstates;
 } PartitionPruneContext;
 
-- 
2.24.1



^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2022-03-11 15:06  Amit Langote <[email protected]>
  parent: Amit Langote <[email protected]>
  1 sibling, 0 replies; 108+ messages in thread

From: Amit Langote @ 2022-03-11 15:06 UTC (permalink / raw)
  To: Robert Haas <[email protected]>; +Cc: pgsql-hackers; David Rowley *EXTERN* <[email protected]>

On Fri, Mar 11, 2022 at 11:35 PM Amit Langote <[email protected]> wrote:
> Attached is v5, now broken into 3 patches:
>
> 0001: Some refactoring of runtime pruning code
> 0002: Add a plan_tree_walker
> 0003: Teach AcquireExecutorLocks to skip locking pruned relations

Repeated the performance tests described in the 1st email of this thread:

HEAD: (copied from the 1st email)

32      tps = 20561.776403 (without initial connection time)
64      tps = 12553.131423 (without initial connection time)
128     tps = 13330.365696 (without initial connection time)
256     tps = 8605.723120 (without initial connection time)
512     tps = 4435.951139 (without initial connection time)
1024    tps = 2346.902973 (without initial connection time)
2048    tps = 1334.680971 (without initial connection time)

Patched v1: (copied from the 1st email)

32      tps = 27554.156077 (without initial connection time)
64      tps = 27531.161310 (without initial connection time)
128     tps = 27138.305677 (without initial connection time)
256     tps = 25825.467724 (without initial connection time)
512     tps = 19864.386305 (without initial connection time)
1024    tps = 18742.668944 (without initial connection time)
2048    tps = 16312.412704 (without initial connection time)

Patched v5:

32      tps = 28204.197738 (without initial connection time)
64      tps = 26795.385318 (without initial connection time)
128     tps = 26387.920550 (without initial connection time)
256     tps = 25601.141556 (without initial connection time)
512     tps = 19911.947502 (without initial connection time)
1024    tps = 20158.692952 (without initial connection time)
2048    tps = 16180.195463 (without initial connection time)

Good to see that these rewrites haven't really hurt the numbers much,
which makes sense because the rewrites have really been about putting
the code in the right place.

BTW, these are the numbers for the same benchmark repeated with
plan_cache_mode = auto, which causes a custom plan to be chosen for
every execution and so unaffected by this patch.

32      tps = 13359.225082 (without initial connection time)
64      tps = 15760.533280 (without initial connection time)
128     tps = 15825.734482 (without initial connection time)
256     tps = 15017.693905 (without initial connection time)
512     tps = 13479.973395 (without initial connection time)
1024    tps = 13200.444397 (without initial connection time)
2048    tps = 12884.645475 (without initial connection time)

Comparing them to numbers when using force_generic_plan shows that
making the generic plans faster is indeed worthwhile.

-- 
Amit Langote
EDB: http://www.enterprisedb.com






^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2022-03-14 18:42  Robert Haas <[email protected]>
  parent: Amit Langote <[email protected]>
  1 sibling, 1 reply; 108+ messages in thread

From: Robert Haas @ 2022-03-14 18:42 UTC (permalink / raw)
  To: Amit Langote <[email protected]>; +Cc: pgsql-hackers; David Rowley *EXTERN* <[email protected]>; Tom Lane <[email protected]>

On Fri, Mar 11, 2022 at 9:35 AM Amit Langote <[email protected]> wrote:
> Attached is v5, now broken into 3 patches:
>
> 0001: Some refactoring of runtime pruning code
> 0002: Add a plan_tree_walker
> 0003: Teach AcquireExecutorLocks to skip locking pruned relations

So is any other committer planning to look at this? Tom, perhaps?
David? This strikes me as important work, and I don't mind going
through and trying to do some detailed review, but (A) I am not the
person most familiar with the code being modified here and (B) there
are some important theoretical questions about the approach that we
might want to try to cover before we get down into the details.

In my opinion, the most important theoretical issue here is around
reuse of plans that are no longer entirely valid, but the parts that
are no longer valid are certain to be pruned. If, because we know that
some parameter has some particular value, we skip locking a bunch of
partitions, then when we're executing the plan, those partitions need
not exist any more -- or they could have different indexes, be
detached from the partitioning hierarchy and subsequently altered,
whatever. That seems fine to me provided that all of our code (and any
third-party code) is careful not to rely on the portion of the plan
that we've pruned away, and doesn't assume that (for example) we can
still fetch the name of an index whose OID appears in there someplace.
I cannot think of a hazard where the fact that the part of a plan is
no longer valid because some DDL has been executed "infects" the
remainder of the plan. As long as we lock the partitioned tables named
in the plan and their descendents down to the level just above the one
at which something is pruned, and are careful, I think we should be
OK. It would be nice to know if someone has a fundamentally different
view of the hazards here, though.

Just to state my position here clearly, I would be more than happy if
somebody else plans to pick this up and try to get some or all of it
committed, and will cheerfully defer to such person in the event that
they have that plan. If, however, no such person exists, I may try my
hand at that myself.

Thanks,

-- 
Robert Haas
EDB: http://www.enterprisedb.com






^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2022-03-14 19:38  Tom Lane <[email protected]>
  parent: Robert Haas <[email protected]>
  0 siblings, 1 reply; 108+ messages in thread

From: Tom Lane @ 2022-03-14 19:38 UTC (permalink / raw)
  To: Robert Haas <[email protected]>; +Cc: Amit Langote <[email protected]>; pgsql-hackers; David Rowley *EXTERN* <[email protected]>

Robert Haas <[email protected]> writes:
> In my opinion, the most important theoretical issue here is around
> reuse of plans that are no longer entirely valid, but the parts that
> are no longer valid are certain to be pruned. If, because we know that
> some parameter has some particular value, we skip locking a bunch of
> partitions, then when we're executing the plan, those partitions need
> not exist any more -- or they could have different indexes, be
> detached from the partitioning hierarchy and subsequently altered,
> whatever.

Check.

> That seems fine to me provided that all of our code (and any
> third-party code) is careful not to rely on the portion of the plan
> that we've pruned away, and doesn't assume that (for example) we can
> still fetch the name of an index whose OID appears in there someplace.

... like EXPLAIN, for example?

If "pruning" means physical removal from the plan tree, then it's
probably all right.  However, it looks to me like that doesn't
actually happen, or at least doesn't happen till much later, so
there's room for worry about a disconnect between what plancache.c
has verified and what executor startup will try to touch.  As you
say, in the absence of any bugs, that's not a problem ... but if
there are such bugs, tracking them down would be really hard.

What I am skeptical about is that this work actually accomplishes
anything under real-world conditions.  That's because if pruning would
save enough to make skipping the lock-acquisition phase worth the
trouble, the plan cache is almost certainly going to decide it should
be using a custom plan not a generic plan.  Now if we had a better
cost model (or, indeed, any model at all) for run-time pruning effects
then maybe that situation could be improved.  I think we'd be better
served to worry about that end of it before we spend more time making
the executor even less predictable.

Also, while I've not spent much time at all reading this patch,
it seems rather desperately undercommented, and a lot of the
new names are unintelligible.  In particular, I suspect that the
patch is significantly redesigning when/where run-time pruning
happens (unless it's just letting that be run twice); but I don't
see any documentation or name changes suggesting where that
responsibility is now.

			regards, tom lane






^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2022-03-14 20:06  Robert Haas <[email protected]>
  parent: Tom Lane <[email protected]>
  0 siblings, 1 reply; 108+ messages in thread

From: Robert Haas @ 2022-03-14 20:06 UTC (permalink / raw)
  To: Tom Lane <[email protected]>; +Cc: Amit Langote <[email protected]>; pgsql-hackers; David Rowley *EXTERN* <[email protected]>

On Mon, Mar 14, 2022 at 3:38 PM Tom Lane <[email protected]> wrote:
> ... like EXPLAIN, for example?

Exactly! I think that's the foremost example, but extension modules
like auto_explain or even third-party extensions are also a risk. I
think there was some discussion of this previously.

> If "pruning" means physical removal from the plan tree, then it's
> probably all right.  However, it looks to me like that doesn't
> actually happen, or at least doesn't happen till much later, so
> there's room for worry about a disconnect between what plancache.c
> has verified and what executor startup will try to touch.  As you
> say, in the absence of any bugs, that's not a problem ... but if
> there are such bugs, tracking them down would be really hard.

Surgery on the plan would violate the general principle that plans are
read only once constructed. I think the idea ought to be to pass a
secondary data structure around with the plan that defines which parts
you must ignore. Any code that fails to use that other data structure
in the appropriate manner gets defined to be buggy and has to be fixed
by making it follow the new rules.

> What I am skeptical about is that this work actually accomplishes
> anything under real-world conditions.  That's because if pruning would
> save enough to make skipping the lock-acquisition phase worth the
> trouble, the plan cache is almost certainly going to decide it should
> be using a custom plan not a generic plan.  Now if we had a better
> cost model (or, indeed, any model at all) for run-time pruning effects
> then maybe that situation could be improved.  I think we'd be better
> served to worry about that end of it before we spend more time making
> the executor even less predictable.

I don't agree with that analysis, because setting plan_cache_mode is
not uncommon. Even if that GUC didn't exist, I'm pretty sure there are
cases where the planner naturally falls into a generic plan anyway,
even though pruning is happening. But as it is, the GUC does exist,
and people use it. Consequently, while I'd love to see something done
about the costing side of things, I do not accept that all other
improvements should wait for that to happen.

> Also, while I've not spent much time at all reading this patch,
> it seems rather desperately undercommented, and a lot of the
> new names are unintelligible.  In particular, I suspect that the
> patch is significantly redesigning when/where run-time pruning
> happens (unless it's just letting that be run twice); but I don't
> see any documentation or name changes suggesting where that
> responsibility is now.

I am sympathetic to that concern. I spent a while staring at a
baffling comment in 0001 only to discover it had just been moved from
elsewhere. I really don't feel that things in this are as clear as
they could be -- although I hasten to add that I respect the people
who have done work in this area previously and am grateful for what
they did. It's been a huge benefit to the project in spite of the
bumps in the road. Moreover, this isn't the only code in PostgreSQL
that needs improvement, or the worst. That said, I do think there are
problems. I don't yet have a position on whether this patch is making
that better or worse.

That said, I believe that the core idea of the patch is to optionally
perform pruning before we acquire locks or spin up the main executor
and then remember the decisions we made. If once the main executor is
spun up we already made those decisions, then we must stick with what
we decided. If not, we make those pruning decisions at the same point
we do currently - more or less on demand, at the point when we'd need
to know whether to descend that branch of the plan tree or not. I
think this scheme comes about because there are a couple of different
interfaces to the parameterized query stuff, and in some code paths we
have the values early enough to use them for pre-pruning, and in
others we don't.

-- 
Robert Haas
EDB: http://www.enterprisedb.com






^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2022-03-15 06:19  Amit Langote <[email protected]>
  parent: Robert Haas <[email protected]>
  0 siblings, 1 reply; 108+ messages in thread

From: Amit Langote @ 2022-03-15 06:19 UTC (permalink / raw)
  To: Robert Haas <[email protected]>; +Cc: Tom Lane <[email protected]>; pgsql-hackers; David Rowley *EXTERN* <[email protected]>

On Tue, Mar 15, 2022 at 5:06 AM Robert Haas <[email protected]> wrote:
> On Mon, Mar 14, 2022 at 3:38 PM Tom Lane <[email protected]> wrote:
> > What I am skeptical about is that this work actually accomplishes
> > anything under real-world conditions.  That's because if pruning would
> > save enough to make skipping the lock-acquisition phase worth the
> > trouble, the plan cache is almost certainly going to decide it should
> > be using a custom plan not a generic plan.  Now if we had a better
> > cost model (or, indeed, any model at all) for run-time pruning effects
> > then maybe that situation could be improved.  I think we'd be better
> > served to worry about that end of it before we spend more time making
> > the executor even less predictable.
>
> I don't agree with that analysis, because setting plan_cache_mode is
> not uncommon. Even if that GUC didn't exist, I'm pretty sure there are
> cases where the planner naturally falls into a generic plan anyway,
> even though pruning is happening. But as it is, the GUC does exist,
> and people use it. Consequently, while I'd love to see something done
> about the costing side of things, I do not accept that all other
> improvements should wait for that to happen.

I agree that making generic plans execute faster has merit even before
we make the costing changes to allow plancache.c prefer generic plans
over custom ones in these cases.  As the numbers in my previous email
show, simply executing a generic plan with the proposed improvements
applied is significantly cheaper than having the planner do the
pruning on every execution:

nparts      auto/custom     generic
======      ==========      ======
32          13359           28204
64          15760           26795
128         15825           26387
256         15017           25601
512         13479           19911
1024        13200           20158
2048        12884           16180

> > Also, while I've not spent much time at all reading this patch,
> > it seems rather desperately undercommented, and a lot of the
> > new names are unintelligible.  In particular, I suspect that the
> > patch is significantly redesigning when/where run-time pruning
> > happens (unless it's just letting that be run twice); but I don't
> > see any documentation or name changes suggesting where that
> > responsibility is now.
>
> I am sympathetic to that concern. I spent a while staring at a
> baffling comment in 0001 only to discover it had just been moved from
> elsewhere. I really don't feel that things in this are as clear as
> they could be -- although I hasten to add that I respect the people
> who have done work in this area previously and am grateful for what
> they did. It's been a huge benefit to the project in spite of the
> bumps in the road. Moreover, this isn't the only code in PostgreSQL
> that needs improvement, or the worst. That said, I do think there are
> problems. I don't yet have a position on whether this patch is making
> that better or worse.

Okay, I'd like to post a new version with the comments edited to make
them a bit more intelligible.  I understand that the comments around
the new invocation mode(s) of runtime pruning are not as clear as they
should be, especially as the changes that this patch wants to make to
how things work are not very localized.

> That said, I believe that the core idea of the patch is to optionally
> perform pruning before we acquire locks or spin up the main executor
> and then remember the decisions we made. If once the main executor is
> spun up we already made those decisions, then we must stick with what
> we decided. If not, we make those pruning decisions at the same point
> we do currently

Right.  The "initial" pruning, that this patch wants to make occur at
an earlier point (plancache.c), is currently performed in
ExecInit[Merge]Append().

If it does occur early due to the plan being a cached one,
ExecInit[Merge]Append() simply refers to its result that would be made
available via a new data structure that plancache.c has been made to
pass down to the executor alongside the plan tree.

If it does not, ExecInit[Merge]Append() does the pruning in the same
way it does now.  Such cases include initial pruning using only STABLE
expressions that the planner doesn't bother to compute by itself lest
the resulting plan may be cached, but no EXTERN parameters.

--
Amit Langote
EDB: http://www.enterprisedb.com






^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2022-03-22 12:44  Amit Langote <[email protected]>
  parent: Amit Langote <[email protected]>
  0 siblings, 1 reply; 108+ messages in thread

From: Amit Langote @ 2022-03-22 12:44 UTC (permalink / raw)
  To: Robert Haas <[email protected]>; +Cc: Tom Lane <[email protected]>; pgsql-hackers; David Rowley *EXTERN* <[email protected]>

On Tue, Mar 15, 2022 at 3:19 PM Amit Langote <[email protected]> wrote:
> On Tue, Mar 15, 2022 at 5:06 AM Robert Haas <[email protected]> wrote:
> > On Mon, Mar 14, 2022 at 3:38 PM Tom Lane <[email protected]> wrote:
> > > Also, while I've not spent much time at all reading this patch,
> > > it seems rather desperately undercommented, and a lot of the
> > > new names are unintelligible.  In particular, I suspect that the
> > > patch is significantly redesigning when/where run-time pruning
> > > happens (unless it's just letting that be run twice); but I don't
> > > see any documentation or name changes suggesting where that
> > > responsibility is now.
> >
> > I am sympathetic to that concern. I spent a while staring at a
> > baffling comment in 0001 only to discover it had just been moved from
> > elsewhere. I really don't feel that things in this are as clear as
> > they could be -- although I hasten to add that I respect the people
> > who have done work in this area previously and am grateful for what
> > they did. It's been a huge benefit to the project in spite of the
> > bumps in the road. Moreover, this isn't the only code in PostgreSQL
> > that needs improvement, or the worst. That said, I do think there are
> > problems. I don't yet have a position on whether this patch is making
> > that better or worse.
>
> Okay, I'd like to post a new version with the comments edited to make
> them a bit more intelligible.  I understand that the comments around
> the new invocation mode(s) of runtime pruning are not as clear as they
> should be, especially as the changes that this patch wants to make to
> how things work are not very localized.

Actually, another area where the comments may not be as clear as they
should have been is the changes that the patch makes to the
AcquireExecutorLocks() logic that decides which relations are locked
to safeguard the plan tree for execution, which are those given by
RTE_RELATION entries in the range table.

Without the patch, they are found by actually scanning the range table.

With the patch, it's the same set of RTEs if the plan doesn't contain
any pruning nodes, though instead of the range table, what is scanned
is a bitmapset of their RT indexes that is made available by the
planner in the form of PlannedStmt.lockrels.  When the plan does
contain a pruning node (PlannedStmt.containsInitialPruning), the
bitmapset is constructed by calling ExecutorGetLockRels() on the plan
tree, which walks it to add RT indexes of relations mentioned in the
Scan nodes, while skipping any nodes that are pruned after performing
initial pruning steps that may be present in their containing parent
node's PartitionPruneInfo.  Also, the RT indexes of partitioned tables
that are present in the PartitionPruneInfo itself are also added to
the set.

While expanding comments added by the patch to make this clear, I
realized that there are two problems, one of them quite glaring:

* Planner's constructing this bitmapset and its copying along with the
PlannedStmt is pure overhead in the cases that this patch has nothing
to do with, which is the kind of thing that Andres cautioned against
upthread.

* Not all partitioned tables that would have been locked without the
patch to come up with a Append/MergeAppend plan may be returned by
ExecutorGetLockRels().  For example, if none of the query's
runtime-prunable quals were found to match the partition key of an
intermediate partitioned table and thus that partitioned table not
included in the PartitionPruneInfo.  Or if an Append/MergeAppend
covering a partition tree doesn't contain any PartitionPruneInfo to
begin with, in which case, only the leaf partitions and none of
partitioned parents would be accounted for by the
ExecutorGetLockRels() logic.

The 1st one seems easy to fix by not inventing PlannedStmt.lockrels
and just doing what's being done now: scan the range table if
(!PlannedStmt.containsInitialPruning).

The only way perhaps to fix the second one is to reconsider the
decision we made in the following commit:

    commit 52ed730d511b7b1147f2851a7295ef1fb5273776
    Author: Tom Lane <[email protected]>
    Date:   Sun Oct 7 14:33:17 2018 -0400

    Remove some unnecessary fields from Plan trees.

    In the wake of commit f2343653f, we no longer need some fields that
    were used before to control executor lock acquisitions:

    * PlannedStmt.nonleafResultRelations can go away entirely.

    * partitioned_rels can go away from Append, MergeAppend, and ModifyTable.
    However, ModifyTable still needs to know the RT index of the partition
    root table if any, which was formerly kept in the first entry of that
    list.  Add a new field "rootRelation" to remember that.  rootRelation is
    partly redundant with nominalRelation, in that if it's set it will have
    the same value as nominalRelation.  However, the latter field has a
    different purpose so it seems best to keep them distinct.

That is, add back the partitioned_rels field, at least to Append and
MergeAppend, to store the RT indexes of partitioned tables whose
children's paths are present in Append/MergeAppend.subpaths.

Thoughts?


--
Amit Langote
EDB: http://www.enterprisedb.com






^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2022-03-28 07:17  Amit Langote <[email protected]>
  parent: Amit Langote <[email protected]>
  0 siblings, 1 reply; 108+ messages in thread

From: Amit Langote @ 2022-03-28 07:17 UTC (permalink / raw)
  To: Robert Haas <[email protected]>; +Cc: Tom Lane <[email protected]>; pgsql-hackers; David Rowley *EXTERN* <[email protected]>

On Tue, Mar 22, 2022 at 9:44 PM Amit Langote <[email protected]> wrote:
> On Tue, Mar 15, 2022 at 3:19 PM Amit Langote <[email protected]> wrote:
> > On Tue, Mar 15, 2022 at 5:06 AM Robert Haas <[email protected]> wrote:
> > > On Mon, Mar 14, 2022 at 3:38 PM Tom Lane <[email protected]> wrote:
> > > > Also, while I've not spent much time at all reading this patch,
> > > > it seems rather desperately undercommented, and a lot of the
> > > > new names are unintelligible.  In particular, I suspect that the
> > > > patch is significantly redesigning when/where run-time pruning
> > > > happens (unless it's just letting that be run twice); but I don't
> > > > see any documentation or name changes suggesting where that
> > > > responsibility is now.
> > >
> > > I am sympathetic to that concern. I spent a while staring at a
> > > baffling comment in 0001 only to discover it had just been moved from
> > > elsewhere. I really don't feel that things in this are as clear as
> > > they could be -- although I hasten to add that I respect the people
> > > who have done work in this area previously and am grateful for what
> > > they did. It's been a huge benefit to the project in spite of the
> > > bumps in the road. Moreover, this isn't the only code in PostgreSQL
> > > that needs improvement, or the worst. That said, I do think there are
> > > problems. I don't yet have a position on whether this patch is making
> > > that better or worse.
> >
> > Okay, I'd like to post a new version with the comments edited to make
> > them a bit more intelligible.  I understand that the comments around
> > the new invocation mode(s) of runtime pruning are not as clear as they
> > should be, especially as the changes that this patch wants to make to
> > how things work are not very localized.
>
> Actually, another area where the comments may not be as clear as they
> should have been is the changes that the patch makes to the
> AcquireExecutorLocks() logic that decides which relations are locked
> to safeguard the plan tree for execution, which are those given by
> RTE_RELATION entries in the range table.
>
> Without the patch, they are found by actually scanning the range table.
>
> With the patch, it's the same set of RTEs if the plan doesn't contain
> any pruning nodes, though instead of the range table, what is scanned
> is a bitmapset of their RT indexes that is made available by the
> planner in the form of PlannedStmt.lockrels.  When the plan does
> contain a pruning node (PlannedStmt.containsInitialPruning), the
> bitmapset is constructed by calling ExecutorGetLockRels() on the plan
> tree, which walks it to add RT indexes of relations mentioned in the
> Scan nodes, while skipping any nodes that are pruned after performing
> initial pruning steps that may be present in their containing parent
> node's PartitionPruneInfo.  Also, the RT indexes of partitioned tables
> that are present in the PartitionPruneInfo itself are also added to
> the set.
>
> While expanding comments added by the patch to make this clear, I
> realized that there are two problems, one of them quite glaring:
>
> * Planner's constructing this bitmapset and its copying along with the
> PlannedStmt is pure overhead in the cases that this patch has nothing
> to do with, which is the kind of thing that Andres cautioned against
> upthread.
>
> * Not all partitioned tables that would have been locked without the
> patch to come up with a Append/MergeAppend plan may be returned by
> ExecutorGetLockRels().  For example, if none of the query's
> runtime-prunable quals were found to match the partition key of an
> intermediate partitioned table and thus that partitioned table not
> included in the PartitionPruneInfo.  Or if an Append/MergeAppend
> covering a partition tree doesn't contain any PartitionPruneInfo to
> begin with, in which case, only the leaf partitions and none of
> partitioned parents would be accounted for by the
> ExecutorGetLockRels() logic.
>
> The 1st one seems easy to fix by not inventing PlannedStmt.lockrels
> and just doing what's being done now: scan the range table if
> (!PlannedStmt.containsInitialPruning).

The attached updated patch does it like this.

> The only way perhaps to fix the second one is to reconsider the
> decision we made in the following commit:
>
>     commit 52ed730d511b7b1147f2851a7295ef1fb5273776
>     Author: Tom Lane <[email protected]>
>     Date:   Sun Oct 7 14:33:17 2018 -0400
>
>     Remove some unnecessary fields from Plan trees.
>
>     In the wake of commit f2343653f, we no longer need some fields that
>     were used before to control executor lock acquisitions:
>
>     * PlannedStmt.nonleafResultRelations can go away entirely.
>
>     * partitioned_rels can go away from Append, MergeAppend, and ModifyTable.
>     However, ModifyTable still needs to know the RT index of the partition
>     root table if any, which was formerly kept in the first entry of that
>     list.  Add a new field "rootRelation" to remember that.  rootRelation is
>     partly redundant with nominalRelation, in that if it's set it will have
>     the same value as nominalRelation.  However, the latter field has a
>     different purpose so it seems best to keep them distinct.
>
> That is, add back the partitioned_rels field, at least to Append and
> MergeAppend, to store the RT indexes of partitioned tables whose
> children's paths are present in Append/MergeAppend.subpaths.

And implemented this in the attached 0002 that reintroduces
partitioned_rels in Append/MergeAppend nodes as a bitmapset of RT
indexes.  The set contains the RT indexes of partitioned ancestors
whose expansion produced the leaf partitions that a given
Append/MergeAppend node scans.   This project needs this way of
knowing the partitioned tables involved in producing an
Append/MergeAppend node, because we'd like to give plancache.c the
ability to glean the set of relations to be locked by scanning a plan
tree to make the tree ready for execution rather than by scanning the
range table and the only relations we're missing in the tree right now
are partitioned tables.

One fly-in-the-ointment situation I faced when doing that is the fact
that setrefs.c in most situations removes the Append/MergeAppend from
the final plan if it contains only one child subplan.  I got around it
by inventing a PlannerGlobal/PlannedStmt.elidedAppendPartedRels set
which is a union of partitioned_rels of all the Append/MergeAppend
nodes in the plan tree that were removed as described.

Other than the changes mentioned above, the updated patch now contains
a bit more commentary than earlier versions, mostly around
AcquireExecutorLocks()'s new way of determining the set of relations
to lock and the significantly redesigned working of the "initial"
execution pruning.

-- 
Amit Langote
EDB: http://www.enterprisedb.com


Attachments:

  [application/x-patch] v6-0003-Add-a-plan_tree_walker.patch (3.9K, 2-v6-0003-Add-a-plan_tree_walker.patch)
  download | inline diff:
From 47a00a6b8cf695e5890fc6555e2df2980eb2115b Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Thu, 3 Mar 2022 16:04:13 +0900
Subject: [PATCH v6 3/4] Add a plan_tree_walker()

Like planstate_tree_walker() but for uninitialized plan trees.
---
 src/backend/nodes/nodeFuncs.c | 116 ++++++++++++++++++++++++++++++++++
 src/include/nodes/nodeFuncs.h |   3 +
 2 files changed, 119 insertions(+)

diff --git a/src/backend/nodes/nodeFuncs.c b/src/backend/nodes/nodeFuncs.c
index ec25aae6e3..c16f9c6b40 100644
--- a/src/backend/nodes/nodeFuncs.c
+++ b/src/backend/nodes/nodeFuncs.c
@@ -31,6 +31,10 @@ static bool planstate_walk_subplans(List *plans, bool (*walker) (),
 									void *context);
 static bool planstate_walk_members(PlanState **planstates, int nplans,
 								   bool (*walker) (), void *context);
+static bool plan_walk_subplans(List *plans,
+				   bool (*walker) (),
+				   void *context);
+static bool plan_walk_members(List *plans, bool (*walker) (), void *context);
 
 
 /*
@@ -4150,3 +4154,115 @@ planstate_walk_members(PlanState **planstates, int nplans,
 
 	return false;
 }
+
+/*
+ * plan_tree_walker --- walk plantrees
+ *
+ * The walker has already visited the current node, and so we need only
+ * recurse into any sub-nodes it has.
+ */
+bool
+plan_tree_walker(Plan *plan,
+				 bool (*walker) (),
+				 void *context)
+{
+	/* Guard against stack overflow due to overly complex plan trees */
+	check_stack_depth();
+
+	/* initPlan-s */
+	if (plan_walk_subplans(plan->initPlan, walker, context))
+		return true;
+
+	/* lefttree */
+	if (outerPlan(plan))
+	{
+		if (walker(outerPlan(plan), context))
+			return true;
+	}
+
+	/* righttree */
+	if (innerPlan(plan))
+	{
+		if (walker(innerPlan(plan), context))
+			return true;
+	}
+
+	/* special child plans */
+	switch (nodeTag(plan))
+	{
+		case T_Append:
+			if (plan_walk_members(((Append *) plan)->appendplans,
+								  walker, context))
+				return true;
+			break;
+		case T_MergeAppend:
+			if (plan_walk_members(((MergeAppend *) plan)->mergeplans,
+								  walker, context))
+				return true;
+			break;
+		case T_BitmapAnd:
+			if (plan_walk_members(((BitmapAnd *) plan)->bitmapplans,
+								  walker, context))
+				return true;
+			break;
+		case T_BitmapOr:
+			if (plan_walk_members(((BitmapOr *) plan)->bitmapplans,
+								  walker, context))
+				return true;
+			break;
+		case T_CustomScan:
+			if (plan_walk_members(((CustomScan *) plan)->custom_plans,
+								  walker, context))
+				return true;
+			break;
+		case T_SubqueryScan:
+			if (walker(((SubqueryScan *) plan)->subplan, context))
+				return true;
+			break;
+		default:
+			break;
+	}
+
+	return false;
+}
+
+/*
+ * Walk a list of SubPlans (or initPlans, which also use SubPlan nodes).
+ */
+static bool
+plan_walk_subplans(List *plans,
+				   bool (*walker) (),
+				   void *context)
+{
+	ListCell   *lc;
+	PlannedStmt *plannedstmt = (PlannedStmt *) context;
+
+	foreach(lc, plans)
+	{
+		SubPlan *sp = lfirst_node(SubPlan, lc);
+		Plan *p = list_nth(plannedstmt->subplans, sp->plan_id - 1);
+
+		if (walker(p, context))
+			return true;
+	}
+
+	return false;
+}
+
+/*
+ * Walk the constituent plans of a ModifyTable, Append, MergeAppend,
+ * BitmapAnd, or BitmapOr node.
+ */
+static bool
+plan_walk_members(List *plans, bool (*walker) (), void *context)
+{
+	ListCell *lc;
+
+	foreach(lc, plans)
+	{
+		if (walker(lfirst(lc), context))
+			return true;
+	}
+
+	return false;
+}
diff --git a/src/include/nodes/nodeFuncs.h b/src/include/nodes/nodeFuncs.h
index 93c60bde66..fca107ad65 100644
--- a/src/include/nodes/nodeFuncs.h
+++ b/src/include/nodes/nodeFuncs.h
@@ -158,5 +158,8 @@ extern bool raw_expression_tree_walker(Node *node, bool (*walker) (),
 struct PlanState;
 extern bool planstate_tree_walker(struct PlanState *planstate, bool (*walker) (),
 								  void *context);
+struct Plan;
+extern bool plan_tree_walker(struct Plan *plan, bool (*walker) (),
+				 void *context);
 
 #endif							/* NODEFUNCS_H */
-- 
2.24.1



  [application/x-patch] v6-0002-Add-Merge-Append.partitioned_rels.patch (17.4K, 3-v6-0002-Add-Merge-Append.partitioned_rels.patch)
  download | inline diff:
From 8c81237402922ebf82786f3ff34972a6a3cb8c03 Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Thu, 24 Mar 2022 22:47:03 +0900
Subject: [PATCH v6 2/4] Add [Merge]Append.partitioned_rels

To record the RT indexes of all partitioned ancestors leading up to
leaf partitions that are appended by the node.

If a given [Merge]Append node is left out from the plan due to there
being only one element in its list of child subplans, then its
partitioned_rels set is added to PlannerGlobal.elidedAppendPartedRels
that is passed down to the executor through PlannedStmt.

There are no users for partitioned_rels and elidedAppendPartedRels
as of this commit, though a later commit will require the ability
to extract the set of relations that must be locked to make a plan
tree safe for execution by walking the plan tree itself, so having
the partitioned tables be also present in the plan tree will be
helpful.  Note that currently the executor relies on the fact that
the set of relations to be locked can be obtained by simply scanning
the range table that's made available in PlannedStmt along with the
plan tree.
---
 src/backend/nodes/copyfuncs.c           |  3 +++
 src/backend/nodes/outfuncs.c            |  5 +++++
 src/backend/nodes/readfuncs.c           |  3 +++
 src/backend/optimizer/path/joinrels.c   |  9 ++++++++
 src/backend/optimizer/plan/createplan.c | 18 +++++++++++++++-
 src/backend/optimizer/plan/planner.c    |  8 +++++++
 src/backend/optimizer/plan/setrefs.c    | 28 +++++++++++++++++++++++++
 src/backend/optimizer/util/inherit.c    | 16 ++++++++++++++
 src/backend/optimizer/util/relnode.c    | 20 ++++++++++++++++++
 src/include/nodes/pathnodes.h           | 22 +++++++++++++++++++
 src/include/nodes/plannodes.h           | 17 +++++++++++++++
 11 files changed, 148 insertions(+), 1 deletion(-)

diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 55f720a88f..dc68a12486 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -106,6 +106,7 @@ _copyPlannedStmt(const PlannedStmt *from)
 	COPY_NODE_FIELD(invalItems);
 	COPY_NODE_FIELD(paramExecTypes);
 	COPY_NODE_FIELD(utilityStmt);
+	COPY_BITMAPSET_FIELD(elidedAppendPartedRels);
 	COPY_LOCATION_FIELD(stmt_location);
 	COPY_SCALAR_FIELD(stmt_len);
 
@@ -253,6 +254,7 @@ _copyAppend(const Append *from)
 	COPY_SCALAR_FIELD(nasyncplans);
 	COPY_SCALAR_FIELD(first_partial_plan);
 	COPY_NODE_FIELD(part_prune_info);
+	COPY_BITMAPSET_FIELD(partitioned_rels);
 
 	return newnode;
 }
@@ -281,6 +283,7 @@ _copyMergeAppend(const MergeAppend *from)
 	COPY_POINTER_FIELD(collations, from->numCols * sizeof(Oid));
 	COPY_POINTER_FIELD(nullsFirst, from->numCols * sizeof(bool));
 	COPY_NODE_FIELD(part_prune_info);
+	COPY_BITMAPSET_FIELD(partitioned_rels);
 
 	return newnode;
 }
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 6bdad462c7..bc178d53bf 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -324,6 +324,7 @@ _outPlannedStmt(StringInfo str, const PlannedStmt *node)
 	WRITE_NODE_FIELD(invalItems);
 	WRITE_NODE_FIELD(paramExecTypes);
 	WRITE_NODE_FIELD(utilityStmt);
+	WRITE_BITMAPSET_FIELD(elidedAppendPartedRels);
 	WRITE_LOCATION_FIELD(stmt_location);
 	WRITE_INT_FIELD(stmt_len);
 }
@@ -443,6 +444,7 @@ _outAppend(StringInfo str, const Append *node)
 	WRITE_INT_FIELD(nasyncplans);
 	WRITE_INT_FIELD(first_partial_plan);
 	WRITE_NODE_FIELD(part_prune_info);
+	WRITE_BITMAPSET_FIELD(partitioned_rels);
 }
 
 static void
@@ -460,6 +462,7 @@ _outMergeAppend(StringInfo str, const MergeAppend *node)
 	WRITE_OID_ARRAY(collations, node->numCols);
 	WRITE_BOOL_ARRAY(nullsFirst, node->numCols);
 	WRITE_NODE_FIELD(part_prune_info);
+	WRITE_BITMAPSET_FIELD(partitioned_rels);
 }
 
 static void
@@ -2288,6 +2291,7 @@ _outPlannerGlobal(StringInfo str, const PlannerGlobal *node)
 	WRITE_BOOL_FIELD(parallelModeOK);
 	WRITE_BOOL_FIELD(parallelModeNeeded);
 	WRITE_CHAR_FIELD(maxParallelHazard);
+	WRITE_BITMAPSET_FIELD(elidedAppendPartedRels);
 }
 
 static void
@@ -2399,6 +2403,7 @@ _outRelOptInfo(StringInfo str, const RelOptInfo *node)
 	WRITE_BOOL_FIELD(partbounds_merged);
 	WRITE_BITMAPSET_FIELD(live_parts);
 	WRITE_BITMAPSET_FIELD(all_partrels);
+	WRITE_BITMAPSET_FIELD(partitioned_rels);
 }
 
 static void
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 3f68f7c18d..3c673c42d5 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -1597,6 +1597,7 @@ _readPlannedStmt(void)
 	READ_NODE_FIELD(invalItems);
 	READ_NODE_FIELD(paramExecTypes);
 	READ_NODE_FIELD(utilityStmt);
+	READ_BITMAPSET_FIELD(elidedAppendPartedRels);
 	READ_LOCATION_FIELD(stmt_location);
 	READ_INT_FIELD(stmt_len);
 
@@ -1719,6 +1720,7 @@ _readAppend(void)
 	READ_INT_FIELD(nasyncplans);
 	READ_INT_FIELD(first_partial_plan);
 	READ_NODE_FIELD(part_prune_info);
+	READ_BITMAPSET_FIELD(partitioned_rels);
 
 	READ_DONE();
 }
@@ -1741,6 +1743,7 @@ _readMergeAppend(void)
 	READ_OID_ARRAY(collations, local_node->numCols);
 	READ_BOOL_ARRAY(nullsFirst, local_node->numCols);
 	READ_NODE_FIELD(part_prune_info);
+	READ_BITMAPSET_FIELD(partitioned_rels);
 
 	READ_DONE();
 }
diff --git a/src/backend/optimizer/path/joinrels.c b/src/backend/optimizer/path/joinrels.c
index 9da3ff2f9a..e74d40fee3 100644
--- a/src/backend/optimizer/path/joinrels.c
+++ b/src/backend/optimizer/path/joinrels.c
@@ -1549,6 +1549,15 @@ try_partitionwise_join(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2,
 		populate_joinrel_with_paths(root, child_rel1, child_rel2,
 									child_joinrel, child_sjinfo,
 									child_restrictlist);
+
+		/*
+		 * A parent relation's partitioned_rels must be a superset of the sets
+		 * of all its children, direct or indirect, so bubble up the child
+		 * joinrel's set.
+		 */
+		joinrel->partitioned_rels =
+			bms_add_members(joinrel->partitioned_rels,
+							child_joinrel->partitioned_rels);
 	}
 }
 
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index fa069a217c..0026086591 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -26,10 +26,12 @@
 #include "nodes/extensible.h"
 #include "nodes/makefuncs.h"
 #include "nodes/nodeFuncs.h"
+#include "optimizer/appendinfo.h"
 #include "optimizer/clauses.h"
 #include "optimizer/cost.h"
 #include "optimizer/optimizer.h"
 #include "optimizer/paramassign.h"
+#include "optimizer/pathnode.h"
 #include "optimizer/paths.h"
 #include "optimizer/placeholder.h"
 #include "optimizer/plancat.h"
@@ -1331,11 +1333,11 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
 										 best_path->subpaths,
 										 prunequal);
 	}
-
 	plan->appendplans = subplans;
 	plan->nasyncplans = nasyncplans;
 	plan->first_partial_plan = best_path->first_partial_path;
 	plan->part_prune_info = partpruneinfo;
+	plan->partitioned_rels = bms_copy(rel->partitioned_rels);
 
 	copy_generic_path_info(&plan->plan, (Path *) best_path);
 
@@ -1499,6 +1501,20 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
 	node->mergeplans = subplans;
 	node->part_prune_info = partpruneinfo;
 
+	/*
+	 * We need to explicitly add to the plan node the RT indexes of any
+	 * partitioned tables whose partitions will be scanned by the nodes in
+	 * 'subplans'.  There can be multiple RT indexes in the set due to the
+	 * partition tree being multi-level and/or this being a plan for UNION ALL
+	 * over multiple partition trees.  Along with scanrelids of leaf-level Scan
+	 * nodes, this allows the executor to lock the full set of relations being
+	 * scanned by this node.
+	 *
+	 * Note that 'apprelids' only contains the top-level base relation(s), so
+	 * is not sufficient for the purpose.
+	 */
+	node->partitioned_rels = bms_copy(rel->partitioned_rels);
+
 	/*
 	 * If prepare_sort_from_pathkeys added sort columns, but we were told to
 	 * produce either the exact tlist or a narrow tlist, we should get rid of
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index bd09f85aea..374a9d9753 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -529,6 +529,7 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
 	result->paramExecTypes = glob->paramExecTypes;
 	/* utilityStmt should be null, but we might as well copy it */
 	result->utilityStmt = parse->utilityStmt;
+	result->elidedAppendPartedRels = glob->elidedAppendPartedRels;
 	result->stmt_location = parse->stmt_location;
 	result->stmt_len = parse->stmt_len;
 
@@ -7365,6 +7366,13 @@ create_partitionwise_grouping_paths(PlannerInfo *root,
 
 		add_paths_to_append_rel(root, grouped_rel, grouped_live_children);
 	}
+
+	/*
+	 * Input rel might be a partitioned appendrel, though grouped_rel has at
+	 * this point taken its role as the an appendrel owning the former's
+	 * children, so copy the former's partitioned_rels set into the latter.
+	 */
+	grouped_rel->partitioned_rels = bms_copy(input_rel->partitioned_rels);
 }
 
 /*
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index a7b11b7f03..dbdeb8ec9d 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -1512,6 +1512,10 @@ set_append_references(PlannerInfo *root,
 		lfirst(l) = set_plan_refs(root, (Plan *) lfirst(l), rtoffset);
 	}
 
+	/* Fix up partitioned_rels before possibly removing the Append below. */
+	aplan->partitioned_rels = offset_relid_set(aplan->partitioned_rels,
+											   rtoffset);
+
 	/*
 	 * See if it's safe to get rid of the Append entirely.  For this to be
 	 * safe, there must be only one child plan and that child plan's parallel
@@ -1522,8 +1526,17 @@ set_append_references(PlannerInfo *root,
 	 */
 	if (list_length(aplan->appendplans) == 1 &&
 		((Plan *) linitial(aplan->appendplans))->parallel_aware == aplan->plan.parallel_aware)
+	{
+		/*
+		 * Partitioned table involved, if any, must be made known to the
+		 * executor.
+		 */
+		root->glob->elidedAppendPartedRels =
+			bms_add_members(root->glob->elidedAppendPartedRels,
+							aplan->partitioned_rels);
 		return clean_up_removed_plan_level((Plan *) aplan,
 										   (Plan *) linitial(aplan->appendplans));
+	}
 
 	/*
 	 * Otherwise, clean up the Append as needed.  It's okay to do this after
@@ -1584,6 +1597,12 @@ set_mergeappend_references(PlannerInfo *root,
 		lfirst(l) = set_plan_refs(root, (Plan *) lfirst(l), rtoffset);
 	}
 
+	/*
+	 * Fix up partitioned_rels before possibly removing the MergeAppend below.
+	 */
+	mplan->partitioned_rels = offset_relid_set(mplan->partitioned_rels,
+											   rtoffset);
+
 	/*
 	 * See if it's safe to get rid of the MergeAppend entirely.  For this to
 	 * be safe, there must be only one child plan and that child plan's
@@ -1594,8 +1613,17 @@ set_mergeappend_references(PlannerInfo *root,
 	 */
 	if (list_length(mplan->mergeplans) == 1 &&
 		((Plan *) linitial(mplan->mergeplans))->parallel_aware == mplan->plan.parallel_aware)
+	{
+		/*
+		 * Partitioned tables involved, if any, must be made known to the
+		 * executor.
+		 */
+		root->glob->elidedAppendPartedRels =
+			bms_add_members(root->glob->elidedAppendPartedRels,
+							mplan->partitioned_rels);
 		return clean_up_removed_plan_level((Plan *) mplan,
 										   (Plan *) linitial(mplan->mergeplans));
+	}
 
 	/*
 	 * Otherwise, clean up the MergeAppend as needed.  It's okay to do this
diff --git a/src/backend/optimizer/util/inherit.c b/src/backend/optimizer/util/inherit.c
index 7e134822f3..56912e4101 100644
--- a/src/backend/optimizer/util/inherit.c
+++ b/src/backend/optimizer/util/inherit.c
@@ -406,6 +406,14 @@ expand_partitioned_rtentry(PlannerInfo *root, RelOptInfo *relinfo,
 									   childrte, childRTindex,
 									   childrel, top_parentrc, lockmode);
 
+		/*
+		 * A parent relation's partitioned_rels must be a superset of the sets
+		 * of all its children, direct or indirect, so bubble up the child
+		 * rel's set.
+		 */
+		relinfo->partitioned_rels = bms_add_members(relinfo->partitioned_rels,
+													childrelinfo->partitioned_rels);
+
 		/* Close child relation, but keep locks */
 		table_close(childrel, NoLock);
 	}
@@ -737,6 +745,14 @@ expand_appendrel_subquery(PlannerInfo *root, RelOptInfo *rel,
 		/* Child may itself be an inherited rel, either table or subquery. */
 		if (childrte->inh)
 			expand_inherited_rtentry(root, childrel, childrte, childRTindex);
+
+		/*
+		 * A parent relation's partitioned_rels must be a superset of the sets
+		 * of all its children, direct or indirect, so bubble up the child
+		 * rel's set.
+		 */
+		rel->partitioned_rels = bms_add_members(rel->partitioned_rels,
+												childrel->partitioned_rels);
 	}
 }
 
diff --git a/src/backend/optimizer/util/relnode.c b/src/backend/optimizer/util/relnode.c
index 520409f4ba..1d082a8fdd 100644
--- a/src/backend/optimizer/util/relnode.c
+++ b/src/backend/optimizer/util/relnode.c
@@ -361,6 +361,10 @@ build_simple_rel(PlannerInfo *root, int relid, RelOptInfo *parent)
 		}
 	}
 
+	/* A partitioned appendrel. */
+	if (rel->part_scheme != NULL)
+		rel->partitioned_rels = bms_copy(rel->relids);
+
 	/* Save the finished struct in the query's simple_rel_array */
 	root->simple_rel_array[relid] = rel;
 
@@ -729,6 +733,14 @@ build_join_rel(PlannerInfo *root,
 	set_joinrel_size_estimates(root, joinrel, outer_rel, inner_rel,
 							   sjinfo, restrictlist);
 
+	/*
+	 * The joinrel may get processed as an appendrel via partitionwise join
+	 * if both outer and inner rels are partitioned, so set partitioned_rels
+	 * appropriately.
+	 */
+	joinrel->partitioned_rels = bms_union(outer_rel->partitioned_rels,
+										  inner_rel->partitioned_rels);
+
 	/*
 	 * Set the consider_parallel flag if this joinrel could potentially be
 	 * scanned within a parallel worker.  If this flag is false for either
@@ -897,6 +909,14 @@ build_child_join_rel(PlannerInfo *root, RelOptInfo *outer_rel,
 	set_joinrel_size_estimates(root, joinrel, outer_rel, inner_rel,
 							   sjinfo, restrictlist);
 
+	/*
+	 * The joinrel may get processed as an appendrel via partitionwise join
+	 * if both outer and inner rels are partitioned, so set partitioned_rels
+	 * appropriately.
+	 */
+	joinrel->partitioned_rels = bms_union(outer_rel->partitioned_rels,
+										  inner_rel->partitioned_rels);
+
 	/* We build the join only once. */
 	Assert(!find_join_rel(root, joinrel->relids));
 
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 1f3845b3fe..5327d9ba8b 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -130,6 +130,11 @@ typedef struct PlannerGlobal
 	char		maxParallelHazard;	/* worst PROPARALLEL hazard level */
 
 	PartitionDirectory partition_directory; /* partition descriptors */
+
+	Bitmapset *elidedAppendPartedRels;	/* Combined partitioned_rels of all
+										 * single-subplan [Merge]Append nodes
+										 * that have been removed fron the
+										 * various plan trees. */
 } PlannerGlobal;
 
 /* macro for fetching the Plan associated with a SubPlan node */
@@ -773,6 +778,23 @@ typedef struct RelOptInfo
 	Relids		all_partrels;	/* Relids set of all partition relids */
 	List	  **partexprs;		/* Non-nullable partition key expressions */
 	List	  **nullable_partexprs; /* Nullable partition key expressions */
+
+	/*
+	 * For an appendrel parent relation (base, join, or upper) that is
+	 * partitioned, this stores the RT indexes of all the paritioned ancestors
+	 * including itself that lead up to the individual leaf partitions that
+	 * will be scanned to produce this relation's output rows.  The relid set
+	 * is copied into the resulting Append or MergeAppend plan node for
+	 * allowing the executor to take appropriate locks on those relations,
+	 * unless the node is deemed useless in setrefs.c due to having a single
+	 * leaf subplan and thus elided from the final plan, in which case, the set
+	 * is added into PlannerGlobal.elidedAppendPartedRels.
+	 *
+	 * Note that 'apprelids' of those nodes only contains the top-level base
+	 * relation(s), so is not sufficient for said purpose.
+	 */
+
+	Bitmapset  *partitioned_rels;
 } RelOptInfo;
 
 /*
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 0b518ce6b2..bd87c35d6c 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -85,6 +85,11 @@ typedef struct PlannedStmt
 
 	Node	   *utilityStmt;	/* non-null if this is utility stmt */
 
+	Bitmapset *elidedAppendPartedRels;	/* Combined partitioned_rels of all
+										 * single-subplan [Merge]Append nodes
+										 * that have been removed from the
+										 * various plan trees. */
+
 	/* statement location in source string (copied from Query) */
 	int			stmt_location;	/* start location, or -1 if unknown */
 	int			stmt_len;		/* length in bytes; 0 means "rest of string" */
@@ -261,6 +266,12 @@ typedef struct Append
 
 	/* Info for run-time subplan pruning; NULL if we're not doing that */
 	struct PartitionPruneInfo *part_prune_info;
+
+	/*
+	 * RT indexes of all partitioned parents whose partitions' plans are
+	 * present in appendplans.
+	 */
+	Bitmapset  *partitioned_rels;
 } Append;
 
 /* ----------------
@@ -281,6 +292,12 @@ typedef struct MergeAppend
 	bool	   *nullsFirst;		/* NULLS FIRST/LAST directions */
 	/* Info for run-time subplan pruning; NULL if we're not doing that */
 	struct PartitionPruneInfo *part_prune_info;
+
+	/*
+	 * RT indexes of all partitioned parents whose partitions' plans are
+	 * present in appendplans.
+	 */
+	Bitmapset  *partitioned_rels;
 } MergeAppend;
 
 /* ----------------
-- 
2.24.1



  [application/x-patch] v6-0004-Optimize-AcquireExecutorLocks-to-skip-pruned-part.patch (94.2K, 4-v6-0004-Optimize-AcquireExecutorLocks-to-skip-pruned-part.patch)
  download | inline diff:
From 5e076f58274f6cd05afc8533af130e165c9b862e Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Wed, 22 Dec 2021 16:55:17 +0900
Subject: [PATCH v6 4/4] Optimize AcquireExecutorLocks() to skip pruned
 partitions

Instead of locking all relations listed in the range table in the
cases where the PlannedStmt indicates that some nodes in the plan
tree can do partition pruning without depending on execution having
started (so called "initial" pruning), AcquireExecutorLocks() now
calls the new executor function ExecutorGetLockRels() which returns
a set of relations (their RT indexes) to be locked not including
those scanned by the subplans that pruned.

The result of pruning done this way must be remembered and reused
during actual execution of the plan, which is done by creating a
PlanInitPruningOutput nodes for for each plan node that undergoes
pruning and a set of those for the whole plan tree are added to
ExecLockRelsInfo which also stores the bitmapset of RT indexes of
relations that are actually locked by AcquireExecutorLocks().
ExecLockRelsInfos are passed down the executor alongside the
PlannedStmts.  This arrangement ensures that the executor doesn't
accidentally try to process a plan tree subnodes that has been
deemed pruned by AcquireExecutorLocks().
---
 src/backend/commands/copyto.c          |   2 +-
 src/backend/commands/createas.c        |   2 +-
 src/backend/commands/explain.c         |   7 +-
 src/backend/commands/extension.c       |  13 +-
 src/backend/commands/matview.c         |   2 +-
 src/backend/commands/portalcmds.c      |   1 +
 src/backend/commands/prepare.c         |  17 +-
 src/backend/executor/README            |  24 +++
 src/backend/executor/execMain.c        | 202 ++++++++++++++++++++
 src/backend/executor/execParallel.c    |  26 ++-
 src/backend/executor/execPartition.c   | 224 ++++++++++++++++++----
 src/backend/executor/execUtils.c       |   8 +
 src/backend/executor/functions.c       |   2 +-
 src/backend/executor/nodeAppend.c      |  52 ++++-
 src/backend/executor/nodeMergeAppend.c |  52 ++++-
 src/backend/executor/nodeModifyTable.c |  25 +++
 src/backend/executor/spi.c             |  14 +-
 src/backend/nodes/copyfuncs.c          |  49 ++++-
 src/backend/nodes/outfuncs.c           |  39 ++++
 src/backend/nodes/readfuncs.c          |  37 ++++
 src/backend/optimizer/plan/planner.c   |   2 +
 src/backend/optimizer/plan/setrefs.c   |   6 +
 src/backend/partitioning/partprune.c   |  37 +++-
 src/backend/tcop/postgres.c            |  15 +-
 src/backend/tcop/pquery.c              |  21 ++-
 src/backend/utils/cache/plancache.c    | 252 ++++++++++++++++++++++---
 src/backend/utils/mmgr/portalmem.c     |   2 +
 src/include/commands/explain.h         |   3 +-
 src/include/executor/execPartition.h   |   2 +
 src/include/executor/execdesc.h        |   2 +
 src/include/executor/executor.h        |   2 +
 src/include/executor/nodeAppend.h      |   1 +
 src/include/executor/nodeMergeAppend.h |   1 +
 src/include/executor/nodeModifyTable.h |   1 +
 src/include/nodes/execnodes.h          |  96 ++++++++++
 src/include/nodes/nodes.h              |   5 +
 src/include/nodes/pathnodes.h          |   4 +
 src/include/nodes/plannodes.h          |  15 ++
 src/include/tcop/tcopprot.h            |   2 +-
 src/include/utils/plancache.h          |   6 +
 src/include/utils/portal.h             |   5 +
 41 files changed, 1174 insertions(+), 104 deletions(-)

diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index 55c38b04c4..d403eb2309 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -542,7 +542,7 @@ BeginCopyTo(ParseState *pstate,
 		((DR_copy *) dest)->cstate = cstate;
 
 		/* Create a QueryDesc requesting no output */
-		cstate->queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+		cstate->queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
 											GetActiveSnapshot(),
 											InvalidSnapshot,
 											dest, NULL, NULL, 0);
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 9abbb6b555..f6607f2454 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -325,7 +325,7 @@ ExecCreateTableAs(ParseState *pstate, CreateTableAsStmt *stmt,
 		UpdateActiveSnapshotCommandId();
 
 		/* Create a QueryDesc, redirecting output to our tuple receiver */
-		queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+		queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
 									GetActiveSnapshot(), InvalidSnapshot,
 									dest, params, queryEnv, 0);
 
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 9f632285b6..1f1a44b9bb 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -407,7 +407,7 @@ ExplainOneQuery(Query *query, int cursorOptions,
 		}
 
 		/* run it (if needed) and produce output */
-		ExplainOnePlan(plan, into, es, queryString, params, queryEnv,
+		ExplainOnePlan(plan, NULL, into, es, queryString, params, queryEnv,
 					   &planduration, (es->buffers ? &bufusage : NULL));
 	}
 }
@@ -515,7 +515,8 @@ ExplainOneUtility(Node *utilityStmt, IntoClause *into, ExplainState *es,
  * to call it.
  */
 void
-ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
+ExplainOnePlan(PlannedStmt *plannedstmt, ExecLockRelsInfo *execlockrelsinfo,
+			   IntoClause *into, ExplainState *es,
 			   const char *queryString, ParamListInfo params,
 			   QueryEnvironment *queryEnv, const instr_time *planduration,
 			   const BufferUsage *bufusage)
@@ -563,7 +564,7 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
 		dest = None_Receiver;
 
 	/* Create a QueryDesc for the query */
-	queryDesc = CreateQueryDesc(plannedstmt, queryString,
+	queryDesc = CreateQueryDesc(plannedstmt, execlockrelsinfo, queryString,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, params, queryEnv, instrument_option);
 
diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index 1013790dbb..008b8ce0e9 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -741,8 +741,10 @@ execute_sql_string(const char *sql)
 		RawStmt    *parsetree = lfirst_node(RawStmt, lc1);
 		MemoryContext per_parsetree_context,
 					oldcontext;
-		List	   *stmt_list;
-		ListCell   *lc2;
+		List	   *stmt_list,
+				   *execlockrelsinfo_list;
+		ListCell   *lc2,
+				   *lc3;
 
 		/*
 		 * We do the work for each parsetree in a short-lived context, to
@@ -762,11 +764,13 @@ execute_sql_string(const char *sql)
 										   NULL,
 										   0,
 										   NULL);
-		stmt_list = pg_plan_queries(stmt_list, sql, CURSOR_OPT_PARALLEL_OK, NULL);
+		stmt_list = pg_plan_queries(stmt_list, sql, CURSOR_OPT_PARALLEL_OK, NULL,
+									&execlockrelsinfo_list);
 
-		foreach(lc2, stmt_list)
+		forboth(lc2, stmt_list, lc3, execlockrelsinfo_list)
 		{
 			PlannedStmt *stmt = lfirst_node(PlannedStmt, lc2);
+			ExecLockRelsInfo *execlockrelsinfo = lfirst_node(ExecLockRelsInfo, lc3);
 
 			CommandCounterIncrement();
 
@@ -777,6 +781,7 @@ execute_sql_string(const char *sql)
 				QueryDesc  *qdesc;
 
 				qdesc = CreateQueryDesc(stmt,
+										execlockrelsinfo,
 										sql,
 										GetActiveSnapshot(), NULL,
 										dest, NULL, NULL, 0);
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index 05e7b60059..4ef44aaf23 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -416,7 +416,7 @@ refresh_matview_datafill(DestReceiver *dest, Query *query,
 	UpdateActiveSnapshotCommandId();
 
 	/* Create a QueryDesc, redirecting output to our tuple receiver */
-	queryDesc = CreateQueryDesc(plan, queryString,
+	queryDesc = CreateQueryDesc(plan, NULL, queryString,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, NULL, NULL, 0);
 
diff --git a/src/backend/commands/portalcmds.c b/src/backend/commands/portalcmds.c
index 9902c5c566..85e73ddded 100644
--- a/src/backend/commands/portalcmds.c
+++ b/src/backend/commands/portalcmds.c
@@ -107,6 +107,7 @@ PerformCursorOpen(ParseState *pstate, DeclareCursorStmt *cstmt, ParamListInfo pa
 					  queryString,
 					  CMDTAG_SELECT,	/* cursor's query is always a SELECT */
 					  list_make1(plan),
+					  list_make1(NULL),	/* no ExecLockRelsInfo to pass */
 					  NULL);
 
 	/*----------
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 80738547ed..bbbf8bbcbd 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -155,6 +155,7 @@ ExecuteQuery(ParseState *pstate,
 	PreparedStatement *entry;
 	CachedPlan *cplan;
 	List	   *plan_list;
+	List	   *plan_execlockrelsinfo_list;
 	ParamListInfo paramLI = NULL;
 	EState	   *estate = NULL;
 	Portal		portal;
@@ -195,6 +196,7 @@ ExecuteQuery(ParseState *pstate,
 	/* Replan if needed, and increment plan refcount for portal */
 	cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL);
 	plan_list = cplan->stmt_list;
+	plan_execlockrelsinfo_list = cplan->execlockrelsinfo_list;
 
 	/*
 	 * DO NOT add any logic that could possibly throw an error between
@@ -204,7 +206,7 @@ ExecuteQuery(ParseState *pstate,
 					  NULL,
 					  query_string,
 					  entry->plansource->commandTag,
-					  plan_list,
+					  plan_list, plan_execlockrelsinfo_list,
 					  cplan);
 
 	/*
@@ -576,7 +578,9 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	const char *query_string;
 	CachedPlan *cplan;
 	List	   *plan_list;
-	ListCell   *p;
+	List	   *plan_execlockrelsinfo_list;
+	ListCell   *p,
+			   *pe;
 	ParamListInfo paramLI = NULL;
 	EState	   *estate = NULL;
 	instr_time	planstart;
@@ -632,15 +636,18 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	}
 
 	plan_list = cplan->stmt_list;
+	plan_execlockrelsinfo_list = cplan->execlockrelsinfo_list;
 
 	/* Explain each query */
-	foreach(p, plan_list)
+	forboth(p, plan_list, pe, plan_execlockrelsinfo_list)
 	{
 		PlannedStmt *pstmt = lfirst_node(PlannedStmt, p);
+		ExecLockRelsInfo *execlockrelsinfo = lfirst_node(ExecLockRelsInfo, pe);
 
 		if (pstmt->commandType != CMD_UTILITY)
-			ExplainOnePlan(pstmt, into, es, query_string, paramLI, queryEnv,
-						   &planduration, (es->buffers ? &bufusage : NULL));
+			ExplainOnePlan(pstmt, execlockrelsinfo, into, es, query_string,
+						   paramLI, queryEnv, &planduration,
+						   (es->buffers ? &bufusage : NULL));
 		else
 			ExplainOneUtility(pstmt->utilityStmt, into, es, query_string,
 							  paramLI, queryEnv);
diff --git a/src/backend/executor/README b/src/backend/executor/README
index bf5e70860d..9720d0ac2c 100644
--- a/src/backend/executor/README
+++ b/src/backend/executor/README
@@ -65,6 +65,27 @@ found there.  This currently only occurs for Append and MergeAppend nodes.  In
 this case the non-required subplans are ignored and the executor state's
 subnode array will become out of sequence to the plan's subplan list.
 
+Actually, the so-called execution time pruning may also occur even before the
+execution has started.  One case where that occurs is when a cached generic
+plan is being validated for execution by plancache.c: GetCachedPlan(), which
+proceeds by locking all the relations that will be scanned by that plan.  If
+the generic plan has nodes that contain so-called initial pruning steps (a
+subset of execution pruning steps that do not depend on full-fledged execution
+having started), they are performed at this point to figure out the minimal
+set of child subplans that satisfy those pruning instructions and the result
+of performing that pruning is saved in a data structure that gets passed to
+the executor alongside the plan tree.  Relations scanned by only those
+surviving subplans are then locked while those scanned by the pruned subplans
+are not, even though the pruned subplans themselves are not removed from the
+plan tree. So, it is imperative that the executor and any third party code
+invoked by it that gets passed the plan tree look at the initial pruning result
+made available via the aforementioned data structure to determine whether or
+not a particular subplan is valid.  (The data structure basically consists of
+an array of PlanInitPruningOutput nodes containing one element for each node
+of the plan tree indexable using plan_node_id of the individual plan nodes,
+where each node contains a bitmapset of indexes of unpruned child subplans of
+a given node.)
+
 Each Plan node may have expression trees associated with it, to represent
 its target list, qualification conditions, etc.  These trees are also
 read-only to the executor, but the executor state for expression evaluation
@@ -247,6 +268,9 @@ Query Processing Control Flow
 
 This is a sketch of control flow for full query processing:
 
+  [ ExecutorGetLockRels ] --- an optional step to walk over the plan tree
+		to produce an ExecLockRelsInfo to be passed to CreateQueryDesc
+
 	CreateQueryDesc
 
 	ExecutorStart
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 473d2e00a2..1ddd1dfb83 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -49,11 +49,15 @@
 #include "commands/matview.h"
 #include "commands/trigger.h"
 #include "executor/execdebug.h"
+#include "executor/nodeAppend.h"
+#include "executor/nodeMergeAppend.h"
+#include "executor/nodeModifyTable.h"
 #include "executor/nodeSubplan.h"
 #include "foreign/fdwapi.h"
 #include "jit/jit.h"
 #include "mb/pg_wchar.h"
 #include "miscadmin.h"
+#include "nodes/nodeFuncs.h"
 #include "parser/parsetree.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
@@ -101,9 +105,205 @@ static char *ExecBuildSlotValueDescription(Oid reloid,
 										   Bitmapset *modifiedCols,
 										   int maxfieldlen);
 static void EvalPlanQualStart(EPQState *epqstate, Plan *planTree);
+static bool ExecGetScanLockRels(Scan *scan, ExecGetLockRelsContext *context);
 
 /* end of local decls */
 
+/* ----------------------------------------------------------------
+ *		ExecutorGetLockRels
+ *
+ *		Figure out the minimal set of relations to lock to be able to safely
+ *		execute a given plan
+ *
+ * This ignores the relations scanned by child subplans that are pruned away
+ * after performing initial pruning steps present in the plan using the
+ * provided set of EXTERN parameters.
+ *
+ * Along with the set of RT indexes of relations that must be locked, the
+ * returned struct also contains an array of PlanInitPruningOutput nodes each
+ * of which contains the result of initial pruning for a given plan node, which
+ * is basically a bitmapset of the indexes of surviving child subplans.  Each
+ * plan node in the tree that undergoes pruning will have an element in the
+ * array.
+ *
+ * Note that while relations scanned by the subplans that are pruned will not
+ * be locked, the subplans themselves are left as-is in the plan tree, assuming
+ * anything that reads the plan tree during execution knows to ignore them by
+ * looking at the PlanInitPruningOutput's list of valid subplans.
+ *
+ * Partitioned tables mentioned in PartitionedRelPruneInfo nodes that drive
+ * the pruning will be locked before doing the pruning and also added to the
+ * the returned set.
+ */
+ExecLockRelsInfo *
+ExecutorGetLockRels(PlannedStmt *plannedstmt, ParamListInfo params)
+{
+	int		numPlanNodes = plannedstmt->numPlanNodes;
+	ExecGetLockRelsContext context;
+	ExecLockRelsInfo *result;
+	ListCell *lc;
+
+	/* Only get here if there is any pruning to do. */
+	Assert(plannedstmt->containsInitialPruning);
+
+	context.stmt = plannedstmt;
+	context.params = params;
+
+	/*
+	 * Go walk all the plan tree(s) present in the PlannedStmt, filling
+	 * context.lockrels with only the relations from plan nodes that
+	 * survive initial pruning and also the tables mentioned in
+	 * partitioned_rels sets found in the plan.
+	 */
+	context.lockrels = NULL;
+	context.initPruningOutputs = NIL;
+	context.ipoIndexes = palloc0(sizeof(int) * numPlanNodes);
+
+	/* All the subplans. */
+	foreach(lc, plannedstmt->subplans)
+	{
+		Plan *subplan = lfirst(lc);
+
+		(void) ExecGetLockRels(subplan, &context);
+	}
+
+	/* And the main tree. */
+	(void) ExecGetLockRels(plannedstmt->planTree, &context);
+
+	/*
+	 * Also be sure to lock partitioned relations from any [Merge]Append nodes
+	 * that were originally present but were ultimately left out from the plan
+	 * due to being deemed no-op nodes.
+	 */
+	context.lockrels = bms_add_members(context.lockrels,
+									   plannedstmt->elidedAppendPartedRels);
+
+	result = makeNode(ExecLockRelsInfo);
+	result->lockrels = context.lockrels;
+	result->numPlanNodes = numPlanNodes;
+	result->initPruningOutputs = context.initPruningOutputs;
+	result->ipoIndexes = context.ipoIndexes;
+
+	return result;
+}
+
+/* ------------------------------------------------------------------------
+ * ExecGetLockRels
+ *		Adds all the relations that will be scanned by 'node' and its child
+ *		plans to context->lockrels after taking into the account the effect
+ *		of performing initial pruning if any
+ *
+ * context->stmt gives the PlannedStmt being inspected to access the plan's
+ * range table if needed and context->params the set of EXTERN parameters
+ * available to evaluate pruning parameters.
+ *
+ * If initial pruning is done, a PlanInitPruningOutput node containing the
+ * result of pruning will be stored in context->initPruningOutputs that will
+ * be made available to the executor to reuse.
+ * ------------------------------------------------------------------------
+ */
+bool
+ExecGetLockRels(Plan *node, ExecGetLockRelsContext *context)
+{
+	/* Do nothing when we get to the end of a leaf on tree. */
+	if (node == NULL)
+		return true;
+
+	/* Make sure there's enough stack available. */
+	check_stack_depth();
+
+	switch (nodeTag(node))
+	{
+		/* Currently, only these two nodes have prunable child subplans. */
+		case T_Append:
+			if (ExecGetAppendLockRels((Append *) node, context))
+				return true;
+			break;
+		case T_MergeAppend:
+			if (ExecGetMergeAppendLockRels((MergeAppend *) node,
+												context))
+				return true;
+			break;
+
+		/*
+		 * And these manipulate relations that must be added context->lockrels.
+		 */
+		case T_SeqScan:
+		case T_SampleScan:
+		case T_IndexScan:
+		case T_IndexOnlyScan:
+		case T_BitmapIndexScan:
+		case T_BitmapHeapScan:
+		case T_TidScan:
+		case T_TidRangeScan:
+		case T_ForeignScan:
+		case T_SubqueryScan:
+		case T_CustomScan:
+			if (ExecGetScanLockRels((Scan *) node, context))
+				return true;
+			break;
+		case T_ModifyTable:
+			if (ExecGetModifyTableLockRels((ModifyTable *) node, context))
+				return true;
+			/* plan_tree_walker() will visit the subplan (outerNode) */
+			break;
+
+		default:
+			break;
+	}
+
+	/* Recurse to subnodes. */
+	return plan_tree_walker(node, ExecGetLockRels, (void *) context);
+}
+
+/*
+ * ExecGetScanLockRels
+ * 		Do ExecGetLockRels()'s work for a leaf Scan node
+ */
+static bool
+ExecGetScanLockRels(Scan *scan, ExecGetLockRelsContext *context)
+{
+	switch (nodeTag(scan))
+	{
+		case T_ForeignScan:
+			{
+				ForeignScan *fscan = (ForeignScan *) scan;
+
+				context->lockrels = bms_add_members(context->lockrels,
+													fscan->fs_relids);
+			}
+			break;
+
+		case T_SubqueryScan:
+			{
+				SubqueryScan *sscan = (SubqueryScan *) scan;
+
+				(void) ExecGetLockRels((Plan *) sscan->subplan, context);
+			}
+			break;
+
+		case T_CustomScan:
+			{
+				CustomScan *cscan = (CustomScan *) scan;
+				ListCell *lc;
+
+				context->lockrels = bms_add_members(context->lockrels,
+													cscan->custom_relids);
+				foreach(lc, cscan->custom_plans)
+				{
+					(void) ExecGetLockRels((Plan *) lfirst(lc), context);
+				}
+			}
+			break;
+
+		default:
+			context->lockrels = bms_add_member(context->lockrels,
+											   scan->scanrelid);
+			break;
+	}
+
+	return true;
+}
 
 /* ----------------------------------------------------------------
  *		ExecutorStart
@@ -805,6 +1005,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
 {
 	CmdType		operation = queryDesc->operation;
 	PlannedStmt *plannedstmt = queryDesc->plannedstmt;
+	ExecLockRelsInfo *execlockrelsinfo = queryDesc->execlockrelsinfo;
 	Plan	   *plan = plannedstmt->planTree;
 	List	   *rangeTable = plannedstmt->rtable;
 	EState	   *estate = queryDesc->estate;
@@ -824,6 +1025,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
 	ExecInitRangeTable(estate, rangeTable);
 
 	estate->es_plannedstmt = plannedstmt;
+	estate->es_execlockrelsinfo = execlockrelsinfo;
 
 	/*
 	 * Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index 5dd8ab7db2..02f2c27fdf 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -66,6 +66,7 @@
 #define PARALLEL_KEY_QUERY_TEXT		UINT64CONST(0xE000000000000008)
 #define PARALLEL_KEY_JIT_INSTRUMENTATION UINT64CONST(0xE000000000000009)
 #define PARALLEL_KEY_WAL_USAGE			UINT64CONST(0xE00000000000000A)
+#define PARALLEL_KEY_EXECLOCKRELSINFO	UINT64CONST(0xE00000000000000B)
 
 #define PARALLEL_TUPLE_QUEUE_SIZE		65536
 
@@ -182,6 +183,7 @@ ExecSerializePlan(Plan *plan, EState *estate)
 	pstmt->transientPlan = false;
 	pstmt->dependsOnRole = false;
 	pstmt->parallelModeNeeded = false;
+	pstmt->containsInitialPruning = false;
 	pstmt->planTree = plan;
 	pstmt->rtable = estate->es_range_table;
 	pstmt->resultRelations = NIL;
@@ -596,12 +598,15 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	FixedParallelExecutorState *fpes;
 	char	   *pstmt_data;
 	char	   *pstmt_space;
+	char	   *execlockrelsinfo_data;
+	char	   *execlockrelsinfo_space;
 	char	   *paramlistinfo_space;
 	BufferUsage *bufusage_space;
 	WalUsage   *walusage_space;
 	SharedExecutorInstrumentation *instrumentation = NULL;
 	SharedJitInstrumentation *jit_instrumentation = NULL;
 	int			pstmt_len;
+	int			execlockrelsinfo_len;
 	int			paramlistinfo_len;
 	int			instrumentation_len = 0;
 	int			jit_instrumentation_len = 0;
@@ -630,6 +635,7 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 
 	/* Fix up and serialize plan to be sent to workers. */
 	pstmt_data = ExecSerializePlan(planstate->plan, estate);
+	execlockrelsinfo_data = nodeToString(estate->es_execlockrelsinfo);
 
 	/* Create a parallel context. */
 	pcxt = CreateParallelContext("postgres", "ParallelQueryMain", nworkers);
@@ -656,6 +662,11 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	shm_toc_estimate_chunk(&pcxt->estimator, pstmt_len);
 	shm_toc_estimate_keys(&pcxt->estimator, 1);
 
+	/* Estimate space for serialized ExecLockRelsInfo. */
+	execlockrelsinfo_len = strlen(execlockrelsinfo_data) + 1;
+	shm_toc_estimate_chunk(&pcxt->estimator, execlockrelsinfo_len);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
 	/* Estimate space for serialized ParamListInfo. */
 	paramlistinfo_len = EstimateParamListSpace(estate->es_param_list_info);
 	shm_toc_estimate_chunk(&pcxt->estimator, paramlistinfo_len);
@@ -750,6 +761,12 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	memcpy(pstmt_space, pstmt_data, pstmt_len);
 	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PLANNEDSTMT, pstmt_space);
 
+	/* Store serialized ExecLockRelsInfo */
+	execlockrelsinfo_space = shm_toc_allocate(pcxt->toc, execlockrelsinfo_len);
+	memcpy(execlockrelsinfo_space, execlockrelsinfo_data, execlockrelsinfo_len);
+	shm_toc_insert(pcxt->toc, PARALLEL_KEY_EXECLOCKRELSINFO,
+				   execlockrelsinfo_space);
+
 	/* Store serialized ParamListInfo. */
 	paramlistinfo_space = shm_toc_allocate(pcxt->toc, paramlistinfo_len);
 	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARAMLISTINFO, paramlistinfo_space);
@@ -1231,8 +1248,10 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
 						 int instrument_options)
 {
 	char	   *pstmtspace;
+	char	   *execlockrelsinfospace;
 	char	   *paramspace;
 	PlannedStmt *pstmt;
+	ExecLockRelsInfo *execlockrelsinfo;
 	ParamListInfo paramLI;
 	char	   *queryString;
 
@@ -1243,12 +1262,17 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
 	pstmtspace = shm_toc_lookup(toc, PARALLEL_KEY_PLANNEDSTMT, false);
 	pstmt = (PlannedStmt *) stringToNode(pstmtspace);
 
+	/* Reconstruct leader-supplied ExecLockRelsInfo. */
+	execlockrelsinfospace = shm_toc_lookup(toc, PARALLEL_KEY_EXECLOCKRELSINFO,
+										  false);
+	execlockrelsinfo = (ExecLockRelsInfo *) stringToNode(execlockrelsinfospace);
+
 	/* Reconstruct ParamListInfo. */
 	paramspace = shm_toc_lookup(toc, PARALLEL_KEY_PARAMLISTINFO, false);
 	paramLI = RestoreParamList(&paramspace);
 
 	/* Create a QueryDesc for the query. */
-	return CreateQueryDesc(pstmt,
+	return CreateQueryDesc(pstmt, execlockrelsinfo,
 						   queryString,
 						   GetActiveSnapshot(), InvalidSnapshot,
 						   receiver, paramLI, NULL, instrument_options);
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 7ff5a95f05..fddc97280e 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -24,6 +24,7 @@
 #include "mb/pg_wchar.h"
 #include "miscadmin.h"
 #include "nodes/makefuncs.h"
+#include "parser/parsetree.h"
 #include "partitioning/partbounds.h"
 #include "partitioning/partdesc.h"
 #include "partitioning/partprune.h"
@@ -183,8 +184,13 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation rel,
 												  int maxfieldlen);
 static List *adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri);
 static PartitionPruneState *ExecCreatePartitionPruneState(PlanState *planstate,
-							  PartitionPruneInfo *partitionpruneinfo);
-static Bitmapset *ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate);
+							  PartitionPruneInfo *partitionpruneinfo,
+							  bool consider_initial_steps,
+							  bool consider_exec_steps,
+							  List *rtable, ExprContext *econtext,
+							  PartitionDirectory partdir);
+static Bitmapset *ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate,
+								PartitionPruneInfo *pruneinfo);
 static void ExecInitPruningContext(PartitionPruneContext *context,
 								   List *pruning_steps,
 								   PartitionDesc partdesc,
@@ -1483,8 +1489,9 @@ adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri)
  * considered to be a stable expression, it can change value from one plan
  * node scan to the next during query execution.  Stable comparison
  * expressions that don't involve such Params allow partition pruning to be
- * done once during executor startup.  Expressions that do involve such Params
- * require us to prune separately for each scan of the parent plan node.
+ * done once during executor startup or even before during ExecutorGetLockRels().
+ * Expressions that do involve such Params require us to prune separately for
+ * each scan of the parent plan node.
  *
  * Note that pruning away unneeded subplans during executor startup has the
  * added benefit of not having to initialize the unneeded subplans at all.
@@ -1496,10 +1503,17 @@ adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri)
  *		Creates the PartitionPruneState required by each of the two pruning
  *		functions.  Details stored include how to map the partition index
  *		returned by the partition pruning code into subplan indexes.  Also
- *		determines the set of initially valid subplans by performing initial
- *		pruning steps, only which need be initialized by the caller such as
- *		ExecInitAppend.  Maps in PartitionPruneState are updated to account
- *		for initial pruning having eliminated some of the subplans, if any.
+ *		determines the set of initially valid subplans by either looking that
+ *		up in the plan node's PlanInitPruningOutput if one found in
+ *		EState.es_execlockrelinfo or by performing initial pruning steps.
+ *		Only the subplans included in that need be initialized by the caller
+ *		such as ExecInitAppend.  Maps in PartitionPruneState are updated to
+ *		account for initial pruning having eliminated some of the subplans,
+ *		if any.
+ *
+ * ExecGetLockRelsDoInitialPruning:
+ *		Do initial pruning as part of ExecGetLockRels() on the parent plan
+ *		node
  *
  * ExecFindMatchingSubPlans:
  *		Returns indexes of matching subplans after evaluating all available
@@ -1514,9 +1528,10 @@ adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri)
  * ExecInitPartitionPruning
  * 		Initialize data structure needed for run-time partition pruning
  *
- * Initial pruning can be done immediately, so it is done here if needed and
- * the set of surviving partition subplans' indexes are added to the output
- * parameter *initially_valid_subplans.
+ * Initial pruning can be done immediately, so it is done here unless it has
+ * already been done by ExecGetLockRelsDoInitialPruning(), and the set of
+ * surviving partition subplans' indexes are added to the output parameter
+ * *initially_valid_subplans.
  *
  * If subplans are indeed pruned, subplan_map arrays contained in the returned
  * PartitionPruneState are re-sequenced to not count those, though only if the
@@ -1530,22 +1545,57 @@ ExecInitPartitionPruning(PlanState *planstate,
 {
 	PartitionPruneState *prunestate;
 	EState *estate = planstate->state;
+	Plan   *plan = planstate->plan;
+	PlanInitPruningOutput *initPruningOutput = NULL;
+	bool	do_pruning = (pruneinfo->needs_init_pruning ||
+						  pruneinfo->needs_exec_pruning);
 
-	/* We may need an expression context to evaluate partition exprs */
-	ExecAssignExprContext(estate, planstate);
+	/* Retrieve the parent plan's PlanInitPruningOutput, if any. */
+	if (estate->es_execlockrelsinfo)
+	{
+		initPruningOutput = (PlanInitPruningOutput *)
+			ExecFetchPlanInitPruningOutput(estate->es_execlockrelsinfo, plan);
 
-	/*
-	 * Create the working data structure for pruning.
-	 */
-	prunestate = ExecCreatePartitionPruneState(planstate, pruneinfo);
+		Assert(initPruningOutput != NULL &&
+			   IsA(initPruningOutput, PlanInitPruningOutput));
+		/* No need to do initial pruning again, only exec pruning. */
+		do_pruning = pruneinfo->needs_exec_pruning;
+	}
+
+	prunestate = NULL;
+	if (do_pruning)
+	{
+		/* We may need an expression context to evaluate partition exprs */
+		ExecAssignExprContext(estate, planstate);
+
+		/* For data reading, executor always omits detached partitions */
+		if (estate->es_partition_directory == NULL)
+			estate->es_partition_directory =
+				CreatePartitionDirectory(estate->es_query_cxt, false);
+
+		/*
+		 * Create the working data structure for pruning.  No need to consider
+		 * initial pruning steps if we have a PlanInitPruningOutput.
+		 */
+		prunestate = ExecCreatePartitionPruneState(planstate, pruneinfo,
+												   initPruningOutput == NULL, true,
+												   NIL, planstate->ps_ExprContext,
+												   estate->es_partition_directory);
+	}
 
 	/*
 	 * Perform an initial partition prune, if required.
 	 */
-	if (prunestate->do_initial_prune)
+	if (initPruningOutput)
+	{
+		/* ExecGetLockRelsDoInitialPruning() already did it for us! */
+		*initially_valid_subplans = initPruningOutput->initially_valid_subplans;
+	}
+	else if (prunestate && prunestate->do_initial_prune)
 	{
 		/* Determine which subplans survive initial pruning */
-		*initially_valid_subplans = ExecFindInitialMatchingSubPlans(prunestate);
+		*initially_valid_subplans = ExecFindInitialMatchingSubPlans(prunestate,
+																	pruneinfo);
 	}
 	else
 	{
@@ -1563,7 +1613,7 @@ ExecInitPartitionPruning(PlanState *planstate,
 	 * invalid data in prunestate, because that data won't be consulted again
 	 * (cf initial Assert in ExecFindMatchingSubPlans).
 	 */
-	if (prunestate->do_exec_prune &&
+	if (prunestate && prunestate->do_exec_prune &&
 		bms_num_members(*initially_valid_subplans) < n_total_subplans)
 		PartitionPruneStateFixSubPlanMap(prunestate,
 										 *initially_valid_subplans,
@@ -1572,12 +1622,75 @@ ExecInitPartitionPruning(PlanState *planstate,
 	return prunestate;
 }
 
+/*
+ * ExecGetLockRelsDoInitialPruning
+ *		Perform initial pruning as part of doing ExecGetLockRels() on the parent
+ *		plan node
+ */
+Bitmapset *
+ExecGetLockRelsDoInitialPruning(Plan *plan, ExecGetLockRelsContext *context,
+								PartitionPruneInfo *pruneinfo)
+{
+	List		 *rtable = context->stmt->rtable;
+	ParamListInfo params = context->params;
+	ExprContext	 *econtext;
+	PartitionDirectory pdir;
+	MemoryContext oldcontext,
+				  tmpcontext;
+	PartitionPruneState *prunestate;
+	PlanInitPruningOutput *initPruningOutput;
+
+	/*
+	 * A temporary context to allocate stuff needded to run the pruning steps.
+	 */
+	tmpcontext = AllocSetContextCreate(CurrentMemoryContext,
+									   "initial pruning working data",
+									   ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(tmpcontext);
+
+	/*
+	 * PartitionDirectory to look up partition descriptors, which omits
+	 * detached partitions, just like in the executor proper.
+	 */
+	pdir = CreatePartitionDirectory(CurrentMemoryContext, false);
+
+	/*
+	 * We don't yet have a PlanState for the parent plan node, so must create
+	 * a standalone ExprContext to evaluate pruning expressions, equipped with
+	 * the information about the EXTERN parameters that the caller passed us.
+	 * Note that that's okay because the initial pruning steps do not contain
+	 * anything that requires the execution to have started.
+	 */
+	econtext = CreateStandaloneExprContext();
+	econtext->ecxt_param_list_info = params;
+	prunestate = ExecCreatePartitionPruneState(NULL, pruneinfo,
+											   true, false,
+											   rtable, econtext,
+											   pdir);
+	MemoryContextSwitchTo(oldcontext);
+
+	/* Do the pruning and populate a PlanInitPruningOutput for this node. */
+	initPruningOutput = makeNode(PlanInitPruningOutput);
+	initPruningOutput->initially_valid_subplans =
+		ExecFindInitialMatchingSubPlans(prunestate, pruneinfo);
+	ExecStorePlanInitPruningOutput(context, initPruningOutput, plan);
+
+	FreeExprContext(econtext, true);
+	DestroyPartitionDirectory(pdir);
+	MemoryContextDelete(tmpcontext);
+
+	return initPruningOutput->initially_valid_subplans;
+}
+
 /*
  * ExecCreatePartitionPruneState
  *		Build the data structure required for calling
  *		ExecFindInitialMatchingSubPlans and ExecFindMatchingSubPlans.
  *
- * 'planstate' is the parent plan node's execution state.
+ * 'planstate', if not NULL, is the parent plan node's execution state.  It
+ * can be NULL if being called before ExecutorStart(), in which case,
+ * 'rtable' (range table), 'econtext', and 'partdir' must be explicitly
+ * provided.
  *
  * 'partitionpruneinfo' is a PartitionPruneInfo as generated by
  * make_partition_pruneinfo.  Here we build a PartitionPruneState containing a
@@ -1592,19 +1705,20 @@ ExecInitPartitionPruning(PlanState *planstate,
  */
 static PartitionPruneState *
 ExecCreatePartitionPruneState(PlanState *planstate,
-							  PartitionPruneInfo *partitionpruneinfo)
+							  PartitionPruneInfo *partitionpruneinfo,
+							  bool consider_initial_steps,
+							  bool consider_exec_steps,
+							  List *rtable, ExprContext *econtext,
+							  PartitionDirectory partdir)
 {
-	EState	   *estate = planstate->state;
+	EState	   *estate = planstate ? planstate->state : NULL;
 	PartitionPruneState *prunestate;
 	int			n_part_hierarchies;
 	ListCell   *lc;
 	int			i;
-	ExprContext	*econtext = planstate->ps_ExprContext;
 
-	/* For data reading, executor always omits detached partitions */
-	if (estate->es_partition_directory == NULL)
-		estate->es_partition_directory =
-			CreatePartitionDirectory(estate->es_query_cxt, false);
+	Assert((estate != NULL) ||
+			(partdir != NULL && econtext != NULL && rtable != NIL));
 
 	n_part_hierarchies = list_length(partitionpruneinfo->prune_infos);
 	Assert(n_part_hierarchies > 0);
@@ -1655,19 +1769,48 @@ ExecCreatePartitionPruneState(PlanState *planstate,
 			PartitionedRelPruneInfo *pinfo = lfirst_node(PartitionedRelPruneInfo, lc2);
 			PartitionedRelPruningData *pprune = &prunedata->partrelprunedata[j];
 			Relation	partrel;
+			bool		close_partrel = false;
 			PartitionDesc partdesc;
 			PartitionKey partkey;
 
 			/*
-			 * We can rely on the copies of the partitioned table's partition
-			 * key and partition descriptor appearing in its relcache entry,
-			 * because that entry will be held open and locked for the
-			 * duration of this executor run.
+			 * Must open the relation by ourselves when called before the
+			 * execution has started, such as, when called during
+			 * ExecutorGetLockRels() on a cached plan.  In that case,
+			 * sub-partitions must be locked, because AcquirePlannerLocks()
+			 * would not have seen them. (1st relation in a partrelpruneinfos
+			 * list is always the root partitioned table appearing in the
+			 * query, which AcquirePlannerLocks() would have locked; the
+			 * Assert in relation_open() guards that assumption.)
+			 */
+			if (estate == NULL)
+			{
+				RangeTblEntry *rte = rt_fetch(pinfo->rtindex, rtable);
+				int		lockmode = (j == 0) ? NoLock : rte->rellockmode;
+
+				partrel = table_open(rte->relid, lockmode);
+				close_partrel = true;
+			}
+			else
+				partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
+
+			/*
+			 * We can rely on the copy of the partitioned table's partition
+			 * key from in its relcache entry, because it can't change (or
+			 * get destroyed) as long as the relation is locked.  Partition
+			 * descriptor is taken from the PartitionDirectory associated with
+			 * the table that is held open long enough for the descriptor to
+			 * remain valid while it's used to perform the pruning steps.
 			 */
-			partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
 			partkey = RelationGetPartitionKey(partrel);
-			partdesc = PartitionDirectoryLookup(estate->es_partition_directory,
-												partrel);
+			partdesc = PartitionDirectoryLookup(partdir, partrel);
+
+			/*
+			 * Must close partrel, keeping the lock taken, if we're not using
+			 * EState's entry.
+			 */
+			if (close_partrel)
+				table_close(partrel, NoLock);
 
 			/*
 			 * Initialize the subplan_map and subpart_map.
@@ -1769,7 +1912,7 @@ ExecCreatePartitionPruneState(PlanState *planstate,
 			 * Initialize pruning contexts as needed.
 			 */
 			pprune->initial_pruning_steps = pinfo->initial_pruning_steps;
-			if (pinfo->initial_pruning_steps)
+			if (consider_initial_steps && pinfo->initial_pruning_steps)
 			{
 				ExecInitPruningContext(&pprune->initial_context,
 									   pinfo->initial_pruning_steps,
@@ -1779,7 +1922,7 @@ ExecCreatePartitionPruneState(PlanState *planstate,
 				prunestate->do_initial_prune = true;
 			}
 			pprune->exec_pruning_steps = pinfo->exec_pruning_steps;
-			if (pinfo->exec_pruning_steps)
+			if (consider_exec_steps && pinfo->exec_pruning_steps)
 			{
 				ExecInitPruningContext(&pprune->exec_context,
 									   pinfo->exec_pruning_steps,
@@ -1893,7 +2036,8 @@ ExecInitPruningContext(PartitionPruneContext *context,
  * is required.
  */
 static Bitmapset *
-ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate)
+ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate,
+								PartitionPruneInfo *pruneinfo)
 {
 	Bitmapset  *result = NULL;
 	MemoryContext oldcontext;
@@ -1903,8 +2047,8 @@ ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate)
 	Assert(prunestate->do_initial_prune);
 
 	/*
-	 * Switch to a temp context to avoid leaking memory in the executor's
-	 * query-lifespan memory context.
+	 * Switch to a temp context to avoid leaking memory in the longer-term
+	 * memory context.
 	 */
 	oldcontext = MemoryContextSwitchTo(prunestate->prune_context);
 
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 9df1f81ea8..7246f9175f 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -119,6 +119,7 @@ CreateExecutorState(void)
 	estate->es_relations = NULL;
 	estate->es_rowmarks = NULL;
 	estate->es_plannedstmt = NULL;
+	estate->es_execlockrelsinfo = NULL;
 
 	estate->es_junkFilter = NULL;
 
@@ -785,6 +786,13 @@ ExecGetRangeTableRelation(EState *estate, Index rti)
 
 	Assert(rti > 0 && rti <= estate->es_range_table_size);
 
+	/*
+	 * A cross-check that AcquireExecutorLocks() hasn't missed any relations
+	 * it must not have.
+	 */
+	Assert(estate->es_execlockrelsinfo == NULL ||
+		   bms_is_member(rti, estate->es_execlockrelsinfo->lockrels));
+
 	rel = estate->es_relations[rti - 1];
 	if (rel == NULL)
 	{
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index f9460ae506..a2182a6b1f 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -844,7 +844,7 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
 	else
 		dest = None_Receiver;
 
-	es->qd = CreateQueryDesc(es->stmt,
+	es->qd = CreateQueryDesc(es->stmt, NULL,
 							 fcache->src,
 							 GetActiveSnapshot(),
 							 InvalidSnapshot,
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index 5b6d3eb23b..9c6f907687 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -94,6 +94,55 @@ static bool ExecAppendAsyncRequest(AppendState *node, TupleTableSlot **result);
 static void ExecAppendAsyncEventWait(AppendState *node);
 static void classify_matching_subplans(AppendState *node);
 
+/* ----------------------------------------------------------------
+ *		ExecGetAppendLockRels
+ *			Do ExecGetLockRels()'s work for an Append plan
+ * ----------------------------------------------------------------
+ */
+bool
+ExecGetAppendLockRels(Append *node, ExecGetLockRelsContext *context)
+{
+	PartitionPruneInfo *pruneinfo = node->part_prune_info;
+
+	/*
+	 * Must always lock all the partitioned tables whose direct and indirect
+	 * partitions will be scanned by this Append.
+	 */
+	context->lockrels = bms_add_members(context->lockrels,
+										node->partitioned_rels);
+
+	/*
+	 * Now recurse to subplans to add relations scanned therein.
+	 *
+	 * If initial pruning can be done, do that now and only recurse to the
+	 * surviving subplans.
+	 */
+	if (pruneinfo && pruneinfo->needs_init_pruning)
+	{
+		List	   *subplans = node->appendplans;
+		Bitmapset  *validsubplans;
+		int			i;
+
+		validsubplans = ExecGetLockRelsDoInitialPruning((Plan *) node,
+														context, pruneinfo);
+
+		/* Recurse to surviving subplans. */
+		i = -1;
+		while ((i = bms_next_member(validsubplans, i)) >= 0)
+		{
+			Plan   *subplan = list_nth(subplans, i);
+
+			(void) ExecGetLockRels(subplan, context);
+		}
+
+		/* done with this node */
+		return true;
+	}
+
+	/* Tell the caller to recurse to *all* the subplans. */
+	return false;
+}
+
 /* ----------------------------------------------------------------
  *		ExecInitAppend
  *
@@ -155,7 +204,8 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
 		 * subplan, we can fill as_valid_subplans immediately, preventing
 		 * later calls to ExecFindMatchingSubPlans.
 		 */
-		if (!prunestate->do_exec_prune && nplans > 0)
+		if (appendstate->as_prune_state == NULL ||
+			(!appendstate->as_prune_state->do_exec_prune && nplans > 0))
 			appendstate->as_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
 	}
 	else
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index 9a9f29e845..4b04fcdbc2 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -54,6 +54,55 @@ typedef int32 SlotNumber;
 static TupleTableSlot *ExecMergeAppend(PlanState *pstate);
 static int	heap_compare_slots(Datum a, Datum b, void *arg);
 
+/* ----------------------------------------------------------------
+ *		ExecGetMergeAppendLockRels
+ *			Do ExecGetLockRels()'s work for a MergeAppend plan
+ * ----------------------------------------------------------------
+ */
+bool
+ExecGetMergeAppendLockRels(MergeAppend *node, ExecGetLockRelsContext *context)
+{
+	PartitionPruneInfo *pruneinfo = node->part_prune_info;
+
+	/*
+	 * Must always lock all the partitioned tables whose direct and indirect
+	 * partitions will be scanned by this Append.
+	 */
+	context->lockrels = bms_add_members(context->lockrels,
+										node->partitioned_rels);
+
+	/*
+	 * Now recurse to subplans to add relations scanned therein.
+	 *
+	 * If initial pruning can be done, do that now and only recurse to the
+	 * surviving subplans.
+	 */
+	if (pruneinfo && pruneinfo->needs_init_pruning)
+	{
+		List	   *subplans = node->mergeplans;
+		Bitmapset  *validsubplans;
+		int			i;
+
+		validsubplans = ExecGetLockRelsDoInitialPruning((Plan *) node,
+														context, pruneinfo);
+
+		/* Recurse to surviving subplans. */
+		i = -1;
+		while ((i = bms_next_member(validsubplans, i)) >= 0)
+		{
+			Plan   *subplan = list_nth(subplans, i);
+
+			(void) ExecGetLockRels(subplan, context);
+		}
+
+		/* done with this node */
+		return true;
+	}
+
+	/* Tell the caller to recurse to *all* the subplans. */
+	return false;
+}
+
 
 /* ----------------------------------------------------------------
  *		ExecInitMergeAppend
@@ -103,7 +152,8 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
 		 * subplan, we can fill as_valid_subplans immediately, preventing
 		 * later calls to ExecFindMatchingSubPlans.
 		 */
-		if (!prunestate->do_exec_prune && nplans > 0)
+		if (mergestate->ms_prune_state == NULL ||
+			(!mergestate->ms_prune_state->do_exec_prune && nplans > 0))
 			mergestate->ms_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
 	}
 	else
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index 701fe05296..23df3efef0 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -3008,6 +3008,31 @@ ExecLookupResultRelByOid(ModifyTableState *node, Oid resultoid,
 	return NULL;
 }
 
+/*
+ * ExecGetModifyTableLockRels
+ * 		Do ExecGetLockRels()'s work for a ModifyTable plan
+ */
+bool
+ExecGetModifyTableLockRels(ModifyTable *plan, ExecGetLockRelsContext *context)
+{
+	ListCell *lc;
+
+	/* First add the result relation RTIs mentioned in the node. */
+	if (plan->rootRelation > 0)
+		context->lockrels = bms_add_member(context->lockrels,
+										   plan->rootRelation);
+	context->lockrels = bms_add_member(context->lockrels,
+									   plan->nominalRelation);
+	foreach(lc, plan->resultRelations)
+	{
+		context->lockrels = bms_add_member(context->lockrels,
+										   lfirst_int(lc));
+	}
+
+	/* Tell the caller to recurse to the subplan (outerPlan(plan)). */
+	return false;
+}
+
 /* ----------------------------------------------------------------
  *		ExecInitModifyTable
  * ----------------------------------------------------------------
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index a82e986667..2107009591 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -1578,6 +1578,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 	CachedPlanSource *plansource;
 	CachedPlan *cplan;
 	List	   *stmt_list;
+	List	   *execlockrelsinfo_list;
 	char	   *query_string;
 	Snapshot	snapshot;
 	MemoryContext oldcontext;
@@ -1659,6 +1660,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 	/* Replan if needed, and increment plan refcount for portal */
 	cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv);
 	stmt_list = cplan->stmt_list;
+	execlockrelsinfo_list = cplan->execlockrelsinfo_list;
 
 	if (!plan->saved)
 	{
@@ -1670,6 +1672,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 		 */
 		oldcontext = MemoryContextSwitchTo(portal->portalContext);
 		stmt_list = copyObject(stmt_list);
+		execlockrelsinfo_list = copyObject(execlockrelsinfo_list);
 		MemoryContextSwitchTo(oldcontext);
 		ReleaseCachedPlan(cplan, NULL);
 		cplan = NULL;			/* portal shouldn't depend on cplan */
@@ -1683,6 +1686,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 					  query_string,
 					  plansource->commandTag,
 					  stmt_list,
+					  execlockrelsinfo_list,
 					  cplan);
 
 	/*
@@ -2473,7 +2477,9 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 	{
 		CachedPlanSource *plansource = (CachedPlanSource *) lfirst(lc1);
 		List	   *stmt_list;
-		ListCell   *lc2;
+		List	   *execlockrelsinfo_list;
+		ListCell   *lc2,
+				   *lc3;
 
 		spicallbackarg.query = plansource->query_string;
 
@@ -2552,6 +2558,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 							  plan_owner, _SPI_current->queryEnv);
 
 		stmt_list = cplan->stmt_list;
+		execlockrelsinfo_list = cplan->execlockrelsinfo_list;
 
 		/*
 		 * If we weren't given a specific snapshot to use, and the statement
@@ -2589,9 +2596,10 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 			}
 		}
 
-		foreach(lc2, stmt_list)
+		forboth(lc2, stmt_list, lc3, execlockrelsinfo_list)
 		{
 			PlannedStmt *stmt = lfirst_node(PlannedStmt, lc2);
+			ExecLockRelsInfo *execlockrelsinfo = lfirst_node(ExecLockRelsInfo, lc3);
 			bool		canSetTag = stmt->canSetTag;
 			DestReceiver *dest;
 
@@ -2663,7 +2671,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 				else
 					snap = InvalidSnapshot;
 
-				qdesc = CreateQueryDesc(stmt,
+				qdesc = CreateQueryDesc(stmt, execlockrelsinfo,
 										plansource->query_string,
 										snap, crosscheck_snapshot,
 										dest,
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index dc68a12486..1b94d7c881 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -68,6 +68,13 @@
 		} \
 	} while (0)
 
+/* Copy a field that is an array with numElem ints */
+#define COPY_INT_ARRAY(fldname, numElem) \
+	do { \
+		newnode->fldname = (numElem) > 0 ? palloc((numElem) * sizeof(int)) : NULL; \
+		memcpy(newnode->fldname, from->fldname, sizeof(int) * (numElem)); \
+	} while (0)
+
 /* Copy a parse location field (for Copy, this is same as scalar case) */
 #define COPY_LOCATION_FIELD(fldname) \
 	(newnode->fldname = from->fldname)
@@ -94,8 +101,10 @@ _copyPlannedStmt(const PlannedStmt *from)
 	COPY_SCALAR_FIELD(transientPlan);
 	COPY_SCALAR_FIELD(dependsOnRole);
 	COPY_SCALAR_FIELD(parallelModeNeeded);
+	COPY_SCALAR_FIELD(containsInitialPruning);
 	COPY_SCALAR_FIELD(jitFlags);
 	COPY_NODE_FIELD(planTree);
+	COPY_SCALAR_FIELD(numPlanNodes);
 	COPY_NODE_FIELD(rtable);
 	COPY_NODE_FIELD(resultRelations);
 	COPY_NODE_FIELD(appendRelations);
@@ -1281,6 +1290,8 @@ _copyPartitionPruneInfo(const PartitionPruneInfo *from)
 	PartitionPruneInfo *newnode = makeNode(PartitionPruneInfo);
 
 	COPY_NODE_FIELD(prune_infos);
+	COPY_SCALAR_FIELD(needs_init_pruning);
+	COPY_SCALAR_FIELD(needs_exec_pruning);
 	COPY_BITMAPSET_FIELD(other_subplans);
 
 	return newnode;
@@ -4944,6 +4955,33 @@ _copyExtensibleNode(const ExtensibleNode *from)
 	return newnode;
 }
 
+/* ****************************************************************
+ *					execnodes.h copy functions
+ * ****************************************************************
+ */
+static ExecLockRelsInfo *
+_copyExecLockRelsInfo(const ExecLockRelsInfo *from)
+{
+	ExecLockRelsInfo *newnode = makeNode(ExecLockRelsInfo);
+
+	COPY_BITMAPSET_FIELD(lockrels);
+	COPY_SCALAR_FIELD(numPlanNodes);
+	COPY_NODE_FIELD(initPruningOutputs);
+	COPY_INT_ARRAY(ipoIndexes, from->numPlanNodes);
+
+	return newnode;
+}
+
+static PlanInitPruningOutput *
+_copyPlanInitPruningOutput(const PlanInitPruningOutput *from)
+{
+	PlanInitPruningOutput *newnode = makeNode(PlanInitPruningOutput);
+
+	COPY_BITMAPSET_FIELD(initially_valid_subplans);
+
+	return newnode;
+}
+
 /* ****************************************************************
  *					value.h copy functions
  * ****************************************************************
@@ -4998,7 +5036,6 @@ _copyBitString(const BitString *from)
 	return newnode;
 }
 
-
 static ForeignKeyCacheInfo *
 _copyForeignKeyCacheInfo(const ForeignKeyCacheInfo *from)
 {
@@ -5947,6 +5984,16 @@ copyObjectImpl(const void *from)
 			retval = _copyPublicationTable(from);
 			break;
 
+			/*
+			 * EXECUTION NODES
+			 */
+		case T_ExecLockRelsInfo:
+			retval = _copyExecLockRelsInfo(from);
+			break;
+		case T_PlanInitPruningOutput:
+			retval = _copyPlanInitPruningOutput(from);
+			break;
+
 			/*
 			 * MISCELLANEOUS NODES
 			 */
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index bc178d53bf..6c404c8664 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -312,8 +312,10 @@ _outPlannedStmt(StringInfo str, const PlannedStmt *node)
 	WRITE_BOOL_FIELD(transientPlan);
 	WRITE_BOOL_FIELD(dependsOnRole);
 	WRITE_BOOL_FIELD(parallelModeNeeded);
+	WRITE_BOOL_FIELD(containsInitialPruning);
 	WRITE_INT_FIELD(jitFlags);
 	WRITE_NODE_FIELD(planTree);
+	WRITE_INT_FIELD(numPlanNodes);
 	WRITE_NODE_FIELD(rtable);
 	WRITE_NODE_FIELD(resultRelations);
 	WRITE_NODE_FIELD(appendRelations);
@@ -1007,6 +1009,8 @@ _outPartitionPruneInfo(StringInfo str, const PartitionPruneInfo *node)
 	WRITE_NODE_TYPE("PARTITIONPRUNEINFO");
 
 	WRITE_NODE_FIELD(prune_infos);
+	WRITE_BOOL_FIELD(needs_init_pruning);
+	WRITE_BOOL_FIELD(needs_exec_pruning);
 	WRITE_BITMAPSET_FIELD(other_subplans);
 }
 
@@ -2702,6 +2706,31 @@ _outExtensibleNode(StringInfo str, const ExtensibleNode *node)
 	methods->nodeOut(str, node);
 }
 
+/*****************************************************************************
+ *
+ *	Stuff from execnodes.h
+ *
+ *****************************************************************************/
+
+static void
+_outExecLockRelsInfo(StringInfo str, const ExecLockRelsInfo *node)
+{
+	WRITE_NODE_TYPE("EXECLOCKRELSINFO");
+
+	WRITE_BITMAPSET_FIELD(lockrels);
+	WRITE_INT_FIELD(numPlanNodes);
+	WRITE_NODE_FIELD(initPruningOutputs);
+	WRITE_INT_ARRAY(ipoIndexes, node->numPlanNodes);
+}
+
+static void
+_outPlanInitPruningOutput(StringInfo str, const PlanInitPruningOutput *node)
+{
+	WRITE_NODE_TYPE("PARTITIONINITPRUNINGOUTPUT");
+
+	WRITE_BITMAPSET_FIELD(initially_valid_subplans);
+}
+
 /*****************************************************************************
  *
  *	Stuff from parsenodes.h.
@@ -4543,6 +4572,16 @@ outNode(StringInfo str, const void *obj)
 				_outPartitionRangeDatum(str, obj);
 				break;
 
+				/*
+				 * EXECUTION NODES
+				 */
+			case T_ExecLockRelsInfo:
+				_outExecLockRelsInfo(str, obj);
+				break;
+			case T_PlanInitPruningOutput:
+				_outPlanInitPruningOutput(str, obj);
+				break;
+
 			default:
 
 				/*
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 3c673c42d5..863f082729 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -1585,8 +1585,10 @@ _readPlannedStmt(void)
 	READ_BOOL_FIELD(transientPlan);
 	READ_BOOL_FIELD(dependsOnRole);
 	READ_BOOL_FIELD(parallelModeNeeded);
+	READ_BOOL_FIELD(containsInitialPruning);
 	READ_INT_FIELD(jitFlags);
 	READ_NODE_FIELD(planTree);
+	READ_INT_FIELD(numPlanNodes);
 	READ_NODE_FIELD(rtable);
 	READ_NODE_FIELD(resultRelations);
 	READ_NODE_FIELD(appendRelations);
@@ -2537,6 +2539,8 @@ _readPartitionPruneInfo(void)
 	READ_LOCALS(PartitionPruneInfo);
 
 	READ_NODE_FIELD(prune_infos);
+	READ_BOOL_FIELD(needs_init_pruning);
+	READ_BOOL_FIELD(needs_exec_pruning);
 	READ_BITMAPSET_FIELD(other_subplans);
 
 	READ_DONE();
@@ -2706,6 +2710,35 @@ _readPartitionRangeDatum(void)
 	READ_DONE();
 }
 
+/*
+ * _readExecLockRelsInfo
+ */
+static ExecLockRelsInfo *
+_readExecLockRelsInfo(void)
+{
+	READ_LOCALS(ExecLockRelsInfo);
+
+	READ_BITMAPSET_FIELD(lockrels);
+	READ_INT_FIELD(numPlanNodes);
+	READ_NODE_FIELD(initPruningOutputs);
+	READ_INT_ARRAY(ipoIndexes, local_node->numPlanNodes);
+
+	READ_DONE();
+}
+
+/*
+ * _readPlanInitPruningOutput
+ */
+static PlanInitPruningOutput *
+_readPlanInitPruningOutput(void)
+{
+	READ_LOCALS(PlanInitPruningOutput);
+
+	READ_BITMAPSET_FIELD(initially_valid_subplans);
+
+	READ_DONE();
+}
+
 /*
  * parseNodeString
  *
@@ -2977,6 +3010,10 @@ parseNodeString(void)
 		return_value = _readPartitionBoundSpec();
 	else if (MATCH("PARTITIONRANGEDATUM", 19))
 		return_value = _readPartitionRangeDatum();
+	else if (MATCH("EXECLOCKRELSINFO", 16))
+		return_value = _readExecLockRelsInfo();
+	else if (MATCH("PARTITIONINITPRUNINGOUTPUT", 26))
+		return_value = _readPlanInitPruningOutput();
 	else
 	{
 		elog(ERROR, "badly formatted node string \"%.32s\"...", token);
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 374a9d9753..329fb9d6e7 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -517,7 +517,9 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
 	result->transientPlan = glob->transientPlan;
 	result->dependsOnRole = glob->dependsOnRole;
 	result->parallelModeNeeded = glob->parallelModeNeeded;
+	result->containsInitialPruning = glob->containsInitialPruning;
 	result->planTree = top_plan;
+	result->numPlanNodes = glob->lastPlanNodeId;
 	result->rtable = glob->finalrtable;
 	result->resultRelations = glob->resultRelations;
 	result->appendRelations = glob->appendRelations;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index dbdeb8ec9d..ac795ae9d9 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -1561,6 +1561,9 @@ set_append_references(PlannerInfo *root,
 				pinfo->rtindex += rtoffset;
 			}
 		}
+
+		if (aplan->part_prune_info->needs_init_pruning)
+			root->glob->containsInitialPruning = true;
 	}
 
 	/* We don't need to recurse to lefttree or righttree ... */
@@ -1648,6 +1651,9 @@ set_mergeappend_references(PlannerInfo *root,
 				pinfo->rtindex += rtoffset;
 			}
 		}
+
+		if (mplan->part_prune_info->needs_init_pruning)
+			root->glob->containsInitialPruning = true;
 	}
 
 	/* We don't need to recurse to lefttree or righttree ... */
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index 7080cb25d9..3322dc79f2 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -144,7 +144,9 @@ static List *make_partitionedrel_pruneinfo(PlannerInfo *root,
 										   List *prunequal,
 										   Bitmapset *partrelids,
 										   int *relid_subplan_map,
-										   Bitmapset **matchedsubplans);
+										   Bitmapset **matchedsubplans,
+										   bool *needs_init_pruning,
+										   bool *needs_exec_pruning);
 static void gen_partprune_steps(RelOptInfo *rel, List *clauses,
 								PartClauseTarget target,
 								GeneratePruningStepsContext *context);
@@ -230,6 +232,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	int		   *relid_subplan_map;
 	ListCell   *lc;
 	int			i;
+	bool		needs_init_pruning = false;
+	bool		needs_exec_pruning = false;
 
 	/*
 	 * Scan the subpaths to see which ones are scans of partition child
@@ -309,12 +313,16 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		Bitmapset  *partrelids = (Bitmapset *) lfirst(lc);
 		List	   *pinfolist;
 		Bitmapset  *matchedsubplans = NULL;
+		bool		partrel_needs_init_pruning;
+		bool		partrel_needs_exec_pruning;
 
 		pinfolist = make_partitionedrel_pruneinfo(root, parentrel,
 												  prunequal,
 												  partrelids,
 												  relid_subplan_map,
-												  &matchedsubplans);
+												  &matchedsubplans,
+												  &partrel_needs_init_pruning,
+												  &partrel_needs_exec_pruning);
 
 		/* When pruning is possible, record the matched subplans */
 		if (pinfolist != NIL)
@@ -323,6 +331,10 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 			allmatchedsubplans = bms_join(matchedsubplans,
 										  allmatchedsubplans);
 		}
+		if (!needs_init_pruning)
+			needs_init_pruning = partrel_needs_init_pruning;
+		if (!needs_exec_pruning)
+			needs_exec_pruning = partrel_needs_exec_pruning;
 	}
 
 	pfree(relid_subplan_map);
@@ -337,6 +349,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	/* Else build the result data structure */
 	pruneinfo = makeNode(PartitionPruneInfo);
 	pruneinfo->prune_infos = prunerelinfos;
+	pruneinfo->needs_init_pruning = needs_init_pruning;
+	pruneinfo->needs_exec_pruning = needs_exec_pruning;
 
 	/*
 	 * Some subplans may not belong to any of the identified partitioned rels.
@@ -435,13 +449,18 @@ add_part_relids(List *allpartrelids, Bitmapset *partrelids)
  * If we cannot find any useful run-time pruning steps, return NIL.
  * However, on success, each rel identified in partrelids will have
  * an element in the result list, even if some of them are useless.
+ * *needs_init_pruning and *needs_exec_pruning are set to indicate that the
+ * returned PartitionedRelPruneInfos contains pruning steps that can be
+ * performed before and after execution begins, respectively.
  */
 static List *
 make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 							  List *prunequal,
 							  Bitmapset *partrelids,
 							  int *relid_subplan_map,
-							  Bitmapset **matchedsubplans)
+							  Bitmapset **matchedsubplans,
+							  bool *needs_init_pruning,
+							  bool *needs_exec_pruning)
 {
 	RelOptInfo *targetpart = NULL;
 	List	   *pinfolist = NIL;
@@ -452,6 +471,10 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	int			rti;
 	int			i;
 
+	/* Will find out below. */
+	*needs_init_pruning = false;
+	*needs_exec_pruning = false;
+
 	/*
 	 * Examine each partitioned rel, constructing a temporary array to map
 	 * from planner relids to index of the partitioned rel, and building a
@@ -539,6 +562,9 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		 * executor per-scan pruning steps.  This first pass creates startup
 		 * pruning steps and detects whether there's any possibly-useful quals
 		 * that would require per-scan pruning.
+		 *
+		 * In the first pass, we note whether the 2nd pass is necessary by
+		 * by noting the presence of EXEC parameters.
 		 */
 		gen_partprune_steps(subpart, partprunequal, PARTTARGET_INITIAL,
 							&context);
@@ -613,6 +639,11 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		pinfo->execparamids = execparamids;
 		/* Remaining fields will be filled in the next loop */
 
+		if (!*needs_init_pruning)
+			*needs_init_pruning = (initial_pruning_steps != NIL);
+		if (!*needs_exec_pruning)
+			*needs_exec_pruning = (exec_pruning_steps != NIL);
+
 		pinfolist = lappend(pinfolist, pinfo);
 	}
 
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index ba2fcfeb4a..085eb3f209 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -945,15 +945,17 @@ pg_plan_query(Query *querytree, const char *query_string, int cursorOptions,
  * For normal optimizable statements, invoke the planner.  For utility
  * statements, just make a wrapper PlannedStmt node.
  *
- * The result is a list of PlannedStmt nodes.
+ * The result is a list of PlannedStmt nodes.  Also, a NULL is appended to
+ * *execlockrelsinfo_list for each PlannedStmt added to the returned list.
  */
 List *
 pg_plan_queries(List *querytrees, const char *query_string, int cursorOptions,
-				ParamListInfo boundParams)
+				ParamListInfo boundParams, List **execlockrelsinfo_list)
 {
 	List	   *stmt_list = NIL;
 	ListCell   *query_list;
 
+	*execlockrelsinfo_list = NIL;
 	foreach(query_list, querytrees)
 	{
 		Query	   *query = lfirst_node(Query, query_list);
@@ -977,6 +979,7 @@ pg_plan_queries(List *querytrees, const char *query_string, int cursorOptions,
 		}
 
 		stmt_list = lappend(stmt_list, stmt);
+		*execlockrelsinfo_list = lappend(*execlockrelsinfo_list, NULL);
 	}
 
 	return stmt_list;
@@ -1080,7 +1083,8 @@ exec_simple_query(const char *query_string)
 		QueryCompletion qc;
 		MemoryContext per_parsetree_context = NULL;
 		List	   *querytree_list,
-				   *plantree_list;
+				   *plantree_list,
+				   *plantree_execlockrelsinfo_list;
 		Portal		portal;
 		DestReceiver *receiver;
 		int16		format;
@@ -1167,7 +1171,8 @@ exec_simple_query(const char *query_string)
 												NULL, 0, NULL);
 
 		plantree_list = pg_plan_queries(querytree_list, query_string,
-										CURSOR_OPT_PARALLEL_OK, NULL);
+										CURSOR_OPT_PARALLEL_OK, NULL,
+										&plantree_execlockrelsinfo_list);
 
 		/*
 		 * Done with the snapshot used for parsing/planning.
@@ -1203,6 +1208,7 @@ exec_simple_query(const char *query_string)
 						  query_string,
 						  commandTag,
 						  plantree_list,
+						  plantree_execlockrelsinfo_list,
 						  NULL);
 
 		/*
@@ -1991,6 +1997,7 @@ exec_bind_message(StringInfo input_message)
 					  query_string,
 					  psrc->commandTag,
 					  cplan->stmt_list,
+					  cplan->execlockrelsinfo_list,
 					  cplan);
 
 	/* Done with the snapshot used for parameter I/O and parsing/planning */
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index 5f907831a3..972ddc014e 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -35,7 +35,7 @@
 Portal		ActivePortal = NULL;
 
 
-static void ProcessQuery(PlannedStmt *plan,
+static void ProcessQuery(PlannedStmt *plan, ExecLockRelsInfo *execlockrelsinfo,
 						 const char *sourceText,
 						 ParamListInfo params,
 						 QueryEnvironment *queryEnv,
@@ -65,6 +65,7 @@ static void DoPortalRewind(Portal portal);
  */
 QueryDesc *
 CreateQueryDesc(PlannedStmt *plannedstmt,
+				ExecLockRelsInfo *execlockrelsinfo,
 				const char *sourceText,
 				Snapshot snapshot,
 				Snapshot crosscheck_snapshot,
@@ -77,6 +78,7 @@ CreateQueryDesc(PlannedStmt *plannedstmt,
 
 	qd->operation = plannedstmt->commandType;	/* operation */
 	qd->plannedstmt = plannedstmt;	/* plan */
+	qd->execlockrelsinfo = execlockrelsinfo;		/* ExecutorGetLockRels() output for plan */
 	qd->sourceText = sourceText;	/* query text */
 	qd->snapshot = RegisterSnapshot(snapshot);	/* snapshot */
 	/* RI check snapshot */
@@ -122,6 +124,7 @@ FreeQueryDesc(QueryDesc *qdesc)
  *		PORTAL_ONE_RETURNING, or PORTAL_ONE_MOD_WITH portal
  *
  *	plan: the plan tree for the query
+ *	execlockrelsinfo: ExecutorGetLockRels() output for the plan tree
  *	sourceText: the source text of the query
  *	params: any parameters needed
  *	dest: where to send results
@@ -134,6 +137,7 @@ FreeQueryDesc(QueryDesc *qdesc)
  */
 static void
 ProcessQuery(PlannedStmt *plan,
+			 ExecLockRelsInfo *execlockrelsinfo,
 			 const char *sourceText,
 			 ParamListInfo params,
 			 QueryEnvironment *queryEnv,
@@ -145,7 +149,7 @@ ProcessQuery(PlannedStmt *plan,
 	/*
 	 * Create the QueryDesc object
 	 */
-	queryDesc = CreateQueryDesc(plan, sourceText,
+	queryDesc = CreateQueryDesc(plan, execlockrelsinfo, sourceText,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, params, queryEnv, 0);
 
@@ -490,6 +494,7 @@ PortalStart(Portal portal, ParamListInfo params,
 				 * the destination to DestNone.
 				 */
 				queryDesc = CreateQueryDesc(linitial_node(PlannedStmt, portal->stmts),
+											linitial_node(ExecLockRelsInfo, portal->execlockrelsinfos),
 											portal->sourceText,
 											GetActiveSnapshot(),
 											InvalidSnapshot,
@@ -1190,7 +1195,8 @@ PortalRunMulti(Portal portal,
 			   QueryCompletion *qc)
 {
 	bool		active_snapshot_set = false;
-	ListCell   *stmtlist_item;
+	ListCell   *stmtlist_item,
+			   *execlockrelsinfolist_item;
 
 	/*
 	 * If the destination is DestRemoteExecute, change to DestNone.  The
@@ -1211,9 +1217,12 @@ PortalRunMulti(Portal portal,
 	 * Loop to handle the individual queries generated from a single parsetree
 	 * by analysis and rewrite.
 	 */
-	foreach(stmtlist_item, portal->stmts)
+	forboth(stmtlist_item, portal->stmts,
+			execlockrelsinfolist_item, portal->execlockrelsinfos)
 	{
 		PlannedStmt *pstmt = lfirst_node(PlannedStmt, stmtlist_item);
+		ExecLockRelsInfo *execlockrelsinfo = lfirst_node(ExecLockRelsInfo,
+											   execlockrelsinfolist_item);
 
 		/*
 		 * If we got a cancel signal in prior command, quit
@@ -1271,7 +1280,7 @@ PortalRunMulti(Portal portal,
 			if (pstmt->canSetTag)
 			{
 				/* statement can set tag string */
-				ProcessQuery(pstmt,
+				ProcessQuery(pstmt, execlockrelsinfo,
 							 portal->sourceText,
 							 portal->portalParams,
 							 portal->queryEnv,
@@ -1280,7 +1289,7 @@ PortalRunMulti(Portal portal,
 			else
 			{
 				/* stmt added by rewrite cannot set tag */
-				ProcessQuery(pstmt,
+				ProcessQuery(pstmt, execlockrelsinfo,
 							 portal->sourceText,
 							 portal->portalParams,
 							 portal->queryEnv,
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 4cf6db504f..9f5a40a0a6 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -99,14 +99,16 @@ static dlist_head cached_expression_list = DLIST_STATIC_INIT(cached_expression_l
 static void ReleaseGenericPlan(CachedPlanSource *plansource);
 static List *RevalidateCachedQuery(CachedPlanSource *plansource,
 								   QueryEnvironment *queryEnv);
-static bool CheckCachedPlan(CachedPlanSource *plansource);
+static bool CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams);
 static CachedPlan *BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
 								   ParamListInfo boundParams, QueryEnvironment *queryEnv);
+static void CachedPlanSaveExecLockRelsInfos(CachedPlan *plan, List *execlockrelsinfo_list);
 static bool choose_custom_plan(CachedPlanSource *plansource,
 							   ParamListInfo boundParams);
 static double cached_plan_cost(CachedPlan *plan, bool include_planner);
 static Query *QueryListGetPrimaryStmt(List *stmts);
-static void AcquireExecutorLocks(List *stmt_list, bool acquire);
+static List *AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams);
+static void ReleaseExecutorLocks(List *stmt_list, List *execlockrelsinfo_list);
 static void AcquirePlannerLocks(List *stmt_list, bool acquire);
 static void ScanQueryForLocks(Query *parsetree, bool acquire);
 static bool ScanQueryWalker(Node *node, bool *acquire);
@@ -790,9 +792,21 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
  *
  * On a "true" return, we have acquired the locks needed to run the plan.
  * (We must do this for the "true" result to be race-condition-free.)
+ *
+ * If the CachedPlan is valid, this may in some cases call ExecutorGetLockRels
+ * on each PlannedStmt contained in it to determine the set of relations to be
+ * locked by AcquireExecutorLocks(), instead of just scanning its range table,
+ * which is done to prune away any nodes in the tree that need not be executed
+ * based on the result of initial partition pruning.  Resulting
+ * ExecLockRelsInfo nodes containing the result of such pruning, allocated in
+ * a child context of the context containing the plan itself, are added into
+ * plan->execlockrelsinfo_list.  The previous contents of the list from the
+ * last invocation on the same CachedPlan are deleted, because they would no
+ * longer be valid given the fresh set of parameter values which may be used
+ * as pruning parameters.
  */
 static bool
-CheckCachedPlan(CachedPlanSource *plansource)
+CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams)
 {
 	CachedPlan *plan = plansource->gplan;
 
@@ -820,13 +834,25 @@ CheckCachedPlan(CachedPlanSource *plansource)
 	 */
 	if (plan->is_valid)
 	{
+		List   *execlockrelsinfo_list;
+
 		/*
 		 * Plan must have positive refcount because it is referenced by
 		 * plansource; so no need to fear it disappears under us here.
 		 */
 		Assert(plan->refcount > 0);
 
-		AcquireExecutorLocks(plan->stmt_list, true);
+		/*
+		 * Lock relations scanned by the plan.  If ExecutorGetLockRels() asked
+		 * to omit some relations because the plan nodes that scan them were
+		 * found to be pruned, the executor will be informed of the omission of
+		 * the plan nodes themselves, so that it doesn't accidentally try to
+		 * execute those nodes, via the ExecLockRelsInfo nodes collected in the
+		 * returned list that is also passed to it along with the list of
+		 * PlannedStmts.
+		 */
+		execlockrelsinfo_list = AcquireExecutorLocks(plan->stmt_list,
+													 boundParams);
 
 		/*
 		 * If plan was transient, check to see if TransactionXmin has
@@ -844,11 +870,14 @@ CheckCachedPlan(CachedPlanSource *plansource)
 		if (plan->is_valid)
 		{
 			/* Successfully revalidated and locked the query. */
+
+			/* Remember ExecLockRelsInfos in the CachedPlan. */
+			CachedPlanSaveExecLockRelsInfos(plan, execlockrelsinfo_list);
 			return true;
 		}
 
 		/* Oops, the race case happened.  Release useless locks. */
-		AcquireExecutorLocks(plan->stmt_list, false);
+		ReleaseExecutorLocks(plan->stmt_list, execlockrelsinfo_list);
 	}
 
 	/*
@@ -880,7 +909,8 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
 				ParamListInfo boundParams, QueryEnvironment *queryEnv)
 {
 	CachedPlan *plan;
-	List	   *plist;
+	List	   *plist,
+			   *execlockrelsinfo_list;
 	bool		snapshot_set;
 	bool		is_transient;
 	MemoryContext plan_context;
@@ -933,7 +963,8 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
 	 * Generate the plan.
 	 */
 	plist = pg_plan_queries(qlist, plansource->query_string,
-							plansource->cursor_options, boundParams);
+							plansource->cursor_options, boundParams,
+							&execlockrelsinfo_list);
 
 	/* Release snapshot if we got one */
 	if (snapshot_set)
@@ -1002,6 +1033,16 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
 	plan->is_saved = false;
 	plan->is_valid = true;
 
+	/*
+	 * Save the dummy ExecLockRelsInfo list, that is a list containing NULLs
+	 * as elements.  We must do this, becasue users of the CachedPlan expect
+	 * one to go with the list of PlannedStmts.
+	 * XXX maybe get rid of that contract.
+	 */
+	plan->execlockrelsinfo_context = NULL;
+	CachedPlanSaveExecLockRelsInfos(plan, execlockrelsinfo_list);
+	Assert(MemoryContextIsValid(plan->execlockrelsinfo_context));
+
 	/* assign generation number to new plan */
 	plan->generation = ++(plansource->generation);
 
@@ -1160,7 +1201,7 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 
 	if (!customplan)
 	{
-		if (CheckCachedPlan(plansource))
+		if (CheckCachedPlan(plansource, boundParams))
 		{
 			/* We want a generic plan, and we already have a valid one */
 			plan = plansource->gplan;
@@ -1586,6 +1627,49 @@ CopyCachedPlan(CachedPlanSource *plansource)
 	return newsource;
 }
 
+/*
+ * CachedPlanSaveExecLockRelsInfos
+ *		Save the list containing ExecLockRelsInfo nodes into the given
+ *		CachedPlan
+ *
+ * The provided list is copied into a dedicated context that is a child of
+ * plan->context.  If the child context already exists, it is emptied, because
+ * any ExecLockRelsInfo contained therein would no longer be useful.
+ */
+static void
+CachedPlanSaveExecLockRelsInfos(CachedPlan *plan, List *execlockrelsinfo_list)
+{
+	MemoryContext	execlockrelsinfo_context = plan->execlockrelsinfo_context,
+					oldcontext = CurrentMemoryContext;
+	List		   *execlockrelsinfo_list_copy;
+
+	/*
+	 * Set up the dedicated context if not already done, saving it as a child
+	 * of the CachedPlan's context.
+	 */
+	if (execlockrelsinfo_context == NULL)
+	{
+		execlockrelsinfo_context = AllocSetContextCreate(CurrentMemoryContext,
+												 "CachedPlan execlockrelsinfo list",
+												 ALLOCSET_START_SMALL_SIZES);
+		MemoryContextSetParent(execlockrelsinfo_context, plan->context);
+		MemoryContextSetIdentifier(execlockrelsinfo_context, plan->context->ident);
+		plan->execlockrelsinfo_context = execlockrelsinfo_context;
+	}
+	else
+	{
+		/* Just clear existing contents by resetting the context. */
+		Assert(MemoryContextIsValid(execlockrelsinfo_context));
+		MemoryContextReset(execlockrelsinfo_context);
+	}
+
+	MemoryContextSwitchTo(execlockrelsinfo_context);
+	execlockrelsinfo_list_copy = copyObject(execlockrelsinfo_list);
+	MemoryContextSwitchTo(oldcontext);
+
+	plan->execlockrelsinfo_list = execlockrelsinfo_list_copy;
+}
+
 /*
  * CachedPlanIsValid: test whether the rewritten querytree within a
  * CachedPlanSource is currently valid (that is, not marked as being in need
@@ -1737,17 +1821,21 @@ QueryListGetPrimaryStmt(List *stmts)
 
 /*
  * AcquireExecutorLocks: acquire locks needed for execution of a cached plan;
- * or release them if acquire is false.
+ *
+ * Returns a list of ExecLockRelsInfo nodes containing one element for each
+ * PlannedStmt in stmt_list or NULL if the latter is utility statement or its
+ * containsInitialPruning is false.
  */
-static void
-AcquireExecutorLocks(List *stmt_list, bool acquire)
+static List *
+AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams)
 {
 	ListCell   *lc1;
+	List	   *execlockrelsinfo_list = NIL;
 
 	foreach(lc1, stmt_list)
 	{
 		PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
-		ListCell   *lc2;
+		ExecLockRelsInfo *execlockrelsinfo = NULL;
 
 		if (plannedstmt->commandType == CMD_UTILITY)
 		{
@@ -1761,27 +1849,139 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
 			Query	   *query = UtilityContainsQuery(plannedstmt->utilityStmt);
 
 			if (query)
-				ScanQueryForLocks(query, acquire);
-			continue;
+				ScanQueryForLocks(query, true);
 		}
-
-		foreach(lc2, plannedstmt->rtable)
+		else
 		{
-			RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc2);
+			/*
+			 * Figure out the set of relations that would need to be locked
+			 * before executing the plan.
+			 */
+			if (!plannedstmt->containsInitialPruning)
+			{
+				/*
+				 * If the plan contains no initial pruning steps, just lock
+				 * all the relations found in the range table.
+				 */
+				ListCell *lc;
 
-			if (rte->rtekind != RTE_RELATION)
-				continue;
+				foreach(lc, plannedstmt->rtable)
+				{
+					RangeTblEntry *rte = lfirst(lc);
+
+					if (rte->rtekind != RTE_RELATION)
+						continue;
+
+					/*
+					 * Acquire the appropriate type of lock on each relation
+					 * OID. Note that we don't actually try to open the rel,
+					 * and hence will not fail if it's been dropped entirely
+					 * --- we'll just transiently acquire a non-conflicting
+					 *  lock.
+					 */
+					LockRelationOid(rte->relid, rte->rellockmode);
+				}
+			}
+			else
+			{
+				int			rti;
+				Bitmapset  *lockrels;
+
+				/*
+				 * Walk the plan tree to find only the minimal set of
+				 * relations to be locked, considering the effect of performing
+				 * initial partition pruning.
+				 */
+				execlockrelsinfo = ExecutorGetLockRels(plannedstmt, boundParams);
+				lockrels = execlockrelsinfo->lockrels;
+
+				rti = -1;
+				while ((rti = bms_next_member(lockrels, rti)) >= 0)
+				{
+					RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
 
+					Assert(rte->rtekind == RTE_RELATION);
+
+					/* See the comment above. */
+					LockRelationOid(rte->relid, rte->rellockmode);
+				}
+			}
+		}
+
+		/*
+		 * Remember ExecLockRelsInfo for later adding to the QueryDesc that
+		 * will be passed to the executor when executing this plan.  May be
+		 * NULL, but must keep the list the same length as stmt_list.
+		 */
+		execlockrelsinfo_list = lappend(execlockrelsinfo_list,
+										execlockrelsinfo);
+	}
+
+	return execlockrelsinfo_list;
+}
+
+/*
+ * ReleaseExecutorLocks
+ * 		Release locks that would've been acquired by an earlier call to
+ * 		AcquireExecutorLocks()
+ */
+static void
+ReleaseExecutorLocks(List *stmt_list, List *execlockrelsinfo_list)
+{
+	ListCell   *lc1,
+			   *lc2;
+
+	forboth(lc1, stmt_list, lc2, execlockrelsinfo_list)
+	{
+		PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
+		ExecLockRelsInfo *execlockrelsinfo = lfirst_node(ExecLockRelsInfo, lc2);
+
+		if (plannedstmt->commandType == CMD_UTILITY)
+		{
 			/*
-			 * Acquire the appropriate type of lock on each relation OID. Note
-			 * that we don't actually try to open the rel, and hence will not
-			 * fail if it's been dropped entirely --- we'll just transiently
-			 * acquire a non-conflicting lock.
+			 * Ignore utility statements, except those (such as EXPLAIN) that
+			 * contain a parsed-but-not-planned query.  Note: it's okay to use
+			 * ScanQueryForLocks, even though the query hasn't been through
+			 * rule rewriting, because rewriting doesn't change the query
+			 * representation.
 			 */
-			if (acquire)
-				LockRelationOid(rte->relid, rte->rellockmode);
+			Query	   *query = UtilityContainsQuery(plannedstmt->utilityStmt);
+
+			if (query)
+				ScanQueryForLocks(query, false);
+		}
+		else
+		{
+			if (execlockrelsinfo == NULL)
+			{
+				ListCell *lc;
+
+				foreach(lc, plannedstmt->rtable)
+				{
+					RangeTblEntry *rte = lfirst(lc);
+
+					if (rte->rtekind != RTE_RELATION)
+						continue;
+
+					LockRelationOid(rte->relid, rte->rellockmode);
+				}
+			}
 			else
-				UnlockRelationOid(rte->relid, rte->rellockmode);
+			{
+				int			rti;
+				Bitmapset  *lockrels;
+
+				lockrels = execlockrelsinfo->lockrels;
+				rti = -1;
+				while ((rti = bms_next_member(lockrels, rti)) >= 0)
+				{
+					RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
+
+					Assert(rte->rtekind == RTE_RELATION);
+
+					UnlockRelationOid(rte->relid, rte->rellockmode);
+				}
+			}
 		}
 	}
 }
diff --git a/src/backend/utils/mmgr/portalmem.c b/src/backend/utils/mmgr/portalmem.c
index d549f66d4a..896f51be08 100644
--- a/src/backend/utils/mmgr/portalmem.c
+++ b/src/backend/utils/mmgr/portalmem.c
@@ -285,6 +285,7 @@ PortalDefineQuery(Portal portal,
 				  const char *sourceText,
 				  CommandTag commandTag,
 				  List *stmts,
+				  List *execlockrelsinfos,
 				  CachedPlan *cplan)
 {
 	AssertArg(PortalIsValid(portal));
@@ -299,6 +300,7 @@ PortalDefineQuery(Portal portal,
 	portal->qc.nprocessed = 0;
 	portal->commandTag = commandTag;
 	portal->stmts = stmts;
+	portal->execlockrelsinfos = execlockrelsinfos;
 	portal->cplan = cplan;
 	portal->status = PORTAL_DEFINED;
 }
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 666977fb1f..fef75ba147 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -87,7 +87,8 @@ extern void ExplainOneUtility(Node *utilityStmt, IntoClause *into,
 							  ExplainState *es, const char *queryString,
 							  ParamListInfo params, QueryEnvironment *queryEnv);
 
-extern void ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into,
+extern void ExplainOnePlan(PlannedStmt *plannedstmt, ExecLockRelsInfo *execlockrelsinfo,
+						   IntoClause *into,
 						   ExplainState *es, const char *queryString,
 						   ParamListInfo params, QueryEnvironment *queryEnv,
 						   const instr_time *planduration,
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index fd5735a946..ded19b8cbb 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -124,4 +124,6 @@ extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
 						 PartitionPruneInfo *pruneinfo,
 						 Bitmapset **initially_valid_subplans);
 extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate);
+extern Bitmapset *ExecGetLockRelsDoInitialPruning(Plan *plan, ExecGetLockRelsContext *context,
+								PartitionPruneInfo *pruneinfo);
 #endif							/* EXECPARTITION_H */
diff --git a/src/include/executor/execdesc.h b/src/include/executor/execdesc.h
index e79e2c001f..4338463479 100644
--- a/src/include/executor/execdesc.h
+++ b/src/include/executor/execdesc.h
@@ -35,6 +35,7 @@ typedef struct QueryDesc
 	/* These fields are provided by CreateQueryDesc */
 	CmdType		operation;		/* CMD_SELECT, CMD_UPDATE, etc. */
 	PlannedStmt *plannedstmt;	/* planner's output (could be utility, too) */
+	ExecLockRelsInfo *execlockrelsinfo;	/* ExecutorGetLockRels()'s output given plannedstmt */
 	const char *sourceText;		/* source text of the query */
 	Snapshot	snapshot;		/* snapshot to use for query */
 	Snapshot	crosscheck_snapshot;	/* crosscheck for RI update/delete */
@@ -57,6 +58,7 @@ typedef struct QueryDesc
 
 /* in pquery.c */
 extern QueryDesc *CreateQueryDesc(PlannedStmt *plannedstmt,
+								  ExecLockRelsInfo *execlockrelsinfo,
 								  const char *sourceText,
 								  Snapshot snapshot,
 								  Snapshot crosscheck_snapshot,
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index 82925b4b63..5cf414cc11 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -185,6 +185,8 @@ ExecGetJunkAttribute(TupleTableSlot *slot, AttrNumber attno, bool *isNull)
 /*
  * prototypes from functions in execMain.c
  */
+extern ExecLockRelsInfo *ExecutorGetLockRels(PlannedStmt *plannedstmt, ParamListInfo params);
+extern bool ExecGetLockRels(Plan *node, ExecGetLockRelsContext *context);
 extern void ExecutorStart(QueryDesc *queryDesc, int eflags);
 extern void standard_ExecutorStart(QueryDesc *queryDesc, int eflags);
 extern void ExecutorRun(QueryDesc *queryDesc,
diff --git a/src/include/executor/nodeAppend.h b/src/include/executor/nodeAppend.h
index 4cb78ee5b6..b53535c2a4 100644
--- a/src/include/executor/nodeAppend.h
+++ b/src/include/executor/nodeAppend.h
@@ -17,6 +17,7 @@
 #include "access/parallel.h"
 #include "nodes/execnodes.h"
 
+extern bool ExecGetAppendLockRels(Append *node, ExecGetLockRelsContext *context);
 extern AppendState *ExecInitAppend(Append *node, EState *estate, int eflags);
 extern void ExecEndAppend(AppendState *node);
 extern void ExecReScanAppend(AppendState *node);
diff --git a/src/include/executor/nodeMergeAppend.h b/src/include/executor/nodeMergeAppend.h
index 97fe3b0665..8eb4e9df93 100644
--- a/src/include/executor/nodeMergeAppend.h
+++ b/src/include/executor/nodeMergeAppend.h
@@ -16,6 +16,7 @@
 
 #include "nodes/execnodes.h"
 
+extern bool ExecGetMergeAppendLockRels(MergeAppend *node, ExecGetLockRelsContext *context);
 extern MergeAppendState *ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags);
 extern void ExecEndMergeAppend(MergeAppendState *node);
 extern void ExecReScanMergeAppend(MergeAppendState *node);
diff --git a/src/include/executor/nodeModifyTable.h b/src/include/executor/nodeModifyTable.h
index 1d225bc88d..5006499088 100644
--- a/src/include/executor/nodeModifyTable.h
+++ b/src/include/executor/nodeModifyTable.h
@@ -19,6 +19,7 @@ extern void ExecComputeStoredGenerated(ResultRelInfo *resultRelInfo,
 									   EState *estate, TupleTableSlot *slot,
 									   CmdType cmdtype);
 
+extern bool ExecGetModifyTableLockRels(ModifyTable *plan, ExecGetLockRelsContext *context);
 extern ModifyTableState *ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags);
 extern void ExecEndModifyTable(ModifyTableState *node);
 extern void ExecReScanModifyTable(ModifyTableState *node);
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 44dd73fc80..1253fdb0ed 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -576,6 +576,7 @@ typedef struct EState
 	struct ExecRowMark **es_rowmarks;	/* Array of per-range-table-entry
 										 * ExecRowMarks, or NULL if none */
 	PlannedStmt *es_plannedstmt;	/* link to top of plan tree */
+	struct ExecLockRelsInfo *es_execlockrelsinfo; /* QueryDesc.execlockrelsinfo */
 	const char *es_sourceText;	/* Source text from QueryDesc */
 
 	JunkFilter *es_junkFilter;	/* top-level junk filter, if any */
@@ -964,6 +965,101 @@ typedef struct DomainConstraintState
  */
 typedef TupleTableSlot *(*ExecProcNodeMtd) (struct PlanState *pstate);
 
+/*----------------
+ * ExecLockRelsInfo
+ *
+ * Result of performing ExecutorGetLockRels() for a given PlannedStmt
+ */
+typedef struct ExecLockRelsInfo
+{
+	NodeTag		type;
+
+	/*
+	 * Relations that must be locked to execute the plan tree contained in
+	 * the PlannedStmt.
+	 */
+	Bitmapset  *lockrels;
+
+	/* PlannedStmt.numPlanNodes */
+	int			numPlanNodes;
+
+	/*
+	 * List of PlanInitPruningOutput, each representing the output of
+	 * performing initial pruning on a given plan node, for all nodes in the
+	 * plan tree that have been marked as needing initial pruning.
+	 *
+	 * 'ipoIndexes' is an array of 'numPlanNodes' elements, indexed with
+	 * plan_node_id of the individual nodes in the plan tree, each a 1-based
+	 * index into 'initPruningOutputs' list for a given plan node.  0 means
+	 * that a given plan node has no entry in the list because of not needing
+	 * any initial pruning done on it.
+	 */
+	List	   *initPruningOutputs;
+	int		   *ipoIndexes;
+} ExecLockRelsInfo;
+
+/*----------------
+ * ExecGetLockRelsContext
+ *
+ * Information pertaining to ExecutorGetLockRels() invocation for a given
+ * plan.
+ */
+typedef struct ExecGetLockRelsContext
+{
+	NodeTag		type;
+
+	PlannedStmt	   *stmt;		/* target plan */
+	ParamListInfo	params;		/* EXTERN parameters available for pruning */
+
+	/* Output parameters for ExecGetLockRels and its subroutines. */
+	Bitmapset	   *lockrels;
+
+	/* See the omment in the definition of ExecLockRelsInfo struct. */
+	List		   *initPruningOutputs;
+	int			   *ipoIndexes;
+} ExecGetLockRelsContext;
+
+/*
+ * Appends the provided PlanInitPruningOutput to
+ * ExecGetLockRelsContext.initPruningOutput
+ */
+#define ExecStorePlanInitPruningOutput(cxt, initPruningOutput, plannode) \
+	do { \
+		(cxt)->initPruningOutputs = lappend((cxt)->initPruningOutputs, initPruningOutput); \
+		(cxt)->ipoIndexes[(plannode)->plan_node_id] = list_length((cxt)->initPruningOutputs); \
+	} while (0)
+
+/*
+ * Finds the PlanInitPruningOutput for a given Plan node in
+ * ExecLockRelsInfo.initPruningOutputs.
+ */
+#define ExecFetchPlanInitPruningOutput(execlockrelsinfo, plannode) \
+		(((execlockrelsinfo) != NULL && (execlockrelsinfo)->initPruningOutputs != NIL) ? \
+		 list_nth((execlockrelsinfo)->initPruningOutputs, \
+				  (execlockrelsinfo)->ipoIndexes[(plannode)->plan_node_id] - 1) : NULL)
+
+/* ---------------
+ * PlanInitPruningOutput
+ *
+ * Node to remember the result of performing initial partition pruning steps
+ * during ExecutorGetLockRels() on nodes that support pruning.
+ *
+ * ExecLockRelsDoInitPruning(), which runs during ExecutorGetLockRels(),
+ * creates it and stores it in the corresponding ExecLockRelsInfo.
+ *
+ * ExecInitPartitionPruning(), which runs during ExecuorStart(), fetches it
+ * from the EState's ExecLockRelsInfo (if any) and uses the value of
+ * initially_valid_subplans contained in it as-is to select the subplans to be
+ * initialized for execution, instead of re-evaluating that by performing
+ * initial pruning again.
+ */
+typedef struct PlanInitPruningOutput
+{
+	NodeTag		type;
+
+	Bitmapset  *initially_valid_subplans;
+} PlanInitPruningOutput;
+
 /* ----------------
  *		PlanState node
  *
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index 5d075f0c34..d365fc4402 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -96,6 +96,11 @@ typedef enum NodeTag
 	T_PartitionPruneStepCombine,
 	T_PlanInvalItem,
 
+	/* TAGS FOR EXECUTOR PREP NODES (execnodes.h) */
+	T_ExecGetLockRelsContext,
+	T_ExecLockRelsInfo,
+	T_PlanInitPruningOutput,
+
 	/*
 	 * TAGS FOR PLAN STATE NODES (execnodes.h)
 	 *
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 5327d9ba8b..019719c1a4 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -129,6 +129,10 @@ typedef struct PlannerGlobal
 
 	char		maxParallelHazard;	/* worst PROPARALLEL hazard level */
 
+	bool		containsInitialPruning;	/* Do some Plan nodes in the tree
+										 * have initial (pre-exec) pruning
+										 * steps? */
+
 	PartitionDirectory partition_directory; /* partition descriptors */
 
 	Bitmapset *elidedAppendPartedRels;	/* Combined partitioned_rels of all
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index bd87c35d6c..bfdb5bbf28 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -59,10 +59,16 @@ typedef struct PlannedStmt
 
 	bool		parallelModeNeeded; /* parallel mode required to execute? */
 
+	bool		containsInitialPruning;	/* Do some Plan nodes in the tree
+										 * have initial (pre-exec) pruning
+										 * steps? */
+
 	int			jitFlags;		/* which forms of JIT should be performed */
 
 	struct Plan *planTree;		/* tree of Plan nodes */
 
+	int			numPlanNodes;	/* number of nodes in planTree */
+
 	List	   *rtable;			/* list of RangeTblEntry nodes */
 
 	/* rtable indexes of target relations for INSERT/UPDATE/DELETE */
@@ -1189,6 +1195,13 @@ typedef struct PlanRowMark
  * prune_infos			List of Lists containing PartitionedRelPruneInfo nodes,
  *						one sublist per run-time-prunable partition hierarchy
  *						appearing in the parent plan node's subplans.
+ *
+ * needs_init_pruning	Does any of the PartitionedRelPruneInfos in
+ *						prune_infos have its initial_pruning_steps set?
+ *
+ * needs_exec_pruning	Does any of the PartitionedRelPruneInfos in
+ *						prune_infos have its exec_pruning_steps set?
+ *
  * other_subplans		Indexes of any subplans that are not accounted for
  *						by any of the PartitionedRelPruneInfo nodes in
  *						"prune_infos".  These subplans must not be pruned.
@@ -1197,6 +1210,8 @@ typedef struct PartitionPruneInfo
 {
 	NodeTag		type;
 	List	   *prune_infos;
+	bool		needs_init_pruning;
+	bool		needs_exec_pruning;
 	Bitmapset  *other_subplans;
 } PartitionPruneInfo;
 
diff --git a/src/include/tcop/tcopprot.h b/src/include/tcop/tcopprot.h
index 92291a750d..bf80c53bed 100644
--- a/src/include/tcop/tcopprot.h
+++ b/src/include/tcop/tcopprot.h
@@ -64,7 +64,7 @@ extern PlannedStmt *pg_plan_query(Query *querytree, const char *query_string,
 								  ParamListInfo boundParams);
 extern List *pg_plan_queries(List *querytrees, const char *query_string,
 							 int cursorOptions,
-							 ParamListInfo boundParams);
+							 ParamListInfo boundParams, List **execlockrelsinfo_list);
 
 extern bool check_max_stack_depth(int *newval, void **extra, GucSource source);
 extern void assign_max_stack_depth(int newval, void *extra);
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index 95b99e3d25..56b0dcc6bd 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -148,6 +148,9 @@ typedef struct CachedPlan
 {
 	int			magic;			/* should equal CACHEDPLAN_MAGIC */
 	List	   *stmt_list;		/* list of PlannedStmts */
+	List	   *execlockrelsinfo_list;	/* list of ExecutorGetLockRelsResult with one
+									 * element for each of stmt_list; NIL
+									 * if not a generic plan */
 	bool		is_oneshot;		/* is it a "oneshot" plan? */
 	bool		is_saved;		/* is CachedPlan in a long-lived context? */
 	bool		is_valid;		/* is the stmt_list currently valid? */
@@ -158,6 +161,9 @@ typedef struct CachedPlan
 	int			generation;		/* parent's generation number for this plan */
 	int			refcount;		/* count of live references to this struct */
 	MemoryContext context;		/* context containing this CachedPlan */
+	MemoryContext execlockrelsinfo_context;	/* context containing
+											 * execlockrelsinfo_list,
+											 * a child of the above context */
 } CachedPlan;
 
 /*
diff --git a/src/include/utils/portal.h b/src/include/utils/portal.h
index aeddbdafe5..9abace6734 100644
--- a/src/include/utils/portal.h
+++ b/src/include/utils/portal.h
@@ -137,6 +137,10 @@ typedef struct PortalData
 	CommandTag	commandTag;		/* command tag for original query */
 	QueryCompletion qc;			/* command completion data for executed query */
 	List	   *stmts;			/* list of PlannedStmts */
+	List	   *execlockrelsinfos;	/* list of ExecutorGetLockRelsResults with one element
+								 * for each of 'stmts'; same as
+								 * cplan->execlockrelsinfo_list if cplan is
+								 * not NULL */
 	CachedPlan *cplan;			/* CachedPlan, if stmts are from one */
 
 	ParamListInfo portalParams; /* params to pass to query */
@@ -241,6 +245,7 @@ extern void PortalDefineQuery(Portal portal,
 							  const char *sourceText,
 							  CommandTag commandTag,
 							  List *stmts,
+							  List *execlockrelsinfos,
 							  CachedPlan *cplan);
 extern PlannedStmt *PortalGetPrimaryStmt(Portal portal);
 extern void PortalCreateHoldStore(Portal portal);
-- 
2.24.1



  [application/x-patch] v6-0001-Some-refactoring-of-runtime-pruning-code.patch (26.5K, 5-v6-0001-Some-refactoring-of-runtime-pruning-code.patch)
  download | inline diff:
From df8186c0e4a76f31c1f803a953f2c98ac88f9dc8 Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Wed, 2 Mar 2022 15:17:55 +0900
Subject: [PATCH v6 1/4] Some refactoring of runtime pruning code

This does two things mainly:

* Move the execution pruning initialization steps that are common
between both ExecInitAppend() and ExecInitMergeAppend() into a new
function ExecInitPartitionPruning() defined in execPartition.c.
Thus, ExecCreatePartitionPruneState() and
ExecFindInitialMatchingSubPlans() need not be exported.

* Add an ExprContext field to PartitionPruneContext to remove the
implicit assumption in the runtime pruning code that the ExprContext
to use to compute pruning expressions that need one can always rely
on the PlanState providing it.  A future patch will allow runtime
pruning (at least the initial pruning steps) to be performed without
the corresponding PlanState yet having been created, so this will
help.
---
 src/backend/executor/execPartition.c   | 340 ++++++++++++++++---------
 src/backend/executor/nodeAppend.c      |  33 +--
 src/backend/executor/nodeMergeAppend.c |  32 +--
 src/backend/partitioning/partprune.c   |  20 +-
 src/include/executor/execPartition.h   |   9 +-
 src/include/partitioning/partprune.h   |   2 +
 6 files changed, 252 insertions(+), 184 deletions(-)

diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 90ed1485d1..7ff5a95f05 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -182,11 +182,18 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation rel,
 												  bool *isnull,
 												  int maxfieldlen);
 static List *adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri);
+static PartitionPruneState *ExecCreatePartitionPruneState(PlanState *planstate,
+							  PartitionPruneInfo *partitionpruneinfo);
+static Bitmapset *ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate);
 static void ExecInitPruningContext(PartitionPruneContext *context,
 								   List *pruning_steps,
 								   PartitionDesc partdesc,
 								   PartitionKey partkey,
-								   PlanState *planstate);
+								   PlanState *planstate,
+								   ExprContext *econtext);
+static void PartitionPruneStateFixSubPlanMap(PartitionPruneState *prunestate,
+								 Bitmapset *initially_valid_subplans,
+								 int n_total_subplans);
 static void find_matching_subplans_recurse(PartitionPruningData *prunedata,
 										   PartitionedRelPruningData *pprune,
 										   bool initial_prune,
@@ -1485,30 +1492,86 @@ adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri)
  *
  * Functions:
  *
- * ExecCreatePartitionPruneState:
+ * ExecInitPartitionPruning:
  *		Creates the PartitionPruneState required by each of the two pruning
  *		functions.  Details stored include how to map the partition index
- *		returned by the partition pruning code into subplan indexes.
- *
- * ExecFindInitialMatchingSubPlans:
- *		Returns indexes of matching subplans.  Partition pruning is attempted
- *		without any evaluation of expressions containing PARAM_EXEC Params.
- *		This function must be called during executor startup for the parent
- *		plan before the subplans themselves are initialized.  Subplans which
- *		are found not to match by this function must be removed from the
- *		plan's list of subplans during execution, as this function performs a
- *		remap of the partition index to subplan index map and the newly
- *		created map provides indexes only for subplans which remain after
- *		calling this function.
+ *		returned by the partition pruning code into subplan indexes.  Also
+ *		determines the set of initially valid subplans by performing initial
+ *		pruning steps, only which need be initialized by the caller such as
+ *		ExecInitAppend.  Maps in PartitionPruneState are updated to account
+ *		for initial pruning having eliminated some of the subplans, if any.
  *
  * ExecFindMatchingSubPlans:
  *		Returns indexes of matching subplans after evaluating all available
- *		expressions.  This function can only be called during execution and
- *		must be called again each time the value of a Param listed in
- *		PartitionPruneState's 'execparamids' changes.
+ *		expressions, that is, using execution pruning steps.  This function can
+ *		can only be called during execution and must be called again each time
+ *		the value of a Param listed in PartitionPruneState's 'execparamids'
+ *		changes.
  *-------------------------------------------------------------------------
  */
 
+/*
+ * ExecInitPartitionPruning
+ * 		Initialize data structure needed for run-time partition pruning
+ *
+ * Initial pruning can be done immediately, so it is done here if needed and
+ * the set of surviving partition subplans' indexes are added to the output
+ * parameter *initially_valid_subplans.
+ *
+ * If subplans are indeed pruned, subplan_map arrays contained in the returned
+ * PartitionPruneState are re-sequenced to not count those, though only if the
+ * maps will be needed for subsequent execution pruning passes.
+ */
+PartitionPruneState *
+ExecInitPartitionPruning(PlanState *planstate,
+						 int n_total_subplans,
+						 PartitionPruneInfo *pruneinfo,
+						 Bitmapset **initially_valid_subplans)
+{
+	PartitionPruneState *prunestate;
+	EState *estate = planstate->state;
+
+	/* We may need an expression context to evaluate partition exprs */
+	ExecAssignExprContext(estate, planstate);
+
+	/*
+	 * Create the working data structure for pruning.
+	 */
+	prunestate = ExecCreatePartitionPruneState(planstate, pruneinfo);
+
+	/*
+	 * Perform an initial partition prune, if required.
+	 */
+	if (prunestate->do_initial_prune)
+	{
+		/* Determine which subplans survive initial pruning */
+		*initially_valid_subplans = ExecFindInitialMatchingSubPlans(prunestate);
+	}
+	else
+	{
+		/* We'll need to initialize all subplans */
+		Assert(n_total_subplans > 0);
+		*initially_valid_subplans = bms_add_range(NULL, 0,
+												  n_total_subplans - 1);
+	}
+
+	/*
+	 * Re-sequence subplan indexes contained in prunestate to account for any
+	 * that were removed above due to initial pruning.
+	 *
+	 * We can safely skip this when !do_exec_prune, even though that leaves
+	 * invalid data in prunestate, because that data won't be consulted again
+	 * (cf initial Assert in ExecFindMatchingSubPlans).
+	 */
+	if (prunestate->do_exec_prune &&
+		bms_num_members(*initially_valid_subplans) < n_total_subplans)
+		PartitionPruneStateFixSubPlanMap(prunestate,
+										 *initially_valid_subplans,
+										 n_total_subplans);
+
+	return prunestate;
+}
+
 /*
  * ExecCreatePartitionPruneState
  *		Build the data structure required for calling
@@ -1527,7 +1590,7 @@ adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri)
  * re-used each time we re-evaluate which partitions match the pruning steps
  * provided in each PartitionedRelPruneInfo.
  */
-PartitionPruneState *
+static PartitionPruneState *
 ExecCreatePartitionPruneState(PlanState *planstate,
 							  PartitionPruneInfo *partitionpruneinfo)
 {
@@ -1536,6 +1599,7 @@ ExecCreatePartitionPruneState(PlanState *planstate,
 	int			n_part_hierarchies;
 	ListCell   *lc;
 	int			i;
+	ExprContext	*econtext = planstate->ps_ExprContext;
 
 	/* For data reading, executor always omits detached partitions */
 	if (estate->es_partition_directory == NULL)
@@ -1709,7 +1773,8 @@ ExecCreatePartitionPruneState(PlanState *planstate,
 			{
 				ExecInitPruningContext(&pprune->initial_context,
 									   pinfo->initial_pruning_steps,
-									   partdesc, partkey, planstate);
+									   partdesc, partkey, planstate,
+									   econtext);
 				/* Record whether initial pruning is needed at any level */
 				prunestate->do_initial_prune = true;
 			}
@@ -1718,7 +1783,8 @@ ExecCreatePartitionPruneState(PlanState *planstate,
 			{
 				ExecInitPruningContext(&pprune->exec_context,
 									   pinfo->exec_pruning_steps,
-									   partdesc, partkey, planstate);
+									   partdesc, partkey, planstate,
+									   econtext);
 				/* Record whether exec pruning is needed at any level */
 				prunestate->do_exec_prune = true;
 			}
@@ -1746,7 +1812,8 @@ ExecInitPruningContext(PartitionPruneContext *context,
 					   List *pruning_steps,
 					   PartitionDesc partdesc,
 					   PartitionKey partkey,
-					   PlanState *planstate)
+					   PlanState *planstate,
+					   ExprContext *econtext)
 {
 	int			n_steps;
 	int			partnatts;
@@ -1767,6 +1834,7 @@ ExecInitPruningContext(PartitionPruneContext *context,
 
 	context->ppccontext = CurrentMemoryContext;
 	context->planstate = planstate;
+	context->exprcontext = econtext;
 
 	/* Initialize expression state for each expression we need */
 	context->exprstates = (ExprState **)
@@ -1795,8 +1863,20 @@ ExecInitPruningContext(PartitionPruneContext *context,
 														step->step.step_id,
 														keyno);
 
-				context->exprstates[stateidx] =
-					ExecInitExpr(expr, context->planstate);
+				/*
+				 * When planstate is NULL, pruning_steps is known not to
+				 * contain any expressions that depend on the parent plan.
+				 * Information of any available EXTERN parameters must be
+				 * passed explicitly in that case, which the caller must
+				 * have made available via econtext.
+				 */
+				if (planstate == NULL)
+					context->exprstates[stateidx] =
+						ExecInitExprWithParams(expr,
+											   econtext->ecxt_param_list_info);
+				else
+					context->exprstates[stateidx] =
+						ExecInitExpr(expr, context->planstate);
 			}
 			keyno++;
 		}
@@ -1809,18 +1889,11 @@ ExecInitPruningContext(PartitionPruneContext *context,
  *		pruning, disregarding any pruning constraints involving PARAM_EXEC
  *		Params.
  *
- * If additional pruning passes will be required (because of PARAM_EXEC
- * Params), we must also update the translation data that allows conversion
- * of partition indexes into subplan indexes to account for the unneeded
- * subplans having been removed.
- *
  * Must only be called once per 'prunestate', and only if initial pruning
  * is required.
- *
- * 'nsubplans' must be passed as the total number of unpruned subplans.
  */
-Bitmapset *
-ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate, int nsubplans)
+static Bitmapset *
+ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate)
 {
 	Bitmapset  *result = NULL;
 	MemoryContext oldcontext;
@@ -1845,14 +1918,20 @@ ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate, int nsubplans)
 		PartitionedRelPruningData *pprune;
 
 		prunedata = prunestate->partprunedata[i];
+
+		/*
+		 * We pass the 1st item belonging to the root table of the hierarchy
+		 * and find_matching_subplans_recurse() takes care of recursing to
+		 * other (lower-level) parents as needed.
+		 */
 		pprune = &prunedata->partrelprunedata[0];
 
 		/* Perform pruning without using PARAM_EXEC Params */
 		find_matching_subplans_recurse(prunedata, pprune, true, &result);
 
-		/* Expression eval may have used space in node's ps_ExprContext too */
+		/* Expression eval may have used space in ExprContext too */
 		if (pprune->initial_pruning_steps)
-			ResetExprContext(pprune->initial_context.planstate->ps_ExprContext);
+			ResetExprContext(pprune->initial_context.exprcontext);
 	}
 
 	/* Add in any subplans that partition pruning didn't account for */
@@ -1865,118 +1944,120 @@ ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate, int nsubplans)
 
 	MemoryContextReset(prunestate->prune_context);
 
+	return result;
+}
+
+/*
+ * PartitionPruneStateFixSubPlanMap
+ *		Fix mapping of partition indexes to subplan indexes contained in
+ *		prunestate by considering the new list of subplans that survived
+ *		initial pruning
+ *
+ * Subplans would previously be indexed 0..(n_total_subplans - 1) should be
+ * changed to index range 0..num(initially_valid_subplans).
+ */
+static void
+PartitionPruneStateFixSubPlanMap(PartitionPruneState *prunestate,
+								 Bitmapset *initially_valid_subplans,
+								 int n_total_subplans)
+{
+	int		   *new_subplan_indexes;
+	Bitmapset  *new_other_subplans;
+	int			i;
+	int			newidx;
+
 	/*
-	 * If exec-time pruning is required and we pruned subplans above, then we
-	 * must re-sequence the subplan indexes so that ExecFindMatchingSubPlans
-	 * properly returns the indexes from the subplans which will remain after
-	 * execution of this function.
-	 *
-	 * We can safely skip this when !do_exec_prune, even though that leaves
-	 * invalid data in prunestate, because that data won't be consulted again
-	 * (cf initial Assert in ExecFindMatchingSubPlans).
+	 * First we must build a temporary array which maps old subplan
+	 * indexes to new ones.  For convenience of initialization, we use
+	 * 1-based indexes in this array and leave pruned items as 0.
 	 */
-	if (prunestate->do_exec_prune && bms_num_members(result) < nsubplans)
+	new_subplan_indexes = (int *) palloc0(sizeof(int) * n_total_subplans);
+	newidx = 1;
+	i = -1;
+	while ((i = bms_next_member(initially_valid_subplans, i)) >= 0)
 	{
-		int		   *new_subplan_indexes;
-		Bitmapset  *new_other_subplans;
-		int			i;
-		int			newidx;
+		Assert(i < n_total_subplans);
+		new_subplan_indexes[i] = newidx++;
+	}
 
-		/*
-		 * First we must build a temporary array which maps old subplan
-		 * indexes to new ones.  For convenience of initialization, we use
-		 * 1-based indexes in this array and leave pruned items as 0.
-		 */
-		new_subplan_indexes = (int *) palloc0(sizeof(int) * nsubplans);
-		newidx = 1;
-		i = -1;
-		while ((i = bms_next_member(result, i)) >= 0)
-		{
-			Assert(i < nsubplans);
-			new_subplan_indexes[i] = newidx++;
-		}
+	/*
+	 * Now we can update each PartitionedRelPruneInfo's subplan_map with
+	 * new subplan indexes.  We must also recompute its present_parts
+	 * bitmap.
+	 */
+	for (i = 0; i < prunestate->num_partprunedata; i++)
+	{
+		PartitionPruningData *prunedata = prunestate->partprunedata[i];
+		int			j;
 
 		/*
-		 * Now we can update each PartitionedRelPruneInfo's subplan_map with
-		 * new subplan indexes.  We must also recompute its present_parts
-		 * bitmap.
+		 * Within each hierarchy, we perform this loop in back-to-front
+		 * order so that we determine present_parts for the lowest-level
+		 * partitioned tables first.  This way we can tell whether a
+		 * sub-partitioned table's partitions were entirely pruned so we
+		 * can exclude it from the current level's present_parts.
 		 */
-		for (i = 0; i < prunestate->num_partprunedata; i++)
+		for (j = prunedata->num_partrelprunedata - 1; j >= 0; j--)
 		{
-			PartitionPruningData *prunedata = prunestate->partprunedata[i];
-			int			j;
+			PartitionedRelPruningData *pprune = &prunedata->partrelprunedata[j];
+			int			nparts = pprune->nparts;
+			int			k;
 
-			/*
-			 * Within each hierarchy, we perform this loop in back-to-front
-			 * order so that we determine present_parts for the lowest-level
-			 * partitioned tables first.  This way we can tell whether a
-			 * sub-partitioned table's partitions were entirely pruned so we
-			 * can exclude it from the current level's present_parts.
-			 */
-			for (j = prunedata->num_partrelprunedata - 1; j >= 0; j--)
-			{
-				PartitionedRelPruningData *pprune = &prunedata->partrelprunedata[j];
-				int			nparts = pprune->nparts;
-				int			k;
+			/* We just rebuild present_parts from scratch */
+			bms_free(pprune->present_parts);
+			pprune->present_parts = NULL;
 
-				/* We just rebuild present_parts from scratch */
-				bms_free(pprune->present_parts);
-				pprune->present_parts = NULL;
+			for (k = 0; k < nparts; k++)
+			{
+				int			oldidx = pprune->subplan_map[k];
+				int			subidx;
 
-				for (k = 0; k < nparts; k++)
+				/*
+				 * If this partition existed as a subplan then change the
+				 * old subplan index to the new subplan index.  The new
+				 * index may become -1 if the partition was pruned above,
+				 * or it may just come earlier in the subplan list due to
+				 * some subplans being removed earlier in the list.  If
+				 * it's a subpartition, add it to present_parts unless
+				 * it's entirely pruned.
+				 */
+				if (oldidx >= 0)
 				{
-					int			oldidx = pprune->subplan_map[k];
-					int			subidx;
-
-					/*
-					 * If this partition existed as a subplan then change the
-					 * old subplan index to the new subplan index.  The new
-					 * index may become -1 if the partition was pruned above,
-					 * or it may just come earlier in the subplan list due to
-					 * some subplans being removed earlier in the list.  If
-					 * it's a subpartition, add it to present_parts unless
-					 * it's entirely pruned.
-					 */
-					if (oldidx >= 0)
-					{
-						Assert(oldidx < nsubplans);
-						pprune->subplan_map[k] = new_subplan_indexes[oldidx] - 1;
+					Assert(oldidx < n_total_subplans);
+					pprune->subplan_map[k] = new_subplan_indexes[oldidx] - 1;
 
-						if (new_subplan_indexes[oldidx] > 0)
-							pprune->present_parts =
-								bms_add_member(pprune->present_parts, k);
-					}
-					else if ((subidx = pprune->subpart_map[k]) >= 0)
-					{
-						PartitionedRelPruningData *subprune;
+					if (new_subplan_indexes[oldidx] > 0)
+						pprune->present_parts =
+							bms_add_member(pprune->present_parts, k);
+				}
+				else if ((subidx = pprune->subpart_map[k]) >= 0)
+				{
+					PartitionedRelPruningData *subprune;
 
-						subprune = &prunedata->partrelprunedata[subidx];
+					subprune = &prunedata->partrelprunedata[subidx];
 
-						if (!bms_is_empty(subprune->present_parts))
-							pprune->present_parts =
-								bms_add_member(pprune->present_parts, k);
-					}
+					if (!bms_is_empty(subprune->present_parts))
+						pprune->present_parts =
+							bms_add_member(pprune->present_parts, k);
 				}
 			}
 		}
+	}
 
-		/*
-		 * We must also recompute the other_subplans set, since indexes in it
-		 * may change.
-		 */
-		new_other_subplans = NULL;
-		i = -1;
-		while ((i = bms_next_member(prunestate->other_subplans, i)) >= 0)
-			new_other_subplans = bms_add_member(new_other_subplans,
-												new_subplan_indexes[i] - 1);
-
-		bms_free(prunestate->other_subplans);
-		prunestate->other_subplans = new_other_subplans;
+	/*
+	 * We must also recompute the other_subplans set, since indexes in it
+	 * may change.
+	 */
+	new_other_subplans = NULL;
+	i = -1;
+	while ((i = bms_next_member(prunestate->other_subplans, i)) >= 0)
+		new_other_subplans = bms_add_member(new_other_subplans,
+											new_subplan_indexes[i] - 1);
 
-		pfree(new_subplan_indexes);
-	}
+	bms_free(prunestate->other_subplans);
+	prunestate->other_subplans = new_other_subplans;
 
-	return result;
+	pfree(new_subplan_indexes);
 }
 
 /*
@@ -2018,11 +2099,16 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate)
 		prunedata = prunestate->partprunedata[i];
 		pprune = &prunedata->partrelprunedata[0];
 
+		/*
+		 * We pass the 1st item belonging to the root table of the hierarchy
+		 * and find_matching_subplans_recurse() takes care of recursing to
+		 * other (lower-level) parents as needed.
+		 */
 		find_matching_subplans_recurse(prunedata, pprune, false, &result);
 
-		/* Expression eval may have used space in node's ps_ExprContext too */
+		/* Expression eval may have used space in ExprContext too */
 		if (pprune->exec_pruning_steps)
-			ResetExprContext(pprune->exec_context.planstate->ps_ExprContext);
+			ResetExprContext(pprune->exec_context.exprcontext);
 	}
 
 	/* Add in any subplans that partition pruning didn't account for */
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index 7937f1c88f..5b6d3eb23b 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -138,30 +138,17 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
 	{
 		PartitionPruneState *prunestate;
 
-		/* We may need an expression context to evaluate partition exprs */
-		ExecAssignExprContext(estate, &appendstate->ps);
-
-		/* Create the working data structure for pruning. */
-		prunestate = ExecCreatePartitionPruneState(&appendstate->ps,
-												   node->part_prune_info);
+		/*
+		 * Set up pruning data structure.  Initial pruning steps, if any, are
+		 * performed as part of the setup, adding the set of indexes of
+		 * surviving subplans to 'validsubplans'.
+		 */
+		prunestate = ExecInitPartitionPruning(&appendstate->ps,
+											  list_length(node->appendplans),
+											  node->part_prune_info,
+											  &validsubplans);
 		appendstate->as_prune_state = prunestate;
-
-		/* Perform an initial partition prune, if required. */
-		if (prunestate->do_initial_prune)
-		{
-			/* Determine which subplans survive initial pruning */
-			validsubplans = ExecFindInitialMatchingSubPlans(prunestate,
-															list_length(node->appendplans));
-
-			nplans = bms_num_members(validsubplans);
-		}
-		else
-		{
-			/* We'll need to initialize all subplans */
-			nplans = list_length(node->appendplans);
-			Assert(nplans > 0);
-			validsubplans = bms_add_range(NULL, 0, nplans - 1);
-		}
+		nplans = bms_num_members(validsubplans);
 
 		/*
 		 * When no run-time pruning is required and there's at least one
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index 418f89dea8..9a9f29e845 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -86,29 +86,17 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
 	{
 		PartitionPruneState *prunestate;
 
-		/* We may need an expression context to evaluate partition exprs */
-		ExecAssignExprContext(estate, &mergestate->ps);
-
-		prunestate = ExecCreatePartitionPruneState(&mergestate->ps,
-												   node->part_prune_info);
+		/*
+		 * Set up pruning data structure.  Initial pruning steps, if any, are
+		 * performed as part of the setup, adding the set of indexes of
+		 * surviving subplans to 'validsubplans'.
+		 */
+		prunestate = ExecInitPartitionPruning(&mergestate->ps,
+											  list_length(node->mergeplans),
+											  node->part_prune_info,
+											  &validsubplans);
 		mergestate->ms_prune_state = prunestate;
-
-		/* Perform an initial partition prune, if required. */
-		if (prunestate->do_initial_prune)
-		{
-			/* Determine which subplans survive initial pruning */
-			validsubplans = ExecFindInitialMatchingSubPlans(prunestate,
-															list_length(node->mergeplans));
-
-			nplans = bms_num_members(validsubplans);
-		}
-		else
-		{
-			/* We'll need to initialize all subplans */
-			nplans = list_length(node->mergeplans);
-			Assert(nplans > 0);
-			validsubplans = bms_add_range(NULL, 0, nplans - 1);
-		}
+		nplans = bms_num_members(validsubplans);
 
 		/*
 		 * When no run-time pruning is required and there's at least one
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index 1bc00826c1..7080cb25d9 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -798,6 +798,7 @@ prune_append_rel_partitions(RelOptInfo *rel)
 
 	/* These are not valid when being called from the planner */
 	context.planstate = NULL;
+	context.exprcontext = NULL;
 	context.exprstates = NULL;
 
 	/* Actual pruning happens here. */
@@ -808,8 +809,8 @@ prune_append_rel_partitions(RelOptInfo *rel)
  * get_matching_partitions
  *		Determine partitions that survive partition pruning
  *
- * Note: context->planstate must be set to a valid PlanState when the
- * pruning_steps were generated with a target other than PARTTARGET_PLANNER.
+ * Note: context->exprcontext must be valid when the pruning_steps were
+ * generated with a target other than PARTTARGET_PLANNER.
  *
  * Returns a Bitmapset of the RelOptInfo->part_rels indexes of the surviving
  * partitions.
@@ -3654,7 +3655,7 @@ match_boolean_partition_clause(Oid partopfamily, Expr *clause, Expr *partkey,
  * exprstate array.
  *
  * Note that the evaluated result may be in the per-tuple memory context of
- * context->planstate->ps_ExprContext, and we may have leaked other memory
+ * context->exprcontext, and we may have leaked other memory
  * there too.  This memory must be recovered by resetting that ExprContext
  * after we're done with the pruning operation (see execPartition.c).
  */
@@ -3677,13 +3678,18 @@ partkey_datum_from_expr(PartitionPruneContext *context,
 		ExprContext *ectx;
 
 		/*
-		 * We should never see a non-Const in a step unless we're running in
-		 * the executor.
+		 * We should never see a non-Const in a step unless the caller has
+		 * passed a valid ExprContext.
+		 *
+		 * When context->planstate is valid, context->exprcontext is same
+		 * as context->planstate->ps_ExprContext.
 		 */
-		Assert(context->planstate != NULL);
+		Assert(context->planstate != NULL || context->exprcontext != NULL);
+		Assert(context->planstate == NULL ||
+			   (context->exprcontext == context->planstate->ps_ExprContext));
 
 		exprstate = context->exprstates[stateidx];
-		ectx = context->planstate->ps_ExprContext;
+		ectx = context->exprcontext;
 		*value = ExecEvalExprSwitchContext(exprstate, ectx, isnull);
 	}
 }
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index 603d8becc4..fd5735a946 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -119,10 +119,9 @@ extern ResultRelInfo *ExecFindPartition(ModifyTableState *mtstate,
 										EState *estate);
 extern void ExecCleanupTupleRouting(ModifyTableState *mtstate,
 									PartitionTupleRouting *proute);
-extern PartitionPruneState *ExecCreatePartitionPruneState(PlanState *planstate,
-														  PartitionPruneInfo *partitionpruneinfo);
+extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
+						 int n_total_subplans,
+						 PartitionPruneInfo *pruneinfo,
+						 Bitmapset **initially_valid_subplans);
 extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate);
-extern Bitmapset *ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate,
-												  int nsubplans);
-
 #endif							/* EXECPARTITION_H */
diff --git a/src/include/partitioning/partprune.h b/src/include/partitioning/partprune.h
index ee11b6feae..90684efa25 100644
--- a/src/include/partitioning/partprune.h
+++ b/src/include/partitioning/partprune.h
@@ -41,6 +41,7 @@ struct RelOptInfo;
  *					subsidiary data, such as the FmgrInfos.
  * planstate		Points to the parent plan node's PlanState when called
  *					during execution; NULL when called from the planner.
+ * exprcontext		ExprContext to use when evaluating pruning expressions
  * exprstates		Array of ExprStates, indexed as per PruneCxtStateIdx; one
  *					for each partition key in each pruning step.  Allocated if
  *					planstate is non-NULL, otherwise NULL.
@@ -56,6 +57,7 @@ typedef struct PartitionPruneContext
 	FmgrInfo   *stepcmpfuncs;
 	MemoryContext ppccontext;
 	PlanState  *planstate;
+	ExprContext *exprcontext;
 	ExprState **exprstates;
 } PartitionPruneContext;
 
-- 
2.24.1



^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2022-03-28 07:28  Amit Langote <[email protected]>
  parent: Amit Langote <[email protected]>
  0 siblings, 1 reply; 108+ messages in thread

From: Amit Langote @ 2022-03-28 07:28 UTC (permalink / raw)
  To: Robert Haas <[email protected]>; +Cc: Tom Lane <[email protected]>; pgsql-hackers; David Rowley *EXTERN* <[email protected]>

On Mon, Mar 28, 2022 at 4:17 PM Amit Langote <[email protected]> wrote:
> Other than the changes mentioned above, the updated patch now contains
> a bit more commentary than earlier versions, mostly around
> AcquireExecutorLocks()'s new way of determining the set of relations
> to lock and the significantly redesigned working of the "initial"
> execution pruning.

Forgot to rebase over the latest HEAD, so here's v7.  Also fixed that
_out and _read functions for PlanInitPruningOutput were using an
obsolete node label.

-- 
Amit Langote
EDB: http://www.enterprisedb.com


Attachments:

  [application/octet-stream] v7-0002-Add-Merge-Append.partitioned_rels.patch (17.4K, 2-v7-0002-Add-Merge-Append.partitioned_rels.patch)
  download | inline diff:
From b43aac217ba51854c5a22636f94f14e81bae3991 Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Thu, 24 Mar 2022 22:47:03 +0900
Subject: [PATCH v7 2/4] Add [Merge]Append.partitioned_rels

To record the RT indexes of all partitioned ancestors leading up to
leaf partitions that are appended by the node.

If a given [Merge]Append node is left out from the plan due to there
being only one element in its list of child subplans, then its
partitioned_rels set is added to PlannerGlobal.elidedAppendPartedRels
that is passed down to the executor through PlannedStmt.

There are no users for partitioned_rels and elidedAppendPartedRels
as of this commit, though a later commit will require the ability
to extract the set of relations that must be locked to make a plan
tree safe for execution by walking the plan tree itself, so having
the partitioned tables be also present in the plan tree will be
helpful.  Note that currently the executor relies on the fact that
the set of relations to be locked can be obtained by simply scanning
the range table that's made available in PlannedStmt along with the
plan tree.
---
 src/backend/nodes/copyfuncs.c           |  3 +++
 src/backend/nodes/outfuncs.c            |  5 +++++
 src/backend/nodes/readfuncs.c           |  3 +++
 src/backend/optimizer/path/joinrels.c   |  9 ++++++++
 src/backend/optimizer/plan/createplan.c | 18 +++++++++++++++-
 src/backend/optimizer/plan/planner.c    |  8 +++++++
 src/backend/optimizer/plan/setrefs.c    | 28 +++++++++++++++++++++++++
 src/backend/optimizer/util/inherit.c    | 16 ++++++++++++++
 src/backend/optimizer/util/relnode.c    | 20 ++++++++++++++++++
 src/include/nodes/pathnodes.h           | 22 +++++++++++++++++++
 src/include/nodes/plannodes.h           | 17 +++++++++++++++
 11 files changed, 148 insertions(+), 1 deletion(-)

diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 2cbd8aa0df..d4b5cc7e59 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -106,6 +106,7 @@ _copyPlannedStmt(const PlannedStmt *from)
 	COPY_NODE_FIELD(invalItems);
 	COPY_NODE_FIELD(paramExecTypes);
 	COPY_NODE_FIELD(utilityStmt);
+	COPY_BITMAPSET_FIELD(elidedAppendPartedRels);
 	COPY_LOCATION_FIELD(stmt_location);
 	COPY_SCALAR_FIELD(stmt_len);
 
@@ -253,6 +254,7 @@ _copyAppend(const Append *from)
 	COPY_SCALAR_FIELD(nasyncplans);
 	COPY_SCALAR_FIELD(first_partial_plan);
 	COPY_NODE_FIELD(part_prune_info);
+	COPY_BITMAPSET_FIELD(partitioned_rels);
 
 	return newnode;
 }
@@ -281,6 +283,7 @@ _copyMergeAppend(const MergeAppend *from)
 	COPY_POINTER_FIELD(collations, from->numCols * sizeof(Oid));
 	COPY_POINTER_FIELD(nullsFirst, from->numCols * sizeof(bool));
 	COPY_NODE_FIELD(part_prune_info);
+	COPY_BITMAPSET_FIELD(partitioned_rels);
 
 	return newnode;
 }
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index c25f0bd684..99056272f3 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -324,6 +324,7 @@ _outPlannedStmt(StringInfo str, const PlannedStmt *node)
 	WRITE_NODE_FIELD(invalItems);
 	WRITE_NODE_FIELD(paramExecTypes);
 	WRITE_NODE_FIELD(utilityStmt);
+	WRITE_BITMAPSET_FIELD(elidedAppendPartedRels);
 	WRITE_LOCATION_FIELD(stmt_location);
 	WRITE_INT_FIELD(stmt_len);
 }
@@ -443,6 +444,7 @@ _outAppend(StringInfo str, const Append *node)
 	WRITE_INT_FIELD(nasyncplans);
 	WRITE_INT_FIELD(first_partial_plan);
 	WRITE_NODE_FIELD(part_prune_info);
+	WRITE_BITMAPSET_FIELD(partitioned_rels);
 }
 
 static void
@@ -460,6 +462,7 @@ _outMergeAppend(StringInfo str, const MergeAppend *node)
 	WRITE_OID_ARRAY(collations, node->numCols);
 	WRITE_BOOL_ARRAY(nullsFirst, node->numCols);
 	WRITE_NODE_FIELD(part_prune_info);
+	WRITE_BITMAPSET_FIELD(partitioned_rels);
 }
 
 static void
@@ -2333,6 +2336,7 @@ _outPlannerGlobal(StringInfo str, const PlannerGlobal *node)
 	WRITE_BOOL_FIELD(parallelModeOK);
 	WRITE_BOOL_FIELD(parallelModeNeeded);
 	WRITE_CHAR_FIELD(maxParallelHazard);
+	WRITE_BITMAPSET_FIELD(elidedAppendPartedRels);
 }
 
 static void
@@ -2444,6 +2448,7 @@ _outRelOptInfo(StringInfo str, const RelOptInfo *node)
 	WRITE_BOOL_FIELD(partbounds_merged);
 	WRITE_BITMAPSET_FIELD(live_parts);
 	WRITE_BITMAPSET_FIELD(all_partrels);
+	WRITE_BITMAPSET_FIELD(partitioned_rels);
 }
 
 static void
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index e0b3ad1ed2..7536f216bd 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -1662,6 +1662,7 @@ _readPlannedStmt(void)
 	READ_NODE_FIELD(invalItems);
 	READ_NODE_FIELD(paramExecTypes);
 	READ_NODE_FIELD(utilityStmt);
+	READ_BITMAPSET_FIELD(elidedAppendPartedRels);
 	READ_LOCATION_FIELD(stmt_location);
 	READ_INT_FIELD(stmt_len);
 
@@ -1784,6 +1785,7 @@ _readAppend(void)
 	READ_INT_FIELD(nasyncplans);
 	READ_INT_FIELD(first_partial_plan);
 	READ_NODE_FIELD(part_prune_info);
+	READ_BITMAPSET_FIELD(partitioned_rels);
 
 	READ_DONE();
 }
@@ -1806,6 +1808,7 @@ _readMergeAppend(void)
 	READ_OID_ARRAY(collations, local_node->numCols);
 	READ_BOOL_ARRAY(nullsFirst, local_node->numCols);
 	READ_NODE_FIELD(part_prune_info);
+	READ_BITMAPSET_FIELD(partitioned_rels);
 
 	READ_DONE();
 }
diff --git a/src/backend/optimizer/path/joinrels.c b/src/backend/optimizer/path/joinrels.c
index 9da3ff2f9a..e74d40fee3 100644
--- a/src/backend/optimizer/path/joinrels.c
+++ b/src/backend/optimizer/path/joinrels.c
@@ -1549,6 +1549,15 @@ try_partitionwise_join(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2,
 		populate_joinrel_with_paths(root, child_rel1, child_rel2,
 									child_joinrel, child_sjinfo,
 									child_restrictlist);
+
+		/*
+		 * A parent relation's partitioned_rels must be a superset of the sets
+		 * of all its children, direct or indirect, so bubble up the child
+		 * joinrel's set.
+		 */
+		joinrel->partitioned_rels =
+			bms_add_members(joinrel->partitioned_rels,
+							child_joinrel->partitioned_rels);
 	}
 }
 
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index fa069a217c..0026086591 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -26,10 +26,12 @@
 #include "nodes/extensible.h"
 #include "nodes/makefuncs.h"
 #include "nodes/nodeFuncs.h"
+#include "optimizer/appendinfo.h"
 #include "optimizer/clauses.h"
 #include "optimizer/cost.h"
 #include "optimizer/optimizer.h"
 #include "optimizer/paramassign.h"
+#include "optimizer/pathnode.h"
 #include "optimizer/paths.h"
 #include "optimizer/placeholder.h"
 #include "optimizer/plancat.h"
@@ -1331,11 +1333,11 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
 										 best_path->subpaths,
 										 prunequal);
 	}
-
 	plan->appendplans = subplans;
 	plan->nasyncplans = nasyncplans;
 	plan->first_partial_plan = best_path->first_partial_path;
 	plan->part_prune_info = partpruneinfo;
+	plan->partitioned_rels = bms_copy(rel->partitioned_rels);
 
 	copy_generic_path_info(&plan->plan, (Path *) best_path);
 
@@ -1499,6 +1501,20 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
 	node->mergeplans = subplans;
 	node->part_prune_info = partpruneinfo;
 
+	/*
+	 * We need to explicitly add to the plan node the RT indexes of any
+	 * partitioned tables whose partitions will be scanned by the nodes in
+	 * 'subplans'.  There can be multiple RT indexes in the set due to the
+	 * partition tree being multi-level and/or this being a plan for UNION ALL
+	 * over multiple partition trees.  Along with scanrelids of leaf-level Scan
+	 * nodes, this allows the executor to lock the full set of relations being
+	 * scanned by this node.
+	 *
+	 * Note that 'apprelids' only contains the top-level base relation(s), so
+	 * is not sufficient for the purpose.
+	 */
+	node->partitioned_rels = bms_copy(rel->partitioned_rels);
+
 	/*
 	 * If prepare_sort_from_pathkeys added sort columns, but we were told to
 	 * produce either the exact tlist or a narrow tlist, we should get rid of
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index bd09f85aea..374a9d9753 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -529,6 +529,7 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
 	result->paramExecTypes = glob->paramExecTypes;
 	/* utilityStmt should be null, but we might as well copy it */
 	result->utilityStmt = parse->utilityStmt;
+	result->elidedAppendPartedRels = glob->elidedAppendPartedRels;
 	result->stmt_location = parse->stmt_location;
 	result->stmt_len = parse->stmt_len;
 
@@ -7365,6 +7366,13 @@ create_partitionwise_grouping_paths(PlannerInfo *root,
 
 		add_paths_to_append_rel(root, grouped_rel, grouped_live_children);
 	}
+
+	/*
+	 * Input rel might be a partitioned appendrel, though grouped_rel has at
+	 * this point taken its role as the an appendrel owning the former's
+	 * children, so copy the former's partitioned_rels set into the latter.
+	 */
+	grouped_rel->partitioned_rels = bms_copy(input_rel->partitioned_rels);
 }
 
 /*
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index a7b11b7f03..dbdeb8ec9d 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -1512,6 +1512,10 @@ set_append_references(PlannerInfo *root,
 		lfirst(l) = set_plan_refs(root, (Plan *) lfirst(l), rtoffset);
 	}
 
+	/* Fix up partitioned_rels before possibly removing the Append below. */
+	aplan->partitioned_rels = offset_relid_set(aplan->partitioned_rels,
+											   rtoffset);
+
 	/*
 	 * See if it's safe to get rid of the Append entirely.  For this to be
 	 * safe, there must be only one child plan and that child plan's parallel
@@ -1522,8 +1526,17 @@ set_append_references(PlannerInfo *root,
 	 */
 	if (list_length(aplan->appendplans) == 1 &&
 		((Plan *) linitial(aplan->appendplans))->parallel_aware == aplan->plan.parallel_aware)
+	{
+		/*
+		 * Partitioned table involved, if any, must be made known to the
+		 * executor.
+		 */
+		root->glob->elidedAppendPartedRels =
+			bms_add_members(root->glob->elidedAppendPartedRels,
+							aplan->partitioned_rels);
 		return clean_up_removed_plan_level((Plan *) aplan,
 										   (Plan *) linitial(aplan->appendplans));
+	}
 
 	/*
 	 * Otherwise, clean up the Append as needed.  It's okay to do this after
@@ -1584,6 +1597,12 @@ set_mergeappend_references(PlannerInfo *root,
 		lfirst(l) = set_plan_refs(root, (Plan *) lfirst(l), rtoffset);
 	}
 
+	/*
+	 * Fix up partitioned_rels before possibly removing the MergeAppend below.
+	 */
+	mplan->partitioned_rels = offset_relid_set(mplan->partitioned_rels,
+											   rtoffset);
+
 	/*
 	 * See if it's safe to get rid of the MergeAppend entirely.  For this to
 	 * be safe, there must be only one child plan and that child plan's
@@ -1594,8 +1613,17 @@ set_mergeappend_references(PlannerInfo *root,
 	 */
 	if (list_length(mplan->mergeplans) == 1 &&
 		((Plan *) linitial(mplan->mergeplans))->parallel_aware == mplan->plan.parallel_aware)
+	{
+		/*
+		 * Partitioned tables involved, if any, must be made known to the
+		 * executor.
+		 */
+		root->glob->elidedAppendPartedRels =
+			bms_add_members(root->glob->elidedAppendPartedRels,
+							mplan->partitioned_rels);
 		return clean_up_removed_plan_level((Plan *) mplan,
 										   (Plan *) linitial(mplan->mergeplans));
+	}
 
 	/*
 	 * Otherwise, clean up the MergeAppend as needed.  It's okay to do this
diff --git a/src/backend/optimizer/util/inherit.c b/src/backend/optimizer/util/inherit.c
index 7e134822f3..56912e4101 100644
--- a/src/backend/optimizer/util/inherit.c
+++ b/src/backend/optimizer/util/inherit.c
@@ -406,6 +406,14 @@ expand_partitioned_rtentry(PlannerInfo *root, RelOptInfo *relinfo,
 									   childrte, childRTindex,
 									   childrel, top_parentrc, lockmode);
 
+		/*
+		 * A parent relation's partitioned_rels must be a superset of the sets
+		 * of all its children, direct or indirect, so bubble up the child
+		 * rel's set.
+		 */
+		relinfo->partitioned_rels = bms_add_members(relinfo->partitioned_rels,
+													childrelinfo->partitioned_rels);
+
 		/* Close child relation, but keep locks */
 		table_close(childrel, NoLock);
 	}
@@ -737,6 +745,14 @@ expand_appendrel_subquery(PlannerInfo *root, RelOptInfo *rel,
 		/* Child may itself be an inherited rel, either table or subquery. */
 		if (childrte->inh)
 			expand_inherited_rtentry(root, childrel, childrte, childRTindex);
+
+		/*
+		 * A parent relation's partitioned_rels must be a superset of the sets
+		 * of all its children, direct or indirect, so bubble up the child
+		 * rel's set.
+		 */
+		rel->partitioned_rels = bms_add_members(rel->partitioned_rels,
+												childrel->partitioned_rels);
 	}
 }
 
diff --git a/src/backend/optimizer/util/relnode.c b/src/backend/optimizer/util/relnode.c
index 520409f4ba..1d082a8fdd 100644
--- a/src/backend/optimizer/util/relnode.c
+++ b/src/backend/optimizer/util/relnode.c
@@ -361,6 +361,10 @@ build_simple_rel(PlannerInfo *root, int relid, RelOptInfo *parent)
 		}
 	}
 
+	/* A partitioned appendrel. */
+	if (rel->part_scheme != NULL)
+		rel->partitioned_rels = bms_copy(rel->relids);
+
 	/* Save the finished struct in the query's simple_rel_array */
 	root->simple_rel_array[relid] = rel;
 
@@ -729,6 +733,14 @@ build_join_rel(PlannerInfo *root,
 	set_joinrel_size_estimates(root, joinrel, outer_rel, inner_rel,
 							   sjinfo, restrictlist);
 
+	/*
+	 * The joinrel may get processed as an appendrel via partitionwise join
+	 * if both outer and inner rels are partitioned, so set partitioned_rels
+	 * appropriately.
+	 */
+	joinrel->partitioned_rels = bms_union(outer_rel->partitioned_rels,
+										  inner_rel->partitioned_rels);
+
 	/*
 	 * Set the consider_parallel flag if this joinrel could potentially be
 	 * scanned within a parallel worker.  If this flag is false for either
@@ -897,6 +909,14 @@ build_child_join_rel(PlannerInfo *root, RelOptInfo *outer_rel,
 	set_joinrel_size_estimates(root, joinrel, outer_rel, inner_rel,
 							   sjinfo, restrictlist);
 
+	/*
+	 * The joinrel may get processed as an appendrel via partitionwise join
+	 * if both outer and inner rels are partitioned, so set partitioned_rels
+	 * appropriately.
+	 */
+	joinrel->partitioned_rels = bms_union(outer_rel->partitioned_rels,
+										  inner_rel->partitioned_rels);
+
 	/* We build the join only once. */
 	Assert(!find_join_rel(root, joinrel->relids));
 
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 1f3845b3fe..5327d9ba8b 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -130,6 +130,11 @@ typedef struct PlannerGlobal
 	char		maxParallelHazard;	/* worst PROPARALLEL hazard level */
 
 	PartitionDirectory partition_directory; /* partition descriptors */
+
+	Bitmapset *elidedAppendPartedRels;	/* Combined partitioned_rels of all
+										 * single-subplan [Merge]Append nodes
+										 * that have been removed fron the
+										 * various plan trees. */
 } PlannerGlobal;
 
 /* macro for fetching the Plan associated with a SubPlan node */
@@ -773,6 +778,23 @@ typedef struct RelOptInfo
 	Relids		all_partrels;	/* Relids set of all partition relids */
 	List	  **partexprs;		/* Non-nullable partition key expressions */
 	List	  **nullable_partexprs; /* Nullable partition key expressions */
+
+	/*
+	 * For an appendrel parent relation (base, join, or upper) that is
+	 * partitioned, this stores the RT indexes of all the paritioned ancestors
+	 * including itself that lead up to the individual leaf partitions that
+	 * will be scanned to produce this relation's output rows.  The relid set
+	 * is copied into the resulting Append or MergeAppend plan node for
+	 * allowing the executor to take appropriate locks on those relations,
+	 * unless the node is deemed useless in setrefs.c due to having a single
+	 * leaf subplan and thus elided from the final plan, in which case, the set
+	 * is added into PlannerGlobal.elidedAppendPartedRels.
+	 *
+	 * Note that 'apprelids' of those nodes only contains the top-level base
+	 * relation(s), so is not sufficient for said purpose.
+	 */
+
+	Bitmapset  *partitioned_rels;
 } RelOptInfo;
 
 /*
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 0b518ce6b2..bd87c35d6c 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -85,6 +85,11 @@ typedef struct PlannedStmt
 
 	Node	   *utilityStmt;	/* non-null if this is utility stmt */
 
+	Bitmapset *elidedAppendPartedRels;	/* Combined partitioned_rels of all
+										 * single-subplan [Merge]Append nodes
+										 * that have been removed from the
+										 * various plan trees. */
+
 	/* statement location in source string (copied from Query) */
 	int			stmt_location;	/* start location, or -1 if unknown */
 	int			stmt_len;		/* length in bytes; 0 means "rest of string" */
@@ -261,6 +266,12 @@ typedef struct Append
 
 	/* Info for run-time subplan pruning; NULL if we're not doing that */
 	struct PartitionPruneInfo *part_prune_info;
+
+	/*
+	 * RT indexes of all partitioned parents whose partitions' plans are
+	 * present in appendplans.
+	 */
+	Bitmapset  *partitioned_rels;
 } Append;
 
 /* ----------------
@@ -281,6 +292,12 @@ typedef struct MergeAppend
 	bool	   *nullsFirst;		/* NULLS FIRST/LAST directions */
 	/* Info for run-time subplan pruning; NULL if we're not doing that */
 	struct PartitionPruneInfo *part_prune_info;
+
+	/*
+	 * RT indexes of all partitioned parents whose partitions' plans are
+	 * present in appendplans.
+	 */
+	Bitmapset  *partitioned_rels;
 } MergeAppend;
 
 /* ----------------
-- 
2.24.1



  [application/octet-stream] v7-0003-Add-a-plan_tree_walker.patch (3.9K, 3-v7-0003-Add-a-plan_tree_walker.patch)
  download | inline diff:
From 761e6c2583b37eb9d45d64de954d65d953277040 Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Thu, 3 Mar 2022 16:04:13 +0900
Subject: [PATCH v7 3/4] Add a plan_tree_walker()

Like planstate_tree_walker() but for uninitialized plan trees.
---
 src/backend/nodes/nodeFuncs.c | 116 ++++++++++++++++++++++++++++++++++
 src/include/nodes/nodeFuncs.h |   3 +
 2 files changed, 119 insertions(+)

diff --git a/src/backend/nodes/nodeFuncs.c b/src/backend/nodes/nodeFuncs.c
index 25cf282aab..5e5158ea0e 100644
--- a/src/backend/nodes/nodeFuncs.c
+++ b/src/backend/nodes/nodeFuncs.c
@@ -31,6 +31,10 @@ static bool planstate_walk_subplans(List *plans, bool (*walker) (),
 									void *context);
 static bool planstate_walk_members(PlanState **planstates, int nplans,
 								   bool (*walker) (), void *context);
+static bool plan_walk_subplans(List *plans,
+				   bool (*walker) (),
+				   void *context);
+static bool plan_walk_members(List *plans, bool (*walker) (), void *context);
 
 
 /*
@@ -4368,3 +4372,115 @@ planstate_walk_members(PlanState **planstates, int nplans,
 
 	return false;
 }
+
+/*
+ * plan_tree_walker --- walk plantrees
+ *
+ * The walker has already visited the current node, and so we need only
+ * recurse into any sub-nodes it has.
+ */
+bool
+plan_tree_walker(Plan *plan,
+				 bool (*walker) (),
+				 void *context)
+{
+	/* Guard against stack overflow due to overly complex plan trees */
+	check_stack_depth();
+
+	/* initPlan-s */
+	if (plan_walk_subplans(plan->initPlan, walker, context))
+		return true;
+
+	/* lefttree */
+	if (outerPlan(plan))
+	{
+		if (walker(outerPlan(plan), context))
+			return true;
+	}
+
+	/* righttree */
+	if (innerPlan(plan))
+	{
+		if (walker(innerPlan(plan), context))
+			return true;
+	}
+
+	/* special child plans */
+	switch (nodeTag(plan))
+	{
+		case T_Append:
+			if (plan_walk_members(((Append *) plan)->appendplans,
+								  walker, context))
+				return true;
+			break;
+		case T_MergeAppend:
+			if (plan_walk_members(((MergeAppend *) plan)->mergeplans,
+								  walker, context))
+				return true;
+			break;
+		case T_BitmapAnd:
+			if (plan_walk_members(((BitmapAnd *) plan)->bitmapplans,
+								  walker, context))
+				return true;
+			break;
+		case T_BitmapOr:
+			if (plan_walk_members(((BitmapOr *) plan)->bitmapplans,
+								  walker, context))
+				return true;
+			break;
+		case T_CustomScan:
+			if (plan_walk_members(((CustomScan *) plan)->custom_plans,
+								  walker, context))
+				return true;
+			break;
+		case T_SubqueryScan:
+			if (walker(((SubqueryScan *) plan)->subplan, context))
+				return true;
+			break;
+		default:
+			break;
+	}
+
+	return false;
+}
+
+/*
+ * Walk a list of SubPlans (or initPlans, which also use SubPlan nodes).
+ */
+static bool
+plan_walk_subplans(List *plans,
+				   bool (*walker) (),
+				   void *context)
+{
+	ListCell   *lc;
+	PlannedStmt *plannedstmt = (PlannedStmt *) context;
+
+	foreach(lc, plans)
+	{
+		SubPlan *sp = lfirst_node(SubPlan, lc);
+		Plan *p = list_nth(plannedstmt->subplans, sp->plan_id - 1);
+
+		if (walker(p, context))
+			return true;
+	}
+
+	return false;
+}
+
+/*
+ * Walk the constituent plans of a ModifyTable, Append, MergeAppend,
+ * BitmapAnd, or BitmapOr node.
+ */
+static bool
+plan_walk_members(List *plans, bool (*walker) (), void *context)
+{
+	ListCell *lc;
+
+	foreach(lc, plans)
+	{
+		if (walker(lfirst(lc), context))
+			return true;
+	}
+
+	return false;
+}
diff --git a/src/include/nodes/nodeFuncs.h b/src/include/nodes/nodeFuncs.h
index 93c60bde66..fca107ad65 100644
--- a/src/include/nodes/nodeFuncs.h
+++ b/src/include/nodes/nodeFuncs.h
@@ -158,5 +158,8 @@ extern bool raw_expression_tree_walker(Node *node, bool (*walker) (),
 struct PlanState;
 extern bool planstate_tree_walker(struct PlanState *planstate, bool (*walker) (),
 								  void *context);
+struct Plan;
+extern bool plan_tree_walker(struct Plan *plan, bool (*walker) (),
+				 void *context);
 
 #endif							/* NODEFUNCS_H */
-- 
2.24.1



  [application/octet-stream] v7-0001-Some-refactoring-of-runtime-pruning-code.patch (26.5K, 4-v7-0001-Some-refactoring-of-runtime-pruning-code.patch)
  download | inline diff:
From 60ec0ebb911a2c7c8cc13ea9f96e1fb2038842a0 Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Wed, 2 Mar 2022 15:17:55 +0900
Subject: [PATCH v7 1/4] Some refactoring of runtime pruning code

This does two things mainly:

* Move the execution pruning initialization steps that are common
between both ExecInitAppend() and ExecInitMergeAppend() into a new
function ExecInitPartitionPruning() defined in execPartition.c.
Thus, ExecCreatePartitionPruneState() and
ExecFindInitialMatchingSubPlans() need not be exported.

* Add an ExprContext field to PartitionPruneContext to remove the
implicit assumption in the runtime pruning code that the ExprContext
to use to compute pruning expressions that need one can always rely
on the PlanState providing it.  A future patch will allow runtime
pruning (at least the initial pruning steps) to be performed without
the corresponding PlanState yet having been created, so this will
help.
---
 src/backend/executor/execPartition.c   | 340 ++++++++++++++++---------
 src/backend/executor/nodeAppend.c      |  33 +--
 src/backend/executor/nodeMergeAppend.c |  32 +--
 src/backend/partitioning/partprune.c   |  20 +-
 src/include/executor/execPartition.h   |   9 +-
 src/include/partitioning/partprune.h   |   2 +
 6 files changed, 252 insertions(+), 184 deletions(-)

diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 90ed1485d1..7ff5a95f05 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -182,11 +182,18 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation rel,
 												  bool *isnull,
 												  int maxfieldlen);
 static List *adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri);
+static PartitionPruneState *ExecCreatePartitionPruneState(PlanState *planstate,
+							  PartitionPruneInfo *partitionpruneinfo);
+static Bitmapset *ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate);
 static void ExecInitPruningContext(PartitionPruneContext *context,
 								   List *pruning_steps,
 								   PartitionDesc partdesc,
 								   PartitionKey partkey,
-								   PlanState *planstate);
+								   PlanState *planstate,
+								   ExprContext *econtext);
+static void PartitionPruneStateFixSubPlanMap(PartitionPruneState *prunestate,
+								 Bitmapset *initially_valid_subplans,
+								 int n_total_subplans);
 static void find_matching_subplans_recurse(PartitionPruningData *prunedata,
 										   PartitionedRelPruningData *pprune,
 										   bool initial_prune,
@@ -1485,30 +1492,86 @@ adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri)
  *
  * Functions:
  *
- * ExecCreatePartitionPruneState:
+ * ExecInitPartitionPruning:
  *		Creates the PartitionPruneState required by each of the two pruning
  *		functions.  Details stored include how to map the partition index
- *		returned by the partition pruning code into subplan indexes.
- *
- * ExecFindInitialMatchingSubPlans:
- *		Returns indexes of matching subplans.  Partition pruning is attempted
- *		without any evaluation of expressions containing PARAM_EXEC Params.
- *		This function must be called during executor startup for the parent
- *		plan before the subplans themselves are initialized.  Subplans which
- *		are found not to match by this function must be removed from the
- *		plan's list of subplans during execution, as this function performs a
- *		remap of the partition index to subplan index map and the newly
- *		created map provides indexes only for subplans which remain after
- *		calling this function.
+ *		returned by the partition pruning code into subplan indexes.  Also
+ *		determines the set of initially valid subplans by performing initial
+ *		pruning steps, only which need be initialized by the caller such as
+ *		ExecInitAppend.  Maps in PartitionPruneState are updated to account
+ *		for initial pruning having eliminated some of the subplans, if any.
  *
  * ExecFindMatchingSubPlans:
  *		Returns indexes of matching subplans after evaluating all available
- *		expressions.  This function can only be called during execution and
- *		must be called again each time the value of a Param listed in
- *		PartitionPruneState's 'execparamids' changes.
+ *		expressions, that is, using execution pruning steps.  This function can
+ *		can only be called during execution and must be called again each time
+ *		the value of a Param listed in PartitionPruneState's 'execparamids'
+ *		changes.
  *-------------------------------------------------------------------------
  */
 
+/*
+ * ExecInitPartitionPruning
+ * 		Initialize data structure needed for run-time partition pruning
+ *
+ * Initial pruning can be done immediately, so it is done here if needed and
+ * the set of surviving partition subplans' indexes are added to the output
+ * parameter *initially_valid_subplans.
+ *
+ * If subplans are indeed pruned, subplan_map arrays contained in the returned
+ * PartitionPruneState are re-sequenced to not count those, though only if the
+ * maps will be needed for subsequent execution pruning passes.
+ */
+PartitionPruneState *
+ExecInitPartitionPruning(PlanState *planstate,
+						 int n_total_subplans,
+						 PartitionPruneInfo *pruneinfo,
+						 Bitmapset **initially_valid_subplans)
+{
+	PartitionPruneState *prunestate;
+	EState *estate = planstate->state;
+
+	/* We may need an expression context to evaluate partition exprs */
+	ExecAssignExprContext(estate, planstate);
+
+	/*
+	 * Create the working data structure for pruning.
+	 */
+	prunestate = ExecCreatePartitionPruneState(planstate, pruneinfo);
+
+	/*
+	 * Perform an initial partition prune, if required.
+	 */
+	if (prunestate->do_initial_prune)
+	{
+		/* Determine which subplans survive initial pruning */
+		*initially_valid_subplans = ExecFindInitialMatchingSubPlans(prunestate);
+	}
+	else
+	{
+		/* We'll need to initialize all subplans */
+		Assert(n_total_subplans > 0);
+		*initially_valid_subplans = bms_add_range(NULL, 0,
+												  n_total_subplans - 1);
+	}
+
+	/*
+	 * Re-sequence subplan indexes contained in prunestate to account for any
+	 * that were removed above due to initial pruning.
+	 *
+	 * We can safely skip this when !do_exec_prune, even though that leaves
+	 * invalid data in prunestate, because that data won't be consulted again
+	 * (cf initial Assert in ExecFindMatchingSubPlans).
+	 */
+	if (prunestate->do_exec_prune &&
+		bms_num_members(*initially_valid_subplans) < n_total_subplans)
+		PartitionPruneStateFixSubPlanMap(prunestate,
+										 *initially_valid_subplans,
+										 n_total_subplans);
+
+	return prunestate;
+}
+
 /*
  * ExecCreatePartitionPruneState
  *		Build the data structure required for calling
@@ -1527,7 +1590,7 @@ adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri)
  * re-used each time we re-evaluate which partitions match the pruning steps
  * provided in each PartitionedRelPruneInfo.
  */
-PartitionPruneState *
+static PartitionPruneState *
 ExecCreatePartitionPruneState(PlanState *planstate,
 							  PartitionPruneInfo *partitionpruneinfo)
 {
@@ -1536,6 +1599,7 @@ ExecCreatePartitionPruneState(PlanState *planstate,
 	int			n_part_hierarchies;
 	ListCell   *lc;
 	int			i;
+	ExprContext	*econtext = planstate->ps_ExprContext;
 
 	/* For data reading, executor always omits detached partitions */
 	if (estate->es_partition_directory == NULL)
@@ -1709,7 +1773,8 @@ ExecCreatePartitionPruneState(PlanState *planstate,
 			{
 				ExecInitPruningContext(&pprune->initial_context,
 									   pinfo->initial_pruning_steps,
-									   partdesc, partkey, planstate);
+									   partdesc, partkey, planstate,
+									   econtext);
 				/* Record whether initial pruning is needed at any level */
 				prunestate->do_initial_prune = true;
 			}
@@ -1718,7 +1783,8 @@ ExecCreatePartitionPruneState(PlanState *planstate,
 			{
 				ExecInitPruningContext(&pprune->exec_context,
 									   pinfo->exec_pruning_steps,
-									   partdesc, partkey, planstate);
+									   partdesc, partkey, planstate,
+									   econtext);
 				/* Record whether exec pruning is needed at any level */
 				prunestate->do_exec_prune = true;
 			}
@@ -1746,7 +1812,8 @@ ExecInitPruningContext(PartitionPruneContext *context,
 					   List *pruning_steps,
 					   PartitionDesc partdesc,
 					   PartitionKey partkey,
-					   PlanState *planstate)
+					   PlanState *planstate,
+					   ExprContext *econtext)
 {
 	int			n_steps;
 	int			partnatts;
@@ -1767,6 +1834,7 @@ ExecInitPruningContext(PartitionPruneContext *context,
 
 	context->ppccontext = CurrentMemoryContext;
 	context->planstate = planstate;
+	context->exprcontext = econtext;
 
 	/* Initialize expression state for each expression we need */
 	context->exprstates = (ExprState **)
@@ -1795,8 +1863,20 @@ ExecInitPruningContext(PartitionPruneContext *context,
 														step->step.step_id,
 														keyno);
 
-				context->exprstates[stateidx] =
-					ExecInitExpr(expr, context->planstate);
+				/*
+				 * When planstate is NULL, pruning_steps is known not to
+				 * contain any expressions that depend on the parent plan.
+				 * Information of any available EXTERN parameters must be
+				 * passed explicitly in that case, which the caller must
+				 * have made available via econtext.
+				 */
+				if (planstate == NULL)
+					context->exprstates[stateidx] =
+						ExecInitExprWithParams(expr,
+											   econtext->ecxt_param_list_info);
+				else
+					context->exprstates[stateidx] =
+						ExecInitExpr(expr, context->planstate);
 			}
 			keyno++;
 		}
@@ -1809,18 +1889,11 @@ ExecInitPruningContext(PartitionPruneContext *context,
  *		pruning, disregarding any pruning constraints involving PARAM_EXEC
  *		Params.
  *
- * If additional pruning passes will be required (because of PARAM_EXEC
- * Params), we must also update the translation data that allows conversion
- * of partition indexes into subplan indexes to account for the unneeded
- * subplans having been removed.
- *
  * Must only be called once per 'prunestate', and only if initial pruning
  * is required.
- *
- * 'nsubplans' must be passed as the total number of unpruned subplans.
  */
-Bitmapset *
-ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate, int nsubplans)
+static Bitmapset *
+ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate)
 {
 	Bitmapset  *result = NULL;
 	MemoryContext oldcontext;
@@ -1845,14 +1918,20 @@ ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate, int nsubplans)
 		PartitionedRelPruningData *pprune;
 
 		prunedata = prunestate->partprunedata[i];
+
+		/*
+		 * We pass the 1st item belonging to the root table of the hierarchy
+		 * and find_matching_subplans_recurse() takes care of recursing to
+		 * other (lower-level) parents as needed.
+		 */
 		pprune = &prunedata->partrelprunedata[0];
 
 		/* Perform pruning without using PARAM_EXEC Params */
 		find_matching_subplans_recurse(prunedata, pprune, true, &result);
 
-		/* Expression eval may have used space in node's ps_ExprContext too */
+		/* Expression eval may have used space in ExprContext too */
 		if (pprune->initial_pruning_steps)
-			ResetExprContext(pprune->initial_context.planstate->ps_ExprContext);
+			ResetExprContext(pprune->initial_context.exprcontext);
 	}
 
 	/* Add in any subplans that partition pruning didn't account for */
@@ -1865,118 +1944,120 @@ ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate, int nsubplans)
 
 	MemoryContextReset(prunestate->prune_context);
 
+	return result;
+}
+
+/*
+ * PartitionPruneStateFixSubPlanMap
+ *		Fix mapping of partition indexes to subplan indexes contained in
+ *		prunestate by considering the new list of subplans that survived
+ *		initial pruning
+ *
+ * Subplans would previously be indexed 0..(n_total_subplans - 1) should be
+ * changed to index range 0..num(initially_valid_subplans).
+ */
+static void
+PartitionPruneStateFixSubPlanMap(PartitionPruneState *prunestate,
+								 Bitmapset *initially_valid_subplans,
+								 int n_total_subplans)
+{
+	int		   *new_subplan_indexes;
+	Bitmapset  *new_other_subplans;
+	int			i;
+	int			newidx;
+
 	/*
-	 * If exec-time pruning is required and we pruned subplans above, then we
-	 * must re-sequence the subplan indexes so that ExecFindMatchingSubPlans
-	 * properly returns the indexes from the subplans which will remain after
-	 * execution of this function.
-	 *
-	 * We can safely skip this when !do_exec_prune, even though that leaves
-	 * invalid data in prunestate, because that data won't be consulted again
-	 * (cf initial Assert in ExecFindMatchingSubPlans).
+	 * First we must build a temporary array which maps old subplan
+	 * indexes to new ones.  For convenience of initialization, we use
+	 * 1-based indexes in this array and leave pruned items as 0.
 	 */
-	if (prunestate->do_exec_prune && bms_num_members(result) < nsubplans)
+	new_subplan_indexes = (int *) palloc0(sizeof(int) * n_total_subplans);
+	newidx = 1;
+	i = -1;
+	while ((i = bms_next_member(initially_valid_subplans, i)) >= 0)
 	{
-		int		   *new_subplan_indexes;
-		Bitmapset  *new_other_subplans;
-		int			i;
-		int			newidx;
+		Assert(i < n_total_subplans);
+		new_subplan_indexes[i] = newidx++;
+	}
 
-		/*
-		 * First we must build a temporary array which maps old subplan
-		 * indexes to new ones.  For convenience of initialization, we use
-		 * 1-based indexes in this array and leave pruned items as 0.
-		 */
-		new_subplan_indexes = (int *) palloc0(sizeof(int) * nsubplans);
-		newidx = 1;
-		i = -1;
-		while ((i = bms_next_member(result, i)) >= 0)
-		{
-			Assert(i < nsubplans);
-			new_subplan_indexes[i] = newidx++;
-		}
+	/*
+	 * Now we can update each PartitionedRelPruneInfo's subplan_map with
+	 * new subplan indexes.  We must also recompute its present_parts
+	 * bitmap.
+	 */
+	for (i = 0; i < prunestate->num_partprunedata; i++)
+	{
+		PartitionPruningData *prunedata = prunestate->partprunedata[i];
+		int			j;
 
 		/*
-		 * Now we can update each PartitionedRelPruneInfo's subplan_map with
-		 * new subplan indexes.  We must also recompute its present_parts
-		 * bitmap.
+		 * Within each hierarchy, we perform this loop in back-to-front
+		 * order so that we determine present_parts for the lowest-level
+		 * partitioned tables first.  This way we can tell whether a
+		 * sub-partitioned table's partitions were entirely pruned so we
+		 * can exclude it from the current level's present_parts.
 		 */
-		for (i = 0; i < prunestate->num_partprunedata; i++)
+		for (j = prunedata->num_partrelprunedata - 1; j >= 0; j--)
 		{
-			PartitionPruningData *prunedata = prunestate->partprunedata[i];
-			int			j;
+			PartitionedRelPruningData *pprune = &prunedata->partrelprunedata[j];
+			int			nparts = pprune->nparts;
+			int			k;
 
-			/*
-			 * Within each hierarchy, we perform this loop in back-to-front
-			 * order so that we determine present_parts for the lowest-level
-			 * partitioned tables first.  This way we can tell whether a
-			 * sub-partitioned table's partitions were entirely pruned so we
-			 * can exclude it from the current level's present_parts.
-			 */
-			for (j = prunedata->num_partrelprunedata - 1; j >= 0; j--)
-			{
-				PartitionedRelPruningData *pprune = &prunedata->partrelprunedata[j];
-				int			nparts = pprune->nparts;
-				int			k;
+			/* We just rebuild present_parts from scratch */
+			bms_free(pprune->present_parts);
+			pprune->present_parts = NULL;
 
-				/* We just rebuild present_parts from scratch */
-				bms_free(pprune->present_parts);
-				pprune->present_parts = NULL;
+			for (k = 0; k < nparts; k++)
+			{
+				int			oldidx = pprune->subplan_map[k];
+				int			subidx;
 
-				for (k = 0; k < nparts; k++)
+				/*
+				 * If this partition existed as a subplan then change the
+				 * old subplan index to the new subplan index.  The new
+				 * index may become -1 if the partition was pruned above,
+				 * or it may just come earlier in the subplan list due to
+				 * some subplans being removed earlier in the list.  If
+				 * it's a subpartition, add it to present_parts unless
+				 * it's entirely pruned.
+				 */
+				if (oldidx >= 0)
 				{
-					int			oldidx = pprune->subplan_map[k];
-					int			subidx;
-
-					/*
-					 * If this partition existed as a subplan then change the
-					 * old subplan index to the new subplan index.  The new
-					 * index may become -1 if the partition was pruned above,
-					 * or it may just come earlier in the subplan list due to
-					 * some subplans being removed earlier in the list.  If
-					 * it's a subpartition, add it to present_parts unless
-					 * it's entirely pruned.
-					 */
-					if (oldidx >= 0)
-					{
-						Assert(oldidx < nsubplans);
-						pprune->subplan_map[k] = new_subplan_indexes[oldidx] - 1;
+					Assert(oldidx < n_total_subplans);
+					pprune->subplan_map[k] = new_subplan_indexes[oldidx] - 1;
 
-						if (new_subplan_indexes[oldidx] > 0)
-							pprune->present_parts =
-								bms_add_member(pprune->present_parts, k);
-					}
-					else if ((subidx = pprune->subpart_map[k]) >= 0)
-					{
-						PartitionedRelPruningData *subprune;
+					if (new_subplan_indexes[oldidx] > 0)
+						pprune->present_parts =
+							bms_add_member(pprune->present_parts, k);
+				}
+				else if ((subidx = pprune->subpart_map[k]) >= 0)
+				{
+					PartitionedRelPruningData *subprune;
 
-						subprune = &prunedata->partrelprunedata[subidx];
+					subprune = &prunedata->partrelprunedata[subidx];
 
-						if (!bms_is_empty(subprune->present_parts))
-							pprune->present_parts =
-								bms_add_member(pprune->present_parts, k);
-					}
+					if (!bms_is_empty(subprune->present_parts))
+						pprune->present_parts =
+							bms_add_member(pprune->present_parts, k);
 				}
 			}
 		}
+	}
 
-		/*
-		 * We must also recompute the other_subplans set, since indexes in it
-		 * may change.
-		 */
-		new_other_subplans = NULL;
-		i = -1;
-		while ((i = bms_next_member(prunestate->other_subplans, i)) >= 0)
-			new_other_subplans = bms_add_member(new_other_subplans,
-												new_subplan_indexes[i] - 1);
-
-		bms_free(prunestate->other_subplans);
-		prunestate->other_subplans = new_other_subplans;
+	/*
+	 * We must also recompute the other_subplans set, since indexes in it
+	 * may change.
+	 */
+	new_other_subplans = NULL;
+	i = -1;
+	while ((i = bms_next_member(prunestate->other_subplans, i)) >= 0)
+		new_other_subplans = bms_add_member(new_other_subplans,
+											new_subplan_indexes[i] - 1);
 
-		pfree(new_subplan_indexes);
-	}
+	bms_free(prunestate->other_subplans);
+	prunestate->other_subplans = new_other_subplans;
 
-	return result;
+	pfree(new_subplan_indexes);
 }
 
 /*
@@ -2018,11 +2099,16 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate)
 		prunedata = prunestate->partprunedata[i];
 		pprune = &prunedata->partrelprunedata[0];
 
+		/*
+		 * We pass the 1st item belonging to the root table of the hierarchy
+		 * and find_matching_subplans_recurse() takes care of recursing to
+		 * other (lower-level) parents as needed.
+		 */
 		find_matching_subplans_recurse(prunedata, pprune, false, &result);
 
-		/* Expression eval may have used space in node's ps_ExprContext too */
+		/* Expression eval may have used space in ExprContext too */
 		if (pprune->exec_pruning_steps)
-			ResetExprContext(pprune->exec_context.planstate->ps_ExprContext);
+			ResetExprContext(pprune->exec_context.exprcontext);
 	}
 
 	/* Add in any subplans that partition pruning didn't account for */
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index 7937f1c88f..5b6d3eb23b 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -138,30 +138,17 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
 	{
 		PartitionPruneState *prunestate;
 
-		/* We may need an expression context to evaluate partition exprs */
-		ExecAssignExprContext(estate, &appendstate->ps);
-
-		/* Create the working data structure for pruning. */
-		prunestate = ExecCreatePartitionPruneState(&appendstate->ps,
-												   node->part_prune_info);
+		/*
+		 * Set up pruning data structure.  Initial pruning steps, if any, are
+		 * performed as part of the setup, adding the set of indexes of
+		 * surviving subplans to 'validsubplans'.
+		 */
+		prunestate = ExecInitPartitionPruning(&appendstate->ps,
+											  list_length(node->appendplans),
+											  node->part_prune_info,
+											  &validsubplans);
 		appendstate->as_prune_state = prunestate;
-
-		/* Perform an initial partition prune, if required. */
-		if (prunestate->do_initial_prune)
-		{
-			/* Determine which subplans survive initial pruning */
-			validsubplans = ExecFindInitialMatchingSubPlans(prunestate,
-															list_length(node->appendplans));
-
-			nplans = bms_num_members(validsubplans);
-		}
-		else
-		{
-			/* We'll need to initialize all subplans */
-			nplans = list_length(node->appendplans);
-			Assert(nplans > 0);
-			validsubplans = bms_add_range(NULL, 0, nplans - 1);
-		}
+		nplans = bms_num_members(validsubplans);
 
 		/*
 		 * When no run-time pruning is required and there's at least one
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index 418f89dea8..9a9f29e845 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -86,29 +86,17 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
 	{
 		PartitionPruneState *prunestate;
 
-		/* We may need an expression context to evaluate partition exprs */
-		ExecAssignExprContext(estate, &mergestate->ps);
-
-		prunestate = ExecCreatePartitionPruneState(&mergestate->ps,
-												   node->part_prune_info);
+		/*
+		 * Set up pruning data structure.  Initial pruning steps, if any, are
+		 * performed as part of the setup, adding the set of indexes of
+		 * surviving subplans to 'validsubplans'.
+		 */
+		prunestate = ExecInitPartitionPruning(&mergestate->ps,
+											  list_length(node->mergeplans),
+											  node->part_prune_info,
+											  &validsubplans);
 		mergestate->ms_prune_state = prunestate;
-
-		/* Perform an initial partition prune, if required. */
-		if (prunestate->do_initial_prune)
-		{
-			/* Determine which subplans survive initial pruning */
-			validsubplans = ExecFindInitialMatchingSubPlans(prunestate,
-															list_length(node->mergeplans));
-
-			nplans = bms_num_members(validsubplans);
-		}
-		else
-		{
-			/* We'll need to initialize all subplans */
-			nplans = list_length(node->mergeplans);
-			Assert(nplans > 0);
-			validsubplans = bms_add_range(NULL, 0, nplans - 1);
-		}
+		nplans = bms_num_members(validsubplans);
 
 		/*
 		 * When no run-time pruning is required and there's at least one
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index 1bc00826c1..7080cb25d9 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -798,6 +798,7 @@ prune_append_rel_partitions(RelOptInfo *rel)
 
 	/* These are not valid when being called from the planner */
 	context.planstate = NULL;
+	context.exprcontext = NULL;
 	context.exprstates = NULL;
 
 	/* Actual pruning happens here. */
@@ -808,8 +809,8 @@ prune_append_rel_partitions(RelOptInfo *rel)
  * get_matching_partitions
  *		Determine partitions that survive partition pruning
  *
- * Note: context->planstate must be set to a valid PlanState when the
- * pruning_steps were generated with a target other than PARTTARGET_PLANNER.
+ * Note: context->exprcontext must be valid when the pruning_steps were
+ * generated with a target other than PARTTARGET_PLANNER.
  *
  * Returns a Bitmapset of the RelOptInfo->part_rels indexes of the surviving
  * partitions.
@@ -3654,7 +3655,7 @@ match_boolean_partition_clause(Oid partopfamily, Expr *clause, Expr *partkey,
  * exprstate array.
  *
  * Note that the evaluated result may be in the per-tuple memory context of
- * context->planstate->ps_ExprContext, and we may have leaked other memory
+ * context->exprcontext, and we may have leaked other memory
  * there too.  This memory must be recovered by resetting that ExprContext
  * after we're done with the pruning operation (see execPartition.c).
  */
@@ -3677,13 +3678,18 @@ partkey_datum_from_expr(PartitionPruneContext *context,
 		ExprContext *ectx;
 
 		/*
-		 * We should never see a non-Const in a step unless we're running in
-		 * the executor.
+		 * We should never see a non-Const in a step unless the caller has
+		 * passed a valid ExprContext.
+		 *
+		 * When context->planstate is valid, context->exprcontext is same
+		 * as context->planstate->ps_ExprContext.
 		 */
-		Assert(context->planstate != NULL);
+		Assert(context->planstate != NULL || context->exprcontext != NULL);
+		Assert(context->planstate == NULL ||
+			   (context->exprcontext == context->planstate->ps_ExprContext));
 
 		exprstate = context->exprstates[stateidx];
-		ectx = context->planstate->ps_ExprContext;
+		ectx = context->exprcontext;
 		*value = ExecEvalExprSwitchContext(exprstate, ectx, isnull);
 	}
 }
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index 603d8becc4..fd5735a946 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -119,10 +119,9 @@ extern ResultRelInfo *ExecFindPartition(ModifyTableState *mtstate,
 										EState *estate);
 extern void ExecCleanupTupleRouting(ModifyTableState *mtstate,
 									PartitionTupleRouting *proute);
-extern PartitionPruneState *ExecCreatePartitionPruneState(PlanState *planstate,
-														  PartitionPruneInfo *partitionpruneinfo);
+extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
+						 int n_total_subplans,
+						 PartitionPruneInfo *pruneinfo,
+						 Bitmapset **initially_valid_subplans);
 extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate);
-extern Bitmapset *ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate,
-												  int nsubplans);
-
 #endif							/* EXECPARTITION_H */
diff --git a/src/include/partitioning/partprune.h b/src/include/partitioning/partprune.h
index ee11b6feae..90684efa25 100644
--- a/src/include/partitioning/partprune.h
+++ b/src/include/partitioning/partprune.h
@@ -41,6 +41,7 @@ struct RelOptInfo;
  *					subsidiary data, such as the FmgrInfos.
  * planstate		Points to the parent plan node's PlanState when called
  *					during execution; NULL when called from the planner.
+ * exprcontext		ExprContext to use when evaluating pruning expressions
  * exprstates		Array of ExprStates, indexed as per PruneCxtStateIdx; one
  *					for each partition key in each pruning step.  Allocated if
  *					planstate is non-NULL, otherwise NULL.
@@ -56,6 +57,7 @@ typedef struct PartitionPruneContext
 	FmgrInfo   *stepcmpfuncs;
 	MemoryContext ppccontext;
 	PlanState  *planstate;
+	ExprContext *exprcontext;
 	ExprState **exprstates;
 } PartitionPruneContext;
 
-- 
2.24.1



  [application/octet-stream] v7-0004-Optimize-AcquireExecutorLocks-to-skip-pruned-part.patch (94.2K, 5-v7-0004-Optimize-AcquireExecutorLocks-to-skip-pruned-part.patch)
  download | inline diff:
From 14d951ca644860eec6d72ac03e3a95b12373938b Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Wed, 22 Dec 2021 16:55:17 +0900
Subject: [PATCH v7 4/4] Optimize AcquireExecutorLocks() to skip pruned
 partitions

Instead of locking all relations listed in the range table in the
cases where the PlannedStmt indicates that some nodes in the plan
tree can do partition pruning without depending on execution having
started (so called "initial" pruning), AcquireExecutorLocks() now
calls the new executor function ExecutorGetLockRels() which returns
a set of relations (their RT indexes) to be locked not including
those scanned by the subplans that pruned.

The result of pruning done this way must be remembered and reused
during actual execution of the plan, which is done by creating a
PlanInitPruningOutput nodes for for each plan node that undergoes
pruning and a set of those for the whole plan tree are added to
ExecLockRelsInfo which also stores the bitmapset of RT indexes of
relations that are actually locked by AcquireExecutorLocks().
ExecLockRelsInfos are passed down the executor alongside the
PlannedStmts.  This arrangement ensures that the executor doesn't
accidentally try to process a plan tree subnodes that has been
deemed pruned by AcquireExecutorLocks().
---
 src/backend/commands/copyto.c          |   2 +-
 src/backend/commands/createas.c        |   2 +-
 src/backend/commands/explain.c         |   7 +-
 src/backend/commands/extension.c       |  13 +-
 src/backend/commands/matview.c         |   2 +-
 src/backend/commands/portalcmds.c      |   1 +
 src/backend/commands/prepare.c         |  17 +-
 src/backend/executor/README            |  24 +++
 src/backend/executor/execMain.c        | 202 ++++++++++++++++++++
 src/backend/executor/execParallel.c    |  26 ++-
 src/backend/executor/execPartition.c   | 224 ++++++++++++++++++----
 src/backend/executor/execUtils.c       |   8 +
 src/backend/executor/functions.c       |   2 +-
 src/backend/executor/nodeAppend.c      |  52 ++++-
 src/backend/executor/nodeMergeAppend.c |  52 ++++-
 src/backend/executor/nodeModifyTable.c |  25 +++
 src/backend/executor/spi.c             |  14 +-
 src/backend/nodes/copyfuncs.c          |  49 ++++-
 src/backend/nodes/outfuncs.c           |  39 ++++
 src/backend/nodes/readfuncs.c          |  37 ++++
 src/backend/optimizer/plan/planner.c   |   2 +
 src/backend/optimizer/plan/setrefs.c   |   6 +
 src/backend/partitioning/partprune.c   |  37 +++-
 src/backend/tcop/postgres.c            |  15 +-
 src/backend/tcop/pquery.c              |  21 ++-
 src/backend/utils/cache/plancache.c    | 252 ++++++++++++++++++++++---
 src/backend/utils/mmgr/portalmem.c     |   2 +
 src/include/commands/explain.h         |   3 +-
 src/include/executor/execPartition.h   |   2 +
 src/include/executor/execdesc.h        |   2 +
 src/include/executor/executor.h        |   2 +
 src/include/executor/nodeAppend.h      |   1 +
 src/include/executor/nodeMergeAppend.h |   1 +
 src/include/executor/nodeModifyTable.h |   1 +
 src/include/nodes/execnodes.h          |  96 ++++++++++
 src/include/nodes/nodes.h              |   5 +
 src/include/nodes/pathnodes.h          |   4 +
 src/include/nodes/plannodes.h          |  15 ++
 src/include/tcop/tcopprot.h            |   2 +-
 src/include/utils/plancache.h          |   6 +
 src/include/utils/portal.h             |   5 +
 41 files changed, 1174 insertions(+), 104 deletions(-)

diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index 55c38b04c4..d403eb2309 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -542,7 +542,7 @@ BeginCopyTo(ParseState *pstate,
 		((DR_copy *) dest)->cstate = cstate;
 
 		/* Create a QueryDesc requesting no output */
-		cstate->queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+		cstate->queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
 											GetActiveSnapshot(),
 											InvalidSnapshot,
 											dest, NULL, NULL, 0);
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 9abbb6b555..f6607f2454 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -325,7 +325,7 @@ ExecCreateTableAs(ParseState *pstate, CreateTableAsStmt *stmt,
 		UpdateActiveSnapshotCommandId();
 
 		/* Create a QueryDesc, redirecting output to our tuple receiver */
-		queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+		queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
 									GetActiveSnapshot(), InvalidSnapshot,
 									dest, params, queryEnv, 0);
 
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 9f632285b6..1f1a44b9bb 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -407,7 +407,7 @@ ExplainOneQuery(Query *query, int cursorOptions,
 		}
 
 		/* run it (if needed) and produce output */
-		ExplainOnePlan(plan, into, es, queryString, params, queryEnv,
+		ExplainOnePlan(plan, NULL, into, es, queryString, params, queryEnv,
 					   &planduration, (es->buffers ? &bufusage : NULL));
 	}
 }
@@ -515,7 +515,8 @@ ExplainOneUtility(Node *utilityStmt, IntoClause *into, ExplainState *es,
  * to call it.
  */
 void
-ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
+ExplainOnePlan(PlannedStmt *plannedstmt, ExecLockRelsInfo *execlockrelsinfo,
+			   IntoClause *into, ExplainState *es,
 			   const char *queryString, ParamListInfo params,
 			   QueryEnvironment *queryEnv, const instr_time *planduration,
 			   const BufferUsage *bufusage)
@@ -563,7 +564,7 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
 		dest = None_Receiver;
 
 	/* Create a QueryDesc for the query */
-	queryDesc = CreateQueryDesc(plannedstmt, queryString,
+	queryDesc = CreateQueryDesc(plannedstmt, execlockrelsinfo, queryString,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, params, queryEnv, instrument_option);
 
diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index 1013790dbb..008b8ce0e9 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -741,8 +741,10 @@ execute_sql_string(const char *sql)
 		RawStmt    *parsetree = lfirst_node(RawStmt, lc1);
 		MemoryContext per_parsetree_context,
 					oldcontext;
-		List	   *stmt_list;
-		ListCell   *lc2;
+		List	   *stmt_list,
+				   *execlockrelsinfo_list;
+		ListCell   *lc2,
+				   *lc3;
 
 		/*
 		 * We do the work for each parsetree in a short-lived context, to
@@ -762,11 +764,13 @@ execute_sql_string(const char *sql)
 										   NULL,
 										   0,
 										   NULL);
-		stmt_list = pg_plan_queries(stmt_list, sql, CURSOR_OPT_PARALLEL_OK, NULL);
+		stmt_list = pg_plan_queries(stmt_list, sql, CURSOR_OPT_PARALLEL_OK, NULL,
+									&execlockrelsinfo_list);
 
-		foreach(lc2, stmt_list)
+		forboth(lc2, stmt_list, lc3, execlockrelsinfo_list)
 		{
 			PlannedStmt *stmt = lfirst_node(PlannedStmt, lc2);
+			ExecLockRelsInfo *execlockrelsinfo = lfirst_node(ExecLockRelsInfo, lc3);
 
 			CommandCounterIncrement();
 
@@ -777,6 +781,7 @@ execute_sql_string(const char *sql)
 				QueryDesc  *qdesc;
 
 				qdesc = CreateQueryDesc(stmt,
+										execlockrelsinfo,
 										sql,
 										GetActiveSnapshot(), NULL,
 										dest, NULL, NULL, 0);
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index 05e7b60059..4ef44aaf23 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -416,7 +416,7 @@ refresh_matview_datafill(DestReceiver *dest, Query *query,
 	UpdateActiveSnapshotCommandId();
 
 	/* Create a QueryDesc, redirecting output to our tuple receiver */
-	queryDesc = CreateQueryDesc(plan, queryString,
+	queryDesc = CreateQueryDesc(plan, NULL, queryString,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, NULL, NULL, 0);
 
diff --git a/src/backend/commands/portalcmds.c b/src/backend/commands/portalcmds.c
index 9902c5c566..85e73ddded 100644
--- a/src/backend/commands/portalcmds.c
+++ b/src/backend/commands/portalcmds.c
@@ -107,6 +107,7 @@ PerformCursorOpen(ParseState *pstate, DeclareCursorStmt *cstmt, ParamListInfo pa
 					  queryString,
 					  CMDTAG_SELECT,	/* cursor's query is always a SELECT */
 					  list_make1(plan),
+					  list_make1(NULL),	/* no ExecLockRelsInfo to pass */
 					  NULL);
 
 	/*----------
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 80738547ed..bbbf8bbcbd 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -155,6 +155,7 @@ ExecuteQuery(ParseState *pstate,
 	PreparedStatement *entry;
 	CachedPlan *cplan;
 	List	   *plan_list;
+	List	   *plan_execlockrelsinfo_list;
 	ParamListInfo paramLI = NULL;
 	EState	   *estate = NULL;
 	Portal		portal;
@@ -195,6 +196,7 @@ ExecuteQuery(ParseState *pstate,
 	/* Replan if needed, and increment plan refcount for portal */
 	cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL);
 	plan_list = cplan->stmt_list;
+	plan_execlockrelsinfo_list = cplan->execlockrelsinfo_list;
 
 	/*
 	 * DO NOT add any logic that could possibly throw an error between
@@ -204,7 +206,7 @@ ExecuteQuery(ParseState *pstate,
 					  NULL,
 					  query_string,
 					  entry->plansource->commandTag,
-					  plan_list,
+					  plan_list, plan_execlockrelsinfo_list,
 					  cplan);
 
 	/*
@@ -576,7 +578,9 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	const char *query_string;
 	CachedPlan *cplan;
 	List	   *plan_list;
-	ListCell   *p;
+	List	   *plan_execlockrelsinfo_list;
+	ListCell   *p,
+			   *pe;
 	ParamListInfo paramLI = NULL;
 	EState	   *estate = NULL;
 	instr_time	planstart;
@@ -632,15 +636,18 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	}
 
 	plan_list = cplan->stmt_list;
+	plan_execlockrelsinfo_list = cplan->execlockrelsinfo_list;
 
 	/* Explain each query */
-	foreach(p, plan_list)
+	forboth(p, plan_list, pe, plan_execlockrelsinfo_list)
 	{
 		PlannedStmt *pstmt = lfirst_node(PlannedStmt, p);
+		ExecLockRelsInfo *execlockrelsinfo = lfirst_node(ExecLockRelsInfo, pe);
 
 		if (pstmt->commandType != CMD_UTILITY)
-			ExplainOnePlan(pstmt, into, es, query_string, paramLI, queryEnv,
-						   &planduration, (es->buffers ? &bufusage : NULL));
+			ExplainOnePlan(pstmt, execlockrelsinfo, into, es, query_string,
+						   paramLI, queryEnv, &planduration,
+						   (es->buffers ? &bufusage : NULL));
 		else
 			ExplainOneUtility(pstmt->utilityStmt, into, es, query_string,
 							  paramLI, queryEnv);
diff --git a/src/backend/executor/README b/src/backend/executor/README
index bf5e70860d..9720d0ac2c 100644
--- a/src/backend/executor/README
+++ b/src/backend/executor/README
@@ -65,6 +65,27 @@ found there.  This currently only occurs for Append and MergeAppend nodes.  In
 this case the non-required subplans are ignored and the executor state's
 subnode array will become out of sequence to the plan's subplan list.
 
+Actually, the so-called execution time pruning may also occur even before the
+execution has started.  One case where that occurs is when a cached generic
+plan is being validated for execution by plancache.c: GetCachedPlan(), which
+proceeds by locking all the relations that will be scanned by that plan.  If
+the generic plan has nodes that contain so-called initial pruning steps (a
+subset of execution pruning steps that do not depend on full-fledged execution
+having started), they are performed at this point to figure out the minimal
+set of child subplans that satisfy those pruning instructions and the result
+of performing that pruning is saved in a data structure that gets passed to
+the executor alongside the plan tree.  Relations scanned by only those
+surviving subplans are then locked while those scanned by the pruned subplans
+are not, even though the pruned subplans themselves are not removed from the
+plan tree. So, it is imperative that the executor and any third party code
+invoked by it that gets passed the plan tree look at the initial pruning result
+made available via the aforementioned data structure to determine whether or
+not a particular subplan is valid.  (The data structure basically consists of
+an array of PlanInitPruningOutput nodes containing one element for each node
+of the plan tree indexable using plan_node_id of the individual plan nodes,
+where each node contains a bitmapset of indexes of unpruned child subplans of
+a given node.)
+
 Each Plan node may have expression trees associated with it, to represent
 its target list, qualification conditions, etc.  These trees are also
 read-only to the executor, but the executor state for expression evaluation
@@ -247,6 +268,9 @@ Query Processing Control Flow
 
 This is a sketch of control flow for full query processing:
 
+  [ ExecutorGetLockRels ] --- an optional step to walk over the plan tree
+		to produce an ExecLockRelsInfo to be passed to CreateQueryDesc
+
 	CreateQueryDesc
 
 	ExecutorStart
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 473d2e00a2..1ddd1dfb83 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -49,11 +49,15 @@
 #include "commands/matview.h"
 #include "commands/trigger.h"
 #include "executor/execdebug.h"
+#include "executor/nodeAppend.h"
+#include "executor/nodeMergeAppend.h"
+#include "executor/nodeModifyTable.h"
 #include "executor/nodeSubplan.h"
 #include "foreign/fdwapi.h"
 #include "jit/jit.h"
 #include "mb/pg_wchar.h"
 #include "miscadmin.h"
+#include "nodes/nodeFuncs.h"
 #include "parser/parsetree.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
@@ -101,9 +105,205 @@ static char *ExecBuildSlotValueDescription(Oid reloid,
 										   Bitmapset *modifiedCols,
 										   int maxfieldlen);
 static void EvalPlanQualStart(EPQState *epqstate, Plan *planTree);
+static bool ExecGetScanLockRels(Scan *scan, ExecGetLockRelsContext *context);
 
 /* end of local decls */
 
+/* ----------------------------------------------------------------
+ *		ExecutorGetLockRels
+ *
+ *		Figure out the minimal set of relations to lock to be able to safely
+ *		execute a given plan
+ *
+ * This ignores the relations scanned by child subplans that are pruned away
+ * after performing initial pruning steps present in the plan using the
+ * provided set of EXTERN parameters.
+ *
+ * Along with the set of RT indexes of relations that must be locked, the
+ * returned struct also contains an array of PlanInitPruningOutput nodes each
+ * of which contains the result of initial pruning for a given plan node, which
+ * is basically a bitmapset of the indexes of surviving child subplans.  Each
+ * plan node in the tree that undergoes pruning will have an element in the
+ * array.
+ *
+ * Note that while relations scanned by the subplans that are pruned will not
+ * be locked, the subplans themselves are left as-is in the plan tree, assuming
+ * anything that reads the plan tree during execution knows to ignore them by
+ * looking at the PlanInitPruningOutput's list of valid subplans.
+ *
+ * Partitioned tables mentioned in PartitionedRelPruneInfo nodes that drive
+ * the pruning will be locked before doing the pruning and also added to the
+ * the returned set.
+ */
+ExecLockRelsInfo *
+ExecutorGetLockRels(PlannedStmt *plannedstmt, ParamListInfo params)
+{
+	int		numPlanNodes = plannedstmt->numPlanNodes;
+	ExecGetLockRelsContext context;
+	ExecLockRelsInfo *result;
+	ListCell *lc;
+
+	/* Only get here if there is any pruning to do. */
+	Assert(plannedstmt->containsInitialPruning);
+
+	context.stmt = plannedstmt;
+	context.params = params;
+
+	/*
+	 * Go walk all the plan tree(s) present in the PlannedStmt, filling
+	 * context.lockrels with only the relations from plan nodes that
+	 * survive initial pruning and also the tables mentioned in
+	 * partitioned_rels sets found in the plan.
+	 */
+	context.lockrels = NULL;
+	context.initPruningOutputs = NIL;
+	context.ipoIndexes = palloc0(sizeof(int) * numPlanNodes);
+
+	/* All the subplans. */
+	foreach(lc, plannedstmt->subplans)
+	{
+		Plan *subplan = lfirst(lc);
+
+		(void) ExecGetLockRels(subplan, &context);
+	}
+
+	/* And the main tree. */
+	(void) ExecGetLockRels(plannedstmt->planTree, &context);
+
+	/*
+	 * Also be sure to lock partitioned relations from any [Merge]Append nodes
+	 * that were originally present but were ultimately left out from the plan
+	 * due to being deemed no-op nodes.
+	 */
+	context.lockrels = bms_add_members(context.lockrels,
+									   plannedstmt->elidedAppendPartedRels);
+
+	result = makeNode(ExecLockRelsInfo);
+	result->lockrels = context.lockrels;
+	result->numPlanNodes = numPlanNodes;
+	result->initPruningOutputs = context.initPruningOutputs;
+	result->ipoIndexes = context.ipoIndexes;
+
+	return result;
+}
+
+/* ------------------------------------------------------------------------
+ * ExecGetLockRels
+ *		Adds all the relations that will be scanned by 'node' and its child
+ *		plans to context->lockrels after taking into the account the effect
+ *		of performing initial pruning if any
+ *
+ * context->stmt gives the PlannedStmt being inspected to access the plan's
+ * range table if needed and context->params the set of EXTERN parameters
+ * available to evaluate pruning parameters.
+ *
+ * If initial pruning is done, a PlanInitPruningOutput node containing the
+ * result of pruning will be stored in context->initPruningOutputs that will
+ * be made available to the executor to reuse.
+ * ------------------------------------------------------------------------
+ */
+bool
+ExecGetLockRels(Plan *node, ExecGetLockRelsContext *context)
+{
+	/* Do nothing when we get to the end of a leaf on tree. */
+	if (node == NULL)
+		return true;
+
+	/* Make sure there's enough stack available. */
+	check_stack_depth();
+
+	switch (nodeTag(node))
+	{
+		/* Currently, only these two nodes have prunable child subplans. */
+		case T_Append:
+			if (ExecGetAppendLockRels((Append *) node, context))
+				return true;
+			break;
+		case T_MergeAppend:
+			if (ExecGetMergeAppendLockRels((MergeAppend *) node,
+												context))
+				return true;
+			break;
+
+		/*
+		 * And these manipulate relations that must be added context->lockrels.
+		 */
+		case T_SeqScan:
+		case T_SampleScan:
+		case T_IndexScan:
+		case T_IndexOnlyScan:
+		case T_BitmapIndexScan:
+		case T_BitmapHeapScan:
+		case T_TidScan:
+		case T_TidRangeScan:
+		case T_ForeignScan:
+		case T_SubqueryScan:
+		case T_CustomScan:
+			if (ExecGetScanLockRels((Scan *) node, context))
+				return true;
+			break;
+		case T_ModifyTable:
+			if (ExecGetModifyTableLockRels((ModifyTable *) node, context))
+				return true;
+			/* plan_tree_walker() will visit the subplan (outerNode) */
+			break;
+
+		default:
+			break;
+	}
+
+	/* Recurse to subnodes. */
+	return plan_tree_walker(node, ExecGetLockRels, (void *) context);
+}
+
+/*
+ * ExecGetScanLockRels
+ * 		Do ExecGetLockRels()'s work for a leaf Scan node
+ */
+static bool
+ExecGetScanLockRels(Scan *scan, ExecGetLockRelsContext *context)
+{
+	switch (nodeTag(scan))
+	{
+		case T_ForeignScan:
+			{
+				ForeignScan *fscan = (ForeignScan *) scan;
+
+				context->lockrels = bms_add_members(context->lockrels,
+													fscan->fs_relids);
+			}
+			break;
+
+		case T_SubqueryScan:
+			{
+				SubqueryScan *sscan = (SubqueryScan *) scan;
+
+				(void) ExecGetLockRels((Plan *) sscan->subplan, context);
+			}
+			break;
+
+		case T_CustomScan:
+			{
+				CustomScan *cscan = (CustomScan *) scan;
+				ListCell *lc;
+
+				context->lockrels = bms_add_members(context->lockrels,
+													cscan->custom_relids);
+				foreach(lc, cscan->custom_plans)
+				{
+					(void) ExecGetLockRels((Plan *) lfirst(lc), context);
+				}
+			}
+			break;
+
+		default:
+			context->lockrels = bms_add_member(context->lockrels,
+											   scan->scanrelid);
+			break;
+	}
+
+	return true;
+}
 
 /* ----------------------------------------------------------------
  *		ExecutorStart
@@ -805,6 +1005,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
 {
 	CmdType		operation = queryDesc->operation;
 	PlannedStmt *plannedstmt = queryDesc->plannedstmt;
+	ExecLockRelsInfo *execlockrelsinfo = queryDesc->execlockrelsinfo;
 	Plan	   *plan = plannedstmt->planTree;
 	List	   *rangeTable = plannedstmt->rtable;
 	EState	   *estate = queryDesc->estate;
@@ -824,6 +1025,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
 	ExecInitRangeTable(estate, rangeTable);
 
 	estate->es_plannedstmt = plannedstmt;
+	estate->es_execlockrelsinfo = execlockrelsinfo;
 
 	/*
 	 * Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index 9a0d5d59ef..fb6dbd298a 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -66,6 +66,7 @@
 #define PARALLEL_KEY_QUERY_TEXT		UINT64CONST(0xE000000000000008)
 #define PARALLEL_KEY_JIT_INSTRUMENTATION UINT64CONST(0xE000000000000009)
 #define PARALLEL_KEY_WAL_USAGE			UINT64CONST(0xE00000000000000A)
+#define PARALLEL_KEY_EXECLOCKRELSINFO	UINT64CONST(0xE00000000000000B)
 
 #define PARALLEL_TUPLE_QUEUE_SIZE		65536
 
@@ -182,6 +183,7 @@ ExecSerializePlan(Plan *plan, EState *estate)
 	pstmt->transientPlan = false;
 	pstmt->dependsOnRole = false;
 	pstmt->parallelModeNeeded = false;
+	pstmt->containsInitialPruning = false;
 	pstmt->planTree = plan;
 	pstmt->rtable = estate->es_range_table;
 	pstmt->resultRelations = NIL;
@@ -596,12 +598,15 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	FixedParallelExecutorState *fpes;
 	char	   *pstmt_data;
 	char	   *pstmt_space;
+	char	   *execlockrelsinfo_data;
+	char	   *execlockrelsinfo_space;
 	char	   *paramlistinfo_space;
 	BufferUsage *bufusage_space;
 	WalUsage   *walusage_space;
 	SharedExecutorInstrumentation *instrumentation = NULL;
 	SharedJitInstrumentation *jit_instrumentation = NULL;
 	int			pstmt_len;
+	int			execlockrelsinfo_len;
 	int			paramlistinfo_len;
 	int			instrumentation_len = 0;
 	int			jit_instrumentation_len = 0;
@@ -630,6 +635,7 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 
 	/* Fix up and serialize plan to be sent to workers. */
 	pstmt_data = ExecSerializePlan(planstate->plan, estate);
+	execlockrelsinfo_data = nodeToString(estate->es_execlockrelsinfo);
 
 	/* Create a parallel context. */
 	pcxt = CreateParallelContext("postgres", "ParallelQueryMain", nworkers);
@@ -656,6 +662,11 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	shm_toc_estimate_chunk(&pcxt->estimator, pstmt_len);
 	shm_toc_estimate_keys(&pcxt->estimator, 1);
 
+	/* Estimate space for serialized ExecLockRelsInfo. */
+	execlockrelsinfo_len = strlen(execlockrelsinfo_data) + 1;
+	shm_toc_estimate_chunk(&pcxt->estimator, execlockrelsinfo_len);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
 	/* Estimate space for serialized ParamListInfo. */
 	paramlistinfo_len = EstimateParamListSpace(estate->es_param_list_info);
 	shm_toc_estimate_chunk(&pcxt->estimator, paramlistinfo_len);
@@ -750,6 +761,12 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	memcpy(pstmt_space, pstmt_data, pstmt_len);
 	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PLANNEDSTMT, pstmt_space);
 
+	/* Store serialized ExecLockRelsInfo */
+	execlockrelsinfo_space = shm_toc_allocate(pcxt->toc, execlockrelsinfo_len);
+	memcpy(execlockrelsinfo_space, execlockrelsinfo_data, execlockrelsinfo_len);
+	shm_toc_insert(pcxt->toc, PARALLEL_KEY_EXECLOCKRELSINFO,
+				   execlockrelsinfo_space);
+
 	/* Store serialized ParamListInfo. */
 	paramlistinfo_space = shm_toc_allocate(pcxt->toc, paramlistinfo_len);
 	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARAMLISTINFO, paramlistinfo_space);
@@ -1231,8 +1248,10 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
 						 int instrument_options)
 {
 	char	   *pstmtspace;
+	char	   *execlockrelsinfospace;
 	char	   *paramspace;
 	PlannedStmt *pstmt;
+	ExecLockRelsInfo *execlockrelsinfo;
 	ParamListInfo paramLI;
 	char	   *queryString;
 
@@ -1243,12 +1262,17 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
 	pstmtspace = shm_toc_lookup(toc, PARALLEL_KEY_PLANNEDSTMT, false);
 	pstmt = (PlannedStmt *) stringToNode(pstmtspace);
 
+	/* Reconstruct leader-supplied ExecLockRelsInfo. */
+	execlockrelsinfospace = shm_toc_lookup(toc, PARALLEL_KEY_EXECLOCKRELSINFO,
+										  false);
+	execlockrelsinfo = (ExecLockRelsInfo *) stringToNode(execlockrelsinfospace);
+
 	/* Reconstruct ParamListInfo. */
 	paramspace = shm_toc_lookup(toc, PARALLEL_KEY_PARAMLISTINFO, false);
 	paramLI = RestoreParamList(&paramspace);
 
 	/* Create a QueryDesc for the query. */
-	return CreateQueryDesc(pstmt,
+	return CreateQueryDesc(pstmt, execlockrelsinfo,
 						   queryString,
 						   GetActiveSnapshot(), InvalidSnapshot,
 						   receiver, paramLI, NULL, instrument_options);
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 7ff5a95f05..fddc97280e 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -24,6 +24,7 @@
 #include "mb/pg_wchar.h"
 #include "miscadmin.h"
 #include "nodes/makefuncs.h"
+#include "parser/parsetree.h"
 #include "partitioning/partbounds.h"
 #include "partitioning/partdesc.h"
 #include "partitioning/partprune.h"
@@ -183,8 +184,13 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation rel,
 												  int maxfieldlen);
 static List *adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri);
 static PartitionPruneState *ExecCreatePartitionPruneState(PlanState *planstate,
-							  PartitionPruneInfo *partitionpruneinfo);
-static Bitmapset *ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate);
+							  PartitionPruneInfo *partitionpruneinfo,
+							  bool consider_initial_steps,
+							  bool consider_exec_steps,
+							  List *rtable, ExprContext *econtext,
+							  PartitionDirectory partdir);
+static Bitmapset *ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate,
+								PartitionPruneInfo *pruneinfo);
 static void ExecInitPruningContext(PartitionPruneContext *context,
 								   List *pruning_steps,
 								   PartitionDesc partdesc,
@@ -1483,8 +1489,9 @@ adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri)
  * considered to be a stable expression, it can change value from one plan
  * node scan to the next during query execution.  Stable comparison
  * expressions that don't involve such Params allow partition pruning to be
- * done once during executor startup.  Expressions that do involve such Params
- * require us to prune separately for each scan of the parent plan node.
+ * done once during executor startup or even before during ExecutorGetLockRels().
+ * Expressions that do involve such Params require us to prune separately for
+ * each scan of the parent plan node.
  *
  * Note that pruning away unneeded subplans during executor startup has the
  * added benefit of not having to initialize the unneeded subplans at all.
@@ -1496,10 +1503,17 @@ adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri)
  *		Creates the PartitionPruneState required by each of the two pruning
  *		functions.  Details stored include how to map the partition index
  *		returned by the partition pruning code into subplan indexes.  Also
- *		determines the set of initially valid subplans by performing initial
- *		pruning steps, only which need be initialized by the caller such as
- *		ExecInitAppend.  Maps in PartitionPruneState are updated to account
- *		for initial pruning having eliminated some of the subplans, if any.
+ *		determines the set of initially valid subplans by either looking that
+ *		up in the plan node's PlanInitPruningOutput if one found in
+ *		EState.es_execlockrelinfo or by performing initial pruning steps.
+ *		Only the subplans included in that need be initialized by the caller
+ *		such as ExecInitAppend.  Maps in PartitionPruneState are updated to
+ *		account for initial pruning having eliminated some of the subplans,
+ *		if any.
+ *
+ * ExecGetLockRelsDoInitialPruning:
+ *		Do initial pruning as part of ExecGetLockRels() on the parent plan
+ *		node
  *
  * ExecFindMatchingSubPlans:
  *		Returns indexes of matching subplans after evaluating all available
@@ -1514,9 +1528,10 @@ adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri)
  * ExecInitPartitionPruning
  * 		Initialize data structure needed for run-time partition pruning
  *
- * Initial pruning can be done immediately, so it is done here if needed and
- * the set of surviving partition subplans' indexes are added to the output
- * parameter *initially_valid_subplans.
+ * Initial pruning can be done immediately, so it is done here unless it has
+ * already been done by ExecGetLockRelsDoInitialPruning(), and the set of
+ * surviving partition subplans' indexes are added to the output parameter
+ * *initially_valid_subplans.
  *
  * If subplans are indeed pruned, subplan_map arrays contained in the returned
  * PartitionPruneState are re-sequenced to not count those, though only if the
@@ -1530,22 +1545,57 @@ ExecInitPartitionPruning(PlanState *planstate,
 {
 	PartitionPruneState *prunestate;
 	EState *estate = planstate->state;
+	Plan   *plan = planstate->plan;
+	PlanInitPruningOutput *initPruningOutput = NULL;
+	bool	do_pruning = (pruneinfo->needs_init_pruning ||
+						  pruneinfo->needs_exec_pruning);
 
-	/* We may need an expression context to evaluate partition exprs */
-	ExecAssignExprContext(estate, planstate);
+	/* Retrieve the parent plan's PlanInitPruningOutput, if any. */
+	if (estate->es_execlockrelsinfo)
+	{
+		initPruningOutput = (PlanInitPruningOutput *)
+			ExecFetchPlanInitPruningOutput(estate->es_execlockrelsinfo, plan);
 
-	/*
-	 * Create the working data structure for pruning.
-	 */
-	prunestate = ExecCreatePartitionPruneState(planstate, pruneinfo);
+		Assert(initPruningOutput != NULL &&
+			   IsA(initPruningOutput, PlanInitPruningOutput));
+		/* No need to do initial pruning again, only exec pruning. */
+		do_pruning = pruneinfo->needs_exec_pruning;
+	}
+
+	prunestate = NULL;
+	if (do_pruning)
+	{
+		/* We may need an expression context to evaluate partition exprs */
+		ExecAssignExprContext(estate, planstate);
+
+		/* For data reading, executor always omits detached partitions */
+		if (estate->es_partition_directory == NULL)
+			estate->es_partition_directory =
+				CreatePartitionDirectory(estate->es_query_cxt, false);
+
+		/*
+		 * Create the working data structure for pruning.  No need to consider
+		 * initial pruning steps if we have a PlanInitPruningOutput.
+		 */
+		prunestate = ExecCreatePartitionPruneState(planstate, pruneinfo,
+												   initPruningOutput == NULL, true,
+												   NIL, planstate->ps_ExprContext,
+												   estate->es_partition_directory);
+	}
 
 	/*
 	 * Perform an initial partition prune, if required.
 	 */
-	if (prunestate->do_initial_prune)
+	if (initPruningOutput)
+	{
+		/* ExecGetLockRelsDoInitialPruning() already did it for us! */
+		*initially_valid_subplans = initPruningOutput->initially_valid_subplans;
+	}
+	else if (prunestate && prunestate->do_initial_prune)
 	{
 		/* Determine which subplans survive initial pruning */
-		*initially_valid_subplans = ExecFindInitialMatchingSubPlans(prunestate);
+		*initially_valid_subplans = ExecFindInitialMatchingSubPlans(prunestate,
+																	pruneinfo);
 	}
 	else
 	{
@@ -1563,7 +1613,7 @@ ExecInitPartitionPruning(PlanState *planstate,
 	 * invalid data in prunestate, because that data won't be consulted again
 	 * (cf initial Assert in ExecFindMatchingSubPlans).
 	 */
-	if (prunestate->do_exec_prune &&
+	if (prunestate && prunestate->do_exec_prune &&
 		bms_num_members(*initially_valid_subplans) < n_total_subplans)
 		PartitionPruneStateFixSubPlanMap(prunestate,
 										 *initially_valid_subplans,
@@ -1572,12 +1622,75 @@ ExecInitPartitionPruning(PlanState *planstate,
 	return prunestate;
 }
 
+/*
+ * ExecGetLockRelsDoInitialPruning
+ *		Perform initial pruning as part of doing ExecGetLockRels() on the parent
+ *		plan node
+ */
+Bitmapset *
+ExecGetLockRelsDoInitialPruning(Plan *plan, ExecGetLockRelsContext *context,
+								PartitionPruneInfo *pruneinfo)
+{
+	List		 *rtable = context->stmt->rtable;
+	ParamListInfo params = context->params;
+	ExprContext	 *econtext;
+	PartitionDirectory pdir;
+	MemoryContext oldcontext,
+				  tmpcontext;
+	PartitionPruneState *prunestate;
+	PlanInitPruningOutput *initPruningOutput;
+
+	/*
+	 * A temporary context to allocate stuff needded to run the pruning steps.
+	 */
+	tmpcontext = AllocSetContextCreate(CurrentMemoryContext,
+									   "initial pruning working data",
+									   ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(tmpcontext);
+
+	/*
+	 * PartitionDirectory to look up partition descriptors, which omits
+	 * detached partitions, just like in the executor proper.
+	 */
+	pdir = CreatePartitionDirectory(CurrentMemoryContext, false);
+
+	/*
+	 * We don't yet have a PlanState for the parent plan node, so must create
+	 * a standalone ExprContext to evaluate pruning expressions, equipped with
+	 * the information about the EXTERN parameters that the caller passed us.
+	 * Note that that's okay because the initial pruning steps do not contain
+	 * anything that requires the execution to have started.
+	 */
+	econtext = CreateStandaloneExprContext();
+	econtext->ecxt_param_list_info = params;
+	prunestate = ExecCreatePartitionPruneState(NULL, pruneinfo,
+											   true, false,
+											   rtable, econtext,
+											   pdir);
+	MemoryContextSwitchTo(oldcontext);
+
+	/* Do the pruning and populate a PlanInitPruningOutput for this node. */
+	initPruningOutput = makeNode(PlanInitPruningOutput);
+	initPruningOutput->initially_valid_subplans =
+		ExecFindInitialMatchingSubPlans(prunestate, pruneinfo);
+	ExecStorePlanInitPruningOutput(context, initPruningOutput, plan);
+
+	FreeExprContext(econtext, true);
+	DestroyPartitionDirectory(pdir);
+	MemoryContextDelete(tmpcontext);
+
+	return initPruningOutput->initially_valid_subplans;
+}
+
 /*
  * ExecCreatePartitionPruneState
  *		Build the data structure required for calling
  *		ExecFindInitialMatchingSubPlans and ExecFindMatchingSubPlans.
  *
- * 'planstate' is the parent plan node's execution state.
+ * 'planstate', if not NULL, is the parent plan node's execution state.  It
+ * can be NULL if being called before ExecutorStart(), in which case,
+ * 'rtable' (range table), 'econtext', and 'partdir' must be explicitly
+ * provided.
  *
  * 'partitionpruneinfo' is a PartitionPruneInfo as generated by
  * make_partition_pruneinfo.  Here we build a PartitionPruneState containing a
@@ -1592,19 +1705,20 @@ ExecInitPartitionPruning(PlanState *planstate,
  */
 static PartitionPruneState *
 ExecCreatePartitionPruneState(PlanState *planstate,
-							  PartitionPruneInfo *partitionpruneinfo)
+							  PartitionPruneInfo *partitionpruneinfo,
+							  bool consider_initial_steps,
+							  bool consider_exec_steps,
+							  List *rtable, ExprContext *econtext,
+							  PartitionDirectory partdir)
 {
-	EState	   *estate = planstate->state;
+	EState	   *estate = planstate ? planstate->state : NULL;
 	PartitionPruneState *prunestate;
 	int			n_part_hierarchies;
 	ListCell   *lc;
 	int			i;
-	ExprContext	*econtext = planstate->ps_ExprContext;
 
-	/* For data reading, executor always omits detached partitions */
-	if (estate->es_partition_directory == NULL)
-		estate->es_partition_directory =
-			CreatePartitionDirectory(estate->es_query_cxt, false);
+	Assert((estate != NULL) ||
+			(partdir != NULL && econtext != NULL && rtable != NIL));
 
 	n_part_hierarchies = list_length(partitionpruneinfo->prune_infos);
 	Assert(n_part_hierarchies > 0);
@@ -1655,19 +1769,48 @@ ExecCreatePartitionPruneState(PlanState *planstate,
 			PartitionedRelPruneInfo *pinfo = lfirst_node(PartitionedRelPruneInfo, lc2);
 			PartitionedRelPruningData *pprune = &prunedata->partrelprunedata[j];
 			Relation	partrel;
+			bool		close_partrel = false;
 			PartitionDesc partdesc;
 			PartitionKey partkey;
 
 			/*
-			 * We can rely on the copies of the partitioned table's partition
-			 * key and partition descriptor appearing in its relcache entry,
-			 * because that entry will be held open and locked for the
-			 * duration of this executor run.
+			 * Must open the relation by ourselves when called before the
+			 * execution has started, such as, when called during
+			 * ExecutorGetLockRels() on a cached plan.  In that case,
+			 * sub-partitions must be locked, because AcquirePlannerLocks()
+			 * would not have seen them. (1st relation in a partrelpruneinfos
+			 * list is always the root partitioned table appearing in the
+			 * query, which AcquirePlannerLocks() would have locked; the
+			 * Assert in relation_open() guards that assumption.)
+			 */
+			if (estate == NULL)
+			{
+				RangeTblEntry *rte = rt_fetch(pinfo->rtindex, rtable);
+				int		lockmode = (j == 0) ? NoLock : rte->rellockmode;
+
+				partrel = table_open(rte->relid, lockmode);
+				close_partrel = true;
+			}
+			else
+				partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
+
+			/*
+			 * We can rely on the copy of the partitioned table's partition
+			 * key from in its relcache entry, because it can't change (or
+			 * get destroyed) as long as the relation is locked.  Partition
+			 * descriptor is taken from the PartitionDirectory associated with
+			 * the table that is held open long enough for the descriptor to
+			 * remain valid while it's used to perform the pruning steps.
 			 */
-			partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
 			partkey = RelationGetPartitionKey(partrel);
-			partdesc = PartitionDirectoryLookup(estate->es_partition_directory,
-												partrel);
+			partdesc = PartitionDirectoryLookup(partdir, partrel);
+
+			/*
+			 * Must close partrel, keeping the lock taken, if we're not using
+			 * EState's entry.
+			 */
+			if (close_partrel)
+				table_close(partrel, NoLock);
 
 			/*
 			 * Initialize the subplan_map and subpart_map.
@@ -1769,7 +1912,7 @@ ExecCreatePartitionPruneState(PlanState *planstate,
 			 * Initialize pruning contexts as needed.
 			 */
 			pprune->initial_pruning_steps = pinfo->initial_pruning_steps;
-			if (pinfo->initial_pruning_steps)
+			if (consider_initial_steps && pinfo->initial_pruning_steps)
 			{
 				ExecInitPruningContext(&pprune->initial_context,
 									   pinfo->initial_pruning_steps,
@@ -1779,7 +1922,7 @@ ExecCreatePartitionPruneState(PlanState *planstate,
 				prunestate->do_initial_prune = true;
 			}
 			pprune->exec_pruning_steps = pinfo->exec_pruning_steps;
-			if (pinfo->exec_pruning_steps)
+			if (consider_exec_steps && pinfo->exec_pruning_steps)
 			{
 				ExecInitPruningContext(&pprune->exec_context,
 									   pinfo->exec_pruning_steps,
@@ -1893,7 +2036,8 @@ ExecInitPruningContext(PartitionPruneContext *context,
  * is required.
  */
 static Bitmapset *
-ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate)
+ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate,
+								PartitionPruneInfo *pruneinfo)
 {
 	Bitmapset  *result = NULL;
 	MemoryContext oldcontext;
@@ -1903,8 +2047,8 @@ ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate)
 	Assert(prunestate->do_initial_prune);
 
 	/*
-	 * Switch to a temp context to avoid leaking memory in the executor's
-	 * query-lifespan memory context.
+	 * Switch to a temp context to avoid leaking memory in the longer-term
+	 * memory context.
 	 */
 	oldcontext = MemoryContextSwitchTo(prunestate->prune_context);
 
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 9df1f81ea8..7246f9175f 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -119,6 +119,7 @@ CreateExecutorState(void)
 	estate->es_relations = NULL;
 	estate->es_rowmarks = NULL;
 	estate->es_plannedstmt = NULL;
+	estate->es_execlockrelsinfo = NULL;
 
 	estate->es_junkFilter = NULL;
 
@@ -785,6 +786,13 @@ ExecGetRangeTableRelation(EState *estate, Index rti)
 
 	Assert(rti > 0 && rti <= estate->es_range_table_size);
 
+	/*
+	 * A cross-check that AcquireExecutorLocks() hasn't missed any relations
+	 * it must not have.
+	 */
+	Assert(estate->es_execlockrelsinfo == NULL ||
+		   bms_is_member(rti, estate->es_execlockrelsinfo->lockrels));
+
 	rel = estate->es_relations[rti - 1];
 	if (rel == NULL)
 	{
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index f9460ae506..a2182a6b1f 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -844,7 +844,7 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
 	else
 		dest = None_Receiver;
 
-	es->qd = CreateQueryDesc(es->stmt,
+	es->qd = CreateQueryDesc(es->stmt, NULL,
 							 fcache->src,
 							 GetActiveSnapshot(),
 							 InvalidSnapshot,
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index 5b6d3eb23b..9c6f907687 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -94,6 +94,55 @@ static bool ExecAppendAsyncRequest(AppendState *node, TupleTableSlot **result);
 static void ExecAppendAsyncEventWait(AppendState *node);
 static void classify_matching_subplans(AppendState *node);
 
+/* ----------------------------------------------------------------
+ *		ExecGetAppendLockRels
+ *			Do ExecGetLockRels()'s work for an Append plan
+ * ----------------------------------------------------------------
+ */
+bool
+ExecGetAppendLockRels(Append *node, ExecGetLockRelsContext *context)
+{
+	PartitionPruneInfo *pruneinfo = node->part_prune_info;
+
+	/*
+	 * Must always lock all the partitioned tables whose direct and indirect
+	 * partitions will be scanned by this Append.
+	 */
+	context->lockrels = bms_add_members(context->lockrels,
+										node->partitioned_rels);
+
+	/*
+	 * Now recurse to subplans to add relations scanned therein.
+	 *
+	 * If initial pruning can be done, do that now and only recurse to the
+	 * surviving subplans.
+	 */
+	if (pruneinfo && pruneinfo->needs_init_pruning)
+	{
+		List	   *subplans = node->appendplans;
+		Bitmapset  *validsubplans;
+		int			i;
+
+		validsubplans = ExecGetLockRelsDoInitialPruning((Plan *) node,
+														context, pruneinfo);
+
+		/* Recurse to surviving subplans. */
+		i = -1;
+		while ((i = bms_next_member(validsubplans, i)) >= 0)
+		{
+			Plan   *subplan = list_nth(subplans, i);
+
+			(void) ExecGetLockRels(subplan, context);
+		}
+
+		/* done with this node */
+		return true;
+	}
+
+	/* Tell the caller to recurse to *all* the subplans. */
+	return false;
+}
+
 /* ----------------------------------------------------------------
  *		ExecInitAppend
  *
@@ -155,7 +204,8 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
 		 * subplan, we can fill as_valid_subplans immediately, preventing
 		 * later calls to ExecFindMatchingSubPlans.
 		 */
-		if (!prunestate->do_exec_prune && nplans > 0)
+		if (appendstate->as_prune_state == NULL ||
+			(!appendstate->as_prune_state->do_exec_prune && nplans > 0))
 			appendstate->as_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
 	}
 	else
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index 9a9f29e845..4b04fcdbc2 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -54,6 +54,55 @@ typedef int32 SlotNumber;
 static TupleTableSlot *ExecMergeAppend(PlanState *pstate);
 static int	heap_compare_slots(Datum a, Datum b, void *arg);
 
+/* ----------------------------------------------------------------
+ *		ExecGetMergeAppendLockRels
+ *			Do ExecGetLockRels()'s work for a MergeAppend plan
+ * ----------------------------------------------------------------
+ */
+bool
+ExecGetMergeAppendLockRels(MergeAppend *node, ExecGetLockRelsContext *context)
+{
+	PartitionPruneInfo *pruneinfo = node->part_prune_info;
+
+	/*
+	 * Must always lock all the partitioned tables whose direct and indirect
+	 * partitions will be scanned by this Append.
+	 */
+	context->lockrels = bms_add_members(context->lockrels,
+										node->partitioned_rels);
+
+	/*
+	 * Now recurse to subplans to add relations scanned therein.
+	 *
+	 * If initial pruning can be done, do that now and only recurse to the
+	 * surviving subplans.
+	 */
+	if (pruneinfo && pruneinfo->needs_init_pruning)
+	{
+		List	   *subplans = node->mergeplans;
+		Bitmapset  *validsubplans;
+		int			i;
+
+		validsubplans = ExecGetLockRelsDoInitialPruning((Plan *) node,
+														context, pruneinfo);
+
+		/* Recurse to surviving subplans. */
+		i = -1;
+		while ((i = bms_next_member(validsubplans, i)) >= 0)
+		{
+			Plan   *subplan = list_nth(subplans, i);
+
+			(void) ExecGetLockRels(subplan, context);
+		}
+
+		/* done with this node */
+		return true;
+	}
+
+	/* Tell the caller to recurse to *all* the subplans. */
+	return false;
+}
+
 
 /* ----------------------------------------------------------------
  *		ExecInitMergeAppend
@@ -103,7 +152,8 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
 		 * subplan, we can fill as_valid_subplans immediately, preventing
 		 * later calls to ExecFindMatchingSubPlans.
 		 */
-		if (!prunestate->do_exec_prune && nplans > 0)
+		if (mergestate->ms_prune_state == NULL ||
+			(!mergestate->ms_prune_state->do_exec_prune && nplans > 0))
 			mergestate->ms_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
 	}
 	else
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index 701fe05296..23df3efef0 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -3008,6 +3008,31 @@ ExecLookupResultRelByOid(ModifyTableState *node, Oid resultoid,
 	return NULL;
 }
 
+/*
+ * ExecGetModifyTableLockRels
+ * 		Do ExecGetLockRels()'s work for a ModifyTable plan
+ */
+bool
+ExecGetModifyTableLockRels(ModifyTable *plan, ExecGetLockRelsContext *context)
+{
+	ListCell *lc;
+
+	/* First add the result relation RTIs mentioned in the node. */
+	if (plan->rootRelation > 0)
+		context->lockrels = bms_add_member(context->lockrels,
+										   plan->rootRelation);
+	context->lockrels = bms_add_member(context->lockrels,
+									   plan->nominalRelation);
+	foreach(lc, plan->resultRelations)
+	{
+		context->lockrels = bms_add_member(context->lockrels,
+										   lfirst_int(lc));
+	}
+
+	/* Tell the caller to recurse to the subplan (outerPlan(plan)). */
+	return false;
+}
+
 /* ----------------------------------------------------------------
  *		ExecInitModifyTable
  * ----------------------------------------------------------------
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index a82e986667..2107009591 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -1578,6 +1578,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 	CachedPlanSource *plansource;
 	CachedPlan *cplan;
 	List	   *stmt_list;
+	List	   *execlockrelsinfo_list;
 	char	   *query_string;
 	Snapshot	snapshot;
 	MemoryContext oldcontext;
@@ -1659,6 +1660,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 	/* Replan if needed, and increment plan refcount for portal */
 	cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv);
 	stmt_list = cplan->stmt_list;
+	execlockrelsinfo_list = cplan->execlockrelsinfo_list;
 
 	if (!plan->saved)
 	{
@@ -1670,6 +1672,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 		 */
 		oldcontext = MemoryContextSwitchTo(portal->portalContext);
 		stmt_list = copyObject(stmt_list);
+		execlockrelsinfo_list = copyObject(execlockrelsinfo_list);
 		MemoryContextSwitchTo(oldcontext);
 		ReleaseCachedPlan(cplan, NULL);
 		cplan = NULL;			/* portal shouldn't depend on cplan */
@@ -1683,6 +1686,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 					  query_string,
 					  plansource->commandTag,
 					  stmt_list,
+					  execlockrelsinfo_list,
 					  cplan);
 
 	/*
@@ -2473,7 +2477,9 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 	{
 		CachedPlanSource *plansource = (CachedPlanSource *) lfirst(lc1);
 		List	   *stmt_list;
-		ListCell   *lc2;
+		List	   *execlockrelsinfo_list;
+		ListCell   *lc2,
+				   *lc3;
 
 		spicallbackarg.query = plansource->query_string;
 
@@ -2552,6 +2558,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 							  plan_owner, _SPI_current->queryEnv);
 
 		stmt_list = cplan->stmt_list;
+		execlockrelsinfo_list = cplan->execlockrelsinfo_list;
 
 		/*
 		 * If we weren't given a specific snapshot to use, and the statement
@@ -2589,9 +2596,10 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 			}
 		}
 
-		foreach(lc2, stmt_list)
+		forboth(lc2, stmt_list, lc3, execlockrelsinfo_list)
 		{
 			PlannedStmt *stmt = lfirst_node(PlannedStmt, lc2);
+			ExecLockRelsInfo *execlockrelsinfo = lfirst_node(ExecLockRelsInfo, lc3);
 			bool		canSetTag = stmt->canSetTag;
 			DestReceiver *dest;
 
@@ -2663,7 +2671,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 				else
 					snap = InvalidSnapshot;
 
-				qdesc = CreateQueryDesc(stmt,
+				qdesc = CreateQueryDesc(stmt, execlockrelsinfo,
 										plansource->query_string,
 										snap, crosscheck_snapshot,
 										dest,
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index d4b5cc7e59..631727d310 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -68,6 +68,13 @@
 		} \
 	} while (0)
 
+/* Copy a field that is an array with numElem ints */
+#define COPY_INT_ARRAY(fldname, numElem) \
+	do { \
+		newnode->fldname = (numElem) > 0 ? palloc((numElem) * sizeof(int)) : NULL; \
+		memcpy(newnode->fldname, from->fldname, sizeof(int) * (numElem)); \
+	} while (0)
+
 /* Copy a parse location field (for Copy, this is same as scalar case) */
 #define COPY_LOCATION_FIELD(fldname) \
 	(newnode->fldname = from->fldname)
@@ -94,8 +101,10 @@ _copyPlannedStmt(const PlannedStmt *from)
 	COPY_SCALAR_FIELD(transientPlan);
 	COPY_SCALAR_FIELD(dependsOnRole);
 	COPY_SCALAR_FIELD(parallelModeNeeded);
+	COPY_SCALAR_FIELD(containsInitialPruning);
 	COPY_SCALAR_FIELD(jitFlags);
 	COPY_NODE_FIELD(planTree);
+	COPY_SCALAR_FIELD(numPlanNodes);
 	COPY_NODE_FIELD(rtable);
 	COPY_NODE_FIELD(resultRelations);
 	COPY_NODE_FIELD(appendRelations);
@@ -1281,6 +1290,8 @@ _copyPartitionPruneInfo(const PartitionPruneInfo *from)
 	PartitionPruneInfo *newnode = makeNode(PartitionPruneInfo);
 
 	COPY_NODE_FIELD(prune_infos);
+	COPY_SCALAR_FIELD(needs_init_pruning);
+	COPY_SCALAR_FIELD(needs_exec_pruning);
 	COPY_BITMAPSET_FIELD(other_subplans);
 
 	return newnode;
@@ -5137,6 +5148,33 @@ _copyExtensibleNode(const ExtensibleNode *from)
 	return newnode;
 }
 
+/* ****************************************************************
+ *					execnodes.h copy functions
+ * ****************************************************************
+ */
+static ExecLockRelsInfo *
+_copyExecLockRelsInfo(const ExecLockRelsInfo *from)
+{
+	ExecLockRelsInfo *newnode = makeNode(ExecLockRelsInfo);
+
+	COPY_BITMAPSET_FIELD(lockrels);
+	COPY_SCALAR_FIELD(numPlanNodes);
+	COPY_NODE_FIELD(initPruningOutputs);
+	COPY_INT_ARRAY(ipoIndexes, from->numPlanNodes);
+
+	return newnode;
+}
+
+static PlanInitPruningOutput *
+_copyPlanInitPruningOutput(const PlanInitPruningOutput *from)
+{
+	PlanInitPruningOutput *newnode = makeNode(PlanInitPruningOutput);
+
+	COPY_BITMAPSET_FIELD(initially_valid_subplans);
+
+	return newnode;
+}
+
 /* ****************************************************************
  *					value.h copy functions
  * ****************************************************************
@@ -5191,7 +5229,6 @@ _copyBitString(const BitString *from)
 	return newnode;
 }
 
-
 static ForeignKeyCacheInfo *
 _copyForeignKeyCacheInfo(const ForeignKeyCacheInfo *from)
 {
@@ -6176,6 +6213,16 @@ copyObjectImpl(const void *from)
 			retval = _copyPublicationTable(from);
 			break;
 
+			/*
+			 * EXECUTION NODES
+			 */
+		case T_ExecLockRelsInfo:
+			retval = _copyExecLockRelsInfo(from);
+			break;
+		case T_PlanInitPruningOutput:
+			retval = _copyPlanInitPruningOutput(from);
+			break;
+
 			/*
 			 * MISCELLANEOUS NODES
 			 */
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 99056272f3..f361d2e2bc 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -312,8 +312,10 @@ _outPlannedStmt(StringInfo str, const PlannedStmt *node)
 	WRITE_BOOL_FIELD(transientPlan);
 	WRITE_BOOL_FIELD(dependsOnRole);
 	WRITE_BOOL_FIELD(parallelModeNeeded);
+	WRITE_BOOL_FIELD(containsInitialPruning);
 	WRITE_INT_FIELD(jitFlags);
 	WRITE_NODE_FIELD(planTree);
+	WRITE_INT_FIELD(numPlanNodes);
 	WRITE_NODE_FIELD(rtable);
 	WRITE_NODE_FIELD(resultRelations);
 	WRITE_NODE_FIELD(appendRelations);
@@ -1007,6 +1009,8 @@ _outPartitionPruneInfo(StringInfo str, const PartitionPruneInfo *node)
 	WRITE_NODE_TYPE("PARTITIONPRUNEINFO");
 
 	WRITE_NODE_FIELD(prune_infos);
+	WRITE_BOOL_FIELD(needs_init_pruning);
+	WRITE_BOOL_FIELD(needs_exec_pruning);
 	WRITE_BITMAPSET_FIELD(other_subplans);
 }
 
@@ -2747,6 +2751,31 @@ _outExtensibleNode(StringInfo str, const ExtensibleNode *node)
 	methods->nodeOut(str, node);
 }
 
+/*****************************************************************************
+ *
+ *	Stuff from execnodes.h
+ *
+ *****************************************************************************/
+
+static void
+_outExecLockRelsInfo(StringInfo str, const ExecLockRelsInfo *node)
+{
+	WRITE_NODE_TYPE("EXECLOCKRELSINFO");
+
+	WRITE_BITMAPSET_FIELD(lockrels);
+	WRITE_INT_FIELD(numPlanNodes);
+	WRITE_NODE_FIELD(initPruningOutputs);
+	WRITE_INT_ARRAY(ipoIndexes, node->numPlanNodes);
+}
+
+static void
+_outPlanInitPruningOutput(StringInfo str, const PlanInitPruningOutput *node)
+{
+	WRITE_NODE_TYPE("PLANINITPRUNINGOUTPUT");
+
+	WRITE_BITMAPSET_FIELD(initially_valid_subplans);
+}
+
 /*****************************************************************************
  *
  *	Stuff from parsenodes.h.
@@ -4600,6 +4629,16 @@ outNode(StringInfo str, const void *obj)
 				_outJsonConstructorExpr(str, obj);
 				break;
 
+				/*
+				 * EXECUTION NODES
+				 */
+			case T_ExecLockRelsInfo:
+				_outExecLockRelsInfo(str, obj);
+				break;
+			case T_PlanInitPruningOutput:
+				_outPlanInitPruningOutput(str, obj);
+				break;
+
 			default:
 
 				/*
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 7536f216bd..41fc710999 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -1650,8 +1650,10 @@ _readPlannedStmt(void)
 	READ_BOOL_FIELD(transientPlan);
 	READ_BOOL_FIELD(dependsOnRole);
 	READ_BOOL_FIELD(parallelModeNeeded);
+	READ_BOOL_FIELD(containsInitialPruning);
 	READ_INT_FIELD(jitFlags);
 	READ_NODE_FIELD(planTree);
+	READ_INT_FIELD(numPlanNodes);
 	READ_NODE_FIELD(rtable);
 	READ_NODE_FIELD(resultRelations);
 	READ_NODE_FIELD(appendRelations);
@@ -2602,6 +2604,8 @@ _readPartitionPruneInfo(void)
 	READ_LOCALS(PartitionPruneInfo);
 
 	READ_NODE_FIELD(prune_infos);
+	READ_BOOL_FIELD(needs_init_pruning);
+	READ_BOOL_FIELD(needs_exec_pruning);
 	READ_BITMAPSET_FIELD(other_subplans);
 
 	READ_DONE();
@@ -2771,6 +2775,35 @@ _readPartitionRangeDatum(void)
 	READ_DONE();
 }
 
+/*
+ * _readExecLockRelsInfo
+ */
+static ExecLockRelsInfo *
+_readExecLockRelsInfo(void)
+{
+	READ_LOCALS(ExecLockRelsInfo);
+
+	READ_BITMAPSET_FIELD(lockrels);
+	READ_INT_FIELD(numPlanNodes);
+	READ_NODE_FIELD(initPruningOutputs);
+	READ_INT_ARRAY(ipoIndexes, local_node->numPlanNodes);
+
+	READ_DONE();
+}
+
+/*
+ * _readPlanInitPruningOutput
+ */
+static PlanInitPruningOutput *
+_readPlanInitPruningOutput(void)
+{
+	READ_LOCALS(PlanInitPruningOutput);
+
+	READ_BITMAPSET_FIELD(initially_valid_subplans);
+
+	READ_DONE();
+}
+
 /*
  * parseNodeString
  *
@@ -3050,6 +3083,10 @@ parseNodeString(void)
 		return_value = _readJsonValueExpr();
 	else if (MATCH("JSONCTOREXPR", 12))
 		return_value = _readJsonConstructorExpr();
+	else if (MATCH("EXECLOCKRELSINFO", 16))
+		return_value = _readExecLockRelsInfo();
+	else if (MATCH("PLANINITPRUNINGOUTPUT", 21))
+		return_value = _readPlanInitPruningOutput();
 	else
 	{
 		elog(ERROR, "badly formatted node string \"%.32s\"...", token);
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 374a9d9753..329fb9d6e7 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -517,7 +517,9 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
 	result->transientPlan = glob->transientPlan;
 	result->dependsOnRole = glob->dependsOnRole;
 	result->parallelModeNeeded = glob->parallelModeNeeded;
+	result->containsInitialPruning = glob->containsInitialPruning;
 	result->planTree = top_plan;
+	result->numPlanNodes = glob->lastPlanNodeId;
 	result->rtable = glob->finalrtable;
 	result->resultRelations = glob->resultRelations;
 	result->appendRelations = glob->appendRelations;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index dbdeb8ec9d..ac795ae9d9 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -1561,6 +1561,9 @@ set_append_references(PlannerInfo *root,
 				pinfo->rtindex += rtoffset;
 			}
 		}
+
+		if (aplan->part_prune_info->needs_init_pruning)
+			root->glob->containsInitialPruning = true;
 	}
 
 	/* We don't need to recurse to lefttree or righttree ... */
@@ -1648,6 +1651,9 @@ set_mergeappend_references(PlannerInfo *root,
 				pinfo->rtindex += rtoffset;
 			}
 		}
+
+		if (mplan->part_prune_info->needs_init_pruning)
+			root->glob->containsInitialPruning = true;
 	}
 
 	/* We don't need to recurse to lefttree or righttree ... */
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index 7080cb25d9..3322dc79f2 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -144,7 +144,9 @@ static List *make_partitionedrel_pruneinfo(PlannerInfo *root,
 										   List *prunequal,
 										   Bitmapset *partrelids,
 										   int *relid_subplan_map,
-										   Bitmapset **matchedsubplans);
+										   Bitmapset **matchedsubplans,
+										   bool *needs_init_pruning,
+										   bool *needs_exec_pruning);
 static void gen_partprune_steps(RelOptInfo *rel, List *clauses,
 								PartClauseTarget target,
 								GeneratePruningStepsContext *context);
@@ -230,6 +232,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	int		   *relid_subplan_map;
 	ListCell   *lc;
 	int			i;
+	bool		needs_init_pruning = false;
+	bool		needs_exec_pruning = false;
 
 	/*
 	 * Scan the subpaths to see which ones are scans of partition child
@@ -309,12 +313,16 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		Bitmapset  *partrelids = (Bitmapset *) lfirst(lc);
 		List	   *pinfolist;
 		Bitmapset  *matchedsubplans = NULL;
+		bool		partrel_needs_init_pruning;
+		bool		partrel_needs_exec_pruning;
 
 		pinfolist = make_partitionedrel_pruneinfo(root, parentrel,
 												  prunequal,
 												  partrelids,
 												  relid_subplan_map,
-												  &matchedsubplans);
+												  &matchedsubplans,
+												  &partrel_needs_init_pruning,
+												  &partrel_needs_exec_pruning);
 
 		/* When pruning is possible, record the matched subplans */
 		if (pinfolist != NIL)
@@ -323,6 +331,10 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 			allmatchedsubplans = bms_join(matchedsubplans,
 										  allmatchedsubplans);
 		}
+		if (!needs_init_pruning)
+			needs_init_pruning = partrel_needs_init_pruning;
+		if (!needs_exec_pruning)
+			needs_exec_pruning = partrel_needs_exec_pruning;
 	}
 
 	pfree(relid_subplan_map);
@@ -337,6 +349,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	/* Else build the result data structure */
 	pruneinfo = makeNode(PartitionPruneInfo);
 	pruneinfo->prune_infos = prunerelinfos;
+	pruneinfo->needs_init_pruning = needs_init_pruning;
+	pruneinfo->needs_exec_pruning = needs_exec_pruning;
 
 	/*
 	 * Some subplans may not belong to any of the identified partitioned rels.
@@ -435,13 +449,18 @@ add_part_relids(List *allpartrelids, Bitmapset *partrelids)
  * If we cannot find any useful run-time pruning steps, return NIL.
  * However, on success, each rel identified in partrelids will have
  * an element in the result list, even if some of them are useless.
+ * *needs_init_pruning and *needs_exec_pruning are set to indicate that the
+ * returned PartitionedRelPruneInfos contains pruning steps that can be
+ * performed before and after execution begins, respectively.
  */
 static List *
 make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 							  List *prunequal,
 							  Bitmapset *partrelids,
 							  int *relid_subplan_map,
-							  Bitmapset **matchedsubplans)
+							  Bitmapset **matchedsubplans,
+							  bool *needs_init_pruning,
+							  bool *needs_exec_pruning)
 {
 	RelOptInfo *targetpart = NULL;
 	List	   *pinfolist = NIL;
@@ -452,6 +471,10 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	int			rti;
 	int			i;
 
+	/* Will find out below. */
+	*needs_init_pruning = false;
+	*needs_exec_pruning = false;
+
 	/*
 	 * Examine each partitioned rel, constructing a temporary array to map
 	 * from planner relids to index of the partitioned rel, and building a
@@ -539,6 +562,9 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		 * executor per-scan pruning steps.  This first pass creates startup
 		 * pruning steps and detects whether there's any possibly-useful quals
 		 * that would require per-scan pruning.
+		 *
+		 * In the first pass, we note whether the 2nd pass is necessary by
+		 * by noting the presence of EXEC parameters.
 		 */
 		gen_partprune_steps(subpart, partprunequal, PARTTARGET_INITIAL,
 							&context);
@@ -613,6 +639,11 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		pinfo->execparamids = execparamids;
 		/* Remaining fields will be filled in the next loop */
 
+		if (!*needs_init_pruning)
+			*needs_init_pruning = (initial_pruning_steps != NIL);
+		if (!*needs_exec_pruning)
+			*needs_exec_pruning = (exec_pruning_steps != NIL);
+
 		pinfolist = lappend(pinfolist, pinfo);
 	}
 
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index ba2fcfeb4a..085eb3f209 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -945,15 +945,17 @@ pg_plan_query(Query *querytree, const char *query_string, int cursorOptions,
  * For normal optimizable statements, invoke the planner.  For utility
  * statements, just make a wrapper PlannedStmt node.
  *
- * The result is a list of PlannedStmt nodes.
+ * The result is a list of PlannedStmt nodes.  Also, a NULL is appended to
+ * *execlockrelsinfo_list for each PlannedStmt added to the returned list.
  */
 List *
 pg_plan_queries(List *querytrees, const char *query_string, int cursorOptions,
-				ParamListInfo boundParams)
+				ParamListInfo boundParams, List **execlockrelsinfo_list)
 {
 	List	   *stmt_list = NIL;
 	ListCell   *query_list;
 
+	*execlockrelsinfo_list = NIL;
 	foreach(query_list, querytrees)
 	{
 		Query	   *query = lfirst_node(Query, query_list);
@@ -977,6 +979,7 @@ pg_plan_queries(List *querytrees, const char *query_string, int cursorOptions,
 		}
 
 		stmt_list = lappend(stmt_list, stmt);
+		*execlockrelsinfo_list = lappend(*execlockrelsinfo_list, NULL);
 	}
 
 	return stmt_list;
@@ -1080,7 +1083,8 @@ exec_simple_query(const char *query_string)
 		QueryCompletion qc;
 		MemoryContext per_parsetree_context = NULL;
 		List	   *querytree_list,
-				   *plantree_list;
+				   *plantree_list,
+				   *plantree_execlockrelsinfo_list;
 		Portal		portal;
 		DestReceiver *receiver;
 		int16		format;
@@ -1167,7 +1171,8 @@ exec_simple_query(const char *query_string)
 												NULL, 0, NULL);
 
 		plantree_list = pg_plan_queries(querytree_list, query_string,
-										CURSOR_OPT_PARALLEL_OK, NULL);
+										CURSOR_OPT_PARALLEL_OK, NULL,
+										&plantree_execlockrelsinfo_list);
 
 		/*
 		 * Done with the snapshot used for parsing/planning.
@@ -1203,6 +1208,7 @@ exec_simple_query(const char *query_string)
 						  query_string,
 						  commandTag,
 						  plantree_list,
+						  plantree_execlockrelsinfo_list,
 						  NULL);
 
 		/*
@@ -1991,6 +1997,7 @@ exec_bind_message(StringInfo input_message)
 					  query_string,
 					  psrc->commandTag,
 					  cplan->stmt_list,
+					  cplan->execlockrelsinfo_list,
 					  cplan);
 
 	/* Done with the snapshot used for parameter I/O and parsing/planning */
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index 5f907831a3..972ddc014e 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -35,7 +35,7 @@
 Portal		ActivePortal = NULL;
 
 
-static void ProcessQuery(PlannedStmt *plan,
+static void ProcessQuery(PlannedStmt *plan, ExecLockRelsInfo *execlockrelsinfo,
 						 const char *sourceText,
 						 ParamListInfo params,
 						 QueryEnvironment *queryEnv,
@@ -65,6 +65,7 @@ static void DoPortalRewind(Portal portal);
  */
 QueryDesc *
 CreateQueryDesc(PlannedStmt *plannedstmt,
+				ExecLockRelsInfo *execlockrelsinfo,
 				const char *sourceText,
 				Snapshot snapshot,
 				Snapshot crosscheck_snapshot,
@@ -77,6 +78,7 @@ CreateQueryDesc(PlannedStmt *plannedstmt,
 
 	qd->operation = plannedstmt->commandType;	/* operation */
 	qd->plannedstmt = plannedstmt;	/* plan */
+	qd->execlockrelsinfo = execlockrelsinfo;		/* ExecutorGetLockRels() output for plan */
 	qd->sourceText = sourceText;	/* query text */
 	qd->snapshot = RegisterSnapshot(snapshot);	/* snapshot */
 	/* RI check snapshot */
@@ -122,6 +124,7 @@ FreeQueryDesc(QueryDesc *qdesc)
  *		PORTAL_ONE_RETURNING, or PORTAL_ONE_MOD_WITH portal
  *
  *	plan: the plan tree for the query
+ *	execlockrelsinfo: ExecutorGetLockRels() output for the plan tree
  *	sourceText: the source text of the query
  *	params: any parameters needed
  *	dest: where to send results
@@ -134,6 +137,7 @@ FreeQueryDesc(QueryDesc *qdesc)
  */
 static void
 ProcessQuery(PlannedStmt *plan,
+			 ExecLockRelsInfo *execlockrelsinfo,
 			 const char *sourceText,
 			 ParamListInfo params,
 			 QueryEnvironment *queryEnv,
@@ -145,7 +149,7 @@ ProcessQuery(PlannedStmt *plan,
 	/*
 	 * Create the QueryDesc object
 	 */
-	queryDesc = CreateQueryDesc(plan, sourceText,
+	queryDesc = CreateQueryDesc(plan, execlockrelsinfo, sourceText,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, params, queryEnv, 0);
 
@@ -490,6 +494,7 @@ PortalStart(Portal portal, ParamListInfo params,
 				 * the destination to DestNone.
 				 */
 				queryDesc = CreateQueryDesc(linitial_node(PlannedStmt, portal->stmts),
+											linitial_node(ExecLockRelsInfo, portal->execlockrelsinfos),
 											portal->sourceText,
 											GetActiveSnapshot(),
 											InvalidSnapshot,
@@ -1190,7 +1195,8 @@ PortalRunMulti(Portal portal,
 			   QueryCompletion *qc)
 {
 	bool		active_snapshot_set = false;
-	ListCell   *stmtlist_item;
+	ListCell   *stmtlist_item,
+			   *execlockrelsinfolist_item;
 
 	/*
 	 * If the destination is DestRemoteExecute, change to DestNone.  The
@@ -1211,9 +1217,12 @@ PortalRunMulti(Portal portal,
 	 * Loop to handle the individual queries generated from a single parsetree
 	 * by analysis and rewrite.
 	 */
-	foreach(stmtlist_item, portal->stmts)
+	forboth(stmtlist_item, portal->stmts,
+			execlockrelsinfolist_item, portal->execlockrelsinfos)
 	{
 		PlannedStmt *pstmt = lfirst_node(PlannedStmt, stmtlist_item);
+		ExecLockRelsInfo *execlockrelsinfo = lfirst_node(ExecLockRelsInfo,
+											   execlockrelsinfolist_item);
 
 		/*
 		 * If we got a cancel signal in prior command, quit
@@ -1271,7 +1280,7 @@ PortalRunMulti(Portal portal,
 			if (pstmt->canSetTag)
 			{
 				/* statement can set tag string */
-				ProcessQuery(pstmt,
+				ProcessQuery(pstmt, execlockrelsinfo,
 							 portal->sourceText,
 							 portal->portalParams,
 							 portal->queryEnv,
@@ -1280,7 +1289,7 @@ PortalRunMulti(Portal portal,
 			else
 			{
 				/* stmt added by rewrite cannot set tag */
-				ProcessQuery(pstmt,
+				ProcessQuery(pstmt, execlockrelsinfo,
 							 portal->sourceText,
 							 portal->portalParams,
 							 portal->queryEnv,
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 4cf6db504f..9f5a40a0a6 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -99,14 +99,16 @@ static dlist_head cached_expression_list = DLIST_STATIC_INIT(cached_expression_l
 static void ReleaseGenericPlan(CachedPlanSource *plansource);
 static List *RevalidateCachedQuery(CachedPlanSource *plansource,
 								   QueryEnvironment *queryEnv);
-static bool CheckCachedPlan(CachedPlanSource *plansource);
+static bool CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams);
 static CachedPlan *BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
 								   ParamListInfo boundParams, QueryEnvironment *queryEnv);
+static void CachedPlanSaveExecLockRelsInfos(CachedPlan *plan, List *execlockrelsinfo_list);
 static bool choose_custom_plan(CachedPlanSource *plansource,
 							   ParamListInfo boundParams);
 static double cached_plan_cost(CachedPlan *plan, bool include_planner);
 static Query *QueryListGetPrimaryStmt(List *stmts);
-static void AcquireExecutorLocks(List *stmt_list, bool acquire);
+static List *AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams);
+static void ReleaseExecutorLocks(List *stmt_list, List *execlockrelsinfo_list);
 static void AcquirePlannerLocks(List *stmt_list, bool acquire);
 static void ScanQueryForLocks(Query *parsetree, bool acquire);
 static bool ScanQueryWalker(Node *node, bool *acquire);
@@ -790,9 +792,21 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
  *
  * On a "true" return, we have acquired the locks needed to run the plan.
  * (We must do this for the "true" result to be race-condition-free.)
+ *
+ * If the CachedPlan is valid, this may in some cases call ExecutorGetLockRels
+ * on each PlannedStmt contained in it to determine the set of relations to be
+ * locked by AcquireExecutorLocks(), instead of just scanning its range table,
+ * which is done to prune away any nodes in the tree that need not be executed
+ * based on the result of initial partition pruning.  Resulting
+ * ExecLockRelsInfo nodes containing the result of such pruning, allocated in
+ * a child context of the context containing the plan itself, are added into
+ * plan->execlockrelsinfo_list.  The previous contents of the list from the
+ * last invocation on the same CachedPlan are deleted, because they would no
+ * longer be valid given the fresh set of parameter values which may be used
+ * as pruning parameters.
  */
 static bool
-CheckCachedPlan(CachedPlanSource *plansource)
+CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams)
 {
 	CachedPlan *plan = plansource->gplan;
 
@@ -820,13 +834,25 @@ CheckCachedPlan(CachedPlanSource *plansource)
 	 */
 	if (plan->is_valid)
 	{
+		List   *execlockrelsinfo_list;
+
 		/*
 		 * Plan must have positive refcount because it is referenced by
 		 * plansource; so no need to fear it disappears under us here.
 		 */
 		Assert(plan->refcount > 0);
 
-		AcquireExecutorLocks(plan->stmt_list, true);
+		/*
+		 * Lock relations scanned by the plan.  If ExecutorGetLockRels() asked
+		 * to omit some relations because the plan nodes that scan them were
+		 * found to be pruned, the executor will be informed of the omission of
+		 * the plan nodes themselves, so that it doesn't accidentally try to
+		 * execute those nodes, via the ExecLockRelsInfo nodes collected in the
+		 * returned list that is also passed to it along with the list of
+		 * PlannedStmts.
+		 */
+		execlockrelsinfo_list = AcquireExecutorLocks(plan->stmt_list,
+													 boundParams);
 
 		/*
 		 * If plan was transient, check to see if TransactionXmin has
@@ -844,11 +870,14 @@ CheckCachedPlan(CachedPlanSource *plansource)
 		if (plan->is_valid)
 		{
 			/* Successfully revalidated and locked the query. */
+
+			/* Remember ExecLockRelsInfos in the CachedPlan. */
+			CachedPlanSaveExecLockRelsInfos(plan, execlockrelsinfo_list);
 			return true;
 		}
 
 		/* Oops, the race case happened.  Release useless locks. */
-		AcquireExecutorLocks(plan->stmt_list, false);
+		ReleaseExecutorLocks(plan->stmt_list, execlockrelsinfo_list);
 	}
 
 	/*
@@ -880,7 +909,8 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
 				ParamListInfo boundParams, QueryEnvironment *queryEnv)
 {
 	CachedPlan *plan;
-	List	   *plist;
+	List	   *plist,
+			   *execlockrelsinfo_list;
 	bool		snapshot_set;
 	bool		is_transient;
 	MemoryContext plan_context;
@@ -933,7 +963,8 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
 	 * Generate the plan.
 	 */
 	plist = pg_plan_queries(qlist, plansource->query_string,
-							plansource->cursor_options, boundParams);
+							plansource->cursor_options, boundParams,
+							&execlockrelsinfo_list);
 
 	/* Release snapshot if we got one */
 	if (snapshot_set)
@@ -1002,6 +1033,16 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
 	plan->is_saved = false;
 	plan->is_valid = true;
 
+	/*
+	 * Save the dummy ExecLockRelsInfo list, that is a list containing NULLs
+	 * as elements.  We must do this, becasue users of the CachedPlan expect
+	 * one to go with the list of PlannedStmts.
+	 * XXX maybe get rid of that contract.
+	 */
+	plan->execlockrelsinfo_context = NULL;
+	CachedPlanSaveExecLockRelsInfos(plan, execlockrelsinfo_list);
+	Assert(MemoryContextIsValid(plan->execlockrelsinfo_context));
+
 	/* assign generation number to new plan */
 	plan->generation = ++(plansource->generation);
 
@@ -1160,7 +1201,7 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 
 	if (!customplan)
 	{
-		if (CheckCachedPlan(plansource))
+		if (CheckCachedPlan(plansource, boundParams))
 		{
 			/* We want a generic plan, and we already have a valid one */
 			plan = plansource->gplan;
@@ -1586,6 +1627,49 @@ CopyCachedPlan(CachedPlanSource *plansource)
 	return newsource;
 }
 
+/*
+ * CachedPlanSaveExecLockRelsInfos
+ *		Save the list containing ExecLockRelsInfo nodes into the given
+ *		CachedPlan
+ *
+ * The provided list is copied into a dedicated context that is a child of
+ * plan->context.  If the child context already exists, it is emptied, because
+ * any ExecLockRelsInfo contained therein would no longer be useful.
+ */
+static void
+CachedPlanSaveExecLockRelsInfos(CachedPlan *plan, List *execlockrelsinfo_list)
+{
+	MemoryContext	execlockrelsinfo_context = plan->execlockrelsinfo_context,
+					oldcontext = CurrentMemoryContext;
+	List		   *execlockrelsinfo_list_copy;
+
+	/*
+	 * Set up the dedicated context if not already done, saving it as a child
+	 * of the CachedPlan's context.
+	 */
+	if (execlockrelsinfo_context == NULL)
+	{
+		execlockrelsinfo_context = AllocSetContextCreate(CurrentMemoryContext,
+												 "CachedPlan execlockrelsinfo list",
+												 ALLOCSET_START_SMALL_SIZES);
+		MemoryContextSetParent(execlockrelsinfo_context, plan->context);
+		MemoryContextSetIdentifier(execlockrelsinfo_context, plan->context->ident);
+		plan->execlockrelsinfo_context = execlockrelsinfo_context;
+	}
+	else
+	{
+		/* Just clear existing contents by resetting the context. */
+		Assert(MemoryContextIsValid(execlockrelsinfo_context));
+		MemoryContextReset(execlockrelsinfo_context);
+	}
+
+	MemoryContextSwitchTo(execlockrelsinfo_context);
+	execlockrelsinfo_list_copy = copyObject(execlockrelsinfo_list);
+	MemoryContextSwitchTo(oldcontext);
+
+	plan->execlockrelsinfo_list = execlockrelsinfo_list_copy;
+}
+
 /*
  * CachedPlanIsValid: test whether the rewritten querytree within a
  * CachedPlanSource is currently valid (that is, not marked as being in need
@@ -1737,17 +1821,21 @@ QueryListGetPrimaryStmt(List *stmts)
 
 /*
  * AcquireExecutorLocks: acquire locks needed for execution of a cached plan;
- * or release them if acquire is false.
+ *
+ * Returns a list of ExecLockRelsInfo nodes containing one element for each
+ * PlannedStmt in stmt_list or NULL if the latter is utility statement or its
+ * containsInitialPruning is false.
  */
-static void
-AcquireExecutorLocks(List *stmt_list, bool acquire)
+static List *
+AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams)
 {
 	ListCell   *lc1;
+	List	   *execlockrelsinfo_list = NIL;
 
 	foreach(lc1, stmt_list)
 	{
 		PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
-		ListCell   *lc2;
+		ExecLockRelsInfo *execlockrelsinfo = NULL;
 
 		if (plannedstmt->commandType == CMD_UTILITY)
 		{
@@ -1761,27 +1849,139 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
 			Query	   *query = UtilityContainsQuery(plannedstmt->utilityStmt);
 
 			if (query)
-				ScanQueryForLocks(query, acquire);
-			continue;
+				ScanQueryForLocks(query, true);
 		}
-
-		foreach(lc2, plannedstmt->rtable)
+		else
 		{
-			RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc2);
+			/*
+			 * Figure out the set of relations that would need to be locked
+			 * before executing the plan.
+			 */
+			if (!plannedstmt->containsInitialPruning)
+			{
+				/*
+				 * If the plan contains no initial pruning steps, just lock
+				 * all the relations found in the range table.
+				 */
+				ListCell *lc;
 
-			if (rte->rtekind != RTE_RELATION)
-				continue;
+				foreach(lc, plannedstmt->rtable)
+				{
+					RangeTblEntry *rte = lfirst(lc);
+
+					if (rte->rtekind != RTE_RELATION)
+						continue;
+
+					/*
+					 * Acquire the appropriate type of lock on each relation
+					 * OID. Note that we don't actually try to open the rel,
+					 * and hence will not fail if it's been dropped entirely
+					 * --- we'll just transiently acquire a non-conflicting
+					 *  lock.
+					 */
+					LockRelationOid(rte->relid, rte->rellockmode);
+				}
+			}
+			else
+			{
+				int			rti;
+				Bitmapset  *lockrels;
+
+				/*
+				 * Walk the plan tree to find only the minimal set of
+				 * relations to be locked, considering the effect of performing
+				 * initial partition pruning.
+				 */
+				execlockrelsinfo = ExecutorGetLockRels(plannedstmt, boundParams);
+				lockrels = execlockrelsinfo->lockrels;
+
+				rti = -1;
+				while ((rti = bms_next_member(lockrels, rti)) >= 0)
+				{
+					RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
 
+					Assert(rte->rtekind == RTE_RELATION);
+
+					/* See the comment above. */
+					LockRelationOid(rte->relid, rte->rellockmode);
+				}
+			}
+		}
+
+		/*
+		 * Remember ExecLockRelsInfo for later adding to the QueryDesc that
+		 * will be passed to the executor when executing this plan.  May be
+		 * NULL, but must keep the list the same length as stmt_list.
+		 */
+		execlockrelsinfo_list = lappend(execlockrelsinfo_list,
+										execlockrelsinfo);
+	}
+
+	return execlockrelsinfo_list;
+}
+
+/*
+ * ReleaseExecutorLocks
+ * 		Release locks that would've been acquired by an earlier call to
+ * 		AcquireExecutorLocks()
+ */
+static void
+ReleaseExecutorLocks(List *stmt_list, List *execlockrelsinfo_list)
+{
+	ListCell   *lc1,
+			   *lc2;
+
+	forboth(lc1, stmt_list, lc2, execlockrelsinfo_list)
+	{
+		PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
+		ExecLockRelsInfo *execlockrelsinfo = lfirst_node(ExecLockRelsInfo, lc2);
+
+		if (plannedstmt->commandType == CMD_UTILITY)
+		{
 			/*
-			 * Acquire the appropriate type of lock on each relation OID. Note
-			 * that we don't actually try to open the rel, and hence will not
-			 * fail if it's been dropped entirely --- we'll just transiently
-			 * acquire a non-conflicting lock.
+			 * Ignore utility statements, except those (such as EXPLAIN) that
+			 * contain a parsed-but-not-planned query.  Note: it's okay to use
+			 * ScanQueryForLocks, even though the query hasn't been through
+			 * rule rewriting, because rewriting doesn't change the query
+			 * representation.
 			 */
-			if (acquire)
-				LockRelationOid(rte->relid, rte->rellockmode);
+			Query	   *query = UtilityContainsQuery(plannedstmt->utilityStmt);
+
+			if (query)
+				ScanQueryForLocks(query, false);
+		}
+		else
+		{
+			if (execlockrelsinfo == NULL)
+			{
+				ListCell *lc;
+
+				foreach(lc, plannedstmt->rtable)
+				{
+					RangeTblEntry *rte = lfirst(lc);
+
+					if (rte->rtekind != RTE_RELATION)
+						continue;
+
+					LockRelationOid(rte->relid, rte->rellockmode);
+				}
+			}
 			else
-				UnlockRelationOid(rte->relid, rte->rellockmode);
+			{
+				int			rti;
+				Bitmapset  *lockrels;
+
+				lockrels = execlockrelsinfo->lockrels;
+				rti = -1;
+				while ((rti = bms_next_member(lockrels, rti)) >= 0)
+				{
+					RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
+
+					Assert(rte->rtekind == RTE_RELATION);
+
+					UnlockRelationOid(rte->relid, rte->rellockmode);
+				}
+			}
 		}
 	}
 }
diff --git a/src/backend/utils/mmgr/portalmem.c b/src/backend/utils/mmgr/portalmem.c
index d549f66d4a..896f51be08 100644
--- a/src/backend/utils/mmgr/portalmem.c
+++ b/src/backend/utils/mmgr/portalmem.c
@@ -285,6 +285,7 @@ PortalDefineQuery(Portal portal,
 				  const char *sourceText,
 				  CommandTag commandTag,
 				  List *stmts,
+				  List *execlockrelsinfos,
 				  CachedPlan *cplan)
 {
 	AssertArg(PortalIsValid(portal));
@@ -299,6 +300,7 @@ PortalDefineQuery(Portal portal,
 	portal->qc.nprocessed = 0;
 	portal->commandTag = commandTag;
 	portal->stmts = stmts;
+	portal->execlockrelsinfos = execlockrelsinfos;
 	portal->cplan = cplan;
 	portal->status = PORTAL_DEFINED;
 }
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 666977fb1f..fef75ba147 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -87,7 +87,8 @@ extern void ExplainOneUtility(Node *utilityStmt, IntoClause *into,
 							  ExplainState *es, const char *queryString,
 							  ParamListInfo params, QueryEnvironment *queryEnv);
 
-extern void ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into,
+extern void ExplainOnePlan(PlannedStmt *plannedstmt, ExecLockRelsInfo *execlockrelsinfo,
+						   IntoClause *into,
 						   ExplainState *es, const char *queryString,
 						   ParamListInfo params, QueryEnvironment *queryEnv,
 						   const instr_time *planduration,
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index fd5735a946..ded19b8cbb 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -124,4 +124,6 @@ extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
 						 PartitionPruneInfo *pruneinfo,
 						 Bitmapset **initially_valid_subplans);
 extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate);
+extern Bitmapset *ExecGetLockRelsDoInitialPruning(Plan *plan, ExecGetLockRelsContext *context,
+								PartitionPruneInfo *pruneinfo);
 #endif							/* EXECPARTITION_H */
diff --git a/src/include/executor/execdesc.h b/src/include/executor/execdesc.h
index e79e2c001f..4338463479 100644
--- a/src/include/executor/execdesc.h
+++ b/src/include/executor/execdesc.h
@@ -35,6 +35,7 @@ typedef struct QueryDesc
 	/* These fields are provided by CreateQueryDesc */
 	CmdType		operation;		/* CMD_SELECT, CMD_UPDATE, etc. */
 	PlannedStmt *plannedstmt;	/* planner's output (could be utility, too) */
+	ExecLockRelsInfo *execlockrelsinfo;	/* ExecutorGetLockRels()'s output given plannedstmt */
 	const char *sourceText;		/* source text of the query */
 	Snapshot	snapshot;		/* snapshot to use for query */
 	Snapshot	crosscheck_snapshot;	/* crosscheck for RI update/delete */
@@ -57,6 +58,7 @@ typedef struct QueryDesc
 
 /* in pquery.c */
 extern QueryDesc *CreateQueryDesc(PlannedStmt *plannedstmt,
+								  ExecLockRelsInfo *execlockrelsinfo,
 								  const char *sourceText,
 								  Snapshot snapshot,
 								  Snapshot crosscheck_snapshot,
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index 82925b4b63..5cf414cc11 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -185,6 +185,8 @@ ExecGetJunkAttribute(TupleTableSlot *slot, AttrNumber attno, bool *isNull)
 /*
  * prototypes from functions in execMain.c
  */
+extern ExecLockRelsInfo *ExecutorGetLockRels(PlannedStmt *plannedstmt, ParamListInfo params);
+extern bool ExecGetLockRels(Plan *node, ExecGetLockRelsContext *context);
 extern void ExecutorStart(QueryDesc *queryDesc, int eflags);
 extern void standard_ExecutorStart(QueryDesc *queryDesc, int eflags);
 extern void ExecutorRun(QueryDesc *queryDesc,
diff --git a/src/include/executor/nodeAppend.h b/src/include/executor/nodeAppend.h
index 4cb78ee5b6..b53535c2a4 100644
--- a/src/include/executor/nodeAppend.h
+++ b/src/include/executor/nodeAppend.h
@@ -17,6 +17,7 @@
 #include "access/parallel.h"
 #include "nodes/execnodes.h"
 
+extern bool ExecGetAppendLockRels(Append *node, ExecGetLockRelsContext *context);
 extern AppendState *ExecInitAppend(Append *node, EState *estate, int eflags);
 extern void ExecEndAppend(AppendState *node);
 extern void ExecReScanAppend(AppendState *node);
diff --git a/src/include/executor/nodeMergeAppend.h b/src/include/executor/nodeMergeAppend.h
index 97fe3b0665..8eb4e9df93 100644
--- a/src/include/executor/nodeMergeAppend.h
+++ b/src/include/executor/nodeMergeAppend.h
@@ -16,6 +16,7 @@
 
 #include "nodes/execnodes.h"
 
+extern bool ExecGetMergeAppendLockRels(MergeAppend *node, ExecGetLockRelsContext *context);
 extern MergeAppendState *ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags);
 extern void ExecEndMergeAppend(MergeAppendState *node);
 extern void ExecReScanMergeAppend(MergeAppendState *node);
diff --git a/src/include/executor/nodeModifyTable.h b/src/include/executor/nodeModifyTable.h
index 1d225bc88d..5006499088 100644
--- a/src/include/executor/nodeModifyTable.h
+++ b/src/include/executor/nodeModifyTable.h
@@ -19,6 +19,7 @@ extern void ExecComputeStoredGenerated(ResultRelInfo *resultRelInfo,
 									   EState *estate, TupleTableSlot *slot,
 									   CmdType cmdtype);
 
+extern bool ExecGetModifyTableLockRels(ModifyTable *plan, ExecGetLockRelsContext *context);
 extern ModifyTableState *ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags);
 extern void ExecEndModifyTable(ModifyTableState *node);
 extern void ExecReScanModifyTable(ModifyTableState *node);
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 44dd73fc80..1253fdb0ed 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -576,6 +576,7 @@ typedef struct EState
 	struct ExecRowMark **es_rowmarks;	/* Array of per-range-table-entry
 										 * ExecRowMarks, or NULL if none */
 	PlannedStmt *es_plannedstmt;	/* link to top of plan tree */
+	struct ExecLockRelsInfo *es_execlockrelsinfo; /* QueryDesc.execlockrelsinfo */
 	const char *es_sourceText;	/* Source text from QueryDesc */
 
 	JunkFilter *es_junkFilter;	/* top-level junk filter, if any */
@@ -964,6 +965,101 @@ typedef struct DomainConstraintState
  */
 typedef TupleTableSlot *(*ExecProcNodeMtd) (struct PlanState *pstate);
 
+/*----------------
+ * ExecLockRelsInfo
+ *
+ * Result of performing ExecutorGetLockRels() for a given PlannedStmt
+ */
+typedef struct ExecLockRelsInfo
+{
+	NodeTag		type;
+
+	/*
+	 * Relations that must be locked to execute the plan tree contained in
+	 * the PlannedStmt.
+	 */
+	Bitmapset  *lockrels;
+
+	/* PlannedStmt.numPlanNodes */
+	int			numPlanNodes;
+
+	/*
+	 * List of PlanInitPruningOutput, each representing the output of
+	 * performing initial pruning on a given plan node, for all nodes in the
+	 * plan tree that have been marked as needing initial pruning.
+	 *
+	 * 'ipoIndexes' is an array of 'numPlanNodes' elements, indexed with
+	 * plan_node_id of the individual nodes in the plan tree, each a 1-based
+	 * index into 'initPruningOutputs' list for a given plan node.  0 means
+	 * that a given plan node has no entry in the list because of not needing
+	 * any initial pruning done on it.
+	 */
+	List	   *initPruningOutputs;
+	int		   *ipoIndexes;
+} ExecLockRelsInfo;
+
+/*----------------
+ * ExecGetLockRelsContext
+ *
+ * Information pertaining to ExecutorGetLockRels() invocation for a given
+ * plan.
+ */
+typedef struct ExecGetLockRelsContext
+{
+	NodeTag		type;
+
+	PlannedStmt	   *stmt;		/* target plan */
+	ParamListInfo	params;		/* EXTERN parameters available for pruning */
+
+	/* Output parameters for ExecGetLockRels and its subroutines. */
+	Bitmapset	   *lockrels;
+
+	/* See the omment in the definition of ExecLockRelsInfo struct. */
+	List		   *initPruningOutputs;
+	int			   *ipoIndexes;
+} ExecGetLockRelsContext;
+
+/*
+ * Appends the provided PlanInitPruningOutput to
+ * ExecGetLockRelsContext.initPruningOutput
+ */
+#define ExecStorePlanInitPruningOutput(cxt, initPruningOutput, plannode) \
+	do { \
+		(cxt)->initPruningOutputs = lappend((cxt)->initPruningOutputs, initPruningOutput); \
+		(cxt)->ipoIndexes[(plannode)->plan_node_id] = list_length((cxt)->initPruningOutputs); \
+	} while (0)
+
+/*
+ * Finds the PlanInitPruningOutput for a given Plan node in
+ * ExecLockRelsInfo.initPruningOutputs.
+ */
+#define ExecFetchPlanInitPruningOutput(execlockrelsinfo, plannode) \
+		(((execlockrelsinfo) != NULL && (execlockrelsinfo)->initPruningOutputs != NIL) ? \
+		 list_nth((execlockrelsinfo)->initPruningOutputs, \
+				  (execlockrelsinfo)->ipoIndexes[(plannode)->plan_node_id] - 1) : NULL)
+
+/* ---------------
+ * PlanInitPruningOutput
+ *
+ * Node to remember the result of performing initial partition pruning steps
+ * during ExecutorGetLockRels() on nodes that support pruning.
+ *
+ * ExecLockRelsDoInitPruning(), which runs during ExecutorGetLockRels(),
+ * creates it and stores it in the corresponding ExecLockRelsInfo.
+ *
+ * ExecInitPartitionPruning(), which runs during ExecuorStart(), fetches it
+ * from the EState's ExecLockRelsInfo (if any) and uses the value of
+ * initially_valid_subplans contained in it as-is to select the subplans to be
+ * initialized for execution, instead of re-evaluating that by performing
+ * initial pruning again.
+ */
+typedef struct PlanInitPruningOutput
+{
+	NodeTag		type;
+
+	Bitmapset  *initially_valid_subplans;
+} PlanInitPruningOutput;
+
 /* ----------------
  *		PlanState node
  *
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index 05f0b79e82..00c4d8293e 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -96,6 +96,11 @@ typedef enum NodeTag
 	T_PartitionPruneStepCombine,
 	T_PlanInvalItem,
 
+	/* TAGS FOR EXECUTOR PREP NODES (execnodes.h) */
+	T_ExecGetLockRelsContext,
+	T_ExecLockRelsInfo,
+	T_PlanInitPruningOutput,
+
 	/*
 	 * TAGS FOR PLAN STATE NODES (execnodes.h)
 	 *
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 5327d9ba8b..019719c1a4 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -129,6 +129,10 @@ typedef struct PlannerGlobal
 
 	char		maxParallelHazard;	/* worst PROPARALLEL hazard level */
 
+	bool		containsInitialPruning;	/* Do some Plan nodes in the tree
+										 * have initial (pre-exec) pruning
+										 * steps? */
+
 	PartitionDirectory partition_directory; /* partition descriptors */
 
 	Bitmapset *elidedAppendPartedRels;	/* Combined partitioned_rels of all
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index bd87c35d6c..bfdb5bbf28 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -59,10 +59,16 @@ typedef struct PlannedStmt
 
 	bool		parallelModeNeeded; /* parallel mode required to execute? */
 
+	bool		containsInitialPruning;	/* Do some Plan nodes in the tree
+										 * have initial (pre-exec) pruning
+										 * steps? */
+
 	int			jitFlags;		/* which forms of JIT should be performed */
 
 	struct Plan *planTree;		/* tree of Plan nodes */
 
+	int			numPlanNodes;	/* number of nodes in planTree */
+
 	List	   *rtable;			/* list of RangeTblEntry nodes */
 
 	/* rtable indexes of target relations for INSERT/UPDATE/DELETE */
@@ -1189,6 +1195,13 @@ typedef struct PlanRowMark
  * prune_infos			List of Lists containing PartitionedRelPruneInfo nodes,
  *						one sublist per run-time-prunable partition hierarchy
  *						appearing in the parent plan node's subplans.
+ *
+ * needs_init_pruning	Does any of the PartitionedRelPruneInfos in
+ *						prune_infos have its initial_pruning_steps set?
+ *
+ * needs_exec_pruning	Does any of the PartitionedRelPruneInfos in
+ *						prune_infos have its exec_pruning_steps set?
+ *
  * other_subplans		Indexes of any subplans that are not accounted for
  *						by any of the PartitionedRelPruneInfo nodes in
  *						"prune_infos".  These subplans must not be pruned.
@@ -1197,6 +1210,8 @@ typedef struct PartitionPruneInfo
 {
 	NodeTag		type;
 	List	   *prune_infos;
+	bool		needs_init_pruning;
+	bool		needs_exec_pruning;
 	Bitmapset  *other_subplans;
 } PartitionPruneInfo;
 
diff --git a/src/include/tcop/tcopprot.h b/src/include/tcop/tcopprot.h
index 92291a750d..bf80c53bed 100644
--- a/src/include/tcop/tcopprot.h
+++ b/src/include/tcop/tcopprot.h
@@ -64,7 +64,7 @@ extern PlannedStmt *pg_plan_query(Query *querytree, const char *query_string,
 								  ParamListInfo boundParams);
 extern List *pg_plan_queries(List *querytrees, const char *query_string,
 							 int cursorOptions,
-							 ParamListInfo boundParams);
+							 ParamListInfo boundParams, List **execlockrelsinfo_list);
 
 extern bool check_max_stack_depth(int *newval, void **extra, GucSource source);
 extern void assign_max_stack_depth(int newval, void *extra);
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index 95b99e3d25..56b0dcc6bd 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -148,6 +148,9 @@ typedef struct CachedPlan
 {
 	int			magic;			/* should equal CACHEDPLAN_MAGIC */
 	List	   *stmt_list;		/* list of PlannedStmts */
+	List	   *execlockrelsinfo_list;	/* list of ExecutorGetLockRelsResult with one
+									 * element for each of stmt_list; NIL
+									 * if not a generic plan */
 	bool		is_oneshot;		/* is it a "oneshot" plan? */
 	bool		is_saved;		/* is CachedPlan in a long-lived context? */
 	bool		is_valid;		/* is the stmt_list currently valid? */
@@ -158,6 +161,9 @@ typedef struct CachedPlan
 	int			generation;		/* parent's generation number for this plan */
 	int			refcount;		/* count of live references to this struct */
 	MemoryContext context;		/* context containing this CachedPlan */
+	MemoryContext execlockrelsinfo_context;	/* context containing
+											 * execlockrelsinfo_list,
+											 * a child of the above context */
 } CachedPlan;
 
 /*
diff --git a/src/include/utils/portal.h b/src/include/utils/portal.h
index aeddbdafe5..9abace6734 100644
--- a/src/include/utils/portal.h
+++ b/src/include/utils/portal.h
@@ -137,6 +137,10 @@ typedef struct PortalData
 	CommandTag	commandTag;		/* command tag for original query */
 	QueryCompletion qc;			/* command completion data for executed query */
 	List	   *stmts;			/* list of PlannedStmts */
+	List	   *execlockrelsinfos;	/* list of ExecutorGetLockRelsResults with one element
+								 * for each of 'stmts'; same as
+								 * cplan->execlockrelsinfo_list if cplan is
+								 * not NULL */
 	CachedPlan *cplan;			/* CachedPlan, if stmts are from one */
 
 	ParamListInfo portalParams; /* params to pass to query */
@@ -241,6 +245,7 @@ extern void PortalDefineQuery(Portal portal,
 							  const char *sourceText,
 							  CommandTag commandTag,
 							  List *stmts,
+							  List *execlockrelsinfos,
 							  CachedPlan *cplan);
 extern PlannedStmt *PortalGetPrimaryStmt(Portal portal);
 extern void PortalCreateHoldStore(Portal portal);
-- 
2.24.1



^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2022-03-31 03:25  Amit Langote <[email protected]>
  parent: Amit Langote <[email protected]>
  0 siblings, 2 replies; 108+ messages in thread

From: Amit Langote @ 2022-03-31 03:25 UTC (permalink / raw)
  To: Robert Haas <[email protected]>; +Cc: Tom Lane <[email protected]>; pgsql-hackers; David Rowley *EXTERN* <[email protected]>

On Mon, Mar 28, 2022 at 4:28 PM Amit Langote <[email protected]> wrote:
> On Mon, Mar 28, 2022 at 4:17 PM Amit Langote <[email protected]> wrote:
> > Other than the changes mentioned above, the updated patch now contains
> > a bit more commentary than earlier versions, mostly around
> > AcquireExecutorLocks()'s new way of determining the set of relations
> > to lock and the significantly redesigned working of the "initial"
> > execution pruning.
>
> Forgot to rebase over the latest HEAD, so here's v7.  Also fixed that
> _out and _read functions for PlanInitPruningOutput were using an
> obsolete node label.

Rebased.

-- 
Amit Langote
EDB: http://www.enterprisedb.com


Attachments:

  [application/octet-stream] v8-0004-Optimize-AcquireExecutorLocks-to-skip-pruned-part.patch (94.3K, 2-v8-0004-Optimize-AcquireExecutorLocks-to-skip-pruned-part.patch)
  download | inline diff:
From 9e0ae8887a9f3d75feb4df969dde504a21d3700d Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Wed, 22 Dec 2021 16:55:17 +0900
Subject: [PATCH v8 4/4] Optimize AcquireExecutorLocks() to skip pruned
 partitions

Instead of locking all relations listed in the range table in the
cases where the PlannedStmt indicates that some nodes in the plan
tree can do partition pruning without depending on execution having
started (so called "initial" pruning), AcquireExecutorLocks() now
calls the new executor function ExecutorGetLockRels() which returns
a set of relations (their RT indexes) to be locked not including
those scanned by the subplans that pruned.

The result of pruning done this way must be remembered and reused
during actual execution of the plan, which is done by creating a
PlanInitPruningOutput nodes for for each plan node that undergoes
pruning and a set of those for the whole plan tree are added to
ExecLockRelsInfo which also stores the bitmapset of RT indexes of
relations that are actually locked by AcquireExecutorLocks().
ExecLockRelsInfos are passed down the executor alongside the
PlannedStmts.  This arrangement ensures that the executor doesn't
accidentally try to process a plan tree subnodes that has been
deemed pruned by AcquireExecutorLocks().
---
 src/backend/commands/copyto.c          |   2 +-
 src/backend/commands/createas.c        |   2 +-
 src/backend/commands/explain.c         |   7 +-
 src/backend/commands/extension.c       |  13 +-
 src/backend/commands/matview.c         |   2 +-
 src/backend/commands/portalcmds.c      |   1 +
 src/backend/commands/prepare.c         |  17 +-
 src/backend/executor/README            |  24 +++
 src/backend/executor/execMain.c        | 202 ++++++++++++++++++++
 src/backend/executor/execParallel.c    |  26 ++-
 src/backend/executor/execPartition.c   | 224 ++++++++++++++++++----
 src/backend/executor/execUtils.c       |   8 +
 src/backend/executor/functions.c       |   2 +-
 src/backend/executor/nodeAppend.c      |  52 ++++-
 src/backend/executor/nodeMergeAppend.c |  52 ++++-
 src/backend/executor/nodeModifyTable.c |  25 +++
 src/backend/executor/spi.c             |  14 +-
 src/backend/nodes/copyfuncs.c          |  49 ++++-
 src/backend/nodes/outfuncs.c           |  39 ++++
 src/backend/nodes/readfuncs.c          |  37 ++++
 src/backend/optimizer/plan/planner.c   |   2 +
 src/backend/optimizer/plan/setrefs.c   |   6 +
 src/backend/partitioning/partprune.c   |  37 +++-
 src/backend/tcop/postgres.c            |  15 +-
 src/backend/tcop/pquery.c              |  21 ++-
 src/backend/utils/cache/plancache.c    | 252 ++++++++++++++++++++++---
 src/backend/utils/mmgr/portalmem.c     |   2 +
 src/include/commands/explain.h         |   3 +-
 src/include/executor/execPartition.h   |   2 +
 src/include/executor/execdesc.h        |   2 +
 src/include/executor/executor.h        |   2 +
 src/include/executor/nodeAppend.h      |   1 +
 src/include/executor/nodeMergeAppend.h |   1 +
 src/include/executor/nodeModifyTable.h |   1 +
 src/include/nodes/execnodes.h          |  96 ++++++++++
 src/include/nodes/nodes.h              |   5 +
 src/include/nodes/pathnodes.h          |   4 +
 src/include/nodes/plannodes.h          |  15 ++
 src/include/tcop/tcopprot.h            |   2 +-
 src/include/utils/plancache.h          |   6 +
 src/include/utils/portal.h             |   5 +
 41 files changed, 1174 insertions(+), 104 deletions(-)

diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index 55c38b04c4..d403eb2309 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -542,7 +542,7 @@ BeginCopyTo(ParseState *pstate,
 		((DR_copy *) dest)->cstate = cstate;
 
 		/* Create a QueryDesc requesting no output */
-		cstate->queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+		cstate->queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
 											GetActiveSnapshot(),
 											InvalidSnapshot,
 											dest, NULL, NULL, 0);
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 9abbb6b555..f6607f2454 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -325,7 +325,7 @@ ExecCreateTableAs(ParseState *pstate, CreateTableAsStmt *stmt,
 		UpdateActiveSnapshotCommandId();
 
 		/* Create a QueryDesc, redirecting output to our tuple receiver */
-		queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+		queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
 									GetActiveSnapshot(), InvalidSnapshot,
 									dest, params, queryEnv, 0);
 
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index cb13227db1..e5dff2bc25 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -407,7 +407,7 @@ ExplainOneQuery(Query *query, int cursorOptions,
 		}
 
 		/* run it (if needed) and produce output */
-		ExplainOnePlan(plan, into, es, queryString, params, queryEnv,
+		ExplainOnePlan(plan, NULL, into, es, queryString, params, queryEnv,
 					   &planduration, (es->buffers ? &bufusage : NULL));
 	}
 }
@@ -515,7 +515,8 @@ ExplainOneUtility(Node *utilityStmt, IntoClause *into, ExplainState *es,
  * to call it.
  */
 void
-ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
+ExplainOnePlan(PlannedStmt *plannedstmt, ExecLockRelsInfo *execlockrelsinfo,
+			   IntoClause *into, ExplainState *es,
 			   const char *queryString, ParamListInfo params,
 			   QueryEnvironment *queryEnv, const instr_time *planduration,
 			   const BufferUsage *bufusage)
@@ -563,7 +564,7 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
 		dest = None_Receiver;
 
 	/* Create a QueryDesc for the query */
-	queryDesc = CreateQueryDesc(plannedstmt, queryString,
+	queryDesc = CreateQueryDesc(plannedstmt, execlockrelsinfo, queryString,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, params, queryEnv, instrument_option);
 
diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index 1013790dbb..008b8ce0e9 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -741,8 +741,10 @@ execute_sql_string(const char *sql)
 		RawStmt    *parsetree = lfirst_node(RawStmt, lc1);
 		MemoryContext per_parsetree_context,
 					oldcontext;
-		List	   *stmt_list;
-		ListCell   *lc2;
+		List	   *stmt_list,
+				   *execlockrelsinfo_list;
+		ListCell   *lc2,
+				   *lc3;
 
 		/*
 		 * We do the work for each parsetree in a short-lived context, to
@@ -762,11 +764,13 @@ execute_sql_string(const char *sql)
 										   NULL,
 										   0,
 										   NULL);
-		stmt_list = pg_plan_queries(stmt_list, sql, CURSOR_OPT_PARALLEL_OK, NULL);
+		stmt_list = pg_plan_queries(stmt_list, sql, CURSOR_OPT_PARALLEL_OK, NULL,
+									&execlockrelsinfo_list);
 
-		foreach(lc2, stmt_list)
+		forboth(lc2, stmt_list, lc3, execlockrelsinfo_list)
 		{
 			PlannedStmt *stmt = lfirst_node(PlannedStmt, lc2);
+			ExecLockRelsInfo *execlockrelsinfo = lfirst_node(ExecLockRelsInfo, lc3);
 
 			CommandCounterIncrement();
 
@@ -777,6 +781,7 @@ execute_sql_string(const char *sql)
 				QueryDesc  *qdesc;
 
 				qdesc = CreateQueryDesc(stmt,
+										execlockrelsinfo,
 										sql,
 										GetActiveSnapshot(), NULL,
 										dest, NULL, NULL, 0);
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index 05e7b60059..4ef44aaf23 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -416,7 +416,7 @@ refresh_matview_datafill(DestReceiver *dest, Query *query,
 	UpdateActiveSnapshotCommandId();
 
 	/* Create a QueryDesc, redirecting output to our tuple receiver */
-	queryDesc = CreateQueryDesc(plan, queryString,
+	queryDesc = CreateQueryDesc(plan, NULL, queryString,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, NULL, NULL, 0);
 
diff --git a/src/backend/commands/portalcmds.c b/src/backend/commands/portalcmds.c
index 9902c5c566..85e73ddded 100644
--- a/src/backend/commands/portalcmds.c
+++ b/src/backend/commands/portalcmds.c
@@ -107,6 +107,7 @@ PerformCursorOpen(ParseState *pstate, DeclareCursorStmt *cstmt, ParamListInfo pa
 					  queryString,
 					  CMDTAG_SELECT,	/* cursor's query is always a SELECT */
 					  list_make1(plan),
+					  list_make1(NULL),	/* no ExecLockRelsInfo to pass */
 					  NULL);
 
 	/*----------
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 80738547ed..bbbf8bbcbd 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -155,6 +155,7 @@ ExecuteQuery(ParseState *pstate,
 	PreparedStatement *entry;
 	CachedPlan *cplan;
 	List	   *plan_list;
+	List	   *plan_execlockrelsinfo_list;
 	ParamListInfo paramLI = NULL;
 	EState	   *estate = NULL;
 	Portal		portal;
@@ -195,6 +196,7 @@ ExecuteQuery(ParseState *pstate,
 	/* Replan if needed, and increment plan refcount for portal */
 	cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL);
 	plan_list = cplan->stmt_list;
+	plan_execlockrelsinfo_list = cplan->execlockrelsinfo_list;
 
 	/*
 	 * DO NOT add any logic that could possibly throw an error between
@@ -204,7 +206,7 @@ ExecuteQuery(ParseState *pstate,
 					  NULL,
 					  query_string,
 					  entry->plansource->commandTag,
-					  plan_list,
+					  plan_list, plan_execlockrelsinfo_list,
 					  cplan);
 
 	/*
@@ -576,7 +578,9 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	const char *query_string;
 	CachedPlan *cplan;
 	List	   *plan_list;
-	ListCell   *p;
+	List	   *plan_execlockrelsinfo_list;
+	ListCell   *p,
+			   *pe;
 	ParamListInfo paramLI = NULL;
 	EState	   *estate = NULL;
 	instr_time	planstart;
@@ -632,15 +636,18 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	}
 
 	plan_list = cplan->stmt_list;
+	plan_execlockrelsinfo_list = cplan->execlockrelsinfo_list;
 
 	/* Explain each query */
-	foreach(p, plan_list)
+	forboth(p, plan_list, pe, plan_execlockrelsinfo_list)
 	{
 		PlannedStmt *pstmt = lfirst_node(PlannedStmt, p);
+		ExecLockRelsInfo *execlockrelsinfo = lfirst_node(ExecLockRelsInfo, pe);
 
 		if (pstmt->commandType != CMD_UTILITY)
-			ExplainOnePlan(pstmt, into, es, query_string, paramLI, queryEnv,
-						   &planduration, (es->buffers ? &bufusage : NULL));
+			ExplainOnePlan(pstmt, execlockrelsinfo, into, es, query_string,
+						   paramLI, queryEnv, &planduration,
+						   (es->buffers ? &bufusage : NULL));
 		else
 			ExplainOneUtility(pstmt->utilityStmt, into, es, query_string,
 							  paramLI, queryEnv);
diff --git a/src/backend/executor/README b/src/backend/executor/README
index 0b5183fc4a..b45ca508a8 100644
--- a/src/backend/executor/README
+++ b/src/backend/executor/README
@@ -65,6 +65,27 @@ found there.  This currently only occurs for Append and MergeAppend nodes.  In
 this case the non-required subplans are ignored and the executor state's
 subnode array will become out of sequence to the plan's subplan list.
 
+Actually, the so-called execution time pruning may also occur even before the
+execution has started.  One case where that occurs is when a cached generic
+plan is being validated for execution by plancache.c: GetCachedPlan(), which
+proceeds by locking all the relations that will be scanned by that plan.  If
+the generic plan has nodes that contain so-called initial pruning steps (a
+subset of execution pruning steps that do not depend on full-fledged execution
+having started), they are performed at this point to figure out the minimal
+set of child subplans that satisfy those pruning instructions and the result
+of performing that pruning is saved in a data structure that gets passed to
+the executor alongside the plan tree.  Relations scanned by only those
+surviving subplans are then locked while those scanned by the pruned subplans
+are not, even though the pruned subplans themselves are not removed from the
+plan tree. So, it is imperative that the executor and any third party code
+invoked by it that gets passed the plan tree look at the initial pruning result
+made available via the aforementioned data structure to determine whether or
+not a particular subplan is valid.  (The data structure basically consists of
+an array of PlanInitPruningOutput nodes containing one element for each node
+of the plan tree indexable using plan_node_id of the individual plan nodes,
+where each node contains a bitmapset of indexes of unpruned child subplans of
+a given node.)
+
 Each Plan node may have expression trees associated with it, to represent
 its target list, qualification conditions, etc.  These trees are also
 read-only to the executor, but the executor state for expression evaluation
@@ -286,6 +307,9 @@ Query Processing Control Flow
 
 This is a sketch of control flow for full query processing:
 
+  [ ExecutorGetLockRels ] --- an optional step to walk over the plan tree
+		to produce an ExecLockRelsInfo to be passed to CreateQueryDesc
+
 	CreateQueryDesc
 
 	ExecutorStart
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index ef2fd46092..56946c12dd 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -49,11 +49,15 @@
 #include "commands/matview.h"
 #include "commands/trigger.h"
 #include "executor/execdebug.h"
+#include "executor/nodeAppend.h"
+#include "executor/nodeMergeAppend.h"
+#include "executor/nodeModifyTable.h"
 #include "executor/nodeSubplan.h"
 #include "foreign/fdwapi.h"
 #include "jit/jit.h"
 #include "mb/pg_wchar.h"
 #include "miscadmin.h"
+#include "nodes/nodeFuncs.h"
 #include "parser/parsetree.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
@@ -101,9 +105,205 @@ static char *ExecBuildSlotValueDescription(Oid reloid,
 										   Bitmapset *modifiedCols,
 										   int maxfieldlen);
 static void EvalPlanQualStart(EPQState *epqstate, Plan *planTree);
+static bool ExecGetScanLockRels(Scan *scan, ExecGetLockRelsContext *context);
 
 /* end of local decls */
 
+/* ----------------------------------------------------------------
+ *		ExecutorGetLockRels
+ *
+ *		Figure out the minimal set of relations to lock to be able to safely
+ *		execute a given plan
+ *
+ * This ignores the relations scanned by child subplans that are pruned away
+ * after performing initial pruning steps present in the plan using the
+ * provided set of EXTERN parameters.
+ *
+ * Along with the set of RT indexes of relations that must be locked, the
+ * returned struct also contains an array of PlanInitPruningOutput nodes each
+ * of which contains the result of initial pruning for a given plan node, which
+ * is basically a bitmapset of the indexes of surviving child subplans.  Each
+ * plan node in the tree that undergoes pruning will have an element in the
+ * array.
+ *
+ * Note that while relations scanned by the subplans that are pruned will not
+ * be locked, the subplans themselves are left as-is in the plan tree, assuming
+ * anything that reads the plan tree during execution knows to ignore them by
+ * looking at the PlanInitPruningOutput's list of valid subplans.
+ *
+ * Partitioned tables mentioned in PartitionedRelPruneInfo nodes that drive
+ * the pruning will be locked before doing the pruning and also added to the
+ * the returned set.
+ */
+ExecLockRelsInfo *
+ExecutorGetLockRels(PlannedStmt *plannedstmt, ParamListInfo params)
+{
+	int		numPlanNodes = plannedstmt->numPlanNodes;
+	ExecGetLockRelsContext context;
+	ExecLockRelsInfo *result;
+	ListCell *lc;
+
+	/* Only get here if there is any pruning to do. */
+	Assert(plannedstmt->containsInitialPruning);
+
+	context.stmt = plannedstmt;
+	context.params = params;
+
+	/*
+	 * Go walk all the plan tree(s) present in the PlannedStmt, filling
+	 * context.lockrels with only the relations from plan nodes that
+	 * survive initial pruning and also the tables mentioned in
+	 * partitioned_rels sets found in the plan.
+	 */
+	context.lockrels = NULL;
+	context.initPruningOutputs = NIL;
+	context.ipoIndexes = palloc0(sizeof(int) * numPlanNodes);
+
+	/* All the subplans. */
+	foreach(lc, plannedstmt->subplans)
+	{
+		Plan *subplan = lfirst(lc);
+
+		(void) ExecGetLockRels(subplan, &context);
+	}
+
+	/* And the main tree. */
+	(void) ExecGetLockRels(plannedstmt->planTree, &context);
+
+	/*
+	 * Also be sure to lock partitioned relations from any [Merge]Append nodes
+	 * that were originally present but were ultimately left out from the plan
+	 * due to being deemed no-op nodes.
+	 */
+	context.lockrels = bms_add_members(context.lockrels,
+									   plannedstmt->elidedAppendPartedRels);
+
+	result = makeNode(ExecLockRelsInfo);
+	result->lockrels = context.lockrels;
+	result->numPlanNodes = numPlanNodes;
+	result->initPruningOutputs = context.initPruningOutputs;
+	result->ipoIndexes = context.ipoIndexes;
+
+	return result;
+}
+
+/* ------------------------------------------------------------------------
+ * ExecGetLockRels
+ *		Adds all the relations that will be scanned by 'node' and its child
+ *		plans to context->lockrels after taking into the account the effect
+ *		of performing initial pruning if any
+ *
+ * context->stmt gives the PlannedStmt being inspected to access the plan's
+ * range table if needed and context->params the set of EXTERN parameters
+ * available to evaluate pruning parameters.
+ *
+ * If initial pruning is done, a PlanInitPruningOutput node containing the
+ * result of pruning will be stored in context->initPruningOutputs that will
+ * be made available to the executor to reuse.
+ * ------------------------------------------------------------------------
+ */
+bool
+ExecGetLockRels(Plan *node, ExecGetLockRelsContext *context)
+{
+	/* Do nothing when we get to the end of a leaf on tree. */
+	if (node == NULL)
+		return true;
+
+	/* Make sure there's enough stack available. */
+	check_stack_depth();
+
+	switch (nodeTag(node))
+	{
+		/* Currently, only these two nodes have prunable child subplans. */
+		case T_Append:
+			if (ExecGetAppendLockRels((Append *) node, context))
+				return true;
+			break;
+		case T_MergeAppend:
+			if (ExecGetMergeAppendLockRels((MergeAppend *) node,
+												context))
+				return true;
+			break;
+
+		/*
+		 * And these manipulate relations that must be added context->lockrels.
+		 */
+		case T_SeqScan:
+		case T_SampleScan:
+		case T_IndexScan:
+		case T_IndexOnlyScan:
+		case T_BitmapIndexScan:
+		case T_BitmapHeapScan:
+		case T_TidScan:
+		case T_TidRangeScan:
+		case T_ForeignScan:
+		case T_SubqueryScan:
+		case T_CustomScan:
+			if (ExecGetScanLockRels((Scan *) node, context))
+				return true;
+			break;
+		case T_ModifyTable:
+			if (ExecGetModifyTableLockRels((ModifyTable *) node, context))
+				return true;
+			/* plan_tree_walker() will visit the subplan (outerNode) */
+			break;
+
+		default:
+			break;
+	}
+
+	/* Recurse to subnodes. */
+	return plan_tree_walker(node, ExecGetLockRels, (void *) context);
+}
+
+/*
+ * ExecGetScanLockRels
+ * 		Do ExecGetLockRels()'s work for a leaf Scan node
+ */
+static bool
+ExecGetScanLockRels(Scan *scan, ExecGetLockRelsContext *context)
+{
+	switch (nodeTag(scan))
+	{
+		case T_ForeignScan:
+			{
+				ForeignScan *fscan = (ForeignScan *) scan;
+
+				context->lockrels = bms_add_members(context->lockrels,
+													fscan->fs_relids);
+			}
+			break;
+
+		case T_SubqueryScan:
+			{
+				SubqueryScan *sscan = (SubqueryScan *) scan;
+
+				(void) ExecGetLockRels((Plan *) sscan->subplan, context);
+			}
+			break;
+
+		case T_CustomScan:
+			{
+				CustomScan *cscan = (CustomScan *) scan;
+				ListCell *lc;
+
+				context->lockrels = bms_add_members(context->lockrels,
+													cscan->custom_relids);
+				foreach(lc, cscan->custom_plans)
+				{
+					(void) ExecGetLockRels((Plan *) lfirst(lc), context);
+				}
+			}
+			break;
+
+		default:
+			context->lockrels = bms_add_member(context->lockrels,
+											   scan->scanrelid);
+			break;
+	}
+
+	return true;
+}
 
 /* ----------------------------------------------------------------
  *		ExecutorStart
@@ -806,6 +1006,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
 {
 	CmdType		operation = queryDesc->operation;
 	PlannedStmt *plannedstmt = queryDesc->plannedstmt;
+	ExecLockRelsInfo *execlockrelsinfo = queryDesc->execlockrelsinfo;
 	Plan	   *plan = plannedstmt->planTree;
 	List	   *rangeTable = plannedstmt->rtable;
 	EState	   *estate = queryDesc->estate;
@@ -825,6 +1026,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
 	ExecInitRangeTable(estate, rangeTable);
 
 	estate->es_plannedstmt = plannedstmt;
+	estate->es_execlockrelsinfo = execlockrelsinfo;
 
 	/*
 	 * Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index 9a0d5d59ef..fb6dbd298a 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -66,6 +66,7 @@
 #define PARALLEL_KEY_QUERY_TEXT		UINT64CONST(0xE000000000000008)
 #define PARALLEL_KEY_JIT_INSTRUMENTATION UINT64CONST(0xE000000000000009)
 #define PARALLEL_KEY_WAL_USAGE			UINT64CONST(0xE00000000000000A)
+#define PARALLEL_KEY_EXECLOCKRELSINFO	UINT64CONST(0xE00000000000000B)
 
 #define PARALLEL_TUPLE_QUEUE_SIZE		65536
 
@@ -182,6 +183,7 @@ ExecSerializePlan(Plan *plan, EState *estate)
 	pstmt->transientPlan = false;
 	pstmt->dependsOnRole = false;
 	pstmt->parallelModeNeeded = false;
+	pstmt->containsInitialPruning = false;
 	pstmt->planTree = plan;
 	pstmt->rtable = estate->es_range_table;
 	pstmt->resultRelations = NIL;
@@ -596,12 +598,15 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	FixedParallelExecutorState *fpes;
 	char	   *pstmt_data;
 	char	   *pstmt_space;
+	char	   *execlockrelsinfo_data;
+	char	   *execlockrelsinfo_space;
 	char	   *paramlistinfo_space;
 	BufferUsage *bufusage_space;
 	WalUsage   *walusage_space;
 	SharedExecutorInstrumentation *instrumentation = NULL;
 	SharedJitInstrumentation *jit_instrumentation = NULL;
 	int			pstmt_len;
+	int			execlockrelsinfo_len;
 	int			paramlistinfo_len;
 	int			instrumentation_len = 0;
 	int			jit_instrumentation_len = 0;
@@ -630,6 +635,7 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 
 	/* Fix up and serialize plan to be sent to workers. */
 	pstmt_data = ExecSerializePlan(planstate->plan, estate);
+	execlockrelsinfo_data = nodeToString(estate->es_execlockrelsinfo);
 
 	/* Create a parallel context. */
 	pcxt = CreateParallelContext("postgres", "ParallelQueryMain", nworkers);
@@ -656,6 +662,11 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	shm_toc_estimate_chunk(&pcxt->estimator, pstmt_len);
 	shm_toc_estimate_keys(&pcxt->estimator, 1);
 
+	/* Estimate space for serialized ExecLockRelsInfo. */
+	execlockrelsinfo_len = strlen(execlockrelsinfo_data) + 1;
+	shm_toc_estimate_chunk(&pcxt->estimator, execlockrelsinfo_len);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
 	/* Estimate space for serialized ParamListInfo. */
 	paramlistinfo_len = EstimateParamListSpace(estate->es_param_list_info);
 	shm_toc_estimate_chunk(&pcxt->estimator, paramlistinfo_len);
@@ -750,6 +761,12 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	memcpy(pstmt_space, pstmt_data, pstmt_len);
 	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PLANNEDSTMT, pstmt_space);
 
+	/* Store serialized ExecLockRelsInfo */
+	execlockrelsinfo_space = shm_toc_allocate(pcxt->toc, execlockrelsinfo_len);
+	memcpy(execlockrelsinfo_space, execlockrelsinfo_data, execlockrelsinfo_len);
+	shm_toc_insert(pcxt->toc, PARALLEL_KEY_EXECLOCKRELSINFO,
+				   execlockrelsinfo_space);
+
 	/* Store serialized ParamListInfo. */
 	paramlistinfo_space = shm_toc_allocate(pcxt->toc, paramlistinfo_len);
 	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARAMLISTINFO, paramlistinfo_space);
@@ -1231,8 +1248,10 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
 						 int instrument_options)
 {
 	char	   *pstmtspace;
+	char	   *execlockrelsinfospace;
 	char	   *paramspace;
 	PlannedStmt *pstmt;
+	ExecLockRelsInfo *execlockrelsinfo;
 	ParamListInfo paramLI;
 	char	   *queryString;
 
@@ -1243,12 +1262,17 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
 	pstmtspace = shm_toc_lookup(toc, PARALLEL_KEY_PLANNEDSTMT, false);
 	pstmt = (PlannedStmt *) stringToNode(pstmtspace);
 
+	/* Reconstruct leader-supplied ExecLockRelsInfo. */
+	execlockrelsinfospace = shm_toc_lookup(toc, PARALLEL_KEY_EXECLOCKRELSINFO,
+										  false);
+	execlockrelsinfo = (ExecLockRelsInfo *) stringToNode(execlockrelsinfospace);
+
 	/* Reconstruct ParamListInfo. */
 	paramspace = shm_toc_lookup(toc, PARALLEL_KEY_PARAMLISTINFO, false);
 	paramLI = RestoreParamList(&paramspace);
 
 	/* Create a QueryDesc for the query. */
-	return CreateQueryDesc(pstmt,
+	return CreateQueryDesc(pstmt, execlockrelsinfo,
 						   queryString,
 						   GetActiveSnapshot(), InvalidSnapshot,
 						   receiver, paramLI, NULL, instrument_options);
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 84b4e4b3d6..e79ada16f0 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -25,6 +25,7 @@
 #include "mb/pg_wchar.h"
 #include "miscadmin.h"
 #include "nodes/makefuncs.h"
+#include "parser/parsetree.h"
 #include "partitioning/partbounds.h"
 #include "partitioning/partdesc.h"
 #include "partitioning/partprune.h"
@@ -185,8 +186,13 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation rel,
 static List *adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri);
 static List *adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap);
 static PartitionPruneState *ExecCreatePartitionPruneState(PlanState *planstate,
-							  PartitionPruneInfo *partitionpruneinfo);
-static Bitmapset *ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate);
+							  PartitionPruneInfo *partitionpruneinfo,
+							  bool consider_initial_steps,
+							  bool consider_exec_steps,
+							  List *rtable, ExprContext *econtext,
+							  PartitionDirectory partdir);
+static Bitmapset *ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate,
+								PartitionPruneInfo *pruneinfo);
 static void ExecInitPruningContext(PartitionPruneContext *context,
 								   List *pruning_steps,
 								   PartitionDesc partdesc,
@@ -1588,8 +1594,9 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  * considered to be a stable expression, it can change value from one plan
  * node scan to the next during query execution.  Stable comparison
  * expressions that don't involve such Params allow partition pruning to be
- * done once during executor startup.  Expressions that do involve such Params
- * require us to prune separately for each scan of the parent plan node.
+ * done once during executor startup or even before during ExecutorGetLockRels().
+ * Expressions that do involve such Params require us to prune separately for
+ * each scan of the parent plan node.
  *
  * Note that pruning away unneeded subplans during executor startup has the
  * added benefit of not having to initialize the unneeded subplans at all.
@@ -1601,10 +1608,17 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  *		Creates the PartitionPruneState required by each of the two pruning
  *		functions.  Details stored include how to map the partition index
  *		returned by the partition pruning code into subplan indexes.  Also
- *		determines the set of initially valid subplans by performing initial
- *		pruning steps, only which need be initialized by the caller such as
- *		ExecInitAppend.  Maps in PartitionPruneState are updated to account
- *		for initial pruning having eliminated some of the subplans, if any.
+ *		determines the set of initially valid subplans by either looking that
+ *		up in the plan node's PlanInitPruningOutput if one found in
+ *		EState.es_execlockrelinfo or by performing initial pruning steps.
+ *		Only the subplans included in that need be initialized by the caller
+ *		such as ExecInitAppend.  Maps in PartitionPruneState are updated to
+ *		account for initial pruning having eliminated some of the subplans,
+ *		if any.
+ *
+ * ExecGetLockRelsDoInitialPruning:
+ *		Do initial pruning as part of ExecGetLockRels() on the parent plan
+ *		node
  *
  * ExecFindMatchingSubPlans:
  *		Returns indexes of matching subplans after evaluating all available
@@ -1619,9 +1633,10 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  * ExecInitPartitionPruning
  * 		Initialize data structure needed for run-time partition pruning
  *
- * Initial pruning can be done immediately, so it is done here if needed and
- * the set of surviving partition subplans' indexes are added to the output
- * parameter *initially_valid_subplans.
+ * Initial pruning can be done immediately, so it is done here unless it has
+ * already been done by ExecGetLockRelsDoInitialPruning(), and the set of
+ * surviving partition subplans' indexes are added to the output parameter
+ * *initially_valid_subplans.
  *
  * If subplans are indeed pruned, subplan_map arrays contained in the returned
  * PartitionPruneState are re-sequenced to not count those, though only if the
@@ -1635,22 +1650,57 @@ ExecInitPartitionPruning(PlanState *planstate,
 {
 	PartitionPruneState *prunestate;
 	EState *estate = planstate->state;
+	Plan   *plan = planstate->plan;
+	PlanInitPruningOutput *initPruningOutput = NULL;
+	bool	do_pruning = (pruneinfo->needs_init_pruning ||
+						  pruneinfo->needs_exec_pruning);
 
-	/* We may need an expression context to evaluate partition exprs */
-	ExecAssignExprContext(estate, planstate);
+	/* Retrieve the parent plan's PlanInitPruningOutput, if any. */
+	if (estate->es_execlockrelsinfo)
+	{
+		initPruningOutput = (PlanInitPruningOutput *)
+			ExecFetchPlanInitPruningOutput(estate->es_execlockrelsinfo, plan);
 
-	/*
-	 * Create the working data structure for pruning.
-	 */
-	prunestate = ExecCreatePartitionPruneState(planstate, pruneinfo);
+		Assert(initPruningOutput != NULL &&
+			   IsA(initPruningOutput, PlanInitPruningOutput));
+		/* No need to do initial pruning again, only exec pruning. */
+		do_pruning = pruneinfo->needs_exec_pruning;
+	}
+
+	prunestate = NULL;
+	if (do_pruning)
+	{
+		/* We may need an expression context to evaluate partition exprs */
+		ExecAssignExprContext(estate, planstate);
+
+		/* For data reading, executor always omits detached partitions */
+		if (estate->es_partition_directory == NULL)
+			estate->es_partition_directory =
+				CreatePartitionDirectory(estate->es_query_cxt, false);
+
+		/*
+		 * Create the working data structure for pruning.  No need to consider
+		 * initial pruning steps if we have a PlanInitPruningOutput.
+		 */
+		prunestate = ExecCreatePartitionPruneState(planstate, pruneinfo,
+												   initPruningOutput == NULL, true,
+												   NIL, planstate->ps_ExprContext,
+												   estate->es_partition_directory);
+	}
 
 	/*
 	 * Perform an initial partition prune, if required.
 	 */
-	if (prunestate->do_initial_prune)
+	if (initPruningOutput)
+	{
+		/* ExecGetLockRelsDoInitialPruning() already did it for us! */
+		*initially_valid_subplans = initPruningOutput->initially_valid_subplans;
+	}
+	else if (prunestate && prunestate->do_initial_prune)
 	{
 		/* Determine which subplans survive initial pruning */
-		*initially_valid_subplans = ExecFindInitialMatchingSubPlans(prunestate);
+		*initially_valid_subplans = ExecFindInitialMatchingSubPlans(prunestate,
+																	pruneinfo);
 	}
 	else
 	{
@@ -1668,7 +1718,7 @@ ExecInitPartitionPruning(PlanState *planstate,
 	 * invalid data in prunestate, because that data won't be consulted again
 	 * (cf initial Assert in ExecFindMatchingSubPlans).
 	 */
-	if (prunestate->do_exec_prune &&
+	if (prunestate && prunestate->do_exec_prune &&
 		bms_num_members(*initially_valid_subplans) < n_total_subplans)
 		PartitionPruneStateFixSubPlanMap(prunestate,
 										 *initially_valid_subplans,
@@ -1677,12 +1727,75 @@ ExecInitPartitionPruning(PlanState *planstate,
 	return prunestate;
 }
 
+/*
+ * ExecGetLockRelsDoInitialPruning
+ *		Perform initial pruning as part of doing ExecGetLockRels() on the parent
+ *		plan node
+ */
+Bitmapset *
+ExecGetLockRelsDoInitialPruning(Plan *plan, ExecGetLockRelsContext *context,
+								PartitionPruneInfo *pruneinfo)
+{
+	List		 *rtable = context->stmt->rtable;
+	ParamListInfo params = context->params;
+	ExprContext	 *econtext;
+	PartitionDirectory pdir;
+	MemoryContext oldcontext,
+				  tmpcontext;
+	PartitionPruneState *prunestate;
+	PlanInitPruningOutput *initPruningOutput;
+
+	/*
+	 * A temporary context to allocate stuff needded to run the pruning steps.
+	 */
+	tmpcontext = AllocSetContextCreate(CurrentMemoryContext,
+									   "initial pruning working data",
+									   ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(tmpcontext);
+
+	/*
+	 * PartitionDirectory to look up partition descriptors, which omits
+	 * detached partitions, just like in the executor proper.
+	 */
+	pdir = CreatePartitionDirectory(CurrentMemoryContext, false);
+
+	/*
+	 * We don't yet have a PlanState for the parent plan node, so must create
+	 * a standalone ExprContext to evaluate pruning expressions, equipped with
+	 * the information about the EXTERN parameters that the caller passed us.
+	 * Note that that's okay because the initial pruning steps do not contain
+	 * anything that requires the execution to have started.
+	 */
+	econtext = CreateStandaloneExprContext();
+	econtext->ecxt_param_list_info = params;
+	prunestate = ExecCreatePartitionPruneState(NULL, pruneinfo,
+											   true, false,
+											   rtable, econtext,
+											   pdir);
+	MemoryContextSwitchTo(oldcontext);
+
+	/* Do the pruning and populate a PlanInitPruningOutput for this node. */
+	initPruningOutput = makeNode(PlanInitPruningOutput);
+	initPruningOutput->initially_valid_subplans =
+		ExecFindInitialMatchingSubPlans(prunestate, pruneinfo);
+	ExecStorePlanInitPruningOutput(context, initPruningOutput, plan);
+
+	FreeExprContext(econtext, true);
+	DestroyPartitionDirectory(pdir);
+	MemoryContextDelete(tmpcontext);
+
+	return initPruningOutput->initially_valid_subplans;
+}
+
 /*
  * ExecCreatePartitionPruneState
  *		Build the data structure required for calling
  *		ExecFindInitialMatchingSubPlans and ExecFindMatchingSubPlans.
  *
- * 'planstate' is the parent plan node's execution state.
+ * 'planstate', if not NULL, is the parent plan node's execution state.  It
+ * can be NULL if being called before ExecutorStart(), in which case,
+ * 'rtable' (range table), 'econtext', and 'partdir' must be explicitly
+ * provided.
  *
  * 'partitionpruneinfo' is a PartitionPruneInfo as generated by
  * make_partition_pruneinfo.  Here we build a PartitionPruneState containing a
@@ -1697,19 +1810,20 @@ ExecInitPartitionPruning(PlanState *planstate,
  */
 static PartitionPruneState *
 ExecCreatePartitionPruneState(PlanState *planstate,
-							  PartitionPruneInfo *partitionpruneinfo)
+							  PartitionPruneInfo *partitionpruneinfo,
+							  bool consider_initial_steps,
+							  bool consider_exec_steps,
+							  List *rtable, ExprContext *econtext,
+							  PartitionDirectory partdir)
 {
-	EState	   *estate = planstate->state;
+	EState	   *estate = planstate ? planstate->state : NULL;
 	PartitionPruneState *prunestate;
 	int			n_part_hierarchies;
 	ListCell   *lc;
 	int			i;
-	ExprContext	*econtext = planstate->ps_ExprContext;
 
-	/* For data reading, executor always omits detached partitions */
-	if (estate->es_partition_directory == NULL)
-		estate->es_partition_directory =
-			CreatePartitionDirectory(estate->es_query_cxt, false);
+	Assert((estate != NULL) ||
+			(partdir != NULL && econtext != NULL && rtable != NIL));
 
 	n_part_hierarchies = list_length(partitionpruneinfo->prune_infos);
 	Assert(n_part_hierarchies > 0);
@@ -1760,19 +1874,48 @@ ExecCreatePartitionPruneState(PlanState *planstate,
 			PartitionedRelPruneInfo *pinfo = lfirst_node(PartitionedRelPruneInfo, lc2);
 			PartitionedRelPruningData *pprune = &prunedata->partrelprunedata[j];
 			Relation	partrel;
+			bool		close_partrel = false;
 			PartitionDesc partdesc;
 			PartitionKey partkey;
 
 			/*
-			 * We can rely on the copies of the partitioned table's partition
-			 * key and partition descriptor appearing in its relcache entry,
-			 * because that entry will be held open and locked for the
-			 * duration of this executor run.
+			 * Must open the relation by ourselves when called before the
+			 * execution has started, such as, when called during
+			 * ExecutorGetLockRels() on a cached plan.  In that case,
+			 * sub-partitions must be locked, because AcquirePlannerLocks()
+			 * would not have seen them. (1st relation in a partrelpruneinfos
+			 * list is always the root partitioned table appearing in the
+			 * query, which AcquirePlannerLocks() would have locked; the
+			 * Assert in relation_open() guards that assumption.)
+			 */
+			if (estate == NULL)
+			{
+				RangeTblEntry *rte = rt_fetch(pinfo->rtindex, rtable);
+				int		lockmode = (j == 0) ? NoLock : rte->rellockmode;
+
+				partrel = table_open(rte->relid, lockmode);
+				close_partrel = true;
+			}
+			else
+				partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
+
+			/*
+			 * We can rely on the copy of the partitioned table's partition
+			 * key from in its relcache entry, because it can't change (or
+			 * get destroyed) as long as the relation is locked.  Partition
+			 * descriptor is taken from the PartitionDirectory associated with
+			 * the table that is held open long enough for the descriptor to
+			 * remain valid while it's used to perform the pruning steps.
 			 */
-			partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
 			partkey = RelationGetPartitionKey(partrel);
-			partdesc = PartitionDirectoryLookup(estate->es_partition_directory,
-												partrel);
+			partdesc = PartitionDirectoryLookup(partdir, partrel);
+
+			/*
+			 * Must close partrel, keeping the lock taken, if we're not using
+			 * EState's entry.
+			 */
+			if (close_partrel)
+				table_close(partrel, NoLock);
 
 			/*
 			 * Initialize the subplan_map and subpart_map.
@@ -1874,7 +2017,7 @@ ExecCreatePartitionPruneState(PlanState *planstate,
 			 * Initialize pruning contexts as needed.
 			 */
 			pprune->initial_pruning_steps = pinfo->initial_pruning_steps;
-			if (pinfo->initial_pruning_steps)
+			if (consider_initial_steps && pinfo->initial_pruning_steps)
 			{
 				ExecInitPruningContext(&pprune->initial_context,
 									   pinfo->initial_pruning_steps,
@@ -1884,7 +2027,7 @@ ExecCreatePartitionPruneState(PlanState *planstate,
 				prunestate->do_initial_prune = true;
 			}
 			pprune->exec_pruning_steps = pinfo->exec_pruning_steps;
-			if (pinfo->exec_pruning_steps)
+			if (consider_exec_steps && pinfo->exec_pruning_steps)
 			{
 				ExecInitPruningContext(&pprune->exec_context,
 									   pinfo->exec_pruning_steps,
@@ -1998,7 +2141,8 @@ ExecInitPruningContext(PartitionPruneContext *context,
  * is required.
  */
 static Bitmapset *
-ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate)
+ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate,
+								PartitionPruneInfo *pruneinfo)
 {
 	Bitmapset  *result = NULL;
 	MemoryContext oldcontext;
@@ -2008,8 +2152,8 @@ ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate)
 	Assert(prunestate->do_initial_prune);
 
 	/*
-	 * Switch to a temp context to avoid leaking memory in the executor's
-	 * query-lifespan memory context.
+	 * Switch to a temp context to avoid leaking memory in the longer-term
+	 * memory context.
 	 */
 	oldcontext = MemoryContextSwitchTo(prunestate->prune_context);
 
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 9df1f81ea8..7246f9175f 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -119,6 +119,7 @@ CreateExecutorState(void)
 	estate->es_relations = NULL;
 	estate->es_rowmarks = NULL;
 	estate->es_plannedstmt = NULL;
+	estate->es_execlockrelsinfo = NULL;
 
 	estate->es_junkFilter = NULL;
 
@@ -785,6 +786,13 @@ ExecGetRangeTableRelation(EState *estate, Index rti)
 
 	Assert(rti > 0 && rti <= estate->es_range_table_size);
 
+	/*
+	 * A cross-check that AcquireExecutorLocks() hasn't missed any relations
+	 * it must not have.
+	 */
+	Assert(estate->es_execlockrelsinfo == NULL ||
+		   bms_is_member(rti, estate->es_execlockrelsinfo->lockrels));
+
 	rel = estate->es_relations[rti - 1];
 	if (rel == NULL)
 	{
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index f9460ae506..a2182a6b1f 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -844,7 +844,7 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
 	else
 		dest = None_Receiver;
 
-	es->qd = CreateQueryDesc(es->stmt,
+	es->qd = CreateQueryDesc(es->stmt, NULL,
 							 fcache->src,
 							 GetActiveSnapshot(),
 							 InvalidSnapshot,
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index 5b6d3eb23b..9c6f907687 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -94,6 +94,55 @@ static bool ExecAppendAsyncRequest(AppendState *node, TupleTableSlot **result);
 static void ExecAppendAsyncEventWait(AppendState *node);
 static void classify_matching_subplans(AppendState *node);
 
+/* ----------------------------------------------------------------
+ *		ExecGetAppendLockRels
+ *			Do ExecGetLockRels()'s work for an Append plan
+ * ----------------------------------------------------------------
+ */
+bool
+ExecGetAppendLockRels(Append *node, ExecGetLockRelsContext *context)
+{
+	PartitionPruneInfo *pruneinfo = node->part_prune_info;
+
+	/*
+	 * Must always lock all the partitioned tables whose direct and indirect
+	 * partitions will be scanned by this Append.
+	 */
+	context->lockrels = bms_add_members(context->lockrels,
+										node->partitioned_rels);
+
+	/*
+	 * Now recurse to subplans to add relations scanned therein.
+	 *
+	 * If initial pruning can be done, do that now and only recurse to the
+	 * surviving subplans.
+	 */
+	if (pruneinfo && pruneinfo->needs_init_pruning)
+	{
+		List	   *subplans = node->appendplans;
+		Bitmapset  *validsubplans;
+		int			i;
+
+		validsubplans = ExecGetLockRelsDoInitialPruning((Plan *) node,
+														context, pruneinfo);
+
+		/* Recurse to surviving subplans. */
+		i = -1;
+		while ((i = bms_next_member(validsubplans, i)) >= 0)
+		{
+			Plan   *subplan = list_nth(subplans, i);
+
+			(void) ExecGetLockRels(subplan, context);
+		}
+
+		/* done with this node */
+		return true;
+	}
+
+	/* Tell the caller to recurse to *all* the subplans. */
+	return false;
+}
+
 /* ----------------------------------------------------------------
  *		ExecInitAppend
  *
@@ -155,7 +204,8 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
 		 * subplan, we can fill as_valid_subplans immediately, preventing
 		 * later calls to ExecFindMatchingSubPlans.
 		 */
-		if (!prunestate->do_exec_prune && nplans > 0)
+		if (appendstate->as_prune_state == NULL ||
+			(!appendstate->as_prune_state->do_exec_prune && nplans > 0))
 			appendstate->as_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
 	}
 	else
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index 9a9f29e845..4b04fcdbc2 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -54,6 +54,55 @@ typedef int32 SlotNumber;
 static TupleTableSlot *ExecMergeAppend(PlanState *pstate);
 static int	heap_compare_slots(Datum a, Datum b, void *arg);
 
+/* ----------------------------------------------------------------
+ *		ExecGetMergeAppendLockRels
+ *			Do ExecGetLockRels()'s work for a MergeAppend plan
+ * ----------------------------------------------------------------
+ */
+bool
+ExecGetMergeAppendLockRels(MergeAppend *node, ExecGetLockRelsContext *context)
+{
+	PartitionPruneInfo *pruneinfo = node->part_prune_info;
+
+	/*
+	 * Must always lock all the partitioned tables whose direct and indirect
+	 * partitions will be scanned by this Append.
+	 */
+	context->lockrels = bms_add_members(context->lockrels,
+										node->partitioned_rels);
+
+	/*
+	 * Now recurse to subplans to add relations scanned therein.
+	 *
+	 * If initial pruning can be done, do that now and only recurse to the
+	 * surviving subplans.
+	 */
+	if (pruneinfo && pruneinfo->needs_init_pruning)
+	{
+		List	   *subplans = node->mergeplans;
+		Bitmapset  *validsubplans;
+		int			i;
+
+		validsubplans = ExecGetLockRelsDoInitialPruning((Plan *) node,
+														context, pruneinfo);
+
+		/* Recurse to surviving subplans. */
+		i = -1;
+		while ((i = bms_next_member(validsubplans, i)) >= 0)
+		{
+			Plan   *subplan = list_nth(subplans, i);
+
+			(void) ExecGetLockRels(subplan, context);
+		}
+
+		/* done with this node */
+		return true;
+	}
+
+	/* Tell the caller to recurse to *all* the subplans. */
+	return false;
+}
+
 
 /* ----------------------------------------------------------------
  *		ExecInitMergeAppend
@@ -103,7 +152,8 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
 		 * subplan, we can fill as_valid_subplans immediately, preventing
 		 * later calls to ExecFindMatchingSubPlans.
 		 */
-		if (!prunestate->do_exec_prune && nplans > 0)
+		if (mergestate->ms_prune_state == NULL ||
+			(!mergestate->ms_prune_state->do_exec_prune && nplans > 0))
 			mergestate->ms_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
 	}
 	else
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index 171575cd73..f17bede367 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -3853,6 +3853,31 @@ ExecLookupResultRelByOid(ModifyTableState *node, Oid resultoid,
 	return NULL;
 }
 
+/*
+ * ExecGetModifyTableLockRels
+ * 		Do ExecGetLockRels()'s work for a ModifyTable plan
+ */
+bool
+ExecGetModifyTableLockRels(ModifyTable *plan, ExecGetLockRelsContext *context)
+{
+	ListCell *lc;
+
+	/* First add the result relation RTIs mentioned in the node. */
+	if (plan->rootRelation > 0)
+		context->lockrels = bms_add_member(context->lockrels,
+										   plan->rootRelation);
+	context->lockrels = bms_add_member(context->lockrels,
+									   plan->nominalRelation);
+	foreach(lc, plan->resultRelations)
+	{
+		context->lockrels = bms_add_member(context->lockrels,
+										   lfirst_int(lc));
+	}
+
+	/* Tell the caller to recurse to the subplan (outerPlan(plan)). */
+	return false;
+}
+
 /* ----------------------------------------------------------------
  *		ExecInitModifyTable
  * ----------------------------------------------------------------
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 042a5f8b0a..64ebbfb31e 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -1578,6 +1578,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 	CachedPlanSource *plansource;
 	CachedPlan *cplan;
 	List	   *stmt_list;
+	List	   *execlockrelsinfo_list;
 	char	   *query_string;
 	Snapshot	snapshot;
 	MemoryContext oldcontext;
@@ -1659,6 +1660,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 	/* Replan if needed, and increment plan refcount for portal */
 	cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv);
 	stmt_list = cplan->stmt_list;
+	execlockrelsinfo_list = cplan->execlockrelsinfo_list;
 
 	if (!plan->saved)
 	{
@@ -1670,6 +1672,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 		 */
 		oldcontext = MemoryContextSwitchTo(portal->portalContext);
 		stmt_list = copyObject(stmt_list);
+		execlockrelsinfo_list = copyObject(execlockrelsinfo_list);
 		MemoryContextSwitchTo(oldcontext);
 		ReleaseCachedPlan(cplan, NULL);
 		cplan = NULL;			/* portal shouldn't depend on cplan */
@@ -1683,6 +1686,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 					  query_string,
 					  plansource->commandTag,
 					  stmt_list,
+					  execlockrelsinfo_list,
 					  cplan);
 
 	/*
@@ -2473,7 +2477,9 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 	{
 		CachedPlanSource *plansource = (CachedPlanSource *) lfirst(lc1);
 		List	   *stmt_list;
-		ListCell   *lc2;
+		List	   *execlockrelsinfo_list;
+		ListCell   *lc2,
+				   *lc3;
 
 		spicallbackarg.query = plansource->query_string;
 
@@ -2552,6 +2558,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 							  plan_owner, _SPI_current->queryEnv);
 
 		stmt_list = cplan->stmt_list;
+		execlockrelsinfo_list = cplan->execlockrelsinfo_list;
 
 		/*
 		 * If we weren't given a specific snapshot to use, and the statement
@@ -2589,9 +2596,10 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 			}
 		}
 
-		foreach(lc2, stmt_list)
+		forboth(lc2, stmt_list, lc3, execlockrelsinfo_list)
 		{
 			PlannedStmt *stmt = lfirst_node(PlannedStmt, lc2);
+			ExecLockRelsInfo *execlockrelsinfo = lfirst_node(ExecLockRelsInfo, lc3);
 			bool		canSetTag = stmt->canSetTag;
 			DestReceiver *dest;
 
@@ -2663,7 +2671,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 				else
 					snap = InvalidSnapshot;
 
-				qdesc = CreateQueryDesc(stmt,
+				qdesc = CreateQueryDesc(stmt, execlockrelsinfo,
 										plansource->query_string,
 										snap, crosscheck_snapshot,
 										dest,
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 29c515d7db..afffabbea0 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -68,6 +68,13 @@
 		} \
 	} while (0)
 
+/* Copy a field that is an array with numElem ints */
+#define COPY_INT_ARRAY(fldname, numElem) \
+	do { \
+		newnode->fldname = (numElem) > 0 ? palloc((numElem) * sizeof(int)) : NULL; \
+		memcpy(newnode->fldname, from->fldname, sizeof(int) * (numElem)); \
+	} while (0)
+
 /* Copy a parse location field (for Copy, this is same as scalar case) */
 #define COPY_LOCATION_FIELD(fldname) \
 	(newnode->fldname = from->fldname)
@@ -94,8 +101,10 @@ _copyPlannedStmt(const PlannedStmt *from)
 	COPY_SCALAR_FIELD(transientPlan);
 	COPY_SCALAR_FIELD(dependsOnRole);
 	COPY_SCALAR_FIELD(parallelModeNeeded);
+	COPY_SCALAR_FIELD(containsInitialPruning);
 	COPY_SCALAR_FIELD(jitFlags);
 	COPY_NODE_FIELD(planTree);
+	COPY_SCALAR_FIELD(numPlanNodes);
 	COPY_NODE_FIELD(rtable);
 	COPY_NODE_FIELD(resultRelations);
 	COPY_NODE_FIELD(appendRelations);
@@ -1282,6 +1291,8 @@ _copyPartitionPruneInfo(const PartitionPruneInfo *from)
 	PartitionPruneInfo *newnode = makeNode(PartitionPruneInfo);
 
 	COPY_NODE_FIELD(prune_infos);
+	COPY_SCALAR_FIELD(needs_init_pruning);
+	COPY_SCALAR_FIELD(needs_exec_pruning);
 	COPY_BITMAPSET_FIELD(other_subplans);
 
 	return newnode;
@@ -5373,6 +5384,33 @@ _copyExtensibleNode(const ExtensibleNode *from)
 	return newnode;
 }
 
+/* ****************************************************************
+ *					execnodes.h copy functions
+ * ****************************************************************
+ */
+static ExecLockRelsInfo *
+_copyExecLockRelsInfo(const ExecLockRelsInfo *from)
+{
+	ExecLockRelsInfo *newnode = makeNode(ExecLockRelsInfo);
+
+	COPY_BITMAPSET_FIELD(lockrels);
+	COPY_SCALAR_FIELD(numPlanNodes);
+	COPY_NODE_FIELD(initPruningOutputs);
+	COPY_INT_ARRAY(ipoIndexes, from->numPlanNodes);
+
+	return newnode;
+}
+
+static PlanInitPruningOutput *
+_copyPlanInitPruningOutput(const PlanInitPruningOutput *from)
+{
+	PlanInitPruningOutput *newnode = makeNode(PlanInitPruningOutput);
+
+	COPY_BITMAPSET_FIELD(initially_valid_subplans);
+
+	return newnode;
+}
+
 /* ****************************************************************
  *					value.h copy functions
  * ****************************************************************
@@ -5427,7 +5465,6 @@ _copyBitString(const BitString *from)
 	return newnode;
 }
 
-
 static ForeignKeyCacheInfo *
 _copyForeignKeyCacheInfo(const ForeignKeyCacheInfo *from)
 {
@@ -6454,6 +6491,16 @@ copyObjectImpl(const void *from)
 			retval = _copyPublicationTable(from);
 			break;
 
+			/*
+			 * EXECUTION NODES
+			 */
+		case T_ExecLockRelsInfo:
+			retval = _copyExecLockRelsInfo(from);
+			break;
+		case T_PlanInitPruningOutput:
+			retval = _copyPlanInitPruningOutput(from);
+			break;
+
 			/*
 			 * MISCELLANEOUS NODES
 			 */
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 108ede9af9..e2d7e6bcac 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -312,8 +312,10 @@ _outPlannedStmt(StringInfo str, const PlannedStmt *node)
 	WRITE_BOOL_FIELD(transientPlan);
 	WRITE_BOOL_FIELD(dependsOnRole);
 	WRITE_BOOL_FIELD(parallelModeNeeded);
+	WRITE_BOOL_FIELD(containsInitialPruning);
 	WRITE_INT_FIELD(jitFlags);
 	WRITE_NODE_FIELD(planTree);
+	WRITE_INT_FIELD(numPlanNodes);
 	WRITE_NODE_FIELD(rtable);
 	WRITE_NODE_FIELD(resultRelations);
 	WRITE_NODE_FIELD(appendRelations);
@@ -1008,6 +1010,8 @@ _outPartitionPruneInfo(StringInfo str, const PartitionPruneInfo *node)
 	WRITE_NODE_TYPE("PARTITIONPRUNEINFO");
 
 	WRITE_NODE_FIELD(prune_infos);
+	WRITE_BOOL_FIELD(needs_init_pruning);
+	WRITE_BOOL_FIELD(needs_exec_pruning);
 	WRITE_BITMAPSET_FIELD(other_subplans);
 }
 
@@ -2818,6 +2822,31 @@ _outExtensibleNode(StringInfo str, const ExtensibleNode *node)
 	methods->nodeOut(str, node);
 }
 
+/*****************************************************************************
+ *
+ *	Stuff from execnodes.h
+ *
+ *****************************************************************************/
+
+static void
+_outExecLockRelsInfo(StringInfo str, const ExecLockRelsInfo *node)
+{
+	WRITE_NODE_TYPE("EXECLOCKRELSINFO");
+
+	WRITE_BITMAPSET_FIELD(lockrels);
+	WRITE_INT_FIELD(numPlanNodes);
+	WRITE_NODE_FIELD(initPruningOutputs);
+	WRITE_INT_ARRAY(ipoIndexes, node->numPlanNodes);
+}
+
+static void
+_outPlanInitPruningOutput(StringInfo str, const PlanInitPruningOutput *node)
+{
+	WRITE_NODE_TYPE("PLANINITPRUNINGOUTPUT");
+
+	WRITE_BITMAPSET_FIELD(initially_valid_subplans);
+}
+
 /*****************************************************************************
  *
  *	Stuff from parsenodes.h.
@@ -4720,6 +4749,16 @@ outNode(StringInfo str, const void *obj)
 				_outJsonItemCoercions(str, obj);
 				break;
 
+				/*
+				 * EXECUTION NODES
+				 */
+			case T_ExecLockRelsInfo:
+				_outExecLockRelsInfo(str, obj);
+				break;
+			case T_PlanInitPruningOutput:
+				_outPlanInitPruningOutput(str, obj);
+				break;
+
 			default:
 
 				/*
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index ce146dd45e..88173f70a1 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -1782,8 +1782,10 @@ _readPlannedStmt(void)
 	READ_BOOL_FIELD(transientPlan);
 	READ_BOOL_FIELD(dependsOnRole);
 	READ_BOOL_FIELD(parallelModeNeeded);
+	READ_BOOL_FIELD(containsInitialPruning);
 	READ_INT_FIELD(jitFlags);
 	READ_NODE_FIELD(planTree);
+	READ_INT_FIELD(numPlanNodes);
 	READ_NODE_FIELD(rtable);
 	READ_NODE_FIELD(resultRelations);
 	READ_NODE_FIELD(appendRelations);
@@ -2735,6 +2737,8 @@ _readPartitionPruneInfo(void)
 	READ_LOCALS(PartitionPruneInfo);
 
 	READ_NODE_FIELD(prune_infos);
+	READ_BOOL_FIELD(needs_init_pruning);
+	READ_BOOL_FIELD(needs_exec_pruning);
 	READ_BITMAPSET_FIELD(other_subplans);
 
 	READ_DONE();
@@ -2904,6 +2908,35 @@ _readPartitionRangeDatum(void)
 	READ_DONE();
 }
 
+/*
+ * _readExecLockRelsInfo
+ */
+static ExecLockRelsInfo *
+_readExecLockRelsInfo(void)
+{
+	READ_LOCALS(ExecLockRelsInfo);
+
+	READ_BITMAPSET_FIELD(lockrels);
+	READ_INT_FIELD(numPlanNodes);
+	READ_NODE_FIELD(initPruningOutputs);
+	READ_INT_ARRAY(ipoIndexes, local_node->numPlanNodes);
+
+	READ_DONE();
+}
+
+/*
+ * _readPlanInitPruningOutput
+ */
+static PlanInitPruningOutput *
+_readPlanInitPruningOutput(void)
+{
+	READ_LOCALS(PlanInitPruningOutput);
+
+	READ_BITMAPSET_FIELD(initially_valid_subplans);
+
+	READ_DONE();
+}
+
 /*
  * parseNodeString
  *
@@ -3197,6 +3230,10 @@ parseNodeString(void)
 		return_value = _readJsonCoercion();
 	else if (MATCH("JSONITEMCOERCIONS", 17))
 		return_value = _readJsonItemCoercions();
+	else if (MATCH("EXECLOCKRELSINFO", 16))
+		return_value = _readExecLockRelsInfo();
+	else if (MATCH("PLANINITPRUNINGOUTPUT", 21))
+		return_value = _readPlanInitPruningOutput();
 	else
 	{
 		elog(ERROR, "badly formatted node string \"%.32s\"...", token);
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index c769b4b4b9..4c586ac1ec 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -517,7 +517,9 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
 	result->transientPlan = glob->transientPlan;
 	result->dependsOnRole = glob->dependsOnRole;
 	result->parallelModeNeeded = glob->parallelModeNeeded;
+	result->containsInitialPruning = glob->containsInitialPruning;
 	result->planTree = top_plan;
+	result->numPlanNodes = glob->lastPlanNodeId;
 	result->rtable = glob->finalrtable;
 	result->resultRelations = glob->resultRelations;
 	result->appendRelations = glob->appendRelations;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 8214edec54..a1c6c3caa2 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -1623,6 +1623,9 @@ set_append_references(PlannerInfo *root,
 				pinfo->rtindex += rtoffset;
 			}
 		}
+
+		if (aplan->part_prune_info->needs_init_pruning)
+			root->glob->containsInitialPruning = true;
 	}
 
 	/* We don't need to recurse to lefttree or righttree ... */
@@ -1710,6 +1713,9 @@ set_mergeappend_references(PlannerInfo *root,
 				pinfo->rtindex += rtoffset;
 			}
 		}
+
+		if (mplan->part_prune_info->needs_init_pruning)
+			root->glob->containsInitialPruning = true;
 	}
 
 	/* We don't need to recurse to lefttree or righttree ... */
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index 7080cb25d9..3322dc79f2 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -144,7 +144,9 @@ static List *make_partitionedrel_pruneinfo(PlannerInfo *root,
 										   List *prunequal,
 										   Bitmapset *partrelids,
 										   int *relid_subplan_map,
-										   Bitmapset **matchedsubplans);
+										   Bitmapset **matchedsubplans,
+										   bool *needs_init_pruning,
+										   bool *needs_exec_pruning);
 static void gen_partprune_steps(RelOptInfo *rel, List *clauses,
 								PartClauseTarget target,
 								GeneratePruningStepsContext *context);
@@ -230,6 +232,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	int		   *relid_subplan_map;
 	ListCell   *lc;
 	int			i;
+	bool		needs_init_pruning = false;
+	bool		needs_exec_pruning = false;
 
 	/*
 	 * Scan the subpaths to see which ones are scans of partition child
@@ -309,12 +313,16 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		Bitmapset  *partrelids = (Bitmapset *) lfirst(lc);
 		List	   *pinfolist;
 		Bitmapset  *matchedsubplans = NULL;
+		bool		partrel_needs_init_pruning;
+		bool		partrel_needs_exec_pruning;
 
 		pinfolist = make_partitionedrel_pruneinfo(root, parentrel,
 												  prunequal,
 												  partrelids,
 												  relid_subplan_map,
-												  &matchedsubplans);
+												  &matchedsubplans,
+												  &partrel_needs_init_pruning,
+												  &partrel_needs_exec_pruning);
 
 		/* When pruning is possible, record the matched subplans */
 		if (pinfolist != NIL)
@@ -323,6 +331,10 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 			allmatchedsubplans = bms_join(matchedsubplans,
 										  allmatchedsubplans);
 		}
+		if (!needs_init_pruning)
+			needs_init_pruning = partrel_needs_init_pruning;
+		if (!needs_exec_pruning)
+			needs_exec_pruning = partrel_needs_exec_pruning;
 	}
 
 	pfree(relid_subplan_map);
@@ -337,6 +349,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	/* Else build the result data structure */
 	pruneinfo = makeNode(PartitionPruneInfo);
 	pruneinfo->prune_infos = prunerelinfos;
+	pruneinfo->needs_init_pruning = needs_init_pruning;
+	pruneinfo->needs_exec_pruning = needs_exec_pruning;
 
 	/*
 	 * Some subplans may not belong to any of the identified partitioned rels.
@@ -435,13 +449,18 @@ add_part_relids(List *allpartrelids, Bitmapset *partrelids)
  * If we cannot find any useful run-time pruning steps, return NIL.
  * However, on success, each rel identified in partrelids will have
  * an element in the result list, even if some of them are useless.
+ * *needs_init_pruning and *needs_exec_pruning are set to indicate that the
+ * returned PartitionedRelPruneInfos contains pruning steps that can be
+ * performed before and after execution begins, respectively.
  */
 static List *
 make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 							  List *prunequal,
 							  Bitmapset *partrelids,
 							  int *relid_subplan_map,
-							  Bitmapset **matchedsubplans)
+							  Bitmapset **matchedsubplans,
+							  bool *needs_init_pruning,
+							  bool *needs_exec_pruning)
 {
 	RelOptInfo *targetpart = NULL;
 	List	   *pinfolist = NIL;
@@ -452,6 +471,10 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	int			rti;
 	int			i;
 
+	/* Will find out below. */
+	*needs_init_pruning = false;
+	*needs_exec_pruning = false;
+
 	/*
 	 * Examine each partitioned rel, constructing a temporary array to map
 	 * from planner relids to index of the partitioned rel, and building a
@@ -539,6 +562,9 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		 * executor per-scan pruning steps.  This first pass creates startup
 		 * pruning steps and detects whether there's any possibly-useful quals
 		 * that would require per-scan pruning.
+		 *
+		 * In the first pass, we note whether the 2nd pass is necessary by
+		 * by noting the presence of EXEC parameters.
 		 */
 		gen_partprune_steps(subpart, partprunequal, PARTTARGET_INITIAL,
 							&context);
@@ -613,6 +639,11 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		pinfo->execparamids = execparamids;
 		/* Remaining fields will be filled in the next loop */
 
+		if (!*needs_init_pruning)
+			*needs_init_pruning = (initial_pruning_steps != NIL);
+		if (!*needs_exec_pruning)
+			*needs_exec_pruning = (exec_pruning_steps != NIL);
+
 		pinfolist = lappend(pinfolist, pinfo);
 	}
 
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index ba2fcfeb4a..085eb3f209 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -945,15 +945,17 @@ pg_plan_query(Query *querytree, const char *query_string, int cursorOptions,
  * For normal optimizable statements, invoke the planner.  For utility
  * statements, just make a wrapper PlannedStmt node.
  *
- * The result is a list of PlannedStmt nodes.
+ * The result is a list of PlannedStmt nodes.  Also, a NULL is appended to
+ * *execlockrelsinfo_list for each PlannedStmt added to the returned list.
  */
 List *
 pg_plan_queries(List *querytrees, const char *query_string, int cursorOptions,
-				ParamListInfo boundParams)
+				ParamListInfo boundParams, List **execlockrelsinfo_list)
 {
 	List	   *stmt_list = NIL;
 	ListCell   *query_list;
 
+	*execlockrelsinfo_list = NIL;
 	foreach(query_list, querytrees)
 	{
 		Query	   *query = lfirst_node(Query, query_list);
@@ -977,6 +979,7 @@ pg_plan_queries(List *querytrees, const char *query_string, int cursorOptions,
 		}
 
 		stmt_list = lappend(stmt_list, stmt);
+		*execlockrelsinfo_list = lappend(*execlockrelsinfo_list, NULL);
 	}
 
 	return stmt_list;
@@ -1080,7 +1083,8 @@ exec_simple_query(const char *query_string)
 		QueryCompletion qc;
 		MemoryContext per_parsetree_context = NULL;
 		List	   *querytree_list,
-				   *plantree_list;
+				   *plantree_list,
+				   *plantree_execlockrelsinfo_list;
 		Portal		portal;
 		DestReceiver *receiver;
 		int16		format;
@@ -1167,7 +1171,8 @@ exec_simple_query(const char *query_string)
 												NULL, 0, NULL);
 
 		plantree_list = pg_plan_queries(querytree_list, query_string,
-										CURSOR_OPT_PARALLEL_OK, NULL);
+										CURSOR_OPT_PARALLEL_OK, NULL,
+										&plantree_execlockrelsinfo_list);
 
 		/*
 		 * Done with the snapshot used for parsing/planning.
@@ -1203,6 +1208,7 @@ exec_simple_query(const char *query_string)
 						  query_string,
 						  commandTag,
 						  plantree_list,
+						  plantree_execlockrelsinfo_list,
 						  NULL);
 
 		/*
@@ -1991,6 +1997,7 @@ exec_bind_message(StringInfo input_message)
 					  query_string,
 					  psrc->commandTag,
 					  cplan->stmt_list,
+					  cplan->execlockrelsinfo_list,
 					  cplan);
 
 	/* Done with the snapshot used for parameter I/O and parsing/planning */
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index 5aa5a350f3..0fd8c65de7 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -35,7 +35,7 @@
 Portal		ActivePortal = NULL;
 
 
-static void ProcessQuery(PlannedStmt *plan,
+static void ProcessQuery(PlannedStmt *plan, ExecLockRelsInfo *execlockrelsinfo,
 						 const char *sourceText,
 						 ParamListInfo params,
 						 QueryEnvironment *queryEnv,
@@ -65,6 +65,7 @@ static void DoPortalRewind(Portal portal);
  */
 QueryDesc *
 CreateQueryDesc(PlannedStmt *plannedstmt,
+				ExecLockRelsInfo *execlockrelsinfo,
 				const char *sourceText,
 				Snapshot snapshot,
 				Snapshot crosscheck_snapshot,
@@ -77,6 +78,7 @@ CreateQueryDesc(PlannedStmt *plannedstmt,
 
 	qd->operation = plannedstmt->commandType;	/* operation */
 	qd->plannedstmt = plannedstmt;	/* plan */
+	qd->execlockrelsinfo = execlockrelsinfo;		/* ExecutorGetLockRels() output for plan */
 	qd->sourceText = sourceText;	/* query text */
 	qd->snapshot = RegisterSnapshot(snapshot);	/* snapshot */
 	/* RI check snapshot */
@@ -122,6 +124,7 @@ FreeQueryDesc(QueryDesc *qdesc)
  *		PORTAL_ONE_RETURNING, or PORTAL_ONE_MOD_WITH portal
  *
  *	plan: the plan tree for the query
+ *	execlockrelsinfo: ExecutorGetLockRels() output for the plan tree
  *	sourceText: the source text of the query
  *	params: any parameters needed
  *	dest: where to send results
@@ -134,6 +137,7 @@ FreeQueryDesc(QueryDesc *qdesc)
  */
 static void
 ProcessQuery(PlannedStmt *plan,
+			 ExecLockRelsInfo *execlockrelsinfo,
 			 const char *sourceText,
 			 ParamListInfo params,
 			 QueryEnvironment *queryEnv,
@@ -145,7 +149,7 @@ ProcessQuery(PlannedStmt *plan,
 	/*
 	 * Create the QueryDesc object
 	 */
-	queryDesc = CreateQueryDesc(plan, sourceText,
+	queryDesc = CreateQueryDesc(plan, execlockrelsinfo, sourceText,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, params, queryEnv, 0);
 
@@ -493,6 +497,7 @@ PortalStart(Portal portal, ParamListInfo params,
 				 * the destination to DestNone.
 				 */
 				queryDesc = CreateQueryDesc(linitial_node(PlannedStmt, portal->stmts),
+											linitial_node(ExecLockRelsInfo, portal->execlockrelsinfos),
 											portal->sourceText,
 											GetActiveSnapshot(),
 											InvalidSnapshot,
@@ -1193,7 +1198,8 @@ PortalRunMulti(Portal portal,
 			   QueryCompletion *qc)
 {
 	bool		active_snapshot_set = false;
-	ListCell   *stmtlist_item;
+	ListCell   *stmtlist_item,
+			   *execlockrelsinfolist_item;
 
 	/*
 	 * If the destination is DestRemoteExecute, change to DestNone.  The
@@ -1214,9 +1220,12 @@ PortalRunMulti(Portal portal,
 	 * Loop to handle the individual queries generated from a single parsetree
 	 * by analysis and rewrite.
 	 */
-	foreach(stmtlist_item, portal->stmts)
+	forboth(stmtlist_item, portal->stmts,
+			execlockrelsinfolist_item, portal->execlockrelsinfos)
 	{
 		PlannedStmt *pstmt = lfirst_node(PlannedStmt, stmtlist_item);
+		ExecLockRelsInfo *execlockrelsinfo = lfirst_node(ExecLockRelsInfo,
+											   execlockrelsinfolist_item);
 
 		/*
 		 * If we got a cancel signal in prior command, quit
@@ -1274,7 +1283,7 @@ PortalRunMulti(Portal portal,
 			if (pstmt->canSetTag)
 			{
 				/* statement can set tag string */
-				ProcessQuery(pstmt,
+				ProcessQuery(pstmt, execlockrelsinfo,
 							 portal->sourceText,
 							 portal->portalParams,
 							 portal->queryEnv,
@@ -1283,7 +1292,7 @@ PortalRunMulti(Portal portal,
 			else
 			{
 				/* stmt added by rewrite cannot set tag */
-				ProcessQuery(pstmt,
+				ProcessQuery(pstmt, execlockrelsinfo,
 							 portal->sourceText,
 							 portal->portalParams,
 							 portal->queryEnv,
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 4cf6db504f..9f5a40a0a6 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -99,14 +99,16 @@ static dlist_head cached_expression_list = DLIST_STATIC_INIT(cached_expression_l
 static void ReleaseGenericPlan(CachedPlanSource *plansource);
 static List *RevalidateCachedQuery(CachedPlanSource *plansource,
 								   QueryEnvironment *queryEnv);
-static bool CheckCachedPlan(CachedPlanSource *plansource);
+static bool CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams);
 static CachedPlan *BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
 								   ParamListInfo boundParams, QueryEnvironment *queryEnv);
+static void CachedPlanSaveExecLockRelsInfos(CachedPlan *plan, List *execlockrelsinfo_list);
 static bool choose_custom_plan(CachedPlanSource *plansource,
 							   ParamListInfo boundParams);
 static double cached_plan_cost(CachedPlan *plan, bool include_planner);
 static Query *QueryListGetPrimaryStmt(List *stmts);
-static void AcquireExecutorLocks(List *stmt_list, bool acquire);
+static List *AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams);
+static void ReleaseExecutorLocks(List *stmt_list, List *execlockrelsinfo_list);
 static void AcquirePlannerLocks(List *stmt_list, bool acquire);
 static void ScanQueryForLocks(Query *parsetree, bool acquire);
 static bool ScanQueryWalker(Node *node, bool *acquire);
@@ -790,9 +792,21 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
  *
  * On a "true" return, we have acquired the locks needed to run the plan.
  * (We must do this for the "true" result to be race-condition-free.)
+ *
+ * If the CachedPlan is valid, this may in some cases call ExecutorGetLockRels
+ * on each PlannedStmt contained in it to determine the set of relations to be
+ * locked by AcquireExecutorLocks(), instead of just scanning its range table,
+ * which is done to prune away any nodes in the tree that need not be executed
+ * based on the result of initial partition pruning.  Resulting
+ * ExecLockRelsInfo nodes containing the result of such pruning, allocated in
+ * a child context of the context containing the plan itself, are added into
+ * plan->execlockrelsinfo_list.  The previous contents of the list from the
+ * last invocation on the same CachedPlan are deleted, because they would no
+ * longer be valid given the fresh set of parameter values which may be used
+ * as pruning parameters.
  */
 static bool
-CheckCachedPlan(CachedPlanSource *plansource)
+CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams)
 {
 	CachedPlan *plan = plansource->gplan;
 
@@ -820,13 +834,25 @@ CheckCachedPlan(CachedPlanSource *plansource)
 	 */
 	if (plan->is_valid)
 	{
+		List   *execlockrelsinfo_list;
+
 		/*
 		 * Plan must have positive refcount because it is referenced by
 		 * plansource; so no need to fear it disappears under us here.
 		 */
 		Assert(plan->refcount > 0);
 
-		AcquireExecutorLocks(plan->stmt_list, true);
+		/*
+		 * Lock relations scanned by the plan.  If ExecutorGetLockRels() asked
+		 * to omit some relations because the plan nodes that scan them were
+		 * found to be pruned, the executor will be informed of the omission of
+		 * the plan nodes themselves, so that it doesn't accidentally try to
+		 * execute those nodes, via the ExecLockRelsInfo nodes collected in the
+		 * returned list that is also passed to it along with the list of
+		 * PlannedStmts.
+		 */
+		execlockrelsinfo_list = AcquireExecutorLocks(plan->stmt_list,
+													 boundParams);
 
 		/*
 		 * If plan was transient, check to see if TransactionXmin has
@@ -844,11 +870,14 @@ CheckCachedPlan(CachedPlanSource *plansource)
 		if (plan->is_valid)
 		{
 			/* Successfully revalidated and locked the query. */
+
+			/* Remember ExecLockRelsInfos in the CachedPlan. */
+			CachedPlanSaveExecLockRelsInfos(plan, execlockrelsinfo_list);
 			return true;
 		}
 
 		/* Oops, the race case happened.  Release useless locks. */
-		AcquireExecutorLocks(plan->stmt_list, false);
+		ReleaseExecutorLocks(plan->stmt_list, execlockrelsinfo_list);
 	}
 
 	/*
@@ -880,7 +909,8 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
 				ParamListInfo boundParams, QueryEnvironment *queryEnv)
 {
 	CachedPlan *plan;
-	List	   *plist;
+	List	   *plist,
+			   *execlockrelsinfo_list;
 	bool		snapshot_set;
 	bool		is_transient;
 	MemoryContext plan_context;
@@ -933,7 +963,8 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
 	 * Generate the plan.
 	 */
 	plist = pg_plan_queries(qlist, plansource->query_string,
-							plansource->cursor_options, boundParams);
+							plansource->cursor_options, boundParams,
+							&execlockrelsinfo_list);
 
 	/* Release snapshot if we got one */
 	if (snapshot_set)
@@ -1002,6 +1033,16 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
 	plan->is_saved = false;
 	plan->is_valid = true;
 
+	/*
+	 * Save the dummy ExecLockRelsInfo list, that is a list containing NULLs
+	 * as elements.  We must do this, becasue users of the CachedPlan expect
+	 * one to go with the list of PlannedStmts.
+	 * XXX maybe get rid of that contract.
+	 */
+	plan->execlockrelsinfo_context = NULL;
+	CachedPlanSaveExecLockRelsInfos(plan, execlockrelsinfo_list);
+	Assert(MemoryContextIsValid(plan->execlockrelsinfo_context));
+
 	/* assign generation number to new plan */
 	plan->generation = ++(plansource->generation);
 
@@ -1160,7 +1201,7 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 
 	if (!customplan)
 	{
-		if (CheckCachedPlan(plansource))
+		if (CheckCachedPlan(plansource, boundParams))
 		{
 			/* We want a generic plan, and we already have a valid one */
 			plan = plansource->gplan;
@@ -1586,6 +1627,49 @@ CopyCachedPlan(CachedPlanSource *plansource)
 	return newsource;
 }
 
+/*
+ * CachedPlanSaveExecLockRelsInfos
+ *		Save the list containing ExecLockRelsInfo nodes into the given
+ *		CachedPlan
+ *
+ * The provided list is copied into a dedicated context that is a child of
+ * plan->context.  If the child context already exists, it is emptied, because
+ * any ExecLockRelsInfo contained therein would no longer be useful.
+ */
+static void
+CachedPlanSaveExecLockRelsInfos(CachedPlan *plan, List *execlockrelsinfo_list)
+{
+	MemoryContext	execlockrelsinfo_context = plan->execlockrelsinfo_context,
+					oldcontext = CurrentMemoryContext;
+	List		   *execlockrelsinfo_list_copy;
+
+	/*
+	 * Set up the dedicated context if not already done, saving it as a child
+	 * of the CachedPlan's context.
+	 */
+	if (execlockrelsinfo_context == NULL)
+	{
+		execlockrelsinfo_context = AllocSetContextCreate(CurrentMemoryContext,
+												 "CachedPlan execlockrelsinfo list",
+												 ALLOCSET_START_SMALL_SIZES);
+		MemoryContextSetParent(execlockrelsinfo_context, plan->context);
+		MemoryContextSetIdentifier(execlockrelsinfo_context, plan->context->ident);
+		plan->execlockrelsinfo_context = execlockrelsinfo_context;
+	}
+	else
+	{
+		/* Just clear existing contents by resetting the context. */
+		Assert(MemoryContextIsValid(execlockrelsinfo_context));
+		MemoryContextReset(execlockrelsinfo_context);
+	}
+
+	MemoryContextSwitchTo(execlockrelsinfo_context);
+	execlockrelsinfo_list_copy = copyObject(execlockrelsinfo_list);
+	MemoryContextSwitchTo(oldcontext);
+
+	plan->execlockrelsinfo_list = execlockrelsinfo_list_copy;
+}
+
 /*
  * CachedPlanIsValid: test whether the rewritten querytree within a
  * CachedPlanSource is currently valid (that is, not marked as being in need
@@ -1737,17 +1821,21 @@ QueryListGetPrimaryStmt(List *stmts)
 
 /*
  * AcquireExecutorLocks: acquire locks needed for execution of a cached plan;
- * or release them if acquire is false.
+ *
+ * Returns a list of ExecLockRelsInfo nodes containing one element for each
+ * PlannedStmt in stmt_list or NULL if the latter is utility statement or its
+ * containsInitialPruning is false.
  */
-static void
-AcquireExecutorLocks(List *stmt_list, bool acquire)
+static List *
+AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams)
 {
 	ListCell   *lc1;
+	List	   *execlockrelsinfo_list = NIL;
 
 	foreach(lc1, stmt_list)
 	{
 		PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
-		ListCell   *lc2;
+		ExecLockRelsInfo *execlockrelsinfo = NULL;
 
 		if (plannedstmt->commandType == CMD_UTILITY)
 		{
@@ -1761,27 +1849,139 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
 			Query	   *query = UtilityContainsQuery(plannedstmt->utilityStmt);
 
 			if (query)
-				ScanQueryForLocks(query, acquire);
-			continue;
+				ScanQueryForLocks(query, true);
 		}
-
-		foreach(lc2, plannedstmt->rtable)
+		else
 		{
-			RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc2);
+			/*
+			 * Figure out the set of relations that would need to be locked
+			 * before executing the plan.
+			 */
+			if (!plannedstmt->containsInitialPruning)
+			{
+				/*
+				 * If the plan contains no initial pruning steps, just lock
+				 * all the relations found in the range table.
+				 */
+				ListCell *lc;
 
-			if (rte->rtekind != RTE_RELATION)
-				continue;
+				foreach(lc, plannedstmt->rtable)
+				{
+					RangeTblEntry *rte = lfirst(lc);
+
+					if (rte->rtekind != RTE_RELATION)
+						continue;
+
+					/*
+					 * Acquire the appropriate type of lock on each relation
+					 * OID. Note that we don't actually try to open the rel,
+					 * and hence will not fail if it's been dropped entirely
+					 * --- we'll just transiently acquire a non-conflicting
+					 *  lock.
+					 */
+					LockRelationOid(rte->relid, rte->rellockmode);
+				}
+			}
+			else
+			{
+				int			rti;
+				Bitmapset  *lockrels;
+
+				/*
+				 * Walk the plan tree to find only the minimal set of
+				 * relations to be locked, considering the effect of performing
+				 * initial partition pruning.
+				 */
+				execlockrelsinfo = ExecutorGetLockRels(plannedstmt, boundParams);
+				lockrels = execlockrelsinfo->lockrels;
+
+				rti = -1;
+				while ((rti = bms_next_member(lockrels, rti)) >= 0)
+				{
+					RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
 
+					Assert(rte->rtekind == RTE_RELATION);
+
+					/* See the comment above. */
+					LockRelationOid(rte->relid, rte->rellockmode);
+				}
+			}
+		}
+
+		/*
+		 * Remember ExecLockRelsInfo for later adding to the QueryDesc that
+		 * will be passed to the executor when executing this plan.  May be
+		 * NULL, but must keep the list the same length as stmt_list.
+		 */
+		execlockrelsinfo_list = lappend(execlockrelsinfo_list,
+										execlockrelsinfo);
+	}
+
+	return execlockrelsinfo_list;
+}
+
+/*
+ * ReleaseExecutorLocks
+ * 		Release locks that would've been acquired by an earlier call to
+ * 		AcquireExecutorLocks()
+ */
+static void
+ReleaseExecutorLocks(List *stmt_list, List *execlockrelsinfo_list)
+{
+	ListCell   *lc1,
+			   *lc2;
+
+	forboth(lc1, stmt_list, lc2, execlockrelsinfo_list)
+	{
+		PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
+		ExecLockRelsInfo *execlockrelsinfo = lfirst_node(ExecLockRelsInfo, lc2);
+
+		if (plannedstmt->commandType == CMD_UTILITY)
+		{
 			/*
-			 * Acquire the appropriate type of lock on each relation OID. Note
-			 * that we don't actually try to open the rel, and hence will not
-			 * fail if it's been dropped entirely --- we'll just transiently
-			 * acquire a non-conflicting lock.
+			 * Ignore utility statements, except those (such as EXPLAIN) that
+			 * contain a parsed-but-not-planned query.  Note: it's okay to use
+			 * ScanQueryForLocks, even though the query hasn't been through
+			 * rule rewriting, because rewriting doesn't change the query
+			 * representation.
 			 */
-			if (acquire)
-				LockRelationOid(rte->relid, rte->rellockmode);
+			Query	   *query = UtilityContainsQuery(plannedstmt->utilityStmt);
+
+			if (query)
+				ScanQueryForLocks(query, false);
+		}
+		else
+		{
+			if (execlockrelsinfo == NULL)
+			{
+				ListCell *lc;
+
+				foreach(lc, plannedstmt->rtable)
+				{
+					RangeTblEntry *rte = lfirst(lc);
+
+					if (rte->rtekind != RTE_RELATION)
+						continue;
+
+					LockRelationOid(rte->relid, rte->rellockmode);
+				}
+			}
 			else
-				UnlockRelationOid(rte->relid, rte->rellockmode);
+			{
+				int			rti;
+				Bitmapset  *lockrels;
+
+				lockrels = execlockrelsinfo->lockrels;
+				rti = -1;
+				while ((rti = bms_next_member(lockrels, rti)) >= 0)
+				{
+					RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
+
+					Assert(rte->rtekind == RTE_RELATION);
+
+					UnlockRelationOid(rte->relid, rte->rellockmode);
+				}
+			}
 		}
 	}
 }
diff --git a/src/backend/utils/mmgr/portalmem.c b/src/backend/utils/mmgr/portalmem.c
index d549f66d4a..896f51be08 100644
--- a/src/backend/utils/mmgr/portalmem.c
+++ b/src/backend/utils/mmgr/portalmem.c
@@ -285,6 +285,7 @@ PortalDefineQuery(Portal portal,
 				  const char *sourceText,
 				  CommandTag commandTag,
 				  List *stmts,
+				  List *execlockrelsinfos,
 				  CachedPlan *cplan)
 {
 	AssertArg(PortalIsValid(portal));
@@ -299,6 +300,7 @@ PortalDefineQuery(Portal portal,
 	portal->qc.nprocessed = 0;
 	portal->commandTag = commandTag;
 	portal->stmts = stmts;
+	portal->execlockrelsinfos = execlockrelsinfos;
 	portal->cplan = cplan;
 	portal->status = PORTAL_DEFINED;
 }
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 666977fb1f..fef75ba147 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -87,7 +87,8 @@ extern void ExplainOneUtility(Node *utilityStmt, IntoClause *into,
 							  ExplainState *es, const char *queryString,
 							  ParamListInfo params, QueryEnvironment *queryEnv);
 
-extern void ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into,
+extern void ExplainOnePlan(PlannedStmt *plannedstmt, ExecLockRelsInfo *execlockrelsinfo,
+						   IntoClause *into,
 						   ExplainState *es, const char *queryString,
 						   ParamListInfo params, QueryEnvironment *queryEnv,
 						   const instr_time *planduration,
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index fd5735a946..ded19b8cbb 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -124,4 +124,6 @@ extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
 						 PartitionPruneInfo *pruneinfo,
 						 Bitmapset **initially_valid_subplans);
 extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate);
+extern Bitmapset *ExecGetLockRelsDoInitialPruning(Plan *plan, ExecGetLockRelsContext *context,
+								PartitionPruneInfo *pruneinfo);
 #endif							/* EXECPARTITION_H */
diff --git a/src/include/executor/execdesc.h b/src/include/executor/execdesc.h
index e79e2c001f..4338463479 100644
--- a/src/include/executor/execdesc.h
+++ b/src/include/executor/execdesc.h
@@ -35,6 +35,7 @@ typedef struct QueryDesc
 	/* These fields are provided by CreateQueryDesc */
 	CmdType		operation;		/* CMD_SELECT, CMD_UPDATE, etc. */
 	PlannedStmt *plannedstmt;	/* planner's output (could be utility, too) */
+	ExecLockRelsInfo *execlockrelsinfo;	/* ExecutorGetLockRels()'s output given plannedstmt */
 	const char *sourceText;		/* source text of the query */
 	Snapshot	snapshot;		/* snapshot to use for query */
 	Snapshot	crosscheck_snapshot;	/* crosscheck for RI update/delete */
@@ -57,6 +58,7 @@ typedef struct QueryDesc
 
 /* in pquery.c */
 extern QueryDesc *CreateQueryDesc(PlannedStmt *plannedstmt,
+								  ExecLockRelsInfo *execlockrelsinfo,
 								  const char *sourceText,
 								  Snapshot snapshot,
 								  Snapshot crosscheck_snapshot,
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index 873772f188..d03bd5a026 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -185,6 +185,8 @@ ExecGetJunkAttribute(TupleTableSlot *slot, AttrNumber attno, bool *isNull)
 /*
  * prototypes from functions in execMain.c
  */
+extern ExecLockRelsInfo *ExecutorGetLockRels(PlannedStmt *plannedstmt, ParamListInfo params);
+extern bool ExecGetLockRels(Plan *node, ExecGetLockRelsContext *context);
 extern void ExecutorStart(QueryDesc *queryDesc, int eflags);
 extern void standard_ExecutorStart(QueryDesc *queryDesc, int eflags);
 extern void ExecutorRun(QueryDesc *queryDesc,
diff --git a/src/include/executor/nodeAppend.h b/src/include/executor/nodeAppend.h
index 4cb78ee5b6..b53535c2a4 100644
--- a/src/include/executor/nodeAppend.h
+++ b/src/include/executor/nodeAppend.h
@@ -17,6 +17,7 @@
 #include "access/parallel.h"
 #include "nodes/execnodes.h"
 
+extern bool ExecGetAppendLockRels(Append *node, ExecGetLockRelsContext *context);
 extern AppendState *ExecInitAppend(Append *node, EState *estate, int eflags);
 extern void ExecEndAppend(AppendState *node);
 extern void ExecReScanAppend(AppendState *node);
diff --git a/src/include/executor/nodeMergeAppend.h b/src/include/executor/nodeMergeAppend.h
index 97fe3b0665..8eb4e9df93 100644
--- a/src/include/executor/nodeMergeAppend.h
+++ b/src/include/executor/nodeMergeAppend.h
@@ -16,6 +16,7 @@
 
 #include "nodes/execnodes.h"
 
+extern bool ExecGetMergeAppendLockRels(MergeAppend *node, ExecGetLockRelsContext *context);
 extern MergeAppendState *ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags);
 extern void ExecEndMergeAppend(MergeAppendState *node);
 extern void ExecReScanMergeAppend(MergeAppendState *node);
diff --git a/src/include/executor/nodeModifyTable.h b/src/include/executor/nodeModifyTable.h
index c318681b9a..287baf6257 100644
--- a/src/include/executor/nodeModifyTable.h
+++ b/src/include/executor/nodeModifyTable.h
@@ -19,6 +19,7 @@ extern void ExecComputeStoredGenerated(ResultRelInfo *resultRelInfo,
 									   EState *estate, TupleTableSlot *slot,
 									   CmdType cmdtype);
 
+extern bool ExecGetModifyTableLockRels(ModifyTable *plan, ExecGetLockRelsContext *context);
 extern ModifyTableState *ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags);
 extern void ExecEndModifyTable(ModifyTableState *node);
 extern void ExecReScanModifyTable(ModifyTableState *node);
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index cbbcff81d2..ee0c73e9a4 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -596,6 +596,7 @@ typedef struct EState
 	struct ExecRowMark **es_rowmarks;	/* Array of per-range-table-entry
 										 * ExecRowMarks, or NULL if none */
 	PlannedStmt *es_plannedstmt;	/* link to top of plan tree */
+	struct ExecLockRelsInfo *es_execlockrelsinfo; /* QueryDesc.execlockrelsinfo */
 	const char *es_sourceText;	/* Source text from QueryDesc */
 
 	JunkFilter *es_junkFilter;	/* top-level junk filter, if any */
@@ -984,6 +985,101 @@ typedef struct DomainConstraintState
  */
 typedef TupleTableSlot *(*ExecProcNodeMtd) (struct PlanState *pstate);
 
+/*----------------
+ * ExecLockRelsInfo
+ *
+ * Result of performing ExecutorGetLockRels() for a given PlannedStmt
+ */
+typedef struct ExecLockRelsInfo
+{
+	NodeTag		type;
+
+	/*
+	 * Relations that must be locked to execute the plan tree contained in
+	 * the PlannedStmt.
+	 */
+	Bitmapset  *lockrels;
+
+	/* PlannedStmt.numPlanNodes */
+	int			numPlanNodes;
+
+	/*
+	 * List of PlanInitPruningOutput, each representing the output of
+	 * performing initial pruning on a given plan node, for all nodes in the
+	 * plan tree that have been marked as needing initial pruning.
+	 *
+	 * 'ipoIndexes' is an array of 'numPlanNodes' elements, indexed with
+	 * plan_node_id of the individual nodes in the plan tree, each a 1-based
+	 * index into 'initPruningOutputs' list for a given plan node.  0 means
+	 * that a given plan node has no entry in the list because of not needing
+	 * any initial pruning done on it.
+	 */
+	List	   *initPruningOutputs;
+	int		   *ipoIndexes;
+} ExecLockRelsInfo;
+
+/*----------------
+ * ExecGetLockRelsContext
+ *
+ * Information pertaining to ExecutorGetLockRels() invocation for a given
+ * plan.
+ */
+typedef struct ExecGetLockRelsContext
+{
+	NodeTag		type;
+
+	PlannedStmt	   *stmt;		/* target plan */
+	ParamListInfo	params;		/* EXTERN parameters available for pruning */
+
+	/* Output parameters for ExecGetLockRels and its subroutines. */
+	Bitmapset	   *lockrels;
+
+	/* See the omment in the definition of ExecLockRelsInfo struct. */
+	List		   *initPruningOutputs;
+	int			   *ipoIndexes;
+} ExecGetLockRelsContext;
+
+/*
+ * Appends the provided PlanInitPruningOutput to
+ * ExecGetLockRelsContext.initPruningOutput
+ */
+#define ExecStorePlanInitPruningOutput(cxt, initPruningOutput, plannode) \
+	do { \
+		(cxt)->initPruningOutputs = lappend((cxt)->initPruningOutputs, initPruningOutput); \
+		(cxt)->ipoIndexes[(plannode)->plan_node_id] = list_length((cxt)->initPruningOutputs); \
+	} while (0)
+
+/*
+ * Finds the PlanInitPruningOutput for a given Plan node in
+ * ExecLockRelsInfo.initPruningOutputs.
+ */
+#define ExecFetchPlanInitPruningOutput(execlockrelsinfo, plannode) \
+		(((execlockrelsinfo) != NULL && (execlockrelsinfo)->initPruningOutputs != NIL) ? \
+		 list_nth((execlockrelsinfo)->initPruningOutputs, \
+				  (execlockrelsinfo)->ipoIndexes[(plannode)->plan_node_id] - 1) : NULL)
+
+/* ---------------
+ * PlanInitPruningOutput
+ *
+ * Node to remember the result of performing initial partition pruning steps
+ * during ExecutorGetLockRels() on nodes that support pruning.
+ *
+ * ExecLockRelsDoInitPruning(), which runs during ExecutorGetLockRels(),
+ * creates it and stores it in the corresponding ExecLockRelsInfo.
+ *
+ * ExecInitPartitionPruning(), which runs during ExecuorStart(), fetches it
+ * from the EState's ExecLockRelsInfo (if any) and uses the value of
+ * initially_valid_subplans contained in it as-is to select the subplans to be
+ * initialized for execution, instead of re-evaluating that by performing
+ * initial pruning again.
+ */
+typedef struct PlanInitPruningOutput
+{
+	NodeTag		type;
+
+	Bitmapset  *initially_valid_subplans;
+} PlanInitPruningOutput;
+
 /* ----------------
  *		PlanState node
  *
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index 53f6b05a3f..928a30c7c6 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -97,6 +97,11 @@ typedef enum NodeTag
 	T_PartitionPruneStepCombine,
 	T_PlanInvalItem,
 
+	/* TAGS FOR EXECUTOR PREP NODES (execnodes.h) */
+	T_ExecGetLockRelsContext,
+	T_ExecLockRelsInfo,
+	T_PlanInitPruningOutput,
+
 	/*
 	 * TAGS FOR PLAN STATE NODES (execnodes.h)
 	 *
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index ef9b54739a..0ed171d3f5 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -129,6 +129,10 @@ typedef struct PlannerGlobal
 
 	char		maxParallelHazard;	/* worst PROPARALLEL hazard level */
 
+	bool		containsInitialPruning;	/* Do some Plan nodes in the tree
+										 * have initial (pre-exec) pruning
+										 * steps? */
+
 	PartitionDirectory partition_directory; /* partition descriptors */
 
 	Bitmapset *elidedAppendPartedRels;	/* Combined partitioned_rels of all
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index a823c7c20d..4fcba0e55c 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -60,10 +60,16 @@ typedef struct PlannedStmt
 
 	bool		parallelModeNeeded; /* parallel mode required to execute? */
 
+	bool		containsInitialPruning;	/* Do some Plan nodes in the tree
+										 * have initial (pre-exec) pruning
+										 * steps? */
+
 	int			jitFlags;		/* which forms of JIT should be performed */
 
 	struct Plan *planTree;		/* tree of Plan nodes */
 
+	int			numPlanNodes;	/* number of nodes in planTree */
+
 	List	   *rtable;			/* list of RangeTblEntry nodes */
 
 	/* rtable indexes of target relations for INSERT/UPDATE/DELETE */
@@ -1192,6 +1198,13 @@ typedef struct PlanRowMark
  * prune_infos			List of Lists containing PartitionedRelPruneInfo nodes,
  *						one sublist per run-time-prunable partition hierarchy
  *						appearing in the parent plan node's subplans.
+ *
+ * needs_init_pruning	Does any of the PartitionedRelPruneInfos in
+ *						prune_infos have its initial_pruning_steps set?
+ *
+ * needs_exec_pruning	Does any of the PartitionedRelPruneInfos in
+ *						prune_infos have its exec_pruning_steps set?
+ *
  * other_subplans		Indexes of any subplans that are not accounted for
  *						by any of the PartitionedRelPruneInfo nodes in
  *						"prune_infos".  These subplans must not be pruned.
@@ -1200,6 +1213,8 @@ typedef struct PartitionPruneInfo
 {
 	NodeTag		type;
 	List	   *prune_infos;
+	bool		needs_init_pruning;
+	bool		needs_exec_pruning;
 	Bitmapset  *other_subplans;
 } PartitionPruneInfo;
 
diff --git a/src/include/tcop/tcopprot.h b/src/include/tcop/tcopprot.h
index 92291a750d..bf80c53bed 100644
--- a/src/include/tcop/tcopprot.h
+++ b/src/include/tcop/tcopprot.h
@@ -64,7 +64,7 @@ extern PlannedStmt *pg_plan_query(Query *querytree, const char *query_string,
 								  ParamListInfo boundParams);
 extern List *pg_plan_queries(List *querytrees, const char *query_string,
 							 int cursorOptions,
-							 ParamListInfo boundParams);
+							 ParamListInfo boundParams, List **execlockrelsinfo_list);
 
 extern bool check_max_stack_depth(int *newval, void **extra, GucSource source);
 extern void assign_max_stack_depth(int newval, void *extra);
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index 95b99e3d25..56b0dcc6bd 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -148,6 +148,9 @@ typedef struct CachedPlan
 {
 	int			magic;			/* should equal CACHEDPLAN_MAGIC */
 	List	   *stmt_list;		/* list of PlannedStmts */
+	List	   *execlockrelsinfo_list;	/* list of ExecutorGetLockRelsResult with one
+									 * element for each of stmt_list; NIL
+									 * if not a generic plan */
 	bool		is_oneshot;		/* is it a "oneshot" plan? */
 	bool		is_saved;		/* is CachedPlan in a long-lived context? */
 	bool		is_valid;		/* is the stmt_list currently valid? */
@@ -158,6 +161,9 @@ typedef struct CachedPlan
 	int			generation;		/* parent's generation number for this plan */
 	int			refcount;		/* count of live references to this struct */
 	MemoryContext context;		/* context containing this CachedPlan */
+	MemoryContext execlockrelsinfo_context;	/* context containing
+											 * execlockrelsinfo_list,
+											 * a child of the above context */
 } CachedPlan;
 
 /*
diff --git a/src/include/utils/portal.h b/src/include/utils/portal.h
index aeddbdafe5..9abace6734 100644
--- a/src/include/utils/portal.h
+++ b/src/include/utils/portal.h
@@ -137,6 +137,10 @@ typedef struct PortalData
 	CommandTag	commandTag;		/* command tag for original query */
 	QueryCompletion qc;			/* command completion data for executed query */
 	List	   *stmts;			/* list of PlannedStmts */
+	List	   *execlockrelsinfos;	/* list of ExecutorGetLockRelsResults with one element
+								 * for each of 'stmts'; same as
+								 * cplan->execlockrelsinfo_list if cplan is
+								 * not NULL */
 	CachedPlan *cplan;			/* CachedPlan, if stmts are from one */
 
 	ParamListInfo portalParams; /* params to pass to query */
@@ -241,6 +245,7 @@ extern void PortalDefineQuery(Portal portal,
 							  const char *sourceText,
 							  CommandTag commandTag,
 							  List *stmts,
+							  List *execlockrelsinfos,
 							  CachedPlan *cplan);
 extern PlannedStmt *PortalGetPrimaryStmt(Portal portal);
 extern void PortalCreateHoldStore(Portal portal);
-- 
2.24.1



  [application/octet-stream] v8-0001-Some-refactoring-of-runtime-pruning-code.patch (26.5K, 3-v8-0001-Some-refactoring-of-runtime-pruning-code.patch)
  download | inline diff:
From ce2041b254a7fee3097012f11685b635d58fb9b2 Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Wed, 2 Mar 2022 15:17:55 +0900
Subject: [PATCH v8 1/4] Some refactoring of runtime pruning code

This does two things mainly:

* Move the execution pruning initialization steps that are common
between both ExecInitAppend() and ExecInitMergeAppend() into a new
function ExecInitPartitionPruning() defined in execPartition.c.
Thus, ExecCreatePartitionPruneState() and
ExecFindInitialMatchingSubPlans() need not be exported.

* Add an ExprContext field to PartitionPruneContext to remove the
implicit assumption in the runtime pruning code that the ExprContext
to use to compute pruning expressions that need one can always rely
on the PlanState providing it.  A future patch will allow runtime
pruning (at least the initial pruning steps) to be performed without
the corresponding PlanState yet having been created, so this will
help.
---
 src/backend/executor/execPartition.c   | 340 ++++++++++++++++---------
 src/backend/executor/nodeAppend.c      |  33 +--
 src/backend/executor/nodeMergeAppend.c |  32 +--
 src/backend/partitioning/partprune.c   |  20 +-
 src/include/executor/execPartition.h   |   9 +-
 src/include/partitioning/partprune.h   |   2 +
 6 files changed, 252 insertions(+), 184 deletions(-)

diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index aca42ca5b8..84b4e4b3d6 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -184,11 +184,18 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation rel,
 												  int maxfieldlen);
 static List *adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri);
 static List *adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap);
+static PartitionPruneState *ExecCreatePartitionPruneState(PlanState *planstate,
+							  PartitionPruneInfo *partitionpruneinfo);
+static Bitmapset *ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate);
 static void ExecInitPruningContext(PartitionPruneContext *context,
 								   List *pruning_steps,
 								   PartitionDesc partdesc,
 								   PartitionKey partkey,
-								   PlanState *planstate);
+								   PlanState *planstate,
+								   ExprContext *econtext);
+static void PartitionPruneStateFixSubPlanMap(PartitionPruneState *prunestate,
+								 Bitmapset *initially_valid_subplans,
+								 int n_total_subplans);
 static void find_matching_subplans_recurse(PartitionPruningData *prunedata,
 										   PartitionedRelPruningData *pprune,
 										   bool initial_prune,
@@ -1590,30 +1597,86 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  *
  * Functions:
  *
- * ExecCreatePartitionPruneState:
+ * ExecInitPartitionPruning:
  *		Creates the PartitionPruneState required by each of the two pruning
  *		functions.  Details stored include how to map the partition index
- *		returned by the partition pruning code into subplan indexes.
- *
- * ExecFindInitialMatchingSubPlans:
- *		Returns indexes of matching subplans.  Partition pruning is attempted
- *		without any evaluation of expressions containing PARAM_EXEC Params.
- *		This function must be called during executor startup for the parent
- *		plan before the subplans themselves are initialized.  Subplans which
- *		are found not to match by this function must be removed from the
- *		plan's list of subplans during execution, as this function performs a
- *		remap of the partition index to subplan index map and the newly
- *		created map provides indexes only for subplans which remain after
- *		calling this function.
+ *		returned by the partition pruning code into subplan indexes.  Also
+ *		determines the set of initially valid subplans by performing initial
+ *		pruning steps, only which need be initialized by the caller such as
+ *		ExecInitAppend.  Maps in PartitionPruneState are updated to account
+ *		for initial pruning having eliminated some of the subplans, if any.
  *
  * ExecFindMatchingSubPlans:
  *		Returns indexes of matching subplans after evaluating all available
- *		expressions.  This function can only be called during execution and
- *		must be called again each time the value of a Param listed in
- *		PartitionPruneState's 'execparamids' changes.
+ *		expressions, that is, using execution pruning steps.  This function can
+ *		can only be called during execution and must be called again each time
+ *		the value of a Param listed in PartitionPruneState's 'execparamids'
+ *		changes.
  *-------------------------------------------------------------------------
  */
 
+/*
+ * ExecInitPartitionPruning
+ * 		Initialize data structure needed for run-time partition pruning
+ *
+ * Initial pruning can be done immediately, so it is done here if needed and
+ * the set of surviving partition subplans' indexes are added to the output
+ * parameter *initially_valid_subplans.
+ *
+ * If subplans are indeed pruned, subplan_map arrays contained in the returned
+ * PartitionPruneState are re-sequenced to not count those, though only if the
+ * maps will be needed for subsequent execution pruning passes.
+ */
+PartitionPruneState *
+ExecInitPartitionPruning(PlanState *planstate,
+						 int n_total_subplans,
+						 PartitionPruneInfo *pruneinfo,
+						 Bitmapset **initially_valid_subplans)
+{
+	PartitionPruneState *prunestate;
+	EState *estate = planstate->state;
+
+	/* We may need an expression context to evaluate partition exprs */
+	ExecAssignExprContext(estate, planstate);
+
+	/*
+	 * Create the working data structure for pruning.
+	 */
+	prunestate = ExecCreatePartitionPruneState(planstate, pruneinfo);
+
+	/*
+	 * Perform an initial partition prune, if required.
+	 */
+	if (prunestate->do_initial_prune)
+	{
+		/* Determine which subplans survive initial pruning */
+		*initially_valid_subplans = ExecFindInitialMatchingSubPlans(prunestate);
+	}
+	else
+	{
+		/* We'll need to initialize all subplans */
+		Assert(n_total_subplans > 0);
+		*initially_valid_subplans = bms_add_range(NULL, 0,
+												  n_total_subplans - 1);
+	}
+
+	/*
+	 * Re-sequence subplan indexes contained in prunestate to account for any
+	 * that were removed above due to initial pruning.
+	 *
+	 * We can safely skip this when !do_exec_prune, even though that leaves
+	 * invalid data in prunestate, because that data won't be consulted again
+	 * (cf initial Assert in ExecFindMatchingSubPlans).
+	 */
+	if (prunestate->do_exec_prune &&
+		bms_num_members(*initially_valid_subplans) < n_total_subplans)
+		PartitionPruneStateFixSubPlanMap(prunestate,
+										 *initially_valid_subplans,
+										 n_total_subplans);
+
+	return prunestate;
+}
+
 /*
  * ExecCreatePartitionPruneState
  *		Build the data structure required for calling
@@ -1632,7 +1695,7 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  * re-used each time we re-evaluate which partitions match the pruning steps
  * provided in each PartitionedRelPruneInfo.
  */
-PartitionPruneState *
+static PartitionPruneState *
 ExecCreatePartitionPruneState(PlanState *planstate,
 							  PartitionPruneInfo *partitionpruneinfo)
 {
@@ -1641,6 +1704,7 @@ ExecCreatePartitionPruneState(PlanState *planstate,
 	int			n_part_hierarchies;
 	ListCell   *lc;
 	int			i;
+	ExprContext	*econtext = planstate->ps_ExprContext;
 
 	/* For data reading, executor always omits detached partitions */
 	if (estate->es_partition_directory == NULL)
@@ -1814,7 +1878,8 @@ ExecCreatePartitionPruneState(PlanState *planstate,
 			{
 				ExecInitPruningContext(&pprune->initial_context,
 									   pinfo->initial_pruning_steps,
-									   partdesc, partkey, planstate);
+									   partdesc, partkey, planstate,
+									   econtext);
 				/* Record whether initial pruning is needed at any level */
 				prunestate->do_initial_prune = true;
 			}
@@ -1823,7 +1888,8 @@ ExecCreatePartitionPruneState(PlanState *planstate,
 			{
 				ExecInitPruningContext(&pprune->exec_context,
 									   pinfo->exec_pruning_steps,
-									   partdesc, partkey, planstate);
+									   partdesc, partkey, planstate,
+									   econtext);
 				/* Record whether exec pruning is needed at any level */
 				prunestate->do_exec_prune = true;
 			}
@@ -1851,7 +1917,8 @@ ExecInitPruningContext(PartitionPruneContext *context,
 					   List *pruning_steps,
 					   PartitionDesc partdesc,
 					   PartitionKey partkey,
-					   PlanState *planstate)
+					   PlanState *planstate,
+					   ExprContext *econtext)
 {
 	int			n_steps;
 	int			partnatts;
@@ -1872,6 +1939,7 @@ ExecInitPruningContext(PartitionPruneContext *context,
 
 	context->ppccontext = CurrentMemoryContext;
 	context->planstate = planstate;
+	context->exprcontext = econtext;
 
 	/* Initialize expression state for each expression we need */
 	context->exprstates = (ExprState **)
@@ -1900,8 +1968,20 @@ ExecInitPruningContext(PartitionPruneContext *context,
 														step->step.step_id,
 														keyno);
 
-				context->exprstates[stateidx] =
-					ExecInitExpr(expr, context->planstate);
+				/*
+				 * When planstate is NULL, pruning_steps is known not to
+				 * contain any expressions that depend on the parent plan.
+				 * Information of any available EXTERN parameters must be
+				 * passed explicitly in that case, which the caller must
+				 * have made available via econtext.
+				 */
+				if (planstate == NULL)
+					context->exprstates[stateidx] =
+						ExecInitExprWithParams(expr,
+											   econtext->ecxt_param_list_info);
+				else
+					context->exprstates[stateidx] =
+						ExecInitExpr(expr, context->planstate);
 			}
 			keyno++;
 		}
@@ -1914,18 +1994,11 @@ ExecInitPruningContext(PartitionPruneContext *context,
  *		pruning, disregarding any pruning constraints involving PARAM_EXEC
  *		Params.
  *
- * If additional pruning passes will be required (because of PARAM_EXEC
- * Params), we must also update the translation data that allows conversion
- * of partition indexes into subplan indexes to account for the unneeded
- * subplans having been removed.
- *
  * Must only be called once per 'prunestate', and only if initial pruning
  * is required.
- *
- * 'nsubplans' must be passed as the total number of unpruned subplans.
  */
-Bitmapset *
-ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate, int nsubplans)
+static Bitmapset *
+ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate)
 {
 	Bitmapset  *result = NULL;
 	MemoryContext oldcontext;
@@ -1950,14 +2023,20 @@ ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate, int nsubplans)
 		PartitionedRelPruningData *pprune;
 
 		prunedata = prunestate->partprunedata[i];
+
+		/*
+		 * We pass the 1st item belonging to the root table of the hierarchy
+		 * and find_matching_subplans_recurse() takes care of recursing to
+		 * other (lower-level) parents as needed.
+		 */
 		pprune = &prunedata->partrelprunedata[0];
 
 		/* Perform pruning without using PARAM_EXEC Params */
 		find_matching_subplans_recurse(prunedata, pprune, true, &result);
 
-		/* Expression eval may have used space in node's ps_ExprContext too */
+		/* Expression eval may have used space in ExprContext too */
 		if (pprune->initial_pruning_steps)
-			ResetExprContext(pprune->initial_context.planstate->ps_ExprContext);
+			ResetExprContext(pprune->initial_context.exprcontext);
 	}
 
 	/* Add in any subplans that partition pruning didn't account for */
@@ -1970,118 +2049,120 @@ ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate, int nsubplans)
 
 	MemoryContextReset(prunestate->prune_context);
 
+	return result;
+}
+
+/*
+ * PartitionPruneStateFixSubPlanMap
+ *		Fix mapping of partition indexes to subplan indexes contained in
+ *		prunestate by considering the new list of subplans that survived
+ *		initial pruning
+ *
+ * Subplans would previously be indexed 0..(n_total_subplans - 1) should be
+ * changed to index range 0..num(initially_valid_subplans).
+ */
+static void
+PartitionPruneStateFixSubPlanMap(PartitionPruneState *prunestate,
+								 Bitmapset *initially_valid_subplans,
+								 int n_total_subplans)
+{
+	int		   *new_subplan_indexes;
+	Bitmapset  *new_other_subplans;
+	int			i;
+	int			newidx;
+
 	/*
-	 * If exec-time pruning is required and we pruned subplans above, then we
-	 * must re-sequence the subplan indexes so that ExecFindMatchingSubPlans
-	 * properly returns the indexes from the subplans which will remain after
-	 * execution of this function.
-	 *
-	 * We can safely skip this when !do_exec_prune, even though that leaves
-	 * invalid data in prunestate, because that data won't be consulted again
-	 * (cf initial Assert in ExecFindMatchingSubPlans).
+	 * First we must build a temporary array which maps old subplan
+	 * indexes to new ones.  For convenience of initialization, we use
+	 * 1-based indexes in this array and leave pruned items as 0.
 	 */
-	if (prunestate->do_exec_prune && bms_num_members(result) < nsubplans)
+	new_subplan_indexes = (int *) palloc0(sizeof(int) * n_total_subplans);
+	newidx = 1;
+	i = -1;
+	while ((i = bms_next_member(initially_valid_subplans, i)) >= 0)
 	{
-		int		   *new_subplan_indexes;
-		Bitmapset  *new_other_subplans;
-		int			i;
-		int			newidx;
+		Assert(i < n_total_subplans);
+		new_subplan_indexes[i] = newidx++;
+	}
 
-		/*
-		 * First we must build a temporary array which maps old subplan
-		 * indexes to new ones.  For convenience of initialization, we use
-		 * 1-based indexes in this array and leave pruned items as 0.
-		 */
-		new_subplan_indexes = (int *) palloc0(sizeof(int) * nsubplans);
-		newidx = 1;
-		i = -1;
-		while ((i = bms_next_member(result, i)) >= 0)
-		{
-			Assert(i < nsubplans);
-			new_subplan_indexes[i] = newidx++;
-		}
+	/*
+	 * Now we can update each PartitionedRelPruneInfo's subplan_map with
+	 * new subplan indexes.  We must also recompute its present_parts
+	 * bitmap.
+	 */
+	for (i = 0; i < prunestate->num_partprunedata; i++)
+	{
+		PartitionPruningData *prunedata = prunestate->partprunedata[i];
+		int			j;
 
 		/*
-		 * Now we can update each PartitionedRelPruneInfo's subplan_map with
-		 * new subplan indexes.  We must also recompute its present_parts
-		 * bitmap.
+		 * Within each hierarchy, we perform this loop in back-to-front
+		 * order so that we determine present_parts for the lowest-level
+		 * partitioned tables first.  This way we can tell whether a
+		 * sub-partitioned table's partitions were entirely pruned so we
+		 * can exclude it from the current level's present_parts.
 		 */
-		for (i = 0; i < prunestate->num_partprunedata; i++)
+		for (j = prunedata->num_partrelprunedata - 1; j >= 0; j--)
 		{
-			PartitionPruningData *prunedata = prunestate->partprunedata[i];
-			int			j;
+			PartitionedRelPruningData *pprune = &prunedata->partrelprunedata[j];
+			int			nparts = pprune->nparts;
+			int			k;
 
-			/*
-			 * Within each hierarchy, we perform this loop in back-to-front
-			 * order so that we determine present_parts for the lowest-level
-			 * partitioned tables first.  This way we can tell whether a
-			 * sub-partitioned table's partitions were entirely pruned so we
-			 * can exclude it from the current level's present_parts.
-			 */
-			for (j = prunedata->num_partrelprunedata - 1; j >= 0; j--)
-			{
-				PartitionedRelPruningData *pprune = &prunedata->partrelprunedata[j];
-				int			nparts = pprune->nparts;
-				int			k;
+			/* We just rebuild present_parts from scratch */
+			bms_free(pprune->present_parts);
+			pprune->present_parts = NULL;
 
-				/* We just rebuild present_parts from scratch */
-				bms_free(pprune->present_parts);
-				pprune->present_parts = NULL;
+			for (k = 0; k < nparts; k++)
+			{
+				int			oldidx = pprune->subplan_map[k];
+				int			subidx;
 
-				for (k = 0; k < nparts; k++)
+				/*
+				 * If this partition existed as a subplan then change the
+				 * old subplan index to the new subplan index.  The new
+				 * index may become -1 if the partition was pruned above,
+				 * or it may just come earlier in the subplan list due to
+				 * some subplans being removed earlier in the list.  If
+				 * it's a subpartition, add it to present_parts unless
+				 * it's entirely pruned.
+				 */
+				if (oldidx >= 0)
 				{
-					int			oldidx = pprune->subplan_map[k];
-					int			subidx;
-
-					/*
-					 * If this partition existed as a subplan then change the
-					 * old subplan index to the new subplan index.  The new
-					 * index may become -1 if the partition was pruned above,
-					 * or it may just come earlier in the subplan list due to
-					 * some subplans being removed earlier in the list.  If
-					 * it's a subpartition, add it to present_parts unless
-					 * it's entirely pruned.
-					 */
-					if (oldidx >= 0)
-					{
-						Assert(oldidx < nsubplans);
-						pprune->subplan_map[k] = new_subplan_indexes[oldidx] - 1;
+					Assert(oldidx < n_total_subplans);
+					pprune->subplan_map[k] = new_subplan_indexes[oldidx] - 1;
 
-						if (new_subplan_indexes[oldidx] > 0)
-							pprune->present_parts =
-								bms_add_member(pprune->present_parts, k);
-					}
-					else if ((subidx = pprune->subpart_map[k]) >= 0)
-					{
-						PartitionedRelPruningData *subprune;
+					if (new_subplan_indexes[oldidx] > 0)
+						pprune->present_parts =
+							bms_add_member(pprune->present_parts, k);
+				}
+				else if ((subidx = pprune->subpart_map[k]) >= 0)
+				{
+					PartitionedRelPruningData *subprune;
 
-						subprune = &prunedata->partrelprunedata[subidx];
+					subprune = &prunedata->partrelprunedata[subidx];
 
-						if (!bms_is_empty(subprune->present_parts))
-							pprune->present_parts =
-								bms_add_member(pprune->present_parts, k);
-					}
+					if (!bms_is_empty(subprune->present_parts))
+						pprune->present_parts =
+							bms_add_member(pprune->present_parts, k);
 				}
 			}
 		}
+	}
 
-		/*
-		 * We must also recompute the other_subplans set, since indexes in it
-		 * may change.
-		 */
-		new_other_subplans = NULL;
-		i = -1;
-		while ((i = bms_next_member(prunestate->other_subplans, i)) >= 0)
-			new_other_subplans = bms_add_member(new_other_subplans,
-												new_subplan_indexes[i] - 1);
-
-		bms_free(prunestate->other_subplans);
-		prunestate->other_subplans = new_other_subplans;
+	/*
+	 * We must also recompute the other_subplans set, since indexes in it
+	 * may change.
+	 */
+	new_other_subplans = NULL;
+	i = -1;
+	while ((i = bms_next_member(prunestate->other_subplans, i)) >= 0)
+		new_other_subplans = bms_add_member(new_other_subplans,
+											new_subplan_indexes[i] - 1);
 
-		pfree(new_subplan_indexes);
-	}
+	bms_free(prunestate->other_subplans);
+	prunestate->other_subplans = new_other_subplans;
 
-	return result;
+	pfree(new_subplan_indexes);
 }
 
 /*
@@ -2123,11 +2204,16 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate)
 		prunedata = prunestate->partprunedata[i];
 		pprune = &prunedata->partrelprunedata[0];
 
+		/*
+		 * We pass the 1st item belonging to the root table of the hierarchy
+		 * and find_matching_subplans_recurse() takes care of recursing to
+		 * other (lower-level) parents as needed.
+		 */
 		find_matching_subplans_recurse(prunedata, pprune, false, &result);
 
-		/* Expression eval may have used space in node's ps_ExprContext too */
+		/* Expression eval may have used space in ExprContext too */
 		if (pprune->exec_pruning_steps)
-			ResetExprContext(pprune->exec_context.planstate->ps_ExprContext);
+			ResetExprContext(pprune->exec_context.exprcontext);
 	}
 
 	/* Add in any subplans that partition pruning didn't account for */
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index 7937f1c88f..5b6d3eb23b 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -138,30 +138,17 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
 	{
 		PartitionPruneState *prunestate;
 
-		/* We may need an expression context to evaluate partition exprs */
-		ExecAssignExprContext(estate, &appendstate->ps);
-
-		/* Create the working data structure for pruning. */
-		prunestate = ExecCreatePartitionPruneState(&appendstate->ps,
-												   node->part_prune_info);
+		/*
+		 * Set up pruning data structure.  Initial pruning steps, if any, are
+		 * performed as part of the setup, adding the set of indexes of
+		 * surviving subplans to 'validsubplans'.
+		 */
+		prunestate = ExecInitPartitionPruning(&appendstate->ps,
+											  list_length(node->appendplans),
+											  node->part_prune_info,
+											  &validsubplans);
 		appendstate->as_prune_state = prunestate;
-
-		/* Perform an initial partition prune, if required. */
-		if (prunestate->do_initial_prune)
-		{
-			/* Determine which subplans survive initial pruning */
-			validsubplans = ExecFindInitialMatchingSubPlans(prunestate,
-															list_length(node->appendplans));
-
-			nplans = bms_num_members(validsubplans);
-		}
-		else
-		{
-			/* We'll need to initialize all subplans */
-			nplans = list_length(node->appendplans);
-			Assert(nplans > 0);
-			validsubplans = bms_add_range(NULL, 0, nplans - 1);
-		}
+		nplans = bms_num_members(validsubplans);
 
 		/*
 		 * When no run-time pruning is required and there's at least one
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index 418f89dea8..9a9f29e845 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -86,29 +86,17 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
 	{
 		PartitionPruneState *prunestate;
 
-		/* We may need an expression context to evaluate partition exprs */
-		ExecAssignExprContext(estate, &mergestate->ps);
-
-		prunestate = ExecCreatePartitionPruneState(&mergestate->ps,
-												   node->part_prune_info);
+		/*
+		 * Set up pruning data structure.  Initial pruning steps, if any, are
+		 * performed as part of the setup, adding the set of indexes of
+		 * surviving subplans to 'validsubplans'.
+		 */
+		prunestate = ExecInitPartitionPruning(&mergestate->ps,
+											  list_length(node->mergeplans),
+											  node->part_prune_info,
+											  &validsubplans);
 		mergestate->ms_prune_state = prunestate;
-
-		/* Perform an initial partition prune, if required. */
-		if (prunestate->do_initial_prune)
-		{
-			/* Determine which subplans survive initial pruning */
-			validsubplans = ExecFindInitialMatchingSubPlans(prunestate,
-															list_length(node->mergeplans));
-
-			nplans = bms_num_members(validsubplans);
-		}
-		else
-		{
-			/* We'll need to initialize all subplans */
-			nplans = list_length(node->mergeplans);
-			Assert(nplans > 0);
-			validsubplans = bms_add_range(NULL, 0, nplans - 1);
-		}
+		nplans = bms_num_members(validsubplans);
 
 		/*
 		 * When no run-time pruning is required and there's at least one
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index 1bc00826c1..7080cb25d9 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -798,6 +798,7 @@ prune_append_rel_partitions(RelOptInfo *rel)
 
 	/* These are not valid when being called from the planner */
 	context.planstate = NULL;
+	context.exprcontext = NULL;
 	context.exprstates = NULL;
 
 	/* Actual pruning happens here. */
@@ -808,8 +809,8 @@ prune_append_rel_partitions(RelOptInfo *rel)
  * get_matching_partitions
  *		Determine partitions that survive partition pruning
  *
- * Note: context->planstate must be set to a valid PlanState when the
- * pruning_steps were generated with a target other than PARTTARGET_PLANNER.
+ * Note: context->exprcontext must be valid when the pruning_steps were
+ * generated with a target other than PARTTARGET_PLANNER.
  *
  * Returns a Bitmapset of the RelOptInfo->part_rels indexes of the surviving
  * partitions.
@@ -3654,7 +3655,7 @@ match_boolean_partition_clause(Oid partopfamily, Expr *clause, Expr *partkey,
  * exprstate array.
  *
  * Note that the evaluated result may be in the per-tuple memory context of
- * context->planstate->ps_ExprContext, and we may have leaked other memory
+ * context->exprcontext, and we may have leaked other memory
  * there too.  This memory must be recovered by resetting that ExprContext
  * after we're done with the pruning operation (see execPartition.c).
  */
@@ -3677,13 +3678,18 @@ partkey_datum_from_expr(PartitionPruneContext *context,
 		ExprContext *ectx;
 
 		/*
-		 * We should never see a non-Const in a step unless we're running in
-		 * the executor.
+		 * We should never see a non-Const in a step unless the caller has
+		 * passed a valid ExprContext.
+		 *
+		 * When context->planstate is valid, context->exprcontext is same
+		 * as context->planstate->ps_ExprContext.
 		 */
-		Assert(context->planstate != NULL);
+		Assert(context->planstate != NULL || context->exprcontext != NULL);
+		Assert(context->planstate == NULL ||
+			   (context->exprcontext == context->planstate->ps_ExprContext));
 
 		exprstate = context->exprstates[stateidx];
-		ectx = context->planstate->ps_ExprContext;
+		ectx = context->exprcontext;
 		*value = ExecEvalExprSwitchContext(exprstate, ectx, isnull);
 	}
 }
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index 603d8becc4..fd5735a946 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -119,10 +119,9 @@ extern ResultRelInfo *ExecFindPartition(ModifyTableState *mtstate,
 										EState *estate);
 extern void ExecCleanupTupleRouting(ModifyTableState *mtstate,
 									PartitionTupleRouting *proute);
-extern PartitionPruneState *ExecCreatePartitionPruneState(PlanState *planstate,
-														  PartitionPruneInfo *partitionpruneinfo);
+extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
+						 int n_total_subplans,
+						 PartitionPruneInfo *pruneinfo,
+						 Bitmapset **initially_valid_subplans);
 extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate);
-extern Bitmapset *ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate,
-												  int nsubplans);
-
 #endif							/* EXECPARTITION_H */
diff --git a/src/include/partitioning/partprune.h b/src/include/partitioning/partprune.h
index ee11b6feae..90684efa25 100644
--- a/src/include/partitioning/partprune.h
+++ b/src/include/partitioning/partprune.h
@@ -41,6 +41,7 @@ struct RelOptInfo;
  *					subsidiary data, such as the FmgrInfos.
  * planstate		Points to the parent plan node's PlanState when called
  *					during execution; NULL when called from the planner.
+ * exprcontext		ExprContext to use when evaluating pruning expressions
  * exprstates		Array of ExprStates, indexed as per PruneCxtStateIdx; one
  *					for each partition key in each pruning step.  Allocated if
  *					planstate is non-NULL, otherwise NULL.
@@ -56,6 +57,7 @@ typedef struct PartitionPruneContext
 	FmgrInfo   *stepcmpfuncs;
 	MemoryContext ppccontext;
 	PlanState  *planstate;
+	ExprContext *exprcontext;
 	ExprState **exprstates;
 } PartitionPruneContext;
 
-- 
2.24.1



  [application/octet-stream] v8-0003-Add-a-plan_tree_walker.patch (3.9K, 4-v8-0003-Add-a-plan_tree_walker.patch)
  download | inline diff:
From 3f3bfe578401c43e578196f46f2bad7d3071411a Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Thu, 3 Mar 2022 16:04:13 +0900
Subject: [PATCH v8 3/4] Add a plan_tree_walker()

Like planstate_tree_walker() but for uninitialized plan trees.
---
 src/backend/nodes/nodeFuncs.c | 116 ++++++++++++++++++++++++++++++++++
 src/include/nodes/nodeFuncs.h |   3 +
 2 files changed, 119 insertions(+)

diff --git a/src/backend/nodes/nodeFuncs.c b/src/backend/nodes/nodeFuncs.c
index 4789ba6911..51cac40a3e 100644
--- a/src/backend/nodes/nodeFuncs.c
+++ b/src/backend/nodes/nodeFuncs.c
@@ -31,6 +31,10 @@ static bool planstate_walk_subplans(List *plans, bool (*walker) (),
 									void *context);
 static bool planstate_walk_members(PlanState **planstates, int nplans,
 								   bool (*walker) (), void *context);
+static bool plan_walk_subplans(List *plans,
+				   bool (*walker) (),
+				   void *context);
+static bool plan_walk_members(List *plans, bool (*walker) (), void *context);
 
 
 /*
@@ -4645,3 +4649,115 @@ planstate_walk_members(PlanState **planstates, int nplans,
 
 	return false;
 }
+
+/*
+ * plan_tree_walker --- walk plantrees
+ *
+ * The walker has already visited the current node, and so we need only
+ * recurse into any sub-nodes it has.
+ */
+bool
+plan_tree_walker(Plan *plan,
+				 bool (*walker) (),
+				 void *context)
+{
+	/* Guard against stack overflow due to overly complex plan trees */
+	check_stack_depth();
+
+	/* initPlan-s */
+	if (plan_walk_subplans(plan->initPlan, walker, context))
+		return true;
+
+	/* lefttree */
+	if (outerPlan(plan))
+	{
+		if (walker(outerPlan(plan), context))
+			return true;
+	}
+
+	/* righttree */
+	if (innerPlan(plan))
+	{
+		if (walker(innerPlan(plan), context))
+			return true;
+	}
+
+	/* special child plans */
+	switch (nodeTag(plan))
+	{
+		case T_Append:
+			if (plan_walk_members(((Append *) plan)->appendplans,
+								  walker, context))
+				return true;
+			break;
+		case T_MergeAppend:
+			if (plan_walk_members(((MergeAppend *) plan)->mergeplans,
+								  walker, context))
+				return true;
+			break;
+		case T_BitmapAnd:
+			if (plan_walk_members(((BitmapAnd *) plan)->bitmapplans,
+								  walker, context))
+				return true;
+			break;
+		case T_BitmapOr:
+			if (plan_walk_members(((BitmapOr *) plan)->bitmapplans,
+								  walker, context))
+				return true;
+			break;
+		case T_CustomScan:
+			if (plan_walk_members(((CustomScan *) plan)->custom_plans,
+								  walker, context))
+				return true;
+			break;
+		case T_SubqueryScan:
+			if (walker(((SubqueryScan *) plan)->subplan, context))
+				return true;
+			break;
+		default:
+			break;
+	}
+
+	return false;
+}
+
+/*
+ * Walk a list of SubPlans (or initPlans, which also use SubPlan nodes).
+ */
+static bool
+plan_walk_subplans(List *plans,
+				   bool (*walker) (),
+				   void *context)
+{
+	ListCell   *lc;
+	PlannedStmt *plannedstmt = (PlannedStmt *) context;
+
+	foreach(lc, plans)
+	{
+		SubPlan *sp = lfirst_node(SubPlan, lc);
+		Plan *p = list_nth(plannedstmt->subplans, sp->plan_id - 1);
+
+		if (walker(p, context))
+			return true;
+	}
+
+	return false;
+}
+
+/*
+ * Walk the constituent plans of a ModifyTable, Append, MergeAppend,
+ * BitmapAnd, or BitmapOr node.
+ */
+static bool
+plan_walk_members(List *plans, bool (*walker) (), void *context)
+{
+	ListCell *lc;
+
+	foreach(lc, plans)
+	{
+		if (walker(lfirst(lc), context))
+			return true;
+	}
+
+	return false;
+}
diff --git a/src/include/nodes/nodeFuncs.h b/src/include/nodes/nodeFuncs.h
index 93c60bde66..fca107ad65 100644
--- a/src/include/nodes/nodeFuncs.h
+++ b/src/include/nodes/nodeFuncs.h
@@ -158,5 +158,8 @@ extern bool raw_expression_tree_walker(Node *node, bool (*walker) (),
 struct PlanState;
 extern bool planstate_tree_walker(struct PlanState *planstate, bool (*walker) (),
 								  void *context);
+struct Plan;
+extern bool plan_tree_walker(struct Plan *plan, bool (*walker) (),
+				 void *context);
 
 #endif							/* NODEFUNCS_H */
-- 
2.24.1



  [application/octet-stream] v8-0002-Add-Merge-Append.partitioned_rels.patch (17.4K, 5-v8-0002-Add-Merge-Append.partitioned_rels.patch)
  download | inline diff:
From 8b99146c9b8c4826e1434d3f006597681c24cd45 Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Thu, 24 Mar 2022 22:47:03 +0900
Subject: [PATCH v8 2/4] Add [Merge]Append.partitioned_rels

To record the RT indexes of all partitioned ancestors leading up to
leaf partitions that are appended by the node.

If a given [Merge]Append node is left out from the plan due to there
being only one element in its list of child subplans, then its
partitioned_rels set is added to PlannerGlobal.elidedAppendPartedRels
that is passed down to the executor through PlannedStmt.

There are no users for partitioned_rels and elidedAppendPartedRels
as of this commit, though a later commit will require the ability
to extract the set of relations that must be locked to make a plan
tree safe for execution by walking the plan tree itself, so having
the partitioned tables be also present in the plan tree will be
helpful.  Note that currently the executor relies on the fact that
the set of relations to be locked can be obtained by simply scanning
the range table that's made available in PlannedStmt along with the
plan tree.
---
 src/backend/nodes/copyfuncs.c           |  3 +++
 src/backend/nodes/outfuncs.c            |  5 +++++
 src/backend/nodes/readfuncs.c           |  3 +++
 src/backend/optimizer/path/joinrels.c   |  9 ++++++++
 src/backend/optimizer/plan/createplan.c | 18 +++++++++++++++-
 src/backend/optimizer/plan/planner.c    |  8 +++++++
 src/backend/optimizer/plan/setrefs.c    | 28 +++++++++++++++++++++++++
 src/backend/optimizer/util/inherit.c    | 16 ++++++++++++++
 src/backend/optimizer/util/relnode.c    | 20 ++++++++++++++++++
 src/include/nodes/pathnodes.h           | 22 +++++++++++++++++++
 src/include/nodes/plannodes.h           | 17 +++++++++++++++
 11 files changed, 148 insertions(+), 1 deletion(-)

diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 56505557bf..29c515d7db 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -106,6 +106,7 @@ _copyPlannedStmt(const PlannedStmt *from)
 	COPY_NODE_FIELD(invalItems);
 	COPY_NODE_FIELD(paramExecTypes);
 	COPY_NODE_FIELD(utilityStmt);
+	COPY_BITMAPSET_FIELD(elidedAppendPartedRels);
 	COPY_LOCATION_FIELD(stmt_location);
 	COPY_SCALAR_FIELD(stmt_len);
 
@@ -254,6 +255,7 @@ _copyAppend(const Append *from)
 	COPY_SCALAR_FIELD(nasyncplans);
 	COPY_SCALAR_FIELD(first_partial_plan);
 	COPY_NODE_FIELD(part_prune_info);
+	COPY_BITMAPSET_FIELD(partitioned_rels);
 
 	return newnode;
 }
@@ -282,6 +284,7 @@ _copyMergeAppend(const MergeAppend *from)
 	COPY_POINTER_FIELD(collations, from->numCols * sizeof(Oid));
 	COPY_POINTER_FIELD(nullsFirst, from->numCols * sizeof(bool));
 	COPY_NODE_FIELD(part_prune_info);
+	COPY_BITMAPSET_FIELD(partitioned_rels);
 
 	return newnode;
 }
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 6e39590730..108ede9af9 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -324,6 +324,7 @@ _outPlannedStmt(StringInfo str, const PlannedStmt *node)
 	WRITE_NODE_FIELD(invalItems);
 	WRITE_NODE_FIELD(paramExecTypes);
 	WRITE_NODE_FIELD(utilityStmt);
+	WRITE_BITMAPSET_FIELD(elidedAppendPartedRels);
 	WRITE_LOCATION_FIELD(stmt_location);
 	WRITE_INT_FIELD(stmt_len);
 }
@@ -444,6 +445,7 @@ _outAppend(StringInfo str, const Append *node)
 	WRITE_INT_FIELD(nasyncplans);
 	WRITE_INT_FIELD(first_partial_plan);
 	WRITE_NODE_FIELD(part_prune_info);
+	WRITE_BITMAPSET_FIELD(partitioned_rels);
 }
 
 static void
@@ -461,6 +463,7 @@ _outMergeAppend(StringInfo str, const MergeAppend *node)
 	WRITE_OID_ARRAY(collations, node->numCols);
 	WRITE_BOOL_ARRAY(nullsFirst, node->numCols);
 	WRITE_NODE_FIELD(part_prune_info);
+	WRITE_BITMAPSET_FIELD(partitioned_rels);
 }
 
 static void
@@ -2404,6 +2407,7 @@ _outPlannerGlobal(StringInfo str, const PlannerGlobal *node)
 	WRITE_BOOL_FIELD(parallelModeOK);
 	WRITE_BOOL_FIELD(parallelModeNeeded);
 	WRITE_CHAR_FIELD(maxParallelHazard);
+	WRITE_BITMAPSET_FIELD(elidedAppendPartedRels);
 }
 
 static void
@@ -2515,6 +2519,7 @@ _outRelOptInfo(StringInfo str, const RelOptInfo *node)
 	WRITE_BOOL_FIELD(partbounds_merged);
 	WRITE_BITMAPSET_FIELD(live_parts);
 	WRITE_BITMAPSET_FIELD(all_partrels);
+	WRITE_BITMAPSET_FIELD(partitioned_rels);
 }
 
 static void
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index c94b2561f0..ce146dd45e 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -1794,6 +1794,7 @@ _readPlannedStmt(void)
 	READ_NODE_FIELD(invalItems);
 	READ_NODE_FIELD(paramExecTypes);
 	READ_NODE_FIELD(utilityStmt);
+	READ_BITMAPSET_FIELD(elidedAppendPartedRels);
 	READ_LOCATION_FIELD(stmt_location);
 	READ_INT_FIELD(stmt_len);
 
@@ -1917,6 +1918,7 @@ _readAppend(void)
 	READ_INT_FIELD(nasyncplans);
 	READ_INT_FIELD(first_partial_plan);
 	READ_NODE_FIELD(part_prune_info);
+	READ_BITMAPSET_FIELD(partitioned_rels);
 
 	READ_DONE();
 }
@@ -1939,6 +1941,7 @@ _readMergeAppend(void)
 	READ_OID_ARRAY(collations, local_node->numCols);
 	READ_BOOL_ARRAY(nullsFirst, local_node->numCols);
 	READ_NODE_FIELD(part_prune_info);
+	READ_BITMAPSET_FIELD(partitioned_rels);
 
 	READ_DONE();
 }
diff --git a/src/backend/optimizer/path/joinrels.c b/src/backend/optimizer/path/joinrels.c
index 9da3ff2f9a..e74d40fee3 100644
--- a/src/backend/optimizer/path/joinrels.c
+++ b/src/backend/optimizer/path/joinrels.c
@@ -1549,6 +1549,15 @@ try_partitionwise_join(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2,
 		populate_joinrel_with_paths(root, child_rel1, child_rel2,
 									child_joinrel, child_sjinfo,
 									child_restrictlist);
+
+		/*
+		 * A parent relation's partitioned_rels must be a superset of the sets
+		 * of all its children, direct or indirect, so bubble up the child
+		 * joinrel's set.
+		 */
+		joinrel->partitioned_rels =
+			bms_add_members(joinrel->partitioned_rels,
+							child_joinrel->partitioned_rels);
 	}
 }
 
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index 179c87c671..99868a1a79 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -26,10 +26,12 @@
 #include "nodes/extensible.h"
 #include "nodes/makefuncs.h"
 #include "nodes/nodeFuncs.h"
+#include "optimizer/appendinfo.h"
 #include "optimizer/clauses.h"
 #include "optimizer/cost.h"
 #include "optimizer/optimizer.h"
 #include "optimizer/paramassign.h"
+#include "optimizer/pathnode.h"
 #include "optimizer/paths.h"
 #include "optimizer/placeholder.h"
 #include "optimizer/plancat.h"
@@ -1332,11 +1334,11 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
 										 best_path->subpaths,
 										 prunequal);
 	}
-
 	plan->appendplans = subplans;
 	plan->nasyncplans = nasyncplans;
 	plan->first_partial_plan = best_path->first_partial_path;
 	plan->part_prune_info = partpruneinfo;
+	plan->partitioned_rels = bms_copy(rel->partitioned_rels);
 
 	copy_generic_path_info(&plan->plan, (Path *) best_path);
 
@@ -1500,6 +1502,20 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
 	node->mergeplans = subplans;
 	node->part_prune_info = partpruneinfo;
 
+	/*
+	 * We need to explicitly add to the plan node the RT indexes of any
+	 * partitioned tables whose partitions will be scanned by the nodes in
+	 * 'subplans'.  There can be multiple RT indexes in the set due to the
+	 * partition tree being multi-level and/or this being a plan for UNION ALL
+	 * over multiple partition trees.  Along with scanrelids of leaf-level Scan
+	 * nodes, this allows the executor to lock the full set of relations being
+	 * scanned by this node.
+	 *
+	 * Note that 'apprelids' only contains the top-level base relation(s), so
+	 * is not sufficient for the purpose.
+	 */
+	node->partitioned_rels = bms_copy(rel->partitioned_rels);
+
 	/*
 	 * If prepare_sort_from_pathkeys added sort columns, but we were told to
 	 * produce either the exact tlist or a narrow tlist, we should get rid of
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index b2569c5d0c..c769b4b4b9 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -529,6 +529,7 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
 	result->paramExecTypes = glob->paramExecTypes;
 	/* utilityStmt should be null, but we might as well copy it */
 	result->utilityStmt = parse->utilityStmt;
+	result->elidedAppendPartedRels = glob->elidedAppendPartedRels;
 	result->stmt_location = parse->stmt_location;
 	result->stmt_len = parse->stmt_len;
 
@@ -7534,6 +7535,13 @@ create_partitionwise_grouping_paths(PlannerInfo *root,
 
 		add_paths_to_append_rel(root, grouped_rel, grouped_live_children);
 	}
+
+	/*
+	 * Input rel might be a partitioned appendrel, though grouped_rel has at
+	 * this point taken its role as the an appendrel owning the former's
+	 * children, so copy the former's partitioned_rels set into the latter.
+	 */
+	grouped_rel->partitioned_rels = bms_copy(input_rel->partitioned_rels);
 }
 
 /*
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index bf4c722c02..8214edec54 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -1574,6 +1574,10 @@ set_append_references(PlannerInfo *root,
 		lfirst(l) = set_plan_refs(root, (Plan *) lfirst(l), rtoffset);
 	}
 
+	/* Fix up partitioned_rels before possibly removing the Append below. */
+	aplan->partitioned_rels = offset_relid_set(aplan->partitioned_rels,
+											   rtoffset);
+
 	/*
 	 * See if it's safe to get rid of the Append entirely.  For this to be
 	 * safe, there must be only one child plan and that child plan's parallel
@@ -1584,8 +1588,17 @@ set_append_references(PlannerInfo *root,
 	 */
 	if (list_length(aplan->appendplans) == 1 &&
 		((Plan *) linitial(aplan->appendplans))->parallel_aware == aplan->plan.parallel_aware)
+	{
+		/*
+		 * Partitioned table involved, if any, must be made known to the
+		 * executor.
+		 */
+		root->glob->elidedAppendPartedRels =
+			bms_add_members(root->glob->elidedAppendPartedRels,
+							aplan->partitioned_rels);
 		return clean_up_removed_plan_level((Plan *) aplan,
 										   (Plan *) linitial(aplan->appendplans));
+	}
 
 	/*
 	 * Otherwise, clean up the Append as needed.  It's okay to do this after
@@ -1646,6 +1659,12 @@ set_mergeappend_references(PlannerInfo *root,
 		lfirst(l) = set_plan_refs(root, (Plan *) lfirst(l), rtoffset);
 	}
 
+	/*
+	 * Fix up partitioned_rels before possibly removing the MergeAppend below.
+	 */
+	mplan->partitioned_rels = offset_relid_set(mplan->partitioned_rels,
+											   rtoffset);
+
 	/*
 	 * See if it's safe to get rid of the MergeAppend entirely.  For this to
 	 * be safe, there must be only one child plan and that child plan's
@@ -1656,8 +1675,17 @@ set_mergeappend_references(PlannerInfo *root,
 	 */
 	if (list_length(mplan->mergeplans) == 1 &&
 		((Plan *) linitial(mplan->mergeplans))->parallel_aware == mplan->plan.parallel_aware)
+	{
+		/*
+		 * Partitioned tables involved, if any, must be made known to the
+		 * executor.
+		 */
+		root->glob->elidedAppendPartedRels =
+			bms_add_members(root->glob->elidedAppendPartedRels,
+							mplan->partitioned_rels);
 		return clean_up_removed_plan_level((Plan *) mplan,
 										   (Plan *) linitial(mplan->mergeplans));
+	}
 
 	/*
 	 * Otherwise, clean up the MergeAppend as needed.  It's okay to do this
diff --git a/src/backend/optimizer/util/inherit.c b/src/backend/optimizer/util/inherit.c
index 7e134822f3..56912e4101 100644
--- a/src/backend/optimizer/util/inherit.c
+++ b/src/backend/optimizer/util/inherit.c
@@ -406,6 +406,14 @@ expand_partitioned_rtentry(PlannerInfo *root, RelOptInfo *relinfo,
 									   childrte, childRTindex,
 									   childrel, top_parentrc, lockmode);
 
+		/*
+		 * A parent relation's partitioned_rels must be a superset of the sets
+		 * of all its children, direct or indirect, so bubble up the child
+		 * rel's set.
+		 */
+		relinfo->partitioned_rels = bms_add_members(relinfo->partitioned_rels,
+													childrelinfo->partitioned_rels);
+
 		/* Close child relation, but keep locks */
 		table_close(childrel, NoLock);
 	}
@@ -737,6 +745,14 @@ expand_appendrel_subquery(PlannerInfo *root, RelOptInfo *rel,
 		/* Child may itself be an inherited rel, either table or subquery. */
 		if (childrte->inh)
 			expand_inherited_rtentry(root, childrel, childrte, childRTindex);
+
+		/*
+		 * A parent relation's partitioned_rels must be a superset of the sets
+		 * of all its children, direct or indirect, so bubble up the child
+		 * rel's set.
+		 */
+		rel->partitioned_rels = bms_add_members(rel->partitioned_rels,
+												childrel->partitioned_rels);
 	}
 }
 
diff --git a/src/backend/optimizer/util/relnode.c b/src/backend/optimizer/util/relnode.c
index 520409f4ba..1d082a8fdd 100644
--- a/src/backend/optimizer/util/relnode.c
+++ b/src/backend/optimizer/util/relnode.c
@@ -361,6 +361,10 @@ build_simple_rel(PlannerInfo *root, int relid, RelOptInfo *parent)
 		}
 	}
 
+	/* A partitioned appendrel. */
+	if (rel->part_scheme != NULL)
+		rel->partitioned_rels = bms_copy(rel->relids);
+
 	/* Save the finished struct in the query's simple_rel_array */
 	root->simple_rel_array[relid] = rel;
 
@@ -729,6 +733,14 @@ build_join_rel(PlannerInfo *root,
 	set_joinrel_size_estimates(root, joinrel, outer_rel, inner_rel,
 							   sjinfo, restrictlist);
 
+	/*
+	 * The joinrel may get processed as an appendrel via partitionwise join
+	 * if both outer and inner rels are partitioned, so set partitioned_rels
+	 * appropriately.
+	 */
+	joinrel->partitioned_rels = bms_union(outer_rel->partitioned_rels,
+										  inner_rel->partitioned_rels);
+
 	/*
 	 * Set the consider_parallel flag if this joinrel could potentially be
 	 * scanned within a parallel worker.  If this flag is false for either
@@ -897,6 +909,14 @@ build_child_join_rel(PlannerInfo *root, RelOptInfo *outer_rel,
 	set_joinrel_size_estimates(root, joinrel, outer_rel, inner_rel,
 							   sjinfo, restrictlist);
 
+	/*
+	 * The joinrel may get processed as an appendrel via partitionwise join
+	 * if both outer and inner rels are partitioned, so set partitioned_rels
+	 * appropriately.
+	 */
+	joinrel->partitioned_rels = bms_union(outer_rel->partitioned_rels,
+										  inner_rel->partitioned_rels);
+
 	/* We build the join only once. */
 	Assert(!find_join_rel(root, joinrel->relids));
 
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 6cbcb67bdf..ef9b54739a 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -130,6 +130,11 @@ typedef struct PlannerGlobal
 	char		maxParallelHazard;	/* worst PROPARALLEL hazard level */
 
 	PartitionDirectory partition_directory; /* partition descriptors */
+
+	Bitmapset *elidedAppendPartedRels;	/* Combined partitioned_rels of all
+										 * single-subplan [Merge]Append nodes
+										 * that have been removed fron the
+										 * various plan trees. */
 } PlannerGlobal;
 
 /* macro for fetching the Plan associated with a SubPlan node */
@@ -773,6 +778,23 @@ typedef struct RelOptInfo
 	Relids		all_partrels;	/* Relids set of all partition relids */
 	List	  **partexprs;		/* Non-nullable partition key expressions */
 	List	  **nullable_partexprs; /* Nullable partition key expressions */
+
+	/*
+	 * For an appendrel parent relation (base, join, or upper) that is
+	 * partitioned, this stores the RT indexes of all the paritioned ancestors
+	 * including itself that lead up to the individual leaf partitions that
+	 * will be scanned to produce this relation's output rows.  The relid set
+	 * is copied into the resulting Append or MergeAppend plan node for
+	 * allowing the executor to take appropriate locks on those relations,
+	 * unless the node is deemed useless in setrefs.c due to having a single
+	 * leaf subplan and thus elided from the final plan, in which case, the set
+	 * is added into PlannerGlobal.elidedAppendPartedRels.
+	 *
+	 * Note that 'apprelids' of those nodes only contains the top-level base
+	 * relation(s), so is not sufficient for said purpose.
+	 */
+
+	Bitmapset  *partitioned_rels;
 } RelOptInfo;
 
 /*
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 50ef3dda05..a823c7c20d 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -86,6 +86,11 @@ typedef struct PlannedStmt
 
 	Node	   *utilityStmt;	/* non-null if this is utility stmt */
 
+	Bitmapset *elidedAppendPartedRels;	/* Combined partitioned_rels of all
+										 * single-subplan [Merge]Append nodes
+										 * that have been removed from the
+										 * various plan trees. */
+
 	/* statement location in source string (copied from Query) */
 	int			stmt_location;	/* start location, or -1 if unknown */
 	int			stmt_len;		/* length in bytes; 0 means "rest of string" */
@@ -264,6 +269,12 @@ typedef struct Append
 
 	/* Info for run-time subplan pruning; NULL if we're not doing that */
 	struct PartitionPruneInfo *part_prune_info;
+
+	/*
+	 * RT indexes of all partitioned parents whose partitions' plans are
+	 * present in appendplans.
+	 */
+	Bitmapset  *partitioned_rels;
 } Append;
 
 /* ----------------
@@ -284,6 +295,12 @@ typedef struct MergeAppend
 	bool	   *nullsFirst;		/* NULLS FIRST/LAST directions */
 	/* Info for run-time subplan pruning; NULL if we're not doing that */
 	struct PartitionPruneInfo *part_prune_info;
+
+	/*
+	 * RT indexes of all partitioned parents whose partitions' plans are
+	 * present in appendplans.
+	 */
+	Bitmapset  *partitioned_rels;
 } MergeAppend;
 
 /* ----------------
-- 
2.24.1



^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2022-03-31 09:56  Alvaro Herrera <[email protected]>
  parent: Amit Langote <[email protected]>
  1 sibling, 1 reply; 108+ messages in thread

From: Alvaro Herrera @ 2022-03-31 09:56 UTC (permalink / raw)
  To: Amit Langote <[email protected]>; +Cc: Robert Haas <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers; David Rowley <[email protected]>

I'm looking at 0001 here with intention to commit later.  I see that
there is some resistance to 0004, but I think a final verdict on that
one doesn't materially affect 0001.

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"El destino baraja y nosotros jugamos" (A. Schopenhauer)





^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2022-03-31 11:11  Amit Langote <[email protected]>
  parent: Alvaro Herrera <[email protected]>
  0 siblings, 0 replies; 108+ messages in thread

From: Amit Langote @ 2022-03-31 11:11 UTC (permalink / raw)
  To: Alvaro Herrera <[email protected]>; +Cc: Robert Haas <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers; David Rowley <[email protected]>

On Thu, Mar 31, 2022 at 6:55 PM Alvaro Herrera <[email protected]> wrote:
> I'm looking at 0001 here with intention to commit later.  I see that
> there is some resistance to 0004, but I think a final verdict on that
> one doesn't materially affect 0001.

Thanks.

While the main goal of the refactoring patch is to make it easier to
review the more complex changes that 0004 makes to execPartition.c, I
agree it has merit on its own.  Although, one may say that the bit
about providing a PlanState-independent ExprContext is more closely
tied with 0004's requirements...

-- 
Amit Langote
EDB: http://www.enterprisedb.com





^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2022-04-01 01:31  David Rowley <[email protected]>
  parent: Amit Langote <[email protected]>
  1 sibling, 1 reply; 108+ messages in thread

From: David Rowley @ 2022-04-01 01:31 UTC (permalink / raw)
  To: Amit Langote <[email protected]>; +Cc: Robert Haas <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers

On Thu, 31 Mar 2022 at 16:25, Amit Langote <[email protected]> wrote:
> Rebased.

I've been looking over the v8 patch and I'd like to propose semi-baked
ideas to improve things.  I'd need to go and write them myself to
fully know if they'd actually work ok.

1. You've changed the signature of various functions by adding
ExecLockRelsInfo *execlockrelsinfo.  I'm wondering why you didn't just
put the ExecLockRelsInfo as a new field in PlannedStmt?

I think the above gets around messing the signatures of
CreateQueryDesc(), ExplainOnePlan(), pg_plan_queries(),
PortalDefineQuery(), ProcessQuery() It would get rid of your change of
foreach to forboth in execute_sql_string() / PortalRunMulti() and gets
rid of a number of places where your carrying around a variable named
execlockrelsinfo_list. It would also make the patch significantly
easier to review as you'd be touching far fewer files.

2. I don't really like the way you've gone about most of the patch...

The way I imagine this working is that during create_plan() we visit
all nodes that have run-time pruning then inside create_append_plan()
and create_merge_append_plan() we'd tag those onto a new field in
PlannerGlobal  That way you can store the PartitionPruneInfos in the
new PlannedStmt field in standard_planner() after the
makeNode(PlannedStmt).

Instead of storing the PartitionPruneInfo in the Append / MergeAppend
struct, you'd just add a new index field to those structs. The index
would start with 0 for the 0th PartitionPruneInfo. You'd basically
just know the index by assigning
list_length(root->glob->partitionpruneinfos).

You'd then assign the root->glob->partitionpruneinfos to
PlannedStmt.partitionpruneinfos and anytime you needed to do run-time
pruning during execution, you'd need to use the Append / MergeAppend's
partition_prune_info_idx to lookup the PartitionPruneInfo in some new
field you add to EState to store those.  You'd leave that index as -1
if there's no PartitionPruneInfo for the Append / MergeAppend node.

When you do AcquireExecutorLocks(), you'd iterate over the
PlannedStmt's PartitionPruneInfo to figure out which subplans to
prune. You'd then have an array sized
list_length(plannedstmt->runtimepruneinfos) where you'd store the
result.  When the Append/MergeAppend node starts up you just check if
the part_prune_info_idx >= 0 and if there's a non-NULL result stored
then use that result.  That's how you'd ensure you always got the same
run-time prune result between locking and plan startup.

3. Also, looking at ExecGetLockRels(), shouldn't it be the planner's
job to determine the minimum set of relations which must be locked?  I
think the plan tree traversal during execution not great.  Seems the
whole point of this patch is to reduce overhead during execution. A
full additional plan traversal aside from the 3 that we already do for
start/run/end of execution seems not great.

I think this means that during AcquireExecutorLocks() you'd start with
the minimum set or RTEs that need to be locked as determined during
create_plan() and stored in some Bitmapset field in PlannedStmt. This
minimal set would also only exclude RTIs that would only possibly be
used due to a PartitionPruneInfo with initial pruning steps, i.e.
include RTIs from PartitionPruneInfo with no init pruining steps (you
can't skip any locks for those).  All you need to do to determine the
RTEs to lock are to take the minimal set and execute each
PartitionPruneInfo in the PlannedStmt that has init steps

4. It's a bit disappointing to see RelOptInfo.partitioned_rels getting
revived here.  Why don't you just add a partitioned_relids to
PartitionPruneInfo and just have make_partitionedrel_pruneinfo build
you a Relids of them. PartitionedRelPruneInfo already has an rtindex
field, so you just need to bms_add_member whatever that rtindex is.

It's a fairly high-level review at this stage. I can look in more
detail if the above points get looked at.  You may find or know of
some reason why it can't be done like I mention above.

David






^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2022-04-01 03:09  Amit Langote <[email protected]>
  parent: David Rowley <[email protected]>
  0 siblings, 2 replies; 108+ messages in thread

From: Amit Langote @ 2022-04-01 03:09 UTC (permalink / raw)
  To: David Rowley <[email protected]>; +Cc: Robert Haas <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers

Thanks a lot for looking into this.

On Fri, Apr 1, 2022 at 10:32 AM David Rowley <[email protected]> wrote:
> I've been looking over the v8 patch and I'd like to propose semi-baked
> ideas to improve things.  I'd need to go and write them myself to
> fully know if they'd actually work ok.
>
> 1. You've changed the signature of various functions by adding
> ExecLockRelsInfo *execlockrelsinfo.  I'm wondering why you didn't just
> put the ExecLockRelsInfo as a new field in PlannedStmt?
>
> I think the above gets around messing the signatures of
> CreateQueryDesc(), ExplainOnePlan(), pg_plan_queries(),
> PortalDefineQuery(), ProcessQuery() It would get rid of your change of
> foreach to forboth in execute_sql_string() / PortalRunMulti() and gets
> rid of a number of places where your carrying around a variable named
> execlockrelsinfo_list. It would also make the patch significantly
> easier to review as you'd be touching far fewer files.

I'm worried about that churn myself and did consider this idea, though
I couldn't shake the feeling that it's maybe wrong to put something in
PlannedStmt that the planner itself doesn't produce.  I mean the
definition of PlannedStmt says this:

/* ----------------
 *      PlannedStmt node
 *
 * The output of the planner

With the ideas that you've outlined below, perhaps we can frame most
of the things that the patch wants to do as the planner and the
plancache changes.  If we twist the above definition a bit to say what
the plancache does in this regard is part of planning, maybe it makes
sense to add the initial pruning related fields (nodes, outputs) into
PlannedStmt.

> 2. I don't really like the way you've gone about most of the patch...
>
> The way I imagine this working is that during create_plan() we visit
> all nodes that have run-time pruning then inside create_append_plan()
> and create_merge_append_plan() we'd tag those onto a new field in
> PlannerGlobal  That way you can store the PartitionPruneInfos in the
> new PlannedStmt field in standard_planner() after the
> makeNode(PlannedStmt).
>
> Instead of storing the PartitionPruneInfo in the Append / MergeAppend
> struct, you'd just add a new index field to those structs. The index
> would start with 0 for the 0th PartitionPruneInfo. You'd basically
> just know the index by assigning
> list_length(root->glob->partitionpruneinfos).
>
> You'd then assign the root->glob->partitionpruneinfos to
> PlannedStmt.partitionpruneinfos and anytime you needed to do run-time
> pruning during execution, you'd need to use the Append / MergeAppend's
> partition_prune_info_idx to lookup the PartitionPruneInfo in some new
> field you add to EState to store those.  You'd leave that index as -1
> if there's no PartitionPruneInfo for the Append / MergeAppend node.
>
> When you do AcquireExecutorLocks(), you'd iterate over the
> PlannedStmt's PartitionPruneInfo to figure out which subplans to
> prune. You'd then have an array sized
> list_length(plannedstmt->runtimepruneinfos) where you'd store the
> result.  When the Append/MergeAppend node starts up you just check if
> the part_prune_info_idx >= 0 and if there's a non-NULL result stored
> then use that result.  That's how you'd ensure you always got the same
> run-time prune result between locking and plan startup.

Actually, Robert too suggested such an idea to me off-list and I think
it's worth trying.  I was not sure about the implementation, because
then we'd be passing around lists of initial pruning nodes/results
across many function/module boundaries that you mentioned in your
comment 1, but if we agree that PlannedStmt is an acceptable place for
those things to be stored, then I agree it's an attractive idea.

> 3. Also, looking at ExecGetLockRels(), shouldn't it be the planner's
> job to determine the minimum set of relations which must be locked?  I
> think the plan tree traversal during execution not great.  Seems the
> whole point of this patch is to reduce overhead during execution. A
> full additional plan traversal aside from the 3 that we already do for
> start/run/end of execution seems not great.
>
> I think this means that during AcquireExecutorLocks() you'd start with
> the minimum set or RTEs that need to be locked as determined during
> create_plan() and stored in some Bitmapset field in PlannedStmt.

The patch did have a PlannedStmt.lockrels till v6.  Though, it wasn't
the same thing as you are describing it...

> This
> minimal set would also only exclude RTIs that would only possibly be
> used due to a PartitionPruneInfo with initial pruning steps, i.e.
> include RTIs from PartitionPruneInfo with no init pruining steps (you
> can't skip any locks for those).  All you need to do to determine the
> RTEs to lock are to take the minimal set and execute each
> PartitionPruneInfo in the PlannedStmt that has init steps

So just thinking about an Append/MergeAppend, the minimum set must
include the RT indexes of all the partitioned tables whose direct and
indirect children's plans will be in 'subplans' and also of the
children if the PartitionPruneInfo doesn't contain initial steps or if
there is no PartitionPruneInfo to begin with.

One question is whether the planner should always pay the overhead of
initializing this bitmapset?  I mean it's only worthwhile if
AcquireExecutorLocks() is going to be involved, that is, the plan will
be cached and reused.

> 4. It's a bit disappointing to see RelOptInfo.partitioned_rels getting
> revived here.  Why don't you just add a partitioned_relids to
> PartitionPruneInfo and just have make_partitionedrel_pruneinfo build
> you a Relids of them. PartitionedRelPruneInfo already has an rtindex
> field, so you just need to bms_add_member whatever that rtindex is.

Hmm, not all Append/MergeAppend nodes in the plan tree may have
make_partition_pruneinfo() called on them though.

If not the proposed RelOptInfo.partitioned_rels that is populated in
the early planning stages, the only reliable way to get all the
partitioned tables involved in Appends/MergeAppends at create_plan()
stage seems to be to make a function out the stanza at the top of
make_partition_pruneinfo() that collects them by scanning the leaf
paths and tracing each path's relation's parents up to the root
partitioned parent and call it from create_{merge_}append_plan() if
make_partition_pruneinfo() was not. I did try to implement that and
found it a bit complex and expensive (the scanning the leaf paths
part).

> It's a fairly high-level review at this stage. I can look in more
> detail if the above points get looked at.  You may find or know of
> some reason why it can't be done like I mention above.

I'll try to write a version with the above points addressed, while
keeping RelOptInfo.partitioned_rels around for now.

-- 
Amit Langote
EDB: http://www.enterprisedb.com

[1] https://www.postgresql.org/message-id/CA%2BHiwqH9-fAvpG-w9qYCcDWzK3vGPCMyw4f9nHzqkxXVuD1pxw%40mail.g...






^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2022-04-01 03:45  Tom Lane <[email protected]>
  parent: Amit Langote <[email protected]>
  1 sibling, 1 reply; 108+ messages in thread

From: Tom Lane @ 2022-04-01 03:45 UTC (permalink / raw)
  To: Amit Langote <[email protected]>; +Cc: David Rowley <[email protected]>; Robert Haas <[email protected]>; pgsql-hackers

Amit Langote <[email protected]> writes:
> On Fri, Apr 1, 2022 at 10:32 AM David Rowley <[email protected]> wrote:
>> 1. You've changed the signature of various functions by adding
>> ExecLockRelsInfo *execlockrelsinfo.  I'm wondering why you didn't just
>> put the ExecLockRelsInfo as a new field in PlannedStmt?

> I'm worried about that churn myself and did consider this idea, though
> I couldn't shake the feeling that it's maybe wrong to put something in
> PlannedStmt that the planner itself doesn't produce.

PlannedStmt is part of the plan tree, which MUST be read-only to
the executor.  This is not negotiable.  However, there's other
places that this data could be put, such as QueryDesc.
Or for that matter, couldn't the data structure be created by
the planner?  (It looks like David is proposing exactly that
further down.)

			regards, tom lane






^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2022-04-01 04:08  David Rowley <[email protected]>
  parent: Amit Langote <[email protected]>
  1 sibling, 1 reply; 108+ messages in thread

From: David Rowley @ 2022-04-01 04:08 UTC (permalink / raw)
  To: Amit Langote <[email protected]>; +Cc: Robert Haas <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers

On Fri, 1 Apr 2022 at 16:09, Amit Langote <[email protected]> wrote:
> definition of PlannedStmt says this:
>
> /* ----------------
>  *      PlannedStmt node
>  *
>  * The output of the planner
>
> With the ideas that you've outlined below, perhaps we can frame most
> of the things that the patch wants to do as the planner and the
> plancache changes.  If we twist the above definition a bit to say what
> the plancache does in this regard is part of planning, maybe it makes
> sense to add the initial pruning related fields (nodes, outputs) into
> PlannedStmt.

How about the PartitionPruneInfos go into PlannedStmt as a List
indexed in the way I mentioned and the cache of the results of pruning
in EState?

I think that leaves you adding  List *partpruneinfos,  Bitmapset
*minimumlockrtis to PlannedStmt and the thing you have to cache the
pruning results into EState.   I'm not very clear on where you should
stash the results of run-time pruning in the meantime before you can
put them in EState.  You might need to invent some intermediate struct
that gets passed around that you can scribble down some details you're
going to need during execution.

> One question is whether the planner should always pay the overhead of
> initializing this bitmapset?  I mean it's only worthwhile if
> AcquireExecutorLocks() is going to be involved, that is, the plan will
> be cached and reused.

Maybe the Bitmapset for the minimal locks needs to be built with
bms_add_range(NULL, 0, list_length(rtable));  then do
bms_del_members() on the relevant RTIs you find in the listed
PartitionPruneInfos.  That way it's very simple and cheap to do when
there are no PartitionPruneInfos.

> > 4. It's a bit disappointing to see RelOptInfo.partitioned_rels getting
> > revived here.  Why don't you just add a partitioned_relids to
> > PartitionPruneInfo and just have make_partitionedrel_pruneinfo build
> > you a Relids of them. PartitionedRelPruneInfo already has an rtindex
> > field, so you just need to bms_add_member whatever that rtindex is.
>
> Hmm, not all Append/MergeAppend nodes in the plan tree may have
> make_partition_pruneinfo() called on them though.

For Append/MergeAppends without run-time pruning you'll want to add
the RTIs to the minimal locking set of RTIs to go into PlannedStmt.
The only things you want to leave out of that are RTIs for the RTEs
that you might run-time prune away during AcquireExecutorLocks().

David






^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2022-04-01 06:58  Amit Langote <[email protected]>
  parent: David Rowley <[email protected]>
  0 siblings, 1 reply; 108+ messages in thread

From: Amit Langote @ 2022-04-01 06:58 UTC (permalink / raw)
  To: David Rowley <[email protected]>; +Cc: Robert Haas <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers

On Fri, Apr 1, 2022 at 1:08 PM David Rowley <[email protected]> wrote:
> On Fri, 1 Apr 2022 at 16:09, Amit Langote <[email protected]> wrote:
> > definition of PlannedStmt says this:
> >
> > /* ----------------
> >  *      PlannedStmt node
> >  *
> >  * The output of the planner
> >
> > With the ideas that you've outlined below, perhaps we can frame most
> > of the things that the patch wants to do as the planner and the
> > plancache changes.  If we twist the above definition a bit to say what
> > the plancache does in this regard is part of planning, maybe it makes
> > sense to add the initial pruning related fields (nodes, outputs) into
> > PlannedStmt.
>
> How about the PartitionPruneInfos go into PlannedStmt as a List
> indexed in the way I mentioned and the cache of the results of pruning
> in EState?
>
> I think that leaves you adding  List *partpruneinfos,  Bitmapset
> *minimumlockrtis to PlannedStmt and the thing you have to cache the
> pruning results into EState.   I'm not very clear on where you should
> stash the results of run-time pruning in the meantime before you can
> put them in EState.  You might need to invent some intermediate struct
> that gets passed around that you can scribble down some details you're
> going to need during execution.

Yes, the ExecLockRelsInfo node in the current patch, that first gets
added to the QueryDesc and subsequently to the EState of the query,
serves as that stashing place.  Not sure if you've looked at
ExecLockRelInfo in detail in your review of the patch so far, but it
carries the initial pruning result in what are called
PlanInitPruningOutput nodes, which are stored in a list in
ExecLockRelsInfo and their offsets in the list are in turn stored in
an adjacent array that contains an element for every plan node in the
tree.  If we go with a PlannedStmt.partpruneinfos list, then maybe we
don't need to have that array, because the Append/MergeAppend nodes
would be carrying those offsets by themselves.

Maybe a different name for ExecLockRelsInfo would be better?

Also, given Tom's apparent dislike for carrying that in PlannedStmt,
maybe the way I have it now is fine?

> > One question is whether the planner should always pay the overhead of
> > initializing this bitmapset?  I mean it's only worthwhile if
> > AcquireExecutorLocks() is going to be involved, that is, the plan will
> > be cached and reused.
>
> Maybe the Bitmapset for the minimal locks needs to be built with
> bms_add_range(NULL, 0, list_length(rtable));  then do
> bms_del_members() on the relevant RTIs you find in the listed
> PartitionPruneInfos.  That way it's very simple and cheap to do when
> there are no PartitionPruneInfos.

Ah, okay.  Looking at make_partition_pruneinfo(), I think I see a way
to delete the RTIs of prunable relations -- construct a
all_matched_leaf_part_relids in parallel to allmatchedsubplans and
delete those from the initial set.

> > > 4. It's a bit disappointing to see RelOptInfo.partitioned_rels getting
> > > revived here.  Why don't you just add a partitioned_relids to
> > > PartitionPruneInfo and just have make_partitionedrel_pruneinfo build
> > > you a Relids of them. PartitionedRelPruneInfo already has an rtindex
> > > field, so you just need to bms_add_member whatever that rtindex is.
> >
> > Hmm, not all Append/MergeAppend nodes in the plan tree may have
> > make_partition_pruneinfo() called on them though.
>
> For Append/MergeAppends without run-time pruning you'll want to add
> the RTIs to the minimal locking set of RTIs to go into PlannedStmt.
> The only things you want to leave out of that are RTIs for the RTEs
> that you might run-time prune away during AcquireExecutorLocks().

Yeah, I see it now.

Thanks.

-- 
Amit Langote
EDB: http://www.enterprisedb.com






^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2022-04-01 07:01  Amit Langote <[email protected]>
  parent: Tom Lane <[email protected]>
  0 siblings, 0 replies; 108+ messages in thread

From: Amit Langote @ 2022-04-01 07:01 UTC (permalink / raw)
  To: Tom Lane <[email protected]>; +Cc: David Rowley <[email protected]>; Robert Haas <[email protected]>; pgsql-hackers

On Fri, Apr 1, 2022 at 12:45 PM Tom Lane <[email protected]> wrote:
> Amit Langote <[email protected]> writes:
> > On Fri, Apr 1, 2022 at 10:32 AM David Rowley <[email protected]> wrote:
> >> 1. You've changed the signature of various functions by adding
> >> ExecLockRelsInfo *execlockrelsinfo.  I'm wondering why you didn't just
> >> put the ExecLockRelsInfo as a new field in PlannedStmt?
>
> > I'm worried about that churn myself and did consider this idea, though
> > I couldn't shake the feeling that it's maybe wrong to put something in
> > PlannedStmt that the planner itself doesn't produce.
>
> PlannedStmt is part of the plan tree, which MUST be read-only to
> the executor.  This is not negotiable.  However, there's other
> places that this data could be put, such as QueryDesc.
> Or for that matter, couldn't the data structure be created by
> the planner?  (It looks like David is proposing exactly that
> further down.)

The data structure in question is for storing the results of
performing initial partition pruning on a generic plan, which the
proposes to do in plancache.c -- inside the body of
AcquireExecutorLocks()'s loop over PlannedStmts -- so, it's hard to
see it as a product of the planner. :-(

-- 
Amit Langote
EDB: http://www.enterprisedb.com






^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2022-04-01 08:19  David Rowley <[email protected]>
  parent: Amit Langote <[email protected]>
  0 siblings, 1 reply; 108+ messages in thread

From: David Rowley @ 2022-04-01 08:19 UTC (permalink / raw)
  To: Amit Langote <[email protected]>; +Cc: Robert Haas <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers

On Fri, 1 Apr 2022 at 19:58, Amit Langote <[email protected]> wrote:
> Yes, the ExecLockRelsInfo node in the current patch, that first gets
> added to the QueryDesc and subsequently to the EState of the query,
> serves as that stashing place.  Not sure if you've looked at
> ExecLockRelInfo in detail in your review of the patch so far, but it
> carries the initial pruning result in what are called
> PlanInitPruningOutput nodes, which are stored in a list in
> ExecLockRelsInfo and their offsets in the list are in turn stored in
> an adjacent array that contains an element for every plan node in the
> tree.  If we go with a PlannedStmt.partpruneinfos list, then maybe we
> don't need to have that array, because the Append/MergeAppend nodes
> would be carrying those offsets by themselves.

I saw it, just not in great detail. I saw that you had an array that
was indexed by the plan node's ID.  I thought that wouldn't be so good
with large complex plans that we often get with partitioning
workloads.  That's why I mentioned using another index that you store
in Append/MergeAppend that starts at 0 and increments by 1 for each
node that has a PartitionPruneInfo made for it during create_plan.

> Maybe a different name for ExecLockRelsInfo would be better?
>
> Also, given Tom's apparent dislike for carrying that in PlannedStmt,
> maybe the way I have it now is fine?

I think if you change how it's indexed and the other stuff then we can
have another look.  I think the patch will be much easier to review
once the ParitionPruneInfos are moved into PlannedStmt.

David






^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2022-04-01 08:36  Amit Langote <[email protected]>
  parent: David Rowley <[email protected]>
  0 siblings, 1 reply; 108+ messages in thread

From: Amit Langote @ 2022-04-01 08:36 UTC (permalink / raw)
  To: David Rowley <[email protected]>; +Cc: Robert Haas <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers

On Fri, Apr 1, 2022 at 5:20 PM David Rowley <[email protected]> wrote:
> On Fri, 1 Apr 2022 at 19:58, Amit Langote <[email protected]> wrote:
> > Yes, the ExecLockRelsInfo node in the current patch, that first gets
> > added to the QueryDesc and subsequently to the EState of the query,
> > serves as that stashing place.  Not sure if you've looked at
> > ExecLockRelInfo in detail in your review of the patch so far, but it
> > carries the initial pruning result in what are called
> > PlanInitPruningOutput nodes, which are stored in a list in
> > ExecLockRelsInfo and their offsets in the list are in turn stored in
> > an adjacent array that contains an element for every plan node in the
> > tree.  If we go with a PlannedStmt.partpruneinfos list, then maybe we
> > don't need to have that array, because the Append/MergeAppend nodes
> > would be carrying those offsets by themselves.
>
> I saw it, just not in great detail. I saw that you had an array that
> was indexed by the plan node's ID.  I thought that wouldn't be so good
> with large complex plans that we often get with partitioning
> workloads.  That's why I mentioned using another index that you store
> in Append/MergeAppend that starts at 0 and increments by 1 for each
> node that has a PartitionPruneInfo made for it during create_plan.
>
> > Maybe a different name for ExecLockRelsInfo would be better?
> >
> > Also, given Tom's apparent dislike for carrying that in PlannedStmt,
> > maybe the way I have it now is fine?
>
> I think if you change how it's indexed and the other stuff then we can
> have another look.  I think the patch will be much easier to review
> once the ParitionPruneInfos are moved into PlannedStmt.

Will do, thanks.

-- 
Amit Langote
EDB: http://www.enterprisedb.com






^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2022-04-06 07:20  Amit Langote <[email protected]>
  parent: Amit Langote <[email protected]>
  0 siblings, 1 reply; 108+ messages in thread

From: Amit Langote @ 2022-04-06 07:20 UTC (permalink / raw)
  To: David Rowley <[email protected]>; +Cc: Robert Haas <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers

On Fri, Apr 1, 2022 at 5:36 PM Amit Langote <[email protected]> wrote:
> On Fri, Apr 1, 2022 at 5:20 PM David Rowley <[email protected]> wrote:
> > On Fri, 1 Apr 2022 at 19:58, Amit Langote <[email protected]> wrote:
> > > Yes, the ExecLockRelsInfo node in the current patch, that first gets
> > > added to the QueryDesc and subsequently to the EState of the query,
> > > serves as that stashing place.  Not sure if you've looked at
> > > ExecLockRelInfo in detail in your review of the patch so far, but it
> > > carries the initial pruning result in what are called
> > > PlanInitPruningOutput nodes, which are stored in a list in
> > > ExecLockRelsInfo and their offsets in the list are in turn stored in
> > > an adjacent array that contains an element for every plan node in the
> > > tree.  If we go with a PlannedStmt.partpruneinfos list, then maybe we
> > > don't need to have that array, because the Append/MergeAppend nodes
> > > would be carrying those offsets by themselves.
> >
> > I saw it, just not in great detail. I saw that you had an array that
> > was indexed by the plan node's ID.  I thought that wouldn't be so good
> > with large complex plans that we often get with partitioning
> > workloads.  That's why I mentioned using another index that you store
> > in Append/MergeAppend that starts at 0 and increments by 1 for each
> > node that has a PartitionPruneInfo made for it during create_plan.
> >
> > > Maybe a different name for ExecLockRelsInfo would be better?
> > >
> > > Also, given Tom's apparent dislike for carrying that in PlannedStmt,
> > > maybe the way I have it now is fine?
> >
> > I think if you change how it's indexed and the other stuff then we can
> > have another look.  I think the patch will be much easier to review
> > once the ParitionPruneInfos are moved into PlannedStmt.
>
> Will do, thanks.

And here is a version like that that passes make check-world.  Maybe
still a WIP as I think comments could use more editing.

Here's how the new implementation works:

AcquireExecutorLocks() calls ExecutorDoInitialPruning(), which in turn
iterates over a list of PartitionPruneInfos in a given PlannedStmt
coming from a CachedPlan.  For each PartitionPruneInfo,
ExecPartitionDoInitialPruning() is called, which sets up
PartitionPruneState and performs initial pruning steps present in the
PartitionPruneInfo.  The resulting bitmapsets of valid subplans, one
for each PartitionPruneInfo, are collected in a list and added to a
result node called PartitionPruneResult.  It represents the result of
performing initial pruning on all PartitionPruneInfos found in a plan.
A list of PartitionPruneResults is passed along with the PlannedStmt
to the executor, which is referenced when initializing
Append/MergeAppend nodes.

PlannedStmt.minLockRelids defined by the planner contains the RT
indexes of all the entries in the range table minus those of the leaf
partitions whose subplans are subject to removal due to initial
pruning.  AcquireExecutoLocks() adds back the RT indexes of only those
leaf partitions whose subplans survive ExecutorDoInitialPruning().  To
get the leaf partition RT indexes from the PartitionPruneInfo, a new
rti_map array is added to PartitionedRelPruneInfo.

There's only one patch this time.  Patches that added partitioned_rels
and plan_tree_walker() are no longer necessary.

-- 
Amit Langote
EDB: http://www.enterprisedb.com


Attachments:

  [application/octet-stream] v11-0001-Optimize-AcquireExecutorLocks-to-skip-pruned-par.patch (97.8K, 2-v11-0001-Optimize-AcquireExecutorLocks-to-skip-pruned-par.patch)
  download | inline diff:
From b0c8f18835ea2f455ea503a7c1702195be989df8 Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Wed, 22 Dec 2021 16:55:17 +0900
Subject: [PATCH v11] Optimize AcquireExecutorLocks() to skip pruned partitions

---
 src/backend/commands/copyto.c           |   2 +-
 src/backend/commands/createas.c         |   2 +-
 src/backend/commands/explain.c          |   7 +-
 src/backend/commands/extension.c        |  13 +-
 src/backend/commands/matview.c          |   2 +-
 src/backend/commands/portalcmds.c       |   1 +
 src/backend/commands/prepare.c          |  17 +-
 src/backend/executor/README             |  28 +++
 src/backend/executor/execMain.c         |  46 +++++
 src/backend/executor/execParallel.c     |  28 ++-
 src/backend/executor/execPartition.c    | 238 ++++++++++++++++++++----
 src/backend/executor/execUtils.c        |   1 +
 src/backend/executor/functions.c        |   2 +-
 src/backend/executor/nodeAppend.c       |  16 +-
 src/backend/executor/nodeMergeAppend.c  |   9 +-
 src/backend/executor/spi.c              |  14 +-
 src/backend/nodes/copyfuncs.c           |  33 +++-
 src/backend/nodes/outfuncs.c            |  36 +++-
 src/backend/nodes/readfuncs.c           |  56 +++++-
 src/backend/optimizer/plan/createplan.c |  20 +-
 src/backend/optimizer/plan/planner.c    |   3 +
 src/backend/optimizer/plan/setrefs.c    | 104 ++++++++---
 src/backend/partitioning/partprune.c    |  41 +++-
 src/backend/tcop/postgres.c             |  15 +-
 src/backend/tcop/pquery.c               |  22 ++-
 src/backend/utils/cache/plancache.c     | 232 ++++++++++++++++++++---
 src/backend/utils/mmgr/portalmem.c      |   2 +
 src/include/commands/explain.h          |   3 +-
 src/include/executor/execPartition.h    |  12 +-
 src/include/executor/execdesc.h         |   3 +
 src/include/executor/executor.h         |   2 +
 src/include/nodes/execnodes.h           |  15 ++
 src/include/nodes/nodes.h               |   4 +
 src/include/nodes/pathnodes.h           |  15 ++
 src/include/nodes/plannodes.h           |  39 +++-
 src/include/tcop/tcopprot.h             |   2 +-
 src/include/utils/plancache.h           |   7 +
 src/include/utils/portal.h              |   5 +
 38 files changed, 942 insertions(+), 155 deletions(-)

diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index 55c38b04c4..d403eb2309 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -542,7 +542,7 @@ BeginCopyTo(ParseState *pstate,
 		((DR_copy *) dest)->cstate = cstate;
 
 		/* Create a QueryDesc requesting no output */
-		cstate->queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+		cstate->queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
 											GetActiveSnapshot(),
 											InvalidSnapshot,
 											dest, NULL, NULL, 0);
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 9abbb6b555..f6607f2454 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -325,7 +325,7 @@ ExecCreateTableAs(ParseState *pstate, CreateTableAsStmt *stmt,
 		UpdateActiveSnapshotCommandId();
 
 		/* Create a QueryDesc, redirecting output to our tuple receiver */
-		queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+		queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
 									GetActiveSnapshot(), InvalidSnapshot,
 									dest, params, queryEnv, 0);
 
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 1e5701b8eb..7ba9852e51 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -407,7 +407,7 @@ ExplainOneQuery(Query *query, int cursorOptions,
 		}
 
 		/* run it (if needed) and produce output */
-		ExplainOnePlan(plan, into, es, queryString, params, queryEnv,
+		ExplainOnePlan(plan, NULL, into, es, queryString, params, queryEnv,
 					   &planduration, (es->buffers ? &bufusage : NULL));
 	}
 }
@@ -515,7 +515,8 @@ ExplainOneUtility(Node *utilityStmt, IntoClause *into, ExplainState *es,
  * to call it.
  */
 void
-ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
+ExplainOnePlan(PlannedStmt *plannedstmt, PartitionPruneResult *part_prune_result,
+			   IntoClause *into, ExplainState *es,
 			   const char *queryString, ParamListInfo params,
 			   QueryEnvironment *queryEnv, const instr_time *planduration,
 			   const BufferUsage *bufusage)
@@ -563,7 +564,7 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
 		dest = None_Receiver;
 
 	/* Create a QueryDesc for the query */
-	queryDesc = CreateQueryDesc(plannedstmt, queryString,
+	queryDesc = CreateQueryDesc(plannedstmt, part_prune_result, queryString,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, params, queryEnv, instrument_option);
 
diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index 1013790dbb..1151d95e1f 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -741,8 +741,10 @@ execute_sql_string(const char *sql)
 		RawStmt    *parsetree = lfirst_node(RawStmt, lc1);
 		MemoryContext per_parsetree_context,
 					oldcontext;
-		List	   *stmt_list;
-		ListCell   *lc2;
+		List	   *stmt_list,
+				   *part_prune_result_list;
+		ListCell   *lc2,
+				   *lc3;
 
 		/*
 		 * We do the work for each parsetree in a short-lived context, to
@@ -762,11 +764,13 @@ execute_sql_string(const char *sql)
 										   NULL,
 										   0,
 										   NULL);
-		stmt_list = pg_plan_queries(stmt_list, sql, CURSOR_OPT_PARALLEL_OK, NULL);
+		stmt_list = pg_plan_queries(stmt_list, sql, CURSOR_OPT_PARALLEL_OK, NULL,
+									&part_prune_result_list);
 
-		foreach(lc2, stmt_list)
+		forboth(lc2, stmt_list, lc3, part_prune_result_list)
 		{
 			PlannedStmt *stmt = lfirst_node(PlannedStmt, lc2);
+			PartitionPruneResult *part_prune_result = lfirst_node(PartitionPruneResult, lc3);
 
 			CommandCounterIncrement();
 
@@ -777,6 +781,7 @@ execute_sql_string(const char *sql)
 				QueryDesc  *qdesc;
 
 				qdesc = CreateQueryDesc(stmt,
+										part_prune_result,
 										sql,
 										GetActiveSnapshot(), NULL,
 										dest, NULL, NULL, 0);
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index 05e7b60059..4ef44aaf23 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -416,7 +416,7 @@ refresh_matview_datafill(DestReceiver *dest, Query *query,
 	UpdateActiveSnapshotCommandId();
 
 	/* Create a QueryDesc, redirecting output to our tuple receiver */
-	queryDesc = CreateQueryDesc(plan, queryString,
+	queryDesc = CreateQueryDesc(plan, NULL, queryString,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, NULL, NULL, 0);
 
diff --git a/src/backend/commands/portalcmds.c b/src/backend/commands/portalcmds.c
index 9902c5c566..cac653f535 100644
--- a/src/backend/commands/portalcmds.c
+++ b/src/backend/commands/portalcmds.c
@@ -107,6 +107,7 @@ PerformCursorOpen(ParseState *pstate, DeclareCursorStmt *cstmt, ParamListInfo pa
 					  queryString,
 					  CMDTAG_SELECT,	/* cursor's query is always a SELECT */
 					  list_make1(plan),
+					  list_make1(NULL),	/* no PartitionPruneResult to pass */
 					  NULL);
 
 	/*----------
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 80738547ed..8b15159374 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -155,6 +155,7 @@ ExecuteQuery(ParseState *pstate,
 	PreparedStatement *entry;
 	CachedPlan *cplan;
 	List	   *plan_list;
+	List	   *plan_part_prune_result_list;
 	ParamListInfo paramLI = NULL;
 	EState	   *estate = NULL;
 	Portal		portal;
@@ -195,6 +196,7 @@ ExecuteQuery(ParseState *pstate,
 	/* Replan if needed, and increment plan refcount for portal */
 	cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL);
 	plan_list = cplan->stmt_list;
+	plan_part_prune_result_list = cplan->part_prune_result_list;
 
 	/*
 	 * DO NOT add any logic that could possibly throw an error between
@@ -204,7 +206,7 @@ ExecuteQuery(ParseState *pstate,
 					  NULL,
 					  query_string,
 					  entry->plansource->commandTag,
-					  plan_list,
+					  plan_list, plan_part_prune_result_list,
 					  cplan);
 
 	/*
@@ -576,7 +578,9 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	const char *query_string;
 	CachedPlan *cplan;
 	List	   *plan_list;
-	ListCell   *p;
+	List	   *plan_part_prune_result_list;
+	ListCell   *p,
+			   *pp;
 	ParamListInfo paramLI = NULL;
 	EState	   *estate = NULL;
 	instr_time	planstart;
@@ -632,15 +636,18 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	}
 
 	plan_list = cplan->stmt_list;
+	plan_part_prune_result_list = cplan->part_prune_result_list;
 
 	/* Explain each query */
-	foreach(p, plan_list)
+	forboth(p, plan_list, pp, plan_part_prune_result_list)
 	{
 		PlannedStmt *pstmt = lfirst_node(PlannedStmt, p);
+		PartitionPruneResult *part_prune_result = lfirst_node(PartitionPruneResult, pp);
 
 		if (pstmt->commandType != CMD_UTILITY)
-			ExplainOnePlan(pstmt, into, es, query_string, paramLI, queryEnv,
-						   &planduration, (es->buffers ? &bufusage : NULL));
+			ExplainOnePlan(pstmt, part_prune_result, into, es, query_string,
+						   paramLI, queryEnv, &planduration,
+						   (es->buffers ? &bufusage : NULL));
 		else
 			ExplainOneUtility(pstmt->utilityStmt, into, es, query_string,
 							  paramLI, queryEnv);
diff --git a/src/backend/executor/README b/src/backend/executor/README
index 0b5183fc4a..8418e758da 100644
--- a/src/backend/executor/README
+++ b/src/backend/executor/README
@@ -65,6 +65,30 @@ found there.  This currently only occurs for Append and MergeAppend nodes.  In
 this case the non-required subplans are ignored and the executor state's
 subnode array will become out of sequence to the plan's subplan list.
 
+Actually, the so-called execution time pruning may also occur even before the
+execution has started.  One case where that occurs is when a cached generic
+plan is being validated for execution by plancache.c: GetCachedPlan(), which
+proceeds by locking all the relations that will be scanned by that plan.  If
+the generic plan has nodes that contain so-called initial pruning steps (a
+subset of execution pruning steps that do not depend on full-fledged execution
+having started), they are performed at this point to figure out the minimal
+set of child subplans that satisfy those pruning instructions and the result
+of performing that pruning is saved in a data structure that gets passed to
+the executor alongside the plan tree.  Relations scanned by only those
+surviving subplans are then locked while those scanned by the pruned subplans
+are not, even though the pruned subplans themselves are not removed from the
+plan tree. So, it is imperative that the executor and any third party code
+invoked by it that gets passed the plan tree look at the initial pruning result
+made available via the aforementioned data structure to determine whether or
+not a particular subplan is valid.  The data structure basically consists of
+a PartitionPruneResult node passed through the QueryDesc (subsequently added
+to EState) containing a list of bitmapsets with one element for every
+PartitionPruneInfo found in PlannedStmt.partPruneInfos.  The list is indexed
+with part_prune_index of the individual PartitionPruneInfos that's stored in
+the parent plan nodes to which a given PartitionPruneInfo belongs.  Each
+bitmapset of the indexes of the child subplans of the given parent plan
+node that survive initial partiiton pruning.
+
 Each Plan node may have expression trees associated with it, to represent
 its target list, qualification conditions, etc.  These trees are also
 read-only to the executor, but the executor state for expression evaluation
@@ -286,6 +310,10 @@ Query Processing Control Flow
 
 This is a sketch of control flow for full query processing:
 
+  [ ExecutorDoInitialPruning ] --- an optional step to perform initial
+		partition pruning on the plan tree the result of which is passed
+		to the executor via QueryDesc
+
 	CreateQueryDesc
 
 	ExecutorStart
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index ef2fd46092..05cc99df8f 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -49,11 +49,13 @@
 #include "commands/matview.h"
 #include "commands/trigger.h"
 #include "executor/execdebug.h"
+#include "executor/execPartition.h"
 #include "executor/nodeSubplan.h"
 #include "foreign/fdwapi.h"
 #include "jit/jit.h"
 #include "mb/pg_wchar.h"
 #include "miscadmin.h"
+#include "nodes/nodeFuncs.h"
 #include "parser/parsetree.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
@@ -104,6 +106,47 @@ static void EvalPlanQualStart(EPQState *epqstate, Plan *planTree);
 
 /* end of local decls */
 
+/* ----------------------------------------------------------------
+ *		ExecutorDoInitialPruning
+ *
+ *		Performs initial partition pruning to figure out the minimal set of
+ *		subplans to be executed and the set of RT indexes of the corresponding
+ *		leaf partitions
+ *
+ * Returned PartitionPruneResult must be subsequently passed to the executor
+ * so that it can reuse the result of pruning.  It's important that the
+ * has the same view of which partitions are initially pruned (by not doing
+ * the pruning again itself) or otherwise it risks initializing subplans whose
+ * partitions would not have been locked.
+ *
+ * Note: Partitioned tables mentioned in PartitionedRelPruneInfo nodes that
+ * drive the pruning will be locked before doing the pruning.
+ */
+PartitionPruneResult *
+ExecutorDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params)
+{
+	PartitionPruneResult *result;
+	ListCell *lc;
+
+	/* Only get here if there is any pruning to do. */
+	Assert(plannedstmt->containsInitialPruning);
+
+	result = makeNode(PartitionPruneResult);
+	foreach(lc, plannedstmt->partPruneInfos)
+	{
+		PartitionPruneInfo *pruneinfo = lfirst(lc);
+		Bitmapset *valid_subplan_offs;
+
+		valid_subplan_offs =
+			ExecPartitionDoInitialPruning(plannedstmt, params, pruneinfo,
+										  &result->scan_leafpart_rtis);
+		result->valid_subplan_offs_list =
+			lappend(result->valid_subplan_offs_list,
+					valid_subplan_offs);
+	}
+
+	return result;
+}
 
 /* ----------------------------------------------------------------
  *		ExecutorStart
@@ -806,6 +849,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
 {
 	CmdType		operation = queryDesc->operation;
 	PlannedStmt *plannedstmt = queryDesc->plannedstmt;
+	PartitionPruneResult *part_prune_result = queryDesc->part_prune_result;
 	Plan	   *plan = plannedstmt->planTree;
 	List	   *rangeTable = plannedstmt->rtable;
 	EState	   *estate = queryDesc->estate;
@@ -825,6 +869,8 @@ InitPlan(QueryDesc *queryDesc, int eflags)
 	ExecInitRangeTable(estate, rangeTable);
 
 	estate->es_plannedstmt = plannedstmt;
+	estate->es_part_prune_infos = plannedstmt->partPruneInfos;
+	estate->es_part_prune_result = part_prune_result;
 
 	/*
 	 * Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index 9a0d5d59ef..805f86c503 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -66,6 +66,7 @@
 #define PARALLEL_KEY_QUERY_TEXT		UINT64CONST(0xE000000000000008)
 #define PARALLEL_KEY_JIT_INSTRUMENTATION UINT64CONST(0xE000000000000009)
 #define PARALLEL_KEY_WAL_USAGE			UINT64CONST(0xE00000000000000A)
+#define PARALLEL_KEY_PARTITIONPRUNERESULT	UINT64CONST(0xE00000000000000B)
 
 #define PARALLEL_TUPLE_QUEUE_SIZE		65536
 
@@ -182,7 +183,9 @@ ExecSerializePlan(Plan *plan, EState *estate)
 	pstmt->transientPlan = false;
 	pstmt->dependsOnRole = false;
 	pstmt->parallelModeNeeded = false;
+	pstmt->containsInitialPruning = false;
 	pstmt->planTree = plan;
+	pstmt->partPruneInfos = estate->es_part_prune_infos;
 	pstmt->rtable = estate->es_range_table;
 	pstmt->resultRelations = NIL;
 	pstmt->appendRelations = NIL;
@@ -596,12 +599,15 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	FixedParallelExecutorState *fpes;
 	char	   *pstmt_data;
 	char	   *pstmt_space;
+	char	   *part_prune_result_data;
+	char	   *part_prune_result_space;
 	char	   *paramlistinfo_space;
 	BufferUsage *bufusage_space;
 	WalUsage   *walusage_space;
 	SharedExecutorInstrumentation *instrumentation = NULL;
 	SharedJitInstrumentation *jit_instrumentation = NULL;
 	int			pstmt_len;
+	int			part_prune_result_len;
 	int			paramlistinfo_len;
 	int			instrumentation_len = 0;
 	int			jit_instrumentation_len = 0;
@@ -630,6 +636,7 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 
 	/* Fix up and serialize plan to be sent to workers. */
 	pstmt_data = ExecSerializePlan(planstate->plan, estate);
+	part_prune_result_data = nodeToString(estate->es_part_prune_result);
 
 	/* Create a parallel context. */
 	pcxt = CreateParallelContext("postgres", "ParallelQueryMain", nworkers);
@@ -656,6 +663,11 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	shm_toc_estimate_chunk(&pcxt->estimator, pstmt_len);
 	shm_toc_estimate_keys(&pcxt->estimator, 1);
 
+	/* Estimate space for serialized PartitionPruneResult. */
+	part_prune_result_len = strlen(part_prune_result_data) + 1;
+	shm_toc_estimate_chunk(&pcxt->estimator, part_prune_result_len);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
 	/* Estimate space for serialized ParamListInfo. */
 	paramlistinfo_len = EstimateParamListSpace(estate->es_param_list_info);
 	shm_toc_estimate_chunk(&pcxt->estimator, paramlistinfo_len);
@@ -750,6 +762,12 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	memcpy(pstmt_space, pstmt_data, pstmt_len);
 	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PLANNEDSTMT, pstmt_space);
 
+	/* Store serialized PartitionPruneResult */
+	part_prune_result_space = shm_toc_allocate(pcxt->toc, part_prune_result_len);
+	memcpy(part_prune_result_space, part_prune_result_data, part_prune_result_len);
+	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARTITIONPRUNERESULT,
+				   part_prune_result_space);
+
 	/* Store serialized ParamListInfo. */
 	paramlistinfo_space = shm_toc_allocate(pcxt->toc, paramlistinfo_len);
 	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARAMLISTINFO, paramlistinfo_space);
@@ -1231,8 +1249,10 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
 						 int instrument_options)
 {
 	char	   *pstmtspace;
+	char	   *part_prune_result_space;
 	char	   *paramspace;
 	PlannedStmt *pstmt;
+	PartitionPruneResult *part_prune_result;
 	ParamListInfo paramLI;
 	char	   *queryString;
 
@@ -1243,12 +1263,18 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
 	pstmtspace = shm_toc_lookup(toc, PARALLEL_KEY_PLANNEDSTMT, false);
 	pstmt = (PlannedStmt *) stringToNode(pstmtspace);
 
+	/* Reconstruct leader-supplied PartitionPruneResult. */
+	part_prune_result_space =
+		shm_toc_lookup(toc, PARALLEL_KEY_PARTITIONPRUNERESULT, false);
+	part_prune_result = (PartitionPruneResult *)
+		stringToNode(part_prune_result_space);
+
 	/* Reconstruct ParamListInfo. */
 	paramspace = shm_toc_lookup(toc, PARALLEL_KEY_PARAMLISTINFO, false);
 	paramLI = RestoreParamList(&paramspace);
 
 	/* Create a QueryDesc for the query. */
-	return CreateQueryDesc(pstmt,
+	return CreateQueryDesc(pstmt, part_prune_result,
 						   queryString,
 						   GetActiveSnapshot(), InvalidSnapshot,
 						   receiver, paramLI, NULL, instrument_options);
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 615bd80973..3037742b8d 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -25,6 +25,7 @@
 #include "mb/pg_wchar.h"
 #include "miscadmin.h"
 #include "nodes/makefuncs.h"
+#include "parser/parsetree.h"
 #include "partitioning/partbounds.h"
 #include "partitioning/partdesc.h"
 #include "partitioning/partprune.h"
@@ -185,7 +186,11 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation rel,
 static List *adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri);
 static List *adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap);
 static PartitionPruneState *CreatePartitionPruneState(PlanState *planstate,
-													  PartitionPruneInfo *pruneinfo);
+													  PartitionPruneInfo *pruneinfo,
+													  bool consider_initial_steps,
+													  bool consider_exec_steps,
+													  List *rtable, ExprContext *econtext,
+													  PartitionDirectory partdir);
 static void InitPartitionPruneContext(PartitionPruneContext *context,
 									  List *pruning_steps,
 									  PartitionDesc partdesc,
@@ -198,7 +203,8 @@ static void PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
 static void find_matching_subplans_recurse(PartitionPruningData *prunedata,
 										   PartitionedRelPruningData *pprune,
 										   bool initial_prune,
-										   Bitmapset **validsubplans);
+										   Bitmapset **validsubplans,
+										   Bitmapset **scan_leafpart_rtis);
 
 
 /*
@@ -1587,8 +1593,10 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  * considered to be a stable expression, it can change value from one plan
  * node scan to the next during query execution.  Stable comparison
  * expressions that don't involve such Params allow partition pruning to be
- * done once during executor startup.  Expressions that do involve such Params
- * require us to prune separately for each scan of the parent plan node.
+ * done once during executor startup or during ExecutorDoInitialPruning() that
+ * runs as part of performing AcquireExecutorLocks() on a given plan tree.
+ * Expressions that do involve such Params require us to prune separately for
+ * each scan of the parent plan node.
  *
  * Note that pruning away unneeded subplans during executor startup has the
  * added benefit of not having to initialize the unneeded subplans at all.
@@ -1605,6 +1613,13 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  *		account for initial pruning possibly having eliminated some of the
  *		subplans.
  *
+ * ExecPartitionDoInitialPruning:
+ *		Do initial pruning with the information contained in a given
+ *		PartitionPruneInfo to determine the minimal set of child subplans
+ *		to be executed of the parent plan node to which the PartitionPruneInfo
+ *		belongs and also the set of the RT indexes of leaf partitions that will
+ *		be scanned with those subplans.
+ *
  * ExecFindMatchingSubPlans:
  *		Returns indexes of matching subplans after evaluating the expressions
  *		that are safe to evaluate at a given point.  This function is first
@@ -1622,8 +1637,9 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  *
  * On return, *initially_valid_subplans is assigned the set of indexes of
  * child subplans that must be initialized along with the parent plan node.
- * Initial pruning is performed here if needed and in that case only the
- * surviving subplans' indexes are added.
+ * Initial pruning is performed here if needed (unless it has already been done
+ * by ExecDoInitialPruning()), and in that case only the surviving subplans'
+ * indexes are added.
  *
  * If subplans are indeed pruned, subplan_map arrays contained in the returned
  * PartitionPruneState are re-sequenced to not count those, though only if the
@@ -1632,23 +1648,59 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
 PartitionPruneState *
 ExecInitPartitionPruning(PlanState *planstate,
 						 int n_total_subplans,
-						 PartitionPruneInfo *pruneinfo,
+						 int part_prune_index,
 						 Bitmapset **initially_valid_subplans)
 {
-	PartitionPruneState *prunestate;
 	EState	   *estate = planstate->state;
+	PartitionPruneInfo  *pruneinfo = list_nth(estate->es_part_prune_infos,
+											  part_prune_index);
+	PartitionPruneResult *pruneresult = estate->es_part_prune_result;
+	PartitionPruneState *prunestate;
+	bool	do_pruning = (pruneinfo->needs_init_pruning ||
+						  pruneinfo->needs_exec_pruning);
 
-	/* We may need an expression context to evaluate partition exprs */
-	ExecAssignExprContext(estate, planstate);
+	/*
+	 * No need to do initial pruning if it was done already by
+	 * ExecutorDoInitialPruning(), which it would be if es_part_prune_result
+	 * has been set.
+	 */
+	if (pruneresult)
+		do_pruning = pruneinfo->needs_exec_pruning;
+
+	prunestate = NULL;
+	if (do_pruning)
+	{
+		/* We may need an expression context to evaluate partition exprs */
+		ExecAssignExprContext(estate, planstate);
 
-	/* Create the working data structure for pruning */
-	prunestate = CreatePartitionPruneState(planstate, pruneinfo);
+		/* For data reading, executor always omits detached partitions */
+		if (estate->es_partition_directory == NULL)
+			estate->es_partition_directory =
+				CreatePartitionDirectory(estate->es_query_cxt, false);
+
+		/*
+		 * Create the working data structure for pruning.  No need to consider
+		 * initial pruning steps if we have a PartitionPruneResult.
+		 */
+		prunestate = CreatePartitionPruneState(planstate, pruneinfo,
+											   pruneresult == NULL, true,
+											   NIL, planstate->ps_ExprContext,
+											   estate->es_partition_directory);
+	}
 
 	/*
 	 * Perform an initial partition prune pass, if required.
 	 */
-	if (prunestate->do_initial_prune)
-		*initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true);
+	if (pruneresult)
+	{
+		*initially_valid_subplans =
+			list_nth(pruneresult->valid_subplan_offs_list, part_prune_index);
+	}
+	else if (prunestate && prunestate->do_initial_prune)
+	{
+		*initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true,
+															 NULL);
+	}
 	else
 	{
 		/* No pruning, so we'll need to initialize all subplans */
@@ -1669,7 +1721,7 @@ ExecInitPartitionPruning(PlanState *planstate,
 		 * leaves invalid data in prunestate, because that data won't be
 		 * consulted again (cf initial Assert in ExecFindMatchingSubPlans).
 		 */
-		if (prunestate->do_exec_prune)
+		if (prunestate && prunestate->do_exec_prune)
 			PartitionPruneFixSubPlanMap(prunestate,
 										*initially_valid_subplans,
 										n_total_subplans);
@@ -1678,11 +1730,72 @@ ExecInitPartitionPruning(PlanState *planstate,
 	return prunestate;
 }
 
+/*
+ * ExecPartitionDoInitialPruning
+ *		Perform initial pruning using given PartitionPruneInfo to determine
+ *		the minimal set of child subplans to be executed of the parent plan
+ *		node to which the PartitionPruneInfo belongs and also the set of RT
+ *		indexes of leaf partitions that will scanned with those subplans.
+ */
+Bitmapset *
+ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params,
+							  PartitionPruneInfo *pruneinfo,
+							  Bitmapset **scan_leafpart_rtis)
+{
+	List		 *rtable = plannedstmt->rtable;
+	ExprContext	 *econtext;
+	PartitionDirectory pdir;
+	MemoryContext oldcontext,
+				  tmpcontext;
+	PartitionPruneState *prunestate;
+	Bitmapset	 *valid_subplan_offs;
+
+	/*
+	 * A temporary context to allocate stuff needded to run the pruning steps.
+	 */
+	tmpcontext = AllocSetContextCreate(CurrentMemoryContext,
+									   "initial pruning working data",
+									   ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(tmpcontext);
+
+	/*
+	 * PartitionDirectory to look up partition descriptors, which omits
+	 * detached partitions, just like in the executor proper.
+	 */
+	pdir = CreatePartitionDirectory(CurrentMemoryContext, false);
+
+	/*
+	 * We don't yet have a PlanState for the parent plan node, so must create
+	 * a standalone ExprContext to evaluate pruning expressions, equipped with
+	 * the information about the EXTERN parameters that the caller passed us.
+	 * Note that that's okay because the initial pruning steps do not contain
+	 * anything that requires the execution to have started.
+	 */
+	econtext = CreateStandaloneExprContext();
+	econtext->ecxt_param_list_info = params;
+	prunestate = CreatePartitionPruneState(NULL, pruneinfo, true, false,
+										   rtable, econtext, pdir);
+	MemoryContextSwitchTo(oldcontext);
+
+	/* Do the initial pruning. */
+	valid_subplan_offs = ExecFindMatchingSubPlans(prunestate, true,
+												  scan_leafpart_rtis);
+
+	FreeExprContext(econtext, true);
+	DestroyPartitionDirectory(pdir);
+	MemoryContextDelete(tmpcontext);
+
+	return valid_subplan_offs;
+}
+
 /*
  * CreatePartitionPruneState
  *		Build the data structure required for calling ExecFindMatchingSubPlans
  *
- * 'planstate' is the parent plan node's execution state.
+ * 'planstate', if not NULL, is the parent plan node's execution state.  It
+ * can be NULL if being called before ExecutorStart(), in which case,
+ * 'rtable' (range table), 'econtext', and 'partdir' must be explicitly
+ * provided.
  *
  * 'pruneinfo' is a PartitionPruneInfo as generated by
  * make_partition_pruneinfo.  Here we build a PartitionPruneState containing a
@@ -1696,19 +1809,21 @@ ExecInitPartitionPruning(PlanState *planstate,
  * PartitionedRelPruneInfo.
  */
 static PartitionPruneState *
-CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
+CreatePartitionPruneState(PlanState *planstate,
+						  PartitionPruneInfo *pruneinfo,
+						  bool consider_initial_steps,
+						  bool consider_exec_steps,
+						  List *rtable, ExprContext *econtext,
+						  PartitionDirectory partdir)
 {
-	EState	   *estate = planstate->state;
+	EState	   *estate = planstate ? planstate->state : NULL;
 	PartitionPruneState *prunestate;
 	int			n_part_hierarchies;
 	ListCell   *lc;
 	int			i;
-	ExprContext *econtext = planstate->ps_ExprContext;
 
-	/* For data reading, executor always omits detached partitions */
-	if (estate->es_partition_directory == NULL)
-		estate->es_partition_directory =
-			CreatePartitionDirectory(estate->es_query_cxt, false);
+	Assert((estate != NULL) ||
+			(partdir != NULL && econtext != NULL && rtable != NIL));
 
 	n_part_hierarchies = list_length(pruneinfo->prune_infos);
 	Assert(n_part_hierarchies > 0);
@@ -1759,19 +1874,48 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 			PartitionedRelPruneInfo *pinfo = lfirst_node(PartitionedRelPruneInfo, lc2);
 			PartitionedRelPruningData *pprune = &prunedata->partrelprunedata[j];
 			Relation	partrel;
+			bool		close_partrel = false;
 			PartitionDesc partdesc;
 			PartitionKey partkey;
 
 			/*
-			 * We can rely on the copies of the partitioned table's partition
-			 * key and partition descriptor appearing in its relcache entry,
-			 * because that entry will be held open and locked for the
-			 * duration of this executor run.
+			 * Must open the relation by ourselves when called before the
+			 * execution has started, such as, when called during
+			 * ExecutorDoInitialPruning() on a cached plan.  In that case,
+			 * sub-partitions must be locked, because AcquirePlannerLocks()
+			 * would not have seen them. (1st relation in a partrelpruneinfos
+			 * list is always the root partitioned table appearing in the
+			 * query, which AcquirePlannerLocks() would have locked; the
+			 * Assert in relation_open() guards that assumption.)
+			 */
+			if (estate == NULL)
+			{
+				RangeTblEntry *rte = rt_fetch(pinfo->rtindex, rtable);
+				int		lockmode = (j == 0) ? NoLock : rte->rellockmode;
+
+				partrel = table_open(rte->relid, lockmode);
+				close_partrel = true;
+			}
+			else
+				partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
+
+			/*
+			 * We can rely on the copy of the partitioned table's partition
+			 * key from in its relcache entry, because it can't change (or
+			 * get destroyed) as long as the relation is locked.  Partition
+			 * descriptor is taken from the PartitionDirectory associated with
+			 * the table that is held open long enough for the descriptor to
+			 * remain valid while it's used to perform the pruning steps.
 			 */
-			partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
 			partkey = RelationGetPartitionKey(partrel);
-			partdesc = PartitionDirectoryLookup(estate->es_partition_directory,
-												partrel);
+			partdesc = PartitionDirectoryLookup(partdir, partrel);
+
+			/*
+			 * Must close partrel, keeping the lock taken, if we're not using
+			 * EState's entry.
+			 */
+			if (close_partrel)
+				table_close(partrel, NoLock);
 
 			/*
 			 * Initialize the subplan_map and subpart_map.
@@ -1785,6 +1929,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 			Assert(partdesc->nparts >= pinfo->nparts);
 			pprune->nparts = partdesc->nparts;
 			pprune->subplan_map = palloc(sizeof(int) * partdesc->nparts);
+			pprune->rti_map = palloc(sizeof(Index) * partdesc->nparts);
 			if (partdesc->nparts == pinfo->nparts)
 			{
 				/*
@@ -1795,6 +1940,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 				pprune->subpart_map = pinfo->subpart_map;
 				memcpy(pprune->subplan_map, pinfo->subplan_map,
 					   sizeof(int) * pinfo->nparts);
+				memcpy(pprune->rti_map, pinfo->rti_map,
+					   sizeof(int) * pinfo->nparts);
 
 				/*
 				 * Double-check that the list of unpruned relations has not
@@ -1845,6 +1992,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 							pinfo->subplan_map[pd_idx];
 						pprune->subpart_map[pp_idx] =
 							pinfo->subpart_map[pd_idx];
+						pprune->rti_map[pp_idx] =
+							pinfo->rti_map[pd_idx];
 						pd_idx++;
 					}
 					else
@@ -1852,6 +2001,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 						/* this partdesc entry is not in the plan */
 						pprune->subplan_map[pp_idx] = -1;
 						pprune->subpart_map[pp_idx] = -1;
+						pprune->rti_map[pp_idx] = 0;
 					}
 				}
 
@@ -1873,7 +2023,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 			 * Initialize pruning contexts as needed.
 			 */
 			pprune->initial_pruning_steps = pinfo->initial_pruning_steps;
-			if (pinfo->initial_pruning_steps)
+			if (consider_initial_steps && pinfo->initial_pruning_steps)
 			{
 				InitPartitionPruneContext(&pprune->initial_context,
 										  pinfo->initial_pruning_steps,
@@ -1883,7 +2033,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 				prunestate->do_initial_prune = true;
 			}
 			pprune->exec_pruning_steps = pinfo->exec_pruning_steps;
-			if (pinfo->exec_pruning_steps)
+			if (consider_exec_steps && pinfo->exec_pruning_steps)
 			{
 				InitPartitionPruneContext(&pprune->exec_context,
 										  pinfo->exec_pruning_steps,
@@ -2111,10 +2261,14 @@ PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
  * Pass initial_prune if PARAM_EXEC Params cannot yet be evaluated.  This
  * differentiates the initial executor-time pruning step from later
  * runtime pruning.
+ *
+ * RT indexes of leaf partitions scanned by the chosen subplans are added to
+ * *scan_leafpart_rtis if the pointer is non-NULL.
  */
 Bitmapset *
 ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
-						 bool initial_prune)
+						 bool initial_prune,
+						 Bitmapset **scan_leafpart_rtis)
 {
 	Bitmapset  *result = NULL;
 	MemoryContext oldcontext;
@@ -2149,7 +2303,7 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
 		 */
 		pprune = &prunedata->partrelprunedata[0];
 		find_matching_subplans_recurse(prunedata, pprune, initial_prune,
-									   &result);
+									   &result, scan_leafpart_rtis);
 
 		/* Expression eval may have used space in ExprContext too */
 		if (pprune->exec_pruning_steps)
@@ -2163,6 +2317,8 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
 
 	/* Copy result out of the temp context before we reset it */
 	result = bms_copy(result);
+	if (scan_leafpart_rtis)
+		*scan_leafpart_rtis = bms_copy(*scan_leafpart_rtis);
 
 	MemoryContextReset(prunestate->prune_context);
 
@@ -2173,13 +2329,15 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
  * find_matching_subplans_recurse
  *		Recursive worker function for ExecFindMatchingSubPlans
  *
- * Adds valid (non-prunable) subplan IDs to *validsubplans
+ * Adds valid (non-prunable) subplan IDs to *validsubplans and RT indexes of
+ * of the corresponding leaf partitions to *scan_leafpart_rtis (if asked for).
  */
 static void
 find_matching_subplans_recurse(PartitionPruningData *prunedata,
 							   PartitionedRelPruningData *pprune,
 							   bool initial_prune,
-							   Bitmapset **validsubplans)
+							   Bitmapset **validsubplans,
+							   Bitmapset **scan_leafpart_rtis)
 {
 	Bitmapset  *partset;
 	int			i;
@@ -2206,8 +2364,13 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
 	while ((i = bms_next_member(partset, i)) >= 0)
 	{
 		if (pprune->subplan_map[i] >= 0)
+		{
 			*validsubplans = bms_add_member(*validsubplans,
 											pprune->subplan_map[i]);
+			if (scan_leafpart_rtis && pprune->rti_map[i] > 0)
+				*scan_leafpart_rtis = bms_add_member(*scan_leafpart_rtis,
+													 pprune->rti_map[i]);
+		}
 		else
 		{
 			int			partidx = pprune->subpart_map[i];
@@ -2215,7 +2378,8 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
 			if (partidx >= 0)
 				find_matching_subplans_recurse(prunedata,
 											   &prunedata->partrelprunedata[partidx],
-											   initial_prune, validsubplans);
+											   initial_prune, validsubplans,
+											   scan_leafpart_rtis);
 			else
 			{
 				/*
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 9df1f81ea8..639145abe9 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -119,6 +119,7 @@ CreateExecutorState(void)
 	estate->es_relations = NULL;
 	estate->es_rowmarks = NULL;
 	estate->es_plannedstmt = NULL;
+	estate->es_part_prune_result = NULL;
 
 	estate->es_junkFilter = NULL;
 
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index f9460ae506..a2182a6b1f 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -844,7 +844,7 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
 	else
 		dest = None_Receiver;
 
-	es->qd = CreateQueryDesc(es->stmt,
+	es->qd = CreateQueryDesc(es->stmt, NULL,
 							 fcache->src,
 							 GetActiveSnapshot(),
 							 InvalidSnapshot,
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index 357e10a1d7..09f26658e2 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -94,6 +94,7 @@ static bool ExecAppendAsyncRequest(AppendState *node, TupleTableSlot **result);
 static void ExecAppendAsyncEventWait(AppendState *node);
 static void classify_matching_subplans(AppendState *node);
 
+
 /* ----------------------------------------------------------------
  *		ExecInitAppend
  *
@@ -134,7 +135,7 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
 	appendstate->as_begun = false;
 
 	/* If run-time partition pruning is enabled, then set that up now */
-	if (node->part_prune_info != NULL)
+	if (node->part_prune_index >= 0)
 	{
 		PartitionPruneState *prunestate;
 
@@ -145,7 +146,7 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
 		 */
 		prunestate = ExecInitPartitionPruning(&appendstate->ps,
 											  list_length(node->appendplans),
-											  node->part_prune_info,
+											  node->part_prune_index,
 											  &validsubplans);
 		appendstate->as_prune_state = prunestate;
 		nplans = bms_num_members(validsubplans);
@@ -155,7 +156,8 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
 		 * subplan, we can fill as_valid_subplans immediately, preventing
 		 * later calls to ExecFindMatchingSubPlans.
 		 */
-		if (!prunestate->do_exec_prune && nplans > 0)
+		if (appendstate->as_prune_state == NULL ||
+			(!appendstate->as_prune_state->do_exec_prune && nplans > 0))
 			appendstate->as_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
 	}
 	else
@@ -577,7 +579,7 @@ choose_next_subplan_locally(AppendState *node)
 		}
 		else if (node->as_valid_subplans == NULL)
 			node->as_valid_subplans =
-				ExecFindMatchingSubPlans(node->as_prune_state, false);
+				ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
 
 		whichplan = -1;
 	}
@@ -642,7 +644,7 @@ choose_next_subplan_for_leader(AppendState *node)
 		if (node->as_valid_subplans == NULL)
 		{
 			node->as_valid_subplans =
-				ExecFindMatchingSubPlans(node->as_prune_state, false);
+				ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
 
 			/*
 			 * Mark each invalid plan as finished to allow the loop below to
@@ -717,7 +719,7 @@ choose_next_subplan_for_worker(AppendState *node)
 	else if (node->as_valid_subplans == NULL)
 	{
 		node->as_valid_subplans =
-			ExecFindMatchingSubPlans(node->as_prune_state, false);
+			ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
 		mark_invalid_subplans_as_finished(node);
 	}
 
@@ -868,7 +870,7 @@ ExecAppendAsyncBegin(AppendState *node)
 	if (node->as_valid_subplans == NULL)
 	{
 		node->as_valid_subplans =
-			ExecFindMatchingSubPlans(node->as_prune_state, false);
+			ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
 
 		classify_matching_subplans(node);
 	}
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index ecf9052e03..7708cfffda 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -82,7 +82,7 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
 	mergestate->ps.ExecProcNode = ExecMergeAppend;
 
 	/* If run-time partition pruning is enabled, then set that up now */
-	if (node->part_prune_info != NULL)
+	if (node->part_prune_index >= 0)
 	{
 		PartitionPruneState *prunestate;
 
@@ -93,7 +93,7 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
 		 */
 		prunestate = ExecInitPartitionPruning(&mergestate->ps,
 											  list_length(node->mergeplans),
-											  node->part_prune_info,
+											  node->part_prune_index,
 											  &validsubplans);
 		mergestate->ms_prune_state = prunestate;
 		nplans = bms_num_members(validsubplans);
@@ -103,7 +103,8 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
 		 * subplan, we can fill as_valid_subplans immediately, preventing
 		 * later calls to ExecFindMatchingSubPlans.
 		 */
-		if (!prunestate->do_exec_prune && nplans > 0)
+		if (mergestate->ms_prune_state == NULL ||
+			(!mergestate->ms_prune_state->do_exec_prune && nplans > 0))
 			mergestate->ms_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
 	}
 	else
@@ -218,7 +219,7 @@ ExecMergeAppend(PlanState *pstate)
 		 */
 		if (node->ms_valid_subplans == NULL)
 			node->ms_valid_subplans =
-				ExecFindMatchingSubPlans(node->ms_prune_state, false);
+				ExecFindMatchingSubPlans(node->ms_prune_state, false, NULL);
 
 		/*
 		 * First time through: pull the first tuple from each valid subplan,
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 042a5f8b0a..d2ea2a8914 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -1578,6 +1578,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 	CachedPlanSource *plansource;
 	CachedPlan *cplan;
 	List	   *stmt_list;
+	List	   *part_prune_result_list;
 	char	   *query_string;
 	Snapshot	snapshot;
 	MemoryContext oldcontext;
@@ -1659,6 +1660,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 	/* Replan if needed, and increment plan refcount for portal */
 	cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv);
 	stmt_list = cplan->stmt_list;
+	part_prune_result_list = cplan->part_prune_result_list;
 
 	if (!plan->saved)
 	{
@@ -1670,6 +1672,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 		 */
 		oldcontext = MemoryContextSwitchTo(portal->portalContext);
 		stmt_list = copyObject(stmt_list);
+		part_prune_result_list = copyObject(part_prune_result_list);
 		MemoryContextSwitchTo(oldcontext);
 		ReleaseCachedPlan(cplan, NULL);
 		cplan = NULL;			/* portal shouldn't depend on cplan */
@@ -1683,6 +1686,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 					  query_string,
 					  plansource->commandTag,
 					  stmt_list,
+					  part_prune_result_list,
 					  cplan);
 
 	/*
@@ -2473,7 +2477,9 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 	{
 		CachedPlanSource *plansource = (CachedPlanSource *) lfirst(lc1);
 		List	   *stmt_list;
-		ListCell   *lc2;
+		List	   *part_prune_result_list;
+		ListCell   *lc2,
+				   *lc3;
 
 		spicallbackarg.query = plansource->query_string;
 
@@ -2552,6 +2558,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 							  plan_owner, _SPI_current->queryEnv);
 
 		stmt_list = cplan->stmt_list;
+		part_prune_result_list = cplan->part_prune_result_list;
 
 		/*
 		 * If we weren't given a specific snapshot to use, and the statement
@@ -2589,9 +2596,10 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 			}
 		}
 
-		foreach(lc2, stmt_list)
+		forboth(lc2, stmt_list, lc3, part_prune_result_list)
 		{
 			PlannedStmt *stmt = lfirst_node(PlannedStmt, lc2);
+			PartitionPruneResult *part_prune_result = lfirst_node(PartitionPruneResult, lc3);
 			bool		canSetTag = stmt->canSetTag;
 			DestReceiver *dest;
 
@@ -2663,7 +2671,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 				else
 					snap = InvalidSnapshot;
 
-				qdesc = CreateQueryDesc(stmt,
+				qdesc = CreateQueryDesc(stmt, part_prune_result,
 										plansource->query_string,
 										snap, crosscheck_snapshot,
 										dest,
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index d5760b1006..d2d86c9841 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -96,7 +96,10 @@ _copyPlannedStmt(const PlannedStmt *from)
 	COPY_SCALAR_FIELD(parallelModeNeeded);
 	COPY_SCALAR_FIELD(jitFlags);
 	COPY_NODE_FIELD(planTree);
+	COPY_NODE_FIELD(partPruneInfos);
+	COPY_SCALAR_FIELD(containsInitialPruning);
 	COPY_NODE_FIELD(rtable);
+	COPY_BITMAPSET_FIELD(minLockRelids);
 	COPY_NODE_FIELD(resultRelations);
 	COPY_NODE_FIELD(appendRelations);
 	COPY_NODE_FIELD(subplans);
@@ -253,7 +256,7 @@ _copyAppend(const Append *from)
 	COPY_NODE_FIELD(appendplans);
 	COPY_SCALAR_FIELD(nasyncplans);
 	COPY_SCALAR_FIELD(first_partial_plan);
-	COPY_NODE_FIELD(part_prune_info);
+	COPY_SCALAR_FIELD(part_prune_index);
 
 	return newnode;
 }
@@ -281,7 +284,7 @@ _copyMergeAppend(const MergeAppend *from)
 	COPY_POINTER_FIELD(sortOperators, from->numCols * sizeof(Oid));
 	COPY_POINTER_FIELD(collations, from->numCols * sizeof(Oid));
 	COPY_POINTER_FIELD(nullsFirst, from->numCols * sizeof(bool));
-	COPY_NODE_FIELD(part_prune_info);
+	COPY_SCALAR_FIELD(part_prune_index);
 
 	return newnode;
 }
@@ -1279,6 +1282,8 @@ _copyPartitionPruneInfo(const PartitionPruneInfo *from)
 	PartitionPruneInfo *newnode = makeNode(PartitionPruneInfo);
 
 	COPY_NODE_FIELD(prune_infos);
+	COPY_SCALAR_FIELD(needs_init_pruning);
+	COPY_SCALAR_FIELD(needs_exec_pruning);
 	COPY_BITMAPSET_FIELD(other_subplans);
 
 	return newnode;
@@ -1295,6 +1300,7 @@ _copyPartitionedRelPruneInfo(const PartitionedRelPruneInfo *from)
 	COPY_POINTER_FIELD(subplan_map, from->nparts * sizeof(int));
 	COPY_POINTER_FIELD(subpart_map, from->nparts * sizeof(int));
 	COPY_POINTER_FIELD(relid_map, from->nparts * sizeof(Oid));
+	COPY_POINTER_FIELD(rti_map, from->nparts * sizeof(Index));
 	COPY_NODE_FIELD(initial_pruning_steps);
 	COPY_NODE_FIELD(exec_pruning_steps);
 	COPY_BITMAPSET_FIELD(execparamids);
@@ -5468,6 +5474,21 @@ _copyExtensibleNode(const ExtensibleNode *from)
 	return newnode;
 }
 
+/* ****************************************************************
+ *					execnodes.h copy functions
+ * ****************************************************************
+ */
+static PartitionPruneResult *
+_copyPartitionPruneResult(const PartitionPruneResult *from)
+{
+	PartitionPruneResult *newnode = makeNode(PartitionPruneResult);
+
+	COPY_BITMAPSET_FIELD(scan_leafpart_rtis);
+	COPY_NODE_FIELD(valid_subplan_offs_list);
+
+	return newnode;
+}
+
 /* ****************************************************************
  *					value.h copy functions
  * ****************************************************************
@@ -5522,7 +5543,6 @@ _copyBitString(const BitString *from)
 	return newnode;
 }
 
-
 static ForeignKeyCacheInfo *
 _copyForeignKeyCacheInfo(const ForeignKeyCacheInfo *from)
 {
@@ -6564,6 +6584,13 @@ copyObjectImpl(const void *from)
 			retval = _copyPublicationTable(from);
 			break;
 
+			/*
+			 * EXECUTION NODES
+			 */
+		case T_PartitionPruneResult:
+			retval = _copyPartitionPruneResult(from);
+			break;
+
 			/*
 			 * MISCELLANEOUS NODES
 			 */
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index abb1f787ef..96d305102d 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -314,7 +314,10 @@ _outPlannedStmt(StringInfo str, const PlannedStmt *node)
 	WRITE_BOOL_FIELD(parallelModeNeeded);
 	WRITE_INT_FIELD(jitFlags);
 	WRITE_NODE_FIELD(planTree);
+	WRITE_NODE_FIELD(partPruneInfos);
+	WRITE_BOOL_FIELD(containsInitialPruning);
 	WRITE_NODE_FIELD(rtable);
+	WRITE_BITMAPSET_FIELD(minLockRelids);
 	WRITE_NODE_FIELD(resultRelations);
 	WRITE_NODE_FIELD(appendRelations);
 	WRITE_NODE_FIELD(subplans);
@@ -443,7 +446,7 @@ _outAppend(StringInfo str, const Append *node)
 	WRITE_NODE_FIELD(appendplans);
 	WRITE_INT_FIELD(nasyncplans);
 	WRITE_INT_FIELD(first_partial_plan);
-	WRITE_NODE_FIELD(part_prune_info);
+	WRITE_INT_FIELD(part_prune_index);
 }
 
 static void
@@ -460,7 +463,7 @@ _outMergeAppend(StringInfo str, const MergeAppend *node)
 	WRITE_OID_ARRAY(sortOperators, node->numCols);
 	WRITE_OID_ARRAY(collations, node->numCols);
 	WRITE_BOOL_ARRAY(nullsFirst, node->numCols);
-	WRITE_NODE_FIELD(part_prune_info);
+	WRITE_INT_FIELD(part_prune_index);
 }
 
 static void
@@ -1005,6 +1008,8 @@ _outPartitionPruneInfo(StringInfo str, const PartitionPruneInfo *node)
 	WRITE_NODE_TYPE("PARTITIONPRUNEINFO");
 
 	WRITE_NODE_FIELD(prune_infos);
+	WRITE_BOOL_FIELD(needs_init_pruning);
+	WRITE_BOOL_FIELD(needs_exec_pruning);
 	WRITE_BITMAPSET_FIELD(other_subplans);
 }
 
@@ -1019,6 +1024,7 @@ _outPartitionedRelPruneInfo(StringInfo str, const PartitionedRelPruneInfo *node)
 	WRITE_INT_ARRAY(subplan_map, node->nparts);
 	WRITE_INT_ARRAY(subpart_map, node->nparts);
 	WRITE_OID_ARRAY(relid_map, node->nparts);
+	WRITE_INDEX_ARRAY(rti_map, node->nparts);
 	WRITE_NODE_FIELD(initial_pruning_steps);
 	WRITE_NODE_FIELD(exec_pruning_steps);
 	WRITE_BITMAPSET_FIELD(execparamids);
@@ -2419,6 +2425,9 @@ _outPlannerGlobal(StringInfo str, const PlannerGlobal *node)
 	WRITE_NODE_FIELD(finalrowmarks);
 	WRITE_NODE_FIELD(resultRelations);
 	WRITE_NODE_FIELD(appendRelations);
+	WRITE_NODE_FIELD(partPruneInfos);
+	WRITE_BOOL_FIELD(containsInitialPruning);
+	WRITE_BITMAPSET_FIELD(minLockRelids);
 	WRITE_NODE_FIELD(relationOids);
 	WRITE_NODE_FIELD(invalItems);
 	WRITE_NODE_FIELD(paramExecTypes);
@@ -2486,6 +2495,7 @@ _outPlannerInfo(StringInfo str, const PlannerInfo *node)
 	WRITE_BITMAPSET_FIELD(curOuterRels);
 	WRITE_NODE_FIELD(curOuterParams);
 	WRITE_BOOL_FIELD(partColsUpdated);
+	WRITE_NODE_FIELD(partPruneInfos);
 }
 
 static void
@@ -2839,6 +2849,21 @@ _outExtensibleNode(StringInfo str, const ExtensibleNode *node)
 	methods->nodeOut(str, node);
 }
 
+/*****************************************************************************
+ *
+ *	Stuff from execnodes.h
+ *
+ *****************************************************************************/
+
+static void
+_outPartitionPruneResult(StringInfo str, const PartitionPruneResult *node)
+{
+	WRITE_NODE_TYPE("PARTITIONPRUNERESULT");
+
+	WRITE_BITMAPSET_FIELD(scan_leafpart_rtis);
+	WRITE_NODE_FIELD(valid_subplan_offs_list);
+}
+
 /*****************************************************************************
  *
  *	Stuff from parsenodes.h.
@@ -4747,6 +4772,13 @@ outNode(StringInfo str, const void *obj)
 				_outJsonTableSibling(str, obj);
 				break;
 
+				/*
+				 * EXECUTION NODES
+				 */
+			case T_PartitionPruneResult:
+				_outPartitionPruneResult(str, obj);
+				break;
+
 			default:
 
 				/*
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index e7d008b2c5..677ec055d6 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -164,6 +164,11 @@
 	token = pg_strtok(&length);		/* skip :fldname */ \
 	local_node->fldname = readIntCols(len)
 
+/* Read an Index array */
+#define READ_INDEX_ARRAY(fldname, len) \
+	token = pg_strtok(&length);		/* skip :fldname */ \
+	local_node->fldname = readIndexCols(len)
+
 /* Read a bool array */
 #define READ_BOOL_ARRAY(fldname, len) \
 	token = pg_strtok(&length);		/* skip :fldname */ \
@@ -1814,7 +1819,10 @@ _readPlannedStmt(void)
 	READ_BOOL_FIELD(parallelModeNeeded);
 	READ_INT_FIELD(jitFlags);
 	READ_NODE_FIELD(planTree);
+	READ_NODE_FIELD(partPruneInfos);
+	READ_BOOL_FIELD(containsInitialPruning);
 	READ_NODE_FIELD(rtable);
+	READ_BITMAPSET_FIELD(minLockRelids);
 	READ_NODE_FIELD(resultRelations);
 	READ_NODE_FIELD(appendRelations);
 	READ_NODE_FIELD(subplans);
@@ -1946,7 +1954,7 @@ _readAppend(void)
 	READ_NODE_FIELD(appendplans);
 	READ_INT_FIELD(nasyncplans);
 	READ_INT_FIELD(first_partial_plan);
-	READ_NODE_FIELD(part_prune_info);
+	READ_INT_FIELD(part_prune_index);
 
 	READ_DONE();
 }
@@ -1968,7 +1976,7 @@ _readMergeAppend(void)
 	READ_OID_ARRAY(sortOperators, local_node->numCols);
 	READ_OID_ARRAY(collations, local_node->numCols);
 	READ_BOOL_ARRAY(nullsFirst, local_node->numCols);
-	READ_NODE_FIELD(part_prune_info);
+	READ_INT_FIELD(part_prune_index);
 
 	READ_DONE();
 }
@@ -2762,6 +2770,8 @@ _readPartitionPruneInfo(void)
 	READ_LOCALS(PartitionPruneInfo);
 
 	READ_NODE_FIELD(prune_infos);
+	READ_BOOL_FIELD(needs_init_pruning);
+	READ_BOOL_FIELD(needs_exec_pruning);
 	READ_BITMAPSET_FIELD(other_subplans);
 
 	READ_DONE();
@@ -2778,6 +2788,7 @@ _readPartitionedRelPruneInfo(void)
 	READ_INT_ARRAY(subplan_map, local_node->nparts);
 	READ_INT_ARRAY(subpart_map, local_node->nparts);
 	READ_OID_ARRAY(relid_map, local_node->nparts);
+	READ_INDEX_ARRAY(rti_map, local_node->nparts);
 	READ_NODE_FIELD(initial_pruning_steps);
 	READ_NODE_FIELD(exec_pruning_steps);
 	READ_BITMAPSET_FIELD(execparamids);
@@ -2931,6 +2942,21 @@ _readPartitionRangeDatum(void)
 	READ_DONE();
 }
 
+
+/*
+ * _readPartitionPruneResult
+ */
+static PartitionPruneResult *
+_readPartitionPruneResult(void)
+{
+	READ_LOCALS(PartitionPruneResult);
+
+	READ_BITMAPSET_FIELD(scan_leafpart_rtis);
+	READ_NODE_FIELD(valid_subplan_offs_list);
+
+	READ_DONE();
+}
+
 /*
  * parseNodeString
  *
@@ -3228,6 +3254,8 @@ parseNodeString(void)
 		return_value = _readJsonTableParent();
 	else if (MATCH("JSONTABSNODE", 12))
 		return_value = _readJsonTableSibling();
+	else if (MATCH("PARTITIONPRUNERESULT", 20))
+		return_value = _readPartitionPruneResult();
 	else
 	{
 		elog(ERROR, "badly formatted node string \"%.32s\"...", token);
@@ -3371,6 +3399,30 @@ readIntCols(int numCols)
 	return int_vals;
 }
 
+/*
+ * readIndexCols
+ */
+Index *
+readIndexCols(int numCols)
+{
+	int			tokenLength,
+				i;
+	const char *token;
+	Index	   *index_vals;
+
+	if (numCols <= 0)
+		return NULL;
+
+	index_vals = (Index *) palloc(numCols * sizeof(Index));
+	for (i = 0; i < numCols; i++)
+	{
+		token = pg_strtok(&tokenLength);
+		index_vals[i] = atoui(token);
+	}
+
+	return index_vals;
+}
+
 /*
  * readBoolCols
  */
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index 179c87c671..2f9260abed 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -1336,7 +1336,15 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
 	plan->appendplans = subplans;
 	plan->nasyncplans = nasyncplans;
 	plan->first_partial_plan = best_path->first_partial_path;
-	plan->part_prune_info = partpruneinfo;
+
+	if (partpruneinfo)
+	{
+		root->partPruneInfos = lappend(root->partPruneInfos, partpruneinfo);
+		/* Will be updated later in set_plan_references(). */
+		plan->part_prune_index = list_length(root->partPruneInfos) - 1;
+	}
+	else
+		plan->part_prune_index = -1;
 
 	copy_generic_path_info(&plan->plan, (Path *) best_path);
 
@@ -1498,7 +1506,15 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
 	}
 
 	node->mergeplans = subplans;
-	node->part_prune_info = partpruneinfo;
+
+	if (partpruneinfo)
+	{
+		root->partPruneInfos = lappend(root->partPruneInfos, partpruneinfo);
+		/* Will be updated later in set_plan_references(). */
+		node->part_prune_index = list_length(root->partPruneInfos) - 1;
+	}
+	else
+		node->part_prune_index = -1;
 
 	/*
 	 * If prepare_sort_from_pathkeys added sort columns, but we were told to
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index b2569c5d0c..2aa051d862 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -518,7 +518,10 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
 	result->dependsOnRole = glob->dependsOnRole;
 	result->parallelModeNeeded = glob->parallelModeNeeded;
 	result->planTree = top_plan;
+	result->partPruneInfos = glob->partPruneInfos;
+	result->containsInitialPruning = glob->containsInitialPruning;
 	result->rtable = glob->finalrtable;
+	result->minLockRelids = glob->minLockRelids;
 	result->resultRelations = glob->resultRelations;
 	result->appendRelations = glob->appendRelations;
 	result->subplans = glob->subplans;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index bf4c722c02..8d9ab2c74d 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -252,7 +252,7 @@ set_plan_references(PlannerInfo *root, Plan *plan)
 	Plan	   *result;
 	PlannerGlobal *glob = root->glob;
 	int			rtoffset = list_length(glob->finalrtable);
-	ListCell   *lc;
+	ListCell *lc;
 
 	/*
 	 * Add all the query's RTEs to the flattened rangetable.  The live ones
@@ -261,6 +261,16 @@ set_plan_references(PlannerInfo *root, Plan *plan)
 	 */
 	add_rtes_to_flat_rtable(root, false);
 
+	/*
+	 * Add the query's adjusted range of RT indexes to glob->minLockRelids.
+	 * The adjusted RT indexes of prunable relations will be deleted from the
+	 * set below where PartitionPruneInfos are processed.
+	 */
+	glob->minLockRelids =
+		bms_add_range(glob->minLockRelids,
+					  rtoffset + 1,
+					  rtoffset + list_length(root->parse->rtable));
+
 	/*
 	 * Adjust RT indexes of PlanRowMarks and add to final rowmarks list
 	 */
@@ -339,6 +349,56 @@ set_plan_references(PlannerInfo *root, Plan *plan)
 		}
 	}
 
+	/* Also fix up the information in PartitionPruneInfos. */
+	foreach (lc, root->partPruneInfos)
+	{
+		PartitionPruneInfo *pruneinfo = lfirst(lc);
+		Bitmapset *leafpart_rtis = NULL;
+		ListCell  *l;
+
+		foreach(l, pruneinfo->prune_infos)
+		{
+			List	   *prune_infos = lfirst(l);
+			ListCell   *l2;
+
+			foreach(l2, prune_infos)
+			{
+				PartitionedRelPruneInfo *pinfo = lfirst(l2);
+				int		i;
+
+				/* RT index of the partitione table. */
+				pinfo->rtindex += rtoffset;
+
+				/* And also those of the leaf partitions. */
+				for (i = 0; i < pinfo->nparts; i++)
+				{
+					if (pinfo->rti_map[i] > 0)
+					{
+						pinfo->rti_map[i] += rtoffset;
+						leafpart_rtis = bms_add_member(leafpart_rtis,
+													   pinfo->rti_map[i]);
+					}
+				}
+			}
+		}
+
+		if (pruneinfo->needs_init_pruning)
+		{
+			glob->containsInitialPruning = true;
+
+			/*
+			 * Delete the leaf partition RTIs from the global set of relations
+			 * to be locked before executing the plan.  AcquireExecutorLocks()
+			 * will find the ones to add to the set after performing initial
+			 * pruning.
+			 */
+			glob->minLockRelids = bms_del_members(glob->minLockRelids,
+												  leafpart_rtis);
+		}
+
+		glob->partPruneInfos = lappend(glob->partPruneInfos, pruneinfo);
+	}
+
 	return result;
 }
 
@@ -1596,21 +1656,12 @@ set_append_references(PlannerInfo *root,
 
 	aplan->apprelids = offset_relid_set(aplan->apprelids, rtoffset);
 
-	if (aplan->part_prune_info)
-	{
-		foreach(l, aplan->part_prune_info->prune_infos)
-		{
-			List	   *prune_infos = lfirst(l);
-			ListCell   *l2;
-
-			foreach(l2, prune_infos)
-			{
-				PartitionedRelPruneInfo *pinfo = lfirst(l2);
-
-				pinfo->rtindex += rtoffset;
-			}
-		}
-	}
+	/*
+	 * PartitionPruneInfos will be added to a list in PlannerGlobal, so update
+	 * the index.
+	 */
+	if (aplan->part_prune_index >= 0)
+		aplan->part_prune_index += list_length(root->glob->partPruneInfos);
 
 	/* We don't need to recurse to lefttree or righttree ... */
 	Assert(aplan->plan.lefttree == NULL);
@@ -1668,21 +1719,12 @@ set_mergeappend_references(PlannerInfo *root,
 
 	mplan->apprelids = offset_relid_set(mplan->apprelids, rtoffset);
 
-	if (mplan->part_prune_info)
-	{
-		foreach(l, mplan->part_prune_info->prune_infos)
-		{
-			List	   *prune_infos = lfirst(l);
-			ListCell   *l2;
-
-			foreach(l2, prune_infos)
-			{
-				PartitionedRelPruneInfo *pinfo = lfirst(l2);
-
-				pinfo->rtindex += rtoffset;
-			}
-		}
-	}
+	/*
+	 * PartitionPruneInfos will be added to a list in PlannerGlobal, so update
+	 * the index.
+	 */
+	if (mplan->part_prune_index >= 0)
+		mplan->part_prune_index += list_length(root->glob->partPruneInfos);
 
 	/* We don't need to recurse to lefttree or righttree ... */
 	Assert(mplan->plan.lefttree == NULL);
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index 9d3c05aed3..0eaff15ed0 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -144,7 +144,9 @@ static List *make_partitionedrel_pruneinfo(PlannerInfo *root,
 										   List *prunequal,
 										   Bitmapset *partrelids,
 										   int *relid_subplan_map,
-										   Bitmapset **matchedsubplans);
+										   Bitmapset **matchedsubplans,
+										   bool *needs_init_pruning,
+										   bool *needs_exec_pruning);
 static void gen_partprune_steps(RelOptInfo *rel, List *clauses,
 								PartClauseTarget target,
 								GeneratePruningStepsContext *context);
@@ -230,6 +232,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	int		   *relid_subplan_map;
 	ListCell   *lc;
 	int			i;
+	bool		needs_init_pruning = false;
+	bool		needs_exec_pruning = false;
 
 	/*
 	 * Scan the subpaths to see which ones are scans of partition child
@@ -309,12 +313,16 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		Bitmapset  *partrelids = (Bitmapset *) lfirst(lc);
 		List	   *pinfolist;
 		Bitmapset  *matchedsubplans = NULL;
+		bool		partrel_needs_init_pruning;
+		bool		partrel_needs_exec_pruning;
 
 		pinfolist = make_partitionedrel_pruneinfo(root, parentrel,
 												  prunequal,
 												  partrelids,
 												  relid_subplan_map,
-												  &matchedsubplans);
+												  &matchedsubplans,
+												  &partrel_needs_init_pruning,
+												  &partrel_needs_exec_pruning);
 
 		/* When pruning is possible, record the matched subplans */
 		if (pinfolist != NIL)
@@ -323,6 +331,10 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 			allmatchedsubplans = bms_join(matchedsubplans,
 										  allmatchedsubplans);
 		}
+		if (!needs_init_pruning)
+			needs_init_pruning = partrel_needs_init_pruning;
+		if (!needs_exec_pruning)
+			needs_exec_pruning = partrel_needs_exec_pruning;
 	}
 
 	pfree(relid_subplan_map);
@@ -337,6 +349,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	/* Else build the result data structure */
 	pruneinfo = makeNode(PartitionPruneInfo);
 	pruneinfo->prune_infos = prunerelinfos;
+	pruneinfo->needs_init_pruning = needs_init_pruning;
+	pruneinfo->needs_exec_pruning = needs_exec_pruning;
 
 	/*
 	 * Some subplans may not belong to any of the identified partitioned rels.
@@ -435,13 +449,18 @@ add_part_relids(List *allpartrelids, Bitmapset *partrelids)
  * If we cannot find any useful run-time pruning steps, return NIL.
  * However, on success, each rel identified in partrelids will have
  * an element in the result list, even if some of them are useless.
+ * *needs_init_pruning and *needs_exec_pruning are set to indicate that the
+ * returned PartitionedRelPruneInfos contains pruning steps that can be
+ * performed before and after execution begins, respectively.
  */
 static List *
 make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 							  List *prunequal,
 							  Bitmapset *partrelids,
 							  int *relid_subplan_map,
-							  Bitmapset **matchedsubplans)
+							  Bitmapset **matchedsubplans,
+							  bool *needs_init_pruning,
+							  bool *needs_exec_pruning)
 {
 	RelOptInfo *targetpart = NULL;
 	List	   *pinfolist = NIL;
@@ -452,6 +471,10 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	int			rti;
 	int			i;
 
+	/* Will find out below. */
+	*needs_init_pruning = false;
+	*needs_exec_pruning = false;
+
 	/*
 	 * Examine each partitioned rel, constructing a temporary array to map
 	 * from planner relids to index of the partitioned rel, and building a
@@ -539,6 +562,9 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		 * executor per-scan pruning steps.  This first pass creates startup
 		 * pruning steps and detects whether there's any possibly-useful quals
 		 * that would require per-scan pruning.
+		 *
+		 * In the first pass, we note whether the 2nd pass is necessary by
+		 * by noting the presence of EXEC parameters.
 		 */
 		gen_partprune_steps(subpart, partprunequal, PARTTARGET_INITIAL,
 							&context);
@@ -613,6 +639,11 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		pinfo->execparamids = execparamids;
 		/* Remaining fields will be filled in the next loop */
 
+		if (!*needs_init_pruning)
+			*needs_init_pruning = (initial_pruning_steps != NIL);
+		if (!*needs_exec_pruning)
+			*needs_exec_pruning = (exec_pruning_steps != NIL);
+
 		pinfolist = lappend(pinfolist, pinfo);
 	}
 
@@ -640,6 +671,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		int		   *subplan_map;
 		int		   *subpart_map;
 		Oid		   *relid_map;
+		Index	   *rti_map;
 
 		/*
 		 * Construct the subplan and subpart maps for this partitioning level.
@@ -652,6 +684,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		subpart_map = (int *) palloc(nparts * sizeof(int));
 		memset(subpart_map, -1, nparts * sizeof(int));
 		relid_map = (Oid *) palloc0(nparts * sizeof(Oid));
+		rti_map = (Index *) palloc0(nparts * sizeof(Index));
 		present_parts = NULL;
 
 		i = -1;
@@ -666,6 +699,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 			subplan_map[i] = subplanidx = relid_subplan_map[partrel->relid] - 1;
 			subpart_map[i] = subpartidx = relid_subpart_map[partrel->relid] - 1;
 			relid_map[i] = planner_rt_fetch(partrel->relid, root)->relid;
+			rti_map[i] = partrel->relid;
 			if (subplanidx >= 0)
 			{
 				present_parts = bms_add_member(present_parts, i);
@@ -690,6 +724,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		pinfo->subplan_map = subplan_map;
 		pinfo->subpart_map = subpart_map;
 		pinfo->relid_map = relid_map;
+		pinfo->rti_map = rti_map;
 	}
 
 	pfree(relid_subpart_map);
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index ba2fcfeb4a..fecffdba65 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -945,15 +945,17 @@ pg_plan_query(Query *querytree, const char *query_string, int cursorOptions,
  * For normal optimizable statements, invoke the planner.  For utility
  * statements, just make a wrapper PlannedStmt node.
  *
- * The result is a list of PlannedStmt nodes.
+ * The result is a list of PlannedStmt nodes.  Also, a NULL is appended to
+ * *part_prune_result_list for each PlannedStmt added to the returned list.
  */
 List *
 pg_plan_queries(List *querytrees, const char *query_string, int cursorOptions,
-				ParamListInfo boundParams)
+				ParamListInfo boundParams, List **part_prune_result_list)
 {
 	List	   *stmt_list = NIL;
 	ListCell   *query_list;
 
+	*part_prune_result_list = NIL;
 	foreach(query_list, querytrees)
 	{
 		Query	   *query = lfirst_node(Query, query_list);
@@ -977,6 +979,7 @@ pg_plan_queries(List *querytrees, const char *query_string, int cursorOptions,
 		}
 
 		stmt_list = lappend(stmt_list, stmt);
+		*part_prune_result_list = lappend(*part_prune_result_list, NULL);
 	}
 
 	return stmt_list;
@@ -1080,7 +1083,8 @@ exec_simple_query(const char *query_string)
 		QueryCompletion qc;
 		MemoryContext per_parsetree_context = NULL;
 		List	   *querytree_list,
-				   *plantree_list;
+				   *plantree_list,
+				   *plantree_part_prune_result_list;
 		Portal		portal;
 		DestReceiver *receiver;
 		int16		format;
@@ -1167,7 +1171,8 @@ exec_simple_query(const char *query_string)
 												NULL, 0, NULL);
 
 		plantree_list = pg_plan_queries(querytree_list, query_string,
-										CURSOR_OPT_PARALLEL_OK, NULL);
+										CURSOR_OPT_PARALLEL_OK, NULL,
+										&plantree_part_prune_result_list);
 
 		/*
 		 * Done with the snapshot used for parsing/planning.
@@ -1203,6 +1208,7 @@ exec_simple_query(const char *query_string)
 						  query_string,
 						  commandTag,
 						  plantree_list,
+						  plantree_part_prune_result_list,
 						  NULL);
 
 		/*
@@ -1991,6 +1997,7 @@ exec_bind_message(StringInfo input_message)
 					  query_string,
 					  psrc->commandTag,
 					  cplan->stmt_list,
+					  cplan->part_prune_result_list,
 					  cplan);
 
 	/* Done with the snapshot used for parameter I/O and parsing/planning */
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index 5aa5a350f3..fcba303b53 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -35,7 +35,7 @@
 Portal		ActivePortal = NULL;
 
 
-static void ProcessQuery(PlannedStmt *plan,
+static void ProcessQuery(PlannedStmt *plan, PartitionPruneResult *part_prune_result,
 						 const char *sourceText,
 						 ParamListInfo params,
 						 QueryEnvironment *queryEnv,
@@ -65,6 +65,7 @@ static void DoPortalRewind(Portal portal);
  */
 QueryDesc *
 CreateQueryDesc(PlannedStmt *plannedstmt,
+				PartitionPruneResult *part_prune_result,
 				const char *sourceText,
 				Snapshot snapshot,
 				Snapshot crosscheck_snapshot,
@@ -77,6 +78,8 @@ CreateQueryDesc(PlannedStmt *plannedstmt,
 
 	qd->operation = plannedstmt->commandType;	/* operation */
 	qd->plannedstmt = plannedstmt;	/* plan */
+	qd->part_prune_result = part_prune_result;	/* ExecutorDoInitialPruning()
+												 * output for plan */
 	qd->sourceText = sourceText;	/* query text */
 	qd->snapshot = RegisterSnapshot(snapshot);	/* snapshot */
 	/* RI check snapshot */
@@ -122,6 +125,7 @@ FreeQueryDesc(QueryDesc *qdesc)
  *		PORTAL_ONE_RETURNING, or PORTAL_ONE_MOD_WITH portal
  *
  *	plan: the plan tree for the query
+ *	part_prune_result: ExecutorDoInitialPruning() output for the plan tree
  *	sourceText: the source text of the query
  *	params: any parameters needed
  *	dest: where to send results
@@ -134,6 +138,7 @@ FreeQueryDesc(QueryDesc *qdesc)
  */
 static void
 ProcessQuery(PlannedStmt *plan,
+			 PartitionPruneResult *part_prune_result,
 			 const char *sourceText,
 			 ParamListInfo params,
 			 QueryEnvironment *queryEnv,
@@ -145,7 +150,7 @@ ProcessQuery(PlannedStmt *plan,
 	/*
 	 * Create the QueryDesc object
 	 */
-	queryDesc = CreateQueryDesc(plan, sourceText,
+	queryDesc = CreateQueryDesc(plan, part_prune_result, sourceText,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, params, queryEnv, 0);
 
@@ -493,6 +498,7 @@ PortalStart(Portal portal, ParamListInfo params,
 				 * the destination to DestNone.
 				 */
 				queryDesc = CreateQueryDesc(linitial_node(PlannedStmt, portal->stmts),
+											linitial_node(PartitionPruneResult, portal->part_prune_results),
 											portal->sourceText,
 											GetActiveSnapshot(),
 											InvalidSnapshot,
@@ -1193,7 +1199,8 @@ PortalRunMulti(Portal portal,
 			   QueryCompletion *qc)
 {
 	bool		active_snapshot_set = false;
-	ListCell   *stmtlist_item;
+	ListCell   *stmtlist_item,
+			   *part_prune_results_item;
 
 	/*
 	 * If the destination is DestRemoteExecute, change to DestNone.  The
@@ -1214,9 +1221,12 @@ PortalRunMulti(Portal portal,
 	 * Loop to handle the individual queries generated from a single parsetree
 	 * by analysis and rewrite.
 	 */
-	foreach(stmtlist_item, portal->stmts)
+	forboth(stmtlist_item, portal->stmts,
+			part_prune_results_item, portal->part_prune_results)
 	{
 		PlannedStmt *pstmt = lfirst_node(PlannedStmt, stmtlist_item);
+		PartitionPruneResult *part_prune_result = lfirst_node(PartitionPruneResult,
+															  part_prune_results_item);
 
 		/*
 		 * If we got a cancel signal in prior command, quit
@@ -1274,7 +1284,7 @@ PortalRunMulti(Portal portal,
 			if (pstmt->canSetTag)
 			{
 				/* statement can set tag string */
-				ProcessQuery(pstmt,
+				ProcessQuery(pstmt, part_prune_result,
 							 portal->sourceText,
 							 portal->portalParams,
 							 portal->queryEnv,
@@ -1283,7 +1293,7 @@ PortalRunMulti(Portal portal,
 			else
 			{
 				/* stmt added by rewrite cannot set tag */
-				ProcessQuery(pstmt,
+				ProcessQuery(pstmt, part_prune_result,
 							 portal->sourceText,
 							 portal->portalParams,
 							 portal->queryEnv,
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 4cf6db504f..80564dd874 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -99,14 +99,16 @@ static dlist_head cached_expression_list = DLIST_STATIC_INIT(cached_expression_l
 static void ReleaseGenericPlan(CachedPlanSource *plansource);
 static List *RevalidateCachedQuery(CachedPlanSource *plansource,
 								   QueryEnvironment *queryEnv);
-static bool CheckCachedPlan(CachedPlanSource *plansource);
+static bool CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams);
 static CachedPlan *BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
 								   ParamListInfo boundParams, QueryEnvironment *queryEnv);
+static void CachedPlanSavePartitionPruneResults(CachedPlan *plan, List *part_prune_result_list);
 static bool choose_custom_plan(CachedPlanSource *plansource,
 							   ParamListInfo boundParams);
 static double cached_plan_cost(CachedPlan *plan, bool include_planner);
 static Query *QueryListGetPrimaryStmt(List *stmts);
-static void AcquireExecutorLocks(List *stmt_list, bool acquire);
+static List *AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams);
+static void ReleaseExecutorLocks(List *stmt_list, List *part_prune_result_list);
 static void AcquirePlannerLocks(List *stmt_list, bool acquire);
 static void ScanQueryForLocks(Query *parsetree, bool acquire);
 static bool ScanQueryWalker(Node *node, bool *acquire);
@@ -790,9 +792,21 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
  *
  * On a "true" return, we have acquired the locks needed to run the plan.
  * (We must do this for the "true" result to be race-condition-free.)
+ *
+ * If the CachedPlan is valid, this may in some cases call
+ * ExecutorDoInitialPruning() on each PlannedStmt contained in it to determine
+ * the set of relations to be locked by AcquireExecutorLocks(), instead of just
+ * scanning its range table, which is done to prune away any nodes in the tree
+ * that need not be executed based on the result of initial partition pruning.
+ * The result of pruning which consists of List of Lists of bitmapsets of child
+ * subplan indexes, allocated in a child context of the context containing the
+ * plan itself, are added into plan->part_prune_results.  The previous contents
+ * of the list from the last invocation on the same CachedPlan are deleted,
+ * because they would no longer be valid given the fresh set of parameter
+ * values which may be used as pruning parameters.
  */
 static bool
-CheckCachedPlan(CachedPlanSource *plansource)
+CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams)
 {
 	CachedPlan *plan = plansource->gplan;
 
@@ -820,13 +834,24 @@ CheckCachedPlan(CachedPlanSource *plansource)
 	 */
 	if (plan->is_valid)
 	{
+		List   *part_prune_result_list;
+
 		/*
 		 * Plan must have positive refcount because it is referenced by
 		 * plansource; so no need to fear it disappears under us here.
 		 */
 		Assert(plan->refcount > 0);
 
-		AcquireExecutorLocks(plan->stmt_list, true);
+		/*
+		 * Lock relations scanned by the plan.  If ExecutorDoInitialPruning()
+		 * asked to omit some relations because the plan nodes that scan them
+		 * were found to be pruned, the executor will be informed of the
+		 * omission of the plan nodes themselves via part_prune_result_list
+		 * that is passed to it along with the list of PlannedStmts, so that
+		 * it doesn't accidentally try to execute those nodes.
+		 */
+		part_prune_result_list = AcquireExecutorLocks(plan->stmt_list,
+													   boundParams);
 
 		/*
 		 * If plan was transient, check to see if TransactionXmin has
@@ -844,11 +869,14 @@ CheckCachedPlan(CachedPlanSource *plansource)
 		if (plan->is_valid)
 		{
 			/* Successfully revalidated and locked the query. */
+
+			/* Remember pruning results in the CachedPlan. */
+			CachedPlanSavePartitionPruneResults(plan, part_prune_result_list);
 			return true;
 		}
 
 		/* Oops, the race case happened.  Release useless locks. */
-		AcquireExecutorLocks(plan->stmt_list, false);
+		ReleaseExecutorLocks(plan->stmt_list, part_prune_result_list);
 	}
 
 	/*
@@ -880,7 +908,8 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
 				ParamListInfo boundParams, QueryEnvironment *queryEnv)
 {
 	CachedPlan *plan;
-	List	   *plist;
+	List	   *plist,
+			   *part_prune_result_list;
 	bool		snapshot_set;
 	bool		is_transient;
 	MemoryContext plan_context;
@@ -933,7 +962,8 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
 	 * Generate the plan.
 	 */
 	plist = pg_plan_queries(qlist, plansource->query_string,
-							plansource->cursor_options, boundParams);
+							plansource->cursor_options, boundParams,
+							&part_prune_result_list);
 
 	/* Release snapshot if we got one */
 	if (snapshot_set)
@@ -1002,6 +1032,16 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
 	plan->is_saved = false;
 	plan->is_valid = true;
 
+	/*
+	 * Save a dummy part_prune_result_list, that is a list containing NULLs
+	 * as elements.  We must do this, becasue users of the CachedPlan expect
+	 * one to go with the list of PlannedStmts.
+	 * XXX maybe get rid of that contract.
+	 */
+	plan->part_prune_result_list_context = NULL;
+	CachedPlanSavePartitionPruneResults(plan, part_prune_result_list);
+	Assert(MemoryContextIsValid(plan->part_prune_result_list_context));
+
 	/* assign generation number to new plan */
 	plan->generation = ++(plansource->generation);
 
@@ -1160,7 +1200,7 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 
 	if (!customplan)
 	{
-		if (CheckCachedPlan(plansource))
+		if (CheckCachedPlan(plansource, boundParams))
 		{
 			/* We want a generic plan, and we already have a valid one */
 			plan = plansource->gplan;
@@ -1586,6 +1626,49 @@ CopyCachedPlan(CachedPlanSource *plansource)
 	return newsource;
 }
 
+/*
+ * CachedPlanSavePartitionPruneResults
+ *		Save the list containing PartitionPruneResult nodes into the given
+ *		CachedPlan
+ *
+ * The provided list is copied into a dedicated context that is a child of
+ * plan->context.  If the child context already exists, it is emptied, because
+ * any PartitionPruneResult contained therein would no longer be useful.
+ */
+static void
+CachedPlanSavePartitionPruneResults(CachedPlan *plan, List *part_prune_result_list)
+{
+	MemoryContext	part_prune_result_list_context = plan->part_prune_result_list_context,
+					oldcontext = CurrentMemoryContext;
+	List		   *part_prune_result_list_copy;
+
+	/*
+	 * Set up the dedicated context if not already done, saving it as a child
+	 * of the CachedPlan's context.
+	 */
+	if (part_prune_result_list_context == NULL)
+	{
+		part_prune_result_list_context = AllocSetContextCreate(CurrentMemoryContext,
+												 "CachedPlan part_prune_results list",
+												 ALLOCSET_START_SMALL_SIZES);
+		MemoryContextSetParent(part_prune_result_list_context, plan->context);
+		MemoryContextSetIdentifier(part_prune_result_list_context, plan->context->ident);
+		plan->part_prune_result_list_context = part_prune_result_list_context;
+	}
+	else
+	{
+		/* Just clear existing contents by resetting the context. */
+		Assert(MemoryContextIsValid(part_prune_result_list_context));
+		MemoryContextReset(part_prune_result_list_context);
+	}
+
+	MemoryContextSwitchTo(part_prune_result_list_context);
+	part_prune_result_list_copy = copyObject(part_prune_result_list);
+	MemoryContextSwitchTo(oldcontext);
+
+	plan->part_prune_result_list = part_prune_result_list_copy;
+}
+
 /*
  * CachedPlanIsValid: test whether the rewritten querytree within a
  * CachedPlanSource is currently valid (that is, not marked as being in need
@@ -1737,17 +1820,21 @@ QueryListGetPrimaryStmt(List *stmts)
 
 /*
  * AcquireExecutorLocks: acquire locks needed for execution of a cached plan;
- * or release them if acquire is false.
+ *
+ * Returns a list of PartitionPruneResult nodes containing one element for each
+ * PlannedStmt in stmt_list or NULL if the latter is utility statement or its
+ * containsInitialPruning is false.
  */
-static void
-AcquireExecutorLocks(List *stmt_list, bool acquire)
+static List *
+AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams)
 {
 	ListCell   *lc1;
+	List	   *part_prune_result_list = NIL;
 
 	foreach(lc1, stmt_list)
 	{
 		PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
-		ListCell   *lc2;
+		PartitionPruneResult *part_prune_result = NULL;
 
 		if (plannedstmt->commandType == CMD_UTILITY)
 		{
@@ -1761,27 +1848,122 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
 			Query	   *query = UtilityContainsQuery(plannedstmt->utilityStmt);
 
 			if (query)
-				ScanQueryForLocks(query, acquire);
-			continue;
+				ScanQueryForLocks(query, true);
 		}
-
-		foreach(lc2, plannedstmt->rtable)
+		else
 		{
-			RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc2);
-
-			if (rte->rtekind != RTE_RELATION)
-				continue;
+			Bitmapset  *lockRelids;
+			int			rti;
 
 			/*
-			 * Acquire the appropriate type of lock on each relation OID. Note
-			 * that we don't actually try to open the rel, and hence will not
-			 * fail if it's been dropped entirely --- we'll just transiently
-			 * acquire a non-conflicting lock.
+			 * Figure out the set of relations that would need to be locked
+			 * before executing the plan.
 			 */
-			if (acquire)
+			if (plannedstmt->containsInitialPruning)
+			{
+				/*
+				 * Obtain the set of partitions to be locked from the
+				 * PartitionPruneInfos by considering the result of performing
+				 * initial partition pruning.
+				 */
+				PartitionPruneResult *part_prune_result =
+					ExecutorDoInitialPruning(plannedstmt, boundParams);
+
+				lockRelids = bms_union(plannedstmt->minLockRelids,
+									   part_prune_result->scan_leafpart_rtis);
+			}
+			else
+				lockRelids = plannedstmt->minLockRelids;
+
+			rti = -1;
+			while ((rti = bms_next_member(lockRelids, rti)) > 0)
+			{
+				RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
+
+				if (rte->rtekind != RTE_RELATION)
+					continue;
+
+				/*
+				 * Acquire the appropriate type of lock on each relation OID.
+				 * Note that we don't actually try to open the rel, and hence
+				 * will not fail if it's been dropped entirely --- we'll just
+				 * transiently acquire a non-conflicting lock.
+				 */
 				LockRelationOid(rte->relid, rte->rellockmode);
+			}
+		}
+
+		/*
+		 * Remember PartitionPruneResult for later adding to the QueryDesc that
+		 * will be passed to the executor when executing this plan.  May be
+		 * NULL, but must keep the list the same length as stmt_list.
+		 */
+		part_prune_result_list = lappend(part_prune_result_list,
+										 part_prune_result);
+	}
+
+	return part_prune_result_list;
+}
+
+/*
+ * ReleaseExecutorLocks
+ * 		Release locks that would've been acquired by an earlier call to
+ * 		AcquireExecutorLocks()
+ */
+static void
+ReleaseExecutorLocks(List *stmt_list, List *part_prune_result_list)
+{
+	ListCell   *lc1,
+			   *lc2;
+
+	forboth(lc1, stmt_list, lc2, part_prune_result_list)
+	{
+		PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
+		PartitionPruneResult *part_prune_result = lfirst_node(PartitionPruneResult, lc2);
+
+		if (plannedstmt->commandType == CMD_UTILITY)
+		{
+			/*
+			 * Ignore utility statements, except those (such as EXPLAIN) that
+			 * contain a parsed-but-not-planned query.  Note: it's okay to use
+			 * ScanQueryForLocks, even though the query hasn't been through
+			 * rule rewriting, because rewriting doesn't change the query
+			 * representation.
+			 */
+			Query	   *query = UtilityContainsQuery(plannedstmt->utilityStmt);
+
+			if (query)
+				ScanQueryForLocks(query, false);
+		}
+		else
+		{
+			Bitmapset  *lockRelids;
+			int			rti;
+
+			if (part_prune_result == NULL)
+			{
+				Assert(!plannedstmt->containsInitialPruning);
+				lockRelids = plannedstmt->minLockRelids;
+			}
 			else
+			{
+				Assert(plannedstmt->containsInitialPruning);
+				lockRelids = bms_union(plannedstmt->minLockRelids,
+									   part_prune_result->scan_leafpart_rtis);
+			}
+
+			rti = -1;
+			while ((rti = bms_next_member(lockRelids, rti)) >= 0)
+			{
+				RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
+
+				if (rte->rtekind != RTE_RELATION)
+					continue;
+
+				/* See the comment in AcquireExecutorLocks(). */
 				UnlockRelationOid(rte->relid, rte->rellockmode);
+			}
+
 		}
 	}
 }
diff --git a/src/backend/utils/mmgr/portalmem.c b/src/backend/utils/mmgr/portalmem.c
index d549f66d4a..4705dc4097 100644
--- a/src/backend/utils/mmgr/portalmem.c
+++ b/src/backend/utils/mmgr/portalmem.c
@@ -285,6 +285,7 @@ PortalDefineQuery(Portal portal,
 				  const char *sourceText,
 				  CommandTag commandTag,
 				  List *stmts,
+				  List *part_prune_results,
 				  CachedPlan *cplan)
 {
 	AssertArg(PortalIsValid(portal));
@@ -299,6 +300,7 @@ PortalDefineQuery(Portal portal,
 	portal->qc.nprocessed = 0;
 	portal->commandTag = commandTag;
 	portal->stmts = stmts;
+	portal->part_prune_results = part_prune_results;
 	portal->cplan = cplan;
 	portal->status = PORTAL_DEFINED;
 }
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 666977fb1f..34975c69ee 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -87,7 +87,8 @@ extern void ExplainOneUtility(Node *utilityStmt, IntoClause *into,
 							  ExplainState *es, const char *queryString,
 							  ParamListInfo params, QueryEnvironment *queryEnv);
 
-extern void ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into,
+extern void ExplainOnePlan(PlannedStmt *plannedstmt, PartitionPruneResult *part_prune_resul,
+						   IntoClause *into,
 						   ExplainState *es, const char *queryString,
 						   ParamListInfo params, QueryEnvironment *queryEnv,
 						   const instr_time *planduration,
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index 708435e952..bd8776402e 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -45,6 +45,7 @@ extern void ExecCleanupTupleRouting(ModifyTableState *mtstate,
  * nparts						Length of subplan_map[] and subpart_map[].
  * subplan_map					Subplan index by partition index, or -1.
  * subpart_map					Subpart index by partition index, or -1.
+ * rti_map						Range table index by partition index, or 0.
  * present_parts				A Bitmapset of the partition indexes that we
  *								have subplans or subparts for.
  * initial_pruning_steps		List of PartitionPruneSteps used to
@@ -61,6 +62,7 @@ typedef struct PartitionedRelPruningData
 	int			nparts;
 	int		   *subplan_map;
 	int		   *subpart_map;
+	Index	   *rti_map;
 	Bitmapset  *present_parts;
 	List	   *initial_pruning_steps;
 	List	   *exec_pruning_steps;
@@ -123,9 +125,13 @@ typedef struct PartitionPruneState
 
 extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
 													 int n_total_subplans,
-													 PartitionPruneInfo *pruneinfo,
+													 int part_prune_index,
 													 Bitmapset **initially_valid_subplans);
 extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
-										   bool initial_prune);
-
+										   bool initial_prune,
+										   Bitmapset **scan_leafpart_rtis);
+extern Bitmapset *ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt,
+								ParamListInfo params,
+								PartitionPruneInfo *pruneinfo,
+								Bitmapset **scan_leafpart_rtis);
 #endif							/* EXECPARTITION_H */
diff --git a/src/include/executor/execdesc.h b/src/include/executor/execdesc.h
index e79e2c001f..60d5644908 100644
--- a/src/include/executor/execdesc.h
+++ b/src/include/executor/execdesc.h
@@ -35,6 +35,8 @@ typedef struct QueryDesc
 	/* These fields are provided by CreateQueryDesc */
 	CmdType		operation;		/* CMD_SELECT, CMD_UPDATE, etc. */
 	PlannedStmt *plannedstmt;	/* planner's output (could be utility, too) */
+	PartitionPruneResult *part_prune_result; /* ExecutorDoInitialPruning()'s
+											  * output for plannedstmt */
 	const char *sourceText;		/* source text of the query */
 	Snapshot	snapshot;		/* snapshot to use for query */
 	Snapshot	crosscheck_snapshot;	/* crosscheck for RI update/delete */
@@ -57,6 +59,7 @@ typedef struct QueryDesc
 
 /* in pquery.c */
 extern QueryDesc *CreateQueryDesc(PlannedStmt *plannedstmt,
+								  PartitionPruneResult *part_prune_result,
 								  const char *sourceText,
 								  Snapshot snapshot,
 								  Snapshot crosscheck_snapshot,
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index 873772f188..57dc0e8077 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -185,6 +185,8 @@ ExecGetJunkAttribute(TupleTableSlot *slot, AttrNumber attno, bool *isNull)
 /*
  * prototypes from functions in execMain.c
  */
+extern PartitionPruneResult *ExecutorDoInitialPruning(PlannedStmt *plannedstmt,
+													  ParamListInfo params);
 extern void ExecutorStart(QueryDesc *queryDesc, int eflags);
 extern void standard_ExecutorStart(QueryDesc *queryDesc, int eflags);
 extern void ExecutorRun(QueryDesc *queryDesc,
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index cbbcff81d2..b5a7fd7e16 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -596,6 +596,8 @@ typedef struct EState
 	struct ExecRowMark **es_rowmarks;	/* Array of per-range-table-entry
 										 * ExecRowMarks, or NULL if none */
 	PlannedStmt *es_plannedstmt;	/* link to top of plan tree */
+	List		*es_part_prune_infos;	/* PlannedStmt.partPruneInfos */
+	struct PartitionPruneResult *es_part_prune_result; /* QueryDesc.part_prune_result */
 	const char *es_sourceText;	/* Source text from QueryDesc */
 
 	JunkFilter *es_junkFilter;	/* top-level junk filter, if any */
@@ -984,6 +986,19 @@ typedef struct DomainConstraintState
  */
 typedef TupleTableSlot *(*ExecProcNodeMtd) (struct PlanState *pstate);
 
+/*----------------
+ * PartitionPruneResult
+ *
+ * Result of ExecutorDoInitialPruning() invocation on a given plan.
+ */
+typedef struct PartitionPruneResult
+{
+	NodeTag		type;
+
+	Bitmapset	   *scan_leafpart_rtis;
+	List		   *valid_subplan_offs_list;
+} PartitionPruneResult;
+
 /* ----------------
  *		PlanState node
  *
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index 300824258e..de312b9215 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -97,6 +97,9 @@ typedef enum NodeTag
 	T_PartitionPruneStepCombine,
 	T_PlanInvalItem,
 
+	/* TAGS FOR EXECUTOR PREP NODES (execnodes.h) */
+	T_PartitionPruneResult,
+
 	/*
 	 * TAGS FOR PLAN STATE NODES (execnodes.h)
 	 *
@@ -673,6 +676,7 @@ extern struct Bitmapset *readBitmapset(void);
 extern uintptr_t readDatum(bool typbyval);
 extern bool *readBoolCols(int numCols);
 extern int *readIntCols(int numCols);
+extern Index *readIndexCols(int numCols);
 extern Oid *readOidCols(int numCols);
 extern int16 *readAttrNumberCols(int numCols);
 
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 6cbcb67bdf..f2039071c9 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -107,6 +107,18 @@ typedef struct PlannerGlobal
 
 	List	   *appendRelations;	/* "flat" list of AppendRelInfos */
 
+	List	   *partPruneInfos;		/* List of PartitionPruneInfo contained in
+									 * the plan */
+
+	bool		containsInitialPruning;	/* Do any of those PartitionPruneInfos
+										 * have initial (pre-exec) pruning
+										 * steps in them? */
+
+	Bitmapset  *minLockRelids;	/* RT indexes of RTE_RELATION entries that
+								 * must always be locked to execute the plan;
+								 * those scanned by initial-prunable plan
+								 * nodes are not included */
+
 	List	   *relationOids;	/* OIDs of relations the plan depends on */
 
 	List	   *invalItems;		/* other dependencies, as PlanInvalItems */
@@ -377,6 +389,9 @@ struct PlannerInfo
 
 	/* Does this query modify any partition key columns? */
 	bool		partColsUpdated;
+
+	/* PartitionPruneInfos added in this query's plan. */
+	List	   *partPruneInfos;
 };
 
 
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 50ef3dda05..0a144a1e92 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -64,8 +64,19 @@ typedef struct PlannedStmt
 
 	struct Plan *planTree;		/* tree of Plan nodes */
 
+	List	   *partPruneInfos;	/* List of PartitionPruneInfo contained in
+								 * the plan */
+
+	bool		containsInitialPruning;	/* Do any of those PartitionPruneInfos
+										 * have initial (pre-exec) pruning
+										 * steps in them? */
+
 	List	   *rtable;			/* list of RangeTblEntry nodes */
 
+	Bitmapset  *minLockRelids;	/* RT indexes of RTE_RELATION entries that
+								 * must be locked, except those scanned by
+								 * initial-prunable plan nodes */
+
 	/* rtable indexes of target relations for INSERT/UPDATE/DELETE */
 	List	   *resultRelations;	/* integer list of RT indexes, or NIL */
 
@@ -262,8 +273,12 @@ typedef struct Append
 	 */
 	int			first_partial_plan;
 
-	/* Info for run-time subplan pruning; NULL if we're not doing that */
-	struct PartitionPruneInfo *part_prune_info;
+	/*
+	 * Index of this plan's PartitionPruneInfo in PlannedStmt.part_prune_infos
+	 * to be used for run-time subplan pruning; -1 if run-time pruning is
+	 * not needed.
+	 */
+	int			part_prune_index;
 } Append;
 
 /* ----------------
@@ -282,8 +297,13 @@ typedef struct MergeAppend
 	Oid		   *sortOperators;	/* OIDs of operators to sort them by */
 	Oid		   *collations;		/* OIDs of collations */
 	bool	   *nullsFirst;		/* NULLS FIRST/LAST directions */
-	/* Info for run-time subplan pruning; NULL if we're not doing that */
-	struct PartitionPruneInfo *part_prune_info;
+
+	/*
+	 * Index of this plan's PartitionPruneInfo in PlannedStmt.part_prune_infos
+	 * to be used for run-time subplan pruning; -1 if run-time pruning is
+	 * not needed.
+	 */
+	int			part_prune_index;
 } MergeAppend;
 
 /* ----------------
@@ -1175,6 +1195,13 @@ typedef struct PlanRowMark
  * prune_infos			List of Lists containing PartitionedRelPruneInfo nodes,
  *						one sublist per run-time-prunable partition hierarchy
  *						appearing in the parent plan node's subplans.
+ *
+ * needs_init_pruning	Does any of the PartitionedRelPruneInfos in
+ *						prune_infos have its initial_pruning_steps set?
+ *
+ * needs_exec_pruning	Does any of the PartitionedRelPruneInfos in
+ *						prune_infos have its exec_pruning_steps set?
+ *
  * other_subplans		Indexes of any subplans that are not accounted for
  *						by any of the PartitionedRelPruneInfo nodes in
  *						"prune_infos".  These subplans must not be pruned.
@@ -1183,6 +1210,9 @@ typedef struct PartitionPruneInfo
 {
 	NodeTag		type;
 	List	   *prune_infos;
+	Bitmapset  *leafpart_rtis;
+	bool		needs_init_pruning;
+	bool		needs_exec_pruning;
 	Bitmapset  *other_subplans;
 } PartitionPruneInfo;
 
@@ -1213,6 +1243,7 @@ typedef struct PartitionedRelPruneInfo
 	int		   *subplan_map;	/* subplan index by partition index, or -1 */
 	int		   *subpart_map;	/* subpart index by partition index, or -1 */
 	Oid		   *relid_map;		/* relation OID by partition index, or 0 */
+	Index	   *rti_map;		/* Range table index by partition index, 0. */
 
 	/*
 	 * initial_pruning_steps shows how to prune during executor startup (i.e.,
diff --git a/src/include/tcop/tcopprot.h b/src/include/tcop/tcopprot.h
index 92291a750d..119d4a1d10 100644
--- a/src/include/tcop/tcopprot.h
+++ b/src/include/tcop/tcopprot.h
@@ -64,7 +64,7 @@ extern PlannedStmt *pg_plan_query(Query *querytree, const char *query_string,
 								  ParamListInfo boundParams);
 extern List *pg_plan_queries(List *querytrees, const char *query_string,
 							 int cursorOptions,
-							 ParamListInfo boundParams);
+							 ParamListInfo boundParams, List **part_prune_result_list);
 
 extern bool check_max_stack_depth(int *newval, void **extra, GucSource source);
 extern void assign_max_stack_depth(int newval, void *extra);
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index 95b99e3d25..f591b9df9c 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -148,6 +148,9 @@ typedef struct CachedPlan
 {
 	int			magic;			/* should equal CACHEDPLAN_MAGIC */
 	List	   *stmt_list;		/* list of PlannedStmts */
+	List	   *part_prune_result_list;	/* list of PartitionPruneResult with
+									 * one element for each of stmt_list; NIL
+									 * if not a generic plan */
 	bool		is_oneshot;		/* is it a "oneshot" plan? */
 	bool		is_saved;		/* is CachedPlan in a long-lived context? */
 	bool		is_valid;		/* is the stmt_list currently valid? */
@@ -158,6 +161,10 @@ typedef struct CachedPlan
 	int			generation;		/* parent's generation number for this plan */
 	int			refcount;		/* count of live references to this struct */
 	MemoryContext context;		/* context containing this CachedPlan */
+	MemoryContext part_prune_result_list_context; /* context containing
+												   * part_prune_result_list,
+												   * a child of the above
+												   * context */
 } CachedPlan;
 
 /*
diff --git a/src/include/utils/portal.h b/src/include/utils/portal.h
index aeddbdafe5..c1e304f9d7 100644
--- a/src/include/utils/portal.h
+++ b/src/include/utils/portal.h
@@ -137,6 +137,10 @@ typedef struct PortalData
 	CommandTag	commandTag;		/* command tag for original query */
 	QueryCompletion qc;			/* command completion data for executed query */
 	List	   *stmts;			/* list of PlannedStmts */
+	List	   *part_prune_results;	/* list of PartitionPruneResults with one element
+								 * for each of 'stmts'; same as
+								 * cplan->part_prune_result_list if cplan is
+								 * not NULL */
 	CachedPlan *cplan;			/* CachedPlan, if stmts are from one */
 
 	ParamListInfo portalParams; /* params to pass to query */
@@ -241,6 +245,7 @@ extern void PortalDefineQuery(Portal portal,
 							  const char *sourceText,
 							  CommandTag commandTag,
 							  List *stmts,
+							  List *part_prune_results,
 							  CachedPlan *cplan);
 extern PlannedStmt *PortalGetPrimaryStmt(Portal portal);
 extern void PortalCreateHoldStore(Portal portal);
-- 
2.24.1



^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2022-04-07 08:27  Amit Langote <[email protected]>
  parent: Amit Langote <[email protected]>
  0 siblings, 1 reply; 108+ messages in thread

From: Amit Langote @ 2022-04-07 08:27 UTC (permalink / raw)
  To: David Rowley <[email protected]>; +Cc: Robert Haas <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers

On Wed, Apr 6, 2022 at 4:20 PM Amit Langote <[email protected]> wrote:
> And here is a version like that that passes make check-world.  Maybe
> still a WIP as I think comments could use more editing.
>
> Here's how the new implementation works:
>
> AcquireExecutorLocks() calls ExecutorDoInitialPruning(), which in turn
> iterates over a list of PartitionPruneInfos in a given PlannedStmt
> coming from a CachedPlan.  For each PartitionPruneInfo,
> ExecPartitionDoInitialPruning() is called, which sets up
> PartitionPruneState and performs initial pruning steps present in the
> PartitionPruneInfo.  The resulting bitmapsets of valid subplans, one
> for each PartitionPruneInfo, are collected in a list and added to a
> result node called PartitionPruneResult.  It represents the result of
> performing initial pruning on all PartitionPruneInfos found in a plan.
> A list of PartitionPruneResults is passed along with the PlannedStmt
> to the executor, which is referenced when initializing
> Append/MergeAppend nodes.
>
> PlannedStmt.minLockRelids defined by the planner contains the RT
> indexes of all the entries in the range table minus those of the leaf
> partitions whose subplans are subject to removal due to initial
> pruning.  AcquireExecutoLocks() adds back the RT indexes of only those
> leaf partitions whose subplans survive ExecutorDoInitialPruning().  To
> get the leaf partition RT indexes from the PartitionPruneInfo, a new
> rti_map array is added to PartitionedRelPruneInfo.
>
> There's only one patch this time.  Patches that added partitioned_rels
> and plan_tree_walker() are no longer necessary.

Here's an updated version.  In Particular, I removed
part_prune_results list from PortalData, in favor of anything that
needs to look at the list can instead get it from the CachedPlan
(PortalData.cplan).  This makes things better in 2 ways:

* All the changes that were needed to produce the list to be pass to
PortalDefineQuery() are now unnecessary (especially ugly ones were
those made to pg_plan_queries()'s interface)

* The cases in which the PartitionPruneResult being added to a
QueryDesc can be assumed to be valid is more clearly define now; it's
the cases where the portal's CachedPlan is also valid, that is, if the
accompanying PlannedStmt is a cached one.

-- 
Amit Langote
EDB: http://www.enterprisedb.com


Attachments:

  [application/octet-stream] v12-0001-Optimize-AcquireExecutorLocks-to-skip-pruned-par.patch (90.2K, 2-v12-0001-Optimize-AcquireExecutorLocks-to-skip-pruned-par.patch)
  download | inline diff:
From f55a622383c90c3f300dede0d04247f7cf2d9e77 Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Wed, 22 Dec 2021 16:55:17 +0900
Subject: [PATCH v12] Optimize AcquireExecutorLocks() to skip pruned partitions

---
 src/backend/commands/copyto.c           |   2 +-
 src/backend/commands/createas.c         |   2 +-
 src/backend/commands/explain.c          |   7 +-
 src/backend/commands/extension.c        |   2 +-
 src/backend/commands/matview.c          |   2 +-
 src/backend/commands/prepare.c          |  13 +-
 src/backend/executor/README             |  28 +++
 src/backend/executor/execMain.c         |  46 +++++
 src/backend/executor/execParallel.c     |  28 ++-
 src/backend/executor/execPartition.c    | 238 ++++++++++++++++++++----
 src/backend/executor/execUtils.c        |   1 +
 src/backend/executor/functions.c        |   2 +-
 src/backend/executor/nodeAppend.c       |  16 +-
 src/backend/executor/nodeMergeAppend.c  |   9 +-
 src/backend/executor/spi.c              |  10 +-
 src/backend/nodes/copyfuncs.c           |  33 +++-
 src/backend/nodes/outfuncs.c            |  36 +++-
 src/backend/nodes/readfuncs.c           |  56 +++++-
 src/backend/optimizer/plan/createplan.c |  20 +-
 src/backend/optimizer/plan/planner.c    |   3 +
 src/backend/optimizer/plan/setrefs.c    | 104 ++++++++---
 src/backend/partitioning/partprune.c    |  41 +++-
 src/backend/tcop/pquery.c               |  28 ++-
 src/backend/utils/cache/plancache.c     | 236 ++++++++++++++++++++---
 src/include/commands/explain.h          |   3 +-
 src/include/executor/execPartition.h    |  12 +-
 src/include/executor/execdesc.h         |   3 +
 src/include/executor/executor.h         |   2 +
 src/include/nodes/execnodes.h           |  15 ++
 src/include/nodes/nodes.h               |   4 +
 src/include/nodes/pathnodes.h           |  15 ++
 src/include/nodes/plannodes.h           |  39 +++-
 src/include/utils/plancache.h           |   7 +
 33 files changed, 919 insertions(+), 144 deletions(-)

diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index 55c38b04c4..d403eb2309 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -542,7 +542,7 @@ BeginCopyTo(ParseState *pstate,
 		((DR_copy *) dest)->cstate = cstate;
 
 		/* Create a QueryDesc requesting no output */
-		cstate->queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+		cstate->queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
 											GetActiveSnapshot(),
 											InvalidSnapshot,
 											dest, NULL, NULL, 0);
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 9abbb6b555..f6607f2454 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -325,7 +325,7 @@ ExecCreateTableAs(ParseState *pstate, CreateTableAsStmt *stmt,
 		UpdateActiveSnapshotCommandId();
 
 		/* Create a QueryDesc, redirecting output to our tuple receiver */
-		queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+		queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
 									GetActiveSnapshot(), InvalidSnapshot,
 									dest, params, queryEnv, 0);
 
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 1e5701b8eb..7ba9852e51 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -407,7 +407,7 @@ ExplainOneQuery(Query *query, int cursorOptions,
 		}
 
 		/* run it (if needed) and produce output */
-		ExplainOnePlan(plan, into, es, queryString, params, queryEnv,
+		ExplainOnePlan(plan, NULL, into, es, queryString, params, queryEnv,
 					   &planduration, (es->buffers ? &bufusage : NULL));
 	}
 }
@@ -515,7 +515,8 @@ ExplainOneUtility(Node *utilityStmt, IntoClause *into, ExplainState *es,
  * to call it.
  */
 void
-ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
+ExplainOnePlan(PlannedStmt *plannedstmt, PartitionPruneResult *part_prune_result,
+			   IntoClause *into, ExplainState *es,
 			   const char *queryString, ParamListInfo params,
 			   QueryEnvironment *queryEnv, const instr_time *planduration,
 			   const BufferUsage *bufusage)
@@ -563,7 +564,7 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
 		dest = None_Receiver;
 
 	/* Create a QueryDesc for the query */
-	queryDesc = CreateQueryDesc(plannedstmt, queryString,
+	queryDesc = CreateQueryDesc(plannedstmt, part_prune_result, queryString,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, params, queryEnv, instrument_option);
 
diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index 1013790dbb..54734a3a93 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -776,7 +776,7 @@ execute_sql_string(const char *sql)
 			{
 				QueryDesc  *qdesc;
 
-				qdesc = CreateQueryDesc(stmt,
+				qdesc = CreateQueryDesc(stmt, NULL,
 										sql,
 										GetActiveSnapshot(), NULL,
 										dest, NULL, NULL, 0);
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index 9ab248d25e..2be1782bc4 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -416,7 +416,7 @@ refresh_matview_datafill(DestReceiver *dest, Query *query,
 	UpdateActiveSnapshotCommandId();
 
 	/* Create a QueryDesc, redirecting output to our tuple receiver */
-	queryDesc = CreateQueryDesc(plan, queryString,
+	queryDesc = CreateQueryDesc(plan, NULL, queryString,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, NULL, NULL, 0);
 
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 80738547ed..45039e64be 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -576,7 +576,9 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	const char *query_string;
 	CachedPlan *cplan;
 	List	   *plan_list;
-	ListCell   *p;
+	List	   *plan_part_prune_result_list;
+	ListCell   *p,
+			   *pp;
 	ParamListInfo paramLI = NULL;
 	EState	   *estate = NULL;
 	instr_time	planstart;
@@ -632,15 +634,18 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	}
 
 	plan_list = cplan->stmt_list;
+	plan_part_prune_result_list = cplan->part_prune_result_list;
 
 	/* Explain each query */
-	foreach(p, plan_list)
+	forboth(p, plan_list, pp, plan_part_prune_result_list)
 	{
 		PlannedStmt *pstmt = lfirst_node(PlannedStmt, p);
+		PartitionPruneResult *part_prune_result = lfirst_node(PartitionPruneResult, pp);
 
 		if (pstmt->commandType != CMD_UTILITY)
-			ExplainOnePlan(pstmt, into, es, query_string, paramLI, queryEnv,
-						   &planduration, (es->buffers ? &bufusage : NULL));
+			ExplainOnePlan(pstmt, part_prune_result, into, es, query_string,
+						   paramLI, queryEnv, &planduration,
+						   (es->buffers ? &bufusage : NULL));
 		else
 			ExplainOneUtility(pstmt->utilityStmt, into, es, query_string,
 							  paramLI, queryEnv);
diff --git a/src/backend/executor/README b/src/backend/executor/README
index 0b5183fc4a..8418e758da 100644
--- a/src/backend/executor/README
+++ b/src/backend/executor/README
@@ -65,6 +65,30 @@ found there.  This currently only occurs for Append and MergeAppend nodes.  In
 this case the non-required subplans are ignored and the executor state's
 subnode array will become out of sequence to the plan's subplan list.
 
+Actually, the so-called execution time pruning may also occur even before the
+execution has started.  One case where that occurs is when a cached generic
+plan is being validated for execution by plancache.c: GetCachedPlan(), which
+proceeds by locking all the relations that will be scanned by that plan.  If
+the generic plan has nodes that contain so-called initial pruning steps (a
+subset of execution pruning steps that do not depend on full-fledged execution
+having started), they are performed at this point to figure out the minimal
+set of child subplans that satisfy those pruning instructions and the result
+of performing that pruning is saved in a data structure that gets passed to
+the executor alongside the plan tree.  Relations scanned by only those
+surviving subplans are then locked while those scanned by the pruned subplans
+are not, even though the pruned subplans themselves are not removed from the
+plan tree. So, it is imperative that the executor and any third party code
+invoked by it that gets passed the plan tree look at the initial pruning result
+made available via the aforementioned data structure to determine whether or
+not a particular subplan is valid.  The data structure basically consists of
+a PartitionPruneResult node passed through the QueryDesc (subsequently added
+to EState) containing a list of bitmapsets with one element for every
+PartitionPruneInfo found in PlannedStmt.partPruneInfos.  The list is indexed
+with part_prune_index of the individual PartitionPruneInfos that's stored in
+the parent plan nodes to which a given PartitionPruneInfo belongs.  Each
+bitmapset of the indexes of the child subplans of the given parent plan
+node that survive initial partiiton pruning.
+
 Each Plan node may have expression trees associated with it, to represent
 its target list, qualification conditions, etc.  These trees are also
 read-only to the executor, but the executor state for expression evaluation
@@ -286,6 +310,10 @@ Query Processing Control Flow
 
 This is a sketch of control flow for full query processing:
 
+  [ ExecutorDoInitialPruning ] --- an optional step to perform initial
+		partition pruning on the plan tree the result of which is passed
+		to the executor via QueryDesc
+
 	CreateQueryDesc
 
 	ExecutorStart
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index ef2fd46092..05cc99df8f 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -49,11 +49,13 @@
 #include "commands/matview.h"
 #include "commands/trigger.h"
 #include "executor/execdebug.h"
+#include "executor/execPartition.h"
 #include "executor/nodeSubplan.h"
 #include "foreign/fdwapi.h"
 #include "jit/jit.h"
 #include "mb/pg_wchar.h"
 #include "miscadmin.h"
+#include "nodes/nodeFuncs.h"
 #include "parser/parsetree.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
@@ -104,6 +106,47 @@ static void EvalPlanQualStart(EPQState *epqstate, Plan *planTree);
 
 /* end of local decls */
 
+/* ----------------------------------------------------------------
+ *		ExecutorDoInitialPruning
+ *
+ *		Performs initial partition pruning to figure out the minimal set of
+ *		subplans to be executed and the set of RT indexes of the corresponding
+ *		leaf partitions
+ *
+ * Returned PartitionPruneResult must be subsequently passed to the executor
+ * so that it can reuse the result of pruning.  It's important that the
+ * has the same view of which partitions are initially pruned (by not doing
+ * the pruning again itself) or otherwise it risks initializing subplans whose
+ * partitions would not have been locked.
+ *
+ * Note: Partitioned tables mentioned in PartitionedRelPruneInfo nodes that
+ * drive the pruning will be locked before doing the pruning.
+ */
+PartitionPruneResult *
+ExecutorDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params)
+{
+	PartitionPruneResult *result;
+	ListCell *lc;
+
+	/* Only get here if there is any pruning to do. */
+	Assert(plannedstmt->containsInitialPruning);
+
+	result = makeNode(PartitionPruneResult);
+	foreach(lc, plannedstmt->partPruneInfos)
+	{
+		PartitionPruneInfo *pruneinfo = lfirst(lc);
+		Bitmapset *valid_subplan_offs;
+
+		valid_subplan_offs =
+			ExecPartitionDoInitialPruning(plannedstmt, params, pruneinfo,
+										  &result->scan_leafpart_rtis);
+		result->valid_subplan_offs_list =
+			lappend(result->valid_subplan_offs_list,
+					valid_subplan_offs);
+	}
+
+	return result;
+}
 
 /* ----------------------------------------------------------------
  *		ExecutorStart
@@ -806,6 +849,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
 {
 	CmdType		operation = queryDesc->operation;
 	PlannedStmt *plannedstmt = queryDesc->plannedstmt;
+	PartitionPruneResult *part_prune_result = queryDesc->part_prune_result;
 	Plan	   *plan = plannedstmt->planTree;
 	List	   *rangeTable = plannedstmt->rtable;
 	EState	   *estate = queryDesc->estate;
@@ -825,6 +869,8 @@ InitPlan(QueryDesc *queryDesc, int eflags)
 	ExecInitRangeTable(estate, rangeTable);
 
 	estate->es_plannedstmt = plannedstmt;
+	estate->es_part_prune_infos = plannedstmt->partPruneInfos;
+	estate->es_part_prune_result = part_prune_result;
 
 	/*
 	 * Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index 9a0d5d59ef..805f86c503 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -66,6 +66,7 @@
 #define PARALLEL_KEY_QUERY_TEXT		UINT64CONST(0xE000000000000008)
 #define PARALLEL_KEY_JIT_INSTRUMENTATION UINT64CONST(0xE000000000000009)
 #define PARALLEL_KEY_WAL_USAGE			UINT64CONST(0xE00000000000000A)
+#define PARALLEL_KEY_PARTITIONPRUNERESULT	UINT64CONST(0xE00000000000000B)
 
 #define PARALLEL_TUPLE_QUEUE_SIZE		65536
 
@@ -182,7 +183,9 @@ ExecSerializePlan(Plan *plan, EState *estate)
 	pstmt->transientPlan = false;
 	pstmt->dependsOnRole = false;
 	pstmt->parallelModeNeeded = false;
+	pstmt->containsInitialPruning = false;
 	pstmt->planTree = plan;
+	pstmt->partPruneInfos = estate->es_part_prune_infos;
 	pstmt->rtable = estate->es_range_table;
 	pstmt->resultRelations = NIL;
 	pstmt->appendRelations = NIL;
@@ -596,12 +599,15 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	FixedParallelExecutorState *fpes;
 	char	   *pstmt_data;
 	char	   *pstmt_space;
+	char	   *part_prune_result_data;
+	char	   *part_prune_result_space;
 	char	   *paramlistinfo_space;
 	BufferUsage *bufusage_space;
 	WalUsage   *walusage_space;
 	SharedExecutorInstrumentation *instrumentation = NULL;
 	SharedJitInstrumentation *jit_instrumentation = NULL;
 	int			pstmt_len;
+	int			part_prune_result_len;
 	int			paramlistinfo_len;
 	int			instrumentation_len = 0;
 	int			jit_instrumentation_len = 0;
@@ -630,6 +636,7 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 
 	/* Fix up and serialize plan to be sent to workers. */
 	pstmt_data = ExecSerializePlan(planstate->plan, estate);
+	part_prune_result_data = nodeToString(estate->es_part_prune_result);
 
 	/* Create a parallel context. */
 	pcxt = CreateParallelContext("postgres", "ParallelQueryMain", nworkers);
@@ -656,6 +663,11 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	shm_toc_estimate_chunk(&pcxt->estimator, pstmt_len);
 	shm_toc_estimate_keys(&pcxt->estimator, 1);
 
+	/* Estimate space for serialized PartitionPruneResult. */
+	part_prune_result_len = strlen(part_prune_result_data) + 1;
+	shm_toc_estimate_chunk(&pcxt->estimator, part_prune_result_len);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
 	/* Estimate space for serialized ParamListInfo. */
 	paramlistinfo_len = EstimateParamListSpace(estate->es_param_list_info);
 	shm_toc_estimate_chunk(&pcxt->estimator, paramlistinfo_len);
@@ -750,6 +762,12 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	memcpy(pstmt_space, pstmt_data, pstmt_len);
 	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PLANNEDSTMT, pstmt_space);
 
+	/* Store serialized PartitionPruneResult */
+	part_prune_result_space = shm_toc_allocate(pcxt->toc, part_prune_result_len);
+	memcpy(part_prune_result_space, part_prune_result_data, part_prune_result_len);
+	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARTITIONPRUNERESULT,
+				   part_prune_result_space);
+
 	/* Store serialized ParamListInfo. */
 	paramlistinfo_space = shm_toc_allocate(pcxt->toc, paramlistinfo_len);
 	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARAMLISTINFO, paramlistinfo_space);
@@ -1231,8 +1249,10 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
 						 int instrument_options)
 {
 	char	   *pstmtspace;
+	char	   *part_prune_result_space;
 	char	   *paramspace;
 	PlannedStmt *pstmt;
+	PartitionPruneResult *part_prune_result;
 	ParamListInfo paramLI;
 	char	   *queryString;
 
@@ -1243,12 +1263,18 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
 	pstmtspace = shm_toc_lookup(toc, PARALLEL_KEY_PLANNEDSTMT, false);
 	pstmt = (PlannedStmt *) stringToNode(pstmtspace);
 
+	/* Reconstruct leader-supplied PartitionPruneResult. */
+	part_prune_result_space =
+		shm_toc_lookup(toc, PARALLEL_KEY_PARTITIONPRUNERESULT, false);
+	part_prune_result = (PartitionPruneResult *)
+		stringToNode(part_prune_result_space);
+
 	/* Reconstruct ParamListInfo. */
 	paramspace = shm_toc_lookup(toc, PARALLEL_KEY_PARAMLISTINFO, false);
 	paramLI = RestoreParamList(&paramspace);
 
 	/* Create a QueryDesc for the query. */
-	return CreateQueryDesc(pstmt,
+	return CreateQueryDesc(pstmt, part_prune_result,
 						   queryString,
 						   GetActiveSnapshot(), InvalidSnapshot,
 						   receiver, paramLI, NULL, instrument_options);
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 615bd80973..3037742b8d 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -25,6 +25,7 @@
 #include "mb/pg_wchar.h"
 #include "miscadmin.h"
 #include "nodes/makefuncs.h"
+#include "parser/parsetree.h"
 #include "partitioning/partbounds.h"
 #include "partitioning/partdesc.h"
 #include "partitioning/partprune.h"
@@ -185,7 +186,11 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation rel,
 static List *adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri);
 static List *adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap);
 static PartitionPruneState *CreatePartitionPruneState(PlanState *planstate,
-													  PartitionPruneInfo *pruneinfo);
+													  PartitionPruneInfo *pruneinfo,
+													  bool consider_initial_steps,
+													  bool consider_exec_steps,
+													  List *rtable, ExprContext *econtext,
+													  PartitionDirectory partdir);
 static void InitPartitionPruneContext(PartitionPruneContext *context,
 									  List *pruning_steps,
 									  PartitionDesc partdesc,
@@ -198,7 +203,8 @@ static void PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
 static void find_matching_subplans_recurse(PartitionPruningData *prunedata,
 										   PartitionedRelPruningData *pprune,
 										   bool initial_prune,
-										   Bitmapset **validsubplans);
+										   Bitmapset **validsubplans,
+										   Bitmapset **scan_leafpart_rtis);
 
 
 /*
@@ -1587,8 +1593,10 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  * considered to be a stable expression, it can change value from one plan
  * node scan to the next during query execution.  Stable comparison
  * expressions that don't involve such Params allow partition pruning to be
- * done once during executor startup.  Expressions that do involve such Params
- * require us to prune separately for each scan of the parent plan node.
+ * done once during executor startup or during ExecutorDoInitialPruning() that
+ * runs as part of performing AcquireExecutorLocks() on a given plan tree.
+ * Expressions that do involve such Params require us to prune separately for
+ * each scan of the parent plan node.
  *
  * Note that pruning away unneeded subplans during executor startup has the
  * added benefit of not having to initialize the unneeded subplans at all.
@@ -1605,6 +1613,13 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  *		account for initial pruning possibly having eliminated some of the
  *		subplans.
  *
+ * ExecPartitionDoInitialPruning:
+ *		Do initial pruning with the information contained in a given
+ *		PartitionPruneInfo to determine the minimal set of child subplans
+ *		to be executed of the parent plan node to which the PartitionPruneInfo
+ *		belongs and also the set of the RT indexes of leaf partitions that will
+ *		be scanned with those subplans.
+ *
  * ExecFindMatchingSubPlans:
  *		Returns indexes of matching subplans after evaluating the expressions
  *		that are safe to evaluate at a given point.  This function is first
@@ -1622,8 +1637,9 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  *
  * On return, *initially_valid_subplans is assigned the set of indexes of
  * child subplans that must be initialized along with the parent plan node.
- * Initial pruning is performed here if needed and in that case only the
- * surviving subplans' indexes are added.
+ * Initial pruning is performed here if needed (unless it has already been done
+ * by ExecDoInitialPruning()), and in that case only the surviving subplans'
+ * indexes are added.
  *
  * If subplans are indeed pruned, subplan_map arrays contained in the returned
  * PartitionPruneState are re-sequenced to not count those, though only if the
@@ -1632,23 +1648,59 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
 PartitionPruneState *
 ExecInitPartitionPruning(PlanState *planstate,
 						 int n_total_subplans,
-						 PartitionPruneInfo *pruneinfo,
+						 int part_prune_index,
 						 Bitmapset **initially_valid_subplans)
 {
-	PartitionPruneState *prunestate;
 	EState	   *estate = planstate->state;
+	PartitionPruneInfo  *pruneinfo = list_nth(estate->es_part_prune_infos,
+											  part_prune_index);
+	PartitionPruneResult *pruneresult = estate->es_part_prune_result;
+	PartitionPruneState *prunestate;
+	bool	do_pruning = (pruneinfo->needs_init_pruning ||
+						  pruneinfo->needs_exec_pruning);
 
-	/* We may need an expression context to evaluate partition exprs */
-	ExecAssignExprContext(estate, planstate);
+	/*
+	 * No need to do initial pruning if it was done already by
+	 * ExecutorDoInitialPruning(), which it would be if es_part_prune_result
+	 * has been set.
+	 */
+	if (pruneresult)
+		do_pruning = pruneinfo->needs_exec_pruning;
+
+	prunestate = NULL;
+	if (do_pruning)
+	{
+		/* We may need an expression context to evaluate partition exprs */
+		ExecAssignExprContext(estate, planstate);
 
-	/* Create the working data structure for pruning */
-	prunestate = CreatePartitionPruneState(planstate, pruneinfo);
+		/* For data reading, executor always omits detached partitions */
+		if (estate->es_partition_directory == NULL)
+			estate->es_partition_directory =
+				CreatePartitionDirectory(estate->es_query_cxt, false);
+
+		/*
+		 * Create the working data structure for pruning.  No need to consider
+		 * initial pruning steps if we have a PartitionPruneResult.
+		 */
+		prunestate = CreatePartitionPruneState(planstate, pruneinfo,
+											   pruneresult == NULL, true,
+											   NIL, planstate->ps_ExprContext,
+											   estate->es_partition_directory);
+	}
 
 	/*
 	 * Perform an initial partition prune pass, if required.
 	 */
-	if (prunestate->do_initial_prune)
-		*initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true);
+	if (pruneresult)
+	{
+		*initially_valid_subplans =
+			list_nth(pruneresult->valid_subplan_offs_list, part_prune_index);
+	}
+	else if (prunestate && prunestate->do_initial_prune)
+	{
+		*initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true,
+															 NULL);
+	}
 	else
 	{
 		/* No pruning, so we'll need to initialize all subplans */
@@ -1669,7 +1721,7 @@ ExecInitPartitionPruning(PlanState *planstate,
 		 * leaves invalid data in prunestate, because that data won't be
 		 * consulted again (cf initial Assert in ExecFindMatchingSubPlans).
 		 */
-		if (prunestate->do_exec_prune)
+		if (prunestate && prunestate->do_exec_prune)
 			PartitionPruneFixSubPlanMap(prunestate,
 										*initially_valid_subplans,
 										n_total_subplans);
@@ -1678,11 +1730,72 @@ ExecInitPartitionPruning(PlanState *planstate,
 	return prunestate;
 }
 
+/*
+ * ExecPartitionDoInitialPruning
+ *		Perform initial pruning using given PartitionPruneInfo to determine
+ *		the minimal set of child subplans to be executed of the parent plan
+ *		node to which the PartitionPruneInfo belongs and also the set of RT
+ *		indexes of leaf partitions that will scanned with those subplans.
+ */
+Bitmapset *
+ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params,
+							  PartitionPruneInfo *pruneinfo,
+							  Bitmapset **scan_leafpart_rtis)
+{
+	List		 *rtable = plannedstmt->rtable;
+	ExprContext	 *econtext;
+	PartitionDirectory pdir;
+	MemoryContext oldcontext,
+				  tmpcontext;
+	PartitionPruneState *prunestate;
+	Bitmapset	 *valid_subplan_offs;
+
+	/*
+	 * A temporary context to allocate stuff needded to run the pruning steps.
+	 */
+	tmpcontext = AllocSetContextCreate(CurrentMemoryContext,
+									   "initial pruning working data",
+									   ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(tmpcontext);
+
+	/*
+	 * PartitionDirectory to look up partition descriptors, which omits
+	 * detached partitions, just like in the executor proper.
+	 */
+	pdir = CreatePartitionDirectory(CurrentMemoryContext, false);
+
+	/*
+	 * We don't yet have a PlanState for the parent plan node, so must create
+	 * a standalone ExprContext to evaluate pruning expressions, equipped with
+	 * the information about the EXTERN parameters that the caller passed us.
+	 * Note that that's okay because the initial pruning steps do not contain
+	 * anything that requires the execution to have started.
+	 */
+	econtext = CreateStandaloneExprContext();
+	econtext->ecxt_param_list_info = params;
+	prunestate = CreatePartitionPruneState(NULL, pruneinfo, true, false,
+										   rtable, econtext, pdir);
+	MemoryContextSwitchTo(oldcontext);
+
+	/* Do the initial pruning. */
+	valid_subplan_offs = ExecFindMatchingSubPlans(prunestate, true,
+												  scan_leafpart_rtis);
+
+	FreeExprContext(econtext, true);
+	DestroyPartitionDirectory(pdir);
+	MemoryContextDelete(tmpcontext);
+
+	return valid_subplan_offs;
+}
+
 /*
  * CreatePartitionPruneState
  *		Build the data structure required for calling ExecFindMatchingSubPlans
  *
- * 'planstate' is the parent plan node's execution state.
+ * 'planstate', if not NULL, is the parent plan node's execution state.  It
+ * can be NULL if being called before ExecutorStart(), in which case,
+ * 'rtable' (range table), 'econtext', and 'partdir' must be explicitly
+ * provided.
  *
  * 'pruneinfo' is a PartitionPruneInfo as generated by
  * make_partition_pruneinfo.  Here we build a PartitionPruneState containing a
@@ -1696,19 +1809,21 @@ ExecInitPartitionPruning(PlanState *planstate,
  * PartitionedRelPruneInfo.
  */
 static PartitionPruneState *
-CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
+CreatePartitionPruneState(PlanState *planstate,
+						  PartitionPruneInfo *pruneinfo,
+						  bool consider_initial_steps,
+						  bool consider_exec_steps,
+						  List *rtable, ExprContext *econtext,
+						  PartitionDirectory partdir)
 {
-	EState	   *estate = planstate->state;
+	EState	   *estate = planstate ? planstate->state : NULL;
 	PartitionPruneState *prunestate;
 	int			n_part_hierarchies;
 	ListCell   *lc;
 	int			i;
-	ExprContext *econtext = planstate->ps_ExprContext;
 
-	/* For data reading, executor always omits detached partitions */
-	if (estate->es_partition_directory == NULL)
-		estate->es_partition_directory =
-			CreatePartitionDirectory(estate->es_query_cxt, false);
+	Assert((estate != NULL) ||
+			(partdir != NULL && econtext != NULL && rtable != NIL));
 
 	n_part_hierarchies = list_length(pruneinfo->prune_infos);
 	Assert(n_part_hierarchies > 0);
@@ -1759,19 +1874,48 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 			PartitionedRelPruneInfo *pinfo = lfirst_node(PartitionedRelPruneInfo, lc2);
 			PartitionedRelPruningData *pprune = &prunedata->partrelprunedata[j];
 			Relation	partrel;
+			bool		close_partrel = false;
 			PartitionDesc partdesc;
 			PartitionKey partkey;
 
 			/*
-			 * We can rely on the copies of the partitioned table's partition
-			 * key and partition descriptor appearing in its relcache entry,
-			 * because that entry will be held open and locked for the
-			 * duration of this executor run.
+			 * Must open the relation by ourselves when called before the
+			 * execution has started, such as, when called during
+			 * ExecutorDoInitialPruning() on a cached plan.  In that case,
+			 * sub-partitions must be locked, because AcquirePlannerLocks()
+			 * would not have seen them. (1st relation in a partrelpruneinfos
+			 * list is always the root partitioned table appearing in the
+			 * query, which AcquirePlannerLocks() would have locked; the
+			 * Assert in relation_open() guards that assumption.)
+			 */
+			if (estate == NULL)
+			{
+				RangeTblEntry *rte = rt_fetch(pinfo->rtindex, rtable);
+				int		lockmode = (j == 0) ? NoLock : rte->rellockmode;
+
+				partrel = table_open(rte->relid, lockmode);
+				close_partrel = true;
+			}
+			else
+				partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
+
+			/*
+			 * We can rely on the copy of the partitioned table's partition
+			 * key from in its relcache entry, because it can't change (or
+			 * get destroyed) as long as the relation is locked.  Partition
+			 * descriptor is taken from the PartitionDirectory associated with
+			 * the table that is held open long enough for the descriptor to
+			 * remain valid while it's used to perform the pruning steps.
 			 */
-			partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
 			partkey = RelationGetPartitionKey(partrel);
-			partdesc = PartitionDirectoryLookup(estate->es_partition_directory,
-												partrel);
+			partdesc = PartitionDirectoryLookup(partdir, partrel);
+
+			/*
+			 * Must close partrel, keeping the lock taken, if we're not using
+			 * EState's entry.
+			 */
+			if (close_partrel)
+				table_close(partrel, NoLock);
 
 			/*
 			 * Initialize the subplan_map and subpart_map.
@@ -1785,6 +1929,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 			Assert(partdesc->nparts >= pinfo->nparts);
 			pprune->nparts = partdesc->nparts;
 			pprune->subplan_map = palloc(sizeof(int) * partdesc->nparts);
+			pprune->rti_map = palloc(sizeof(Index) * partdesc->nparts);
 			if (partdesc->nparts == pinfo->nparts)
 			{
 				/*
@@ -1795,6 +1940,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 				pprune->subpart_map = pinfo->subpart_map;
 				memcpy(pprune->subplan_map, pinfo->subplan_map,
 					   sizeof(int) * pinfo->nparts);
+				memcpy(pprune->rti_map, pinfo->rti_map,
+					   sizeof(int) * pinfo->nparts);
 
 				/*
 				 * Double-check that the list of unpruned relations has not
@@ -1845,6 +1992,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 							pinfo->subplan_map[pd_idx];
 						pprune->subpart_map[pp_idx] =
 							pinfo->subpart_map[pd_idx];
+						pprune->rti_map[pp_idx] =
+							pinfo->rti_map[pd_idx];
 						pd_idx++;
 					}
 					else
@@ -1852,6 +2001,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 						/* this partdesc entry is not in the plan */
 						pprune->subplan_map[pp_idx] = -1;
 						pprune->subpart_map[pp_idx] = -1;
+						pprune->rti_map[pp_idx] = 0;
 					}
 				}
 
@@ -1873,7 +2023,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 			 * Initialize pruning contexts as needed.
 			 */
 			pprune->initial_pruning_steps = pinfo->initial_pruning_steps;
-			if (pinfo->initial_pruning_steps)
+			if (consider_initial_steps && pinfo->initial_pruning_steps)
 			{
 				InitPartitionPruneContext(&pprune->initial_context,
 										  pinfo->initial_pruning_steps,
@@ -1883,7 +2033,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 				prunestate->do_initial_prune = true;
 			}
 			pprune->exec_pruning_steps = pinfo->exec_pruning_steps;
-			if (pinfo->exec_pruning_steps)
+			if (consider_exec_steps && pinfo->exec_pruning_steps)
 			{
 				InitPartitionPruneContext(&pprune->exec_context,
 										  pinfo->exec_pruning_steps,
@@ -2111,10 +2261,14 @@ PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
  * Pass initial_prune if PARAM_EXEC Params cannot yet be evaluated.  This
  * differentiates the initial executor-time pruning step from later
  * runtime pruning.
+ *
+ * RT indexes of leaf partitions scanned by the chosen subplans are added to
+ * *scan_leafpart_rtis if the pointer is non-NULL.
  */
 Bitmapset *
 ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
-						 bool initial_prune)
+						 bool initial_prune,
+						 Bitmapset **scan_leafpart_rtis)
 {
 	Bitmapset  *result = NULL;
 	MemoryContext oldcontext;
@@ -2149,7 +2303,7 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
 		 */
 		pprune = &prunedata->partrelprunedata[0];
 		find_matching_subplans_recurse(prunedata, pprune, initial_prune,
-									   &result);
+									   &result, scan_leafpart_rtis);
 
 		/* Expression eval may have used space in ExprContext too */
 		if (pprune->exec_pruning_steps)
@@ -2163,6 +2317,8 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
 
 	/* Copy result out of the temp context before we reset it */
 	result = bms_copy(result);
+	if (scan_leafpart_rtis)
+		*scan_leafpart_rtis = bms_copy(*scan_leafpart_rtis);
 
 	MemoryContextReset(prunestate->prune_context);
 
@@ -2173,13 +2329,15 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
  * find_matching_subplans_recurse
  *		Recursive worker function for ExecFindMatchingSubPlans
  *
- * Adds valid (non-prunable) subplan IDs to *validsubplans
+ * Adds valid (non-prunable) subplan IDs to *validsubplans and RT indexes of
+ * of the corresponding leaf partitions to *scan_leafpart_rtis (if asked for).
  */
 static void
 find_matching_subplans_recurse(PartitionPruningData *prunedata,
 							   PartitionedRelPruningData *pprune,
 							   bool initial_prune,
-							   Bitmapset **validsubplans)
+							   Bitmapset **validsubplans,
+							   Bitmapset **scan_leafpart_rtis)
 {
 	Bitmapset  *partset;
 	int			i;
@@ -2206,8 +2364,13 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
 	while ((i = bms_next_member(partset, i)) >= 0)
 	{
 		if (pprune->subplan_map[i] >= 0)
+		{
 			*validsubplans = bms_add_member(*validsubplans,
 											pprune->subplan_map[i]);
+			if (scan_leafpart_rtis && pprune->rti_map[i] > 0)
+				*scan_leafpart_rtis = bms_add_member(*scan_leafpart_rtis,
+													 pprune->rti_map[i]);
+		}
 		else
 		{
 			int			partidx = pprune->subpart_map[i];
@@ -2215,7 +2378,8 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
 			if (partidx >= 0)
 				find_matching_subplans_recurse(prunedata,
 											   &prunedata->partrelprunedata[partidx],
-											   initial_prune, validsubplans);
+											   initial_prune, validsubplans,
+											   scan_leafpart_rtis);
 			else
 			{
 				/*
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 9df1f81ea8..639145abe9 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -119,6 +119,7 @@ CreateExecutorState(void)
 	estate->es_relations = NULL;
 	estate->es_rowmarks = NULL;
 	estate->es_plannedstmt = NULL;
+	estate->es_part_prune_result = NULL;
 
 	estate->es_junkFilter = NULL;
 
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index f9460ae506..a2182a6b1f 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -844,7 +844,7 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
 	else
 		dest = None_Receiver;
 
-	es->qd = CreateQueryDesc(es->stmt,
+	es->qd = CreateQueryDesc(es->stmt, NULL,
 							 fcache->src,
 							 GetActiveSnapshot(),
 							 InvalidSnapshot,
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index 357e10a1d7..09f26658e2 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -94,6 +94,7 @@ static bool ExecAppendAsyncRequest(AppendState *node, TupleTableSlot **result);
 static void ExecAppendAsyncEventWait(AppendState *node);
 static void classify_matching_subplans(AppendState *node);
 
+
 /* ----------------------------------------------------------------
  *		ExecInitAppend
  *
@@ -134,7 +135,7 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
 	appendstate->as_begun = false;
 
 	/* If run-time partition pruning is enabled, then set that up now */
-	if (node->part_prune_info != NULL)
+	if (node->part_prune_index >= 0)
 	{
 		PartitionPruneState *prunestate;
 
@@ -145,7 +146,7 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
 		 */
 		prunestate = ExecInitPartitionPruning(&appendstate->ps,
 											  list_length(node->appendplans),
-											  node->part_prune_info,
+											  node->part_prune_index,
 											  &validsubplans);
 		appendstate->as_prune_state = prunestate;
 		nplans = bms_num_members(validsubplans);
@@ -155,7 +156,8 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
 		 * subplan, we can fill as_valid_subplans immediately, preventing
 		 * later calls to ExecFindMatchingSubPlans.
 		 */
-		if (!prunestate->do_exec_prune && nplans > 0)
+		if (appendstate->as_prune_state == NULL ||
+			(!appendstate->as_prune_state->do_exec_prune && nplans > 0))
 			appendstate->as_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
 	}
 	else
@@ -577,7 +579,7 @@ choose_next_subplan_locally(AppendState *node)
 		}
 		else if (node->as_valid_subplans == NULL)
 			node->as_valid_subplans =
-				ExecFindMatchingSubPlans(node->as_prune_state, false);
+				ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
 
 		whichplan = -1;
 	}
@@ -642,7 +644,7 @@ choose_next_subplan_for_leader(AppendState *node)
 		if (node->as_valid_subplans == NULL)
 		{
 			node->as_valid_subplans =
-				ExecFindMatchingSubPlans(node->as_prune_state, false);
+				ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
 
 			/*
 			 * Mark each invalid plan as finished to allow the loop below to
@@ -717,7 +719,7 @@ choose_next_subplan_for_worker(AppendState *node)
 	else if (node->as_valid_subplans == NULL)
 	{
 		node->as_valid_subplans =
-			ExecFindMatchingSubPlans(node->as_prune_state, false);
+			ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
 		mark_invalid_subplans_as_finished(node);
 	}
 
@@ -868,7 +870,7 @@ ExecAppendAsyncBegin(AppendState *node)
 	if (node->as_valid_subplans == NULL)
 	{
 		node->as_valid_subplans =
-			ExecFindMatchingSubPlans(node->as_prune_state, false);
+			ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
 
 		classify_matching_subplans(node);
 	}
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index ecf9052e03..7708cfffda 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -82,7 +82,7 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
 	mergestate->ps.ExecProcNode = ExecMergeAppend;
 
 	/* If run-time partition pruning is enabled, then set that up now */
-	if (node->part_prune_info != NULL)
+	if (node->part_prune_index >= 0)
 	{
 		PartitionPruneState *prunestate;
 
@@ -93,7 +93,7 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
 		 */
 		prunestate = ExecInitPartitionPruning(&mergestate->ps,
 											  list_length(node->mergeplans),
-											  node->part_prune_info,
+											  node->part_prune_index,
 											  &validsubplans);
 		mergestate->ms_prune_state = prunestate;
 		nplans = bms_num_members(validsubplans);
@@ -103,7 +103,8 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
 		 * subplan, we can fill as_valid_subplans immediately, preventing
 		 * later calls to ExecFindMatchingSubPlans.
 		 */
-		if (!prunestate->do_exec_prune && nplans > 0)
+		if (mergestate->ms_prune_state == NULL ||
+			(!mergestate->ms_prune_state->do_exec_prune && nplans > 0))
 			mergestate->ms_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
 	}
 	else
@@ -218,7 +219,7 @@ ExecMergeAppend(PlanState *pstate)
 		 */
 		if (node->ms_valid_subplans == NULL)
 			node->ms_valid_subplans =
-				ExecFindMatchingSubPlans(node->ms_prune_state, false);
+				ExecFindMatchingSubPlans(node->ms_prune_state, false, NULL);
 
 		/*
 		 * First time through: pull the first tuple from each valid subplan,
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 042a5f8b0a..05db2e9de1 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -2473,7 +2473,9 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 	{
 		CachedPlanSource *plansource = (CachedPlanSource *) lfirst(lc1);
 		List	   *stmt_list;
-		ListCell   *lc2;
+		List	   *part_prune_result_list;
+		ListCell   *lc2,
+				   *lc3;
 
 		spicallbackarg.query = plansource->query_string;
 
@@ -2552,6 +2554,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 							  plan_owner, _SPI_current->queryEnv);
 
 		stmt_list = cplan->stmt_list;
+		part_prune_result_list = cplan->part_prune_result_list;
 
 		/*
 		 * If we weren't given a specific snapshot to use, and the statement
@@ -2589,9 +2592,10 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 			}
 		}
 
-		foreach(lc2, stmt_list)
+		forboth(lc2, stmt_list, lc3, part_prune_result_list)
 		{
 			PlannedStmt *stmt = lfirst_node(PlannedStmt, lc2);
+			PartitionPruneResult *part_prune_result = lfirst_node(PartitionPruneResult, lc3);
 			bool		canSetTag = stmt->canSetTag;
 			DestReceiver *dest;
 
@@ -2663,7 +2667,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 				else
 					snap = InvalidSnapshot;
 
-				qdesc = CreateQueryDesc(stmt,
+				qdesc = CreateQueryDesc(stmt, part_prune_result,
 										plansource->query_string,
 										snap, crosscheck_snapshot,
 										dest,
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 46a1943d97..c5c70593de 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -96,7 +96,10 @@ _copyPlannedStmt(const PlannedStmt *from)
 	COPY_SCALAR_FIELD(parallelModeNeeded);
 	COPY_SCALAR_FIELD(jitFlags);
 	COPY_NODE_FIELD(planTree);
+	COPY_NODE_FIELD(partPruneInfos);
+	COPY_SCALAR_FIELD(containsInitialPruning);
 	COPY_NODE_FIELD(rtable);
+	COPY_BITMAPSET_FIELD(minLockRelids);
 	COPY_NODE_FIELD(resultRelations);
 	COPY_NODE_FIELD(appendRelations);
 	COPY_NODE_FIELD(subplans);
@@ -253,7 +256,7 @@ _copyAppend(const Append *from)
 	COPY_NODE_FIELD(appendplans);
 	COPY_SCALAR_FIELD(nasyncplans);
 	COPY_SCALAR_FIELD(first_partial_plan);
-	COPY_NODE_FIELD(part_prune_info);
+	COPY_SCALAR_FIELD(part_prune_index);
 
 	return newnode;
 }
@@ -281,7 +284,7 @@ _copyMergeAppend(const MergeAppend *from)
 	COPY_POINTER_FIELD(sortOperators, from->numCols * sizeof(Oid));
 	COPY_POINTER_FIELD(collations, from->numCols * sizeof(Oid));
 	COPY_POINTER_FIELD(nullsFirst, from->numCols * sizeof(bool));
-	COPY_NODE_FIELD(part_prune_info);
+	COPY_SCALAR_FIELD(part_prune_index);
 
 	return newnode;
 }
@@ -1280,6 +1283,8 @@ _copyPartitionPruneInfo(const PartitionPruneInfo *from)
 	PartitionPruneInfo *newnode = makeNode(PartitionPruneInfo);
 
 	COPY_NODE_FIELD(prune_infos);
+	COPY_SCALAR_FIELD(needs_init_pruning);
+	COPY_SCALAR_FIELD(needs_exec_pruning);
 	COPY_BITMAPSET_FIELD(other_subplans);
 
 	return newnode;
@@ -1296,6 +1301,7 @@ _copyPartitionedRelPruneInfo(const PartitionedRelPruneInfo *from)
 	COPY_POINTER_FIELD(subplan_map, from->nparts * sizeof(int));
 	COPY_POINTER_FIELD(subpart_map, from->nparts * sizeof(int));
 	COPY_POINTER_FIELD(relid_map, from->nparts * sizeof(Oid));
+	COPY_POINTER_FIELD(rti_map, from->nparts * sizeof(Index));
 	COPY_NODE_FIELD(initial_pruning_steps);
 	COPY_NODE_FIELD(exec_pruning_steps);
 	COPY_BITMAPSET_FIELD(execparamids);
@@ -5469,6 +5475,21 @@ _copyExtensibleNode(const ExtensibleNode *from)
 	return newnode;
 }
 
+/* ****************************************************************
+ *					execnodes.h copy functions
+ * ****************************************************************
+ */
+static PartitionPruneResult *
+_copyPartitionPruneResult(const PartitionPruneResult *from)
+{
+	PartitionPruneResult *newnode = makeNode(PartitionPruneResult);
+
+	COPY_BITMAPSET_FIELD(scan_leafpart_rtis);
+	COPY_NODE_FIELD(valid_subplan_offs_list);
+
+	return newnode;
+}
+
 /* ****************************************************************
  *					value.h copy functions
  * ****************************************************************
@@ -5523,7 +5544,6 @@ _copyBitString(const BitString *from)
 	return newnode;
 }
 
-
 static ForeignKeyCacheInfo *
 _copyForeignKeyCacheInfo(const ForeignKeyCacheInfo *from)
 {
@@ -6565,6 +6585,13 @@ copyObjectImpl(const void *from)
 			retval = _copyPublicationTable(from);
 			break;
 
+			/*
+			 * EXECUTION NODES
+			 */
+		case T_PartitionPruneResult:
+			retval = _copyPartitionPruneResult(from);
+			break;
+
 			/*
 			 * MISCELLANEOUS NODES
 			 */
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 13e1643530..ca54022fee 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -314,7 +314,10 @@ _outPlannedStmt(StringInfo str, const PlannedStmt *node)
 	WRITE_BOOL_FIELD(parallelModeNeeded);
 	WRITE_INT_FIELD(jitFlags);
 	WRITE_NODE_FIELD(planTree);
+	WRITE_NODE_FIELD(partPruneInfos);
+	WRITE_BOOL_FIELD(containsInitialPruning);
 	WRITE_NODE_FIELD(rtable);
+	WRITE_BITMAPSET_FIELD(minLockRelids);
 	WRITE_NODE_FIELD(resultRelations);
 	WRITE_NODE_FIELD(appendRelations);
 	WRITE_NODE_FIELD(subplans);
@@ -443,7 +446,7 @@ _outAppend(StringInfo str, const Append *node)
 	WRITE_NODE_FIELD(appendplans);
 	WRITE_INT_FIELD(nasyncplans);
 	WRITE_INT_FIELD(first_partial_plan);
-	WRITE_NODE_FIELD(part_prune_info);
+	WRITE_INT_FIELD(part_prune_index);
 }
 
 static void
@@ -460,7 +463,7 @@ _outMergeAppend(StringInfo str, const MergeAppend *node)
 	WRITE_OID_ARRAY(sortOperators, node->numCols);
 	WRITE_OID_ARRAY(collations, node->numCols);
 	WRITE_BOOL_ARRAY(nullsFirst, node->numCols);
-	WRITE_NODE_FIELD(part_prune_info);
+	WRITE_INT_FIELD(part_prune_index);
 }
 
 static void
@@ -1006,6 +1009,8 @@ _outPartitionPruneInfo(StringInfo str, const PartitionPruneInfo *node)
 	WRITE_NODE_TYPE("PARTITIONPRUNEINFO");
 
 	WRITE_NODE_FIELD(prune_infos);
+	WRITE_BOOL_FIELD(needs_init_pruning);
+	WRITE_BOOL_FIELD(needs_exec_pruning);
 	WRITE_BITMAPSET_FIELD(other_subplans);
 }
 
@@ -1020,6 +1025,7 @@ _outPartitionedRelPruneInfo(StringInfo str, const PartitionedRelPruneInfo *node)
 	WRITE_INT_ARRAY(subplan_map, node->nparts);
 	WRITE_INT_ARRAY(subpart_map, node->nparts);
 	WRITE_OID_ARRAY(relid_map, node->nparts);
+	WRITE_INDEX_ARRAY(rti_map, node->nparts);
 	WRITE_NODE_FIELD(initial_pruning_steps);
 	WRITE_NODE_FIELD(exec_pruning_steps);
 	WRITE_BITMAPSET_FIELD(execparamids);
@@ -2420,6 +2426,9 @@ _outPlannerGlobal(StringInfo str, const PlannerGlobal *node)
 	WRITE_NODE_FIELD(finalrowmarks);
 	WRITE_NODE_FIELD(resultRelations);
 	WRITE_NODE_FIELD(appendRelations);
+	WRITE_NODE_FIELD(partPruneInfos);
+	WRITE_BOOL_FIELD(containsInitialPruning);
+	WRITE_BITMAPSET_FIELD(minLockRelids);
 	WRITE_NODE_FIELD(relationOids);
 	WRITE_NODE_FIELD(invalItems);
 	WRITE_NODE_FIELD(paramExecTypes);
@@ -2487,6 +2496,7 @@ _outPlannerInfo(StringInfo str, const PlannerInfo *node)
 	WRITE_BITMAPSET_FIELD(curOuterRels);
 	WRITE_NODE_FIELD(curOuterParams);
 	WRITE_BOOL_FIELD(partColsUpdated);
+	WRITE_NODE_FIELD(partPruneInfos);
 }
 
 static void
@@ -2840,6 +2850,21 @@ _outExtensibleNode(StringInfo str, const ExtensibleNode *node)
 	methods->nodeOut(str, node);
 }
 
+/*****************************************************************************
+ *
+ *	Stuff from execnodes.h
+ *
+ *****************************************************************************/
+
+static void
+_outPartitionPruneResult(StringInfo str, const PartitionPruneResult *node)
+{
+	WRITE_NODE_TYPE("PARTITIONPRUNERESULT");
+
+	WRITE_BITMAPSET_FIELD(scan_leafpart_rtis);
+	WRITE_NODE_FIELD(valid_subplan_offs_list);
+}
+
 /*****************************************************************************
  *
  *	Stuff from parsenodes.h.
@@ -4748,6 +4773,13 @@ outNode(StringInfo str, const void *obj)
 				_outJsonTableSibling(str, obj);
 				break;
 
+				/*
+				 * EXECUTION NODES
+				 */
+			case T_PartitionPruneResult:
+				_outPartitionPruneResult(str, obj);
+				break;
+
 			default:
 
 				/*
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 48f7216c9e..acce5e29cc 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -164,6 +164,11 @@
 	token = pg_strtok(&length);		/* skip :fldname */ \
 	local_node->fldname = readIntCols(len)
 
+/* Read an Index array */
+#define READ_INDEX_ARRAY(fldname, len) \
+	token = pg_strtok(&length);		/* skip :fldname */ \
+	local_node->fldname = readIndexCols(len)
+
 /* Read a bool array */
 #define READ_BOOL_ARRAY(fldname, len) \
 	token = pg_strtok(&length);		/* skip :fldname */ \
@@ -1814,7 +1819,10 @@ _readPlannedStmt(void)
 	READ_BOOL_FIELD(parallelModeNeeded);
 	READ_INT_FIELD(jitFlags);
 	READ_NODE_FIELD(planTree);
+	READ_NODE_FIELD(partPruneInfos);
+	READ_BOOL_FIELD(containsInitialPruning);
 	READ_NODE_FIELD(rtable);
+	READ_BITMAPSET_FIELD(minLockRelids);
 	READ_NODE_FIELD(resultRelations);
 	READ_NODE_FIELD(appendRelations);
 	READ_NODE_FIELD(subplans);
@@ -1946,7 +1954,7 @@ _readAppend(void)
 	READ_NODE_FIELD(appendplans);
 	READ_INT_FIELD(nasyncplans);
 	READ_INT_FIELD(first_partial_plan);
-	READ_NODE_FIELD(part_prune_info);
+	READ_INT_FIELD(part_prune_index);
 
 	READ_DONE();
 }
@@ -1968,7 +1976,7 @@ _readMergeAppend(void)
 	READ_OID_ARRAY(sortOperators, local_node->numCols);
 	READ_OID_ARRAY(collations, local_node->numCols);
 	READ_BOOL_ARRAY(nullsFirst, local_node->numCols);
-	READ_NODE_FIELD(part_prune_info);
+	READ_INT_FIELD(part_prune_index);
 
 	READ_DONE();
 }
@@ -2763,6 +2771,8 @@ _readPartitionPruneInfo(void)
 	READ_LOCALS(PartitionPruneInfo);
 
 	READ_NODE_FIELD(prune_infos);
+	READ_BOOL_FIELD(needs_init_pruning);
+	READ_BOOL_FIELD(needs_exec_pruning);
 	READ_BITMAPSET_FIELD(other_subplans);
 
 	READ_DONE();
@@ -2779,6 +2789,7 @@ _readPartitionedRelPruneInfo(void)
 	READ_INT_ARRAY(subplan_map, local_node->nparts);
 	READ_INT_ARRAY(subpart_map, local_node->nparts);
 	READ_OID_ARRAY(relid_map, local_node->nparts);
+	READ_INDEX_ARRAY(rti_map, local_node->nparts);
 	READ_NODE_FIELD(initial_pruning_steps);
 	READ_NODE_FIELD(exec_pruning_steps);
 	READ_BITMAPSET_FIELD(execparamids);
@@ -2932,6 +2943,21 @@ _readPartitionRangeDatum(void)
 	READ_DONE();
 }
 
+
+/*
+ * _readPartitionPruneResult
+ */
+static PartitionPruneResult *
+_readPartitionPruneResult(void)
+{
+	READ_LOCALS(PartitionPruneResult);
+
+	READ_BITMAPSET_FIELD(scan_leafpart_rtis);
+	READ_NODE_FIELD(valid_subplan_offs_list);
+
+	READ_DONE();
+}
+
 /*
  * parseNodeString
  *
@@ -3229,6 +3255,8 @@ parseNodeString(void)
 		return_value = _readJsonTableParent();
 	else if (MATCH("JSONTABSNODE", 12))
 		return_value = _readJsonTableSibling();
+	else if (MATCH("PARTITIONPRUNERESULT", 20))
+		return_value = _readPartitionPruneResult();
 	else
 	{
 		elog(ERROR, "badly formatted node string \"%.32s\"...", token);
@@ -3372,6 +3400,30 @@ readIntCols(int numCols)
 	return int_vals;
 }
 
+/*
+ * readIndexCols
+ */
+Index *
+readIndexCols(int numCols)
+{
+	int			tokenLength,
+				i;
+	const char *token;
+	Index	   *index_vals;
+
+	if (numCols <= 0)
+		return NULL;
+
+	index_vals = (Index *) palloc(numCols * sizeof(Index));
+	for (i = 0; i < numCols; i++)
+	{
+		token = pg_strtok(&tokenLength);
+		index_vals[i] = atoui(token);
+	}
+
+	return index_vals;
+}
+
 /*
  * readBoolCols
  */
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index 51591bb812..453f720759 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -1366,7 +1366,15 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
 	plan->appendplans = subplans;
 	plan->nasyncplans = nasyncplans;
 	plan->first_partial_plan = best_path->first_partial_path;
-	plan->part_prune_info = partpruneinfo;
+
+	if (partpruneinfo)
+	{
+		root->partPruneInfos = lappend(root->partPruneInfos, partpruneinfo);
+		/* Will be updated later in set_plan_references(). */
+		plan->part_prune_index = list_length(root->partPruneInfos) - 1;
+	}
+	else
+		plan->part_prune_index = -1;
 
 	copy_generic_path_info(&plan->plan, (Path *) best_path);
 
@@ -1528,7 +1536,15 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
 	}
 
 	node->mergeplans = subplans;
-	node->part_prune_info = partpruneinfo;
+
+	if (partpruneinfo)
+	{
+		root->partPruneInfos = lappend(root->partPruneInfos, partpruneinfo);
+		/* Will be updated later in set_plan_references(). */
+		node->part_prune_index = list_length(root->partPruneInfos) - 1;
+	}
+	else
+		node->part_prune_index = -1;
 
 	/*
 	 * If prepare_sort_from_pathkeys added sort columns, but we were told to
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index b2569c5d0c..2aa051d862 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -518,7 +518,10 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
 	result->dependsOnRole = glob->dependsOnRole;
 	result->parallelModeNeeded = glob->parallelModeNeeded;
 	result->planTree = top_plan;
+	result->partPruneInfos = glob->partPruneInfos;
+	result->containsInitialPruning = glob->containsInitialPruning;
 	result->rtable = glob->finalrtable;
+	result->minLockRelids = glob->minLockRelids;
 	result->resultRelations = glob->resultRelations;
 	result->appendRelations = glob->appendRelations;
 	result->subplans = glob->subplans;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 7519723081..fc66986e1c 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -251,7 +251,7 @@ set_plan_references(PlannerInfo *root, Plan *plan)
 	Plan	   *result;
 	PlannerGlobal *glob = root->glob;
 	int			rtoffset = list_length(glob->finalrtable);
-	ListCell   *lc;
+	ListCell *lc;
 
 	/*
 	 * Add all the query's RTEs to the flattened rangetable.  The live ones
@@ -260,6 +260,16 @@ set_plan_references(PlannerInfo *root, Plan *plan)
 	 */
 	add_rtes_to_flat_rtable(root, false);
 
+	/*
+	 * Add the query's adjusted range of RT indexes to glob->minLockRelids.
+	 * The adjusted RT indexes of prunable relations will be deleted from the
+	 * set below where PartitionPruneInfos are processed.
+	 */
+	glob->minLockRelids =
+		bms_add_range(glob->minLockRelids,
+					  rtoffset + 1,
+					  rtoffset + list_length(root->parse->rtable));
+
 	/*
 	 * Adjust RT indexes of PlanRowMarks and add to final rowmarks list
 	 */
@@ -338,6 +348,56 @@ set_plan_references(PlannerInfo *root, Plan *plan)
 		}
 	}
 
+	/* Also fix up the information in PartitionPruneInfos. */
+	foreach (lc, root->partPruneInfos)
+	{
+		PartitionPruneInfo *pruneinfo = lfirst(lc);
+		Bitmapset *leafpart_rtis = NULL;
+		ListCell  *l;
+
+		foreach(l, pruneinfo->prune_infos)
+		{
+			List	   *prune_infos = lfirst(l);
+			ListCell   *l2;
+
+			foreach(l2, prune_infos)
+			{
+				PartitionedRelPruneInfo *pinfo = lfirst(l2);
+				int		i;
+
+				/* RT index of the partitione table. */
+				pinfo->rtindex += rtoffset;
+
+				/* And also those of the leaf partitions. */
+				for (i = 0; i < pinfo->nparts; i++)
+				{
+					if (pinfo->rti_map[i] > 0)
+					{
+						pinfo->rti_map[i] += rtoffset;
+						leafpart_rtis = bms_add_member(leafpart_rtis,
+													   pinfo->rti_map[i]);
+					}
+				}
+			}
+		}
+
+		if (pruneinfo->needs_init_pruning)
+		{
+			glob->containsInitialPruning = true;
+
+			/*
+			 * Delete the leaf partition RTIs from the global set of relations
+			 * to be locked before executing the plan.  AcquireExecutorLocks()
+			 * will find the ones to add to the set after performing initial
+			 * pruning.
+			 */
+			glob->minLockRelids = bms_del_members(glob->minLockRelids,
+												  leafpart_rtis);
+		}
+
+		glob->partPruneInfos = lappend(glob->partPruneInfos, pruneinfo);
+	}
+
 	return result;
 }
 
@@ -1610,21 +1670,12 @@ set_append_references(PlannerInfo *root,
 
 	aplan->apprelids = offset_relid_set(aplan->apprelids, rtoffset);
 
-	if (aplan->part_prune_info)
-	{
-		foreach(l, aplan->part_prune_info->prune_infos)
-		{
-			List	   *prune_infos = lfirst(l);
-			ListCell   *l2;
-
-			foreach(l2, prune_infos)
-			{
-				PartitionedRelPruneInfo *pinfo = lfirst(l2);
-
-				pinfo->rtindex += rtoffset;
-			}
-		}
-	}
+	/*
+	 * PartitionPruneInfos will be added to a list in PlannerGlobal, so update
+	 * the index.
+	 */
+	if (aplan->part_prune_index >= 0)
+		aplan->part_prune_index += list_length(root->glob->partPruneInfos);
 
 	/* We don't need to recurse to lefttree or righttree ... */
 	Assert(aplan->plan.lefttree == NULL);
@@ -1682,21 +1733,12 @@ set_mergeappend_references(PlannerInfo *root,
 
 	mplan->apprelids = offset_relid_set(mplan->apprelids, rtoffset);
 
-	if (mplan->part_prune_info)
-	{
-		foreach(l, mplan->part_prune_info->prune_infos)
-		{
-			List	   *prune_infos = lfirst(l);
-			ListCell   *l2;
-
-			foreach(l2, prune_infos)
-			{
-				PartitionedRelPruneInfo *pinfo = lfirst(l2);
-
-				pinfo->rtindex += rtoffset;
-			}
-		}
-	}
+	/*
+	 * PartitionPruneInfos will be added to a list in PlannerGlobal, so update
+	 * the index.
+	 */
+	if (mplan->part_prune_index >= 0)
+		mplan->part_prune_index += list_length(root->glob->partPruneInfos);
 
 	/* We don't need to recurse to lefttree or righttree ... */
 	Assert(mplan->plan.lefttree == NULL);
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index 9d3c05aed3..0eaff15ed0 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -144,7 +144,9 @@ static List *make_partitionedrel_pruneinfo(PlannerInfo *root,
 										   List *prunequal,
 										   Bitmapset *partrelids,
 										   int *relid_subplan_map,
-										   Bitmapset **matchedsubplans);
+										   Bitmapset **matchedsubplans,
+										   bool *needs_init_pruning,
+										   bool *needs_exec_pruning);
 static void gen_partprune_steps(RelOptInfo *rel, List *clauses,
 								PartClauseTarget target,
 								GeneratePruningStepsContext *context);
@@ -230,6 +232,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	int		   *relid_subplan_map;
 	ListCell   *lc;
 	int			i;
+	bool		needs_init_pruning = false;
+	bool		needs_exec_pruning = false;
 
 	/*
 	 * Scan the subpaths to see which ones are scans of partition child
@@ -309,12 +313,16 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		Bitmapset  *partrelids = (Bitmapset *) lfirst(lc);
 		List	   *pinfolist;
 		Bitmapset  *matchedsubplans = NULL;
+		bool		partrel_needs_init_pruning;
+		bool		partrel_needs_exec_pruning;
 
 		pinfolist = make_partitionedrel_pruneinfo(root, parentrel,
 												  prunequal,
 												  partrelids,
 												  relid_subplan_map,
-												  &matchedsubplans);
+												  &matchedsubplans,
+												  &partrel_needs_init_pruning,
+												  &partrel_needs_exec_pruning);
 
 		/* When pruning is possible, record the matched subplans */
 		if (pinfolist != NIL)
@@ -323,6 +331,10 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 			allmatchedsubplans = bms_join(matchedsubplans,
 										  allmatchedsubplans);
 		}
+		if (!needs_init_pruning)
+			needs_init_pruning = partrel_needs_init_pruning;
+		if (!needs_exec_pruning)
+			needs_exec_pruning = partrel_needs_exec_pruning;
 	}
 
 	pfree(relid_subplan_map);
@@ -337,6 +349,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	/* Else build the result data structure */
 	pruneinfo = makeNode(PartitionPruneInfo);
 	pruneinfo->prune_infos = prunerelinfos;
+	pruneinfo->needs_init_pruning = needs_init_pruning;
+	pruneinfo->needs_exec_pruning = needs_exec_pruning;
 
 	/*
 	 * Some subplans may not belong to any of the identified partitioned rels.
@@ -435,13 +449,18 @@ add_part_relids(List *allpartrelids, Bitmapset *partrelids)
  * If we cannot find any useful run-time pruning steps, return NIL.
  * However, on success, each rel identified in partrelids will have
  * an element in the result list, even if some of them are useless.
+ * *needs_init_pruning and *needs_exec_pruning are set to indicate that the
+ * returned PartitionedRelPruneInfos contains pruning steps that can be
+ * performed before and after execution begins, respectively.
  */
 static List *
 make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 							  List *prunequal,
 							  Bitmapset *partrelids,
 							  int *relid_subplan_map,
-							  Bitmapset **matchedsubplans)
+							  Bitmapset **matchedsubplans,
+							  bool *needs_init_pruning,
+							  bool *needs_exec_pruning)
 {
 	RelOptInfo *targetpart = NULL;
 	List	   *pinfolist = NIL;
@@ -452,6 +471,10 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	int			rti;
 	int			i;
 
+	/* Will find out below. */
+	*needs_init_pruning = false;
+	*needs_exec_pruning = false;
+
 	/*
 	 * Examine each partitioned rel, constructing a temporary array to map
 	 * from planner relids to index of the partitioned rel, and building a
@@ -539,6 +562,9 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		 * executor per-scan pruning steps.  This first pass creates startup
 		 * pruning steps and detects whether there's any possibly-useful quals
 		 * that would require per-scan pruning.
+		 *
+		 * In the first pass, we note whether the 2nd pass is necessary by
+		 * by noting the presence of EXEC parameters.
 		 */
 		gen_partprune_steps(subpart, partprunequal, PARTTARGET_INITIAL,
 							&context);
@@ -613,6 +639,11 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		pinfo->execparamids = execparamids;
 		/* Remaining fields will be filled in the next loop */
 
+		if (!*needs_init_pruning)
+			*needs_init_pruning = (initial_pruning_steps != NIL);
+		if (!*needs_exec_pruning)
+			*needs_exec_pruning = (exec_pruning_steps != NIL);
+
 		pinfolist = lappend(pinfolist, pinfo);
 	}
 
@@ -640,6 +671,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		int		   *subplan_map;
 		int		   *subpart_map;
 		Oid		   *relid_map;
+		Index	   *rti_map;
 
 		/*
 		 * Construct the subplan and subpart maps for this partitioning level.
@@ -652,6 +684,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		subpart_map = (int *) palloc(nparts * sizeof(int));
 		memset(subpart_map, -1, nparts * sizeof(int));
 		relid_map = (Oid *) palloc0(nparts * sizeof(Oid));
+		rti_map = (Index *) palloc0(nparts * sizeof(Index));
 		present_parts = NULL;
 
 		i = -1;
@@ -666,6 +699,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 			subplan_map[i] = subplanidx = relid_subplan_map[partrel->relid] - 1;
 			subpart_map[i] = subpartidx = relid_subpart_map[partrel->relid] - 1;
 			relid_map[i] = planner_rt_fetch(partrel->relid, root)->relid;
+			rti_map[i] = partrel->relid;
 			if (subplanidx >= 0)
 			{
 				present_parts = bms_add_member(present_parts, i);
@@ -690,6 +724,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		pinfo->subplan_map = subplan_map;
 		pinfo->subpart_map = subpart_map;
 		pinfo->relid_map = relid_map;
+		pinfo->rti_map = rti_map;
 	}
 
 	pfree(relid_subpart_map);
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index 5aa5a350f3..163ba956c4 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -35,7 +35,7 @@
 Portal		ActivePortal = NULL;
 
 
-static void ProcessQuery(PlannedStmt *plan,
+static void ProcessQuery(PlannedStmt *plan, PartitionPruneResult *part_prune_result,
 						 const char *sourceText,
 						 ParamListInfo params,
 						 QueryEnvironment *queryEnv,
@@ -65,6 +65,7 @@ static void DoPortalRewind(Portal portal);
  */
 QueryDesc *
 CreateQueryDesc(PlannedStmt *plannedstmt,
+				PartitionPruneResult *part_prune_result,
 				const char *sourceText,
 				Snapshot snapshot,
 				Snapshot crosscheck_snapshot,
@@ -77,6 +78,8 @@ CreateQueryDesc(PlannedStmt *plannedstmt,
 
 	qd->operation = plannedstmt->commandType;	/* operation */
 	qd->plannedstmt = plannedstmt;	/* plan */
+	qd->part_prune_result = part_prune_result;	/* ExecutorDoInitialPruning()
+												 * output for plan */
 	qd->sourceText = sourceText;	/* query text */
 	qd->snapshot = RegisterSnapshot(snapshot);	/* snapshot */
 	/* RI check snapshot */
@@ -122,6 +125,7 @@ FreeQueryDesc(QueryDesc *qdesc)
  *		PORTAL_ONE_RETURNING, or PORTAL_ONE_MOD_WITH portal
  *
  *	plan: the plan tree for the query
+ *	part_prune_result: ExecutorDoInitialPruning() output for the plan tree
  *	sourceText: the source text of the query
  *	params: any parameters needed
  *	dest: where to send results
@@ -134,6 +138,7 @@ FreeQueryDesc(QueryDesc *qdesc)
  */
 static void
 ProcessQuery(PlannedStmt *plan,
+			 PartitionPruneResult *part_prune_result,
 			 const char *sourceText,
 			 ParamListInfo params,
 			 QueryEnvironment *queryEnv,
@@ -145,7 +150,7 @@ ProcessQuery(PlannedStmt *plan,
 	/*
 	 * Create the QueryDesc object
 	 */
-	queryDesc = CreateQueryDesc(plan, sourceText,
+	queryDesc = CreateQueryDesc(plan, part_prune_result, sourceText,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, params, queryEnv, 0);
 
@@ -491,8 +496,14 @@ PortalStart(Portal portal, ParamListInfo params,
 				/*
 				 * Create QueryDesc in portal's context; for the moment, set
 				 * the destination to DestNone.
+				 *
+				 * There is no PartitionPruneResult unless the PlannedStmt is
+				 * from a CachedPlan.
 				 */
 				queryDesc = CreateQueryDesc(linitial_node(PlannedStmt, portal->stmts),
+											portal->cplan == NULL ? NULL :
+											linitial_node(PartitionPruneResult,
+											portal->cplan->part_prune_result_list),
 											portal->sourceText,
 											GetActiveSnapshot(),
 											InvalidSnapshot,
@@ -1194,6 +1205,9 @@ PortalRunMulti(Portal portal,
 {
 	bool		active_snapshot_set = false;
 	ListCell   *stmtlist_item;
+	int			i;
+	List	   *part_prune_results = portal->cplan == NULL ? NIL:
+									 portal->cplan->part_prune_result_list;
 
 	/*
 	 * If the destination is DestRemoteExecute, change to DestNone.  The
@@ -1214,9 +1228,15 @@ PortalRunMulti(Portal portal,
 	 * Loop to handle the individual queries generated from a single parsetree
 	 * by analysis and rewrite.
 	 */
+	i = 0;
 	foreach(stmtlist_item, portal->stmts)
 	{
 		PlannedStmt *pstmt = lfirst_node(PlannedStmt, stmtlist_item);
+		PartitionPruneResult *part_prune_result = part_prune_results ?
+												  list_nth(part_prune_results, i) :
+												  NULL;
+
+		i++;
 
 		/*
 		 * If we got a cancel signal in prior command, quit
@@ -1274,7 +1294,7 @@ PortalRunMulti(Portal portal,
 			if (pstmt->canSetTag)
 			{
 				/* statement can set tag string */
-				ProcessQuery(pstmt,
+				ProcessQuery(pstmt, part_prune_result,
 							 portal->sourceText,
 							 portal->portalParams,
 							 portal->queryEnv,
@@ -1283,7 +1303,7 @@ PortalRunMulti(Portal portal,
 			else
 			{
 				/* stmt added by rewrite cannot set tag */
-				ProcessQuery(pstmt,
+				ProcessQuery(pstmt, part_prune_result,
 							 portal->sourceText,
 							 portal->portalParams,
 							 portal->queryEnv,
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 4cf6db504f..216401bcfb 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -99,14 +99,16 @@ static dlist_head cached_expression_list = DLIST_STATIC_INIT(cached_expression_l
 static void ReleaseGenericPlan(CachedPlanSource *plansource);
 static List *RevalidateCachedQuery(CachedPlanSource *plansource,
 								   QueryEnvironment *queryEnv);
-static bool CheckCachedPlan(CachedPlanSource *plansource);
+static bool CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams);
 static CachedPlan *BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
 								   ParamListInfo boundParams, QueryEnvironment *queryEnv);
+static void CachedPlanSavePartitionPruneResults(CachedPlan *plan, List *part_prune_result_list);
 static bool choose_custom_plan(CachedPlanSource *plansource,
 							   ParamListInfo boundParams);
 static double cached_plan_cost(CachedPlan *plan, bool include_planner);
 static Query *QueryListGetPrimaryStmt(List *stmts);
-static void AcquireExecutorLocks(List *stmt_list, bool acquire);
+static List *AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams);
+static void ReleaseExecutorLocks(List *stmt_list, List *part_prune_result_list);
 static void AcquirePlannerLocks(List *stmt_list, bool acquire);
 static void ScanQueryForLocks(Query *parsetree, bool acquire);
 static bool ScanQueryWalker(Node *node, bool *acquire);
@@ -790,9 +792,21 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
  *
  * On a "true" return, we have acquired the locks needed to run the plan.
  * (We must do this for the "true" result to be race-condition-free.)
+ *
+ * If the CachedPlan is valid, this may in some cases call
+ * ExecutorDoInitialPruning() on each PlannedStmt contained in it to determine
+ * the set of relations to be locked by AcquireExecutorLocks(), instead of just
+ * scanning its range table, which is done to prune away any nodes in the tree
+ * that need not be executed based on the result of initial partition pruning.
+ * The result of pruning which consists of List of Lists of bitmapsets of child
+ * subplan indexes, allocated in a child context of the context containing the
+ * plan itself, are added into plan->part_prune_results.  The previous contents
+ * of the list from the last invocation on the same CachedPlan are deleted,
+ * because they would no longer be valid given the fresh set of parameter
+ * values which may be used as pruning parameters.
  */
 static bool
-CheckCachedPlan(CachedPlanSource *plansource)
+CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams)
 {
 	CachedPlan *plan = plansource->gplan;
 
@@ -820,13 +834,24 @@ CheckCachedPlan(CachedPlanSource *plansource)
 	 */
 	if (plan->is_valid)
 	{
+		List   *part_prune_result_list;
+
 		/*
 		 * Plan must have positive refcount because it is referenced by
 		 * plansource; so no need to fear it disappears under us here.
 		 */
 		Assert(plan->refcount > 0);
 
-		AcquireExecutorLocks(plan->stmt_list, true);
+		/*
+		 * Lock relations scanned by the plan.  If ExecutorDoInitialPruning()
+		 * asked to omit some relations because the plan nodes that scan them
+		 * were found to be pruned, the executor will be informed of the
+		 * omission of the plan nodes themselves via part_prune_result_list
+		 * that is passed to it along with the list of PlannedStmts, so that
+		 * it doesn't accidentally try to execute those nodes.
+		 */
+		part_prune_result_list = AcquireExecutorLocks(plan->stmt_list,
+													   boundParams);
 
 		/*
 		 * If plan was transient, check to see if TransactionXmin has
@@ -844,11 +869,14 @@ CheckCachedPlan(CachedPlanSource *plansource)
 		if (plan->is_valid)
 		{
 			/* Successfully revalidated and locked the query. */
+
+			/* Remember pruning results in the CachedPlan. */
+			CachedPlanSavePartitionPruneResults(plan, part_prune_result_list);
 			return true;
 		}
 
 		/* Oops, the race case happened.  Release useless locks. */
-		AcquireExecutorLocks(plan->stmt_list, false);
+		ReleaseExecutorLocks(plan->stmt_list, part_prune_result_list);
 	}
 
 	/*
@@ -880,10 +908,12 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
 				ParamListInfo boundParams, QueryEnvironment *queryEnv)
 {
 	CachedPlan *plan;
-	List	   *plist;
+	List	   *plist,
+			   *dummy_part_prune_result_list;
 	bool		snapshot_set;
 	bool		is_transient;
-	MemoryContext plan_context;
+	MemoryContext plan_context,
+				  part_prune_result_context;
 	MemoryContext oldcxt = CurrentMemoryContext;
 	ListCell   *lc;
 
@@ -962,6 +992,16 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
 	else
 		plan_context = CurrentMemoryContext;
 
+	/*
+	 * Also create a dedicated context for part_prune_result_list, making it
+	 * a child of plan_context.
+	 */
+	part_prune_result_context = AllocSetContextCreate(CurrentMemoryContext,
+													  "CachedPlan part_prune_results list",
+													  ALLOCSET_START_SMALL_SIZES);
+	MemoryContextSetParent(part_prune_result_context, plan_context);
+	MemoryContextSetIdentifier(part_prune_result_context, plan_context->ident);
+
 	/*
 	 * Create and fill the CachedPlan struct within the new context.
 	 */
@@ -977,10 +1017,20 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
 	plan->planRoleId = GetUserId();
 	plan->dependsOnRole = plansource->dependsOnRLS;
 	is_transient = false;
+	dummy_part_prune_result_list = NIL;
 	foreach(lc, plist)
 	{
 		PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc);
 
+		/*
+		 * Real values will be added during subsequent CheckCachedPlan() calls
+		 * on this plan, but must add "something" for now, becasue users of
+		 * CachedPlan expect stmt_list and part_prune_result_list to have
+		 * the same number of elements.
+		 */
+		dummy_part_prune_result_list = lappend(dummy_part_prune_result_list,
+											   NULL);
+
 		if (plannedstmt->commandType == CMD_UTILITY)
 			continue;			/* Ignore utility statements */
 
@@ -1002,6 +1052,13 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
 	plan->is_saved = false;
 	plan->is_valid = true;
 
+	/*
+	 * While still dummy, save the list so that it is discarded on next use of
+	 * the CachedPlan.
+	 */
+	plan->part_prune_result_context = part_prune_result_context;
+	CachedPlanSavePartitionPruneResults(plan, dummy_part_prune_result_list);
+
 	/* assign generation number to new plan */
 	plan->generation = ++(plansource->generation);
 
@@ -1160,7 +1217,7 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 
 	if (!customplan)
 	{
-		if (CheckCachedPlan(plansource))
+		if (CheckCachedPlan(plansource, boundParams))
 		{
 			/* We want a generic plan, and we already have a valid one */
 			plan = plansource->gplan;
@@ -1586,6 +1643,36 @@ CopyCachedPlan(CachedPlanSource *plansource)
 	return newsource;
 }
 
+/*
+ * CachedPlanSavePartitionPruneResults
+ *		Save the list containing PartitionPruneResult nodes into the given
+ *		CachedPlan
+ *
+ * They must be hanged on to for the duration of a given execution of the
+ * CachedPlan.  The provided list is copied into a dedicated context that is
+ * a child of plan->context after dropping the existing contents of the list,
+ * because any PartitionPruneResult contained therein would no longer be
+ * valid for the current execution.
+ */
+static void
+CachedPlanSavePartitionPruneResults(CachedPlan *plan,
+									List *part_prune_result_list)
+{
+	MemoryContext	part_prune_result_context = plan->part_prune_result_context,
+					oldcontext = CurrentMemoryContext;
+	List		   *part_prune_result_list_copy;
+
+	/* First clear the existing contents of the list. */
+	Assert(MemoryContextIsValid(part_prune_result_context));
+	MemoryContextReset(part_prune_result_context);
+
+	MemoryContextSwitchTo(part_prune_result_context);
+	part_prune_result_list_copy = copyObject(part_prune_result_list);
+	MemoryContextSwitchTo(oldcontext);
+
+	plan->part_prune_result_list = part_prune_result_list_copy;
+}
+
 /*
  * CachedPlanIsValid: test whether the rewritten querytree within a
  * CachedPlanSource is currently valid (that is, not marked as being in need
@@ -1737,17 +1824,21 @@ QueryListGetPrimaryStmt(List *stmts)
 
 /*
  * AcquireExecutorLocks: acquire locks needed for execution of a cached plan;
- * or release them if acquire is false.
+ *
+ * Returns a list of PartitionPruneResult nodes containing one element for each
+ * PlannedStmt in stmt_list or NULL if the latter is utility statement or its
+ * containsInitialPruning is false.
  */
-static void
-AcquireExecutorLocks(List *stmt_list, bool acquire)
+static List *
+AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams)
 {
 	ListCell   *lc1;
+	List	   *part_prune_result_list = NIL;
 
 	foreach(lc1, stmt_list)
 	{
 		PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
-		ListCell   *lc2;
+		PartitionPruneResult *part_prune_result = NULL;
 
 		if (plannedstmt->commandType == CMD_UTILITY)
 		{
@@ -1761,27 +1852,122 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
 			Query	   *query = UtilityContainsQuery(plannedstmt->utilityStmt);
 
 			if (query)
-				ScanQueryForLocks(query, acquire);
-			continue;
+				ScanQueryForLocks(query, true);
 		}
-
-		foreach(lc2, plannedstmt->rtable)
+		else
 		{
-			RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc2);
-
-			if (rte->rtekind != RTE_RELATION)
-				continue;
+			Bitmapset  *lockRelids;
+			int			rti;
 
 			/*
-			 * Acquire the appropriate type of lock on each relation OID. Note
-			 * that we don't actually try to open the rel, and hence will not
-			 * fail if it's been dropped entirely --- we'll just transiently
-			 * acquire a non-conflicting lock.
+			 * Figure out the set of relations that would need to be locked
+			 * before executing the plan.
 			 */
-			if (acquire)
+			if (plannedstmt->containsInitialPruning)
+			{
+				/*
+				 * Obtain the set of partitions to be locked from the
+				 * PartitionPruneInfos by considering the result of performing
+				 * initial partition pruning.
+				 */
+				PartitionPruneResult *part_prune_result =
+					ExecutorDoInitialPruning(plannedstmt, boundParams);
+
+				lockRelids = bms_union(plannedstmt->minLockRelids,
+									   part_prune_result->scan_leafpart_rtis);
+			}
+			else
+				lockRelids = plannedstmt->minLockRelids;
+
+			rti = -1;
+			while ((rti = bms_next_member(lockRelids, rti)) > 0)
+			{
+				RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
+
+				if (rte->rtekind != RTE_RELATION)
+					continue;
+
+				/*
+				 * Acquire the appropriate type of lock on each relation OID.
+				 * Note that we don't actually try to open the rel, and hence
+				 * will not fail if it's been dropped entirely --- we'll just
+				 * transiently acquire a non-conflicting lock.
+				 */
 				LockRelationOid(rte->relid, rte->rellockmode);
+			}
+		}
+
+		/*
+		 * Remember PartitionPruneResult for later adding to the QueryDesc that
+		 * will be passed to the executor when executing this plan.  May be
+		 * NULL, but must keep the list the same length as stmt_list.
+		 */
+		part_prune_result_list = lappend(part_prune_result_list,
+										 part_prune_result);
+	}
+
+	return part_prune_result_list;
+}
+
+/*
+ * ReleaseExecutorLocks
+ * 		Release locks that would've been acquired by an earlier call to
+ * 		AcquireExecutorLocks()
+ */
+static void
+ReleaseExecutorLocks(List *stmt_list, List *part_prune_result_list)
+{
+	ListCell   *lc1,
+			   *lc2;
+
+	forboth(lc1, stmt_list, lc2, part_prune_result_list)
+	{
+		PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
+		PartitionPruneResult *part_prune_result = lfirst_node(PartitionPruneResult, lc2);
+
+		if (plannedstmt->commandType == CMD_UTILITY)
+		{
+			/*
+			 * Ignore utility statements, except those (such as EXPLAIN) that
+			 * contain a parsed-but-not-planned query.  Note: it's okay to use
+			 * ScanQueryForLocks, even though the query hasn't been through
+			 * rule rewriting, because rewriting doesn't change the query
+			 * representation.
+			 */
+			Query	   *query = UtilityContainsQuery(plannedstmt->utilityStmt);
+
+			if (query)
+				ScanQueryForLocks(query, false);
+		}
+		else
+		{
+			Bitmapset  *lockRelids;
+			int			rti;
+
+			if (part_prune_result == NULL)
+			{
+				Assert(!plannedstmt->containsInitialPruning);
+				lockRelids = plannedstmt->minLockRelids;
+			}
 			else
+			{
+				Assert(plannedstmt->containsInitialPruning);
+				lockRelids = bms_union(plannedstmt->minLockRelids,
+									   part_prune_result->scan_leafpart_rtis);
+			}
+
+			rti = -1;
+			while ((rti = bms_next_member(lockRelids, rti)) >= 0)
+			{
+				RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
+
+				if (rte->rtekind != RTE_RELATION)
+					continue;
+
+				/* See the comment in AcquireExecutorLocks(). */
 				UnlockRelationOid(rte->relid, rte->rellockmode);
+			}
+
 		}
 	}
 }
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 666977fb1f..34975c69ee 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -87,7 +87,8 @@ extern void ExplainOneUtility(Node *utilityStmt, IntoClause *into,
 							  ExplainState *es, const char *queryString,
 							  ParamListInfo params, QueryEnvironment *queryEnv);
 
-extern void ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into,
+extern void ExplainOnePlan(PlannedStmt *plannedstmt, PartitionPruneResult *part_prune_resul,
+						   IntoClause *into,
 						   ExplainState *es, const char *queryString,
 						   ParamListInfo params, QueryEnvironment *queryEnv,
 						   const instr_time *planduration,
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index 708435e952..bd8776402e 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -45,6 +45,7 @@ extern void ExecCleanupTupleRouting(ModifyTableState *mtstate,
  * nparts						Length of subplan_map[] and subpart_map[].
  * subplan_map					Subplan index by partition index, or -1.
  * subpart_map					Subpart index by partition index, or -1.
+ * rti_map						Range table index by partition index, or 0.
  * present_parts				A Bitmapset of the partition indexes that we
  *								have subplans or subparts for.
  * initial_pruning_steps		List of PartitionPruneSteps used to
@@ -61,6 +62,7 @@ typedef struct PartitionedRelPruningData
 	int			nparts;
 	int		   *subplan_map;
 	int		   *subpart_map;
+	Index	   *rti_map;
 	Bitmapset  *present_parts;
 	List	   *initial_pruning_steps;
 	List	   *exec_pruning_steps;
@@ -123,9 +125,13 @@ typedef struct PartitionPruneState
 
 extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
 													 int n_total_subplans,
-													 PartitionPruneInfo *pruneinfo,
+													 int part_prune_index,
 													 Bitmapset **initially_valid_subplans);
 extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
-										   bool initial_prune);
-
+										   bool initial_prune,
+										   Bitmapset **scan_leafpart_rtis);
+extern Bitmapset *ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt,
+								ParamListInfo params,
+								PartitionPruneInfo *pruneinfo,
+								Bitmapset **scan_leafpart_rtis);
 #endif							/* EXECPARTITION_H */
diff --git a/src/include/executor/execdesc.h b/src/include/executor/execdesc.h
index e79e2c001f..60d5644908 100644
--- a/src/include/executor/execdesc.h
+++ b/src/include/executor/execdesc.h
@@ -35,6 +35,8 @@ typedef struct QueryDesc
 	/* These fields are provided by CreateQueryDesc */
 	CmdType		operation;		/* CMD_SELECT, CMD_UPDATE, etc. */
 	PlannedStmt *plannedstmt;	/* planner's output (could be utility, too) */
+	PartitionPruneResult *part_prune_result; /* ExecutorDoInitialPruning()'s
+											  * output for plannedstmt */
 	const char *sourceText;		/* source text of the query */
 	Snapshot	snapshot;		/* snapshot to use for query */
 	Snapshot	crosscheck_snapshot;	/* crosscheck for RI update/delete */
@@ -57,6 +59,7 @@ typedef struct QueryDesc
 
 /* in pquery.c */
 extern QueryDesc *CreateQueryDesc(PlannedStmt *plannedstmt,
+								  PartitionPruneResult *part_prune_result,
 								  const char *sourceText,
 								  Snapshot snapshot,
 								  Snapshot crosscheck_snapshot,
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index 873772f188..57dc0e8077 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -185,6 +185,8 @@ ExecGetJunkAttribute(TupleTableSlot *slot, AttrNumber attno, bool *isNull)
 /*
  * prototypes from functions in execMain.c
  */
+extern PartitionPruneResult *ExecutorDoInitialPruning(PlannedStmt *plannedstmt,
+													  ParamListInfo params);
 extern void ExecutorStart(QueryDesc *queryDesc, int eflags);
 extern void standard_ExecutorStart(QueryDesc *queryDesc, int eflags);
 extern void ExecutorRun(QueryDesc *queryDesc,
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index cbbcff81d2..b5a7fd7e16 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -596,6 +596,8 @@ typedef struct EState
 	struct ExecRowMark **es_rowmarks;	/* Array of per-range-table-entry
 										 * ExecRowMarks, or NULL if none */
 	PlannedStmt *es_plannedstmt;	/* link to top of plan tree */
+	List		*es_part_prune_infos;	/* PlannedStmt.partPruneInfos */
+	struct PartitionPruneResult *es_part_prune_result; /* QueryDesc.part_prune_result */
 	const char *es_sourceText;	/* Source text from QueryDesc */
 
 	JunkFilter *es_junkFilter;	/* top-level junk filter, if any */
@@ -984,6 +986,19 @@ typedef struct DomainConstraintState
  */
 typedef TupleTableSlot *(*ExecProcNodeMtd) (struct PlanState *pstate);
 
+/*----------------
+ * PartitionPruneResult
+ *
+ * Result of ExecutorDoInitialPruning() invocation on a given plan.
+ */
+typedef struct PartitionPruneResult
+{
+	NodeTag		type;
+
+	Bitmapset	   *scan_leafpart_rtis;
+	List		   *valid_subplan_offs_list;
+} PartitionPruneResult;
+
 /* ----------------
  *		PlanState node
  *
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index 300824258e..de312b9215 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -97,6 +97,9 @@ typedef enum NodeTag
 	T_PartitionPruneStepCombine,
 	T_PlanInvalItem,
 
+	/* TAGS FOR EXECUTOR PREP NODES (execnodes.h) */
+	T_PartitionPruneResult,
+
 	/*
 	 * TAGS FOR PLAN STATE NODES (execnodes.h)
 	 *
@@ -673,6 +676,7 @@ extern struct Bitmapset *readBitmapset(void);
 extern uintptr_t readDatum(bool typbyval);
 extern bool *readBoolCols(int numCols);
 extern int *readIntCols(int numCols);
+extern Index *readIndexCols(int numCols);
 extern Oid *readOidCols(int numCols);
 extern int16 *readAttrNumberCols(int numCols);
 
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 6cbcb67bdf..f2039071c9 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -107,6 +107,18 @@ typedef struct PlannerGlobal
 
 	List	   *appendRelations;	/* "flat" list of AppendRelInfos */
 
+	List	   *partPruneInfos;		/* List of PartitionPruneInfo contained in
+									 * the plan */
+
+	bool		containsInitialPruning;	/* Do any of those PartitionPruneInfos
+										 * have initial (pre-exec) pruning
+										 * steps in them? */
+
+	Bitmapset  *minLockRelids;	/* RT indexes of RTE_RELATION entries that
+								 * must always be locked to execute the plan;
+								 * those scanned by initial-prunable plan
+								 * nodes are not included */
+
 	List	   *relationOids;	/* OIDs of relations the plan depends on */
 
 	List	   *invalItems;		/* other dependencies, as PlanInvalItems */
@@ -377,6 +389,9 @@ struct PlannerInfo
 
 	/* Does this query modify any partition key columns? */
 	bool		partColsUpdated;
+
+	/* PartitionPruneInfos added in this query's plan. */
+	List	   *partPruneInfos;
 };
 
 
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 10dd35f011..ecdc950fde 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -64,8 +64,19 @@ typedef struct PlannedStmt
 
 	struct Plan *planTree;		/* tree of Plan nodes */
 
+	List	   *partPruneInfos;	/* List of PartitionPruneInfo contained in
+								 * the plan */
+
+	bool		containsInitialPruning;	/* Do any of those PartitionPruneInfos
+										 * have initial (pre-exec) pruning
+										 * steps in them? */
+
 	List	   *rtable;			/* list of RangeTblEntry nodes */
 
+	Bitmapset  *minLockRelids;	/* RT indexes of RTE_RELATION entries that
+								 * must be locked, except those scanned by
+								 * initial-prunable plan nodes */
+
 	/* rtable indexes of target relations for INSERT/UPDATE/DELETE */
 	List	   *resultRelations;	/* integer list of RT indexes, or NIL */
 
@@ -262,8 +273,12 @@ typedef struct Append
 	 */
 	int			first_partial_plan;
 
-	/* Info for run-time subplan pruning; NULL if we're not doing that */
-	struct PartitionPruneInfo *part_prune_info;
+	/*
+	 * Index of this plan's PartitionPruneInfo in PlannedStmt.part_prune_infos
+	 * to be used for run-time subplan pruning; -1 if run-time pruning is
+	 * not needed.
+	 */
+	int			part_prune_index;
 } Append;
 
 /* ----------------
@@ -282,8 +297,13 @@ typedef struct MergeAppend
 	Oid		   *sortOperators;	/* OIDs of operators to sort them by */
 	Oid		   *collations;		/* OIDs of collations */
 	bool	   *nullsFirst;		/* NULLS FIRST/LAST directions */
-	/* Info for run-time subplan pruning; NULL if we're not doing that */
-	struct PartitionPruneInfo *part_prune_info;
+
+	/*
+	 * Index of this plan's PartitionPruneInfo in PlannedStmt.part_prune_infos
+	 * to be used for run-time subplan pruning; -1 if run-time pruning is
+	 * not needed.
+	 */
+	int			part_prune_index;
 } MergeAppend;
 
 /* ----------------
@@ -1187,6 +1207,13 @@ typedef struct PlanRowMark
  * prune_infos			List of Lists containing PartitionedRelPruneInfo nodes,
  *						one sublist per run-time-prunable partition hierarchy
  *						appearing in the parent plan node's subplans.
+ *
+ * needs_init_pruning	Does any of the PartitionedRelPruneInfos in
+ *						prune_infos have its initial_pruning_steps set?
+ *
+ * needs_exec_pruning	Does any of the PartitionedRelPruneInfos in
+ *						prune_infos have its exec_pruning_steps set?
+ *
  * other_subplans		Indexes of any subplans that are not accounted for
  *						by any of the PartitionedRelPruneInfo nodes in
  *						"prune_infos".  These subplans must not be pruned.
@@ -1195,6 +1222,9 @@ typedef struct PartitionPruneInfo
 {
 	NodeTag		type;
 	List	   *prune_infos;
+	Bitmapset  *leafpart_rtis;
+	bool		needs_init_pruning;
+	bool		needs_exec_pruning;
 	Bitmapset  *other_subplans;
 } PartitionPruneInfo;
 
@@ -1225,6 +1255,7 @@ typedef struct PartitionedRelPruneInfo
 	int		   *subplan_map;	/* subplan index by partition index, or -1 */
 	int		   *subpart_map;	/* subpart index by partition index, or -1 */
 	Oid		   *relid_map;		/* relation OID by partition index, or 0 */
+	Index	   *rti_map;		/* Range table index by partition index, 0. */
 
 	/*
 	 * initial_pruning_steps shows how to prune during executor startup (i.e.,
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index 95b99e3d25..fd7f129aea 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -148,6 +148,9 @@ typedef struct CachedPlan
 {
 	int			magic;			/* should equal CACHEDPLAN_MAGIC */
 	List	   *stmt_list;		/* list of PlannedStmts */
+	List	   *part_prune_result_list;	/* list of PartitionPruneResult with
+										 * one element for each of stmt_list;
+										 * NIL if not a generic plan */
 	bool		is_oneshot;		/* is it a "oneshot" plan? */
 	bool		is_saved;		/* is CachedPlan in a long-lived context? */
 	bool		is_valid;		/* is the stmt_list currently valid? */
@@ -158,6 +161,10 @@ typedef struct CachedPlan
 	int			generation;		/* parent's generation number for this plan */
 	int			refcount;		/* count of live references to this struct */
 	MemoryContext context;		/* context containing this CachedPlan */
+	MemoryContext part_prune_result_context; /* context containing
+											  * part_prune_result_list,
+											  * a child of the above
+											  * context */
 } CachedPlan;
 
 /*
-- 
2.24.1



^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2022-04-07 12:41  David Rowley <[email protected]>
  parent: Amit Langote <[email protected]>
  0 siblings, 1 reply; 108+ messages in thread

From: David Rowley @ 2022-04-07 12:41 UTC (permalink / raw)
  To: Amit Langote <[email protected]>; +Cc: Robert Haas <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers

On Thu, 7 Apr 2022 at 20:28, Amit Langote <[email protected]> wrote:
> Here's an updated version.  In Particular, I removed
> part_prune_results list from PortalData, in favor of anything that
> needs to look at the list can instead get it from the CachedPlan
> (PortalData.cplan).  This makes things better in 2 ways:

Thanks for making those changes.

I'm not overly familiar with the data structures we use for planning
around plans between the planner and executor, but storing the pruning
results in CachedPlan seems pretty bad. I see you've stashed it in
there and invented a new memory context to stop leaks into the cache
memory.

Since I'm not overly familiar with these structures, I'm trying to
imagine why you made that choice and the best I can come up with was
that it was the most convenient thing you had to hand inside
CheckCachedPlan().

I don't really have any great ideas right now on how to make this
better. I wonder if GetCachedPlan() should be changed to return some
struct that wraps up the CachedPlan with some sort of executor prep
info struct that we can stash the list of PartitionPruneResults in,
and perhaps something else one day.

Some lesser important stuff that I think could be done better.

* Are you also able to put meaningful comments on the
PartitionPruneResult struct in execnodes.h?

* In create_append_plan() and create_merge_append_plan() you have the
same code to set the part_prune_index. Why not just move all that code
into make_partition_pruneinfo() and have make_partition_pruneinfo()
return the index and append to the PlannerInfo.partPruneInfos List?

* Why not forboth() here?

i = 0;
foreach(stmtlist_item, portal->stmts)
{
PlannedStmt *pstmt = lfirst_node(PlannedStmt, stmtlist_item);
PartitionPruneResult *part_prune_result = part_prune_results ?
  list_nth(part_prune_results, i) :
  NULL;

i++;

* It would be good if ReleaseExecutorLocks() already knew the RTIs
that were locked. Maybe the executor prep info struct I mentioned
above could also store the RTIs that have been locked already and
allow ReleaseExecutorLocks() to just iterate over those to release the
locks.

David






^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2022-04-08 05:49  Amit Langote <[email protected]>
  parent: David Rowley <[email protected]>
  0 siblings, 1 reply; 108+ messages in thread

From: Amit Langote @ 2022-04-08 05:49 UTC (permalink / raw)
  To: David Rowley <[email protected]>; +Cc: Robert Haas <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers

On Thu, Apr 7, 2022 at 9:41 PM David Rowley <[email protected]> wrote:
> On Thu, 7 Apr 2022 at 20:28, Amit Langote <[email protected]> wrote:
> > Here's an updated version.  In Particular, I removed
> > part_prune_results list from PortalData, in favor of anything that
> > needs to look at the list can instead get it from the CachedPlan
> > (PortalData.cplan).  This makes things better in 2 ways:
>
> Thanks for making those changes.
>
> I'm not overly familiar with the data structures we use for planning
> around plans between the planner and executor, but storing the pruning
> results in CachedPlan seems pretty bad. I see you've stashed it in
> there and invented a new memory context to stop leaks into the cache
> memory.
>
> Since I'm not overly familiar with these structures, I'm trying to
> imagine why you made that choice and the best I can come up with was
> that it was the most convenient thing you had to hand inside
> CheckCachedPlan().

Yeah, it's that way because it felt convenient, though I have wondered
if a simpler scheme that doesn't require any changes to the CachedPlan
data structure might be better after all.  Your pointing it out has
made me think a bit harder on that.

> I don't really have any great ideas right now on how to make this
> better. I wonder if GetCachedPlan() should be changed to return some
> struct that wraps up the CachedPlan with some sort of executor prep
> info struct that we can stash the list of PartitionPruneResults in,
> and perhaps something else one day.

I think what might be better to do now is just add an output List
parameter to GetCachedPlan() to add the PartitionPruneResult node to
instead of stashing them into CachedPlan as now.  IMHO, we should
leave inventing a new generic struct to the next project that will
make it necessary to return more information from GetCachedPlan() to
its users.  I find it hard to convincingly describe what the new
generic struct really is if we invent it *now*, when it's going to
carry a single list whose purpose is pretty narrow.

So, I've implemented this by making the callers of GetCachedPlan()
pass a list to add the PartitionPruneResults that may be produced.
Most callers can put that into the Portal for passing that to other
modules, so I have reinstated PortalData.part_prune_results.  As for
its memory management, the list and the PartitionPruneResults therein
will be allocated in a context that holds the Portal itself.

> Some lesser important stuff that I think could be done better.
>
> * Are you also able to put meaningful comments on the
> PartitionPruneResult struct in execnodes.h?
>
> * In create_append_plan() and create_merge_append_plan() you have the
> same code to set the part_prune_index. Why not just move all that code
> into make_partition_pruneinfo() and have make_partition_pruneinfo()
> return the index and append to the PlannerInfo.partPruneInfos List?

That sounds better, so done.

> * Why not forboth() here?
>
> i = 0;
> foreach(stmtlist_item, portal->stmts)
> {
> PlannedStmt *pstmt = lfirst_node(PlannedStmt, stmtlist_item);
> PartitionPruneResult *part_prune_result = part_prune_results ?
>   list_nth(part_prune_results, i) :
>   NULL;
>
> i++;

Because the PartitionPruneResult list may not always be available.  To
wit, it's only available when it is GetCachedPlan() that gave the
portal its plan.  I know this is a bit ugly, but it seems better than
fixing all users of Portal to build a dummy list, not that it is
totally avoidable even in the current implementation.

> * It would be good if ReleaseExecutorLocks() already knew the RTIs
> that were locked. Maybe the executor prep info struct I mentioned
> above could also store the RTIs that have been locked already and
> allow ReleaseExecutorLocks() to just iterate over those to release the
> locks.

Rewrote this such that ReleaseExecutorLocks() just receives a list of
per-PlannedStmt bitmapsets containing the RT indexes of only the
locked entries in that plan.

Attached updated patch with these changes.



--
Amit Langote
EDB: http://www.enterprisedb.com


Attachments:

  [application/octet-stream] v13-0001-Optimize-AcquireExecutorLocks-to-skip-pruned-par.patch (99.1K, 2-v13-0001-Optimize-AcquireExecutorLocks-to-skip-pruned-par.patch)
  download | inline diff:
From 3c0c7f9f5f8bdf89c6afd06e26ba6d5490af9221 Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Wed, 22 Dec 2021 16:55:17 +0900
Subject: [PATCH v13] Optimize AcquireExecutorLocks() to skip pruned partitions

---
 src/backend/commands/copyto.c           |   2 +-
 src/backend/commands/createas.c         |   2 +-
 src/backend/commands/explain.c          |   7 +-
 src/backend/commands/extension.c        |   2 +-
 src/backend/commands/matview.c          |   2 +-
 src/backend/commands/prepare.c          |  26 ++-
 src/backend/executor/README             |  27 +++
 src/backend/executor/execMain.c         |  46 +++++
 src/backend/executor/execParallel.c     |  28 ++-
 src/backend/executor/execPartition.c    | 238 ++++++++++++++++++++----
 src/backend/executor/execUtils.c        |   1 +
 src/backend/executor/functions.c        |   2 +-
 src/backend/executor/nodeAppend.c       |  16 +-
 src/backend/executor/nodeMergeAppend.c  |   9 +-
 src/backend/executor/spi.c              |  27 ++-
 src/backend/nodes/copyfuncs.c           |  33 +++-
 src/backend/nodes/outfuncs.c            |  36 +++-
 src/backend/nodes/readfuncs.c           |  56 +++++-
 src/backend/optimizer/plan/createplan.c |  25 +--
 src/backend/optimizer/plan/planner.c    |   3 +
 src/backend/optimizer/plan/setrefs.c    | 104 ++++++++---
 src/backend/partitioning/partprune.c    |  59 +++++-
 src/backend/tcop/postgres.c             |   8 +-
 src/backend/tcop/pquery.c               |  25 ++-
 src/backend/utils/cache/plancache.c     | 184 +++++++++++++++---
 src/backend/utils/mmgr/portalmem.c      |  19 ++
 src/include/commands/explain.h          |   3 +-
 src/include/executor/execPartition.h    |  12 +-
 src/include/executor/execdesc.h         |   3 +
 src/include/executor/executor.h         |   2 +
 src/include/nodes/execnodes.h           |  30 +++
 src/include/nodes/nodes.h               |   4 +
 src/include/nodes/pathnodes.h           |  15 ++
 src/include/nodes/plannodes.h           |  39 +++-
 src/include/partitioning/partprune.h    |   8 +-
 src/include/utils/plancache.h           |   3 +-
 src/include/utils/portal.h              |   3 +
 37 files changed, 942 insertions(+), 167 deletions(-)

diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index 55c38b04c4..d403eb2309 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -542,7 +542,7 @@ BeginCopyTo(ParseState *pstate,
 		((DR_copy *) dest)->cstate = cstate;
 
 		/* Create a QueryDesc requesting no output */
-		cstate->queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+		cstate->queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
 											GetActiveSnapshot(),
 											InvalidSnapshot,
 											dest, NULL, NULL, 0);
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 9abbb6b555..f6607f2454 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -325,7 +325,7 @@ ExecCreateTableAs(ParseState *pstate, CreateTableAsStmt *stmt,
 		UpdateActiveSnapshotCommandId();
 
 		/* Create a QueryDesc, redirecting output to our tuple receiver */
-		queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+		queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
 									GetActiveSnapshot(), InvalidSnapshot,
 									dest, params, queryEnv, 0);
 
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 1e5701b8eb..7ba9852e51 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -407,7 +407,7 @@ ExplainOneQuery(Query *query, int cursorOptions,
 		}
 
 		/* run it (if needed) and produce output */
-		ExplainOnePlan(plan, into, es, queryString, params, queryEnv,
+		ExplainOnePlan(plan, NULL, into, es, queryString, params, queryEnv,
 					   &planduration, (es->buffers ? &bufusage : NULL));
 	}
 }
@@ -515,7 +515,8 @@ ExplainOneUtility(Node *utilityStmt, IntoClause *into, ExplainState *es,
  * to call it.
  */
 void
-ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
+ExplainOnePlan(PlannedStmt *plannedstmt, PartitionPruneResult *part_prune_result,
+			   IntoClause *into, ExplainState *es,
 			   const char *queryString, ParamListInfo params,
 			   QueryEnvironment *queryEnv, const instr_time *planduration,
 			   const BufferUsage *bufusage)
@@ -563,7 +564,7 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
 		dest = None_Receiver;
 
 	/* Create a QueryDesc for the query */
-	queryDesc = CreateQueryDesc(plannedstmt, queryString,
+	queryDesc = CreateQueryDesc(plannedstmt, part_prune_result, queryString,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, params, queryEnv, instrument_option);
 
diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index 1013790dbb..54734a3a93 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -776,7 +776,7 @@ execute_sql_string(const char *sql)
 			{
 				QueryDesc  *qdesc;
 
-				qdesc = CreateQueryDesc(stmt,
+				qdesc = CreateQueryDesc(stmt, NULL,
 										sql,
 										GetActiveSnapshot(), NULL,
 										dest, NULL, NULL, 0);
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index 9ab248d25e..2be1782bc4 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -416,7 +416,7 @@ refresh_matview_datafill(DestReceiver *dest, Query *query,
 	UpdateActiveSnapshotCommandId();
 
 	/* Create a QueryDesc, redirecting output to our tuple receiver */
-	queryDesc = CreateQueryDesc(plan, queryString,
+	queryDesc = CreateQueryDesc(plan, NULL, queryString,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, NULL, NULL, 0);
 
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 80738547ed..c7360712b1 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -155,6 +155,7 @@ ExecuteQuery(ParseState *pstate,
 	PreparedStatement *entry;
 	CachedPlan *cplan;
 	List	   *plan_list;
+	List	   *part_prune_result_list;
 	ParamListInfo paramLI = NULL;
 	EState	   *estate = NULL;
 	Portal		portal;
@@ -193,7 +194,10 @@ ExecuteQuery(ParseState *pstate,
 									   entry->plansource->query_string);
 
 	/* Replan if needed, and increment plan refcount for portal */
-	cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL);
+	cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL,
+						  &part_prune_result_list);
+	Assert(list_length(cplan->stmt_list) ==
+		   list_length(part_prune_result_list));
 	plan_list = cplan->stmt_list;
 
 	/*
@@ -207,6 +211,9 @@ ExecuteQuery(ParseState *pstate,
 					  plan_list,
 					  cplan);
 
+	/* Copy PartitionPruneResults into the portal's context. */
+	PortalStorePartitionPruneResults(portal, part_prune_result_list);
+
 	/*
 	 * For CREATE TABLE ... AS EXECUTE, we must verify that the prepared
 	 * statement is one that produces tuples.  Currently we insist that it be
@@ -576,7 +583,9 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	const char *query_string;
 	CachedPlan *cplan;
 	List	   *plan_list;
-	ListCell   *p;
+	List	   *part_prune_result_list;
+	ListCell   *p,
+			   *pp;
 	ParamListInfo paramLI = NULL;
 	EState	   *estate = NULL;
 	instr_time	planstart;
@@ -619,7 +628,10 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 
 	/* Replan if needed, and acquire a transient refcount */
 	cplan = GetCachedPlan(entry->plansource, paramLI,
-						  CurrentResourceOwner, queryEnv);
+						  CurrentResourceOwner, queryEnv,
+						  &part_prune_result_list);
+	Assert(list_length(cplan->stmt_list) ==
+		   list_length(part_prune_result_list));
 
 	INSTR_TIME_SET_CURRENT(planduration);
 	INSTR_TIME_SUBTRACT(planduration, planstart);
@@ -634,13 +646,15 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	plan_list = cplan->stmt_list;
 
 	/* Explain each query */
-	foreach(p, plan_list)
+	forboth(p, plan_list, pp, part_prune_result_list)
 	{
 		PlannedStmt *pstmt = lfirst_node(PlannedStmt, p);
+		PartitionPruneResult *part_prune_result = lfirst_node(PartitionPruneResult, pp);
 
 		if (pstmt->commandType != CMD_UTILITY)
-			ExplainOnePlan(pstmt, into, es, query_string, paramLI, queryEnv,
-						   &planduration, (es->buffers ? &bufusage : NULL));
+			ExplainOnePlan(pstmt, part_prune_result, into, es, query_string,
+						   paramLI, queryEnv, &planduration,
+						   (es->buffers ? &bufusage : NULL));
 		else
 			ExplainOneUtility(pstmt->utilityStmt, into, es, query_string,
 							  paramLI, queryEnv);
diff --git a/src/backend/executor/README b/src/backend/executor/README
index 0b5183fc4a..e0802be723 100644
--- a/src/backend/executor/README
+++ b/src/backend/executor/README
@@ -65,6 +65,29 @@ found there.  This currently only occurs for Append and MergeAppend nodes.  In
 this case the non-required subplans are ignored and the executor state's
 subnode array will become out of sequence to the plan's subplan list.
 
+Actually, the so-called execution time pruning may also occur even before the
+execution has started.  One case where that occurs is when a cached generic
+plan is being validated for execution by plancache.c: GetCachedPlan(), which
+proceeds by locking all the relations that will be scanned by that plan.  If
+the generic plan contains nodes that can perform execution time partition
+pruning (that is, contain a PartitionPruneInfo), a subset of pruning steps
+contained in the PartitionPruneInfos that do not depend on execution actually
+having started (called "initial" pruning steps) are performed at this point
+to figure out the minimal set of child subplans that satisfy those pruning
+instructions.  AcquireExecutorLocks() looking at a particular plan will then
+lock only the relations scanned by those surviving subplans (along with those
+present in PlannedStmt.minLockRelids), and ignore those scanned by the pruned
+subplans, even though the pruned subplans themselves are not removed from the
+plan tree.  The result of pruning (that is, the set of indexes of surviving
+subplans in their parent's list of child subplans) is saved as a list of
+bitmapsets, with one element for every PartitionPruneInfo referenced in the
+plan (PlannedStmt.partPruneInfos).  The list is packaged into a
+PartitionPruneResult node, which is passed along with the PlannedStmt to the
+executor via the QueryDesc.  It is imperative that the executor and any third
+party code invoked by it that gets passed the plan tree look at the plan's
+PartitionPruneResult to determine whether a particular child subplan of a
+parent node that supports pruning is valid for a given execution.
+
 Each Plan node may have expression trees associated with it, to represent
 its target list, qualification conditions, etc.  These trees are also
 read-only to the executor, but the executor state for expression evaluation
@@ -286,6 +309,10 @@ Query Processing Control Flow
 
 This is a sketch of control flow for full query processing:
 
+  [ ExecutorDoInitialPruning ] --- an optional step to perform initial
+		partition pruning on the plan tree the result of which is passed
+		to the executor via QueryDesc
+
 	CreateQueryDesc
 
 	ExecutorStart
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index ef2fd46092..05cc99df8f 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -49,11 +49,13 @@
 #include "commands/matview.h"
 #include "commands/trigger.h"
 #include "executor/execdebug.h"
+#include "executor/execPartition.h"
 #include "executor/nodeSubplan.h"
 #include "foreign/fdwapi.h"
 #include "jit/jit.h"
 #include "mb/pg_wchar.h"
 #include "miscadmin.h"
+#include "nodes/nodeFuncs.h"
 #include "parser/parsetree.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
@@ -104,6 +106,47 @@ static void EvalPlanQualStart(EPQState *epqstate, Plan *planTree);
 
 /* end of local decls */
 
+/* ----------------------------------------------------------------
+ *		ExecutorDoInitialPruning
+ *
+ *		Performs initial partition pruning to figure out the minimal set of
+ *		subplans to be executed and the set of RT indexes of the corresponding
+ *		leaf partitions
+ *
+ * Returned PartitionPruneResult must be subsequently passed to the executor
+ * so that it can reuse the result of pruning.  It's important that the
+ * has the same view of which partitions are initially pruned (by not doing
+ * the pruning again itself) or otherwise it risks initializing subplans whose
+ * partitions would not have been locked.
+ *
+ * Note: Partitioned tables mentioned in PartitionedRelPruneInfo nodes that
+ * drive the pruning will be locked before doing the pruning.
+ */
+PartitionPruneResult *
+ExecutorDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params)
+{
+	PartitionPruneResult *result;
+	ListCell *lc;
+
+	/* Only get here if there is any pruning to do. */
+	Assert(plannedstmt->containsInitialPruning);
+
+	result = makeNode(PartitionPruneResult);
+	foreach(lc, plannedstmt->partPruneInfos)
+	{
+		PartitionPruneInfo *pruneinfo = lfirst(lc);
+		Bitmapset *valid_subplan_offs;
+
+		valid_subplan_offs =
+			ExecPartitionDoInitialPruning(plannedstmt, params, pruneinfo,
+										  &result->scan_leafpart_rtis);
+		result->valid_subplan_offs_list =
+			lappend(result->valid_subplan_offs_list,
+					valid_subplan_offs);
+	}
+
+	return result;
+}
 
 /* ----------------------------------------------------------------
  *		ExecutorStart
@@ -806,6 +849,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
 {
 	CmdType		operation = queryDesc->operation;
 	PlannedStmt *plannedstmt = queryDesc->plannedstmt;
+	PartitionPruneResult *part_prune_result = queryDesc->part_prune_result;
 	Plan	   *plan = plannedstmt->planTree;
 	List	   *rangeTable = plannedstmt->rtable;
 	EState	   *estate = queryDesc->estate;
@@ -825,6 +869,8 @@ InitPlan(QueryDesc *queryDesc, int eflags)
 	ExecInitRangeTable(estate, rangeTable);
 
 	estate->es_plannedstmt = plannedstmt;
+	estate->es_part_prune_infos = plannedstmt->partPruneInfos;
+	estate->es_part_prune_result = part_prune_result;
 
 	/*
 	 * Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index 9a0d5d59ef..805f86c503 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -66,6 +66,7 @@
 #define PARALLEL_KEY_QUERY_TEXT		UINT64CONST(0xE000000000000008)
 #define PARALLEL_KEY_JIT_INSTRUMENTATION UINT64CONST(0xE000000000000009)
 #define PARALLEL_KEY_WAL_USAGE			UINT64CONST(0xE00000000000000A)
+#define PARALLEL_KEY_PARTITIONPRUNERESULT	UINT64CONST(0xE00000000000000B)
 
 #define PARALLEL_TUPLE_QUEUE_SIZE		65536
 
@@ -182,7 +183,9 @@ ExecSerializePlan(Plan *plan, EState *estate)
 	pstmt->transientPlan = false;
 	pstmt->dependsOnRole = false;
 	pstmt->parallelModeNeeded = false;
+	pstmt->containsInitialPruning = false;
 	pstmt->planTree = plan;
+	pstmt->partPruneInfos = estate->es_part_prune_infos;
 	pstmt->rtable = estate->es_range_table;
 	pstmt->resultRelations = NIL;
 	pstmt->appendRelations = NIL;
@@ -596,12 +599,15 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	FixedParallelExecutorState *fpes;
 	char	   *pstmt_data;
 	char	   *pstmt_space;
+	char	   *part_prune_result_data;
+	char	   *part_prune_result_space;
 	char	   *paramlistinfo_space;
 	BufferUsage *bufusage_space;
 	WalUsage   *walusage_space;
 	SharedExecutorInstrumentation *instrumentation = NULL;
 	SharedJitInstrumentation *jit_instrumentation = NULL;
 	int			pstmt_len;
+	int			part_prune_result_len;
 	int			paramlistinfo_len;
 	int			instrumentation_len = 0;
 	int			jit_instrumentation_len = 0;
@@ -630,6 +636,7 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 
 	/* Fix up and serialize plan to be sent to workers. */
 	pstmt_data = ExecSerializePlan(planstate->plan, estate);
+	part_prune_result_data = nodeToString(estate->es_part_prune_result);
 
 	/* Create a parallel context. */
 	pcxt = CreateParallelContext("postgres", "ParallelQueryMain", nworkers);
@@ -656,6 +663,11 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	shm_toc_estimate_chunk(&pcxt->estimator, pstmt_len);
 	shm_toc_estimate_keys(&pcxt->estimator, 1);
 
+	/* Estimate space for serialized PartitionPruneResult. */
+	part_prune_result_len = strlen(part_prune_result_data) + 1;
+	shm_toc_estimate_chunk(&pcxt->estimator, part_prune_result_len);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
 	/* Estimate space for serialized ParamListInfo. */
 	paramlistinfo_len = EstimateParamListSpace(estate->es_param_list_info);
 	shm_toc_estimate_chunk(&pcxt->estimator, paramlistinfo_len);
@@ -750,6 +762,12 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	memcpy(pstmt_space, pstmt_data, pstmt_len);
 	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PLANNEDSTMT, pstmt_space);
 
+	/* Store serialized PartitionPruneResult */
+	part_prune_result_space = shm_toc_allocate(pcxt->toc, part_prune_result_len);
+	memcpy(part_prune_result_space, part_prune_result_data, part_prune_result_len);
+	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARTITIONPRUNERESULT,
+				   part_prune_result_space);
+
 	/* Store serialized ParamListInfo. */
 	paramlistinfo_space = shm_toc_allocate(pcxt->toc, paramlistinfo_len);
 	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARAMLISTINFO, paramlistinfo_space);
@@ -1231,8 +1249,10 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
 						 int instrument_options)
 {
 	char	   *pstmtspace;
+	char	   *part_prune_result_space;
 	char	   *paramspace;
 	PlannedStmt *pstmt;
+	PartitionPruneResult *part_prune_result;
 	ParamListInfo paramLI;
 	char	   *queryString;
 
@@ -1243,12 +1263,18 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
 	pstmtspace = shm_toc_lookup(toc, PARALLEL_KEY_PLANNEDSTMT, false);
 	pstmt = (PlannedStmt *) stringToNode(pstmtspace);
 
+	/* Reconstruct leader-supplied PartitionPruneResult. */
+	part_prune_result_space =
+		shm_toc_lookup(toc, PARALLEL_KEY_PARTITIONPRUNERESULT, false);
+	part_prune_result = (PartitionPruneResult *)
+		stringToNode(part_prune_result_space);
+
 	/* Reconstruct ParamListInfo. */
 	paramspace = shm_toc_lookup(toc, PARALLEL_KEY_PARAMLISTINFO, false);
 	paramLI = RestoreParamList(&paramspace);
 
 	/* Create a QueryDesc for the query. */
-	return CreateQueryDesc(pstmt,
+	return CreateQueryDesc(pstmt, part_prune_result,
 						   queryString,
 						   GetActiveSnapshot(), InvalidSnapshot,
 						   receiver, paramLI, NULL, instrument_options);
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 615bd80973..3037742b8d 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -25,6 +25,7 @@
 #include "mb/pg_wchar.h"
 #include "miscadmin.h"
 #include "nodes/makefuncs.h"
+#include "parser/parsetree.h"
 #include "partitioning/partbounds.h"
 #include "partitioning/partdesc.h"
 #include "partitioning/partprune.h"
@@ -185,7 +186,11 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation rel,
 static List *adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri);
 static List *adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap);
 static PartitionPruneState *CreatePartitionPruneState(PlanState *planstate,
-													  PartitionPruneInfo *pruneinfo);
+													  PartitionPruneInfo *pruneinfo,
+													  bool consider_initial_steps,
+													  bool consider_exec_steps,
+													  List *rtable, ExprContext *econtext,
+													  PartitionDirectory partdir);
 static void InitPartitionPruneContext(PartitionPruneContext *context,
 									  List *pruning_steps,
 									  PartitionDesc partdesc,
@@ -198,7 +203,8 @@ static void PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
 static void find_matching_subplans_recurse(PartitionPruningData *prunedata,
 										   PartitionedRelPruningData *pprune,
 										   bool initial_prune,
-										   Bitmapset **validsubplans);
+										   Bitmapset **validsubplans,
+										   Bitmapset **scan_leafpart_rtis);
 
 
 /*
@@ -1587,8 +1593,10 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  * considered to be a stable expression, it can change value from one plan
  * node scan to the next during query execution.  Stable comparison
  * expressions that don't involve such Params allow partition pruning to be
- * done once during executor startup.  Expressions that do involve such Params
- * require us to prune separately for each scan of the parent plan node.
+ * done once during executor startup or during ExecutorDoInitialPruning() that
+ * runs as part of performing AcquireExecutorLocks() on a given plan tree.
+ * Expressions that do involve such Params require us to prune separately for
+ * each scan of the parent plan node.
  *
  * Note that pruning away unneeded subplans during executor startup has the
  * added benefit of not having to initialize the unneeded subplans at all.
@@ -1605,6 +1613,13 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  *		account for initial pruning possibly having eliminated some of the
  *		subplans.
  *
+ * ExecPartitionDoInitialPruning:
+ *		Do initial pruning with the information contained in a given
+ *		PartitionPruneInfo to determine the minimal set of child subplans
+ *		to be executed of the parent plan node to which the PartitionPruneInfo
+ *		belongs and also the set of the RT indexes of leaf partitions that will
+ *		be scanned with those subplans.
+ *
  * ExecFindMatchingSubPlans:
  *		Returns indexes of matching subplans after evaluating the expressions
  *		that are safe to evaluate at a given point.  This function is first
@@ -1622,8 +1637,9 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  *
  * On return, *initially_valid_subplans is assigned the set of indexes of
  * child subplans that must be initialized along with the parent plan node.
- * Initial pruning is performed here if needed and in that case only the
- * surviving subplans' indexes are added.
+ * Initial pruning is performed here if needed (unless it has already been done
+ * by ExecDoInitialPruning()), and in that case only the surviving subplans'
+ * indexes are added.
  *
  * If subplans are indeed pruned, subplan_map arrays contained in the returned
  * PartitionPruneState are re-sequenced to not count those, though only if the
@@ -1632,23 +1648,59 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
 PartitionPruneState *
 ExecInitPartitionPruning(PlanState *planstate,
 						 int n_total_subplans,
-						 PartitionPruneInfo *pruneinfo,
+						 int part_prune_index,
 						 Bitmapset **initially_valid_subplans)
 {
-	PartitionPruneState *prunestate;
 	EState	   *estate = planstate->state;
+	PartitionPruneInfo  *pruneinfo = list_nth(estate->es_part_prune_infos,
+											  part_prune_index);
+	PartitionPruneResult *pruneresult = estate->es_part_prune_result;
+	PartitionPruneState *prunestate;
+	bool	do_pruning = (pruneinfo->needs_init_pruning ||
+						  pruneinfo->needs_exec_pruning);
 
-	/* We may need an expression context to evaluate partition exprs */
-	ExecAssignExprContext(estate, planstate);
+	/*
+	 * No need to do initial pruning if it was done already by
+	 * ExecutorDoInitialPruning(), which it would be if es_part_prune_result
+	 * has been set.
+	 */
+	if (pruneresult)
+		do_pruning = pruneinfo->needs_exec_pruning;
+
+	prunestate = NULL;
+	if (do_pruning)
+	{
+		/* We may need an expression context to evaluate partition exprs */
+		ExecAssignExprContext(estate, planstate);
 
-	/* Create the working data structure for pruning */
-	prunestate = CreatePartitionPruneState(planstate, pruneinfo);
+		/* For data reading, executor always omits detached partitions */
+		if (estate->es_partition_directory == NULL)
+			estate->es_partition_directory =
+				CreatePartitionDirectory(estate->es_query_cxt, false);
+
+		/*
+		 * Create the working data structure for pruning.  No need to consider
+		 * initial pruning steps if we have a PartitionPruneResult.
+		 */
+		prunestate = CreatePartitionPruneState(planstate, pruneinfo,
+											   pruneresult == NULL, true,
+											   NIL, planstate->ps_ExprContext,
+											   estate->es_partition_directory);
+	}
 
 	/*
 	 * Perform an initial partition prune pass, if required.
 	 */
-	if (prunestate->do_initial_prune)
-		*initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true);
+	if (pruneresult)
+	{
+		*initially_valid_subplans =
+			list_nth(pruneresult->valid_subplan_offs_list, part_prune_index);
+	}
+	else if (prunestate && prunestate->do_initial_prune)
+	{
+		*initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true,
+															 NULL);
+	}
 	else
 	{
 		/* No pruning, so we'll need to initialize all subplans */
@@ -1669,7 +1721,7 @@ ExecInitPartitionPruning(PlanState *planstate,
 		 * leaves invalid data in prunestate, because that data won't be
 		 * consulted again (cf initial Assert in ExecFindMatchingSubPlans).
 		 */
-		if (prunestate->do_exec_prune)
+		if (prunestate && prunestate->do_exec_prune)
 			PartitionPruneFixSubPlanMap(prunestate,
 										*initially_valid_subplans,
 										n_total_subplans);
@@ -1678,11 +1730,72 @@ ExecInitPartitionPruning(PlanState *planstate,
 	return prunestate;
 }
 
+/*
+ * ExecPartitionDoInitialPruning
+ *		Perform initial pruning using given PartitionPruneInfo to determine
+ *		the minimal set of child subplans to be executed of the parent plan
+ *		node to which the PartitionPruneInfo belongs and also the set of RT
+ *		indexes of leaf partitions that will scanned with those subplans.
+ */
+Bitmapset *
+ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params,
+							  PartitionPruneInfo *pruneinfo,
+							  Bitmapset **scan_leafpart_rtis)
+{
+	List		 *rtable = plannedstmt->rtable;
+	ExprContext	 *econtext;
+	PartitionDirectory pdir;
+	MemoryContext oldcontext,
+				  tmpcontext;
+	PartitionPruneState *prunestate;
+	Bitmapset	 *valid_subplan_offs;
+
+	/*
+	 * A temporary context to allocate stuff needded to run the pruning steps.
+	 */
+	tmpcontext = AllocSetContextCreate(CurrentMemoryContext,
+									   "initial pruning working data",
+									   ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(tmpcontext);
+
+	/*
+	 * PartitionDirectory to look up partition descriptors, which omits
+	 * detached partitions, just like in the executor proper.
+	 */
+	pdir = CreatePartitionDirectory(CurrentMemoryContext, false);
+
+	/*
+	 * We don't yet have a PlanState for the parent plan node, so must create
+	 * a standalone ExprContext to evaluate pruning expressions, equipped with
+	 * the information about the EXTERN parameters that the caller passed us.
+	 * Note that that's okay because the initial pruning steps do not contain
+	 * anything that requires the execution to have started.
+	 */
+	econtext = CreateStandaloneExprContext();
+	econtext->ecxt_param_list_info = params;
+	prunestate = CreatePartitionPruneState(NULL, pruneinfo, true, false,
+										   rtable, econtext, pdir);
+	MemoryContextSwitchTo(oldcontext);
+
+	/* Do the initial pruning. */
+	valid_subplan_offs = ExecFindMatchingSubPlans(prunestate, true,
+												  scan_leafpart_rtis);
+
+	FreeExprContext(econtext, true);
+	DestroyPartitionDirectory(pdir);
+	MemoryContextDelete(tmpcontext);
+
+	return valid_subplan_offs;
+}
+
 /*
  * CreatePartitionPruneState
  *		Build the data structure required for calling ExecFindMatchingSubPlans
  *
- * 'planstate' is the parent plan node's execution state.
+ * 'planstate', if not NULL, is the parent plan node's execution state.  It
+ * can be NULL if being called before ExecutorStart(), in which case,
+ * 'rtable' (range table), 'econtext', and 'partdir' must be explicitly
+ * provided.
  *
  * 'pruneinfo' is a PartitionPruneInfo as generated by
  * make_partition_pruneinfo.  Here we build a PartitionPruneState containing a
@@ -1696,19 +1809,21 @@ ExecInitPartitionPruning(PlanState *planstate,
  * PartitionedRelPruneInfo.
  */
 static PartitionPruneState *
-CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
+CreatePartitionPruneState(PlanState *planstate,
+						  PartitionPruneInfo *pruneinfo,
+						  bool consider_initial_steps,
+						  bool consider_exec_steps,
+						  List *rtable, ExprContext *econtext,
+						  PartitionDirectory partdir)
 {
-	EState	   *estate = planstate->state;
+	EState	   *estate = planstate ? planstate->state : NULL;
 	PartitionPruneState *prunestate;
 	int			n_part_hierarchies;
 	ListCell   *lc;
 	int			i;
-	ExprContext *econtext = planstate->ps_ExprContext;
 
-	/* For data reading, executor always omits detached partitions */
-	if (estate->es_partition_directory == NULL)
-		estate->es_partition_directory =
-			CreatePartitionDirectory(estate->es_query_cxt, false);
+	Assert((estate != NULL) ||
+			(partdir != NULL && econtext != NULL && rtable != NIL));
 
 	n_part_hierarchies = list_length(pruneinfo->prune_infos);
 	Assert(n_part_hierarchies > 0);
@@ -1759,19 +1874,48 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 			PartitionedRelPruneInfo *pinfo = lfirst_node(PartitionedRelPruneInfo, lc2);
 			PartitionedRelPruningData *pprune = &prunedata->partrelprunedata[j];
 			Relation	partrel;
+			bool		close_partrel = false;
 			PartitionDesc partdesc;
 			PartitionKey partkey;
 
 			/*
-			 * We can rely on the copies of the partitioned table's partition
-			 * key and partition descriptor appearing in its relcache entry,
-			 * because that entry will be held open and locked for the
-			 * duration of this executor run.
+			 * Must open the relation by ourselves when called before the
+			 * execution has started, such as, when called during
+			 * ExecutorDoInitialPruning() on a cached plan.  In that case,
+			 * sub-partitions must be locked, because AcquirePlannerLocks()
+			 * would not have seen them. (1st relation in a partrelpruneinfos
+			 * list is always the root partitioned table appearing in the
+			 * query, which AcquirePlannerLocks() would have locked; the
+			 * Assert in relation_open() guards that assumption.)
+			 */
+			if (estate == NULL)
+			{
+				RangeTblEntry *rte = rt_fetch(pinfo->rtindex, rtable);
+				int		lockmode = (j == 0) ? NoLock : rte->rellockmode;
+
+				partrel = table_open(rte->relid, lockmode);
+				close_partrel = true;
+			}
+			else
+				partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
+
+			/*
+			 * We can rely on the copy of the partitioned table's partition
+			 * key from in its relcache entry, because it can't change (or
+			 * get destroyed) as long as the relation is locked.  Partition
+			 * descriptor is taken from the PartitionDirectory associated with
+			 * the table that is held open long enough for the descriptor to
+			 * remain valid while it's used to perform the pruning steps.
 			 */
-			partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
 			partkey = RelationGetPartitionKey(partrel);
-			partdesc = PartitionDirectoryLookup(estate->es_partition_directory,
-												partrel);
+			partdesc = PartitionDirectoryLookup(partdir, partrel);
+
+			/*
+			 * Must close partrel, keeping the lock taken, if we're not using
+			 * EState's entry.
+			 */
+			if (close_partrel)
+				table_close(partrel, NoLock);
 
 			/*
 			 * Initialize the subplan_map and subpart_map.
@@ -1785,6 +1929,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 			Assert(partdesc->nparts >= pinfo->nparts);
 			pprune->nparts = partdesc->nparts;
 			pprune->subplan_map = palloc(sizeof(int) * partdesc->nparts);
+			pprune->rti_map = palloc(sizeof(Index) * partdesc->nparts);
 			if (partdesc->nparts == pinfo->nparts)
 			{
 				/*
@@ -1795,6 +1940,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 				pprune->subpart_map = pinfo->subpart_map;
 				memcpy(pprune->subplan_map, pinfo->subplan_map,
 					   sizeof(int) * pinfo->nparts);
+				memcpy(pprune->rti_map, pinfo->rti_map,
+					   sizeof(int) * pinfo->nparts);
 
 				/*
 				 * Double-check that the list of unpruned relations has not
@@ -1845,6 +1992,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 							pinfo->subplan_map[pd_idx];
 						pprune->subpart_map[pp_idx] =
 							pinfo->subpart_map[pd_idx];
+						pprune->rti_map[pp_idx] =
+							pinfo->rti_map[pd_idx];
 						pd_idx++;
 					}
 					else
@@ -1852,6 +2001,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 						/* this partdesc entry is not in the plan */
 						pprune->subplan_map[pp_idx] = -1;
 						pprune->subpart_map[pp_idx] = -1;
+						pprune->rti_map[pp_idx] = 0;
 					}
 				}
 
@@ -1873,7 +2023,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 			 * Initialize pruning contexts as needed.
 			 */
 			pprune->initial_pruning_steps = pinfo->initial_pruning_steps;
-			if (pinfo->initial_pruning_steps)
+			if (consider_initial_steps && pinfo->initial_pruning_steps)
 			{
 				InitPartitionPruneContext(&pprune->initial_context,
 										  pinfo->initial_pruning_steps,
@@ -1883,7 +2033,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 				prunestate->do_initial_prune = true;
 			}
 			pprune->exec_pruning_steps = pinfo->exec_pruning_steps;
-			if (pinfo->exec_pruning_steps)
+			if (consider_exec_steps && pinfo->exec_pruning_steps)
 			{
 				InitPartitionPruneContext(&pprune->exec_context,
 										  pinfo->exec_pruning_steps,
@@ -2111,10 +2261,14 @@ PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
  * Pass initial_prune if PARAM_EXEC Params cannot yet be evaluated.  This
  * differentiates the initial executor-time pruning step from later
  * runtime pruning.
+ *
+ * RT indexes of leaf partitions scanned by the chosen subplans are added to
+ * *scan_leafpart_rtis if the pointer is non-NULL.
  */
 Bitmapset *
 ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
-						 bool initial_prune)
+						 bool initial_prune,
+						 Bitmapset **scan_leafpart_rtis)
 {
 	Bitmapset  *result = NULL;
 	MemoryContext oldcontext;
@@ -2149,7 +2303,7 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
 		 */
 		pprune = &prunedata->partrelprunedata[0];
 		find_matching_subplans_recurse(prunedata, pprune, initial_prune,
-									   &result);
+									   &result, scan_leafpart_rtis);
 
 		/* Expression eval may have used space in ExprContext too */
 		if (pprune->exec_pruning_steps)
@@ -2163,6 +2317,8 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
 
 	/* Copy result out of the temp context before we reset it */
 	result = bms_copy(result);
+	if (scan_leafpart_rtis)
+		*scan_leafpart_rtis = bms_copy(*scan_leafpart_rtis);
 
 	MemoryContextReset(prunestate->prune_context);
 
@@ -2173,13 +2329,15 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
  * find_matching_subplans_recurse
  *		Recursive worker function for ExecFindMatchingSubPlans
  *
- * Adds valid (non-prunable) subplan IDs to *validsubplans
+ * Adds valid (non-prunable) subplan IDs to *validsubplans and RT indexes of
+ * of the corresponding leaf partitions to *scan_leafpart_rtis (if asked for).
  */
 static void
 find_matching_subplans_recurse(PartitionPruningData *prunedata,
 							   PartitionedRelPruningData *pprune,
 							   bool initial_prune,
-							   Bitmapset **validsubplans)
+							   Bitmapset **validsubplans,
+							   Bitmapset **scan_leafpart_rtis)
 {
 	Bitmapset  *partset;
 	int			i;
@@ -2206,8 +2364,13 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
 	while ((i = bms_next_member(partset, i)) >= 0)
 	{
 		if (pprune->subplan_map[i] >= 0)
+		{
 			*validsubplans = bms_add_member(*validsubplans,
 											pprune->subplan_map[i]);
+			if (scan_leafpart_rtis && pprune->rti_map[i] > 0)
+				*scan_leafpart_rtis = bms_add_member(*scan_leafpart_rtis,
+													 pprune->rti_map[i]);
+		}
 		else
 		{
 			int			partidx = pprune->subpart_map[i];
@@ -2215,7 +2378,8 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
 			if (partidx >= 0)
 				find_matching_subplans_recurse(prunedata,
 											   &prunedata->partrelprunedata[partidx],
-											   initial_prune, validsubplans);
+											   initial_prune, validsubplans,
+											   scan_leafpart_rtis);
 			else
 			{
 				/*
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 9df1f81ea8..639145abe9 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -119,6 +119,7 @@ CreateExecutorState(void)
 	estate->es_relations = NULL;
 	estate->es_rowmarks = NULL;
 	estate->es_plannedstmt = NULL;
+	estate->es_part_prune_result = NULL;
 
 	estate->es_junkFilter = NULL;
 
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index f9460ae506..a2182a6b1f 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -844,7 +844,7 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
 	else
 		dest = None_Receiver;
 
-	es->qd = CreateQueryDesc(es->stmt,
+	es->qd = CreateQueryDesc(es->stmt, NULL,
 							 fcache->src,
 							 GetActiveSnapshot(),
 							 InvalidSnapshot,
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index 357e10a1d7..09f26658e2 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -94,6 +94,7 @@ static bool ExecAppendAsyncRequest(AppendState *node, TupleTableSlot **result);
 static void ExecAppendAsyncEventWait(AppendState *node);
 static void classify_matching_subplans(AppendState *node);
 
+
 /* ----------------------------------------------------------------
  *		ExecInitAppend
  *
@@ -134,7 +135,7 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
 	appendstate->as_begun = false;
 
 	/* If run-time partition pruning is enabled, then set that up now */
-	if (node->part_prune_info != NULL)
+	if (node->part_prune_index >= 0)
 	{
 		PartitionPruneState *prunestate;
 
@@ -145,7 +146,7 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
 		 */
 		prunestate = ExecInitPartitionPruning(&appendstate->ps,
 											  list_length(node->appendplans),
-											  node->part_prune_info,
+											  node->part_prune_index,
 											  &validsubplans);
 		appendstate->as_prune_state = prunestate;
 		nplans = bms_num_members(validsubplans);
@@ -155,7 +156,8 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
 		 * subplan, we can fill as_valid_subplans immediately, preventing
 		 * later calls to ExecFindMatchingSubPlans.
 		 */
-		if (!prunestate->do_exec_prune && nplans > 0)
+		if (appendstate->as_prune_state == NULL ||
+			(!appendstate->as_prune_state->do_exec_prune && nplans > 0))
 			appendstate->as_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
 	}
 	else
@@ -577,7 +579,7 @@ choose_next_subplan_locally(AppendState *node)
 		}
 		else if (node->as_valid_subplans == NULL)
 			node->as_valid_subplans =
-				ExecFindMatchingSubPlans(node->as_prune_state, false);
+				ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
 
 		whichplan = -1;
 	}
@@ -642,7 +644,7 @@ choose_next_subplan_for_leader(AppendState *node)
 		if (node->as_valid_subplans == NULL)
 		{
 			node->as_valid_subplans =
-				ExecFindMatchingSubPlans(node->as_prune_state, false);
+				ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
 
 			/*
 			 * Mark each invalid plan as finished to allow the loop below to
@@ -717,7 +719,7 @@ choose_next_subplan_for_worker(AppendState *node)
 	else if (node->as_valid_subplans == NULL)
 	{
 		node->as_valid_subplans =
-			ExecFindMatchingSubPlans(node->as_prune_state, false);
+			ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
 		mark_invalid_subplans_as_finished(node);
 	}
 
@@ -868,7 +870,7 @@ ExecAppendAsyncBegin(AppendState *node)
 	if (node->as_valid_subplans == NULL)
 	{
 		node->as_valid_subplans =
-			ExecFindMatchingSubPlans(node->as_prune_state, false);
+			ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
 
 		classify_matching_subplans(node);
 	}
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index ecf9052e03..7708cfffda 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -82,7 +82,7 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
 	mergestate->ps.ExecProcNode = ExecMergeAppend;
 
 	/* If run-time partition pruning is enabled, then set that up now */
-	if (node->part_prune_info != NULL)
+	if (node->part_prune_index >= 0)
 	{
 		PartitionPruneState *prunestate;
 
@@ -93,7 +93,7 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
 		 */
 		prunestate = ExecInitPartitionPruning(&mergestate->ps,
 											  list_length(node->mergeplans),
-											  node->part_prune_info,
+											  node->part_prune_index,
 											  &validsubplans);
 		mergestate->ms_prune_state = prunestate;
 		nplans = bms_num_members(validsubplans);
@@ -103,7 +103,8 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
 		 * subplan, we can fill as_valid_subplans immediately, preventing
 		 * later calls to ExecFindMatchingSubPlans.
 		 */
-		if (!prunestate->do_exec_prune && nplans > 0)
+		if (mergestate->ms_prune_state == NULL ||
+			(!mergestate->ms_prune_state->do_exec_prune && nplans > 0))
 			mergestate->ms_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
 	}
 	else
@@ -218,7 +219,7 @@ ExecMergeAppend(PlanState *pstate)
 		 */
 		if (node->ms_valid_subplans == NULL)
 			node->ms_valid_subplans =
-				ExecFindMatchingSubPlans(node->ms_prune_state, false);
+				ExecFindMatchingSubPlans(node->ms_prune_state, false, NULL);
 
 		/*
 		 * First time through: pull the first tuple from each valid subplan,
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 042a5f8b0a..729e2fd7b2 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -1578,6 +1578,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 	CachedPlanSource *plansource;
 	CachedPlan *cplan;
 	List	   *stmt_list;
+	List	   *part_prune_result_list;
 	char	   *query_string;
 	Snapshot	snapshot;
 	MemoryContext oldcontext;
@@ -1657,7 +1658,10 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 	 */
 
 	/* Replan if needed, and increment plan refcount for portal */
-	cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv);
+	cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv,
+						  &part_prune_result_list);
+	Assert(list_length(cplan->stmt_list) ==
+		   list_length(part_prune_result_list));
 	stmt_list = cplan->stmt_list;
 
 	if (!plan->saved)
@@ -1685,6 +1689,9 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 					  stmt_list,
 					  cplan);
 
+	/* Copy PartitionPruneResults into the portal's context. */
+	PortalStorePartitionPruneResults(portal, part_prune_result_list);
+
 	/*
 	 * Set up options for portal.  Default SCROLL type is chosen the same way
 	 * as PerformCursorOpen does it.
@@ -2092,7 +2099,8 @@ SPI_plan_get_cached_plan(SPIPlanPtr plan)
 	/* Get the generic plan for the query */
 	cplan = GetCachedPlan(plansource, NULL,
 						  plan->saved ? CurrentResourceOwner : NULL,
-						  _SPI_current->queryEnv);
+						  _SPI_current->queryEnv,
+						  NULL /* Not interested in PartitionPruneResults */);
 	Assert(cplan == plansource->gplan);
 
 	/* Pop the error context stack */
@@ -2473,7 +2481,9 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 	{
 		CachedPlanSource *plansource = (CachedPlanSource *) lfirst(lc1);
 		List	   *stmt_list;
-		ListCell   *lc2;
+		List	   *part_prune_result_list;
+		ListCell   *lc2,
+				   *lc3;
 
 		spicallbackarg.query = plansource->query_string;
 
@@ -2549,8 +2559,10 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 		 * plan, the refcount must be backed by the plan_owner.
 		 */
 		cplan = GetCachedPlan(plansource, options->params,
-							  plan_owner, _SPI_current->queryEnv);
-
+							  plan_owner, _SPI_current->queryEnv,
+							 &part_prune_result_list);
+		Assert(list_length(cplan->stmt_list) ==
+			   list_length(part_prune_result_list));
 		stmt_list = cplan->stmt_list;
 
 		/*
@@ -2589,9 +2601,10 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 			}
 		}
 
-		foreach(lc2, stmt_list)
+		forboth(lc2, stmt_list, lc3, part_prune_result_list)
 		{
 			PlannedStmt *stmt = lfirst_node(PlannedStmt, lc2);
+			PartitionPruneResult *part_prune_result = lfirst_node(PartitionPruneResult, lc3);
 			bool		canSetTag = stmt->canSetTag;
 			DestReceiver *dest;
 
@@ -2663,7 +2676,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 				else
 					snap = InvalidSnapshot;
 
-				qdesc = CreateQueryDesc(stmt,
+				qdesc = CreateQueryDesc(stmt, part_prune_result,
 										plansource->query_string,
 										snap, crosscheck_snapshot,
 										dest,
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 46a1943d97..9642e74ef1 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -96,7 +96,10 @@ _copyPlannedStmt(const PlannedStmt *from)
 	COPY_SCALAR_FIELD(parallelModeNeeded);
 	COPY_SCALAR_FIELD(jitFlags);
 	COPY_NODE_FIELD(planTree);
+	COPY_NODE_FIELD(partPruneInfos);
+	COPY_SCALAR_FIELD(containsInitialPruning);
 	COPY_NODE_FIELD(rtable);
+	COPY_BITMAPSET_FIELD(minLockRelids);
 	COPY_NODE_FIELD(resultRelations);
 	COPY_NODE_FIELD(appendRelations);
 	COPY_NODE_FIELD(subplans);
@@ -253,7 +256,7 @@ _copyAppend(const Append *from)
 	COPY_NODE_FIELD(appendplans);
 	COPY_SCALAR_FIELD(nasyncplans);
 	COPY_SCALAR_FIELD(first_partial_plan);
-	COPY_NODE_FIELD(part_prune_info);
+	COPY_SCALAR_FIELD(part_prune_index);
 
 	return newnode;
 }
@@ -281,7 +284,7 @@ _copyMergeAppend(const MergeAppend *from)
 	COPY_POINTER_FIELD(sortOperators, from->numCols * sizeof(Oid));
 	COPY_POINTER_FIELD(collations, from->numCols * sizeof(Oid));
 	COPY_POINTER_FIELD(nullsFirst, from->numCols * sizeof(bool));
-	COPY_NODE_FIELD(part_prune_info);
+	COPY_SCALAR_FIELD(part_prune_index);
 
 	return newnode;
 }
@@ -1280,6 +1283,8 @@ _copyPartitionPruneInfo(const PartitionPruneInfo *from)
 	PartitionPruneInfo *newnode = makeNode(PartitionPruneInfo);
 
 	COPY_NODE_FIELD(prune_infos);
+	COPY_SCALAR_FIELD(needs_init_pruning);
+	COPY_SCALAR_FIELD(needs_exec_pruning);
 	COPY_BITMAPSET_FIELD(other_subplans);
 
 	return newnode;
@@ -1296,6 +1301,7 @@ _copyPartitionedRelPruneInfo(const PartitionedRelPruneInfo *from)
 	COPY_POINTER_FIELD(subplan_map, from->nparts * sizeof(int));
 	COPY_POINTER_FIELD(subpart_map, from->nparts * sizeof(int));
 	COPY_POINTER_FIELD(relid_map, from->nparts * sizeof(Oid));
+	COPY_POINTER_FIELD(rti_map, from->nparts * sizeof(Index));
 	COPY_NODE_FIELD(initial_pruning_steps);
 	COPY_NODE_FIELD(exec_pruning_steps);
 	COPY_BITMAPSET_FIELD(execparamids);
@@ -5469,6 +5475,21 @@ _copyExtensibleNode(const ExtensibleNode *from)
 	return newnode;
 }
 
+/* ****************************************************************
+ *					execnodes.h copy functions
+ * ****************************************************************
+ */
+static PartitionPruneResult *
+_copyPartitionPruneResult(const PartitionPruneResult *from)
+{
+	PartitionPruneResult *newnode = makeNode(PartitionPruneResult);
+
+	COPY_NODE_FIELD(valid_subplan_offs_list);
+	COPY_BITMAPSET_FIELD(scan_leafpart_rtis);
+
+	return newnode;
+}
+
 /* ****************************************************************
  *					value.h copy functions
  * ****************************************************************
@@ -5523,7 +5544,6 @@ _copyBitString(const BitString *from)
 	return newnode;
 }
 
-
 static ForeignKeyCacheInfo *
 _copyForeignKeyCacheInfo(const ForeignKeyCacheInfo *from)
 {
@@ -6565,6 +6585,13 @@ copyObjectImpl(const void *from)
 			retval = _copyPublicationTable(from);
 			break;
 
+			/*
+			 * EXECUTION NODES
+			 */
+		case T_PartitionPruneResult:
+			retval = _copyPartitionPruneResult(from);
+			break;
+
 			/*
 			 * MISCELLANEOUS NODES
 			 */
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 13e1643530..0cbcbc8ed4 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -314,7 +314,10 @@ _outPlannedStmt(StringInfo str, const PlannedStmt *node)
 	WRITE_BOOL_FIELD(parallelModeNeeded);
 	WRITE_INT_FIELD(jitFlags);
 	WRITE_NODE_FIELD(planTree);
+	WRITE_NODE_FIELD(partPruneInfos);
+	WRITE_BOOL_FIELD(containsInitialPruning);
 	WRITE_NODE_FIELD(rtable);
+	WRITE_BITMAPSET_FIELD(minLockRelids);
 	WRITE_NODE_FIELD(resultRelations);
 	WRITE_NODE_FIELD(appendRelations);
 	WRITE_NODE_FIELD(subplans);
@@ -443,7 +446,7 @@ _outAppend(StringInfo str, const Append *node)
 	WRITE_NODE_FIELD(appendplans);
 	WRITE_INT_FIELD(nasyncplans);
 	WRITE_INT_FIELD(first_partial_plan);
-	WRITE_NODE_FIELD(part_prune_info);
+	WRITE_INT_FIELD(part_prune_index);
 }
 
 static void
@@ -460,7 +463,7 @@ _outMergeAppend(StringInfo str, const MergeAppend *node)
 	WRITE_OID_ARRAY(sortOperators, node->numCols);
 	WRITE_OID_ARRAY(collations, node->numCols);
 	WRITE_BOOL_ARRAY(nullsFirst, node->numCols);
-	WRITE_NODE_FIELD(part_prune_info);
+	WRITE_INT_FIELD(part_prune_index);
 }
 
 static void
@@ -1006,6 +1009,8 @@ _outPartitionPruneInfo(StringInfo str, const PartitionPruneInfo *node)
 	WRITE_NODE_TYPE("PARTITIONPRUNEINFO");
 
 	WRITE_NODE_FIELD(prune_infos);
+	WRITE_BOOL_FIELD(needs_init_pruning);
+	WRITE_BOOL_FIELD(needs_exec_pruning);
 	WRITE_BITMAPSET_FIELD(other_subplans);
 }
 
@@ -1020,6 +1025,7 @@ _outPartitionedRelPruneInfo(StringInfo str, const PartitionedRelPruneInfo *node)
 	WRITE_INT_ARRAY(subplan_map, node->nparts);
 	WRITE_INT_ARRAY(subpart_map, node->nparts);
 	WRITE_OID_ARRAY(relid_map, node->nparts);
+	WRITE_INDEX_ARRAY(rti_map, node->nparts);
 	WRITE_NODE_FIELD(initial_pruning_steps);
 	WRITE_NODE_FIELD(exec_pruning_steps);
 	WRITE_BITMAPSET_FIELD(execparamids);
@@ -2420,6 +2426,9 @@ _outPlannerGlobal(StringInfo str, const PlannerGlobal *node)
 	WRITE_NODE_FIELD(finalrowmarks);
 	WRITE_NODE_FIELD(resultRelations);
 	WRITE_NODE_FIELD(appendRelations);
+	WRITE_NODE_FIELD(partPruneInfos);
+	WRITE_BOOL_FIELD(containsInitialPruning);
+	WRITE_BITMAPSET_FIELD(minLockRelids);
 	WRITE_NODE_FIELD(relationOids);
 	WRITE_NODE_FIELD(invalItems);
 	WRITE_NODE_FIELD(paramExecTypes);
@@ -2487,6 +2496,7 @@ _outPlannerInfo(StringInfo str, const PlannerInfo *node)
 	WRITE_BITMAPSET_FIELD(curOuterRels);
 	WRITE_NODE_FIELD(curOuterParams);
 	WRITE_BOOL_FIELD(partColsUpdated);
+	WRITE_NODE_FIELD(partPruneInfos);
 }
 
 static void
@@ -2840,6 +2850,21 @@ _outExtensibleNode(StringInfo str, const ExtensibleNode *node)
 	methods->nodeOut(str, node);
 }
 
+/*****************************************************************************
+ *
+ *	Stuff from execnodes.h
+ *
+ *****************************************************************************/
+
+static void
+_outPartitionPruneResult(StringInfo str, const PartitionPruneResult *node)
+{
+	WRITE_NODE_TYPE("PARTITIONPRUNERESULT");
+
+	WRITE_NODE_FIELD(valid_subplan_offs_list);
+	WRITE_BITMAPSET_FIELD(scan_leafpart_rtis);
+}
+
 /*****************************************************************************
  *
  *	Stuff from parsenodes.h.
@@ -4748,6 +4773,13 @@ outNode(StringInfo str, const void *obj)
 				_outJsonTableSibling(str, obj);
 				break;
 
+				/*
+				 * EXECUTION NODES
+				 */
+			case T_PartitionPruneResult:
+				_outPartitionPruneResult(str, obj);
+				break;
+
 			default:
 
 				/*
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 48f7216c9e..25e1df7068 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -164,6 +164,11 @@
 	token = pg_strtok(&length);		/* skip :fldname */ \
 	local_node->fldname = readIntCols(len)
 
+/* Read an Index array */
+#define READ_INDEX_ARRAY(fldname, len) \
+	token = pg_strtok(&length);		/* skip :fldname */ \
+	local_node->fldname = readIndexCols(len)
+
 /* Read a bool array */
 #define READ_BOOL_ARRAY(fldname, len) \
 	token = pg_strtok(&length);		/* skip :fldname */ \
@@ -1814,7 +1819,10 @@ _readPlannedStmt(void)
 	READ_BOOL_FIELD(parallelModeNeeded);
 	READ_INT_FIELD(jitFlags);
 	READ_NODE_FIELD(planTree);
+	READ_NODE_FIELD(partPruneInfos);
+	READ_BOOL_FIELD(containsInitialPruning);
 	READ_NODE_FIELD(rtable);
+	READ_BITMAPSET_FIELD(minLockRelids);
 	READ_NODE_FIELD(resultRelations);
 	READ_NODE_FIELD(appendRelations);
 	READ_NODE_FIELD(subplans);
@@ -1946,7 +1954,7 @@ _readAppend(void)
 	READ_NODE_FIELD(appendplans);
 	READ_INT_FIELD(nasyncplans);
 	READ_INT_FIELD(first_partial_plan);
-	READ_NODE_FIELD(part_prune_info);
+	READ_INT_FIELD(part_prune_index);
 
 	READ_DONE();
 }
@@ -1968,7 +1976,7 @@ _readMergeAppend(void)
 	READ_OID_ARRAY(sortOperators, local_node->numCols);
 	READ_OID_ARRAY(collations, local_node->numCols);
 	READ_BOOL_ARRAY(nullsFirst, local_node->numCols);
-	READ_NODE_FIELD(part_prune_info);
+	READ_INT_FIELD(part_prune_index);
 
 	READ_DONE();
 }
@@ -2763,6 +2771,8 @@ _readPartitionPruneInfo(void)
 	READ_LOCALS(PartitionPruneInfo);
 
 	READ_NODE_FIELD(prune_infos);
+	READ_BOOL_FIELD(needs_init_pruning);
+	READ_BOOL_FIELD(needs_exec_pruning);
 	READ_BITMAPSET_FIELD(other_subplans);
 
 	READ_DONE();
@@ -2779,6 +2789,7 @@ _readPartitionedRelPruneInfo(void)
 	READ_INT_ARRAY(subplan_map, local_node->nparts);
 	READ_INT_ARRAY(subpart_map, local_node->nparts);
 	READ_OID_ARRAY(relid_map, local_node->nparts);
+	READ_INDEX_ARRAY(rti_map, local_node->nparts);
 	READ_NODE_FIELD(initial_pruning_steps);
 	READ_NODE_FIELD(exec_pruning_steps);
 	READ_BITMAPSET_FIELD(execparamids);
@@ -2932,6 +2943,21 @@ _readPartitionRangeDatum(void)
 	READ_DONE();
 }
 
+
+/*
+ * _readPartitionPruneResult
+ */
+static PartitionPruneResult *
+_readPartitionPruneResult(void)
+{
+	READ_LOCALS(PartitionPruneResult);
+
+	READ_NODE_FIELD(valid_subplan_offs_list);
+	READ_BITMAPSET_FIELD(scan_leafpart_rtis);
+
+	READ_DONE();
+}
+
 /*
  * parseNodeString
  *
@@ -3229,6 +3255,8 @@ parseNodeString(void)
 		return_value = _readJsonTableParent();
 	else if (MATCH("JSONTABSNODE", 12))
 		return_value = _readJsonTableSibling();
+	else if (MATCH("PARTITIONPRUNERESULT", 20))
+		return_value = _readPartitionPruneResult();
 	else
 	{
 		elog(ERROR, "badly formatted node string \"%.32s\"...", token);
@@ -3372,6 +3400,30 @@ readIntCols(int numCols)
 	return int_vals;
 }
 
+/*
+ * readIndexCols
+ */
+Index *
+readIndexCols(int numCols)
+{
+	int			tokenLength,
+				i;
+	const char *token;
+	Index	   *index_vals;
+
+	if (numCols <= 0)
+		return NULL;
+
+	index_vals = (Index *) palloc(numCols * sizeof(Index));
+	for (i = 0; i < numCols; i++)
+	{
+		token = pg_strtok(&tokenLength);
+		index_vals[i] = atoui(token);
+	}
+
+	return index_vals;
+}
+
 /*
  * readBoolCols
  */
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index 51591bb812..e7f977fb96 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -1183,7 +1183,7 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
 	ListCell   *subpaths;
 	int			nasyncplans = 0;
 	RelOptInfo *rel = best_path->path.parent;
-	PartitionPruneInfo *partpruneinfo = NULL;
+	int			part_prune_index = -1;
 	int			nodenumsortkeys = 0;
 	AttrNumber *nodeSortColIdx = NULL;
 	Oid		   *nodeSortOperators = NULL;
@@ -1357,16 +1357,17 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
 		}
 
 		if (prunequal != NIL)
-			partpruneinfo =
-				make_partition_pruneinfo(root, rel,
-										 best_path->subpaths,
-										 prunequal);
+			part_prune_index= make_partition_pruneinfo(root, rel,
+													   best_path->subpaths,
+													   prunequal);
 	}
 
 	plan->appendplans = subplans;
 	plan->nasyncplans = nasyncplans;
 	plan->first_partial_plan = best_path->first_partial_path;
-	plan->part_prune_info = partpruneinfo;
+
+	/* Will be updated later in set_plan_references(). */
+	plan->part_prune_index = part_prune_index;
 
 	copy_generic_path_info(&plan->plan, (Path *) best_path);
 
@@ -1406,7 +1407,7 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
 	List	   *subplans = NIL;
 	ListCell   *subpaths;
 	RelOptInfo *rel = best_path->path.parent;
-	PartitionPruneInfo *partpruneinfo = NULL;
+	int			part_prune_index = -1;
 
 	/*
 	 * We don't have the actual creation of the MergeAppend node split out
@@ -1522,13 +1523,15 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
 		}
 
 		if (prunequal != NIL)
-			partpruneinfo = make_partition_pruneinfo(root, rel,
-													 best_path->subpaths,
-													 prunequal);
+			part_prune_index= make_partition_pruneinfo(root, rel,
+													   best_path->subpaths,
+													   prunequal);
 	}
 
 	node->mergeplans = subplans;
-	node->part_prune_info = partpruneinfo;
+
+	/* Will be updated later in set_plan_references(). */
+	node->part_prune_index = part_prune_index;
 
 	/*
 	 * If prepare_sort_from_pathkeys added sort columns, but we were told to
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index b2569c5d0c..2aa051d862 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -518,7 +518,10 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
 	result->dependsOnRole = glob->dependsOnRole;
 	result->parallelModeNeeded = glob->parallelModeNeeded;
 	result->planTree = top_plan;
+	result->partPruneInfos = glob->partPruneInfos;
+	result->containsInitialPruning = glob->containsInitialPruning;
 	result->rtable = glob->finalrtable;
+	result->minLockRelids = glob->minLockRelids;
 	result->resultRelations = glob->resultRelations;
 	result->appendRelations = glob->appendRelations;
 	result->subplans = glob->subplans;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 7519723081..fc66986e1c 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -251,7 +251,7 @@ set_plan_references(PlannerInfo *root, Plan *plan)
 	Plan	   *result;
 	PlannerGlobal *glob = root->glob;
 	int			rtoffset = list_length(glob->finalrtable);
-	ListCell   *lc;
+	ListCell *lc;
 
 	/*
 	 * Add all the query's RTEs to the flattened rangetable.  The live ones
@@ -260,6 +260,16 @@ set_plan_references(PlannerInfo *root, Plan *plan)
 	 */
 	add_rtes_to_flat_rtable(root, false);
 
+	/*
+	 * Add the query's adjusted range of RT indexes to glob->minLockRelids.
+	 * The adjusted RT indexes of prunable relations will be deleted from the
+	 * set below where PartitionPruneInfos are processed.
+	 */
+	glob->minLockRelids =
+		bms_add_range(glob->minLockRelids,
+					  rtoffset + 1,
+					  rtoffset + list_length(root->parse->rtable));
+
 	/*
 	 * Adjust RT indexes of PlanRowMarks and add to final rowmarks list
 	 */
@@ -338,6 +348,56 @@ set_plan_references(PlannerInfo *root, Plan *plan)
 		}
 	}
 
+	/* Also fix up the information in PartitionPruneInfos. */
+	foreach (lc, root->partPruneInfos)
+	{
+		PartitionPruneInfo *pruneinfo = lfirst(lc);
+		Bitmapset *leafpart_rtis = NULL;
+		ListCell  *l;
+
+		foreach(l, pruneinfo->prune_infos)
+		{
+			List	   *prune_infos = lfirst(l);
+			ListCell   *l2;
+
+			foreach(l2, prune_infos)
+			{
+				PartitionedRelPruneInfo *pinfo = lfirst(l2);
+				int		i;
+
+				/* RT index of the partitione table. */
+				pinfo->rtindex += rtoffset;
+
+				/* And also those of the leaf partitions. */
+				for (i = 0; i < pinfo->nparts; i++)
+				{
+					if (pinfo->rti_map[i] > 0)
+					{
+						pinfo->rti_map[i] += rtoffset;
+						leafpart_rtis = bms_add_member(leafpart_rtis,
+													   pinfo->rti_map[i]);
+					}
+				}
+			}
+		}
+
+		if (pruneinfo->needs_init_pruning)
+		{
+			glob->containsInitialPruning = true;
+
+			/*
+			 * Delete the leaf partition RTIs from the global set of relations
+			 * to be locked before executing the plan.  AcquireExecutorLocks()
+			 * will find the ones to add to the set after performing initial
+			 * pruning.
+			 */
+			glob->minLockRelids = bms_del_members(glob->minLockRelids,
+												  leafpart_rtis);
+		}
+
+		glob->partPruneInfos = lappend(glob->partPruneInfos, pruneinfo);
+	}
+
 	return result;
 }
 
@@ -1610,21 +1670,12 @@ set_append_references(PlannerInfo *root,
 
 	aplan->apprelids = offset_relid_set(aplan->apprelids, rtoffset);
 
-	if (aplan->part_prune_info)
-	{
-		foreach(l, aplan->part_prune_info->prune_infos)
-		{
-			List	   *prune_infos = lfirst(l);
-			ListCell   *l2;
-
-			foreach(l2, prune_infos)
-			{
-				PartitionedRelPruneInfo *pinfo = lfirst(l2);
-
-				pinfo->rtindex += rtoffset;
-			}
-		}
-	}
+	/*
+	 * PartitionPruneInfos will be added to a list in PlannerGlobal, so update
+	 * the index.
+	 */
+	if (aplan->part_prune_index >= 0)
+		aplan->part_prune_index += list_length(root->glob->partPruneInfos);
 
 	/* We don't need to recurse to lefttree or righttree ... */
 	Assert(aplan->plan.lefttree == NULL);
@@ -1682,21 +1733,12 @@ set_mergeappend_references(PlannerInfo *root,
 
 	mplan->apprelids = offset_relid_set(mplan->apprelids, rtoffset);
 
-	if (mplan->part_prune_info)
-	{
-		foreach(l, mplan->part_prune_info->prune_infos)
-		{
-			List	   *prune_infos = lfirst(l);
-			ListCell   *l2;
-
-			foreach(l2, prune_infos)
-			{
-				PartitionedRelPruneInfo *pinfo = lfirst(l2);
-
-				pinfo->rtindex += rtoffset;
-			}
-		}
-	}
+	/*
+	 * PartitionPruneInfos will be added to a list in PlannerGlobal, so update
+	 * the index.
+	 */
+	if (mplan->part_prune_index >= 0)
+		mplan->part_prune_index += list_length(root->glob->partPruneInfos);
 
 	/* We don't need to recurse to lefttree or righttree ... */
 	Assert(mplan->plan.lefttree == NULL);
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index 9d3c05aed3..5a5f5dee46 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -144,7 +144,9 @@ static List *make_partitionedrel_pruneinfo(PlannerInfo *root,
 										   List *prunequal,
 										   Bitmapset *partrelids,
 										   int *relid_subplan_map,
-										   Bitmapset **matchedsubplans);
+										   Bitmapset **matchedsubplans,
+										   bool *needs_init_pruning,
+										   bool *needs_exec_pruning);
 static void gen_partprune_steps(RelOptInfo *rel, List *clauses,
 								PartClauseTarget target,
 								GeneratePruningStepsContext *context);
@@ -209,16 +211,20 @@ static void partkey_datum_from_expr(PartitionPruneContext *context,
 
 /*
  * make_partition_pruneinfo
- *		Builds a PartitionPruneInfo which can be used in the executor to allow
- *		additional partition pruning to take place.  Returns NULL when
- *		partition pruning would be useless.
+ *		Checks if the given set of quals can be used to build pruning steps
+ *		that the executor will use to prune useless ones from given set of
+ *		child paths, and if so builds a PartitionPruneInfo that will allow the
+ *		executor to do do and append it to root->partPruneInfos.
+ *
+ * Return value is 0-based index of the added PartitionPruneInfo or -1 if one
+ * was not built after all.
  *
  * 'parentrel' is the RelOptInfo for an appendrel, and 'subpaths' is the list
  * of scan paths for its child rels.
  * 'prunequal' is a list of potential pruning quals (i.e., restriction
  * clauses that are applicable to the appendrel).
  */
-PartitionPruneInfo *
+int
 make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 						 List *subpaths,
 						 List *prunequal)
@@ -230,6 +236,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	int		   *relid_subplan_map;
 	ListCell   *lc;
 	int			i;
+	bool		needs_init_pruning = false;
+	bool		needs_exec_pruning = false;
 
 	/*
 	 * Scan the subpaths to see which ones are scans of partition child
@@ -309,12 +317,16 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		Bitmapset  *partrelids = (Bitmapset *) lfirst(lc);
 		List	   *pinfolist;
 		Bitmapset  *matchedsubplans = NULL;
+		bool		partrel_needs_init_pruning;
+		bool		partrel_needs_exec_pruning;
 
 		pinfolist = make_partitionedrel_pruneinfo(root, parentrel,
 												  prunequal,
 												  partrelids,
 												  relid_subplan_map,
-												  &matchedsubplans);
+												  &matchedsubplans,
+												  &partrel_needs_init_pruning,
+												  &partrel_needs_exec_pruning);
 
 		/* When pruning is possible, record the matched subplans */
 		if (pinfolist != NIL)
@@ -323,6 +335,10 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 			allmatchedsubplans = bms_join(matchedsubplans,
 										  allmatchedsubplans);
 		}
+		if (!needs_init_pruning)
+			needs_init_pruning = partrel_needs_init_pruning;
+		if (!needs_exec_pruning)
+			needs_exec_pruning = partrel_needs_exec_pruning;
 	}
 
 	pfree(relid_subplan_map);
@@ -332,11 +348,13 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	 * quals, then we can just not bother with run-time pruning.
 	 */
 	if (prunerelinfos == NIL)
-		return NULL;
+		return -1;
 
 	/* Else build the result data structure */
 	pruneinfo = makeNode(PartitionPruneInfo);
 	pruneinfo->prune_infos = prunerelinfos;
+	pruneinfo->needs_init_pruning = needs_init_pruning;
+	pruneinfo->needs_exec_pruning = needs_exec_pruning;
 
 	/*
 	 * Some subplans may not belong to any of the identified partitioned rels.
@@ -358,7 +376,9 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	else
 		pruneinfo->other_subplans = NULL;
 
-	return pruneinfo;
+	root->partPruneInfos = lappend(root->partPruneInfos, pruneinfo);
+
+	return list_length(root->partPruneInfos) - 1;
 }
 
 /*
@@ -435,13 +455,18 @@ add_part_relids(List *allpartrelids, Bitmapset *partrelids)
  * If we cannot find any useful run-time pruning steps, return NIL.
  * However, on success, each rel identified in partrelids will have
  * an element in the result list, even if some of them are useless.
+ * *needs_init_pruning and *needs_exec_pruning are set to indicate that the
+ * returned PartitionedRelPruneInfos contains pruning steps that can be
+ * performed before and after execution begins, respectively.
  */
 static List *
 make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 							  List *prunequal,
 							  Bitmapset *partrelids,
 							  int *relid_subplan_map,
-							  Bitmapset **matchedsubplans)
+							  Bitmapset **matchedsubplans,
+							  bool *needs_init_pruning,
+							  bool *needs_exec_pruning)
 {
 	RelOptInfo *targetpart = NULL;
 	List	   *pinfolist = NIL;
@@ -452,6 +477,10 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	int			rti;
 	int			i;
 
+	/* Will find out below. */
+	*needs_init_pruning = false;
+	*needs_exec_pruning = false;
+
 	/*
 	 * Examine each partitioned rel, constructing a temporary array to map
 	 * from planner relids to index of the partitioned rel, and building a
@@ -539,6 +568,9 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		 * executor per-scan pruning steps.  This first pass creates startup
 		 * pruning steps and detects whether there's any possibly-useful quals
 		 * that would require per-scan pruning.
+		 *
+		 * In the first pass, we note whether the 2nd pass is necessary by
+		 * by noting the presence of EXEC parameters.
 		 */
 		gen_partprune_steps(subpart, partprunequal, PARTTARGET_INITIAL,
 							&context);
@@ -613,6 +645,11 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		pinfo->execparamids = execparamids;
 		/* Remaining fields will be filled in the next loop */
 
+		if (!*needs_init_pruning)
+			*needs_init_pruning = (initial_pruning_steps != NIL);
+		if (!*needs_exec_pruning)
+			*needs_exec_pruning = (exec_pruning_steps != NIL);
+
 		pinfolist = lappend(pinfolist, pinfo);
 	}
 
@@ -640,6 +677,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		int		   *subplan_map;
 		int		   *subpart_map;
 		Oid		   *relid_map;
+		Index	   *rti_map;
 
 		/*
 		 * Construct the subplan and subpart maps for this partitioning level.
@@ -652,6 +690,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		subpart_map = (int *) palloc(nparts * sizeof(int));
 		memset(subpart_map, -1, nparts * sizeof(int));
 		relid_map = (Oid *) palloc0(nparts * sizeof(Oid));
+		rti_map = (Index *) palloc0(nparts * sizeof(Index));
 		present_parts = NULL;
 
 		i = -1;
@@ -666,6 +705,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 			subplan_map[i] = subplanidx = relid_subplan_map[partrel->relid] - 1;
 			subpart_map[i] = subpartidx = relid_subpart_map[partrel->relid] - 1;
 			relid_map[i] = planner_rt_fetch(partrel->relid, root)->relid;
+			rti_map[i] = partrel->relid;
 			if (subplanidx >= 0)
 			{
 				present_parts = bms_add_member(present_parts, i);
@@ -690,6 +730,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		pinfo->subplan_map = subplan_map;
 		pinfo->subpart_map = subpart_map;
 		pinfo->relid_map = relid_map;
+		pinfo->rti_map = rti_map;
 	}
 
 	pfree(relid_subpart_map);
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 95dc2e2c83..8dc52a158f 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -1603,6 +1603,7 @@ exec_bind_message(StringInfo input_message)
 	int16	   *rformats = NULL;
 	CachedPlanSource *psrc;
 	CachedPlan *cplan;
+	List	   *part_prune_result_list;
 	Portal		portal;
 	char	   *query_string;
 	char	   *saved_stmt_name;
@@ -1978,7 +1979,9 @@ exec_bind_message(StringInfo input_message)
 	 * will be generated in MessageContext.  The plan refcount will be
 	 * assigned to the Portal, so it will be released at portal destruction.
 	 */
-	cplan = GetCachedPlan(psrc, params, NULL, NULL);
+	cplan = GetCachedPlan(psrc, params, NULL, NULL, &part_prune_result_list);
+	Assert(list_length(cplan->stmt_list) ==
+		   list_length(part_prune_result_list));
 
 	/*
 	 * Now we can define the portal.
@@ -1993,6 +1996,9 @@ exec_bind_message(StringInfo input_message)
 					  cplan->stmt_list,
 					  cplan);
 
+	/* Copy PartitionPruneResults into the portal's context. */
+	PortalStorePartitionPruneResults(portal, part_prune_result_list);
+
 	/* Done with the snapshot used for parameter I/O and parsing/planning */
 	if (snapshot_set)
 		PopActiveSnapshot();
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index 5aa5a350f3..a627448a5a 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -35,7 +35,7 @@
 Portal		ActivePortal = NULL;
 
 
-static void ProcessQuery(PlannedStmt *plan,
+static void ProcessQuery(PlannedStmt *plan, PartitionPruneResult *part_prune_result,
 						 const char *sourceText,
 						 ParamListInfo params,
 						 QueryEnvironment *queryEnv,
@@ -65,6 +65,7 @@ static void DoPortalRewind(Portal portal);
  */
 QueryDesc *
 CreateQueryDesc(PlannedStmt *plannedstmt,
+				PartitionPruneResult *part_prune_result,
 				const char *sourceText,
 				Snapshot snapshot,
 				Snapshot crosscheck_snapshot,
@@ -77,6 +78,8 @@ CreateQueryDesc(PlannedStmt *plannedstmt,
 
 	qd->operation = plannedstmt->commandType;	/* operation */
 	qd->plannedstmt = plannedstmt;	/* plan */
+	qd->part_prune_result = part_prune_result;	/* ExecutorDoInitialPruning()
+												 * output for plan */
 	qd->sourceText = sourceText;	/* query text */
 	qd->snapshot = RegisterSnapshot(snapshot);	/* snapshot */
 	/* RI check snapshot */
@@ -122,6 +125,7 @@ FreeQueryDesc(QueryDesc *qdesc)
  *		PORTAL_ONE_RETURNING, or PORTAL_ONE_MOD_WITH portal
  *
  *	plan: the plan tree for the query
+ *	part_prune_result: ExecutorDoInitialPruning() output for the plan tree
  *	sourceText: the source text of the query
  *	params: any parameters needed
  *	dest: where to send results
@@ -134,6 +138,7 @@ FreeQueryDesc(QueryDesc *qdesc)
  */
 static void
 ProcessQuery(PlannedStmt *plan,
+			 PartitionPruneResult *part_prune_result,
 			 const char *sourceText,
 			 ParamListInfo params,
 			 QueryEnvironment *queryEnv,
@@ -145,7 +150,7 @@ ProcessQuery(PlannedStmt *plan,
 	/*
 	 * Create the QueryDesc object
 	 */
-	queryDesc = CreateQueryDesc(plan, sourceText,
+	queryDesc = CreateQueryDesc(plan, part_prune_result, sourceText,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, params, queryEnv, 0);
 
@@ -491,8 +496,13 @@ PortalStart(Portal portal, ParamListInfo params,
 				/*
 				 * Create QueryDesc in portal's context; for the moment, set
 				 * the destination to DestNone.
+				 *
+				 * There is no PartitionPruneResult unless the PlannedStmt is
+				 * from a CachedPlan.
 				 */
 				queryDesc = CreateQueryDesc(linitial_node(PlannedStmt, portal->stmts),
+											portal->part_prune_results == NIL ? NULL :
+											linitial(portal->part_prune_results),
 											portal->sourceText,
 											GetActiveSnapshot(),
 											InvalidSnapshot,
@@ -1194,6 +1204,7 @@ PortalRunMulti(Portal portal,
 {
 	bool		active_snapshot_set = false;
 	ListCell   *stmtlist_item;
+	int			i;
 
 	/*
 	 * If the destination is DestRemoteExecute, change to DestNone.  The
@@ -1214,9 +1225,15 @@ PortalRunMulti(Portal portal,
 	 * Loop to handle the individual queries generated from a single parsetree
 	 * by analysis and rewrite.
 	 */
+	i = 0;
 	foreach(stmtlist_item, portal->stmts)
 	{
 		PlannedStmt *pstmt = lfirst_node(PlannedStmt, stmtlist_item);
+		PartitionPruneResult *part_prune_result = portal->part_prune_results ?
+									  list_nth(portal->part_prune_results, i) :
+									  NULL;
+
+		i++;
 
 		/*
 		 * If we got a cancel signal in prior command, quit
@@ -1274,7 +1291,7 @@ PortalRunMulti(Portal portal,
 			if (pstmt->canSetTag)
 			{
 				/* statement can set tag string */
-				ProcessQuery(pstmt,
+				ProcessQuery(pstmt, part_prune_result,
 							 portal->sourceText,
 							 portal->portalParams,
 							 portal->queryEnv,
@@ -1283,7 +1300,7 @@ PortalRunMulti(Portal portal,
 			else
 			{
 				/* stmt added by rewrite cannot set tag */
-				ProcessQuery(pstmt,
+				ProcessQuery(pstmt, part_prune_result,
 							 portal->sourceText,
 							 portal->portalParams,
 							 portal->queryEnv,
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 4cf6db504f..6cb473f2f4 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -99,14 +99,19 @@ static dlist_head cached_expression_list = DLIST_STATIC_INIT(cached_expression_l
 static void ReleaseGenericPlan(CachedPlanSource *plansource);
 static List *RevalidateCachedQuery(CachedPlanSource *plansource,
 								   QueryEnvironment *queryEnv);
-static bool CheckCachedPlan(CachedPlanSource *plansource);
+static bool CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
+							List **part_prune_result_list);
 static CachedPlan *BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
-								   ParamListInfo boundParams, QueryEnvironment *queryEnv);
+								   ParamListInfo boundParams, QueryEnvironment *queryEnv,
+								   List **part_prune_result_list);
 static bool choose_custom_plan(CachedPlanSource *plansource,
 							   ParamListInfo boundParams);
 static double cached_plan_cost(CachedPlan *plan, bool include_planner);
 static Query *QueryListGetPrimaryStmt(List *stmts);
-static void AcquireExecutorLocks(List *stmt_list, bool acquire);
+static void AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams,
+								 List **part_prune_result_list,
+								 List **lockedRelids_per_stmt);
+static void ReleaseExecutorLocks(List *stmt_list, List *lockedRelids_per_stmt);
 static void AcquirePlannerLocks(List *stmt_list, bool acquire);
 static void ScanQueryForLocks(Query *parsetree, bool acquire);
 static bool ScanQueryWalker(Node *node, bool *acquire);
@@ -790,15 +795,20 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
  *
  * On a "true" return, we have acquired the locks needed to run the plan.
  * (We must do this for the "true" result to be race-condition-free.)
+ *
+ * See GetCachedPlan()'s comment for a description of part_prune_result_list.
  */
 static bool
-CheckCachedPlan(CachedPlanSource *plansource)
+CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
+				List **part_prune_result_list)
 {
 	CachedPlan *plan = plansource->gplan;
 
 	/* Assert that caller checked the querytree */
 	Assert(plansource->is_valid);
 
+	*part_prune_result_list = NIL;
+
 	/* If there's no generic plan, just say "false" */
 	if (!plan)
 		return false;
@@ -820,13 +830,21 @@ CheckCachedPlan(CachedPlanSource *plansource)
 	 */
 	if (plan->is_valid)
 	{
+		List   *lockedRelids_per_stmt;
+
 		/*
 		 * Plan must have positive refcount because it is referenced by
 		 * plansource; so no need to fear it disappears under us here.
 		 */
 		Assert(plan->refcount > 0);
 
-		AcquireExecutorLocks(plan->stmt_list, true);
+		/*
+		 * Lock relations scanned by the plan.  This is where the pruning
+		 * happens if needed.
+		 */
+		AcquireExecutorLocks(plan->stmt_list, boundParams,
+							 part_prune_result_list,
+							 &lockedRelids_per_stmt);
 
 		/*
 		 * If plan was transient, check to see if TransactionXmin has
@@ -848,7 +866,14 @@ CheckCachedPlan(CachedPlanSource *plansource)
 		}
 
 		/* Oops, the race case happened.  Release useless locks. */
-		AcquireExecutorLocks(plan->stmt_list, false);
+		ReleaseExecutorLocks(plan->stmt_list, lockedRelids_per_stmt);
+
+		/*
+		 * The output list and any objects therein have been allocated in the
+		 * caller's hopefully short-lived context, so will not remain leaked
+		 * for long, though reset to avoid its accidentally being looked at.
+		 */
+		*part_prune_result_list = NIL;
 	}
 
 	/*
@@ -874,10 +899,15 @@ CheckCachedPlan(CachedPlanSource *plansource)
  * Planning work is done in the caller's memory context.  The finished plan
  * is in a child memory context, which typically should get reparented
  * (unless this is a one-shot plan, in which case we don't copy the plan).
+ *
+ * A list of NULLs is returned in *part_prune_result_list, meaning that no
+ * PartitionPruneResult nodes have yet been created for the plans in
+ * stmt_list.
  */
 static CachedPlan *
 BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
-				ParamListInfo boundParams, QueryEnvironment *queryEnv)
+				ParamListInfo boundParams, QueryEnvironment *queryEnv,
+				List **part_prune_result_list)
 {
 	CachedPlan *plan;
 	List	   *plist;
@@ -1007,6 +1037,17 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
 
 	MemoryContextSwitchTo(oldcxt);
 
+	/*
+	 * No actual PartitionPruneResults yet to add, though must initialize
+	 * the list to have the same number of elements as the list of
+	 * PlannedStmts.
+	 */
+	*part_prune_result_list = NIL;
+	foreach(lc, plist)
+	{
+		*part_prune_result_list = lappend(*part_prune_result_list, NULL);
+	}
+
 	return plan;
 }
 
@@ -1126,6 +1167,17 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
  * plan or a custom plan for the given parameters: the caller does not know
  * which it will get.
  *
+ * For every PlannedStmt found in the returned CachedPlan, an element that
+ * is either a PartitionPruneResult or a NULL is added to
+ * *part_prune_result_list if needed.  The former if the PlannedStmt is from
+ * the existing CachedPlan that is otherwise valid and contains at least one
+ * PartitionPruneInfo that has "initial" pruning steps.  Those steps are
+ * performed by calling ExecutorDoInitialPruning() to determine only those
+ * leaf partitions that need to be locked by AcquireExecutorLocks() by pruning
+ * away subplans that don't match the pruning conditions.  The
+ * PartitionPruneResult contains a list of bitmapsets of the indexes of
+ * matching subplans, one for each PartitionPruneInfo.
+ *
  * On return, the plan is valid and we have sufficient locks to begin
  * execution.
  *
@@ -1139,11 +1191,13 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
  */
 CachedPlan *
 GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
-			  ResourceOwner owner, QueryEnvironment *queryEnv)
+			  ResourceOwner owner, QueryEnvironment *queryEnv,
+			  List **part_prune_result_list)
 {
 	CachedPlan *plan = NULL;
 	List	   *qlist;
 	bool		customplan;
+	List	   *my_part_prune_result_list;
 
 	/* Assert caller is doing things in a sane order */
 	Assert(plansource->magic == CACHEDPLANSOURCE_MAGIC);
@@ -1160,7 +1214,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 
 	if (!customplan)
 	{
-		if (CheckCachedPlan(plansource))
+		if (CheckCachedPlan(plansource, boundParams,
+							&my_part_prune_result_list))
 		{
 			/* We want a generic plan, and we already have a valid one */
 			plan = plansource->gplan;
@@ -1169,7 +1224,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 		else
 		{
 			/* Build a new generic plan */
-			plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv);
+			plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv,
+								   &my_part_prune_result_list);
 			/* Just make real sure plansource->gplan is clear */
 			ReleaseGenericPlan(plansource);
 			/* Link the new generic plan into the plansource */
@@ -1214,7 +1270,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 	if (customplan)
 	{
 		/* Build a custom plan */
-		plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv);
+		plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv,
+							   &my_part_prune_result_list);
 		/* Accumulate total costs of custom plans */
 		plansource->total_custom_cost += cached_plan_cost(plan, true);
 
@@ -1246,6 +1303,9 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 		plan->is_saved = true;
 	}
 
+	if (part_prune_result_list)
+		*part_prune_result_list = my_part_prune_result_list;
+
 	return plan;
 }
 
@@ -1737,17 +1797,29 @@ QueryListGetPrimaryStmt(List *stmts)
 
 /*
  * AcquireExecutorLocks: acquire locks needed for execution of a cached plan;
- * or release them if acquire is false.
+ *
+ * See GetCachedPlan()'s comment for a description of part_prune_result_list.
+ *
+ * On return, *lockedRelids_per_stmt will contain a bitmapset for every
+ * PlannedStmt in stmt_list, containing the RT indexes of relation entries
+ * in its range table that were actually locked, or NULL if the PlannedStmt
+ * contains a utility statement.
  */
 static void
-AcquireExecutorLocks(List *stmt_list, bool acquire)
+AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams,
+					 List **part_prune_result_list,
+					 List **lockedRelids_per_stmt)
 {
 	ListCell   *lc1;
 
+	*part_prune_result_list = *lockedRelids_per_stmt = NIL;
 	foreach(lc1, stmt_list)
 	{
 		PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
-		ListCell   *lc2;
+		PartitionPruneResult *part_prune_result = NULL;
+		Bitmapset  *allLockRelids;
+		Bitmapset  *lockedRelids = NULL;
+		int			rti;
 
 		if (plannedstmt->commandType == CMD_UTILITY)
 		{
@@ -1761,13 +1833,35 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
 			Query	   *query = UtilityContainsQuery(plannedstmt->utilityStmt);
 
 			if (query)
-				ScanQueryForLocks(query, acquire);
+				ScanQueryForLocks(query, true);
+			*part_prune_result_list = lappend(*part_prune_result_list, NULL);
 			continue;
 		}
 
-		foreach(lc2, plannedstmt->rtable)
+		/*
+		 * Figure out the set of relations that would need to be locked
+		 * before executing the plan.
+		 */
+		if (plannedstmt->containsInitialPruning)
 		{
-			RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc2);
+			/*
+			 * Obtain the set of partitions to be locked from the
+			 * PartitionPruneInfos by considering the result of performing
+			 * initial partition pruning.
+			 */
+			PartitionPruneResult *part_prune_result =
+				ExecutorDoInitialPruning(plannedstmt, boundParams);
+
+			allLockRelids = bms_union(plannedstmt->minLockRelids,
+									  part_prune_result->scan_leafpart_rtis);
+		}
+		else
+			allLockRelids = plannedstmt->minLockRelids;
+
+		rti = -1;
+		while ((rti = bms_next_member(allLockRelids, rti)) > 0)
+		{
+			RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
 
 			if (rte->rtekind != RTE_RELATION)
 				continue;
@@ -1778,10 +1872,58 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
 			 * fail if it's been dropped entirely --- we'll just transiently
 			 * acquire a non-conflicting lock.
 			 */
-			if (acquire)
-				LockRelationOid(rte->relid, rte->rellockmode);
-			else
-				UnlockRelationOid(rte->relid, rte->rellockmode);
+			LockRelationOid(rte->relid, rte->rellockmode);
+			lockedRelids = bms_add_member(lockedRelids, rti);
+		}
+
+		*part_prune_result_list = lappend(*part_prune_result_list,
+										  part_prune_result);
+		*lockedRelids_per_stmt = lappend(*lockedRelids_per_stmt, lockedRelids);
+	}
+}
+
+/*
+ * ReleaseExecutorLocks
+ * 		Release locks that would've been acquired by an earlier call to
+ * 		AcquireExecutorLocks()
+ */
+static void
+ReleaseExecutorLocks(List *stmt_list, List *lockedRelids_per_stmt)
+{
+	ListCell   *lc1,
+			   *lc2;
+
+	forboth(lc1, stmt_list, lc2, lockedRelids_per_stmt)
+	{
+		PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
+		Bitmapset  *lockedRelids = lfirst(lc2);
+		int			rti;
+
+		if (plannedstmt->commandType == CMD_UTILITY)
+		{
+			/*
+			 * Ignore utility statements, except those (such as EXPLAIN) that
+			 * contain a parsed-but-not-planned query.  Note: it's okay to use
+			 * ScanQueryForLocks, even though the query hasn't been through
+			 * rule rewriting, because rewriting doesn't change the query
+			 * representation.
+			 */
+			Query	   *query = UtilityContainsQuery(plannedstmt->utilityStmt);
+
+			if (query)
+				ScanQueryForLocks(query, false);
+			continue;
+		}
+
+		rti = -1;
+		while ((rti = bms_next_member(lockedRelids, rti)) >= 0)
+		{
+			RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
+
+			Assert(rte->rtekind == RTE_RELATION);
+
+			/* See the comment in AcquireExecutorLocks(). */
+			UnlockRelationOid(rte->relid, rte->rellockmode);
 		}
 	}
 }
diff --git a/src/backend/utils/mmgr/portalmem.c b/src/backend/utils/mmgr/portalmem.c
index d549f66d4a..1bbe6b704b 100644
--- a/src/backend/utils/mmgr/portalmem.c
+++ b/src/backend/utils/mmgr/portalmem.c
@@ -303,6 +303,25 @@ PortalDefineQuery(Portal portal,
 	portal->status = PORTAL_DEFINED;
 }
 
+/*
+ * PortalStorePartitionPruneResults
+ *		Copy the given list of PartitionPruneResults into the portal's
+ *		context
+ *
+ * This allows the caller to ensure that the list exists as long as the portal
+ * does.
+ */
+void
+PortalStorePartitionPruneResults(Portal portal, List *part_prune_results)
+{
+	MemoryContext	oldcxt;
+
+	AssertArg(PortalIsValid(portal));
+	oldcxt = MemoryContextSwitchTo(portal->portalContext);
+	portal->part_prune_results = copyObject(part_prune_results);
+	MemoryContextSwitchTo(oldcxt);
+}
+
 /*
  * PortalReleaseCachedPlan
  *		Release a portal's reference to its cached plan, if any.
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 666977fb1f..34975c69ee 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -87,7 +87,8 @@ extern void ExplainOneUtility(Node *utilityStmt, IntoClause *into,
 							  ExplainState *es, const char *queryString,
 							  ParamListInfo params, QueryEnvironment *queryEnv);
 
-extern void ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into,
+extern void ExplainOnePlan(PlannedStmt *plannedstmt, PartitionPruneResult *part_prune_resul,
+						   IntoClause *into,
 						   ExplainState *es, const char *queryString,
 						   ParamListInfo params, QueryEnvironment *queryEnv,
 						   const instr_time *planduration,
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index 708435e952..bd8776402e 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -45,6 +45,7 @@ extern void ExecCleanupTupleRouting(ModifyTableState *mtstate,
  * nparts						Length of subplan_map[] and subpart_map[].
  * subplan_map					Subplan index by partition index, or -1.
  * subpart_map					Subpart index by partition index, or -1.
+ * rti_map						Range table index by partition index, or 0.
  * present_parts				A Bitmapset of the partition indexes that we
  *								have subplans or subparts for.
  * initial_pruning_steps		List of PartitionPruneSteps used to
@@ -61,6 +62,7 @@ typedef struct PartitionedRelPruningData
 	int			nparts;
 	int		   *subplan_map;
 	int		   *subpart_map;
+	Index	   *rti_map;
 	Bitmapset  *present_parts;
 	List	   *initial_pruning_steps;
 	List	   *exec_pruning_steps;
@@ -123,9 +125,13 @@ typedef struct PartitionPruneState
 
 extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
 													 int n_total_subplans,
-													 PartitionPruneInfo *pruneinfo,
+													 int part_prune_index,
 													 Bitmapset **initially_valid_subplans);
 extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
-										   bool initial_prune);
-
+										   bool initial_prune,
+										   Bitmapset **scan_leafpart_rtis);
+extern Bitmapset *ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt,
+								ParamListInfo params,
+								PartitionPruneInfo *pruneinfo,
+								Bitmapset **scan_leafpart_rtis);
 #endif							/* EXECPARTITION_H */
diff --git a/src/include/executor/execdesc.h b/src/include/executor/execdesc.h
index e79e2c001f..60d5644908 100644
--- a/src/include/executor/execdesc.h
+++ b/src/include/executor/execdesc.h
@@ -35,6 +35,8 @@ typedef struct QueryDesc
 	/* These fields are provided by CreateQueryDesc */
 	CmdType		operation;		/* CMD_SELECT, CMD_UPDATE, etc. */
 	PlannedStmt *plannedstmt;	/* planner's output (could be utility, too) */
+	PartitionPruneResult *part_prune_result; /* ExecutorDoInitialPruning()'s
+											  * output for plannedstmt */
 	const char *sourceText;		/* source text of the query */
 	Snapshot	snapshot;		/* snapshot to use for query */
 	Snapshot	crosscheck_snapshot;	/* crosscheck for RI update/delete */
@@ -57,6 +59,7 @@ typedef struct QueryDesc
 
 /* in pquery.c */
 extern QueryDesc *CreateQueryDesc(PlannedStmt *plannedstmt,
+								  PartitionPruneResult *part_prune_result,
 								  const char *sourceText,
 								  Snapshot snapshot,
 								  Snapshot crosscheck_snapshot,
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index 873772f188..57dc0e8077 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -185,6 +185,8 @@ ExecGetJunkAttribute(TupleTableSlot *slot, AttrNumber attno, bool *isNull)
 /*
  * prototypes from functions in execMain.c
  */
+extern PartitionPruneResult *ExecutorDoInitialPruning(PlannedStmt *plannedstmt,
+													  ParamListInfo params);
 extern void ExecutorStart(QueryDesc *queryDesc, int eflags);
 extern void standard_ExecutorStart(QueryDesc *queryDesc, int eflags);
 extern void ExecutorRun(QueryDesc *queryDesc,
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index cbbcff81d2..3de4df1b05 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -596,6 +596,8 @@ typedef struct EState
 	struct ExecRowMark **es_rowmarks;	/* Array of per-range-table-entry
 										 * ExecRowMarks, or NULL if none */
 	PlannedStmt *es_plannedstmt;	/* link to top of plan tree */
+	List		*es_part_prune_infos;	/* PlannedStmt.partPruneInfos */
+	struct PartitionPruneResult *es_part_prune_result; /* QueryDesc.part_prune_result */
 	const char *es_sourceText;	/* Source text from QueryDesc */
 
 	JunkFilter *es_junkFilter;	/* top-level junk filter, if any */
@@ -984,6 +986,34 @@ typedef struct DomainConstraintState
  */
 typedef TupleTableSlot *(*ExecProcNodeMtd) (struct PlanState *pstate);
 
+/*----------------
+ * PartitionPruneResult
+ *
+ * The result of performing ExecutorDoInitialPruning() invocation on a given
+ * PlannedStmt.
+ *
+ * Contains a list of Bitmapset of the indexes of the subplans remaining after
+ * performing initial pruning by calling ExecFindMatchingSubPlans() for every
+ * PartitionPruneInfos found in PlannedStmt.partPruneInfos.  RT indexes of the
+ * leaf partitions scanned by those subplans across all PartitionPruneInfos
+ * are added into scan_leafpart_rtis.
+ *
+ * This is used by GetCachedPlan() to inform its callers of the pruning
+ * decisions made when performing AcquireExecutorLocks() on a given cached
+ * PlannedStmt, which the callers then pass that on to the executor.  The
+ * executor refers to this node when made available when initializing the plan
+ * nodes to which those PartitionPruneInfos apply so that the same set of
+ * qualifying subplans are initialized, rather than deriving that set again by
+ * redoing initial pruning.
+ */
+typedef struct PartitionPruneResult
+{
+	NodeTag		type;
+
+	List		   *valid_subplan_offs_list;
+	Bitmapset	   *scan_leafpart_rtis;
+} PartitionPruneResult;
+
 /* ----------------
  *		PlanState node
  *
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index 300824258e..de312b9215 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -97,6 +97,9 @@ typedef enum NodeTag
 	T_PartitionPruneStepCombine,
 	T_PlanInvalItem,
 
+	/* TAGS FOR EXECUTOR PREP NODES (execnodes.h) */
+	T_PartitionPruneResult,
+
 	/*
 	 * TAGS FOR PLAN STATE NODES (execnodes.h)
 	 *
@@ -673,6 +676,7 @@ extern struct Bitmapset *readBitmapset(void);
 extern uintptr_t readDatum(bool typbyval);
 extern bool *readBoolCols(int numCols);
 extern int *readIntCols(int numCols);
+extern Index *readIndexCols(int numCols);
 extern Oid *readOidCols(int numCols);
 extern int16 *readAttrNumberCols(int numCols);
 
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 6cbcb67bdf..d9c482e08b 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -107,6 +107,18 @@ typedef struct PlannerGlobal
 
 	List	   *appendRelations;	/* "flat" list of AppendRelInfos */
 
+	List	   *partPruneInfos;		/* List of PartitionPruneInfo contained in
+									 * the plan */
+
+	bool		containsInitialPruning;	/* Do any of those PartitionPruneInfos
+										 * have initial (pre-exec) pruning
+										 * steps in them? */
+
+	Bitmapset  *minLockRelids;	/* Indexes of all range table entries minus
+								 * indexes of range table entries of the leaf
+								 * partitions scanned by prunable subplans;
+								 * see AcquireExecutorLocks() */
+
 	List	   *relationOids;	/* OIDs of relations the plan depends on */
 
 	List	   *invalItems;		/* other dependencies, as PlanInvalItems */
@@ -377,6 +389,9 @@ struct PlannerInfo
 
 	/* Does this query modify any partition key columns? */
 	bool		partColsUpdated;
+
+	/* PartitionPruneInfos added in this query's plan. */
+	List	   *partPruneInfos;
 };
 
 
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 10dd35f011..44997d595d 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -64,8 +64,20 @@ typedef struct PlannedStmt
 
 	struct Plan *planTree;		/* tree of Plan nodes */
 
+	List	   *partPruneInfos;	/* List of PartitionPruneInfo contained in
+								 * the plan */
+
+	bool		containsInitialPruning;	/* Do any of those PartitionPruneInfos
+										 * have initial (pre-exec) pruning
+										 * steps in them? */
+
 	List	   *rtable;			/* list of RangeTblEntry nodes */
 
+	Bitmapset  *minLockRelids;	/* Indexes of all range table entries minus
+								 * indexes of range table entries of the leaf
+								 * partitions scanned by prunable subplans;
+								 * see AcquireExecutorLocks() */
+
 	/* rtable indexes of target relations for INSERT/UPDATE/DELETE */
 	List	   *resultRelations;	/* integer list of RT indexes, or NIL */
 
@@ -262,8 +274,12 @@ typedef struct Append
 	 */
 	int			first_partial_plan;
 
-	/* Info for run-time subplan pruning; NULL if we're not doing that */
-	struct PartitionPruneInfo *part_prune_info;
+	/*
+	 * Index of this plan's PartitionPruneInfo in PlannedStmt.partPruneInfos
+	 * to be used for run-time subplan pruning; -1 if run-time pruning is
+	 * not needed.
+	 */
+	int			part_prune_index;
 } Append;
 
 /* ----------------
@@ -282,8 +298,13 @@ typedef struct MergeAppend
 	Oid		   *sortOperators;	/* OIDs of operators to sort them by */
 	Oid		   *collations;		/* OIDs of collations */
 	bool	   *nullsFirst;		/* NULLS FIRST/LAST directions */
-	/* Info for run-time subplan pruning; NULL if we're not doing that */
-	struct PartitionPruneInfo *part_prune_info;
+
+	/*
+	 * Index of this plan's PartitionPruneInfo in PlannedStmt.partPruneInfos
+	 * to be used for run-time subplan pruning; -1 if run-time pruning is
+	 * not needed.
+	 */
+	int			part_prune_index;
 } MergeAppend;
 
 /* ----------------
@@ -1187,6 +1208,13 @@ typedef struct PlanRowMark
  * prune_infos			List of Lists containing PartitionedRelPruneInfo nodes,
  *						one sublist per run-time-prunable partition hierarchy
  *						appearing in the parent plan node's subplans.
+ *
+ * needs_init_pruning	Does any of the PartitionedRelPruneInfos in
+ *						prune_infos have its initial_pruning_steps set?
+ *
+ * needs_exec_pruning	Does any of the PartitionedRelPruneInfos in
+ *						prune_infos have its exec_pruning_steps set?
+ *
  * other_subplans		Indexes of any subplans that are not accounted for
  *						by any of the PartitionedRelPruneInfo nodes in
  *						"prune_infos".  These subplans must not be pruned.
@@ -1195,6 +1223,8 @@ typedef struct PartitionPruneInfo
 {
 	NodeTag		type;
 	List	   *prune_infos;
+	bool		needs_init_pruning;
+	bool		needs_exec_pruning;
 	Bitmapset  *other_subplans;
 } PartitionPruneInfo;
 
@@ -1225,6 +1255,7 @@ typedef struct PartitionedRelPruneInfo
 	int		   *subplan_map;	/* subplan index by partition index, or -1 */
 	int		   *subpart_map;	/* subpart index by partition index, or -1 */
 	Oid		   *relid_map;		/* relation OID by partition index, or 0 */
+	Index	   *rti_map;		/* Range table index by partition index, 0. */
 
 	/*
 	 * initial_pruning_steps shows how to prune during executor startup (i.e.,
diff --git a/src/include/partitioning/partprune.h b/src/include/partitioning/partprune.h
index 90684efa25..ebf0dcff8c 100644
--- a/src/include/partitioning/partprune.h
+++ b/src/include/partitioning/partprune.h
@@ -70,10 +70,10 @@ typedef struct PartitionPruneContext
 #define PruneCxtStateIdx(partnatts, step_id, keyno) \
 	((partnatts) * (step_id) + (keyno))
 
-extern PartitionPruneInfo *make_partition_pruneinfo(struct PlannerInfo *root,
-													struct RelOptInfo *parentrel,
-													List *subpaths,
-													List *prunequal);
+extern int make_partition_pruneinfo(struct PlannerInfo *root,
+									struct RelOptInfo *parentrel,
+									List *subpaths,
+									List *prunequal);
 extern Bitmapset *prune_append_rel_partitions(struct RelOptInfo *rel);
 extern Bitmapset *get_matching_partitions(PartitionPruneContext *context,
 										  List *pruning_steps);
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index 95b99e3d25..449200b949 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -220,7 +220,8 @@ extern List *CachedPlanGetTargetList(CachedPlanSource *plansource,
 extern CachedPlan *GetCachedPlan(CachedPlanSource *plansource,
 								 ParamListInfo boundParams,
 								 ResourceOwner owner,
-								 QueryEnvironment *queryEnv);
+								 QueryEnvironment *queryEnv,
+								 List **part_prune_result_list);
 extern void ReleaseCachedPlan(CachedPlan *plan, ResourceOwner owner);
 
 extern bool CachedPlanAllowsSimpleValidityCheck(CachedPlanSource *plansource,
diff --git a/src/include/utils/portal.h b/src/include/utils/portal.h
index aeddbdafe5..9f7727a837 100644
--- a/src/include/utils/portal.h
+++ b/src/include/utils/portal.h
@@ -138,6 +138,7 @@ typedef struct PortalData
 	QueryCompletion qc;			/* command completion data for executed query */
 	List	   *stmts;			/* list of PlannedStmts */
 	CachedPlan *cplan;			/* CachedPlan, if stmts are from one */
+	List	   *part_prune_results;	/* list of PartitionPruneResults */
 
 	ParamListInfo portalParams; /* params to pass to query */
 	QueryEnvironment *queryEnv; /* environment for query */
@@ -242,6 +243,8 @@ extern void PortalDefineQuery(Portal portal,
 							  CommandTag commandTag,
 							  List *stmts,
 							  CachedPlan *cplan);
+extern void PortalStorePartitionPruneResults(Portal portal,
+											 List *part_prune_result_list);
 extern PlannedStmt *PortalGetPrimaryStmt(Portal portal);
 extern void PortalCreateHoldStore(Portal portal);
 extern void PortalHashTableDeleteAll(void);
-- 
2.24.1



^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2022-04-08 11:15  David Rowley <[email protected]>
  parent: Amit Langote <[email protected]>
  0 siblings, 1 reply; 108+ messages in thread

From: David Rowley @ 2022-04-08 11:15 UTC (permalink / raw)
  To: Amit Langote <[email protected]>; +Cc: Robert Haas <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers

On Fri, 8 Apr 2022 at 17:49, Amit Langote <[email protected]> wrote:
> Attached updated patch with these changes.

Thanks for making the changes.  I started looking over this patch but
really feel like it needs quite a few more iterations of what we've
just been doing to get it into proper committable shape. There seems
to be only about 40 mins to go before the freeze, so it seems very
unrealistic that it could be made to work.

I started trying to take a serious look at it this evening, but I feel
like I just failed to get into it deep enough to make any meaningful
improvements.  I'd need more time to study the problem before I could
build up a proper opinion on how exactly I think it should work.

Anyway. I've attached a small patch that's just a few things I
adjusted or questions while reading over your v13 patch.  Some of
these are just me questioning your code (See XXX comments) and some I
think are improvements. Feel free to take the hunks that you see fit
and drop anything you don't.

David

diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 05cc99df8f..5ee978937d 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -121,6 +121,8 @@ static void EvalPlanQualStart(EPQState *epqstate, Plan *planTree);
  *
  * Note: Partitioned tables mentioned in PartitionedRelPruneInfo nodes that
  * drive the pruning will be locked before doing the pruning.
+ *
+ * ----------------------------------------------------------------
  */
 PartitionPruneResult *
 ExecutorDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params)
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 3037742b8d..e9ca6bc55f 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -1707,6 +1707,7 @@ ExecInitPartitionPruning(PlanState *planstate,
 		Assert(n_total_subplans > 0);
 		*initially_valid_subplans = bms_add_range(NULL, 0,
 												  n_total_subplans - 1);
+		return prunestate;
 	}
 
 	/*
@@ -1714,14 +1715,15 @@ ExecInitPartitionPruning(PlanState *planstate,
 	 * that were removed above due to initial pruning.  No need to do this if
 	 * no steps were removed.
 	 */
-	if (bms_num_members(*initially_valid_subplans) < n_total_subplans)
+	if (prunestate &&
+		bms_num_members(*initially_valid_subplans) < n_total_subplans)
 	{
 		/*
 		 * We can safely skip this when !do_exec_prune, even though that
 		 * leaves invalid data in prunestate, because that data won't be
 		 * consulted again (cf initial Assert in ExecFindMatchingSubPlans).
 		 */
-		if (prunestate && prunestate->do_exec_prune)
+		if (prunestate->do_exec_prune)
 			PartitionPruneFixSubPlanMap(prunestate,
 										*initially_valid_subplans,
 										n_total_subplans);
@@ -1751,7 +1753,8 @@ ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params,
 	Bitmapset	 *valid_subplan_offs;
 
 	/*
-	 * A temporary context to allocate stuff needded to run the pruning steps.
+	 * A temporary context to for memory allocations required while execution
+	 * partition pruning steps.
 	 */
 	tmpcontext = AllocSetContextCreate(CurrentMemoryContext,
 									   "initial pruning working data",
@@ -1765,11 +1768,12 @@ ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params,
 	pdir = CreatePartitionDirectory(CurrentMemoryContext, false);
 
 	/*
-	 * We don't yet have a PlanState for the parent plan node, so must create
-	 * a standalone ExprContext to evaluate pruning expressions, equipped with
-	 * the information about the EXTERN parameters that the caller passed us.
-	 * Note that that's okay because the initial pruning steps do not contain
-	 * anything that requires the execution to have started.
+	 * We don't yet have a PlanState for the parent plan node, so we must
+	 * create a standalone ExprContext to evaluate pruning expressions,
+	 * equipped with the information about the EXTERN parameters that the
+	 * caller passed us.  Note that that's okay because the initial pruning
+	 * steps do not contain anything that requires the execution to have
+	 * started.
 	 */
 	econtext = CreateStandaloneExprContext();
 	econtext->ecxt_param_list_info = params;
@@ -1874,7 +1878,6 @@ CreatePartitionPruneState(PlanState *planstate,
 			PartitionedRelPruneInfo *pinfo = lfirst_node(PartitionedRelPruneInfo, lc2);
 			PartitionedRelPruningData *pprune = &prunedata->partrelprunedata[j];
 			Relation	partrel;
-			bool		close_partrel = false;
 			PartitionDesc partdesc;
 			PartitionKey partkey;
 
@@ -1894,7 +1897,6 @@ CreatePartitionPruneState(PlanState *planstate,
 				int		lockmode = (j == 0) ? NoLock : rte->rellockmode;
 
 				partrel = table_open(rte->relid, lockmode);
-				close_partrel = true;
 			}
 			else
 				partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
@@ -1914,7 +1916,7 @@ CreatePartitionPruneState(PlanState *planstate,
 			 * Must close partrel, keeping the lock taken, if we're not using
 			 * EState's entry.
 			 */
-			if (close_partrel)
+			if (estate == NULL)
 				table_close(partrel, NoLock);
 
 			/*
@@ -2367,6 +2369,8 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
 		{
 			*validsubplans = bms_add_member(*validsubplans,
 											pprune->subplan_map[i]);
+			/* XXX why would pprune->rti_map[i] ever be zero here??? */
+			Assert(pprune->rti_map[i] > 0);
 			if (scan_leafpart_rtis && pprune->rti_map[i] > 0)
 				*scan_leafpart_rtis = bms_add_member(*scan_leafpart_rtis,
 													 pprune->rti_map[i]);
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 639145abe9..f9c7976ff2 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -119,6 +119,7 @@ CreateExecutorState(void)
 	estate->es_relations = NULL;
 	estate->es_rowmarks = NULL;
 	estate->es_plannedstmt = NULL;
+	estate->es_part_prune_infos = NIL;
 	estate->es_part_prune_result = NULL;
 
 	estate->es_junkFilter = NULL;
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index 09f26658e2..96880e122a 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -94,7 +94,6 @@ static bool ExecAppendAsyncRequest(AppendState *node, TupleTableSlot **result);
 static void ExecAppendAsyncEventWait(AppendState *node);
 static void classify_matching_subplans(AppendState *node);
 
-
 /* ----------------------------------------------------------------
  *		ExecInitAppend
  *
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index ec6b1f1fc0..fe0df2f1d1 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -1184,7 +1184,6 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
 	ListCell   *subpaths;
 	int			nasyncplans = 0;
 	RelOptInfo *rel = best_path->path.parent;
-	int			part_prune_index = -1;
 	int			nodenumsortkeys = 0;
 	AttrNumber *nodeSortColIdx = NULL;
 	Oid		   *nodeSortOperators = NULL;
@@ -1335,6 +1334,9 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
 		subplans = lappend(subplans, subplan);
 	}
 
+	/* Set below if we find quals that we can use to run-time prune */
+	plan->part_prune_index = -1;
+
 	/*
 	 * If any quals exist, they may be useful to perform further partition
 	 * pruning during execution.  Gather information needed by the executor to
@@ -1358,18 +1360,15 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
 		}
 
 		if (prunequal != NIL)
-			part_prune_index= make_partition_pruneinfo(root, rel,
-													   best_path->subpaths,
-													   prunequal);
+			plan->part_prune_index = make_partition_pruneinfo(root, rel,
+															  best_path->subpaths,
+															  prunequal);
 	}
 
 	plan->appendplans = subplans;
 	plan->nasyncplans = nasyncplans;
 	plan->first_partial_plan = best_path->first_partial_path;
 
-	/* Will be updated later in set_plan_references(). */
-	plan->part_prune_index = part_prune_index;
-
 	copy_generic_path_info(&plan->plan, (Path *) best_path);
 
 	/*
@@ -1408,7 +1407,6 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
 	List	   *subplans = NIL;
 	ListCell   *subpaths;
 	RelOptInfo *rel = best_path->path.parent;
-	int			part_prune_index = -1;
 
 	/*
 	 * We don't have the actual creation of the MergeAppend node split out
@@ -1501,6 +1499,9 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
 		subplans = lappend(subplans, subplan);
 	}
 
+	/* Set below if we find quals that we can use to run-time prune */
+	node->part_prune_index = -1;
+
 	/*
 	 * If any quals exist, they may be useful to perform further partition
 	 * pruning during execution.  Gather information needed by the executor to
@@ -1524,15 +1525,13 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
 		}
 
 		if (prunequal != NIL)
-			part_prune_index= make_partition_pruneinfo(root, rel,
-													   best_path->subpaths,
-													   prunequal);
+			node->part_prune_index = make_partition_pruneinfo(root, rel,
+															  best_path->subpaths,
+															  prunequal);
 	}
 
 	node->mergeplans = subplans;
 
-	/* Will be updated later in set_plan_references(). */
-	node->part_prune_index = part_prune_index;
 
 	/*
 	 * If prepare_sort_from_pathkeys added sort columns, but we were told to
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index c88e5bacac..63ec8a98fc 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -408,6 +408,13 @@ set_plan_references(PlannerInfo *root, Plan *plan)
 		glob->partPruneInfos = lappend(glob->partPruneInfos, pruneinfo);
 	}
 
+	/*
+	 * XXX is it worth doing a bms_copy() on glob->minLockRelids if
+	 * glob->containsInitialPruning is true?. I'm slighly worried that the
+	 * Bitmapset could have a very long empty tail resulting in excessive
+	 * looping during AcquireExecutorLocks().
+	 */
+
 	return result;
 }
 
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index 5a5f5dee46..952c5b8327 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -212,12 +212,12 @@ static void partkey_datum_from_expr(PartitionPruneContext *context,
 /*
  * make_partition_pruneinfo
  *		Checks if the given set of quals can be used to build pruning steps
- *		that the executor will use to prune useless ones from given set of
- *		child paths, and if so builds a PartitionPruneInfo that will allow the
- *		executor to do do and append it to root->partPruneInfos.
+ *		that the executor can use to prune away unneeded partitions.  If
+ *		suitable quals are found then a PartitionPruneInfo is built and tagged
+ *		onto the PlannerInfo's partPruneInfos list.
  *
- * Return value is 0-based index of the added PartitionPruneInfo or -1 if one
- * was not built after all.
+ * The return value is the 0-based index of the item added to the
+ * partPruneInfos list or -1 if nothing was added.
  *
  * 'parentrel' is the RelOptInfo for an appendrel, and 'subpaths' is the list
  * of scan paths for its child rels.
@@ -335,10 +335,9 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 			allmatchedsubplans = bms_join(matchedsubplans,
 										  allmatchedsubplans);
 		}
-		if (!needs_init_pruning)
-			needs_init_pruning = partrel_needs_init_pruning;
-		if (!needs_exec_pruning)
-			needs_exec_pruning = partrel_needs_exec_pruning;
+
+		needs_init_pruning |= partrel_needs_init_pruning;
+		needs_exec_pruning |= partrel_needs_exec_pruning;
 	}
 
 	pfree(relid_subplan_map);
@@ -570,7 +569,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		 * that would require per-scan pruning.
 		 *
 		 * In the first pass, we note whether the 2nd pass is necessary by
-		 * by noting the presence of EXEC parameters.
+		 * noting the presence of EXEC parameters.
 		 */
 		gen_partprune_steps(subpart, partprunequal, PARTTARGET_INITIAL,
 							&context);
@@ -645,10 +644,11 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		pinfo->execparamids = execparamids;
 		/* Remaining fields will be filled in the next loop */
 
-		if (!*needs_init_pruning)
-			*needs_init_pruning = (initial_pruning_steps != NIL);
-		if (!*needs_exec_pruning)
-			*needs_exec_pruning = (exec_pruning_steps != NIL);
+		/* record which types of pruning steps we've seen so far */
+		if (initial_pruning_steps != NIL)
+			*needs_init_pruning = true;
+		if (exec_pruning_steps != NIL)
+			*needs_exec_pruning = true;
 
 		pinfolist = lappend(pinfolist, pinfo);
 	}
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index a627448a5a..8cc2e2162d 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -1204,7 +1204,6 @@ PortalRunMulti(Portal portal,
 {
 	bool		active_snapshot_set = false;
 	ListCell   *stmtlist_item;
-	int			i;
 
 	/*
 	 * If the destination is DestRemoteExecute, change to DestNone.  The
@@ -1225,15 +1224,9 @@ PortalRunMulti(Portal portal,
 	 * Loop to handle the individual queries generated from a single parsetree
 	 * by analysis and rewrite.
 	 */
-	i = 0;
 	foreach(stmtlist_item, portal->stmts)
 	{
 		PlannedStmt *pstmt = lfirst_node(PlannedStmt, stmtlist_item);
-		PartitionPruneResult *part_prune_result = portal->part_prune_results ?
-									  list_nth(portal->part_prune_results, i) :
-									  NULL;
-
-		i++;
 
 		/*
 		 * If we got a cancel signal in prior command, quit
@@ -1242,6 +1235,8 @@ PortalRunMulti(Portal portal,
 
 		if (pstmt->utilityStmt == NULL)
 		{
+			PartitionPruneResult *part_prune_result = NULL;
+
 			/*
 			 * process a plannable query.
 			 */
@@ -1288,6 +1283,14 @@ PortalRunMulti(Portal portal,
 			else
 				UpdateActiveSnapshotCommandId();
 
+			/*
+			 * Determine if there's a corresponding PartitionPruneResult for
+			 * this PlannedStmt.
+			 */
+			if (portal->part_prune_results != NIL)
+				part_prune_result = list_nth(portal->part_prune_results,
+											 foreach_current_index(stmtlist_item));
+
 			if (pstmt->canSetTag)
 			{
 				/* statement can set tag string */
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 34975c69ee..bbc8c42d88 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -87,7 +87,8 @@ extern void ExplainOneUtility(Node *utilityStmt, IntoClause *into,
 							  ExplainState *es, const char *queryString,
 							  ParamListInfo params, QueryEnvironment *queryEnv);
 
-extern void ExplainOnePlan(PlannedStmt *plannedstmt, PartitionPruneResult *part_prune_resul,
+extern void ExplainOnePlan(PlannedStmt *plannedstmt,
+						   PartitionPruneResult *part_prune_result,
 						   IntoClause *into,
 						   ExplainState *es, const char *queryString,
 						   ParamListInfo params, QueryEnvironment *queryEnv,
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 43bd293433..a8bf908d63 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -1000,11 +1000,11 @@ typedef TupleTableSlot *(*ExecProcNodeMtd) (struct PlanState *pstate);
  *
  * This is used by GetCachedPlan() to inform its callers of the pruning
  * decisions made when performing AcquireExecutorLocks() on a given cached
- * PlannedStmt, which the callers then pass that on to the executor.  The
- * executor refers to this node when made available when initializing the plan
- * nodes to which those PartitionPruneInfos apply so that the same set of
- * qualifying subplans are initialized, rather than deriving that set again by
- * redoing initial pruning.
+ * PlannedStmt, which the callers then pass onto the executor.  The executor
+ * refers to this node when made available when initializing the plan nodes to
+ * which those PartitionPruneInfos apply so that the same set of qualifying
+ * subplans are initialized, rather than deriving that set again by redoing
+ * initial pruning.
  */
 typedef struct PartitionPruneResult
 {
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 550308147d..f8f3971f44 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -274,11 +274,7 @@ typedef struct Append
 	 */
 	int			first_partial_plan;
 
-	/*
-	 * Index of this plan's PartitionPruneInfo in PlannedStmt.partPruneInfos
-	 * to be used for run-time subplan pruning; -1 if run-time pruning is
-	 * not needed.
-	 */
+	/* Index to PlannerInfo.partPruneInfos or -1 if no run-time pruning */
 	int			part_prune_index;
 } Append;
 
@@ -299,11 +295,7 @@ typedef struct MergeAppend
 	Oid		   *collations;		/* OIDs of collations */
 	bool	   *nullsFirst;		/* NULLS FIRST/LAST directions */
 
-	/*
-	 * Index of this plan's PartitionPruneInfo in PlannedStmt.partPruneInfos
-	 * to be used for run-time subplan pruning; -1 if run-time pruning is
-	 * not needed.
-	 */
+	/* Index to PlannerInfo.partPruneInfos or -1 if no run-time pruning */
 	int			part_prune_index;
 } MergeAppend;
 


Attachments:

  [text/plain] misc_fixes.patch.txt (15.8K, 2-misc_fixes.patch.txt)
  download | inline diff:
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 05cc99df8f..5ee978937d 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -121,6 +121,8 @@ static void EvalPlanQualStart(EPQState *epqstate, Plan *planTree);
  *
  * Note: Partitioned tables mentioned in PartitionedRelPruneInfo nodes that
  * drive the pruning will be locked before doing the pruning.
+ *
+ * ----------------------------------------------------------------
  */
 PartitionPruneResult *
 ExecutorDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params)
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 3037742b8d..e9ca6bc55f 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -1707,6 +1707,7 @@ ExecInitPartitionPruning(PlanState *planstate,
 		Assert(n_total_subplans > 0);
 		*initially_valid_subplans = bms_add_range(NULL, 0,
 												  n_total_subplans - 1);
+		return prunestate;
 	}
 
 	/*
@@ -1714,14 +1715,15 @@ ExecInitPartitionPruning(PlanState *planstate,
 	 * that were removed above due to initial pruning.  No need to do this if
 	 * no steps were removed.
 	 */
-	if (bms_num_members(*initially_valid_subplans) < n_total_subplans)
+	if (prunestate &&
+		bms_num_members(*initially_valid_subplans) < n_total_subplans)
 	{
 		/*
 		 * We can safely skip this when !do_exec_prune, even though that
 		 * leaves invalid data in prunestate, because that data won't be
 		 * consulted again (cf initial Assert in ExecFindMatchingSubPlans).
 		 */
-		if (prunestate && prunestate->do_exec_prune)
+		if (prunestate->do_exec_prune)
 			PartitionPruneFixSubPlanMap(prunestate,
 										*initially_valid_subplans,
 										n_total_subplans);
@@ -1751,7 +1753,8 @@ ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params,
 	Bitmapset	 *valid_subplan_offs;
 
 	/*
-	 * A temporary context to allocate stuff needded to run the pruning steps.
+	 * A temporary context to for memory allocations required while execution
+	 * partition pruning steps.
 	 */
 	tmpcontext = AllocSetContextCreate(CurrentMemoryContext,
 									   "initial pruning working data",
@@ -1765,11 +1768,12 @@ ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params,
 	pdir = CreatePartitionDirectory(CurrentMemoryContext, false);
 
 	/*
-	 * We don't yet have a PlanState for the parent plan node, so must create
-	 * a standalone ExprContext to evaluate pruning expressions, equipped with
-	 * the information about the EXTERN parameters that the caller passed us.
-	 * Note that that's okay because the initial pruning steps do not contain
-	 * anything that requires the execution to have started.
+	 * We don't yet have a PlanState for the parent plan node, so we must
+	 * create a standalone ExprContext to evaluate pruning expressions,
+	 * equipped with the information about the EXTERN parameters that the
+	 * caller passed us.  Note that that's okay because the initial pruning
+	 * steps do not contain anything that requires the execution to have
+	 * started.
 	 */
 	econtext = CreateStandaloneExprContext();
 	econtext->ecxt_param_list_info = params;
@@ -1874,7 +1878,6 @@ CreatePartitionPruneState(PlanState *planstate,
 			PartitionedRelPruneInfo *pinfo = lfirst_node(PartitionedRelPruneInfo, lc2);
 			PartitionedRelPruningData *pprune = &prunedata->partrelprunedata[j];
 			Relation	partrel;
-			bool		close_partrel = false;
 			PartitionDesc partdesc;
 			PartitionKey partkey;
 
@@ -1894,7 +1897,6 @@ CreatePartitionPruneState(PlanState *planstate,
 				int		lockmode = (j == 0) ? NoLock : rte->rellockmode;
 
 				partrel = table_open(rte->relid, lockmode);
-				close_partrel = true;
 			}
 			else
 				partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
@@ -1914,7 +1916,7 @@ CreatePartitionPruneState(PlanState *planstate,
 			 * Must close partrel, keeping the lock taken, if we're not using
 			 * EState's entry.
 			 */
-			if (close_partrel)
+			if (estate == NULL)
 				table_close(partrel, NoLock);
 
 			/*
@@ -2367,6 +2369,8 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
 		{
 			*validsubplans = bms_add_member(*validsubplans,
 											pprune->subplan_map[i]);
+			/* XXX why would pprune->rti_map[i] ever be zero here??? */
+			Assert(pprune->rti_map[i] > 0);
 			if (scan_leafpart_rtis && pprune->rti_map[i] > 0)
 				*scan_leafpart_rtis = bms_add_member(*scan_leafpart_rtis,
 													 pprune->rti_map[i]);
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 639145abe9..f9c7976ff2 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -119,6 +119,7 @@ CreateExecutorState(void)
 	estate->es_relations = NULL;
 	estate->es_rowmarks = NULL;
 	estate->es_plannedstmt = NULL;
+	estate->es_part_prune_infos = NIL;
 	estate->es_part_prune_result = NULL;
 
 	estate->es_junkFilter = NULL;
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index 09f26658e2..96880e122a 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -94,7 +94,6 @@ static bool ExecAppendAsyncRequest(AppendState *node, TupleTableSlot **result);
 static void ExecAppendAsyncEventWait(AppendState *node);
 static void classify_matching_subplans(AppendState *node);
 
-
 /* ----------------------------------------------------------------
  *		ExecInitAppend
  *
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index ec6b1f1fc0..fe0df2f1d1 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -1184,7 +1184,6 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
 	ListCell   *subpaths;
 	int			nasyncplans = 0;
 	RelOptInfo *rel = best_path->path.parent;
-	int			part_prune_index = -1;
 	int			nodenumsortkeys = 0;
 	AttrNumber *nodeSortColIdx = NULL;
 	Oid		   *nodeSortOperators = NULL;
@@ -1335,6 +1334,9 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
 		subplans = lappend(subplans, subplan);
 	}
 
+	/* Set below if we find quals that we can use to run-time prune */
+	plan->part_prune_index = -1;
+
 	/*
 	 * If any quals exist, they may be useful to perform further partition
 	 * pruning during execution.  Gather information needed by the executor to
@@ -1358,18 +1360,15 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
 		}
 
 		if (prunequal != NIL)
-			part_prune_index= make_partition_pruneinfo(root, rel,
-													   best_path->subpaths,
-													   prunequal);
+			plan->part_prune_index = make_partition_pruneinfo(root, rel,
+															  best_path->subpaths,
+															  prunequal);
 	}
 
 	plan->appendplans = subplans;
 	plan->nasyncplans = nasyncplans;
 	plan->first_partial_plan = best_path->first_partial_path;
 
-	/* Will be updated later in set_plan_references(). */
-	plan->part_prune_index = part_prune_index;
-
 	copy_generic_path_info(&plan->plan, (Path *) best_path);
 
 	/*
@@ -1408,7 +1407,6 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
 	List	   *subplans = NIL;
 	ListCell   *subpaths;
 	RelOptInfo *rel = best_path->path.parent;
-	int			part_prune_index = -1;
 
 	/*
 	 * We don't have the actual creation of the MergeAppend node split out
@@ -1501,6 +1499,9 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
 		subplans = lappend(subplans, subplan);
 	}
 
+	/* Set below if we find quals that we can use to run-time prune */
+	node->part_prune_index = -1;
+
 	/*
 	 * If any quals exist, they may be useful to perform further partition
 	 * pruning during execution.  Gather information needed by the executor to
@@ -1524,15 +1525,13 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
 		}
 
 		if (prunequal != NIL)
-			part_prune_index= make_partition_pruneinfo(root, rel,
-													   best_path->subpaths,
-													   prunequal);
+			node->part_prune_index = make_partition_pruneinfo(root, rel,
+															  best_path->subpaths,
+															  prunequal);
 	}
 
 	node->mergeplans = subplans;
 
-	/* Will be updated later in set_plan_references(). */
-	node->part_prune_index = part_prune_index;
 
 	/*
 	 * If prepare_sort_from_pathkeys added sort columns, but we were told to
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index c88e5bacac..63ec8a98fc 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -408,6 +408,13 @@ set_plan_references(PlannerInfo *root, Plan *plan)
 		glob->partPruneInfos = lappend(glob->partPruneInfos, pruneinfo);
 	}
 
+	/*
+	 * XXX is it worth doing a bms_copy() on glob->minLockRelids if
+	 * glob->containsInitialPruning is true?. I'm slighly worried that the
+	 * Bitmapset could have a very long empty tail resulting in excessive
+	 * looping during AcquireExecutorLocks().
+	 */
+
 	return result;
 }
 
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index 5a5f5dee46..952c5b8327 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -212,12 +212,12 @@ static void partkey_datum_from_expr(PartitionPruneContext *context,
 /*
  * make_partition_pruneinfo
  *		Checks if the given set of quals can be used to build pruning steps
- *		that the executor will use to prune useless ones from given set of
- *		child paths, and if so builds a PartitionPruneInfo that will allow the
- *		executor to do do and append it to root->partPruneInfos.
+ *		that the executor can use to prune away unneeded partitions.  If
+ *		suitable quals are found then a PartitionPruneInfo is built and tagged
+ *		onto the PlannerInfo's partPruneInfos list.
  *
- * Return value is 0-based index of the added PartitionPruneInfo or -1 if one
- * was not built after all.
+ * The return value is the 0-based index of the item added to the
+ * partPruneInfos list or -1 if nothing was added.
  *
  * 'parentrel' is the RelOptInfo for an appendrel, and 'subpaths' is the list
  * of scan paths for its child rels.
@@ -335,10 +335,9 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 			allmatchedsubplans = bms_join(matchedsubplans,
 										  allmatchedsubplans);
 		}
-		if (!needs_init_pruning)
-			needs_init_pruning = partrel_needs_init_pruning;
-		if (!needs_exec_pruning)
-			needs_exec_pruning = partrel_needs_exec_pruning;
+
+		needs_init_pruning |= partrel_needs_init_pruning;
+		needs_exec_pruning |= partrel_needs_exec_pruning;
 	}
 
 	pfree(relid_subplan_map);
@@ -570,7 +569,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		 * that would require per-scan pruning.
 		 *
 		 * In the first pass, we note whether the 2nd pass is necessary by
-		 * by noting the presence of EXEC parameters.
+		 * noting the presence of EXEC parameters.
 		 */
 		gen_partprune_steps(subpart, partprunequal, PARTTARGET_INITIAL,
 							&context);
@@ -645,10 +644,11 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		pinfo->execparamids = execparamids;
 		/* Remaining fields will be filled in the next loop */
 
-		if (!*needs_init_pruning)
-			*needs_init_pruning = (initial_pruning_steps != NIL);
-		if (!*needs_exec_pruning)
-			*needs_exec_pruning = (exec_pruning_steps != NIL);
+		/* record which types of pruning steps we've seen so far */
+		if (initial_pruning_steps != NIL)
+			*needs_init_pruning = true;
+		if (exec_pruning_steps != NIL)
+			*needs_exec_pruning = true;
 
 		pinfolist = lappend(pinfolist, pinfo);
 	}
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index a627448a5a..8cc2e2162d 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -1204,7 +1204,6 @@ PortalRunMulti(Portal portal,
 {
 	bool		active_snapshot_set = false;
 	ListCell   *stmtlist_item;
-	int			i;
 
 	/*
 	 * If the destination is DestRemoteExecute, change to DestNone.  The
@@ -1225,15 +1224,9 @@ PortalRunMulti(Portal portal,
 	 * Loop to handle the individual queries generated from a single parsetree
 	 * by analysis and rewrite.
 	 */
-	i = 0;
 	foreach(stmtlist_item, portal->stmts)
 	{
 		PlannedStmt *pstmt = lfirst_node(PlannedStmt, stmtlist_item);
-		PartitionPruneResult *part_prune_result = portal->part_prune_results ?
-									  list_nth(portal->part_prune_results, i) :
-									  NULL;
-
-		i++;
 
 		/*
 		 * If we got a cancel signal in prior command, quit
@@ -1242,6 +1235,8 @@ PortalRunMulti(Portal portal,
 
 		if (pstmt->utilityStmt == NULL)
 		{
+			PartitionPruneResult *part_prune_result = NULL;
+
 			/*
 			 * process a plannable query.
 			 */
@@ -1288,6 +1283,14 @@ PortalRunMulti(Portal portal,
 			else
 				UpdateActiveSnapshotCommandId();
 
+			/*
+			 * Determine if there's a corresponding PartitionPruneResult for
+			 * this PlannedStmt.
+			 */
+			if (portal->part_prune_results != NIL)
+				part_prune_result = list_nth(portal->part_prune_results,
+											 foreach_current_index(stmtlist_item));
+
 			if (pstmt->canSetTag)
 			{
 				/* statement can set tag string */
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 34975c69ee..bbc8c42d88 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -87,7 +87,8 @@ extern void ExplainOneUtility(Node *utilityStmt, IntoClause *into,
 							  ExplainState *es, const char *queryString,
 							  ParamListInfo params, QueryEnvironment *queryEnv);
 
-extern void ExplainOnePlan(PlannedStmt *plannedstmt, PartitionPruneResult *part_prune_resul,
+extern void ExplainOnePlan(PlannedStmt *plannedstmt,
+						   PartitionPruneResult *part_prune_result,
 						   IntoClause *into,
 						   ExplainState *es, const char *queryString,
 						   ParamListInfo params, QueryEnvironment *queryEnv,
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 43bd293433..a8bf908d63 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -1000,11 +1000,11 @@ typedef TupleTableSlot *(*ExecProcNodeMtd) (struct PlanState *pstate);
  *
  * This is used by GetCachedPlan() to inform its callers of the pruning
  * decisions made when performing AcquireExecutorLocks() on a given cached
- * PlannedStmt, which the callers then pass that on to the executor.  The
- * executor refers to this node when made available when initializing the plan
- * nodes to which those PartitionPruneInfos apply so that the same set of
- * qualifying subplans are initialized, rather than deriving that set again by
- * redoing initial pruning.
+ * PlannedStmt, which the callers then pass onto the executor.  The executor
+ * refers to this node when made available when initializing the plan nodes to
+ * which those PartitionPruneInfos apply so that the same set of qualifying
+ * subplans are initialized, rather than deriving that set again by redoing
+ * initial pruning.
  */
 typedef struct PartitionPruneResult
 {
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 550308147d..f8f3971f44 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -274,11 +274,7 @@ typedef struct Append
 	 */
 	int			first_partial_plan;
 
-	/*
-	 * Index of this plan's PartitionPruneInfo in PlannedStmt.partPruneInfos
-	 * to be used for run-time subplan pruning; -1 if run-time pruning is
-	 * not needed.
-	 */
+	/* Index to PlannerInfo.partPruneInfos or -1 if no run-time pruning */
 	int			part_prune_index;
 } Append;
 
@@ -299,11 +295,7 @@ typedef struct MergeAppend
 	Oid		   *collations;		/* OIDs of collations */
 	bool	   *nullsFirst;		/* NULLS FIRST/LAST directions */
 
-	/*
-	 * Index of this plan's PartitionPruneInfo in PlannedStmt.partPruneInfos
-	 * to be used for run-time subplan pruning; -1 if run-time pruning is
-	 * not needed.
-	 */
+	/* Index to PlannerInfo.partPruneInfos or -1 if no run-time pruning */
 	int			part_prune_index;
 } MergeAppend;
 


^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2022-04-08 11:45  Amit Langote <[email protected]>
  parent: David Rowley <[email protected]>
  0 siblings, 1 reply; 108+ messages in thread

From: Amit Langote @ 2022-04-08 11:45 UTC (permalink / raw)
  To: David Rowley <[email protected]>; +Cc: Robert Haas <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers

Hi David,

On Fri, Apr 8, 2022 at 8:16 PM David Rowley <[email protected]> wrote:
> On Fri, 8 Apr 2022 at 17:49, Amit Langote <[email protected]> wrote:
> > Attached updated patch with these changes.
> Thanks for making the changes.  I started looking over this patch but
> really feel like it needs quite a few more iterations of what we've
> just been doing to get it into proper committable shape. There seems
> to be only about 40 mins to go before the freeze, so it seems very
> unrealistic that it could be made to work.

Yeah, totally understandable.

> I started trying to take a serious look at it this evening, but I feel
> like I just failed to get into it deep enough to make any meaningful
> improvements.  I'd need more time to study the problem before I could
> build up a proper opinion on how exactly I think it should work.
>
> Anyway. I've attached a small patch that's just a few things I
> adjusted or questions while reading over your v13 patch.  Some of
> these are just me questioning your code (See XXX comments) and some I
> think are improvements. Feel free to take the hunks that you see fit
> and drop anything you don't.

Thanks a lot for compiling those.

Most looked fine changes to me except a couple of typos, so I've
adopted those into the attached new version, even though I know it's
too late to try to apply it.  Re the XXX comments:

+ /* XXX why would pprune->rti_map[i] ever be zero here??? */

Yeah, no there can't be, was perhaps being overly paraioid.

+ * XXX is it worth doing a bms_copy() on glob->minLockRelids if
+ * glob->containsInitialPruning is true?. I'm slighly worried that the
+ * Bitmapset could have a very long empty tail resulting in excessive
+ * looping during AcquireExecutorLocks().
+ */

I guess I trust your instincts about bitmapset operation efficiency
and what you've written here makes sense.  It's typical for leaf
partitions to have been appended toward the tail end of rtable and I'd
imagine their indexes would be in the tail words of minLockRelids.  If
copying the bitmapset removes those useless words, I don't see why we
shouldn't do that.  So added:

+ /*
+ * It seems worth doing a bms_copy() on glob->minLockRelids if we deleted
+ * bit from it just above to prevent empty tail bits resulting in
+ * inefficient looping during AcquireExecutorLocks().
+ */
+ if (glob->containsInitialPruning)
+ glob->minLockRelids = bms_copy(glob->minLockRelids)

Not 100% about the comment I wrote.

-- 
Amit Langote
EDB: http://www.enterprisedb.com


Attachments:

  [application/octet-stream] v14-0001-Optimize-AcquireExecutorLocks-to-skip-pruned-par.patch (99.3K, 2-v14-0001-Optimize-AcquireExecutorLocks-to-skip-pruned-par.patch)
  download | inline diff:
From 552da9453f0c4896bcc8748719960db52b3ccad1 Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Wed, 22 Dec 2021 16:55:17 +0900
Subject: [PATCH v14] Optimize AcquireExecutorLocks() to skip pruned partitions

---
 src/backend/commands/copyto.c           |   2 +-
 src/backend/commands/createas.c         |   2 +-
 src/backend/commands/explain.c          |   7 +-
 src/backend/commands/extension.c        |   2 +-
 src/backend/commands/matview.c          |   2 +-
 src/backend/commands/prepare.c          |  26 ++-
 src/backend/executor/README             |  27 +++
 src/backend/executor/execMain.c         |  48 +++++
 src/backend/executor/execParallel.c     |  28 ++-
 src/backend/executor/execPartition.c    | 241 ++++++++++++++++++++----
 src/backend/executor/execUtils.c        |   2 +
 src/backend/executor/functions.c        |   2 +-
 src/backend/executor/nodeAppend.c       |  15 +-
 src/backend/executor/nodeMergeAppend.c  |   9 +-
 src/backend/executor/spi.c              |  27 ++-
 src/backend/nodes/copyfuncs.c           |  33 +++-
 src/backend/nodes/outfuncs.c            |  36 +++-
 src/backend/nodes/readfuncs.c           |  56 +++++-
 src/backend/optimizer/plan/createplan.c |  24 +--
 src/backend/optimizer/plan/planner.c    |   3 +
 src/backend/optimizer/plan/setrefs.c    | 112 ++++++++---
 src/backend/partitioning/partprune.c    |  59 +++++-
 src/backend/tcop/postgres.c             |   8 +-
 src/backend/tcop/pquery.c               |  28 ++-
 src/backend/utils/cache/plancache.c     | 184 +++++++++++++++---
 src/backend/utils/mmgr/portalmem.c      |  19 ++
 src/include/commands/explain.h          |   4 +-
 src/include/executor/execPartition.h    |  12 +-
 src/include/executor/execdesc.h         |   3 +
 src/include/executor/executor.h         |   2 +
 src/include/nodes/execnodes.h           |  30 +++
 src/include/nodes/nodes.h               |   4 +
 src/include/nodes/pathnodes.h           |  15 ++
 src/include/nodes/plannodes.h           |  31 ++-
 src/include/partitioning/partprune.h    |   8 +-
 src/include/utils/plancache.h           |   3 +-
 src/include/utils/portal.h              |   3 +
 37 files changed, 950 insertions(+), 167 deletions(-)

diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index 55c38b04c4..d403eb2309 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -542,7 +542,7 @@ BeginCopyTo(ParseState *pstate,
 		((DR_copy *) dest)->cstate = cstate;
 
 		/* Create a QueryDesc requesting no output */
-		cstate->queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+		cstate->queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
 											GetActiveSnapshot(),
 											InvalidSnapshot,
 											dest, NULL, NULL, 0);
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 9abbb6b555..f6607f2454 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -325,7 +325,7 @@ ExecCreateTableAs(ParseState *pstate, CreateTableAsStmt *stmt,
 		UpdateActiveSnapshotCommandId();
 
 		/* Create a QueryDesc, redirecting output to our tuple receiver */
-		queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+		queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
 									GetActiveSnapshot(), InvalidSnapshot,
 									dest, params, queryEnv, 0);
 
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index d2a2479822..35dd24adf8 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -407,7 +407,7 @@ ExplainOneQuery(Query *query, int cursorOptions,
 		}
 
 		/* run it (if needed) and produce output */
-		ExplainOnePlan(plan, into, es, queryString, params, queryEnv,
+		ExplainOnePlan(plan, NULL, into, es, queryString, params, queryEnv,
 					   &planduration, (es->buffers ? &bufusage : NULL));
 	}
 }
@@ -515,7 +515,8 @@ ExplainOneUtility(Node *utilityStmt, IntoClause *into, ExplainState *es,
  * to call it.
  */
 void
-ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
+ExplainOnePlan(PlannedStmt *plannedstmt, PartitionPruneResult *part_prune_result,
+			   IntoClause *into, ExplainState *es,
 			   const char *queryString, ParamListInfo params,
 			   QueryEnvironment *queryEnv, const instr_time *planduration,
 			   const BufferUsage *bufusage)
@@ -563,7 +564,7 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
 		dest = None_Receiver;
 
 	/* Create a QueryDesc for the query */
-	queryDesc = CreateQueryDesc(plannedstmt, queryString,
+	queryDesc = CreateQueryDesc(plannedstmt, part_prune_result, queryString,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, params, queryEnv, instrument_option);
 
diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index 1013790dbb..54734a3a93 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -776,7 +776,7 @@ execute_sql_string(const char *sql)
 			{
 				QueryDesc  *qdesc;
 
-				qdesc = CreateQueryDesc(stmt,
+				qdesc = CreateQueryDesc(stmt, NULL,
 										sql,
 										GetActiveSnapshot(), NULL,
 										dest, NULL, NULL, 0);
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index 9ab248d25e..2be1782bc4 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -416,7 +416,7 @@ refresh_matview_datafill(DestReceiver *dest, Query *query,
 	UpdateActiveSnapshotCommandId();
 
 	/* Create a QueryDesc, redirecting output to our tuple receiver */
-	queryDesc = CreateQueryDesc(plan, queryString,
+	queryDesc = CreateQueryDesc(plan, NULL, queryString,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, NULL, NULL, 0);
 
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 80738547ed..c7360712b1 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -155,6 +155,7 @@ ExecuteQuery(ParseState *pstate,
 	PreparedStatement *entry;
 	CachedPlan *cplan;
 	List	   *plan_list;
+	List	   *part_prune_result_list;
 	ParamListInfo paramLI = NULL;
 	EState	   *estate = NULL;
 	Portal		portal;
@@ -193,7 +194,10 @@ ExecuteQuery(ParseState *pstate,
 									   entry->plansource->query_string);
 
 	/* Replan if needed, and increment plan refcount for portal */
-	cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL);
+	cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL,
+						  &part_prune_result_list);
+	Assert(list_length(cplan->stmt_list) ==
+		   list_length(part_prune_result_list));
 	plan_list = cplan->stmt_list;
 
 	/*
@@ -207,6 +211,9 @@ ExecuteQuery(ParseState *pstate,
 					  plan_list,
 					  cplan);
 
+	/* Copy PartitionPruneResults into the portal's context. */
+	PortalStorePartitionPruneResults(portal, part_prune_result_list);
+
 	/*
 	 * For CREATE TABLE ... AS EXECUTE, we must verify that the prepared
 	 * statement is one that produces tuples.  Currently we insist that it be
@@ -576,7 +583,9 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	const char *query_string;
 	CachedPlan *cplan;
 	List	   *plan_list;
-	ListCell   *p;
+	List	   *part_prune_result_list;
+	ListCell   *p,
+			   *pp;
 	ParamListInfo paramLI = NULL;
 	EState	   *estate = NULL;
 	instr_time	planstart;
@@ -619,7 +628,10 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 
 	/* Replan if needed, and acquire a transient refcount */
 	cplan = GetCachedPlan(entry->plansource, paramLI,
-						  CurrentResourceOwner, queryEnv);
+						  CurrentResourceOwner, queryEnv,
+						  &part_prune_result_list);
+	Assert(list_length(cplan->stmt_list) ==
+		   list_length(part_prune_result_list));
 
 	INSTR_TIME_SET_CURRENT(planduration);
 	INSTR_TIME_SUBTRACT(planduration, planstart);
@@ -634,13 +646,15 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	plan_list = cplan->stmt_list;
 
 	/* Explain each query */
-	foreach(p, plan_list)
+	forboth(p, plan_list, pp, part_prune_result_list)
 	{
 		PlannedStmt *pstmt = lfirst_node(PlannedStmt, p);
+		PartitionPruneResult *part_prune_result = lfirst_node(PartitionPruneResult, pp);
 
 		if (pstmt->commandType != CMD_UTILITY)
-			ExplainOnePlan(pstmt, into, es, query_string, paramLI, queryEnv,
-						   &planduration, (es->buffers ? &bufusage : NULL));
+			ExplainOnePlan(pstmt, part_prune_result, into, es, query_string,
+						   paramLI, queryEnv, &planduration,
+						   (es->buffers ? &bufusage : NULL));
 		else
 			ExplainOneUtility(pstmt->utilityStmt, into, es, query_string,
 							  paramLI, queryEnv);
diff --git a/src/backend/executor/README b/src/backend/executor/README
index 0b5183fc4a..e0802be723 100644
--- a/src/backend/executor/README
+++ b/src/backend/executor/README
@@ -65,6 +65,29 @@ found there.  This currently only occurs for Append and MergeAppend nodes.  In
 this case the non-required subplans are ignored and the executor state's
 subnode array will become out of sequence to the plan's subplan list.
 
+Actually, the so-called execution time pruning may also occur even before the
+execution has started.  One case where that occurs is when a cached generic
+plan is being validated for execution by plancache.c: GetCachedPlan(), which
+proceeds by locking all the relations that will be scanned by that plan.  If
+the generic plan contains nodes that can perform execution time partition
+pruning (that is, contain a PartitionPruneInfo), a subset of pruning steps
+contained in the PartitionPruneInfos that do not depend on execution actually
+having started (called "initial" pruning steps) are performed at this point
+to figure out the minimal set of child subplans that satisfy those pruning
+instructions.  AcquireExecutorLocks() looking at a particular plan will then
+lock only the relations scanned by those surviving subplans (along with those
+present in PlannedStmt.minLockRelids), and ignore those scanned by the pruned
+subplans, even though the pruned subplans themselves are not removed from the
+plan tree.  The result of pruning (that is, the set of indexes of surviving
+subplans in their parent's list of child subplans) is saved as a list of
+bitmapsets, with one element for every PartitionPruneInfo referenced in the
+plan (PlannedStmt.partPruneInfos).  The list is packaged into a
+PartitionPruneResult node, which is passed along with the PlannedStmt to the
+executor via the QueryDesc.  It is imperative that the executor and any third
+party code invoked by it that gets passed the plan tree look at the plan's
+PartitionPruneResult to determine whether a particular child subplan of a
+parent node that supports pruning is valid for a given execution.
+
 Each Plan node may have expression trees associated with it, to represent
 its target list, qualification conditions, etc.  These trees are also
 read-only to the executor, but the executor state for expression evaluation
@@ -286,6 +309,10 @@ Query Processing Control Flow
 
 This is a sketch of control flow for full query processing:
 
+  [ ExecutorDoInitialPruning ] --- an optional step to perform initial
+		partition pruning on the plan tree the result of which is passed
+		to the executor via QueryDesc
+
 	CreateQueryDesc
 
 	ExecutorStart
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index ef2fd46092..5ee978937d 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -49,11 +49,13 @@
 #include "commands/matview.h"
 #include "commands/trigger.h"
 #include "executor/execdebug.h"
+#include "executor/execPartition.h"
 #include "executor/nodeSubplan.h"
 #include "foreign/fdwapi.h"
 #include "jit/jit.h"
 #include "mb/pg_wchar.h"
 #include "miscadmin.h"
+#include "nodes/nodeFuncs.h"
 #include "parser/parsetree.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
@@ -104,6 +106,49 @@ static void EvalPlanQualStart(EPQState *epqstate, Plan *planTree);
 
 /* end of local decls */
 
+/* ----------------------------------------------------------------
+ *		ExecutorDoInitialPruning
+ *
+ *		Performs initial partition pruning to figure out the minimal set of
+ *		subplans to be executed and the set of RT indexes of the corresponding
+ *		leaf partitions
+ *
+ * Returned PartitionPruneResult must be subsequently passed to the executor
+ * so that it can reuse the result of pruning.  It's important that the
+ * has the same view of which partitions are initially pruned (by not doing
+ * the pruning again itself) or otherwise it risks initializing subplans whose
+ * partitions would not have been locked.
+ *
+ * Note: Partitioned tables mentioned in PartitionedRelPruneInfo nodes that
+ * drive the pruning will be locked before doing the pruning.
+ *
+ * ----------------------------------------------------------------
+ */
+PartitionPruneResult *
+ExecutorDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params)
+{
+	PartitionPruneResult *result;
+	ListCell *lc;
+
+	/* Only get here if there is any pruning to do. */
+	Assert(plannedstmt->containsInitialPruning);
+
+	result = makeNode(PartitionPruneResult);
+	foreach(lc, plannedstmt->partPruneInfos)
+	{
+		PartitionPruneInfo *pruneinfo = lfirst(lc);
+		Bitmapset *valid_subplan_offs;
+
+		valid_subplan_offs =
+			ExecPartitionDoInitialPruning(plannedstmt, params, pruneinfo,
+										  &result->scan_leafpart_rtis);
+		result->valid_subplan_offs_list =
+			lappend(result->valid_subplan_offs_list,
+					valid_subplan_offs);
+	}
+
+	return result;
+}
 
 /* ----------------------------------------------------------------
  *		ExecutorStart
@@ -806,6 +851,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
 {
 	CmdType		operation = queryDesc->operation;
 	PlannedStmt *plannedstmt = queryDesc->plannedstmt;
+	PartitionPruneResult *part_prune_result = queryDesc->part_prune_result;
 	Plan	   *plan = plannedstmt->planTree;
 	List	   *rangeTable = plannedstmt->rtable;
 	EState	   *estate = queryDesc->estate;
@@ -825,6 +871,8 @@ InitPlan(QueryDesc *queryDesc, int eflags)
 	ExecInitRangeTable(estate, rangeTable);
 
 	estate->es_plannedstmt = plannedstmt;
+	estate->es_part_prune_infos = plannedstmt->partPruneInfos;
+	estate->es_part_prune_result = part_prune_result;
 
 	/*
 	 * Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index 9a0d5d59ef..805f86c503 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -66,6 +66,7 @@
 #define PARALLEL_KEY_QUERY_TEXT		UINT64CONST(0xE000000000000008)
 #define PARALLEL_KEY_JIT_INSTRUMENTATION UINT64CONST(0xE000000000000009)
 #define PARALLEL_KEY_WAL_USAGE			UINT64CONST(0xE00000000000000A)
+#define PARALLEL_KEY_PARTITIONPRUNERESULT	UINT64CONST(0xE00000000000000B)
 
 #define PARALLEL_TUPLE_QUEUE_SIZE		65536
 
@@ -182,7 +183,9 @@ ExecSerializePlan(Plan *plan, EState *estate)
 	pstmt->transientPlan = false;
 	pstmt->dependsOnRole = false;
 	pstmt->parallelModeNeeded = false;
+	pstmt->containsInitialPruning = false;
 	pstmt->planTree = plan;
+	pstmt->partPruneInfos = estate->es_part_prune_infos;
 	pstmt->rtable = estate->es_range_table;
 	pstmt->resultRelations = NIL;
 	pstmt->appendRelations = NIL;
@@ -596,12 +599,15 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	FixedParallelExecutorState *fpes;
 	char	   *pstmt_data;
 	char	   *pstmt_space;
+	char	   *part_prune_result_data;
+	char	   *part_prune_result_space;
 	char	   *paramlistinfo_space;
 	BufferUsage *bufusage_space;
 	WalUsage   *walusage_space;
 	SharedExecutorInstrumentation *instrumentation = NULL;
 	SharedJitInstrumentation *jit_instrumentation = NULL;
 	int			pstmt_len;
+	int			part_prune_result_len;
 	int			paramlistinfo_len;
 	int			instrumentation_len = 0;
 	int			jit_instrumentation_len = 0;
@@ -630,6 +636,7 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 
 	/* Fix up and serialize plan to be sent to workers. */
 	pstmt_data = ExecSerializePlan(planstate->plan, estate);
+	part_prune_result_data = nodeToString(estate->es_part_prune_result);
 
 	/* Create a parallel context. */
 	pcxt = CreateParallelContext("postgres", "ParallelQueryMain", nworkers);
@@ -656,6 +663,11 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	shm_toc_estimate_chunk(&pcxt->estimator, pstmt_len);
 	shm_toc_estimate_keys(&pcxt->estimator, 1);
 
+	/* Estimate space for serialized PartitionPruneResult. */
+	part_prune_result_len = strlen(part_prune_result_data) + 1;
+	shm_toc_estimate_chunk(&pcxt->estimator, part_prune_result_len);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
 	/* Estimate space for serialized ParamListInfo. */
 	paramlistinfo_len = EstimateParamListSpace(estate->es_param_list_info);
 	shm_toc_estimate_chunk(&pcxt->estimator, paramlistinfo_len);
@@ -750,6 +762,12 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	memcpy(pstmt_space, pstmt_data, pstmt_len);
 	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PLANNEDSTMT, pstmt_space);
 
+	/* Store serialized PartitionPruneResult */
+	part_prune_result_space = shm_toc_allocate(pcxt->toc, part_prune_result_len);
+	memcpy(part_prune_result_space, part_prune_result_data, part_prune_result_len);
+	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARTITIONPRUNERESULT,
+				   part_prune_result_space);
+
 	/* Store serialized ParamListInfo. */
 	paramlistinfo_space = shm_toc_allocate(pcxt->toc, paramlistinfo_len);
 	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARAMLISTINFO, paramlistinfo_space);
@@ -1231,8 +1249,10 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
 						 int instrument_options)
 {
 	char	   *pstmtspace;
+	char	   *part_prune_result_space;
 	char	   *paramspace;
 	PlannedStmt *pstmt;
+	PartitionPruneResult *part_prune_result;
 	ParamListInfo paramLI;
 	char	   *queryString;
 
@@ -1243,12 +1263,18 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
 	pstmtspace = shm_toc_lookup(toc, PARALLEL_KEY_PLANNEDSTMT, false);
 	pstmt = (PlannedStmt *) stringToNode(pstmtspace);
 
+	/* Reconstruct leader-supplied PartitionPruneResult. */
+	part_prune_result_space =
+		shm_toc_lookup(toc, PARALLEL_KEY_PARTITIONPRUNERESULT, false);
+	part_prune_result = (PartitionPruneResult *)
+		stringToNode(part_prune_result_space);
+
 	/* Reconstruct ParamListInfo. */
 	paramspace = shm_toc_lookup(toc, PARALLEL_KEY_PARAMLISTINFO, false);
 	paramLI = RestoreParamList(&paramspace);
 
 	/* Create a QueryDesc for the query. */
-	return CreateQueryDesc(pstmt,
+	return CreateQueryDesc(pstmt, part_prune_result,
 						   queryString,
 						   GetActiveSnapshot(), InvalidSnapshot,
 						   receiver, paramLI, NULL, instrument_options);
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 615bd80973..af87b9197f 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -25,6 +25,7 @@
 #include "mb/pg_wchar.h"
 #include "miscadmin.h"
 #include "nodes/makefuncs.h"
+#include "parser/parsetree.h"
 #include "partitioning/partbounds.h"
 #include "partitioning/partdesc.h"
 #include "partitioning/partprune.h"
@@ -185,7 +186,11 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation rel,
 static List *adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri);
 static List *adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap);
 static PartitionPruneState *CreatePartitionPruneState(PlanState *planstate,
-													  PartitionPruneInfo *pruneinfo);
+													  PartitionPruneInfo *pruneinfo,
+													  bool consider_initial_steps,
+													  bool consider_exec_steps,
+													  List *rtable, ExprContext *econtext,
+													  PartitionDirectory partdir);
 static void InitPartitionPruneContext(PartitionPruneContext *context,
 									  List *pruning_steps,
 									  PartitionDesc partdesc,
@@ -198,7 +203,8 @@ static void PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
 static void find_matching_subplans_recurse(PartitionPruningData *prunedata,
 										   PartitionedRelPruningData *pprune,
 										   bool initial_prune,
-										   Bitmapset **validsubplans);
+										   Bitmapset **validsubplans,
+										   Bitmapset **scan_leafpart_rtis);
 
 
 /*
@@ -1587,8 +1593,10 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  * considered to be a stable expression, it can change value from one plan
  * node scan to the next during query execution.  Stable comparison
  * expressions that don't involve such Params allow partition pruning to be
- * done once during executor startup.  Expressions that do involve such Params
- * require us to prune separately for each scan of the parent plan node.
+ * done once during executor startup or during ExecutorDoInitialPruning() that
+ * runs as part of performing AcquireExecutorLocks() on a given plan tree.
+ * Expressions that do involve such Params require us to prune separately for
+ * each scan of the parent plan node.
  *
  * Note that pruning away unneeded subplans during executor startup has the
  * added benefit of not having to initialize the unneeded subplans at all.
@@ -1605,6 +1613,13 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  *		account for initial pruning possibly having eliminated some of the
  *		subplans.
  *
+ * ExecPartitionDoInitialPruning:
+ *		Do initial pruning with the information contained in a given
+ *		PartitionPruneInfo to determine the minimal set of child subplans
+ *		to be executed of the parent plan node to which the PartitionPruneInfo
+ *		belongs and also the set of the RT indexes of leaf partitions that will
+ *		be scanned with those subplans.
+ *
  * ExecFindMatchingSubPlans:
  *		Returns indexes of matching subplans after evaluating the expressions
  *		that are safe to evaluate at a given point.  This function is first
@@ -1622,8 +1637,9 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  *
  * On return, *initially_valid_subplans is assigned the set of indexes of
  * child subplans that must be initialized along with the parent plan node.
- * Initial pruning is performed here if needed and in that case only the
- * surviving subplans' indexes are added.
+ * Initial pruning is performed here if needed (unless it has already been done
+ * by ExecDoInitialPruning()), and in that case only the surviving subplans'
+ * indexes are added.
  *
  * If subplans are indeed pruned, subplan_map arrays contained in the returned
  * PartitionPruneState are re-sequenced to not count those, though only if the
@@ -1632,29 +1648,66 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
 PartitionPruneState *
 ExecInitPartitionPruning(PlanState *planstate,
 						 int n_total_subplans,
-						 PartitionPruneInfo *pruneinfo,
+						 int part_prune_index,
 						 Bitmapset **initially_valid_subplans)
 {
-	PartitionPruneState *prunestate;
 	EState	   *estate = planstate->state;
+	PartitionPruneInfo  *pruneinfo = list_nth(estate->es_part_prune_infos,
+											  part_prune_index);
+	PartitionPruneResult *pruneresult = estate->es_part_prune_result;
+	PartitionPruneState *prunestate;
+	bool	do_pruning = (pruneinfo->needs_init_pruning ||
+						  pruneinfo->needs_exec_pruning);
 
-	/* We may need an expression context to evaluate partition exprs */
-	ExecAssignExprContext(estate, planstate);
+	/*
+	 * No need to do initial pruning if it was done already by
+	 * ExecutorDoInitialPruning(), which it would be if es_part_prune_result
+	 * has been set.
+	 */
+	if (pruneresult)
+		do_pruning = pruneinfo->needs_exec_pruning;
+
+	prunestate = NULL;
+	if (do_pruning)
+	{
+		/* We may need an expression context to evaluate partition exprs */
+		ExecAssignExprContext(estate, planstate);
 
-	/* Create the working data structure for pruning */
-	prunestate = CreatePartitionPruneState(planstate, pruneinfo);
+		/* For data reading, executor always omits detached partitions */
+		if (estate->es_partition_directory == NULL)
+			estate->es_partition_directory =
+				CreatePartitionDirectory(estate->es_query_cxt, false);
+
+		/*
+		 * Create the working data structure for pruning.  No need to consider
+		 * initial pruning steps if we have a PartitionPruneResult.
+		 */
+		prunestate = CreatePartitionPruneState(planstate, pruneinfo,
+											   pruneresult == NULL, true,
+											   NIL, planstate->ps_ExprContext,
+											   estate->es_partition_directory);
+	}
 
 	/*
 	 * Perform an initial partition prune pass, if required.
 	 */
-	if (prunestate->do_initial_prune)
-		*initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true);
+	if (pruneresult)
+	{
+		*initially_valid_subplans =
+			list_nth(pruneresult->valid_subplan_offs_list, part_prune_index);
+	}
+	else if (prunestate && prunestate->do_initial_prune)
+	{
+		*initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true,
+															 NULL);
+	}
 	else
 	{
 		/* No pruning, so we'll need to initialize all subplans */
 		Assert(n_total_subplans > 0);
 		*initially_valid_subplans = bms_add_range(NULL, 0,
 												  n_total_subplans - 1);
+		return prunestate;
 	}
 
 	/*
@@ -1662,7 +1715,8 @@ ExecInitPartitionPruning(PlanState *planstate,
 	 * that were removed above due to initial pruning.  No need to do this if
 	 * no steps were removed.
 	 */
-	if (bms_num_members(*initially_valid_subplans) < n_total_subplans)
+	if (prunestate &&
+		bms_num_members(*initially_valid_subplans) < n_total_subplans)
 	{
 		/*
 		 * We can safely skip this when !do_exec_prune, even though that
@@ -1678,11 +1732,74 @@ ExecInitPartitionPruning(PlanState *planstate,
 	return prunestate;
 }
 
+/*
+ * ExecPartitionDoInitialPruning
+ *		Perform initial pruning using given PartitionPruneInfo to determine
+ *		the minimal set of child subplans to be executed of the parent plan
+ *		node to which the PartitionPruneInfo belongs and also the set of RT
+ *		indexes of leaf partitions that will scanned with those subplans.
+ */
+Bitmapset *
+ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params,
+							  PartitionPruneInfo *pruneinfo,
+							  Bitmapset **scan_leafpart_rtis)
+{
+	List		 *rtable = plannedstmt->rtable;
+	ExprContext	 *econtext;
+	PartitionDirectory pdir;
+	MemoryContext oldcontext,
+				  tmpcontext;
+	PartitionPruneState *prunestate;
+	Bitmapset	 *valid_subplan_offs;
+
+	/*
+	 * A temporary context for memory allocations required while executing
+	 * partition pruning steps.
+	 */
+	tmpcontext = AllocSetContextCreate(CurrentMemoryContext,
+									   "initial pruning working data",
+									   ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(tmpcontext);
+
+	/*
+	 * PartitionDirectory to look up partition descriptors, which omits
+	 * detached partitions, just like in the executor proper.
+	 */
+	pdir = CreatePartitionDirectory(CurrentMemoryContext, false);
+
+	/*
+	 * We don't yet have a PlanState for the parent plan node, so we must
+	 * create a standalone ExprContext to evaluate pruning expressions,
+	 * equipped with the information about the EXTERN parameters that the
+	 * caller passed us.  Note that that's okay because the initial pruning
+	 * steps do not contain anything that requires the execution to have
+	 * started.
+	 */
+	econtext = CreateStandaloneExprContext();
+	econtext->ecxt_param_list_info = params;
+	prunestate = CreatePartitionPruneState(NULL, pruneinfo, true, false,
+										   rtable, econtext, pdir);
+	MemoryContextSwitchTo(oldcontext);
+
+	/* Do the initial pruning. */
+	valid_subplan_offs = ExecFindMatchingSubPlans(prunestate, true,
+												  scan_leafpart_rtis);
+
+	FreeExprContext(econtext, true);
+	DestroyPartitionDirectory(pdir);
+	MemoryContextDelete(tmpcontext);
+
+	return valid_subplan_offs;
+}
+
 /*
  * CreatePartitionPruneState
  *		Build the data structure required for calling ExecFindMatchingSubPlans
  *
- * 'planstate' is the parent plan node's execution state.
+ * 'planstate', if not NULL, is the parent plan node's execution state.  It
+ * can be NULL if being called before ExecutorStart(), in which case,
+ * 'rtable' (range table), 'econtext', and 'partdir' must be explicitly
+ * provided.
  *
  * 'pruneinfo' is a PartitionPruneInfo as generated by
  * make_partition_pruneinfo.  Here we build a PartitionPruneState containing a
@@ -1696,19 +1813,21 @@ ExecInitPartitionPruning(PlanState *planstate,
  * PartitionedRelPruneInfo.
  */
 static PartitionPruneState *
-CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
+CreatePartitionPruneState(PlanState *planstate,
+						  PartitionPruneInfo *pruneinfo,
+						  bool consider_initial_steps,
+						  bool consider_exec_steps,
+						  List *rtable, ExprContext *econtext,
+						  PartitionDirectory partdir)
 {
-	EState	   *estate = planstate->state;
+	EState	   *estate = planstate ? planstate->state : NULL;
 	PartitionPruneState *prunestate;
 	int			n_part_hierarchies;
 	ListCell   *lc;
 	int			i;
-	ExprContext *econtext = planstate->ps_ExprContext;
 
-	/* For data reading, executor always omits detached partitions */
-	if (estate->es_partition_directory == NULL)
-		estate->es_partition_directory =
-			CreatePartitionDirectory(estate->es_query_cxt, false);
+	Assert((estate != NULL) ||
+			(partdir != NULL && econtext != NULL && rtable != NIL));
 
 	n_part_hierarchies = list_length(pruneinfo->prune_infos);
 	Assert(n_part_hierarchies > 0);
@@ -1763,15 +1882,42 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 			PartitionKey partkey;
 
 			/*
-			 * We can rely on the copies of the partitioned table's partition
-			 * key and partition descriptor appearing in its relcache entry,
-			 * because that entry will be held open and locked for the
-			 * duration of this executor run.
+			 * Must open the relation by ourselves when called before the
+			 * execution has started, such as, when called during
+			 * ExecutorDoInitialPruning() on a cached plan.  In that case,
+			 * sub-partitions must be locked, because AcquirePlannerLocks()
+			 * would not have seen them. (1st relation in a partrelpruneinfos
+			 * list is always the root partitioned table appearing in the
+			 * query, which AcquirePlannerLocks() would have locked; the
+			 * Assert in relation_open() guards that assumption.)
+			 */
+			if (estate == NULL)
+			{
+				RangeTblEntry *rte = rt_fetch(pinfo->rtindex, rtable);
+				int		lockmode = (j == 0) ? NoLock : rte->rellockmode;
+
+				partrel = table_open(rte->relid, lockmode);
+			}
+			else
+				partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
+
+			/*
+			 * We can rely on the copy of the partitioned table's partition
+			 * key from in its relcache entry, because it can't change (or
+			 * get destroyed) as long as the relation is locked.  Partition
+			 * descriptor is taken from the PartitionDirectory associated with
+			 * the table that is held open long enough for the descriptor to
+			 * remain valid while it's used to perform the pruning steps.
 			 */
-			partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
 			partkey = RelationGetPartitionKey(partrel);
-			partdesc = PartitionDirectoryLookup(estate->es_partition_directory,
-												partrel);
+			partdesc = PartitionDirectoryLookup(partdir, partrel);
+
+			/*
+			 * Must close partrel, keeping the lock taken, if we're not using
+			 * EState's entry.
+			 */
+			if (estate == NULL)
+				table_close(partrel, NoLock);
 
 			/*
 			 * Initialize the subplan_map and subpart_map.
@@ -1785,6 +1931,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 			Assert(partdesc->nparts >= pinfo->nparts);
 			pprune->nparts = partdesc->nparts;
 			pprune->subplan_map = palloc(sizeof(int) * partdesc->nparts);
+			pprune->rti_map = palloc(sizeof(Index) * partdesc->nparts);
 			if (partdesc->nparts == pinfo->nparts)
 			{
 				/*
@@ -1795,6 +1942,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 				pprune->subpart_map = pinfo->subpart_map;
 				memcpy(pprune->subplan_map, pinfo->subplan_map,
 					   sizeof(int) * pinfo->nparts);
+				memcpy(pprune->rti_map, pinfo->rti_map,
+					   sizeof(int) * pinfo->nparts);
 
 				/*
 				 * Double-check that the list of unpruned relations has not
@@ -1845,6 +1994,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 							pinfo->subplan_map[pd_idx];
 						pprune->subpart_map[pp_idx] =
 							pinfo->subpart_map[pd_idx];
+						pprune->rti_map[pp_idx] =
+							pinfo->rti_map[pd_idx];
 						pd_idx++;
 					}
 					else
@@ -1852,6 +2003,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 						/* this partdesc entry is not in the plan */
 						pprune->subplan_map[pp_idx] = -1;
 						pprune->subpart_map[pp_idx] = -1;
+						pprune->rti_map[pp_idx] = 0;
 					}
 				}
 
@@ -1873,7 +2025,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 			 * Initialize pruning contexts as needed.
 			 */
 			pprune->initial_pruning_steps = pinfo->initial_pruning_steps;
-			if (pinfo->initial_pruning_steps)
+			if (consider_initial_steps && pinfo->initial_pruning_steps)
 			{
 				InitPartitionPruneContext(&pprune->initial_context,
 										  pinfo->initial_pruning_steps,
@@ -1883,7 +2035,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 				prunestate->do_initial_prune = true;
 			}
 			pprune->exec_pruning_steps = pinfo->exec_pruning_steps;
-			if (pinfo->exec_pruning_steps)
+			if (consider_exec_steps && pinfo->exec_pruning_steps)
 			{
 				InitPartitionPruneContext(&pprune->exec_context,
 										  pinfo->exec_pruning_steps,
@@ -2111,10 +2263,14 @@ PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
  * Pass initial_prune if PARAM_EXEC Params cannot yet be evaluated.  This
  * differentiates the initial executor-time pruning step from later
  * runtime pruning.
+ *
+ * RT indexes of leaf partitions scanned by the chosen subplans are added to
+ * *scan_leafpart_rtis if the pointer is non-NULL.
  */
 Bitmapset *
 ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
-						 bool initial_prune)
+						 bool initial_prune,
+						 Bitmapset **scan_leafpart_rtis)
 {
 	Bitmapset  *result = NULL;
 	MemoryContext oldcontext;
@@ -2149,7 +2305,7 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
 		 */
 		pprune = &prunedata->partrelprunedata[0];
 		find_matching_subplans_recurse(prunedata, pprune, initial_prune,
-									   &result);
+									   &result, scan_leafpart_rtis);
 
 		/* Expression eval may have used space in ExprContext too */
 		if (pprune->exec_pruning_steps)
@@ -2163,6 +2319,8 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
 
 	/* Copy result out of the temp context before we reset it */
 	result = bms_copy(result);
+	if (scan_leafpart_rtis)
+		*scan_leafpart_rtis = bms_copy(*scan_leafpart_rtis);
 
 	MemoryContextReset(prunestate->prune_context);
 
@@ -2173,13 +2331,15 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
  * find_matching_subplans_recurse
  *		Recursive worker function for ExecFindMatchingSubPlans
  *
- * Adds valid (non-prunable) subplan IDs to *validsubplans
+ * Adds valid (non-prunable) subplan IDs to *validsubplans and RT indexes of
+ * of the corresponding leaf partitions to *scan_leafpart_rtis (if asked for).
  */
 static void
 find_matching_subplans_recurse(PartitionPruningData *prunedata,
 							   PartitionedRelPruningData *pprune,
 							   bool initial_prune,
-							   Bitmapset **validsubplans)
+							   Bitmapset **validsubplans,
+							   Bitmapset **scan_leafpart_rtis)
 {
 	Bitmapset  *partset;
 	int			i;
@@ -2206,8 +2366,14 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
 	while ((i = bms_next_member(partset, i)) >= 0)
 	{
 		if (pprune->subplan_map[i] >= 0)
+		{
 			*validsubplans = bms_add_member(*validsubplans,
 											pprune->subplan_map[i]);
+			Assert(pprune->rti_map[i] > 0);
+			if (scan_leafpart_rtis)
+				*scan_leafpart_rtis = bms_add_member(*scan_leafpart_rtis,
+													 pprune->rti_map[i]);
+		}
 		else
 		{
 			int			partidx = pprune->subpart_map[i];
@@ -2215,7 +2381,8 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
 			if (partidx >= 0)
 				find_matching_subplans_recurse(prunedata,
 											   &prunedata->partrelprunedata[partidx],
-											   initial_prune, validsubplans);
+											   initial_prune, validsubplans,
+											   scan_leafpart_rtis);
 			else
 			{
 				/*
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 9df1f81ea8..f9c7976ff2 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -119,6 +119,8 @@ CreateExecutorState(void)
 	estate->es_relations = NULL;
 	estate->es_rowmarks = NULL;
 	estate->es_plannedstmt = NULL;
+	estate->es_part_prune_infos = NIL;
+	estate->es_part_prune_result = NULL;
 
 	estate->es_junkFilter = NULL;
 
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index f9460ae506..a2182a6b1f 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -844,7 +844,7 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
 	else
 		dest = None_Receiver;
 
-	es->qd = CreateQueryDesc(es->stmt,
+	es->qd = CreateQueryDesc(es->stmt, NULL,
 							 fcache->src,
 							 GetActiveSnapshot(),
 							 InvalidSnapshot,
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index 357e10a1d7..96880e122a 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -134,7 +134,7 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
 	appendstate->as_begun = false;
 
 	/* If run-time partition pruning is enabled, then set that up now */
-	if (node->part_prune_info != NULL)
+	if (node->part_prune_index >= 0)
 	{
 		PartitionPruneState *prunestate;
 
@@ -145,7 +145,7 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
 		 */
 		prunestate = ExecInitPartitionPruning(&appendstate->ps,
 											  list_length(node->appendplans),
-											  node->part_prune_info,
+											  node->part_prune_index,
 											  &validsubplans);
 		appendstate->as_prune_state = prunestate;
 		nplans = bms_num_members(validsubplans);
@@ -155,7 +155,8 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
 		 * subplan, we can fill as_valid_subplans immediately, preventing
 		 * later calls to ExecFindMatchingSubPlans.
 		 */
-		if (!prunestate->do_exec_prune && nplans > 0)
+		if (appendstate->as_prune_state == NULL ||
+			(!appendstate->as_prune_state->do_exec_prune && nplans > 0))
 			appendstate->as_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
 	}
 	else
@@ -577,7 +578,7 @@ choose_next_subplan_locally(AppendState *node)
 		}
 		else if (node->as_valid_subplans == NULL)
 			node->as_valid_subplans =
-				ExecFindMatchingSubPlans(node->as_prune_state, false);
+				ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
 
 		whichplan = -1;
 	}
@@ -642,7 +643,7 @@ choose_next_subplan_for_leader(AppendState *node)
 		if (node->as_valid_subplans == NULL)
 		{
 			node->as_valid_subplans =
-				ExecFindMatchingSubPlans(node->as_prune_state, false);
+				ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
 
 			/*
 			 * Mark each invalid plan as finished to allow the loop below to
@@ -717,7 +718,7 @@ choose_next_subplan_for_worker(AppendState *node)
 	else if (node->as_valid_subplans == NULL)
 	{
 		node->as_valid_subplans =
-			ExecFindMatchingSubPlans(node->as_prune_state, false);
+			ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
 		mark_invalid_subplans_as_finished(node);
 	}
 
@@ -868,7 +869,7 @@ ExecAppendAsyncBegin(AppendState *node)
 	if (node->as_valid_subplans == NULL)
 	{
 		node->as_valid_subplans =
-			ExecFindMatchingSubPlans(node->as_prune_state, false);
+			ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
 
 		classify_matching_subplans(node);
 	}
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index ecf9052e03..7708cfffda 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -82,7 +82,7 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
 	mergestate->ps.ExecProcNode = ExecMergeAppend;
 
 	/* If run-time partition pruning is enabled, then set that up now */
-	if (node->part_prune_info != NULL)
+	if (node->part_prune_index >= 0)
 	{
 		PartitionPruneState *prunestate;
 
@@ -93,7 +93,7 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
 		 */
 		prunestate = ExecInitPartitionPruning(&mergestate->ps,
 											  list_length(node->mergeplans),
-											  node->part_prune_info,
+											  node->part_prune_index,
 											  &validsubplans);
 		mergestate->ms_prune_state = prunestate;
 		nplans = bms_num_members(validsubplans);
@@ -103,7 +103,8 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
 		 * subplan, we can fill as_valid_subplans immediately, preventing
 		 * later calls to ExecFindMatchingSubPlans.
 		 */
-		if (!prunestate->do_exec_prune && nplans > 0)
+		if (mergestate->ms_prune_state == NULL ||
+			(!mergestate->ms_prune_state->do_exec_prune && nplans > 0))
 			mergestate->ms_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
 	}
 	else
@@ -218,7 +219,7 @@ ExecMergeAppend(PlanState *pstate)
 		 */
 		if (node->ms_valid_subplans == NULL)
 			node->ms_valid_subplans =
-				ExecFindMatchingSubPlans(node->ms_prune_state, false);
+				ExecFindMatchingSubPlans(node->ms_prune_state, false, NULL);
 
 		/*
 		 * First time through: pull the first tuple from each valid subplan,
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 042a5f8b0a..729e2fd7b2 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -1578,6 +1578,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 	CachedPlanSource *plansource;
 	CachedPlan *cplan;
 	List	   *stmt_list;
+	List	   *part_prune_result_list;
 	char	   *query_string;
 	Snapshot	snapshot;
 	MemoryContext oldcontext;
@@ -1657,7 +1658,10 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 	 */
 
 	/* Replan if needed, and increment plan refcount for portal */
-	cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv);
+	cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv,
+						  &part_prune_result_list);
+	Assert(list_length(cplan->stmt_list) ==
+		   list_length(part_prune_result_list));
 	stmt_list = cplan->stmt_list;
 
 	if (!plan->saved)
@@ -1685,6 +1689,9 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 					  stmt_list,
 					  cplan);
 
+	/* Copy PartitionPruneResults into the portal's context. */
+	PortalStorePartitionPruneResults(portal, part_prune_result_list);
+
 	/*
 	 * Set up options for portal.  Default SCROLL type is chosen the same way
 	 * as PerformCursorOpen does it.
@@ -2092,7 +2099,8 @@ SPI_plan_get_cached_plan(SPIPlanPtr plan)
 	/* Get the generic plan for the query */
 	cplan = GetCachedPlan(plansource, NULL,
 						  plan->saved ? CurrentResourceOwner : NULL,
-						  _SPI_current->queryEnv);
+						  _SPI_current->queryEnv,
+						  NULL /* Not interested in PartitionPruneResults */);
 	Assert(cplan == plansource->gplan);
 
 	/* Pop the error context stack */
@@ -2473,7 +2481,9 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 	{
 		CachedPlanSource *plansource = (CachedPlanSource *) lfirst(lc1);
 		List	   *stmt_list;
-		ListCell   *lc2;
+		List	   *part_prune_result_list;
+		ListCell   *lc2,
+				   *lc3;
 
 		spicallbackarg.query = plansource->query_string;
 
@@ -2549,8 +2559,10 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 		 * plan, the refcount must be backed by the plan_owner.
 		 */
 		cplan = GetCachedPlan(plansource, options->params,
-							  plan_owner, _SPI_current->queryEnv);
-
+							  plan_owner, _SPI_current->queryEnv,
+							 &part_prune_result_list);
+		Assert(list_length(cplan->stmt_list) ==
+			   list_length(part_prune_result_list));
 		stmt_list = cplan->stmt_list;
 
 		/*
@@ -2589,9 +2601,10 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 			}
 		}
 
-		foreach(lc2, stmt_list)
+		forboth(lc2, stmt_list, lc3, part_prune_result_list)
 		{
 			PlannedStmt *stmt = lfirst_node(PlannedStmt, lc2);
+			PartitionPruneResult *part_prune_result = lfirst_node(PartitionPruneResult, lc3);
 			bool		canSetTag = stmt->canSetTag;
 			DestReceiver *dest;
 
@@ -2663,7 +2676,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 				else
 					snap = InvalidSnapshot;
 
-				qdesc = CreateQueryDesc(stmt,
+				qdesc = CreateQueryDesc(stmt, part_prune_result,
 										plansource->query_string,
 										snap, crosscheck_snapshot,
 										dest,
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 836f427ea8..59a7054011 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -96,7 +96,10 @@ _copyPlannedStmt(const PlannedStmt *from)
 	COPY_SCALAR_FIELD(parallelModeNeeded);
 	COPY_SCALAR_FIELD(jitFlags);
 	COPY_NODE_FIELD(planTree);
+	COPY_NODE_FIELD(partPruneInfos);
+	COPY_SCALAR_FIELD(containsInitialPruning);
 	COPY_NODE_FIELD(rtable);
+	COPY_BITMAPSET_FIELD(minLockRelids);
 	COPY_NODE_FIELD(resultRelations);
 	COPY_NODE_FIELD(appendRelations);
 	COPY_NODE_FIELD(subplans);
@@ -253,7 +256,7 @@ _copyAppend(const Append *from)
 	COPY_NODE_FIELD(appendplans);
 	COPY_SCALAR_FIELD(nasyncplans);
 	COPY_SCALAR_FIELD(first_partial_plan);
-	COPY_NODE_FIELD(part_prune_info);
+	COPY_SCALAR_FIELD(part_prune_index);
 
 	return newnode;
 }
@@ -281,7 +284,7 @@ _copyMergeAppend(const MergeAppend *from)
 	COPY_POINTER_FIELD(sortOperators, from->numCols * sizeof(Oid));
 	COPY_POINTER_FIELD(collations, from->numCols * sizeof(Oid));
 	COPY_POINTER_FIELD(nullsFirst, from->numCols * sizeof(bool));
-	COPY_NODE_FIELD(part_prune_info);
+	COPY_SCALAR_FIELD(part_prune_index);
 
 	return newnode;
 }
@@ -1283,6 +1286,8 @@ _copyPartitionPruneInfo(const PartitionPruneInfo *from)
 	PartitionPruneInfo *newnode = makeNode(PartitionPruneInfo);
 
 	COPY_NODE_FIELD(prune_infos);
+	COPY_SCALAR_FIELD(needs_init_pruning);
+	COPY_SCALAR_FIELD(needs_exec_pruning);
 	COPY_BITMAPSET_FIELD(other_subplans);
 
 	return newnode;
@@ -1299,6 +1304,7 @@ _copyPartitionedRelPruneInfo(const PartitionedRelPruneInfo *from)
 	COPY_POINTER_FIELD(subplan_map, from->nparts * sizeof(int));
 	COPY_POINTER_FIELD(subpart_map, from->nparts * sizeof(int));
 	COPY_POINTER_FIELD(relid_map, from->nparts * sizeof(Oid));
+	COPY_POINTER_FIELD(rti_map, from->nparts * sizeof(Index));
 	COPY_NODE_FIELD(initial_pruning_steps);
 	COPY_NODE_FIELD(exec_pruning_steps);
 	COPY_BITMAPSET_FIELD(execparamids);
@@ -5473,6 +5479,21 @@ _copyExtensibleNode(const ExtensibleNode *from)
 	return newnode;
 }
 
+/* ****************************************************************
+ *					execnodes.h copy functions
+ * ****************************************************************
+ */
+static PartitionPruneResult *
+_copyPartitionPruneResult(const PartitionPruneResult *from)
+{
+	PartitionPruneResult *newnode = makeNode(PartitionPruneResult);
+
+	COPY_NODE_FIELD(valid_subplan_offs_list);
+	COPY_BITMAPSET_FIELD(scan_leafpart_rtis);
+
+	return newnode;
+}
+
 /* ****************************************************************
  *					value.h copy functions
  * ****************************************************************
@@ -5527,7 +5548,6 @@ _copyBitString(const BitString *from)
 	return newnode;
 }
 
-
 static ForeignKeyCacheInfo *
 _copyForeignKeyCacheInfo(const ForeignKeyCacheInfo *from)
 {
@@ -6569,6 +6589,13 @@ copyObjectImpl(const void *from)
 			retval = _copyPublicationTable(from);
 			break;
 
+			/*
+			 * EXECUTION NODES
+			 */
+		case T_PartitionPruneResult:
+			retval = _copyPartitionPruneResult(from);
+			break;
+
 			/*
 			 * MISCELLANEOUS NODES
 			 */
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index d5f5e76c55..3dada68291 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -314,7 +314,10 @@ _outPlannedStmt(StringInfo str, const PlannedStmt *node)
 	WRITE_BOOL_FIELD(parallelModeNeeded);
 	WRITE_INT_FIELD(jitFlags);
 	WRITE_NODE_FIELD(planTree);
+	WRITE_NODE_FIELD(partPruneInfos);
+	WRITE_BOOL_FIELD(containsInitialPruning);
 	WRITE_NODE_FIELD(rtable);
+	WRITE_BITMAPSET_FIELD(minLockRelids);
 	WRITE_NODE_FIELD(resultRelations);
 	WRITE_NODE_FIELD(appendRelations);
 	WRITE_NODE_FIELD(subplans);
@@ -443,7 +446,7 @@ _outAppend(StringInfo str, const Append *node)
 	WRITE_NODE_FIELD(appendplans);
 	WRITE_INT_FIELD(nasyncplans);
 	WRITE_INT_FIELD(first_partial_plan);
-	WRITE_NODE_FIELD(part_prune_info);
+	WRITE_INT_FIELD(part_prune_index);
 }
 
 static void
@@ -460,7 +463,7 @@ _outMergeAppend(StringInfo str, const MergeAppend *node)
 	WRITE_OID_ARRAY(sortOperators, node->numCols);
 	WRITE_OID_ARRAY(collations, node->numCols);
 	WRITE_BOOL_ARRAY(nullsFirst, node->numCols);
-	WRITE_NODE_FIELD(part_prune_info);
+	WRITE_INT_FIELD(part_prune_index);
 }
 
 static void
@@ -1009,6 +1012,8 @@ _outPartitionPruneInfo(StringInfo str, const PartitionPruneInfo *node)
 	WRITE_NODE_TYPE("PARTITIONPRUNEINFO");
 
 	WRITE_NODE_FIELD(prune_infos);
+	WRITE_BOOL_FIELD(needs_init_pruning);
+	WRITE_BOOL_FIELD(needs_exec_pruning);
 	WRITE_BITMAPSET_FIELD(other_subplans);
 }
 
@@ -1023,6 +1028,7 @@ _outPartitionedRelPruneInfo(StringInfo str, const PartitionedRelPruneInfo *node)
 	WRITE_INT_ARRAY(subplan_map, node->nparts);
 	WRITE_INT_ARRAY(subpart_map, node->nparts);
 	WRITE_OID_ARRAY(relid_map, node->nparts);
+	WRITE_INDEX_ARRAY(rti_map, node->nparts);
 	WRITE_NODE_FIELD(initial_pruning_steps);
 	WRITE_NODE_FIELD(exec_pruning_steps);
 	WRITE_BITMAPSET_FIELD(execparamids);
@@ -2425,6 +2431,9 @@ _outPlannerGlobal(StringInfo str, const PlannerGlobal *node)
 	WRITE_NODE_FIELD(finalrowmarks);
 	WRITE_NODE_FIELD(resultRelations);
 	WRITE_NODE_FIELD(appendRelations);
+	WRITE_NODE_FIELD(partPruneInfos);
+	WRITE_BOOL_FIELD(containsInitialPruning);
+	WRITE_BITMAPSET_FIELD(minLockRelids);
 	WRITE_NODE_FIELD(relationOids);
 	WRITE_NODE_FIELD(invalItems);
 	WRITE_NODE_FIELD(paramExecTypes);
@@ -2492,6 +2501,7 @@ _outPlannerInfo(StringInfo str, const PlannerInfo *node)
 	WRITE_BITMAPSET_FIELD(curOuterRels);
 	WRITE_NODE_FIELD(curOuterParams);
 	WRITE_BOOL_FIELD(partColsUpdated);
+	WRITE_NODE_FIELD(partPruneInfos);
 }
 
 static void
@@ -2845,6 +2855,21 @@ _outExtensibleNode(StringInfo str, const ExtensibleNode *node)
 	methods->nodeOut(str, node);
 }
 
+/*****************************************************************************
+ *
+ *	Stuff from execnodes.h
+ *
+ *****************************************************************************/
+
+static void
+_outPartitionPruneResult(StringInfo str, const PartitionPruneResult *node)
+{
+	WRITE_NODE_TYPE("PARTITIONPRUNERESULT");
+
+	WRITE_NODE_FIELD(valid_subplan_offs_list);
+	WRITE_BITMAPSET_FIELD(scan_leafpart_rtis);
+}
+
 /*****************************************************************************
  *
  *	Stuff from parsenodes.h.
@@ -4754,6 +4779,13 @@ outNode(StringInfo str, const void *obj)
 				_outJsonTableSibling(str, obj);
 				break;
 
+				/*
+				 * EXECUTION NODES
+				 */
+			case T_PartitionPruneResult:
+				_outPartitionPruneResult(str, obj);
+				break;
+
 			default:
 
 				/*
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 3d150cb25d..6a6fcec03b 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -164,6 +164,11 @@
 	token = pg_strtok(&length);		/* skip :fldname */ \
 	local_node->fldname = readIntCols(len)
 
+/* Read an Index array */
+#define READ_INDEX_ARRAY(fldname, len) \
+	token = pg_strtok(&length);		/* skip :fldname */ \
+	local_node->fldname = readIndexCols(len)
+
 /* Read a bool array */
 #define READ_BOOL_ARRAY(fldname, len) \
 	token = pg_strtok(&length);		/* skip :fldname */ \
@@ -1815,7 +1820,10 @@ _readPlannedStmt(void)
 	READ_BOOL_FIELD(parallelModeNeeded);
 	READ_INT_FIELD(jitFlags);
 	READ_NODE_FIELD(planTree);
+	READ_NODE_FIELD(partPruneInfos);
+	READ_BOOL_FIELD(containsInitialPruning);
 	READ_NODE_FIELD(rtable);
+	READ_BITMAPSET_FIELD(minLockRelids);
 	READ_NODE_FIELD(resultRelations);
 	READ_NODE_FIELD(appendRelations);
 	READ_NODE_FIELD(subplans);
@@ -1947,7 +1955,7 @@ _readAppend(void)
 	READ_NODE_FIELD(appendplans);
 	READ_INT_FIELD(nasyncplans);
 	READ_INT_FIELD(first_partial_plan);
-	READ_NODE_FIELD(part_prune_info);
+	READ_INT_FIELD(part_prune_index);
 
 	READ_DONE();
 }
@@ -1969,7 +1977,7 @@ _readMergeAppend(void)
 	READ_OID_ARRAY(sortOperators, local_node->numCols);
 	READ_OID_ARRAY(collations, local_node->numCols);
 	READ_BOOL_ARRAY(nullsFirst, local_node->numCols);
-	READ_NODE_FIELD(part_prune_info);
+	READ_INT_FIELD(part_prune_index);
 
 	READ_DONE();
 }
@@ -2767,6 +2775,8 @@ _readPartitionPruneInfo(void)
 	READ_LOCALS(PartitionPruneInfo);
 
 	READ_NODE_FIELD(prune_infos);
+	READ_BOOL_FIELD(needs_init_pruning);
+	READ_BOOL_FIELD(needs_exec_pruning);
 	READ_BITMAPSET_FIELD(other_subplans);
 
 	READ_DONE();
@@ -2783,6 +2793,7 @@ _readPartitionedRelPruneInfo(void)
 	READ_INT_ARRAY(subplan_map, local_node->nparts);
 	READ_INT_ARRAY(subpart_map, local_node->nparts);
 	READ_OID_ARRAY(relid_map, local_node->nparts);
+	READ_INDEX_ARRAY(rti_map, local_node->nparts);
 	READ_NODE_FIELD(initial_pruning_steps);
 	READ_NODE_FIELD(exec_pruning_steps);
 	READ_BITMAPSET_FIELD(execparamids);
@@ -2936,6 +2947,21 @@ _readPartitionRangeDatum(void)
 	READ_DONE();
 }
 
+
+/*
+ * _readPartitionPruneResult
+ */
+static PartitionPruneResult *
+_readPartitionPruneResult(void)
+{
+	READ_LOCALS(PartitionPruneResult);
+
+	READ_NODE_FIELD(valid_subplan_offs_list);
+	READ_BITMAPSET_FIELD(scan_leafpart_rtis);
+
+	READ_DONE();
+}
+
 /*
  * parseNodeString
  *
@@ -3233,6 +3259,8 @@ parseNodeString(void)
 		return_value = _readJsonTableParent();
 	else if (MATCH("JSONTABSNODE", 12))
 		return_value = _readJsonTableSibling();
+	else if (MATCH("PARTITIONPRUNERESULT", 20))
+		return_value = _readPartitionPruneResult();
 	else
 	{
 		elog(ERROR, "badly formatted node string \"%.32s\"...", token);
@@ -3376,6 +3404,30 @@ readIntCols(int numCols)
 	return int_vals;
 }
 
+/*
+ * readIndexCols
+ */
+Index *
+readIndexCols(int numCols)
+{
+	int			tokenLength,
+				i;
+	const char *token;
+	Index	   *index_vals;
+
+	if (numCols <= 0)
+		return NULL;
+
+	index_vals = (Index *) palloc(numCols * sizeof(Index));
+	for (i = 0; i < numCols; i++)
+	{
+		token = pg_strtok(&tokenLength);
+		index_vals[i] = atoui(token);
+	}
+
+	return index_vals;
+}
+
 /*
  * readBoolCols
  */
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index 95476ada0b..fe0df2f1d1 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -1184,7 +1184,6 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
 	ListCell   *subpaths;
 	int			nasyncplans = 0;
 	RelOptInfo *rel = best_path->path.parent;
-	PartitionPruneInfo *partpruneinfo = NULL;
 	int			nodenumsortkeys = 0;
 	AttrNumber *nodeSortColIdx = NULL;
 	Oid		   *nodeSortOperators = NULL;
@@ -1335,6 +1334,9 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
 		subplans = lappend(subplans, subplan);
 	}
 
+	/* Set below if we find quals that we can use to run-time prune */
+	plan->part_prune_index = -1;
+
 	/*
 	 * If any quals exist, they may be useful to perform further partition
 	 * pruning during execution.  Gather information needed by the executor to
@@ -1358,16 +1360,14 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
 		}
 
 		if (prunequal != NIL)
-			partpruneinfo =
-				make_partition_pruneinfo(root, rel,
-										 best_path->subpaths,
-										 prunequal);
+			plan->part_prune_index = make_partition_pruneinfo(root, rel,
+															  best_path->subpaths,
+															  prunequal);
 	}
 
 	plan->appendplans = subplans;
 	plan->nasyncplans = nasyncplans;
 	plan->first_partial_plan = best_path->first_partial_path;
-	plan->part_prune_info = partpruneinfo;
 
 	copy_generic_path_info(&plan->plan, (Path *) best_path);
 
@@ -1407,7 +1407,6 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
 	List	   *subplans = NIL;
 	ListCell   *subpaths;
 	RelOptInfo *rel = best_path->path.parent;
-	PartitionPruneInfo *partpruneinfo = NULL;
 
 	/*
 	 * We don't have the actual creation of the MergeAppend node split out
@@ -1500,6 +1499,9 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
 		subplans = lappend(subplans, subplan);
 	}
 
+	/* Set below if we find quals that we can use to run-time prune */
+	node->part_prune_index = -1;
+
 	/*
 	 * If any quals exist, they may be useful to perform further partition
 	 * pruning during execution.  Gather information needed by the executor to
@@ -1523,13 +1525,13 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
 		}
 
 		if (prunequal != NIL)
-			partpruneinfo = make_partition_pruneinfo(root, rel,
-													 best_path->subpaths,
-													 prunequal);
+			node->part_prune_index = make_partition_pruneinfo(root, rel,
+															  best_path->subpaths,
+															  prunequal);
 	}
 
 	node->mergeplans = subplans;
-	node->part_prune_info = partpruneinfo;
+
 
 	/*
 	 * If prepare_sort_from_pathkeys added sort columns, but we were told to
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index b090b087e9..f425362491 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -518,7 +518,10 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
 	result->dependsOnRole = glob->dependsOnRole;
 	result->parallelModeNeeded = glob->parallelModeNeeded;
 	result->planTree = top_plan;
+	result->partPruneInfos = glob->partPruneInfos;
+	result->containsInitialPruning = glob->containsInitialPruning;
 	result->rtable = glob->finalrtable;
+	result->minLockRelids = glob->minLockRelids;
 	result->resultRelations = glob->resultRelations;
 	result->appendRelations = glob->appendRelations;
 	result->subplans = glob->subplans;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 6ea3505646..94d4ff0b9d 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -261,7 +261,7 @@ set_plan_references(PlannerInfo *root, Plan *plan)
 	Plan	   *result;
 	PlannerGlobal *glob = root->glob;
 	int			rtoffset = list_length(glob->finalrtable);
-	ListCell   *lc;
+	ListCell *lc;
 
 	/*
 	 * Add all the query's RTEs to the flattened rangetable.  The live ones
@@ -270,6 +270,16 @@ set_plan_references(PlannerInfo *root, Plan *plan)
 	 */
 	add_rtes_to_flat_rtable(root, false);
 
+	/*
+	 * Add the query's adjusted range of RT indexes to glob->minLockRelids.
+	 * The adjusted RT indexes of prunable relations will be deleted from the
+	 * set below where PartitionPruneInfos are processed.
+	 */
+	glob->minLockRelids =
+		bms_add_range(glob->minLockRelids,
+					  rtoffset + 1,
+					  rtoffset + list_length(root->parse->rtable));
+
 	/*
 	 * Adjust RT indexes of PlanRowMarks and add to final rowmarks list
 	 */
@@ -348,6 +358,64 @@ set_plan_references(PlannerInfo *root, Plan *plan)
 		}
 	}
 
+	/* Also fix up the information in PartitionPruneInfos. */
+	foreach (lc, root->partPruneInfos)
+	{
+		PartitionPruneInfo *pruneinfo = lfirst(lc);
+		Bitmapset *leafpart_rtis = NULL;
+		ListCell  *l;
+
+		foreach(l, pruneinfo->prune_infos)
+		{
+			List	   *prune_infos = lfirst(l);
+			ListCell   *l2;
+
+			foreach(l2, prune_infos)
+			{
+				PartitionedRelPruneInfo *pinfo = lfirst(l2);
+				int		i;
+
+				/* RT index of the partitione table. */
+				pinfo->rtindex += rtoffset;
+
+				/* And also those of the leaf partitions. */
+				for (i = 0; i < pinfo->nparts; i++)
+				{
+					if (pinfo->rti_map[i] > 0)
+					{
+						pinfo->rti_map[i] += rtoffset;
+						leafpart_rtis = bms_add_member(leafpart_rtis,
+													   pinfo->rti_map[i]);
+					}
+				}
+			}
+		}
+
+		if (pruneinfo->needs_init_pruning)
+		{
+			glob->containsInitialPruning = true;
+
+			/*
+			 * Delete the leaf partition RTIs from the global set of relations
+			 * to be locked before executing the plan.  AcquireExecutorLocks()
+			 * will find the ones to add to the set after performing initial
+			 * pruning.
+			 */
+			glob->minLockRelids = bms_del_members(glob->minLockRelids,
+												  leafpart_rtis);
+		}
+
+		glob->partPruneInfos = lappend(glob->partPruneInfos, pruneinfo);
+	}
+
+	/*
+	 * It seems worth doing a bms_copy() on glob->minLockRelids if we deleted
+	 * bit from it just above to prevent empty tail bits resulting in
+	 * inefficient looping during AcquireExecutorLocks().
+	 */
+	if (glob->containsInitialPruning)
+		glob->minLockRelids = bms_copy(glob->minLockRelids)
+
 	return result;
 }
 
@@ -1640,21 +1708,12 @@ set_append_references(PlannerInfo *root,
 
 	aplan->apprelids = offset_relid_set(aplan->apprelids, rtoffset);
 
-	if (aplan->part_prune_info)
-	{
-		foreach(l, aplan->part_prune_info->prune_infos)
-		{
-			List	   *prune_infos = lfirst(l);
-			ListCell   *l2;
-
-			foreach(l2, prune_infos)
-			{
-				PartitionedRelPruneInfo *pinfo = lfirst(l2);
-
-				pinfo->rtindex += rtoffset;
-			}
-		}
-	}
+	/*
+	 * PartitionPruneInfos will be added to a list in PlannerGlobal, so update
+	 * the index.
+	 */
+	if (aplan->part_prune_index >= 0)
+		aplan->part_prune_index += list_length(root->glob->partPruneInfos);
 
 	/* We don't need to recurse to lefttree or righttree ... */
 	Assert(aplan->plan.lefttree == NULL);
@@ -1712,21 +1771,12 @@ set_mergeappend_references(PlannerInfo *root,
 
 	mplan->apprelids = offset_relid_set(mplan->apprelids, rtoffset);
 
-	if (mplan->part_prune_info)
-	{
-		foreach(l, mplan->part_prune_info->prune_infos)
-		{
-			List	   *prune_infos = lfirst(l);
-			ListCell   *l2;
-
-			foreach(l2, prune_infos)
-			{
-				PartitionedRelPruneInfo *pinfo = lfirst(l2);
-
-				pinfo->rtindex += rtoffset;
-			}
-		}
-	}
+	/*
+	 * PartitionPruneInfos will be added to a list in PlannerGlobal, so update
+	 * the index.
+	 */
+	if (mplan->part_prune_index >= 0)
+		mplan->part_prune_index += list_length(root->glob->partPruneInfos);
 
 	/* We don't need to recurse to lefttree or righttree ... */
 	Assert(mplan->plan.lefttree == NULL);
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index 9d3c05aed3..952c5b8327 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -144,7 +144,9 @@ static List *make_partitionedrel_pruneinfo(PlannerInfo *root,
 										   List *prunequal,
 										   Bitmapset *partrelids,
 										   int *relid_subplan_map,
-										   Bitmapset **matchedsubplans);
+										   Bitmapset **matchedsubplans,
+										   bool *needs_init_pruning,
+										   bool *needs_exec_pruning);
 static void gen_partprune_steps(RelOptInfo *rel, List *clauses,
 								PartClauseTarget target,
 								GeneratePruningStepsContext *context);
@@ -209,16 +211,20 @@ static void partkey_datum_from_expr(PartitionPruneContext *context,
 
 /*
  * make_partition_pruneinfo
- *		Builds a PartitionPruneInfo which can be used in the executor to allow
- *		additional partition pruning to take place.  Returns NULL when
- *		partition pruning would be useless.
+ *		Checks if the given set of quals can be used to build pruning steps
+ *		that the executor can use to prune away unneeded partitions.  If
+ *		suitable quals are found then a PartitionPruneInfo is built and tagged
+ *		onto the PlannerInfo's partPruneInfos list.
+ *
+ * The return value is the 0-based index of the item added to the
+ * partPruneInfos list or -1 if nothing was added.
  *
  * 'parentrel' is the RelOptInfo for an appendrel, and 'subpaths' is the list
  * of scan paths for its child rels.
  * 'prunequal' is a list of potential pruning quals (i.e., restriction
  * clauses that are applicable to the appendrel).
  */
-PartitionPruneInfo *
+int
 make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 						 List *subpaths,
 						 List *prunequal)
@@ -230,6 +236,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	int		   *relid_subplan_map;
 	ListCell   *lc;
 	int			i;
+	bool		needs_init_pruning = false;
+	bool		needs_exec_pruning = false;
 
 	/*
 	 * Scan the subpaths to see which ones are scans of partition child
@@ -309,12 +317,16 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		Bitmapset  *partrelids = (Bitmapset *) lfirst(lc);
 		List	   *pinfolist;
 		Bitmapset  *matchedsubplans = NULL;
+		bool		partrel_needs_init_pruning;
+		bool		partrel_needs_exec_pruning;
 
 		pinfolist = make_partitionedrel_pruneinfo(root, parentrel,
 												  prunequal,
 												  partrelids,
 												  relid_subplan_map,
-												  &matchedsubplans);
+												  &matchedsubplans,
+												  &partrel_needs_init_pruning,
+												  &partrel_needs_exec_pruning);
 
 		/* When pruning is possible, record the matched subplans */
 		if (pinfolist != NIL)
@@ -323,6 +335,9 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 			allmatchedsubplans = bms_join(matchedsubplans,
 										  allmatchedsubplans);
 		}
+
+		needs_init_pruning |= partrel_needs_init_pruning;
+		needs_exec_pruning |= partrel_needs_exec_pruning;
 	}
 
 	pfree(relid_subplan_map);
@@ -332,11 +347,13 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	 * quals, then we can just not bother with run-time pruning.
 	 */
 	if (prunerelinfos == NIL)
-		return NULL;
+		return -1;
 
 	/* Else build the result data structure */
 	pruneinfo = makeNode(PartitionPruneInfo);
 	pruneinfo->prune_infos = prunerelinfos;
+	pruneinfo->needs_init_pruning = needs_init_pruning;
+	pruneinfo->needs_exec_pruning = needs_exec_pruning;
 
 	/*
 	 * Some subplans may not belong to any of the identified partitioned rels.
@@ -358,7 +375,9 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	else
 		pruneinfo->other_subplans = NULL;
 
-	return pruneinfo;
+	root->partPruneInfos = lappend(root->partPruneInfos, pruneinfo);
+
+	return list_length(root->partPruneInfos) - 1;
 }
 
 /*
@@ -435,13 +454,18 @@ add_part_relids(List *allpartrelids, Bitmapset *partrelids)
  * If we cannot find any useful run-time pruning steps, return NIL.
  * However, on success, each rel identified in partrelids will have
  * an element in the result list, even if some of them are useless.
+ * *needs_init_pruning and *needs_exec_pruning are set to indicate that the
+ * returned PartitionedRelPruneInfos contains pruning steps that can be
+ * performed before and after execution begins, respectively.
  */
 static List *
 make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 							  List *prunequal,
 							  Bitmapset *partrelids,
 							  int *relid_subplan_map,
-							  Bitmapset **matchedsubplans)
+							  Bitmapset **matchedsubplans,
+							  bool *needs_init_pruning,
+							  bool *needs_exec_pruning)
 {
 	RelOptInfo *targetpart = NULL;
 	List	   *pinfolist = NIL;
@@ -452,6 +476,10 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	int			rti;
 	int			i;
 
+	/* Will find out below. */
+	*needs_init_pruning = false;
+	*needs_exec_pruning = false;
+
 	/*
 	 * Examine each partitioned rel, constructing a temporary array to map
 	 * from planner relids to index of the partitioned rel, and building a
@@ -539,6 +567,9 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		 * executor per-scan pruning steps.  This first pass creates startup
 		 * pruning steps and detects whether there's any possibly-useful quals
 		 * that would require per-scan pruning.
+		 *
+		 * In the first pass, we note whether the 2nd pass is necessary by
+		 * noting the presence of EXEC parameters.
 		 */
 		gen_partprune_steps(subpart, partprunequal, PARTTARGET_INITIAL,
 							&context);
@@ -613,6 +644,12 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		pinfo->execparamids = execparamids;
 		/* Remaining fields will be filled in the next loop */
 
+		/* record which types of pruning steps we've seen so far */
+		if (initial_pruning_steps != NIL)
+			*needs_init_pruning = true;
+		if (exec_pruning_steps != NIL)
+			*needs_exec_pruning = true;
+
 		pinfolist = lappend(pinfolist, pinfo);
 	}
 
@@ -640,6 +677,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		int		   *subplan_map;
 		int		   *subpart_map;
 		Oid		   *relid_map;
+		Index	   *rti_map;
 
 		/*
 		 * Construct the subplan and subpart maps for this partitioning level.
@@ -652,6 +690,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		subpart_map = (int *) palloc(nparts * sizeof(int));
 		memset(subpart_map, -1, nparts * sizeof(int));
 		relid_map = (Oid *) palloc0(nparts * sizeof(Oid));
+		rti_map = (Index *) palloc0(nparts * sizeof(Index));
 		present_parts = NULL;
 
 		i = -1;
@@ -666,6 +705,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 			subplan_map[i] = subplanidx = relid_subplan_map[partrel->relid] - 1;
 			subpart_map[i] = subpartidx = relid_subpart_map[partrel->relid] - 1;
 			relid_map[i] = planner_rt_fetch(partrel->relid, root)->relid;
+			rti_map[i] = partrel->relid;
 			if (subplanidx >= 0)
 			{
 				present_parts = bms_add_member(present_parts, i);
@@ -690,6 +730,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		pinfo->subplan_map = subplan_map;
 		pinfo->subpart_map = subpart_map;
 		pinfo->relid_map = relid_map;
+		pinfo->rti_map = rti_map;
 	}
 
 	pfree(relid_subpart_map);
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 95dc2e2c83..8dc52a158f 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -1603,6 +1603,7 @@ exec_bind_message(StringInfo input_message)
 	int16	   *rformats = NULL;
 	CachedPlanSource *psrc;
 	CachedPlan *cplan;
+	List	   *part_prune_result_list;
 	Portal		portal;
 	char	   *query_string;
 	char	   *saved_stmt_name;
@@ -1978,7 +1979,9 @@ exec_bind_message(StringInfo input_message)
 	 * will be generated in MessageContext.  The plan refcount will be
 	 * assigned to the Portal, so it will be released at portal destruction.
 	 */
-	cplan = GetCachedPlan(psrc, params, NULL, NULL);
+	cplan = GetCachedPlan(psrc, params, NULL, NULL, &part_prune_result_list);
+	Assert(list_length(cplan->stmt_list) ==
+		   list_length(part_prune_result_list));
 
 	/*
 	 * Now we can define the portal.
@@ -1993,6 +1996,9 @@ exec_bind_message(StringInfo input_message)
 					  cplan->stmt_list,
 					  cplan);
 
+	/* Copy PartitionPruneResults into the portal's context. */
+	PortalStorePartitionPruneResults(portal, part_prune_result_list);
+
 	/* Done with the snapshot used for parameter I/O and parsing/planning */
 	if (snapshot_set)
 		PopActiveSnapshot();
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index 5aa5a350f3..8cc2e2162d 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -35,7 +35,7 @@
 Portal		ActivePortal = NULL;
 
 
-static void ProcessQuery(PlannedStmt *plan,
+static void ProcessQuery(PlannedStmt *plan, PartitionPruneResult *part_prune_result,
 						 const char *sourceText,
 						 ParamListInfo params,
 						 QueryEnvironment *queryEnv,
@@ -65,6 +65,7 @@ static void DoPortalRewind(Portal portal);
  */
 QueryDesc *
 CreateQueryDesc(PlannedStmt *plannedstmt,
+				PartitionPruneResult *part_prune_result,
 				const char *sourceText,
 				Snapshot snapshot,
 				Snapshot crosscheck_snapshot,
@@ -77,6 +78,8 @@ CreateQueryDesc(PlannedStmt *plannedstmt,
 
 	qd->operation = plannedstmt->commandType;	/* operation */
 	qd->plannedstmt = plannedstmt;	/* plan */
+	qd->part_prune_result = part_prune_result;	/* ExecutorDoInitialPruning()
+												 * output for plan */
 	qd->sourceText = sourceText;	/* query text */
 	qd->snapshot = RegisterSnapshot(snapshot);	/* snapshot */
 	/* RI check snapshot */
@@ -122,6 +125,7 @@ FreeQueryDesc(QueryDesc *qdesc)
  *		PORTAL_ONE_RETURNING, or PORTAL_ONE_MOD_WITH portal
  *
  *	plan: the plan tree for the query
+ *	part_prune_result: ExecutorDoInitialPruning() output for the plan tree
  *	sourceText: the source text of the query
  *	params: any parameters needed
  *	dest: where to send results
@@ -134,6 +138,7 @@ FreeQueryDesc(QueryDesc *qdesc)
  */
 static void
 ProcessQuery(PlannedStmt *plan,
+			 PartitionPruneResult *part_prune_result,
 			 const char *sourceText,
 			 ParamListInfo params,
 			 QueryEnvironment *queryEnv,
@@ -145,7 +150,7 @@ ProcessQuery(PlannedStmt *plan,
 	/*
 	 * Create the QueryDesc object
 	 */
-	queryDesc = CreateQueryDesc(plan, sourceText,
+	queryDesc = CreateQueryDesc(plan, part_prune_result, sourceText,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, params, queryEnv, 0);
 
@@ -491,8 +496,13 @@ PortalStart(Portal portal, ParamListInfo params,
 				/*
 				 * Create QueryDesc in portal's context; for the moment, set
 				 * the destination to DestNone.
+				 *
+				 * There is no PartitionPruneResult unless the PlannedStmt is
+				 * from a CachedPlan.
 				 */
 				queryDesc = CreateQueryDesc(linitial_node(PlannedStmt, portal->stmts),
+											portal->part_prune_results == NIL ? NULL :
+											linitial(portal->part_prune_results),
 											portal->sourceText,
 											GetActiveSnapshot(),
 											InvalidSnapshot,
@@ -1225,6 +1235,8 @@ PortalRunMulti(Portal portal,
 
 		if (pstmt->utilityStmt == NULL)
 		{
+			PartitionPruneResult *part_prune_result = NULL;
+
 			/*
 			 * process a plannable query.
 			 */
@@ -1271,10 +1283,18 @@ PortalRunMulti(Portal portal,
 			else
 				UpdateActiveSnapshotCommandId();
 
+			/*
+			 * Determine if there's a corresponding PartitionPruneResult for
+			 * this PlannedStmt.
+			 */
+			if (portal->part_prune_results != NIL)
+				part_prune_result = list_nth(portal->part_prune_results,
+											 foreach_current_index(stmtlist_item));
+
 			if (pstmt->canSetTag)
 			{
 				/* statement can set tag string */
-				ProcessQuery(pstmt,
+				ProcessQuery(pstmt, part_prune_result,
 							 portal->sourceText,
 							 portal->portalParams,
 							 portal->queryEnv,
@@ -1283,7 +1303,7 @@ PortalRunMulti(Portal portal,
 			else
 			{
 				/* stmt added by rewrite cannot set tag */
-				ProcessQuery(pstmt,
+				ProcessQuery(pstmt, part_prune_result,
 							 portal->sourceText,
 							 portal->portalParams,
 							 portal->queryEnv,
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 4cf6db504f..6cb473f2f4 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -99,14 +99,19 @@ static dlist_head cached_expression_list = DLIST_STATIC_INIT(cached_expression_l
 static void ReleaseGenericPlan(CachedPlanSource *plansource);
 static List *RevalidateCachedQuery(CachedPlanSource *plansource,
 								   QueryEnvironment *queryEnv);
-static bool CheckCachedPlan(CachedPlanSource *plansource);
+static bool CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
+							List **part_prune_result_list);
 static CachedPlan *BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
-								   ParamListInfo boundParams, QueryEnvironment *queryEnv);
+								   ParamListInfo boundParams, QueryEnvironment *queryEnv,
+								   List **part_prune_result_list);
 static bool choose_custom_plan(CachedPlanSource *plansource,
 							   ParamListInfo boundParams);
 static double cached_plan_cost(CachedPlan *plan, bool include_planner);
 static Query *QueryListGetPrimaryStmt(List *stmts);
-static void AcquireExecutorLocks(List *stmt_list, bool acquire);
+static void AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams,
+								 List **part_prune_result_list,
+								 List **lockedRelids_per_stmt);
+static void ReleaseExecutorLocks(List *stmt_list, List *lockedRelids_per_stmt);
 static void AcquirePlannerLocks(List *stmt_list, bool acquire);
 static void ScanQueryForLocks(Query *parsetree, bool acquire);
 static bool ScanQueryWalker(Node *node, bool *acquire);
@@ -790,15 +795,20 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
  *
  * On a "true" return, we have acquired the locks needed to run the plan.
  * (We must do this for the "true" result to be race-condition-free.)
+ *
+ * See GetCachedPlan()'s comment for a description of part_prune_result_list.
  */
 static bool
-CheckCachedPlan(CachedPlanSource *plansource)
+CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
+				List **part_prune_result_list)
 {
 	CachedPlan *plan = plansource->gplan;
 
 	/* Assert that caller checked the querytree */
 	Assert(plansource->is_valid);
 
+	*part_prune_result_list = NIL;
+
 	/* If there's no generic plan, just say "false" */
 	if (!plan)
 		return false;
@@ -820,13 +830,21 @@ CheckCachedPlan(CachedPlanSource *plansource)
 	 */
 	if (plan->is_valid)
 	{
+		List   *lockedRelids_per_stmt;
+
 		/*
 		 * Plan must have positive refcount because it is referenced by
 		 * plansource; so no need to fear it disappears under us here.
 		 */
 		Assert(plan->refcount > 0);
 
-		AcquireExecutorLocks(plan->stmt_list, true);
+		/*
+		 * Lock relations scanned by the plan.  This is where the pruning
+		 * happens if needed.
+		 */
+		AcquireExecutorLocks(plan->stmt_list, boundParams,
+							 part_prune_result_list,
+							 &lockedRelids_per_stmt);
 
 		/*
 		 * If plan was transient, check to see if TransactionXmin has
@@ -848,7 +866,14 @@ CheckCachedPlan(CachedPlanSource *plansource)
 		}
 
 		/* Oops, the race case happened.  Release useless locks. */
-		AcquireExecutorLocks(plan->stmt_list, false);
+		ReleaseExecutorLocks(plan->stmt_list, lockedRelids_per_stmt);
+
+		/*
+		 * The output list and any objects therein have been allocated in the
+		 * caller's hopefully short-lived context, so will not remain leaked
+		 * for long, though reset to avoid its accidentally being looked at.
+		 */
+		*part_prune_result_list = NIL;
 	}
 
 	/*
@@ -874,10 +899,15 @@ CheckCachedPlan(CachedPlanSource *plansource)
  * Planning work is done in the caller's memory context.  The finished plan
  * is in a child memory context, which typically should get reparented
  * (unless this is a one-shot plan, in which case we don't copy the plan).
+ *
+ * A list of NULLs is returned in *part_prune_result_list, meaning that no
+ * PartitionPruneResult nodes have yet been created for the plans in
+ * stmt_list.
  */
 static CachedPlan *
 BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
-				ParamListInfo boundParams, QueryEnvironment *queryEnv)
+				ParamListInfo boundParams, QueryEnvironment *queryEnv,
+				List **part_prune_result_list)
 {
 	CachedPlan *plan;
 	List	   *plist;
@@ -1007,6 +1037,17 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
 
 	MemoryContextSwitchTo(oldcxt);
 
+	/*
+	 * No actual PartitionPruneResults yet to add, though must initialize
+	 * the list to have the same number of elements as the list of
+	 * PlannedStmts.
+	 */
+	*part_prune_result_list = NIL;
+	foreach(lc, plist)
+	{
+		*part_prune_result_list = lappend(*part_prune_result_list, NULL);
+	}
+
 	return plan;
 }
 
@@ -1126,6 +1167,17 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
  * plan or a custom plan for the given parameters: the caller does not know
  * which it will get.
  *
+ * For every PlannedStmt found in the returned CachedPlan, an element that
+ * is either a PartitionPruneResult or a NULL is added to
+ * *part_prune_result_list if needed.  The former if the PlannedStmt is from
+ * the existing CachedPlan that is otherwise valid and contains at least one
+ * PartitionPruneInfo that has "initial" pruning steps.  Those steps are
+ * performed by calling ExecutorDoInitialPruning() to determine only those
+ * leaf partitions that need to be locked by AcquireExecutorLocks() by pruning
+ * away subplans that don't match the pruning conditions.  The
+ * PartitionPruneResult contains a list of bitmapsets of the indexes of
+ * matching subplans, one for each PartitionPruneInfo.
+ *
  * On return, the plan is valid and we have sufficient locks to begin
  * execution.
  *
@@ -1139,11 +1191,13 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
  */
 CachedPlan *
 GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
-			  ResourceOwner owner, QueryEnvironment *queryEnv)
+			  ResourceOwner owner, QueryEnvironment *queryEnv,
+			  List **part_prune_result_list)
 {
 	CachedPlan *plan = NULL;
 	List	   *qlist;
 	bool		customplan;
+	List	   *my_part_prune_result_list;
 
 	/* Assert caller is doing things in a sane order */
 	Assert(plansource->magic == CACHEDPLANSOURCE_MAGIC);
@@ -1160,7 +1214,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 
 	if (!customplan)
 	{
-		if (CheckCachedPlan(plansource))
+		if (CheckCachedPlan(plansource, boundParams,
+							&my_part_prune_result_list))
 		{
 			/* We want a generic plan, and we already have a valid one */
 			plan = plansource->gplan;
@@ -1169,7 +1224,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 		else
 		{
 			/* Build a new generic plan */
-			plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv);
+			plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv,
+								   &my_part_prune_result_list);
 			/* Just make real sure plansource->gplan is clear */
 			ReleaseGenericPlan(plansource);
 			/* Link the new generic plan into the plansource */
@@ -1214,7 +1270,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 	if (customplan)
 	{
 		/* Build a custom plan */
-		plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv);
+		plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv,
+							   &my_part_prune_result_list);
 		/* Accumulate total costs of custom plans */
 		plansource->total_custom_cost += cached_plan_cost(plan, true);
 
@@ -1246,6 +1303,9 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 		plan->is_saved = true;
 	}
 
+	if (part_prune_result_list)
+		*part_prune_result_list = my_part_prune_result_list;
+
 	return plan;
 }
 
@@ -1737,17 +1797,29 @@ QueryListGetPrimaryStmt(List *stmts)
 
 /*
  * AcquireExecutorLocks: acquire locks needed for execution of a cached plan;
- * or release them if acquire is false.
+ *
+ * See GetCachedPlan()'s comment for a description of part_prune_result_list.
+ *
+ * On return, *lockedRelids_per_stmt will contain a bitmapset for every
+ * PlannedStmt in stmt_list, containing the RT indexes of relation entries
+ * in its range table that were actually locked, or NULL if the PlannedStmt
+ * contains a utility statement.
  */
 static void
-AcquireExecutorLocks(List *stmt_list, bool acquire)
+AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams,
+					 List **part_prune_result_list,
+					 List **lockedRelids_per_stmt)
 {
 	ListCell   *lc1;
 
+	*part_prune_result_list = *lockedRelids_per_stmt = NIL;
 	foreach(lc1, stmt_list)
 	{
 		PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
-		ListCell   *lc2;
+		PartitionPruneResult *part_prune_result = NULL;
+		Bitmapset  *allLockRelids;
+		Bitmapset  *lockedRelids = NULL;
+		int			rti;
 
 		if (plannedstmt->commandType == CMD_UTILITY)
 		{
@@ -1761,13 +1833,35 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
 			Query	   *query = UtilityContainsQuery(plannedstmt->utilityStmt);
 
 			if (query)
-				ScanQueryForLocks(query, acquire);
+				ScanQueryForLocks(query, true);
+			*part_prune_result_list = lappend(*part_prune_result_list, NULL);
 			continue;
 		}
 
-		foreach(lc2, plannedstmt->rtable)
+		/*
+		 * Figure out the set of relations that would need to be locked
+		 * before executing the plan.
+		 */
+		if (plannedstmt->containsInitialPruning)
 		{
-			RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc2);
+			/*
+			 * Obtain the set of partitions to be locked from the
+			 * PartitionPruneInfos by considering the result of performing
+			 * initial partition pruning.
+			 */
+			PartitionPruneResult *part_prune_result =
+				ExecutorDoInitialPruning(plannedstmt, boundParams);
+
+			allLockRelids = bms_union(plannedstmt->minLockRelids,
+									  part_prune_result->scan_leafpart_rtis);
+		}
+		else
+			allLockRelids = plannedstmt->minLockRelids;
+
+		rti = -1;
+		while ((rti = bms_next_member(allLockRelids, rti)) > 0)
+		{
+			RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
 
 			if (rte->rtekind != RTE_RELATION)
 				continue;
@@ -1778,10 +1872,58 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
 			 * fail if it's been dropped entirely --- we'll just transiently
 			 * acquire a non-conflicting lock.
 			 */
-			if (acquire)
-				LockRelationOid(rte->relid, rte->rellockmode);
-			else
-				UnlockRelationOid(rte->relid, rte->rellockmode);
+			LockRelationOid(rte->relid, rte->rellockmode);
+			lockedRelids = bms_add_member(lockedRelids, rti);
+		}
+
+		*part_prune_result_list = lappend(*part_prune_result_list,
+										  part_prune_result);
+		*lockedRelids_per_stmt = lappend(*lockedRelids_per_stmt, lockedRelids);
+	}
+}
+
+/*
+ * ReleaseExecutorLocks
+ * 		Release locks that would've been acquired by an earlier call to
+ * 		AcquireExecutorLocks()
+ */
+static void
+ReleaseExecutorLocks(List *stmt_list, List *lockedRelids_per_stmt)
+{
+	ListCell   *lc1,
+			   *lc2;
+
+	forboth(lc1, stmt_list, lc2, lockedRelids_per_stmt)
+	{
+		PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
+		Bitmapset  *lockedRelids = lfirst(lc2);
+		int			rti;
+
+		if (plannedstmt->commandType == CMD_UTILITY)
+		{
+			/*
+			 * Ignore utility statements, except those (such as EXPLAIN) that
+			 * contain a parsed-but-not-planned query.  Note: it's okay to use
+			 * ScanQueryForLocks, even though the query hasn't been through
+			 * rule rewriting, because rewriting doesn't change the query
+			 * representation.
+			 */
+			Query	   *query = UtilityContainsQuery(plannedstmt->utilityStmt);
+
+			if (query)
+				ScanQueryForLocks(query, false);
+			continue;
+		}
+
+		rti = -1;
+		while ((rti = bms_next_member(lockedRelids, rti)) >= 0)
+		{
+			RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
+
+			Assert(rte->rtekind == RTE_RELATION);
+
+			/* See the comment in AcquireExecutorLocks(). */
+			UnlockRelationOid(rte->relid, rte->rellockmode);
 		}
 	}
 }
diff --git a/src/backend/utils/mmgr/portalmem.c b/src/backend/utils/mmgr/portalmem.c
index d549f66d4a..1bbe6b704b 100644
--- a/src/backend/utils/mmgr/portalmem.c
+++ b/src/backend/utils/mmgr/portalmem.c
@@ -303,6 +303,25 @@ PortalDefineQuery(Portal portal,
 	portal->status = PORTAL_DEFINED;
 }
 
+/*
+ * PortalStorePartitionPruneResults
+ *		Copy the given list of PartitionPruneResults into the portal's
+ *		context
+ *
+ * This allows the caller to ensure that the list exists as long as the portal
+ * does.
+ */
+void
+PortalStorePartitionPruneResults(Portal portal, List *part_prune_results)
+{
+	MemoryContext	oldcxt;
+
+	AssertArg(PortalIsValid(portal));
+	oldcxt = MemoryContextSwitchTo(portal->portalContext);
+	portal->part_prune_results = copyObject(part_prune_results);
+	MemoryContextSwitchTo(oldcxt);
+}
+
 /*
  * PortalReleaseCachedPlan
  *		Release a portal's reference to its cached plan, if any.
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 666977fb1f..bbc8c42d88 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -87,7 +87,9 @@ extern void ExplainOneUtility(Node *utilityStmt, IntoClause *into,
 							  ExplainState *es, const char *queryString,
 							  ParamListInfo params, QueryEnvironment *queryEnv);
 
-extern void ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into,
+extern void ExplainOnePlan(PlannedStmt *plannedstmt,
+						   PartitionPruneResult *part_prune_result,
+						   IntoClause *into,
 						   ExplainState *es, const char *queryString,
 						   ParamListInfo params, QueryEnvironment *queryEnv,
 						   const instr_time *planduration,
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index 708435e952..bd8776402e 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -45,6 +45,7 @@ extern void ExecCleanupTupleRouting(ModifyTableState *mtstate,
  * nparts						Length of subplan_map[] and subpart_map[].
  * subplan_map					Subplan index by partition index, or -1.
  * subpart_map					Subpart index by partition index, or -1.
+ * rti_map						Range table index by partition index, or 0.
  * present_parts				A Bitmapset of the partition indexes that we
  *								have subplans or subparts for.
  * initial_pruning_steps		List of PartitionPruneSteps used to
@@ -61,6 +62,7 @@ typedef struct PartitionedRelPruningData
 	int			nparts;
 	int		   *subplan_map;
 	int		   *subpart_map;
+	Index	   *rti_map;
 	Bitmapset  *present_parts;
 	List	   *initial_pruning_steps;
 	List	   *exec_pruning_steps;
@@ -123,9 +125,13 @@ typedef struct PartitionPruneState
 
 extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
 													 int n_total_subplans,
-													 PartitionPruneInfo *pruneinfo,
+													 int part_prune_index,
 													 Bitmapset **initially_valid_subplans);
 extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
-										   bool initial_prune);
-
+										   bool initial_prune,
+										   Bitmapset **scan_leafpart_rtis);
+extern Bitmapset *ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt,
+								ParamListInfo params,
+								PartitionPruneInfo *pruneinfo,
+								Bitmapset **scan_leafpart_rtis);
 #endif							/* EXECPARTITION_H */
diff --git a/src/include/executor/execdesc.h b/src/include/executor/execdesc.h
index e79e2c001f..60d5644908 100644
--- a/src/include/executor/execdesc.h
+++ b/src/include/executor/execdesc.h
@@ -35,6 +35,8 @@ typedef struct QueryDesc
 	/* These fields are provided by CreateQueryDesc */
 	CmdType		operation;		/* CMD_SELECT, CMD_UPDATE, etc. */
 	PlannedStmt *plannedstmt;	/* planner's output (could be utility, too) */
+	PartitionPruneResult *part_prune_result; /* ExecutorDoInitialPruning()'s
+											  * output for plannedstmt */
 	const char *sourceText;		/* source text of the query */
 	Snapshot	snapshot;		/* snapshot to use for query */
 	Snapshot	crosscheck_snapshot;	/* crosscheck for RI update/delete */
@@ -57,6 +59,7 @@ typedef struct QueryDesc
 
 /* in pquery.c */
 extern QueryDesc *CreateQueryDesc(PlannedStmt *plannedstmt,
+								  PartitionPruneResult *part_prune_result,
 								  const char *sourceText,
 								  Snapshot snapshot,
 								  Snapshot crosscheck_snapshot,
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index 873772f188..57dc0e8077 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -185,6 +185,8 @@ ExecGetJunkAttribute(TupleTableSlot *slot, AttrNumber attno, bool *isNull)
 /*
  * prototypes from functions in execMain.c
  */
+extern PartitionPruneResult *ExecutorDoInitialPruning(PlannedStmt *plannedstmt,
+													  ParamListInfo params);
 extern void ExecutorStart(QueryDesc *queryDesc, int eflags);
 extern void standard_ExecutorStart(QueryDesc *queryDesc, int eflags);
 extern void ExecutorRun(QueryDesc *queryDesc,
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 94b191f8ae..a8bf908d63 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -596,6 +596,8 @@ typedef struct EState
 	struct ExecRowMark **es_rowmarks;	/* Array of per-range-table-entry
 										 * ExecRowMarks, or NULL if none */
 	PlannedStmt *es_plannedstmt;	/* link to top of plan tree */
+	List		*es_part_prune_infos;	/* PlannedStmt.partPruneInfos */
+	struct PartitionPruneResult *es_part_prune_result; /* QueryDesc.part_prune_result */
 	const char *es_sourceText;	/* Source text from QueryDesc */
 
 	JunkFilter *es_junkFilter;	/* top-level junk filter, if any */
@@ -984,6 +986,34 @@ typedef struct DomainConstraintState
  */
 typedef TupleTableSlot *(*ExecProcNodeMtd) (struct PlanState *pstate);
 
+/*----------------
+ * PartitionPruneResult
+ *
+ * The result of performing ExecutorDoInitialPruning() invocation on a given
+ * PlannedStmt.
+ *
+ * Contains a list of Bitmapset of the indexes of the subplans remaining after
+ * performing initial pruning by calling ExecFindMatchingSubPlans() for every
+ * PartitionPruneInfos found in PlannedStmt.partPruneInfos.  RT indexes of the
+ * leaf partitions scanned by those subplans across all PartitionPruneInfos
+ * are added into scan_leafpart_rtis.
+ *
+ * This is used by GetCachedPlan() to inform its callers of the pruning
+ * decisions made when performing AcquireExecutorLocks() on a given cached
+ * PlannedStmt, which the callers then pass onto the executor.  The executor
+ * refers to this node when made available when initializing the plan nodes to
+ * which those PartitionPruneInfos apply so that the same set of qualifying
+ * subplans are initialized, rather than deriving that set again by redoing
+ * initial pruning.
+ */
+typedef struct PartitionPruneResult
+{
+	NodeTag		type;
+
+	List		   *valid_subplan_offs_list;
+	Bitmapset	   *scan_leafpart_rtis;
+} PartitionPruneResult;
+
 /* ----------------
  *		PlanState node
  *
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index 340d28f4e1..66416bce97 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -97,6 +97,9 @@ typedef enum NodeTag
 	T_PartitionPruneStepCombine,
 	T_PlanInvalItem,
 
+	/* TAGS FOR EXECUTOR PREP NODES (execnodes.h) */
+	T_PartitionPruneResult,
+
 	/*
 	 * TAGS FOR PLAN STATE NODES (execnodes.h)
 	 *
@@ -674,6 +677,7 @@ extern struct Bitmapset *readBitmapset(void);
 extern uintptr_t readDatum(bool typbyval);
 extern bool *readBoolCols(int numCols);
 extern int *readIntCols(int numCols);
+extern Index *readIndexCols(int numCols);
 extern Oid *readOidCols(int numCols);
 extern int16 *readAttrNumberCols(int numCols);
 
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index c5ab53e05c..11007cda25 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -107,6 +107,18 @@ typedef struct PlannerGlobal
 
 	List	   *appendRelations;	/* "flat" list of AppendRelInfos */
 
+	List	   *partPruneInfos;		/* List of PartitionPruneInfo contained in
+									 * the plan */
+
+	bool		containsInitialPruning;	/* Do any of those PartitionPruneInfos
+										 * have initial (pre-exec) pruning
+										 * steps in them? */
+
+	Bitmapset  *minLockRelids;	/* Indexes of all range table entries minus
+								 * indexes of range table entries of the leaf
+								 * partitions scanned by prunable subplans;
+								 * see AcquireExecutorLocks() */
+
 	List	   *relationOids;	/* OIDs of relations the plan depends on */
 
 	List	   *invalItems;		/* other dependencies, as PlanInvalItems */
@@ -377,6 +389,9 @@ struct PlannerInfo
 
 	/* Does this query modify any partition key columns? */
 	bool		partColsUpdated;
+
+	/* PartitionPruneInfos added in this query's plan. */
+	List	   *partPruneInfos;
 };
 
 
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index e43e360d9b..f8f3971f44 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -64,8 +64,20 @@ typedef struct PlannedStmt
 
 	struct Plan *planTree;		/* tree of Plan nodes */
 
+	List	   *partPruneInfos;	/* List of PartitionPruneInfo contained in
+								 * the plan */
+
+	bool		containsInitialPruning;	/* Do any of those PartitionPruneInfos
+										 * have initial (pre-exec) pruning
+										 * steps in them? */
+
 	List	   *rtable;			/* list of RangeTblEntry nodes */
 
+	Bitmapset  *minLockRelids;	/* Indexes of all range table entries minus
+								 * indexes of range table entries of the leaf
+								 * partitions scanned by prunable subplans;
+								 * see AcquireExecutorLocks() */
+
 	/* rtable indexes of target relations for INSERT/UPDATE/DELETE */
 	List	   *resultRelations;	/* integer list of RT indexes, or NIL */
 
@@ -262,8 +274,8 @@ typedef struct Append
 	 */
 	int			first_partial_plan;
 
-	/* Info for run-time subplan pruning; NULL if we're not doing that */
-	struct PartitionPruneInfo *part_prune_info;
+	/* Index to PlannerInfo.partPruneInfos or -1 if no run-time pruning */
+	int			part_prune_index;
 } Append;
 
 /* ----------------
@@ -282,8 +294,9 @@ typedef struct MergeAppend
 	Oid		   *sortOperators;	/* OIDs of operators to sort them by */
 	Oid		   *collations;		/* OIDs of collations */
 	bool	   *nullsFirst;		/* NULLS FIRST/LAST directions */
-	/* Info for run-time subplan pruning; NULL if we're not doing that */
-	struct PartitionPruneInfo *part_prune_info;
+
+	/* Index to PlannerInfo.partPruneInfos or -1 if no run-time pruning */
+	int			part_prune_index;
 } MergeAppend;
 
 /* ----------------
@@ -1191,6 +1204,13 @@ typedef struct PlanRowMark
  * prune_infos			List of Lists containing PartitionedRelPruneInfo nodes,
  *						one sublist per run-time-prunable partition hierarchy
  *						appearing in the parent plan node's subplans.
+ *
+ * needs_init_pruning	Does any of the PartitionedRelPruneInfos in
+ *						prune_infos have its initial_pruning_steps set?
+ *
+ * needs_exec_pruning	Does any of the PartitionedRelPruneInfos in
+ *						prune_infos have its exec_pruning_steps set?
+ *
  * other_subplans		Indexes of any subplans that are not accounted for
  *						by any of the PartitionedRelPruneInfo nodes in
  *						"prune_infos".  These subplans must not be pruned.
@@ -1199,6 +1219,8 @@ typedef struct PartitionPruneInfo
 {
 	NodeTag		type;
 	List	   *prune_infos;
+	bool		needs_init_pruning;
+	bool		needs_exec_pruning;
 	Bitmapset  *other_subplans;
 } PartitionPruneInfo;
 
@@ -1229,6 +1251,7 @@ typedef struct PartitionedRelPruneInfo
 	int		   *subplan_map;	/* subplan index by partition index, or -1 */
 	int		   *subpart_map;	/* subpart index by partition index, or -1 */
 	Oid		   *relid_map;		/* relation OID by partition index, or 0 */
+	Index	   *rti_map;		/* Range table index by partition index, 0. */
 
 	/*
 	 * initial_pruning_steps shows how to prune during executor startup (i.e.,
diff --git a/src/include/partitioning/partprune.h b/src/include/partitioning/partprune.h
index 90684efa25..ebf0dcff8c 100644
--- a/src/include/partitioning/partprune.h
+++ b/src/include/partitioning/partprune.h
@@ -70,10 +70,10 @@ typedef struct PartitionPruneContext
 #define PruneCxtStateIdx(partnatts, step_id, keyno) \
 	((partnatts) * (step_id) + (keyno))
 
-extern PartitionPruneInfo *make_partition_pruneinfo(struct PlannerInfo *root,
-													struct RelOptInfo *parentrel,
-													List *subpaths,
-													List *prunequal);
+extern int make_partition_pruneinfo(struct PlannerInfo *root,
+									struct RelOptInfo *parentrel,
+									List *subpaths,
+									List *prunequal);
 extern Bitmapset *prune_append_rel_partitions(struct RelOptInfo *rel);
 extern Bitmapset *get_matching_partitions(PartitionPruneContext *context,
 										  List *pruning_steps);
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index 95b99e3d25..449200b949 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -220,7 +220,8 @@ extern List *CachedPlanGetTargetList(CachedPlanSource *plansource,
 extern CachedPlan *GetCachedPlan(CachedPlanSource *plansource,
 								 ParamListInfo boundParams,
 								 ResourceOwner owner,
-								 QueryEnvironment *queryEnv);
+								 QueryEnvironment *queryEnv,
+								 List **part_prune_result_list);
 extern void ReleaseCachedPlan(CachedPlan *plan, ResourceOwner owner);
 
 extern bool CachedPlanAllowsSimpleValidityCheck(CachedPlanSource *plansource,
diff --git a/src/include/utils/portal.h b/src/include/utils/portal.h
index aeddbdafe5..9f7727a837 100644
--- a/src/include/utils/portal.h
+++ b/src/include/utils/portal.h
@@ -138,6 +138,7 @@ typedef struct PortalData
 	QueryCompletion qc;			/* command completion data for executed query */
 	List	   *stmts;			/* list of PlannedStmts */
 	CachedPlan *cplan;			/* CachedPlan, if stmts are from one */
+	List	   *part_prune_results;	/* list of PartitionPruneResults */
 
 	ParamListInfo portalParams; /* params to pass to query */
 	QueryEnvironment *queryEnv; /* environment for query */
@@ -242,6 +243,8 @@ extern void PortalDefineQuery(Portal portal,
 							  CommandTag commandTag,
 							  List *stmts,
 							  CachedPlan *cplan);
+extern void PortalStorePartitionPruneResults(Portal portal,
+											 List *part_prune_result_list);
 extern PlannedStmt *PortalGetPrimaryStmt(Portal portal);
 extern void PortalCreateHoldStore(Portal portal);
 extern void PortalHashTableDeleteAll(void);
-- 
2.24.1



^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2022-04-11 03:05  Amit Langote <[email protected]>
  parent: Amit Langote <[email protected]>
  0 siblings, 1 reply; 108+ messages in thread

From: Amit Langote @ 2022-04-11 03:05 UTC (permalink / raw)
  To: David Rowley <[email protected]>; +Cc: Robert Haas <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers

On Fri, Apr 8, 2022 at 8:45 PM Amit Langote <[email protected]> wrote:
> Most looked fine changes to me except a couple of typos, so I've
> adopted those into the attached new version, even though I know it's
> too late to try to apply it.
>
> + * XXX is it worth doing a bms_copy() on glob->minLockRelids if
> + * glob->containsInitialPruning is true?. I'm slighly worried that the
> + * Bitmapset could have a very long empty tail resulting in excessive
> + * looping during AcquireExecutorLocks().
> + */
>
> I guess I trust your instincts about bitmapset operation efficiency
> and what you've written here makes sense.  It's typical for leaf
> partitions to have been appended toward the tail end of rtable and I'd
> imagine their indexes would be in the tail words of minLockRelids.  If
> copying the bitmapset removes those useless words, I don't see why we
> shouldn't do that.  So added:
>
> + /*
> + * It seems worth doing a bms_copy() on glob->minLockRelids if we deleted
> + * bit from it just above to prevent empty tail bits resulting in
> + * inefficient looping during AcquireExecutorLocks().
> + */
> + if (glob->containsInitialPruning)
> + glob->minLockRelids = bms_copy(glob->minLockRelids)
>
> Not 100% about the comment I wrote.

And the quoted code change missed a semicolon in the v14 that I
hurriedly sent on Friday.   (Had apparently forgotten to `git add` the
hunk to fix that).

Sending v15 that fixes that to keep the cfbot green for now.

-- 
Amit Langote
EDB: http://www.enterprisedb.com


Attachments:

  [application/octet-stream] v15-0001-Optimize-AcquireExecutorLocks-to-skip-pruned-par.patch (99.3K, 2-v15-0001-Optimize-AcquireExecutorLocks-to-skip-pruned-par.patch)
  download | inline diff:
From e974c27abda9c53744b93f2c6e0f1083ddeedbba Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Wed, 22 Dec 2021 16:55:17 +0900
Subject: [PATCH v15] Optimize AcquireExecutorLocks() to skip pruned partitions

---
 src/backend/commands/copyto.c           |   2 +-
 src/backend/commands/createas.c         |   2 +-
 src/backend/commands/explain.c          |   7 +-
 src/backend/commands/extension.c        |   2 +-
 src/backend/commands/matview.c          |   2 +-
 src/backend/commands/prepare.c          |  26 ++-
 src/backend/executor/README             |  27 +++
 src/backend/executor/execMain.c         |  48 +++++
 src/backend/executor/execParallel.c     |  28 ++-
 src/backend/executor/execPartition.c    | 241 ++++++++++++++++++++----
 src/backend/executor/execUtils.c        |   2 +
 src/backend/executor/functions.c        |   2 +-
 src/backend/executor/nodeAppend.c       |  15 +-
 src/backend/executor/nodeMergeAppend.c  |   9 +-
 src/backend/executor/spi.c              |  27 ++-
 src/backend/nodes/copyfuncs.c           |  33 +++-
 src/backend/nodes/outfuncs.c            |  36 +++-
 src/backend/nodes/readfuncs.c           |  56 +++++-
 src/backend/optimizer/plan/createplan.c |  24 +--
 src/backend/optimizer/plan/planner.c    |   3 +
 src/backend/optimizer/plan/setrefs.c    | 112 ++++++++---
 src/backend/partitioning/partprune.c    |  59 +++++-
 src/backend/tcop/postgres.c             |   8 +-
 src/backend/tcop/pquery.c               |  28 ++-
 src/backend/utils/cache/plancache.c     | 184 +++++++++++++++---
 src/backend/utils/mmgr/portalmem.c      |  19 ++
 src/include/commands/explain.h          |   4 +-
 src/include/executor/execPartition.h    |  12 +-
 src/include/executor/execdesc.h         |   3 +
 src/include/executor/executor.h         |   2 +
 src/include/nodes/execnodes.h           |  30 +++
 src/include/nodes/nodes.h               |   4 +
 src/include/nodes/pathnodes.h           |  15 ++
 src/include/nodes/plannodes.h           |  31 ++-
 src/include/partitioning/partprune.h    |   8 +-
 src/include/utils/plancache.h           |   3 +-
 src/include/utils/portal.h              |   3 +
 37 files changed, 950 insertions(+), 167 deletions(-)

diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index 55c38b04c4..d403eb2309 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -542,7 +542,7 @@ BeginCopyTo(ParseState *pstate,
 		((DR_copy *) dest)->cstate = cstate;
 
 		/* Create a QueryDesc requesting no output */
-		cstate->queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+		cstate->queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
 											GetActiveSnapshot(),
 											InvalidSnapshot,
 											dest, NULL, NULL, 0);
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 9abbb6b555..f6607f2454 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -325,7 +325,7 @@ ExecCreateTableAs(ParseState *pstate, CreateTableAsStmt *stmt,
 		UpdateActiveSnapshotCommandId();
 
 		/* Create a QueryDesc, redirecting output to our tuple receiver */
-		queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+		queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
 									GetActiveSnapshot(), InvalidSnapshot,
 									dest, params, queryEnv, 0);
 
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index d2a2479822..35dd24adf8 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -407,7 +407,7 @@ ExplainOneQuery(Query *query, int cursorOptions,
 		}
 
 		/* run it (if needed) and produce output */
-		ExplainOnePlan(plan, into, es, queryString, params, queryEnv,
+		ExplainOnePlan(plan, NULL, into, es, queryString, params, queryEnv,
 					   &planduration, (es->buffers ? &bufusage : NULL));
 	}
 }
@@ -515,7 +515,8 @@ ExplainOneUtility(Node *utilityStmt, IntoClause *into, ExplainState *es,
  * to call it.
  */
 void
-ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
+ExplainOnePlan(PlannedStmt *plannedstmt, PartitionPruneResult *part_prune_result,
+			   IntoClause *into, ExplainState *es,
 			   const char *queryString, ParamListInfo params,
 			   QueryEnvironment *queryEnv, const instr_time *planduration,
 			   const BufferUsage *bufusage)
@@ -563,7 +564,7 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
 		dest = None_Receiver;
 
 	/* Create a QueryDesc for the query */
-	queryDesc = CreateQueryDesc(plannedstmt, queryString,
+	queryDesc = CreateQueryDesc(plannedstmt, part_prune_result, queryString,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, params, queryEnv, instrument_option);
 
diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index 1013790dbb..54734a3a93 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -776,7 +776,7 @@ execute_sql_string(const char *sql)
 			{
 				QueryDesc  *qdesc;
 
-				qdesc = CreateQueryDesc(stmt,
+				qdesc = CreateQueryDesc(stmt, NULL,
 										sql,
 										GetActiveSnapshot(), NULL,
 										dest, NULL, NULL, 0);
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index 9ab248d25e..2be1782bc4 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -416,7 +416,7 @@ refresh_matview_datafill(DestReceiver *dest, Query *query,
 	UpdateActiveSnapshotCommandId();
 
 	/* Create a QueryDesc, redirecting output to our tuple receiver */
-	queryDesc = CreateQueryDesc(plan, queryString,
+	queryDesc = CreateQueryDesc(plan, NULL, queryString,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, NULL, NULL, 0);
 
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 80738547ed..c7360712b1 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -155,6 +155,7 @@ ExecuteQuery(ParseState *pstate,
 	PreparedStatement *entry;
 	CachedPlan *cplan;
 	List	   *plan_list;
+	List	   *part_prune_result_list;
 	ParamListInfo paramLI = NULL;
 	EState	   *estate = NULL;
 	Portal		portal;
@@ -193,7 +194,10 @@ ExecuteQuery(ParseState *pstate,
 									   entry->plansource->query_string);
 
 	/* Replan if needed, and increment plan refcount for portal */
-	cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL);
+	cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL,
+						  &part_prune_result_list);
+	Assert(list_length(cplan->stmt_list) ==
+		   list_length(part_prune_result_list));
 	plan_list = cplan->stmt_list;
 
 	/*
@@ -207,6 +211,9 @@ ExecuteQuery(ParseState *pstate,
 					  plan_list,
 					  cplan);
 
+	/* Copy PartitionPruneResults into the portal's context. */
+	PortalStorePartitionPruneResults(portal, part_prune_result_list);
+
 	/*
 	 * For CREATE TABLE ... AS EXECUTE, we must verify that the prepared
 	 * statement is one that produces tuples.  Currently we insist that it be
@@ -576,7 +583,9 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	const char *query_string;
 	CachedPlan *cplan;
 	List	   *plan_list;
-	ListCell   *p;
+	List	   *part_prune_result_list;
+	ListCell   *p,
+			   *pp;
 	ParamListInfo paramLI = NULL;
 	EState	   *estate = NULL;
 	instr_time	planstart;
@@ -619,7 +628,10 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 
 	/* Replan if needed, and acquire a transient refcount */
 	cplan = GetCachedPlan(entry->plansource, paramLI,
-						  CurrentResourceOwner, queryEnv);
+						  CurrentResourceOwner, queryEnv,
+						  &part_prune_result_list);
+	Assert(list_length(cplan->stmt_list) ==
+		   list_length(part_prune_result_list));
 
 	INSTR_TIME_SET_CURRENT(planduration);
 	INSTR_TIME_SUBTRACT(planduration, planstart);
@@ -634,13 +646,15 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	plan_list = cplan->stmt_list;
 
 	/* Explain each query */
-	foreach(p, plan_list)
+	forboth(p, plan_list, pp, part_prune_result_list)
 	{
 		PlannedStmt *pstmt = lfirst_node(PlannedStmt, p);
+		PartitionPruneResult *part_prune_result = lfirst_node(PartitionPruneResult, pp);
 
 		if (pstmt->commandType != CMD_UTILITY)
-			ExplainOnePlan(pstmt, into, es, query_string, paramLI, queryEnv,
-						   &planduration, (es->buffers ? &bufusage : NULL));
+			ExplainOnePlan(pstmt, part_prune_result, into, es, query_string,
+						   paramLI, queryEnv, &planduration,
+						   (es->buffers ? &bufusage : NULL));
 		else
 			ExplainOneUtility(pstmt->utilityStmt, into, es, query_string,
 							  paramLI, queryEnv);
diff --git a/src/backend/executor/README b/src/backend/executor/README
index 0b5183fc4a..e0802be723 100644
--- a/src/backend/executor/README
+++ b/src/backend/executor/README
@@ -65,6 +65,29 @@ found there.  This currently only occurs for Append and MergeAppend nodes.  In
 this case the non-required subplans are ignored and the executor state's
 subnode array will become out of sequence to the plan's subplan list.
 
+Actually, the so-called execution time pruning may also occur even before the
+execution has started.  One case where that occurs is when a cached generic
+plan is being validated for execution by plancache.c: GetCachedPlan(), which
+proceeds by locking all the relations that will be scanned by that plan.  If
+the generic plan contains nodes that can perform execution time partition
+pruning (that is, contain a PartitionPruneInfo), a subset of pruning steps
+contained in the PartitionPruneInfos that do not depend on execution actually
+having started (called "initial" pruning steps) are performed at this point
+to figure out the minimal set of child subplans that satisfy those pruning
+instructions.  AcquireExecutorLocks() looking at a particular plan will then
+lock only the relations scanned by those surviving subplans (along with those
+present in PlannedStmt.minLockRelids), and ignore those scanned by the pruned
+subplans, even though the pruned subplans themselves are not removed from the
+plan tree.  The result of pruning (that is, the set of indexes of surviving
+subplans in their parent's list of child subplans) is saved as a list of
+bitmapsets, with one element for every PartitionPruneInfo referenced in the
+plan (PlannedStmt.partPruneInfos).  The list is packaged into a
+PartitionPruneResult node, which is passed along with the PlannedStmt to the
+executor via the QueryDesc.  It is imperative that the executor and any third
+party code invoked by it that gets passed the plan tree look at the plan's
+PartitionPruneResult to determine whether a particular child subplan of a
+parent node that supports pruning is valid for a given execution.
+
 Each Plan node may have expression trees associated with it, to represent
 its target list, qualification conditions, etc.  These trees are also
 read-only to the executor, but the executor state for expression evaluation
@@ -286,6 +309,10 @@ Query Processing Control Flow
 
 This is a sketch of control flow for full query processing:
 
+  [ ExecutorDoInitialPruning ] --- an optional step to perform initial
+		partition pruning on the plan tree the result of which is passed
+		to the executor via QueryDesc
+
 	CreateQueryDesc
 
 	ExecutorStart
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index ef2fd46092..5ee978937d 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -49,11 +49,13 @@
 #include "commands/matview.h"
 #include "commands/trigger.h"
 #include "executor/execdebug.h"
+#include "executor/execPartition.h"
 #include "executor/nodeSubplan.h"
 #include "foreign/fdwapi.h"
 #include "jit/jit.h"
 #include "mb/pg_wchar.h"
 #include "miscadmin.h"
+#include "nodes/nodeFuncs.h"
 #include "parser/parsetree.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
@@ -104,6 +106,49 @@ static void EvalPlanQualStart(EPQState *epqstate, Plan *planTree);
 
 /* end of local decls */
 
+/* ----------------------------------------------------------------
+ *		ExecutorDoInitialPruning
+ *
+ *		Performs initial partition pruning to figure out the minimal set of
+ *		subplans to be executed and the set of RT indexes of the corresponding
+ *		leaf partitions
+ *
+ * Returned PartitionPruneResult must be subsequently passed to the executor
+ * so that it can reuse the result of pruning.  It's important that the
+ * has the same view of which partitions are initially pruned (by not doing
+ * the pruning again itself) or otherwise it risks initializing subplans whose
+ * partitions would not have been locked.
+ *
+ * Note: Partitioned tables mentioned in PartitionedRelPruneInfo nodes that
+ * drive the pruning will be locked before doing the pruning.
+ *
+ * ----------------------------------------------------------------
+ */
+PartitionPruneResult *
+ExecutorDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params)
+{
+	PartitionPruneResult *result;
+	ListCell *lc;
+
+	/* Only get here if there is any pruning to do. */
+	Assert(plannedstmt->containsInitialPruning);
+
+	result = makeNode(PartitionPruneResult);
+	foreach(lc, plannedstmt->partPruneInfos)
+	{
+		PartitionPruneInfo *pruneinfo = lfirst(lc);
+		Bitmapset *valid_subplan_offs;
+
+		valid_subplan_offs =
+			ExecPartitionDoInitialPruning(plannedstmt, params, pruneinfo,
+										  &result->scan_leafpart_rtis);
+		result->valid_subplan_offs_list =
+			lappend(result->valid_subplan_offs_list,
+					valid_subplan_offs);
+	}
+
+	return result;
+}
 
 /* ----------------------------------------------------------------
  *		ExecutorStart
@@ -806,6 +851,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
 {
 	CmdType		operation = queryDesc->operation;
 	PlannedStmt *plannedstmt = queryDesc->plannedstmt;
+	PartitionPruneResult *part_prune_result = queryDesc->part_prune_result;
 	Plan	   *plan = plannedstmt->planTree;
 	List	   *rangeTable = plannedstmt->rtable;
 	EState	   *estate = queryDesc->estate;
@@ -825,6 +871,8 @@ InitPlan(QueryDesc *queryDesc, int eflags)
 	ExecInitRangeTable(estate, rangeTable);
 
 	estate->es_plannedstmt = plannedstmt;
+	estate->es_part_prune_infos = plannedstmt->partPruneInfos;
+	estate->es_part_prune_result = part_prune_result;
 
 	/*
 	 * Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index 9a0d5d59ef..805f86c503 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -66,6 +66,7 @@
 #define PARALLEL_KEY_QUERY_TEXT		UINT64CONST(0xE000000000000008)
 #define PARALLEL_KEY_JIT_INSTRUMENTATION UINT64CONST(0xE000000000000009)
 #define PARALLEL_KEY_WAL_USAGE			UINT64CONST(0xE00000000000000A)
+#define PARALLEL_KEY_PARTITIONPRUNERESULT	UINT64CONST(0xE00000000000000B)
 
 #define PARALLEL_TUPLE_QUEUE_SIZE		65536
 
@@ -182,7 +183,9 @@ ExecSerializePlan(Plan *plan, EState *estate)
 	pstmt->transientPlan = false;
 	pstmt->dependsOnRole = false;
 	pstmt->parallelModeNeeded = false;
+	pstmt->containsInitialPruning = false;
 	pstmt->planTree = plan;
+	pstmt->partPruneInfos = estate->es_part_prune_infos;
 	pstmt->rtable = estate->es_range_table;
 	pstmt->resultRelations = NIL;
 	pstmt->appendRelations = NIL;
@@ -596,12 +599,15 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	FixedParallelExecutorState *fpes;
 	char	   *pstmt_data;
 	char	   *pstmt_space;
+	char	   *part_prune_result_data;
+	char	   *part_prune_result_space;
 	char	   *paramlistinfo_space;
 	BufferUsage *bufusage_space;
 	WalUsage   *walusage_space;
 	SharedExecutorInstrumentation *instrumentation = NULL;
 	SharedJitInstrumentation *jit_instrumentation = NULL;
 	int			pstmt_len;
+	int			part_prune_result_len;
 	int			paramlistinfo_len;
 	int			instrumentation_len = 0;
 	int			jit_instrumentation_len = 0;
@@ -630,6 +636,7 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 
 	/* Fix up and serialize plan to be sent to workers. */
 	pstmt_data = ExecSerializePlan(planstate->plan, estate);
+	part_prune_result_data = nodeToString(estate->es_part_prune_result);
 
 	/* Create a parallel context. */
 	pcxt = CreateParallelContext("postgres", "ParallelQueryMain", nworkers);
@@ -656,6 +663,11 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	shm_toc_estimate_chunk(&pcxt->estimator, pstmt_len);
 	shm_toc_estimate_keys(&pcxt->estimator, 1);
 
+	/* Estimate space for serialized PartitionPruneResult. */
+	part_prune_result_len = strlen(part_prune_result_data) + 1;
+	shm_toc_estimate_chunk(&pcxt->estimator, part_prune_result_len);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
 	/* Estimate space for serialized ParamListInfo. */
 	paramlistinfo_len = EstimateParamListSpace(estate->es_param_list_info);
 	shm_toc_estimate_chunk(&pcxt->estimator, paramlistinfo_len);
@@ -750,6 +762,12 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	memcpy(pstmt_space, pstmt_data, pstmt_len);
 	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PLANNEDSTMT, pstmt_space);
 
+	/* Store serialized PartitionPruneResult */
+	part_prune_result_space = shm_toc_allocate(pcxt->toc, part_prune_result_len);
+	memcpy(part_prune_result_space, part_prune_result_data, part_prune_result_len);
+	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARTITIONPRUNERESULT,
+				   part_prune_result_space);
+
 	/* Store serialized ParamListInfo. */
 	paramlistinfo_space = shm_toc_allocate(pcxt->toc, paramlistinfo_len);
 	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARAMLISTINFO, paramlistinfo_space);
@@ -1231,8 +1249,10 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
 						 int instrument_options)
 {
 	char	   *pstmtspace;
+	char	   *part_prune_result_space;
 	char	   *paramspace;
 	PlannedStmt *pstmt;
+	PartitionPruneResult *part_prune_result;
 	ParamListInfo paramLI;
 	char	   *queryString;
 
@@ -1243,12 +1263,18 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
 	pstmtspace = shm_toc_lookup(toc, PARALLEL_KEY_PLANNEDSTMT, false);
 	pstmt = (PlannedStmt *) stringToNode(pstmtspace);
 
+	/* Reconstruct leader-supplied PartitionPruneResult. */
+	part_prune_result_space =
+		shm_toc_lookup(toc, PARALLEL_KEY_PARTITIONPRUNERESULT, false);
+	part_prune_result = (PartitionPruneResult *)
+		stringToNode(part_prune_result_space);
+
 	/* Reconstruct ParamListInfo. */
 	paramspace = shm_toc_lookup(toc, PARALLEL_KEY_PARAMLISTINFO, false);
 	paramLI = RestoreParamList(&paramspace);
 
 	/* Create a QueryDesc for the query. */
-	return CreateQueryDesc(pstmt,
+	return CreateQueryDesc(pstmt, part_prune_result,
 						   queryString,
 						   GetActiveSnapshot(), InvalidSnapshot,
 						   receiver, paramLI, NULL, instrument_options);
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 615bd80973..af87b9197f 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -25,6 +25,7 @@
 #include "mb/pg_wchar.h"
 #include "miscadmin.h"
 #include "nodes/makefuncs.h"
+#include "parser/parsetree.h"
 #include "partitioning/partbounds.h"
 #include "partitioning/partdesc.h"
 #include "partitioning/partprune.h"
@@ -185,7 +186,11 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation rel,
 static List *adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri);
 static List *adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap);
 static PartitionPruneState *CreatePartitionPruneState(PlanState *planstate,
-													  PartitionPruneInfo *pruneinfo);
+													  PartitionPruneInfo *pruneinfo,
+													  bool consider_initial_steps,
+													  bool consider_exec_steps,
+													  List *rtable, ExprContext *econtext,
+													  PartitionDirectory partdir);
 static void InitPartitionPruneContext(PartitionPruneContext *context,
 									  List *pruning_steps,
 									  PartitionDesc partdesc,
@@ -198,7 +203,8 @@ static void PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
 static void find_matching_subplans_recurse(PartitionPruningData *prunedata,
 										   PartitionedRelPruningData *pprune,
 										   bool initial_prune,
-										   Bitmapset **validsubplans);
+										   Bitmapset **validsubplans,
+										   Bitmapset **scan_leafpart_rtis);
 
 
 /*
@@ -1587,8 +1593,10 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  * considered to be a stable expression, it can change value from one plan
  * node scan to the next during query execution.  Stable comparison
  * expressions that don't involve such Params allow partition pruning to be
- * done once during executor startup.  Expressions that do involve such Params
- * require us to prune separately for each scan of the parent plan node.
+ * done once during executor startup or during ExecutorDoInitialPruning() that
+ * runs as part of performing AcquireExecutorLocks() on a given plan tree.
+ * Expressions that do involve such Params require us to prune separately for
+ * each scan of the parent plan node.
  *
  * Note that pruning away unneeded subplans during executor startup has the
  * added benefit of not having to initialize the unneeded subplans at all.
@@ -1605,6 +1613,13 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  *		account for initial pruning possibly having eliminated some of the
  *		subplans.
  *
+ * ExecPartitionDoInitialPruning:
+ *		Do initial pruning with the information contained in a given
+ *		PartitionPruneInfo to determine the minimal set of child subplans
+ *		to be executed of the parent plan node to which the PartitionPruneInfo
+ *		belongs and also the set of the RT indexes of leaf partitions that will
+ *		be scanned with those subplans.
+ *
  * ExecFindMatchingSubPlans:
  *		Returns indexes of matching subplans after evaluating the expressions
  *		that are safe to evaluate at a given point.  This function is first
@@ -1622,8 +1637,9 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  *
  * On return, *initially_valid_subplans is assigned the set of indexes of
  * child subplans that must be initialized along with the parent plan node.
- * Initial pruning is performed here if needed and in that case only the
- * surviving subplans' indexes are added.
+ * Initial pruning is performed here if needed (unless it has already been done
+ * by ExecDoInitialPruning()), and in that case only the surviving subplans'
+ * indexes are added.
  *
  * If subplans are indeed pruned, subplan_map arrays contained in the returned
  * PartitionPruneState are re-sequenced to not count those, though only if the
@@ -1632,29 +1648,66 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
 PartitionPruneState *
 ExecInitPartitionPruning(PlanState *planstate,
 						 int n_total_subplans,
-						 PartitionPruneInfo *pruneinfo,
+						 int part_prune_index,
 						 Bitmapset **initially_valid_subplans)
 {
-	PartitionPruneState *prunestate;
 	EState	   *estate = planstate->state;
+	PartitionPruneInfo  *pruneinfo = list_nth(estate->es_part_prune_infos,
+											  part_prune_index);
+	PartitionPruneResult *pruneresult = estate->es_part_prune_result;
+	PartitionPruneState *prunestate;
+	bool	do_pruning = (pruneinfo->needs_init_pruning ||
+						  pruneinfo->needs_exec_pruning);
 
-	/* We may need an expression context to evaluate partition exprs */
-	ExecAssignExprContext(estate, planstate);
+	/*
+	 * No need to do initial pruning if it was done already by
+	 * ExecutorDoInitialPruning(), which it would be if es_part_prune_result
+	 * has been set.
+	 */
+	if (pruneresult)
+		do_pruning = pruneinfo->needs_exec_pruning;
+
+	prunestate = NULL;
+	if (do_pruning)
+	{
+		/* We may need an expression context to evaluate partition exprs */
+		ExecAssignExprContext(estate, planstate);
 
-	/* Create the working data structure for pruning */
-	prunestate = CreatePartitionPruneState(planstate, pruneinfo);
+		/* For data reading, executor always omits detached partitions */
+		if (estate->es_partition_directory == NULL)
+			estate->es_partition_directory =
+				CreatePartitionDirectory(estate->es_query_cxt, false);
+
+		/*
+		 * Create the working data structure for pruning.  No need to consider
+		 * initial pruning steps if we have a PartitionPruneResult.
+		 */
+		prunestate = CreatePartitionPruneState(planstate, pruneinfo,
+											   pruneresult == NULL, true,
+											   NIL, planstate->ps_ExprContext,
+											   estate->es_partition_directory);
+	}
 
 	/*
 	 * Perform an initial partition prune pass, if required.
 	 */
-	if (prunestate->do_initial_prune)
-		*initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true);
+	if (pruneresult)
+	{
+		*initially_valid_subplans =
+			list_nth(pruneresult->valid_subplan_offs_list, part_prune_index);
+	}
+	else if (prunestate && prunestate->do_initial_prune)
+	{
+		*initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true,
+															 NULL);
+	}
 	else
 	{
 		/* No pruning, so we'll need to initialize all subplans */
 		Assert(n_total_subplans > 0);
 		*initially_valid_subplans = bms_add_range(NULL, 0,
 												  n_total_subplans - 1);
+		return prunestate;
 	}
 
 	/*
@@ -1662,7 +1715,8 @@ ExecInitPartitionPruning(PlanState *planstate,
 	 * that were removed above due to initial pruning.  No need to do this if
 	 * no steps were removed.
 	 */
-	if (bms_num_members(*initially_valid_subplans) < n_total_subplans)
+	if (prunestate &&
+		bms_num_members(*initially_valid_subplans) < n_total_subplans)
 	{
 		/*
 		 * We can safely skip this when !do_exec_prune, even though that
@@ -1678,11 +1732,74 @@ ExecInitPartitionPruning(PlanState *planstate,
 	return prunestate;
 }
 
+/*
+ * ExecPartitionDoInitialPruning
+ *		Perform initial pruning using given PartitionPruneInfo to determine
+ *		the minimal set of child subplans to be executed of the parent plan
+ *		node to which the PartitionPruneInfo belongs and also the set of RT
+ *		indexes of leaf partitions that will scanned with those subplans.
+ */
+Bitmapset *
+ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params,
+							  PartitionPruneInfo *pruneinfo,
+							  Bitmapset **scan_leafpart_rtis)
+{
+	List		 *rtable = plannedstmt->rtable;
+	ExprContext	 *econtext;
+	PartitionDirectory pdir;
+	MemoryContext oldcontext,
+				  tmpcontext;
+	PartitionPruneState *prunestate;
+	Bitmapset	 *valid_subplan_offs;
+
+	/*
+	 * A temporary context for memory allocations required while executing
+	 * partition pruning steps.
+	 */
+	tmpcontext = AllocSetContextCreate(CurrentMemoryContext,
+									   "initial pruning working data",
+									   ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(tmpcontext);
+
+	/*
+	 * PartitionDirectory to look up partition descriptors, which omits
+	 * detached partitions, just like in the executor proper.
+	 */
+	pdir = CreatePartitionDirectory(CurrentMemoryContext, false);
+
+	/*
+	 * We don't yet have a PlanState for the parent plan node, so we must
+	 * create a standalone ExprContext to evaluate pruning expressions,
+	 * equipped with the information about the EXTERN parameters that the
+	 * caller passed us.  Note that that's okay because the initial pruning
+	 * steps do not contain anything that requires the execution to have
+	 * started.
+	 */
+	econtext = CreateStandaloneExprContext();
+	econtext->ecxt_param_list_info = params;
+	prunestate = CreatePartitionPruneState(NULL, pruneinfo, true, false,
+										   rtable, econtext, pdir);
+	MemoryContextSwitchTo(oldcontext);
+
+	/* Do the initial pruning. */
+	valid_subplan_offs = ExecFindMatchingSubPlans(prunestate, true,
+												  scan_leafpart_rtis);
+
+	FreeExprContext(econtext, true);
+	DestroyPartitionDirectory(pdir);
+	MemoryContextDelete(tmpcontext);
+
+	return valid_subplan_offs;
+}
+
 /*
  * CreatePartitionPruneState
  *		Build the data structure required for calling ExecFindMatchingSubPlans
  *
- * 'planstate' is the parent plan node's execution state.
+ * 'planstate', if not NULL, is the parent plan node's execution state.  It
+ * can be NULL if being called before ExecutorStart(), in which case,
+ * 'rtable' (range table), 'econtext', and 'partdir' must be explicitly
+ * provided.
  *
  * 'pruneinfo' is a PartitionPruneInfo as generated by
  * make_partition_pruneinfo.  Here we build a PartitionPruneState containing a
@@ -1696,19 +1813,21 @@ ExecInitPartitionPruning(PlanState *planstate,
  * PartitionedRelPruneInfo.
  */
 static PartitionPruneState *
-CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
+CreatePartitionPruneState(PlanState *planstate,
+						  PartitionPruneInfo *pruneinfo,
+						  bool consider_initial_steps,
+						  bool consider_exec_steps,
+						  List *rtable, ExprContext *econtext,
+						  PartitionDirectory partdir)
 {
-	EState	   *estate = planstate->state;
+	EState	   *estate = planstate ? planstate->state : NULL;
 	PartitionPruneState *prunestate;
 	int			n_part_hierarchies;
 	ListCell   *lc;
 	int			i;
-	ExprContext *econtext = planstate->ps_ExprContext;
 
-	/* For data reading, executor always omits detached partitions */
-	if (estate->es_partition_directory == NULL)
-		estate->es_partition_directory =
-			CreatePartitionDirectory(estate->es_query_cxt, false);
+	Assert((estate != NULL) ||
+			(partdir != NULL && econtext != NULL && rtable != NIL));
 
 	n_part_hierarchies = list_length(pruneinfo->prune_infos);
 	Assert(n_part_hierarchies > 0);
@@ -1763,15 +1882,42 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 			PartitionKey partkey;
 
 			/*
-			 * We can rely on the copies of the partitioned table's partition
-			 * key and partition descriptor appearing in its relcache entry,
-			 * because that entry will be held open and locked for the
-			 * duration of this executor run.
+			 * Must open the relation by ourselves when called before the
+			 * execution has started, such as, when called during
+			 * ExecutorDoInitialPruning() on a cached plan.  In that case,
+			 * sub-partitions must be locked, because AcquirePlannerLocks()
+			 * would not have seen them. (1st relation in a partrelpruneinfos
+			 * list is always the root partitioned table appearing in the
+			 * query, which AcquirePlannerLocks() would have locked; the
+			 * Assert in relation_open() guards that assumption.)
+			 */
+			if (estate == NULL)
+			{
+				RangeTblEntry *rte = rt_fetch(pinfo->rtindex, rtable);
+				int		lockmode = (j == 0) ? NoLock : rte->rellockmode;
+
+				partrel = table_open(rte->relid, lockmode);
+			}
+			else
+				partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
+
+			/*
+			 * We can rely on the copy of the partitioned table's partition
+			 * key from in its relcache entry, because it can't change (or
+			 * get destroyed) as long as the relation is locked.  Partition
+			 * descriptor is taken from the PartitionDirectory associated with
+			 * the table that is held open long enough for the descriptor to
+			 * remain valid while it's used to perform the pruning steps.
 			 */
-			partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
 			partkey = RelationGetPartitionKey(partrel);
-			partdesc = PartitionDirectoryLookup(estate->es_partition_directory,
-												partrel);
+			partdesc = PartitionDirectoryLookup(partdir, partrel);
+
+			/*
+			 * Must close partrel, keeping the lock taken, if we're not using
+			 * EState's entry.
+			 */
+			if (estate == NULL)
+				table_close(partrel, NoLock);
 
 			/*
 			 * Initialize the subplan_map and subpart_map.
@@ -1785,6 +1931,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 			Assert(partdesc->nparts >= pinfo->nparts);
 			pprune->nparts = partdesc->nparts;
 			pprune->subplan_map = palloc(sizeof(int) * partdesc->nparts);
+			pprune->rti_map = palloc(sizeof(Index) * partdesc->nparts);
 			if (partdesc->nparts == pinfo->nparts)
 			{
 				/*
@@ -1795,6 +1942,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 				pprune->subpart_map = pinfo->subpart_map;
 				memcpy(pprune->subplan_map, pinfo->subplan_map,
 					   sizeof(int) * pinfo->nparts);
+				memcpy(pprune->rti_map, pinfo->rti_map,
+					   sizeof(int) * pinfo->nparts);
 
 				/*
 				 * Double-check that the list of unpruned relations has not
@@ -1845,6 +1994,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 							pinfo->subplan_map[pd_idx];
 						pprune->subpart_map[pp_idx] =
 							pinfo->subpart_map[pd_idx];
+						pprune->rti_map[pp_idx] =
+							pinfo->rti_map[pd_idx];
 						pd_idx++;
 					}
 					else
@@ -1852,6 +2003,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 						/* this partdesc entry is not in the plan */
 						pprune->subplan_map[pp_idx] = -1;
 						pprune->subpart_map[pp_idx] = -1;
+						pprune->rti_map[pp_idx] = 0;
 					}
 				}
 
@@ -1873,7 +2025,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 			 * Initialize pruning contexts as needed.
 			 */
 			pprune->initial_pruning_steps = pinfo->initial_pruning_steps;
-			if (pinfo->initial_pruning_steps)
+			if (consider_initial_steps && pinfo->initial_pruning_steps)
 			{
 				InitPartitionPruneContext(&pprune->initial_context,
 										  pinfo->initial_pruning_steps,
@@ -1883,7 +2035,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 				prunestate->do_initial_prune = true;
 			}
 			pprune->exec_pruning_steps = pinfo->exec_pruning_steps;
-			if (pinfo->exec_pruning_steps)
+			if (consider_exec_steps && pinfo->exec_pruning_steps)
 			{
 				InitPartitionPruneContext(&pprune->exec_context,
 										  pinfo->exec_pruning_steps,
@@ -2111,10 +2263,14 @@ PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
  * Pass initial_prune if PARAM_EXEC Params cannot yet be evaluated.  This
  * differentiates the initial executor-time pruning step from later
  * runtime pruning.
+ *
+ * RT indexes of leaf partitions scanned by the chosen subplans are added to
+ * *scan_leafpart_rtis if the pointer is non-NULL.
  */
 Bitmapset *
 ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
-						 bool initial_prune)
+						 bool initial_prune,
+						 Bitmapset **scan_leafpart_rtis)
 {
 	Bitmapset  *result = NULL;
 	MemoryContext oldcontext;
@@ -2149,7 +2305,7 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
 		 */
 		pprune = &prunedata->partrelprunedata[0];
 		find_matching_subplans_recurse(prunedata, pprune, initial_prune,
-									   &result);
+									   &result, scan_leafpart_rtis);
 
 		/* Expression eval may have used space in ExprContext too */
 		if (pprune->exec_pruning_steps)
@@ -2163,6 +2319,8 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
 
 	/* Copy result out of the temp context before we reset it */
 	result = bms_copy(result);
+	if (scan_leafpart_rtis)
+		*scan_leafpart_rtis = bms_copy(*scan_leafpart_rtis);
 
 	MemoryContextReset(prunestate->prune_context);
 
@@ -2173,13 +2331,15 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
  * find_matching_subplans_recurse
  *		Recursive worker function for ExecFindMatchingSubPlans
  *
- * Adds valid (non-prunable) subplan IDs to *validsubplans
+ * Adds valid (non-prunable) subplan IDs to *validsubplans and RT indexes of
+ * of the corresponding leaf partitions to *scan_leafpart_rtis (if asked for).
  */
 static void
 find_matching_subplans_recurse(PartitionPruningData *prunedata,
 							   PartitionedRelPruningData *pprune,
 							   bool initial_prune,
-							   Bitmapset **validsubplans)
+							   Bitmapset **validsubplans,
+							   Bitmapset **scan_leafpart_rtis)
 {
 	Bitmapset  *partset;
 	int			i;
@@ -2206,8 +2366,14 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
 	while ((i = bms_next_member(partset, i)) >= 0)
 	{
 		if (pprune->subplan_map[i] >= 0)
+		{
 			*validsubplans = bms_add_member(*validsubplans,
 											pprune->subplan_map[i]);
+			Assert(pprune->rti_map[i] > 0);
+			if (scan_leafpart_rtis)
+				*scan_leafpart_rtis = bms_add_member(*scan_leafpart_rtis,
+													 pprune->rti_map[i]);
+		}
 		else
 		{
 			int			partidx = pprune->subpart_map[i];
@@ -2215,7 +2381,8 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
 			if (partidx >= 0)
 				find_matching_subplans_recurse(prunedata,
 											   &prunedata->partrelprunedata[partidx],
-											   initial_prune, validsubplans);
+											   initial_prune, validsubplans,
+											   scan_leafpart_rtis);
 			else
 			{
 				/*
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 9df1f81ea8..f9c7976ff2 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -119,6 +119,8 @@ CreateExecutorState(void)
 	estate->es_relations = NULL;
 	estate->es_rowmarks = NULL;
 	estate->es_plannedstmt = NULL;
+	estate->es_part_prune_infos = NIL;
+	estate->es_part_prune_result = NULL;
 
 	estate->es_junkFilter = NULL;
 
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index f9460ae506..a2182a6b1f 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -844,7 +844,7 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
 	else
 		dest = None_Receiver;
 
-	es->qd = CreateQueryDesc(es->stmt,
+	es->qd = CreateQueryDesc(es->stmt, NULL,
 							 fcache->src,
 							 GetActiveSnapshot(),
 							 InvalidSnapshot,
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index 357e10a1d7..96880e122a 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -134,7 +134,7 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
 	appendstate->as_begun = false;
 
 	/* If run-time partition pruning is enabled, then set that up now */
-	if (node->part_prune_info != NULL)
+	if (node->part_prune_index >= 0)
 	{
 		PartitionPruneState *prunestate;
 
@@ -145,7 +145,7 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
 		 */
 		prunestate = ExecInitPartitionPruning(&appendstate->ps,
 											  list_length(node->appendplans),
-											  node->part_prune_info,
+											  node->part_prune_index,
 											  &validsubplans);
 		appendstate->as_prune_state = prunestate;
 		nplans = bms_num_members(validsubplans);
@@ -155,7 +155,8 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
 		 * subplan, we can fill as_valid_subplans immediately, preventing
 		 * later calls to ExecFindMatchingSubPlans.
 		 */
-		if (!prunestate->do_exec_prune && nplans > 0)
+		if (appendstate->as_prune_state == NULL ||
+			(!appendstate->as_prune_state->do_exec_prune && nplans > 0))
 			appendstate->as_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
 	}
 	else
@@ -577,7 +578,7 @@ choose_next_subplan_locally(AppendState *node)
 		}
 		else if (node->as_valid_subplans == NULL)
 			node->as_valid_subplans =
-				ExecFindMatchingSubPlans(node->as_prune_state, false);
+				ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
 
 		whichplan = -1;
 	}
@@ -642,7 +643,7 @@ choose_next_subplan_for_leader(AppendState *node)
 		if (node->as_valid_subplans == NULL)
 		{
 			node->as_valid_subplans =
-				ExecFindMatchingSubPlans(node->as_prune_state, false);
+				ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
 
 			/*
 			 * Mark each invalid plan as finished to allow the loop below to
@@ -717,7 +718,7 @@ choose_next_subplan_for_worker(AppendState *node)
 	else if (node->as_valid_subplans == NULL)
 	{
 		node->as_valid_subplans =
-			ExecFindMatchingSubPlans(node->as_prune_state, false);
+			ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
 		mark_invalid_subplans_as_finished(node);
 	}
 
@@ -868,7 +869,7 @@ ExecAppendAsyncBegin(AppendState *node)
 	if (node->as_valid_subplans == NULL)
 	{
 		node->as_valid_subplans =
-			ExecFindMatchingSubPlans(node->as_prune_state, false);
+			ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
 
 		classify_matching_subplans(node);
 	}
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index ecf9052e03..7708cfffda 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -82,7 +82,7 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
 	mergestate->ps.ExecProcNode = ExecMergeAppend;
 
 	/* If run-time partition pruning is enabled, then set that up now */
-	if (node->part_prune_info != NULL)
+	if (node->part_prune_index >= 0)
 	{
 		PartitionPruneState *prunestate;
 
@@ -93,7 +93,7 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
 		 */
 		prunestate = ExecInitPartitionPruning(&mergestate->ps,
 											  list_length(node->mergeplans),
-											  node->part_prune_info,
+											  node->part_prune_index,
 											  &validsubplans);
 		mergestate->ms_prune_state = prunestate;
 		nplans = bms_num_members(validsubplans);
@@ -103,7 +103,8 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
 		 * subplan, we can fill as_valid_subplans immediately, preventing
 		 * later calls to ExecFindMatchingSubPlans.
 		 */
-		if (!prunestate->do_exec_prune && nplans > 0)
+		if (mergestate->ms_prune_state == NULL ||
+			(!mergestate->ms_prune_state->do_exec_prune && nplans > 0))
 			mergestate->ms_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
 	}
 	else
@@ -218,7 +219,7 @@ ExecMergeAppend(PlanState *pstate)
 		 */
 		if (node->ms_valid_subplans == NULL)
 			node->ms_valid_subplans =
-				ExecFindMatchingSubPlans(node->ms_prune_state, false);
+				ExecFindMatchingSubPlans(node->ms_prune_state, false, NULL);
 
 		/*
 		 * First time through: pull the first tuple from each valid subplan,
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 042a5f8b0a..729e2fd7b2 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -1578,6 +1578,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 	CachedPlanSource *plansource;
 	CachedPlan *cplan;
 	List	   *stmt_list;
+	List	   *part_prune_result_list;
 	char	   *query_string;
 	Snapshot	snapshot;
 	MemoryContext oldcontext;
@@ -1657,7 +1658,10 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 	 */
 
 	/* Replan if needed, and increment plan refcount for portal */
-	cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv);
+	cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv,
+						  &part_prune_result_list);
+	Assert(list_length(cplan->stmt_list) ==
+		   list_length(part_prune_result_list));
 	stmt_list = cplan->stmt_list;
 
 	if (!plan->saved)
@@ -1685,6 +1689,9 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 					  stmt_list,
 					  cplan);
 
+	/* Copy PartitionPruneResults into the portal's context. */
+	PortalStorePartitionPruneResults(portal, part_prune_result_list);
+
 	/*
 	 * Set up options for portal.  Default SCROLL type is chosen the same way
 	 * as PerformCursorOpen does it.
@@ -2092,7 +2099,8 @@ SPI_plan_get_cached_plan(SPIPlanPtr plan)
 	/* Get the generic plan for the query */
 	cplan = GetCachedPlan(plansource, NULL,
 						  plan->saved ? CurrentResourceOwner : NULL,
-						  _SPI_current->queryEnv);
+						  _SPI_current->queryEnv,
+						  NULL /* Not interested in PartitionPruneResults */);
 	Assert(cplan == plansource->gplan);
 
 	/* Pop the error context stack */
@@ -2473,7 +2481,9 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 	{
 		CachedPlanSource *plansource = (CachedPlanSource *) lfirst(lc1);
 		List	   *stmt_list;
-		ListCell   *lc2;
+		List	   *part_prune_result_list;
+		ListCell   *lc2,
+				   *lc3;
 
 		spicallbackarg.query = plansource->query_string;
 
@@ -2549,8 +2559,10 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 		 * plan, the refcount must be backed by the plan_owner.
 		 */
 		cplan = GetCachedPlan(plansource, options->params,
-							  plan_owner, _SPI_current->queryEnv);
-
+							  plan_owner, _SPI_current->queryEnv,
+							 &part_prune_result_list);
+		Assert(list_length(cplan->stmt_list) ==
+			   list_length(part_prune_result_list));
 		stmt_list = cplan->stmt_list;
 
 		/*
@@ -2589,9 +2601,10 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 			}
 		}
 
-		foreach(lc2, stmt_list)
+		forboth(lc2, stmt_list, lc3, part_prune_result_list)
 		{
 			PlannedStmt *stmt = lfirst_node(PlannedStmt, lc2);
+			PartitionPruneResult *part_prune_result = lfirst_node(PartitionPruneResult, lc3);
 			bool		canSetTag = stmt->canSetTag;
 			DestReceiver *dest;
 
@@ -2663,7 +2676,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 				else
 					snap = InvalidSnapshot;
 
-				qdesc = CreateQueryDesc(stmt,
+				qdesc = CreateQueryDesc(stmt, part_prune_result,
 										plansource->query_string,
 										snap, crosscheck_snapshot,
 										dest,
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 836f427ea8..59a7054011 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -96,7 +96,10 @@ _copyPlannedStmt(const PlannedStmt *from)
 	COPY_SCALAR_FIELD(parallelModeNeeded);
 	COPY_SCALAR_FIELD(jitFlags);
 	COPY_NODE_FIELD(planTree);
+	COPY_NODE_FIELD(partPruneInfos);
+	COPY_SCALAR_FIELD(containsInitialPruning);
 	COPY_NODE_FIELD(rtable);
+	COPY_BITMAPSET_FIELD(minLockRelids);
 	COPY_NODE_FIELD(resultRelations);
 	COPY_NODE_FIELD(appendRelations);
 	COPY_NODE_FIELD(subplans);
@@ -253,7 +256,7 @@ _copyAppend(const Append *from)
 	COPY_NODE_FIELD(appendplans);
 	COPY_SCALAR_FIELD(nasyncplans);
 	COPY_SCALAR_FIELD(first_partial_plan);
-	COPY_NODE_FIELD(part_prune_info);
+	COPY_SCALAR_FIELD(part_prune_index);
 
 	return newnode;
 }
@@ -281,7 +284,7 @@ _copyMergeAppend(const MergeAppend *from)
 	COPY_POINTER_FIELD(sortOperators, from->numCols * sizeof(Oid));
 	COPY_POINTER_FIELD(collations, from->numCols * sizeof(Oid));
 	COPY_POINTER_FIELD(nullsFirst, from->numCols * sizeof(bool));
-	COPY_NODE_FIELD(part_prune_info);
+	COPY_SCALAR_FIELD(part_prune_index);
 
 	return newnode;
 }
@@ -1283,6 +1286,8 @@ _copyPartitionPruneInfo(const PartitionPruneInfo *from)
 	PartitionPruneInfo *newnode = makeNode(PartitionPruneInfo);
 
 	COPY_NODE_FIELD(prune_infos);
+	COPY_SCALAR_FIELD(needs_init_pruning);
+	COPY_SCALAR_FIELD(needs_exec_pruning);
 	COPY_BITMAPSET_FIELD(other_subplans);
 
 	return newnode;
@@ -1299,6 +1304,7 @@ _copyPartitionedRelPruneInfo(const PartitionedRelPruneInfo *from)
 	COPY_POINTER_FIELD(subplan_map, from->nparts * sizeof(int));
 	COPY_POINTER_FIELD(subpart_map, from->nparts * sizeof(int));
 	COPY_POINTER_FIELD(relid_map, from->nparts * sizeof(Oid));
+	COPY_POINTER_FIELD(rti_map, from->nparts * sizeof(Index));
 	COPY_NODE_FIELD(initial_pruning_steps);
 	COPY_NODE_FIELD(exec_pruning_steps);
 	COPY_BITMAPSET_FIELD(execparamids);
@@ -5473,6 +5479,21 @@ _copyExtensibleNode(const ExtensibleNode *from)
 	return newnode;
 }
 
+/* ****************************************************************
+ *					execnodes.h copy functions
+ * ****************************************************************
+ */
+static PartitionPruneResult *
+_copyPartitionPruneResult(const PartitionPruneResult *from)
+{
+	PartitionPruneResult *newnode = makeNode(PartitionPruneResult);
+
+	COPY_NODE_FIELD(valid_subplan_offs_list);
+	COPY_BITMAPSET_FIELD(scan_leafpart_rtis);
+
+	return newnode;
+}
+
 /* ****************************************************************
  *					value.h copy functions
  * ****************************************************************
@@ -5527,7 +5548,6 @@ _copyBitString(const BitString *from)
 	return newnode;
 }
 
-
 static ForeignKeyCacheInfo *
 _copyForeignKeyCacheInfo(const ForeignKeyCacheInfo *from)
 {
@@ -6569,6 +6589,13 @@ copyObjectImpl(const void *from)
 			retval = _copyPublicationTable(from);
 			break;
 
+			/*
+			 * EXECUTION NODES
+			 */
+		case T_PartitionPruneResult:
+			retval = _copyPartitionPruneResult(from);
+			break;
+
 			/*
 			 * MISCELLANEOUS NODES
 			 */
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index d5f5e76c55..3dada68291 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -314,7 +314,10 @@ _outPlannedStmt(StringInfo str, const PlannedStmt *node)
 	WRITE_BOOL_FIELD(parallelModeNeeded);
 	WRITE_INT_FIELD(jitFlags);
 	WRITE_NODE_FIELD(planTree);
+	WRITE_NODE_FIELD(partPruneInfos);
+	WRITE_BOOL_FIELD(containsInitialPruning);
 	WRITE_NODE_FIELD(rtable);
+	WRITE_BITMAPSET_FIELD(minLockRelids);
 	WRITE_NODE_FIELD(resultRelations);
 	WRITE_NODE_FIELD(appendRelations);
 	WRITE_NODE_FIELD(subplans);
@@ -443,7 +446,7 @@ _outAppend(StringInfo str, const Append *node)
 	WRITE_NODE_FIELD(appendplans);
 	WRITE_INT_FIELD(nasyncplans);
 	WRITE_INT_FIELD(first_partial_plan);
-	WRITE_NODE_FIELD(part_prune_info);
+	WRITE_INT_FIELD(part_prune_index);
 }
 
 static void
@@ -460,7 +463,7 @@ _outMergeAppend(StringInfo str, const MergeAppend *node)
 	WRITE_OID_ARRAY(sortOperators, node->numCols);
 	WRITE_OID_ARRAY(collations, node->numCols);
 	WRITE_BOOL_ARRAY(nullsFirst, node->numCols);
-	WRITE_NODE_FIELD(part_prune_info);
+	WRITE_INT_FIELD(part_prune_index);
 }
 
 static void
@@ -1009,6 +1012,8 @@ _outPartitionPruneInfo(StringInfo str, const PartitionPruneInfo *node)
 	WRITE_NODE_TYPE("PARTITIONPRUNEINFO");
 
 	WRITE_NODE_FIELD(prune_infos);
+	WRITE_BOOL_FIELD(needs_init_pruning);
+	WRITE_BOOL_FIELD(needs_exec_pruning);
 	WRITE_BITMAPSET_FIELD(other_subplans);
 }
 
@@ -1023,6 +1028,7 @@ _outPartitionedRelPruneInfo(StringInfo str, const PartitionedRelPruneInfo *node)
 	WRITE_INT_ARRAY(subplan_map, node->nparts);
 	WRITE_INT_ARRAY(subpart_map, node->nparts);
 	WRITE_OID_ARRAY(relid_map, node->nparts);
+	WRITE_INDEX_ARRAY(rti_map, node->nparts);
 	WRITE_NODE_FIELD(initial_pruning_steps);
 	WRITE_NODE_FIELD(exec_pruning_steps);
 	WRITE_BITMAPSET_FIELD(execparamids);
@@ -2425,6 +2431,9 @@ _outPlannerGlobal(StringInfo str, const PlannerGlobal *node)
 	WRITE_NODE_FIELD(finalrowmarks);
 	WRITE_NODE_FIELD(resultRelations);
 	WRITE_NODE_FIELD(appendRelations);
+	WRITE_NODE_FIELD(partPruneInfos);
+	WRITE_BOOL_FIELD(containsInitialPruning);
+	WRITE_BITMAPSET_FIELD(minLockRelids);
 	WRITE_NODE_FIELD(relationOids);
 	WRITE_NODE_FIELD(invalItems);
 	WRITE_NODE_FIELD(paramExecTypes);
@@ -2492,6 +2501,7 @@ _outPlannerInfo(StringInfo str, const PlannerInfo *node)
 	WRITE_BITMAPSET_FIELD(curOuterRels);
 	WRITE_NODE_FIELD(curOuterParams);
 	WRITE_BOOL_FIELD(partColsUpdated);
+	WRITE_NODE_FIELD(partPruneInfos);
 }
 
 static void
@@ -2845,6 +2855,21 @@ _outExtensibleNode(StringInfo str, const ExtensibleNode *node)
 	methods->nodeOut(str, node);
 }
 
+/*****************************************************************************
+ *
+ *	Stuff from execnodes.h
+ *
+ *****************************************************************************/
+
+static void
+_outPartitionPruneResult(StringInfo str, const PartitionPruneResult *node)
+{
+	WRITE_NODE_TYPE("PARTITIONPRUNERESULT");
+
+	WRITE_NODE_FIELD(valid_subplan_offs_list);
+	WRITE_BITMAPSET_FIELD(scan_leafpart_rtis);
+}
+
 /*****************************************************************************
  *
  *	Stuff from parsenodes.h.
@@ -4754,6 +4779,13 @@ outNode(StringInfo str, const void *obj)
 				_outJsonTableSibling(str, obj);
 				break;
 
+				/*
+				 * EXECUTION NODES
+				 */
+			case T_PartitionPruneResult:
+				_outPartitionPruneResult(str, obj);
+				break;
+
 			default:
 
 				/*
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 3d150cb25d..6a6fcec03b 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -164,6 +164,11 @@
 	token = pg_strtok(&length);		/* skip :fldname */ \
 	local_node->fldname = readIntCols(len)
 
+/* Read an Index array */
+#define READ_INDEX_ARRAY(fldname, len) \
+	token = pg_strtok(&length);		/* skip :fldname */ \
+	local_node->fldname = readIndexCols(len)
+
 /* Read a bool array */
 #define READ_BOOL_ARRAY(fldname, len) \
 	token = pg_strtok(&length);		/* skip :fldname */ \
@@ -1815,7 +1820,10 @@ _readPlannedStmt(void)
 	READ_BOOL_FIELD(parallelModeNeeded);
 	READ_INT_FIELD(jitFlags);
 	READ_NODE_FIELD(planTree);
+	READ_NODE_FIELD(partPruneInfos);
+	READ_BOOL_FIELD(containsInitialPruning);
 	READ_NODE_FIELD(rtable);
+	READ_BITMAPSET_FIELD(minLockRelids);
 	READ_NODE_FIELD(resultRelations);
 	READ_NODE_FIELD(appendRelations);
 	READ_NODE_FIELD(subplans);
@@ -1947,7 +1955,7 @@ _readAppend(void)
 	READ_NODE_FIELD(appendplans);
 	READ_INT_FIELD(nasyncplans);
 	READ_INT_FIELD(first_partial_plan);
-	READ_NODE_FIELD(part_prune_info);
+	READ_INT_FIELD(part_prune_index);
 
 	READ_DONE();
 }
@@ -1969,7 +1977,7 @@ _readMergeAppend(void)
 	READ_OID_ARRAY(sortOperators, local_node->numCols);
 	READ_OID_ARRAY(collations, local_node->numCols);
 	READ_BOOL_ARRAY(nullsFirst, local_node->numCols);
-	READ_NODE_FIELD(part_prune_info);
+	READ_INT_FIELD(part_prune_index);
 
 	READ_DONE();
 }
@@ -2767,6 +2775,8 @@ _readPartitionPruneInfo(void)
 	READ_LOCALS(PartitionPruneInfo);
 
 	READ_NODE_FIELD(prune_infos);
+	READ_BOOL_FIELD(needs_init_pruning);
+	READ_BOOL_FIELD(needs_exec_pruning);
 	READ_BITMAPSET_FIELD(other_subplans);
 
 	READ_DONE();
@@ -2783,6 +2793,7 @@ _readPartitionedRelPruneInfo(void)
 	READ_INT_ARRAY(subplan_map, local_node->nparts);
 	READ_INT_ARRAY(subpart_map, local_node->nparts);
 	READ_OID_ARRAY(relid_map, local_node->nparts);
+	READ_INDEX_ARRAY(rti_map, local_node->nparts);
 	READ_NODE_FIELD(initial_pruning_steps);
 	READ_NODE_FIELD(exec_pruning_steps);
 	READ_BITMAPSET_FIELD(execparamids);
@@ -2936,6 +2947,21 @@ _readPartitionRangeDatum(void)
 	READ_DONE();
 }
 
+
+/*
+ * _readPartitionPruneResult
+ */
+static PartitionPruneResult *
+_readPartitionPruneResult(void)
+{
+	READ_LOCALS(PartitionPruneResult);
+
+	READ_NODE_FIELD(valid_subplan_offs_list);
+	READ_BITMAPSET_FIELD(scan_leafpart_rtis);
+
+	READ_DONE();
+}
+
 /*
  * parseNodeString
  *
@@ -3233,6 +3259,8 @@ parseNodeString(void)
 		return_value = _readJsonTableParent();
 	else if (MATCH("JSONTABSNODE", 12))
 		return_value = _readJsonTableSibling();
+	else if (MATCH("PARTITIONPRUNERESULT", 20))
+		return_value = _readPartitionPruneResult();
 	else
 	{
 		elog(ERROR, "badly formatted node string \"%.32s\"...", token);
@@ -3376,6 +3404,30 @@ readIntCols(int numCols)
 	return int_vals;
 }
 
+/*
+ * readIndexCols
+ */
+Index *
+readIndexCols(int numCols)
+{
+	int			tokenLength,
+				i;
+	const char *token;
+	Index	   *index_vals;
+
+	if (numCols <= 0)
+		return NULL;
+
+	index_vals = (Index *) palloc(numCols * sizeof(Index));
+	for (i = 0; i < numCols; i++)
+	{
+		token = pg_strtok(&tokenLength);
+		index_vals[i] = atoui(token);
+	}
+
+	return index_vals;
+}
+
 /*
  * readBoolCols
  */
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index 95476ada0b..fe0df2f1d1 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -1184,7 +1184,6 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
 	ListCell   *subpaths;
 	int			nasyncplans = 0;
 	RelOptInfo *rel = best_path->path.parent;
-	PartitionPruneInfo *partpruneinfo = NULL;
 	int			nodenumsortkeys = 0;
 	AttrNumber *nodeSortColIdx = NULL;
 	Oid		   *nodeSortOperators = NULL;
@@ -1335,6 +1334,9 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
 		subplans = lappend(subplans, subplan);
 	}
 
+	/* Set below if we find quals that we can use to run-time prune */
+	plan->part_prune_index = -1;
+
 	/*
 	 * If any quals exist, they may be useful to perform further partition
 	 * pruning during execution.  Gather information needed by the executor to
@@ -1358,16 +1360,14 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
 		}
 
 		if (prunequal != NIL)
-			partpruneinfo =
-				make_partition_pruneinfo(root, rel,
-										 best_path->subpaths,
-										 prunequal);
+			plan->part_prune_index = make_partition_pruneinfo(root, rel,
+															  best_path->subpaths,
+															  prunequal);
 	}
 
 	plan->appendplans = subplans;
 	plan->nasyncplans = nasyncplans;
 	plan->first_partial_plan = best_path->first_partial_path;
-	plan->part_prune_info = partpruneinfo;
 
 	copy_generic_path_info(&plan->plan, (Path *) best_path);
 
@@ -1407,7 +1407,6 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
 	List	   *subplans = NIL;
 	ListCell   *subpaths;
 	RelOptInfo *rel = best_path->path.parent;
-	PartitionPruneInfo *partpruneinfo = NULL;
 
 	/*
 	 * We don't have the actual creation of the MergeAppend node split out
@@ -1500,6 +1499,9 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
 		subplans = lappend(subplans, subplan);
 	}
 
+	/* Set below if we find quals that we can use to run-time prune */
+	node->part_prune_index = -1;
+
 	/*
 	 * If any quals exist, they may be useful to perform further partition
 	 * pruning during execution.  Gather information needed by the executor to
@@ -1523,13 +1525,13 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
 		}
 
 		if (prunequal != NIL)
-			partpruneinfo = make_partition_pruneinfo(root, rel,
-													 best_path->subpaths,
-													 prunequal);
+			node->part_prune_index = make_partition_pruneinfo(root, rel,
+															  best_path->subpaths,
+															  prunequal);
 	}
 
 	node->mergeplans = subplans;
-	node->part_prune_info = partpruneinfo;
+
 
 	/*
 	 * If prepare_sort_from_pathkeys added sort columns, but we were told to
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index b090b087e9..f425362491 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -518,7 +518,10 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
 	result->dependsOnRole = glob->dependsOnRole;
 	result->parallelModeNeeded = glob->parallelModeNeeded;
 	result->planTree = top_plan;
+	result->partPruneInfos = glob->partPruneInfos;
+	result->containsInitialPruning = glob->containsInitialPruning;
 	result->rtable = glob->finalrtable;
+	result->minLockRelids = glob->minLockRelids;
 	result->resultRelations = glob->resultRelations;
 	result->appendRelations = glob->appendRelations;
 	result->subplans = glob->subplans;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 6ea3505646..c5549a19b4 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -261,7 +261,7 @@ set_plan_references(PlannerInfo *root, Plan *plan)
 	Plan	   *result;
 	PlannerGlobal *glob = root->glob;
 	int			rtoffset = list_length(glob->finalrtable);
-	ListCell   *lc;
+	ListCell *lc;
 
 	/*
 	 * Add all the query's RTEs to the flattened rangetable.  The live ones
@@ -270,6 +270,16 @@ set_plan_references(PlannerInfo *root, Plan *plan)
 	 */
 	add_rtes_to_flat_rtable(root, false);
 
+	/*
+	 * Add the query's adjusted range of RT indexes to glob->minLockRelids.
+	 * The adjusted RT indexes of prunable relations will be deleted from the
+	 * set below where PartitionPruneInfos are processed.
+	 */
+	glob->minLockRelids =
+		bms_add_range(glob->minLockRelids,
+					  rtoffset + 1,
+					  rtoffset + list_length(root->parse->rtable));
+
 	/*
 	 * Adjust RT indexes of PlanRowMarks and add to final rowmarks list
 	 */
@@ -348,6 +358,64 @@ set_plan_references(PlannerInfo *root, Plan *plan)
 		}
 	}
 
+	/* Also fix up the information in PartitionPruneInfos. */
+	foreach (lc, root->partPruneInfos)
+	{
+		PartitionPruneInfo *pruneinfo = lfirst(lc);
+		Bitmapset *leafpart_rtis = NULL;
+		ListCell  *l;
+
+		foreach(l, pruneinfo->prune_infos)
+		{
+			List	   *prune_infos = lfirst(l);
+			ListCell   *l2;
+
+			foreach(l2, prune_infos)
+			{
+				PartitionedRelPruneInfo *pinfo = lfirst(l2);
+				int		i;
+
+				/* RT index of the partitione table. */
+				pinfo->rtindex += rtoffset;
+
+				/* And also those of the leaf partitions. */
+				for (i = 0; i < pinfo->nparts; i++)
+				{
+					if (pinfo->rti_map[i] > 0)
+					{
+						pinfo->rti_map[i] += rtoffset;
+						leafpart_rtis = bms_add_member(leafpart_rtis,
+													   pinfo->rti_map[i]);
+					}
+				}
+			}
+		}
+
+		if (pruneinfo->needs_init_pruning)
+		{
+			glob->containsInitialPruning = true;
+
+			/*
+			 * Delete the leaf partition RTIs from the global set of relations
+			 * to be locked before executing the plan.  AcquireExecutorLocks()
+			 * will find the ones to add to the set after performing initial
+			 * pruning.
+			 */
+			glob->minLockRelids = bms_del_members(glob->minLockRelids,
+												  leafpart_rtis);
+		}
+
+		glob->partPruneInfos = lappend(glob->partPruneInfos, pruneinfo);
+	}
+
+	/*
+	 * It seems worth doing a bms_copy() on glob->minLockRelids if we deleted
+	 * bit from it just above to prevent empty tail bits resulting in
+	 * inefficient looping during AcquireExecutorLocks().
+	 */
+	if (glob->containsInitialPruning)
+		glob->minLockRelids = bms_copy(glob->minLockRelids);
+
 	return result;
 }
 
@@ -1640,21 +1708,12 @@ set_append_references(PlannerInfo *root,
 
 	aplan->apprelids = offset_relid_set(aplan->apprelids, rtoffset);
 
-	if (aplan->part_prune_info)
-	{
-		foreach(l, aplan->part_prune_info->prune_infos)
-		{
-			List	   *prune_infos = lfirst(l);
-			ListCell   *l2;
-
-			foreach(l2, prune_infos)
-			{
-				PartitionedRelPruneInfo *pinfo = lfirst(l2);
-
-				pinfo->rtindex += rtoffset;
-			}
-		}
-	}
+	/*
+	 * PartitionPruneInfos will be added to a list in PlannerGlobal, so update
+	 * the index.
+	 */
+	if (aplan->part_prune_index >= 0)
+		aplan->part_prune_index += list_length(root->glob->partPruneInfos);
 
 	/* We don't need to recurse to lefttree or righttree ... */
 	Assert(aplan->plan.lefttree == NULL);
@@ -1712,21 +1771,12 @@ set_mergeappend_references(PlannerInfo *root,
 
 	mplan->apprelids = offset_relid_set(mplan->apprelids, rtoffset);
 
-	if (mplan->part_prune_info)
-	{
-		foreach(l, mplan->part_prune_info->prune_infos)
-		{
-			List	   *prune_infos = lfirst(l);
-			ListCell   *l2;
-
-			foreach(l2, prune_infos)
-			{
-				PartitionedRelPruneInfo *pinfo = lfirst(l2);
-
-				pinfo->rtindex += rtoffset;
-			}
-		}
-	}
+	/*
+	 * PartitionPruneInfos will be added to a list in PlannerGlobal, so update
+	 * the index.
+	 */
+	if (mplan->part_prune_index >= 0)
+		mplan->part_prune_index += list_length(root->glob->partPruneInfos);
 
 	/* We don't need to recurse to lefttree or righttree ... */
 	Assert(mplan->plan.lefttree == NULL);
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index 9d3c05aed3..952c5b8327 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -144,7 +144,9 @@ static List *make_partitionedrel_pruneinfo(PlannerInfo *root,
 										   List *prunequal,
 										   Bitmapset *partrelids,
 										   int *relid_subplan_map,
-										   Bitmapset **matchedsubplans);
+										   Bitmapset **matchedsubplans,
+										   bool *needs_init_pruning,
+										   bool *needs_exec_pruning);
 static void gen_partprune_steps(RelOptInfo *rel, List *clauses,
 								PartClauseTarget target,
 								GeneratePruningStepsContext *context);
@@ -209,16 +211,20 @@ static void partkey_datum_from_expr(PartitionPruneContext *context,
 
 /*
  * make_partition_pruneinfo
- *		Builds a PartitionPruneInfo which can be used in the executor to allow
- *		additional partition pruning to take place.  Returns NULL when
- *		partition pruning would be useless.
+ *		Checks if the given set of quals can be used to build pruning steps
+ *		that the executor can use to prune away unneeded partitions.  If
+ *		suitable quals are found then a PartitionPruneInfo is built and tagged
+ *		onto the PlannerInfo's partPruneInfos list.
+ *
+ * The return value is the 0-based index of the item added to the
+ * partPruneInfos list or -1 if nothing was added.
  *
  * 'parentrel' is the RelOptInfo for an appendrel, and 'subpaths' is the list
  * of scan paths for its child rels.
  * 'prunequal' is a list of potential pruning quals (i.e., restriction
  * clauses that are applicable to the appendrel).
  */
-PartitionPruneInfo *
+int
 make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 						 List *subpaths,
 						 List *prunequal)
@@ -230,6 +236,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	int		   *relid_subplan_map;
 	ListCell   *lc;
 	int			i;
+	bool		needs_init_pruning = false;
+	bool		needs_exec_pruning = false;
 
 	/*
 	 * Scan the subpaths to see which ones are scans of partition child
@@ -309,12 +317,16 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		Bitmapset  *partrelids = (Bitmapset *) lfirst(lc);
 		List	   *pinfolist;
 		Bitmapset  *matchedsubplans = NULL;
+		bool		partrel_needs_init_pruning;
+		bool		partrel_needs_exec_pruning;
 
 		pinfolist = make_partitionedrel_pruneinfo(root, parentrel,
 												  prunequal,
 												  partrelids,
 												  relid_subplan_map,
-												  &matchedsubplans);
+												  &matchedsubplans,
+												  &partrel_needs_init_pruning,
+												  &partrel_needs_exec_pruning);
 
 		/* When pruning is possible, record the matched subplans */
 		if (pinfolist != NIL)
@@ -323,6 +335,9 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 			allmatchedsubplans = bms_join(matchedsubplans,
 										  allmatchedsubplans);
 		}
+
+		needs_init_pruning |= partrel_needs_init_pruning;
+		needs_exec_pruning |= partrel_needs_exec_pruning;
 	}
 
 	pfree(relid_subplan_map);
@@ -332,11 +347,13 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	 * quals, then we can just not bother with run-time pruning.
 	 */
 	if (prunerelinfos == NIL)
-		return NULL;
+		return -1;
 
 	/* Else build the result data structure */
 	pruneinfo = makeNode(PartitionPruneInfo);
 	pruneinfo->prune_infos = prunerelinfos;
+	pruneinfo->needs_init_pruning = needs_init_pruning;
+	pruneinfo->needs_exec_pruning = needs_exec_pruning;
 
 	/*
 	 * Some subplans may not belong to any of the identified partitioned rels.
@@ -358,7 +375,9 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	else
 		pruneinfo->other_subplans = NULL;
 
-	return pruneinfo;
+	root->partPruneInfos = lappend(root->partPruneInfos, pruneinfo);
+
+	return list_length(root->partPruneInfos) - 1;
 }
 
 /*
@@ -435,13 +454,18 @@ add_part_relids(List *allpartrelids, Bitmapset *partrelids)
  * If we cannot find any useful run-time pruning steps, return NIL.
  * However, on success, each rel identified in partrelids will have
  * an element in the result list, even if some of them are useless.
+ * *needs_init_pruning and *needs_exec_pruning are set to indicate that the
+ * returned PartitionedRelPruneInfos contains pruning steps that can be
+ * performed before and after execution begins, respectively.
  */
 static List *
 make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 							  List *prunequal,
 							  Bitmapset *partrelids,
 							  int *relid_subplan_map,
-							  Bitmapset **matchedsubplans)
+							  Bitmapset **matchedsubplans,
+							  bool *needs_init_pruning,
+							  bool *needs_exec_pruning)
 {
 	RelOptInfo *targetpart = NULL;
 	List	   *pinfolist = NIL;
@@ -452,6 +476,10 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	int			rti;
 	int			i;
 
+	/* Will find out below. */
+	*needs_init_pruning = false;
+	*needs_exec_pruning = false;
+
 	/*
 	 * Examine each partitioned rel, constructing a temporary array to map
 	 * from planner relids to index of the partitioned rel, and building a
@@ -539,6 +567,9 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		 * executor per-scan pruning steps.  This first pass creates startup
 		 * pruning steps and detects whether there's any possibly-useful quals
 		 * that would require per-scan pruning.
+		 *
+		 * In the first pass, we note whether the 2nd pass is necessary by
+		 * noting the presence of EXEC parameters.
 		 */
 		gen_partprune_steps(subpart, partprunequal, PARTTARGET_INITIAL,
 							&context);
@@ -613,6 +644,12 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		pinfo->execparamids = execparamids;
 		/* Remaining fields will be filled in the next loop */
 
+		/* record which types of pruning steps we've seen so far */
+		if (initial_pruning_steps != NIL)
+			*needs_init_pruning = true;
+		if (exec_pruning_steps != NIL)
+			*needs_exec_pruning = true;
+
 		pinfolist = lappend(pinfolist, pinfo);
 	}
 
@@ -640,6 +677,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		int		   *subplan_map;
 		int		   *subpart_map;
 		Oid		   *relid_map;
+		Index	   *rti_map;
 
 		/*
 		 * Construct the subplan and subpart maps for this partitioning level.
@@ -652,6 +690,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		subpart_map = (int *) palloc(nparts * sizeof(int));
 		memset(subpart_map, -1, nparts * sizeof(int));
 		relid_map = (Oid *) palloc0(nparts * sizeof(Oid));
+		rti_map = (Index *) palloc0(nparts * sizeof(Index));
 		present_parts = NULL;
 
 		i = -1;
@@ -666,6 +705,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 			subplan_map[i] = subplanidx = relid_subplan_map[partrel->relid] - 1;
 			subpart_map[i] = subpartidx = relid_subpart_map[partrel->relid] - 1;
 			relid_map[i] = planner_rt_fetch(partrel->relid, root)->relid;
+			rti_map[i] = partrel->relid;
 			if (subplanidx >= 0)
 			{
 				present_parts = bms_add_member(present_parts, i);
@@ -690,6 +730,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		pinfo->subplan_map = subplan_map;
 		pinfo->subpart_map = subpart_map;
 		pinfo->relid_map = relid_map;
+		pinfo->rti_map = rti_map;
 	}
 
 	pfree(relid_subpart_map);
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 95dc2e2c83..8dc52a158f 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -1603,6 +1603,7 @@ exec_bind_message(StringInfo input_message)
 	int16	   *rformats = NULL;
 	CachedPlanSource *psrc;
 	CachedPlan *cplan;
+	List	   *part_prune_result_list;
 	Portal		portal;
 	char	   *query_string;
 	char	   *saved_stmt_name;
@@ -1978,7 +1979,9 @@ exec_bind_message(StringInfo input_message)
 	 * will be generated in MessageContext.  The plan refcount will be
 	 * assigned to the Portal, so it will be released at portal destruction.
 	 */
-	cplan = GetCachedPlan(psrc, params, NULL, NULL);
+	cplan = GetCachedPlan(psrc, params, NULL, NULL, &part_prune_result_list);
+	Assert(list_length(cplan->stmt_list) ==
+		   list_length(part_prune_result_list));
 
 	/*
 	 * Now we can define the portal.
@@ -1993,6 +1996,9 @@ exec_bind_message(StringInfo input_message)
 					  cplan->stmt_list,
 					  cplan);
 
+	/* Copy PartitionPruneResults into the portal's context. */
+	PortalStorePartitionPruneResults(portal, part_prune_result_list);
+
 	/* Done with the snapshot used for parameter I/O and parsing/planning */
 	if (snapshot_set)
 		PopActiveSnapshot();
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index 5aa5a350f3..8cc2e2162d 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -35,7 +35,7 @@
 Portal		ActivePortal = NULL;
 
 
-static void ProcessQuery(PlannedStmt *plan,
+static void ProcessQuery(PlannedStmt *plan, PartitionPruneResult *part_prune_result,
 						 const char *sourceText,
 						 ParamListInfo params,
 						 QueryEnvironment *queryEnv,
@@ -65,6 +65,7 @@ static void DoPortalRewind(Portal portal);
  */
 QueryDesc *
 CreateQueryDesc(PlannedStmt *plannedstmt,
+				PartitionPruneResult *part_prune_result,
 				const char *sourceText,
 				Snapshot snapshot,
 				Snapshot crosscheck_snapshot,
@@ -77,6 +78,8 @@ CreateQueryDesc(PlannedStmt *plannedstmt,
 
 	qd->operation = plannedstmt->commandType;	/* operation */
 	qd->plannedstmt = plannedstmt;	/* plan */
+	qd->part_prune_result = part_prune_result;	/* ExecutorDoInitialPruning()
+												 * output for plan */
 	qd->sourceText = sourceText;	/* query text */
 	qd->snapshot = RegisterSnapshot(snapshot);	/* snapshot */
 	/* RI check snapshot */
@@ -122,6 +125,7 @@ FreeQueryDesc(QueryDesc *qdesc)
  *		PORTAL_ONE_RETURNING, or PORTAL_ONE_MOD_WITH portal
  *
  *	plan: the plan tree for the query
+ *	part_prune_result: ExecutorDoInitialPruning() output for the plan tree
  *	sourceText: the source text of the query
  *	params: any parameters needed
  *	dest: where to send results
@@ -134,6 +138,7 @@ FreeQueryDesc(QueryDesc *qdesc)
  */
 static void
 ProcessQuery(PlannedStmt *plan,
+			 PartitionPruneResult *part_prune_result,
 			 const char *sourceText,
 			 ParamListInfo params,
 			 QueryEnvironment *queryEnv,
@@ -145,7 +150,7 @@ ProcessQuery(PlannedStmt *plan,
 	/*
 	 * Create the QueryDesc object
 	 */
-	queryDesc = CreateQueryDesc(plan, sourceText,
+	queryDesc = CreateQueryDesc(plan, part_prune_result, sourceText,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, params, queryEnv, 0);
 
@@ -491,8 +496,13 @@ PortalStart(Portal portal, ParamListInfo params,
 				/*
 				 * Create QueryDesc in portal's context; for the moment, set
 				 * the destination to DestNone.
+				 *
+				 * There is no PartitionPruneResult unless the PlannedStmt is
+				 * from a CachedPlan.
 				 */
 				queryDesc = CreateQueryDesc(linitial_node(PlannedStmt, portal->stmts),
+											portal->part_prune_results == NIL ? NULL :
+											linitial(portal->part_prune_results),
 											portal->sourceText,
 											GetActiveSnapshot(),
 											InvalidSnapshot,
@@ -1225,6 +1235,8 @@ PortalRunMulti(Portal portal,
 
 		if (pstmt->utilityStmt == NULL)
 		{
+			PartitionPruneResult *part_prune_result = NULL;
+
 			/*
 			 * process a plannable query.
 			 */
@@ -1271,10 +1283,18 @@ PortalRunMulti(Portal portal,
 			else
 				UpdateActiveSnapshotCommandId();
 
+			/*
+			 * Determine if there's a corresponding PartitionPruneResult for
+			 * this PlannedStmt.
+			 */
+			if (portal->part_prune_results != NIL)
+				part_prune_result = list_nth(portal->part_prune_results,
+											 foreach_current_index(stmtlist_item));
+
 			if (pstmt->canSetTag)
 			{
 				/* statement can set tag string */
-				ProcessQuery(pstmt,
+				ProcessQuery(pstmt, part_prune_result,
 							 portal->sourceText,
 							 portal->portalParams,
 							 portal->queryEnv,
@@ -1283,7 +1303,7 @@ PortalRunMulti(Portal portal,
 			else
 			{
 				/* stmt added by rewrite cannot set tag */
-				ProcessQuery(pstmt,
+				ProcessQuery(pstmt, part_prune_result,
 							 portal->sourceText,
 							 portal->portalParams,
 							 portal->queryEnv,
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 4cf6db504f..6cb473f2f4 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -99,14 +99,19 @@ static dlist_head cached_expression_list = DLIST_STATIC_INIT(cached_expression_l
 static void ReleaseGenericPlan(CachedPlanSource *plansource);
 static List *RevalidateCachedQuery(CachedPlanSource *plansource,
 								   QueryEnvironment *queryEnv);
-static bool CheckCachedPlan(CachedPlanSource *plansource);
+static bool CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
+							List **part_prune_result_list);
 static CachedPlan *BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
-								   ParamListInfo boundParams, QueryEnvironment *queryEnv);
+								   ParamListInfo boundParams, QueryEnvironment *queryEnv,
+								   List **part_prune_result_list);
 static bool choose_custom_plan(CachedPlanSource *plansource,
 							   ParamListInfo boundParams);
 static double cached_plan_cost(CachedPlan *plan, bool include_planner);
 static Query *QueryListGetPrimaryStmt(List *stmts);
-static void AcquireExecutorLocks(List *stmt_list, bool acquire);
+static void AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams,
+								 List **part_prune_result_list,
+								 List **lockedRelids_per_stmt);
+static void ReleaseExecutorLocks(List *stmt_list, List *lockedRelids_per_stmt);
 static void AcquirePlannerLocks(List *stmt_list, bool acquire);
 static void ScanQueryForLocks(Query *parsetree, bool acquire);
 static bool ScanQueryWalker(Node *node, bool *acquire);
@@ -790,15 +795,20 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
  *
  * On a "true" return, we have acquired the locks needed to run the plan.
  * (We must do this for the "true" result to be race-condition-free.)
+ *
+ * See GetCachedPlan()'s comment for a description of part_prune_result_list.
  */
 static bool
-CheckCachedPlan(CachedPlanSource *plansource)
+CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
+				List **part_prune_result_list)
 {
 	CachedPlan *plan = plansource->gplan;
 
 	/* Assert that caller checked the querytree */
 	Assert(plansource->is_valid);
 
+	*part_prune_result_list = NIL;
+
 	/* If there's no generic plan, just say "false" */
 	if (!plan)
 		return false;
@@ -820,13 +830,21 @@ CheckCachedPlan(CachedPlanSource *plansource)
 	 */
 	if (plan->is_valid)
 	{
+		List   *lockedRelids_per_stmt;
+
 		/*
 		 * Plan must have positive refcount because it is referenced by
 		 * plansource; so no need to fear it disappears under us here.
 		 */
 		Assert(plan->refcount > 0);
 
-		AcquireExecutorLocks(plan->stmt_list, true);
+		/*
+		 * Lock relations scanned by the plan.  This is where the pruning
+		 * happens if needed.
+		 */
+		AcquireExecutorLocks(plan->stmt_list, boundParams,
+							 part_prune_result_list,
+							 &lockedRelids_per_stmt);
 
 		/*
 		 * If plan was transient, check to see if TransactionXmin has
@@ -848,7 +866,14 @@ CheckCachedPlan(CachedPlanSource *plansource)
 		}
 
 		/* Oops, the race case happened.  Release useless locks. */
-		AcquireExecutorLocks(plan->stmt_list, false);
+		ReleaseExecutorLocks(plan->stmt_list, lockedRelids_per_stmt);
+
+		/*
+		 * The output list and any objects therein have been allocated in the
+		 * caller's hopefully short-lived context, so will not remain leaked
+		 * for long, though reset to avoid its accidentally being looked at.
+		 */
+		*part_prune_result_list = NIL;
 	}
 
 	/*
@@ -874,10 +899,15 @@ CheckCachedPlan(CachedPlanSource *plansource)
  * Planning work is done in the caller's memory context.  The finished plan
  * is in a child memory context, which typically should get reparented
  * (unless this is a one-shot plan, in which case we don't copy the plan).
+ *
+ * A list of NULLs is returned in *part_prune_result_list, meaning that no
+ * PartitionPruneResult nodes have yet been created for the plans in
+ * stmt_list.
  */
 static CachedPlan *
 BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
-				ParamListInfo boundParams, QueryEnvironment *queryEnv)
+				ParamListInfo boundParams, QueryEnvironment *queryEnv,
+				List **part_prune_result_list)
 {
 	CachedPlan *plan;
 	List	   *plist;
@@ -1007,6 +1037,17 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
 
 	MemoryContextSwitchTo(oldcxt);
 
+	/*
+	 * No actual PartitionPruneResults yet to add, though must initialize
+	 * the list to have the same number of elements as the list of
+	 * PlannedStmts.
+	 */
+	*part_prune_result_list = NIL;
+	foreach(lc, plist)
+	{
+		*part_prune_result_list = lappend(*part_prune_result_list, NULL);
+	}
+
 	return plan;
 }
 
@@ -1126,6 +1167,17 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
  * plan or a custom plan for the given parameters: the caller does not know
  * which it will get.
  *
+ * For every PlannedStmt found in the returned CachedPlan, an element that
+ * is either a PartitionPruneResult or a NULL is added to
+ * *part_prune_result_list if needed.  The former if the PlannedStmt is from
+ * the existing CachedPlan that is otherwise valid and contains at least one
+ * PartitionPruneInfo that has "initial" pruning steps.  Those steps are
+ * performed by calling ExecutorDoInitialPruning() to determine only those
+ * leaf partitions that need to be locked by AcquireExecutorLocks() by pruning
+ * away subplans that don't match the pruning conditions.  The
+ * PartitionPruneResult contains a list of bitmapsets of the indexes of
+ * matching subplans, one for each PartitionPruneInfo.
+ *
  * On return, the plan is valid and we have sufficient locks to begin
  * execution.
  *
@@ -1139,11 +1191,13 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
  */
 CachedPlan *
 GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
-			  ResourceOwner owner, QueryEnvironment *queryEnv)
+			  ResourceOwner owner, QueryEnvironment *queryEnv,
+			  List **part_prune_result_list)
 {
 	CachedPlan *plan = NULL;
 	List	   *qlist;
 	bool		customplan;
+	List	   *my_part_prune_result_list;
 
 	/* Assert caller is doing things in a sane order */
 	Assert(plansource->magic == CACHEDPLANSOURCE_MAGIC);
@@ -1160,7 +1214,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 
 	if (!customplan)
 	{
-		if (CheckCachedPlan(plansource))
+		if (CheckCachedPlan(plansource, boundParams,
+							&my_part_prune_result_list))
 		{
 			/* We want a generic plan, and we already have a valid one */
 			plan = plansource->gplan;
@@ -1169,7 +1224,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 		else
 		{
 			/* Build a new generic plan */
-			plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv);
+			plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv,
+								   &my_part_prune_result_list);
 			/* Just make real sure plansource->gplan is clear */
 			ReleaseGenericPlan(plansource);
 			/* Link the new generic plan into the plansource */
@@ -1214,7 +1270,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 	if (customplan)
 	{
 		/* Build a custom plan */
-		plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv);
+		plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv,
+							   &my_part_prune_result_list);
 		/* Accumulate total costs of custom plans */
 		plansource->total_custom_cost += cached_plan_cost(plan, true);
 
@@ -1246,6 +1303,9 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 		plan->is_saved = true;
 	}
 
+	if (part_prune_result_list)
+		*part_prune_result_list = my_part_prune_result_list;
+
 	return plan;
 }
 
@@ -1737,17 +1797,29 @@ QueryListGetPrimaryStmt(List *stmts)
 
 /*
  * AcquireExecutorLocks: acquire locks needed for execution of a cached plan;
- * or release them if acquire is false.
+ *
+ * See GetCachedPlan()'s comment for a description of part_prune_result_list.
+ *
+ * On return, *lockedRelids_per_stmt will contain a bitmapset for every
+ * PlannedStmt in stmt_list, containing the RT indexes of relation entries
+ * in its range table that were actually locked, or NULL if the PlannedStmt
+ * contains a utility statement.
  */
 static void
-AcquireExecutorLocks(List *stmt_list, bool acquire)
+AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams,
+					 List **part_prune_result_list,
+					 List **lockedRelids_per_stmt)
 {
 	ListCell   *lc1;
 
+	*part_prune_result_list = *lockedRelids_per_stmt = NIL;
 	foreach(lc1, stmt_list)
 	{
 		PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
-		ListCell   *lc2;
+		PartitionPruneResult *part_prune_result = NULL;
+		Bitmapset  *allLockRelids;
+		Bitmapset  *lockedRelids = NULL;
+		int			rti;
 
 		if (plannedstmt->commandType == CMD_UTILITY)
 		{
@@ -1761,13 +1833,35 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
 			Query	   *query = UtilityContainsQuery(plannedstmt->utilityStmt);
 
 			if (query)
-				ScanQueryForLocks(query, acquire);
+				ScanQueryForLocks(query, true);
+			*part_prune_result_list = lappend(*part_prune_result_list, NULL);
 			continue;
 		}
 
-		foreach(lc2, plannedstmt->rtable)
+		/*
+		 * Figure out the set of relations that would need to be locked
+		 * before executing the plan.
+		 */
+		if (plannedstmt->containsInitialPruning)
 		{
-			RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc2);
+			/*
+			 * Obtain the set of partitions to be locked from the
+			 * PartitionPruneInfos by considering the result of performing
+			 * initial partition pruning.
+			 */
+			PartitionPruneResult *part_prune_result =
+				ExecutorDoInitialPruning(plannedstmt, boundParams);
+
+			allLockRelids = bms_union(plannedstmt->minLockRelids,
+									  part_prune_result->scan_leafpart_rtis);
+		}
+		else
+			allLockRelids = plannedstmt->minLockRelids;
+
+		rti = -1;
+		while ((rti = bms_next_member(allLockRelids, rti)) > 0)
+		{
+			RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
 
 			if (rte->rtekind != RTE_RELATION)
 				continue;
@@ -1778,10 +1872,58 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
 			 * fail if it's been dropped entirely --- we'll just transiently
 			 * acquire a non-conflicting lock.
 			 */
-			if (acquire)
-				LockRelationOid(rte->relid, rte->rellockmode);
-			else
-				UnlockRelationOid(rte->relid, rte->rellockmode);
+			LockRelationOid(rte->relid, rte->rellockmode);
+			lockedRelids = bms_add_member(lockedRelids, rti);
+		}
+
+		*part_prune_result_list = lappend(*part_prune_result_list,
+										  part_prune_result);
+		*lockedRelids_per_stmt = lappend(*lockedRelids_per_stmt, lockedRelids);
+	}
+}
+
+/*
+ * ReleaseExecutorLocks
+ * 		Release locks that would've been acquired by an earlier call to
+ * 		AcquireExecutorLocks()
+ */
+static void
+ReleaseExecutorLocks(List *stmt_list, List *lockedRelids_per_stmt)
+{
+	ListCell   *lc1,
+			   *lc2;
+
+	forboth(lc1, stmt_list, lc2, lockedRelids_per_stmt)
+	{
+		PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
+		Bitmapset  *lockedRelids = lfirst(lc2);
+		int			rti;
+
+		if (plannedstmt->commandType == CMD_UTILITY)
+		{
+			/*
+			 * Ignore utility statements, except those (such as EXPLAIN) that
+			 * contain a parsed-but-not-planned query.  Note: it's okay to use
+			 * ScanQueryForLocks, even though the query hasn't been through
+			 * rule rewriting, because rewriting doesn't change the query
+			 * representation.
+			 */
+			Query	   *query = UtilityContainsQuery(plannedstmt->utilityStmt);
+
+			if (query)
+				ScanQueryForLocks(query, false);
+			continue;
+		}
+
+		rti = -1;
+		while ((rti = bms_next_member(lockedRelids, rti)) >= 0)
+		{
+			RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
+
+			Assert(rte->rtekind == RTE_RELATION);
+
+			/* See the comment in AcquireExecutorLocks(). */
+			UnlockRelationOid(rte->relid, rte->rellockmode);
 		}
 	}
 }
diff --git a/src/backend/utils/mmgr/portalmem.c b/src/backend/utils/mmgr/portalmem.c
index d549f66d4a..1bbe6b704b 100644
--- a/src/backend/utils/mmgr/portalmem.c
+++ b/src/backend/utils/mmgr/portalmem.c
@@ -303,6 +303,25 @@ PortalDefineQuery(Portal portal,
 	portal->status = PORTAL_DEFINED;
 }
 
+/*
+ * PortalStorePartitionPruneResults
+ *		Copy the given list of PartitionPruneResults into the portal's
+ *		context
+ *
+ * This allows the caller to ensure that the list exists as long as the portal
+ * does.
+ */
+void
+PortalStorePartitionPruneResults(Portal portal, List *part_prune_results)
+{
+	MemoryContext	oldcxt;
+
+	AssertArg(PortalIsValid(portal));
+	oldcxt = MemoryContextSwitchTo(portal->portalContext);
+	portal->part_prune_results = copyObject(part_prune_results);
+	MemoryContextSwitchTo(oldcxt);
+}
+
 /*
  * PortalReleaseCachedPlan
  *		Release a portal's reference to its cached plan, if any.
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 666977fb1f..bbc8c42d88 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -87,7 +87,9 @@ extern void ExplainOneUtility(Node *utilityStmt, IntoClause *into,
 							  ExplainState *es, const char *queryString,
 							  ParamListInfo params, QueryEnvironment *queryEnv);
 
-extern void ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into,
+extern void ExplainOnePlan(PlannedStmt *plannedstmt,
+						   PartitionPruneResult *part_prune_result,
+						   IntoClause *into,
 						   ExplainState *es, const char *queryString,
 						   ParamListInfo params, QueryEnvironment *queryEnv,
 						   const instr_time *planduration,
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index 708435e952..bd8776402e 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -45,6 +45,7 @@ extern void ExecCleanupTupleRouting(ModifyTableState *mtstate,
  * nparts						Length of subplan_map[] and subpart_map[].
  * subplan_map					Subplan index by partition index, or -1.
  * subpart_map					Subpart index by partition index, or -1.
+ * rti_map						Range table index by partition index, or 0.
  * present_parts				A Bitmapset of the partition indexes that we
  *								have subplans or subparts for.
  * initial_pruning_steps		List of PartitionPruneSteps used to
@@ -61,6 +62,7 @@ typedef struct PartitionedRelPruningData
 	int			nparts;
 	int		   *subplan_map;
 	int		   *subpart_map;
+	Index	   *rti_map;
 	Bitmapset  *present_parts;
 	List	   *initial_pruning_steps;
 	List	   *exec_pruning_steps;
@@ -123,9 +125,13 @@ typedef struct PartitionPruneState
 
 extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
 													 int n_total_subplans,
-													 PartitionPruneInfo *pruneinfo,
+													 int part_prune_index,
 													 Bitmapset **initially_valid_subplans);
 extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
-										   bool initial_prune);
-
+										   bool initial_prune,
+										   Bitmapset **scan_leafpart_rtis);
+extern Bitmapset *ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt,
+								ParamListInfo params,
+								PartitionPruneInfo *pruneinfo,
+								Bitmapset **scan_leafpart_rtis);
 #endif							/* EXECPARTITION_H */
diff --git a/src/include/executor/execdesc.h b/src/include/executor/execdesc.h
index e79e2c001f..60d5644908 100644
--- a/src/include/executor/execdesc.h
+++ b/src/include/executor/execdesc.h
@@ -35,6 +35,8 @@ typedef struct QueryDesc
 	/* These fields are provided by CreateQueryDesc */
 	CmdType		operation;		/* CMD_SELECT, CMD_UPDATE, etc. */
 	PlannedStmt *plannedstmt;	/* planner's output (could be utility, too) */
+	PartitionPruneResult *part_prune_result; /* ExecutorDoInitialPruning()'s
+											  * output for plannedstmt */
 	const char *sourceText;		/* source text of the query */
 	Snapshot	snapshot;		/* snapshot to use for query */
 	Snapshot	crosscheck_snapshot;	/* crosscheck for RI update/delete */
@@ -57,6 +59,7 @@ typedef struct QueryDesc
 
 /* in pquery.c */
 extern QueryDesc *CreateQueryDesc(PlannedStmt *plannedstmt,
+								  PartitionPruneResult *part_prune_result,
 								  const char *sourceText,
 								  Snapshot snapshot,
 								  Snapshot crosscheck_snapshot,
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index 873772f188..57dc0e8077 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -185,6 +185,8 @@ ExecGetJunkAttribute(TupleTableSlot *slot, AttrNumber attno, bool *isNull)
 /*
  * prototypes from functions in execMain.c
  */
+extern PartitionPruneResult *ExecutorDoInitialPruning(PlannedStmt *plannedstmt,
+													  ParamListInfo params);
 extern void ExecutorStart(QueryDesc *queryDesc, int eflags);
 extern void standard_ExecutorStart(QueryDesc *queryDesc, int eflags);
 extern void ExecutorRun(QueryDesc *queryDesc,
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 94b191f8ae..a8bf908d63 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -596,6 +596,8 @@ typedef struct EState
 	struct ExecRowMark **es_rowmarks;	/* Array of per-range-table-entry
 										 * ExecRowMarks, or NULL if none */
 	PlannedStmt *es_plannedstmt;	/* link to top of plan tree */
+	List		*es_part_prune_infos;	/* PlannedStmt.partPruneInfos */
+	struct PartitionPruneResult *es_part_prune_result; /* QueryDesc.part_prune_result */
 	const char *es_sourceText;	/* Source text from QueryDesc */
 
 	JunkFilter *es_junkFilter;	/* top-level junk filter, if any */
@@ -984,6 +986,34 @@ typedef struct DomainConstraintState
  */
 typedef TupleTableSlot *(*ExecProcNodeMtd) (struct PlanState *pstate);
 
+/*----------------
+ * PartitionPruneResult
+ *
+ * The result of performing ExecutorDoInitialPruning() invocation on a given
+ * PlannedStmt.
+ *
+ * Contains a list of Bitmapset of the indexes of the subplans remaining after
+ * performing initial pruning by calling ExecFindMatchingSubPlans() for every
+ * PartitionPruneInfos found in PlannedStmt.partPruneInfos.  RT indexes of the
+ * leaf partitions scanned by those subplans across all PartitionPruneInfos
+ * are added into scan_leafpart_rtis.
+ *
+ * This is used by GetCachedPlan() to inform its callers of the pruning
+ * decisions made when performing AcquireExecutorLocks() on a given cached
+ * PlannedStmt, which the callers then pass onto the executor.  The executor
+ * refers to this node when made available when initializing the plan nodes to
+ * which those PartitionPruneInfos apply so that the same set of qualifying
+ * subplans are initialized, rather than deriving that set again by redoing
+ * initial pruning.
+ */
+typedef struct PartitionPruneResult
+{
+	NodeTag		type;
+
+	List		   *valid_subplan_offs_list;
+	Bitmapset	   *scan_leafpart_rtis;
+} PartitionPruneResult;
+
 /* ----------------
  *		PlanState node
  *
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index 340d28f4e1..66416bce97 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -97,6 +97,9 @@ typedef enum NodeTag
 	T_PartitionPruneStepCombine,
 	T_PlanInvalItem,
 
+	/* TAGS FOR EXECUTOR PREP NODES (execnodes.h) */
+	T_PartitionPruneResult,
+
 	/*
 	 * TAGS FOR PLAN STATE NODES (execnodes.h)
 	 *
@@ -674,6 +677,7 @@ extern struct Bitmapset *readBitmapset(void);
 extern uintptr_t readDatum(bool typbyval);
 extern bool *readBoolCols(int numCols);
 extern int *readIntCols(int numCols);
+extern Index *readIndexCols(int numCols);
 extern Oid *readOidCols(int numCols);
 extern int16 *readAttrNumberCols(int numCols);
 
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index c5ab53e05c..11007cda25 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -107,6 +107,18 @@ typedef struct PlannerGlobal
 
 	List	   *appendRelations;	/* "flat" list of AppendRelInfos */
 
+	List	   *partPruneInfos;		/* List of PartitionPruneInfo contained in
+									 * the plan */
+
+	bool		containsInitialPruning;	/* Do any of those PartitionPruneInfos
+										 * have initial (pre-exec) pruning
+										 * steps in them? */
+
+	Bitmapset  *minLockRelids;	/* Indexes of all range table entries minus
+								 * indexes of range table entries of the leaf
+								 * partitions scanned by prunable subplans;
+								 * see AcquireExecutorLocks() */
+
 	List	   *relationOids;	/* OIDs of relations the plan depends on */
 
 	List	   *invalItems;		/* other dependencies, as PlanInvalItems */
@@ -377,6 +389,9 @@ struct PlannerInfo
 
 	/* Does this query modify any partition key columns? */
 	bool		partColsUpdated;
+
+	/* PartitionPruneInfos added in this query's plan. */
+	List	   *partPruneInfos;
 };
 
 
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index e43e360d9b..f8f3971f44 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -64,8 +64,20 @@ typedef struct PlannedStmt
 
 	struct Plan *planTree;		/* tree of Plan nodes */
 
+	List	   *partPruneInfos;	/* List of PartitionPruneInfo contained in
+								 * the plan */
+
+	bool		containsInitialPruning;	/* Do any of those PartitionPruneInfos
+										 * have initial (pre-exec) pruning
+										 * steps in them? */
+
 	List	   *rtable;			/* list of RangeTblEntry nodes */
 
+	Bitmapset  *minLockRelids;	/* Indexes of all range table entries minus
+								 * indexes of range table entries of the leaf
+								 * partitions scanned by prunable subplans;
+								 * see AcquireExecutorLocks() */
+
 	/* rtable indexes of target relations for INSERT/UPDATE/DELETE */
 	List	   *resultRelations;	/* integer list of RT indexes, or NIL */
 
@@ -262,8 +274,8 @@ typedef struct Append
 	 */
 	int			first_partial_plan;
 
-	/* Info for run-time subplan pruning; NULL if we're not doing that */
-	struct PartitionPruneInfo *part_prune_info;
+	/* Index to PlannerInfo.partPruneInfos or -1 if no run-time pruning */
+	int			part_prune_index;
 } Append;
 
 /* ----------------
@@ -282,8 +294,9 @@ typedef struct MergeAppend
 	Oid		   *sortOperators;	/* OIDs of operators to sort them by */
 	Oid		   *collations;		/* OIDs of collations */
 	bool	   *nullsFirst;		/* NULLS FIRST/LAST directions */
-	/* Info for run-time subplan pruning; NULL if we're not doing that */
-	struct PartitionPruneInfo *part_prune_info;
+
+	/* Index to PlannerInfo.partPruneInfos or -1 if no run-time pruning */
+	int			part_prune_index;
 } MergeAppend;
 
 /* ----------------
@@ -1191,6 +1204,13 @@ typedef struct PlanRowMark
  * prune_infos			List of Lists containing PartitionedRelPruneInfo nodes,
  *						one sublist per run-time-prunable partition hierarchy
  *						appearing in the parent plan node's subplans.
+ *
+ * needs_init_pruning	Does any of the PartitionedRelPruneInfos in
+ *						prune_infos have its initial_pruning_steps set?
+ *
+ * needs_exec_pruning	Does any of the PartitionedRelPruneInfos in
+ *						prune_infos have its exec_pruning_steps set?
+ *
  * other_subplans		Indexes of any subplans that are not accounted for
  *						by any of the PartitionedRelPruneInfo nodes in
  *						"prune_infos".  These subplans must not be pruned.
@@ -1199,6 +1219,8 @@ typedef struct PartitionPruneInfo
 {
 	NodeTag		type;
 	List	   *prune_infos;
+	bool		needs_init_pruning;
+	bool		needs_exec_pruning;
 	Bitmapset  *other_subplans;
 } PartitionPruneInfo;
 
@@ -1229,6 +1251,7 @@ typedef struct PartitionedRelPruneInfo
 	int		   *subplan_map;	/* subplan index by partition index, or -1 */
 	int		   *subpart_map;	/* subpart index by partition index, or -1 */
 	Oid		   *relid_map;		/* relation OID by partition index, or 0 */
+	Index	   *rti_map;		/* Range table index by partition index, 0. */
 
 	/*
 	 * initial_pruning_steps shows how to prune during executor startup (i.e.,
diff --git a/src/include/partitioning/partprune.h b/src/include/partitioning/partprune.h
index 90684efa25..ebf0dcff8c 100644
--- a/src/include/partitioning/partprune.h
+++ b/src/include/partitioning/partprune.h
@@ -70,10 +70,10 @@ typedef struct PartitionPruneContext
 #define PruneCxtStateIdx(partnatts, step_id, keyno) \
 	((partnatts) * (step_id) + (keyno))
 
-extern PartitionPruneInfo *make_partition_pruneinfo(struct PlannerInfo *root,
-													struct RelOptInfo *parentrel,
-													List *subpaths,
-													List *prunequal);
+extern int make_partition_pruneinfo(struct PlannerInfo *root,
+									struct RelOptInfo *parentrel,
+									List *subpaths,
+									List *prunequal);
 extern Bitmapset *prune_append_rel_partitions(struct RelOptInfo *rel);
 extern Bitmapset *get_matching_partitions(PartitionPruneContext *context,
 										  List *pruning_steps);
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index 95b99e3d25..449200b949 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -220,7 +220,8 @@ extern List *CachedPlanGetTargetList(CachedPlanSource *plansource,
 extern CachedPlan *GetCachedPlan(CachedPlanSource *plansource,
 								 ParamListInfo boundParams,
 								 ResourceOwner owner,
-								 QueryEnvironment *queryEnv);
+								 QueryEnvironment *queryEnv,
+								 List **part_prune_result_list);
 extern void ReleaseCachedPlan(CachedPlan *plan, ResourceOwner owner);
 
 extern bool CachedPlanAllowsSimpleValidityCheck(CachedPlanSource *plansource,
diff --git a/src/include/utils/portal.h b/src/include/utils/portal.h
index aeddbdafe5..9f7727a837 100644
--- a/src/include/utils/portal.h
+++ b/src/include/utils/portal.h
@@ -138,6 +138,7 @@ typedef struct PortalData
 	QueryCompletion qc;			/* command completion data for executed query */
 	List	   *stmts;			/* list of PlannedStmts */
 	CachedPlan *cplan;			/* CachedPlan, if stmts are from one */
+	List	   *part_prune_results;	/* list of PartitionPruneResults */
 
 	ParamListInfo portalParams; /* params to pass to query */
 	QueryEnvironment *queryEnv; /* environment for query */
@@ -242,6 +243,8 @@ extern void PortalDefineQuery(Portal portal,
 							  CommandTag commandTag,
 							  List *stmts,
 							  CachedPlan *cplan);
+extern void PortalStorePartitionPruneResults(Portal portal,
+											 List *part_prune_result_list);
 extern PlannedStmt *PortalGetPrimaryStmt(Portal portal);
 extern void PortalCreateHoldStore(Portal portal);
 extern void PortalHashTableDeleteAll(void);
-- 
2.24.1



^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2022-04-11 03:58  Zhihong Yu <[email protected]>
  parent: Amit Langote <[email protected]>
  0 siblings, 1 reply; 108+ messages in thread

From: Zhihong Yu @ 2022-04-11 03:58 UTC (permalink / raw)
  To: Amit Langote <[email protected]>; +Cc: David Rowley <[email protected]>; Robert Haas <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers

On Sun, Apr 10, 2022 at 8:05 PM Amit Langote <[email protected]>
wrote:

> On Fri, Apr 8, 2022 at 8:45 PM Amit Langote <[email protected]>
> wrote:
> > Most looked fine changes to me except a couple of typos, so I've
> > adopted those into the attached new version, even though I know it's
> > too late to try to apply it.
> >
> > + * XXX is it worth doing a bms_copy() on glob->minLockRelids if
> > + * glob->containsInitialPruning is true?. I'm slighly worried that the
> > + * Bitmapset could have a very long empty tail resulting in excessive
> > + * looping during AcquireExecutorLocks().
> > + */
> >
> > I guess I trust your instincts about bitmapset operation efficiency
> > and what you've written here makes sense.  It's typical for leaf
> > partitions to have been appended toward the tail end of rtable and I'd
> > imagine their indexes would be in the tail words of minLockRelids.  If
> > copying the bitmapset removes those useless words, I don't see why we
> > shouldn't do that.  So added:
> >
> > + /*
> > + * It seems worth doing a bms_copy() on glob->minLockRelids if we
> deleted
> > + * bit from it just above to prevent empty tail bits resulting in
> > + * inefficient looping during AcquireExecutorLocks().
> > + */
> > + if (glob->containsInitialPruning)
> > + glob->minLockRelids = bms_copy(glob->minLockRelids)
> >
> > Not 100% about the comment I wrote.
>
> And the quoted code change missed a semicolon in the v14 that I
> hurriedly sent on Friday.   (Had apparently forgotten to `git add` the
> hunk to fix that).
>
> Sending v15 that fixes that to keep the cfbot green for now.
>
> --
> Amit Langote
> EDB: http://www.enterprisedb.com

Hi,

+               /* RT index of the partitione table. */

partitione -> partitioned

Cheers


^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2022-05-27 08:09  Amit Langote <[email protected]>
  parent: Zhihong Yu <[email protected]>
  0 siblings, 2 replies; 108+ messages in thread

From: Amit Langote @ 2022-05-27 08:09 UTC (permalink / raw)
  To: Zhihong Yu <[email protected]>; +Cc: David Rowley <[email protected]>; Robert Haas <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers

On Mon, Apr 11, 2022 at 12:53 PM Zhihong Yu <[email protected]> wrote:
> On Sun, Apr 10, 2022 at 8:05 PM Amit Langote <[email protected]> wrote:
>> Sending v15 that fixes that to keep the cfbot green for now.
>
> Hi,
>
> +               /* RT index of the partitione table. */
>
> partitione -> partitioned

Thanks, fixed.

Also, I broke this into patches:

0001 contains the mechanical changes of moving PartitionPruneInfo out
of Append/MergeAppend into a list in PlannedStmt.

0002 is the main patch to "Optimize AcquireExecutorLocks() by locking
only unpruned partitions".

-- 
Thanks, Amit Langote
EDB: http://www.enterprisedb.com


Attachments:

  [application/octet-stream] v16-0001-Move-PartitioPruneInfo-out-of-plan-nodes-into-Pl.patch (21.2K, 2-v16-0001-Move-PartitioPruneInfo-out-of-plan-nodes-into-Pl.patch)
  download | inline diff:
From 16fd07b7c8ffde7632ffa7b03e4595e1e08d7e06 Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Fri, 27 May 2022 16:00:28 +0900
Subject: [PATCH v16 1/2] Move PartitioPruneInfo out of plan nodes into
 PlannedStmt

The planner will now add a given PartitioPruneInfo to
PlannedStmt.partPruneInfos instead of the Append/MergeAppend plan
node to which it would be added until now and set an index field in
the plan node that point to the list element.

A later commit will make AcquireExecutorLocks() do the initial
partition pruning to determine a minimal set of partitions to be
locked to validate a plan tree and it will need to consult the
PartitioPruneInfos referenced therein to do so.  It would be better
for the PartitioPruneInfos to be accessible directly than them having
to be found individually by walking the plan tree, which can be done
by simply iterative over PlannedStmt.partPruneInfos.
---
 src/backend/executor/execMain.c         |  1 +
 src/backend/executor/execParallel.c     |  1 +
 src/backend/executor/execPartition.c    |  4 +-
 src/backend/executor/execUtils.c        |  2 +
 src/backend/executor/nodeAppend.c       |  4 +-
 src/backend/executor/nodeMergeAppend.c  |  4 +-
 src/backend/nodes/copyfuncs.c           |  5 +-
 src/backend/nodes/outfuncs.c            |  7 ++-
 src/backend/nodes/readfuncs.c           |  5 +-
 src/backend/optimizer/plan/createplan.c | 24 ++++-----
 src/backend/optimizer/plan/planner.c    |  1 +
 src/backend/optimizer/plan/setrefs.c    | 65 +++++++++++++------------
 src/backend/partitioning/partprune.c    | 18 ++++---
 src/include/executor/execPartition.h    |  3 +-
 src/include/nodes/execnodes.h           |  2 +
 src/include/nodes/pathnodes.h           |  6 +++
 src/include/nodes/plannodes.h           | 12 +++--
 src/include/partitioning/partprune.h    |  8 +--
 18 files changed, 104 insertions(+), 68 deletions(-)

diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index ef2fd46092..72fc273524 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -825,6 +825,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
 	ExecInitRangeTable(estate, rangeTable);
 
 	estate->es_plannedstmt = plannedstmt;
+	estate->es_part_prune_infos = plannedstmt->partPruneInfos;
 
 	/*
 	 * Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index f1fd7f7e8b..f73b8c2607 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -183,6 +183,7 @@ ExecSerializePlan(Plan *plan, EState *estate)
 	pstmt->dependsOnRole = false;
 	pstmt->parallelModeNeeded = false;
 	pstmt->planTree = plan;
+	pstmt->partPruneInfos = estate->es_part_prune_infos;
 	pstmt->rtable = estate->es_range_table;
 	pstmt->resultRelations = NIL;
 	pstmt->appendRelations = NIL;
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index e03ea27299..b55cdd2580 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -1638,11 +1638,13 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
 PartitionPruneState *
 ExecInitPartitionPruning(PlanState *planstate,
 						 int n_total_subplans,
-						 PartitionPruneInfo *pruneinfo,
+						 int part_prune_index,
 						 Bitmapset **initially_valid_subplans)
 {
 	PartitionPruneState *prunestate;
 	EState	   *estate = planstate->state;
+	PartitionPruneInfo *pruneinfo = list_nth(estate->es_part_prune_infos,
+											 part_prune_index);
 
 	/* We may need an expression context to evaluate partition exprs */
 	ExecAssignExprContext(estate, planstate);
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 9df1f81ea8..f9c7976ff2 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -119,6 +119,8 @@ CreateExecutorState(void)
 	estate->es_relations = NULL;
 	estate->es_rowmarks = NULL;
 	estate->es_plannedstmt = NULL;
+	estate->es_part_prune_infos = NIL;
+	estate->es_part_prune_result = NULL;
 
 	estate->es_junkFilter = NULL;
 
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index 357e10a1d7..c6f86a6510 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -134,7 +134,7 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
 	appendstate->as_begun = false;
 
 	/* If run-time partition pruning is enabled, then set that up now */
-	if (node->part_prune_info != NULL)
+	if (node->part_prune_index >= 0)
 	{
 		PartitionPruneState *prunestate;
 
@@ -145,7 +145,7 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
 		 */
 		prunestate = ExecInitPartitionPruning(&appendstate->ps,
 											  list_length(node->appendplans),
-											  node->part_prune_info,
+											  node->part_prune_index,
 											  &validsubplans);
 		appendstate->as_prune_state = prunestate;
 		nplans = bms_num_members(validsubplans);
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index c5c62fa5c7..8d35860c30 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -82,7 +82,7 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
 	mergestate->ps.ExecProcNode = ExecMergeAppend;
 
 	/* If run-time partition pruning is enabled, then set that up now */
-	if (node->part_prune_info != NULL)
+	if (node->part_prune_index >= 0)
 	{
 		PartitionPruneState *prunestate;
 
@@ -93,7 +93,7 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
 		 */
 		prunestate = ExecInitPartitionPruning(&mergestate->ps,
 											  list_length(node->mergeplans),
-											  node->part_prune_info,
+											  node->part_prune_index,
 											  &validsubplans);
 		mergestate->ms_prune_state = prunestate;
 		nplans = bms_num_members(validsubplans);
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 51d630fa89..8fbeaa4f36 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -96,6 +96,7 @@ _copyPlannedStmt(const PlannedStmt *from)
 	COPY_SCALAR_FIELD(parallelModeNeeded);
 	COPY_SCALAR_FIELD(jitFlags);
 	COPY_NODE_FIELD(planTree);
+	COPY_NODE_FIELD(partPruneInfos);
 	COPY_NODE_FIELD(rtable);
 	COPY_NODE_FIELD(resultRelations);
 	COPY_NODE_FIELD(appendRelations);
@@ -253,7 +254,7 @@ _copyAppend(const Append *from)
 	COPY_NODE_FIELD(appendplans);
 	COPY_SCALAR_FIELD(nasyncplans);
 	COPY_SCALAR_FIELD(first_partial_plan);
-	COPY_NODE_FIELD(part_prune_info);
+	COPY_SCALAR_FIELD(part_prune_index);
 
 	return newnode;
 }
@@ -281,7 +282,7 @@ _copyMergeAppend(const MergeAppend *from)
 	COPY_POINTER_FIELD(sortOperators, from->numCols * sizeof(Oid));
 	COPY_POINTER_FIELD(collations, from->numCols * sizeof(Oid));
 	COPY_POINTER_FIELD(nullsFirst, from->numCols * sizeof(bool));
-	COPY_NODE_FIELD(part_prune_info);
+	COPY_SCALAR_FIELD(part_prune_index);
 
 	return newnode;
 }
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index ce12915592..72fcd8a6ee 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -321,6 +321,7 @@ _outPlannedStmt(StringInfo str, const PlannedStmt *node)
 	WRITE_BOOL_FIELD(parallelModeNeeded);
 	WRITE_INT_FIELD(jitFlags);
 	WRITE_NODE_FIELD(planTree);
+	WRITE_NODE_FIELD(partPruneInfos);
 	WRITE_NODE_FIELD(rtable);
 	WRITE_NODE_FIELD(resultRelations);
 	WRITE_NODE_FIELD(appendRelations);
@@ -450,7 +451,7 @@ _outAppend(StringInfo str, const Append *node)
 	WRITE_NODE_FIELD(appendplans);
 	WRITE_INT_FIELD(nasyncplans);
 	WRITE_INT_FIELD(first_partial_plan);
-	WRITE_NODE_FIELD(part_prune_info);
+	WRITE_INT_FIELD(part_prune_index);
 }
 
 static void
@@ -467,7 +468,7 @@ _outMergeAppend(StringInfo str, const MergeAppend *node)
 	WRITE_OID_ARRAY(sortOperators, node->numCols);
 	WRITE_OID_ARRAY(collations, node->numCols);
 	WRITE_BOOL_ARRAY(nullsFirst, node->numCols);
-	WRITE_NODE_FIELD(part_prune_info);
+	WRITE_INT_FIELD(part_prune_index);
 }
 
 static void
@@ -2434,6 +2435,7 @@ _outPlannerGlobal(StringInfo str, const PlannerGlobal *node)
 	WRITE_NODE_FIELD(finalrowmarks);
 	WRITE_NODE_FIELD(resultRelations);
 	WRITE_NODE_FIELD(appendRelations);
+	WRITE_NODE_FIELD(partPruneInfos);
 	WRITE_NODE_FIELD(relationOids);
 	WRITE_NODE_FIELD(invalItems);
 	WRITE_NODE_FIELD(paramExecTypes);
@@ -2501,6 +2503,7 @@ _outPlannerInfo(StringInfo str, const PlannerInfo *node)
 	WRITE_BITMAPSET_FIELD(curOuterRels);
 	WRITE_NODE_FIELD(curOuterParams);
 	WRITE_BOOL_FIELD(partColsUpdated);
+	WRITE_NODE_FIELD(partPruneInfos);
 }
 
 static void
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 6a05b69415..bf602ff93e 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -1817,6 +1817,7 @@ _readPlannedStmt(void)
 	READ_BOOL_FIELD(parallelModeNeeded);
 	READ_INT_FIELD(jitFlags);
 	READ_NODE_FIELD(planTree);
+	READ_NODE_FIELD(partPruneInfos);
 	READ_NODE_FIELD(rtable);
 	READ_NODE_FIELD(resultRelations);
 	READ_NODE_FIELD(appendRelations);
@@ -1949,7 +1950,7 @@ _readAppend(void)
 	READ_NODE_FIELD(appendplans);
 	READ_INT_FIELD(nasyncplans);
 	READ_INT_FIELD(first_partial_plan);
-	READ_NODE_FIELD(part_prune_info);
+	READ_INT_FIELD(part_prune_index);
 
 	READ_DONE();
 }
@@ -1971,7 +1972,7 @@ _readMergeAppend(void)
 	READ_OID_ARRAY(sortOperators, local_node->numCols);
 	READ_OID_ARRAY(collations, local_node->numCols);
 	READ_BOOL_ARRAY(nullsFirst, local_node->numCols);
-	READ_NODE_FIELD(part_prune_info);
+	READ_INT_FIELD(part_prune_index);
 
 	READ_DONE();
 }
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index 76606faa3e..58a05cf673 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -1203,7 +1203,6 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
 	ListCell   *subpaths;
 	int			nasyncplans = 0;
 	RelOptInfo *rel = best_path->path.parent;
-	PartitionPruneInfo *partpruneinfo = NULL;
 	int			nodenumsortkeys = 0;
 	AttrNumber *nodeSortColIdx = NULL;
 	Oid		   *nodeSortOperators = NULL;
@@ -1354,6 +1353,9 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
 		subplans = lappend(subplans, subplan);
 	}
 
+	/* Set below if we find quals that we can use to run-time prune */
+	plan->part_prune_index = -1;
+
 	/*
 	 * If any quals exist, they may be useful to perform further partition
 	 * pruning during execution.  Gather information needed by the executor to
@@ -1377,16 +1379,14 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
 		}
 
 		if (prunequal != NIL)
-			partpruneinfo =
-				make_partition_pruneinfo(root, rel,
-										 best_path->subpaths,
-										 prunequal);
+			plan->part_prune_index = make_partition_pruneinfo(root, rel,
+															  best_path->subpaths,
+															  prunequal);
 	}
 
 	plan->appendplans = subplans;
 	plan->nasyncplans = nasyncplans;
 	plan->first_partial_plan = best_path->first_partial_path;
-	plan->part_prune_info = partpruneinfo;
 
 	copy_generic_path_info(&plan->plan, (Path *) best_path);
 
@@ -1426,7 +1426,6 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
 	List	   *subplans = NIL;
 	ListCell   *subpaths;
 	RelOptInfo *rel = best_path->path.parent;
-	PartitionPruneInfo *partpruneinfo = NULL;
 
 	/*
 	 * We don't have the actual creation of the MergeAppend node split out
@@ -1519,6 +1518,9 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
 		subplans = lappend(subplans, subplan);
 	}
 
+	/* Set below if we find quals that we can use to run-time prune */
+	node->part_prune_index = -1;
+
 	/*
 	 * If any quals exist, they may be useful to perform further partition
 	 * pruning during execution.  Gather information needed by the executor to
@@ -1542,13 +1544,13 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
 		}
 
 		if (prunequal != NIL)
-			partpruneinfo = make_partition_pruneinfo(root, rel,
-													 best_path->subpaths,
-													 prunequal);
+			node->part_prune_index = make_partition_pruneinfo(root, rel,
+															  best_path->subpaths,
+															  prunequal);
 	}
 
 	node->mergeplans = subplans;
-	node->part_prune_info = partpruneinfo;
+
 
 	/*
 	 * If prepare_sort_from_pathkeys added sort columns, but we were told to
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index a0f2390334..32e658b5d6 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -518,6 +518,7 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
 	result->dependsOnRole = glob->dependsOnRole;
 	result->parallelModeNeeded = glob->parallelModeNeeded;
 	result->planTree = top_plan;
+	result->partPruneInfos = glob->partPruneInfos;
 	result->rtable = glob->finalrtable;
 	result->resultRelations = glob->resultRelations;
 	result->appendRelations = glob->appendRelations;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index d95fd89807..aafe1c149d 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -348,6 +348,29 @@ set_plan_references(PlannerInfo *root, Plan *plan)
 		}
 	}
 
+	/* Also fix up the information in PartitionPruneInfos. */
+	foreach (lc, root->partPruneInfos)
+	{
+		PartitionPruneInfo *pruneinfo = lfirst(lc);
+		ListCell  *l;
+
+		foreach(l, pruneinfo->prune_infos)
+		{
+			List	   *prune_infos = lfirst(l);
+			ListCell   *l2;
+
+			foreach(l2, prune_infos)
+			{
+				PartitionedRelPruneInfo *pinfo = lfirst(l2);
+
+				/* RT index of the table to which the pinfo belongs. */
+				pinfo->rtindex += rtoffset;
+			}
+		}
+
+		glob->partPruneInfos = lappend(glob->partPruneInfos, pruneinfo);
+	}
+
 	return result;
 }
 
@@ -1640,21 +1663,12 @@ set_append_references(PlannerInfo *root,
 
 	aplan->apprelids = offset_relid_set(aplan->apprelids, rtoffset);
 
-	if (aplan->part_prune_info)
-	{
-		foreach(l, aplan->part_prune_info->prune_infos)
-		{
-			List	   *prune_infos = lfirst(l);
-			ListCell   *l2;
-
-			foreach(l2, prune_infos)
-			{
-				PartitionedRelPruneInfo *pinfo = lfirst(l2);
-
-				pinfo->rtindex += rtoffset;
-			}
-		}
-	}
+	/*
+	 * PartitionPruneInfos will be added to a list in PlannerGlobal, so update
+	 * the index.
+	 */
+	if (aplan->part_prune_index >= 0)
+		aplan->part_prune_index += list_length(root->glob->partPruneInfos);
 
 	/* We don't need to recurse to lefttree or righttree ... */
 	Assert(aplan->plan.lefttree == NULL);
@@ -1712,21 +1726,12 @@ set_mergeappend_references(PlannerInfo *root,
 
 	mplan->apprelids = offset_relid_set(mplan->apprelids, rtoffset);
 
-	if (mplan->part_prune_info)
-	{
-		foreach(l, mplan->part_prune_info->prune_infos)
-		{
-			List	   *prune_infos = lfirst(l);
-			ListCell   *l2;
-
-			foreach(l2, prune_infos)
-			{
-				PartitionedRelPruneInfo *pinfo = lfirst(l2);
-
-				pinfo->rtindex += rtoffset;
-			}
-		}
-	}
+	/*
+	 * PartitionPruneInfos will be added to a list in PlannerGlobal, so update
+	 * the index.
+	 */
+	if (mplan->part_prune_index >= 0)
+		mplan->part_prune_index += list_length(root->glob->partPruneInfos);
 
 	/* We don't need to recurse to lefttree or righttree ... */
 	Assert(mplan->plan.lefttree == NULL);
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index 9d3c05aed3..d77f7d3aef 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -209,16 +209,20 @@ static void partkey_datum_from_expr(PartitionPruneContext *context,
 
 /*
  * make_partition_pruneinfo
- *		Builds a PartitionPruneInfo which can be used in the executor to allow
- *		additional partition pruning to take place.  Returns NULL when
- *		partition pruning would be useless.
+ *		Checks if the given set of quals can be used to build pruning steps
+ *		that the executor can use to prune away unneeded partitions.  If
+ *		suitable quals are found then a PartitionPruneInfo is built and tagged
+ *		onto the PlannerInfo's partPruneInfos list.
+ *
+ * The return value is the 0-based index of the item added to the
+ * partPruneInfos list or -1 if nothing was added.
  *
  * 'parentrel' is the RelOptInfo for an appendrel, and 'subpaths' is the list
  * of scan paths for its child rels.
  * 'prunequal' is a list of potential pruning quals (i.e., restriction
  * clauses that are applicable to the appendrel).
  */
-PartitionPruneInfo *
+int
 make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 						 List *subpaths,
 						 List *prunequal)
@@ -332,7 +336,7 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	 * quals, then we can just not bother with run-time pruning.
 	 */
 	if (prunerelinfos == NIL)
-		return NULL;
+		return -1;
 
 	/* Else build the result data structure */
 	pruneinfo = makeNode(PartitionPruneInfo);
@@ -358,7 +362,9 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	else
 		pruneinfo->other_subplans = NULL;
 
-	return pruneinfo;
+	root->partPruneInfos = lappend(root->partPruneInfos, pruneinfo);
+
+	return list_length(root->partPruneInfos) - 1;
 }
 
 /*
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index 708435e952..bf962af7af 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -123,9 +123,8 @@ typedef struct PartitionPruneState
 
 extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
 													 int n_total_subplans,
-													 PartitionPruneInfo *pruneinfo,
+													 int part_prune_index,
 													 Bitmapset **initially_valid_subplans);
 extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
 										   bool initial_prune);
-
 #endif							/* EXECPARTITION_H */
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 5728801379..25e0bb976e 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -596,6 +596,8 @@ typedef struct EState
 	struct ExecRowMark **es_rowmarks;	/* Array of per-range-table-entry
 										 * ExecRowMarks, or NULL if none */
 	PlannedStmt *es_plannedstmt;	/* link to top of plan tree */
+	List		*es_part_prune_infos;	/* PlannedStmt.partPruneInfos */
+	struct PartitionPruneResult *es_part_prune_result; /* QueryDesc.part_prune_result */
 	const char *es_sourceText;	/* Source text from QueryDesc */
 
 	JunkFilter *es_junkFilter;	/* top-level junk filter, if any */
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index a6e5db4eec..6995b0ecec 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -107,6 +107,9 @@ typedef struct PlannerGlobal
 
 	List	   *appendRelations;	/* "flat" list of AppendRelInfos */
 
+	List	   *partPruneInfos;		/* List of PartitionPruneInfo contained in
+									 * the plan */
+
 	List	   *relationOids;	/* OIDs of relations the plan depends on */
 
 	List	   *invalItems;		/* other dependencies, as PlanInvalItems */
@@ -378,6 +381,9 @@ struct PlannerInfo
 
 	/* Does this query modify any partition key columns? */
 	bool		partColsUpdated;
+
+	/* PartitionPruneInfos added in this query's plan. */
+	List	   *partPruneInfos;
 };
 
 
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 0ea9a22dfb..297cacfb5b 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -64,6 +64,9 @@ typedef struct PlannedStmt
 
 	struct Plan *planTree;		/* tree of Plan nodes */
 
+	List	   *partPruneInfos;	/* List of PartitionPruneInfo contained in
+								 * the plan */
+
 	List	   *rtable;			/* list of RangeTblEntry nodes */
 
 	/* rtable indexes of target relations for INSERT/UPDATE/DELETE */
@@ -262,8 +265,8 @@ typedef struct Append
 	 */
 	int			first_partial_plan;
 
-	/* Info for run-time subplan pruning; NULL if we're not doing that */
-	struct PartitionPruneInfo *part_prune_info;
+	/* Index to PlannerInfo.partPruneInfos or -1 if no run-time pruning */
+	int			part_prune_index;
 } Append;
 
 /* ----------------
@@ -282,8 +285,9 @@ typedef struct MergeAppend
 	Oid		   *sortOperators;	/* OIDs of operators to sort them by */
 	Oid		   *collations;		/* OIDs of collations */
 	bool	   *nullsFirst;		/* NULLS FIRST/LAST directions */
-	/* Info for run-time subplan pruning; NULL if we're not doing that */
-	struct PartitionPruneInfo *part_prune_info;
+
+	/* Index to PlannerInfo.partPruneInfos or -1 if no run-time pruning */
+	int			part_prune_index;
 } MergeAppend;
 
 /* ----------------
diff --git a/src/include/partitioning/partprune.h b/src/include/partitioning/partprune.h
index 90684efa25..ebf0dcff8c 100644
--- a/src/include/partitioning/partprune.h
+++ b/src/include/partitioning/partprune.h
@@ -70,10 +70,10 @@ typedef struct PartitionPruneContext
 #define PruneCxtStateIdx(partnatts, step_id, keyno) \
 	((partnatts) * (step_id) + (keyno))
 
-extern PartitionPruneInfo *make_partition_pruneinfo(struct PlannerInfo *root,
-													struct RelOptInfo *parentrel,
-													List *subpaths,
-													List *prunequal);
+extern int make_partition_pruneinfo(struct PlannerInfo *root,
+									struct RelOptInfo *parentrel,
+									List *subpaths,
+									List *prunequal);
 extern Bitmapset *prune_append_rel_partitions(struct RelOptInfo *rel);
 extern Bitmapset *get_matching_partitions(PartitionPruneContext *context,
 										  List *pruning_steps);
-- 
2.35.3



  [application/octet-stream] v16-0002-Optimize-AcquireExecutorLocks-by-locking-only-un.patch (87.1K, 3-v16-0002-Optimize-AcquireExecutorLocks-by-locking-only-un.patch)
  download | inline diff:
From 6654d7c2b5c54d69d3f8a0136cfaf5593a3b7aae Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Wed, 22 Dec 2021 16:55:17 +0900
Subject: [PATCH v16 2/2] Optimize AcquireExecutorLocks() by locking only
 unpruned partitions

This commit teaches AcquireExecutorLocks() to perform initial
partition pruning to notionally eliminate the subnodes contained in a
generic cached plan that need not be initialized during the actual
execution of the plan and skip locking the partition scanned by those
subnodes.

The result of performing initial partition pruning this way before the
actual execution has started is made available to the actual execution via
PartitionPruneResult, made available along with the PlannedStmt by the
callers of the executor that used plancache.c to get the plan.  It is NULL
in the cases in which the plan is obtained by calling the planner
directly or if the plan obtained by plancache.c is not a generic one.
---
 src/backend/commands/copyto.c          |   2 +-
 src/backend/commands/createas.c        |   2 +-
 src/backend/commands/explain.c         |   7 +-
 src/backend/commands/extension.c       |   2 +-
 src/backend/commands/matview.c         |   2 +-
 src/backend/commands/prepare.c         |  26 ++-
 src/backend/executor/README            |  27 +++
 src/backend/executor/execMain.c        |  53 ++++++
 src/backend/executor/execParallel.c    |  27 ++-
 src/backend/executor/execPartition.c   | 234 +++++++++++++++++++++----
 src/backend/executor/functions.c       |   2 +-
 src/backend/executor/nodeAppend.c      |  11 +-
 src/backend/executor/nodeMergeAppend.c |   5 +-
 src/backend/executor/spi.c             |  27 ++-
 src/backend/nodes/copyfuncs.c          |  27 +++
 src/backend/nodes/outfuncs.c           |  29 +++
 src/backend/nodes/readfuncs.c          |  51 ++++++
 src/backend/optimizer/plan/planner.c   |   2 +
 src/backend/optimizer/plan/setrefs.c   |  45 +++++
 src/backend/partitioning/partprune.c   |  41 ++++-
 src/backend/tcop/postgres.c            |   8 +-
 src/backend/tcop/pquery.c              |  28 ++-
 src/backend/utils/cache/plancache.c    | 184 ++++++++++++++++---
 src/backend/utils/mmgr/portalmem.c     |  19 ++
 src/include/commands/explain.h         |   4 +-
 src/include/executor/execPartition.h   |   9 +-
 src/include/executor/execdesc.h        |   3 +
 src/include/executor/executor.h        |   2 +
 src/include/nodes/execnodes.h          |  28 +++
 src/include/nodes/nodes.h              |   4 +
 src/include/nodes/pathnodes.h          |   9 +
 src/include/nodes/plannodes.h          |  19 ++
 src/include/utils/plancache.h          |   3 +-
 src/include/utils/portal.h             |   3 +
 34 files changed, 849 insertions(+), 96 deletions(-)

diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index fca29a9a10..d839517693 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -541,7 +541,7 @@ BeginCopyTo(ParseState *pstate,
 		((DR_copy *) dest)->cstate = cstate;
 
 		/* Create a QueryDesc requesting no output */
-		cstate->queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+		cstate->queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
 											GetActiveSnapshot(),
 											InvalidSnapshot,
 											dest, NULL, NULL, 0);
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 9abbb6b555..f6607f2454 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -325,7 +325,7 @@ ExecCreateTableAs(ParseState *pstate, CreateTableAsStmt *stmt,
 		UpdateActiveSnapshotCommandId();
 
 		/* Create a QueryDesc, redirecting output to our tuple receiver */
-		queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+		queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
 									GetActiveSnapshot(), InvalidSnapshot,
 									dest, params, queryEnv, 0);
 
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 5d1f7089da..111d384982 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -407,7 +407,7 @@ ExplainOneQuery(Query *query, int cursorOptions,
 		}
 
 		/* run it (if needed) and produce output */
-		ExplainOnePlan(plan, into, es, queryString, params, queryEnv,
+		ExplainOnePlan(plan, NULL, into, es, queryString, params, queryEnv,
 					   &planduration, (es->buffers ? &bufusage : NULL));
 	}
 }
@@ -515,7 +515,8 @@ ExplainOneUtility(Node *utilityStmt, IntoClause *into, ExplainState *es,
  * to call it.
  */
 void
-ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
+ExplainOnePlan(PlannedStmt *plannedstmt, PartitionPruneResult *part_prune_result,
+			   IntoClause *into, ExplainState *es,
 			   const char *queryString, ParamListInfo params,
 			   QueryEnvironment *queryEnv, const instr_time *planduration,
 			   const BufferUsage *bufusage)
@@ -563,7 +564,7 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
 		dest = None_Receiver;
 
 	/* Create a QueryDesc for the query */
-	queryDesc = CreateQueryDesc(plannedstmt, queryString,
+	queryDesc = CreateQueryDesc(plannedstmt, part_prune_result, queryString,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, params, queryEnv, instrument_option);
 
diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index 767d9b9619..1d55a23ded 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -776,7 +776,7 @@ execute_sql_string(const char *sql)
 			{
 				QueryDesc  *qdesc;
 
-				qdesc = CreateQueryDesc(stmt,
+				qdesc = CreateQueryDesc(stmt, NULL,
 										sql,
 										GetActiveSnapshot(), NULL,
 										dest, NULL, NULL, 0);
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index d1ee106465..e878209674 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -408,7 +408,7 @@ refresh_matview_datafill(DestReceiver *dest, Query *query,
 	UpdateActiveSnapshotCommandId();
 
 	/* Create a QueryDesc, redirecting output to our tuple receiver */
-	queryDesc = CreateQueryDesc(plan, queryString,
+	queryDesc = CreateQueryDesc(plan, NULL, queryString,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, NULL, NULL, 0);
 
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 80738547ed..c7360712b1 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -155,6 +155,7 @@ ExecuteQuery(ParseState *pstate,
 	PreparedStatement *entry;
 	CachedPlan *cplan;
 	List	   *plan_list;
+	List	   *part_prune_result_list;
 	ParamListInfo paramLI = NULL;
 	EState	   *estate = NULL;
 	Portal		portal;
@@ -193,7 +194,10 @@ ExecuteQuery(ParseState *pstate,
 									   entry->plansource->query_string);
 
 	/* Replan if needed, and increment plan refcount for portal */
-	cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL);
+	cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL,
+						  &part_prune_result_list);
+	Assert(list_length(cplan->stmt_list) ==
+		   list_length(part_prune_result_list));
 	plan_list = cplan->stmt_list;
 
 	/*
@@ -207,6 +211,9 @@ ExecuteQuery(ParseState *pstate,
 					  plan_list,
 					  cplan);
 
+	/* Copy PartitionPruneResults into the portal's context. */
+	PortalStorePartitionPruneResults(portal, part_prune_result_list);
+
 	/*
 	 * For CREATE TABLE ... AS EXECUTE, we must verify that the prepared
 	 * statement is one that produces tuples.  Currently we insist that it be
@@ -576,7 +583,9 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	const char *query_string;
 	CachedPlan *cplan;
 	List	   *plan_list;
-	ListCell   *p;
+	List	   *part_prune_result_list;
+	ListCell   *p,
+			   *pp;
 	ParamListInfo paramLI = NULL;
 	EState	   *estate = NULL;
 	instr_time	planstart;
@@ -619,7 +628,10 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 
 	/* Replan if needed, and acquire a transient refcount */
 	cplan = GetCachedPlan(entry->plansource, paramLI,
-						  CurrentResourceOwner, queryEnv);
+						  CurrentResourceOwner, queryEnv,
+						  &part_prune_result_list);
+	Assert(list_length(cplan->stmt_list) ==
+		   list_length(part_prune_result_list));
 
 	INSTR_TIME_SET_CURRENT(planduration);
 	INSTR_TIME_SUBTRACT(planduration, planstart);
@@ -634,13 +646,15 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	plan_list = cplan->stmt_list;
 
 	/* Explain each query */
-	foreach(p, plan_list)
+	forboth(p, plan_list, pp, part_prune_result_list)
 	{
 		PlannedStmt *pstmt = lfirst_node(PlannedStmt, p);
+		PartitionPruneResult *part_prune_result = lfirst_node(PartitionPruneResult, pp);
 
 		if (pstmt->commandType != CMD_UTILITY)
-			ExplainOnePlan(pstmt, into, es, query_string, paramLI, queryEnv,
-						   &planduration, (es->buffers ? &bufusage : NULL));
+			ExplainOnePlan(pstmt, part_prune_result, into, es, query_string,
+						   paramLI, queryEnv, &planduration,
+						   (es->buffers ? &bufusage : NULL));
 		else
 			ExplainOneUtility(pstmt->utilityStmt, into, es, query_string,
 							  paramLI, queryEnv);
diff --git a/src/backend/executor/README b/src/backend/executor/README
index 0b5183fc4a..e0802be723 100644
--- a/src/backend/executor/README
+++ b/src/backend/executor/README
@@ -65,6 +65,29 @@ found there.  This currently only occurs for Append and MergeAppend nodes.  In
 this case the non-required subplans are ignored and the executor state's
 subnode array will become out of sequence to the plan's subplan list.
 
+Actually, the so-called execution time pruning may also occur even before the
+execution has started.  One case where that occurs is when a cached generic
+plan is being validated for execution by plancache.c: GetCachedPlan(), which
+proceeds by locking all the relations that will be scanned by that plan.  If
+the generic plan contains nodes that can perform execution time partition
+pruning (that is, contain a PartitionPruneInfo), a subset of pruning steps
+contained in the PartitionPruneInfos that do not depend on execution actually
+having started (called "initial" pruning steps) are performed at this point
+to figure out the minimal set of child subplans that satisfy those pruning
+instructions.  AcquireExecutorLocks() looking at a particular plan will then
+lock only the relations scanned by those surviving subplans (along with those
+present in PlannedStmt.minLockRelids), and ignore those scanned by the pruned
+subplans, even though the pruned subplans themselves are not removed from the
+plan tree.  The result of pruning (that is, the set of indexes of surviving
+subplans in their parent's list of child subplans) is saved as a list of
+bitmapsets, with one element for every PartitionPruneInfo referenced in the
+plan (PlannedStmt.partPruneInfos).  The list is packaged into a
+PartitionPruneResult node, which is passed along with the PlannedStmt to the
+executor via the QueryDesc.  It is imperative that the executor and any third
+party code invoked by it that gets passed the plan tree look at the plan's
+PartitionPruneResult to determine whether a particular child subplan of a
+parent node that supports pruning is valid for a given execution.
+
 Each Plan node may have expression trees associated with it, to represent
 its target list, qualification conditions, etc.  These trees are also
 read-only to the executor, but the executor state for expression evaluation
@@ -286,6 +309,10 @@ Query Processing Control Flow
 
 This is a sketch of control flow for full query processing:
 
+  [ ExecutorDoInitialPruning ] --- an optional step to perform initial
+		partition pruning on the plan tree the result of which is passed
+		to the executor via QueryDesc
+
 	CreateQueryDesc
 
 	ExecutorStart
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 72fc273524..45824624f8 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -49,6 +49,7 @@
 #include "commands/matview.h"
 #include "commands/trigger.h"
 #include "executor/execdebug.h"
+#include "executor/execPartition.h"
 #include "executor/nodeSubplan.h"
 #include "foreign/fdwapi.h"
 #include "jit/jit.h"
@@ -104,6 +105,56 @@ static void EvalPlanQualStart(EPQState *epqstate, Plan *planTree);
 
 /* end of local decls */
 
+/* ----------------------------------------------------------------
+ *		ExecutorDoInitialPruning
+ *
+ *		For each plan tree node that has been assigned a PartitionPruneInfo,
+ *		this performs initial partition pruning using the information contained
+ *		therein to determine the set of child subplans that satisfy the initial
+ *		pruning steps, to be returned as a bitmapset of their indexes in the
+ *		node's list of child subplans (for example, an Append's appendplans).
+ *
+ * Return value is a PartitionPruneResult node that contains a list of those
+ * bitmapsets, with one element for every PartitionPruneInfo, and a bitmapset
+ * of the RT indexes of all the leaf partitions scanned by those chosen
+ * subplans.  Note that the latter is shared across all PartitionPruneInfos.
+ *
+ * The executor must see the exactly same set of subplans as valid for
+ * execution when doing ExecInitNode() on the plan nodes whose
+ * PartitionPruneInfos are processed here.  So, it must get the set from the
+ * aforementioned PartitionPruneResult, instead of computing it all over
+ * again by redoing the initial pruning.  It's the caller's job to pass the
+ * PartitionPruneResult to the executor.
+ *
+ * Note: Partitioned tables mentioned in PartitionedRelPruneInfo nodes that
+ * drive the pruning will be locked before doing the pruning.
+ * ----------------------------------------------------------------
+ */
+PartitionPruneResult *
+ExecutorDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params)
+{
+	PartitionPruneResult *result;
+	ListCell *lc;
+
+	/* Only get here if there is any pruning to do. */
+	Assert(plannedstmt->containsInitialPruning);
+
+	result = makeNode(PartitionPruneResult);
+	foreach(lc, plannedstmt->partPruneInfos)
+	{
+		PartitionPruneInfo *pruneinfo = lfirst(lc);
+		Bitmapset *valid_subplan_offs;
+
+		valid_subplan_offs =
+			ExecPartitionDoInitialPruning(plannedstmt, params, pruneinfo,
+										  &result->scan_leafpart_rtis);
+		result->valid_subplan_offs_list =
+			lappend(result->valid_subplan_offs_list,
+					valid_subplan_offs);
+	}
+
+	return result;
+}
 
 /* ----------------------------------------------------------------
  *		ExecutorStart
@@ -806,6 +857,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
 {
 	CmdType		operation = queryDesc->operation;
 	PlannedStmt *plannedstmt = queryDesc->plannedstmt;
+	PartitionPruneResult *part_prune_result = queryDesc->part_prune_result;
 	Plan	   *plan = plannedstmt->planTree;
 	List	   *rangeTable = plannedstmt->rtable;
 	EState	   *estate = queryDesc->estate;
@@ -826,6 +878,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
 
 	estate->es_plannedstmt = plannedstmt;
 	estate->es_part_prune_infos = plannedstmt->partPruneInfos;
+	estate->es_part_prune_result = part_prune_result;
 
 	/*
 	 * Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index f73b8c2607..7e6dab5623 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -66,6 +66,7 @@
 #define PARALLEL_KEY_QUERY_TEXT		UINT64CONST(0xE000000000000008)
 #define PARALLEL_KEY_JIT_INSTRUMENTATION UINT64CONST(0xE000000000000009)
 #define PARALLEL_KEY_WAL_USAGE			UINT64CONST(0xE00000000000000A)
+#define PARALLEL_KEY_PARTITIONPRUNERESULT	UINT64CONST(0xE00000000000000B)
 
 #define PARALLEL_TUPLE_QUEUE_SIZE		65536
 
@@ -182,6 +183,7 @@ ExecSerializePlan(Plan *plan, EState *estate)
 	pstmt->transientPlan = false;
 	pstmt->dependsOnRole = false;
 	pstmt->parallelModeNeeded = false;
+	pstmt->containsInitialPruning = false;
 	pstmt->planTree = plan;
 	pstmt->partPruneInfos = estate->es_part_prune_infos;
 	pstmt->rtable = estate->es_range_table;
@@ -597,12 +599,15 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	FixedParallelExecutorState *fpes;
 	char	   *pstmt_data;
 	char	   *pstmt_space;
+	char	   *part_prune_result_data;
+	char	   *part_prune_result_space;
 	char	   *paramlistinfo_space;
 	BufferUsage *bufusage_space;
 	WalUsage   *walusage_space;
 	SharedExecutorInstrumentation *instrumentation = NULL;
 	SharedJitInstrumentation *jit_instrumentation = NULL;
 	int			pstmt_len;
+	int			part_prune_result_len;
 	int			paramlistinfo_len;
 	int			instrumentation_len = 0;
 	int			jit_instrumentation_len = 0;
@@ -631,6 +636,7 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 
 	/* Fix up and serialize plan to be sent to workers. */
 	pstmt_data = ExecSerializePlan(planstate->plan, estate);
+	part_prune_result_data = nodeToString(estate->es_part_prune_result);
 
 	/* Create a parallel context. */
 	pcxt = CreateParallelContext("postgres", "ParallelQueryMain", nworkers);
@@ -657,6 +663,11 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	shm_toc_estimate_chunk(&pcxt->estimator, pstmt_len);
 	shm_toc_estimate_keys(&pcxt->estimator, 1);
 
+	/* Estimate space for serialized PartitionPruneResult. */
+	part_prune_result_len = strlen(part_prune_result_data) + 1;
+	shm_toc_estimate_chunk(&pcxt->estimator, part_prune_result_len);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
 	/* Estimate space for serialized ParamListInfo. */
 	paramlistinfo_len = EstimateParamListSpace(estate->es_param_list_info);
 	shm_toc_estimate_chunk(&pcxt->estimator, paramlistinfo_len);
@@ -751,6 +762,12 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	memcpy(pstmt_space, pstmt_data, pstmt_len);
 	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PLANNEDSTMT, pstmt_space);
 
+	/* Store serialized PartitionPruneResult */
+	part_prune_result_space = shm_toc_allocate(pcxt->toc, part_prune_result_len);
+	memcpy(part_prune_result_space, part_prune_result_data, part_prune_result_len);
+	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARTITIONPRUNERESULT,
+				   part_prune_result_space);
+
 	/* Store serialized ParamListInfo. */
 	paramlistinfo_space = shm_toc_allocate(pcxt->toc, paramlistinfo_len);
 	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARAMLISTINFO, paramlistinfo_space);
@@ -1232,8 +1249,10 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
 						 int instrument_options)
 {
 	char	   *pstmtspace;
+	char	   *part_prune_result_space;
 	char	   *paramspace;
 	PlannedStmt *pstmt;
+	PartitionPruneResult *part_prune_result;
 	ParamListInfo paramLI;
 	char	   *queryString;
 
@@ -1244,12 +1263,18 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
 	pstmtspace = shm_toc_lookup(toc, PARALLEL_KEY_PLANNEDSTMT, false);
 	pstmt = (PlannedStmt *) stringToNode(pstmtspace);
 
+	/* Reconstruct leader-supplied PartitionPruneResult. */
+	part_prune_result_space =
+		shm_toc_lookup(toc, PARALLEL_KEY_PARTITIONPRUNERESULT, false);
+	part_prune_result = (PartitionPruneResult *)
+		stringToNode(part_prune_result_space);
+
 	/* Reconstruct ParamListInfo. */
 	paramspace = shm_toc_lookup(toc, PARALLEL_KEY_PARAMLISTINFO, false);
 	paramLI = RestoreParamList(&paramspace);
 
 	/* Create a QueryDesc for the query. */
-	return CreateQueryDesc(pstmt,
+	return CreateQueryDesc(pstmt, part_prune_result,
 						   queryString,
 						   GetActiveSnapshot(), InvalidSnapshot,
 						   receiver, paramLI, NULL, instrument_options);
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index b55cdd2580..86227301e9 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -25,6 +25,7 @@
 #include "mb/pg_wchar.h"
 #include "miscadmin.h"
 #include "nodes/makefuncs.h"
+#include "parser/parsetree.h"
 #include "partitioning/partbounds.h"
 #include "partitioning/partdesc.h"
 #include "partitioning/partprune.h"
@@ -185,7 +186,11 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation rel,
 static List *adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri);
 static List *adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap);
 static PartitionPruneState *CreatePartitionPruneState(PlanState *planstate,
-													  PartitionPruneInfo *pruneinfo);
+													  PartitionPruneInfo *pruneinfo,
+													  bool consider_initial_steps,
+													  bool consider_exec_steps,
+													  List *rtable, ExprContext *econtext,
+													  PartitionDirectory partdir);
 static void InitPartitionPruneContext(PartitionPruneContext *context,
 									  List *pruning_steps,
 									  PartitionDesc partdesc,
@@ -198,7 +203,8 @@ static void PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
 static void find_matching_subplans_recurse(PartitionPruningData *prunedata,
 										   PartitionedRelPruningData *pprune,
 										   bool initial_prune,
-										   Bitmapset **validsubplans);
+										   Bitmapset **validsubplans,
+										   Bitmapset **scan_leafpart_rtis);
 
 
 /*
@@ -1593,8 +1599,10 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  * considered to be a stable expression, it can change value from one plan
  * node scan to the next during query execution.  Stable comparison
  * expressions that don't involve such Params allow partition pruning to be
- * done once during executor startup.  Expressions that do involve such Params
- * require us to prune separately for each scan of the parent plan node.
+ * done once during executor startup or during ExecutorDoInitialPruning() that
+ * runs as part of performing AcquireExecutorLocks() on a given plan tree.
+ * Expressions that do involve such Params require us to prune separately for
+ * each scan of the parent plan node.
  *
  * Note that pruning away unneeded subplans during executor startup has the
  * added benefit of not having to initialize the unneeded subplans at all.
@@ -1611,6 +1619,13 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  *		account for initial pruning possibly having eliminated some of the
  *		subplans.
  *
+ * ExecPartitionDoInitialPruning:
+ *		Do initial pruning with the information contained in a given
+ *		PartitionPruneInfo to determine the minimal set of child subplans
+ *		to be executed of the parent plan node to which the PartitionPruneInfo
+ *		belongs and also the set of the RT indexes of leaf partitions that will
+ *		be scanned with those subplans.
+ *
  * ExecFindMatchingSubPlans:
  *		Returns indexes of matching subplans after evaluating the expressions
  *		that are safe to evaluate at a given point.  This function is first
@@ -1628,8 +1643,9 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  *
  * On return, *initially_valid_subplans is assigned the set of indexes of
  * child subplans that must be initialized along with the parent plan node.
- * Initial pruning is performed here if needed and in that case only the
- * surviving subplans' indexes are added.
+ * Initial pruning is performed here if needed (unless it has already been done
+ * by ExecDoInitialPruning()), and in that case only the surviving subplans'
+ * indexes are added.
  *
  * If subplans are indeed pruned, subplan_map arrays contained in the returned
  * PartitionPruneState are re-sequenced to not count those, though only if the
@@ -1645,24 +1661,59 @@ ExecInitPartitionPruning(PlanState *planstate,
 	EState	   *estate = planstate->state;
 	PartitionPruneInfo *pruneinfo = list_nth(estate->es_part_prune_infos,
 											 part_prune_index);
+	PartitionPruneResult *pruneresult = estate->es_part_prune_result;
+	bool	do_pruning = (pruneinfo->needs_init_pruning ||
+						  pruneinfo->needs_exec_pruning);
 
-	/* We may need an expression context to evaluate partition exprs */
-	ExecAssignExprContext(estate, planstate);
+	/*
+	 * No need to do initial pruning if it was done already by
+	 * ExecutorDoInitialPruning(), which it would be if es_part_prune_result
+	 * has been set.
+	 */
+	if (pruneresult)
+		do_pruning = pruneinfo->needs_exec_pruning;
 
-	/* Create the working data structure for pruning */
-	prunestate = CreatePartitionPruneState(planstate, pruneinfo);
+	prunestate = NULL;
+	if (do_pruning)
+	{
+		/* We may need an expression context to evaluate partition exprs */
+		ExecAssignExprContext(estate, planstate);
+
+		/* For data reading, executor always omits detached partitions */
+		if (estate->es_partition_directory == NULL)
+			estate->es_partition_directory =
+				CreatePartitionDirectory(estate->es_query_cxt, false);
+
+		/*
+		 * Create the working data structure for pruning.  No need to consider
+		 * initial pruning steps if we have a PartitionPruneResult.
+		 */
+		prunestate = CreatePartitionPruneState(planstate, pruneinfo,
+											   pruneresult == NULL, true,
+											   NIL, planstate->ps_ExprContext,
+											   estate->es_partition_directory);
+	}
 
 	/*
 	 * Perform an initial partition prune pass, if required.
 	 */
-	if (prunestate->do_initial_prune)
-		*initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true);
+	if (pruneresult)
+	{
+		*initially_valid_subplans =
+			list_nth(pruneresult->valid_subplan_offs_list, part_prune_index);
+	}
+	else if (prunestate && prunestate->do_initial_prune)
+	{
+		*initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true,
+															 NULL);
+	}
 	else
 	{
 		/* No pruning, so we'll need to initialize all subplans */
 		Assert(n_total_subplans > 0);
 		*initially_valid_subplans = bms_add_range(NULL, 0,
 												  n_total_subplans - 1);
+		return prunestate;
 	}
 
 	/*
@@ -1670,7 +1721,8 @@ ExecInitPartitionPruning(PlanState *planstate,
 	 * that were removed above due to initial pruning.  No need to do this if
 	 * no steps were removed.
 	 */
-	if (bms_num_members(*initially_valid_subplans) < n_total_subplans)
+	if (prunestate &&
+		bms_num_members(*initially_valid_subplans) < n_total_subplans)
 	{
 		/*
 		 * We can safely skip this when !do_exec_prune, even though that
@@ -1686,11 +1738,73 @@ ExecInitPartitionPruning(PlanState *planstate,
 	return prunestate;
 }
 
+/*
+ * ExecPartitionDoInitialPruning
+ *		Perform initial pruning using given PartitionPruneInfo to determine
+ *		the minimal set of child subplans that will be executed and also the
+ *		set of RT indexes of the leaf partitions scanned by those subplans.
+ */
+Bitmapset *
+ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params,
+							  PartitionPruneInfo *pruneinfo,
+							  Bitmapset **scan_leafpart_rtis)
+{
+	List		 *rtable = plannedstmt->rtable;
+	ExprContext	 *econtext;
+	PartitionDirectory pdir;
+	MemoryContext oldcontext,
+				  tmpcontext;
+	PartitionPruneState *prunestate;
+	Bitmapset	 *valid_subplan_offs;
+
+	/*
+	 * A temporary context for memory allocations required while executing
+	 * partition pruning steps.
+	 */
+	tmpcontext = AllocSetContextCreate(CurrentMemoryContext,
+									   "initial pruning working data",
+									   ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(tmpcontext);
+
+	/*
+	 * PartitionDirectory to look up partition descriptors, which omits
+	 * detached partitions, just like in the executor proper.
+	 */
+	pdir = CreatePartitionDirectory(CurrentMemoryContext, false);
+
+	/*
+	 * We don't yet have a PlanState for the parent plan node, so we must
+	 * create a standalone ExprContext to evaluate pruning expressions,
+	 * equipped with the information about the EXTERN parameters that the
+	 * caller passed us.  Note that that's okay because the initial pruning
+	 * steps do not contain anything that requires the execution to have
+	 * started.
+	 */
+	econtext = CreateStandaloneExprContext();
+	econtext->ecxt_param_list_info = params;
+	prunestate = CreatePartitionPruneState(NULL, pruneinfo, true, false,
+										   rtable, econtext, pdir);
+	MemoryContextSwitchTo(oldcontext);
+
+	/* Do the initial pruning. */
+	valid_subplan_offs = ExecFindMatchingSubPlans(prunestate, true,
+												  scan_leafpart_rtis);
+
+	FreeExprContext(econtext, true);
+	DestroyPartitionDirectory(pdir);
+	MemoryContextDelete(tmpcontext);
+
+	return valid_subplan_offs;
+}
+
 /*
  * CreatePartitionPruneState
  *		Build the data structure required for calling ExecFindMatchingSubPlans
  *
- * 'planstate' is the parent plan node's execution state.
+ * 'planstate', if not NULL, is the parent plan node's execution state.  It
+ * can be NULL if being called before ExecutorStart(), in which case,
+ * 'rtable' (range table), 'econtext', and 'partdir' must be explicitly
+ * provided.
  *
  * 'pruneinfo' is a PartitionPruneInfo as generated by
  * make_partition_pruneinfo.  Here we build a PartitionPruneState containing a
@@ -1704,19 +1818,21 @@ ExecInitPartitionPruning(PlanState *planstate,
  * PartitionedRelPruneInfo.
  */
 static PartitionPruneState *
-CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
+CreatePartitionPruneState(PlanState *planstate,
+						  PartitionPruneInfo *pruneinfo,
+						  bool consider_initial_steps,
+						  bool consider_exec_steps,
+						  List *rtable, ExprContext *econtext,
+						  PartitionDirectory partdir)
 {
-	EState	   *estate = planstate->state;
+	EState	   *estate = planstate ? planstate->state : NULL;
 	PartitionPruneState *prunestate;
 	int			n_part_hierarchies;
 	ListCell   *lc;
 	int			i;
-	ExprContext *econtext = planstate->ps_ExprContext;
 
-	/* For data reading, executor always omits detached partitions */
-	if (estate->es_partition_directory == NULL)
-		estate->es_partition_directory =
-			CreatePartitionDirectory(estate->es_query_cxt, false);
+	Assert((estate != NULL) ||
+			(partdir != NULL && econtext != NULL && rtable != NIL));
 
 	n_part_hierarchies = list_length(pruneinfo->prune_infos);
 	Assert(n_part_hierarchies > 0);
@@ -1771,15 +1887,42 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 			PartitionKey partkey;
 
 			/*
-			 * We can rely on the copies of the partitioned table's partition
-			 * key and partition descriptor appearing in its relcache entry,
-			 * because that entry will be held open and locked for the
-			 * duration of this executor run.
+			 * Must open the relation by ourselves when called before the
+			 * execution has started, such as, when called during
+			 * ExecutorDoInitialPruning() on a cached plan.  In that case,
+			 * sub-partitions must be locked, because AcquirePlannerLocks()
+			 * would not have seen them. (1st relation in a partrelpruneinfos
+			 * list is always the root partitioned table appearing in the
+			 * query, which AcquirePlannerLocks() would have locked; the
+			 * Assert in relation_open() guards that assumption.)
+			 */
+			if (estate == NULL)
+			{
+				RangeTblEntry *rte = rt_fetch(pinfo->rtindex, rtable);
+				int		lockmode = (j == 0) ? NoLock : rte->rellockmode;
+
+				partrel = table_open(rte->relid, lockmode);
+			}
+			else
+				partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
+
+			/*
+			 * We can rely on the copy of the partitioned table's partition
+			 * key from in its relcache entry, because it can't change (or
+			 * get destroyed) as long as the relation is locked.  Partition
+			 * descriptor is taken from the PartitionDirectory associated with
+			 * the table that is held open long enough for the descriptor to
+			 * remain valid while it's used to perform the pruning steps.
 			 */
-			partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
 			partkey = RelationGetPartitionKey(partrel);
-			partdesc = PartitionDirectoryLookup(estate->es_partition_directory,
-												partrel);
+			partdesc = PartitionDirectoryLookup(partdir, partrel);
+
+			/*
+			 * Must close partrel, keeping the lock taken, if we're not using
+			 * EState's entry.
+			 */
+			if (estate == NULL)
+				table_close(partrel, NoLock);
 
 			/*
 			 * Initialize the subplan_map and subpart_map.
@@ -1793,6 +1936,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 			Assert(partdesc->nparts >= pinfo->nparts);
 			pprune->nparts = partdesc->nparts;
 			pprune->subplan_map = palloc(sizeof(int) * partdesc->nparts);
+			pprune->rti_map = palloc(sizeof(Index) * partdesc->nparts);
 			if (partdesc->nparts == pinfo->nparts)
 			{
 				/*
@@ -1803,6 +1947,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 				pprune->subpart_map = pinfo->subpart_map;
 				memcpy(pprune->subplan_map, pinfo->subplan_map,
 					   sizeof(int) * pinfo->nparts);
+				memcpy(pprune->rti_map, pinfo->rti_map,
+					   sizeof(int) * pinfo->nparts);
 
 				/*
 				 * Double-check that the list of unpruned relations has not
@@ -1853,6 +1999,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 							pinfo->subplan_map[pd_idx];
 						pprune->subpart_map[pp_idx] =
 							pinfo->subpart_map[pd_idx];
+						pprune->rti_map[pp_idx] =
+							pinfo->rti_map[pd_idx];
 						pd_idx++;
 					}
 					else
@@ -1860,6 +2008,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 						/* this partdesc entry is not in the plan */
 						pprune->subplan_map[pp_idx] = -1;
 						pprune->subpart_map[pp_idx] = -1;
+						pprune->rti_map[pp_idx] = 0;
 					}
 				}
 
@@ -1881,7 +2030,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 			 * Initialize pruning contexts as needed.
 			 */
 			pprune->initial_pruning_steps = pinfo->initial_pruning_steps;
-			if (pinfo->initial_pruning_steps)
+			if (consider_initial_steps && pinfo->initial_pruning_steps)
 			{
 				InitPartitionPruneContext(&pprune->initial_context,
 										  pinfo->initial_pruning_steps,
@@ -1891,7 +2040,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 				prunestate->do_initial_prune = true;
 			}
 			pprune->exec_pruning_steps = pinfo->exec_pruning_steps;
-			if (pinfo->exec_pruning_steps)
+			if (consider_exec_steps && pinfo->exec_pruning_steps)
 			{
 				InitPartitionPruneContext(&pprune->exec_context,
 										  pinfo->exec_pruning_steps,
@@ -2119,10 +2268,14 @@ PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
  * Pass initial_prune if PARAM_EXEC Params cannot yet be evaluated.  This
  * differentiates the initial executor-time pruning step from later
  * runtime pruning.
+ *
+ * RT indexes of leaf partitions scanned by the chosen subplans are added to
+ * *scan_leafpart_rtis if the pointer is non-NULL.
  */
 Bitmapset *
 ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
-						 bool initial_prune)
+						 bool initial_prune,
+						 Bitmapset **scan_leafpart_rtis)
 {
 	Bitmapset  *result = NULL;
 	MemoryContext oldcontext;
@@ -2157,7 +2310,7 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
 		 */
 		pprune = &prunedata->partrelprunedata[0];
 		find_matching_subplans_recurse(prunedata, pprune, initial_prune,
-									   &result);
+									   &result, scan_leafpart_rtis);
 
 		/* Expression eval may have used space in ExprContext too */
 		if (pprune->exec_pruning_steps)
@@ -2171,6 +2324,8 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
 
 	/* Copy result out of the temp context before we reset it */
 	result = bms_copy(result);
+	if (scan_leafpart_rtis)
+		*scan_leafpart_rtis = bms_copy(*scan_leafpart_rtis);
 
 	MemoryContextReset(prunestate->prune_context);
 
@@ -2181,13 +2336,15 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
  * find_matching_subplans_recurse
  *		Recursive worker function for ExecFindMatchingSubPlans
  *
- * Adds valid (non-prunable) subplan IDs to *validsubplans
+ * Adds valid (non-prunable) subplan IDs to *validsubplans and RT indexes of
+ * of the corresponding leaf partitions to *scan_leafpart_rtis (if asked for).
  */
 static void
 find_matching_subplans_recurse(PartitionPruningData *prunedata,
 							   PartitionedRelPruningData *pprune,
 							   bool initial_prune,
-							   Bitmapset **validsubplans)
+							   Bitmapset **validsubplans,
+							   Bitmapset **scan_leafpart_rtis)
 {
 	Bitmapset  *partset;
 	int			i;
@@ -2214,8 +2371,14 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
 	while ((i = bms_next_member(partset, i)) >= 0)
 	{
 		if (pprune->subplan_map[i] >= 0)
+		{
 			*validsubplans = bms_add_member(*validsubplans,
 											pprune->subplan_map[i]);
+			Assert(pprune->rti_map[i] > 0);
+			if (scan_leafpart_rtis)
+				*scan_leafpart_rtis = bms_add_member(*scan_leafpart_rtis,
+													 pprune->rti_map[i]);
+		}
 		else
 		{
 			int			partidx = pprune->subpart_map[i];
@@ -2223,7 +2386,8 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
 			if (partidx >= 0)
 				find_matching_subplans_recurse(prunedata,
 											   &prunedata->partrelprunedata[partidx],
-											   initial_prune, validsubplans);
+											   initial_prune, validsubplans,
+											   scan_leafpart_rtis);
 			else
 			{
 				/*
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index f9460ae506..a2182a6b1f 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -844,7 +844,7 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
 	else
 		dest = None_Receiver;
 
-	es->qd = CreateQueryDesc(es->stmt,
+	es->qd = CreateQueryDesc(es->stmt, NULL,
 							 fcache->src,
 							 GetActiveSnapshot(),
 							 InvalidSnapshot,
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index c6f86a6510..96880e122a 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -155,7 +155,8 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
 		 * subplan, we can fill as_valid_subplans immediately, preventing
 		 * later calls to ExecFindMatchingSubPlans.
 		 */
-		if (!prunestate->do_exec_prune && nplans > 0)
+		if (appendstate->as_prune_state == NULL ||
+			(!appendstate->as_prune_state->do_exec_prune && nplans > 0))
 			appendstate->as_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
 	}
 	else
@@ -577,7 +578,7 @@ choose_next_subplan_locally(AppendState *node)
 		}
 		else if (node->as_valid_subplans == NULL)
 			node->as_valid_subplans =
-				ExecFindMatchingSubPlans(node->as_prune_state, false);
+				ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
 
 		whichplan = -1;
 	}
@@ -642,7 +643,7 @@ choose_next_subplan_for_leader(AppendState *node)
 		if (node->as_valid_subplans == NULL)
 		{
 			node->as_valid_subplans =
-				ExecFindMatchingSubPlans(node->as_prune_state, false);
+				ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
 
 			/*
 			 * Mark each invalid plan as finished to allow the loop below to
@@ -717,7 +718,7 @@ choose_next_subplan_for_worker(AppendState *node)
 	else if (node->as_valid_subplans == NULL)
 	{
 		node->as_valid_subplans =
-			ExecFindMatchingSubPlans(node->as_prune_state, false);
+			ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
 		mark_invalid_subplans_as_finished(node);
 	}
 
@@ -868,7 +869,7 @@ ExecAppendAsyncBegin(AppendState *node)
 	if (node->as_valid_subplans == NULL)
 	{
 		node->as_valid_subplans =
-			ExecFindMatchingSubPlans(node->as_prune_state, false);
+			ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
 
 		classify_matching_subplans(node);
 	}
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index 8d35860c30..2312e5a633 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -103,7 +103,8 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
 		 * subplan, we can fill ms_valid_subplans immediately, preventing
 		 * later calls to ExecFindMatchingSubPlans.
 		 */
-		if (!prunestate->do_exec_prune && nplans > 0)
+		if (mergestate->ms_prune_state == NULL ||
+			(!mergestate->ms_prune_state->do_exec_prune && nplans > 0))
 			mergestate->ms_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
 	}
 	else
@@ -218,7 +219,7 @@ ExecMergeAppend(PlanState *pstate)
 		 */
 		if (node->ms_valid_subplans == NULL)
 			node->ms_valid_subplans =
-				ExecFindMatchingSubPlans(node->ms_prune_state, false);
+				ExecFindMatchingSubPlans(node->ms_prune_state, false, NULL);
 
 		/*
 		 * First time through: pull the first tuple from each valid subplan,
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 29bc26669b..303a572c02 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -1578,6 +1578,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 	CachedPlanSource *plansource;
 	CachedPlan *cplan;
 	List	   *stmt_list;
+	List	   *part_prune_result_list;
 	char	   *query_string;
 	Snapshot	snapshot;
 	MemoryContext oldcontext;
@@ -1657,7 +1658,10 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 	 */
 
 	/* Replan if needed, and increment plan refcount for portal */
-	cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv);
+	cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv,
+						  &part_prune_result_list);
+	Assert(list_length(cplan->stmt_list) ==
+		   list_length(part_prune_result_list));
 	stmt_list = cplan->stmt_list;
 
 	if (!plan->saved)
@@ -1685,6 +1689,9 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 					  stmt_list,
 					  cplan);
 
+	/* Copy PartitionPruneResults into the portal's context. */
+	PortalStorePartitionPruneResults(portal, part_prune_result_list);
+
 	/*
 	 * Set up options for portal.  Default SCROLL type is chosen the same way
 	 * as PerformCursorOpen does it.
@@ -2092,7 +2099,8 @@ SPI_plan_get_cached_plan(SPIPlanPtr plan)
 	/* Get the generic plan for the query */
 	cplan = GetCachedPlan(plansource, NULL,
 						  plan->saved ? CurrentResourceOwner : NULL,
-						  _SPI_current->queryEnv);
+						  _SPI_current->queryEnv,
+						  NULL /* Not interested in PartitionPruneResults */);
 	Assert(cplan == plansource->gplan);
 
 	/* Pop the error context stack */
@@ -2473,7 +2481,9 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 	{
 		CachedPlanSource *plansource = (CachedPlanSource *) lfirst(lc1);
 		List	   *stmt_list;
-		ListCell   *lc2;
+		List	   *part_prune_result_list;
+		ListCell   *lc2,
+				   *lc3;
 
 		spicallbackarg.query = plansource->query_string;
 
@@ -2549,8 +2559,10 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 		 * plan, the refcount must be backed by the plan_owner.
 		 */
 		cplan = GetCachedPlan(plansource, options->params,
-							  plan_owner, _SPI_current->queryEnv);
-
+							  plan_owner, _SPI_current->queryEnv,
+							 &part_prune_result_list);
+		Assert(list_length(cplan->stmt_list) ==
+			   list_length(part_prune_result_list));
 		stmt_list = cplan->stmt_list;
 
 		/*
@@ -2589,9 +2601,10 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 			}
 		}
 
-		foreach(lc2, stmt_list)
+		forboth(lc2, stmt_list, lc3, part_prune_result_list)
 		{
 			PlannedStmt *stmt = lfirst_node(PlannedStmt, lc2);
+			PartitionPruneResult *part_prune_result = lfirst_node(PartitionPruneResult, lc3);
 			bool		canSetTag = stmt->canSetTag;
 			DestReceiver *dest;
 
@@ -2663,7 +2676,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 				else
 					snap = InvalidSnapshot;
 
-				qdesc = CreateQueryDesc(stmt,
+				qdesc = CreateQueryDesc(stmt, part_prune_result,
 										plansource->query_string,
 										snap, crosscheck_snapshot,
 										dest,
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 8fbeaa4f36..ca139797a8 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -97,7 +97,9 @@ _copyPlannedStmt(const PlannedStmt *from)
 	COPY_SCALAR_FIELD(jitFlags);
 	COPY_NODE_FIELD(planTree);
 	COPY_NODE_FIELD(partPruneInfos);
+	COPY_SCALAR_FIELD(containsInitialPruning);
 	COPY_NODE_FIELD(rtable);
+	COPY_BITMAPSET_FIELD(minLockRelids);
 	COPY_NODE_FIELD(resultRelations);
 	COPY_NODE_FIELD(appendRelations);
 	COPY_NODE_FIELD(subplans);
@@ -1284,6 +1286,8 @@ _copyPartitionPruneInfo(const PartitionPruneInfo *from)
 	PartitionPruneInfo *newnode = makeNode(PartitionPruneInfo);
 
 	COPY_NODE_FIELD(prune_infos);
+	COPY_SCALAR_FIELD(needs_init_pruning);
+	COPY_SCALAR_FIELD(needs_exec_pruning);
 	COPY_BITMAPSET_FIELD(other_subplans);
 
 	return newnode;
@@ -1300,6 +1304,7 @@ _copyPartitionedRelPruneInfo(const PartitionedRelPruneInfo *from)
 	COPY_POINTER_FIELD(subplan_map, from->nparts * sizeof(int));
 	COPY_POINTER_FIELD(subpart_map, from->nparts * sizeof(int));
 	COPY_POINTER_FIELD(relid_map, from->nparts * sizeof(Oid));
+	COPY_POINTER_FIELD(rti_map, from->nparts * sizeof(Index));
 	COPY_NODE_FIELD(initial_pruning_steps);
 	COPY_NODE_FIELD(exec_pruning_steps);
 	COPY_BITMAPSET_FIELD(execparamids);
@@ -5475,6 +5480,21 @@ _copyExtensibleNode(const ExtensibleNode *from)
 	return newnode;
 }
 
+/* ****************************************************************
+ *					execnodes.h copy functions
+ * ****************************************************************
+ */
+static PartitionPruneResult *
+_copyPartitionPruneResult(const PartitionPruneResult *from)
+{
+	PartitionPruneResult *newnode = makeNode(PartitionPruneResult);
+
+	COPY_NODE_FIELD(valid_subplan_offs_list);
+	COPY_BITMAPSET_FIELD(scan_leafpart_rtis);
+
+	return newnode;
+}
+
 /* ****************************************************************
  *					value.h copy functions
  * ****************************************************************
@@ -6571,6 +6591,13 @@ copyObjectImpl(const void *from)
 			retval = _copyPublicationTable(from);
 			break;
 
+			/*
+			 * EXECUTION NODES
+			 */
+		case T_PartitionPruneResult:
+			retval = _copyPartitionPruneResult(from);
+			break;
+
 			/*
 			 * MISCELLANEOUS NODES
 			 */
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 72fcd8a6ee..53010bf059 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -322,7 +322,9 @@ _outPlannedStmt(StringInfo str, const PlannedStmt *node)
 	WRITE_INT_FIELD(jitFlags);
 	WRITE_NODE_FIELD(planTree);
 	WRITE_NODE_FIELD(partPruneInfos);
+	WRITE_BOOL_FIELD(containsInitialPruning);
 	WRITE_NODE_FIELD(rtable);
+	WRITE_BITMAPSET_FIELD(minLockRelids);
 	WRITE_NODE_FIELD(resultRelations);
 	WRITE_NODE_FIELD(appendRelations);
 	WRITE_NODE_FIELD(subplans);
@@ -1017,6 +1019,8 @@ _outPartitionPruneInfo(StringInfo str, const PartitionPruneInfo *node)
 	WRITE_NODE_TYPE("PARTITIONPRUNEINFO");
 
 	WRITE_NODE_FIELD(prune_infos);
+	WRITE_BOOL_FIELD(needs_init_pruning);
+	WRITE_BOOL_FIELD(needs_exec_pruning);
 	WRITE_BITMAPSET_FIELD(other_subplans);
 }
 
@@ -1031,6 +1035,7 @@ _outPartitionedRelPruneInfo(StringInfo str, const PartitionedRelPruneInfo *node)
 	WRITE_INT_ARRAY(subplan_map, node->nparts);
 	WRITE_INT_ARRAY(subpart_map, node->nparts);
 	WRITE_OID_ARRAY(relid_map, node->nparts);
+	WRITE_INDEX_ARRAY(rti_map, node->nparts);
 	WRITE_NODE_FIELD(initial_pruning_steps);
 	WRITE_NODE_FIELD(exec_pruning_steps);
 	WRITE_BITMAPSET_FIELD(execparamids);
@@ -2436,6 +2441,8 @@ _outPlannerGlobal(StringInfo str, const PlannerGlobal *node)
 	WRITE_NODE_FIELD(resultRelations);
 	WRITE_NODE_FIELD(appendRelations);
 	WRITE_NODE_FIELD(partPruneInfos);
+	WRITE_BOOL_FIELD(containsInitialPruning);
+	WRITE_BITMAPSET_FIELD(minLockRelids);
 	WRITE_NODE_FIELD(relationOids);
 	WRITE_NODE_FIELD(invalItems);
 	WRITE_NODE_FIELD(paramExecTypes);
@@ -2857,6 +2864,21 @@ _outExtensibleNode(StringInfo str, const ExtensibleNode *node)
 	methods->nodeOut(str, node);
 }
 
+/*****************************************************************************
+ *
+ *	Stuff from execnodes.h
+ *
+ *****************************************************************************/
+
+static void
+_outPartitionPruneResult(StringInfo str, const PartitionPruneResult *node)
+{
+	WRITE_NODE_TYPE("PARTITIONPRUNERESULT");
+
+	WRITE_NODE_FIELD(valid_subplan_offs_list);
+	WRITE_BITMAPSET_FIELD(scan_leafpart_rtis);
+}
+
 /*****************************************************************************
  *
  *	Stuff from parsenodes.h.
@@ -4766,6 +4788,13 @@ outNode(StringInfo str, const void *obj)
 				_outJsonTableSibling(str, obj);
 				break;
 
+				/*
+				 * EXECUTION NODES
+				 */
+			case T_PartitionPruneResult:
+				_outPartitionPruneResult(str, obj);
+				break;
+
 			default:
 
 				/*
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index bf602ff93e..c1d131aa99 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -164,6 +164,11 @@
 	token = pg_strtok(&length);		/* skip :fldname */ \
 	local_node->fldname = readIntCols(len)
 
+/* Read an Index array */
+#define READ_INDEX_ARRAY(fldname, len) \
+	token = pg_strtok(&length);		/* skip :fldname */ \
+	local_node->fldname = readIndexCols(len)
+
 /* Read a bool array */
 #define READ_BOOL_ARRAY(fldname, len) \
 	token = pg_strtok(&length);		/* skip :fldname */ \
@@ -1818,7 +1823,9 @@ _readPlannedStmt(void)
 	READ_INT_FIELD(jitFlags);
 	READ_NODE_FIELD(planTree);
 	READ_NODE_FIELD(partPruneInfos);
+	READ_BOOL_FIELD(containsInitialPruning);
 	READ_NODE_FIELD(rtable);
+	READ_BITMAPSET_FIELD(minLockRelids);
 	READ_NODE_FIELD(resultRelations);
 	READ_NODE_FIELD(appendRelations);
 	READ_NODE_FIELD(subplans);
@@ -2770,6 +2777,8 @@ _readPartitionPruneInfo(void)
 	READ_LOCALS(PartitionPruneInfo);
 
 	READ_NODE_FIELD(prune_infos);
+	READ_BOOL_FIELD(needs_init_pruning);
+	READ_BOOL_FIELD(needs_exec_pruning);
 	READ_BITMAPSET_FIELD(other_subplans);
 
 	READ_DONE();
@@ -2786,6 +2795,7 @@ _readPartitionedRelPruneInfo(void)
 	READ_INT_ARRAY(subplan_map, local_node->nparts);
 	READ_INT_ARRAY(subpart_map, local_node->nparts);
 	READ_OID_ARRAY(relid_map, local_node->nparts);
+	READ_INDEX_ARRAY(rti_map, local_node->nparts);
 	READ_NODE_FIELD(initial_pruning_steps);
 	READ_NODE_FIELD(exec_pruning_steps);
 	READ_BITMAPSET_FIELD(execparamids);
@@ -2939,6 +2949,21 @@ _readPartitionRangeDatum(void)
 	READ_DONE();
 }
 
+
+/*
+ * _readPartitionPruneResult
+ */
+static PartitionPruneResult *
+_readPartitionPruneResult(void)
+{
+	READ_LOCALS(PartitionPruneResult);
+
+	READ_NODE_FIELD(valid_subplan_offs_list);
+	READ_BITMAPSET_FIELD(scan_leafpart_rtis);
+
+	READ_DONE();
+}
+
 /*
  * parseNodeString
  *
@@ -3236,6 +3261,8 @@ parseNodeString(void)
 		return_value = _readJsonTableParent();
 	else if (MATCH("JSONTABLESIBLING", 16))
 		return_value = _readJsonTableSibling();
+	else if (MATCH("PARTITIONPRUNERESULT", 20))
+		return_value = _readPartitionPruneResult();
 	else
 	{
 		elog(ERROR, "badly formatted node string \"%.32s\"...", token);
@@ -3379,6 +3406,30 @@ readIntCols(int numCols)
 	return int_vals;
 }
 
+/*
+ * readIndexCols
+ */
+Index *
+readIndexCols(int numCols)
+{
+	int			tokenLength,
+				i;
+	const char *token;
+	Index	   *index_vals;
+
+	if (numCols <= 0)
+		return NULL;
+
+	index_vals = (Index *) palloc(numCols * sizeof(Index));
+	for (i = 0; i < numCols; i++)
+	{
+		token = pg_strtok(&tokenLength);
+		index_vals[i] = atoui(token);
+	}
+
+	return index_vals;
+}
+
 /*
  * readBoolCols
  */
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 32e658b5d6..edbf19716e 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -519,7 +519,9 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
 	result->parallelModeNeeded = glob->parallelModeNeeded;
 	result->planTree = top_plan;
 	result->partPruneInfos = glob->partPruneInfos;
+	result->containsInitialPruning = glob->containsInitialPruning;
 	result->rtable = glob->finalrtable;
+	result->minLockRelids = glob->minLockRelids;
 	result->resultRelations = glob->resultRelations;
 	result->appendRelations = glob->appendRelations;
 	result->subplans = glob->subplans;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index aafe1c149d..a32fc70785 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -270,6 +270,16 @@ set_plan_references(PlannerInfo *root, Plan *plan)
 	 */
 	add_rtes_to_flat_rtable(root, false);
 
+	/*
+	 * Add the query's adjusted range of RT indexes to glob->minLockRelids.
+	 * The adjusted RT indexes of prunable relations will be deleted from the
+	 * set below where PartitionPruneInfos are processed.
+	 */
+	glob->minLockRelids =
+		bms_add_range(glob->minLockRelids,
+					  rtoffset + 1,
+					  rtoffset + list_length(root->parse->rtable));
+
 	/*
 	 * Adjust RT indexes of PlanRowMarks and add to final rowmarks list
 	 */
@@ -352,6 +362,7 @@ set_plan_references(PlannerInfo *root, Plan *plan)
 	foreach (lc, root->partPruneInfos)
 	{
 		PartitionPruneInfo *pruneinfo = lfirst(lc);
+		Bitmapset *leafpart_rtis = NULL;
 		ListCell  *l;
 
 		foreach(l, pruneinfo->prune_infos)
@@ -362,15 +373,49 @@ set_plan_references(PlannerInfo *root, Plan *plan)
 			foreach(l2, prune_infos)
 			{
 				PartitionedRelPruneInfo *pinfo = lfirst(l2);
+				int		i;
 
 				/* RT index of the table to which the pinfo belongs. */
 				pinfo->rtindex += rtoffset;
+
+				/* Also of the leaf partitions that might be scanned. */
+				for (i = 0; i < pinfo->nparts; i++)
+				{
+					if (pinfo->rti_map[i] > 0 && pinfo->subplan_map[i] >= 0)
+					{
+						pinfo->rti_map[i] += rtoffset;
+						leafpart_rtis = bms_add_member(leafpart_rtis,
+													   pinfo->rti_map[i]);
+					}
+				}
 			}
 		}
 
+		if (pruneinfo->needs_init_pruning)
+		{
+			glob->containsInitialPruning = true;
+
+			/*
+			 * Delete the leaf partition RTIs from the global set of relations
+			 * to be locked before executing the plan.  AcquireExecutorLocks()
+			 * will find the ones to add to the set after performing initial
+			 * pruning.
+			 */
+			glob->minLockRelids = bms_del_members(glob->minLockRelids,
+												  leafpart_rtis);
+		}
+
 		glob->partPruneInfos = lappend(glob->partPruneInfos, pruneinfo);
 	}
 
+	/*
+	 * It seems worth doing a bms_copy() on glob->minLockRelids if we deleted
+	 * bit from it just above to prevent empty tail bits resulting in
+	 * inefficient looping during AcquireExecutorLocks().
+	 */
+	if (glob->containsInitialPruning)
+		glob->minLockRelids = bms_copy(glob->minLockRelids);
+
 	return result;
 }
 
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index d77f7d3aef..952c5b8327 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -144,7 +144,9 @@ static List *make_partitionedrel_pruneinfo(PlannerInfo *root,
 										   List *prunequal,
 										   Bitmapset *partrelids,
 										   int *relid_subplan_map,
-										   Bitmapset **matchedsubplans);
+										   Bitmapset **matchedsubplans,
+										   bool *needs_init_pruning,
+										   bool *needs_exec_pruning);
 static void gen_partprune_steps(RelOptInfo *rel, List *clauses,
 								PartClauseTarget target,
 								GeneratePruningStepsContext *context);
@@ -234,6 +236,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	int		   *relid_subplan_map;
 	ListCell   *lc;
 	int			i;
+	bool		needs_init_pruning = false;
+	bool		needs_exec_pruning = false;
 
 	/*
 	 * Scan the subpaths to see which ones are scans of partition child
@@ -313,12 +317,16 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		Bitmapset  *partrelids = (Bitmapset *) lfirst(lc);
 		List	   *pinfolist;
 		Bitmapset  *matchedsubplans = NULL;
+		bool		partrel_needs_init_pruning;
+		bool		partrel_needs_exec_pruning;
 
 		pinfolist = make_partitionedrel_pruneinfo(root, parentrel,
 												  prunequal,
 												  partrelids,
 												  relid_subplan_map,
-												  &matchedsubplans);
+												  &matchedsubplans,
+												  &partrel_needs_init_pruning,
+												  &partrel_needs_exec_pruning);
 
 		/* When pruning is possible, record the matched subplans */
 		if (pinfolist != NIL)
@@ -327,6 +335,9 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 			allmatchedsubplans = bms_join(matchedsubplans,
 										  allmatchedsubplans);
 		}
+
+		needs_init_pruning |= partrel_needs_init_pruning;
+		needs_exec_pruning |= partrel_needs_exec_pruning;
 	}
 
 	pfree(relid_subplan_map);
@@ -341,6 +352,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	/* Else build the result data structure */
 	pruneinfo = makeNode(PartitionPruneInfo);
 	pruneinfo->prune_infos = prunerelinfos;
+	pruneinfo->needs_init_pruning = needs_init_pruning;
+	pruneinfo->needs_exec_pruning = needs_exec_pruning;
 
 	/*
 	 * Some subplans may not belong to any of the identified partitioned rels.
@@ -441,13 +454,18 @@ add_part_relids(List *allpartrelids, Bitmapset *partrelids)
  * If we cannot find any useful run-time pruning steps, return NIL.
  * However, on success, each rel identified in partrelids will have
  * an element in the result list, even if some of them are useless.
+ * *needs_init_pruning and *needs_exec_pruning are set to indicate that the
+ * returned PartitionedRelPruneInfos contains pruning steps that can be
+ * performed before and after execution begins, respectively.
  */
 static List *
 make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 							  List *prunequal,
 							  Bitmapset *partrelids,
 							  int *relid_subplan_map,
-							  Bitmapset **matchedsubplans)
+							  Bitmapset **matchedsubplans,
+							  bool *needs_init_pruning,
+							  bool *needs_exec_pruning)
 {
 	RelOptInfo *targetpart = NULL;
 	List	   *pinfolist = NIL;
@@ -458,6 +476,10 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	int			rti;
 	int			i;
 
+	/* Will find out below. */
+	*needs_init_pruning = false;
+	*needs_exec_pruning = false;
+
 	/*
 	 * Examine each partitioned rel, constructing a temporary array to map
 	 * from planner relids to index of the partitioned rel, and building a
@@ -545,6 +567,9 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		 * executor per-scan pruning steps.  This first pass creates startup
 		 * pruning steps and detects whether there's any possibly-useful quals
 		 * that would require per-scan pruning.
+		 *
+		 * In the first pass, we note whether the 2nd pass is necessary by
+		 * noting the presence of EXEC parameters.
 		 */
 		gen_partprune_steps(subpart, partprunequal, PARTTARGET_INITIAL,
 							&context);
@@ -619,6 +644,12 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		pinfo->execparamids = execparamids;
 		/* Remaining fields will be filled in the next loop */
 
+		/* record which types of pruning steps we've seen so far */
+		if (initial_pruning_steps != NIL)
+			*needs_init_pruning = true;
+		if (exec_pruning_steps != NIL)
+			*needs_exec_pruning = true;
+
 		pinfolist = lappend(pinfolist, pinfo);
 	}
 
@@ -646,6 +677,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		int		   *subplan_map;
 		int		   *subpart_map;
 		Oid		   *relid_map;
+		Index	   *rti_map;
 
 		/*
 		 * Construct the subplan and subpart maps for this partitioning level.
@@ -658,6 +690,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		subpart_map = (int *) palloc(nparts * sizeof(int));
 		memset(subpart_map, -1, nparts * sizeof(int));
 		relid_map = (Oid *) palloc0(nparts * sizeof(Oid));
+		rti_map = (Index *) palloc0(nparts * sizeof(Index));
 		present_parts = NULL;
 
 		i = -1;
@@ -672,6 +705,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 			subplan_map[i] = subplanidx = relid_subplan_map[partrel->relid] - 1;
 			subpart_map[i] = subpartidx = relid_subpart_map[partrel->relid] - 1;
 			relid_map[i] = planner_rt_fetch(partrel->relid, root)->relid;
+			rti_map[i] = partrel->relid;
 			if (subplanidx >= 0)
 			{
 				present_parts = bms_add_member(present_parts, i);
@@ -696,6 +730,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		pinfo->subplan_map = subplan_map;
 		pinfo->subpart_map = subpart_map;
 		pinfo->relid_map = relid_map;
+		pinfo->rti_map = rti_map;
 	}
 
 	pfree(relid_subpart_map);
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 8b6b5bbaaa..7f0eda48a4 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -1603,6 +1603,7 @@ exec_bind_message(StringInfo input_message)
 	int16	   *rformats = NULL;
 	CachedPlanSource *psrc;
 	CachedPlan *cplan;
+	List	   *part_prune_result_list;
 	Portal		portal;
 	char	   *query_string;
 	char	   *saved_stmt_name;
@@ -1978,7 +1979,9 @@ exec_bind_message(StringInfo input_message)
 	 * will be generated in MessageContext.  The plan refcount will be
 	 * assigned to the Portal, so it will be released at portal destruction.
 	 */
-	cplan = GetCachedPlan(psrc, params, NULL, NULL);
+	cplan = GetCachedPlan(psrc, params, NULL, NULL, &part_prune_result_list);
+	Assert(list_length(cplan->stmt_list) ==
+		   list_length(part_prune_result_list));
 
 	/*
 	 * Now we can define the portal.
@@ -1993,6 +1996,9 @@ exec_bind_message(StringInfo input_message)
 					  cplan->stmt_list,
 					  cplan);
 
+	/* Copy PartitionPruneResults into the portal's context. */
+	PortalStorePartitionPruneResults(portal, part_prune_result_list);
+
 	/* Done with the snapshot used for parameter I/O and parsing/planning */
 	if (snapshot_set)
 		PopActiveSnapshot();
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index 5aa5a350f3..8cc2e2162d 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -35,7 +35,7 @@
 Portal		ActivePortal = NULL;
 
 
-static void ProcessQuery(PlannedStmt *plan,
+static void ProcessQuery(PlannedStmt *plan, PartitionPruneResult *part_prune_result,
 						 const char *sourceText,
 						 ParamListInfo params,
 						 QueryEnvironment *queryEnv,
@@ -65,6 +65,7 @@ static void DoPortalRewind(Portal portal);
  */
 QueryDesc *
 CreateQueryDesc(PlannedStmt *plannedstmt,
+				PartitionPruneResult *part_prune_result,
 				const char *sourceText,
 				Snapshot snapshot,
 				Snapshot crosscheck_snapshot,
@@ -77,6 +78,8 @@ CreateQueryDesc(PlannedStmt *plannedstmt,
 
 	qd->operation = plannedstmt->commandType;	/* operation */
 	qd->plannedstmt = plannedstmt;	/* plan */
+	qd->part_prune_result = part_prune_result;	/* ExecutorDoInitialPruning()
+												 * output for plan */
 	qd->sourceText = sourceText;	/* query text */
 	qd->snapshot = RegisterSnapshot(snapshot);	/* snapshot */
 	/* RI check snapshot */
@@ -122,6 +125,7 @@ FreeQueryDesc(QueryDesc *qdesc)
  *		PORTAL_ONE_RETURNING, or PORTAL_ONE_MOD_WITH portal
  *
  *	plan: the plan tree for the query
+ *	part_prune_result: ExecutorDoInitialPruning() output for the plan tree
  *	sourceText: the source text of the query
  *	params: any parameters needed
  *	dest: where to send results
@@ -134,6 +138,7 @@ FreeQueryDesc(QueryDesc *qdesc)
  */
 static void
 ProcessQuery(PlannedStmt *plan,
+			 PartitionPruneResult *part_prune_result,
 			 const char *sourceText,
 			 ParamListInfo params,
 			 QueryEnvironment *queryEnv,
@@ -145,7 +150,7 @@ ProcessQuery(PlannedStmt *plan,
 	/*
 	 * Create the QueryDesc object
 	 */
-	queryDesc = CreateQueryDesc(plan, sourceText,
+	queryDesc = CreateQueryDesc(plan, part_prune_result, sourceText,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, params, queryEnv, 0);
 
@@ -491,8 +496,13 @@ PortalStart(Portal portal, ParamListInfo params,
 				/*
 				 * Create QueryDesc in portal's context; for the moment, set
 				 * the destination to DestNone.
+				 *
+				 * There is no PartitionPruneResult unless the PlannedStmt is
+				 * from a CachedPlan.
 				 */
 				queryDesc = CreateQueryDesc(linitial_node(PlannedStmt, portal->stmts),
+											portal->part_prune_results == NIL ? NULL :
+											linitial(portal->part_prune_results),
 											portal->sourceText,
 											GetActiveSnapshot(),
 											InvalidSnapshot,
@@ -1225,6 +1235,8 @@ PortalRunMulti(Portal portal,
 
 		if (pstmt->utilityStmt == NULL)
 		{
+			PartitionPruneResult *part_prune_result = NULL;
+
 			/*
 			 * process a plannable query.
 			 */
@@ -1271,10 +1283,18 @@ PortalRunMulti(Portal portal,
 			else
 				UpdateActiveSnapshotCommandId();
 
+			/*
+			 * Determine if there's a corresponding PartitionPruneResult for
+			 * this PlannedStmt.
+			 */
+			if (portal->part_prune_results != NIL)
+				part_prune_result = list_nth(portal->part_prune_results,
+											 foreach_current_index(stmtlist_item));
+
 			if (pstmt->canSetTag)
 			{
 				/* statement can set tag string */
-				ProcessQuery(pstmt,
+				ProcessQuery(pstmt, part_prune_result,
 							 portal->sourceText,
 							 portal->portalParams,
 							 portal->queryEnv,
@@ -1283,7 +1303,7 @@ PortalRunMulti(Portal portal,
 			else
 			{
 				/* stmt added by rewrite cannot set tag */
-				ProcessQuery(pstmt,
+				ProcessQuery(pstmt, part_prune_result,
 							 portal->sourceText,
 							 portal->portalParams,
 							 portal->queryEnv,
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 0d6a295674..8c164741f7 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -99,14 +99,19 @@ static dlist_head cached_expression_list = DLIST_STATIC_INIT(cached_expression_l
 static void ReleaseGenericPlan(CachedPlanSource *plansource);
 static List *RevalidateCachedQuery(CachedPlanSource *plansource,
 								   QueryEnvironment *queryEnv);
-static bool CheckCachedPlan(CachedPlanSource *plansource);
+static bool CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
+							List **part_prune_result_list);
 static CachedPlan *BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
-								   ParamListInfo boundParams, QueryEnvironment *queryEnv);
+								   ParamListInfo boundParams, QueryEnvironment *queryEnv,
+								   List **part_prune_result_list);
 static bool choose_custom_plan(CachedPlanSource *plansource,
 							   ParamListInfo boundParams);
 static double cached_plan_cost(CachedPlan *plan, bool include_planner);
 static Query *QueryListGetPrimaryStmt(List *stmts);
-static void AcquireExecutorLocks(List *stmt_list, bool acquire);
+static void AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams,
+								 List **part_prune_result_list,
+								 List **lockedRelids_per_stmt);
+static void ReleaseExecutorLocks(List *stmt_list, List *lockedRelids_per_stmt);
 static void AcquirePlannerLocks(List *stmt_list, bool acquire);
 static void ScanQueryForLocks(Query *parsetree, bool acquire);
 static bool ScanQueryWalker(Node *node, bool *acquire);
@@ -790,15 +795,20 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
  *
  * On a "true" return, we have acquired the locks needed to run the plan.
  * (We must do this for the "true" result to be race-condition-free.)
+ *
+ * See GetCachedPlan()'s comment for a description of part_prune_result_list.
  */
 static bool
-CheckCachedPlan(CachedPlanSource *plansource)
+CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
+				List **part_prune_result_list)
 {
 	CachedPlan *plan = plansource->gplan;
 
 	/* Assert that caller checked the querytree */
 	Assert(plansource->is_valid);
 
+	*part_prune_result_list = NIL;
+
 	/* If there's no generic plan, just say "false" */
 	if (!plan)
 		return false;
@@ -820,13 +830,21 @@ CheckCachedPlan(CachedPlanSource *plansource)
 	 */
 	if (plan->is_valid)
 	{
+		List   *lockedRelids_per_stmt;
+
 		/*
 		 * Plan must have positive refcount because it is referenced by
 		 * plansource; so no need to fear it disappears under us here.
 		 */
 		Assert(plan->refcount > 0);
 
-		AcquireExecutorLocks(plan->stmt_list, true);
+		/*
+		 * Lock relations scanned by the plan.  This is where the pruning
+		 * happens if needed.
+		 */
+		AcquireExecutorLocks(plan->stmt_list, boundParams,
+							 part_prune_result_list,
+							 &lockedRelids_per_stmt);
 
 		/*
 		 * If plan was transient, check to see if TransactionXmin has
@@ -848,7 +866,14 @@ CheckCachedPlan(CachedPlanSource *plansource)
 		}
 
 		/* Oops, the race case happened.  Release useless locks. */
-		AcquireExecutorLocks(plan->stmt_list, false);
+		ReleaseExecutorLocks(plan->stmt_list, lockedRelids_per_stmt);
+
+		/*
+		 * The output list and any objects therein have been allocated in the
+		 * caller's hopefully short-lived context, so will not remain leaked
+		 * for long, though reset to avoid its accidentally being looked at.
+		 */
+		*part_prune_result_list = NIL;
 	}
 
 	/*
@@ -874,10 +899,15 @@ CheckCachedPlan(CachedPlanSource *plansource)
  * Planning work is done in the caller's memory context.  The finished plan
  * is in a child memory context, which typically should get reparented
  * (unless this is a one-shot plan, in which case we don't copy the plan).
+ *
+ * A list of NULLs is returned in *part_prune_result_list, meaning that no
+ * PartitionPruneResult nodes have yet been created for the plans in
+ * stmt_list.
  */
 static CachedPlan *
 BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
-				ParamListInfo boundParams, QueryEnvironment *queryEnv)
+				ParamListInfo boundParams, QueryEnvironment *queryEnv,
+				List **part_prune_result_list)
 {
 	CachedPlan *plan;
 	List	   *plist;
@@ -1007,6 +1037,17 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
 
 	MemoryContextSwitchTo(oldcxt);
 
+	/*
+	 * No actual PartitionPruneResults yet to add, though must initialize
+	 * the list to have the same number of elements as the list of
+	 * PlannedStmts.
+	 */
+	*part_prune_result_list = NIL;
+	foreach(lc, plist)
+	{
+		*part_prune_result_list = lappend(*part_prune_result_list, NULL);
+	}
+
 	return plan;
 }
 
@@ -1126,6 +1167,17 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
  * plan or a custom plan for the given parameters: the caller does not know
  * which it will get.
  *
+ * For every PlannedStmt found in the returned CachedPlan, an element that
+ * is either a PartitionPruneResult or a NULL is added to
+ * *part_prune_result_list if needed.  The former if the PlannedStmt is from
+ * the existing CachedPlan that is otherwise valid and contains at least one
+ * PartitionPruneInfo that has "initial" pruning steps.  Those steps are
+ * performed by calling ExecutorDoInitialPruning() to determine only those
+ * leaf partitions that need to be locked by AcquireExecutorLocks() by pruning
+ * away subplans that don't match the pruning conditions.  The
+ * PartitionPruneResult contains a list of bitmapsets of the indexes of
+ * matching subplans, one for each PartitionPruneInfo.
+ *
  * On return, the plan is valid and we have sufficient locks to begin
  * execution.
  *
@@ -1139,11 +1191,13 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
  */
 CachedPlan *
 GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
-			  ResourceOwner owner, QueryEnvironment *queryEnv)
+			  ResourceOwner owner, QueryEnvironment *queryEnv,
+			  List **part_prune_result_list)
 {
 	CachedPlan *plan = NULL;
 	List	   *qlist;
 	bool		customplan;
+	List	   *my_part_prune_result_list;
 
 	/* Assert caller is doing things in a sane order */
 	Assert(plansource->magic == CACHEDPLANSOURCE_MAGIC);
@@ -1160,7 +1214,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 
 	if (!customplan)
 	{
-		if (CheckCachedPlan(plansource))
+		if (CheckCachedPlan(plansource, boundParams,
+							&my_part_prune_result_list))
 		{
 			/* We want a generic plan, and we already have a valid one */
 			plan = plansource->gplan;
@@ -1169,7 +1224,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 		else
 		{
 			/* Build a new generic plan */
-			plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv);
+			plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv,
+								   &my_part_prune_result_list);
 			/* Just make real sure plansource->gplan is clear */
 			ReleaseGenericPlan(plansource);
 			/* Link the new generic plan into the plansource */
@@ -1214,7 +1270,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 	if (customplan)
 	{
 		/* Build a custom plan */
-		plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv);
+		plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv,
+							   &my_part_prune_result_list);
 		/* Accumulate total costs of custom plans */
 		plansource->total_custom_cost += cached_plan_cost(plan, true);
 
@@ -1246,6 +1303,9 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 		plan->is_saved = true;
 	}
 
+	if (part_prune_result_list)
+		*part_prune_result_list = my_part_prune_result_list;
+
 	return plan;
 }
 
@@ -1737,17 +1797,29 @@ QueryListGetPrimaryStmt(List *stmts)
 
 /*
  * AcquireExecutorLocks: acquire locks needed for execution of a cached plan;
- * or release them if acquire is false.
+ *
+ * See GetCachedPlan()'s comment for a description of part_prune_result_list.
+ *
+ * On return, *lockedRelids_per_stmt will contain a bitmapset for every
+ * PlannedStmt in stmt_list, containing the RT indexes of relation entries
+ * in its range table that were actually locked, or NULL if the PlannedStmt
+ * contains a utility statement.
  */
 static void
-AcquireExecutorLocks(List *stmt_list, bool acquire)
+AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams,
+					 List **part_prune_result_list,
+					 List **lockedRelids_per_stmt)
 {
 	ListCell   *lc1;
 
+	*part_prune_result_list = *lockedRelids_per_stmt = NIL;
 	foreach(lc1, stmt_list)
 	{
 		PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
-		ListCell   *lc2;
+		PartitionPruneResult *part_prune_result = NULL;
+		Bitmapset  *allLockRelids;
+		Bitmapset  *lockedRelids = NULL;
+		int			rti;
 
 		if (plannedstmt->commandType == CMD_UTILITY)
 		{
@@ -1761,13 +1833,35 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
 			Query	   *query = UtilityContainsQuery(plannedstmt->utilityStmt);
 
 			if (query)
-				ScanQueryForLocks(query, acquire);
+				ScanQueryForLocks(query, true);
+			*part_prune_result_list = lappend(*part_prune_result_list, NULL);
 			continue;
 		}
 
-		foreach(lc2, plannedstmt->rtable)
+		/*
+		 * Figure out the set of relations that would need to be locked
+		 * before executing the plan.
+		 */
+		if (plannedstmt->containsInitialPruning)
 		{
-			RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc2);
+			/*
+			 * Obtain the set of partitions to be locked from the
+			 * PartitionPruneInfos by considering the result of performing
+			 * initial partition pruning.
+			 */
+			PartitionPruneResult *part_prune_result =
+				ExecutorDoInitialPruning(plannedstmt, boundParams);
+
+			allLockRelids = bms_union(plannedstmt->minLockRelids,
+									  part_prune_result->scan_leafpart_rtis);
+		}
+		else
+			allLockRelids = plannedstmt->minLockRelids;
+
+		rti = -1;
+		while ((rti = bms_next_member(allLockRelids, rti)) > 0)
+		{
+			RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
 
 			if (rte->rtekind != RTE_RELATION)
 				continue;
@@ -1778,10 +1872,58 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
 			 * fail if it's been dropped entirely --- we'll just transiently
 			 * acquire a non-conflicting lock.
 			 */
-			if (acquire)
-				LockRelationOid(rte->relid, rte->rellockmode);
-			else
-				UnlockRelationOid(rte->relid, rte->rellockmode);
+			LockRelationOid(rte->relid, rte->rellockmode);
+			lockedRelids = bms_add_member(lockedRelids, rti);
+		}
+
+		*part_prune_result_list = lappend(*part_prune_result_list,
+										  part_prune_result);
+		*lockedRelids_per_stmt = lappend(*lockedRelids_per_stmt, lockedRelids);
+	}
+}
+
+/*
+ * ReleaseExecutorLocks
+ * 		Release locks that would've been acquired by an earlier call to
+ * 		AcquireExecutorLocks()
+ */
+static void
+ReleaseExecutorLocks(List *stmt_list, List *lockedRelids_per_stmt)
+{
+	ListCell   *lc1,
+			   *lc2;
+
+	forboth(lc1, stmt_list, lc2, lockedRelids_per_stmt)
+	{
+		PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
+		Bitmapset  *lockedRelids = lfirst(lc2);
+		int			rti;
+
+		if (plannedstmt->commandType == CMD_UTILITY)
+		{
+			/*
+			 * Ignore utility statements, except those (such as EXPLAIN) that
+			 * contain a parsed-but-not-planned query.  Note: it's okay to use
+			 * ScanQueryForLocks, even though the query hasn't been through
+			 * rule rewriting, because rewriting doesn't change the query
+			 * representation.
+			 */
+			Query	   *query = UtilityContainsQuery(plannedstmt->utilityStmt);
+
+			if (query)
+				ScanQueryForLocks(query, false);
+			continue;
+		}
+
+		rti = -1;
+		while ((rti = bms_next_member(lockedRelids, rti)) >= 0)
+		{
+			RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
+
+			Assert(rte->rtekind == RTE_RELATION);
+
+			/* See the comment in AcquireExecutorLocks(). */
+			UnlockRelationOid(rte->relid, rte->rellockmode);
 		}
 	}
 }
diff --git a/src/backend/utils/mmgr/portalmem.c b/src/backend/utils/mmgr/portalmem.c
index d549f66d4a..1bbe6b704b 100644
--- a/src/backend/utils/mmgr/portalmem.c
+++ b/src/backend/utils/mmgr/portalmem.c
@@ -303,6 +303,25 @@ PortalDefineQuery(Portal portal,
 	portal->status = PORTAL_DEFINED;
 }
 
+/*
+ * PortalStorePartitionPruneResults
+ *		Copy the given list of PartitionPruneResults into the portal's
+ *		context
+ *
+ * This allows the caller to ensure that the list exists as long as the portal
+ * does.
+ */
+void
+PortalStorePartitionPruneResults(Portal portal, List *part_prune_results)
+{
+	MemoryContext	oldcxt;
+
+	AssertArg(PortalIsValid(portal));
+	oldcxt = MemoryContextSwitchTo(portal->portalContext);
+	portal->part_prune_results = copyObject(part_prune_results);
+	MemoryContextSwitchTo(oldcxt);
+}
+
 /*
  * PortalReleaseCachedPlan
  *		Release a portal's reference to its cached plan, if any.
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 666977fb1f..bbc8c42d88 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -87,7 +87,9 @@ extern void ExplainOneUtility(Node *utilityStmt, IntoClause *into,
 							  ExplainState *es, const char *queryString,
 							  ParamListInfo params, QueryEnvironment *queryEnv);
 
-extern void ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into,
+extern void ExplainOnePlan(PlannedStmt *plannedstmt,
+						   PartitionPruneResult *part_prune_result,
+						   IntoClause *into,
 						   ExplainState *es, const char *queryString,
 						   ParamListInfo params, QueryEnvironment *queryEnv,
 						   const instr_time *planduration,
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index bf962af7af..bd8776402e 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -45,6 +45,7 @@ extern void ExecCleanupTupleRouting(ModifyTableState *mtstate,
  * nparts						Length of subplan_map[] and subpart_map[].
  * subplan_map					Subplan index by partition index, or -1.
  * subpart_map					Subpart index by partition index, or -1.
+ * rti_map						Range table index by partition index, or 0.
  * present_parts				A Bitmapset of the partition indexes that we
  *								have subplans or subparts for.
  * initial_pruning_steps		List of PartitionPruneSteps used to
@@ -61,6 +62,7 @@ typedef struct PartitionedRelPruningData
 	int			nparts;
 	int		   *subplan_map;
 	int		   *subpart_map;
+	Index	   *rti_map;
 	Bitmapset  *present_parts;
 	List	   *initial_pruning_steps;
 	List	   *exec_pruning_steps;
@@ -126,5 +128,10 @@ extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
 													 int part_prune_index,
 													 Bitmapset **initially_valid_subplans);
 extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
-										   bool initial_prune);
+										   bool initial_prune,
+										   Bitmapset **scan_leafpart_rtis);
+extern Bitmapset *ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt,
+								ParamListInfo params,
+								PartitionPruneInfo *pruneinfo,
+								Bitmapset **scan_leafpart_rtis);
 #endif							/* EXECPARTITION_H */
diff --git a/src/include/executor/execdesc.h b/src/include/executor/execdesc.h
index e79e2c001f..60d5644908 100644
--- a/src/include/executor/execdesc.h
+++ b/src/include/executor/execdesc.h
@@ -35,6 +35,8 @@ typedef struct QueryDesc
 	/* These fields are provided by CreateQueryDesc */
 	CmdType		operation;		/* CMD_SELECT, CMD_UPDATE, etc. */
 	PlannedStmt *plannedstmt;	/* planner's output (could be utility, too) */
+	PartitionPruneResult *part_prune_result; /* ExecutorDoInitialPruning()'s
+											  * output for plannedstmt */
 	const char *sourceText;		/* source text of the query */
 	Snapshot	snapshot;		/* snapshot to use for query */
 	Snapshot	crosscheck_snapshot;	/* crosscheck for RI update/delete */
@@ -57,6 +59,7 @@ typedef struct QueryDesc
 
 /* in pquery.c */
 extern QueryDesc *CreateQueryDesc(PlannedStmt *plannedstmt,
+								  PartitionPruneResult *part_prune_result,
 								  const char *sourceText,
 								  Snapshot snapshot,
 								  Snapshot crosscheck_snapshot,
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index d68a6b9d28..5c4a282be0 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -185,6 +185,8 @@ ExecGetJunkAttribute(TupleTableSlot *slot, AttrNumber attno, bool *isNull)
 /*
  * prototypes from functions in execMain.c
  */
+extern PartitionPruneResult *ExecutorDoInitialPruning(PlannedStmt *plannedstmt,
+													  ParamListInfo params);
 extern void ExecutorStart(QueryDesc *queryDesc, int eflags);
 extern void standard_ExecutorStart(QueryDesc *queryDesc, int eflags);
 extern void ExecutorRun(QueryDesc *queryDesc,
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 25e0bb976e..d3ae0fa52d 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -986,6 +986,34 @@ typedef struct DomainConstraintState
  */
 typedef TupleTableSlot *(*ExecProcNodeMtd) (struct PlanState *pstate);
 
+/*----------------
+ * PartitionPruneResult
+ *
+ * The result of performing ExecutorDoInitialPruning() invocation on a given
+ * PlannedStmt.
+ *
+ * Contains a list of Bitmapset of the indexes of the subplans remaining after
+ * performing initial pruning by calling ExecFindMatchingSubPlans() for every
+ * PartitionPruneInfos found in PlannedStmt.partPruneInfos.  RT indexes of the
+ * leaf partitions scanned by those subplans across all PartitionPruneInfos
+ * are added into scan_leafpart_rtis.
+ *
+ * This is used by GetCachedPlan() to inform its callers of the pruning
+ * decisions made when performing AcquireExecutorLocks() on a given cached
+ * PlannedStmt, which the callers then pass onto the executor.  The executor
+ * refers to this node when made available when initializing the plan nodes to
+ * which those PartitionPruneInfos apply so that the same set of qualifying
+ * subplans are initialized, rather than deriving that set again by redoing
+ * initial pruning.
+ */
+typedef struct PartitionPruneResult
+{
+	NodeTag		type;
+
+	List		   *valid_subplan_offs_list;
+	Bitmapset	   *scan_leafpart_rtis;
+} PartitionPruneResult;
+
 /* ----------------
  *		PlanState node
  *
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index b3b407579b..84d67d5dcf 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -97,6 +97,9 @@ typedef enum NodeTag
 	T_PartitionPruneStepCombine,
 	T_PlanInvalItem,
 
+	/* TAGS FOR EXECUTOR PREP NODES (execnodes.h) */
+	T_PartitionPruneResult,
+
 	/*
 	 * TAGS FOR PLAN STATE NODES (execnodes.h)
 	 *
@@ -674,6 +677,7 @@ extern struct Bitmapset *readBitmapset(void);
 extern uintptr_t readDatum(bool typbyval);
 extern bool *readBoolCols(int numCols);
 extern int *readIntCols(int numCols);
+extern Index *readIndexCols(int numCols);
 extern Oid *readOidCols(int numCols);
 extern int16 *readAttrNumberCols(int numCols);
 
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 6995b0ecec..c47ce6c09b 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -110,6 +110,15 @@ typedef struct PlannerGlobal
 	List	   *partPruneInfos;		/* List of PartitionPruneInfo contained in
 									 * the plan */
 
+	bool		containsInitialPruning;	/* Do any of those PartitionPruneInfos
+										 * have initial (pre-exec) pruning
+										 * steps in them? */
+
+	Bitmapset  *minLockRelids;	/* Indexes of all range table entries minus
+								 * indexes of range table entries of the leaf
+								 * partitions scanned by prunable subplans;
+								 * see AcquireExecutorLocks() */
+
 	List	   *relationOids;	/* OIDs of relations the plan depends on */
 
 	List	   *invalItems;		/* other dependencies, as PlanInvalItems */
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 297cacfb5b..ffb52e2ac2 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -67,8 +67,17 @@ typedef struct PlannedStmt
 	List	   *partPruneInfos;	/* List of PartitionPruneInfo contained in
 								 * the plan */
 
+	bool		containsInitialPruning;	/* Do any of those PartitionPruneInfos
+										 * have initial (pre-exec) pruning
+										 * steps in them? */
+
 	List	   *rtable;			/* list of RangeTblEntry nodes */
 
+	Bitmapset  *minLockRelids;	/* Indexes of all range table entries minus
+								 * indexes of range table entries of the leaf
+								 * partitions scanned by prunable subplans;
+								 * see AcquireExecutorLocks() */
+
 	/* rtable indexes of target relations for INSERT/UPDATE/DELETE */
 	List	   *resultRelations;	/* integer list of RT indexes, or NIL */
 
@@ -1196,6 +1205,13 @@ typedef struct PlanRowMark
  * prune_infos			List of Lists containing PartitionedRelPruneInfo nodes,
  *						one sublist per run-time-prunable partition hierarchy
  *						appearing in the parent plan node's subplans.
+ *
+ * needs_init_pruning	Does any of the PartitionedRelPruneInfos in
+ *						prune_infos have its initial_pruning_steps set?
+ *
+ * needs_exec_pruning	Does any of the PartitionedRelPruneInfos in
+ *						prune_infos have its exec_pruning_steps set?
+ *
  * other_subplans		Indexes of any subplans that are not accounted for
  *						by any of the PartitionedRelPruneInfo nodes in
  *						"prune_infos".  These subplans must not be pruned.
@@ -1204,6 +1220,8 @@ typedef struct PartitionPruneInfo
 {
 	NodeTag		type;
 	List	   *prune_infos;
+	bool		needs_init_pruning;
+	bool		needs_exec_pruning;
 	Bitmapset  *other_subplans;
 } PartitionPruneInfo;
 
@@ -1234,6 +1252,7 @@ typedef struct PartitionedRelPruneInfo
 	int		   *subplan_map;	/* subplan index by partition index, or -1 */
 	int		   *subpart_map;	/* subpart index by partition index, or -1 */
 	Oid		   *relid_map;		/* relation OID by partition index, or 0 */
+	Index	   *rti_map;		/* Range table index by partition index, 0. */
 
 	/*
 	 * initial_pruning_steps shows how to prune during executor startup (i.e.,
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index 0499635f59..1c5bb5ece1 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -220,7 +220,8 @@ extern List *CachedPlanGetTargetList(CachedPlanSource *plansource,
 extern CachedPlan *GetCachedPlan(CachedPlanSource *plansource,
 								 ParamListInfo boundParams,
 								 ResourceOwner owner,
-								 QueryEnvironment *queryEnv);
+								 QueryEnvironment *queryEnv,
+								 List **part_prune_result_list);
 extern void ReleaseCachedPlan(CachedPlan *plan, ResourceOwner owner);
 
 extern bool CachedPlanAllowsSimpleValidityCheck(CachedPlanSource *plansource,
diff --git a/src/include/utils/portal.h b/src/include/utils/portal.h
index aeddbdafe5..9f7727a837 100644
--- a/src/include/utils/portal.h
+++ b/src/include/utils/portal.h
@@ -138,6 +138,7 @@ typedef struct PortalData
 	QueryCompletion qc;			/* command completion data for executed query */
 	List	   *stmts;			/* list of PlannedStmts */
 	CachedPlan *cplan;			/* CachedPlan, if stmts are from one */
+	List	   *part_prune_results;	/* list of PartitionPruneResults */
 
 	ParamListInfo portalParams; /* params to pass to query */
 	QueryEnvironment *queryEnv; /* environment for query */
@@ -242,6 +243,8 @@ extern void PortalDefineQuery(Portal portal,
 							  CommandTag commandTag,
 							  List *stmts,
 							  CachedPlan *cplan);
+extern void PortalStorePartitionPruneResults(Portal portal,
+											 List *part_prune_result_list);
 extern PlannedStmt *PortalGetPrimaryStmt(Portal portal);
 extern void PortalCreateHoldStore(Portal portal);
 extern void PortalHashTableDeleteAll(void);
-- 
2.35.3



^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2022-05-27 20:08  Zhihong Yu <[email protected]>
  parent: Amit Langote <[email protected]>
  1 sibling, 0 replies; 108+ messages in thread

From: Zhihong Yu @ 2022-05-27 20:08 UTC (permalink / raw)
  To: Amit Langote <[email protected]>; +Cc: David Rowley <[email protected]>; Robert Haas <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers

On Fri, May 27, 2022 at 1:10 AM Amit Langote <[email protected]>
wrote:

> On Mon, Apr 11, 2022 at 12:53 PM Zhihong Yu <[email protected]> wrote:
> > On Sun, Apr 10, 2022 at 8:05 PM Amit Langote <[email protected]>
> wrote:
> >> Sending v15 that fixes that to keep the cfbot green for now.
> >
> > Hi,
> >
> > +               /* RT index of the partitione table. */
> >
> > partitione -> partitioned
>
> Thanks, fixed.
>
> Also, I broke this into patches:
>
> 0001 contains the mechanical changes of moving PartitionPruneInfo out
> of Append/MergeAppend into a list in PlannedStmt.
>
> 0002 is the main patch to "Optimize AcquireExecutorLocks() by locking
> only unpruned partitions".
>
> --
> Thanks, Amit Langote
> EDB: http://www.enterprisedb.com

Hi,
In the description:

is made available to the actual execution via
PartitionPruneResult, made available along with the PlannedStmt by the

I think the second `made available` is redundant (can be omitted).

+ * Initial pruning is performed here if needed (unless it has already been
done
+ * by ExecDoInitialPruning()), and in that case only the surviving
subplans'

I wonder if there is a typo above - I don't find ExecDoInitialPruning
either in PG codebase or in the patches (except for this in the comment).
I think it should be ExecutorDoInitialPruning.

+    * bit from it just above to prevent empty tail bits resulting in

I searched in the code base but didn't find mentioning of `empty tail bit`.
Do you mind explaining a bit about it ?

Cheers


^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2022-07-05 17:43  Jacob Champion <[email protected]>
  parent: Amit Langote <[email protected]>
  1 sibling, 1 reply; 108+ messages in thread

From: Jacob Champion @ 2022-07-05 17:43 UTC (permalink / raw)
  To: Amit Langote <[email protected]>; +Cc: Zhihong Yu <[email protected]>; David Rowley <[email protected]>; Robert Haas <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers

On Fri, May 27, 2022 at 1:09 AM Amit Langote <[email protected]> wrote:
> 0001 contains the mechanical changes of moving PartitionPruneInfo out
> of Append/MergeAppend into a list in PlannedStmt.
>
> 0002 is the main patch to "Optimize AcquireExecutorLocks() by locking
> only unpruned partitions".

This patchset will need to be rebased over 835d476fd21; looks like
just a cosmetic change.

--Jacob





^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2022-07-06 02:37  Amit Langote <[email protected]>
  parent: Jacob Champion <[email protected]>
  0 siblings, 2 replies; 108+ messages in thread

From: Amit Langote @ 2022-07-06 02:37 UTC (permalink / raw)
  To: Jacob Champion <[email protected]>; +Cc: Zhihong Yu <[email protected]>; David Rowley <[email protected]>; Robert Haas <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers

On Wed, Jul 6, 2022 at 2:43 AM Jacob Champion <[email protected]> wrote:
> On Fri, May 27, 2022 at 1:09 AM Amit Langote <[email protected]> wrote:
> > 0001 contains the mechanical changes of moving PartitionPruneInfo out
> > of Append/MergeAppend into a list in PlannedStmt.
> >
> > 0002 is the main patch to "Optimize AcquireExecutorLocks() by locking
> > only unpruned partitions".
>
> This patchset will need to be rebased over 835d476fd21; looks like
> just a cosmetic change.

Thanks for the heads up.

Rebased and also fixed per comments given by Zhihong Yu on May 28.

-- 
Thanks, Amit Langote
EDB: http://www.enterprisedb.com


Attachments:

  [application/octet-stream] v17-0001-Move-PartitioPruneInfo-out-of-plan-nodes-into-Pl.patch (21.2K, 2-v17-0001-Move-PartitioPruneInfo-out-of-plan-nodes-into-Pl.patch)
  download | inline diff:
From 665055be44caaec9dcc2a3251f20ceb3c678fa3d Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Fri, 27 May 2022 16:00:28 +0900
Subject: [PATCH v17 1/2] Move PartitioPruneInfo out of plan nodes into
 PlannedStmt

The planner will now add a given PartitioPruneInfo to
PlannedStmt.partPruneInfos instead of directly to the
Append/MergeAppend plan node.  What gets set instead in the
latter is an index field which points to the list element
of PlannedStmt.partPruneInfos containing the PartitioPruneInfo
belonging to the plan node.

A later commit will make AcquireExecutorLocks() do the initial
partition pruning to determine a minimal set of partitions to be
locked when validating a plan tree and it will need to consult the
PartitioPruneInfos referenced therein to do so.  It would be better
for the PartitioPruneInfos to be accessible directly than requiring
a walk of the plan tree to find them, which is easier when it can be
done by simply iterating over PlannedStmt.partPruneInfos.
---
 src/backend/executor/execMain.c         |  1 +
 src/backend/executor/execParallel.c     |  1 +
 src/backend/executor/execPartition.c    |  4 +-
 src/backend/executor/execUtils.c        |  2 +
 src/backend/executor/nodeAppend.c       |  4 +-
 src/backend/executor/nodeMergeAppend.c  |  4 +-
 src/backend/nodes/copyfuncs.c           |  5 +-
 src/backend/nodes/outfuncs.c            |  7 ++-
 src/backend/nodes/readfuncs.c           |  5 +-
 src/backend/optimizer/plan/createplan.c | 24 ++++-----
 src/backend/optimizer/plan/planner.c    |  1 +
 src/backend/optimizer/plan/setrefs.c    | 65 +++++++++++++------------
 src/backend/partitioning/partprune.c    | 18 ++++---
 src/include/executor/execPartition.h    |  3 +-
 src/include/nodes/execnodes.h           |  2 +
 src/include/nodes/pathnodes.h           |  6 +++
 src/include/nodes/plannodes.h           | 11 +++--
 src/include/partitioning/partprune.h    |  8 +--
 18 files changed, 103 insertions(+), 68 deletions(-)

diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index ef2fd46092..72fc273524 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -825,6 +825,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
 	ExecInitRangeTable(estate, rangeTable);
 
 	estate->es_plannedstmt = plannedstmt;
+	estate->es_part_prune_infos = plannedstmt->partPruneInfos;
 
 	/*
 	 * Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index f1fd7f7e8b..f73b8c2607 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -183,6 +183,7 @@ ExecSerializePlan(Plan *plan, EState *estate)
 	pstmt->dependsOnRole = false;
 	pstmt->parallelModeNeeded = false;
 	pstmt->planTree = plan;
+	pstmt->partPruneInfos = estate->es_part_prune_infos;
 	pstmt->rtable = estate->es_range_table;
 	pstmt->resultRelations = NIL;
 	pstmt->appendRelations = NIL;
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index e03ea27299..b55cdd2580 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -1638,11 +1638,13 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
 PartitionPruneState *
 ExecInitPartitionPruning(PlanState *planstate,
 						 int n_total_subplans,
-						 PartitionPruneInfo *pruneinfo,
+						 int part_prune_index,
 						 Bitmapset **initially_valid_subplans)
 {
 	PartitionPruneState *prunestate;
 	EState	   *estate = planstate->state;
+	PartitionPruneInfo *pruneinfo = list_nth(estate->es_part_prune_infos,
+											 part_prune_index);
 
 	/* We may need an expression context to evaluate partition exprs */
 	ExecAssignExprContext(estate, planstate);
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 9df1f81ea8..f9c7976ff2 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -119,6 +119,8 @@ CreateExecutorState(void)
 	estate->es_relations = NULL;
 	estate->es_rowmarks = NULL;
 	estate->es_plannedstmt = NULL;
+	estate->es_part_prune_infos = NIL;
+	estate->es_part_prune_result = NULL;
 
 	estate->es_junkFilter = NULL;
 
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index 357e10a1d7..c6f86a6510 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -134,7 +134,7 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
 	appendstate->as_begun = false;
 
 	/* If run-time partition pruning is enabled, then set that up now */
-	if (node->part_prune_info != NULL)
+	if (node->part_prune_index >= 0)
 	{
 		PartitionPruneState *prunestate;
 
@@ -145,7 +145,7 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
 		 */
 		prunestate = ExecInitPartitionPruning(&appendstate->ps,
 											  list_length(node->appendplans),
-											  node->part_prune_info,
+											  node->part_prune_index,
 											  &validsubplans);
 		appendstate->as_prune_state = prunestate;
 		nplans = bms_num_members(validsubplans);
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index c5c62fa5c7..8d35860c30 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -82,7 +82,7 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
 	mergestate->ps.ExecProcNode = ExecMergeAppend;
 
 	/* If run-time partition pruning is enabled, then set that up now */
-	if (node->part_prune_info != NULL)
+	if (node->part_prune_index >= 0)
 	{
 		PartitionPruneState *prunestate;
 
@@ -93,7 +93,7 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
 		 */
 		prunestate = ExecInitPartitionPruning(&mergestate->ps,
 											  list_length(node->mergeplans),
-											  node->part_prune_info,
+											  node->part_prune_index,
 											  &validsubplans);
 		mergestate->ms_prune_state = prunestate;
 		nplans = bms_num_members(validsubplans);
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 706d283a92..b02b4a641c 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -96,6 +96,7 @@ _copyPlannedStmt(const PlannedStmt *from)
 	COPY_SCALAR_FIELD(parallelModeNeeded);
 	COPY_SCALAR_FIELD(jitFlags);
 	COPY_NODE_FIELD(planTree);
+	COPY_NODE_FIELD(partPruneInfos);
 	COPY_NODE_FIELD(rtable);
 	COPY_NODE_FIELD(resultRelations);
 	COPY_NODE_FIELD(appendRelations);
@@ -253,7 +254,7 @@ _copyAppend(const Append *from)
 	COPY_NODE_FIELD(appendplans);
 	COPY_SCALAR_FIELD(nasyncplans);
 	COPY_SCALAR_FIELD(first_partial_plan);
-	COPY_NODE_FIELD(part_prune_info);
+	COPY_SCALAR_FIELD(part_prune_index);
 
 	return newnode;
 }
@@ -281,7 +282,7 @@ _copyMergeAppend(const MergeAppend *from)
 	COPY_POINTER_FIELD(sortOperators, from->numCols * sizeof(Oid));
 	COPY_POINTER_FIELD(collations, from->numCols * sizeof(Oid));
 	COPY_POINTER_FIELD(nullsFirst, from->numCols * sizeof(bool));
-	COPY_NODE_FIELD(part_prune_info);
+	COPY_SCALAR_FIELD(part_prune_index);
 
 	return newnode;
 }
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 4315c53080..7618444b4d 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -325,6 +325,7 @@ _outPlannedStmt(StringInfo str, const PlannedStmt *node)
 	WRITE_BOOL_FIELD(parallelModeNeeded);
 	WRITE_INT_FIELD(jitFlags);
 	WRITE_NODE_FIELD(planTree);
+	WRITE_NODE_FIELD(partPruneInfos);
 	WRITE_NODE_FIELD(rtable);
 	WRITE_NODE_FIELD(resultRelations);
 	WRITE_NODE_FIELD(appendRelations);
@@ -454,7 +455,7 @@ _outAppend(StringInfo str, const Append *node)
 	WRITE_NODE_FIELD(appendplans);
 	WRITE_INT_FIELD(nasyncplans);
 	WRITE_INT_FIELD(first_partial_plan);
-	WRITE_NODE_FIELD(part_prune_info);
+	WRITE_INT_FIELD(part_prune_index);
 }
 
 static void
@@ -471,7 +472,7 @@ _outMergeAppend(StringInfo str, const MergeAppend *node)
 	WRITE_OID_ARRAY(sortOperators, node->numCols);
 	WRITE_OID_ARRAY(collations, node->numCols);
 	WRITE_BOOL_ARRAY(nullsFirst, node->numCols);
-	WRITE_NODE_FIELD(part_prune_info);
+	WRITE_INT_FIELD(part_prune_index);
 }
 
 static void
@@ -2438,6 +2439,7 @@ _outPlannerGlobal(StringInfo str, const PlannerGlobal *node)
 	WRITE_NODE_FIELD(finalrowmarks);
 	WRITE_NODE_FIELD(resultRelations);
 	WRITE_NODE_FIELD(appendRelations);
+	WRITE_NODE_FIELD(partPruneInfos);
 	WRITE_NODE_FIELD(relationOids);
 	WRITE_NODE_FIELD(invalItems);
 	WRITE_NODE_FIELD(paramExecTypes);
@@ -2505,6 +2507,7 @@ _outPlannerInfo(StringInfo str, const PlannerInfo *node)
 	WRITE_BITMAPSET_FIELD(curOuterRels);
 	WRITE_NODE_FIELD(curOuterParams);
 	WRITE_BOOL_FIELD(partColsUpdated);
+	WRITE_NODE_FIELD(partPruneInfos);
 }
 
 static void
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 6a05b69415..bf602ff93e 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -1817,6 +1817,7 @@ _readPlannedStmt(void)
 	READ_BOOL_FIELD(parallelModeNeeded);
 	READ_INT_FIELD(jitFlags);
 	READ_NODE_FIELD(planTree);
+	READ_NODE_FIELD(partPruneInfos);
 	READ_NODE_FIELD(rtable);
 	READ_NODE_FIELD(resultRelations);
 	READ_NODE_FIELD(appendRelations);
@@ -1949,7 +1950,7 @@ _readAppend(void)
 	READ_NODE_FIELD(appendplans);
 	READ_INT_FIELD(nasyncplans);
 	READ_INT_FIELD(first_partial_plan);
-	READ_NODE_FIELD(part_prune_info);
+	READ_INT_FIELD(part_prune_index);
 
 	READ_DONE();
 }
@@ -1971,7 +1972,7 @@ _readMergeAppend(void)
 	READ_OID_ARRAY(sortOperators, local_node->numCols);
 	READ_OID_ARRAY(collations, local_node->numCols);
 	READ_BOOL_ARRAY(nullsFirst, local_node->numCols);
-	READ_NODE_FIELD(part_prune_info);
+	READ_INT_FIELD(part_prune_index);
 
 	READ_DONE();
 }
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index 76606faa3e..58a05cf673 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -1203,7 +1203,6 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
 	ListCell   *subpaths;
 	int			nasyncplans = 0;
 	RelOptInfo *rel = best_path->path.parent;
-	PartitionPruneInfo *partpruneinfo = NULL;
 	int			nodenumsortkeys = 0;
 	AttrNumber *nodeSortColIdx = NULL;
 	Oid		   *nodeSortOperators = NULL;
@@ -1354,6 +1353,9 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
 		subplans = lappend(subplans, subplan);
 	}
 
+	/* Set below if we find quals that we can use to run-time prune */
+	plan->part_prune_index = -1;
+
 	/*
 	 * If any quals exist, they may be useful to perform further partition
 	 * pruning during execution.  Gather information needed by the executor to
@@ -1377,16 +1379,14 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
 		}
 
 		if (prunequal != NIL)
-			partpruneinfo =
-				make_partition_pruneinfo(root, rel,
-										 best_path->subpaths,
-										 prunequal);
+			plan->part_prune_index = make_partition_pruneinfo(root, rel,
+															  best_path->subpaths,
+															  prunequal);
 	}
 
 	plan->appendplans = subplans;
 	plan->nasyncplans = nasyncplans;
 	plan->first_partial_plan = best_path->first_partial_path;
-	plan->part_prune_info = partpruneinfo;
 
 	copy_generic_path_info(&plan->plan, (Path *) best_path);
 
@@ -1426,7 +1426,6 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
 	List	   *subplans = NIL;
 	ListCell   *subpaths;
 	RelOptInfo *rel = best_path->path.parent;
-	PartitionPruneInfo *partpruneinfo = NULL;
 
 	/*
 	 * We don't have the actual creation of the MergeAppend node split out
@@ -1519,6 +1518,9 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
 		subplans = lappend(subplans, subplan);
 	}
 
+	/* Set below if we find quals that we can use to run-time prune */
+	node->part_prune_index = -1;
+
 	/*
 	 * If any quals exist, they may be useful to perform further partition
 	 * pruning during execution.  Gather information needed by the executor to
@@ -1542,13 +1544,13 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
 		}
 
 		if (prunequal != NIL)
-			partpruneinfo = make_partition_pruneinfo(root, rel,
-													 best_path->subpaths,
-													 prunequal);
+			node->part_prune_index = make_partition_pruneinfo(root, rel,
+															  best_path->subpaths,
+															  prunequal);
 	}
 
 	node->mergeplans = subplans;
-	node->part_prune_info = partpruneinfo;
+
 
 	/*
 	 * If prepare_sort_from_pathkeys added sort columns, but we were told to
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 06ad856eac..b11249ed8f 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -518,6 +518,7 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
 	result->dependsOnRole = glob->dependsOnRole;
 	result->parallelModeNeeded = glob->parallelModeNeeded;
 	result->planTree = top_plan;
+	result->partPruneInfos = glob->partPruneInfos;
 	result->rtable = glob->finalrtable;
 	result->resultRelations = glob->resultRelations;
 	result->appendRelations = glob->appendRelations;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 9cef92cab2..b8d5610593 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -348,6 +348,29 @@ set_plan_references(PlannerInfo *root, Plan *plan)
 		}
 	}
 
+	/* Also fix up the information in PartitionPruneInfos. */
+	foreach (lc, root->partPruneInfos)
+	{
+		PartitionPruneInfo *pruneinfo = lfirst(lc);
+		ListCell  *l;
+
+		foreach(l, pruneinfo->prune_infos)
+		{
+			List	   *prune_infos = lfirst(l);
+			ListCell   *l2;
+
+			foreach(l2, prune_infos)
+			{
+				PartitionedRelPruneInfo *pinfo = lfirst(l2);
+
+				/* RT index of the table to which the pinfo belongs. */
+				pinfo->rtindex += rtoffset;
+			}
+		}
+
+		glob->partPruneInfos = lappend(glob->partPruneInfos, pruneinfo);
+	}
+
 	return result;
 }
 
@@ -1655,21 +1678,12 @@ set_append_references(PlannerInfo *root,
 
 	aplan->apprelids = offset_relid_set(aplan->apprelids, rtoffset);
 
-	if (aplan->part_prune_info)
-	{
-		foreach(l, aplan->part_prune_info->prune_infos)
-		{
-			List	   *prune_infos = lfirst(l);
-			ListCell   *l2;
-
-			foreach(l2, prune_infos)
-			{
-				PartitionedRelPruneInfo *pinfo = lfirst(l2);
-
-				pinfo->rtindex += rtoffset;
-			}
-		}
-	}
+	/*
+	 * PartitionPruneInfos will be added to a list in PlannerGlobal, so update
+	 * the index.
+	 */
+	if (aplan->part_prune_index >= 0)
+		aplan->part_prune_index += list_length(root->glob->partPruneInfos);
 
 	/* We don't need to recurse to lefttree or righttree ... */
 	Assert(aplan->plan.lefttree == NULL);
@@ -1727,21 +1741,12 @@ set_mergeappend_references(PlannerInfo *root,
 
 	mplan->apprelids = offset_relid_set(mplan->apprelids, rtoffset);
 
-	if (mplan->part_prune_info)
-	{
-		foreach(l, mplan->part_prune_info->prune_infos)
-		{
-			List	   *prune_infos = lfirst(l);
-			ListCell   *l2;
-
-			foreach(l2, prune_infos)
-			{
-				PartitionedRelPruneInfo *pinfo = lfirst(l2);
-
-				pinfo->rtindex += rtoffset;
-			}
-		}
-	}
+	/*
+	 * PartitionPruneInfos will be added to a list in PlannerGlobal, so update
+	 * the index.
+	 */
+	if (mplan->part_prune_index >= 0)
+		mplan->part_prune_index += list_length(root->glob->partPruneInfos);
 
 	/* We don't need to recurse to lefttree or righttree ... */
 	Assert(mplan->plan.lefttree == NULL);
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index 9d3c05aed3..d77f7d3aef 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -209,16 +209,20 @@ static void partkey_datum_from_expr(PartitionPruneContext *context,
 
 /*
  * make_partition_pruneinfo
- *		Builds a PartitionPruneInfo which can be used in the executor to allow
- *		additional partition pruning to take place.  Returns NULL when
- *		partition pruning would be useless.
+ *		Checks if the given set of quals can be used to build pruning steps
+ *		that the executor can use to prune away unneeded partitions.  If
+ *		suitable quals are found then a PartitionPruneInfo is built and tagged
+ *		onto the PlannerInfo's partPruneInfos list.
+ *
+ * The return value is the 0-based index of the item added to the
+ * partPruneInfos list or -1 if nothing was added.
  *
  * 'parentrel' is the RelOptInfo for an appendrel, and 'subpaths' is the list
  * of scan paths for its child rels.
  * 'prunequal' is a list of potential pruning quals (i.e., restriction
  * clauses that are applicable to the appendrel).
  */
-PartitionPruneInfo *
+int
 make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 						 List *subpaths,
 						 List *prunequal)
@@ -332,7 +336,7 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	 * quals, then we can just not bother with run-time pruning.
 	 */
 	if (prunerelinfos == NIL)
-		return NULL;
+		return -1;
 
 	/* Else build the result data structure */
 	pruneinfo = makeNode(PartitionPruneInfo);
@@ -358,7 +362,9 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	else
 		pruneinfo->other_subplans = NULL;
 
-	return pruneinfo;
+	root->partPruneInfos = lappend(root->partPruneInfos, pruneinfo);
+
+	return list_length(root->partPruneInfos) - 1;
 }
 
 /*
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index 708435e952..bf962af7af 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -123,9 +123,8 @@ typedef struct PartitionPruneState
 
 extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
 													 int n_total_subplans,
-													 PartitionPruneInfo *pruneinfo,
+													 int part_prune_index,
 													 Bitmapset **initially_valid_subplans);
 extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
 										   bool initial_prune);
-
 #endif							/* EXECPARTITION_H */
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 5728801379..25e0bb976e 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -596,6 +596,8 @@ typedef struct EState
 	struct ExecRowMark **es_rowmarks;	/* Array of per-range-table-entry
 										 * ExecRowMarks, or NULL if none */
 	PlannedStmt *es_plannedstmt;	/* link to top of plan tree */
+	List		*es_part_prune_infos;	/* PlannedStmt.partPruneInfos */
+	struct PartitionPruneResult *es_part_prune_result; /* QueryDesc.part_prune_result */
 	const char *es_sourceText;	/* Source text from QueryDesc */
 
 	JunkFilter *es_junkFilter;	/* top-level junk filter, if any */
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index b88cfb8dc0..a0f3a46334 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -107,6 +107,9 @@ typedef struct PlannerGlobal
 
 	List	   *appendRelations;	/* "flat" list of AppendRelInfos */
 
+	List	   *partPruneInfos;		/* List of PartitionPruneInfo contained in
+									 * the plan */
+
 	List	   *relationOids;	/* OIDs of relations the plan depends on */
 
 	List	   *invalItems;		/* other dependencies, as PlanInvalItems */
@@ -386,6 +389,9 @@ struct PlannerInfo
 
 	/* Does this query modify any partition key columns? */
 	bool		partColsUpdated;
+
+	/* PartitionPruneInfos added in this query's plan. */
+	List	   *partPruneInfos;
 };
 
 
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index d5c0ebe859..c3f4a39657 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -64,6 +64,9 @@ typedef struct PlannedStmt
 
 	struct Plan *planTree;		/* tree of Plan nodes */
 
+	List	   *partPruneInfos;	/* List of PartitionPruneInfo contained in
+								 * the plan */
+
 	List	   *rtable;			/* list of RangeTblEntry nodes */
 
 	/* rtable indexes of target relations for INSERT/UPDATE/DELETE */
@@ -262,8 +265,8 @@ typedef struct Append
 	 */
 	int			first_partial_plan;
 
-	/* Info for run-time subplan pruning; NULL if we're not doing that */
-	struct PartitionPruneInfo *part_prune_info;
+	/* Index to PlannerInfo.partPruneInfos or -1 if no run-time pruning */
+	int			part_prune_index;
 } Append;
 
 /* ----------------
@@ -297,8 +300,8 @@ typedef struct MergeAppend
 	/* NULLS FIRST/LAST directions */
 	bool	   *nullsFirst;
 
-	/* Info for run-time subplan pruning; NULL if we're not doing that */
-	struct PartitionPruneInfo *part_prune_info;
+	/* Index to PlannerInfo.partPruneInfos or -1 if no run-time pruning */
+	int			part_prune_index;
 } MergeAppend;
 
 /* ----------------
diff --git a/src/include/partitioning/partprune.h b/src/include/partitioning/partprune.h
index 90684efa25..ebf0dcff8c 100644
--- a/src/include/partitioning/partprune.h
+++ b/src/include/partitioning/partprune.h
@@ -70,10 +70,10 @@ typedef struct PartitionPruneContext
 #define PruneCxtStateIdx(partnatts, step_id, keyno) \
 	((partnatts) * (step_id) + (keyno))
 
-extern PartitionPruneInfo *make_partition_pruneinfo(struct PlannerInfo *root,
-													struct RelOptInfo *parentrel,
-													List *subpaths,
-													List *prunequal);
+extern int make_partition_pruneinfo(struct PlannerInfo *root,
+									struct RelOptInfo *parentrel,
+									List *subpaths,
+									List *prunequal);
 extern Bitmapset *prune_append_rel_partitions(struct RelOptInfo *rel);
 extern Bitmapset *get_matching_partitions(PartitionPruneContext *context,
 										  List *pruning_steps);
-- 
2.35.3



  [application/octet-stream] v17-0002-Optimize-AcquireExecutorLocks-by-locking-only-un.patch (87.2K, 3-v17-0002-Optimize-AcquireExecutorLocks-by-locking-only-un.patch)
  download | inline diff:
From e5d0283732311fb068ad75ee4ff282ebe5306266 Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Wed, 22 Dec 2021 16:55:17 +0900
Subject: [PATCH v17 2/2] Optimize AcquireExecutorLocks() by locking only
 unpruned partitions

This commit teaches AcquireExecutorLocks() to perform initial
partition pruning to notionally eliminate the subnodes contained in a
generic cached plan that need not be initialized during the actual
execution of the plan and skip locking the partition scanned by those
subnodes.

The result of performing initial partition pruning this way before the
actual execution has started is made available to the actual execution via
PartitionPruneResult, made available along with the PlannedStmt by the
callers of the executor that used plancache.c to get the plan.  It is NULL
in the cases in which the plan is obtained by calling the planner
directly or if the plan obtained by plancache.c is not a generic one.
---
 src/backend/commands/copyto.c          |   2 +-
 src/backend/commands/createas.c        |   2 +-
 src/backend/commands/explain.c         |   7 +-
 src/backend/commands/extension.c       |   2 +-
 src/backend/commands/matview.c         |   2 +-
 src/backend/commands/prepare.c         |  26 ++-
 src/backend/executor/README            |  32 ++++
 src/backend/executor/execMain.c        |  53 ++++++
 src/backend/executor/execParallel.c    |  27 ++-
 src/backend/executor/execPartition.c   | 234 +++++++++++++++++++++----
 src/backend/executor/functions.c       |   2 +-
 src/backend/executor/nodeAppend.c      |  11 +-
 src/backend/executor/nodeMergeAppend.c |   5 +-
 src/backend/executor/spi.c             |  27 ++-
 src/backend/nodes/copyfuncs.c          |  27 +++
 src/backend/nodes/outfuncs.c           |  29 +++
 src/backend/nodes/readfuncs.c          |  51 ++++++
 src/backend/optimizer/plan/planner.c   |   2 +
 src/backend/optimizer/plan/setrefs.c   |  46 +++++
 src/backend/partitioning/partprune.c   |  41 ++++-
 src/backend/tcop/postgres.c            |   8 +-
 src/backend/tcop/pquery.c              |  28 ++-
 src/backend/utils/cache/plancache.c    | 184 ++++++++++++++++---
 src/backend/utils/mmgr/portalmem.c     |  19 ++
 src/include/commands/explain.h         |   4 +-
 src/include/executor/execPartition.h   |   9 +-
 src/include/executor/execdesc.h        |   3 +
 src/include/executor/executor.h        |   2 +
 src/include/nodes/execnodes.h          |  27 +++
 src/include/nodes/nodes.h              |   4 +
 src/include/nodes/pathnodes.h          |   9 +
 src/include/nodes/plannodes.h          |  21 +++
 src/include/utils/plancache.h          |   3 +-
 src/include/utils/portal.h             |   3 +
 34 files changed, 856 insertions(+), 96 deletions(-)

diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index fca29a9a10..d839517693 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -541,7 +541,7 @@ BeginCopyTo(ParseState *pstate,
 		((DR_copy *) dest)->cstate = cstate;
 
 		/* Create a QueryDesc requesting no output */
-		cstate->queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+		cstate->queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
 											GetActiveSnapshot(),
 											InvalidSnapshot,
 											dest, NULL, NULL, 0);
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 9abbb6b555..f6607f2454 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -325,7 +325,7 @@ ExecCreateTableAs(ParseState *pstate, CreateTableAsStmt *stmt,
 		UpdateActiveSnapshotCommandId();
 
 		/* Create a QueryDesc, redirecting output to our tuple receiver */
-		queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+		queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
 									GetActiveSnapshot(), InvalidSnapshot,
 									dest, params, queryEnv, 0);
 
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index e29c2ae206..e41b13a3ea 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -407,7 +407,7 @@ ExplainOneQuery(Query *query, int cursorOptions,
 		}
 
 		/* run it (if needed) and produce output */
-		ExplainOnePlan(plan, into, es, queryString, params, queryEnv,
+		ExplainOnePlan(plan, NULL, into, es, queryString, params, queryEnv,
 					   &planduration, (es->buffers ? &bufusage : NULL));
 	}
 }
@@ -515,7 +515,8 @@ ExplainOneUtility(Node *utilityStmt, IntoClause *into, ExplainState *es,
  * to call it.
  */
 void
-ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
+ExplainOnePlan(PlannedStmt *plannedstmt, PartitionPruneResult *part_prune_result,
+			   IntoClause *into, ExplainState *es,
 			   const char *queryString, ParamListInfo params,
 			   QueryEnvironment *queryEnv, const instr_time *planduration,
 			   const BufferUsage *bufusage)
@@ -563,7 +564,7 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
 		dest = None_Receiver;
 
 	/* Create a QueryDesc for the query */
-	queryDesc = CreateQueryDesc(plannedstmt, queryString,
+	queryDesc = CreateQueryDesc(plannedstmt, part_prune_result, queryString,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, params, queryEnv, instrument_option);
 
diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index 3db859c3ea..631cc07217 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -776,7 +776,7 @@ execute_sql_string(const char *sql)
 			{
 				QueryDesc  *qdesc;
 
-				qdesc = CreateQueryDesc(stmt,
+				qdesc = CreateQueryDesc(stmt, NULL,
 										sql,
 										GetActiveSnapshot(), NULL,
 										dest, NULL, NULL, 0);
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index d1ee106465..e878209674 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -408,7 +408,7 @@ refresh_matview_datafill(DestReceiver *dest, Query *query,
 	UpdateActiveSnapshotCommandId();
 
 	/* Create a QueryDesc, redirecting output to our tuple receiver */
-	queryDesc = CreateQueryDesc(plan, queryString,
+	queryDesc = CreateQueryDesc(plan, NULL, queryString,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, NULL, NULL, 0);
 
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 2333aae467..83465e40f8 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -155,6 +155,7 @@ ExecuteQuery(ParseState *pstate,
 	PreparedStatement *entry;
 	CachedPlan *cplan;
 	List	   *plan_list;
+	List	   *part_prune_result_list;
 	ParamListInfo paramLI = NULL;
 	EState	   *estate = NULL;
 	Portal		portal;
@@ -193,7 +194,10 @@ ExecuteQuery(ParseState *pstate,
 									   entry->plansource->query_string);
 
 	/* Replan if needed, and increment plan refcount for portal */
-	cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL);
+	cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL,
+						  &part_prune_result_list);
+	Assert(list_length(cplan->stmt_list) ==
+		   list_length(part_prune_result_list));
 	plan_list = cplan->stmt_list;
 
 	/*
@@ -207,6 +211,9 @@ ExecuteQuery(ParseState *pstate,
 					  plan_list,
 					  cplan);
 
+	/* Copy PartitionPruneResults into the portal's context. */
+	PortalStorePartitionPruneResults(portal, part_prune_result_list);
+
 	/*
 	 * For CREATE TABLE ... AS EXECUTE, we must verify that the prepared
 	 * statement is one that produces tuples.  Currently we insist that it be
@@ -576,7 +583,9 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	const char *query_string;
 	CachedPlan *cplan;
 	List	   *plan_list;
-	ListCell   *p;
+	List	   *part_prune_result_list;
+	ListCell   *p,
+			   *pp;
 	ParamListInfo paramLI = NULL;
 	EState	   *estate = NULL;
 	instr_time	planstart;
@@ -619,7 +628,10 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 
 	/* Replan if needed, and acquire a transient refcount */
 	cplan = GetCachedPlan(entry->plansource, paramLI,
-						  CurrentResourceOwner, queryEnv);
+						  CurrentResourceOwner, queryEnv,
+						  &part_prune_result_list);
+	Assert(list_length(cplan->stmt_list) ==
+		   list_length(part_prune_result_list));
 
 	INSTR_TIME_SET_CURRENT(planduration);
 	INSTR_TIME_SUBTRACT(planduration, planstart);
@@ -634,13 +646,15 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	plan_list = cplan->stmt_list;
 
 	/* Explain each query */
-	foreach(p, plan_list)
+	forboth(p, plan_list, pp, part_prune_result_list)
 	{
 		PlannedStmt *pstmt = lfirst_node(PlannedStmt, p);
+		PartitionPruneResult *part_prune_result = lfirst_node(PartitionPruneResult, pp);
 
 		if (pstmt->commandType != CMD_UTILITY)
-			ExplainOnePlan(pstmt, into, es, query_string, paramLI, queryEnv,
-						   &planduration, (es->buffers ? &bufusage : NULL));
+			ExplainOnePlan(pstmt, part_prune_result, into, es, query_string,
+						   paramLI, queryEnv, &planduration,
+						   (es->buffers ? &bufusage : NULL));
 		else
 			ExplainOneUtility(pstmt->utilityStmt, into, es, query_string,
 							  paramLI, queryEnv);
diff --git a/src/backend/executor/README b/src/backend/executor/README
index 0b5183fc4a..953a476ea5 100644
--- a/src/backend/executor/README
+++ b/src/backend/executor/README
@@ -65,6 +65,34 @@ found there.  This currently only occurs for Append and MergeAppend nodes.  In
 this case the non-required subplans are ignored and the executor state's
 subnode array will become out of sequence to the plan's subplan list.
 
+Actually, the so-called execution time pruning may also occur even before the
+execution has started.  One case where that occurs is when a cached generic
+plan is being validated for execution by plancache.c: GetCachedPlan(), which
+works by locking all the relations that will be scanned by that plan.  If the
+generic plan contains nodes that can perform execution time partition pruning
+(that is, contain a PartitionPruneInfo), a subset of pruning steps contained
+in a given node's PartitionPruneInfo that do not depend on the execution
+actually having started (called "initial" pruning steps) are performed at this
+point to figure out the minimal set of child subplans that satisfy those
+pruning steps.  AcquireExecutorLocks() looking at a given plan tree will then
+lock only the relations scanned by the child subplans that survived such
+pruning, along with those present in PlannedStmt.minLockRelids.  Note that the
+subplans are only notionally pruned in that they are not removed from the plan
+tree as such.
+
+To prevent the executor and any third party execution code that can look at
+the plan tree from trying to execute the subplans that were pruned as
+described above, the result of pruning is passed to the executor as a
+PartitionPruneResult node via the QueryDesc.  It consists of the set of
+indexes of surviving subplans in their respective parent plan node's list of
+child subplans, saved as a list of bitmapsets, with one element for every
+parent plan node whose PartitionPruneInfo is present in
+PlannedStmt.partPruneInfos.  In other words, the executor should not
+re-evaluate the set of initially valid subplans by redoing the initial pruning
+if it was already done by AcquireExecutorLocks(), because the re-evaluation may
+very well end up resulting in a different set of subplans, containing some
+whose relations were not locked by AcquireExecutorLocks().
+
 Each Plan node may have expression trees associated with it, to represent
 its target list, qualification conditions, etc.  These trees are also
 read-only to the executor, but the executor state for expression evaluation
@@ -286,6 +314,10 @@ Query Processing Control Flow
 
 This is a sketch of control flow for full query processing:
 
+  [ ExecutorDoInitialPruning ] --- an optional step to perform initial
+		partition pruning on the plan tree the result of which is passed
+		to the executor via QueryDesc
+
 	CreateQueryDesc
 
 	ExecutorStart
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 72fc273524..45824624f8 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -49,6 +49,7 @@
 #include "commands/matview.h"
 #include "commands/trigger.h"
 #include "executor/execdebug.h"
+#include "executor/execPartition.h"
 #include "executor/nodeSubplan.h"
 #include "foreign/fdwapi.h"
 #include "jit/jit.h"
@@ -104,6 +105,56 @@ static void EvalPlanQualStart(EPQState *epqstate, Plan *planTree);
 
 /* end of local decls */
 
+/* ----------------------------------------------------------------
+ *		ExecutorDoInitialPruning
+ *
+ *		For each plan tree node that has been assigned a PartitionPruneInfo,
+ *		this performs initial partition pruning using the information contained
+ *		therein to determine the set of child subplans that satisfy the initial
+ *		pruning steps, to be returned as a bitmapset of their indexes in the
+ *		node's list of child subplans (for example, an Append's appendplans).
+ *
+ * Return value is a PartitionPruneResult node that contains a list of those
+ * bitmapsets, with one element for every PartitionPruneInfo, and a bitmapset
+ * of the RT indexes of all the leaf partitions scanned by those chosen
+ * subplans.  Note that the latter is shared across all PartitionPruneInfos.
+ *
+ * The executor must see the exactly same set of subplans as valid for
+ * execution when doing ExecInitNode() on the plan nodes whose
+ * PartitionPruneInfos are processed here.  So, it must get the set from the
+ * aforementioned PartitionPruneResult, instead of computing it all over
+ * again by redoing the initial pruning.  It's the caller's job to pass the
+ * PartitionPruneResult to the executor.
+ *
+ * Note: Partitioned tables mentioned in PartitionedRelPruneInfo nodes that
+ * drive the pruning will be locked before doing the pruning.
+ * ----------------------------------------------------------------
+ */
+PartitionPruneResult *
+ExecutorDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params)
+{
+	PartitionPruneResult *result;
+	ListCell *lc;
+
+	/* Only get here if there is any pruning to do. */
+	Assert(plannedstmt->containsInitialPruning);
+
+	result = makeNode(PartitionPruneResult);
+	foreach(lc, plannedstmt->partPruneInfos)
+	{
+		PartitionPruneInfo *pruneinfo = lfirst(lc);
+		Bitmapset *valid_subplan_offs;
+
+		valid_subplan_offs =
+			ExecPartitionDoInitialPruning(plannedstmt, params, pruneinfo,
+										  &result->scan_leafpart_rtis);
+		result->valid_subplan_offs_list =
+			lappend(result->valid_subplan_offs_list,
+					valid_subplan_offs);
+	}
+
+	return result;
+}
 
 /* ----------------------------------------------------------------
  *		ExecutorStart
@@ -806,6 +857,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
 {
 	CmdType		operation = queryDesc->operation;
 	PlannedStmt *plannedstmt = queryDesc->plannedstmt;
+	PartitionPruneResult *part_prune_result = queryDesc->part_prune_result;
 	Plan	   *plan = plannedstmt->planTree;
 	List	   *rangeTable = plannedstmt->rtable;
 	EState	   *estate = queryDesc->estate;
@@ -826,6 +878,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
 
 	estate->es_plannedstmt = plannedstmt;
 	estate->es_part_prune_infos = plannedstmt->partPruneInfos;
+	estate->es_part_prune_result = part_prune_result;
 
 	/*
 	 * Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index f73b8c2607..7e6dab5623 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -66,6 +66,7 @@
 #define PARALLEL_KEY_QUERY_TEXT		UINT64CONST(0xE000000000000008)
 #define PARALLEL_KEY_JIT_INSTRUMENTATION UINT64CONST(0xE000000000000009)
 #define PARALLEL_KEY_WAL_USAGE			UINT64CONST(0xE00000000000000A)
+#define PARALLEL_KEY_PARTITIONPRUNERESULT	UINT64CONST(0xE00000000000000B)
 
 #define PARALLEL_TUPLE_QUEUE_SIZE		65536
 
@@ -182,6 +183,7 @@ ExecSerializePlan(Plan *plan, EState *estate)
 	pstmt->transientPlan = false;
 	pstmt->dependsOnRole = false;
 	pstmt->parallelModeNeeded = false;
+	pstmt->containsInitialPruning = false;
 	pstmt->planTree = plan;
 	pstmt->partPruneInfos = estate->es_part_prune_infos;
 	pstmt->rtable = estate->es_range_table;
@@ -597,12 +599,15 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	FixedParallelExecutorState *fpes;
 	char	   *pstmt_data;
 	char	   *pstmt_space;
+	char	   *part_prune_result_data;
+	char	   *part_prune_result_space;
 	char	   *paramlistinfo_space;
 	BufferUsage *bufusage_space;
 	WalUsage   *walusage_space;
 	SharedExecutorInstrumentation *instrumentation = NULL;
 	SharedJitInstrumentation *jit_instrumentation = NULL;
 	int			pstmt_len;
+	int			part_prune_result_len;
 	int			paramlistinfo_len;
 	int			instrumentation_len = 0;
 	int			jit_instrumentation_len = 0;
@@ -631,6 +636,7 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 
 	/* Fix up and serialize plan to be sent to workers. */
 	pstmt_data = ExecSerializePlan(planstate->plan, estate);
+	part_prune_result_data = nodeToString(estate->es_part_prune_result);
 
 	/* Create a parallel context. */
 	pcxt = CreateParallelContext("postgres", "ParallelQueryMain", nworkers);
@@ -657,6 +663,11 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	shm_toc_estimate_chunk(&pcxt->estimator, pstmt_len);
 	shm_toc_estimate_keys(&pcxt->estimator, 1);
 
+	/* Estimate space for serialized PartitionPruneResult. */
+	part_prune_result_len = strlen(part_prune_result_data) + 1;
+	shm_toc_estimate_chunk(&pcxt->estimator, part_prune_result_len);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
 	/* Estimate space for serialized ParamListInfo. */
 	paramlistinfo_len = EstimateParamListSpace(estate->es_param_list_info);
 	shm_toc_estimate_chunk(&pcxt->estimator, paramlistinfo_len);
@@ -751,6 +762,12 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	memcpy(pstmt_space, pstmt_data, pstmt_len);
 	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PLANNEDSTMT, pstmt_space);
 
+	/* Store serialized PartitionPruneResult */
+	part_prune_result_space = shm_toc_allocate(pcxt->toc, part_prune_result_len);
+	memcpy(part_prune_result_space, part_prune_result_data, part_prune_result_len);
+	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARTITIONPRUNERESULT,
+				   part_prune_result_space);
+
 	/* Store serialized ParamListInfo. */
 	paramlistinfo_space = shm_toc_allocate(pcxt->toc, paramlistinfo_len);
 	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARAMLISTINFO, paramlistinfo_space);
@@ -1232,8 +1249,10 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
 						 int instrument_options)
 {
 	char	   *pstmtspace;
+	char	   *part_prune_result_space;
 	char	   *paramspace;
 	PlannedStmt *pstmt;
+	PartitionPruneResult *part_prune_result;
 	ParamListInfo paramLI;
 	char	   *queryString;
 
@@ -1244,12 +1263,18 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
 	pstmtspace = shm_toc_lookup(toc, PARALLEL_KEY_PLANNEDSTMT, false);
 	pstmt = (PlannedStmt *) stringToNode(pstmtspace);
 
+	/* Reconstruct leader-supplied PartitionPruneResult. */
+	part_prune_result_space =
+		shm_toc_lookup(toc, PARALLEL_KEY_PARTITIONPRUNERESULT, false);
+	part_prune_result = (PartitionPruneResult *)
+		stringToNode(part_prune_result_space);
+
 	/* Reconstruct ParamListInfo. */
 	paramspace = shm_toc_lookup(toc, PARALLEL_KEY_PARAMLISTINFO, false);
 	paramLI = RestoreParamList(&paramspace);
 
 	/* Create a QueryDesc for the query. */
-	return CreateQueryDesc(pstmt,
+	return CreateQueryDesc(pstmt, part_prune_result,
 						   queryString,
 						   GetActiveSnapshot(), InvalidSnapshot,
 						   receiver, paramLI, NULL, instrument_options);
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index b55cdd2580..24e6f6e988 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -25,6 +25,7 @@
 #include "mb/pg_wchar.h"
 #include "miscadmin.h"
 #include "nodes/makefuncs.h"
+#include "parser/parsetree.h"
 #include "partitioning/partbounds.h"
 #include "partitioning/partdesc.h"
 #include "partitioning/partprune.h"
@@ -185,7 +186,11 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation rel,
 static List *adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri);
 static List *adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap);
 static PartitionPruneState *CreatePartitionPruneState(PlanState *planstate,
-													  PartitionPruneInfo *pruneinfo);
+													  PartitionPruneInfo *pruneinfo,
+													  bool consider_initial_steps,
+													  bool consider_exec_steps,
+													  List *rtable, ExprContext *econtext,
+													  PartitionDirectory partdir);
 static void InitPartitionPruneContext(PartitionPruneContext *context,
 									  List *pruning_steps,
 									  PartitionDesc partdesc,
@@ -198,7 +203,8 @@ static void PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
 static void find_matching_subplans_recurse(PartitionPruningData *prunedata,
 										   PartitionedRelPruningData *pprune,
 										   bool initial_prune,
-										   Bitmapset **validsubplans);
+										   Bitmapset **validsubplans,
+										   Bitmapset **scan_leafpart_rtis);
 
 
 /*
@@ -1593,8 +1599,10 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  * considered to be a stable expression, it can change value from one plan
  * node scan to the next during query execution.  Stable comparison
  * expressions that don't involve such Params allow partition pruning to be
- * done once during executor startup.  Expressions that do involve such Params
- * require us to prune separately for each scan of the parent plan node.
+ * done once during executor startup or during ExecutorDoInitialPruning() that
+ * runs as part of performing AcquireExecutorLocks() on a given plan tree.
+ * Expressions that do involve such Params require us to prune separately for
+ * each scan of the parent plan node.
  *
  * Note that pruning away unneeded subplans during executor startup has the
  * added benefit of not having to initialize the unneeded subplans at all.
@@ -1611,6 +1619,13 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  *		account for initial pruning possibly having eliminated some of the
  *		subplans.
  *
+ * ExecPartitionDoInitialPruning:
+ *		Do initial pruning with the information contained in a given
+ *		PartitionPruneInfo to determine the minimal set of child subplans
+ *		to be executed of the parent plan node to which the PartitionPruneInfo
+ *		belongs and also the set of the RT indexes of leaf partitions that will
+ *		be scanned with those subplans.
+ *
  * ExecFindMatchingSubPlans:
  *		Returns indexes of matching subplans after evaluating the expressions
  *		that are safe to evaluate at a given point.  This function is first
@@ -1628,8 +1643,9 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  *
  * On return, *initially_valid_subplans is assigned the set of indexes of
  * child subplans that must be initialized along with the parent plan node.
- * Initial pruning is performed here if needed and in that case only the
- * surviving subplans' indexes are added.
+ * Initial pruning is performed here if needed (unless it has already been done
+ * by ExecutorDoInitialPruning()), and in that case only the surviving
+ * subplans' indexes are added.
  *
  * If subplans are indeed pruned, subplan_map arrays contained in the returned
  * PartitionPruneState are re-sequenced to not count those, though only if the
@@ -1645,24 +1661,59 @@ ExecInitPartitionPruning(PlanState *planstate,
 	EState	   *estate = planstate->state;
 	PartitionPruneInfo *pruneinfo = list_nth(estate->es_part_prune_infos,
 											 part_prune_index);
+	PartitionPruneResult *pruneresult = estate->es_part_prune_result;
+	bool	do_pruning = (pruneinfo->needs_init_pruning ||
+						  pruneinfo->needs_exec_pruning);
 
-	/* We may need an expression context to evaluate partition exprs */
-	ExecAssignExprContext(estate, planstate);
+	/*
+	 * No need to do initial pruning if it was done already by
+	 * ExecutorDoInitialPruning(), which it would be if es_part_prune_result
+	 * has been set.
+	 */
+	if (pruneresult)
+		do_pruning = pruneinfo->needs_exec_pruning;
 
-	/* Create the working data structure for pruning */
-	prunestate = CreatePartitionPruneState(planstate, pruneinfo);
+	prunestate = NULL;
+	if (do_pruning)
+	{
+		/* We may need an expression context to evaluate partition exprs */
+		ExecAssignExprContext(estate, planstate);
+
+		/* For data reading, executor always omits detached partitions */
+		if (estate->es_partition_directory == NULL)
+			estate->es_partition_directory =
+				CreatePartitionDirectory(estate->es_query_cxt, false);
+
+		/*
+		 * Create the working data structure for pruning.  No need to consider
+		 * initial pruning steps if we have a PartitionPruneResult.
+		 */
+		prunestate = CreatePartitionPruneState(planstate, pruneinfo,
+											   pruneresult == NULL, true,
+											   NIL, planstate->ps_ExprContext,
+											   estate->es_partition_directory);
+	}
 
 	/*
 	 * Perform an initial partition prune pass, if required.
 	 */
-	if (prunestate->do_initial_prune)
-		*initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true);
+	if (pruneresult)
+	{
+		*initially_valid_subplans =
+			list_nth(pruneresult->valid_subplan_offs_list, part_prune_index);
+	}
+	else if (prunestate && prunestate->do_initial_prune)
+	{
+		*initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true,
+															 NULL);
+	}
 	else
 	{
 		/* No pruning, so we'll need to initialize all subplans */
 		Assert(n_total_subplans > 0);
 		*initially_valid_subplans = bms_add_range(NULL, 0,
 												  n_total_subplans - 1);
+		return prunestate;
 	}
 
 	/*
@@ -1670,7 +1721,8 @@ ExecInitPartitionPruning(PlanState *planstate,
 	 * that were removed above due to initial pruning.  No need to do this if
 	 * no steps were removed.
 	 */
-	if (bms_num_members(*initially_valid_subplans) < n_total_subplans)
+	if (prunestate &&
+		bms_num_members(*initially_valid_subplans) < n_total_subplans)
 	{
 		/*
 		 * We can safely skip this when !do_exec_prune, even though that
@@ -1686,11 +1738,73 @@ ExecInitPartitionPruning(PlanState *planstate,
 	return prunestate;
 }
 
+/*
+ * ExecPartitionDoInitialPruning
+ *		Perform initial pruning using given PartitionPruneInfo to determine
+ *		the minimal set of child subplans that will be executed and also the
+ *		set of RT indexes of the leaf partitions scanned by those subplans.
+ */
+Bitmapset *
+ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params,
+							  PartitionPruneInfo *pruneinfo,
+							  Bitmapset **scan_leafpart_rtis)
+{
+	List		 *rtable = plannedstmt->rtable;
+	ExprContext	 *econtext;
+	PartitionDirectory pdir;
+	MemoryContext oldcontext,
+				  tmpcontext;
+	PartitionPruneState *prunestate;
+	Bitmapset	 *valid_subplan_offs;
+
+	/*
+	 * A temporary context for memory allocations required while executing
+	 * partition pruning steps.
+	 */
+	tmpcontext = AllocSetContextCreate(CurrentMemoryContext,
+									   "initial pruning working data",
+									   ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(tmpcontext);
+
+	/*
+	 * PartitionDirectory to look up partition descriptors, which omits
+	 * detached partitions, just like in the executor proper.
+	 */
+	pdir = CreatePartitionDirectory(CurrentMemoryContext, false);
+
+	/*
+	 * We don't yet have a PlanState for the parent plan node, so we must
+	 * create a standalone ExprContext to evaluate pruning expressions,
+	 * equipped with the information about the EXTERN parameters that the
+	 * caller passed us.  Note that that's okay because the initial pruning
+	 * steps do not contain anything that requires the execution to have
+	 * started.
+	 */
+	econtext = CreateStandaloneExprContext();
+	econtext->ecxt_param_list_info = params;
+	prunestate = CreatePartitionPruneState(NULL, pruneinfo, true, false,
+										   rtable, econtext, pdir);
+	MemoryContextSwitchTo(oldcontext);
+
+	/* Do the initial pruning. */
+	valid_subplan_offs = ExecFindMatchingSubPlans(prunestate, true,
+												  scan_leafpart_rtis);
+
+	FreeExprContext(econtext, true);
+	DestroyPartitionDirectory(pdir);
+	MemoryContextDelete(tmpcontext);
+
+	return valid_subplan_offs;
+}
+
 /*
  * CreatePartitionPruneState
  *		Build the data structure required for calling ExecFindMatchingSubPlans
  *
- * 'planstate' is the parent plan node's execution state.
+ * 'planstate', if not NULL, is the parent plan node's execution state.  It
+ * can be NULL if being called before ExecutorStart(), in which case,
+ * 'rtable' (range table), 'econtext', and 'partdir' must be explicitly
+ * provided.
  *
  * 'pruneinfo' is a PartitionPruneInfo as generated by
  * make_partition_pruneinfo.  Here we build a PartitionPruneState containing a
@@ -1704,19 +1818,21 @@ ExecInitPartitionPruning(PlanState *planstate,
  * PartitionedRelPruneInfo.
  */
 static PartitionPruneState *
-CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
+CreatePartitionPruneState(PlanState *planstate,
+						  PartitionPruneInfo *pruneinfo,
+						  bool consider_initial_steps,
+						  bool consider_exec_steps,
+						  List *rtable, ExprContext *econtext,
+						  PartitionDirectory partdir)
 {
-	EState	   *estate = planstate->state;
+	EState	   *estate = planstate ? planstate->state : NULL;
 	PartitionPruneState *prunestate;
 	int			n_part_hierarchies;
 	ListCell   *lc;
 	int			i;
-	ExprContext *econtext = planstate->ps_ExprContext;
 
-	/* For data reading, executor always omits detached partitions */
-	if (estate->es_partition_directory == NULL)
-		estate->es_partition_directory =
-			CreatePartitionDirectory(estate->es_query_cxt, false);
+	Assert((estate != NULL) ||
+			(partdir != NULL && econtext != NULL && rtable != NIL));
 
 	n_part_hierarchies = list_length(pruneinfo->prune_infos);
 	Assert(n_part_hierarchies > 0);
@@ -1771,15 +1887,42 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 			PartitionKey partkey;
 
 			/*
-			 * We can rely on the copies of the partitioned table's partition
-			 * key and partition descriptor appearing in its relcache entry,
-			 * because that entry will be held open and locked for the
-			 * duration of this executor run.
+			 * Must open the relation by ourselves when called before the
+			 * execution has started, such as, when called during
+			 * ExecutorDoInitialPruning() on a cached plan.  In that case,
+			 * sub-partitions must be locked, because AcquirePlannerLocks()
+			 * would not have seen them. (1st relation in a partrelpruneinfos
+			 * list is always the root partitioned table appearing in the
+			 * query, which AcquirePlannerLocks() would have locked; the
+			 * Assert in relation_open() guards that assumption.)
+			 */
+			if (estate == NULL)
+			{
+				RangeTblEntry *rte = rt_fetch(pinfo->rtindex, rtable);
+				int		lockmode = (j == 0) ? NoLock : rte->rellockmode;
+
+				partrel = table_open(rte->relid, lockmode);
+			}
+			else
+				partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
+
+			/*
+			 * We can rely on the copy of the partitioned table's partition
+			 * key from in its relcache entry, because it can't change (or
+			 * get destroyed) as long as the relation is locked.  Partition
+			 * descriptor is taken from the PartitionDirectory associated with
+			 * the table that is held open long enough for the descriptor to
+			 * remain valid while it's used to perform the pruning steps.
 			 */
-			partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
 			partkey = RelationGetPartitionKey(partrel);
-			partdesc = PartitionDirectoryLookup(estate->es_partition_directory,
-												partrel);
+			partdesc = PartitionDirectoryLookup(partdir, partrel);
+
+			/*
+			 * Must close partrel, keeping the lock taken, if we're not using
+			 * EState's entry.
+			 */
+			if (estate == NULL)
+				table_close(partrel, NoLock);
 
 			/*
 			 * Initialize the subplan_map and subpart_map.
@@ -1793,6 +1936,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 			Assert(partdesc->nparts >= pinfo->nparts);
 			pprune->nparts = partdesc->nparts;
 			pprune->subplan_map = palloc(sizeof(int) * partdesc->nparts);
+			pprune->rti_map = palloc(sizeof(Index) * partdesc->nparts);
 			if (partdesc->nparts == pinfo->nparts)
 			{
 				/*
@@ -1803,6 +1947,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 				pprune->subpart_map = pinfo->subpart_map;
 				memcpy(pprune->subplan_map, pinfo->subplan_map,
 					   sizeof(int) * pinfo->nparts);
+				memcpy(pprune->rti_map, pinfo->rti_map,
+					   sizeof(int) * pinfo->nparts);
 
 				/*
 				 * Double-check that the list of unpruned relations has not
@@ -1853,6 +1999,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 							pinfo->subplan_map[pd_idx];
 						pprune->subpart_map[pp_idx] =
 							pinfo->subpart_map[pd_idx];
+						pprune->rti_map[pp_idx] =
+							pinfo->rti_map[pd_idx];
 						pd_idx++;
 					}
 					else
@@ -1860,6 +2008,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 						/* this partdesc entry is not in the plan */
 						pprune->subplan_map[pp_idx] = -1;
 						pprune->subpart_map[pp_idx] = -1;
+						pprune->rti_map[pp_idx] = 0;
 					}
 				}
 
@@ -1881,7 +2030,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 			 * Initialize pruning contexts as needed.
 			 */
 			pprune->initial_pruning_steps = pinfo->initial_pruning_steps;
-			if (pinfo->initial_pruning_steps)
+			if (consider_initial_steps && pinfo->initial_pruning_steps)
 			{
 				InitPartitionPruneContext(&pprune->initial_context,
 										  pinfo->initial_pruning_steps,
@@ -1891,7 +2040,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 				prunestate->do_initial_prune = true;
 			}
 			pprune->exec_pruning_steps = pinfo->exec_pruning_steps;
-			if (pinfo->exec_pruning_steps)
+			if (consider_exec_steps && pinfo->exec_pruning_steps)
 			{
 				InitPartitionPruneContext(&pprune->exec_context,
 										  pinfo->exec_pruning_steps,
@@ -2119,10 +2268,14 @@ PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
  * Pass initial_prune if PARAM_EXEC Params cannot yet be evaluated.  This
  * differentiates the initial executor-time pruning step from later
  * runtime pruning.
+ *
+ * RT indexes of leaf partitions scanned by the chosen subplans are added to
+ * *scan_leafpart_rtis if the pointer is non-NULL.
  */
 Bitmapset *
 ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
-						 bool initial_prune)
+						 bool initial_prune,
+						 Bitmapset **scan_leafpart_rtis)
 {
 	Bitmapset  *result = NULL;
 	MemoryContext oldcontext;
@@ -2157,7 +2310,7 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
 		 */
 		pprune = &prunedata->partrelprunedata[0];
 		find_matching_subplans_recurse(prunedata, pprune, initial_prune,
-									   &result);
+									   &result, scan_leafpart_rtis);
 
 		/* Expression eval may have used space in ExprContext too */
 		if (pprune->exec_pruning_steps)
@@ -2171,6 +2324,8 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
 
 	/* Copy result out of the temp context before we reset it */
 	result = bms_copy(result);
+	if (scan_leafpart_rtis)
+		*scan_leafpart_rtis = bms_copy(*scan_leafpart_rtis);
 
 	MemoryContextReset(prunestate->prune_context);
 
@@ -2181,13 +2336,15 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
  * find_matching_subplans_recurse
  *		Recursive worker function for ExecFindMatchingSubPlans
  *
- * Adds valid (non-prunable) subplan IDs to *validsubplans
+ * Adds valid (non-prunable) subplan IDs to *validsubplans and RT indexes of
+ * of the corresponding leaf partitions to *scan_leafpart_rtis (if asked for).
  */
 static void
 find_matching_subplans_recurse(PartitionPruningData *prunedata,
 							   PartitionedRelPruningData *pprune,
 							   bool initial_prune,
-							   Bitmapset **validsubplans)
+							   Bitmapset **validsubplans,
+							   Bitmapset **scan_leafpart_rtis)
 {
 	Bitmapset  *partset;
 	int			i;
@@ -2214,8 +2371,14 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
 	while ((i = bms_next_member(partset, i)) >= 0)
 	{
 		if (pprune->subplan_map[i] >= 0)
+		{
 			*validsubplans = bms_add_member(*validsubplans,
 											pprune->subplan_map[i]);
+			Assert(pprune->rti_map[i] > 0);
+			if (scan_leafpart_rtis)
+				*scan_leafpart_rtis = bms_add_member(*scan_leafpart_rtis,
+													 pprune->rti_map[i]);
+		}
 		else
 		{
 			int			partidx = pprune->subpart_map[i];
@@ -2223,7 +2386,8 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
 			if (partidx >= 0)
 				find_matching_subplans_recurse(prunedata,
 											   &prunedata->partrelprunedata[partidx],
-											   initial_prune, validsubplans);
+											   initial_prune, validsubplans,
+											   scan_leafpart_rtis);
 			else
 			{
 				/*
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index f9460ae506..a2182a6b1f 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -844,7 +844,7 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
 	else
 		dest = None_Receiver;
 
-	es->qd = CreateQueryDesc(es->stmt,
+	es->qd = CreateQueryDesc(es->stmt, NULL,
 							 fcache->src,
 							 GetActiveSnapshot(),
 							 InvalidSnapshot,
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index c6f86a6510..96880e122a 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -155,7 +155,8 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
 		 * subplan, we can fill as_valid_subplans immediately, preventing
 		 * later calls to ExecFindMatchingSubPlans.
 		 */
-		if (!prunestate->do_exec_prune && nplans > 0)
+		if (appendstate->as_prune_state == NULL ||
+			(!appendstate->as_prune_state->do_exec_prune && nplans > 0))
 			appendstate->as_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
 	}
 	else
@@ -577,7 +578,7 @@ choose_next_subplan_locally(AppendState *node)
 		}
 		else if (node->as_valid_subplans == NULL)
 			node->as_valid_subplans =
-				ExecFindMatchingSubPlans(node->as_prune_state, false);
+				ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
 
 		whichplan = -1;
 	}
@@ -642,7 +643,7 @@ choose_next_subplan_for_leader(AppendState *node)
 		if (node->as_valid_subplans == NULL)
 		{
 			node->as_valid_subplans =
-				ExecFindMatchingSubPlans(node->as_prune_state, false);
+				ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
 
 			/*
 			 * Mark each invalid plan as finished to allow the loop below to
@@ -717,7 +718,7 @@ choose_next_subplan_for_worker(AppendState *node)
 	else if (node->as_valid_subplans == NULL)
 	{
 		node->as_valid_subplans =
-			ExecFindMatchingSubPlans(node->as_prune_state, false);
+			ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
 		mark_invalid_subplans_as_finished(node);
 	}
 
@@ -868,7 +869,7 @@ ExecAppendAsyncBegin(AppendState *node)
 	if (node->as_valid_subplans == NULL)
 	{
 		node->as_valid_subplans =
-			ExecFindMatchingSubPlans(node->as_prune_state, false);
+			ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
 
 		classify_matching_subplans(node);
 	}
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index 8d35860c30..2312e5a633 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -103,7 +103,8 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
 		 * subplan, we can fill ms_valid_subplans immediately, preventing
 		 * later calls to ExecFindMatchingSubPlans.
 		 */
-		if (!prunestate->do_exec_prune && nplans > 0)
+		if (mergestate->ms_prune_state == NULL ||
+			(!mergestate->ms_prune_state->do_exec_prune && nplans > 0))
 			mergestate->ms_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
 	}
 	else
@@ -218,7 +219,7 @@ ExecMergeAppend(PlanState *pstate)
 		 */
 		if (node->ms_valid_subplans == NULL)
 			node->ms_valid_subplans =
-				ExecFindMatchingSubPlans(node->ms_prune_state, false);
+				ExecFindMatchingSubPlans(node->ms_prune_state, false, NULL);
 
 		/*
 		 * First time through: pull the first tuple from each valid subplan,
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 29bc26669b..303a572c02 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -1578,6 +1578,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 	CachedPlanSource *plansource;
 	CachedPlan *cplan;
 	List	   *stmt_list;
+	List	   *part_prune_result_list;
 	char	   *query_string;
 	Snapshot	snapshot;
 	MemoryContext oldcontext;
@@ -1657,7 +1658,10 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 	 */
 
 	/* Replan if needed, and increment plan refcount for portal */
-	cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv);
+	cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv,
+						  &part_prune_result_list);
+	Assert(list_length(cplan->stmt_list) ==
+		   list_length(part_prune_result_list));
 	stmt_list = cplan->stmt_list;
 
 	if (!plan->saved)
@@ -1685,6 +1689,9 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 					  stmt_list,
 					  cplan);
 
+	/* Copy PartitionPruneResults into the portal's context. */
+	PortalStorePartitionPruneResults(portal, part_prune_result_list);
+
 	/*
 	 * Set up options for portal.  Default SCROLL type is chosen the same way
 	 * as PerformCursorOpen does it.
@@ -2092,7 +2099,8 @@ SPI_plan_get_cached_plan(SPIPlanPtr plan)
 	/* Get the generic plan for the query */
 	cplan = GetCachedPlan(plansource, NULL,
 						  plan->saved ? CurrentResourceOwner : NULL,
-						  _SPI_current->queryEnv);
+						  _SPI_current->queryEnv,
+						  NULL /* Not interested in PartitionPruneResults */);
 	Assert(cplan == plansource->gplan);
 
 	/* Pop the error context stack */
@@ -2473,7 +2481,9 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 	{
 		CachedPlanSource *plansource = (CachedPlanSource *) lfirst(lc1);
 		List	   *stmt_list;
-		ListCell   *lc2;
+		List	   *part_prune_result_list;
+		ListCell   *lc2,
+				   *lc3;
 
 		spicallbackarg.query = plansource->query_string;
 
@@ -2549,8 +2559,10 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 		 * plan, the refcount must be backed by the plan_owner.
 		 */
 		cplan = GetCachedPlan(plansource, options->params,
-							  plan_owner, _SPI_current->queryEnv);
-
+							  plan_owner, _SPI_current->queryEnv,
+							 &part_prune_result_list);
+		Assert(list_length(cplan->stmt_list) ==
+			   list_length(part_prune_result_list));
 		stmt_list = cplan->stmt_list;
 
 		/*
@@ -2589,9 +2601,10 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 			}
 		}
 
-		foreach(lc2, stmt_list)
+		forboth(lc2, stmt_list, lc3, part_prune_result_list)
 		{
 			PlannedStmt *stmt = lfirst_node(PlannedStmt, lc2);
+			PartitionPruneResult *part_prune_result = lfirst_node(PartitionPruneResult, lc3);
 			bool		canSetTag = stmt->canSetTag;
 			DestReceiver *dest;
 
@@ -2663,7 +2676,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 				else
 					snap = InvalidSnapshot;
 
-				qdesc = CreateQueryDesc(stmt,
+				qdesc = CreateQueryDesc(stmt, part_prune_result,
 										plansource->query_string,
 										snap, crosscheck_snapshot,
 										dest,
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index b02b4a641c..332d58381b 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -97,7 +97,9 @@ _copyPlannedStmt(const PlannedStmt *from)
 	COPY_SCALAR_FIELD(jitFlags);
 	COPY_NODE_FIELD(planTree);
 	COPY_NODE_FIELD(partPruneInfos);
+	COPY_SCALAR_FIELD(containsInitialPruning);
 	COPY_NODE_FIELD(rtable);
+	COPY_BITMAPSET_FIELD(minLockRelids);
 	COPY_NODE_FIELD(resultRelations);
 	COPY_NODE_FIELD(appendRelations);
 	COPY_NODE_FIELD(subplans);
@@ -1284,6 +1286,8 @@ _copyPartitionPruneInfo(const PartitionPruneInfo *from)
 	PartitionPruneInfo *newnode = makeNode(PartitionPruneInfo);
 
 	COPY_NODE_FIELD(prune_infos);
+	COPY_SCALAR_FIELD(needs_init_pruning);
+	COPY_SCALAR_FIELD(needs_exec_pruning);
 	COPY_BITMAPSET_FIELD(other_subplans);
 
 	return newnode;
@@ -1300,6 +1304,7 @@ _copyPartitionedRelPruneInfo(const PartitionedRelPruneInfo *from)
 	COPY_POINTER_FIELD(subplan_map, from->nparts * sizeof(int));
 	COPY_POINTER_FIELD(subpart_map, from->nparts * sizeof(int));
 	COPY_POINTER_FIELD(relid_map, from->nparts * sizeof(Oid));
+	COPY_POINTER_FIELD(rti_map, from->nparts * sizeof(Index));
 	COPY_NODE_FIELD(initial_pruning_steps);
 	COPY_NODE_FIELD(exec_pruning_steps);
 	COPY_BITMAPSET_FIELD(execparamids);
@@ -5476,6 +5481,21 @@ _copyExtensibleNode(const ExtensibleNode *from)
 	return newnode;
 }
 
+/* ****************************************************************
+ *					execnodes.h copy functions
+ * ****************************************************************
+ */
+static PartitionPruneResult *
+_copyPartitionPruneResult(const PartitionPruneResult *from)
+{
+	PartitionPruneResult *newnode = makeNode(PartitionPruneResult);
+
+	COPY_NODE_FIELD(valid_subplan_offs_list);
+	COPY_BITMAPSET_FIELD(scan_leafpart_rtis);
+
+	return newnode;
+}
+
 /* ****************************************************************
  *					value.h copy functions
  * ****************************************************************
@@ -6572,6 +6592,13 @@ copyObjectImpl(const void *from)
 			retval = _copyPublicationTable(from);
 			break;
 
+			/*
+			 * EXECUTION NODES
+			 */
+		case T_PartitionPruneResult:
+			retval = _copyPartitionPruneResult(from);
+			break;
+
 			/*
 			 * MISCELLANEOUS NODES
 			 */
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 7618444b4d..7346820eee 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -326,7 +326,9 @@ _outPlannedStmt(StringInfo str, const PlannedStmt *node)
 	WRITE_INT_FIELD(jitFlags);
 	WRITE_NODE_FIELD(planTree);
 	WRITE_NODE_FIELD(partPruneInfos);
+	WRITE_BOOL_FIELD(containsInitialPruning);
 	WRITE_NODE_FIELD(rtable);
+	WRITE_BITMAPSET_FIELD(minLockRelids);
 	WRITE_NODE_FIELD(resultRelations);
 	WRITE_NODE_FIELD(appendRelations);
 	WRITE_NODE_FIELD(subplans);
@@ -1021,6 +1023,8 @@ _outPartitionPruneInfo(StringInfo str, const PartitionPruneInfo *node)
 	WRITE_NODE_TYPE("PARTITIONPRUNEINFO");
 
 	WRITE_NODE_FIELD(prune_infos);
+	WRITE_BOOL_FIELD(needs_init_pruning);
+	WRITE_BOOL_FIELD(needs_exec_pruning);
 	WRITE_BITMAPSET_FIELD(other_subplans);
 }
 
@@ -1035,6 +1039,7 @@ _outPartitionedRelPruneInfo(StringInfo str, const PartitionedRelPruneInfo *node)
 	WRITE_INT_ARRAY(subplan_map, node->nparts);
 	WRITE_INT_ARRAY(subpart_map, node->nparts);
 	WRITE_OID_ARRAY(relid_map, node->nparts);
+	WRITE_INDEX_ARRAY(rti_map, node->nparts);
 	WRITE_NODE_FIELD(initial_pruning_steps);
 	WRITE_NODE_FIELD(exec_pruning_steps);
 	WRITE_BITMAPSET_FIELD(execparamids);
@@ -2440,6 +2445,8 @@ _outPlannerGlobal(StringInfo str, const PlannerGlobal *node)
 	WRITE_NODE_FIELD(resultRelations);
 	WRITE_NODE_FIELD(appendRelations);
 	WRITE_NODE_FIELD(partPruneInfos);
+	WRITE_BOOL_FIELD(containsInitialPruning);
+	WRITE_BITMAPSET_FIELD(minLockRelids);
 	WRITE_NODE_FIELD(relationOids);
 	WRITE_NODE_FIELD(invalItems);
 	WRITE_NODE_FIELD(paramExecTypes);
@@ -2861,6 +2868,21 @@ _outExtensibleNode(StringInfo str, const ExtensibleNode *node)
 	methods->nodeOut(str, node);
 }
 
+/*****************************************************************************
+ *
+ *	Stuff from execnodes.h
+ *
+ *****************************************************************************/
+
+static void
+_outPartitionPruneResult(StringInfo str, const PartitionPruneResult *node)
+{
+	WRITE_NODE_TYPE("PARTITIONPRUNERESULT");
+
+	WRITE_NODE_FIELD(valid_subplan_offs_list);
+	WRITE_BITMAPSET_FIELD(scan_leafpart_rtis);
+}
+
 /*****************************************************************************
  *
  *	Stuff from parsenodes.h.
@@ -4770,6 +4792,13 @@ outNode(StringInfo str, const void *obj)
 				_outJsonTableSibling(str, obj);
 				break;
 
+				/*
+				 * EXECUTION NODES
+				 */
+			case T_PartitionPruneResult:
+				_outPartitionPruneResult(str, obj);
+				break;
+
 			default:
 
 				/*
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index bf602ff93e..c1d131aa99 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -164,6 +164,11 @@
 	token = pg_strtok(&length);		/* skip :fldname */ \
 	local_node->fldname = readIntCols(len)
 
+/* Read an Index array */
+#define READ_INDEX_ARRAY(fldname, len) \
+	token = pg_strtok(&length);		/* skip :fldname */ \
+	local_node->fldname = readIndexCols(len)
+
 /* Read a bool array */
 #define READ_BOOL_ARRAY(fldname, len) \
 	token = pg_strtok(&length);		/* skip :fldname */ \
@@ -1818,7 +1823,9 @@ _readPlannedStmt(void)
 	READ_INT_FIELD(jitFlags);
 	READ_NODE_FIELD(planTree);
 	READ_NODE_FIELD(partPruneInfos);
+	READ_BOOL_FIELD(containsInitialPruning);
 	READ_NODE_FIELD(rtable);
+	READ_BITMAPSET_FIELD(minLockRelids);
 	READ_NODE_FIELD(resultRelations);
 	READ_NODE_FIELD(appendRelations);
 	READ_NODE_FIELD(subplans);
@@ -2770,6 +2777,8 @@ _readPartitionPruneInfo(void)
 	READ_LOCALS(PartitionPruneInfo);
 
 	READ_NODE_FIELD(prune_infos);
+	READ_BOOL_FIELD(needs_init_pruning);
+	READ_BOOL_FIELD(needs_exec_pruning);
 	READ_BITMAPSET_FIELD(other_subplans);
 
 	READ_DONE();
@@ -2786,6 +2795,7 @@ _readPartitionedRelPruneInfo(void)
 	READ_INT_ARRAY(subplan_map, local_node->nparts);
 	READ_INT_ARRAY(subpart_map, local_node->nparts);
 	READ_OID_ARRAY(relid_map, local_node->nparts);
+	READ_INDEX_ARRAY(rti_map, local_node->nparts);
 	READ_NODE_FIELD(initial_pruning_steps);
 	READ_NODE_FIELD(exec_pruning_steps);
 	READ_BITMAPSET_FIELD(execparamids);
@@ -2939,6 +2949,21 @@ _readPartitionRangeDatum(void)
 	READ_DONE();
 }
 
+
+/*
+ * _readPartitionPruneResult
+ */
+static PartitionPruneResult *
+_readPartitionPruneResult(void)
+{
+	READ_LOCALS(PartitionPruneResult);
+
+	READ_NODE_FIELD(valid_subplan_offs_list);
+	READ_BITMAPSET_FIELD(scan_leafpart_rtis);
+
+	READ_DONE();
+}
+
 /*
  * parseNodeString
  *
@@ -3236,6 +3261,8 @@ parseNodeString(void)
 		return_value = _readJsonTableParent();
 	else if (MATCH("JSONTABLESIBLING", 16))
 		return_value = _readJsonTableSibling();
+	else if (MATCH("PARTITIONPRUNERESULT", 20))
+		return_value = _readPartitionPruneResult();
 	else
 	{
 		elog(ERROR, "badly formatted node string \"%.32s\"...", token);
@@ -3379,6 +3406,30 @@ readIntCols(int numCols)
 	return int_vals;
 }
 
+/*
+ * readIndexCols
+ */
+Index *
+readIndexCols(int numCols)
+{
+	int			tokenLength,
+				i;
+	const char *token;
+	Index	   *index_vals;
+
+	if (numCols <= 0)
+		return NULL;
+
+	index_vals = (Index *) palloc(numCols * sizeof(Index));
+	for (i = 0; i < numCols; i++)
+	{
+		token = pg_strtok(&tokenLength);
+		index_vals[i] = atoui(token);
+	}
+
+	return index_vals;
+}
+
 /*
  * readBoolCols
  */
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index b11249ed8f..7141035cc4 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -519,7 +519,9 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
 	result->parallelModeNeeded = glob->parallelModeNeeded;
 	result->planTree = top_plan;
 	result->partPruneInfos = glob->partPruneInfos;
+	result->containsInitialPruning = glob->containsInitialPruning;
 	result->rtable = glob->finalrtable;
+	result->minLockRelids = glob->minLockRelids;
 	result->resultRelations = glob->resultRelations;
 	result->appendRelations = glob->appendRelations;
 	result->subplans = glob->subplans;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index b8d5610593..da749e331e 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -270,6 +270,16 @@ set_plan_references(PlannerInfo *root, Plan *plan)
 	 */
 	add_rtes_to_flat_rtable(root, false);
 
+	/*
+	 * Add the query's adjusted range of RT indexes to glob->minLockRelids.
+	 * The adjusted RT indexes of prunable relations will be deleted from the
+	 * set below where PartitionPruneInfos are processed.
+	 */
+	glob->minLockRelids =
+		bms_add_range(glob->minLockRelids,
+					  rtoffset + 1,
+					  rtoffset + list_length(root->parse->rtable));
+
 	/*
 	 * Adjust RT indexes of PlanRowMarks and add to final rowmarks list
 	 */
@@ -352,6 +362,7 @@ set_plan_references(PlannerInfo *root, Plan *plan)
 	foreach (lc, root->partPruneInfos)
 	{
 		PartitionPruneInfo *pruneinfo = lfirst(lc);
+		Bitmapset *leafpart_rtis = NULL;
 		ListCell  *l;
 
 		foreach(l, pruneinfo->prune_infos)
@@ -362,15 +373,50 @@ set_plan_references(PlannerInfo *root, Plan *plan)
 			foreach(l2, prune_infos)
 			{
 				PartitionedRelPruneInfo *pinfo = lfirst(l2);
+				int		i;
 
 				/* RT index of the table to which the pinfo belongs. */
 				pinfo->rtindex += rtoffset;
+
+				/* Also of the leaf partitions that might be scanned. */
+				for (i = 0; i < pinfo->nparts; i++)
+				{
+					if (pinfo->rti_map[i] > 0 && pinfo->subplan_map[i] >= 0)
+					{
+						pinfo->rti_map[i] += rtoffset;
+						leafpart_rtis = bms_add_member(leafpart_rtis,
+													   pinfo->rti_map[i]);
+					}
+				}
 			}
 		}
 
+		if (pruneinfo->needs_init_pruning)
+		{
+			glob->containsInitialPruning = true;
+
+			/*
+			 * Delete the leaf partition RTIs from the global set of relations
+			 * to be locked before executing the plan.  AcquireExecutorLocks()
+			 * will find the ones to add to the set after performing initial
+			 * pruning.
+			 */
+			glob->minLockRelids = bms_del_members(glob->minLockRelids,
+												  leafpart_rtis);
+		}
+
 		glob->partPruneInfos = lappend(glob->partPruneInfos, pruneinfo);
 	}
 
+	/*
+	 * It seems worth doing a bms_copy() on glob->minLockRelids if we deleted
+	 * bits from it above to get rid of any empty tail bits.  It seems better
+	 * for the loop over this set in AcquireExecutorLocks() to not have to go
+	 * through those useless bit words.
+	 */
+	if (glob->containsInitialPruning)
+		glob->minLockRelids = bms_copy(glob->minLockRelids);
+
 	return result;
 }
 
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index d77f7d3aef..952c5b8327 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -144,7 +144,9 @@ static List *make_partitionedrel_pruneinfo(PlannerInfo *root,
 										   List *prunequal,
 										   Bitmapset *partrelids,
 										   int *relid_subplan_map,
-										   Bitmapset **matchedsubplans);
+										   Bitmapset **matchedsubplans,
+										   bool *needs_init_pruning,
+										   bool *needs_exec_pruning);
 static void gen_partprune_steps(RelOptInfo *rel, List *clauses,
 								PartClauseTarget target,
 								GeneratePruningStepsContext *context);
@@ -234,6 +236,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	int		   *relid_subplan_map;
 	ListCell   *lc;
 	int			i;
+	bool		needs_init_pruning = false;
+	bool		needs_exec_pruning = false;
 
 	/*
 	 * Scan the subpaths to see which ones are scans of partition child
@@ -313,12 +317,16 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		Bitmapset  *partrelids = (Bitmapset *) lfirst(lc);
 		List	   *pinfolist;
 		Bitmapset  *matchedsubplans = NULL;
+		bool		partrel_needs_init_pruning;
+		bool		partrel_needs_exec_pruning;
 
 		pinfolist = make_partitionedrel_pruneinfo(root, parentrel,
 												  prunequal,
 												  partrelids,
 												  relid_subplan_map,
-												  &matchedsubplans);
+												  &matchedsubplans,
+												  &partrel_needs_init_pruning,
+												  &partrel_needs_exec_pruning);
 
 		/* When pruning is possible, record the matched subplans */
 		if (pinfolist != NIL)
@@ -327,6 +335,9 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 			allmatchedsubplans = bms_join(matchedsubplans,
 										  allmatchedsubplans);
 		}
+
+		needs_init_pruning |= partrel_needs_init_pruning;
+		needs_exec_pruning |= partrel_needs_exec_pruning;
 	}
 
 	pfree(relid_subplan_map);
@@ -341,6 +352,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	/* Else build the result data structure */
 	pruneinfo = makeNode(PartitionPruneInfo);
 	pruneinfo->prune_infos = prunerelinfos;
+	pruneinfo->needs_init_pruning = needs_init_pruning;
+	pruneinfo->needs_exec_pruning = needs_exec_pruning;
 
 	/*
 	 * Some subplans may not belong to any of the identified partitioned rels.
@@ -441,13 +454,18 @@ add_part_relids(List *allpartrelids, Bitmapset *partrelids)
  * If we cannot find any useful run-time pruning steps, return NIL.
  * However, on success, each rel identified in partrelids will have
  * an element in the result list, even if some of them are useless.
+ * *needs_init_pruning and *needs_exec_pruning are set to indicate that the
+ * returned PartitionedRelPruneInfos contains pruning steps that can be
+ * performed before and after execution begins, respectively.
  */
 static List *
 make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 							  List *prunequal,
 							  Bitmapset *partrelids,
 							  int *relid_subplan_map,
-							  Bitmapset **matchedsubplans)
+							  Bitmapset **matchedsubplans,
+							  bool *needs_init_pruning,
+							  bool *needs_exec_pruning)
 {
 	RelOptInfo *targetpart = NULL;
 	List	   *pinfolist = NIL;
@@ -458,6 +476,10 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	int			rti;
 	int			i;
 
+	/* Will find out below. */
+	*needs_init_pruning = false;
+	*needs_exec_pruning = false;
+
 	/*
 	 * Examine each partitioned rel, constructing a temporary array to map
 	 * from planner relids to index of the partitioned rel, and building a
@@ -545,6 +567,9 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		 * executor per-scan pruning steps.  This first pass creates startup
 		 * pruning steps and detects whether there's any possibly-useful quals
 		 * that would require per-scan pruning.
+		 *
+		 * In the first pass, we note whether the 2nd pass is necessary by
+		 * noting the presence of EXEC parameters.
 		 */
 		gen_partprune_steps(subpart, partprunequal, PARTTARGET_INITIAL,
 							&context);
@@ -619,6 +644,12 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		pinfo->execparamids = execparamids;
 		/* Remaining fields will be filled in the next loop */
 
+		/* record which types of pruning steps we've seen so far */
+		if (initial_pruning_steps != NIL)
+			*needs_init_pruning = true;
+		if (exec_pruning_steps != NIL)
+			*needs_exec_pruning = true;
+
 		pinfolist = lappend(pinfolist, pinfo);
 	}
 
@@ -646,6 +677,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		int		   *subplan_map;
 		int		   *subpart_map;
 		Oid		   *relid_map;
+		Index	   *rti_map;
 
 		/*
 		 * Construct the subplan and subpart maps for this partitioning level.
@@ -658,6 +690,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		subpart_map = (int *) palloc(nparts * sizeof(int));
 		memset(subpart_map, -1, nparts * sizeof(int));
 		relid_map = (Oid *) palloc0(nparts * sizeof(Oid));
+		rti_map = (Index *) palloc0(nparts * sizeof(Index));
 		present_parts = NULL;
 
 		i = -1;
@@ -672,6 +705,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 			subplan_map[i] = subplanidx = relid_subplan_map[partrel->relid] - 1;
 			subpart_map[i] = subpartidx = relid_subpart_map[partrel->relid] - 1;
 			relid_map[i] = planner_rt_fetch(partrel->relid, root)->relid;
+			rti_map[i] = partrel->relid;
 			if (subplanidx >= 0)
 			{
 				present_parts = bms_add_member(present_parts, i);
@@ -696,6 +730,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		pinfo->subplan_map = subplan_map;
 		pinfo->subpart_map = subpart_map;
 		pinfo->relid_map = relid_map;
+		pinfo->rti_map = rti_map;
 	}
 
 	pfree(relid_subpart_map);
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 5ab91c2c58..5ae967608d 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -1603,6 +1603,7 @@ exec_bind_message(StringInfo input_message)
 	int16	   *rformats = NULL;
 	CachedPlanSource *psrc;
 	CachedPlan *cplan;
+	List	   *part_prune_result_list;
 	Portal		portal;
 	char	   *query_string;
 	char	   *saved_stmt_name;
@@ -1978,7 +1979,9 @@ exec_bind_message(StringInfo input_message)
 	 * will be generated in MessageContext.  The plan refcount will be
 	 * assigned to the Portal, so it will be released at portal destruction.
 	 */
-	cplan = GetCachedPlan(psrc, params, NULL, NULL);
+	cplan = GetCachedPlan(psrc, params, NULL, NULL, &part_prune_result_list);
+	Assert(list_length(cplan->stmt_list) ==
+		   list_length(part_prune_result_list));
 
 	/*
 	 * Now we can define the portal.
@@ -1993,6 +1996,9 @@ exec_bind_message(StringInfo input_message)
 					  cplan->stmt_list,
 					  cplan);
 
+	/* Copy PartitionPruneResults into the portal's context. */
+	PortalStorePartitionPruneResults(portal, part_prune_result_list);
+
 	/* Done with the snapshot used for parameter I/O and parsing/planning */
 	if (snapshot_set)
 		PopActiveSnapshot();
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index 5aa5a350f3..8cc2e2162d 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -35,7 +35,7 @@
 Portal		ActivePortal = NULL;
 
 
-static void ProcessQuery(PlannedStmt *plan,
+static void ProcessQuery(PlannedStmt *plan, PartitionPruneResult *part_prune_result,
 						 const char *sourceText,
 						 ParamListInfo params,
 						 QueryEnvironment *queryEnv,
@@ -65,6 +65,7 @@ static void DoPortalRewind(Portal portal);
  */
 QueryDesc *
 CreateQueryDesc(PlannedStmt *plannedstmt,
+				PartitionPruneResult *part_prune_result,
 				const char *sourceText,
 				Snapshot snapshot,
 				Snapshot crosscheck_snapshot,
@@ -77,6 +78,8 @@ CreateQueryDesc(PlannedStmt *plannedstmt,
 
 	qd->operation = plannedstmt->commandType;	/* operation */
 	qd->plannedstmt = plannedstmt;	/* plan */
+	qd->part_prune_result = part_prune_result;	/* ExecutorDoInitialPruning()
+												 * output for plan */
 	qd->sourceText = sourceText;	/* query text */
 	qd->snapshot = RegisterSnapshot(snapshot);	/* snapshot */
 	/* RI check snapshot */
@@ -122,6 +125,7 @@ FreeQueryDesc(QueryDesc *qdesc)
  *		PORTAL_ONE_RETURNING, or PORTAL_ONE_MOD_WITH portal
  *
  *	plan: the plan tree for the query
+ *	part_prune_result: ExecutorDoInitialPruning() output for the plan tree
  *	sourceText: the source text of the query
  *	params: any parameters needed
  *	dest: where to send results
@@ -134,6 +138,7 @@ FreeQueryDesc(QueryDesc *qdesc)
  */
 static void
 ProcessQuery(PlannedStmt *plan,
+			 PartitionPruneResult *part_prune_result,
 			 const char *sourceText,
 			 ParamListInfo params,
 			 QueryEnvironment *queryEnv,
@@ -145,7 +150,7 @@ ProcessQuery(PlannedStmt *plan,
 	/*
 	 * Create the QueryDesc object
 	 */
-	queryDesc = CreateQueryDesc(plan, sourceText,
+	queryDesc = CreateQueryDesc(plan, part_prune_result, sourceText,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, params, queryEnv, 0);
 
@@ -491,8 +496,13 @@ PortalStart(Portal portal, ParamListInfo params,
 				/*
 				 * Create QueryDesc in portal's context; for the moment, set
 				 * the destination to DestNone.
+				 *
+				 * There is no PartitionPruneResult unless the PlannedStmt is
+				 * from a CachedPlan.
 				 */
 				queryDesc = CreateQueryDesc(linitial_node(PlannedStmt, portal->stmts),
+											portal->part_prune_results == NIL ? NULL :
+											linitial(portal->part_prune_results),
 											portal->sourceText,
 											GetActiveSnapshot(),
 											InvalidSnapshot,
@@ -1225,6 +1235,8 @@ PortalRunMulti(Portal portal,
 
 		if (pstmt->utilityStmt == NULL)
 		{
+			PartitionPruneResult *part_prune_result = NULL;
+
 			/*
 			 * process a plannable query.
 			 */
@@ -1271,10 +1283,18 @@ PortalRunMulti(Portal portal,
 			else
 				UpdateActiveSnapshotCommandId();
 
+			/*
+			 * Determine if there's a corresponding PartitionPruneResult for
+			 * this PlannedStmt.
+			 */
+			if (portal->part_prune_results != NIL)
+				part_prune_result = list_nth(portal->part_prune_results,
+											 foreach_current_index(stmtlist_item));
+
 			if (pstmt->canSetTag)
 			{
 				/* statement can set tag string */
-				ProcessQuery(pstmt,
+				ProcessQuery(pstmt, part_prune_result,
 							 portal->sourceText,
 							 portal->portalParams,
 							 portal->queryEnv,
@@ -1283,7 +1303,7 @@ PortalRunMulti(Portal portal,
 			else
 			{
 				/* stmt added by rewrite cannot set tag */
-				ProcessQuery(pstmt,
+				ProcessQuery(pstmt, part_prune_result,
 							 portal->sourceText,
 							 portal->portalParams,
 							 portal->queryEnv,
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 0d6a295674..8c164741f7 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -99,14 +99,19 @@ static dlist_head cached_expression_list = DLIST_STATIC_INIT(cached_expression_l
 static void ReleaseGenericPlan(CachedPlanSource *plansource);
 static List *RevalidateCachedQuery(CachedPlanSource *plansource,
 								   QueryEnvironment *queryEnv);
-static bool CheckCachedPlan(CachedPlanSource *plansource);
+static bool CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
+							List **part_prune_result_list);
 static CachedPlan *BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
-								   ParamListInfo boundParams, QueryEnvironment *queryEnv);
+								   ParamListInfo boundParams, QueryEnvironment *queryEnv,
+								   List **part_prune_result_list);
 static bool choose_custom_plan(CachedPlanSource *plansource,
 							   ParamListInfo boundParams);
 static double cached_plan_cost(CachedPlan *plan, bool include_planner);
 static Query *QueryListGetPrimaryStmt(List *stmts);
-static void AcquireExecutorLocks(List *stmt_list, bool acquire);
+static void AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams,
+								 List **part_prune_result_list,
+								 List **lockedRelids_per_stmt);
+static void ReleaseExecutorLocks(List *stmt_list, List *lockedRelids_per_stmt);
 static void AcquirePlannerLocks(List *stmt_list, bool acquire);
 static void ScanQueryForLocks(Query *parsetree, bool acquire);
 static bool ScanQueryWalker(Node *node, bool *acquire);
@@ -790,15 +795,20 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
  *
  * On a "true" return, we have acquired the locks needed to run the plan.
  * (We must do this for the "true" result to be race-condition-free.)
+ *
+ * See GetCachedPlan()'s comment for a description of part_prune_result_list.
  */
 static bool
-CheckCachedPlan(CachedPlanSource *plansource)
+CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
+				List **part_prune_result_list)
 {
 	CachedPlan *plan = plansource->gplan;
 
 	/* Assert that caller checked the querytree */
 	Assert(plansource->is_valid);
 
+	*part_prune_result_list = NIL;
+
 	/* If there's no generic plan, just say "false" */
 	if (!plan)
 		return false;
@@ -820,13 +830,21 @@ CheckCachedPlan(CachedPlanSource *plansource)
 	 */
 	if (plan->is_valid)
 	{
+		List   *lockedRelids_per_stmt;
+
 		/*
 		 * Plan must have positive refcount because it is referenced by
 		 * plansource; so no need to fear it disappears under us here.
 		 */
 		Assert(plan->refcount > 0);
 
-		AcquireExecutorLocks(plan->stmt_list, true);
+		/*
+		 * Lock relations scanned by the plan.  This is where the pruning
+		 * happens if needed.
+		 */
+		AcquireExecutorLocks(plan->stmt_list, boundParams,
+							 part_prune_result_list,
+							 &lockedRelids_per_stmt);
 
 		/*
 		 * If plan was transient, check to see if TransactionXmin has
@@ -848,7 +866,14 @@ CheckCachedPlan(CachedPlanSource *plansource)
 		}
 
 		/* Oops, the race case happened.  Release useless locks. */
-		AcquireExecutorLocks(plan->stmt_list, false);
+		ReleaseExecutorLocks(plan->stmt_list, lockedRelids_per_stmt);
+
+		/*
+		 * The output list and any objects therein have been allocated in the
+		 * caller's hopefully short-lived context, so will not remain leaked
+		 * for long, though reset to avoid its accidentally being looked at.
+		 */
+		*part_prune_result_list = NIL;
 	}
 
 	/*
@@ -874,10 +899,15 @@ CheckCachedPlan(CachedPlanSource *plansource)
  * Planning work is done in the caller's memory context.  The finished plan
  * is in a child memory context, which typically should get reparented
  * (unless this is a one-shot plan, in which case we don't copy the plan).
+ *
+ * A list of NULLs is returned in *part_prune_result_list, meaning that no
+ * PartitionPruneResult nodes have yet been created for the plans in
+ * stmt_list.
  */
 static CachedPlan *
 BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
-				ParamListInfo boundParams, QueryEnvironment *queryEnv)
+				ParamListInfo boundParams, QueryEnvironment *queryEnv,
+				List **part_prune_result_list)
 {
 	CachedPlan *plan;
 	List	   *plist;
@@ -1007,6 +1037,17 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
 
 	MemoryContextSwitchTo(oldcxt);
 
+	/*
+	 * No actual PartitionPruneResults yet to add, though must initialize
+	 * the list to have the same number of elements as the list of
+	 * PlannedStmts.
+	 */
+	*part_prune_result_list = NIL;
+	foreach(lc, plist)
+	{
+		*part_prune_result_list = lappend(*part_prune_result_list, NULL);
+	}
+
 	return plan;
 }
 
@@ -1126,6 +1167,17 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
  * plan or a custom plan for the given parameters: the caller does not know
  * which it will get.
  *
+ * For every PlannedStmt found in the returned CachedPlan, an element that
+ * is either a PartitionPruneResult or a NULL is added to
+ * *part_prune_result_list if needed.  The former if the PlannedStmt is from
+ * the existing CachedPlan that is otherwise valid and contains at least one
+ * PartitionPruneInfo that has "initial" pruning steps.  Those steps are
+ * performed by calling ExecutorDoInitialPruning() to determine only those
+ * leaf partitions that need to be locked by AcquireExecutorLocks() by pruning
+ * away subplans that don't match the pruning conditions.  The
+ * PartitionPruneResult contains a list of bitmapsets of the indexes of
+ * matching subplans, one for each PartitionPruneInfo.
+ *
  * On return, the plan is valid and we have sufficient locks to begin
  * execution.
  *
@@ -1139,11 +1191,13 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
  */
 CachedPlan *
 GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
-			  ResourceOwner owner, QueryEnvironment *queryEnv)
+			  ResourceOwner owner, QueryEnvironment *queryEnv,
+			  List **part_prune_result_list)
 {
 	CachedPlan *plan = NULL;
 	List	   *qlist;
 	bool		customplan;
+	List	   *my_part_prune_result_list;
 
 	/* Assert caller is doing things in a sane order */
 	Assert(plansource->magic == CACHEDPLANSOURCE_MAGIC);
@@ -1160,7 +1214,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 
 	if (!customplan)
 	{
-		if (CheckCachedPlan(plansource))
+		if (CheckCachedPlan(plansource, boundParams,
+							&my_part_prune_result_list))
 		{
 			/* We want a generic plan, and we already have a valid one */
 			plan = plansource->gplan;
@@ -1169,7 +1224,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 		else
 		{
 			/* Build a new generic plan */
-			plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv);
+			plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv,
+								   &my_part_prune_result_list);
 			/* Just make real sure plansource->gplan is clear */
 			ReleaseGenericPlan(plansource);
 			/* Link the new generic plan into the plansource */
@@ -1214,7 +1270,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 	if (customplan)
 	{
 		/* Build a custom plan */
-		plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv);
+		plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv,
+							   &my_part_prune_result_list);
 		/* Accumulate total costs of custom plans */
 		plansource->total_custom_cost += cached_plan_cost(plan, true);
 
@@ -1246,6 +1303,9 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 		plan->is_saved = true;
 	}
 
+	if (part_prune_result_list)
+		*part_prune_result_list = my_part_prune_result_list;
+
 	return plan;
 }
 
@@ -1737,17 +1797,29 @@ QueryListGetPrimaryStmt(List *stmts)
 
 /*
  * AcquireExecutorLocks: acquire locks needed for execution of a cached plan;
- * or release them if acquire is false.
+ *
+ * See GetCachedPlan()'s comment for a description of part_prune_result_list.
+ *
+ * On return, *lockedRelids_per_stmt will contain a bitmapset for every
+ * PlannedStmt in stmt_list, containing the RT indexes of relation entries
+ * in its range table that were actually locked, or NULL if the PlannedStmt
+ * contains a utility statement.
  */
 static void
-AcquireExecutorLocks(List *stmt_list, bool acquire)
+AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams,
+					 List **part_prune_result_list,
+					 List **lockedRelids_per_stmt)
 {
 	ListCell   *lc1;
 
+	*part_prune_result_list = *lockedRelids_per_stmt = NIL;
 	foreach(lc1, stmt_list)
 	{
 		PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
-		ListCell   *lc2;
+		PartitionPruneResult *part_prune_result = NULL;
+		Bitmapset  *allLockRelids;
+		Bitmapset  *lockedRelids = NULL;
+		int			rti;
 
 		if (plannedstmt->commandType == CMD_UTILITY)
 		{
@@ -1761,13 +1833,35 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
 			Query	   *query = UtilityContainsQuery(plannedstmt->utilityStmt);
 
 			if (query)
-				ScanQueryForLocks(query, acquire);
+				ScanQueryForLocks(query, true);
+			*part_prune_result_list = lappend(*part_prune_result_list, NULL);
 			continue;
 		}
 
-		foreach(lc2, plannedstmt->rtable)
+		/*
+		 * Figure out the set of relations that would need to be locked
+		 * before executing the plan.
+		 */
+		if (plannedstmt->containsInitialPruning)
 		{
-			RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc2);
+			/*
+			 * Obtain the set of partitions to be locked from the
+			 * PartitionPruneInfos by considering the result of performing
+			 * initial partition pruning.
+			 */
+			PartitionPruneResult *part_prune_result =
+				ExecutorDoInitialPruning(plannedstmt, boundParams);
+
+			allLockRelids = bms_union(plannedstmt->minLockRelids,
+									  part_prune_result->scan_leafpart_rtis);
+		}
+		else
+			allLockRelids = plannedstmt->minLockRelids;
+
+		rti = -1;
+		while ((rti = bms_next_member(allLockRelids, rti)) > 0)
+		{
+			RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
 
 			if (rte->rtekind != RTE_RELATION)
 				continue;
@@ -1778,10 +1872,58 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
 			 * fail if it's been dropped entirely --- we'll just transiently
 			 * acquire a non-conflicting lock.
 			 */
-			if (acquire)
-				LockRelationOid(rte->relid, rte->rellockmode);
-			else
-				UnlockRelationOid(rte->relid, rte->rellockmode);
+			LockRelationOid(rte->relid, rte->rellockmode);
+			lockedRelids = bms_add_member(lockedRelids, rti);
+		}
+
+		*part_prune_result_list = lappend(*part_prune_result_list,
+										  part_prune_result);
+		*lockedRelids_per_stmt = lappend(*lockedRelids_per_stmt, lockedRelids);
+	}
+}
+
+/*
+ * ReleaseExecutorLocks
+ * 		Release locks that would've been acquired by an earlier call to
+ * 		AcquireExecutorLocks()
+ */
+static void
+ReleaseExecutorLocks(List *stmt_list, List *lockedRelids_per_stmt)
+{
+	ListCell   *lc1,
+			   *lc2;
+
+	forboth(lc1, stmt_list, lc2, lockedRelids_per_stmt)
+	{
+		PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
+		Bitmapset  *lockedRelids = lfirst(lc2);
+		int			rti;
+
+		if (plannedstmt->commandType == CMD_UTILITY)
+		{
+			/*
+			 * Ignore utility statements, except those (such as EXPLAIN) that
+			 * contain a parsed-but-not-planned query.  Note: it's okay to use
+			 * ScanQueryForLocks, even though the query hasn't been through
+			 * rule rewriting, because rewriting doesn't change the query
+			 * representation.
+			 */
+			Query	   *query = UtilityContainsQuery(plannedstmt->utilityStmt);
+
+			if (query)
+				ScanQueryForLocks(query, false);
+			continue;
+		}
+
+		rti = -1;
+		while ((rti = bms_next_member(lockedRelids, rti)) >= 0)
+		{
+			RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
+
+			Assert(rte->rtekind == RTE_RELATION);
+
+			/* See the comment in AcquireExecutorLocks(). */
+			UnlockRelationOid(rte->relid, rte->rellockmode);
 		}
 	}
 }
diff --git a/src/backend/utils/mmgr/portalmem.c b/src/backend/utils/mmgr/portalmem.c
index d549f66d4a..1bbe6b704b 100644
--- a/src/backend/utils/mmgr/portalmem.c
+++ b/src/backend/utils/mmgr/portalmem.c
@@ -303,6 +303,25 @@ PortalDefineQuery(Portal portal,
 	portal->status = PORTAL_DEFINED;
 }
 
+/*
+ * PortalStorePartitionPruneResults
+ *		Copy the given list of PartitionPruneResults into the portal's
+ *		context
+ *
+ * This allows the caller to ensure that the list exists as long as the portal
+ * does.
+ */
+void
+PortalStorePartitionPruneResults(Portal portal, List *part_prune_results)
+{
+	MemoryContext	oldcxt;
+
+	AssertArg(PortalIsValid(portal));
+	oldcxt = MemoryContextSwitchTo(portal->portalContext);
+	portal->part_prune_results = copyObject(part_prune_results);
+	MemoryContextSwitchTo(oldcxt);
+}
+
 /*
  * PortalReleaseCachedPlan
  *		Release a portal's reference to its cached plan, if any.
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 9ebde089ae..e57e133f0e 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -87,7 +87,9 @@ extern void ExplainOneUtility(Node *utilityStmt, IntoClause *into,
 							  ExplainState *es, const char *queryString,
 							  ParamListInfo params, QueryEnvironment *queryEnv);
 
-extern void ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into,
+extern void ExplainOnePlan(PlannedStmt *plannedstmt,
+						   PartitionPruneResult *part_prune_result,
+						   IntoClause *into,
 						   ExplainState *es, const char *queryString,
 						   ParamListInfo params, QueryEnvironment *queryEnv,
 						   const instr_time *planduration,
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index bf962af7af..bd8776402e 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -45,6 +45,7 @@ extern void ExecCleanupTupleRouting(ModifyTableState *mtstate,
  * nparts						Length of subplan_map[] and subpart_map[].
  * subplan_map					Subplan index by partition index, or -1.
  * subpart_map					Subpart index by partition index, or -1.
+ * rti_map						Range table index by partition index, or 0.
  * present_parts				A Bitmapset of the partition indexes that we
  *								have subplans or subparts for.
  * initial_pruning_steps		List of PartitionPruneSteps used to
@@ -61,6 +62,7 @@ typedef struct PartitionedRelPruningData
 	int			nparts;
 	int		   *subplan_map;
 	int		   *subpart_map;
+	Index	   *rti_map;
 	Bitmapset  *present_parts;
 	List	   *initial_pruning_steps;
 	List	   *exec_pruning_steps;
@@ -126,5 +128,10 @@ extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
 													 int part_prune_index,
 													 Bitmapset **initially_valid_subplans);
 extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
-										   bool initial_prune);
+										   bool initial_prune,
+										   Bitmapset **scan_leafpart_rtis);
+extern Bitmapset *ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt,
+								ParamListInfo params,
+								PartitionPruneInfo *pruneinfo,
+								Bitmapset **scan_leafpart_rtis);
 #endif							/* EXECPARTITION_H */
diff --git a/src/include/executor/execdesc.h b/src/include/executor/execdesc.h
index e79e2c001f..60d5644908 100644
--- a/src/include/executor/execdesc.h
+++ b/src/include/executor/execdesc.h
@@ -35,6 +35,8 @@ typedef struct QueryDesc
 	/* These fields are provided by CreateQueryDesc */
 	CmdType		operation;		/* CMD_SELECT, CMD_UPDATE, etc. */
 	PlannedStmt *plannedstmt;	/* planner's output (could be utility, too) */
+	PartitionPruneResult *part_prune_result; /* ExecutorDoInitialPruning()'s
+											  * output for plannedstmt */
 	const char *sourceText;		/* source text of the query */
 	Snapshot	snapshot;		/* snapshot to use for query */
 	Snapshot	crosscheck_snapshot;	/* crosscheck for RI update/delete */
@@ -57,6 +59,7 @@ typedef struct QueryDesc
 
 /* in pquery.c */
 extern QueryDesc *CreateQueryDesc(PlannedStmt *plannedstmt,
+								  PartitionPruneResult *part_prune_result,
 								  const char *sourceText,
 								  Snapshot snapshot,
 								  Snapshot crosscheck_snapshot,
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index d68a6b9d28..5c4a282be0 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -185,6 +185,8 @@ ExecGetJunkAttribute(TupleTableSlot *slot, AttrNumber attno, bool *isNull)
 /*
  * prototypes from functions in execMain.c
  */
+extern PartitionPruneResult *ExecutorDoInitialPruning(PlannedStmt *plannedstmt,
+													  ParamListInfo params);
 extern void ExecutorStart(QueryDesc *queryDesc, int eflags);
 extern void standard_ExecutorStart(QueryDesc *queryDesc, int eflags);
 extern void ExecutorRun(QueryDesc *queryDesc,
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 25e0bb976e..4d4bb3fc3c 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -986,6 +986,33 @@ typedef struct DomainConstraintState
  */
 typedef TupleTableSlot *(*ExecProcNodeMtd) (struct PlanState *pstate);
 
+/*----------------
+ * PartitionPruneResult
+ *
+ * The result of performing ExecutorDoInitialPruning() invocation on a given
+ * PlannedStmt.
+ *
+ * Contains a list of Bitmapset of the indexes of the subplans remaining after
+ * performing initial pruning by calling ExecFindMatchingSubPlans() for every
+ * PartitionPruneInfo found in PlannedStmt.partPruneInfos.  RT indexes of the
+ * leaf partitions scanned by those subplans across all PartitionPruneInfos
+ * are added into scan_leafpart_rtis.
+ *
+ * This is used by GetCachedPlan() to inform its callers of the pruning
+ * decisions made when performing AcquireExecutorLocks() on a given cached
+ * PlannedStmt, which the callers then pass on to the executor.  The executor
+ * refers to this node when initializing the plan nodes which contain subplans
+ * that may have been pruned by ExecutorDoInitialPruning(), rather than
+ * redoing initial pruning.
+ */
+typedef struct PartitionPruneResult
+{
+	NodeTag		type;
+
+	List		   *valid_subplan_offs_list;
+	Bitmapset	   *scan_leafpart_rtis;
+} PartitionPruneResult;
+
 /* ----------------
  *		PlanState node
  *
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index 7ce1fc4deb..c7f256028e 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -97,6 +97,9 @@ typedef enum NodeTag
 	T_PartitionPruneStepCombine,
 	T_PlanInvalItem,
 
+	/* TAGS FOR EXECUTOR PREP NODES (execnodes.h) */
+	T_PartitionPruneResult,
+
 	/*
 	 * TAGS FOR PLAN STATE NODES (execnodes.h)
 	 *
@@ -675,6 +678,7 @@ extern struct Bitmapset *readBitmapset(void);
 extern uintptr_t readDatum(bool typbyval);
 extern bool *readBoolCols(int numCols);
 extern int *readIntCols(int numCols);
+extern Index *readIndexCols(int numCols);
 extern Oid *readOidCols(int numCols);
 extern int16 *readAttrNumberCols(int numCols);
 
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index a0f3a46334..c2d91bb12f 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -110,6 +110,15 @@ typedef struct PlannerGlobal
 	List	   *partPruneInfos;		/* List of PartitionPruneInfo contained in
 									 * the plan */
 
+	bool		containsInitialPruning;	/* Do any of those PartitionPruneInfos
+										 * have initial (pre-exec) pruning
+										 * steps in them? */
+
+	Bitmapset  *minLockRelids;	/* Indexes of all range table entries minus
+								 * indexes of range table entries of the leaf
+								 * partitions scanned by prunable subplans;
+								 * see AcquireExecutorLocks() */
+
 	List	   *relationOids;	/* OIDs of relations the plan depends on */
 
 	List	   *invalItems;		/* other dependencies, as PlanInvalItems */
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index c3f4a39657..869bf535bc 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -67,8 +67,17 @@ typedef struct PlannedStmt
 	List	   *partPruneInfos;	/* List of PartitionPruneInfo contained in
 								 * the plan */
 
+	bool		containsInitialPruning;	/* Do any of those PartitionPruneInfos
+										 * have initial (pre-exec) pruning
+										 * steps in them? */
+
 	List	   *rtable;			/* list of RangeTblEntry nodes */
 
+	Bitmapset  *minLockRelids;	/* Indexes of all range table entries minus
+								 * indexes of range table entries of the leaf
+								 * partitions scanned by prunable subplans;
+								 * see AcquireExecutorLocks() */
+
 	/* rtable indexes of target relations for INSERT/UPDATE/DELETE */
 	List	   *resultRelations;	/* integer list of RT indexes, or NIL */
 
@@ -1386,6 +1395,13 @@ typedef struct PlanRowMark
  * prune_infos			List of Lists containing PartitionedRelPruneInfo nodes,
  *						one sublist per run-time-prunable partition hierarchy
  *						appearing in the parent plan node's subplans.
+ *
+ * needs_init_pruning	Does any of the PartitionedRelPruneInfos in
+ *						prune_infos have its initial_pruning_steps set?
+ *
+ * needs_exec_pruning	Does any of the PartitionedRelPruneInfos in
+ *						prune_infos have its exec_pruning_steps set?
+ *
  * other_subplans		Indexes of any subplans that are not accounted for
  *						by any of the PartitionedRelPruneInfo nodes in
  *						"prune_infos".  These subplans must not be pruned.
@@ -1394,6 +1410,8 @@ typedef struct PartitionPruneInfo
 {
 	NodeTag		type;
 	List	   *prune_infos;
+	bool		needs_init_pruning;
+	bool		needs_exec_pruning;
 	Bitmapset  *other_subplans;
 } PartitionPruneInfo;
 
@@ -1436,6 +1454,9 @@ typedef struct PartitionedRelPruneInfo
 	/* relation OID by partition index, or 0 */
 	Oid		   *relid_map;
 
+	/* Range table index by partition index, or 0. */
+	Index	   *rti_map;
+
 	/*
 	 * initial_pruning_steps shows how to prune during executor startup (i.e.,
 	 * without use of any PARAM_EXEC Params); it is NIL if no startup pruning
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index 0499635f59..1c5bb5ece1 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -220,7 +220,8 @@ extern List *CachedPlanGetTargetList(CachedPlanSource *plansource,
 extern CachedPlan *GetCachedPlan(CachedPlanSource *plansource,
 								 ParamListInfo boundParams,
 								 ResourceOwner owner,
-								 QueryEnvironment *queryEnv);
+								 QueryEnvironment *queryEnv,
+								 List **part_prune_result_list);
 extern void ReleaseCachedPlan(CachedPlan *plan, ResourceOwner owner);
 
 extern bool CachedPlanAllowsSimpleValidityCheck(CachedPlanSource *plansource,
diff --git a/src/include/utils/portal.h b/src/include/utils/portal.h
index aeddbdafe5..9f7727a837 100644
--- a/src/include/utils/portal.h
+++ b/src/include/utils/portal.h
@@ -138,6 +138,7 @@ typedef struct PortalData
 	QueryCompletion qc;			/* command completion data for executed query */
 	List	   *stmts;			/* list of PlannedStmts */
 	CachedPlan *cplan;			/* CachedPlan, if stmts are from one */
+	List	   *part_prune_results;	/* list of PartitionPruneResults */
 
 	ParamListInfo portalParams; /* params to pass to query */
 	QueryEnvironment *queryEnv; /* environment for query */
@@ -242,6 +243,8 @@ extern void PortalDefineQuery(Portal portal,
 							  CommandTag commandTag,
 							  List *stmts,
 							  CachedPlan *cplan);
+extern void PortalStorePartitionPruneResults(Portal portal,
+											 List *part_prune_result_list);
 extern PlannedStmt *PortalGetPrimaryStmt(Portal portal);
 extern void PortalCreateHoldStore(Portal portal);
 extern void PortalHashTableDeleteAll(void);
-- 
2.35.3



^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2022-07-13 06:40  Amit Langote <[email protected]>
  parent: Amit Langote <[email protected]>
  1 sibling, 1 reply; 108+ messages in thread

From: Amit Langote @ 2022-07-13 06:40 UTC (permalink / raw)
  To: Jacob Champion <[email protected]>; +Cc: Zhihong Yu <[email protected]>; David Rowley <[email protected]>; Robert Haas <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers

Rebased over 964d01ae90c.

-- 
Thanks, Amit Langote
EDB: http://www.enterprisedb.com


Attachments:

  [application/octet-stream] v18-0002-Optimize-AcquireExecutorLocks-by-locking-only-un.patch (81.4K, 2-v18-0002-Optimize-AcquireExecutorLocks-by-locking-only-un.patch)
  download | inline diff:
From 567059057ee35bcd8ca066f46d4c6b23641af090 Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Wed, 22 Dec 2021 16:55:17 +0900
Subject: [PATCH v18 2/2] Optimize AcquireExecutorLocks() by locking only
 unpruned partitions

This commit teaches AcquireExecutorLocks() to perform initial
partition pruning to notionally eliminate the subnodes contained in a
generic cached plan that need not be initialized during the actual
execution of the plan and skip locking the partition scanned by those
subnodes.

The result of performing initial partition pruning this way before the
actual execution has started is made available to the actual execution via
PartitionPruneResult, made available along with the PlannedStmt by the
callers of the executor that used plancache.c to get the plan.  It is NULL
in the cases in which the plan is obtained by calling the planner
directly or if the plan obtained by plancache.c is not a generic one.
---
 src/backend/commands/copyto.c          |   2 +-
 src/backend/commands/createas.c        |   2 +-
 src/backend/commands/explain.c         |   7 +-
 src/backend/commands/extension.c       |   2 +-
 src/backend/commands/matview.c         |   2 +-
 src/backend/commands/prepare.c         |  26 ++-
 src/backend/executor/README            |  32 ++++
 src/backend/executor/execMain.c        |  53 ++++++
 src/backend/executor/execParallel.c    |  27 ++-
 src/backend/executor/execPartition.c   | 234 +++++++++++++++++++++----
 src/backend/executor/functions.c       |   2 +-
 src/backend/executor/nodeAppend.c      |  11 +-
 src/backend/executor/nodeMergeAppend.c |   5 +-
 src/backend/executor/spi.c             |  27 ++-
 src/backend/nodes/copyfuncs.c          |   1 -
 src/backend/nodes/outfuncs.c           |   1 -
 src/backend/nodes/readfuncs.c          |  29 +++
 src/backend/optimizer/plan/planner.c   |   2 +
 src/backend/optimizer/plan/setrefs.c   |  46 +++++
 src/backend/partitioning/partprune.c   |  41 ++++-
 src/backend/tcop/postgres.c            |   8 +-
 src/backend/tcop/pquery.c              |  28 ++-
 src/backend/utils/cache/plancache.c    | 187 +++++++++++++++++---
 src/backend/utils/mmgr/portalmem.c     |  19 ++
 src/include/commands/explain.h         |   4 +-
 src/include/executor/execPartition.h   |   9 +-
 src/include/executor/execdesc.h        |   3 +
 src/include/executor/executor.h        |   2 +
 src/include/nodes/execnodes.h          |  27 +++
 src/include/nodes/nodes.h              |   1 +
 src/include/nodes/pathnodes.h          |  13 ++
 src/include/nodes/plannodes.h          |  21 +++
 src/include/utils/plancache.h          |   3 +-
 src/include/utils/portal.h             |   3 +
 34 files changed, 782 insertions(+), 98 deletions(-)

diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index fca29a9a10..d839517693 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -541,7 +541,7 @@ BeginCopyTo(ParseState *pstate,
 		((DR_copy *) dest)->cstate = cstate;
 
 		/* Create a QueryDesc requesting no output */
-		cstate->queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+		cstate->queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
 											GetActiveSnapshot(),
 											InvalidSnapshot,
 											dest, NULL, NULL, 0);
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 9abbb6b555..f6607f2454 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -325,7 +325,7 @@ ExecCreateTableAs(ParseState *pstate, CreateTableAsStmt *stmt,
 		UpdateActiveSnapshotCommandId();
 
 		/* Create a QueryDesc, redirecting output to our tuple receiver */
-		queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+		queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
 									GetActiveSnapshot(), InvalidSnapshot,
 									dest, params, queryEnv, 0);
 
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index e29c2ae206..e41b13a3ea 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -407,7 +407,7 @@ ExplainOneQuery(Query *query, int cursorOptions,
 		}
 
 		/* run it (if needed) and produce output */
-		ExplainOnePlan(plan, into, es, queryString, params, queryEnv,
+		ExplainOnePlan(plan, NULL, into, es, queryString, params, queryEnv,
 					   &planduration, (es->buffers ? &bufusage : NULL));
 	}
 }
@@ -515,7 +515,8 @@ ExplainOneUtility(Node *utilityStmt, IntoClause *into, ExplainState *es,
  * to call it.
  */
 void
-ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
+ExplainOnePlan(PlannedStmt *plannedstmt, PartitionPruneResult *part_prune_result,
+			   IntoClause *into, ExplainState *es,
 			   const char *queryString, ParamListInfo params,
 			   QueryEnvironment *queryEnv, const instr_time *planduration,
 			   const BufferUsage *bufusage)
@@ -563,7 +564,7 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
 		dest = None_Receiver;
 
 	/* Create a QueryDesc for the query */
-	queryDesc = CreateQueryDesc(plannedstmt, queryString,
+	queryDesc = CreateQueryDesc(plannedstmt, part_prune_result, queryString,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, params, queryEnv, instrument_option);
 
diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index 3db859c3ea..631cc07217 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -776,7 +776,7 @@ execute_sql_string(const char *sql)
 			{
 				QueryDesc  *qdesc;
 
-				qdesc = CreateQueryDesc(stmt,
+				qdesc = CreateQueryDesc(stmt, NULL,
 										sql,
 										GetActiveSnapshot(), NULL,
 										dest, NULL, NULL, 0);
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index 9ac0383459..b0ed96e56c 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -408,7 +408,7 @@ refresh_matview_datafill(DestReceiver *dest, Query *query,
 	UpdateActiveSnapshotCommandId();
 
 	/* Create a QueryDesc, redirecting output to our tuple receiver */
-	queryDesc = CreateQueryDesc(plan, queryString,
+	queryDesc = CreateQueryDesc(plan, NULL, queryString,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, NULL, NULL, 0);
 
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 2333aae467..83465e40f8 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -155,6 +155,7 @@ ExecuteQuery(ParseState *pstate,
 	PreparedStatement *entry;
 	CachedPlan *cplan;
 	List	   *plan_list;
+	List	   *part_prune_result_list;
 	ParamListInfo paramLI = NULL;
 	EState	   *estate = NULL;
 	Portal		portal;
@@ -193,7 +194,10 @@ ExecuteQuery(ParseState *pstate,
 									   entry->plansource->query_string);
 
 	/* Replan if needed, and increment plan refcount for portal */
-	cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL);
+	cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL,
+						  &part_prune_result_list);
+	Assert(list_length(cplan->stmt_list) ==
+		   list_length(part_prune_result_list));
 	plan_list = cplan->stmt_list;
 
 	/*
@@ -207,6 +211,9 @@ ExecuteQuery(ParseState *pstate,
 					  plan_list,
 					  cplan);
 
+	/* Copy PartitionPruneResults into the portal's context. */
+	PortalStorePartitionPruneResults(portal, part_prune_result_list);
+
 	/*
 	 * For CREATE TABLE ... AS EXECUTE, we must verify that the prepared
 	 * statement is one that produces tuples.  Currently we insist that it be
@@ -576,7 +583,9 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	const char *query_string;
 	CachedPlan *cplan;
 	List	   *plan_list;
-	ListCell   *p;
+	List	   *part_prune_result_list;
+	ListCell   *p,
+			   *pp;
 	ParamListInfo paramLI = NULL;
 	EState	   *estate = NULL;
 	instr_time	planstart;
@@ -619,7 +628,10 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 
 	/* Replan if needed, and acquire a transient refcount */
 	cplan = GetCachedPlan(entry->plansource, paramLI,
-						  CurrentResourceOwner, queryEnv);
+						  CurrentResourceOwner, queryEnv,
+						  &part_prune_result_list);
+	Assert(list_length(cplan->stmt_list) ==
+		   list_length(part_prune_result_list));
 
 	INSTR_TIME_SET_CURRENT(planduration);
 	INSTR_TIME_SUBTRACT(planduration, planstart);
@@ -634,13 +646,15 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	plan_list = cplan->stmt_list;
 
 	/* Explain each query */
-	foreach(p, plan_list)
+	forboth(p, plan_list, pp, part_prune_result_list)
 	{
 		PlannedStmt *pstmt = lfirst_node(PlannedStmt, p);
+		PartitionPruneResult *part_prune_result = lfirst_node(PartitionPruneResult, pp);
 
 		if (pstmt->commandType != CMD_UTILITY)
-			ExplainOnePlan(pstmt, into, es, query_string, paramLI, queryEnv,
-						   &planduration, (es->buffers ? &bufusage : NULL));
+			ExplainOnePlan(pstmt, part_prune_result, into, es, query_string,
+						   paramLI, queryEnv, &planduration,
+						   (es->buffers ? &bufusage : NULL));
 		else
 			ExplainOneUtility(pstmt->utilityStmt, into, es, query_string,
 							  paramLI, queryEnv);
diff --git a/src/backend/executor/README b/src/backend/executor/README
index 0b5183fc4a..953a476ea5 100644
--- a/src/backend/executor/README
+++ b/src/backend/executor/README
@@ -65,6 +65,34 @@ found there.  This currently only occurs for Append and MergeAppend nodes.  In
 this case the non-required subplans are ignored and the executor state's
 subnode array will become out of sequence to the plan's subplan list.
 
+Actually, the so-called execution time pruning may also occur even before the
+execution has started.  One case where that occurs is when a cached generic
+plan is being validated for execution by plancache.c: GetCachedPlan(), which
+works by locking all the relations that will be scanned by that plan.  If the
+generic plan contains nodes that can perform execution time partition pruning
+(that is, contain a PartitionPruneInfo), a subset of pruning steps contained
+in a given node's PartitionPruneInfo that do not depend on the execution
+actually having started (called "initial" pruning steps) are performed at this
+point to figure out the minimal set of child subplans that satisfy those
+pruning steps.  AcquireExecutorLocks() looking at a given plan tree will then
+lock only the relations scanned by the child subplans that survived such
+pruning, along with those present in PlannedStmt.minLockRelids.  Note that the
+subplans are only notionally pruned in that they are not removed from the plan
+tree as such.
+
+To prevent the executor and any third party execution code that can look at
+the plan tree from trying to execute the subplans that were pruned as
+described above, the result of pruning is passed to the executor as a
+PartitionPruneResult node via the QueryDesc.  It consists of the set of
+indexes of surviving subplans in their respective parent plan node's list of
+child subplans, saved as a list of bitmapsets, with one element for every
+parent plan node whose PartitionPruneInfo is present in
+PlannedStmt.partPruneInfos.  In other words, the executor should not
+re-evaluate the set of initially valid subplans by redoing the initial pruning
+if it was already done by AcquireExecutorLocks(), because the re-evaluation may
+very well end up resulting in a different set of subplans, containing some
+whose relations were not locked by AcquireExecutorLocks().
+
 Each Plan node may have expression trees associated with it, to represent
 its target list, qualification conditions, etc.  These trees are also
 read-only to the executor, but the executor state for expression evaluation
@@ -286,6 +314,10 @@ Query Processing Control Flow
 
 This is a sketch of control flow for full query processing:
 
+  [ ExecutorDoInitialPruning ] --- an optional step to perform initial
+		partition pruning on the plan tree the result of which is passed
+		to the executor via QueryDesc
+
 	CreateQueryDesc
 
 	ExecutorStart
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 72fc273524..45824624f8 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -49,6 +49,7 @@
 #include "commands/matview.h"
 #include "commands/trigger.h"
 #include "executor/execdebug.h"
+#include "executor/execPartition.h"
 #include "executor/nodeSubplan.h"
 #include "foreign/fdwapi.h"
 #include "jit/jit.h"
@@ -104,6 +105,56 @@ static void EvalPlanQualStart(EPQState *epqstate, Plan *planTree);
 
 /* end of local decls */
 
+/* ----------------------------------------------------------------
+ *		ExecutorDoInitialPruning
+ *
+ *		For each plan tree node that has been assigned a PartitionPruneInfo,
+ *		this performs initial partition pruning using the information contained
+ *		therein to determine the set of child subplans that satisfy the initial
+ *		pruning steps, to be returned as a bitmapset of their indexes in the
+ *		node's list of child subplans (for example, an Append's appendplans).
+ *
+ * Return value is a PartitionPruneResult node that contains a list of those
+ * bitmapsets, with one element for every PartitionPruneInfo, and a bitmapset
+ * of the RT indexes of all the leaf partitions scanned by those chosen
+ * subplans.  Note that the latter is shared across all PartitionPruneInfos.
+ *
+ * The executor must see the exactly same set of subplans as valid for
+ * execution when doing ExecInitNode() on the plan nodes whose
+ * PartitionPruneInfos are processed here.  So, it must get the set from the
+ * aforementioned PartitionPruneResult, instead of computing it all over
+ * again by redoing the initial pruning.  It's the caller's job to pass the
+ * PartitionPruneResult to the executor.
+ *
+ * Note: Partitioned tables mentioned in PartitionedRelPruneInfo nodes that
+ * drive the pruning will be locked before doing the pruning.
+ * ----------------------------------------------------------------
+ */
+PartitionPruneResult *
+ExecutorDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params)
+{
+	PartitionPruneResult *result;
+	ListCell *lc;
+
+	/* Only get here if there is any pruning to do. */
+	Assert(plannedstmt->containsInitialPruning);
+
+	result = makeNode(PartitionPruneResult);
+	foreach(lc, plannedstmt->partPruneInfos)
+	{
+		PartitionPruneInfo *pruneinfo = lfirst(lc);
+		Bitmapset *valid_subplan_offs;
+
+		valid_subplan_offs =
+			ExecPartitionDoInitialPruning(plannedstmt, params, pruneinfo,
+										  &result->scan_leafpart_rtis);
+		result->valid_subplan_offs_list =
+			lappend(result->valid_subplan_offs_list,
+					valid_subplan_offs);
+	}
+
+	return result;
+}
 
 /* ----------------------------------------------------------------
  *		ExecutorStart
@@ -806,6 +857,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
 {
 	CmdType		operation = queryDesc->operation;
 	PlannedStmt *plannedstmt = queryDesc->plannedstmt;
+	PartitionPruneResult *part_prune_result = queryDesc->part_prune_result;
 	Plan	   *plan = plannedstmt->planTree;
 	List	   *rangeTable = plannedstmt->rtable;
 	EState	   *estate = queryDesc->estate;
@@ -826,6 +878,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
 
 	estate->es_plannedstmt = plannedstmt;
 	estate->es_part_prune_infos = plannedstmt->partPruneInfos;
+	estate->es_part_prune_result = part_prune_result;
 
 	/*
 	 * Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index f73b8c2607..7e6dab5623 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -66,6 +66,7 @@
 #define PARALLEL_KEY_QUERY_TEXT		UINT64CONST(0xE000000000000008)
 #define PARALLEL_KEY_JIT_INSTRUMENTATION UINT64CONST(0xE000000000000009)
 #define PARALLEL_KEY_WAL_USAGE			UINT64CONST(0xE00000000000000A)
+#define PARALLEL_KEY_PARTITIONPRUNERESULT	UINT64CONST(0xE00000000000000B)
 
 #define PARALLEL_TUPLE_QUEUE_SIZE		65536
 
@@ -182,6 +183,7 @@ ExecSerializePlan(Plan *plan, EState *estate)
 	pstmt->transientPlan = false;
 	pstmt->dependsOnRole = false;
 	pstmt->parallelModeNeeded = false;
+	pstmt->containsInitialPruning = false;
 	pstmt->planTree = plan;
 	pstmt->partPruneInfos = estate->es_part_prune_infos;
 	pstmt->rtable = estate->es_range_table;
@@ -597,12 +599,15 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	FixedParallelExecutorState *fpes;
 	char	   *pstmt_data;
 	char	   *pstmt_space;
+	char	   *part_prune_result_data;
+	char	   *part_prune_result_space;
 	char	   *paramlistinfo_space;
 	BufferUsage *bufusage_space;
 	WalUsage   *walusage_space;
 	SharedExecutorInstrumentation *instrumentation = NULL;
 	SharedJitInstrumentation *jit_instrumentation = NULL;
 	int			pstmt_len;
+	int			part_prune_result_len;
 	int			paramlistinfo_len;
 	int			instrumentation_len = 0;
 	int			jit_instrumentation_len = 0;
@@ -631,6 +636,7 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 
 	/* Fix up and serialize plan to be sent to workers. */
 	pstmt_data = ExecSerializePlan(planstate->plan, estate);
+	part_prune_result_data = nodeToString(estate->es_part_prune_result);
 
 	/* Create a parallel context. */
 	pcxt = CreateParallelContext("postgres", "ParallelQueryMain", nworkers);
@@ -657,6 +663,11 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	shm_toc_estimate_chunk(&pcxt->estimator, pstmt_len);
 	shm_toc_estimate_keys(&pcxt->estimator, 1);
 
+	/* Estimate space for serialized PartitionPruneResult. */
+	part_prune_result_len = strlen(part_prune_result_data) + 1;
+	shm_toc_estimate_chunk(&pcxt->estimator, part_prune_result_len);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
 	/* Estimate space for serialized ParamListInfo. */
 	paramlistinfo_len = EstimateParamListSpace(estate->es_param_list_info);
 	shm_toc_estimate_chunk(&pcxt->estimator, paramlistinfo_len);
@@ -751,6 +762,12 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	memcpy(pstmt_space, pstmt_data, pstmt_len);
 	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PLANNEDSTMT, pstmt_space);
 
+	/* Store serialized PartitionPruneResult */
+	part_prune_result_space = shm_toc_allocate(pcxt->toc, part_prune_result_len);
+	memcpy(part_prune_result_space, part_prune_result_data, part_prune_result_len);
+	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARTITIONPRUNERESULT,
+				   part_prune_result_space);
+
 	/* Store serialized ParamListInfo. */
 	paramlistinfo_space = shm_toc_allocate(pcxt->toc, paramlistinfo_len);
 	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARAMLISTINFO, paramlistinfo_space);
@@ -1232,8 +1249,10 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
 						 int instrument_options)
 {
 	char	   *pstmtspace;
+	char	   *part_prune_result_space;
 	char	   *paramspace;
 	PlannedStmt *pstmt;
+	PartitionPruneResult *part_prune_result;
 	ParamListInfo paramLI;
 	char	   *queryString;
 
@@ -1244,12 +1263,18 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
 	pstmtspace = shm_toc_lookup(toc, PARALLEL_KEY_PLANNEDSTMT, false);
 	pstmt = (PlannedStmt *) stringToNode(pstmtspace);
 
+	/* Reconstruct leader-supplied PartitionPruneResult. */
+	part_prune_result_space =
+		shm_toc_lookup(toc, PARALLEL_KEY_PARTITIONPRUNERESULT, false);
+	part_prune_result = (PartitionPruneResult *)
+		stringToNode(part_prune_result_space);
+
 	/* Reconstruct ParamListInfo. */
 	paramspace = shm_toc_lookup(toc, PARALLEL_KEY_PARAMLISTINFO, false);
 	paramLI = RestoreParamList(&paramspace);
 
 	/* Create a QueryDesc for the query. */
-	return CreateQueryDesc(pstmt,
+	return CreateQueryDesc(pstmt, part_prune_result,
 						   queryString,
 						   GetActiveSnapshot(), InvalidSnapshot,
 						   receiver, paramLI, NULL, instrument_options);
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index b55cdd2580..24e6f6e988 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -25,6 +25,7 @@
 #include "mb/pg_wchar.h"
 #include "miscadmin.h"
 #include "nodes/makefuncs.h"
+#include "parser/parsetree.h"
 #include "partitioning/partbounds.h"
 #include "partitioning/partdesc.h"
 #include "partitioning/partprune.h"
@@ -185,7 +186,11 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation rel,
 static List *adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri);
 static List *adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap);
 static PartitionPruneState *CreatePartitionPruneState(PlanState *planstate,
-													  PartitionPruneInfo *pruneinfo);
+													  PartitionPruneInfo *pruneinfo,
+													  bool consider_initial_steps,
+													  bool consider_exec_steps,
+													  List *rtable, ExprContext *econtext,
+													  PartitionDirectory partdir);
 static void InitPartitionPruneContext(PartitionPruneContext *context,
 									  List *pruning_steps,
 									  PartitionDesc partdesc,
@@ -198,7 +203,8 @@ static void PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
 static void find_matching_subplans_recurse(PartitionPruningData *prunedata,
 										   PartitionedRelPruningData *pprune,
 										   bool initial_prune,
-										   Bitmapset **validsubplans);
+										   Bitmapset **validsubplans,
+										   Bitmapset **scan_leafpart_rtis);
 
 
 /*
@@ -1593,8 +1599,10 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  * considered to be a stable expression, it can change value from one plan
  * node scan to the next during query execution.  Stable comparison
  * expressions that don't involve such Params allow partition pruning to be
- * done once during executor startup.  Expressions that do involve such Params
- * require us to prune separately for each scan of the parent plan node.
+ * done once during executor startup or during ExecutorDoInitialPruning() that
+ * runs as part of performing AcquireExecutorLocks() on a given plan tree.
+ * Expressions that do involve such Params require us to prune separately for
+ * each scan of the parent plan node.
  *
  * Note that pruning away unneeded subplans during executor startup has the
  * added benefit of not having to initialize the unneeded subplans at all.
@@ -1611,6 +1619,13 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  *		account for initial pruning possibly having eliminated some of the
  *		subplans.
  *
+ * ExecPartitionDoInitialPruning:
+ *		Do initial pruning with the information contained in a given
+ *		PartitionPruneInfo to determine the minimal set of child subplans
+ *		to be executed of the parent plan node to which the PartitionPruneInfo
+ *		belongs and also the set of the RT indexes of leaf partitions that will
+ *		be scanned with those subplans.
+ *
  * ExecFindMatchingSubPlans:
  *		Returns indexes of matching subplans after evaluating the expressions
  *		that are safe to evaluate at a given point.  This function is first
@@ -1628,8 +1643,9 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  *
  * On return, *initially_valid_subplans is assigned the set of indexes of
  * child subplans that must be initialized along with the parent plan node.
- * Initial pruning is performed here if needed and in that case only the
- * surviving subplans' indexes are added.
+ * Initial pruning is performed here if needed (unless it has already been done
+ * by ExecutorDoInitialPruning()), and in that case only the surviving
+ * subplans' indexes are added.
  *
  * If subplans are indeed pruned, subplan_map arrays contained in the returned
  * PartitionPruneState are re-sequenced to not count those, though only if the
@@ -1645,24 +1661,59 @@ ExecInitPartitionPruning(PlanState *planstate,
 	EState	   *estate = planstate->state;
 	PartitionPruneInfo *pruneinfo = list_nth(estate->es_part_prune_infos,
 											 part_prune_index);
+	PartitionPruneResult *pruneresult = estate->es_part_prune_result;
+	bool	do_pruning = (pruneinfo->needs_init_pruning ||
+						  pruneinfo->needs_exec_pruning);
 
-	/* We may need an expression context to evaluate partition exprs */
-	ExecAssignExprContext(estate, planstate);
+	/*
+	 * No need to do initial pruning if it was done already by
+	 * ExecutorDoInitialPruning(), which it would be if es_part_prune_result
+	 * has been set.
+	 */
+	if (pruneresult)
+		do_pruning = pruneinfo->needs_exec_pruning;
 
-	/* Create the working data structure for pruning */
-	prunestate = CreatePartitionPruneState(planstate, pruneinfo);
+	prunestate = NULL;
+	if (do_pruning)
+	{
+		/* We may need an expression context to evaluate partition exprs */
+		ExecAssignExprContext(estate, planstate);
+
+		/* For data reading, executor always omits detached partitions */
+		if (estate->es_partition_directory == NULL)
+			estate->es_partition_directory =
+				CreatePartitionDirectory(estate->es_query_cxt, false);
+
+		/*
+		 * Create the working data structure for pruning.  No need to consider
+		 * initial pruning steps if we have a PartitionPruneResult.
+		 */
+		prunestate = CreatePartitionPruneState(planstate, pruneinfo,
+											   pruneresult == NULL, true,
+											   NIL, planstate->ps_ExprContext,
+											   estate->es_partition_directory);
+	}
 
 	/*
 	 * Perform an initial partition prune pass, if required.
 	 */
-	if (prunestate->do_initial_prune)
-		*initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true);
+	if (pruneresult)
+	{
+		*initially_valid_subplans =
+			list_nth(pruneresult->valid_subplan_offs_list, part_prune_index);
+	}
+	else if (prunestate && prunestate->do_initial_prune)
+	{
+		*initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true,
+															 NULL);
+	}
 	else
 	{
 		/* No pruning, so we'll need to initialize all subplans */
 		Assert(n_total_subplans > 0);
 		*initially_valid_subplans = bms_add_range(NULL, 0,
 												  n_total_subplans - 1);
+		return prunestate;
 	}
 
 	/*
@@ -1670,7 +1721,8 @@ ExecInitPartitionPruning(PlanState *planstate,
 	 * that were removed above due to initial pruning.  No need to do this if
 	 * no steps were removed.
 	 */
-	if (bms_num_members(*initially_valid_subplans) < n_total_subplans)
+	if (prunestate &&
+		bms_num_members(*initially_valid_subplans) < n_total_subplans)
 	{
 		/*
 		 * We can safely skip this when !do_exec_prune, even though that
@@ -1686,11 +1738,73 @@ ExecInitPartitionPruning(PlanState *planstate,
 	return prunestate;
 }
 
+/*
+ * ExecPartitionDoInitialPruning
+ *		Perform initial pruning using given PartitionPruneInfo to determine
+ *		the minimal set of child subplans that will be executed and also the
+ *		set of RT indexes of the leaf partitions scanned by those subplans.
+ */
+Bitmapset *
+ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params,
+							  PartitionPruneInfo *pruneinfo,
+							  Bitmapset **scan_leafpart_rtis)
+{
+	List		 *rtable = plannedstmt->rtable;
+	ExprContext	 *econtext;
+	PartitionDirectory pdir;
+	MemoryContext oldcontext,
+				  tmpcontext;
+	PartitionPruneState *prunestate;
+	Bitmapset	 *valid_subplan_offs;
+
+	/*
+	 * A temporary context for memory allocations required while executing
+	 * partition pruning steps.
+	 */
+	tmpcontext = AllocSetContextCreate(CurrentMemoryContext,
+									   "initial pruning working data",
+									   ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(tmpcontext);
+
+	/*
+	 * PartitionDirectory to look up partition descriptors, which omits
+	 * detached partitions, just like in the executor proper.
+	 */
+	pdir = CreatePartitionDirectory(CurrentMemoryContext, false);
+
+	/*
+	 * We don't yet have a PlanState for the parent plan node, so we must
+	 * create a standalone ExprContext to evaluate pruning expressions,
+	 * equipped with the information about the EXTERN parameters that the
+	 * caller passed us.  Note that that's okay because the initial pruning
+	 * steps do not contain anything that requires the execution to have
+	 * started.
+	 */
+	econtext = CreateStandaloneExprContext();
+	econtext->ecxt_param_list_info = params;
+	prunestate = CreatePartitionPruneState(NULL, pruneinfo, true, false,
+										   rtable, econtext, pdir);
+	MemoryContextSwitchTo(oldcontext);
+
+	/* Do the initial pruning. */
+	valid_subplan_offs = ExecFindMatchingSubPlans(prunestate, true,
+												  scan_leafpart_rtis);
+
+	FreeExprContext(econtext, true);
+	DestroyPartitionDirectory(pdir);
+	MemoryContextDelete(tmpcontext);
+
+	return valid_subplan_offs;
+}
+
 /*
  * CreatePartitionPruneState
  *		Build the data structure required for calling ExecFindMatchingSubPlans
  *
- * 'planstate' is the parent plan node's execution state.
+ * 'planstate', if not NULL, is the parent plan node's execution state.  It
+ * can be NULL if being called before ExecutorStart(), in which case,
+ * 'rtable' (range table), 'econtext', and 'partdir' must be explicitly
+ * provided.
  *
  * 'pruneinfo' is a PartitionPruneInfo as generated by
  * make_partition_pruneinfo.  Here we build a PartitionPruneState containing a
@@ -1704,19 +1818,21 @@ ExecInitPartitionPruning(PlanState *planstate,
  * PartitionedRelPruneInfo.
  */
 static PartitionPruneState *
-CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
+CreatePartitionPruneState(PlanState *planstate,
+						  PartitionPruneInfo *pruneinfo,
+						  bool consider_initial_steps,
+						  bool consider_exec_steps,
+						  List *rtable, ExprContext *econtext,
+						  PartitionDirectory partdir)
 {
-	EState	   *estate = planstate->state;
+	EState	   *estate = planstate ? planstate->state : NULL;
 	PartitionPruneState *prunestate;
 	int			n_part_hierarchies;
 	ListCell   *lc;
 	int			i;
-	ExprContext *econtext = planstate->ps_ExprContext;
 
-	/* For data reading, executor always omits detached partitions */
-	if (estate->es_partition_directory == NULL)
-		estate->es_partition_directory =
-			CreatePartitionDirectory(estate->es_query_cxt, false);
+	Assert((estate != NULL) ||
+			(partdir != NULL && econtext != NULL && rtable != NIL));
 
 	n_part_hierarchies = list_length(pruneinfo->prune_infos);
 	Assert(n_part_hierarchies > 0);
@@ -1771,15 +1887,42 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 			PartitionKey partkey;
 
 			/*
-			 * We can rely on the copies of the partitioned table's partition
-			 * key and partition descriptor appearing in its relcache entry,
-			 * because that entry will be held open and locked for the
-			 * duration of this executor run.
+			 * Must open the relation by ourselves when called before the
+			 * execution has started, such as, when called during
+			 * ExecutorDoInitialPruning() on a cached plan.  In that case,
+			 * sub-partitions must be locked, because AcquirePlannerLocks()
+			 * would not have seen them. (1st relation in a partrelpruneinfos
+			 * list is always the root partitioned table appearing in the
+			 * query, which AcquirePlannerLocks() would have locked; the
+			 * Assert in relation_open() guards that assumption.)
+			 */
+			if (estate == NULL)
+			{
+				RangeTblEntry *rte = rt_fetch(pinfo->rtindex, rtable);
+				int		lockmode = (j == 0) ? NoLock : rte->rellockmode;
+
+				partrel = table_open(rte->relid, lockmode);
+			}
+			else
+				partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
+
+			/*
+			 * We can rely on the copy of the partitioned table's partition
+			 * key from in its relcache entry, because it can't change (or
+			 * get destroyed) as long as the relation is locked.  Partition
+			 * descriptor is taken from the PartitionDirectory associated with
+			 * the table that is held open long enough for the descriptor to
+			 * remain valid while it's used to perform the pruning steps.
 			 */
-			partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
 			partkey = RelationGetPartitionKey(partrel);
-			partdesc = PartitionDirectoryLookup(estate->es_partition_directory,
-												partrel);
+			partdesc = PartitionDirectoryLookup(partdir, partrel);
+
+			/*
+			 * Must close partrel, keeping the lock taken, if we're not using
+			 * EState's entry.
+			 */
+			if (estate == NULL)
+				table_close(partrel, NoLock);
 
 			/*
 			 * Initialize the subplan_map and subpart_map.
@@ -1793,6 +1936,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 			Assert(partdesc->nparts >= pinfo->nparts);
 			pprune->nparts = partdesc->nparts;
 			pprune->subplan_map = palloc(sizeof(int) * partdesc->nparts);
+			pprune->rti_map = palloc(sizeof(Index) * partdesc->nparts);
 			if (partdesc->nparts == pinfo->nparts)
 			{
 				/*
@@ -1803,6 +1947,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 				pprune->subpart_map = pinfo->subpart_map;
 				memcpy(pprune->subplan_map, pinfo->subplan_map,
 					   sizeof(int) * pinfo->nparts);
+				memcpy(pprune->rti_map, pinfo->rti_map,
+					   sizeof(int) * pinfo->nparts);
 
 				/*
 				 * Double-check that the list of unpruned relations has not
@@ -1853,6 +1999,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 							pinfo->subplan_map[pd_idx];
 						pprune->subpart_map[pp_idx] =
 							pinfo->subpart_map[pd_idx];
+						pprune->rti_map[pp_idx] =
+							pinfo->rti_map[pd_idx];
 						pd_idx++;
 					}
 					else
@@ -1860,6 +2008,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 						/* this partdesc entry is not in the plan */
 						pprune->subplan_map[pp_idx] = -1;
 						pprune->subpart_map[pp_idx] = -1;
+						pprune->rti_map[pp_idx] = 0;
 					}
 				}
 
@@ -1881,7 +2030,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 			 * Initialize pruning contexts as needed.
 			 */
 			pprune->initial_pruning_steps = pinfo->initial_pruning_steps;
-			if (pinfo->initial_pruning_steps)
+			if (consider_initial_steps && pinfo->initial_pruning_steps)
 			{
 				InitPartitionPruneContext(&pprune->initial_context,
 										  pinfo->initial_pruning_steps,
@@ -1891,7 +2040,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 				prunestate->do_initial_prune = true;
 			}
 			pprune->exec_pruning_steps = pinfo->exec_pruning_steps;
-			if (pinfo->exec_pruning_steps)
+			if (consider_exec_steps && pinfo->exec_pruning_steps)
 			{
 				InitPartitionPruneContext(&pprune->exec_context,
 										  pinfo->exec_pruning_steps,
@@ -2119,10 +2268,14 @@ PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
  * Pass initial_prune if PARAM_EXEC Params cannot yet be evaluated.  This
  * differentiates the initial executor-time pruning step from later
  * runtime pruning.
+ *
+ * RT indexes of leaf partitions scanned by the chosen subplans are added to
+ * *scan_leafpart_rtis if the pointer is non-NULL.
  */
 Bitmapset *
 ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
-						 bool initial_prune)
+						 bool initial_prune,
+						 Bitmapset **scan_leafpart_rtis)
 {
 	Bitmapset  *result = NULL;
 	MemoryContext oldcontext;
@@ -2157,7 +2310,7 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
 		 */
 		pprune = &prunedata->partrelprunedata[0];
 		find_matching_subplans_recurse(prunedata, pprune, initial_prune,
-									   &result);
+									   &result, scan_leafpart_rtis);
 
 		/* Expression eval may have used space in ExprContext too */
 		if (pprune->exec_pruning_steps)
@@ -2171,6 +2324,8 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
 
 	/* Copy result out of the temp context before we reset it */
 	result = bms_copy(result);
+	if (scan_leafpart_rtis)
+		*scan_leafpart_rtis = bms_copy(*scan_leafpart_rtis);
 
 	MemoryContextReset(prunestate->prune_context);
 
@@ -2181,13 +2336,15 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
  * find_matching_subplans_recurse
  *		Recursive worker function for ExecFindMatchingSubPlans
  *
- * Adds valid (non-prunable) subplan IDs to *validsubplans
+ * Adds valid (non-prunable) subplan IDs to *validsubplans and RT indexes of
+ * of the corresponding leaf partitions to *scan_leafpart_rtis (if asked for).
  */
 static void
 find_matching_subplans_recurse(PartitionPruningData *prunedata,
 							   PartitionedRelPruningData *pprune,
 							   bool initial_prune,
-							   Bitmapset **validsubplans)
+							   Bitmapset **validsubplans,
+							   Bitmapset **scan_leafpart_rtis)
 {
 	Bitmapset  *partset;
 	int			i;
@@ -2214,8 +2371,14 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
 	while ((i = bms_next_member(partset, i)) >= 0)
 	{
 		if (pprune->subplan_map[i] >= 0)
+		{
 			*validsubplans = bms_add_member(*validsubplans,
 											pprune->subplan_map[i]);
+			Assert(pprune->rti_map[i] > 0);
+			if (scan_leafpart_rtis)
+				*scan_leafpart_rtis = bms_add_member(*scan_leafpart_rtis,
+													 pprune->rti_map[i]);
+		}
 		else
 		{
 			int			partidx = pprune->subpart_map[i];
@@ -2223,7 +2386,8 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
 			if (partidx >= 0)
 				find_matching_subplans_recurse(prunedata,
 											   &prunedata->partrelprunedata[partidx],
-											   initial_prune, validsubplans);
+											   initial_prune, validsubplans,
+											   scan_leafpart_rtis);
 			else
 			{
 				/*
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index f9460ae506..a2182a6b1f 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -844,7 +844,7 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
 	else
 		dest = None_Receiver;
 
-	es->qd = CreateQueryDesc(es->stmt,
+	es->qd = CreateQueryDesc(es->stmt, NULL,
 							 fcache->src,
 							 GetActiveSnapshot(),
 							 InvalidSnapshot,
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index c6f86a6510..96880e122a 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -155,7 +155,8 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
 		 * subplan, we can fill as_valid_subplans immediately, preventing
 		 * later calls to ExecFindMatchingSubPlans.
 		 */
-		if (!prunestate->do_exec_prune && nplans > 0)
+		if (appendstate->as_prune_state == NULL ||
+			(!appendstate->as_prune_state->do_exec_prune && nplans > 0))
 			appendstate->as_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
 	}
 	else
@@ -577,7 +578,7 @@ choose_next_subplan_locally(AppendState *node)
 		}
 		else if (node->as_valid_subplans == NULL)
 			node->as_valid_subplans =
-				ExecFindMatchingSubPlans(node->as_prune_state, false);
+				ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
 
 		whichplan = -1;
 	}
@@ -642,7 +643,7 @@ choose_next_subplan_for_leader(AppendState *node)
 		if (node->as_valid_subplans == NULL)
 		{
 			node->as_valid_subplans =
-				ExecFindMatchingSubPlans(node->as_prune_state, false);
+				ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
 
 			/*
 			 * Mark each invalid plan as finished to allow the loop below to
@@ -717,7 +718,7 @@ choose_next_subplan_for_worker(AppendState *node)
 	else if (node->as_valid_subplans == NULL)
 	{
 		node->as_valid_subplans =
-			ExecFindMatchingSubPlans(node->as_prune_state, false);
+			ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
 		mark_invalid_subplans_as_finished(node);
 	}
 
@@ -868,7 +869,7 @@ ExecAppendAsyncBegin(AppendState *node)
 	if (node->as_valid_subplans == NULL)
 	{
 		node->as_valid_subplans =
-			ExecFindMatchingSubPlans(node->as_prune_state, false);
+			ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
 
 		classify_matching_subplans(node);
 	}
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index 8d35860c30..2312e5a633 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -103,7 +103,8 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
 		 * subplan, we can fill ms_valid_subplans immediately, preventing
 		 * later calls to ExecFindMatchingSubPlans.
 		 */
-		if (!prunestate->do_exec_prune && nplans > 0)
+		if (mergestate->ms_prune_state == NULL ||
+			(!mergestate->ms_prune_state->do_exec_prune && nplans > 0))
 			mergestate->ms_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
 	}
 	else
@@ -218,7 +219,7 @@ ExecMergeAppend(PlanState *pstate)
 		 */
 		if (node->ms_valid_subplans == NULL)
 			node->ms_valid_subplans =
-				ExecFindMatchingSubPlans(node->ms_prune_state, false);
+				ExecFindMatchingSubPlans(node->ms_prune_state, false, NULL);
 
 		/*
 		 * First time through: pull the first tuple from each valid subplan,
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 29bc26669b..303a572c02 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -1578,6 +1578,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 	CachedPlanSource *plansource;
 	CachedPlan *cplan;
 	List	   *stmt_list;
+	List	   *part_prune_result_list;
 	char	   *query_string;
 	Snapshot	snapshot;
 	MemoryContext oldcontext;
@@ -1657,7 +1658,10 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 	 */
 
 	/* Replan if needed, and increment plan refcount for portal */
-	cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv);
+	cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv,
+						  &part_prune_result_list);
+	Assert(list_length(cplan->stmt_list) ==
+		   list_length(part_prune_result_list));
 	stmt_list = cplan->stmt_list;
 
 	if (!plan->saved)
@@ -1685,6 +1689,9 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 					  stmt_list,
 					  cplan);
 
+	/* Copy PartitionPruneResults into the portal's context. */
+	PortalStorePartitionPruneResults(portal, part_prune_result_list);
+
 	/*
 	 * Set up options for portal.  Default SCROLL type is chosen the same way
 	 * as PerformCursorOpen does it.
@@ -2092,7 +2099,8 @@ SPI_plan_get_cached_plan(SPIPlanPtr plan)
 	/* Get the generic plan for the query */
 	cplan = GetCachedPlan(plansource, NULL,
 						  plan->saved ? CurrentResourceOwner : NULL,
-						  _SPI_current->queryEnv);
+						  _SPI_current->queryEnv,
+						  NULL /* Not interested in PartitionPruneResults */);
 	Assert(cplan == plansource->gplan);
 
 	/* Pop the error context stack */
@@ -2473,7 +2481,9 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 	{
 		CachedPlanSource *plansource = (CachedPlanSource *) lfirst(lc1);
 		List	   *stmt_list;
-		ListCell   *lc2;
+		List	   *part_prune_result_list;
+		ListCell   *lc2,
+				   *lc3;
 
 		spicallbackarg.query = plansource->query_string;
 
@@ -2549,8 +2559,10 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 		 * plan, the refcount must be backed by the plan_owner.
 		 */
 		cplan = GetCachedPlan(plansource, options->params,
-							  plan_owner, _SPI_current->queryEnv);
-
+							  plan_owner, _SPI_current->queryEnv,
+							 &part_prune_result_list);
+		Assert(list_length(cplan->stmt_list) ==
+			   list_length(part_prune_result_list));
 		stmt_list = cplan->stmt_list;
 
 		/*
@@ -2589,9 +2601,10 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 			}
 		}
 
-		foreach(lc2, stmt_list)
+		forboth(lc2, stmt_list, lc3, part_prune_result_list)
 		{
 			PlannedStmt *stmt = lfirst_node(PlannedStmt, lc2);
+			PartitionPruneResult *part_prune_result = lfirst_node(PartitionPruneResult, lc3);
 			bool		canSetTag = stmt->canSetTag;
 			DestReceiver *dest;
 
@@ -2663,7 +2676,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 				else
 					snap = InvalidSnapshot;
 
-				qdesc = CreateQueryDesc(stmt,
+				qdesc = CreateQueryDesc(stmt, part_prune_result,
 										plansource->query_string,
 										snap, crosscheck_snapshot,
 										dest,
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index e76fda8eba..afd0332ddd 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -160,7 +160,6 @@ _copyExtensibleNode(const ExtensibleNode *from)
 	return newnode;
 }
 
-
 /*
  * copyObjectImpl -- implementation of copyObject(); see nodes/nodes.h
  *
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 81f6a9093c..84a195adca 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -294,7 +294,6 @@ outDatum(StringInfo str, Datum value, int typlen, bool typbyval)
 
 #include "outfuncs.funcs.c"
 
-
 /*
  * Support functions for nodes with custom_read_write attribute or
  * special_read_write attribute
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 1421686938..d57478bde9 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -158,6 +158,11 @@
 	token = pg_strtok(&length);		/* skip :fldname */ \
 	local_node->fldname = readIntCols(len)
 
+/* Read an Index array */
+#define READ_INDEX_ARRAY(fldname, len) \
+	token = pg_strtok(&length);		/* skip :fldname */ \
+	local_node->fldname = readIndexCols(len)
+
 /* Read a bool array */
 #define READ_BOOL_ARRAY(fldname, len) \
 	token = pg_strtok(&length);		/* skip :fldname */ \
@@ -623,6 +628,30 @@ readIntCols(int numCols)
 	return int_vals;
 }
 
+/*
+ * readIndexCols
+ */
+Index *
+readIndexCols(int numCols)
+{
+	int			tokenLength,
+				i;
+	const char *token;
+	Index	   *index_vals;
+
+	if (numCols <= 0)
+		return NULL;
+
+	index_vals = (Index *) palloc(numCols * sizeof(Index));
+	for (i = 0; i < numCols; i++)
+	{
+		token = pg_strtok(&tokenLength);
+		index_vals[i] = atoui(token);
+	}
+
+	return index_vals;
+}
+
 /*
  * readBoolCols
  */
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index b11249ed8f..7141035cc4 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -519,7 +519,9 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
 	result->parallelModeNeeded = glob->parallelModeNeeded;
 	result->planTree = top_plan;
 	result->partPruneInfos = glob->partPruneInfos;
+	result->containsInitialPruning = glob->containsInitialPruning;
 	result->rtable = glob->finalrtable;
+	result->minLockRelids = glob->minLockRelids;
 	result->resultRelations = glob->resultRelations;
 	result->appendRelations = glob->appendRelations;
 	result->subplans = glob->subplans;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index b8d5610593..da749e331e 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -270,6 +270,16 @@ set_plan_references(PlannerInfo *root, Plan *plan)
 	 */
 	add_rtes_to_flat_rtable(root, false);
 
+	/*
+	 * Add the query's adjusted range of RT indexes to glob->minLockRelids.
+	 * The adjusted RT indexes of prunable relations will be deleted from the
+	 * set below where PartitionPruneInfos are processed.
+	 */
+	glob->minLockRelids =
+		bms_add_range(glob->minLockRelids,
+					  rtoffset + 1,
+					  rtoffset + list_length(root->parse->rtable));
+
 	/*
 	 * Adjust RT indexes of PlanRowMarks and add to final rowmarks list
 	 */
@@ -352,6 +362,7 @@ set_plan_references(PlannerInfo *root, Plan *plan)
 	foreach (lc, root->partPruneInfos)
 	{
 		PartitionPruneInfo *pruneinfo = lfirst(lc);
+		Bitmapset *leafpart_rtis = NULL;
 		ListCell  *l;
 
 		foreach(l, pruneinfo->prune_infos)
@@ -362,15 +373,50 @@ set_plan_references(PlannerInfo *root, Plan *plan)
 			foreach(l2, prune_infos)
 			{
 				PartitionedRelPruneInfo *pinfo = lfirst(l2);
+				int		i;
 
 				/* RT index of the table to which the pinfo belongs. */
 				pinfo->rtindex += rtoffset;
+
+				/* Also of the leaf partitions that might be scanned. */
+				for (i = 0; i < pinfo->nparts; i++)
+				{
+					if (pinfo->rti_map[i] > 0 && pinfo->subplan_map[i] >= 0)
+					{
+						pinfo->rti_map[i] += rtoffset;
+						leafpart_rtis = bms_add_member(leafpart_rtis,
+													   pinfo->rti_map[i]);
+					}
+				}
 			}
 		}
 
+		if (pruneinfo->needs_init_pruning)
+		{
+			glob->containsInitialPruning = true;
+
+			/*
+			 * Delete the leaf partition RTIs from the global set of relations
+			 * to be locked before executing the plan.  AcquireExecutorLocks()
+			 * will find the ones to add to the set after performing initial
+			 * pruning.
+			 */
+			glob->minLockRelids = bms_del_members(glob->minLockRelids,
+												  leafpart_rtis);
+		}
+
 		glob->partPruneInfos = lappend(glob->partPruneInfos, pruneinfo);
 	}
 
+	/*
+	 * It seems worth doing a bms_copy() on glob->minLockRelids if we deleted
+	 * bits from it above to get rid of any empty tail bits.  It seems better
+	 * for the loop over this set in AcquireExecutorLocks() to not have to go
+	 * through those useless bit words.
+	 */
+	if (glob->containsInitialPruning)
+		glob->minLockRelids = bms_copy(glob->minLockRelids);
+
 	return result;
 }
 
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index d77f7d3aef..952c5b8327 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -144,7 +144,9 @@ static List *make_partitionedrel_pruneinfo(PlannerInfo *root,
 										   List *prunequal,
 										   Bitmapset *partrelids,
 										   int *relid_subplan_map,
-										   Bitmapset **matchedsubplans);
+										   Bitmapset **matchedsubplans,
+										   bool *needs_init_pruning,
+										   bool *needs_exec_pruning);
 static void gen_partprune_steps(RelOptInfo *rel, List *clauses,
 								PartClauseTarget target,
 								GeneratePruningStepsContext *context);
@@ -234,6 +236,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	int		   *relid_subplan_map;
 	ListCell   *lc;
 	int			i;
+	bool		needs_init_pruning = false;
+	bool		needs_exec_pruning = false;
 
 	/*
 	 * Scan the subpaths to see which ones are scans of partition child
@@ -313,12 +317,16 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		Bitmapset  *partrelids = (Bitmapset *) lfirst(lc);
 		List	   *pinfolist;
 		Bitmapset  *matchedsubplans = NULL;
+		bool		partrel_needs_init_pruning;
+		bool		partrel_needs_exec_pruning;
 
 		pinfolist = make_partitionedrel_pruneinfo(root, parentrel,
 												  prunequal,
 												  partrelids,
 												  relid_subplan_map,
-												  &matchedsubplans);
+												  &matchedsubplans,
+												  &partrel_needs_init_pruning,
+												  &partrel_needs_exec_pruning);
 
 		/* When pruning is possible, record the matched subplans */
 		if (pinfolist != NIL)
@@ -327,6 +335,9 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 			allmatchedsubplans = bms_join(matchedsubplans,
 										  allmatchedsubplans);
 		}
+
+		needs_init_pruning |= partrel_needs_init_pruning;
+		needs_exec_pruning |= partrel_needs_exec_pruning;
 	}
 
 	pfree(relid_subplan_map);
@@ -341,6 +352,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	/* Else build the result data structure */
 	pruneinfo = makeNode(PartitionPruneInfo);
 	pruneinfo->prune_infos = prunerelinfos;
+	pruneinfo->needs_init_pruning = needs_init_pruning;
+	pruneinfo->needs_exec_pruning = needs_exec_pruning;
 
 	/*
 	 * Some subplans may not belong to any of the identified partitioned rels.
@@ -441,13 +454,18 @@ add_part_relids(List *allpartrelids, Bitmapset *partrelids)
  * If we cannot find any useful run-time pruning steps, return NIL.
  * However, on success, each rel identified in partrelids will have
  * an element in the result list, even if some of them are useless.
+ * *needs_init_pruning and *needs_exec_pruning are set to indicate that the
+ * returned PartitionedRelPruneInfos contains pruning steps that can be
+ * performed before and after execution begins, respectively.
  */
 static List *
 make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 							  List *prunequal,
 							  Bitmapset *partrelids,
 							  int *relid_subplan_map,
-							  Bitmapset **matchedsubplans)
+							  Bitmapset **matchedsubplans,
+							  bool *needs_init_pruning,
+							  bool *needs_exec_pruning)
 {
 	RelOptInfo *targetpart = NULL;
 	List	   *pinfolist = NIL;
@@ -458,6 +476,10 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	int			rti;
 	int			i;
 
+	/* Will find out below. */
+	*needs_init_pruning = false;
+	*needs_exec_pruning = false;
+
 	/*
 	 * Examine each partitioned rel, constructing a temporary array to map
 	 * from planner relids to index of the partitioned rel, and building a
@@ -545,6 +567,9 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		 * executor per-scan pruning steps.  This first pass creates startup
 		 * pruning steps and detects whether there's any possibly-useful quals
 		 * that would require per-scan pruning.
+		 *
+		 * In the first pass, we note whether the 2nd pass is necessary by
+		 * noting the presence of EXEC parameters.
 		 */
 		gen_partprune_steps(subpart, partprunequal, PARTTARGET_INITIAL,
 							&context);
@@ -619,6 +644,12 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		pinfo->execparamids = execparamids;
 		/* Remaining fields will be filled in the next loop */
 
+		/* record which types of pruning steps we've seen so far */
+		if (initial_pruning_steps != NIL)
+			*needs_init_pruning = true;
+		if (exec_pruning_steps != NIL)
+			*needs_exec_pruning = true;
+
 		pinfolist = lappend(pinfolist, pinfo);
 	}
 
@@ -646,6 +677,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		int		   *subplan_map;
 		int		   *subpart_map;
 		Oid		   *relid_map;
+		Index	   *rti_map;
 
 		/*
 		 * Construct the subplan and subpart maps for this partitioning level.
@@ -658,6 +690,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		subpart_map = (int *) palloc(nparts * sizeof(int));
 		memset(subpart_map, -1, nparts * sizeof(int));
 		relid_map = (Oid *) palloc0(nparts * sizeof(Oid));
+		rti_map = (Index *) palloc0(nparts * sizeof(Index));
 		present_parts = NULL;
 
 		i = -1;
@@ -672,6 +705,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 			subplan_map[i] = subplanidx = relid_subplan_map[partrel->relid] - 1;
 			subpart_map[i] = subpartidx = relid_subpart_map[partrel->relid] - 1;
 			relid_map[i] = planner_rt_fetch(partrel->relid, root)->relid;
+			rti_map[i] = partrel->relid;
 			if (subplanidx >= 0)
 			{
 				present_parts = bms_add_member(present_parts, i);
@@ -696,6 +730,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		pinfo->subplan_map = subplan_map;
 		pinfo->subpart_map = subpart_map;
 		pinfo->relid_map = relid_map;
+		pinfo->rti_map = rti_map;
 	}
 
 	pfree(relid_subpart_map);
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 6f18b68856..16bda42f11 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -1596,6 +1596,7 @@ exec_bind_message(StringInfo input_message)
 	int16	   *rformats = NULL;
 	CachedPlanSource *psrc;
 	CachedPlan *cplan;
+	List	   *part_prune_result_list;
 	Portal		portal;
 	char	   *query_string;
 	char	   *saved_stmt_name;
@@ -1971,7 +1972,9 @@ exec_bind_message(StringInfo input_message)
 	 * will be generated in MessageContext.  The plan refcount will be
 	 * assigned to the Portal, so it will be released at portal destruction.
 	 */
-	cplan = GetCachedPlan(psrc, params, NULL, NULL);
+	cplan = GetCachedPlan(psrc, params, NULL, NULL, &part_prune_result_list);
+	Assert(list_length(cplan->stmt_list) ==
+		   list_length(part_prune_result_list));
 
 	/*
 	 * Now we can define the portal.
@@ -1986,6 +1989,9 @@ exec_bind_message(StringInfo input_message)
 					  cplan->stmt_list,
 					  cplan);
 
+	/* Copy PartitionPruneResults into the portal's context. */
+	PortalStorePartitionPruneResults(portal, part_prune_result_list);
+
 	/* Done with the snapshot used for parameter I/O and parsing/planning */
 	if (snapshot_set)
 		PopActiveSnapshot();
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index 5aa5a350f3..8cc2e2162d 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -35,7 +35,7 @@
 Portal		ActivePortal = NULL;
 
 
-static void ProcessQuery(PlannedStmt *plan,
+static void ProcessQuery(PlannedStmt *plan, PartitionPruneResult *part_prune_result,
 						 const char *sourceText,
 						 ParamListInfo params,
 						 QueryEnvironment *queryEnv,
@@ -65,6 +65,7 @@ static void DoPortalRewind(Portal portal);
  */
 QueryDesc *
 CreateQueryDesc(PlannedStmt *plannedstmt,
+				PartitionPruneResult *part_prune_result,
 				const char *sourceText,
 				Snapshot snapshot,
 				Snapshot crosscheck_snapshot,
@@ -77,6 +78,8 @@ CreateQueryDesc(PlannedStmt *plannedstmt,
 
 	qd->operation = plannedstmt->commandType;	/* operation */
 	qd->plannedstmt = plannedstmt;	/* plan */
+	qd->part_prune_result = part_prune_result;	/* ExecutorDoInitialPruning()
+												 * output for plan */
 	qd->sourceText = sourceText;	/* query text */
 	qd->snapshot = RegisterSnapshot(snapshot);	/* snapshot */
 	/* RI check snapshot */
@@ -122,6 +125,7 @@ FreeQueryDesc(QueryDesc *qdesc)
  *		PORTAL_ONE_RETURNING, or PORTAL_ONE_MOD_WITH portal
  *
  *	plan: the plan tree for the query
+ *	part_prune_result: ExecutorDoInitialPruning() output for the plan tree
  *	sourceText: the source text of the query
  *	params: any parameters needed
  *	dest: where to send results
@@ -134,6 +138,7 @@ FreeQueryDesc(QueryDesc *qdesc)
  */
 static void
 ProcessQuery(PlannedStmt *plan,
+			 PartitionPruneResult *part_prune_result,
 			 const char *sourceText,
 			 ParamListInfo params,
 			 QueryEnvironment *queryEnv,
@@ -145,7 +150,7 @@ ProcessQuery(PlannedStmt *plan,
 	/*
 	 * Create the QueryDesc object
 	 */
-	queryDesc = CreateQueryDesc(plan, sourceText,
+	queryDesc = CreateQueryDesc(plan, part_prune_result, sourceText,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, params, queryEnv, 0);
 
@@ -491,8 +496,13 @@ PortalStart(Portal portal, ParamListInfo params,
 				/*
 				 * Create QueryDesc in portal's context; for the moment, set
 				 * the destination to DestNone.
+				 *
+				 * There is no PartitionPruneResult unless the PlannedStmt is
+				 * from a CachedPlan.
 				 */
 				queryDesc = CreateQueryDesc(linitial_node(PlannedStmt, portal->stmts),
+											portal->part_prune_results == NIL ? NULL :
+											linitial(portal->part_prune_results),
 											portal->sourceText,
 											GetActiveSnapshot(),
 											InvalidSnapshot,
@@ -1225,6 +1235,8 @@ PortalRunMulti(Portal portal,
 
 		if (pstmt->utilityStmt == NULL)
 		{
+			PartitionPruneResult *part_prune_result = NULL;
+
 			/*
 			 * process a plannable query.
 			 */
@@ -1271,10 +1283,18 @@ PortalRunMulti(Portal portal,
 			else
 				UpdateActiveSnapshotCommandId();
 
+			/*
+			 * Determine if there's a corresponding PartitionPruneResult for
+			 * this PlannedStmt.
+			 */
+			if (portal->part_prune_results != NIL)
+				part_prune_result = list_nth(portal->part_prune_results,
+											 foreach_current_index(stmtlist_item));
+
 			if (pstmt->canSetTag)
 			{
 				/* statement can set tag string */
-				ProcessQuery(pstmt,
+				ProcessQuery(pstmt, part_prune_result,
 							 portal->sourceText,
 							 portal->portalParams,
 							 portal->queryEnv,
@@ -1283,7 +1303,7 @@ PortalRunMulti(Portal portal,
 			else
 			{
 				/* stmt added by rewrite cannot set tag */
-				ProcessQuery(pstmt,
+				ProcessQuery(pstmt, part_prune_result,
 							 portal->sourceText,
 							 portal->portalParams,
 							 portal->queryEnv,
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 0d6a295674..d1c9605979 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -99,14 +99,19 @@ static dlist_head cached_expression_list = DLIST_STATIC_INIT(cached_expression_l
 static void ReleaseGenericPlan(CachedPlanSource *plansource);
 static List *RevalidateCachedQuery(CachedPlanSource *plansource,
 								   QueryEnvironment *queryEnv);
-static bool CheckCachedPlan(CachedPlanSource *plansource);
+static bool CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
+							List **part_prune_result_list);
 static CachedPlan *BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
-								   ParamListInfo boundParams, QueryEnvironment *queryEnv);
+								   ParamListInfo boundParams, QueryEnvironment *queryEnv,
+								   List **part_prune_result_list);
 static bool choose_custom_plan(CachedPlanSource *plansource,
 							   ParamListInfo boundParams);
 static double cached_plan_cost(CachedPlan *plan, bool include_planner);
 static Query *QueryListGetPrimaryStmt(List *stmts);
-static void AcquireExecutorLocks(List *stmt_list, bool acquire);
+static void AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams,
+								 List **part_prune_result_list,
+								 List **lockedRelids_per_stmt);
+static void ReleaseExecutorLocks(List *stmt_list, List *lockedRelids_per_stmt);
 static void AcquirePlannerLocks(List *stmt_list, bool acquire);
 static void ScanQueryForLocks(Query *parsetree, bool acquire);
 static bool ScanQueryWalker(Node *node, bool *acquire);
@@ -790,15 +795,20 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
  *
  * On a "true" return, we have acquired the locks needed to run the plan.
  * (We must do this for the "true" result to be race-condition-free.)
+ *
+ * See GetCachedPlan()'s comment for a description of part_prune_result_list.
  */
 static bool
-CheckCachedPlan(CachedPlanSource *plansource)
+CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
+				List **part_prune_result_list)
 {
 	CachedPlan *plan = plansource->gplan;
 
 	/* Assert that caller checked the querytree */
 	Assert(plansource->is_valid);
 
+	*part_prune_result_list = NIL;
+
 	/* If there's no generic plan, just say "false" */
 	if (!plan)
 		return false;
@@ -820,13 +830,21 @@ CheckCachedPlan(CachedPlanSource *plansource)
 	 */
 	if (plan->is_valid)
 	{
+		List   *lockedRelids_per_stmt;
+
 		/*
 		 * Plan must have positive refcount because it is referenced by
 		 * plansource; so no need to fear it disappears under us here.
 		 */
 		Assert(plan->refcount > 0);
 
-		AcquireExecutorLocks(plan->stmt_list, true);
+		/*
+		 * Lock relations scanned by the plan.  This is where the pruning
+		 * happens if needed.
+		 */
+		AcquireExecutorLocks(plan->stmt_list, boundParams,
+							 part_prune_result_list,
+							 &lockedRelids_per_stmt);
 
 		/*
 		 * If plan was transient, check to see if TransactionXmin has
@@ -848,7 +866,14 @@ CheckCachedPlan(CachedPlanSource *plansource)
 		}
 
 		/* Oops, the race case happened.  Release useless locks. */
-		AcquireExecutorLocks(plan->stmt_list, false);
+		ReleaseExecutorLocks(plan->stmt_list, lockedRelids_per_stmt);
+
+		/*
+		 * The output list and any objects therein have been allocated in the
+		 * caller's hopefully short-lived context, so will not remain leaked
+		 * for long, though reset to avoid its accidentally being looked at.
+		 */
+		*part_prune_result_list = NIL;
 	}
 
 	/*
@@ -874,10 +899,15 @@ CheckCachedPlan(CachedPlanSource *plansource)
  * Planning work is done in the caller's memory context.  The finished plan
  * is in a child memory context, which typically should get reparented
  * (unless this is a one-shot plan, in which case we don't copy the plan).
+ *
+ * A list of NULLs is returned in *part_prune_result_list, meaning that no
+ * PartitionPruneResult nodes have yet been created for the plans in
+ * stmt_list.
  */
 static CachedPlan *
 BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
-				ParamListInfo boundParams, QueryEnvironment *queryEnv)
+				ParamListInfo boundParams, QueryEnvironment *queryEnv,
+				List **part_prune_result_list)
 {
 	CachedPlan *plan;
 	List	   *plist;
@@ -1007,6 +1037,17 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
 
 	MemoryContextSwitchTo(oldcxt);
 
+	/*
+	 * No actual PartitionPruneResults yet to add, though must initialize
+	 * the list to have the same number of elements as the list of
+	 * PlannedStmts.
+	 */
+	*part_prune_result_list = NIL;
+	foreach(lc, plist)
+	{
+		*part_prune_result_list = lappend(*part_prune_result_list, NULL);
+	}
+
 	return plan;
 }
 
@@ -1126,6 +1167,17 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
  * plan or a custom plan for the given parameters: the caller does not know
  * which it will get.
  *
+ * For every PlannedStmt found in the returned CachedPlan, an element that
+ * is either a PartitionPruneResult or a NULL is added to
+ * *part_prune_result_list if needed.  The former if the PlannedStmt is from
+ * the existing CachedPlan that is otherwise valid and contains at least one
+ * PartitionPruneInfo that has "initial" pruning steps.  Those steps are
+ * performed by calling ExecutorDoInitialPruning() to determine only those
+ * leaf partitions that need to be locked by AcquireExecutorLocks() by pruning
+ * away subplans that don't match the pruning conditions.  The
+ * PartitionPruneResult contains a list of bitmapsets of the indexes of
+ * matching subplans, one for each PartitionPruneInfo.
+ *
  * On return, the plan is valid and we have sufficient locks to begin
  * execution.
  *
@@ -1139,11 +1191,13 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
  */
 CachedPlan *
 GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
-			  ResourceOwner owner, QueryEnvironment *queryEnv)
+			  ResourceOwner owner, QueryEnvironment *queryEnv,
+			  List **part_prune_result_list)
 {
 	CachedPlan *plan = NULL;
 	List	   *qlist;
 	bool		customplan;
+	List	   *my_part_prune_result_list;
 
 	/* Assert caller is doing things in a sane order */
 	Assert(plansource->magic == CACHEDPLANSOURCE_MAGIC);
@@ -1160,7 +1214,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 
 	if (!customplan)
 	{
-		if (CheckCachedPlan(plansource))
+		if (CheckCachedPlan(plansource, boundParams,
+							&my_part_prune_result_list))
 		{
 			/* We want a generic plan, and we already have a valid one */
 			plan = plansource->gplan;
@@ -1169,7 +1224,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 		else
 		{
 			/* Build a new generic plan */
-			plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv);
+			plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv,
+								   &my_part_prune_result_list);
 			/* Just make real sure plansource->gplan is clear */
 			ReleaseGenericPlan(plansource);
 			/* Link the new generic plan into the plansource */
@@ -1214,7 +1270,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 	if (customplan)
 	{
 		/* Build a custom plan */
-		plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv);
+		plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv,
+							   &my_part_prune_result_list);
 		/* Accumulate total costs of custom plans */
 		plansource->total_custom_cost += cached_plan_cost(plan, true);
 
@@ -1246,6 +1303,9 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 		plan->is_saved = true;
 	}
 
+	if (part_prune_result_list)
+		*part_prune_result_list = my_part_prune_result_list;
+
 	return plan;
 }
 
@@ -1737,17 +1797,29 @@ QueryListGetPrimaryStmt(List *stmts)
 
 /*
  * AcquireExecutorLocks: acquire locks needed for execution of a cached plan;
- * or release them if acquire is false.
+ *
+ * See GetCachedPlan()'s comment for a description of part_prune_result_list.
+ *
+ * On return, *lockedRelids_per_stmt will contain a bitmapset for every
+ * PlannedStmt in stmt_list, containing the RT indexes of relation entries
+ * in its range table that were actually locked, or NULL if the PlannedStmt
+ * contains a utility statement.
  */
 static void
-AcquireExecutorLocks(List *stmt_list, bool acquire)
+AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams,
+					 List **part_prune_result_list,
+					 List **lockedRelids_per_stmt)
 {
 	ListCell   *lc1;
 
+	*part_prune_result_list = *lockedRelids_per_stmt = NIL;
 	foreach(lc1, stmt_list)
 	{
 		PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
-		ListCell   *lc2;
+		PartitionPruneResult *part_prune_result = NULL;
+		Bitmapset  *allLockRelids;
+		Bitmapset  *lockedRelids = NULL;
+		int			rti;
 
 		if (plannedstmt->commandType == CMD_UTILITY)
 		{
@@ -1761,13 +1833,38 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
 			Query	   *query = UtilityContainsQuery(plannedstmt->utilityStmt);
 
 			if (query)
-				ScanQueryForLocks(query, acquire);
+				ScanQueryForLocks(query, true);
+			*part_prune_result_list = lappend(*part_prune_result_list, NULL);
 			continue;
 		}
 
-		foreach(lc2, plannedstmt->rtable)
+		/*
+		 * Figure out the set of relations that would need to be locked
+		 * before executing the plan.
+		 */
+		if (plannedstmt->containsInitialPruning)
 		{
-			RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc2);
+			/*
+			 * Obtain the set of leaf partitions to be locked.
+			 *
+			 * The following does initial partition pruning using the
+			 * PartitionPruneInfos found in plannedstmt->partPruneInfos and
+			 * finds leaf partitions that survive that pruning across all the
+			 * nodes in the plan tree.
+			 */
+			PartitionPruneResult *part_prune_result =
+				ExecutorDoInitialPruning(plannedstmt, boundParams);
+
+			allLockRelids = bms_union(plannedstmt->minLockRelids,
+									  part_prune_result->scan_leafpart_rtis);
+		}
+		else
+			allLockRelids = plannedstmt->minLockRelids;
+
+		rti = -1;
+		while ((rti = bms_next_member(allLockRelids, rti)) > 0)
+		{
+			RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
 
 			if (rte->rtekind != RTE_RELATION)
 				continue;
@@ -1778,10 +1875,58 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
 			 * fail if it's been dropped entirely --- we'll just transiently
 			 * acquire a non-conflicting lock.
 			 */
-			if (acquire)
-				LockRelationOid(rte->relid, rte->rellockmode);
-			else
-				UnlockRelationOid(rte->relid, rte->rellockmode);
+			LockRelationOid(rte->relid, rte->rellockmode);
+			lockedRelids = bms_add_member(lockedRelids, rti);
+		}
+
+		*part_prune_result_list = lappend(*part_prune_result_list,
+										  part_prune_result);
+		*lockedRelids_per_stmt = lappend(*lockedRelids_per_stmt, lockedRelids);
+	}
+}
+
+/*
+ * ReleaseExecutorLocks
+ * 		Release locks that would've been acquired by an earlier call to
+ * 		AcquireExecutorLocks()
+ */
+static void
+ReleaseExecutorLocks(List *stmt_list, List *lockedRelids_per_stmt)
+{
+	ListCell   *lc1,
+			   *lc2;
+
+	forboth(lc1, stmt_list, lc2, lockedRelids_per_stmt)
+	{
+		PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
+		Bitmapset  *lockedRelids = lfirst(lc2);
+		int			rti;
+
+		if (plannedstmt->commandType == CMD_UTILITY)
+		{
+			/*
+			 * Ignore utility statements, except those (such as EXPLAIN) that
+			 * contain a parsed-but-not-planned query.  Note: it's okay to use
+			 * ScanQueryForLocks, even though the query hasn't been through
+			 * rule rewriting, because rewriting doesn't change the query
+			 * representation.
+			 */
+			Query	   *query = UtilityContainsQuery(plannedstmt->utilityStmt);
+
+			if (query)
+				ScanQueryForLocks(query, false);
+			continue;
+		}
+
+		rti = -1;
+		while ((rti = bms_next_member(lockedRelids, rti)) >= 0)
+		{
+			RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
+
+			Assert(rte->rtekind == RTE_RELATION);
+
+			/* See the comment in AcquireExecutorLocks(). */
+			UnlockRelationOid(rte->relid, rte->rellockmode);
 		}
 	}
 }
diff --git a/src/backend/utils/mmgr/portalmem.c b/src/backend/utils/mmgr/portalmem.c
index d549f66d4a..1bbe6b704b 100644
--- a/src/backend/utils/mmgr/portalmem.c
+++ b/src/backend/utils/mmgr/portalmem.c
@@ -303,6 +303,25 @@ PortalDefineQuery(Portal portal,
 	portal->status = PORTAL_DEFINED;
 }
 
+/*
+ * PortalStorePartitionPruneResults
+ *		Copy the given list of PartitionPruneResults into the portal's
+ *		context
+ *
+ * This allows the caller to ensure that the list exists as long as the portal
+ * does.
+ */
+void
+PortalStorePartitionPruneResults(Portal portal, List *part_prune_results)
+{
+	MemoryContext	oldcxt;
+
+	AssertArg(PortalIsValid(portal));
+	oldcxt = MemoryContextSwitchTo(portal->portalContext);
+	portal->part_prune_results = copyObject(part_prune_results);
+	MemoryContextSwitchTo(oldcxt);
+}
+
 /*
  * PortalReleaseCachedPlan
  *		Release a portal's reference to its cached plan, if any.
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 9ebde089ae..e57e133f0e 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -87,7 +87,9 @@ extern void ExplainOneUtility(Node *utilityStmt, IntoClause *into,
 							  ExplainState *es, const char *queryString,
 							  ParamListInfo params, QueryEnvironment *queryEnv);
 
-extern void ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into,
+extern void ExplainOnePlan(PlannedStmt *plannedstmt,
+						   PartitionPruneResult *part_prune_result,
+						   IntoClause *into,
 						   ExplainState *es, const char *queryString,
 						   ParamListInfo params, QueryEnvironment *queryEnv,
 						   const instr_time *planduration,
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index bf962af7af..bd8776402e 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -45,6 +45,7 @@ extern void ExecCleanupTupleRouting(ModifyTableState *mtstate,
  * nparts						Length of subplan_map[] and subpart_map[].
  * subplan_map					Subplan index by partition index, or -1.
  * subpart_map					Subpart index by partition index, or -1.
+ * rti_map						Range table index by partition index, or 0.
  * present_parts				A Bitmapset of the partition indexes that we
  *								have subplans or subparts for.
  * initial_pruning_steps		List of PartitionPruneSteps used to
@@ -61,6 +62,7 @@ typedef struct PartitionedRelPruningData
 	int			nparts;
 	int		   *subplan_map;
 	int		   *subpart_map;
+	Index	   *rti_map;
 	Bitmapset  *present_parts;
 	List	   *initial_pruning_steps;
 	List	   *exec_pruning_steps;
@@ -126,5 +128,10 @@ extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
 													 int part_prune_index,
 													 Bitmapset **initially_valid_subplans);
 extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
-										   bool initial_prune);
+										   bool initial_prune,
+										   Bitmapset **scan_leafpart_rtis);
+extern Bitmapset *ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt,
+								ParamListInfo params,
+								PartitionPruneInfo *pruneinfo,
+								Bitmapset **scan_leafpart_rtis);
 #endif							/* EXECPARTITION_H */
diff --git a/src/include/executor/execdesc.h b/src/include/executor/execdesc.h
index e79e2c001f..60d5644908 100644
--- a/src/include/executor/execdesc.h
+++ b/src/include/executor/execdesc.h
@@ -35,6 +35,8 @@ typedef struct QueryDesc
 	/* These fields are provided by CreateQueryDesc */
 	CmdType		operation;		/* CMD_SELECT, CMD_UPDATE, etc. */
 	PlannedStmt *plannedstmt;	/* planner's output (could be utility, too) */
+	PartitionPruneResult *part_prune_result; /* ExecutorDoInitialPruning()'s
+											  * output for plannedstmt */
 	const char *sourceText;		/* source text of the query */
 	Snapshot	snapshot;		/* snapshot to use for query */
 	Snapshot	crosscheck_snapshot;	/* crosscheck for RI update/delete */
@@ -57,6 +59,7 @@ typedef struct QueryDesc
 
 /* in pquery.c */
 extern QueryDesc *CreateQueryDesc(PlannedStmt *plannedstmt,
+								  PartitionPruneResult *part_prune_result,
 								  const char *sourceText,
 								  Snapshot snapshot,
 								  Snapshot crosscheck_snapshot,
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index d68a6b9d28..5c4a282be0 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -185,6 +185,8 @@ ExecGetJunkAttribute(TupleTableSlot *slot, AttrNumber attno, bool *isNull)
 /*
  * prototypes from functions in execMain.c
  */
+extern PartitionPruneResult *ExecutorDoInitialPruning(PlannedStmt *plannedstmt,
+													  ParamListInfo params);
 extern void ExecutorStart(QueryDesc *queryDesc, int eflags);
 extern void standard_ExecutorStart(QueryDesc *queryDesc, int eflags);
 extern void ExecutorRun(QueryDesc *queryDesc,
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 63a89474db..12ea06c2f6 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -1001,6 +1001,33 @@ typedef struct DomainConstraintState
  */
 typedef TupleTableSlot *(*ExecProcNodeMtd) (struct PlanState *pstate);
 
+/*----------------
+ * PartitionPruneResult
+ *
+ * The result of performing ExecutorDoInitialPruning() invocation on a given
+ * PlannedStmt.
+ *
+ * Contains a list of Bitmapset of the indexes of the subplans remaining after
+ * performing initial pruning by calling ExecFindMatchingSubPlans() for every
+ * PartitionPruneInfo found in PlannedStmt.partPruneInfos.  RT indexes of the
+ * leaf partitions scanned by those subplans across all PartitionPruneInfos
+ * are added into scan_leafpart_rtis.
+ *
+ * This is used by GetCachedPlan() to inform its callers of the pruning
+ * decisions made when performing AcquireExecutorLocks() on a given cached
+ * PlannedStmt, which the callers then pass on to the executor.  The executor
+ * refers to this node when initializing the plan nodes which contain subplans
+ * that may have been pruned by ExecutorDoInitialPruning(), rather than
+ * redoing initial pruning.
+ */
+typedef struct PartitionPruneResult
+{
+	NodeTag		type;
+
+	List		   *valid_subplan_offs_list;
+	Bitmapset	   *scan_leafpart_rtis;
+} PartitionPruneResult;
+
 /* ----------------
  *		PlanState node
  *
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index cdd6debfa0..b33d9e426d 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -212,6 +212,7 @@ extern struct Bitmapset *readBitmapset(void);
 extern uintptr_t readDatum(bool typbyval);
 extern bool *readBoolCols(int numCols);
 extern int *readIntCols(int numCols);
+extern Index *readIndexCols(int numCols);
 extern Oid *readOidCols(int numCols);
 extern int16 *readAttrNumberCols(int numCols);
 
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index d87957ff6c..7957aeb6d7 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -125,6 +125,19 @@ typedef struct PlannerGlobal
 	/* List of PartitionPruneInfo contained in the plan */
 	List	   *partPruneInfos;
 
+	/*
+	 * Do any of those PartitionPruneInfos have initial (pre-exec) pruning
+	 * steps in them?
+	 */
+	bool		containsInitialPruning;
+
+	/*
+	 * Indexes of all range table entries minus indexes of range table entries
+	 * of the leaf partitions scanned by prunable subplans; see
+	 * AcquireExecutorLocks()
+	 */
+	Bitmapset  *minLockRelids;
+
 	/* OIDs of relations the plan depends on */
 	List	   *relationOids;
 
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index f2daabb3b7..1d2c0d9bdf 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -72,8 +72,17 @@ typedef struct PlannedStmt
 	List	   *partPruneInfos;	/* List of PartitionPruneInfo contained in
 								 * the plan */
 
+	bool		containsInitialPruning;	/* Do any of those PartitionPruneInfos
+										 * have initial (pre-exec) pruning
+										 * steps in them? */
+
 	List	   *rtable;			/* list of RangeTblEntry nodes */
 
+	Bitmapset  *minLockRelids;	/* Indexes of all range table entries minus
+								 * indexes of range table entries of the leaf
+								 * partitions scanned by prunable subplans;
+								 * see AcquireExecutorLocks() */
+
 	/* rtable indexes of target relations for INSERT/UPDATE/DELETE */
 	List	   *resultRelations;	/* integer list of RT indexes, or NIL */
 
@@ -1409,6 +1418,13 @@ typedef struct PlanRowMark
  * prune_infos			List of Lists containing PartitionedRelPruneInfo nodes,
  *						one sublist per run-time-prunable partition hierarchy
  *						appearing in the parent plan node's subplans.
+ *
+ * needs_init_pruning	Does any of the PartitionedRelPruneInfos in
+ *						prune_infos have its initial_pruning_steps set?
+ *
+ * needs_exec_pruning	Does any of the PartitionedRelPruneInfos in
+ *						prune_infos have its exec_pruning_steps set?
+ *
  * other_subplans		Indexes of any subplans that are not accounted for
  *						by any of the PartitionedRelPruneInfo nodes in
  *						"prune_infos".  These subplans must not be pruned.
@@ -1419,6 +1435,8 @@ typedef struct PartitionPruneInfo
 
 	NodeTag		type;
 	List	   *prune_infos;
+	bool		needs_init_pruning;
+	bool		needs_exec_pruning;
 	Bitmapset  *other_subplans;
 } PartitionPruneInfo;
 
@@ -1463,6 +1481,9 @@ typedef struct PartitionedRelPruneInfo
 	/* relation OID by partition index, or 0 */
 	Oid		   *relid_map pg_node_attr(array_size(nparts));
 
+	/* Range table index by partition index, or 0. */
+	Index	   *rti_map pg_node_attr(array_size(nparts));
+
 	/*
 	 * initial_pruning_steps shows how to prune during executor startup (i.e.,
 	 * without use of any PARAM_EXEC Params); it is NIL if no startup pruning
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index 0499635f59..1c5bb5ece1 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -220,7 +220,8 @@ extern List *CachedPlanGetTargetList(CachedPlanSource *plansource,
 extern CachedPlan *GetCachedPlan(CachedPlanSource *plansource,
 								 ParamListInfo boundParams,
 								 ResourceOwner owner,
-								 QueryEnvironment *queryEnv);
+								 QueryEnvironment *queryEnv,
+								 List **part_prune_result_list);
 extern void ReleaseCachedPlan(CachedPlan *plan, ResourceOwner owner);
 
 extern bool CachedPlanAllowsSimpleValidityCheck(CachedPlanSource *plansource,
diff --git a/src/include/utils/portal.h b/src/include/utils/portal.h
index aeddbdafe5..9f7727a837 100644
--- a/src/include/utils/portal.h
+++ b/src/include/utils/portal.h
@@ -138,6 +138,7 @@ typedef struct PortalData
 	QueryCompletion qc;			/* command completion data for executed query */
 	List	   *stmts;			/* list of PlannedStmts */
 	CachedPlan *cplan;			/* CachedPlan, if stmts are from one */
+	List	   *part_prune_results;	/* list of PartitionPruneResults */
 
 	ParamListInfo portalParams; /* params to pass to query */
 	QueryEnvironment *queryEnv; /* environment for query */
@@ -242,6 +243,8 @@ extern void PortalDefineQuery(Portal portal,
 							  CommandTag commandTag,
 							  List *stmts,
 							  CachedPlan *cplan);
+extern void PortalStorePartitionPruneResults(Portal portal,
+											 List *part_prune_result_list);
 extern PlannedStmt *PortalGetPrimaryStmt(Portal portal);
 extern void PortalCreateHoldStore(Portal portal);
 extern void PortalHashTableDeleteAll(void);
-- 
2.35.3



  [application/octet-stream] v18-0001-Move-PartitioPruneInfo-out-of-plan-nodes-into-Pl.patch (17.8K, 3-v18-0001-Move-PartitioPruneInfo-out-of-plan-nodes-into-Pl.patch)
  download | inline diff:
From 571424d7f1d5cb8b3ee59853649d35731b033b03 Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Fri, 27 May 2022 16:00:28 +0900
Subject: [PATCH v18 1/2] Move PartitioPruneInfo out of plan nodes into
 PlannedStmt

The planner will now add a given PartitioPruneInfo to
PlannedStmt.partPruneInfos instead of directly to the
Append/MergeAppend plan node.  What gets set instead in the
latter is an index field which points to the list element
of PlannedStmt.partPruneInfos containing the PartitioPruneInfo
belonging to the plan node.

A later commit will make AcquireExecutorLocks() do the initial
partition pruning to determine a minimal set of partitions to be
locked when validating a plan tree and it will need to consult the
PartitioPruneInfos referenced therein to do so.  It would be better
for the PartitioPruneInfos to be accessible directly than requiring
a walk of the plan tree to find them, which is easier when it can be
done by simply iterating over PlannedStmt.partPruneInfos.
---
 src/backend/executor/execMain.c         |  1 +
 src/backend/executor/execParallel.c     |  1 +
 src/backend/executor/execPartition.c    |  4 +-
 src/backend/executor/execUtils.c        |  2 +
 src/backend/executor/nodeAppend.c       |  4 +-
 src/backend/executor/nodeMergeAppend.c  |  4 +-
 src/backend/nodes/outfuncs.c            |  1 -
 src/backend/optimizer/plan/createplan.c | 24 ++++-----
 src/backend/optimizer/plan/planner.c    |  1 +
 src/backend/optimizer/plan/setrefs.c    | 65 +++++++++++++------------
 src/backend/partitioning/partprune.c    | 18 ++++---
 src/include/executor/execPartition.h    |  3 +-
 src/include/nodes/execnodes.h           |  2 +
 src/include/nodes/pathnodes.h           |  6 +++
 src/include/nodes/plannodes.h           | 11 +++--
 src/include/partitioning/partprune.h    |  8 +--
 16 files changed, 92 insertions(+), 63 deletions(-)

diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index ef2fd46092..72fc273524 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -825,6 +825,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
 	ExecInitRangeTable(estate, rangeTable);
 
 	estate->es_plannedstmt = plannedstmt;
+	estate->es_part_prune_infos = plannedstmt->partPruneInfos;
 
 	/*
 	 * Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index f1fd7f7e8b..f73b8c2607 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -183,6 +183,7 @@ ExecSerializePlan(Plan *plan, EState *estate)
 	pstmt->dependsOnRole = false;
 	pstmt->parallelModeNeeded = false;
 	pstmt->planTree = plan;
+	pstmt->partPruneInfos = estate->es_part_prune_infos;
 	pstmt->rtable = estate->es_range_table;
 	pstmt->resultRelations = NIL;
 	pstmt->appendRelations = NIL;
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index e03ea27299..b55cdd2580 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -1638,11 +1638,13 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
 PartitionPruneState *
 ExecInitPartitionPruning(PlanState *planstate,
 						 int n_total_subplans,
-						 PartitionPruneInfo *pruneinfo,
+						 int part_prune_index,
 						 Bitmapset **initially_valid_subplans)
 {
 	PartitionPruneState *prunestate;
 	EState	   *estate = planstate->state;
+	PartitionPruneInfo *pruneinfo = list_nth(estate->es_part_prune_infos,
+											 part_prune_index);
 
 	/* We may need an expression context to evaluate partition exprs */
 	ExecAssignExprContext(estate, planstate);
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 9df1f81ea8..f9c7976ff2 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -119,6 +119,8 @@ CreateExecutorState(void)
 	estate->es_relations = NULL;
 	estate->es_rowmarks = NULL;
 	estate->es_plannedstmt = NULL;
+	estate->es_part_prune_infos = NIL;
+	estate->es_part_prune_result = NULL;
 
 	estate->es_junkFilter = NULL;
 
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index 357e10a1d7..c6f86a6510 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -134,7 +134,7 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
 	appendstate->as_begun = false;
 
 	/* If run-time partition pruning is enabled, then set that up now */
-	if (node->part_prune_info != NULL)
+	if (node->part_prune_index >= 0)
 	{
 		PartitionPruneState *prunestate;
 
@@ -145,7 +145,7 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
 		 */
 		prunestate = ExecInitPartitionPruning(&appendstate->ps,
 											  list_length(node->appendplans),
-											  node->part_prune_info,
+											  node->part_prune_index,
 											  &validsubplans);
 		appendstate->as_prune_state = prunestate;
 		nplans = bms_num_members(validsubplans);
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index c5c62fa5c7..8d35860c30 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -82,7 +82,7 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
 	mergestate->ps.ExecProcNode = ExecMergeAppend;
 
 	/* If run-time partition pruning is enabled, then set that up now */
-	if (node->part_prune_info != NULL)
+	if (node->part_prune_index >= 0)
 	{
 		PartitionPruneState *prunestate;
 
@@ -93,7 +93,7 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
 		 */
 		prunestate = ExecInitPartitionPruning(&mergestate->ps,
 											  list_length(node->mergeplans),
-											  node->part_prune_info,
+											  node->part_prune_index,
 											  &validsubplans);
 		mergestate->ms_prune_state = prunestate;
 		nplans = bms_num_members(validsubplans);
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 4d776e7b51..81f6a9093c 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -299,7 +299,6 @@ outDatum(StringInfo str, Datum value, int typlen, bool typbyval)
  * Support functions for nodes with custom_read_write attribute or
  * special_read_write attribute
  */
-
 static void
 _outConst(StringInfo str, const Const *node)
 {
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index 76606faa3e..58a05cf673 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -1203,7 +1203,6 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
 	ListCell   *subpaths;
 	int			nasyncplans = 0;
 	RelOptInfo *rel = best_path->path.parent;
-	PartitionPruneInfo *partpruneinfo = NULL;
 	int			nodenumsortkeys = 0;
 	AttrNumber *nodeSortColIdx = NULL;
 	Oid		   *nodeSortOperators = NULL;
@@ -1354,6 +1353,9 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
 		subplans = lappend(subplans, subplan);
 	}
 
+	/* Set below if we find quals that we can use to run-time prune */
+	plan->part_prune_index = -1;
+
 	/*
 	 * If any quals exist, they may be useful to perform further partition
 	 * pruning during execution.  Gather information needed by the executor to
@@ -1377,16 +1379,14 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
 		}
 
 		if (prunequal != NIL)
-			partpruneinfo =
-				make_partition_pruneinfo(root, rel,
-										 best_path->subpaths,
-										 prunequal);
+			plan->part_prune_index = make_partition_pruneinfo(root, rel,
+															  best_path->subpaths,
+															  prunequal);
 	}
 
 	plan->appendplans = subplans;
 	plan->nasyncplans = nasyncplans;
 	plan->first_partial_plan = best_path->first_partial_path;
-	plan->part_prune_info = partpruneinfo;
 
 	copy_generic_path_info(&plan->plan, (Path *) best_path);
 
@@ -1426,7 +1426,6 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
 	List	   *subplans = NIL;
 	ListCell   *subpaths;
 	RelOptInfo *rel = best_path->path.parent;
-	PartitionPruneInfo *partpruneinfo = NULL;
 
 	/*
 	 * We don't have the actual creation of the MergeAppend node split out
@@ -1519,6 +1518,9 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
 		subplans = lappend(subplans, subplan);
 	}
 
+	/* Set below if we find quals that we can use to run-time prune */
+	node->part_prune_index = -1;
+
 	/*
 	 * If any quals exist, they may be useful to perform further partition
 	 * pruning during execution.  Gather information needed by the executor to
@@ -1542,13 +1544,13 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
 		}
 
 		if (prunequal != NIL)
-			partpruneinfo = make_partition_pruneinfo(root, rel,
-													 best_path->subpaths,
-													 prunequal);
+			node->part_prune_index = make_partition_pruneinfo(root, rel,
+															  best_path->subpaths,
+															  prunequal);
 	}
 
 	node->mergeplans = subplans;
-	node->part_prune_info = partpruneinfo;
+
 
 	/*
 	 * If prepare_sort_from_pathkeys added sort columns, but we were told to
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 06ad856eac..b11249ed8f 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -518,6 +518,7 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
 	result->dependsOnRole = glob->dependsOnRole;
 	result->parallelModeNeeded = glob->parallelModeNeeded;
 	result->planTree = top_plan;
+	result->partPruneInfos = glob->partPruneInfos;
 	result->rtable = glob->finalrtable;
 	result->resultRelations = glob->resultRelations;
 	result->appendRelations = glob->appendRelations;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 9cef92cab2..b8d5610593 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -348,6 +348,29 @@ set_plan_references(PlannerInfo *root, Plan *plan)
 		}
 	}
 
+	/* Also fix up the information in PartitionPruneInfos. */
+	foreach (lc, root->partPruneInfos)
+	{
+		PartitionPruneInfo *pruneinfo = lfirst(lc);
+		ListCell  *l;
+
+		foreach(l, pruneinfo->prune_infos)
+		{
+			List	   *prune_infos = lfirst(l);
+			ListCell   *l2;
+
+			foreach(l2, prune_infos)
+			{
+				PartitionedRelPruneInfo *pinfo = lfirst(l2);
+
+				/* RT index of the table to which the pinfo belongs. */
+				pinfo->rtindex += rtoffset;
+			}
+		}
+
+		glob->partPruneInfos = lappend(glob->partPruneInfos, pruneinfo);
+	}
+
 	return result;
 }
 
@@ -1655,21 +1678,12 @@ set_append_references(PlannerInfo *root,
 
 	aplan->apprelids = offset_relid_set(aplan->apprelids, rtoffset);
 
-	if (aplan->part_prune_info)
-	{
-		foreach(l, aplan->part_prune_info->prune_infos)
-		{
-			List	   *prune_infos = lfirst(l);
-			ListCell   *l2;
-
-			foreach(l2, prune_infos)
-			{
-				PartitionedRelPruneInfo *pinfo = lfirst(l2);
-
-				pinfo->rtindex += rtoffset;
-			}
-		}
-	}
+	/*
+	 * PartitionPruneInfos will be added to a list in PlannerGlobal, so update
+	 * the index.
+	 */
+	if (aplan->part_prune_index >= 0)
+		aplan->part_prune_index += list_length(root->glob->partPruneInfos);
 
 	/* We don't need to recurse to lefttree or righttree ... */
 	Assert(aplan->plan.lefttree == NULL);
@@ -1727,21 +1741,12 @@ set_mergeappend_references(PlannerInfo *root,
 
 	mplan->apprelids = offset_relid_set(mplan->apprelids, rtoffset);
 
-	if (mplan->part_prune_info)
-	{
-		foreach(l, mplan->part_prune_info->prune_infos)
-		{
-			List	   *prune_infos = lfirst(l);
-			ListCell   *l2;
-
-			foreach(l2, prune_infos)
-			{
-				PartitionedRelPruneInfo *pinfo = lfirst(l2);
-
-				pinfo->rtindex += rtoffset;
-			}
-		}
-	}
+	/*
+	 * PartitionPruneInfos will be added to a list in PlannerGlobal, so update
+	 * the index.
+	 */
+	if (mplan->part_prune_index >= 0)
+		mplan->part_prune_index += list_length(root->glob->partPruneInfos);
 
 	/* We don't need to recurse to lefttree or righttree ... */
 	Assert(mplan->plan.lefttree == NULL);
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index 9d3c05aed3..d77f7d3aef 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -209,16 +209,20 @@ static void partkey_datum_from_expr(PartitionPruneContext *context,
 
 /*
  * make_partition_pruneinfo
- *		Builds a PartitionPruneInfo which can be used in the executor to allow
- *		additional partition pruning to take place.  Returns NULL when
- *		partition pruning would be useless.
+ *		Checks if the given set of quals can be used to build pruning steps
+ *		that the executor can use to prune away unneeded partitions.  If
+ *		suitable quals are found then a PartitionPruneInfo is built and tagged
+ *		onto the PlannerInfo's partPruneInfos list.
+ *
+ * The return value is the 0-based index of the item added to the
+ * partPruneInfos list or -1 if nothing was added.
  *
  * 'parentrel' is the RelOptInfo for an appendrel, and 'subpaths' is the list
  * of scan paths for its child rels.
  * 'prunequal' is a list of potential pruning quals (i.e., restriction
  * clauses that are applicable to the appendrel).
  */
-PartitionPruneInfo *
+int
 make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 						 List *subpaths,
 						 List *prunequal)
@@ -332,7 +336,7 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	 * quals, then we can just not bother with run-time pruning.
 	 */
 	if (prunerelinfos == NIL)
-		return NULL;
+		return -1;
 
 	/* Else build the result data structure */
 	pruneinfo = makeNode(PartitionPruneInfo);
@@ -358,7 +362,9 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	else
 		pruneinfo->other_subplans = NULL;
 
-	return pruneinfo;
+	root->partPruneInfos = lappend(root->partPruneInfos, pruneinfo);
+
+	return list_length(root->partPruneInfos) - 1;
 }
 
 /*
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index 708435e952..bf962af7af 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -123,9 +123,8 @@ typedef struct PartitionPruneState
 
 extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
 													 int n_total_subplans,
-													 PartitionPruneInfo *pruneinfo,
+													 int part_prune_index,
 													 Bitmapset **initially_valid_subplans);
 extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
 										   bool initial_prune);
-
 #endif							/* EXECPARTITION_H */
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 01b1727fc0..63a89474db 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -611,6 +611,8 @@ typedef struct EState
 	struct ExecRowMark **es_rowmarks;	/* Array of per-range-table-entry
 										 * ExecRowMarks, or NULL if none */
 	PlannedStmt *es_plannedstmt;	/* link to top of plan tree */
+	List		*es_part_prune_infos;	/* PlannedStmt.partPruneInfos */
+	struct PartitionPruneResult *es_part_prune_result; /* QueryDesc.part_prune_result */
 	const char *es_sourceText;	/* Source text from QueryDesc */
 
 	JunkFilter *es_junkFilter;	/* top-level junk filter, if any */
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 44ffc73f15..d87957ff6c 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -122,6 +122,9 @@ typedef struct PlannerGlobal
 	/* "flat" list of AppendRelInfos */
 	List	   *appendRelations;
 
+	/* List of PartitionPruneInfo contained in the plan */
+	List	   *partPruneInfos;
+
 	/* OIDs of relations the plan depends on */
 	List	   *relationOids;
 
@@ -480,6 +483,9 @@ struct PlannerInfo
 
 	/* Does this query modify any partition key columns? */
 	bool		partColsUpdated;
+
+	/* PartitionPruneInfos added in this query's plan. */
+	List	   *partPruneInfos;
 };
 
 
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index dca2a21e7a..f2daabb3b7 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -69,6 +69,9 @@ typedef struct PlannedStmt
 
 	struct Plan *planTree;		/* tree of Plan nodes */
 
+	List	   *partPruneInfos;	/* List of PartitionPruneInfo contained in
+								 * the plan */
+
 	List	   *rtable;			/* list of RangeTblEntry nodes */
 
 	/* rtable indexes of target relations for INSERT/UPDATE/DELETE */
@@ -269,8 +272,8 @@ typedef struct Append
 	 */
 	int			first_partial_plan;
 
-	/* Info for run-time subplan pruning; NULL if we're not doing that */
-	struct PartitionPruneInfo *part_prune_info;
+	/* Index to PlannerInfo.partPruneInfos or -1 if no run-time pruning */
+	int			part_prune_index;
 } Append;
 
 /* ----------------
@@ -304,8 +307,8 @@ typedef struct MergeAppend
 	/* NULLS FIRST/LAST directions */
 	bool	   *nullsFirst pg_node_attr(array_size(numCols));
 
-	/* Info for run-time subplan pruning; NULL if we're not doing that */
-	struct PartitionPruneInfo *part_prune_info;
+	/* Index to PlannerInfo.partPruneInfos or -1 if no run-time pruning */
+	int			part_prune_index;
 } MergeAppend;
 
 /* ----------------
diff --git a/src/include/partitioning/partprune.h b/src/include/partitioning/partprune.h
index 90684efa25..ebf0dcff8c 100644
--- a/src/include/partitioning/partprune.h
+++ b/src/include/partitioning/partprune.h
@@ -70,10 +70,10 @@ typedef struct PartitionPruneContext
 #define PruneCxtStateIdx(partnatts, step_id, keyno) \
 	((partnatts) * (step_id) + (keyno))
 
-extern PartitionPruneInfo *make_partition_pruneinfo(struct PlannerInfo *root,
-													struct RelOptInfo *parentrel,
-													List *subpaths,
-													List *prunequal);
+extern int make_partition_pruneinfo(struct PlannerInfo *root,
+									struct RelOptInfo *parentrel,
+									List *subpaths,
+									List *prunequal);
 extern Bitmapset *prune_append_rel_partitions(struct RelOptInfo *rel);
 extern Bitmapset *get_matching_partitions(PartitionPruneContext *context,
 										  List *pruning_steps);
-- 
2.35.3



^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2022-07-13 07:03  Amit Langote <[email protected]>
  parent: Amit Langote <[email protected]>
  0 siblings, 1 reply; 108+ messages in thread

From: Amit Langote @ 2022-07-13 07:03 UTC (permalink / raw)
  To: Jacob Champion <[email protected]>; +Cc: Zhihong Yu <[email protected]>; David Rowley <[email protected]>; Robert Haas <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers

On Wed, Jul 13, 2022 at 3:40 PM Amit Langote <[email protected]> wrote:
> Rebased over 964d01ae90c.

Sorry, left some pointless hunks in there while rebasing.  Fixed in
the attached.

-- 
Thanks, Amit Langote
EDB: http://www.enterprisedb.com


Attachments:

  [application/octet-stream] v19-0001-Move-PartitioPruneInfo-out-of-plan-nodes-into-Pl.patch (17.3K, 2-v19-0001-Move-PartitioPruneInfo-out-of-plan-nodes-into-Pl.patch)
  download | inline diff:
From 9fa5cd5f4256b7249ab6f560edca9d3609a126ef Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Fri, 27 May 2022 16:00:28 +0900
Subject: [PATCH v19 1/2] Move PartitioPruneInfo out of plan nodes into
 PlannedStmt

The planner will now add a given PartitioPruneInfo to
PlannedStmt.partPruneInfos instead of directly to the
Append/MergeAppend plan node.  What gets set instead in the
latter is an index field which points to the list element
of PlannedStmt.partPruneInfos containing the PartitioPruneInfo
belonging to the plan node.

A later commit will make AcquireExecutorLocks() do the initial
partition pruning to determine a minimal set of partitions to be
locked when validating a plan tree and it will need to consult the
PartitioPruneInfos referenced therein to do so.  It would be better
for the PartitioPruneInfos to be accessible directly than requiring
a walk of the plan tree to find them, which is easier when it can be
done by simply iterating over PlannedStmt.partPruneInfos.
---
 src/backend/executor/execMain.c         |  1 +
 src/backend/executor/execParallel.c     |  1 +
 src/backend/executor/execPartition.c    |  4 +-
 src/backend/executor/execUtils.c        |  2 +
 src/backend/executor/nodeAppend.c       |  4 +-
 src/backend/executor/nodeMergeAppend.c  |  4 +-
 src/backend/optimizer/plan/createplan.c | 24 ++++-----
 src/backend/optimizer/plan/planner.c    |  1 +
 src/backend/optimizer/plan/setrefs.c    | 65 +++++++++++++------------
 src/backend/partitioning/partprune.c    | 18 ++++---
 src/include/executor/execPartition.h    |  3 +-
 src/include/nodes/execnodes.h           |  2 +
 src/include/nodes/pathnodes.h           |  6 +++
 src/include/nodes/plannodes.h           | 11 +++--
 src/include/partitioning/partprune.h    |  8 +--
 15 files changed, 92 insertions(+), 62 deletions(-)

diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index ef2fd46092..72fc273524 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -825,6 +825,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
 	ExecInitRangeTable(estate, rangeTable);
 
 	estate->es_plannedstmt = plannedstmt;
+	estate->es_part_prune_infos = plannedstmt->partPruneInfos;
 
 	/*
 	 * Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index f1fd7f7e8b..f73b8c2607 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -183,6 +183,7 @@ ExecSerializePlan(Plan *plan, EState *estate)
 	pstmt->dependsOnRole = false;
 	pstmt->parallelModeNeeded = false;
 	pstmt->planTree = plan;
+	pstmt->partPruneInfos = estate->es_part_prune_infos;
 	pstmt->rtable = estate->es_range_table;
 	pstmt->resultRelations = NIL;
 	pstmt->appendRelations = NIL;
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index e03ea27299..b55cdd2580 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -1638,11 +1638,13 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
 PartitionPruneState *
 ExecInitPartitionPruning(PlanState *planstate,
 						 int n_total_subplans,
-						 PartitionPruneInfo *pruneinfo,
+						 int part_prune_index,
 						 Bitmapset **initially_valid_subplans)
 {
 	PartitionPruneState *prunestate;
 	EState	   *estate = planstate->state;
+	PartitionPruneInfo *pruneinfo = list_nth(estate->es_part_prune_infos,
+											 part_prune_index);
 
 	/* We may need an expression context to evaluate partition exprs */
 	ExecAssignExprContext(estate, planstate);
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 9df1f81ea8..f9c7976ff2 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -119,6 +119,8 @@ CreateExecutorState(void)
 	estate->es_relations = NULL;
 	estate->es_rowmarks = NULL;
 	estate->es_plannedstmt = NULL;
+	estate->es_part_prune_infos = NIL;
+	estate->es_part_prune_result = NULL;
 
 	estate->es_junkFilter = NULL;
 
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index 357e10a1d7..c6f86a6510 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -134,7 +134,7 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
 	appendstate->as_begun = false;
 
 	/* If run-time partition pruning is enabled, then set that up now */
-	if (node->part_prune_info != NULL)
+	if (node->part_prune_index >= 0)
 	{
 		PartitionPruneState *prunestate;
 
@@ -145,7 +145,7 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
 		 */
 		prunestate = ExecInitPartitionPruning(&appendstate->ps,
 											  list_length(node->appendplans),
-											  node->part_prune_info,
+											  node->part_prune_index,
 											  &validsubplans);
 		appendstate->as_prune_state = prunestate;
 		nplans = bms_num_members(validsubplans);
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index c5c62fa5c7..8d35860c30 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -82,7 +82,7 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
 	mergestate->ps.ExecProcNode = ExecMergeAppend;
 
 	/* If run-time partition pruning is enabled, then set that up now */
-	if (node->part_prune_info != NULL)
+	if (node->part_prune_index >= 0)
 	{
 		PartitionPruneState *prunestate;
 
@@ -93,7 +93,7 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
 		 */
 		prunestate = ExecInitPartitionPruning(&mergestate->ps,
 											  list_length(node->mergeplans),
-											  node->part_prune_info,
+											  node->part_prune_index,
 											  &validsubplans);
 		mergestate->ms_prune_state = prunestate;
 		nplans = bms_num_members(validsubplans);
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index e37f2933eb..fd8ab4a167 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -1203,7 +1203,6 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
 	ListCell   *subpaths;
 	int			nasyncplans = 0;
 	RelOptInfo *rel = best_path->path.parent;
-	PartitionPruneInfo *partpruneinfo = NULL;
 	int			nodenumsortkeys = 0;
 	AttrNumber *nodeSortColIdx = NULL;
 	Oid		   *nodeSortOperators = NULL;
@@ -1354,6 +1353,9 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
 		subplans = lappend(subplans, subplan);
 	}
 
+	/* Set below if we find quals that we can use to run-time prune */
+	plan->part_prune_index = -1;
+
 	/*
 	 * If any quals exist, they may be useful to perform further partition
 	 * pruning during execution.  Gather information needed by the executor to
@@ -1377,16 +1379,14 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
 		}
 
 		if (prunequal != NIL)
-			partpruneinfo =
-				make_partition_pruneinfo(root, rel,
-										 best_path->subpaths,
-										 prunequal);
+			plan->part_prune_index = make_partition_pruneinfo(root, rel,
+															  best_path->subpaths,
+															  prunequal);
 	}
 
 	plan->appendplans = subplans;
 	plan->nasyncplans = nasyncplans;
 	plan->first_partial_plan = best_path->first_partial_path;
-	plan->part_prune_info = partpruneinfo;
 
 	copy_generic_path_info(&plan->plan, (Path *) best_path);
 
@@ -1425,7 +1425,6 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
 	List	   *subplans = NIL;
 	ListCell   *subpaths;
 	RelOptInfo *rel = best_path->path.parent;
-	PartitionPruneInfo *partpruneinfo = NULL;
 
 	/*
 	 * We don't have the actual creation of the MergeAppend node split out
@@ -1518,6 +1517,9 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
 		subplans = lappend(subplans, subplan);
 	}
 
+	/* Set below if we find quals that we can use to run-time prune */
+	node->part_prune_index = -1;
+
 	/*
 	 * If any quals exist, they may be useful to perform further partition
 	 * pruning during execution.  Gather information needed by the executor to
@@ -1541,13 +1543,13 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
 		}
 
 		if (prunequal != NIL)
-			partpruneinfo = make_partition_pruneinfo(root, rel,
-													 best_path->subpaths,
-													 prunequal);
+			node->part_prune_index = make_partition_pruneinfo(root, rel,
+															  best_path->subpaths,
+															  prunequal);
 	}
 
 	node->mergeplans = subplans;
-	node->part_prune_info = partpruneinfo;
+
 
 	/*
 	 * If prepare_sort_from_pathkeys added sort columns, but we were told to
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 06ad856eac..b11249ed8f 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -518,6 +518,7 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
 	result->dependsOnRole = glob->dependsOnRole;
 	result->parallelModeNeeded = glob->parallelModeNeeded;
 	result->planTree = top_plan;
+	result->partPruneInfos = glob->partPruneInfos;
 	result->rtable = glob->finalrtable;
 	result->resultRelations = glob->resultRelations;
 	result->appendRelations = glob->appendRelations;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 9cef92cab2..b8d5610593 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -348,6 +348,29 @@ set_plan_references(PlannerInfo *root, Plan *plan)
 		}
 	}
 
+	/* Also fix up the information in PartitionPruneInfos. */
+	foreach (lc, root->partPruneInfos)
+	{
+		PartitionPruneInfo *pruneinfo = lfirst(lc);
+		ListCell  *l;
+
+		foreach(l, pruneinfo->prune_infos)
+		{
+			List	   *prune_infos = lfirst(l);
+			ListCell   *l2;
+
+			foreach(l2, prune_infos)
+			{
+				PartitionedRelPruneInfo *pinfo = lfirst(l2);
+
+				/* RT index of the table to which the pinfo belongs. */
+				pinfo->rtindex += rtoffset;
+			}
+		}
+
+		glob->partPruneInfos = lappend(glob->partPruneInfos, pruneinfo);
+	}
+
 	return result;
 }
 
@@ -1655,21 +1678,12 @@ set_append_references(PlannerInfo *root,
 
 	aplan->apprelids = offset_relid_set(aplan->apprelids, rtoffset);
 
-	if (aplan->part_prune_info)
-	{
-		foreach(l, aplan->part_prune_info->prune_infos)
-		{
-			List	   *prune_infos = lfirst(l);
-			ListCell   *l2;
-
-			foreach(l2, prune_infos)
-			{
-				PartitionedRelPruneInfo *pinfo = lfirst(l2);
-
-				pinfo->rtindex += rtoffset;
-			}
-		}
-	}
+	/*
+	 * PartitionPruneInfos will be added to a list in PlannerGlobal, so update
+	 * the index.
+	 */
+	if (aplan->part_prune_index >= 0)
+		aplan->part_prune_index += list_length(root->glob->partPruneInfos);
 
 	/* We don't need to recurse to lefttree or righttree ... */
 	Assert(aplan->plan.lefttree == NULL);
@@ -1727,21 +1741,12 @@ set_mergeappend_references(PlannerInfo *root,
 
 	mplan->apprelids = offset_relid_set(mplan->apprelids, rtoffset);
 
-	if (mplan->part_prune_info)
-	{
-		foreach(l, mplan->part_prune_info->prune_infos)
-		{
-			List	   *prune_infos = lfirst(l);
-			ListCell   *l2;
-
-			foreach(l2, prune_infos)
-			{
-				PartitionedRelPruneInfo *pinfo = lfirst(l2);
-
-				pinfo->rtindex += rtoffset;
-			}
-		}
-	}
+	/*
+	 * PartitionPruneInfos will be added to a list in PlannerGlobal, so update
+	 * the index.
+	 */
+	if (mplan->part_prune_index >= 0)
+		mplan->part_prune_index += list_length(root->glob->partPruneInfos);
 
 	/* We don't need to recurse to lefttree or righttree ... */
 	Assert(mplan->plan.lefttree == NULL);
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index 9d3c05aed3..d77f7d3aef 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -209,16 +209,20 @@ static void partkey_datum_from_expr(PartitionPruneContext *context,
 
 /*
  * make_partition_pruneinfo
- *		Builds a PartitionPruneInfo which can be used in the executor to allow
- *		additional partition pruning to take place.  Returns NULL when
- *		partition pruning would be useless.
+ *		Checks if the given set of quals can be used to build pruning steps
+ *		that the executor can use to prune away unneeded partitions.  If
+ *		suitable quals are found then a PartitionPruneInfo is built and tagged
+ *		onto the PlannerInfo's partPruneInfos list.
+ *
+ * The return value is the 0-based index of the item added to the
+ * partPruneInfos list or -1 if nothing was added.
  *
  * 'parentrel' is the RelOptInfo for an appendrel, and 'subpaths' is the list
  * of scan paths for its child rels.
  * 'prunequal' is a list of potential pruning quals (i.e., restriction
  * clauses that are applicable to the appendrel).
  */
-PartitionPruneInfo *
+int
 make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 						 List *subpaths,
 						 List *prunequal)
@@ -332,7 +336,7 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	 * quals, then we can just not bother with run-time pruning.
 	 */
 	if (prunerelinfos == NIL)
-		return NULL;
+		return -1;
 
 	/* Else build the result data structure */
 	pruneinfo = makeNode(PartitionPruneInfo);
@@ -358,7 +362,9 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	else
 		pruneinfo->other_subplans = NULL;
 
-	return pruneinfo;
+	root->partPruneInfos = lappend(root->partPruneInfos, pruneinfo);
+
+	return list_length(root->partPruneInfos) - 1;
 }
 
 /*
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index 708435e952..bf962af7af 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -123,9 +123,8 @@ typedef struct PartitionPruneState
 
 extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
 													 int n_total_subplans,
-													 PartitionPruneInfo *pruneinfo,
+													 int part_prune_index,
 													 Bitmapset **initially_valid_subplans);
 extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
 										   bool initial_prune);
-
 #endif							/* EXECPARTITION_H */
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 01b1727fc0..63a89474db 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -611,6 +611,8 @@ typedef struct EState
 	struct ExecRowMark **es_rowmarks;	/* Array of per-range-table-entry
 										 * ExecRowMarks, or NULL if none */
 	PlannedStmt *es_plannedstmt;	/* link to top of plan tree */
+	List		*es_part_prune_infos;	/* PlannedStmt.partPruneInfos */
+	struct PartitionPruneResult *es_part_prune_result; /* QueryDesc.part_prune_result */
 	const char *es_sourceText;	/* Source text from QueryDesc */
 
 	JunkFilter *es_junkFilter;	/* top-level junk filter, if any */
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 44ffc73f15..d87957ff6c 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -122,6 +122,9 @@ typedef struct PlannerGlobal
 	/* "flat" list of AppendRelInfos */
 	List	   *appendRelations;
 
+	/* List of PartitionPruneInfo contained in the plan */
+	List	   *partPruneInfos;
+
 	/* OIDs of relations the plan depends on */
 	List	   *relationOids;
 
@@ -480,6 +483,9 @@ struct PlannerInfo
 
 	/* Does this query modify any partition key columns? */
 	bool		partColsUpdated;
+
+	/* PartitionPruneInfos added in this query's plan. */
+	List	   *partPruneInfos;
 };
 
 
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index dca2a21e7a..f2daabb3b7 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -69,6 +69,9 @@ typedef struct PlannedStmt
 
 	struct Plan *planTree;		/* tree of Plan nodes */
 
+	List	   *partPruneInfos;	/* List of PartitionPruneInfo contained in
+								 * the plan */
+
 	List	   *rtable;			/* list of RangeTblEntry nodes */
 
 	/* rtable indexes of target relations for INSERT/UPDATE/DELETE */
@@ -269,8 +272,8 @@ typedef struct Append
 	 */
 	int			first_partial_plan;
 
-	/* Info for run-time subplan pruning; NULL if we're not doing that */
-	struct PartitionPruneInfo *part_prune_info;
+	/* Index to PlannerInfo.partPruneInfos or -1 if no run-time pruning */
+	int			part_prune_index;
 } Append;
 
 /* ----------------
@@ -304,8 +307,8 @@ typedef struct MergeAppend
 	/* NULLS FIRST/LAST directions */
 	bool	   *nullsFirst pg_node_attr(array_size(numCols));
 
-	/* Info for run-time subplan pruning; NULL if we're not doing that */
-	struct PartitionPruneInfo *part_prune_info;
+	/* Index to PlannerInfo.partPruneInfos or -1 if no run-time pruning */
+	int			part_prune_index;
 } MergeAppend;
 
 /* ----------------
diff --git a/src/include/partitioning/partprune.h b/src/include/partitioning/partprune.h
index 90684efa25..ebf0dcff8c 100644
--- a/src/include/partitioning/partprune.h
+++ b/src/include/partitioning/partprune.h
@@ -70,10 +70,10 @@ typedef struct PartitionPruneContext
 #define PruneCxtStateIdx(partnatts, step_id, keyno) \
 	((partnatts) * (step_id) + (keyno))
 
-extern PartitionPruneInfo *make_partition_pruneinfo(struct PlannerInfo *root,
-													struct RelOptInfo *parentrel,
-													List *subpaths,
-													List *prunequal);
+extern int make_partition_pruneinfo(struct PlannerInfo *root,
+									struct RelOptInfo *parentrel,
+									List *subpaths,
+									List *prunequal);
 extern Bitmapset *prune_append_rel_partitions(struct RelOptInfo *rel);
 extern Bitmapset *get_matching_partitions(PartitionPruneContext *context,
 										  List *pruning_steps);
-- 
2.35.3



  [application/octet-stream] v19-0002-Optimize-AcquireExecutorLocks-by-locking-only-un.patch (80.6K, 3-v19-0002-Optimize-AcquireExecutorLocks-by-locking-only-un.patch)
  download | inline diff:
From b67911f2ae182f7158501e7ce4b1799ff2e1efb4 Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Wed, 22 Dec 2021 16:55:17 +0900
Subject: [PATCH v19 2/2] Optimize AcquireExecutorLocks() by locking only
 unpruned partitions

This commit teaches AcquireExecutorLocks() to perform initial
partition pruning to notionally eliminate the subnodes contained in a
generic cached plan that need not be initialized during the actual
execution of the plan and skip locking the partition scanned by those
subnodes.

The result of performing initial partition pruning this way before the
actual execution has started is made available to the actual execution via
PartitionPruneResult, made available along with the PlannedStmt by the
callers of the executor that used plancache.c to get the plan.  It is NULL
in the cases in which the plan is obtained by calling the planner
directly or if the plan obtained by plancache.c is not a generic one.
---
 src/backend/commands/copyto.c          |   2 +-
 src/backend/commands/createas.c        |   2 +-
 src/backend/commands/explain.c         |   7 +-
 src/backend/commands/extension.c       |   2 +-
 src/backend/commands/matview.c         |   2 +-
 src/backend/commands/prepare.c         |  26 ++-
 src/backend/executor/README            |  32 ++++
 src/backend/executor/execMain.c        |  53 ++++++
 src/backend/executor/execParallel.c    |  27 ++-
 src/backend/executor/execPartition.c   | 234 +++++++++++++++++++++----
 src/backend/executor/functions.c       |   2 +-
 src/backend/executor/nodeAppend.c      |  11 +-
 src/backend/executor/nodeMergeAppend.c |   5 +-
 src/backend/executor/spi.c             |  27 ++-
 src/backend/nodes/readfuncs.c          |  29 +++
 src/backend/optimizer/plan/planner.c   |   2 +
 src/backend/optimizer/plan/setrefs.c   |  46 +++++
 src/backend/partitioning/partprune.c   |  41 ++++-
 src/backend/tcop/postgres.c            |   8 +-
 src/backend/tcop/pquery.c              |  28 ++-
 src/backend/utils/cache/plancache.c    | 187 +++++++++++++++++---
 src/backend/utils/mmgr/portalmem.c     |  19 ++
 src/include/commands/explain.h         |   4 +-
 src/include/executor/execPartition.h   |   9 +-
 src/include/executor/execdesc.h        |   3 +
 src/include/executor/executor.h        |   2 +
 src/include/nodes/execnodes.h          |  27 +++
 src/include/nodes/nodes.h              |   1 +
 src/include/nodes/pathnodes.h          |  13 ++
 src/include/nodes/plannodes.h          |  21 +++
 src/include/utils/plancache.h          |   3 +-
 src/include/utils/portal.h             |   3 +
 32 files changed, 782 insertions(+), 96 deletions(-)

diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index fca29a9a10..d839517693 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -541,7 +541,7 @@ BeginCopyTo(ParseState *pstate,
 		((DR_copy *) dest)->cstate = cstate;
 
 		/* Create a QueryDesc requesting no output */
-		cstate->queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+		cstate->queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
 											GetActiveSnapshot(),
 											InvalidSnapshot,
 											dest, NULL, NULL, 0);
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 9abbb6b555..f6607f2454 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -325,7 +325,7 @@ ExecCreateTableAs(ParseState *pstate, CreateTableAsStmt *stmt,
 		UpdateActiveSnapshotCommandId();
 
 		/* Create a QueryDesc, redirecting output to our tuple receiver */
-		queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+		queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
 									GetActiveSnapshot(), InvalidSnapshot,
 									dest, params, queryEnv, 0);
 
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index e29c2ae206..e41b13a3ea 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -407,7 +407,7 @@ ExplainOneQuery(Query *query, int cursorOptions,
 		}
 
 		/* run it (if needed) and produce output */
-		ExplainOnePlan(plan, into, es, queryString, params, queryEnv,
+		ExplainOnePlan(plan, NULL, into, es, queryString, params, queryEnv,
 					   &planduration, (es->buffers ? &bufusage : NULL));
 	}
 }
@@ -515,7 +515,8 @@ ExplainOneUtility(Node *utilityStmt, IntoClause *into, ExplainState *es,
  * to call it.
  */
 void
-ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
+ExplainOnePlan(PlannedStmt *plannedstmt, PartitionPruneResult *part_prune_result,
+			   IntoClause *into, ExplainState *es,
 			   const char *queryString, ParamListInfo params,
 			   QueryEnvironment *queryEnv, const instr_time *planduration,
 			   const BufferUsage *bufusage)
@@ -563,7 +564,7 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
 		dest = None_Receiver;
 
 	/* Create a QueryDesc for the query */
-	queryDesc = CreateQueryDesc(plannedstmt, queryString,
+	queryDesc = CreateQueryDesc(plannedstmt, part_prune_result, queryString,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, params, queryEnv, instrument_option);
 
diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index 3db859c3ea..631cc07217 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -776,7 +776,7 @@ execute_sql_string(const char *sql)
 			{
 				QueryDesc  *qdesc;
 
-				qdesc = CreateQueryDesc(stmt,
+				qdesc = CreateQueryDesc(stmt, NULL,
 										sql,
 										GetActiveSnapshot(), NULL,
 										dest, NULL, NULL, 0);
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index 9ac0383459..b0ed96e56c 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -408,7 +408,7 @@ refresh_matview_datafill(DestReceiver *dest, Query *query,
 	UpdateActiveSnapshotCommandId();
 
 	/* Create a QueryDesc, redirecting output to our tuple receiver */
-	queryDesc = CreateQueryDesc(plan, queryString,
+	queryDesc = CreateQueryDesc(plan, NULL, queryString,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, NULL, NULL, 0);
 
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 2333aae467..83465e40f8 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -155,6 +155,7 @@ ExecuteQuery(ParseState *pstate,
 	PreparedStatement *entry;
 	CachedPlan *cplan;
 	List	   *plan_list;
+	List	   *part_prune_result_list;
 	ParamListInfo paramLI = NULL;
 	EState	   *estate = NULL;
 	Portal		portal;
@@ -193,7 +194,10 @@ ExecuteQuery(ParseState *pstate,
 									   entry->plansource->query_string);
 
 	/* Replan if needed, and increment plan refcount for portal */
-	cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL);
+	cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL,
+						  &part_prune_result_list);
+	Assert(list_length(cplan->stmt_list) ==
+		   list_length(part_prune_result_list));
 	plan_list = cplan->stmt_list;
 
 	/*
@@ -207,6 +211,9 @@ ExecuteQuery(ParseState *pstate,
 					  plan_list,
 					  cplan);
 
+	/* Copy PartitionPruneResults into the portal's context. */
+	PortalStorePartitionPruneResults(portal, part_prune_result_list);
+
 	/*
 	 * For CREATE TABLE ... AS EXECUTE, we must verify that the prepared
 	 * statement is one that produces tuples.  Currently we insist that it be
@@ -576,7 +583,9 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	const char *query_string;
 	CachedPlan *cplan;
 	List	   *plan_list;
-	ListCell   *p;
+	List	   *part_prune_result_list;
+	ListCell   *p,
+			   *pp;
 	ParamListInfo paramLI = NULL;
 	EState	   *estate = NULL;
 	instr_time	planstart;
@@ -619,7 +628,10 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 
 	/* Replan if needed, and acquire a transient refcount */
 	cplan = GetCachedPlan(entry->plansource, paramLI,
-						  CurrentResourceOwner, queryEnv);
+						  CurrentResourceOwner, queryEnv,
+						  &part_prune_result_list);
+	Assert(list_length(cplan->stmt_list) ==
+		   list_length(part_prune_result_list));
 
 	INSTR_TIME_SET_CURRENT(planduration);
 	INSTR_TIME_SUBTRACT(planduration, planstart);
@@ -634,13 +646,15 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	plan_list = cplan->stmt_list;
 
 	/* Explain each query */
-	foreach(p, plan_list)
+	forboth(p, plan_list, pp, part_prune_result_list)
 	{
 		PlannedStmt *pstmt = lfirst_node(PlannedStmt, p);
+		PartitionPruneResult *part_prune_result = lfirst_node(PartitionPruneResult, pp);
 
 		if (pstmt->commandType != CMD_UTILITY)
-			ExplainOnePlan(pstmt, into, es, query_string, paramLI, queryEnv,
-						   &planduration, (es->buffers ? &bufusage : NULL));
+			ExplainOnePlan(pstmt, part_prune_result, into, es, query_string,
+						   paramLI, queryEnv, &planduration,
+						   (es->buffers ? &bufusage : NULL));
 		else
 			ExplainOneUtility(pstmt->utilityStmt, into, es, query_string,
 							  paramLI, queryEnv);
diff --git a/src/backend/executor/README b/src/backend/executor/README
index 0b5183fc4a..953a476ea5 100644
--- a/src/backend/executor/README
+++ b/src/backend/executor/README
@@ -65,6 +65,34 @@ found there.  This currently only occurs for Append and MergeAppend nodes.  In
 this case the non-required subplans are ignored and the executor state's
 subnode array will become out of sequence to the plan's subplan list.
 
+Actually, the so-called execution time pruning may also occur even before the
+execution has started.  One case where that occurs is when a cached generic
+plan is being validated for execution by plancache.c: GetCachedPlan(), which
+works by locking all the relations that will be scanned by that plan.  If the
+generic plan contains nodes that can perform execution time partition pruning
+(that is, contain a PartitionPruneInfo), a subset of pruning steps contained
+in a given node's PartitionPruneInfo that do not depend on the execution
+actually having started (called "initial" pruning steps) are performed at this
+point to figure out the minimal set of child subplans that satisfy those
+pruning steps.  AcquireExecutorLocks() looking at a given plan tree will then
+lock only the relations scanned by the child subplans that survived such
+pruning, along with those present in PlannedStmt.minLockRelids.  Note that the
+subplans are only notionally pruned in that they are not removed from the plan
+tree as such.
+
+To prevent the executor and any third party execution code that can look at
+the plan tree from trying to execute the subplans that were pruned as
+described above, the result of pruning is passed to the executor as a
+PartitionPruneResult node via the QueryDesc.  It consists of the set of
+indexes of surviving subplans in their respective parent plan node's list of
+child subplans, saved as a list of bitmapsets, with one element for every
+parent plan node whose PartitionPruneInfo is present in
+PlannedStmt.partPruneInfos.  In other words, the executor should not
+re-evaluate the set of initially valid subplans by redoing the initial pruning
+if it was already done by AcquireExecutorLocks(), because the re-evaluation may
+very well end up resulting in a different set of subplans, containing some
+whose relations were not locked by AcquireExecutorLocks().
+
 Each Plan node may have expression trees associated with it, to represent
 its target list, qualification conditions, etc.  These trees are also
 read-only to the executor, but the executor state for expression evaluation
@@ -286,6 +314,10 @@ Query Processing Control Flow
 
 This is a sketch of control flow for full query processing:
 
+  [ ExecutorDoInitialPruning ] --- an optional step to perform initial
+		partition pruning on the plan tree the result of which is passed
+		to the executor via QueryDesc
+
 	CreateQueryDesc
 
 	ExecutorStart
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 72fc273524..45824624f8 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -49,6 +49,7 @@
 #include "commands/matview.h"
 #include "commands/trigger.h"
 #include "executor/execdebug.h"
+#include "executor/execPartition.h"
 #include "executor/nodeSubplan.h"
 #include "foreign/fdwapi.h"
 #include "jit/jit.h"
@@ -104,6 +105,56 @@ static void EvalPlanQualStart(EPQState *epqstate, Plan *planTree);
 
 /* end of local decls */
 
+/* ----------------------------------------------------------------
+ *		ExecutorDoInitialPruning
+ *
+ *		For each plan tree node that has been assigned a PartitionPruneInfo,
+ *		this performs initial partition pruning using the information contained
+ *		therein to determine the set of child subplans that satisfy the initial
+ *		pruning steps, to be returned as a bitmapset of their indexes in the
+ *		node's list of child subplans (for example, an Append's appendplans).
+ *
+ * Return value is a PartitionPruneResult node that contains a list of those
+ * bitmapsets, with one element for every PartitionPruneInfo, and a bitmapset
+ * of the RT indexes of all the leaf partitions scanned by those chosen
+ * subplans.  Note that the latter is shared across all PartitionPruneInfos.
+ *
+ * The executor must see the exactly same set of subplans as valid for
+ * execution when doing ExecInitNode() on the plan nodes whose
+ * PartitionPruneInfos are processed here.  So, it must get the set from the
+ * aforementioned PartitionPruneResult, instead of computing it all over
+ * again by redoing the initial pruning.  It's the caller's job to pass the
+ * PartitionPruneResult to the executor.
+ *
+ * Note: Partitioned tables mentioned in PartitionedRelPruneInfo nodes that
+ * drive the pruning will be locked before doing the pruning.
+ * ----------------------------------------------------------------
+ */
+PartitionPruneResult *
+ExecutorDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params)
+{
+	PartitionPruneResult *result;
+	ListCell *lc;
+
+	/* Only get here if there is any pruning to do. */
+	Assert(plannedstmt->containsInitialPruning);
+
+	result = makeNode(PartitionPruneResult);
+	foreach(lc, plannedstmt->partPruneInfos)
+	{
+		PartitionPruneInfo *pruneinfo = lfirst(lc);
+		Bitmapset *valid_subplan_offs;
+
+		valid_subplan_offs =
+			ExecPartitionDoInitialPruning(plannedstmt, params, pruneinfo,
+										  &result->scan_leafpart_rtis);
+		result->valid_subplan_offs_list =
+			lappend(result->valid_subplan_offs_list,
+					valid_subplan_offs);
+	}
+
+	return result;
+}
 
 /* ----------------------------------------------------------------
  *		ExecutorStart
@@ -806,6 +857,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
 {
 	CmdType		operation = queryDesc->operation;
 	PlannedStmt *plannedstmt = queryDesc->plannedstmt;
+	PartitionPruneResult *part_prune_result = queryDesc->part_prune_result;
 	Plan	   *plan = plannedstmt->planTree;
 	List	   *rangeTable = plannedstmt->rtable;
 	EState	   *estate = queryDesc->estate;
@@ -826,6 +878,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
 
 	estate->es_plannedstmt = plannedstmt;
 	estate->es_part_prune_infos = plannedstmt->partPruneInfos;
+	estate->es_part_prune_result = part_prune_result;
 
 	/*
 	 * Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index f73b8c2607..7e6dab5623 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -66,6 +66,7 @@
 #define PARALLEL_KEY_QUERY_TEXT		UINT64CONST(0xE000000000000008)
 #define PARALLEL_KEY_JIT_INSTRUMENTATION UINT64CONST(0xE000000000000009)
 #define PARALLEL_KEY_WAL_USAGE			UINT64CONST(0xE00000000000000A)
+#define PARALLEL_KEY_PARTITIONPRUNERESULT	UINT64CONST(0xE00000000000000B)
 
 #define PARALLEL_TUPLE_QUEUE_SIZE		65536
 
@@ -182,6 +183,7 @@ ExecSerializePlan(Plan *plan, EState *estate)
 	pstmt->transientPlan = false;
 	pstmt->dependsOnRole = false;
 	pstmt->parallelModeNeeded = false;
+	pstmt->containsInitialPruning = false;
 	pstmt->planTree = plan;
 	pstmt->partPruneInfos = estate->es_part_prune_infos;
 	pstmt->rtable = estate->es_range_table;
@@ -597,12 +599,15 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	FixedParallelExecutorState *fpes;
 	char	   *pstmt_data;
 	char	   *pstmt_space;
+	char	   *part_prune_result_data;
+	char	   *part_prune_result_space;
 	char	   *paramlistinfo_space;
 	BufferUsage *bufusage_space;
 	WalUsage   *walusage_space;
 	SharedExecutorInstrumentation *instrumentation = NULL;
 	SharedJitInstrumentation *jit_instrumentation = NULL;
 	int			pstmt_len;
+	int			part_prune_result_len;
 	int			paramlistinfo_len;
 	int			instrumentation_len = 0;
 	int			jit_instrumentation_len = 0;
@@ -631,6 +636,7 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 
 	/* Fix up and serialize plan to be sent to workers. */
 	pstmt_data = ExecSerializePlan(planstate->plan, estate);
+	part_prune_result_data = nodeToString(estate->es_part_prune_result);
 
 	/* Create a parallel context. */
 	pcxt = CreateParallelContext("postgres", "ParallelQueryMain", nworkers);
@@ -657,6 +663,11 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	shm_toc_estimate_chunk(&pcxt->estimator, pstmt_len);
 	shm_toc_estimate_keys(&pcxt->estimator, 1);
 
+	/* Estimate space for serialized PartitionPruneResult. */
+	part_prune_result_len = strlen(part_prune_result_data) + 1;
+	shm_toc_estimate_chunk(&pcxt->estimator, part_prune_result_len);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
 	/* Estimate space for serialized ParamListInfo. */
 	paramlistinfo_len = EstimateParamListSpace(estate->es_param_list_info);
 	shm_toc_estimate_chunk(&pcxt->estimator, paramlistinfo_len);
@@ -751,6 +762,12 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	memcpy(pstmt_space, pstmt_data, pstmt_len);
 	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PLANNEDSTMT, pstmt_space);
 
+	/* Store serialized PartitionPruneResult */
+	part_prune_result_space = shm_toc_allocate(pcxt->toc, part_prune_result_len);
+	memcpy(part_prune_result_space, part_prune_result_data, part_prune_result_len);
+	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARTITIONPRUNERESULT,
+				   part_prune_result_space);
+
 	/* Store serialized ParamListInfo. */
 	paramlistinfo_space = shm_toc_allocate(pcxt->toc, paramlistinfo_len);
 	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARAMLISTINFO, paramlistinfo_space);
@@ -1232,8 +1249,10 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
 						 int instrument_options)
 {
 	char	   *pstmtspace;
+	char	   *part_prune_result_space;
 	char	   *paramspace;
 	PlannedStmt *pstmt;
+	PartitionPruneResult *part_prune_result;
 	ParamListInfo paramLI;
 	char	   *queryString;
 
@@ -1244,12 +1263,18 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
 	pstmtspace = shm_toc_lookup(toc, PARALLEL_KEY_PLANNEDSTMT, false);
 	pstmt = (PlannedStmt *) stringToNode(pstmtspace);
 
+	/* Reconstruct leader-supplied PartitionPruneResult. */
+	part_prune_result_space =
+		shm_toc_lookup(toc, PARALLEL_KEY_PARTITIONPRUNERESULT, false);
+	part_prune_result = (PartitionPruneResult *)
+		stringToNode(part_prune_result_space);
+
 	/* Reconstruct ParamListInfo. */
 	paramspace = shm_toc_lookup(toc, PARALLEL_KEY_PARAMLISTINFO, false);
 	paramLI = RestoreParamList(&paramspace);
 
 	/* Create a QueryDesc for the query. */
-	return CreateQueryDesc(pstmt,
+	return CreateQueryDesc(pstmt, part_prune_result,
 						   queryString,
 						   GetActiveSnapshot(), InvalidSnapshot,
 						   receiver, paramLI, NULL, instrument_options);
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index b55cdd2580..24e6f6e988 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -25,6 +25,7 @@
 #include "mb/pg_wchar.h"
 #include "miscadmin.h"
 #include "nodes/makefuncs.h"
+#include "parser/parsetree.h"
 #include "partitioning/partbounds.h"
 #include "partitioning/partdesc.h"
 #include "partitioning/partprune.h"
@@ -185,7 +186,11 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation rel,
 static List *adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri);
 static List *adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap);
 static PartitionPruneState *CreatePartitionPruneState(PlanState *planstate,
-													  PartitionPruneInfo *pruneinfo);
+													  PartitionPruneInfo *pruneinfo,
+													  bool consider_initial_steps,
+													  bool consider_exec_steps,
+													  List *rtable, ExprContext *econtext,
+													  PartitionDirectory partdir);
 static void InitPartitionPruneContext(PartitionPruneContext *context,
 									  List *pruning_steps,
 									  PartitionDesc partdesc,
@@ -198,7 +203,8 @@ static void PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
 static void find_matching_subplans_recurse(PartitionPruningData *prunedata,
 										   PartitionedRelPruningData *pprune,
 										   bool initial_prune,
-										   Bitmapset **validsubplans);
+										   Bitmapset **validsubplans,
+										   Bitmapset **scan_leafpart_rtis);
 
 
 /*
@@ -1593,8 +1599,10 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  * considered to be a stable expression, it can change value from one plan
  * node scan to the next during query execution.  Stable comparison
  * expressions that don't involve such Params allow partition pruning to be
- * done once during executor startup.  Expressions that do involve such Params
- * require us to prune separately for each scan of the parent plan node.
+ * done once during executor startup or during ExecutorDoInitialPruning() that
+ * runs as part of performing AcquireExecutorLocks() on a given plan tree.
+ * Expressions that do involve such Params require us to prune separately for
+ * each scan of the parent plan node.
  *
  * Note that pruning away unneeded subplans during executor startup has the
  * added benefit of not having to initialize the unneeded subplans at all.
@@ -1611,6 +1619,13 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  *		account for initial pruning possibly having eliminated some of the
  *		subplans.
  *
+ * ExecPartitionDoInitialPruning:
+ *		Do initial pruning with the information contained in a given
+ *		PartitionPruneInfo to determine the minimal set of child subplans
+ *		to be executed of the parent plan node to which the PartitionPruneInfo
+ *		belongs and also the set of the RT indexes of leaf partitions that will
+ *		be scanned with those subplans.
+ *
  * ExecFindMatchingSubPlans:
  *		Returns indexes of matching subplans after evaluating the expressions
  *		that are safe to evaluate at a given point.  This function is first
@@ -1628,8 +1643,9 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  *
  * On return, *initially_valid_subplans is assigned the set of indexes of
  * child subplans that must be initialized along with the parent plan node.
- * Initial pruning is performed here if needed and in that case only the
- * surviving subplans' indexes are added.
+ * Initial pruning is performed here if needed (unless it has already been done
+ * by ExecutorDoInitialPruning()), and in that case only the surviving
+ * subplans' indexes are added.
  *
  * If subplans are indeed pruned, subplan_map arrays contained in the returned
  * PartitionPruneState are re-sequenced to not count those, though only if the
@@ -1645,24 +1661,59 @@ ExecInitPartitionPruning(PlanState *planstate,
 	EState	   *estate = planstate->state;
 	PartitionPruneInfo *pruneinfo = list_nth(estate->es_part_prune_infos,
 											 part_prune_index);
+	PartitionPruneResult *pruneresult = estate->es_part_prune_result;
+	bool	do_pruning = (pruneinfo->needs_init_pruning ||
+						  pruneinfo->needs_exec_pruning);
 
-	/* We may need an expression context to evaluate partition exprs */
-	ExecAssignExprContext(estate, planstate);
+	/*
+	 * No need to do initial pruning if it was done already by
+	 * ExecutorDoInitialPruning(), which it would be if es_part_prune_result
+	 * has been set.
+	 */
+	if (pruneresult)
+		do_pruning = pruneinfo->needs_exec_pruning;
 
-	/* Create the working data structure for pruning */
-	prunestate = CreatePartitionPruneState(planstate, pruneinfo);
+	prunestate = NULL;
+	if (do_pruning)
+	{
+		/* We may need an expression context to evaluate partition exprs */
+		ExecAssignExprContext(estate, planstate);
+
+		/* For data reading, executor always omits detached partitions */
+		if (estate->es_partition_directory == NULL)
+			estate->es_partition_directory =
+				CreatePartitionDirectory(estate->es_query_cxt, false);
+
+		/*
+		 * Create the working data structure for pruning.  No need to consider
+		 * initial pruning steps if we have a PartitionPruneResult.
+		 */
+		prunestate = CreatePartitionPruneState(planstate, pruneinfo,
+											   pruneresult == NULL, true,
+											   NIL, planstate->ps_ExprContext,
+											   estate->es_partition_directory);
+	}
 
 	/*
 	 * Perform an initial partition prune pass, if required.
 	 */
-	if (prunestate->do_initial_prune)
-		*initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true);
+	if (pruneresult)
+	{
+		*initially_valid_subplans =
+			list_nth(pruneresult->valid_subplan_offs_list, part_prune_index);
+	}
+	else if (prunestate && prunestate->do_initial_prune)
+	{
+		*initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true,
+															 NULL);
+	}
 	else
 	{
 		/* No pruning, so we'll need to initialize all subplans */
 		Assert(n_total_subplans > 0);
 		*initially_valid_subplans = bms_add_range(NULL, 0,
 												  n_total_subplans - 1);
+		return prunestate;
 	}
 
 	/*
@@ -1670,7 +1721,8 @@ ExecInitPartitionPruning(PlanState *planstate,
 	 * that were removed above due to initial pruning.  No need to do this if
 	 * no steps were removed.
 	 */
-	if (bms_num_members(*initially_valid_subplans) < n_total_subplans)
+	if (prunestate &&
+		bms_num_members(*initially_valid_subplans) < n_total_subplans)
 	{
 		/*
 		 * We can safely skip this when !do_exec_prune, even though that
@@ -1686,11 +1738,73 @@ ExecInitPartitionPruning(PlanState *planstate,
 	return prunestate;
 }
 
+/*
+ * ExecPartitionDoInitialPruning
+ *		Perform initial pruning using given PartitionPruneInfo to determine
+ *		the minimal set of child subplans that will be executed and also the
+ *		set of RT indexes of the leaf partitions scanned by those subplans.
+ */
+Bitmapset *
+ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params,
+							  PartitionPruneInfo *pruneinfo,
+							  Bitmapset **scan_leafpart_rtis)
+{
+	List		 *rtable = plannedstmt->rtable;
+	ExprContext	 *econtext;
+	PartitionDirectory pdir;
+	MemoryContext oldcontext,
+				  tmpcontext;
+	PartitionPruneState *prunestate;
+	Bitmapset	 *valid_subplan_offs;
+
+	/*
+	 * A temporary context for memory allocations required while executing
+	 * partition pruning steps.
+	 */
+	tmpcontext = AllocSetContextCreate(CurrentMemoryContext,
+									   "initial pruning working data",
+									   ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(tmpcontext);
+
+	/*
+	 * PartitionDirectory to look up partition descriptors, which omits
+	 * detached partitions, just like in the executor proper.
+	 */
+	pdir = CreatePartitionDirectory(CurrentMemoryContext, false);
+
+	/*
+	 * We don't yet have a PlanState for the parent plan node, so we must
+	 * create a standalone ExprContext to evaluate pruning expressions,
+	 * equipped with the information about the EXTERN parameters that the
+	 * caller passed us.  Note that that's okay because the initial pruning
+	 * steps do not contain anything that requires the execution to have
+	 * started.
+	 */
+	econtext = CreateStandaloneExprContext();
+	econtext->ecxt_param_list_info = params;
+	prunestate = CreatePartitionPruneState(NULL, pruneinfo, true, false,
+										   rtable, econtext, pdir);
+	MemoryContextSwitchTo(oldcontext);
+
+	/* Do the initial pruning. */
+	valid_subplan_offs = ExecFindMatchingSubPlans(prunestate, true,
+												  scan_leafpart_rtis);
+
+	FreeExprContext(econtext, true);
+	DestroyPartitionDirectory(pdir);
+	MemoryContextDelete(tmpcontext);
+
+	return valid_subplan_offs;
+}
+
 /*
  * CreatePartitionPruneState
  *		Build the data structure required for calling ExecFindMatchingSubPlans
  *
- * 'planstate' is the parent plan node's execution state.
+ * 'planstate', if not NULL, is the parent plan node's execution state.  It
+ * can be NULL if being called before ExecutorStart(), in which case,
+ * 'rtable' (range table), 'econtext', and 'partdir' must be explicitly
+ * provided.
  *
  * 'pruneinfo' is a PartitionPruneInfo as generated by
  * make_partition_pruneinfo.  Here we build a PartitionPruneState containing a
@@ -1704,19 +1818,21 @@ ExecInitPartitionPruning(PlanState *planstate,
  * PartitionedRelPruneInfo.
  */
 static PartitionPruneState *
-CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
+CreatePartitionPruneState(PlanState *planstate,
+						  PartitionPruneInfo *pruneinfo,
+						  bool consider_initial_steps,
+						  bool consider_exec_steps,
+						  List *rtable, ExprContext *econtext,
+						  PartitionDirectory partdir)
 {
-	EState	   *estate = planstate->state;
+	EState	   *estate = planstate ? planstate->state : NULL;
 	PartitionPruneState *prunestate;
 	int			n_part_hierarchies;
 	ListCell   *lc;
 	int			i;
-	ExprContext *econtext = planstate->ps_ExprContext;
 
-	/* For data reading, executor always omits detached partitions */
-	if (estate->es_partition_directory == NULL)
-		estate->es_partition_directory =
-			CreatePartitionDirectory(estate->es_query_cxt, false);
+	Assert((estate != NULL) ||
+			(partdir != NULL && econtext != NULL && rtable != NIL));
 
 	n_part_hierarchies = list_length(pruneinfo->prune_infos);
 	Assert(n_part_hierarchies > 0);
@@ -1771,15 +1887,42 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 			PartitionKey partkey;
 
 			/*
-			 * We can rely on the copies of the partitioned table's partition
-			 * key and partition descriptor appearing in its relcache entry,
-			 * because that entry will be held open and locked for the
-			 * duration of this executor run.
+			 * Must open the relation by ourselves when called before the
+			 * execution has started, such as, when called during
+			 * ExecutorDoInitialPruning() on a cached plan.  In that case,
+			 * sub-partitions must be locked, because AcquirePlannerLocks()
+			 * would not have seen them. (1st relation in a partrelpruneinfos
+			 * list is always the root partitioned table appearing in the
+			 * query, which AcquirePlannerLocks() would have locked; the
+			 * Assert in relation_open() guards that assumption.)
+			 */
+			if (estate == NULL)
+			{
+				RangeTblEntry *rte = rt_fetch(pinfo->rtindex, rtable);
+				int		lockmode = (j == 0) ? NoLock : rte->rellockmode;
+
+				partrel = table_open(rte->relid, lockmode);
+			}
+			else
+				partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
+
+			/*
+			 * We can rely on the copy of the partitioned table's partition
+			 * key from in its relcache entry, because it can't change (or
+			 * get destroyed) as long as the relation is locked.  Partition
+			 * descriptor is taken from the PartitionDirectory associated with
+			 * the table that is held open long enough for the descriptor to
+			 * remain valid while it's used to perform the pruning steps.
 			 */
-			partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
 			partkey = RelationGetPartitionKey(partrel);
-			partdesc = PartitionDirectoryLookup(estate->es_partition_directory,
-												partrel);
+			partdesc = PartitionDirectoryLookup(partdir, partrel);
+
+			/*
+			 * Must close partrel, keeping the lock taken, if we're not using
+			 * EState's entry.
+			 */
+			if (estate == NULL)
+				table_close(partrel, NoLock);
 
 			/*
 			 * Initialize the subplan_map and subpart_map.
@@ -1793,6 +1936,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 			Assert(partdesc->nparts >= pinfo->nparts);
 			pprune->nparts = partdesc->nparts;
 			pprune->subplan_map = palloc(sizeof(int) * partdesc->nparts);
+			pprune->rti_map = palloc(sizeof(Index) * partdesc->nparts);
 			if (partdesc->nparts == pinfo->nparts)
 			{
 				/*
@@ -1803,6 +1947,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 				pprune->subpart_map = pinfo->subpart_map;
 				memcpy(pprune->subplan_map, pinfo->subplan_map,
 					   sizeof(int) * pinfo->nparts);
+				memcpy(pprune->rti_map, pinfo->rti_map,
+					   sizeof(int) * pinfo->nparts);
 
 				/*
 				 * Double-check that the list of unpruned relations has not
@@ -1853,6 +1999,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 							pinfo->subplan_map[pd_idx];
 						pprune->subpart_map[pp_idx] =
 							pinfo->subpart_map[pd_idx];
+						pprune->rti_map[pp_idx] =
+							pinfo->rti_map[pd_idx];
 						pd_idx++;
 					}
 					else
@@ -1860,6 +2008,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 						/* this partdesc entry is not in the plan */
 						pprune->subplan_map[pp_idx] = -1;
 						pprune->subpart_map[pp_idx] = -1;
+						pprune->rti_map[pp_idx] = 0;
 					}
 				}
 
@@ -1881,7 +2030,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 			 * Initialize pruning contexts as needed.
 			 */
 			pprune->initial_pruning_steps = pinfo->initial_pruning_steps;
-			if (pinfo->initial_pruning_steps)
+			if (consider_initial_steps && pinfo->initial_pruning_steps)
 			{
 				InitPartitionPruneContext(&pprune->initial_context,
 										  pinfo->initial_pruning_steps,
@@ -1891,7 +2040,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 				prunestate->do_initial_prune = true;
 			}
 			pprune->exec_pruning_steps = pinfo->exec_pruning_steps;
-			if (pinfo->exec_pruning_steps)
+			if (consider_exec_steps && pinfo->exec_pruning_steps)
 			{
 				InitPartitionPruneContext(&pprune->exec_context,
 										  pinfo->exec_pruning_steps,
@@ -2119,10 +2268,14 @@ PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
  * Pass initial_prune if PARAM_EXEC Params cannot yet be evaluated.  This
  * differentiates the initial executor-time pruning step from later
  * runtime pruning.
+ *
+ * RT indexes of leaf partitions scanned by the chosen subplans are added to
+ * *scan_leafpart_rtis if the pointer is non-NULL.
  */
 Bitmapset *
 ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
-						 bool initial_prune)
+						 bool initial_prune,
+						 Bitmapset **scan_leafpart_rtis)
 {
 	Bitmapset  *result = NULL;
 	MemoryContext oldcontext;
@@ -2157,7 +2310,7 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
 		 */
 		pprune = &prunedata->partrelprunedata[0];
 		find_matching_subplans_recurse(prunedata, pprune, initial_prune,
-									   &result);
+									   &result, scan_leafpart_rtis);
 
 		/* Expression eval may have used space in ExprContext too */
 		if (pprune->exec_pruning_steps)
@@ -2171,6 +2324,8 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
 
 	/* Copy result out of the temp context before we reset it */
 	result = bms_copy(result);
+	if (scan_leafpart_rtis)
+		*scan_leafpart_rtis = bms_copy(*scan_leafpart_rtis);
 
 	MemoryContextReset(prunestate->prune_context);
 
@@ -2181,13 +2336,15 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
  * find_matching_subplans_recurse
  *		Recursive worker function for ExecFindMatchingSubPlans
  *
- * Adds valid (non-prunable) subplan IDs to *validsubplans
+ * Adds valid (non-prunable) subplan IDs to *validsubplans and RT indexes of
+ * of the corresponding leaf partitions to *scan_leafpart_rtis (if asked for).
  */
 static void
 find_matching_subplans_recurse(PartitionPruningData *prunedata,
 							   PartitionedRelPruningData *pprune,
 							   bool initial_prune,
-							   Bitmapset **validsubplans)
+							   Bitmapset **validsubplans,
+							   Bitmapset **scan_leafpart_rtis)
 {
 	Bitmapset  *partset;
 	int			i;
@@ -2214,8 +2371,14 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
 	while ((i = bms_next_member(partset, i)) >= 0)
 	{
 		if (pprune->subplan_map[i] >= 0)
+		{
 			*validsubplans = bms_add_member(*validsubplans,
 											pprune->subplan_map[i]);
+			Assert(pprune->rti_map[i] > 0);
+			if (scan_leafpart_rtis)
+				*scan_leafpart_rtis = bms_add_member(*scan_leafpart_rtis,
+													 pprune->rti_map[i]);
+		}
 		else
 		{
 			int			partidx = pprune->subpart_map[i];
@@ -2223,7 +2386,8 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
 			if (partidx >= 0)
 				find_matching_subplans_recurse(prunedata,
 											   &prunedata->partrelprunedata[partidx],
-											   initial_prune, validsubplans);
+											   initial_prune, validsubplans,
+											   scan_leafpart_rtis);
 			else
 			{
 				/*
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index f9460ae506..a2182a6b1f 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -844,7 +844,7 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
 	else
 		dest = None_Receiver;
 
-	es->qd = CreateQueryDesc(es->stmt,
+	es->qd = CreateQueryDesc(es->stmt, NULL,
 							 fcache->src,
 							 GetActiveSnapshot(),
 							 InvalidSnapshot,
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index c6f86a6510..96880e122a 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -155,7 +155,8 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
 		 * subplan, we can fill as_valid_subplans immediately, preventing
 		 * later calls to ExecFindMatchingSubPlans.
 		 */
-		if (!prunestate->do_exec_prune && nplans > 0)
+		if (appendstate->as_prune_state == NULL ||
+			(!appendstate->as_prune_state->do_exec_prune && nplans > 0))
 			appendstate->as_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
 	}
 	else
@@ -577,7 +578,7 @@ choose_next_subplan_locally(AppendState *node)
 		}
 		else if (node->as_valid_subplans == NULL)
 			node->as_valid_subplans =
-				ExecFindMatchingSubPlans(node->as_prune_state, false);
+				ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
 
 		whichplan = -1;
 	}
@@ -642,7 +643,7 @@ choose_next_subplan_for_leader(AppendState *node)
 		if (node->as_valid_subplans == NULL)
 		{
 			node->as_valid_subplans =
-				ExecFindMatchingSubPlans(node->as_prune_state, false);
+				ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
 
 			/*
 			 * Mark each invalid plan as finished to allow the loop below to
@@ -717,7 +718,7 @@ choose_next_subplan_for_worker(AppendState *node)
 	else if (node->as_valid_subplans == NULL)
 	{
 		node->as_valid_subplans =
-			ExecFindMatchingSubPlans(node->as_prune_state, false);
+			ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
 		mark_invalid_subplans_as_finished(node);
 	}
 
@@ -868,7 +869,7 @@ ExecAppendAsyncBegin(AppendState *node)
 	if (node->as_valid_subplans == NULL)
 	{
 		node->as_valid_subplans =
-			ExecFindMatchingSubPlans(node->as_prune_state, false);
+			ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
 
 		classify_matching_subplans(node);
 	}
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index 8d35860c30..2312e5a633 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -103,7 +103,8 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
 		 * subplan, we can fill ms_valid_subplans immediately, preventing
 		 * later calls to ExecFindMatchingSubPlans.
 		 */
-		if (!prunestate->do_exec_prune && nplans > 0)
+		if (mergestate->ms_prune_state == NULL ||
+			(!mergestate->ms_prune_state->do_exec_prune && nplans > 0))
 			mergestate->ms_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
 	}
 	else
@@ -218,7 +219,7 @@ ExecMergeAppend(PlanState *pstate)
 		 */
 		if (node->ms_valid_subplans == NULL)
 			node->ms_valid_subplans =
-				ExecFindMatchingSubPlans(node->ms_prune_state, false);
+				ExecFindMatchingSubPlans(node->ms_prune_state, false, NULL);
 
 		/*
 		 * First time through: pull the first tuple from each valid subplan,
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 29bc26669b..303a572c02 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -1578,6 +1578,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 	CachedPlanSource *plansource;
 	CachedPlan *cplan;
 	List	   *stmt_list;
+	List	   *part_prune_result_list;
 	char	   *query_string;
 	Snapshot	snapshot;
 	MemoryContext oldcontext;
@@ -1657,7 +1658,10 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 	 */
 
 	/* Replan if needed, and increment plan refcount for portal */
-	cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv);
+	cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv,
+						  &part_prune_result_list);
+	Assert(list_length(cplan->stmt_list) ==
+		   list_length(part_prune_result_list));
 	stmt_list = cplan->stmt_list;
 
 	if (!plan->saved)
@@ -1685,6 +1689,9 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 					  stmt_list,
 					  cplan);
 
+	/* Copy PartitionPruneResults into the portal's context. */
+	PortalStorePartitionPruneResults(portal, part_prune_result_list);
+
 	/*
 	 * Set up options for portal.  Default SCROLL type is chosen the same way
 	 * as PerformCursorOpen does it.
@@ -2092,7 +2099,8 @@ SPI_plan_get_cached_plan(SPIPlanPtr plan)
 	/* Get the generic plan for the query */
 	cplan = GetCachedPlan(plansource, NULL,
 						  plan->saved ? CurrentResourceOwner : NULL,
-						  _SPI_current->queryEnv);
+						  _SPI_current->queryEnv,
+						  NULL /* Not interested in PartitionPruneResults */);
 	Assert(cplan == plansource->gplan);
 
 	/* Pop the error context stack */
@@ -2473,7 +2481,9 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 	{
 		CachedPlanSource *plansource = (CachedPlanSource *) lfirst(lc1);
 		List	   *stmt_list;
-		ListCell   *lc2;
+		List	   *part_prune_result_list;
+		ListCell   *lc2,
+				   *lc3;
 
 		spicallbackarg.query = plansource->query_string;
 
@@ -2549,8 +2559,10 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 		 * plan, the refcount must be backed by the plan_owner.
 		 */
 		cplan = GetCachedPlan(plansource, options->params,
-							  plan_owner, _SPI_current->queryEnv);
-
+							  plan_owner, _SPI_current->queryEnv,
+							 &part_prune_result_list);
+		Assert(list_length(cplan->stmt_list) ==
+			   list_length(part_prune_result_list));
 		stmt_list = cplan->stmt_list;
 
 		/*
@@ -2589,9 +2601,10 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 			}
 		}
 
-		foreach(lc2, stmt_list)
+		forboth(lc2, stmt_list, lc3, part_prune_result_list)
 		{
 			PlannedStmt *stmt = lfirst_node(PlannedStmt, lc2);
+			PartitionPruneResult *part_prune_result = lfirst_node(PartitionPruneResult, lc3);
 			bool		canSetTag = stmt->canSetTag;
 			DestReceiver *dest;
 
@@ -2663,7 +2676,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 				else
 					snap = InvalidSnapshot;
 
-				qdesc = CreateQueryDesc(stmt,
+				qdesc = CreateQueryDesc(stmt, part_prune_result,
 										plansource->query_string,
 										snap, crosscheck_snapshot,
 										dest,
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 1421686938..d57478bde9 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -158,6 +158,11 @@
 	token = pg_strtok(&length);		/* skip :fldname */ \
 	local_node->fldname = readIntCols(len)
 
+/* Read an Index array */
+#define READ_INDEX_ARRAY(fldname, len) \
+	token = pg_strtok(&length);		/* skip :fldname */ \
+	local_node->fldname = readIndexCols(len)
+
 /* Read a bool array */
 #define READ_BOOL_ARRAY(fldname, len) \
 	token = pg_strtok(&length);		/* skip :fldname */ \
@@ -623,6 +628,30 @@ readIntCols(int numCols)
 	return int_vals;
 }
 
+/*
+ * readIndexCols
+ */
+Index *
+readIndexCols(int numCols)
+{
+	int			tokenLength,
+				i;
+	const char *token;
+	Index	   *index_vals;
+
+	if (numCols <= 0)
+		return NULL;
+
+	index_vals = (Index *) palloc(numCols * sizeof(Index));
+	for (i = 0; i < numCols; i++)
+	{
+		token = pg_strtok(&tokenLength);
+		index_vals[i] = atoui(token);
+	}
+
+	return index_vals;
+}
+
 /*
  * readBoolCols
  */
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index b11249ed8f..7141035cc4 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -519,7 +519,9 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
 	result->parallelModeNeeded = glob->parallelModeNeeded;
 	result->planTree = top_plan;
 	result->partPruneInfos = glob->partPruneInfos;
+	result->containsInitialPruning = glob->containsInitialPruning;
 	result->rtable = glob->finalrtable;
+	result->minLockRelids = glob->minLockRelids;
 	result->resultRelations = glob->resultRelations;
 	result->appendRelations = glob->appendRelations;
 	result->subplans = glob->subplans;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index b8d5610593..da749e331e 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -270,6 +270,16 @@ set_plan_references(PlannerInfo *root, Plan *plan)
 	 */
 	add_rtes_to_flat_rtable(root, false);
 
+	/*
+	 * Add the query's adjusted range of RT indexes to glob->minLockRelids.
+	 * The adjusted RT indexes of prunable relations will be deleted from the
+	 * set below where PartitionPruneInfos are processed.
+	 */
+	glob->minLockRelids =
+		bms_add_range(glob->minLockRelids,
+					  rtoffset + 1,
+					  rtoffset + list_length(root->parse->rtable));
+
 	/*
 	 * Adjust RT indexes of PlanRowMarks and add to final rowmarks list
 	 */
@@ -352,6 +362,7 @@ set_plan_references(PlannerInfo *root, Plan *plan)
 	foreach (lc, root->partPruneInfos)
 	{
 		PartitionPruneInfo *pruneinfo = lfirst(lc);
+		Bitmapset *leafpart_rtis = NULL;
 		ListCell  *l;
 
 		foreach(l, pruneinfo->prune_infos)
@@ -362,15 +373,50 @@ set_plan_references(PlannerInfo *root, Plan *plan)
 			foreach(l2, prune_infos)
 			{
 				PartitionedRelPruneInfo *pinfo = lfirst(l2);
+				int		i;
 
 				/* RT index of the table to which the pinfo belongs. */
 				pinfo->rtindex += rtoffset;
+
+				/* Also of the leaf partitions that might be scanned. */
+				for (i = 0; i < pinfo->nparts; i++)
+				{
+					if (pinfo->rti_map[i] > 0 && pinfo->subplan_map[i] >= 0)
+					{
+						pinfo->rti_map[i] += rtoffset;
+						leafpart_rtis = bms_add_member(leafpart_rtis,
+													   pinfo->rti_map[i]);
+					}
+				}
 			}
 		}
 
+		if (pruneinfo->needs_init_pruning)
+		{
+			glob->containsInitialPruning = true;
+
+			/*
+			 * Delete the leaf partition RTIs from the global set of relations
+			 * to be locked before executing the plan.  AcquireExecutorLocks()
+			 * will find the ones to add to the set after performing initial
+			 * pruning.
+			 */
+			glob->minLockRelids = bms_del_members(glob->minLockRelids,
+												  leafpart_rtis);
+		}
+
 		glob->partPruneInfos = lappend(glob->partPruneInfos, pruneinfo);
 	}
 
+	/*
+	 * It seems worth doing a bms_copy() on glob->minLockRelids if we deleted
+	 * bits from it above to get rid of any empty tail bits.  It seems better
+	 * for the loop over this set in AcquireExecutorLocks() to not have to go
+	 * through those useless bit words.
+	 */
+	if (glob->containsInitialPruning)
+		glob->minLockRelids = bms_copy(glob->minLockRelids);
+
 	return result;
 }
 
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index d77f7d3aef..952c5b8327 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -144,7 +144,9 @@ static List *make_partitionedrel_pruneinfo(PlannerInfo *root,
 										   List *prunequal,
 										   Bitmapset *partrelids,
 										   int *relid_subplan_map,
-										   Bitmapset **matchedsubplans);
+										   Bitmapset **matchedsubplans,
+										   bool *needs_init_pruning,
+										   bool *needs_exec_pruning);
 static void gen_partprune_steps(RelOptInfo *rel, List *clauses,
 								PartClauseTarget target,
 								GeneratePruningStepsContext *context);
@@ -234,6 +236,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	int		   *relid_subplan_map;
 	ListCell   *lc;
 	int			i;
+	bool		needs_init_pruning = false;
+	bool		needs_exec_pruning = false;
 
 	/*
 	 * Scan the subpaths to see which ones are scans of partition child
@@ -313,12 +317,16 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		Bitmapset  *partrelids = (Bitmapset *) lfirst(lc);
 		List	   *pinfolist;
 		Bitmapset  *matchedsubplans = NULL;
+		bool		partrel_needs_init_pruning;
+		bool		partrel_needs_exec_pruning;
 
 		pinfolist = make_partitionedrel_pruneinfo(root, parentrel,
 												  prunequal,
 												  partrelids,
 												  relid_subplan_map,
-												  &matchedsubplans);
+												  &matchedsubplans,
+												  &partrel_needs_init_pruning,
+												  &partrel_needs_exec_pruning);
 
 		/* When pruning is possible, record the matched subplans */
 		if (pinfolist != NIL)
@@ -327,6 +335,9 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 			allmatchedsubplans = bms_join(matchedsubplans,
 										  allmatchedsubplans);
 		}
+
+		needs_init_pruning |= partrel_needs_init_pruning;
+		needs_exec_pruning |= partrel_needs_exec_pruning;
 	}
 
 	pfree(relid_subplan_map);
@@ -341,6 +352,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	/* Else build the result data structure */
 	pruneinfo = makeNode(PartitionPruneInfo);
 	pruneinfo->prune_infos = prunerelinfos;
+	pruneinfo->needs_init_pruning = needs_init_pruning;
+	pruneinfo->needs_exec_pruning = needs_exec_pruning;
 
 	/*
 	 * Some subplans may not belong to any of the identified partitioned rels.
@@ -441,13 +454,18 @@ add_part_relids(List *allpartrelids, Bitmapset *partrelids)
  * If we cannot find any useful run-time pruning steps, return NIL.
  * However, on success, each rel identified in partrelids will have
  * an element in the result list, even if some of them are useless.
+ * *needs_init_pruning and *needs_exec_pruning are set to indicate that the
+ * returned PartitionedRelPruneInfos contains pruning steps that can be
+ * performed before and after execution begins, respectively.
  */
 static List *
 make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 							  List *prunequal,
 							  Bitmapset *partrelids,
 							  int *relid_subplan_map,
-							  Bitmapset **matchedsubplans)
+							  Bitmapset **matchedsubplans,
+							  bool *needs_init_pruning,
+							  bool *needs_exec_pruning)
 {
 	RelOptInfo *targetpart = NULL;
 	List	   *pinfolist = NIL;
@@ -458,6 +476,10 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	int			rti;
 	int			i;
 
+	/* Will find out below. */
+	*needs_init_pruning = false;
+	*needs_exec_pruning = false;
+
 	/*
 	 * Examine each partitioned rel, constructing a temporary array to map
 	 * from planner relids to index of the partitioned rel, and building a
@@ -545,6 +567,9 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		 * executor per-scan pruning steps.  This first pass creates startup
 		 * pruning steps and detects whether there's any possibly-useful quals
 		 * that would require per-scan pruning.
+		 *
+		 * In the first pass, we note whether the 2nd pass is necessary by
+		 * noting the presence of EXEC parameters.
 		 */
 		gen_partprune_steps(subpart, partprunequal, PARTTARGET_INITIAL,
 							&context);
@@ -619,6 +644,12 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		pinfo->execparamids = execparamids;
 		/* Remaining fields will be filled in the next loop */
 
+		/* record which types of pruning steps we've seen so far */
+		if (initial_pruning_steps != NIL)
+			*needs_init_pruning = true;
+		if (exec_pruning_steps != NIL)
+			*needs_exec_pruning = true;
+
 		pinfolist = lappend(pinfolist, pinfo);
 	}
 
@@ -646,6 +677,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		int		   *subplan_map;
 		int		   *subpart_map;
 		Oid		   *relid_map;
+		Index	   *rti_map;
 
 		/*
 		 * Construct the subplan and subpart maps for this partitioning level.
@@ -658,6 +690,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		subpart_map = (int *) palloc(nparts * sizeof(int));
 		memset(subpart_map, -1, nparts * sizeof(int));
 		relid_map = (Oid *) palloc0(nparts * sizeof(Oid));
+		rti_map = (Index *) palloc0(nparts * sizeof(Index));
 		present_parts = NULL;
 
 		i = -1;
@@ -672,6 +705,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 			subplan_map[i] = subplanidx = relid_subplan_map[partrel->relid] - 1;
 			subpart_map[i] = subpartidx = relid_subpart_map[partrel->relid] - 1;
 			relid_map[i] = planner_rt_fetch(partrel->relid, root)->relid;
+			rti_map[i] = partrel->relid;
 			if (subplanidx >= 0)
 			{
 				present_parts = bms_add_member(present_parts, i);
@@ -696,6 +730,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		pinfo->subplan_map = subplan_map;
 		pinfo->subpart_map = subpart_map;
 		pinfo->relid_map = relid_map;
+		pinfo->rti_map = rti_map;
 	}
 
 	pfree(relid_subpart_map);
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 6f18b68856..16bda42f11 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -1596,6 +1596,7 @@ exec_bind_message(StringInfo input_message)
 	int16	   *rformats = NULL;
 	CachedPlanSource *psrc;
 	CachedPlan *cplan;
+	List	   *part_prune_result_list;
 	Portal		portal;
 	char	   *query_string;
 	char	   *saved_stmt_name;
@@ -1971,7 +1972,9 @@ exec_bind_message(StringInfo input_message)
 	 * will be generated in MessageContext.  The plan refcount will be
 	 * assigned to the Portal, so it will be released at portal destruction.
 	 */
-	cplan = GetCachedPlan(psrc, params, NULL, NULL);
+	cplan = GetCachedPlan(psrc, params, NULL, NULL, &part_prune_result_list);
+	Assert(list_length(cplan->stmt_list) ==
+		   list_length(part_prune_result_list));
 
 	/*
 	 * Now we can define the portal.
@@ -1986,6 +1989,9 @@ exec_bind_message(StringInfo input_message)
 					  cplan->stmt_list,
 					  cplan);
 
+	/* Copy PartitionPruneResults into the portal's context. */
+	PortalStorePartitionPruneResults(portal, part_prune_result_list);
+
 	/* Done with the snapshot used for parameter I/O and parsing/planning */
 	if (snapshot_set)
 		PopActiveSnapshot();
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index 5aa5a350f3..8cc2e2162d 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -35,7 +35,7 @@
 Portal		ActivePortal = NULL;
 
 
-static void ProcessQuery(PlannedStmt *plan,
+static void ProcessQuery(PlannedStmt *plan, PartitionPruneResult *part_prune_result,
 						 const char *sourceText,
 						 ParamListInfo params,
 						 QueryEnvironment *queryEnv,
@@ -65,6 +65,7 @@ static void DoPortalRewind(Portal portal);
  */
 QueryDesc *
 CreateQueryDesc(PlannedStmt *plannedstmt,
+				PartitionPruneResult *part_prune_result,
 				const char *sourceText,
 				Snapshot snapshot,
 				Snapshot crosscheck_snapshot,
@@ -77,6 +78,8 @@ CreateQueryDesc(PlannedStmt *plannedstmt,
 
 	qd->operation = plannedstmt->commandType;	/* operation */
 	qd->plannedstmt = plannedstmt;	/* plan */
+	qd->part_prune_result = part_prune_result;	/* ExecutorDoInitialPruning()
+												 * output for plan */
 	qd->sourceText = sourceText;	/* query text */
 	qd->snapshot = RegisterSnapshot(snapshot);	/* snapshot */
 	/* RI check snapshot */
@@ -122,6 +125,7 @@ FreeQueryDesc(QueryDesc *qdesc)
  *		PORTAL_ONE_RETURNING, or PORTAL_ONE_MOD_WITH portal
  *
  *	plan: the plan tree for the query
+ *	part_prune_result: ExecutorDoInitialPruning() output for the plan tree
  *	sourceText: the source text of the query
  *	params: any parameters needed
  *	dest: where to send results
@@ -134,6 +138,7 @@ FreeQueryDesc(QueryDesc *qdesc)
  */
 static void
 ProcessQuery(PlannedStmt *plan,
+			 PartitionPruneResult *part_prune_result,
 			 const char *sourceText,
 			 ParamListInfo params,
 			 QueryEnvironment *queryEnv,
@@ -145,7 +150,7 @@ ProcessQuery(PlannedStmt *plan,
 	/*
 	 * Create the QueryDesc object
 	 */
-	queryDesc = CreateQueryDesc(plan, sourceText,
+	queryDesc = CreateQueryDesc(plan, part_prune_result, sourceText,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, params, queryEnv, 0);
 
@@ -491,8 +496,13 @@ PortalStart(Portal portal, ParamListInfo params,
 				/*
 				 * Create QueryDesc in portal's context; for the moment, set
 				 * the destination to DestNone.
+				 *
+				 * There is no PartitionPruneResult unless the PlannedStmt is
+				 * from a CachedPlan.
 				 */
 				queryDesc = CreateQueryDesc(linitial_node(PlannedStmt, portal->stmts),
+											portal->part_prune_results == NIL ? NULL :
+											linitial(portal->part_prune_results),
 											portal->sourceText,
 											GetActiveSnapshot(),
 											InvalidSnapshot,
@@ -1225,6 +1235,8 @@ PortalRunMulti(Portal portal,
 
 		if (pstmt->utilityStmt == NULL)
 		{
+			PartitionPruneResult *part_prune_result = NULL;
+
 			/*
 			 * process a plannable query.
 			 */
@@ -1271,10 +1283,18 @@ PortalRunMulti(Portal portal,
 			else
 				UpdateActiveSnapshotCommandId();
 
+			/*
+			 * Determine if there's a corresponding PartitionPruneResult for
+			 * this PlannedStmt.
+			 */
+			if (portal->part_prune_results != NIL)
+				part_prune_result = list_nth(portal->part_prune_results,
+											 foreach_current_index(stmtlist_item));
+
 			if (pstmt->canSetTag)
 			{
 				/* statement can set tag string */
-				ProcessQuery(pstmt,
+				ProcessQuery(pstmt, part_prune_result,
 							 portal->sourceText,
 							 portal->portalParams,
 							 portal->queryEnv,
@@ -1283,7 +1303,7 @@ PortalRunMulti(Portal portal,
 			else
 			{
 				/* stmt added by rewrite cannot set tag */
-				ProcessQuery(pstmt,
+				ProcessQuery(pstmt, part_prune_result,
 							 portal->sourceText,
 							 portal->portalParams,
 							 portal->queryEnv,
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 0d6a295674..d1c9605979 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -99,14 +99,19 @@ static dlist_head cached_expression_list = DLIST_STATIC_INIT(cached_expression_l
 static void ReleaseGenericPlan(CachedPlanSource *plansource);
 static List *RevalidateCachedQuery(CachedPlanSource *plansource,
 								   QueryEnvironment *queryEnv);
-static bool CheckCachedPlan(CachedPlanSource *plansource);
+static bool CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
+							List **part_prune_result_list);
 static CachedPlan *BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
-								   ParamListInfo boundParams, QueryEnvironment *queryEnv);
+								   ParamListInfo boundParams, QueryEnvironment *queryEnv,
+								   List **part_prune_result_list);
 static bool choose_custom_plan(CachedPlanSource *plansource,
 							   ParamListInfo boundParams);
 static double cached_plan_cost(CachedPlan *plan, bool include_planner);
 static Query *QueryListGetPrimaryStmt(List *stmts);
-static void AcquireExecutorLocks(List *stmt_list, bool acquire);
+static void AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams,
+								 List **part_prune_result_list,
+								 List **lockedRelids_per_stmt);
+static void ReleaseExecutorLocks(List *stmt_list, List *lockedRelids_per_stmt);
 static void AcquirePlannerLocks(List *stmt_list, bool acquire);
 static void ScanQueryForLocks(Query *parsetree, bool acquire);
 static bool ScanQueryWalker(Node *node, bool *acquire);
@@ -790,15 +795,20 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
  *
  * On a "true" return, we have acquired the locks needed to run the plan.
  * (We must do this for the "true" result to be race-condition-free.)
+ *
+ * See GetCachedPlan()'s comment for a description of part_prune_result_list.
  */
 static bool
-CheckCachedPlan(CachedPlanSource *plansource)
+CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
+				List **part_prune_result_list)
 {
 	CachedPlan *plan = plansource->gplan;
 
 	/* Assert that caller checked the querytree */
 	Assert(plansource->is_valid);
 
+	*part_prune_result_list = NIL;
+
 	/* If there's no generic plan, just say "false" */
 	if (!plan)
 		return false;
@@ -820,13 +830,21 @@ CheckCachedPlan(CachedPlanSource *plansource)
 	 */
 	if (plan->is_valid)
 	{
+		List   *lockedRelids_per_stmt;
+
 		/*
 		 * Plan must have positive refcount because it is referenced by
 		 * plansource; so no need to fear it disappears under us here.
 		 */
 		Assert(plan->refcount > 0);
 
-		AcquireExecutorLocks(plan->stmt_list, true);
+		/*
+		 * Lock relations scanned by the plan.  This is where the pruning
+		 * happens if needed.
+		 */
+		AcquireExecutorLocks(plan->stmt_list, boundParams,
+							 part_prune_result_list,
+							 &lockedRelids_per_stmt);
 
 		/*
 		 * If plan was transient, check to see if TransactionXmin has
@@ -848,7 +866,14 @@ CheckCachedPlan(CachedPlanSource *plansource)
 		}
 
 		/* Oops, the race case happened.  Release useless locks. */
-		AcquireExecutorLocks(plan->stmt_list, false);
+		ReleaseExecutorLocks(plan->stmt_list, lockedRelids_per_stmt);
+
+		/*
+		 * The output list and any objects therein have been allocated in the
+		 * caller's hopefully short-lived context, so will not remain leaked
+		 * for long, though reset to avoid its accidentally being looked at.
+		 */
+		*part_prune_result_list = NIL;
 	}
 
 	/*
@@ -874,10 +899,15 @@ CheckCachedPlan(CachedPlanSource *plansource)
  * Planning work is done in the caller's memory context.  The finished plan
  * is in a child memory context, which typically should get reparented
  * (unless this is a one-shot plan, in which case we don't copy the plan).
+ *
+ * A list of NULLs is returned in *part_prune_result_list, meaning that no
+ * PartitionPruneResult nodes have yet been created for the plans in
+ * stmt_list.
  */
 static CachedPlan *
 BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
-				ParamListInfo boundParams, QueryEnvironment *queryEnv)
+				ParamListInfo boundParams, QueryEnvironment *queryEnv,
+				List **part_prune_result_list)
 {
 	CachedPlan *plan;
 	List	   *plist;
@@ -1007,6 +1037,17 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
 
 	MemoryContextSwitchTo(oldcxt);
 
+	/*
+	 * No actual PartitionPruneResults yet to add, though must initialize
+	 * the list to have the same number of elements as the list of
+	 * PlannedStmts.
+	 */
+	*part_prune_result_list = NIL;
+	foreach(lc, plist)
+	{
+		*part_prune_result_list = lappend(*part_prune_result_list, NULL);
+	}
+
 	return plan;
 }
 
@@ -1126,6 +1167,17 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
  * plan or a custom plan for the given parameters: the caller does not know
  * which it will get.
  *
+ * For every PlannedStmt found in the returned CachedPlan, an element that
+ * is either a PartitionPruneResult or a NULL is added to
+ * *part_prune_result_list if needed.  The former if the PlannedStmt is from
+ * the existing CachedPlan that is otherwise valid and contains at least one
+ * PartitionPruneInfo that has "initial" pruning steps.  Those steps are
+ * performed by calling ExecutorDoInitialPruning() to determine only those
+ * leaf partitions that need to be locked by AcquireExecutorLocks() by pruning
+ * away subplans that don't match the pruning conditions.  The
+ * PartitionPruneResult contains a list of bitmapsets of the indexes of
+ * matching subplans, one for each PartitionPruneInfo.
+ *
  * On return, the plan is valid and we have sufficient locks to begin
  * execution.
  *
@@ -1139,11 +1191,13 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
  */
 CachedPlan *
 GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
-			  ResourceOwner owner, QueryEnvironment *queryEnv)
+			  ResourceOwner owner, QueryEnvironment *queryEnv,
+			  List **part_prune_result_list)
 {
 	CachedPlan *plan = NULL;
 	List	   *qlist;
 	bool		customplan;
+	List	   *my_part_prune_result_list;
 
 	/* Assert caller is doing things in a sane order */
 	Assert(plansource->magic == CACHEDPLANSOURCE_MAGIC);
@@ -1160,7 +1214,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 
 	if (!customplan)
 	{
-		if (CheckCachedPlan(plansource))
+		if (CheckCachedPlan(plansource, boundParams,
+							&my_part_prune_result_list))
 		{
 			/* We want a generic plan, and we already have a valid one */
 			plan = plansource->gplan;
@@ -1169,7 +1224,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 		else
 		{
 			/* Build a new generic plan */
-			plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv);
+			plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv,
+								   &my_part_prune_result_list);
 			/* Just make real sure plansource->gplan is clear */
 			ReleaseGenericPlan(plansource);
 			/* Link the new generic plan into the plansource */
@@ -1214,7 +1270,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 	if (customplan)
 	{
 		/* Build a custom plan */
-		plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv);
+		plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv,
+							   &my_part_prune_result_list);
 		/* Accumulate total costs of custom plans */
 		plansource->total_custom_cost += cached_plan_cost(plan, true);
 
@@ -1246,6 +1303,9 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 		plan->is_saved = true;
 	}
 
+	if (part_prune_result_list)
+		*part_prune_result_list = my_part_prune_result_list;
+
 	return plan;
 }
 
@@ -1737,17 +1797,29 @@ QueryListGetPrimaryStmt(List *stmts)
 
 /*
  * AcquireExecutorLocks: acquire locks needed for execution of a cached plan;
- * or release them if acquire is false.
+ *
+ * See GetCachedPlan()'s comment for a description of part_prune_result_list.
+ *
+ * On return, *lockedRelids_per_stmt will contain a bitmapset for every
+ * PlannedStmt in stmt_list, containing the RT indexes of relation entries
+ * in its range table that were actually locked, or NULL if the PlannedStmt
+ * contains a utility statement.
  */
 static void
-AcquireExecutorLocks(List *stmt_list, bool acquire)
+AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams,
+					 List **part_prune_result_list,
+					 List **lockedRelids_per_stmt)
 {
 	ListCell   *lc1;
 
+	*part_prune_result_list = *lockedRelids_per_stmt = NIL;
 	foreach(lc1, stmt_list)
 	{
 		PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
-		ListCell   *lc2;
+		PartitionPruneResult *part_prune_result = NULL;
+		Bitmapset  *allLockRelids;
+		Bitmapset  *lockedRelids = NULL;
+		int			rti;
 
 		if (plannedstmt->commandType == CMD_UTILITY)
 		{
@@ -1761,13 +1833,38 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
 			Query	   *query = UtilityContainsQuery(plannedstmt->utilityStmt);
 
 			if (query)
-				ScanQueryForLocks(query, acquire);
+				ScanQueryForLocks(query, true);
+			*part_prune_result_list = lappend(*part_prune_result_list, NULL);
 			continue;
 		}
 
-		foreach(lc2, plannedstmt->rtable)
+		/*
+		 * Figure out the set of relations that would need to be locked
+		 * before executing the plan.
+		 */
+		if (plannedstmt->containsInitialPruning)
 		{
-			RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc2);
+			/*
+			 * Obtain the set of leaf partitions to be locked.
+			 *
+			 * The following does initial partition pruning using the
+			 * PartitionPruneInfos found in plannedstmt->partPruneInfos and
+			 * finds leaf partitions that survive that pruning across all the
+			 * nodes in the plan tree.
+			 */
+			PartitionPruneResult *part_prune_result =
+				ExecutorDoInitialPruning(plannedstmt, boundParams);
+
+			allLockRelids = bms_union(plannedstmt->minLockRelids,
+									  part_prune_result->scan_leafpart_rtis);
+		}
+		else
+			allLockRelids = plannedstmt->minLockRelids;
+
+		rti = -1;
+		while ((rti = bms_next_member(allLockRelids, rti)) > 0)
+		{
+			RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
 
 			if (rte->rtekind != RTE_RELATION)
 				continue;
@@ -1778,10 +1875,58 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
 			 * fail if it's been dropped entirely --- we'll just transiently
 			 * acquire a non-conflicting lock.
 			 */
-			if (acquire)
-				LockRelationOid(rte->relid, rte->rellockmode);
-			else
-				UnlockRelationOid(rte->relid, rte->rellockmode);
+			LockRelationOid(rte->relid, rte->rellockmode);
+			lockedRelids = bms_add_member(lockedRelids, rti);
+		}
+
+		*part_prune_result_list = lappend(*part_prune_result_list,
+										  part_prune_result);
+		*lockedRelids_per_stmt = lappend(*lockedRelids_per_stmt, lockedRelids);
+	}
+}
+
+/*
+ * ReleaseExecutorLocks
+ * 		Release locks that would've been acquired by an earlier call to
+ * 		AcquireExecutorLocks()
+ */
+static void
+ReleaseExecutorLocks(List *stmt_list, List *lockedRelids_per_stmt)
+{
+	ListCell   *lc1,
+			   *lc2;
+
+	forboth(lc1, stmt_list, lc2, lockedRelids_per_stmt)
+	{
+		PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
+		Bitmapset  *lockedRelids = lfirst(lc2);
+		int			rti;
+
+		if (plannedstmt->commandType == CMD_UTILITY)
+		{
+			/*
+			 * Ignore utility statements, except those (such as EXPLAIN) that
+			 * contain a parsed-but-not-planned query.  Note: it's okay to use
+			 * ScanQueryForLocks, even though the query hasn't been through
+			 * rule rewriting, because rewriting doesn't change the query
+			 * representation.
+			 */
+			Query	   *query = UtilityContainsQuery(plannedstmt->utilityStmt);
+
+			if (query)
+				ScanQueryForLocks(query, false);
+			continue;
+		}
+
+		rti = -1;
+		while ((rti = bms_next_member(lockedRelids, rti)) >= 0)
+		{
+			RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
+
+			Assert(rte->rtekind == RTE_RELATION);
+
+			/* See the comment in AcquireExecutorLocks(). */
+			UnlockRelationOid(rte->relid, rte->rellockmode);
 		}
 	}
 }
diff --git a/src/backend/utils/mmgr/portalmem.c b/src/backend/utils/mmgr/portalmem.c
index d549f66d4a..1bbe6b704b 100644
--- a/src/backend/utils/mmgr/portalmem.c
+++ b/src/backend/utils/mmgr/portalmem.c
@@ -303,6 +303,25 @@ PortalDefineQuery(Portal portal,
 	portal->status = PORTAL_DEFINED;
 }
 
+/*
+ * PortalStorePartitionPruneResults
+ *		Copy the given list of PartitionPruneResults into the portal's
+ *		context
+ *
+ * This allows the caller to ensure that the list exists as long as the portal
+ * does.
+ */
+void
+PortalStorePartitionPruneResults(Portal portal, List *part_prune_results)
+{
+	MemoryContext	oldcxt;
+
+	AssertArg(PortalIsValid(portal));
+	oldcxt = MemoryContextSwitchTo(portal->portalContext);
+	portal->part_prune_results = copyObject(part_prune_results);
+	MemoryContextSwitchTo(oldcxt);
+}
+
 /*
  * PortalReleaseCachedPlan
  *		Release a portal's reference to its cached plan, if any.
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 9ebde089ae..e57e133f0e 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -87,7 +87,9 @@ extern void ExplainOneUtility(Node *utilityStmt, IntoClause *into,
 							  ExplainState *es, const char *queryString,
 							  ParamListInfo params, QueryEnvironment *queryEnv);
 
-extern void ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into,
+extern void ExplainOnePlan(PlannedStmt *plannedstmt,
+						   PartitionPruneResult *part_prune_result,
+						   IntoClause *into,
 						   ExplainState *es, const char *queryString,
 						   ParamListInfo params, QueryEnvironment *queryEnv,
 						   const instr_time *planduration,
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index bf962af7af..bd8776402e 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -45,6 +45,7 @@ extern void ExecCleanupTupleRouting(ModifyTableState *mtstate,
  * nparts						Length of subplan_map[] and subpart_map[].
  * subplan_map					Subplan index by partition index, or -1.
  * subpart_map					Subpart index by partition index, or -1.
+ * rti_map						Range table index by partition index, or 0.
  * present_parts				A Bitmapset of the partition indexes that we
  *								have subplans or subparts for.
  * initial_pruning_steps		List of PartitionPruneSteps used to
@@ -61,6 +62,7 @@ typedef struct PartitionedRelPruningData
 	int			nparts;
 	int		   *subplan_map;
 	int		   *subpart_map;
+	Index	   *rti_map;
 	Bitmapset  *present_parts;
 	List	   *initial_pruning_steps;
 	List	   *exec_pruning_steps;
@@ -126,5 +128,10 @@ extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
 													 int part_prune_index,
 													 Bitmapset **initially_valid_subplans);
 extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
-										   bool initial_prune);
+										   bool initial_prune,
+										   Bitmapset **scan_leafpart_rtis);
+extern Bitmapset *ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt,
+								ParamListInfo params,
+								PartitionPruneInfo *pruneinfo,
+								Bitmapset **scan_leafpart_rtis);
 #endif							/* EXECPARTITION_H */
diff --git a/src/include/executor/execdesc.h b/src/include/executor/execdesc.h
index e79e2c001f..60d5644908 100644
--- a/src/include/executor/execdesc.h
+++ b/src/include/executor/execdesc.h
@@ -35,6 +35,8 @@ typedef struct QueryDesc
 	/* These fields are provided by CreateQueryDesc */
 	CmdType		operation;		/* CMD_SELECT, CMD_UPDATE, etc. */
 	PlannedStmt *plannedstmt;	/* planner's output (could be utility, too) */
+	PartitionPruneResult *part_prune_result; /* ExecutorDoInitialPruning()'s
+											  * output for plannedstmt */
 	const char *sourceText;		/* source text of the query */
 	Snapshot	snapshot;		/* snapshot to use for query */
 	Snapshot	crosscheck_snapshot;	/* crosscheck for RI update/delete */
@@ -57,6 +59,7 @@ typedef struct QueryDesc
 
 /* in pquery.c */
 extern QueryDesc *CreateQueryDesc(PlannedStmt *plannedstmt,
+								  PartitionPruneResult *part_prune_result,
 								  const char *sourceText,
 								  Snapshot snapshot,
 								  Snapshot crosscheck_snapshot,
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index d68a6b9d28..5c4a282be0 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -185,6 +185,8 @@ ExecGetJunkAttribute(TupleTableSlot *slot, AttrNumber attno, bool *isNull)
 /*
  * prototypes from functions in execMain.c
  */
+extern PartitionPruneResult *ExecutorDoInitialPruning(PlannedStmt *plannedstmt,
+													  ParamListInfo params);
 extern void ExecutorStart(QueryDesc *queryDesc, int eflags);
 extern void standard_ExecutorStart(QueryDesc *queryDesc, int eflags);
 extern void ExecutorRun(QueryDesc *queryDesc,
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 63a89474db..12ea06c2f6 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -1001,6 +1001,33 @@ typedef struct DomainConstraintState
  */
 typedef TupleTableSlot *(*ExecProcNodeMtd) (struct PlanState *pstate);
 
+/*----------------
+ * PartitionPruneResult
+ *
+ * The result of performing ExecutorDoInitialPruning() invocation on a given
+ * PlannedStmt.
+ *
+ * Contains a list of Bitmapset of the indexes of the subplans remaining after
+ * performing initial pruning by calling ExecFindMatchingSubPlans() for every
+ * PartitionPruneInfo found in PlannedStmt.partPruneInfos.  RT indexes of the
+ * leaf partitions scanned by those subplans across all PartitionPruneInfos
+ * are added into scan_leafpart_rtis.
+ *
+ * This is used by GetCachedPlan() to inform its callers of the pruning
+ * decisions made when performing AcquireExecutorLocks() on a given cached
+ * PlannedStmt, which the callers then pass on to the executor.  The executor
+ * refers to this node when initializing the plan nodes which contain subplans
+ * that may have been pruned by ExecutorDoInitialPruning(), rather than
+ * redoing initial pruning.
+ */
+typedef struct PartitionPruneResult
+{
+	NodeTag		type;
+
+	List		   *valid_subplan_offs_list;
+	Bitmapset	   *scan_leafpart_rtis;
+} PartitionPruneResult;
+
 /* ----------------
  *		PlanState node
  *
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index cdd6debfa0..b33d9e426d 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -212,6 +212,7 @@ extern struct Bitmapset *readBitmapset(void);
 extern uintptr_t readDatum(bool typbyval);
 extern bool *readBoolCols(int numCols);
 extern int *readIntCols(int numCols);
+extern Index *readIndexCols(int numCols);
 extern Oid *readOidCols(int numCols);
 extern int16 *readAttrNumberCols(int numCols);
 
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index d87957ff6c..7957aeb6d7 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -125,6 +125,19 @@ typedef struct PlannerGlobal
 	/* List of PartitionPruneInfo contained in the plan */
 	List	   *partPruneInfos;
 
+	/*
+	 * Do any of those PartitionPruneInfos have initial (pre-exec) pruning
+	 * steps in them?
+	 */
+	bool		containsInitialPruning;
+
+	/*
+	 * Indexes of all range table entries minus indexes of range table entries
+	 * of the leaf partitions scanned by prunable subplans; see
+	 * AcquireExecutorLocks()
+	 */
+	Bitmapset  *minLockRelids;
+
 	/* OIDs of relations the plan depends on */
 	List	   *relationOids;
 
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index f2daabb3b7..1d2c0d9bdf 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -72,8 +72,17 @@ typedef struct PlannedStmt
 	List	   *partPruneInfos;	/* List of PartitionPruneInfo contained in
 								 * the plan */
 
+	bool		containsInitialPruning;	/* Do any of those PartitionPruneInfos
+										 * have initial (pre-exec) pruning
+										 * steps in them? */
+
 	List	   *rtable;			/* list of RangeTblEntry nodes */
 
+	Bitmapset  *minLockRelids;	/* Indexes of all range table entries minus
+								 * indexes of range table entries of the leaf
+								 * partitions scanned by prunable subplans;
+								 * see AcquireExecutorLocks() */
+
 	/* rtable indexes of target relations for INSERT/UPDATE/DELETE */
 	List	   *resultRelations;	/* integer list of RT indexes, or NIL */
 
@@ -1409,6 +1418,13 @@ typedef struct PlanRowMark
  * prune_infos			List of Lists containing PartitionedRelPruneInfo nodes,
  *						one sublist per run-time-prunable partition hierarchy
  *						appearing in the parent plan node's subplans.
+ *
+ * needs_init_pruning	Does any of the PartitionedRelPruneInfos in
+ *						prune_infos have its initial_pruning_steps set?
+ *
+ * needs_exec_pruning	Does any of the PartitionedRelPruneInfos in
+ *						prune_infos have its exec_pruning_steps set?
+ *
  * other_subplans		Indexes of any subplans that are not accounted for
  *						by any of the PartitionedRelPruneInfo nodes in
  *						"prune_infos".  These subplans must not be pruned.
@@ -1419,6 +1435,8 @@ typedef struct PartitionPruneInfo
 
 	NodeTag		type;
 	List	   *prune_infos;
+	bool		needs_init_pruning;
+	bool		needs_exec_pruning;
 	Bitmapset  *other_subplans;
 } PartitionPruneInfo;
 
@@ -1463,6 +1481,9 @@ typedef struct PartitionedRelPruneInfo
 	/* relation OID by partition index, or 0 */
 	Oid		   *relid_map pg_node_attr(array_size(nparts));
 
+	/* Range table index by partition index, or 0. */
+	Index	   *rti_map pg_node_attr(array_size(nparts));
+
 	/*
 	 * initial_pruning_steps shows how to prune during executor startup (i.e.,
 	 * without use of any PARAM_EXEC Params); it is NIL if no startup pruning
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index 0499635f59..1c5bb5ece1 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -220,7 +220,8 @@ extern List *CachedPlanGetTargetList(CachedPlanSource *plansource,
 extern CachedPlan *GetCachedPlan(CachedPlanSource *plansource,
 								 ParamListInfo boundParams,
 								 ResourceOwner owner,
-								 QueryEnvironment *queryEnv);
+								 QueryEnvironment *queryEnv,
+								 List **part_prune_result_list);
 extern void ReleaseCachedPlan(CachedPlan *plan, ResourceOwner owner);
 
 extern bool CachedPlanAllowsSimpleValidityCheck(CachedPlanSource *plansource,
diff --git a/src/include/utils/portal.h b/src/include/utils/portal.h
index aeddbdafe5..9f7727a837 100644
--- a/src/include/utils/portal.h
+++ b/src/include/utils/portal.h
@@ -138,6 +138,7 @@ typedef struct PortalData
 	QueryCompletion qc;			/* command completion data for executed query */
 	List	   *stmts;			/* list of PlannedStmts */
 	CachedPlan *cplan;			/* CachedPlan, if stmts are from one */
+	List	   *part_prune_results;	/* list of PartitionPruneResults */
 
 	ParamListInfo portalParams; /* params to pass to query */
 	QueryEnvironment *queryEnv; /* environment for query */
@@ -242,6 +243,8 @@ extern void PortalDefineQuery(Portal portal,
 							  CommandTag commandTag,
 							  List *stmts,
 							  CachedPlan *cplan);
+extern void PortalStorePartitionPruneResults(Portal portal,
+											 List *part_prune_result_list);
 extern PlannedStmt *PortalGetPrimaryStmt(Portal portal);
 extern void PortalCreateHoldStore(Portal portal);
 extern void PortalHashTableDeleteAll(void);
-- 
2.35.3



^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2022-07-27 03:00  Amit Langote <[email protected]>
  parent: Amit Langote <[email protected]>
  0 siblings, 1 reply; 108+ messages in thread

From: Amit Langote @ 2022-07-27 03:00 UTC (permalink / raw)
  To: Jacob Champion <[email protected]>; +Cc: Zhihong Yu <[email protected]>; David Rowley <[email protected]>; Robert Haas <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers

On Wed, Jul 13, 2022 at 4:03 PM Amit Langote <[email protected]> wrote:
> On Wed, Jul 13, 2022 at 3:40 PM Amit Langote <[email protected]> wrote:
> > Rebased over 964d01ae90c.
>
> Sorry, left some pointless hunks in there while rebasing.  Fixed in
> the attached.

Needed to be rebased again, over 2d04277121f this time.

-- 
Thanks, Amit Langote
EDB: http://www.enterprisedb.com


Attachments:

  [application/octet-stream] v20-0001-Move-PartitioPruneInfo-out-of-plan-nodes-into-Pl.patch (17.3K, 2-v20-0001-Move-PartitioPruneInfo-out-of-plan-nodes-into-Pl.patch)
  download | inline diff:
From 8de25528e8f388beffdab3d7c9905712e2f8eeef Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Fri, 27 May 2022 16:00:28 +0900
Subject: [PATCH v20 1/2] Move PartitioPruneInfo out of plan nodes into
 PlannedStmt

The planner will now add a given PartitioPruneInfo to
PlannedStmt.partPruneInfos instead of directly to the
Append/MergeAppend plan node.  What gets set instead in the
latter is an index field which points to the list element
of PlannedStmt.partPruneInfos containing the PartitioPruneInfo
belonging to the plan node.

A later commit will make AcquireExecutorLocks() do the initial
partition pruning to determine a minimal set of partitions to be
locked when validating a plan tree and it will need to consult the
PartitioPruneInfos referenced therein to do so.  It would be better
for the PartitioPruneInfos to be accessible directly than requiring
a walk of the plan tree to find them, which is easier when it can be
done by simply iterating over PlannedStmt.partPruneInfos.
---
 src/backend/executor/execMain.c         |  1 +
 src/backend/executor/execParallel.c     |  1 +
 src/backend/executor/execPartition.c    |  4 +-
 src/backend/executor/execUtils.c        |  2 +
 src/backend/executor/nodeAppend.c       |  4 +-
 src/backend/executor/nodeMergeAppend.c  |  4 +-
 src/backend/optimizer/plan/createplan.c | 24 ++++-----
 src/backend/optimizer/plan/planner.c    |  1 +
 src/backend/optimizer/plan/setrefs.c    | 65 +++++++++++++------------
 src/backend/partitioning/partprune.c    | 18 ++++---
 src/include/executor/execPartition.h    |  3 +-
 src/include/nodes/execnodes.h           |  2 +
 src/include/nodes/pathnodes.h           |  6 +++
 src/include/nodes/plannodes.h           | 11 +++--
 src/include/partitioning/partprune.h    |  8 +--
 15 files changed, 92 insertions(+), 62 deletions(-)

diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index ef2fd46092..72fc273524 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -825,6 +825,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
 	ExecInitRangeTable(estate, rangeTable);
 
 	estate->es_plannedstmt = plannedstmt;
+	estate->es_part_prune_infos = plannedstmt->partPruneInfos;
 
 	/*
 	 * Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index f1fd7f7e8b..f73b8c2607 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -183,6 +183,7 @@ ExecSerializePlan(Plan *plan, EState *estate)
 	pstmt->dependsOnRole = false;
 	pstmt->parallelModeNeeded = false;
 	pstmt->planTree = plan;
+	pstmt->partPruneInfos = estate->es_part_prune_infos;
 	pstmt->rtable = estate->es_range_table;
 	pstmt->resultRelations = NIL;
 	pstmt->appendRelations = NIL;
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index e03ea27299..b55cdd2580 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -1638,11 +1638,13 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
 PartitionPruneState *
 ExecInitPartitionPruning(PlanState *planstate,
 						 int n_total_subplans,
-						 PartitionPruneInfo *pruneinfo,
+						 int part_prune_index,
 						 Bitmapset **initially_valid_subplans)
 {
 	PartitionPruneState *prunestate;
 	EState	   *estate = planstate->state;
+	PartitionPruneInfo *pruneinfo = list_nth(estate->es_part_prune_infos,
+											 part_prune_index);
 
 	/* We may need an expression context to evaluate partition exprs */
 	ExecAssignExprContext(estate, planstate);
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 9df1f81ea8..f9c7976ff2 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -119,6 +119,8 @@ CreateExecutorState(void)
 	estate->es_relations = NULL;
 	estate->es_rowmarks = NULL;
 	estate->es_plannedstmt = NULL;
+	estate->es_part_prune_infos = NIL;
+	estate->es_part_prune_result = NULL;
 
 	estate->es_junkFilter = NULL;
 
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index 357e10a1d7..c6f86a6510 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -134,7 +134,7 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
 	appendstate->as_begun = false;
 
 	/* If run-time partition pruning is enabled, then set that up now */
-	if (node->part_prune_info != NULL)
+	if (node->part_prune_index >= 0)
 	{
 		PartitionPruneState *prunestate;
 
@@ -145,7 +145,7 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
 		 */
 		prunestate = ExecInitPartitionPruning(&appendstate->ps,
 											  list_length(node->appendplans),
-											  node->part_prune_info,
+											  node->part_prune_index,
 											  &validsubplans);
 		appendstate->as_prune_state = prunestate;
 		nplans = bms_num_members(validsubplans);
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index c5c62fa5c7..8d35860c30 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -82,7 +82,7 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
 	mergestate->ps.ExecProcNode = ExecMergeAppend;
 
 	/* If run-time partition pruning is enabled, then set that up now */
-	if (node->part_prune_info != NULL)
+	if (node->part_prune_index >= 0)
 	{
 		PartitionPruneState *prunestate;
 
@@ -93,7 +93,7 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
 		 */
 		prunestate = ExecInitPartitionPruning(&mergestate->ps,
 											  list_length(node->mergeplans),
-											  node->part_prune_info,
+											  node->part_prune_index,
 											  &validsubplans);
 		mergestate->ms_prune_state = prunestate;
 		nplans = bms_num_members(validsubplans);
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index e37f2933eb..fd8ab4a167 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -1203,7 +1203,6 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
 	ListCell   *subpaths;
 	int			nasyncplans = 0;
 	RelOptInfo *rel = best_path->path.parent;
-	PartitionPruneInfo *partpruneinfo = NULL;
 	int			nodenumsortkeys = 0;
 	AttrNumber *nodeSortColIdx = NULL;
 	Oid		   *nodeSortOperators = NULL;
@@ -1354,6 +1353,9 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
 		subplans = lappend(subplans, subplan);
 	}
 
+	/* Set below if we find quals that we can use to run-time prune */
+	plan->part_prune_index = -1;
+
 	/*
 	 * If any quals exist, they may be useful to perform further partition
 	 * pruning during execution.  Gather information needed by the executor to
@@ -1377,16 +1379,14 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
 		}
 
 		if (prunequal != NIL)
-			partpruneinfo =
-				make_partition_pruneinfo(root, rel,
-										 best_path->subpaths,
-										 prunequal);
+			plan->part_prune_index = make_partition_pruneinfo(root, rel,
+															  best_path->subpaths,
+															  prunequal);
 	}
 
 	plan->appendplans = subplans;
 	plan->nasyncplans = nasyncplans;
 	plan->first_partial_plan = best_path->first_partial_path;
-	plan->part_prune_info = partpruneinfo;
 
 	copy_generic_path_info(&plan->plan, (Path *) best_path);
 
@@ -1425,7 +1425,6 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
 	List	   *subplans = NIL;
 	ListCell   *subpaths;
 	RelOptInfo *rel = best_path->path.parent;
-	PartitionPruneInfo *partpruneinfo = NULL;
 
 	/*
 	 * We don't have the actual creation of the MergeAppend node split out
@@ -1518,6 +1517,9 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
 		subplans = lappend(subplans, subplan);
 	}
 
+	/* Set below if we find quals that we can use to run-time prune */
+	node->part_prune_index = -1;
+
 	/*
 	 * If any quals exist, they may be useful to perform further partition
 	 * pruning during execution.  Gather information needed by the executor to
@@ -1541,13 +1543,13 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
 		}
 
 		if (prunequal != NIL)
-			partpruneinfo = make_partition_pruneinfo(root, rel,
-													 best_path->subpaths,
-													 prunequal);
+			node->part_prune_index = make_partition_pruneinfo(root, rel,
+															  best_path->subpaths,
+															  prunequal);
 	}
 
 	node->mergeplans = subplans;
-	node->part_prune_info = partpruneinfo;
+
 
 	/*
 	 * If prepare_sort_from_pathkeys added sort columns, but we were told to
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 06ad856eac..b11249ed8f 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -518,6 +518,7 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
 	result->dependsOnRole = glob->dependsOnRole;
 	result->parallelModeNeeded = glob->parallelModeNeeded;
 	result->planTree = top_plan;
+	result->partPruneInfos = glob->partPruneInfos;
 	result->rtable = glob->finalrtable;
 	result->resultRelations = glob->resultRelations;
 	result->appendRelations = glob->appendRelations;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 1cb0abdbc1..720f20f563 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -348,6 +348,29 @@ set_plan_references(PlannerInfo *root, Plan *plan)
 		}
 	}
 
+	/* Also fix up the information in PartitionPruneInfos. */
+	foreach (lc, root->partPruneInfos)
+	{
+		PartitionPruneInfo *pruneinfo = lfirst(lc);
+		ListCell  *l;
+
+		foreach(l, pruneinfo->prune_infos)
+		{
+			List	   *prune_infos = lfirst(l);
+			ListCell   *l2;
+
+			foreach(l2, prune_infos)
+			{
+				PartitionedRelPruneInfo *pinfo = lfirst(l2);
+
+				/* RT index of the table to which the pinfo belongs. */
+				pinfo->rtindex += rtoffset;
+			}
+		}
+
+		glob->partPruneInfos = lappend(glob->partPruneInfos, pruneinfo);
+	}
+
 	return result;
 }
 
@@ -1658,21 +1681,12 @@ set_append_references(PlannerInfo *root,
 
 	aplan->apprelids = offset_relid_set(aplan->apprelids, rtoffset);
 
-	if (aplan->part_prune_info)
-	{
-		foreach(l, aplan->part_prune_info->prune_infos)
-		{
-			List	   *prune_infos = lfirst(l);
-			ListCell   *l2;
-
-			foreach(l2, prune_infos)
-			{
-				PartitionedRelPruneInfo *pinfo = lfirst(l2);
-
-				pinfo->rtindex += rtoffset;
-			}
-		}
-	}
+	/*
+	 * PartitionPruneInfos will be added to a list in PlannerGlobal, so update
+	 * the index.
+	 */
+	if (aplan->part_prune_index >= 0)
+		aplan->part_prune_index += list_length(root->glob->partPruneInfos);
 
 	/* We don't need to recurse to lefttree or righttree ... */
 	Assert(aplan->plan.lefttree == NULL);
@@ -1734,21 +1748,12 @@ set_mergeappend_references(PlannerInfo *root,
 
 	mplan->apprelids = offset_relid_set(mplan->apprelids, rtoffset);
 
-	if (mplan->part_prune_info)
-	{
-		foreach(l, mplan->part_prune_info->prune_infos)
-		{
-			List	   *prune_infos = lfirst(l);
-			ListCell   *l2;
-
-			foreach(l2, prune_infos)
-			{
-				PartitionedRelPruneInfo *pinfo = lfirst(l2);
-
-				pinfo->rtindex += rtoffset;
-			}
-		}
-	}
+	/*
+	 * PartitionPruneInfos will be added to a list in PlannerGlobal, so update
+	 * the index.
+	 */
+	if (mplan->part_prune_index >= 0)
+		mplan->part_prune_index += list_length(root->glob->partPruneInfos);
 
 	/* We don't need to recurse to lefttree or righttree ... */
 	Assert(mplan->plan.lefttree == NULL);
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index 9d3c05aed3..d77f7d3aef 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -209,16 +209,20 @@ static void partkey_datum_from_expr(PartitionPruneContext *context,
 
 /*
  * make_partition_pruneinfo
- *		Builds a PartitionPruneInfo which can be used in the executor to allow
- *		additional partition pruning to take place.  Returns NULL when
- *		partition pruning would be useless.
+ *		Checks if the given set of quals can be used to build pruning steps
+ *		that the executor can use to prune away unneeded partitions.  If
+ *		suitable quals are found then a PartitionPruneInfo is built and tagged
+ *		onto the PlannerInfo's partPruneInfos list.
+ *
+ * The return value is the 0-based index of the item added to the
+ * partPruneInfos list or -1 if nothing was added.
  *
  * 'parentrel' is the RelOptInfo for an appendrel, and 'subpaths' is the list
  * of scan paths for its child rels.
  * 'prunequal' is a list of potential pruning quals (i.e., restriction
  * clauses that are applicable to the appendrel).
  */
-PartitionPruneInfo *
+int
 make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 						 List *subpaths,
 						 List *prunequal)
@@ -332,7 +336,7 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	 * quals, then we can just not bother with run-time pruning.
 	 */
 	if (prunerelinfos == NIL)
-		return NULL;
+		return -1;
 
 	/* Else build the result data structure */
 	pruneinfo = makeNode(PartitionPruneInfo);
@@ -358,7 +362,9 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	else
 		pruneinfo->other_subplans = NULL;
 
-	return pruneinfo;
+	root->partPruneInfos = lappend(root->partPruneInfos, pruneinfo);
+
+	return list_length(root->partPruneInfos) - 1;
 }
 
 /*
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index 708435e952..bf962af7af 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -123,9 +123,8 @@ typedef struct PartitionPruneState
 
 extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
 													 int n_total_subplans,
-													 PartitionPruneInfo *pruneinfo,
+													 int part_prune_index,
 													 Bitmapset **initially_valid_subplans);
 extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
 										   bool initial_prune);
-
 #endif							/* EXECPARTITION_H */
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 01b1727fc0..63a89474db 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -611,6 +611,8 @@ typedef struct EState
 	struct ExecRowMark **es_rowmarks;	/* Array of per-range-table-entry
 										 * ExecRowMarks, or NULL if none */
 	PlannedStmt *es_plannedstmt;	/* link to top of plan tree */
+	List		*es_part_prune_infos;	/* PlannedStmt.partPruneInfos */
+	struct PartitionPruneResult *es_part_prune_result; /* QueryDesc.part_prune_result */
 	const char *es_sourceText;	/* Source text from QueryDesc */
 
 	JunkFilter *es_junkFilter;	/* top-level junk filter, if any */
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index e2081db4ed..a4e6b4db92 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -122,6 +122,9 @@ typedef struct PlannerGlobal
 	/* "flat" list of AppendRelInfos */
 	List	   *appendRelations;
 
+	/* List of PartitionPruneInfo contained in the plan */
+	List	   *partPruneInfos;
+
 	/* OIDs of relations the plan depends on */
 	List	   *relationOids;
 
@@ -488,6 +491,9 @@ struct PlannerInfo
 
 	/* Does this query modify any partition key columns? */
 	bool		partColsUpdated;
+
+	/* PartitionPruneInfos added in this query's plan. */
+	List	   *partPruneInfos;
 };
 
 
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index dca2a21e7a..f2daabb3b7 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -69,6 +69,9 @@ typedef struct PlannedStmt
 
 	struct Plan *planTree;		/* tree of Plan nodes */
 
+	List	   *partPruneInfos;	/* List of PartitionPruneInfo contained in
+								 * the plan */
+
 	List	   *rtable;			/* list of RangeTblEntry nodes */
 
 	/* rtable indexes of target relations for INSERT/UPDATE/DELETE */
@@ -269,8 +272,8 @@ typedef struct Append
 	 */
 	int			first_partial_plan;
 
-	/* Info for run-time subplan pruning; NULL if we're not doing that */
-	struct PartitionPruneInfo *part_prune_info;
+	/* Index to PlannerInfo.partPruneInfos or -1 if no run-time pruning */
+	int			part_prune_index;
 } Append;
 
 /* ----------------
@@ -304,8 +307,8 @@ typedef struct MergeAppend
 	/* NULLS FIRST/LAST directions */
 	bool	   *nullsFirst pg_node_attr(array_size(numCols));
 
-	/* Info for run-time subplan pruning; NULL if we're not doing that */
-	struct PartitionPruneInfo *part_prune_info;
+	/* Index to PlannerInfo.partPruneInfos or -1 if no run-time pruning */
+	int			part_prune_index;
 } MergeAppend;
 
 /* ----------------
diff --git a/src/include/partitioning/partprune.h b/src/include/partitioning/partprune.h
index 90684efa25..ebf0dcff8c 100644
--- a/src/include/partitioning/partprune.h
+++ b/src/include/partitioning/partprune.h
@@ -70,10 +70,10 @@ typedef struct PartitionPruneContext
 #define PruneCxtStateIdx(partnatts, step_id, keyno) \
 	((partnatts) * (step_id) + (keyno))
 
-extern PartitionPruneInfo *make_partition_pruneinfo(struct PlannerInfo *root,
-													struct RelOptInfo *parentrel,
-													List *subpaths,
-													List *prunequal);
+extern int make_partition_pruneinfo(struct PlannerInfo *root,
+									struct RelOptInfo *parentrel,
+									List *subpaths,
+									List *prunequal);
 extern Bitmapset *prune_append_rel_partitions(struct RelOptInfo *rel);
 extern Bitmapset *get_matching_partitions(PartitionPruneContext *context,
 										  List *pruning_steps);
-- 
2.35.3



  [application/octet-stream] v20-0002-Optimize-AcquireExecutorLocks-by-locking-only-un.patch (80.5K, 3-v20-0002-Optimize-AcquireExecutorLocks-by-locking-only-un.patch)
  download | inline diff:
From 7a1454c6a1ecde5c871bec5a4d646da4e41a62c3 Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Wed, 22 Dec 2021 16:55:17 +0900
Subject: [PATCH v20 2/2] Optimize AcquireExecutorLocks() by locking only
 unpruned partitions

This commit teaches AcquireExecutorLocks() to perform initial
partition pruning to notionally eliminate the subnodes contained in a
generic cached plan that need not be initialized during the actual
execution of the plan and skip locking the partition scanned by those
subnodes.

The result of performing initial partition pruning this way before the
actual execution has started is made available to the actual execution via
PartitionPruneResult, made available along with the PlannedStmt by the
callers of the executor that used plancache.c to get the plan.  It is NULL
in the cases in which the plan is obtained by calling the planner
directly or if the plan obtained by plancache.c is not a generic one.
---
 src/backend/commands/copyto.c          |   2 +-
 src/backend/commands/createas.c        |   2 +-
 src/backend/commands/explain.c         |   7 +-
 src/backend/commands/extension.c       |   2 +-
 src/backend/commands/matview.c         |   2 +-
 src/backend/commands/prepare.c         |  26 ++-
 src/backend/executor/README            |  32 ++++
 src/backend/executor/execMain.c        |  53 ++++++
 src/backend/executor/execParallel.c    |  27 ++-
 src/backend/executor/execPartition.c   | 234 +++++++++++++++++++++----
 src/backend/executor/functions.c       |   2 +-
 src/backend/executor/nodeAppend.c      |  11 +-
 src/backend/executor/nodeMergeAppend.c |   5 +-
 src/backend/executor/spi.c             |  27 ++-
 src/backend/nodes/readfuncs.c          |   8 +-
 src/backend/optimizer/plan/planner.c   |   2 +
 src/backend/optimizer/plan/setrefs.c   |  46 +++++
 src/backend/partitioning/partprune.c   |  41 ++++-
 src/backend/tcop/postgres.c            |   8 +-
 src/backend/tcop/pquery.c              |  28 ++-
 src/backend/utils/cache/plancache.c    | 187 +++++++++++++++++---
 src/backend/utils/mmgr/portalmem.c     |  19 ++
 src/include/commands/explain.h         |   4 +-
 src/include/executor/execPartition.h   |   9 +-
 src/include/executor/execdesc.h        |   3 +
 src/include/executor/executor.h        |   2 +
 src/include/nodes/execnodes.h          |  27 +++
 src/include/nodes/nodes.h              |   1 +
 src/include/nodes/pathnodes.h          |  13 ++
 src/include/nodes/plannodes.h          |  21 +++
 src/include/utils/plancache.h          |   3 +-
 src/include/utils/portal.h             |   3 +
 32 files changed, 759 insertions(+), 98 deletions(-)

diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index fca29a9a10..d839517693 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -541,7 +541,7 @@ BeginCopyTo(ParseState *pstate,
 		((DR_copy *) dest)->cstate = cstate;
 
 		/* Create a QueryDesc requesting no output */
-		cstate->queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+		cstate->queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
 											GetActiveSnapshot(),
 											InvalidSnapshot,
 											dest, NULL, NULL, 0);
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 9abbb6b555..f6607f2454 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -325,7 +325,7 @@ ExecCreateTableAs(ParseState *pstate, CreateTableAsStmt *stmt,
 		UpdateActiveSnapshotCommandId();
 
 		/* Create a QueryDesc, redirecting output to our tuple receiver */
-		queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+		queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
 									GetActiveSnapshot(), InvalidSnapshot,
 									dest, params, queryEnv, 0);
 
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index e29c2ae206..e41b13a3ea 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -407,7 +407,7 @@ ExplainOneQuery(Query *query, int cursorOptions,
 		}
 
 		/* run it (if needed) and produce output */
-		ExplainOnePlan(plan, into, es, queryString, params, queryEnv,
+		ExplainOnePlan(plan, NULL, into, es, queryString, params, queryEnv,
 					   &planduration, (es->buffers ? &bufusage : NULL));
 	}
 }
@@ -515,7 +515,8 @@ ExplainOneUtility(Node *utilityStmt, IntoClause *into, ExplainState *es,
  * to call it.
  */
 void
-ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
+ExplainOnePlan(PlannedStmt *plannedstmt, PartitionPruneResult *part_prune_result,
+			   IntoClause *into, ExplainState *es,
 			   const char *queryString, ParamListInfo params,
 			   QueryEnvironment *queryEnv, const instr_time *planduration,
 			   const BufferUsage *bufusage)
@@ -563,7 +564,7 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
 		dest = None_Receiver;
 
 	/* Create a QueryDesc for the query */
-	queryDesc = CreateQueryDesc(plannedstmt, queryString,
+	queryDesc = CreateQueryDesc(plannedstmt, part_prune_result, queryString,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, params, queryEnv, instrument_option);
 
diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index 6b6720c690..374c0ff807 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -776,7 +776,7 @@ execute_sql_string(const char *sql)
 			{
 				QueryDesc  *qdesc;
 
-				qdesc = CreateQueryDesc(stmt,
+				qdesc = CreateQueryDesc(stmt, NULL,
 										sql,
 										GetActiveSnapshot(), NULL,
 										dest, NULL, NULL, 0);
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index 9ac0383459..b0ed96e56c 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -408,7 +408,7 @@ refresh_matview_datafill(DestReceiver *dest, Query *query,
 	UpdateActiveSnapshotCommandId();
 
 	/* Create a QueryDesc, redirecting output to our tuple receiver */
-	queryDesc = CreateQueryDesc(plan, queryString,
+	queryDesc = CreateQueryDesc(plan, NULL, queryString,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, NULL, NULL, 0);
 
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 579825c159..b6285958bc 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -155,6 +155,7 @@ ExecuteQuery(ParseState *pstate,
 	PreparedStatement *entry;
 	CachedPlan *cplan;
 	List	   *plan_list;
+	List	   *part_prune_result_list;
 	ParamListInfo paramLI = NULL;
 	EState	   *estate = NULL;
 	Portal		portal;
@@ -193,7 +194,10 @@ ExecuteQuery(ParseState *pstate,
 									   entry->plansource->query_string);
 
 	/* Replan if needed, and increment plan refcount for portal */
-	cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL);
+	cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL,
+						  &part_prune_result_list);
+	Assert(list_length(cplan->stmt_list) ==
+		   list_length(part_prune_result_list));
 	plan_list = cplan->stmt_list;
 
 	/*
@@ -207,6 +211,9 @@ ExecuteQuery(ParseState *pstate,
 					  plan_list,
 					  cplan);
 
+	/* Copy PartitionPruneResults into the portal's context. */
+	PortalStorePartitionPruneResults(portal, part_prune_result_list);
+
 	/*
 	 * For CREATE TABLE ... AS EXECUTE, we must verify that the prepared
 	 * statement is one that produces tuples.  Currently we insist that it be
@@ -576,7 +583,9 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	const char *query_string;
 	CachedPlan *cplan;
 	List	   *plan_list;
-	ListCell   *p;
+	List	   *part_prune_result_list;
+	ListCell   *p,
+			   *pp;
 	ParamListInfo paramLI = NULL;
 	EState	   *estate = NULL;
 	instr_time	planstart;
@@ -619,7 +628,10 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 
 	/* Replan if needed, and acquire a transient refcount */
 	cplan = GetCachedPlan(entry->plansource, paramLI,
-						  CurrentResourceOwner, queryEnv);
+						  CurrentResourceOwner, queryEnv,
+						  &part_prune_result_list);
+	Assert(list_length(cplan->stmt_list) ==
+		   list_length(part_prune_result_list));
 
 	INSTR_TIME_SET_CURRENT(planduration);
 	INSTR_TIME_SUBTRACT(planduration, planstart);
@@ -634,13 +646,15 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	plan_list = cplan->stmt_list;
 
 	/* Explain each query */
-	foreach(p, plan_list)
+	forboth(p, plan_list, pp, part_prune_result_list)
 	{
 		PlannedStmt *pstmt = lfirst_node(PlannedStmt, p);
+		PartitionPruneResult *part_prune_result = lfirst_node(PartitionPruneResult, pp);
 
 		if (pstmt->commandType != CMD_UTILITY)
-			ExplainOnePlan(pstmt, into, es, query_string, paramLI, queryEnv,
-						   &planduration, (es->buffers ? &bufusage : NULL));
+			ExplainOnePlan(pstmt, part_prune_result, into, es, query_string,
+						   paramLI, queryEnv, &planduration,
+						   (es->buffers ? &bufusage : NULL));
 		else
 			ExplainOneUtility(pstmt->utilityStmt, into, es, query_string,
 							  paramLI, queryEnv);
diff --git a/src/backend/executor/README b/src/backend/executor/README
index 0b5183fc4a..953a476ea5 100644
--- a/src/backend/executor/README
+++ b/src/backend/executor/README
@@ -65,6 +65,34 @@ found there.  This currently only occurs for Append and MergeAppend nodes.  In
 this case the non-required subplans are ignored and the executor state's
 subnode array will become out of sequence to the plan's subplan list.
 
+Actually, the so-called execution time pruning may also occur even before the
+execution has started.  One case where that occurs is when a cached generic
+plan is being validated for execution by plancache.c: GetCachedPlan(), which
+works by locking all the relations that will be scanned by that plan.  If the
+generic plan contains nodes that can perform execution time partition pruning
+(that is, contain a PartitionPruneInfo), a subset of pruning steps contained
+in a given node's PartitionPruneInfo that do not depend on the execution
+actually having started (called "initial" pruning steps) are performed at this
+point to figure out the minimal set of child subplans that satisfy those
+pruning steps.  AcquireExecutorLocks() looking at a given plan tree will then
+lock only the relations scanned by the child subplans that survived such
+pruning, along with those present in PlannedStmt.minLockRelids.  Note that the
+subplans are only notionally pruned in that they are not removed from the plan
+tree as such.
+
+To prevent the executor and any third party execution code that can look at
+the plan tree from trying to execute the subplans that were pruned as
+described above, the result of pruning is passed to the executor as a
+PartitionPruneResult node via the QueryDesc.  It consists of the set of
+indexes of surviving subplans in their respective parent plan node's list of
+child subplans, saved as a list of bitmapsets, with one element for every
+parent plan node whose PartitionPruneInfo is present in
+PlannedStmt.partPruneInfos.  In other words, the executor should not
+re-evaluate the set of initially valid subplans by redoing the initial pruning
+if it was already done by AcquireExecutorLocks(), because the re-evaluation may
+very well end up resulting in a different set of subplans, containing some
+whose relations were not locked by AcquireExecutorLocks().
+
 Each Plan node may have expression trees associated with it, to represent
 its target list, qualification conditions, etc.  These trees are also
 read-only to the executor, but the executor state for expression evaluation
@@ -286,6 +314,10 @@ Query Processing Control Flow
 
 This is a sketch of control flow for full query processing:
 
+  [ ExecutorDoInitialPruning ] --- an optional step to perform initial
+		partition pruning on the plan tree the result of which is passed
+		to the executor via QueryDesc
+
 	CreateQueryDesc
 
 	ExecutorStart
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 72fc273524..45824624f8 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -49,6 +49,7 @@
 #include "commands/matview.h"
 #include "commands/trigger.h"
 #include "executor/execdebug.h"
+#include "executor/execPartition.h"
 #include "executor/nodeSubplan.h"
 #include "foreign/fdwapi.h"
 #include "jit/jit.h"
@@ -104,6 +105,56 @@ static void EvalPlanQualStart(EPQState *epqstate, Plan *planTree);
 
 /* end of local decls */
 
+/* ----------------------------------------------------------------
+ *		ExecutorDoInitialPruning
+ *
+ *		For each plan tree node that has been assigned a PartitionPruneInfo,
+ *		this performs initial partition pruning using the information contained
+ *		therein to determine the set of child subplans that satisfy the initial
+ *		pruning steps, to be returned as a bitmapset of their indexes in the
+ *		node's list of child subplans (for example, an Append's appendplans).
+ *
+ * Return value is a PartitionPruneResult node that contains a list of those
+ * bitmapsets, with one element for every PartitionPruneInfo, and a bitmapset
+ * of the RT indexes of all the leaf partitions scanned by those chosen
+ * subplans.  Note that the latter is shared across all PartitionPruneInfos.
+ *
+ * The executor must see the exactly same set of subplans as valid for
+ * execution when doing ExecInitNode() on the plan nodes whose
+ * PartitionPruneInfos are processed here.  So, it must get the set from the
+ * aforementioned PartitionPruneResult, instead of computing it all over
+ * again by redoing the initial pruning.  It's the caller's job to pass the
+ * PartitionPruneResult to the executor.
+ *
+ * Note: Partitioned tables mentioned in PartitionedRelPruneInfo nodes that
+ * drive the pruning will be locked before doing the pruning.
+ * ----------------------------------------------------------------
+ */
+PartitionPruneResult *
+ExecutorDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params)
+{
+	PartitionPruneResult *result;
+	ListCell *lc;
+
+	/* Only get here if there is any pruning to do. */
+	Assert(plannedstmt->containsInitialPruning);
+
+	result = makeNode(PartitionPruneResult);
+	foreach(lc, plannedstmt->partPruneInfos)
+	{
+		PartitionPruneInfo *pruneinfo = lfirst(lc);
+		Bitmapset *valid_subplan_offs;
+
+		valid_subplan_offs =
+			ExecPartitionDoInitialPruning(plannedstmt, params, pruneinfo,
+										  &result->scan_leafpart_rtis);
+		result->valid_subplan_offs_list =
+			lappend(result->valid_subplan_offs_list,
+					valid_subplan_offs);
+	}
+
+	return result;
+}
 
 /* ----------------------------------------------------------------
  *		ExecutorStart
@@ -806,6 +857,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
 {
 	CmdType		operation = queryDesc->operation;
 	PlannedStmt *plannedstmt = queryDesc->plannedstmt;
+	PartitionPruneResult *part_prune_result = queryDesc->part_prune_result;
 	Plan	   *plan = plannedstmt->planTree;
 	List	   *rangeTable = plannedstmt->rtable;
 	EState	   *estate = queryDesc->estate;
@@ -826,6 +878,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
 
 	estate->es_plannedstmt = plannedstmt;
 	estate->es_part_prune_infos = plannedstmt->partPruneInfos;
+	estate->es_part_prune_result = part_prune_result;
 
 	/*
 	 * Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index f73b8c2607..7e6dab5623 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -66,6 +66,7 @@
 #define PARALLEL_KEY_QUERY_TEXT		UINT64CONST(0xE000000000000008)
 #define PARALLEL_KEY_JIT_INSTRUMENTATION UINT64CONST(0xE000000000000009)
 #define PARALLEL_KEY_WAL_USAGE			UINT64CONST(0xE00000000000000A)
+#define PARALLEL_KEY_PARTITIONPRUNERESULT	UINT64CONST(0xE00000000000000B)
 
 #define PARALLEL_TUPLE_QUEUE_SIZE		65536
 
@@ -182,6 +183,7 @@ ExecSerializePlan(Plan *plan, EState *estate)
 	pstmt->transientPlan = false;
 	pstmt->dependsOnRole = false;
 	pstmt->parallelModeNeeded = false;
+	pstmt->containsInitialPruning = false;
 	pstmt->planTree = plan;
 	pstmt->partPruneInfos = estate->es_part_prune_infos;
 	pstmt->rtable = estate->es_range_table;
@@ -597,12 +599,15 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	FixedParallelExecutorState *fpes;
 	char	   *pstmt_data;
 	char	   *pstmt_space;
+	char	   *part_prune_result_data;
+	char	   *part_prune_result_space;
 	char	   *paramlistinfo_space;
 	BufferUsage *bufusage_space;
 	WalUsage   *walusage_space;
 	SharedExecutorInstrumentation *instrumentation = NULL;
 	SharedJitInstrumentation *jit_instrumentation = NULL;
 	int			pstmt_len;
+	int			part_prune_result_len;
 	int			paramlistinfo_len;
 	int			instrumentation_len = 0;
 	int			jit_instrumentation_len = 0;
@@ -631,6 +636,7 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 
 	/* Fix up and serialize plan to be sent to workers. */
 	pstmt_data = ExecSerializePlan(planstate->plan, estate);
+	part_prune_result_data = nodeToString(estate->es_part_prune_result);
 
 	/* Create a parallel context. */
 	pcxt = CreateParallelContext("postgres", "ParallelQueryMain", nworkers);
@@ -657,6 +663,11 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	shm_toc_estimate_chunk(&pcxt->estimator, pstmt_len);
 	shm_toc_estimate_keys(&pcxt->estimator, 1);
 
+	/* Estimate space for serialized PartitionPruneResult. */
+	part_prune_result_len = strlen(part_prune_result_data) + 1;
+	shm_toc_estimate_chunk(&pcxt->estimator, part_prune_result_len);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
 	/* Estimate space for serialized ParamListInfo. */
 	paramlistinfo_len = EstimateParamListSpace(estate->es_param_list_info);
 	shm_toc_estimate_chunk(&pcxt->estimator, paramlistinfo_len);
@@ -751,6 +762,12 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	memcpy(pstmt_space, pstmt_data, pstmt_len);
 	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PLANNEDSTMT, pstmt_space);
 
+	/* Store serialized PartitionPruneResult */
+	part_prune_result_space = shm_toc_allocate(pcxt->toc, part_prune_result_len);
+	memcpy(part_prune_result_space, part_prune_result_data, part_prune_result_len);
+	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARTITIONPRUNERESULT,
+				   part_prune_result_space);
+
 	/* Store serialized ParamListInfo. */
 	paramlistinfo_space = shm_toc_allocate(pcxt->toc, paramlistinfo_len);
 	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARAMLISTINFO, paramlistinfo_space);
@@ -1232,8 +1249,10 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
 						 int instrument_options)
 {
 	char	   *pstmtspace;
+	char	   *part_prune_result_space;
 	char	   *paramspace;
 	PlannedStmt *pstmt;
+	PartitionPruneResult *part_prune_result;
 	ParamListInfo paramLI;
 	char	   *queryString;
 
@@ -1244,12 +1263,18 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
 	pstmtspace = shm_toc_lookup(toc, PARALLEL_KEY_PLANNEDSTMT, false);
 	pstmt = (PlannedStmt *) stringToNode(pstmtspace);
 
+	/* Reconstruct leader-supplied PartitionPruneResult. */
+	part_prune_result_space =
+		shm_toc_lookup(toc, PARALLEL_KEY_PARTITIONPRUNERESULT, false);
+	part_prune_result = (PartitionPruneResult *)
+		stringToNode(part_prune_result_space);
+
 	/* Reconstruct ParamListInfo. */
 	paramspace = shm_toc_lookup(toc, PARALLEL_KEY_PARAMLISTINFO, false);
 	paramLI = RestoreParamList(&paramspace);
 
 	/* Create a QueryDesc for the query. */
-	return CreateQueryDesc(pstmt,
+	return CreateQueryDesc(pstmt, part_prune_result,
 						   queryString,
 						   GetActiveSnapshot(), InvalidSnapshot,
 						   receiver, paramLI, NULL, instrument_options);
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index b55cdd2580..24e6f6e988 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -25,6 +25,7 @@
 #include "mb/pg_wchar.h"
 #include "miscadmin.h"
 #include "nodes/makefuncs.h"
+#include "parser/parsetree.h"
 #include "partitioning/partbounds.h"
 #include "partitioning/partdesc.h"
 #include "partitioning/partprune.h"
@@ -185,7 +186,11 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation rel,
 static List *adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri);
 static List *adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap);
 static PartitionPruneState *CreatePartitionPruneState(PlanState *planstate,
-													  PartitionPruneInfo *pruneinfo);
+													  PartitionPruneInfo *pruneinfo,
+													  bool consider_initial_steps,
+													  bool consider_exec_steps,
+													  List *rtable, ExprContext *econtext,
+													  PartitionDirectory partdir);
 static void InitPartitionPruneContext(PartitionPruneContext *context,
 									  List *pruning_steps,
 									  PartitionDesc partdesc,
@@ -198,7 +203,8 @@ static void PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
 static void find_matching_subplans_recurse(PartitionPruningData *prunedata,
 										   PartitionedRelPruningData *pprune,
 										   bool initial_prune,
-										   Bitmapset **validsubplans);
+										   Bitmapset **validsubplans,
+										   Bitmapset **scan_leafpart_rtis);
 
 
 /*
@@ -1593,8 +1599,10 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  * considered to be a stable expression, it can change value from one plan
  * node scan to the next during query execution.  Stable comparison
  * expressions that don't involve such Params allow partition pruning to be
- * done once during executor startup.  Expressions that do involve such Params
- * require us to prune separately for each scan of the parent plan node.
+ * done once during executor startup or during ExecutorDoInitialPruning() that
+ * runs as part of performing AcquireExecutorLocks() on a given plan tree.
+ * Expressions that do involve such Params require us to prune separately for
+ * each scan of the parent plan node.
  *
  * Note that pruning away unneeded subplans during executor startup has the
  * added benefit of not having to initialize the unneeded subplans at all.
@@ -1611,6 +1619,13 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  *		account for initial pruning possibly having eliminated some of the
  *		subplans.
  *
+ * ExecPartitionDoInitialPruning:
+ *		Do initial pruning with the information contained in a given
+ *		PartitionPruneInfo to determine the minimal set of child subplans
+ *		to be executed of the parent plan node to which the PartitionPruneInfo
+ *		belongs and also the set of the RT indexes of leaf partitions that will
+ *		be scanned with those subplans.
+ *
  * ExecFindMatchingSubPlans:
  *		Returns indexes of matching subplans after evaluating the expressions
  *		that are safe to evaluate at a given point.  This function is first
@@ -1628,8 +1643,9 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  *
  * On return, *initially_valid_subplans is assigned the set of indexes of
  * child subplans that must be initialized along with the parent plan node.
- * Initial pruning is performed here if needed and in that case only the
- * surviving subplans' indexes are added.
+ * Initial pruning is performed here if needed (unless it has already been done
+ * by ExecutorDoInitialPruning()), and in that case only the surviving
+ * subplans' indexes are added.
  *
  * If subplans are indeed pruned, subplan_map arrays contained in the returned
  * PartitionPruneState are re-sequenced to not count those, though only if the
@@ -1645,24 +1661,59 @@ ExecInitPartitionPruning(PlanState *planstate,
 	EState	   *estate = planstate->state;
 	PartitionPruneInfo *pruneinfo = list_nth(estate->es_part_prune_infos,
 											 part_prune_index);
+	PartitionPruneResult *pruneresult = estate->es_part_prune_result;
+	bool	do_pruning = (pruneinfo->needs_init_pruning ||
+						  pruneinfo->needs_exec_pruning);
 
-	/* We may need an expression context to evaluate partition exprs */
-	ExecAssignExprContext(estate, planstate);
+	/*
+	 * No need to do initial pruning if it was done already by
+	 * ExecutorDoInitialPruning(), which it would be if es_part_prune_result
+	 * has been set.
+	 */
+	if (pruneresult)
+		do_pruning = pruneinfo->needs_exec_pruning;
 
-	/* Create the working data structure for pruning */
-	prunestate = CreatePartitionPruneState(planstate, pruneinfo);
+	prunestate = NULL;
+	if (do_pruning)
+	{
+		/* We may need an expression context to evaluate partition exprs */
+		ExecAssignExprContext(estate, planstate);
+
+		/* For data reading, executor always omits detached partitions */
+		if (estate->es_partition_directory == NULL)
+			estate->es_partition_directory =
+				CreatePartitionDirectory(estate->es_query_cxt, false);
+
+		/*
+		 * Create the working data structure for pruning.  No need to consider
+		 * initial pruning steps if we have a PartitionPruneResult.
+		 */
+		prunestate = CreatePartitionPruneState(planstate, pruneinfo,
+											   pruneresult == NULL, true,
+											   NIL, planstate->ps_ExprContext,
+											   estate->es_partition_directory);
+	}
 
 	/*
 	 * Perform an initial partition prune pass, if required.
 	 */
-	if (prunestate->do_initial_prune)
-		*initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true);
+	if (pruneresult)
+	{
+		*initially_valid_subplans =
+			list_nth(pruneresult->valid_subplan_offs_list, part_prune_index);
+	}
+	else if (prunestate && prunestate->do_initial_prune)
+	{
+		*initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true,
+															 NULL);
+	}
 	else
 	{
 		/* No pruning, so we'll need to initialize all subplans */
 		Assert(n_total_subplans > 0);
 		*initially_valid_subplans = bms_add_range(NULL, 0,
 												  n_total_subplans - 1);
+		return prunestate;
 	}
 
 	/*
@@ -1670,7 +1721,8 @@ ExecInitPartitionPruning(PlanState *planstate,
 	 * that were removed above due to initial pruning.  No need to do this if
 	 * no steps were removed.
 	 */
-	if (bms_num_members(*initially_valid_subplans) < n_total_subplans)
+	if (prunestate &&
+		bms_num_members(*initially_valid_subplans) < n_total_subplans)
 	{
 		/*
 		 * We can safely skip this when !do_exec_prune, even though that
@@ -1686,11 +1738,73 @@ ExecInitPartitionPruning(PlanState *planstate,
 	return prunestate;
 }
 
+/*
+ * ExecPartitionDoInitialPruning
+ *		Perform initial pruning using given PartitionPruneInfo to determine
+ *		the minimal set of child subplans that will be executed and also the
+ *		set of RT indexes of the leaf partitions scanned by those subplans.
+ */
+Bitmapset *
+ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params,
+							  PartitionPruneInfo *pruneinfo,
+							  Bitmapset **scan_leafpart_rtis)
+{
+	List		 *rtable = plannedstmt->rtable;
+	ExprContext	 *econtext;
+	PartitionDirectory pdir;
+	MemoryContext oldcontext,
+				  tmpcontext;
+	PartitionPruneState *prunestate;
+	Bitmapset	 *valid_subplan_offs;
+
+	/*
+	 * A temporary context for memory allocations required while executing
+	 * partition pruning steps.
+	 */
+	tmpcontext = AllocSetContextCreate(CurrentMemoryContext,
+									   "initial pruning working data",
+									   ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(tmpcontext);
+
+	/*
+	 * PartitionDirectory to look up partition descriptors, which omits
+	 * detached partitions, just like in the executor proper.
+	 */
+	pdir = CreatePartitionDirectory(CurrentMemoryContext, false);
+
+	/*
+	 * We don't yet have a PlanState for the parent plan node, so we must
+	 * create a standalone ExprContext to evaluate pruning expressions,
+	 * equipped with the information about the EXTERN parameters that the
+	 * caller passed us.  Note that that's okay because the initial pruning
+	 * steps do not contain anything that requires the execution to have
+	 * started.
+	 */
+	econtext = CreateStandaloneExprContext();
+	econtext->ecxt_param_list_info = params;
+	prunestate = CreatePartitionPruneState(NULL, pruneinfo, true, false,
+										   rtable, econtext, pdir);
+	MemoryContextSwitchTo(oldcontext);
+
+	/* Do the initial pruning. */
+	valid_subplan_offs = ExecFindMatchingSubPlans(prunestate, true,
+												  scan_leafpart_rtis);
+
+	FreeExprContext(econtext, true);
+	DestroyPartitionDirectory(pdir);
+	MemoryContextDelete(tmpcontext);
+
+	return valid_subplan_offs;
+}
+
 /*
  * CreatePartitionPruneState
  *		Build the data structure required for calling ExecFindMatchingSubPlans
  *
- * 'planstate' is the parent plan node's execution state.
+ * 'planstate', if not NULL, is the parent plan node's execution state.  It
+ * can be NULL if being called before ExecutorStart(), in which case,
+ * 'rtable' (range table), 'econtext', and 'partdir' must be explicitly
+ * provided.
  *
  * 'pruneinfo' is a PartitionPruneInfo as generated by
  * make_partition_pruneinfo.  Here we build a PartitionPruneState containing a
@@ -1704,19 +1818,21 @@ ExecInitPartitionPruning(PlanState *planstate,
  * PartitionedRelPruneInfo.
  */
 static PartitionPruneState *
-CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
+CreatePartitionPruneState(PlanState *planstate,
+						  PartitionPruneInfo *pruneinfo,
+						  bool consider_initial_steps,
+						  bool consider_exec_steps,
+						  List *rtable, ExprContext *econtext,
+						  PartitionDirectory partdir)
 {
-	EState	   *estate = planstate->state;
+	EState	   *estate = planstate ? planstate->state : NULL;
 	PartitionPruneState *prunestate;
 	int			n_part_hierarchies;
 	ListCell   *lc;
 	int			i;
-	ExprContext *econtext = planstate->ps_ExprContext;
 
-	/* For data reading, executor always omits detached partitions */
-	if (estate->es_partition_directory == NULL)
-		estate->es_partition_directory =
-			CreatePartitionDirectory(estate->es_query_cxt, false);
+	Assert((estate != NULL) ||
+			(partdir != NULL && econtext != NULL && rtable != NIL));
 
 	n_part_hierarchies = list_length(pruneinfo->prune_infos);
 	Assert(n_part_hierarchies > 0);
@@ -1771,15 +1887,42 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 			PartitionKey partkey;
 
 			/*
-			 * We can rely on the copies of the partitioned table's partition
-			 * key and partition descriptor appearing in its relcache entry,
-			 * because that entry will be held open and locked for the
-			 * duration of this executor run.
+			 * Must open the relation by ourselves when called before the
+			 * execution has started, such as, when called during
+			 * ExecutorDoInitialPruning() on a cached plan.  In that case,
+			 * sub-partitions must be locked, because AcquirePlannerLocks()
+			 * would not have seen them. (1st relation in a partrelpruneinfos
+			 * list is always the root partitioned table appearing in the
+			 * query, which AcquirePlannerLocks() would have locked; the
+			 * Assert in relation_open() guards that assumption.)
+			 */
+			if (estate == NULL)
+			{
+				RangeTblEntry *rte = rt_fetch(pinfo->rtindex, rtable);
+				int		lockmode = (j == 0) ? NoLock : rte->rellockmode;
+
+				partrel = table_open(rte->relid, lockmode);
+			}
+			else
+				partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
+
+			/*
+			 * We can rely on the copy of the partitioned table's partition
+			 * key from in its relcache entry, because it can't change (or
+			 * get destroyed) as long as the relation is locked.  Partition
+			 * descriptor is taken from the PartitionDirectory associated with
+			 * the table that is held open long enough for the descriptor to
+			 * remain valid while it's used to perform the pruning steps.
 			 */
-			partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
 			partkey = RelationGetPartitionKey(partrel);
-			partdesc = PartitionDirectoryLookup(estate->es_partition_directory,
-												partrel);
+			partdesc = PartitionDirectoryLookup(partdir, partrel);
+
+			/*
+			 * Must close partrel, keeping the lock taken, if we're not using
+			 * EState's entry.
+			 */
+			if (estate == NULL)
+				table_close(partrel, NoLock);
 
 			/*
 			 * Initialize the subplan_map and subpart_map.
@@ -1793,6 +1936,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 			Assert(partdesc->nparts >= pinfo->nparts);
 			pprune->nparts = partdesc->nparts;
 			pprune->subplan_map = palloc(sizeof(int) * partdesc->nparts);
+			pprune->rti_map = palloc(sizeof(Index) * partdesc->nparts);
 			if (partdesc->nparts == pinfo->nparts)
 			{
 				/*
@@ -1803,6 +1947,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 				pprune->subpart_map = pinfo->subpart_map;
 				memcpy(pprune->subplan_map, pinfo->subplan_map,
 					   sizeof(int) * pinfo->nparts);
+				memcpy(pprune->rti_map, pinfo->rti_map,
+					   sizeof(int) * pinfo->nparts);
 
 				/*
 				 * Double-check that the list of unpruned relations has not
@@ -1853,6 +1999,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 							pinfo->subplan_map[pd_idx];
 						pprune->subpart_map[pp_idx] =
 							pinfo->subpart_map[pd_idx];
+						pprune->rti_map[pp_idx] =
+							pinfo->rti_map[pd_idx];
 						pd_idx++;
 					}
 					else
@@ -1860,6 +2008,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 						/* this partdesc entry is not in the plan */
 						pprune->subplan_map[pp_idx] = -1;
 						pprune->subpart_map[pp_idx] = -1;
+						pprune->rti_map[pp_idx] = 0;
 					}
 				}
 
@@ -1881,7 +2030,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 			 * Initialize pruning contexts as needed.
 			 */
 			pprune->initial_pruning_steps = pinfo->initial_pruning_steps;
-			if (pinfo->initial_pruning_steps)
+			if (consider_initial_steps && pinfo->initial_pruning_steps)
 			{
 				InitPartitionPruneContext(&pprune->initial_context,
 										  pinfo->initial_pruning_steps,
@@ -1891,7 +2040,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 				prunestate->do_initial_prune = true;
 			}
 			pprune->exec_pruning_steps = pinfo->exec_pruning_steps;
-			if (pinfo->exec_pruning_steps)
+			if (consider_exec_steps && pinfo->exec_pruning_steps)
 			{
 				InitPartitionPruneContext(&pprune->exec_context,
 										  pinfo->exec_pruning_steps,
@@ -2119,10 +2268,14 @@ PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
  * Pass initial_prune if PARAM_EXEC Params cannot yet be evaluated.  This
  * differentiates the initial executor-time pruning step from later
  * runtime pruning.
+ *
+ * RT indexes of leaf partitions scanned by the chosen subplans are added to
+ * *scan_leafpart_rtis if the pointer is non-NULL.
  */
 Bitmapset *
 ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
-						 bool initial_prune)
+						 bool initial_prune,
+						 Bitmapset **scan_leafpart_rtis)
 {
 	Bitmapset  *result = NULL;
 	MemoryContext oldcontext;
@@ -2157,7 +2310,7 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
 		 */
 		pprune = &prunedata->partrelprunedata[0];
 		find_matching_subplans_recurse(prunedata, pprune, initial_prune,
-									   &result);
+									   &result, scan_leafpart_rtis);
 
 		/* Expression eval may have used space in ExprContext too */
 		if (pprune->exec_pruning_steps)
@@ -2171,6 +2324,8 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
 
 	/* Copy result out of the temp context before we reset it */
 	result = bms_copy(result);
+	if (scan_leafpart_rtis)
+		*scan_leafpart_rtis = bms_copy(*scan_leafpart_rtis);
 
 	MemoryContextReset(prunestate->prune_context);
 
@@ -2181,13 +2336,15 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
  * find_matching_subplans_recurse
  *		Recursive worker function for ExecFindMatchingSubPlans
  *
- * Adds valid (non-prunable) subplan IDs to *validsubplans
+ * Adds valid (non-prunable) subplan IDs to *validsubplans and RT indexes of
+ * of the corresponding leaf partitions to *scan_leafpart_rtis (if asked for).
  */
 static void
 find_matching_subplans_recurse(PartitionPruningData *prunedata,
 							   PartitionedRelPruningData *pprune,
 							   bool initial_prune,
-							   Bitmapset **validsubplans)
+							   Bitmapset **validsubplans,
+							   Bitmapset **scan_leafpart_rtis)
 {
 	Bitmapset  *partset;
 	int			i;
@@ -2214,8 +2371,14 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
 	while ((i = bms_next_member(partset, i)) >= 0)
 	{
 		if (pprune->subplan_map[i] >= 0)
+		{
 			*validsubplans = bms_add_member(*validsubplans,
 											pprune->subplan_map[i]);
+			Assert(pprune->rti_map[i] > 0);
+			if (scan_leafpart_rtis)
+				*scan_leafpart_rtis = bms_add_member(*scan_leafpart_rtis,
+													 pprune->rti_map[i]);
+		}
 		else
 		{
 			int			partidx = pprune->subpart_map[i];
@@ -2223,7 +2386,8 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
 			if (partidx >= 0)
 				find_matching_subplans_recurse(prunedata,
 											   &prunedata->partrelprunedata[partidx],
-											   initial_prune, validsubplans);
+											   initial_prune, validsubplans,
+											   scan_leafpart_rtis);
 			else
 			{
 				/*
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index 076226868f..ed359b5153 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -842,7 +842,7 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
 	else
 		dest = None_Receiver;
 
-	es->qd = CreateQueryDesc(es->stmt,
+	es->qd = CreateQueryDesc(es->stmt, NULL,
 							 fcache->src,
 							 GetActiveSnapshot(),
 							 InvalidSnapshot,
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index c6f86a6510..96880e122a 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -155,7 +155,8 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
 		 * subplan, we can fill as_valid_subplans immediately, preventing
 		 * later calls to ExecFindMatchingSubPlans.
 		 */
-		if (!prunestate->do_exec_prune && nplans > 0)
+		if (appendstate->as_prune_state == NULL ||
+			(!appendstate->as_prune_state->do_exec_prune && nplans > 0))
 			appendstate->as_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
 	}
 	else
@@ -577,7 +578,7 @@ choose_next_subplan_locally(AppendState *node)
 		}
 		else if (node->as_valid_subplans == NULL)
 			node->as_valid_subplans =
-				ExecFindMatchingSubPlans(node->as_prune_state, false);
+				ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
 
 		whichplan = -1;
 	}
@@ -642,7 +643,7 @@ choose_next_subplan_for_leader(AppendState *node)
 		if (node->as_valid_subplans == NULL)
 		{
 			node->as_valid_subplans =
-				ExecFindMatchingSubPlans(node->as_prune_state, false);
+				ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
 
 			/*
 			 * Mark each invalid plan as finished to allow the loop below to
@@ -717,7 +718,7 @@ choose_next_subplan_for_worker(AppendState *node)
 	else if (node->as_valid_subplans == NULL)
 	{
 		node->as_valid_subplans =
-			ExecFindMatchingSubPlans(node->as_prune_state, false);
+			ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
 		mark_invalid_subplans_as_finished(node);
 	}
 
@@ -868,7 +869,7 @@ ExecAppendAsyncBegin(AppendState *node)
 	if (node->as_valid_subplans == NULL)
 	{
 		node->as_valid_subplans =
-			ExecFindMatchingSubPlans(node->as_prune_state, false);
+			ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
 
 		classify_matching_subplans(node);
 	}
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index 8d35860c30..2312e5a633 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -103,7 +103,8 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
 		 * subplan, we can fill ms_valid_subplans immediately, preventing
 		 * later calls to ExecFindMatchingSubPlans.
 		 */
-		if (!prunestate->do_exec_prune && nplans > 0)
+		if (mergestate->ms_prune_state == NULL ||
+			(!mergestate->ms_prune_state->do_exec_prune && nplans > 0))
 			mergestate->ms_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
 	}
 	else
@@ -218,7 +219,7 @@ ExecMergeAppend(PlanState *pstate)
 		 */
 		if (node->ms_valid_subplans == NULL)
 			node->ms_valid_subplans =
-				ExecFindMatchingSubPlans(node->ms_prune_state, false);
+				ExecFindMatchingSubPlans(node->ms_prune_state, false, NULL);
 
 		/*
 		 * First time through: pull the first tuple from each valid subplan,
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 29bc26669b..303a572c02 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -1578,6 +1578,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 	CachedPlanSource *plansource;
 	CachedPlan *cplan;
 	List	   *stmt_list;
+	List	   *part_prune_result_list;
 	char	   *query_string;
 	Snapshot	snapshot;
 	MemoryContext oldcontext;
@@ -1657,7 +1658,10 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 	 */
 
 	/* Replan if needed, and increment plan refcount for portal */
-	cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv);
+	cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv,
+						  &part_prune_result_list);
+	Assert(list_length(cplan->stmt_list) ==
+		   list_length(part_prune_result_list));
 	stmt_list = cplan->stmt_list;
 
 	if (!plan->saved)
@@ -1685,6 +1689,9 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 					  stmt_list,
 					  cplan);
 
+	/* Copy PartitionPruneResults into the portal's context. */
+	PortalStorePartitionPruneResults(portal, part_prune_result_list);
+
 	/*
 	 * Set up options for portal.  Default SCROLL type is chosen the same way
 	 * as PerformCursorOpen does it.
@@ -2092,7 +2099,8 @@ SPI_plan_get_cached_plan(SPIPlanPtr plan)
 	/* Get the generic plan for the query */
 	cplan = GetCachedPlan(plansource, NULL,
 						  plan->saved ? CurrentResourceOwner : NULL,
-						  _SPI_current->queryEnv);
+						  _SPI_current->queryEnv,
+						  NULL /* Not interested in PartitionPruneResults */);
 	Assert(cplan == plansource->gplan);
 
 	/* Pop the error context stack */
@@ -2473,7 +2481,9 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 	{
 		CachedPlanSource *plansource = (CachedPlanSource *) lfirst(lc1);
 		List	   *stmt_list;
-		ListCell   *lc2;
+		List	   *part_prune_result_list;
+		ListCell   *lc2,
+				   *lc3;
 
 		spicallbackarg.query = plansource->query_string;
 
@@ -2549,8 +2559,10 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 		 * plan, the refcount must be backed by the plan_owner.
 		 */
 		cplan = GetCachedPlan(plansource, options->params,
-							  plan_owner, _SPI_current->queryEnv);
-
+							  plan_owner, _SPI_current->queryEnv,
+							 &part_prune_result_list);
+		Assert(list_length(cplan->stmt_list) ==
+			   list_length(part_prune_result_list));
 		stmt_list = cplan->stmt_list;
 
 		/*
@@ -2589,9 +2601,10 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 			}
 		}
 
-		foreach(lc2, stmt_list)
+		forboth(lc2, stmt_list, lc3, part_prune_result_list)
 		{
 			PlannedStmt *stmt = lfirst_node(PlannedStmt, lc2);
+			PartitionPruneResult *part_prune_result = lfirst_node(PartitionPruneResult, lc3);
 			bool		canSetTag = stmt->canSetTag;
 			DestReceiver *dest;
 
@@ -2663,7 +2676,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 				else
 					snap = InvalidSnapshot;
 
-				qdesc = CreateQueryDesc(stmt,
+				qdesc = CreateQueryDesc(stmt, part_prune_result,
 										plansource->query_string,
 										snap, crosscheck_snapshot,
 										dest,
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index bee62fc15c..e7886afa35 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -158,6 +158,11 @@
 	token = pg_strtok(&length);		/* skip :fldname */ \
 	local_node->fldname = readIntCols(len)
 
+/* Read an Index array */
+#define READ_INDEX_ARRAY(fldname, len) \
+	token = pg_strtok(&length);		/* skip :fldname */ \
+	local_node->fldname = readIndexCols(len)
+
 /* Read a bool array */
 #define READ_BOOL_ARRAY(fldname, len) \
 	token = pg_strtok(&length);		/* skip :fldname */ \
@@ -542,7 +547,6 @@ fnname(int numCols) \
  */
 READ_SCALAR_ARRAY(readAttrNumberCols, int16, atoi)
 READ_SCALAR_ARRAY(readOidCols, Oid, atooid)
-/* outfuncs.c has writeIndexCols, but we don't yet need that here */
-/* READ_SCALAR_ARRAY(readIndexCols, Index, atoui) */
+READ_SCALAR_ARRAY(readIndexCols, Index, atoui)
 READ_SCALAR_ARRAY(readIntCols, int, atoi)
 READ_SCALAR_ARRAY(readBoolCols, bool, strtobool)
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index b11249ed8f..7141035cc4 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -519,7 +519,9 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
 	result->parallelModeNeeded = glob->parallelModeNeeded;
 	result->planTree = top_plan;
 	result->partPruneInfos = glob->partPruneInfos;
+	result->containsInitialPruning = glob->containsInitialPruning;
 	result->rtable = glob->finalrtable;
+	result->minLockRelids = glob->minLockRelids;
 	result->resultRelations = glob->resultRelations;
 	result->appendRelations = glob->appendRelations;
 	result->subplans = glob->subplans;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 720f20f563..61d6934978 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -270,6 +270,16 @@ set_plan_references(PlannerInfo *root, Plan *plan)
 	 */
 	add_rtes_to_flat_rtable(root, false);
 
+	/*
+	 * Add the query's adjusted range of RT indexes to glob->minLockRelids.
+	 * The adjusted RT indexes of prunable relations will be deleted from the
+	 * set below where PartitionPruneInfos are processed.
+	 */
+	glob->minLockRelids =
+		bms_add_range(glob->minLockRelids,
+					  rtoffset + 1,
+					  rtoffset + list_length(root->parse->rtable));
+
 	/*
 	 * Adjust RT indexes of PlanRowMarks and add to final rowmarks list
 	 */
@@ -352,6 +362,7 @@ set_plan_references(PlannerInfo *root, Plan *plan)
 	foreach (lc, root->partPruneInfos)
 	{
 		PartitionPruneInfo *pruneinfo = lfirst(lc);
+		Bitmapset *leafpart_rtis = NULL;
 		ListCell  *l;
 
 		foreach(l, pruneinfo->prune_infos)
@@ -362,15 +373,50 @@ set_plan_references(PlannerInfo *root, Plan *plan)
 			foreach(l2, prune_infos)
 			{
 				PartitionedRelPruneInfo *pinfo = lfirst(l2);
+				int		i;
 
 				/* RT index of the table to which the pinfo belongs. */
 				pinfo->rtindex += rtoffset;
+
+				/* Also of the leaf partitions that might be scanned. */
+				for (i = 0; i < pinfo->nparts; i++)
+				{
+					if (pinfo->rti_map[i] > 0 && pinfo->subplan_map[i] >= 0)
+					{
+						pinfo->rti_map[i] += rtoffset;
+						leafpart_rtis = bms_add_member(leafpart_rtis,
+													   pinfo->rti_map[i]);
+					}
+				}
 			}
 		}
 
+		if (pruneinfo->needs_init_pruning)
+		{
+			glob->containsInitialPruning = true;
+
+			/*
+			 * Delete the leaf partition RTIs from the global set of relations
+			 * to be locked before executing the plan.  AcquireExecutorLocks()
+			 * will find the ones to add to the set after performing initial
+			 * pruning.
+			 */
+			glob->minLockRelids = bms_del_members(glob->minLockRelids,
+												  leafpart_rtis);
+		}
+
 		glob->partPruneInfos = lappend(glob->partPruneInfos, pruneinfo);
 	}
 
+	/*
+	 * It seems worth doing a bms_copy() on glob->minLockRelids if we deleted
+	 * bits from it above to get rid of any empty tail bits.  It seems better
+	 * for the loop over this set in AcquireExecutorLocks() to not have to go
+	 * through those useless bit words.
+	 */
+	if (glob->containsInitialPruning)
+		glob->minLockRelids = bms_copy(glob->minLockRelids);
+
 	return result;
 }
 
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index d77f7d3aef..952c5b8327 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -144,7 +144,9 @@ static List *make_partitionedrel_pruneinfo(PlannerInfo *root,
 										   List *prunequal,
 										   Bitmapset *partrelids,
 										   int *relid_subplan_map,
-										   Bitmapset **matchedsubplans);
+										   Bitmapset **matchedsubplans,
+										   bool *needs_init_pruning,
+										   bool *needs_exec_pruning);
 static void gen_partprune_steps(RelOptInfo *rel, List *clauses,
 								PartClauseTarget target,
 								GeneratePruningStepsContext *context);
@@ -234,6 +236,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	int		   *relid_subplan_map;
 	ListCell   *lc;
 	int			i;
+	bool		needs_init_pruning = false;
+	bool		needs_exec_pruning = false;
 
 	/*
 	 * Scan the subpaths to see which ones are scans of partition child
@@ -313,12 +317,16 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		Bitmapset  *partrelids = (Bitmapset *) lfirst(lc);
 		List	   *pinfolist;
 		Bitmapset  *matchedsubplans = NULL;
+		bool		partrel_needs_init_pruning;
+		bool		partrel_needs_exec_pruning;
 
 		pinfolist = make_partitionedrel_pruneinfo(root, parentrel,
 												  prunequal,
 												  partrelids,
 												  relid_subplan_map,
-												  &matchedsubplans);
+												  &matchedsubplans,
+												  &partrel_needs_init_pruning,
+												  &partrel_needs_exec_pruning);
 
 		/* When pruning is possible, record the matched subplans */
 		if (pinfolist != NIL)
@@ -327,6 +335,9 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 			allmatchedsubplans = bms_join(matchedsubplans,
 										  allmatchedsubplans);
 		}
+
+		needs_init_pruning |= partrel_needs_init_pruning;
+		needs_exec_pruning |= partrel_needs_exec_pruning;
 	}
 
 	pfree(relid_subplan_map);
@@ -341,6 +352,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	/* Else build the result data structure */
 	pruneinfo = makeNode(PartitionPruneInfo);
 	pruneinfo->prune_infos = prunerelinfos;
+	pruneinfo->needs_init_pruning = needs_init_pruning;
+	pruneinfo->needs_exec_pruning = needs_exec_pruning;
 
 	/*
 	 * Some subplans may not belong to any of the identified partitioned rels.
@@ -441,13 +454,18 @@ add_part_relids(List *allpartrelids, Bitmapset *partrelids)
  * If we cannot find any useful run-time pruning steps, return NIL.
  * However, on success, each rel identified in partrelids will have
  * an element in the result list, even if some of them are useless.
+ * *needs_init_pruning and *needs_exec_pruning are set to indicate that the
+ * returned PartitionedRelPruneInfos contains pruning steps that can be
+ * performed before and after execution begins, respectively.
  */
 static List *
 make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 							  List *prunequal,
 							  Bitmapset *partrelids,
 							  int *relid_subplan_map,
-							  Bitmapset **matchedsubplans)
+							  Bitmapset **matchedsubplans,
+							  bool *needs_init_pruning,
+							  bool *needs_exec_pruning)
 {
 	RelOptInfo *targetpart = NULL;
 	List	   *pinfolist = NIL;
@@ -458,6 +476,10 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	int			rti;
 	int			i;
 
+	/* Will find out below. */
+	*needs_init_pruning = false;
+	*needs_exec_pruning = false;
+
 	/*
 	 * Examine each partitioned rel, constructing a temporary array to map
 	 * from planner relids to index of the partitioned rel, and building a
@@ -545,6 +567,9 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		 * executor per-scan pruning steps.  This first pass creates startup
 		 * pruning steps and detects whether there's any possibly-useful quals
 		 * that would require per-scan pruning.
+		 *
+		 * In the first pass, we note whether the 2nd pass is necessary by
+		 * noting the presence of EXEC parameters.
 		 */
 		gen_partprune_steps(subpart, partprunequal, PARTTARGET_INITIAL,
 							&context);
@@ -619,6 +644,12 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		pinfo->execparamids = execparamids;
 		/* Remaining fields will be filled in the next loop */
 
+		/* record which types of pruning steps we've seen so far */
+		if (initial_pruning_steps != NIL)
+			*needs_init_pruning = true;
+		if (exec_pruning_steps != NIL)
+			*needs_exec_pruning = true;
+
 		pinfolist = lappend(pinfolist, pinfo);
 	}
 
@@ -646,6 +677,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		int		   *subplan_map;
 		int		   *subpart_map;
 		Oid		   *relid_map;
+		Index	   *rti_map;
 
 		/*
 		 * Construct the subplan and subpart maps for this partitioning level.
@@ -658,6 +690,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		subpart_map = (int *) palloc(nparts * sizeof(int));
 		memset(subpart_map, -1, nparts * sizeof(int));
 		relid_map = (Oid *) palloc0(nparts * sizeof(Oid));
+		rti_map = (Index *) palloc0(nparts * sizeof(Index));
 		present_parts = NULL;
 
 		i = -1;
@@ -672,6 +705,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 			subplan_map[i] = subplanidx = relid_subplan_map[partrel->relid] - 1;
 			subpart_map[i] = subpartidx = relid_subpart_map[partrel->relid] - 1;
 			relid_map[i] = planner_rt_fetch(partrel->relid, root)->relid;
+			rti_map[i] = partrel->relid;
 			if (subplanidx >= 0)
 			{
 				present_parts = bms_add_member(present_parts, i);
@@ -696,6 +730,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		pinfo->subplan_map = subplan_map;
 		pinfo->subpart_map = subpart_map;
 		pinfo->relid_map = relid_map;
+		pinfo->rti_map = rti_map;
 	}
 
 	pfree(relid_subpart_map);
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 078fbdb5a0..02fc5a011b 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -1603,6 +1603,7 @@ exec_bind_message(StringInfo input_message)
 	int16	   *rformats = NULL;
 	CachedPlanSource *psrc;
 	CachedPlan *cplan;
+	List	   *part_prune_result_list;
 	Portal		portal;
 	char	   *query_string;
 	char	   *saved_stmt_name;
@@ -1978,7 +1979,9 @@ exec_bind_message(StringInfo input_message)
 	 * will be generated in MessageContext.  The plan refcount will be
 	 * assigned to the Portal, so it will be released at portal destruction.
 	 */
-	cplan = GetCachedPlan(psrc, params, NULL, NULL);
+	cplan = GetCachedPlan(psrc, params, NULL, NULL, &part_prune_result_list);
+	Assert(list_length(cplan->stmt_list) ==
+		   list_length(part_prune_result_list));
 
 	/*
 	 * Now we can define the portal.
@@ -1993,6 +1996,9 @@ exec_bind_message(StringInfo input_message)
 					  cplan->stmt_list,
 					  cplan);
 
+	/* Copy PartitionPruneResults into the portal's context. */
+	PortalStorePartitionPruneResults(portal, part_prune_result_list);
+
 	/* Done with the snapshot used for parameter I/O and parsing/planning */
 	if (snapshot_set)
 		PopActiveSnapshot();
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index 5aa5a350f3..8cc2e2162d 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -35,7 +35,7 @@
 Portal		ActivePortal = NULL;
 
 
-static void ProcessQuery(PlannedStmt *plan,
+static void ProcessQuery(PlannedStmt *plan, PartitionPruneResult *part_prune_result,
 						 const char *sourceText,
 						 ParamListInfo params,
 						 QueryEnvironment *queryEnv,
@@ -65,6 +65,7 @@ static void DoPortalRewind(Portal portal);
  */
 QueryDesc *
 CreateQueryDesc(PlannedStmt *plannedstmt,
+				PartitionPruneResult *part_prune_result,
 				const char *sourceText,
 				Snapshot snapshot,
 				Snapshot crosscheck_snapshot,
@@ -77,6 +78,8 @@ CreateQueryDesc(PlannedStmt *plannedstmt,
 
 	qd->operation = plannedstmt->commandType;	/* operation */
 	qd->plannedstmt = plannedstmt;	/* plan */
+	qd->part_prune_result = part_prune_result;	/* ExecutorDoInitialPruning()
+												 * output for plan */
 	qd->sourceText = sourceText;	/* query text */
 	qd->snapshot = RegisterSnapshot(snapshot);	/* snapshot */
 	/* RI check snapshot */
@@ -122,6 +125,7 @@ FreeQueryDesc(QueryDesc *qdesc)
  *		PORTAL_ONE_RETURNING, or PORTAL_ONE_MOD_WITH portal
  *
  *	plan: the plan tree for the query
+ *	part_prune_result: ExecutorDoInitialPruning() output for the plan tree
  *	sourceText: the source text of the query
  *	params: any parameters needed
  *	dest: where to send results
@@ -134,6 +138,7 @@ FreeQueryDesc(QueryDesc *qdesc)
  */
 static void
 ProcessQuery(PlannedStmt *plan,
+			 PartitionPruneResult *part_prune_result,
 			 const char *sourceText,
 			 ParamListInfo params,
 			 QueryEnvironment *queryEnv,
@@ -145,7 +150,7 @@ ProcessQuery(PlannedStmt *plan,
 	/*
 	 * Create the QueryDesc object
 	 */
-	queryDesc = CreateQueryDesc(plan, sourceText,
+	queryDesc = CreateQueryDesc(plan, part_prune_result, sourceText,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, params, queryEnv, 0);
 
@@ -491,8 +496,13 @@ PortalStart(Portal portal, ParamListInfo params,
 				/*
 				 * Create QueryDesc in portal's context; for the moment, set
 				 * the destination to DestNone.
+				 *
+				 * There is no PartitionPruneResult unless the PlannedStmt is
+				 * from a CachedPlan.
 				 */
 				queryDesc = CreateQueryDesc(linitial_node(PlannedStmt, portal->stmts),
+											portal->part_prune_results == NIL ? NULL :
+											linitial(portal->part_prune_results),
 											portal->sourceText,
 											GetActiveSnapshot(),
 											InvalidSnapshot,
@@ -1225,6 +1235,8 @@ PortalRunMulti(Portal portal,
 
 		if (pstmt->utilityStmt == NULL)
 		{
+			PartitionPruneResult *part_prune_result = NULL;
+
 			/*
 			 * process a plannable query.
 			 */
@@ -1271,10 +1283,18 @@ PortalRunMulti(Portal portal,
 			else
 				UpdateActiveSnapshotCommandId();
 
+			/*
+			 * Determine if there's a corresponding PartitionPruneResult for
+			 * this PlannedStmt.
+			 */
+			if (portal->part_prune_results != NIL)
+				part_prune_result = list_nth(portal->part_prune_results,
+											 foreach_current_index(stmtlist_item));
+
 			if (pstmt->canSetTag)
 			{
 				/* statement can set tag string */
-				ProcessQuery(pstmt,
+				ProcessQuery(pstmt, part_prune_result,
 							 portal->sourceText,
 							 portal->portalParams,
 							 portal->queryEnv,
@@ -1283,7 +1303,7 @@ PortalRunMulti(Portal portal,
 			else
 			{
 				/* stmt added by rewrite cannot set tag */
-				ProcessQuery(pstmt,
+				ProcessQuery(pstmt, part_prune_result,
 							 portal->sourceText,
 							 portal->portalParams,
 							 portal->queryEnv,
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 0d6a295674..d1c9605979 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -99,14 +99,19 @@ static dlist_head cached_expression_list = DLIST_STATIC_INIT(cached_expression_l
 static void ReleaseGenericPlan(CachedPlanSource *plansource);
 static List *RevalidateCachedQuery(CachedPlanSource *plansource,
 								   QueryEnvironment *queryEnv);
-static bool CheckCachedPlan(CachedPlanSource *plansource);
+static bool CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
+							List **part_prune_result_list);
 static CachedPlan *BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
-								   ParamListInfo boundParams, QueryEnvironment *queryEnv);
+								   ParamListInfo boundParams, QueryEnvironment *queryEnv,
+								   List **part_prune_result_list);
 static bool choose_custom_plan(CachedPlanSource *plansource,
 							   ParamListInfo boundParams);
 static double cached_plan_cost(CachedPlan *plan, bool include_planner);
 static Query *QueryListGetPrimaryStmt(List *stmts);
-static void AcquireExecutorLocks(List *stmt_list, bool acquire);
+static void AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams,
+								 List **part_prune_result_list,
+								 List **lockedRelids_per_stmt);
+static void ReleaseExecutorLocks(List *stmt_list, List *lockedRelids_per_stmt);
 static void AcquirePlannerLocks(List *stmt_list, bool acquire);
 static void ScanQueryForLocks(Query *parsetree, bool acquire);
 static bool ScanQueryWalker(Node *node, bool *acquire);
@@ -790,15 +795,20 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
  *
  * On a "true" return, we have acquired the locks needed to run the plan.
  * (We must do this for the "true" result to be race-condition-free.)
+ *
+ * See GetCachedPlan()'s comment for a description of part_prune_result_list.
  */
 static bool
-CheckCachedPlan(CachedPlanSource *plansource)
+CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
+				List **part_prune_result_list)
 {
 	CachedPlan *plan = plansource->gplan;
 
 	/* Assert that caller checked the querytree */
 	Assert(plansource->is_valid);
 
+	*part_prune_result_list = NIL;
+
 	/* If there's no generic plan, just say "false" */
 	if (!plan)
 		return false;
@@ -820,13 +830,21 @@ CheckCachedPlan(CachedPlanSource *plansource)
 	 */
 	if (plan->is_valid)
 	{
+		List   *lockedRelids_per_stmt;
+
 		/*
 		 * Plan must have positive refcount because it is referenced by
 		 * plansource; so no need to fear it disappears under us here.
 		 */
 		Assert(plan->refcount > 0);
 
-		AcquireExecutorLocks(plan->stmt_list, true);
+		/*
+		 * Lock relations scanned by the plan.  This is where the pruning
+		 * happens if needed.
+		 */
+		AcquireExecutorLocks(plan->stmt_list, boundParams,
+							 part_prune_result_list,
+							 &lockedRelids_per_stmt);
 
 		/*
 		 * If plan was transient, check to see if TransactionXmin has
@@ -848,7 +866,14 @@ CheckCachedPlan(CachedPlanSource *plansource)
 		}
 
 		/* Oops, the race case happened.  Release useless locks. */
-		AcquireExecutorLocks(plan->stmt_list, false);
+		ReleaseExecutorLocks(plan->stmt_list, lockedRelids_per_stmt);
+
+		/*
+		 * The output list and any objects therein have been allocated in the
+		 * caller's hopefully short-lived context, so will not remain leaked
+		 * for long, though reset to avoid its accidentally being looked at.
+		 */
+		*part_prune_result_list = NIL;
 	}
 
 	/*
@@ -874,10 +899,15 @@ CheckCachedPlan(CachedPlanSource *plansource)
  * Planning work is done in the caller's memory context.  The finished plan
  * is in a child memory context, which typically should get reparented
  * (unless this is a one-shot plan, in which case we don't copy the plan).
+ *
+ * A list of NULLs is returned in *part_prune_result_list, meaning that no
+ * PartitionPruneResult nodes have yet been created for the plans in
+ * stmt_list.
  */
 static CachedPlan *
 BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
-				ParamListInfo boundParams, QueryEnvironment *queryEnv)
+				ParamListInfo boundParams, QueryEnvironment *queryEnv,
+				List **part_prune_result_list)
 {
 	CachedPlan *plan;
 	List	   *plist;
@@ -1007,6 +1037,17 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
 
 	MemoryContextSwitchTo(oldcxt);
 
+	/*
+	 * No actual PartitionPruneResults yet to add, though must initialize
+	 * the list to have the same number of elements as the list of
+	 * PlannedStmts.
+	 */
+	*part_prune_result_list = NIL;
+	foreach(lc, plist)
+	{
+		*part_prune_result_list = lappend(*part_prune_result_list, NULL);
+	}
+
 	return plan;
 }
 
@@ -1126,6 +1167,17 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
  * plan or a custom plan for the given parameters: the caller does not know
  * which it will get.
  *
+ * For every PlannedStmt found in the returned CachedPlan, an element that
+ * is either a PartitionPruneResult or a NULL is added to
+ * *part_prune_result_list if needed.  The former if the PlannedStmt is from
+ * the existing CachedPlan that is otherwise valid and contains at least one
+ * PartitionPruneInfo that has "initial" pruning steps.  Those steps are
+ * performed by calling ExecutorDoInitialPruning() to determine only those
+ * leaf partitions that need to be locked by AcquireExecutorLocks() by pruning
+ * away subplans that don't match the pruning conditions.  The
+ * PartitionPruneResult contains a list of bitmapsets of the indexes of
+ * matching subplans, one for each PartitionPruneInfo.
+ *
  * On return, the plan is valid and we have sufficient locks to begin
  * execution.
  *
@@ -1139,11 +1191,13 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
  */
 CachedPlan *
 GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
-			  ResourceOwner owner, QueryEnvironment *queryEnv)
+			  ResourceOwner owner, QueryEnvironment *queryEnv,
+			  List **part_prune_result_list)
 {
 	CachedPlan *plan = NULL;
 	List	   *qlist;
 	bool		customplan;
+	List	   *my_part_prune_result_list;
 
 	/* Assert caller is doing things in a sane order */
 	Assert(plansource->magic == CACHEDPLANSOURCE_MAGIC);
@@ -1160,7 +1214,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 
 	if (!customplan)
 	{
-		if (CheckCachedPlan(plansource))
+		if (CheckCachedPlan(plansource, boundParams,
+							&my_part_prune_result_list))
 		{
 			/* We want a generic plan, and we already have a valid one */
 			plan = plansource->gplan;
@@ -1169,7 +1224,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 		else
 		{
 			/* Build a new generic plan */
-			plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv);
+			plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv,
+								   &my_part_prune_result_list);
 			/* Just make real sure plansource->gplan is clear */
 			ReleaseGenericPlan(plansource);
 			/* Link the new generic plan into the plansource */
@@ -1214,7 +1270,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 	if (customplan)
 	{
 		/* Build a custom plan */
-		plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv);
+		plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv,
+							   &my_part_prune_result_list);
 		/* Accumulate total costs of custom plans */
 		plansource->total_custom_cost += cached_plan_cost(plan, true);
 
@@ -1246,6 +1303,9 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 		plan->is_saved = true;
 	}
 
+	if (part_prune_result_list)
+		*part_prune_result_list = my_part_prune_result_list;
+
 	return plan;
 }
 
@@ -1737,17 +1797,29 @@ QueryListGetPrimaryStmt(List *stmts)
 
 /*
  * AcquireExecutorLocks: acquire locks needed for execution of a cached plan;
- * or release them if acquire is false.
+ *
+ * See GetCachedPlan()'s comment for a description of part_prune_result_list.
+ *
+ * On return, *lockedRelids_per_stmt will contain a bitmapset for every
+ * PlannedStmt in stmt_list, containing the RT indexes of relation entries
+ * in its range table that were actually locked, or NULL if the PlannedStmt
+ * contains a utility statement.
  */
 static void
-AcquireExecutorLocks(List *stmt_list, bool acquire)
+AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams,
+					 List **part_prune_result_list,
+					 List **lockedRelids_per_stmt)
 {
 	ListCell   *lc1;
 
+	*part_prune_result_list = *lockedRelids_per_stmt = NIL;
 	foreach(lc1, stmt_list)
 	{
 		PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
-		ListCell   *lc2;
+		PartitionPruneResult *part_prune_result = NULL;
+		Bitmapset  *allLockRelids;
+		Bitmapset  *lockedRelids = NULL;
+		int			rti;
 
 		if (plannedstmt->commandType == CMD_UTILITY)
 		{
@@ -1761,13 +1833,38 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
 			Query	   *query = UtilityContainsQuery(plannedstmt->utilityStmt);
 
 			if (query)
-				ScanQueryForLocks(query, acquire);
+				ScanQueryForLocks(query, true);
+			*part_prune_result_list = lappend(*part_prune_result_list, NULL);
 			continue;
 		}
 
-		foreach(lc2, plannedstmt->rtable)
+		/*
+		 * Figure out the set of relations that would need to be locked
+		 * before executing the plan.
+		 */
+		if (plannedstmt->containsInitialPruning)
 		{
-			RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc2);
+			/*
+			 * Obtain the set of leaf partitions to be locked.
+			 *
+			 * The following does initial partition pruning using the
+			 * PartitionPruneInfos found in plannedstmt->partPruneInfos and
+			 * finds leaf partitions that survive that pruning across all the
+			 * nodes in the plan tree.
+			 */
+			PartitionPruneResult *part_prune_result =
+				ExecutorDoInitialPruning(plannedstmt, boundParams);
+
+			allLockRelids = bms_union(plannedstmt->minLockRelids,
+									  part_prune_result->scan_leafpart_rtis);
+		}
+		else
+			allLockRelids = plannedstmt->minLockRelids;
+
+		rti = -1;
+		while ((rti = bms_next_member(allLockRelids, rti)) > 0)
+		{
+			RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
 
 			if (rte->rtekind != RTE_RELATION)
 				continue;
@@ -1778,10 +1875,58 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
 			 * fail if it's been dropped entirely --- we'll just transiently
 			 * acquire a non-conflicting lock.
 			 */
-			if (acquire)
-				LockRelationOid(rte->relid, rte->rellockmode);
-			else
-				UnlockRelationOid(rte->relid, rte->rellockmode);
+			LockRelationOid(rte->relid, rte->rellockmode);
+			lockedRelids = bms_add_member(lockedRelids, rti);
+		}
+
+		*part_prune_result_list = lappend(*part_prune_result_list,
+										  part_prune_result);
+		*lockedRelids_per_stmt = lappend(*lockedRelids_per_stmt, lockedRelids);
+	}
+}
+
+/*
+ * ReleaseExecutorLocks
+ * 		Release locks that would've been acquired by an earlier call to
+ * 		AcquireExecutorLocks()
+ */
+static void
+ReleaseExecutorLocks(List *stmt_list, List *lockedRelids_per_stmt)
+{
+	ListCell   *lc1,
+			   *lc2;
+
+	forboth(lc1, stmt_list, lc2, lockedRelids_per_stmt)
+	{
+		PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
+		Bitmapset  *lockedRelids = lfirst(lc2);
+		int			rti;
+
+		if (plannedstmt->commandType == CMD_UTILITY)
+		{
+			/*
+			 * Ignore utility statements, except those (such as EXPLAIN) that
+			 * contain a parsed-but-not-planned query.  Note: it's okay to use
+			 * ScanQueryForLocks, even though the query hasn't been through
+			 * rule rewriting, because rewriting doesn't change the query
+			 * representation.
+			 */
+			Query	   *query = UtilityContainsQuery(plannedstmt->utilityStmt);
+
+			if (query)
+				ScanQueryForLocks(query, false);
+			continue;
+		}
+
+		rti = -1;
+		while ((rti = bms_next_member(lockedRelids, rti)) >= 0)
+		{
+			RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
+
+			Assert(rte->rtekind == RTE_RELATION);
+
+			/* See the comment in AcquireExecutorLocks(). */
+			UnlockRelationOid(rte->relid, rte->rellockmode);
 		}
 	}
 }
diff --git a/src/backend/utils/mmgr/portalmem.c b/src/backend/utils/mmgr/portalmem.c
index 3a161bdb88..27407a7f0f 100644
--- a/src/backend/utils/mmgr/portalmem.c
+++ b/src/backend/utils/mmgr/portalmem.c
@@ -303,6 +303,25 @@ PortalDefineQuery(Portal portal,
 	portal->status = PORTAL_DEFINED;
 }
 
+/*
+ * PortalStorePartitionPruneResults
+ *		Copy the given list of PartitionPruneResults into the portal's
+ *		context
+ *
+ * This allows the caller to ensure that the list exists as long as the portal
+ * does.
+ */
+void
+PortalStorePartitionPruneResults(Portal portal, List *part_prune_results)
+{
+	MemoryContext	oldcxt;
+
+	AssertArg(PortalIsValid(portal));
+	oldcxt = MemoryContextSwitchTo(portal->portalContext);
+	portal->part_prune_results = copyObject(part_prune_results);
+	MemoryContextSwitchTo(oldcxt);
+}
+
 /*
  * PortalReleaseCachedPlan
  *		Release a portal's reference to its cached plan, if any.
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 9ebde089ae..e57e133f0e 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -87,7 +87,9 @@ extern void ExplainOneUtility(Node *utilityStmt, IntoClause *into,
 							  ExplainState *es, const char *queryString,
 							  ParamListInfo params, QueryEnvironment *queryEnv);
 
-extern void ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into,
+extern void ExplainOnePlan(PlannedStmt *plannedstmt,
+						   PartitionPruneResult *part_prune_result,
+						   IntoClause *into,
 						   ExplainState *es, const char *queryString,
 						   ParamListInfo params, QueryEnvironment *queryEnv,
 						   const instr_time *planduration,
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index bf962af7af..bd8776402e 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -45,6 +45,7 @@ extern void ExecCleanupTupleRouting(ModifyTableState *mtstate,
  * nparts						Length of subplan_map[] and subpart_map[].
  * subplan_map					Subplan index by partition index, or -1.
  * subpart_map					Subpart index by partition index, or -1.
+ * rti_map						Range table index by partition index, or 0.
  * present_parts				A Bitmapset of the partition indexes that we
  *								have subplans or subparts for.
  * initial_pruning_steps		List of PartitionPruneSteps used to
@@ -61,6 +62,7 @@ typedef struct PartitionedRelPruningData
 	int			nparts;
 	int		   *subplan_map;
 	int		   *subpart_map;
+	Index	   *rti_map;
 	Bitmapset  *present_parts;
 	List	   *initial_pruning_steps;
 	List	   *exec_pruning_steps;
@@ -126,5 +128,10 @@ extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
 													 int part_prune_index,
 													 Bitmapset **initially_valid_subplans);
 extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
-										   bool initial_prune);
+										   bool initial_prune,
+										   Bitmapset **scan_leafpart_rtis);
+extern Bitmapset *ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt,
+								ParamListInfo params,
+								PartitionPruneInfo *pruneinfo,
+								Bitmapset **scan_leafpart_rtis);
 #endif							/* EXECPARTITION_H */
diff --git a/src/include/executor/execdesc.h b/src/include/executor/execdesc.h
index e79e2c001f..60d5644908 100644
--- a/src/include/executor/execdesc.h
+++ b/src/include/executor/execdesc.h
@@ -35,6 +35,8 @@ typedef struct QueryDesc
 	/* These fields are provided by CreateQueryDesc */
 	CmdType		operation;		/* CMD_SELECT, CMD_UPDATE, etc. */
 	PlannedStmt *plannedstmt;	/* planner's output (could be utility, too) */
+	PartitionPruneResult *part_prune_result; /* ExecutorDoInitialPruning()'s
+											  * output for plannedstmt */
 	const char *sourceText;		/* source text of the query */
 	Snapshot	snapshot;		/* snapshot to use for query */
 	Snapshot	crosscheck_snapshot;	/* crosscheck for RI update/delete */
@@ -57,6 +59,7 @@ typedef struct QueryDesc
 
 /* in pquery.c */
 extern QueryDesc *CreateQueryDesc(PlannedStmt *plannedstmt,
+								  PartitionPruneResult *part_prune_result,
 								  const char *sourceText,
 								  Snapshot snapshot,
 								  Snapshot crosscheck_snapshot,
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index d68a6b9d28..5c4a282be0 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -185,6 +185,8 @@ ExecGetJunkAttribute(TupleTableSlot *slot, AttrNumber attno, bool *isNull)
 /*
  * prototypes from functions in execMain.c
  */
+extern PartitionPruneResult *ExecutorDoInitialPruning(PlannedStmt *plannedstmt,
+													  ParamListInfo params);
 extern void ExecutorStart(QueryDesc *queryDesc, int eflags);
 extern void standard_ExecutorStart(QueryDesc *queryDesc, int eflags);
 extern void ExecutorRun(QueryDesc *queryDesc,
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 63a89474db..12ea06c2f6 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -1001,6 +1001,33 @@ typedef struct DomainConstraintState
  */
 typedef TupleTableSlot *(*ExecProcNodeMtd) (struct PlanState *pstate);
 
+/*----------------
+ * PartitionPruneResult
+ *
+ * The result of performing ExecutorDoInitialPruning() invocation on a given
+ * PlannedStmt.
+ *
+ * Contains a list of Bitmapset of the indexes of the subplans remaining after
+ * performing initial pruning by calling ExecFindMatchingSubPlans() for every
+ * PartitionPruneInfo found in PlannedStmt.partPruneInfos.  RT indexes of the
+ * leaf partitions scanned by those subplans across all PartitionPruneInfos
+ * are added into scan_leafpart_rtis.
+ *
+ * This is used by GetCachedPlan() to inform its callers of the pruning
+ * decisions made when performing AcquireExecutorLocks() on a given cached
+ * PlannedStmt, which the callers then pass on to the executor.  The executor
+ * refers to this node when initializing the plan nodes which contain subplans
+ * that may have been pruned by ExecutorDoInitialPruning(), rather than
+ * redoing initial pruning.
+ */
+typedef struct PartitionPruneResult
+{
+	NodeTag		type;
+
+	List		   *valid_subplan_offs_list;
+	Bitmapset	   *scan_leafpart_rtis;
+} PartitionPruneResult;
+
 /* ----------------
  *		PlanState node
  *
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index cdd6debfa0..b33d9e426d 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -212,6 +212,7 @@ extern struct Bitmapset *readBitmapset(void);
 extern uintptr_t readDatum(bool typbyval);
 extern bool *readBoolCols(int numCols);
 extern int *readIntCols(int numCols);
+extern Index *readIndexCols(int numCols);
 extern Oid *readOidCols(int numCols);
 extern int16 *readAttrNumberCols(int numCols);
 
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index a4e6b4db92..86eda6c7c3 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -125,6 +125,19 @@ typedef struct PlannerGlobal
 	/* List of PartitionPruneInfo contained in the plan */
 	List	   *partPruneInfos;
 
+	/*
+	 * Do any of those PartitionPruneInfos have initial (pre-exec) pruning
+	 * steps in them?
+	 */
+	bool		containsInitialPruning;
+
+	/*
+	 * Indexes of all range table entries minus indexes of range table entries
+	 * of the leaf partitions scanned by prunable subplans; see
+	 * AcquireExecutorLocks()
+	 */
+	Bitmapset  *minLockRelids;
+
 	/* OIDs of relations the plan depends on */
 	List	   *relationOids;
 
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index f2daabb3b7..1d2c0d9bdf 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -72,8 +72,17 @@ typedef struct PlannedStmt
 	List	   *partPruneInfos;	/* List of PartitionPruneInfo contained in
 								 * the plan */
 
+	bool		containsInitialPruning;	/* Do any of those PartitionPruneInfos
+										 * have initial (pre-exec) pruning
+										 * steps in them? */
+
 	List	   *rtable;			/* list of RangeTblEntry nodes */
 
+	Bitmapset  *minLockRelids;	/* Indexes of all range table entries minus
+								 * indexes of range table entries of the leaf
+								 * partitions scanned by prunable subplans;
+								 * see AcquireExecutorLocks() */
+
 	/* rtable indexes of target relations for INSERT/UPDATE/DELETE */
 	List	   *resultRelations;	/* integer list of RT indexes, or NIL */
 
@@ -1409,6 +1418,13 @@ typedef struct PlanRowMark
  * prune_infos			List of Lists containing PartitionedRelPruneInfo nodes,
  *						one sublist per run-time-prunable partition hierarchy
  *						appearing in the parent plan node's subplans.
+ *
+ * needs_init_pruning	Does any of the PartitionedRelPruneInfos in
+ *						prune_infos have its initial_pruning_steps set?
+ *
+ * needs_exec_pruning	Does any of the PartitionedRelPruneInfos in
+ *						prune_infos have its exec_pruning_steps set?
+ *
  * other_subplans		Indexes of any subplans that are not accounted for
  *						by any of the PartitionedRelPruneInfo nodes in
  *						"prune_infos".  These subplans must not be pruned.
@@ -1419,6 +1435,8 @@ typedef struct PartitionPruneInfo
 
 	NodeTag		type;
 	List	   *prune_infos;
+	bool		needs_init_pruning;
+	bool		needs_exec_pruning;
 	Bitmapset  *other_subplans;
 } PartitionPruneInfo;
 
@@ -1463,6 +1481,9 @@ typedef struct PartitionedRelPruneInfo
 	/* relation OID by partition index, or 0 */
 	Oid		   *relid_map pg_node_attr(array_size(nparts));
 
+	/* Range table index by partition index, or 0. */
+	Index	   *rti_map pg_node_attr(array_size(nparts));
+
 	/*
 	 * initial_pruning_steps shows how to prune during executor startup (i.e.,
 	 * without use of any PARAM_EXEC Params); it is NIL if no startup pruning
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index 0499635f59..1c5bb5ece1 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -220,7 +220,8 @@ extern List *CachedPlanGetTargetList(CachedPlanSource *plansource,
 extern CachedPlan *GetCachedPlan(CachedPlanSource *plansource,
 								 ParamListInfo boundParams,
 								 ResourceOwner owner,
-								 QueryEnvironment *queryEnv);
+								 QueryEnvironment *queryEnv,
+								 List **part_prune_result_list);
 extern void ReleaseCachedPlan(CachedPlan *plan, ResourceOwner owner);
 
 extern bool CachedPlanAllowsSimpleValidityCheck(CachedPlanSource *plansource,
diff --git a/src/include/utils/portal.h b/src/include/utils/portal.h
index aeddbdafe5..9f7727a837 100644
--- a/src/include/utils/portal.h
+++ b/src/include/utils/portal.h
@@ -138,6 +138,7 @@ typedef struct PortalData
 	QueryCompletion qc;			/* command completion data for executed query */
 	List	   *stmts;			/* list of PlannedStmts */
 	CachedPlan *cplan;			/* CachedPlan, if stmts are from one */
+	List	   *part_prune_results;	/* list of PartitionPruneResults */
 
 	ParamListInfo portalParams; /* params to pass to query */
 	QueryEnvironment *queryEnv; /* environment for query */
@@ -242,6 +243,8 @@ extern void PortalDefineQuery(Portal portal,
 							  CommandTag commandTag,
 							  List *stmts,
 							  CachedPlan *cplan);
+extern void PortalStorePartitionPruneResults(Portal portal,
+											 List *part_prune_result_list);
 extern PlannedStmt *PortalGetPrimaryStmt(Portal portal);
 extern void PortalCreateHoldStore(Portal portal);
 extern void PortalHashTableDeleteAll(void);
-- 
2.35.3



^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2022-07-27 16:27  Robert Haas <[email protected]>
  parent: Amit Langote <[email protected]>
  0 siblings, 1 reply; 108+ messages in thread

From: Robert Haas @ 2022-07-27 16:27 UTC (permalink / raw)
  To: Amit Langote <[email protected]>; +Cc: Jacob Champion <[email protected]>; Zhihong Yu <[email protected]>; David Rowley <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers

On Tue, Jul 26, 2022 at 11:01 PM Amit Langote <[email protected]> wrote:
> Needed to be rebased again, over 2d04277121f this time.

0001 adds es_part_prune_result but does not use it, so maybe the
introduction of that field should be deferred until it's needed for
something.

I wonder whether it's really necessary to added the PartitionPruneInfo
objects to a list in PlannerInfo first and then roll them up into
PlannerGlobal later. I know we do that for range table entries, but
I've never quite understood why we do it that way instead of creating
a flat range table in PlannerGlobal from the start. And so by
extension I wonder whether this table couldn't be flat from the start
also.

-- 
Robert Haas
EDB: http://www.enterprisedb.com





^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2022-07-29 04:20  Amit Langote <[email protected]>
  parent: Robert Haas <[email protected]>
  0 siblings, 1 reply; 108+ messages in thread

From: Amit Langote @ 2022-07-29 04:20 UTC (permalink / raw)
  To: Robert Haas <[email protected]>; +Cc: Jacob Champion <[email protected]>; Zhihong Yu <[email protected]>; David Rowley <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers

On Thu, Jul 28, 2022 at 1:27 AM Robert Haas <[email protected]> wrote:
> On Tue, Jul 26, 2022 at 11:01 PM Amit Langote <[email protected]> wrote:
> > Needed to be rebased again, over 2d04277121f this time.

Thanks for looking.

> 0001 adds es_part_prune_result but does not use it, so maybe the
> introduction of that field should be deferred until it's needed for
> something.

Oops, looks like a mistake when breaking the patch.  Will move that bit to 0002.

> I wonder whether it's really necessary to added the PartitionPruneInfo
> objects to a list in PlannerInfo first and then roll them up into
> PlannerGlobal later. I know we do that for range table entries, but
> I've never quite understood why we do it that way instead of creating
> a flat range table in PlannerGlobal from the start. And so by
> extension I wonder whether this table couldn't be flat from the start
> also.

Tom may want to correct me but my understanding of why the planner
waits till the end of planning to start populating the PlannerGlobal
range table is that it is not until then that we know which subqueries
will be scanned by the final plan tree, so also whose range table
entries will be included in the range table passed to the executor.  I
can see that subquery pull-up causes a pulled-up subquery's range
table entries to be added into the parent's query's and all its nodes
changed using OffsetVarNodes() to refer to the new RT indexes.  But
for subqueries that are not pulled up, their subplans' nodes (present
in PlannerGlboal.subplans) would still refer to the original RT
indexes (per range table in the corresponding PlannerGlobal.subroot),
which must be fixed and the end of planning is the time to do so.  Or
maybe that could be done when build_subplan() creates a subplan and
adds it to PlannerGlobal.subplans, but for some reason it's not?

--
Thanks, Amit Langote
EDB: http://www.enterprisedb.com





^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2022-07-29 04:55  Tom Lane <[email protected]>
  parent: Amit Langote <[email protected]>
  1 sibling, 2 replies; 108+ messages in thread

From: Tom Lane @ 2022-07-29 04:55 UTC (permalink / raw)
  To: Amit Langote <[email protected]>; +Cc: Robert Haas <[email protected]>; Jacob Champion <[email protected]>; Zhihong Yu <[email protected]>; David Rowley <[email protected]>; pgsql-hackers

Amit Langote <[email protected]> writes:
> On Thu, Jul 28, 2022 at 1:27 AM Robert Haas <[email protected]> wrote:
>> I wonder whether it's really necessary to added the PartitionPruneInfo
>> objects to a list in PlannerInfo first and then roll them up into
>> PlannerGlobal later. I know we do that for range table entries, but
>> I've never quite understood why we do it that way instead of creating
>> a flat range table in PlannerGlobal from the start. And so by
>> extension I wonder whether this table couldn't be flat from the start
>> also.

> Tom may want to correct me but my understanding of why the planner
> waits till the end of planning to start populating the PlannerGlobal
> range table is that it is not until then that we know which subqueries
> will be scanned by the final plan tree, so also whose range table
> entries will be included in the range table passed to the executor.

It would not be profitable to flatten the range table before we've
done remove_useless_joins.  We'd end up with useless entries from
subqueries that ultimately aren't there.  We could perhaps do it
after we finish that phase, but I don't really see the point: it
wouldn't be better than what we do now, just the same work at a
different time.

			regards, tom lane





^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2022-07-29 12:22  Robert Haas <[email protected]>
  parent: Tom Lane <[email protected]>
  1 sibling, 1 reply; 108+ messages in thread

From: Robert Haas @ 2022-07-29 12:22 UTC (permalink / raw)
  To: Tom Lane <[email protected]>; +Cc: Amit Langote <[email protected]>; Jacob Champion <[email protected]>; Zhihong Yu <[email protected]>; David Rowley <[email protected]>; pgsql-hackers

On Fri, Jul 29, 2022 at 12:55 AM Tom Lane <[email protected]> wrote:
> It would not be profitable to flatten the range table before we've
> done remove_useless_joins.  We'd end up with useless entries from
> subqueries that ultimately aren't there.  We could perhaps do it
> after we finish that phase, but I don't really see the point: it
> wouldn't be better than what we do now, just the same work at a
> different time.

That's not quite my question, though. Why do we ever build a non-flat
range table in the first place? Like, instead of assigning indexes
relative to the current subquery level, why not just assign them
relative to the whole query from the start? It can't really be that
we've done it this way because of remove_useless_joins(), because
we've been building separate range tables and later flattening them
for longer than join removal has existed as a feature.

What bugs me is that it's very much not free. By building a bunch of
separate range tables and combining them later, we generate extra
work: we have to go back and adjust RT indexes after-the-fact. We pay
that overhead for every query, not just the ones that end up with some
unused entries in the range table. And why would it matter if we did
end up with some useless entries in the range table, anyway? If
there's some semantic difference, we could add a flag to mark those
entries as needing to be ignored, which seems way better than crawling
all over the whole tree adjusting RTIs everywhere.

I don't really expect that we're ever going to change this -- and
certainly not on this thread. The idea of running around and replacing
RT indexes all over the tree is deeply embedded in the system. But are
we really sure we want to add a second kind of index that we have to
run around and adjust at the same time?

If we are, so be it, I guess. It just looks really ugly and unnecessary to me.

-- 
Robert Haas
EDB: http://www.enterprisedb.com





^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2022-07-29 15:04  Tom Lane <[email protected]>
  parent: Tom Lane <[email protected]>
  1 sibling, 1 reply; 108+ messages in thread

From: Tom Lane @ 2022-07-29 15:04 UTC (permalink / raw)
  To: Robert Haas <[email protected]>; +Cc: Amit Langote <[email protected]>; Jacob Champion <[email protected]>; Zhihong Yu <[email protected]>; David Rowley <[email protected]>; pgsql-hackers

Robert Haas <[email protected]> writes:
> That's not quite my question, though. Why do we ever build a non-flat
> range table in the first place? Like, instead of assigning indexes
> relative to the current subquery level, why not just assign them
> relative to the whole query from the start?

We could probably make that work, but I'm skeptical that it would
really be an improvement overall, for a couple of reasons.

(1) The need for merge-rangetables-and-renumber-Vars logic doesn't
go away.  It just moves from setrefs.c to the rewriter, which would
have to do it when expanding views.  This would be a net loss
performance-wise, I think, because setrefs.c can do it as part of a
parsetree scan that it has to perform anyway for other housekeeping
reasons; but the rewriter would need a brand new pass over the tree.
Admittedly that pass would only happen for view replacement, but
it's still not open-and-shut that there'd be a performance win.

(2) The need for varlevelsup and similar fields doesn't go away,
I think, because we need those for semantic purposes such as
discovering the query level that aggregates are associated with.
That means that subquery flattening still has to make a pass over
the tree to touch every Var's varlevelsup; so not having to adjust
varno at the same time would save little.

I'm not sure whether I think it's a net plus or net minus that
varno would become effectively independent of varlevelsup.
It'd be different from the way we think of them now, for sure,
and I think it'd take awhile to flush out bugs arising from such
a redefinition.

> I don't really expect that we're ever going to change this -- and
> certainly not on this thread. The idea of running around and replacing
> RT indexes all over the tree is deeply embedded in the system. But are
> we really sure we want to add a second kind of index that we have to
> run around and adjust at the same time?

You probably want to avert your eyes from [1], then ;-).  Although
I'm far from convinced that the cross-list index fields currently
proposed there are actually necessary; the cost to adjust them
during rangetable merging could outweigh any benefit.

			regards, tom lane

[1] https://www.postgresql.org/message-id/flat/CA+HiwqGjJDmUhDSfv-U2qhKJjt9ST7Xh9JXC_irsAQ1TAUsJYg@mail....





^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2022-07-29 15:56  Robert Haas <[email protected]>
  parent: Tom Lane <[email protected]>
  0 siblings, 0 replies; 108+ messages in thread

From: Robert Haas @ 2022-07-29 15:56 UTC (permalink / raw)
  To: Tom Lane <[email protected]>; +Cc: Amit Langote <[email protected]>; Jacob Champion <[email protected]>; Zhihong Yu <[email protected]>; David Rowley <[email protected]>; pgsql-hackers

On Fri, Jul 29, 2022 at 11:04 AM Tom Lane <[email protected]> wrote:
> We could probably make that work, but I'm skeptical that it would
> really be an improvement overall, for a couple of reasons.
>
> (1) The need for merge-rangetables-and-renumber-Vars logic doesn't
> go away.  It just moves from setrefs.c to the rewriter, which would
> have to do it when expanding views.  This would be a net loss
> performance-wise, I think, because setrefs.c can do it as part of a
> parsetree scan that it has to perform anyway for other housekeeping
> reasons; but the rewriter would need a brand new pass over the tree.
> Admittedly that pass would only happen for view replacement, but
> it's still not open-and-shut that there'd be a performance win.
>
> (2) The need for varlevelsup and similar fields doesn't go away,
> I think, because we need those for semantic purposes such as
> discovering the query level that aggregates are associated with.
> That means that subquery flattening still has to make a pass over
> the tree to touch every Var's varlevelsup; so not having to adjust
> varno at the same time would save little.
>
> I'm not sure whether I think it's a net plus or net minus that
> varno would become effectively independent of varlevelsup.
> It'd be different from the way we think of them now, for sure,
> and I think it'd take awhile to flush out bugs arising from such
> a redefinition.

Interesting. Thanks for your thoughts. I guess it's not as clear-cut
as I thought, but I still can't help feeling like we're doing an awful
lot of expensive rearrangement at the end of query planning.

I kind of wonder whether varlevelsup is the wrong idea. Like, suppose
we instead handed out subquery identifiers serially, sort of like what
we do with SubTransactionId values. Then instead of testing whether
varlevelsup>0 you test whether varsubqueryid==mysubqueryid. If you
flatten a query into its parent, you still need to adjust every var
that refers to the dead subquery, but you don't need to adjust vars
that refer to subqueries underneath it. Their level changes, but their
identity doesn't. Maybe that doesn't really help that much, but it's
always struck me as a little unfortunate that we basically test
whether a var is equal by testing whether the varno and varlevelsup
are equal. That only works if you assume that you can never end up
comparing two vars from thoroughly unrelated parts of the tree, such
that the subquery one level up from one might be different from the
subquery one level up from the other.

> > I don't really expect that we're ever going to change this -- and
> > certainly not on this thread. The idea of running around and replacing
> > RT indexes all over the tree is deeply embedded in the system. But are
> > we really sure we want to add a second kind of index that we have to
> > run around and adjust at the same time?
>
> You probably want to avert your eyes from [1], then ;-).  Although
> I'm far from convinced that the cross-list index fields currently
> proposed there are actually necessary; the cost to adjust them
> during rangetable merging could outweigh any benefit.

I really like the idea of that patch overall, actually; I think
permissions checking is a good example of something that shouldn't
require walking the whole query tree but currently does. And actually,
I think the same thing is true here: we shouldn't need to walk the
whole query tree to find the pruning information, but right now we do.
I'm just uncertain whether what Amit has implemented is the
least-annoying way to go about it... any thoughts on that,
specifically as it pertains to this patch?

-- 
Robert Haas
EDB: http://www.enterprisedb.com





^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2022-07-29 16:47  Tom Lane <[email protected]>
  parent: Robert Haas <[email protected]>
  0 siblings, 1 reply; 108+ messages in thread

From: Tom Lane @ 2022-07-29 16:47 UTC (permalink / raw)
  To: Robert Haas <[email protected]>; +Cc: Amit Langote <[email protected]>; Jacob Champion <[email protected]>; Zhihong Yu <[email protected]>; David Rowley <[email protected]>; pgsql-hackers

Robert Haas <[email protected]> writes:
> ... it's
> always struck me as a little unfortunate that we basically test
> whether a var is equal by testing whether the varno and varlevelsup
> are equal. That only works if you assume that you can never end up
> comparing two vars from thoroughly unrelated parts of the tree, such
> that the subquery one level up from one might be different from the
> subquery one level up from the other.

Yeah, that's always bothered me a little as well.  I've yet to see a
case where it causes a problem in practice.  But I think that if, say,
we were to try to do any sort of cross-query-level optimization, then
the ambiguity could rise up to bite us.  That might be a situation
where a flat rangetable would be worth the trouble.

> I'm just uncertain whether what Amit has implemented is the
> least-annoying way to go about it... any thoughts on that,
> specifically as it pertains to this patch?

I haven't looked at this patch at all.  I'll try to make some
time for it, but probably not today.

			regards, tom lane





^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2022-07-29 16:55  Robert Haas <[email protected]>
  parent: Tom Lane <[email protected]>
  0 siblings, 0 replies; 108+ messages in thread

From: Robert Haas @ 2022-07-29 16:55 UTC (permalink / raw)
  To: Tom Lane <[email protected]>; +Cc: Amit Langote <[email protected]>; Jacob Champion <[email protected]>; Zhihong Yu <[email protected]>; David Rowley <[email protected]>; pgsql-hackers

On Fri, Jul 29, 2022 at 12:47 PM Tom Lane <[email protected]> wrote:
> > I'm just uncertain whether what Amit has implemented is the
> > least-annoying way to go about it... any thoughts on that,
> > specifically as it pertains to this patch?
>
> I haven't looked at this patch at all.  I'll try to make some
> time for it, but probably not today.

OK, thanks. The preliminary patch I'm talking about here is pretty
short, so you could probably look at that part of it, at least, in
some relatively small amount of time. And I think it's also in pretty
reasonable shape apart from this issue. But, as usual, there's the
question of how well one can evaluate a preliminary patch without
reviewing the full patch in detail.

-- 
Robert Haas
EDB: http://www.enterprisedb.com





^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2022-10-12 07:36  Amit Langote <[email protected]>
  parent: Amit Langote <[email protected]>
  0 siblings, 1 reply; 108+ messages in thread

From: Amit Langote @ 2022-10-12 07:36 UTC (permalink / raw)
  To: Robert Haas <[email protected]>; +Cc: Jacob Champion <[email protected]>; Zhihong Yu <[email protected]>; David Rowley <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers

On Fri, Jul 29, 2022 at 1:20 PM Amit Langote <[email protected]> wrote:
> On Thu, Jul 28, 2022 at 1:27 AM Robert Haas <[email protected]> wrote:
> > 0001 adds es_part_prune_result but does not use it, so maybe the
> > introduction of that field should be deferred until it's needed for
> > something.
>
> Oops, looks like a mistake when breaking the patch.  Will move that bit to 0002.

Fixed that and also noticed that I had defined PartitionPruneResult in
the wrong header (execnodes.h).  That led to PartitionPruneResult
nodes not being able to be written and read, because
src/backend/nodes/gen_node_support.pl doesn't create _out* and _read*
routines for the nodes defined in execnodes.h.  I moved its definition
to plannodes.h, even though it is not actually the planner that
instantiates those; no other include/nodes header sounds better.

One more thing I realized is that Bitmapsets added to the List
PartitionPruneResult.valid_subplan_offs_list are not actually
read/write-able.  That's a problem that I also faced in [1], so I
proposed a patch there to make Bitmapset a read/write-able Node and
mark (only) the Bitmapsets that are added into read/write-able node
trees with the corresponding NodeTag.  I'm including that patch here
as well (0002) for the main patch to work (pass
-DWRITE_READ_PARSE_PLAN_TREES build tests), though it might make sense
to discuss it in its own thread?

--
Thanks, Amit Langote
EDB: http://www.enterprisedb.com

[1] https://www.postgresql.org/message-id/CA%2BHiwqH80qX1ZLx3HyHmBrOzLQeuKuGx6FzGep0F_9zw9L4PAA%40mail.g...


Attachments:

  [application/octet-stream] v21-0001-Move-PartitioPruneInfo-out-of-plan-nodes-into-Pl.patch (17.2K, 2-v21-0001-Move-PartitioPruneInfo-out-of-plan-nodes-into-Pl.patch)
  download | inline diff:
From 06cda14113c3572440a716a4aacb250b2ed52f52 Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Fri, 27 May 2022 16:00:28 +0900
Subject: [PATCH v21 1/3] Move PartitioPruneInfo out of plan nodes into
 PlannedStmt

The planner will now add a given PartitioPruneInfo to
PlannedStmt.partPruneInfos instead of directly to the
Append/MergeAppend plan node.  What gets set instead in the
latter is an index field which points to the list element
of PlannedStmt.partPruneInfos containing the PartitioPruneInfo
belonging to the plan node.

A later commit will make AcquireExecutorLocks() do the initial
partition pruning to determine a minimal set of partitions to be
locked when validating a plan tree and it will need to consult the
PartitioPruneInfos referenced therein to do so.  It would be better
for the PartitioPruneInfos to be accessible directly than requiring
a walk of the plan tree to find them, which is easier when it can be
done by simply iterating over PlannedStmt.partPruneInfos.
---
 src/backend/executor/execMain.c         |  1 +
 src/backend/executor/execParallel.c     |  1 +
 src/backend/executor/execPartition.c    |  4 +-
 src/backend/executor/execUtils.c        |  1 +
 src/backend/executor/nodeAppend.c       |  4 +-
 src/backend/executor/nodeMergeAppend.c  |  4 +-
 src/backend/optimizer/plan/createplan.c | 24 ++++-----
 src/backend/optimizer/plan/planner.c    |  1 +
 src/backend/optimizer/plan/setrefs.c    | 65 +++++++++++++------------
 src/backend/partitioning/partprune.c    | 18 ++++---
 src/include/executor/execPartition.h    |  3 +-
 src/include/nodes/execnodes.h           |  1 +
 src/include/nodes/pathnodes.h           |  6 +++
 src/include/nodes/plannodes.h           | 11 +++--
 src/include/partitioning/partprune.h    |  8 +--
 15 files changed, 90 insertions(+), 62 deletions(-)

diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index d78862e660..32475e33ff 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -825,6 +825,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
 	ExecInitRangeTable(estate, rangeTable);
 
 	estate->es_plannedstmt = plannedstmt;
+	estate->es_part_prune_infos = plannedstmt->partPruneInfos;
 
 	/*
 	 * Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index 99512826c5..aca0c6f323 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -183,6 +183,7 @@ ExecSerializePlan(Plan *plan, EState *estate)
 	pstmt->dependsOnRole = false;
 	pstmt->parallelModeNeeded = false;
 	pstmt->planTree = plan;
+	pstmt->partPruneInfos = estate->es_part_prune_infos;
 	pstmt->rtable = estate->es_range_table;
 	pstmt->resultRelations = NIL;
 	pstmt->appendRelations = NIL;
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 40e3c07693..80197d5141 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -1791,11 +1791,13 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
 PartitionPruneState *
 ExecInitPartitionPruning(PlanState *planstate,
 						 int n_total_subplans,
-						 PartitionPruneInfo *pruneinfo,
+						 int part_prune_index,
 						 Bitmapset **initially_valid_subplans)
 {
 	PartitionPruneState *prunestate;
 	EState	   *estate = planstate->state;
+	PartitionPruneInfo *pruneinfo = list_nth(estate->es_part_prune_infos,
+											 part_prune_index);
 
 	/* We may need an expression context to evaluate partition exprs */
 	ExecAssignExprContext(estate, planstate);
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 9df1f81ea8..21f4c10937 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -119,6 +119,7 @@ CreateExecutorState(void)
 	estate->es_relations = NULL;
 	estate->es_rowmarks = NULL;
 	estate->es_plannedstmt = NULL;
+	estate->es_part_prune_infos = NIL;
 
 	estate->es_junkFilter = NULL;
 
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index 357e10a1d7..c6f86a6510 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -134,7 +134,7 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
 	appendstate->as_begun = false;
 
 	/* If run-time partition pruning is enabled, then set that up now */
-	if (node->part_prune_info != NULL)
+	if (node->part_prune_index >= 0)
 	{
 		PartitionPruneState *prunestate;
 
@@ -145,7 +145,7 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
 		 */
 		prunestate = ExecInitPartitionPruning(&appendstate->ps,
 											  list_length(node->appendplans),
-											  node->part_prune_info,
+											  node->part_prune_index,
 											  &validsubplans);
 		appendstate->as_prune_state = prunestate;
 		nplans = bms_num_members(validsubplans);
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index c5c62fa5c7..8d35860c30 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -82,7 +82,7 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
 	mergestate->ps.ExecProcNode = ExecMergeAppend;
 
 	/* If run-time partition pruning is enabled, then set that up now */
-	if (node->part_prune_info != NULL)
+	if (node->part_prune_index >= 0)
 	{
 		PartitionPruneState *prunestate;
 
@@ -93,7 +93,7 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
 		 */
 		prunestate = ExecInitPartitionPruning(&mergestate->ps,
 											  list_length(node->mergeplans),
-											  node->part_prune_info,
+											  node->part_prune_index,
 											  &validsubplans);
 		mergestate->ms_prune_state = prunestate;
 		nplans = bms_num_members(validsubplans);
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index ab4d8e201d..2bfb817d75 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -1203,7 +1203,6 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
 	ListCell   *subpaths;
 	int			nasyncplans = 0;
 	RelOptInfo *rel = best_path->path.parent;
-	PartitionPruneInfo *partpruneinfo = NULL;
 	int			nodenumsortkeys = 0;
 	AttrNumber *nodeSortColIdx = NULL;
 	Oid		   *nodeSortOperators = NULL;
@@ -1354,6 +1353,9 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
 		subplans = lappend(subplans, subplan);
 	}
 
+	/* Set below if we find quals that we can use to run-time prune */
+	plan->part_prune_index = -1;
+
 	/*
 	 * If any quals exist, they may be useful to perform further partition
 	 * pruning during execution.  Gather information needed by the executor to
@@ -1377,16 +1379,14 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
 		}
 
 		if (prunequal != NIL)
-			partpruneinfo =
-				make_partition_pruneinfo(root, rel,
-										 best_path->subpaths,
-										 prunequal);
+			plan->part_prune_index = make_partition_pruneinfo(root, rel,
+															  best_path->subpaths,
+															  prunequal);
 	}
 
 	plan->appendplans = subplans;
 	plan->nasyncplans = nasyncplans;
 	plan->first_partial_plan = best_path->first_partial_path;
-	plan->part_prune_info = partpruneinfo;
 
 	copy_generic_path_info(&plan->plan, (Path *) best_path);
 
@@ -1425,7 +1425,6 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
 	List	   *subplans = NIL;
 	ListCell   *subpaths;
 	RelOptInfo *rel = best_path->path.parent;
-	PartitionPruneInfo *partpruneinfo = NULL;
 
 	/*
 	 * We don't have the actual creation of the MergeAppend node split out
@@ -1518,6 +1517,9 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
 		subplans = lappend(subplans, subplan);
 	}
 
+	/* Set below if we find quals that we can use to run-time prune */
+	node->part_prune_index = -1;
+
 	/*
 	 * If any quals exist, they may be useful to perform further partition
 	 * pruning during execution.  Gather information needed by the executor to
@@ -1541,13 +1543,13 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
 		}
 
 		if (prunequal != NIL)
-			partpruneinfo = make_partition_pruneinfo(root, rel,
-													 best_path->subpaths,
-													 prunequal);
+			node->part_prune_index = make_partition_pruneinfo(root, rel,
+															  best_path->subpaths,
+															  prunequal);
 	}
 
 	node->mergeplans = subplans;
-	node->part_prune_info = partpruneinfo;
+
 
 	/*
 	 * If prepare_sort_from_pathkeys added sort columns, but we were told to
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 5d0fd6e072..31fff597a7 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -519,6 +519,7 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
 	result->dependsOnRole = glob->dependsOnRole;
 	result->parallelModeNeeded = glob->parallelModeNeeded;
 	result->planTree = top_plan;
+	result->partPruneInfos = glob->partPruneInfos;
 	result->rtable = glob->finalrtable;
 	result->resultRelations = glob->resultRelations;
 	result->appendRelations = glob->appendRelations;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 1cb0abdbc1..720f20f563 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -348,6 +348,29 @@ set_plan_references(PlannerInfo *root, Plan *plan)
 		}
 	}
 
+	/* Also fix up the information in PartitionPruneInfos. */
+	foreach (lc, root->partPruneInfos)
+	{
+		PartitionPruneInfo *pruneinfo = lfirst(lc);
+		ListCell  *l;
+
+		foreach(l, pruneinfo->prune_infos)
+		{
+			List	   *prune_infos = lfirst(l);
+			ListCell   *l2;
+
+			foreach(l2, prune_infos)
+			{
+				PartitionedRelPruneInfo *pinfo = lfirst(l2);
+
+				/* RT index of the table to which the pinfo belongs. */
+				pinfo->rtindex += rtoffset;
+			}
+		}
+
+		glob->partPruneInfos = lappend(glob->partPruneInfos, pruneinfo);
+	}
+
 	return result;
 }
 
@@ -1658,21 +1681,12 @@ set_append_references(PlannerInfo *root,
 
 	aplan->apprelids = offset_relid_set(aplan->apprelids, rtoffset);
 
-	if (aplan->part_prune_info)
-	{
-		foreach(l, aplan->part_prune_info->prune_infos)
-		{
-			List	   *prune_infos = lfirst(l);
-			ListCell   *l2;
-
-			foreach(l2, prune_infos)
-			{
-				PartitionedRelPruneInfo *pinfo = lfirst(l2);
-
-				pinfo->rtindex += rtoffset;
-			}
-		}
-	}
+	/*
+	 * PartitionPruneInfos will be added to a list in PlannerGlobal, so update
+	 * the index.
+	 */
+	if (aplan->part_prune_index >= 0)
+		aplan->part_prune_index += list_length(root->glob->partPruneInfos);
 
 	/* We don't need to recurse to lefttree or righttree ... */
 	Assert(aplan->plan.lefttree == NULL);
@@ -1734,21 +1748,12 @@ set_mergeappend_references(PlannerInfo *root,
 
 	mplan->apprelids = offset_relid_set(mplan->apprelids, rtoffset);
 
-	if (mplan->part_prune_info)
-	{
-		foreach(l, mplan->part_prune_info->prune_infos)
-		{
-			List	   *prune_infos = lfirst(l);
-			ListCell   *l2;
-
-			foreach(l2, prune_infos)
-			{
-				PartitionedRelPruneInfo *pinfo = lfirst(l2);
-
-				pinfo->rtindex += rtoffset;
-			}
-		}
-	}
+	/*
+	 * PartitionPruneInfos will be added to a list in PlannerGlobal, so update
+	 * the index.
+	 */
+	if (mplan->part_prune_index >= 0)
+		mplan->part_prune_index += list_length(root->glob->partPruneInfos);
 
 	/* We don't need to recurse to lefttree or righttree ... */
 	Assert(mplan->plan.lefttree == NULL);
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index 6188bf69cb..6565b6ed01 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -209,16 +209,20 @@ static void partkey_datum_from_expr(PartitionPruneContext *context,
 
 /*
  * make_partition_pruneinfo
- *		Builds a PartitionPruneInfo which can be used in the executor to allow
- *		additional partition pruning to take place.  Returns NULL when
- *		partition pruning would be useless.
+ *		Checks if the given set of quals can be used to build pruning steps
+ *		that the executor can use to prune away unneeded partitions.  If
+ *		suitable quals are found then a PartitionPruneInfo is built and tagged
+ *		onto the PlannerInfo's partPruneInfos list.
+ *
+ * The return value is the 0-based index of the item added to the
+ * partPruneInfos list or -1 if nothing was added.
  *
  * 'parentrel' is the RelOptInfo for an appendrel, and 'subpaths' is the list
  * of scan paths for its child rels.
  * 'prunequal' is a list of potential pruning quals (i.e., restriction
  * clauses that are applicable to the appendrel).
  */
-PartitionPruneInfo *
+int
 make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 						 List *subpaths,
 						 List *prunequal)
@@ -332,7 +336,7 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	 * quals, then we can just not bother with run-time pruning.
 	 */
 	if (prunerelinfos == NIL)
-		return NULL;
+		return -1;
 
 	/* Else build the result data structure */
 	pruneinfo = makeNode(PartitionPruneInfo);
@@ -358,7 +362,9 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	else
 		pruneinfo->other_subplans = NULL;
 
-	return pruneinfo;
+	root->partPruneInfos = lappend(root->partPruneInfos, pruneinfo);
+
+	return list_length(root->partPruneInfos) - 1;
 }
 
 /*
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index 708435e952..bf962af7af 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -123,9 +123,8 @@ typedef struct PartitionPruneState
 
 extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
 													 int n_total_subplans,
-													 PartitionPruneInfo *pruneinfo,
+													 int part_prune_index,
 													 Bitmapset **initially_valid_subplans);
 extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
 										   bool initial_prune);
-
 #endif							/* EXECPARTITION_H */
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 01b1727fc0..4a741b053f 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -611,6 +611,7 @@ typedef struct EState
 	struct ExecRowMark **es_rowmarks;	/* Array of per-range-table-entry
 										 * ExecRowMarks, or NULL if none */
 	PlannedStmt *es_plannedstmt;	/* link to top of plan tree */
+	List		*es_part_prune_infos;	/* PlannedStmt.partPruneInfos */
 	const char *es_sourceText;	/* Source text from QueryDesc */
 
 	JunkFilter *es_junkFilter;	/* top-level junk filter, if any */
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 6bda383bea..e392fb6fc0 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -122,6 +122,9 @@ typedef struct PlannerGlobal
 	/* "flat" list of AppendRelInfos */
 	List	   *appendRelations;
 
+	/* List of PartitionPruneInfo contained in the plan */
+	List	   *partPruneInfos;
+
 	/* OIDs of relations the plan depends on */
 	List	   *relationOids;
 
@@ -503,6 +506,9 @@ struct PlannerInfo
 
 	/* Does this query modify any partition key columns? */
 	bool		partColsUpdated;
+
+	/* PartitionPruneInfos added in this query's plan. */
+	List	   *partPruneInfos;
 };
 
 
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 21e642a64c..3eb3e6e527 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -70,6 +70,9 @@ typedef struct PlannedStmt
 
 	struct Plan *planTree;		/* tree of Plan nodes */
 
+	List	   *partPruneInfos;	/* List of PartitionPruneInfo contained in
+								 * the plan */
+
 	List	   *rtable;			/* list of RangeTblEntry nodes */
 
 	/* rtable indexes of target relations for INSERT/UPDATE/DELETE */
@@ -270,8 +273,8 @@ typedef struct Append
 	 */
 	int			first_partial_plan;
 
-	/* Info for run-time subplan pruning; NULL if we're not doing that */
-	struct PartitionPruneInfo *part_prune_info;
+	/* Index to PlannerInfo.partPruneInfos or -1 if no run-time pruning */
+	int			part_prune_index;
 } Append;
 
 /* ----------------
@@ -305,8 +308,8 @@ typedef struct MergeAppend
 	/* NULLS FIRST/LAST directions */
 	bool	   *nullsFirst pg_node_attr(array_size(numCols));
 
-	/* Info for run-time subplan pruning; NULL if we're not doing that */
-	struct PartitionPruneInfo *part_prune_info;
+	/* Index to PlannerInfo.partPruneInfos or -1 if no run-time pruning */
+	int			part_prune_index;
 } MergeAppend;
 
 /* ----------------
diff --git a/src/include/partitioning/partprune.h b/src/include/partitioning/partprune.h
index 90684efa25..ebf0dcff8c 100644
--- a/src/include/partitioning/partprune.h
+++ b/src/include/partitioning/partprune.h
@@ -70,10 +70,10 @@ typedef struct PartitionPruneContext
 #define PruneCxtStateIdx(partnatts, step_id, keyno) \
 	((partnatts) * (step_id) + (keyno))
 
-extern PartitionPruneInfo *make_partition_pruneinfo(struct PlannerInfo *root,
-													struct RelOptInfo *parentrel,
-													List *subpaths,
-													List *prunequal);
+extern int make_partition_pruneinfo(struct PlannerInfo *root,
+									struct RelOptInfo *parentrel,
+									List *subpaths,
+									List *prunequal);
 extern Bitmapset *prune_append_rel_partitions(struct RelOptInfo *rel);
 extern Bitmapset *get_matching_partitions(PartitionPruneContext *context,
 										  List *pruning_steps);
-- 
2.35.3



  [application/octet-stream] v21-0003-Optimize-AcquireExecutorLocks-by-locking-only-un.patch (81.7K, 3-v21-0003-Optimize-AcquireExecutorLocks-by-locking-only-un.patch)
  download | inline diff:
From ce28c4cfe8bc69e313ba7f59b048fe96f73139a6 Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Wed, 22 Dec 2021 16:55:17 +0900
Subject: [PATCH v21 3/3] Optimize AcquireExecutorLocks() by locking only
 unpruned partitions

This commit teaches AcquireExecutorLocks() to perform initial
partition pruning to notionally eliminate the subnodes contained in a
generic cached plan that need not be initialized during the actual
execution of the plan and skip locking the partition scanned by those
subnodes.

The result of performing initial partition pruning this way before the
actual execution has started is made available to the actual execution via
PartitionPruneResult, made available along with the PlannedStmt by the
callers of the executor that used plancache.c to get the plan.  It is NULL
in the cases in which the plan is obtained by calling the planner
directly or if the plan obtained by plancache.c is not a generic one.
---
 src/backend/commands/copyto.c          |   2 +-
 src/backend/commands/createas.c        |   2 +-
 src/backend/commands/explain.c         |   7 +-
 src/backend/commands/extension.c       |   2 +-
 src/backend/commands/matview.c         |   2 +-
 src/backend/commands/prepare.c         |  26 ++-
 src/backend/executor/README            |  32 ++++
 src/backend/executor/execMain.c        |  55 ++++++
 src/backend/executor/execParallel.c    |  27 ++-
 src/backend/executor/execPartition.c   | 238 +++++++++++++++++++++----
 src/backend/executor/execUtils.c       |   1 +
 src/backend/executor/functions.c       |   2 +-
 src/backend/executor/nodeAppend.c      |  11 +-
 src/backend/executor/nodeMergeAppend.c |   5 +-
 src/backend/executor/spi.c             |  27 ++-
 src/backend/nodes/readfuncs.c          |   8 +-
 src/backend/optimizer/plan/planner.c   |   2 +
 src/backend/optimizer/plan/setrefs.c   |  46 +++++
 src/backend/partitioning/partprune.c   |  41 ++++-
 src/backend/tcop/postgres.c            |   8 +-
 src/backend/tcop/pquery.c              |  28 ++-
 src/backend/utils/cache/plancache.c    | 187 ++++++++++++++++---
 src/backend/utils/mmgr/portalmem.c     |  19 ++
 src/include/commands/explain.h         |   4 +-
 src/include/executor/execPartition.h   |   9 +-
 src/include/executor/execdesc.h        |   3 +
 src/include/executor/executor.h        |   2 +
 src/include/nodes/execnodes.h          |   1 +
 src/include/nodes/nodes.h              |   1 +
 src/include/nodes/pathnodes.h          |  12 ++
 src/include/nodes/plannodes.h          |  47 +++++
 src/include/utils/plancache.h          |   3 +-
 src/include/utils/portal.h             |   3 +
 33 files changed, 763 insertions(+), 100 deletions(-)

diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index 2527e66059..df4b0dcf0e 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -558,7 +558,7 @@ BeginCopyTo(ParseState *pstate,
 		((DR_copy *) dest)->cstate = cstate;
 
 		/* Create a QueryDesc requesting no output */
-		cstate->queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+		cstate->queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
 											GetActiveSnapshot(),
 											InvalidSnapshot,
 											dest, NULL, NULL, 0);
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 152c29b551..462651910a 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -325,7 +325,7 @@ ExecCreateTableAs(ParseState *pstate, CreateTableAsStmt *stmt,
 		UpdateActiveSnapshotCommandId();
 
 		/* Create a QueryDesc, redirecting output to our tuple receiver */
-		queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+		queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
 									GetActiveSnapshot(), InvalidSnapshot,
 									dest, params, queryEnv, 0);
 
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index f86983c660..219c63fa81 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -407,7 +407,7 @@ ExplainOneQuery(Query *query, int cursorOptions,
 		}
 
 		/* run it (if needed) and produce output */
-		ExplainOnePlan(plan, into, es, queryString, params, queryEnv,
+		ExplainOnePlan(plan, NULL, into, es, queryString, params, queryEnv,
 					   &planduration, (es->buffers ? &bufusage : NULL));
 	}
 }
@@ -515,7 +515,8 @@ ExplainOneUtility(Node *utilityStmt, IntoClause *into, ExplainState *es,
  * to call it.
  */
 void
-ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
+ExplainOnePlan(PlannedStmt *plannedstmt, PartitionPruneResult *part_prune_result,
+			   IntoClause *into, ExplainState *es,
 			   const char *queryString, ParamListInfo params,
 			   QueryEnvironment *queryEnv, const instr_time *planduration,
 			   const BufferUsage *bufusage)
@@ -563,7 +564,7 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
 		dest = None_Receiver;
 
 	/* Create a QueryDesc for the query */
-	queryDesc = CreateQueryDesc(plannedstmt, queryString,
+	queryDesc = CreateQueryDesc(plannedstmt, part_prune_result, queryString,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, params, queryEnv, instrument_option);
 
diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index 6b6720c690..374c0ff807 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -776,7 +776,7 @@ execute_sql_string(const char *sql)
 			{
 				QueryDesc  *qdesc;
 
-				qdesc = CreateQueryDesc(stmt,
+				qdesc = CreateQueryDesc(stmt, NULL,
 										sql,
 										GetActiveSnapshot(), NULL,
 										dest, NULL, NULL, 0);
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index 9ac0383459..b0ed96e56c 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -408,7 +408,7 @@ refresh_matview_datafill(DestReceiver *dest, Query *query,
 	UpdateActiveSnapshotCommandId();
 
 	/* Create a QueryDesc, redirecting output to our tuple receiver */
-	queryDesc = CreateQueryDesc(plan, queryString,
+	queryDesc = CreateQueryDesc(plan, NULL, queryString,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, NULL, NULL, 0);
 
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index c4b54d0547..69e02e0346 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -155,6 +155,7 @@ ExecuteQuery(ParseState *pstate,
 	PreparedStatement *entry;
 	CachedPlan *cplan;
 	List	   *plan_list;
+	List	   *part_prune_result_list;
 	ParamListInfo paramLI = NULL;
 	EState	   *estate = NULL;
 	Portal		portal;
@@ -193,7 +194,10 @@ ExecuteQuery(ParseState *pstate,
 									   entry->plansource->query_string);
 
 	/* Replan if needed, and increment plan refcount for portal */
-	cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL);
+	cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL,
+						  &part_prune_result_list);
+	Assert(list_length(cplan->stmt_list) ==
+		   list_length(part_prune_result_list));
 	plan_list = cplan->stmt_list;
 
 	/*
@@ -207,6 +211,9 @@ ExecuteQuery(ParseState *pstate,
 					  plan_list,
 					  cplan);
 
+	/* Copy PartitionPruneResults into the portal's context. */
+	PortalStorePartitionPruneResults(portal, part_prune_result_list);
+
 	/*
 	 * For CREATE TABLE ... AS EXECUTE, we must verify that the prepared
 	 * statement is one that produces tuples.  Currently we insist that it be
@@ -576,7 +583,9 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	const char *query_string;
 	CachedPlan *cplan;
 	List	   *plan_list;
-	ListCell   *p;
+	List	   *part_prune_result_list;
+	ListCell   *p,
+			   *pp;
 	ParamListInfo paramLI = NULL;
 	EState	   *estate = NULL;
 	instr_time	planstart;
@@ -619,7 +628,10 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 
 	/* Replan if needed, and acquire a transient refcount */
 	cplan = GetCachedPlan(entry->plansource, paramLI,
-						  CurrentResourceOwner, queryEnv);
+						  CurrentResourceOwner, queryEnv,
+						  &part_prune_result_list);
+	Assert(list_length(cplan->stmt_list) ==
+		   list_length(part_prune_result_list));
 
 	INSTR_TIME_SET_CURRENT(planduration);
 	INSTR_TIME_SUBTRACT(planduration, planstart);
@@ -634,13 +646,15 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	plan_list = cplan->stmt_list;
 
 	/* Explain each query */
-	foreach(p, plan_list)
+	forboth(p, plan_list, pp, part_prune_result_list)
 	{
 		PlannedStmt *pstmt = lfirst_node(PlannedStmt, p);
+		PartitionPruneResult *part_prune_result = lfirst_node(PartitionPruneResult, pp);
 
 		if (pstmt->commandType != CMD_UTILITY)
-			ExplainOnePlan(pstmt, into, es, query_string, paramLI, queryEnv,
-						   &planduration, (es->buffers ? &bufusage : NULL));
+			ExplainOnePlan(pstmt, part_prune_result, into, es, query_string,
+						   paramLI, queryEnv, &planduration,
+						   (es->buffers ? &bufusage : NULL));
 		else
 			ExplainOneUtility(pstmt->utilityStmt, into, es, query_string,
 							  paramLI, queryEnv);
diff --git a/src/backend/executor/README b/src/backend/executor/README
index 0b5183fc4a..953a476ea5 100644
--- a/src/backend/executor/README
+++ b/src/backend/executor/README
@@ -65,6 +65,34 @@ found there.  This currently only occurs for Append and MergeAppend nodes.  In
 this case the non-required subplans are ignored and the executor state's
 subnode array will become out of sequence to the plan's subplan list.
 
+Actually, the so-called execution time pruning may also occur even before the
+execution has started.  One case where that occurs is when a cached generic
+plan is being validated for execution by plancache.c: GetCachedPlan(), which
+works by locking all the relations that will be scanned by that plan.  If the
+generic plan contains nodes that can perform execution time partition pruning
+(that is, contain a PartitionPruneInfo), a subset of pruning steps contained
+in a given node's PartitionPruneInfo that do not depend on the execution
+actually having started (called "initial" pruning steps) are performed at this
+point to figure out the minimal set of child subplans that satisfy those
+pruning steps.  AcquireExecutorLocks() looking at a given plan tree will then
+lock only the relations scanned by the child subplans that survived such
+pruning, along with those present in PlannedStmt.minLockRelids.  Note that the
+subplans are only notionally pruned in that they are not removed from the plan
+tree as such.
+
+To prevent the executor and any third party execution code that can look at
+the plan tree from trying to execute the subplans that were pruned as
+described above, the result of pruning is passed to the executor as a
+PartitionPruneResult node via the QueryDesc.  It consists of the set of
+indexes of surviving subplans in their respective parent plan node's list of
+child subplans, saved as a list of bitmapsets, with one element for every
+parent plan node whose PartitionPruneInfo is present in
+PlannedStmt.partPruneInfos.  In other words, the executor should not
+re-evaluate the set of initially valid subplans by redoing the initial pruning
+if it was already done by AcquireExecutorLocks(), because the re-evaluation may
+very well end up resulting in a different set of subplans, containing some
+whose relations were not locked by AcquireExecutorLocks().
+
 Each Plan node may have expression trees associated with it, to represent
 its target list, qualification conditions, etc.  These trees are also
 read-only to the executor, but the executor state for expression evaluation
@@ -286,6 +314,10 @@ Query Processing Control Flow
 
 This is a sketch of control flow for full query processing:
 
+  [ ExecutorDoInitialPruning ] --- an optional step to perform initial
+		partition pruning on the plan tree the result of which is passed
+		to the executor via QueryDesc
+
 	CreateQueryDesc
 
 	ExecutorStart
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 32475e33ff..6e2cd1596f 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -49,6 +49,7 @@
 #include "commands/matview.h"
 #include "commands/trigger.h"
 #include "executor/execdebug.h"
+#include "executor/execPartition.h"
 #include "executor/nodeSubplan.h"
 #include "foreign/fdwapi.h"
 #include "jit/jit.h"
@@ -104,6 +105,58 @@ static void EvalPlanQualStart(EPQState *epqstate, Plan *planTree);
 
 /* end of local decls */
 
+/* ----------------------------------------------------------------
+ *		ExecutorDoInitialPruning
+ *
+ *		For each plan tree node that has been assigned a PartitionPruneInfo,
+ *		this performs initial partition pruning using the information contained
+ *		therein to determine the set of child subplans that satisfy the initial
+ *		pruning steps, to be returned as a bitmapset of their indexes in the
+ *		node's list of child subplans (for example, an Append's appendplans).
+ *
+ * Return value is a PartitionPruneResult node that contains a list of those
+ * bitmapsets, with one element for every PartitionPruneInfo, and a bitmapset
+ * of the RT indexes of all the leaf partitions scanned by those chosen
+ * subplans.  Note that the latter is shared across all PartitionPruneInfos.
+ *
+ * The executor must see the exactly same set of subplans as valid for
+ * execution when doing ExecInitNode() on the plan nodes whose
+ * PartitionPruneInfos are processed here.  So, it must get the set from the
+ * aforementioned PartitionPruneResult, instead of computing it all over
+ * again by redoing the initial pruning.  It's the caller's job to pass the
+ * PartitionPruneResult to the executor.
+ *
+ * Note: Partitioned tables mentioned in PartitionedRelPruneInfo nodes that
+ * drive the pruning will be locked before doing the pruning.
+ * ----------------------------------------------------------------
+ */
+PartitionPruneResult *
+ExecutorDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params)
+{
+	PartitionPruneResult *result;
+	ListCell *lc;
+
+	/* Only get here if there is any pruning to do. */
+	Assert(plannedstmt->containsInitialPruning);
+
+	result = makeNode(PartitionPruneResult);
+	foreach(lc, plannedstmt->partPruneInfos)
+	{
+		PartitionPruneInfo *pruneinfo = lfirst(lc);
+		Bitmapset *valid_subplan_offs;
+
+		valid_subplan_offs =
+			ExecPartitionDoInitialPruning(plannedstmt, params, pruneinfo,
+										  &result->scan_leafpart_rtis);
+		if (valid_subplan_offs)
+			valid_subplan_offs->type = T_Bitmapset;
+		result->valid_subplan_offs_list =
+			lappend(result->valid_subplan_offs_list,
+					valid_subplan_offs);
+	}
+
+	return result;
+}
 
 /* ----------------------------------------------------------------
  *		ExecutorStart
@@ -806,6 +859,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
 {
 	CmdType		operation = queryDesc->operation;
 	PlannedStmt *plannedstmt = queryDesc->plannedstmt;
+	PartitionPruneResult *part_prune_result = queryDesc->part_prune_result;
 	Plan	   *plan = plannedstmt->planTree;
 	List	   *rangeTable = plannedstmt->rtable;
 	EState	   *estate = queryDesc->estate;
@@ -826,6 +880,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
 
 	estate->es_plannedstmt = plannedstmt;
 	estate->es_part_prune_infos = plannedstmt->partPruneInfos;
+	estate->es_part_prune_result = part_prune_result;
 
 	/*
 	 * Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index aca0c6f323..abae5b8623 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -66,6 +66,7 @@
 #define PARALLEL_KEY_QUERY_TEXT		UINT64CONST(0xE000000000000008)
 #define PARALLEL_KEY_JIT_INSTRUMENTATION UINT64CONST(0xE000000000000009)
 #define PARALLEL_KEY_WAL_USAGE			UINT64CONST(0xE00000000000000A)
+#define PARALLEL_KEY_PARTITIONPRUNERESULT	UINT64CONST(0xE00000000000000B)
 
 #define PARALLEL_TUPLE_QUEUE_SIZE		65536
 
@@ -182,6 +183,7 @@ ExecSerializePlan(Plan *plan, EState *estate)
 	pstmt->transientPlan = false;
 	pstmt->dependsOnRole = false;
 	pstmt->parallelModeNeeded = false;
+	pstmt->containsInitialPruning = false;
 	pstmt->planTree = plan;
 	pstmt->partPruneInfos = estate->es_part_prune_infos;
 	pstmt->rtable = estate->es_range_table;
@@ -597,12 +599,15 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	FixedParallelExecutorState *fpes;
 	char	   *pstmt_data;
 	char	   *pstmt_space;
+	char	   *part_prune_result_data;
+	char	   *part_prune_result_space;
 	char	   *paramlistinfo_space;
 	BufferUsage *bufusage_space;
 	WalUsage   *walusage_space;
 	SharedExecutorInstrumentation *instrumentation = NULL;
 	SharedJitInstrumentation *jit_instrumentation = NULL;
 	int			pstmt_len;
+	int			part_prune_result_len;
 	int			paramlistinfo_len;
 	int			instrumentation_len = 0;
 	int			jit_instrumentation_len = 0;
@@ -631,6 +636,7 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 
 	/* Fix up and serialize plan to be sent to workers. */
 	pstmt_data = ExecSerializePlan(planstate->plan, estate);
+	part_prune_result_data = nodeToString(estate->es_part_prune_result);
 
 	/* Create a parallel context. */
 	pcxt = CreateParallelContext("postgres", "ParallelQueryMain", nworkers);
@@ -657,6 +663,11 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	shm_toc_estimate_chunk(&pcxt->estimator, pstmt_len);
 	shm_toc_estimate_keys(&pcxt->estimator, 1);
 
+	/* Estimate space for serialized PartitionPruneResult. */
+	part_prune_result_len = strlen(part_prune_result_data) + 1;
+	shm_toc_estimate_chunk(&pcxt->estimator, part_prune_result_len);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
 	/* Estimate space for serialized ParamListInfo. */
 	paramlistinfo_len = EstimateParamListSpace(estate->es_param_list_info);
 	shm_toc_estimate_chunk(&pcxt->estimator, paramlistinfo_len);
@@ -751,6 +762,12 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	memcpy(pstmt_space, pstmt_data, pstmt_len);
 	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PLANNEDSTMT, pstmt_space);
 
+	/* Store serialized PartitionPruneResult */
+	part_prune_result_space = shm_toc_allocate(pcxt->toc, part_prune_result_len);
+	memcpy(part_prune_result_space, part_prune_result_data, part_prune_result_len);
+	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARTITIONPRUNERESULT,
+				   part_prune_result_space);
+
 	/* Store serialized ParamListInfo. */
 	paramlistinfo_space = shm_toc_allocate(pcxt->toc, paramlistinfo_len);
 	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARAMLISTINFO, paramlistinfo_space);
@@ -1232,8 +1249,10 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
 						 int instrument_options)
 {
 	char	   *pstmtspace;
+	char	   *part_prune_result_space;
 	char	   *paramspace;
 	PlannedStmt *pstmt;
+	PartitionPruneResult *part_prune_result;
 	ParamListInfo paramLI;
 	char	   *queryString;
 
@@ -1244,12 +1263,18 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
 	pstmtspace = shm_toc_lookup(toc, PARALLEL_KEY_PLANNEDSTMT, false);
 	pstmt = (PlannedStmt *) stringToNode(pstmtspace);
 
+	/* Reconstruct leader-supplied PartitionPruneResult. */
+	part_prune_result_space =
+		shm_toc_lookup(toc, PARALLEL_KEY_PARTITIONPRUNERESULT, false);
+	part_prune_result = (PartitionPruneResult *)
+		stringToNode(part_prune_result_space);
+
 	/* Reconstruct ParamListInfo. */
 	paramspace = shm_toc_lookup(toc, PARALLEL_KEY_PARAMLISTINFO, false);
 	paramLI = RestoreParamList(&paramspace);
 
 	/* Create a QueryDesc for the query. */
-	return CreateQueryDesc(pstmt,
+	return CreateQueryDesc(pstmt, part_prune_result,
 						   queryString,
 						   GetActiveSnapshot(), InvalidSnapshot,
 						   receiver, paramLI, NULL, instrument_options);
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 80197d5141..b612c24d62 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -25,6 +25,7 @@
 #include "mb/pg_wchar.h"
 #include "miscadmin.h"
 #include "nodes/makefuncs.h"
+#include "parser/parsetree.h"
 #include "partitioning/partbounds.h"
 #include "partitioning/partdesc.h"
 #include "partitioning/partprune.h"
@@ -185,7 +186,11 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation rel,
 static List *adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri);
 static List *adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap);
 static PartitionPruneState *CreatePartitionPruneState(PlanState *planstate,
-													  PartitionPruneInfo *pruneinfo);
+													  PartitionPruneInfo *pruneinfo,
+													  bool consider_initial_steps,
+													  bool consider_exec_steps,
+													  List *rtable, ExprContext *econtext,
+													  PartitionDirectory partdir);
 static void InitPartitionPruneContext(PartitionPruneContext *context,
 									  List *pruning_steps,
 									  PartitionDesc partdesc,
@@ -198,7 +203,8 @@ static void PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
 static void find_matching_subplans_recurse(PartitionPruningData *prunedata,
 										   PartitionedRelPruningData *pprune,
 										   bool initial_prune,
-										   Bitmapset **validsubplans);
+										   Bitmapset **validsubplans,
+										   Bitmapset **scan_leafpart_rtis);
 
 
 /*
@@ -1746,8 +1752,10 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  * considered to be a stable expression, it can change value from one plan
  * node scan to the next during query execution.  Stable comparison
  * expressions that don't involve such Params allow partition pruning to be
- * done once during executor startup.  Expressions that do involve such Params
- * require us to prune separately for each scan of the parent plan node.
+ * done once during executor startup or during ExecutorDoInitialPruning() that
+ * runs as part of performing AcquireExecutorLocks() on a given plan tree.
+ * Expressions that do involve such Params require us to prune separately for
+ * each scan of the parent plan node.
  *
  * Note that pruning away unneeded subplans during executor startup has the
  * added benefit of not having to initialize the unneeded subplans at all.
@@ -1764,6 +1772,13 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  *		account for initial pruning possibly having eliminated some of the
  *		subplans.
  *
+ * ExecPartitionDoInitialPruning:
+ *		Do initial pruning with the information contained in a given
+ *		PartitionPruneInfo to determine the minimal set of child subplans
+ *		to be executed of the parent plan node to which the PartitionPruneInfo
+ *		belongs and also the set of the RT indexes of leaf partitions that will
+ *		be scanned with those subplans.
+ *
  * ExecFindMatchingSubPlans:
  *		Returns indexes of matching subplans after evaluating the expressions
  *		that are safe to evaluate at a given point.  This function is first
@@ -1781,8 +1796,9 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  *
  * On return, *initially_valid_subplans is assigned the set of indexes of
  * child subplans that must be initialized along with the parent plan node.
- * Initial pruning is performed here if needed and in that case only the
- * surviving subplans' indexes are added.
+ * Initial pruning is performed here if needed (unless it has already been done
+ * by ExecutorDoInitialPruning()), and in that case only the surviving
+ * subplans' indexes are added.
  *
  * If subplans are indeed pruned, subplan_map arrays contained in the returned
  * PartitionPruneState are re-sequenced to not count those, though only if the
@@ -1794,28 +1810,62 @@ ExecInitPartitionPruning(PlanState *planstate,
 						 int part_prune_index,
 						 Bitmapset **initially_valid_subplans)
 {
-	PartitionPruneState *prunestate;
+	PartitionPruneState *prunestate = NULL;
 	EState	   *estate = planstate->state;
 	PartitionPruneInfo *pruneinfo = list_nth(estate->es_part_prune_infos,
 											 part_prune_index);
+	PartitionPruneResult *pruneresult = estate->es_part_prune_result;
+	bool	do_pruning = (pruneinfo->needs_init_pruning ||
+						  pruneinfo->needs_exec_pruning);
 
-	/* We may need an expression context to evaluate partition exprs */
-	ExecAssignExprContext(estate, planstate);
+	/*
+	 * No need to do initial pruning if it was done already by
+	 * ExecutorDoInitialPruning(), which it would be if es_part_prune_result
+	 * has been set.
+	 */
+	if (pruneresult)
+		do_pruning = pruneinfo->needs_exec_pruning;
 
-	/* Create the working data structure for pruning */
-	prunestate = CreatePartitionPruneState(planstate, pruneinfo);
+	if (do_pruning)
+	{
+		/* We may need an expression context to evaluate partition exprs */
+		ExecAssignExprContext(estate, planstate);
+
+		/* For data reading, executor always omits detached partitions */
+		if (estate->es_partition_directory == NULL)
+			estate->es_partition_directory =
+				CreatePartitionDirectory(estate->es_query_cxt, false);
+
+		/*
+		 * Create the working data structure for pruning.  No need to consider
+		 * initial pruning steps if we have a PartitionPruneResult.
+		 */
+		prunestate = CreatePartitionPruneState(planstate, pruneinfo,
+											   pruneresult == NULL, true,
+											   NIL, planstate->ps_ExprContext,
+											   estate->es_partition_directory);
+	}
 
 	/*
 	 * Perform an initial partition prune pass, if required.
 	 */
-	if (prunestate->do_initial_prune)
-		*initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true);
+	if (pruneresult)
+	{
+		*initially_valid_subplans =
+			list_nth(pruneresult->valid_subplan_offs_list, part_prune_index);
+	}
+	else if (prunestate && prunestate->do_initial_prune)
+	{
+		*initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true,
+															 NULL);
+	}
 	else
 	{
-		/* No pruning, so we'll need to initialize all subplans */
+		/* No initial pruning, so we'll need to initialize all subplans */
 		Assert(n_total_subplans > 0);
 		*initially_valid_subplans = bms_add_range(NULL, 0,
 												  n_total_subplans - 1);
+		return prunestate;
 	}
 
 	/*
@@ -1823,7 +1873,8 @@ ExecInitPartitionPruning(PlanState *planstate,
 	 * that were removed above due to initial pruning.  No need to do this if
 	 * no steps were removed.
 	 */
-	if (bms_num_members(*initially_valid_subplans) < n_total_subplans)
+	if (prunestate &&
+		bms_num_members(*initially_valid_subplans) < n_total_subplans)
 	{
 		/*
 		 * We can safely skip this when !do_exec_prune, even though that
@@ -1839,11 +1890,74 @@ ExecInitPartitionPruning(PlanState *planstate,
 	return prunestate;
 }
 
+/*
+ * ExecPartitionDoInitialPruning
+ *		Perform initial pruning using given PartitionPruneInfo to determine
+ *		the minimal set of child subplans that will be executed and also the
+ *		set of RT indexes of the leaf partitions scanned by those subplans.
+ */
+Bitmapset *
+ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params,
+							  PartitionPruneInfo *pruneinfo,
+							  Bitmapset **scan_leafpart_rtis)
+{
+	List		 *rtable = plannedstmt->rtable;
+	ExprContext	 *econtext;
+	PartitionDirectory pdir;
+	MemoryContext oldcontext,
+				  tmpcontext;
+	PartitionPruneState *prunestate;
+	Bitmapset	 *valid_subplan_offs;
+
+	/*
+	 * A temporary context for memory allocations required while executing
+	 * partition pruning steps.
+	 */
+	tmpcontext = AllocSetContextCreate(CurrentMemoryContext,
+									   "initial pruning working data",
+									   ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(tmpcontext);
+
+	/*
+	 * PartitionDirectory to look up partition descriptors.
+	 * Note that we don't omit detached partitions, just like during
+	 * execution proper.
+	 */
+	pdir = CreatePartitionDirectory(CurrentMemoryContext, false);
+
+	/*
+	 * We don't yet have a PlanState for the parent plan node, so we must
+	 * create a standalone ExprContext to evaluate pruning expressions,
+	 * equipped with the information about the EXTERN parameters that the
+	 * caller passed us.  Note that that's okay because the initial pruning
+	 * steps do not contain anything that requires the execution to have
+	 * started and thus need the information contained in a PlanState.
+	 */
+	econtext = CreateStandaloneExprContext();
+	econtext->ecxt_param_list_info = params;
+	prunestate = CreatePartitionPruneState(NULL, pruneinfo, true, false,
+										   rtable, econtext, pdir);
+	MemoryContextSwitchTo(oldcontext);
+
+	/* Do the initial pruning. */
+	valid_subplan_offs = ExecFindMatchingSubPlans(prunestate, true,
+												  scan_leafpart_rtis);
+
+	FreeExprContext(econtext, true);
+	DestroyPartitionDirectory(pdir);
+	MemoryContextDelete(tmpcontext);
+
+	return valid_subplan_offs;
+}
+
 /*
  * CreatePartitionPruneState
  *		Build the data structure required for calling ExecFindMatchingSubPlans
  *
- * 'planstate' is the parent plan node's execution state.
+ * 'planstate', if not NULL, is the parent plan node's execution state.  It
+ * can be NULL if being called before ExecutorStart(), in which case,
+ * 'rtable' (range table), 'econtext', and 'partdir' must be explicitly
+ * provided.
  *
  * 'pruneinfo' is a PartitionPruneInfo as generated by
  * make_partition_pruneinfo.  Here we build a PartitionPruneState containing a
@@ -1857,19 +1971,21 @@ ExecInitPartitionPruning(PlanState *planstate,
  * PartitionedRelPruneInfo.
  */
 static PartitionPruneState *
-CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
+CreatePartitionPruneState(PlanState *planstate,
+						  PartitionPruneInfo *pruneinfo,
+						  bool consider_initial_steps,
+						  bool consider_exec_steps,
+						  List *rtable, ExprContext *econtext,
+						  PartitionDirectory partdir)
 {
-	EState	   *estate = planstate->state;
+	EState	   *estate = planstate ? planstate->state : NULL;
 	PartitionPruneState *prunestate;
 	int			n_part_hierarchies;
 	ListCell   *lc;
 	int			i;
-	ExprContext *econtext = planstate->ps_ExprContext;
 
-	/* For data reading, executor always omits detached partitions */
-	if (estate->es_partition_directory == NULL)
-		estate->es_partition_directory =
-			CreatePartitionDirectory(estate->es_query_cxt, false);
+	Assert((estate != NULL) ||
+			(partdir != NULL && econtext != NULL && rtable != NIL));
 
 	n_part_hierarchies = list_length(pruneinfo->prune_infos);
 	Assert(n_part_hierarchies > 0);
@@ -1924,15 +2040,42 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 			PartitionKey partkey;
 
 			/*
-			 * We can rely on the copies of the partitioned table's partition
-			 * key and partition descriptor appearing in its relcache entry,
-			 * because that entry will be held open and locked for the
-			 * duration of this executor run.
+			 * Must open the relation by ourselves when called before the
+			 * execution has started, such as, when called during
+			 * ExecutorDoInitialPruning() on a cached plan.  In that case,
+			 * sub-partitions must be locked, because AcquirePlannerLocks()
+			 * would not have seen them. (1st relation in a partrelpruneinfos
+			 * list is always the root partitioned table appearing in the
+			 * query, which AcquirePlannerLocks() would have locked; the
+			 * Assert in relation_open() guards that assumption.)
+			 */
+			if (estate == NULL)
+			{
+				RangeTblEntry *rte = rt_fetch(pinfo->rtindex, rtable);
+				int		lockmode = (j == 0) ? NoLock : rte->rellockmode;
+
+				partrel = table_open(rte->relid, lockmode);
+			}
+			else
+				partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
+
+			/*
+			 * We can rely on the copy of the partitioned table's partition
+			 * key from in its relcache entry, because it can't change (or
+			 * get destroyed) as long as the relation is locked.  Partition
+			 * descriptor is taken from the PartitionDirectory associated with
+			 * the table that is held open long enough for the descriptor to
+			 * remain valid while it's used to perform the pruning steps.
 			 */
-			partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
 			partkey = RelationGetPartitionKey(partrel);
-			partdesc = PartitionDirectoryLookup(estate->es_partition_directory,
-												partrel);
+			partdesc = PartitionDirectoryLookup(partdir, partrel);
+
+			/*
+			 * Must close partrel, keeping the lock taken, if we're not using
+			 * EState's entry.
+			 */
+			if (estate == NULL)
+				table_close(partrel, NoLock);
 
 			/*
 			 * Initialize the subplan_map and subpart_map.
@@ -1946,6 +2089,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 			Assert(partdesc->nparts >= pinfo->nparts);
 			pprune->nparts = partdesc->nparts;
 			pprune->subplan_map = palloc(sizeof(int) * partdesc->nparts);
+			pprune->rti_map = palloc(sizeof(Index) * partdesc->nparts);
 			if (partdesc->nparts == pinfo->nparts)
 			{
 				/*
@@ -1956,6 +2100,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 				pprune->subpart_map = pinfo->subpart_map;
 				memcpy(pprune->subplan_map, pinfo->subplan_map,
 					   sizeof(int) * pinfo->nparts);
+				memcpy(pprune->rti_map, pinfo->rti_map,
+					   sizeof(int) * pinfo->nparts);
 
 				/*
 				 * Double-check that the list of unpruned relations has not
@@ -2006,6 +2152,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 							pinfo->subplan_map[pd_idx];
 						pprune->subpart_map[pp_idx] =
 							pinfo->subpart_map[pd_idx];
+						pprune->rti_map[pp_idx] =
+							pinfo->rti_map[pd_idx];
 						pd_idx++;
 					}
 					else
@@ -2013,6 +2161,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 						/* this partdesc entry is not in the plan */
 						pprune->subplan_map[pp_idx] = -1;
 						pprune->subpart_map[pp_idx] = -1;
+						pprune->rti_map[pp_idx] = 0;
 					}
 				}
 
@@ -2034,7 +2183,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 			 * Initialize pruning contexts as needed.
 			 */
 			pprune->initial_pruning_steps = pinfo->initial_pruning_steps;
-			if (pinfo->initial_pruning_steps)
+			if (consider_initial_steps && pinfo->initial_pruning_steps)
 			{
 				InitPartitionPruneContext(&pprune->initial_context,
 										  pinfo->initial_pruning_steps,
@@ -2044,7 +2193,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 				prunestate->do_initial_prune = true;
 			}
 			pprune->exec_pruning_steps = pinfo->exec_pruning_steps;
-			if (pinfo->exec_pruning_steps)
+			if (consider_exec_steps && pinfo->exec_pruning_steps)
 			{
 				InitPartitionPruneContext(&pprune->exec_context,
 										  pinfo->exec_pruning_steps,
@@ -2272,10 +2421,14 @@ PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
  * Pass initial_prune if PARAM_EXEC Params cannot yet be evaluated.  This
  * differentiates the initial executor-time pruning step from later
  * runtime pruning.
+ *
+ * RT indexes of leaf partitions scanned by the chosen subplans are added to
+ * *scan_leafpart_rtis if the pointer is non-NULL.
  */
 Bitmapset *
 ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
-						 bool initial_prune)
+						 bool initial_prune,
+						 Bitmapset **scan_leafpart_rtis)
 {
 	Bitmapset  *result = NULL;
 	MemoryContext oldcontext;
@@ -2310,7 +2463,7 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
 		 */
 		pprune = &prunedata->partrelprunedata[0];
 		find_matching_subplans_recurse(prunedata, pprune, initial_prune,
-									   &result);
+									   &result, scan_leafpart_rtis);
 
 		/* Expression eval may have used space in ExprContext too */
 		if (pprune->exec_pruning_steps)
@@ -2324,6 +2477,8 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
 
 	/* Copy result out of the temp context before we reset it */
 	result = bms_copy(result);
+	if (scan_leafpart_rtis)
+		*scan_leafpart_rtis = bms_copy(*scan_leafpart_rtis);
 
 	MemoryContextReset(prunestate->prune_context);
 
@@ -2334,13 +2489,15 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
  * find_matching_subplans_recurse
  *		Recursive worker function for ExecFindMatchingSubPlans
  *
- * Adds valid (non-prunable) subplan IDs to *validsubplans
+ * Adds valid (non-prunable) subplan IDs to *validsubplans and RT indexes of
+ * of the corresponding leaf partitions to *scan_leafpart_rtis (if asked for).
  */
 static void
 find_matching_subplans_recurse(PartitionPruningData *prunedata,
 							   PartitionedRelPruningData *pprune,
 							   bool initial_prune,
-							   Bitmapset **validsubplans)
+							   Bitmapset **validsubplans,
+							   Bitmapset **scan_leafpart_rtis)
 {
 	Bitmapset  *partset;
 	int			i;
@@ -2367,8 +2524,14 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
 	while ((i = bms_next_member(partset, i)) >= 0)
 	{
 		if (pprune->subplan_map[i] >= 0)
+		{
 			*validsubplans = bms_add_member(*validsubplans,
 											pprune->subplan_map[i]);
+			Assert(pprune->rti_map[i] > 0);
+			if (scan_leafpart_rtis)
+				*scan_leafpart_rtis = bms_add_member(*scan_leafpart_rtis,
+													 pprune->rti_map[i]);
+		}
 		else
 		{
 			int			partidx = pprune->subpart_map[i];
@@ -2376,7 +2539,8 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
 			if (partidx >= 0)
 				find_matching_subplans_recurse(prunedata,
 											   &prunedata->partrelprunedata[partidx],
-											   initial_prune, validsubplans);
+											   initial_prune, validsubplans,
+											   scan_leafpart_rtis);
 			else
 			{
 				/*
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 21f4c10937..bb7d028463 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -134,6 +134,7 @@ CreateExecutorState(void)
 	estate->es_param_exec_vals = NULL;
 
 	estate->es_queryEnv = NULL;
+	estate->es_part_prune_result = NULL;
 
 	estate->es_query_cxt = qcontext;
 
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index e134a82ff7..901768cc34 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -842,7 +842,7 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
 	else
 		dest = None_Receiver;
 
-	es->qd = CreateQueryDesc(es->stmt,
+	es->qd = CreateQueryDesc(es->stmt, NULL,
 							 fcache->src,
 							 GetActiveSnapshot(),
 							 InvalidSnapshot,
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index c6f86a6510..96880e122a 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -155,7 +155,8 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
 		 * subplan, we can fill as_valid_subplans immediately, preventing
 		 * later calls to ExecFindMatchingSubPlans.
 		 */
-		if (!prunestate->do_exec_prune && nplans > 0)
+		if (appendstate->as_prune_state == NULL ||
+			(!appendstate->as_prune_state->do_exec_prune && nplans > 0))
 			appendstate->as_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
 	}
 	else
@@ -577,7 +578,7 @@ choose_next_subplan_locally(AppendState *node)
 		}
 		else if (node->as_valid_subplans == NULL)
 			node->as_valid_subplans =
-				ExecFindMatchingSubPlans(node->as_prune_state, false);
+				ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
 
 		whichplan = -1;
 	}
@@ -642,7 +643,7 @@ choose_next_subplan_for_leader(AppendState *node)
 		if (node->as_valid_subplans == NULL)
 		{
 			node->as_valid_subplans =
-				ExecFindMatchingSubPlans(node->as_prune_state, false);
+				ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
 
 			/*
 			 * Mark each invalid plan as finished to allow the loop below to
@@ -717,7 +718,7 @@ choose_next_subplan_for_worker(AppendState *node)
 	else if (node->as_valid_subplans == NULL)
 	{
 		node->as_valid_subplans =
-			ExecFindMatchingSubPlans(node->as_prune_state, false);
+			ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
 		mark_invalid_subplans_as_finished(node);
 	}
 
@@ -868,7 +869,7 @@ ExecAppendAsyncBegin(AppendState *node)
 	if (node->as_valid_subplans == NULL)
 	{
 		node->as_valid_subplans =
-			ExecFindMatchingSubPlans(node->as_prune_state, false);
+			ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
 
 		classify_matching_subplans(node);
 	}
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index 8d35860c30..2312e5a633 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -103,7 +103,8 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
 		 * subplan, we can fill ms_valid_subplans immediately, preventing
 		 * later calls to ExecFindMatchingSubPlans.
 		 */
-		if (!prunestate->do_exec_prune && nplans > 0)
+		if (mergestate->ms_prune_state == NULL ||
+			(!mergestate->ms_prune_state->do_exec_prune && nplans > 0))
 			mergestate->ms_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
 	}
 	else
@@ -218,7 +219,7 @@ ExecMergeAppend(PlanState *pstate)
 		 */
 		if (node->ms_valid_subplans == NULL)
 			node->ms_valid_subplans =
-				ExecFindMatchingSubPlans(node->ms_prune_state, false);
+				ExecFindMatchingSubPlans(node->ms_prune_state, false, NULL);
 
 		/*
 		 * First time through: pull the first tuple from each valid subplan,
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index fd5796f1b9..b3faeae2af 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -1578,6 +1578,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 	CachedPlanSource *plansource;
 	CachedPlan *cplan;
 	List	   *stmt_list;
+	List	   *part_prune_result_list;
 	char	   *query_string;
 	Snapshot	snapshot;
 	MemoryContext oldcontext;
@@ -1657,7 +1658,10 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 	 */
 
 	/* Replan if needed, and increment plan refcount for portal */
-	cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv);
+	cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv,
+						  &part_prune_result_list);
+	Assert(list_length(cplan->stmt_list) ==
+		   list_length(part_prune_result_list));
 	stmt_list = cplan->stmt_list;
 
 	if (!plan->saved)
@@ -1685,6 +1689,9 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 					  stmt_list,
 					  cplan);
 
+	/* Copy PartitionPruneResults into the portal's context. */
+	PortalStorePartitionPruneResults(portal, part_prune_result_list);
+
 	/*
 	 * Set up options for portal.  Default SCROLL type is chosen the same way
 	 * as PerformCursorOpen does it.
@@ -2092,7 +2099,8 @@ SPI_plan_get_cached_plan(SPIPlanPtr plan)
 	/* Get the generic plan for the query */
 	cplan = GetCachedPlan(plansource, NULL,
 						  plan->saved ? CurrentResourceOwner : NULL,
-						  _SPI_current->queryEnv);
+						  _SPI_current->queryEnv,
+						  NULL /* Not interested in PartitionPruneResults */);
 	Assert(cplan == plansource->gplan);
 
 	/* Pop the error context stack */
@@ -2473,7 +2481,9 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 	{
 		CachedPlanSource *plansource = (CachedPlanSource *) lfirst(lc1);
 		List	   *stmt_list;
-		ListCell   *lc2;
+		List	   *part_prune_result_list;
+		ListCell   *lc2,
+				   *lc3;
 
 		spicallbackarg.query = plansource->query_string;
 
@@ -2549,8 +2559,10 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 		 * plan, the refcount must be backed by the plan_owner.
 		 */
 		cplan = GetCachedPlan(plansource, options->params,
-							  plan_owner, _SPI_current->queryEnv);
-
+							  plan_owner, _SPI_current->queryEnv,
+							 &part_prune_result_list);
+		Assert(list_length(cplan->stmt_list) ==
+			   list_length(part_prune_result_list));
 		stmt_list = cplan->stmt_list;
 
 		/*
@@ -2589,9 +2601,10 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 			}
 		}
 
-		foreach(lc2, stmt_list)
+		forboth(lc2, stmt_list, lc3, part_prune_result_list)
 		{
 			PlannedStmt *stmt = lfirst_node(PlannedStmt, lc2);
+			PartitionPruneResult *part_prune_result = lfirst_node(PartitionPruneResult, lc3);
 			bool		canSetTag = stmt->canSetTag;
 			DestReceiver *dest;
 
@@ -2663,7 +2676,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 				else
 					snap = InvalidSnapshot;
 
-				qdesc = CreateQueryDesc(stmt,
+				qdesc = CreateQueryDesc(stmt, part_prune_result,
 										plansource->query_string,
 										snap, crosscheck_snapshot,
 										dest,
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 4d6902d3ac..c34226a83b 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -158,6 +158,11 @@
 	token = pg_strtok(&length);		/* skip :fldname */ \
 	local_node->fldname = readIntCols(len)
 
+/* Read an Index array */
+#define READ_INDEX_ARRAY(fldname, len) \
+	token = pg_strtok(&length);		/* skip :fldname */ \
+	local_node->fldname = readIndexCols(len)
+
 /* Read a bool array */
 #define READ_BOOL_ARRAY(fldname, len) \
 	token = pg_strtok(&length);		/* skip :fldname */ \
@@ -799,7 +804,6 @@ fnname(int numCols) \
  */
 READ_SCALAR_ARRAY(readAttrNumberCols, int16, atoi)
 READ_SCALAR_ARRAY(readOidCols, Oid, atooid)
-/* outfuncs.c has writeIndexCols, but we don't yet need that here */
-/* READ_SCALAR_ARRAY(readIndexCols, Index, atoui) */
+READ_SCALAR_ARRAY(readIndexCols, Index, atoui)
 READ_SCALAR_ARRAY(readIntCols, int, atoi)
 READ_SCALAR_ARRAY(readBoolCols, bool, strtobool)
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 31fff597a7..4097cf7164 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -520,7 +520,9 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
 	result->parallelModeNeeded = glob->parallelModeNeeded;
 	result->planTree = top_plan;
 	result->partPruneInfos = glob->partPruneInfos;
+	result->containsInitialPruning = glob->containsInitialPruning;
 	result->rtable = glob->finalrtable;
+	result->minLockRelids = glob->minLockRelids;
 	result->resultRelations = glob->resultRelations;
 	result->appendRelations = glob->appendRelations;
 	result->subplans = glob->subplans;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 720f20f563..61d6934978 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -270,6 +270,16 @@ set_plan_references(PlannerInfo *root, Plan *plan)
 	 */
 	add_rtes_to_flat_rtable(root, false);
 
+	/*
+	 * Add the query's adjusted range of RT indexes to glob->minLockRelids.
+	 * The adjusted RT indexes of prunable relations will be deleted from the
+	 * set below where PartitionPruneInfos are processed.
+	 */
+	glob->minLockRelids =
+		bms_add_range(glob->minLockRelids,
+					  rtoffset + 1,
+					  rtoffset + list_length(root->parse->rtable));
+
 	/*
 	 * Adjust RT indexes of PlanRowMarks and add to final rowmarks list
 	 */
@@ -352,6 +362,7 @@ set_plan_references(PlannerInfo *root, Plan *plan)
 	foreach (lc, root->partPruneInfos)
 	{
 		PartitionPruneInfo *pruneinfo = lfirst(lc);
+		Bitmapset *leafpart_rtis = NULL;
 		ListCell  *l;
 
 		foreach(l, pruneinfo->prune_infos)
@@ -362,15 +373,50 @@ set_plan_references(PlannerInfo *root, Plan *plan)
 			foreach(l2, prune_infos)
 			{
 				PartitionedRelPruneInfo *pinfo = lfirst(l2);
+				int		i;
 
 				/* RT index of the table to which the pinfo belongs. */
 				pinfo->rtindex += rtoffset;
+
+				/* Also of the leaf partitions that might be scanned. */
+				for (i = 0; i < pinfo->nparts; i++)
+				{
+					if (pinfo->rti_map[i] > 0 && pinfo->subplan_map[i] >= 0)
+					{
+						pinfo->rti_map[i] += rtoffset;
+						leafpart_rtis = bms_add_member(leafpart_rtis,
+													   pinfo->rti_map[i]);
+					}
+				}
 			}
 		}
 
+		if (pruneinfo->needs_init_pruning)
+		{
+			glob->containsInitialPruning = true;
+
+			/*
+			 * Delete the leaf partition RTIs from the global set of relations
+			 * to be locked before executing the plan.  AcquireExecutorLocks()
+			 * will find the ones to add to the set after performing initial
+			 * pruning.
+			 */
+			glob->minLockRelids = bms_del_members(glob->minLockRelids,
+												  leafpart_rtis);
+		}
+
 		glob->partPruneInfos = lappend(glob->partPruneInfos, pruneinfo);
 	}
 
+	/*
+	 * It seems worth doing a bms_copy() on glob->minLockRelids if we deleted
+	 * bits from it above to get rid of any empty tail bits.  It seems better
+	 * for the loop over this set in AcquireExecutorLocks() to not have to go
+	 * through those useless bit words.
+	 */
+	if (glob->containsInitialPruning)
+		glob->minLockRelids = bms_copy(glob->minLockRelids);
+
 	return result;
 }
 
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index 6565b6ed01..37f3e6af61 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -144,7 +144,9 @@ static List *make_partitionedrel_pruneinfo(PlannerInfo *root,
 										   List *prunequal,
 										   Bitmapset *partrelids,
 										   int *relid_subplan_map,
-										   Bitmapset **matchedsubplans);
+										   Bitmapset **matchedsubplans,
+										   bool *needs_init_pruning,
+										   bool *needs_exec_pruning);
 static void gen_partprune_steps(RelOptInfo *rel, List *clauses,
 								PartClauseTarget target,
 								GeneratePruningStepsContext *context);
@@ -234,6 +236,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	int		   *relid_subplan_map;
 	ListCell   *lc;
 	int			i;
+	bool		needs_init_pruning = false;
+	bool		needs_exec_pruning = false;
 
 	/*
 	 * Scan the subpaths to see which ones are scans of partition child
@@ -313,12 +317,16 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		Bitmapset  *partrelids = (Bitmapset *) lfirst(lc);
 		List	   *pinfolist;
 		Bitmapset  *matchedsubplans = NULL;
+		bool		partrel_needs_init_pruning;
+		bool		partrel_needs_exec_pruning;
 
 		pinfolist = make_partitionedrel_pruneinfo(root, parentrel,
 												  prunequal,
 												  partrelids,
 												  relid_subplan_map,
-												  &matchedsubplans);
+												  &matchedsubplans,
+												  &partrel_needs_init_pruning,
+												  &partrel_needs_exec_pruning);
 
 		/* When pruning is possible, record the matched subplans */
 		if (pinfolist != NIL)
@@ -327,6 +335,9 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 			allmatchedsubplans = bms_join(matchedsubplans,
 										  allmatchedsubplans);
 		}
+
+		needs_init_pruning |= partrel_needs_init_pruning;
+		needs_exec_pruning |= partrel_needs_exec_pruning;
 	}
 
 	pfree(relid_subplan_map);
@@ -341,6 +352,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	/* Else build the result data structure */
 	pruneinfo = makeNode(PartitionPruneInfo);
 	pruneinfo->prune_infos = prunerelinfos;
+	pruneinfo->needs_init_pruning = needs_init_pruning;
+	pruneinfo->needs_exec_pruning = needs_exec_pruning;
 
 	/*
 	 * Some subplans may not belong to any of the identified partitioned rels.
@@ -441,13 +454,18 @@ add_part_relids(List *allpartrelids, Bitmapset *partrelids)
  * If we cannot find any useful run-time pruning steps, return NIL.
  * However, on success, each rel identified in partrelids will have
  * an element in the result list, even if some of them are useless.
+ * *needs_init_pruning and *needs_exec_pruning are set to indicate that the
+ * returned PartitionedRelPruneInfos contains pruning steps that can be
+ * performed before and after execution begins, respectively.
  */
 static List *
 make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 							  List *prunequal,
 							  Bitmapset *partrelids,
 							  int *relid_subplan_map,
-							  Bitmapset **matchedsubplans)
+							  Bitmapset **matchedsubplans,
+							  bool *needs_init_pruning,
+							  bool *needs_exec_pruning)
 {
 	RelOptInfo *targetpart = NULL;
 	List	   *pinfolist = NIL;
@@ -458,6 +476,10 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	int			rti;
 	int			i;
 
+	/* Will find out below. */
+	*needs_init_pruning = false;
+	*needs_exec_pruning = false;
+
 	/*
 	 * Examine each partitioned rel, constructing a temporary array to map
 	 * from planner relids to index of the partitioned rel, and building a
@@ -545,6 +567,9 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		 * executor per-scan pruning steps.  This first pass creates startup
 		 * pruning steps and detects whether there's any possibly-useful quals
 		 * that would require per-scan pruning.
+		 *
+		 * In the first pass, we note whether the 2nd pass is necessary by
+		 * noting the presence of EXEC parameters.
 		 */
 		gen_partprune_steps(subpart, partprunequal, PARTTARGET_INITIAL,
 							&context);
@@ -619,6 +644,12 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		pinfo->execparamids = execparamids;
 		/* Remaining fields will be filled in the next loop */
 
+		/* record which types of pruning steps we've seen so far */
+		if (initial_pruning_steps != NIL)
+			*needs_init_pruning = true;
+		if (exec_pruning_steps != NIL)
+			*needs_exec_pruning = true;
+
 		pinfolist = lappend(pinfolist, pinfo);
 	}
 
@@ -646,6 +677,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		int		   *subplan_map;
 		int		   *subpart_map;
 		Oid		   *relid_map;
+		Index	   *rti_map;
 
 		/*
 		 * Construct the subplan and subpart maps for this partitioning level.
@@ -658,6 +690,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		subpart_map = (int *) palloc(nparts * sizeof(int));
 		memset(subpart_map, -1, nparts * sizeof(int));
 		relid_map = (Oid *) palloc0(nparts * sizeof(Oid));
+		rti_map = (Index *) palloc0(nparts * sizeof(Index));
 		present_parts = NULL;
 
 		i = -1;
@@ -672,6 +705,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 			subplan_map[i] = subplanidx = relid_subplan_map[partrel->relid] - 1;
 			subpart_map[i] = subpartidx = relid_subpart_map[partrel->relid] - 1;
 			relid_map[i] = planner_rt_fetch(partrel->relid, root)->relid;
+			rti_map[i] = partrel->relid;
 			if (subplanidx >= 0)
 			{
 				present_parts = bms_add_member(present_parts, i);
@@ -696,6 +730,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		pinfo->subplan_map = subplan_map;
 		pinfo->subpart_map = subpart_map;
 		pinfo->relid_map = relid_map;
+		pinfo->rti_map = rti_map;
 	}
 
 	pfree(relid_subpart_map);
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 27dee29f42..5a37c4160b 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -1598,6 +1598,7 @@ exec_bind_message(StringInfo input_message)
 	int16	   *rformats = NULL;
 	CachedPlanSource *psrc;
 	CachedPlan *cplan;
+	List	   *part_prune_result_list;
 	Portal		portal;
 	char	   *query_string;
 	char	   *saved_stmt_name;
@@ -1972,7 +1973,9 @@ exec_bind_message(StringInfo input_message)
 	 * will be generated in MessageContext.  The plan refcount will be
 	 * assigned to the Portal, so it will be released at portal destruction.
 	 */
-	cplan = GetCachedPlan(psrc, params, NULL, NULL);
+	cplan = GetCachedPlan(psrc, params, NULL, NULL, &part_prune_result_list);
+	Assert(list_length(cplan->stmt_list) ==
+		   list_length(part_prune_result_list));
 
 	/*
 	 * Now we can define the portal.
@@ -1987,6 +1990,9 @@ exec_bind_message(StringInfo input_message)
 					  cplan->stmt_list,
 					  cplan);
 
+	/* Copy PartitionPruneResults into the portal's context. */
+	PortalStorePartitionPruneResults(portal, part_prune_result_list);
+
 	/* Done with the snapshot used for parameter I/O and parsing/planning */
 	if (snapshot_set)
 		PopActiveSnapshot();
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index 5aa5a350f3..8cc2e2162d 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -35,7 +35,7 @@
 Portal		ActivePortal = NULL;
 
 
-static void ProcessQuery(PlannedStmt *plan,
+static void ProcessQuery(PlannedStmt *plan, PartitionPruneResult *part_prune_result,
 						 const char *sourceText,
 						 ParamListInfo params,
 						 QueryEnvironment *queryEnv,
@@ -65,6 +65,7 @@ static void DoPortalRewind(Portal portal);
  */
 QueryDesc *
 CreateQueryDesc(PlannedStmt *plannedstmt,
+				PartitionPruneResult *part_prune_result,
 				const char *sourceText,
 				Snapshot snapshot,
 				Snapshot crosscheck_snapshot,
@@ -77,6 +78,8 @@ CreateQueryDesc(PlannedStmt *plannedstmt,
 
 	qd->operation = plannedstmt->commandType;	/* operation */
 	qd->plannedstmt = plannedstmt;	/* plan */
+	qd->part_prune_result = part_prune_result;	/* ExecutorDoInitialPruning()
+												 * output for plan */
 	qd->sourceText = sourceText;	/* query text */
 	qd->snapshot = RegisterSnapshot(snapshot);	/* snapshot */
 	/* RI check snapshot */
@@ -122,6 +125,7 @@ FreeQueryDesc(QueryDesc *qdesc)
  *		PORTAL_ONE_RETURNING, or PORTAL_ONE_MOD_WITH portal
  *
  *	plan: the plan tree for the query
+ *	part_prune_result: ExecutorDoInitialPruning() output for the plan tree
  *	sourceText: the source text of the query
  *	params: any parameters needed
  *	dest: where to send results
@@ -134,6 +138,7 @@ FreeQueryDesc(QueryDesc *qdesc)
  */
 static void
 ProcessQuery(PlannedStmt *plan,
+			 PartitionPruneResult *part_prune_result,
 			 const char *sourceText,
 			 ParamListInfo params,
 			 QueryEnvironment *queryEnv,
@@ -145,7 +150,7 @@ ProcessQuery(PlannedStmt *plan,
 	/*
 	 * Create the QueryDesc object
 	 */
-	queryDesc = CreateQueryDesc(plan, sourceText,
+	queryDesc = CreateQueryDesc(plan, part_prune_result, sourceText,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, params, queryEnv, 0);
 
@@ -491,8 +496,13 @@ PortalStart(Portal portal, ParamListInfo params,
 				/*
 				 * Create QueryDesc in portal's context; for the moment, set
 				 * the destination to DestNone.
+				 *
+				 * There is no PartitionPruneResult unless the PlannedStmt is
+				 * from a CachedPlan.
 				 */
 				queryDesc = CreateQueryDesc(linitial_node(PlannedStmt, portal->stmts),
+											portal->part_prune_results == NIL ? NULL :
+											linitial(portal->part_prune_results),
 											portal->sourceText,
 											GetActiveSnapshot(),
 											InvalidSnapshot,
@@ -1225,6 +1235,8 @@ PortalRunMulti(Portal portal,
 
 		if (pstmt->utilityStmt == NULL)
 		{
+			PartitionPruneResult *part_prune_result = NULL;
+
 			/*
 			 * process a plannable query.
 			 */
@@ -1271,10 +1283,18 @@ PortalRunMulti(Portal portal,
 			else
 				UpdateActiveSnapshotCommandId();
 
+			/*
+			 * Determine if there's a corresponding PartitionPruneResult for
+			 * this PlannedStmt.
+			 */
+			if (portal->part_prune_results != NIL)
+				part_prune_result = list_nth(portal->part_prune_results,
+											 foreach_current_index(stmtlist_item));
+
 			if (pstmt->canSetTag)
 			{
 				/* statement can set tag string */
-				ProcessQuery(pstmt,
+				ProcessQuery(pstmt, part_prune_result,
 							 portal->sourceText,
 							 portal->portalParams,
 							 portal->queryEnv,
@@ -1283,7 +1303,7 @@ PortalRunMulti(Portal portal,
 			else
 			{
 				/* stmt added by rewrite cannot set tag */
-				ProcessQuery(pstmt,
+				ProcessQuery(pstmt, part_prune_result,
 							 portal->sourceText,
 							 portal->portalParams,
 							 portal->queryEnv,
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 0d6a295674..c8281e7201 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -99,14 +99,19 @@ static dlist_head cached_expression_list = DLIST_STATIC_INIT(cached_expression_l
 static void ReleaseGenericPlan(CachedPlanSource *plansource);
 static List *RevalidateCachedQuery(CachedPlanSource *plansource,
 								   QueryEnvironment *queryEnv);
-static bool CheckCachedPlan(CachedPlanSource *plansource);
+static bool CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
+							List **part_prune_result_list);
 static CachedPlan *BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
-								   ParamListInfo boundParams, QueryEnvironment *queryEnv);
+								   ParamListInfo boundParams, QueryEnvironment *queryEnv,
+								   List **part_prune_result_list);
 static bool choose_custom_plan(CachedPlanSource *plansource,
 							   ParamListInfo boundParams);
 static double cached_plan_cost(CachedPlan *plan, bool include_planner);
 static Query *QueryListGetPrimaryStmt(List *stmts);
-static void AcquireExecutorLocks(List *stmt_list, bool acquire);
+static void AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams,
+								 List **part_prune_result_list,
+								 List **lockedRelids_per_stmt);
+static void ReleaseExecutorLocks(List *stmt_list, List *lockedRelids_per_stmt);
 static void AcquirePlannerLocks(List *stmt_list, bool acquire);
 static void ScanQueryForLocks(Query *parsetree, bool acquire);
 static bool ScanQueryWalker(Node *node, bool *acquire);
@@ -790,15 +795,20 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
  *
  * On a "true" return, we have acquired the locks needed to run the plan.
  * (We must do this for the "true" result to be race-condition-free.)
+ *
+ * See GetCachedPlan()'s comment for a description of part_prune_result_list.
  */
 static bool
-CheckCachedPlan(CachedPlanSource *plansource)
+CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
+				List **part_prune_result_list)
 {
 	CachedPlan *plan = plansource->gplan;
 
 	/* Assert that caller checked the querytree */
 	Assert(plansource->is_valid);
 
+	*part_prune_result_list = NIL;
+
 	/* If there's no generic plan, just say "false" */
 	if (!plan)
 		return false;
@@ -820,13 +830,21 @@ CheckCachedPlan(CachedPlanSource *plansource)
 	 */
 	if (plan->is_valid)
 	{
+		List   *lockedRelids_per_stmt;
+
 		/*
 		 * Plan must have positive refcount because it is referenced by
 		 * plansource; so no need to fear it disappears under us here.
 		 */
 		Assert(plan->refcount > 0);
 
-		AcquireExecutorLocks(plan->stmt_list, true);
+		/*
+		 * Lock relations scanned by the plan.  This is where the pruning
+		 * happens if needed.
+		 */
+		AcquireExecutorLocks(plan->stmt_list, boundParams,
+							 part_prune_result_list,
+							 &lockedRelids_per_stmt);
 
 		/*
 		 * If plan was transient, check to see if TransactionXmin has
@@ -848,7 +866,14 @@ CheckCachedPlan(CachedPlanSource *plansource)
 		}
 
 		/* Oops, the race case happened.  Release useless locks. */
-		AcquireExecutorLocks(plan->stmt_list, false);
+		ReleaseExecutorLocks(plan->stmt_list, lockedRelids_per_stmt);
+
+		/*
+		 * The output list and any objects therein have been allocated in the
+		 * caller's hopefully short-lived context, so will not remain leaked
+		 * for long, though reset to avoid its accidentally being looked at.
+		 */
+		*part_prune_result_list = NIL;
 	}
 
 	/*
@@ -874,10 +899,15 @@ CheckCachedPlan(CachedPlanSource *plansource)
  * Planning work is done in the caller's memory context.  The finished plan
  * is in a child memory context, which typically should get reparented
  * (unless this is a one-shot plan, in which case we don't copy the plan).
+ *
+ * A list of NULLs is returned in *part_prune_result_list, meaning that no
+ * PartitionPruneResult nodes have yet been created for the plans in
+ * stmt_list.
  */
 static CachedPlan *
 BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
-				ParamListInfo boundParams, QueryEnvironment *queryEnv)
+				ParamListInfo boundParams, QueryEnvironment *queryEnv,
+				List **part_prune_result_list)
 {
 	CachedPlan *plan;
 	List	   *plist;
@@ -1007,6 +1037,17 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
 
 	MemoryContextSwitchTo(oldcxt);
 
+	/*
+	 * No actual PartitionPruneResults yet to add, though must initialize
+	 * the list to have the same number of elements as the list of
+	 * PlannedStmts.
+	 */
+	*part_prune_result_list = NIL;
+	foreach(lc, plist)
+	{
+		*part_prune_result_list = lappend(*part_prune_result_list, NULL);
+	}
+
 	return plan;
 }
 
@@ -1126,6 +1167,17 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
  * plan or a custom plan for the given parameters: the caller does not know
  * which it will get.
  *
+ * For every PlannedStmt found in the returned CachedPlan, an element that
+ * is either a PartitionPruneResult or a NULL is added to
+ * *part_prune_result_list if needed.  The former if the PlannedStmt is from
+ * the existing CachedPlan that is otherwise valid and contains at least one
+ * PartitionPruneInfo that has "initial" pruning steps.  Those steps are
+ * performed by calling ExecutorDoInitialPruning() to determine only those
+ * leaf partitions that need to be locked by AcquireExecutorLocks() by pruning
+ * away subplans that don't match the pruning conditions.  The
+ * PartitionPruneResult contains a list of bitmapsets of the indexes of
+ * matching subplans, one for each PartitionPruneInfo.
+ *
  * On return, the plan is valid and we have sufficient locks to begin
  * execution.
  *
@@ -1139,11 +1191,13 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
  */
 CachedPlan *
 GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
-			  ResourceOwner owner, QueryEnvironment *queryEnv)
+			  ResourceOwner owner, QueryEnvironment *queryEnv,
+			  List **part_prune_result_list)
 {
 	CachedPlan *plan = NULL;
 	List	   *qlist;
 	bool		customplan;
+	List	   *my_part_prune_result_list;
 
 	/* Assert caller is doing things in a sane order */
 	Assert(plansource->magic == CACHEDPLANSOURCE_MAGIC);
@@ -1160,7 +1214,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 
 	if (!customplan)
 	{
-		if (CheckCachedPlan(plansource))
+		if (CheckCachedPlan(plansource, boundParams,
+							&my_part_prune_result_list))
 		{
 			/* We want a generic plan, and we already have a valid one */
 			plan = plansource->gplan;
@@ -1169,7 +1224,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 		else
 		{
 			/* Build a new generic plan */
-			plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv);
+			plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv,
+								   &my_part_prune_result_list);
 			/* Just make real sure plansource->gplan is clear */
 			ReleaseGenericPlan(plansource);
 			/* Link the new generic plan into the plansource */
@@ -1214,7 +1270,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 	if (customplan)
 	{
 		/* Build a custom plan */
-		plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv);
+		plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv,
+							   &my_part_prune_result_list);
 		/* Accumulate total costs of custom plans */
 		plansource->total_custom_cost += cached_plan_cost(plan, true);
 
@@ -1246,6 +1303,9 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 		plan->is_saved = true;
 	}
 
+	if (part_prune_result_list)
+		*part_prune_result_list = my_part_prune_result_list;
+
 	return plan;
 }
 
@@ -1737,17 +1797,29 @@ QueryListGetPrimaryStmt(List *stmts)
 
 /*
  * AcquireExecutorLocks: acquire locks needed for execution of a cached plan;
- * or release them if acquire is false.
+ *
+ * See GetCachedPlan()'s comment for a description of part_prune_result_list.
+ *
+ * On return, *lockedRelids_per_stmt will contain a bitmapset for every
+ * PlannedStmt in stmt_list, containing the RT indexes of relation entries
+ * in its range table that were actually locked, or NULL if the PlannedStmt
+ * contains a utility statement.
  */
 static void
-AcquireExecutorLocks(List *stmt_list, bool acquire)
+AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams,
+					 List **part_prune_result_list,
+					 List **lockedRelids_per_stmt)
 {
 	ListCell   *lc1;
 
+	*part_prune_result_list = *lockedRelids_per_stmt = NIL;
 	foreach(lc1, stmt_list)
 	{
 		PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
-		ListCell   *lc2;
+		PartitionPruneResult *part_prune_result = NULL;
+		Bitmapset  *allLockRelids;
+		Bitmapset  *lockedRelids = NULL;
+		int			rti;
 
 		if (plannedstmt->commandType == CMD_UTILITY)
 		{
@@ -1761,13 +1833,37 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
 			Query	   *query = UtilityContainsQuery(plannedstmt->utilityStmt);
 
 			if (query)
-				ScanQueryForLocks(query, acquire);
+				ScanQueryForLocks(query, true);
+			*part_prune_result_list = lappend(*part_prune_result_list, NULL);
 			continue;
 		}
 
-		foreach(lc2, plannedstmt->rtable)
+		/*
+		 * Figure out the set of relations that would need to be locked
+		 * before executing the plan.
+		 */
+		if (plannedstmt->containsInitialPruning)
 		{
-			RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc2);
+			/*
+			 * Obtain the set of leaf partitions to be locked.
+			 *
+			 * The following does initial partition pruning using the
+			 * PartitionPruneInfos found in plannedstmt->partPruneInfos and
+			 * finds leaf partitions that survive that pruning across all the
+			 * nodes in the plan tree.
+			 */
+			part_prune_result = ExecutorDoInitialPruning(plannedstmt,
+														 boundParams);
+			allLockRelids = bms_union(plannedstmt->minLockRelids,
+									  part_prune_result->scan_leafpart_rtis);
+		}
+		else
+			allLockRelids = plannedstmt->minLockRelids;
+
+		rti = -1;
+		while ((rti = bms_next_member(allLockRelids, rti)) > 0)
+		{
+			RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
 
 			if (rte->rtekind != RTE_RELATION)
 				continue;
@@ -1778,10 +1874,59 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
 			 * fail if it's been dropped entirely --- we'll just transiently
 			 * acquire a non-conflicting lock.
 			 */
-			if (acquire)
-				LockRelationOid(rte->relid, rte->rellockmode);
-			else
-				UnlockRelationOid(rte->relid, rte->rellockmode);
+			LockRelationOid(rte->relid, rte->rellockmode);
+			lockedRelids = bms_add_member(lockedRelids, rti);
+		}
+
+		*part_prune_result_list = lappend(*part_prune_result_list,
+										  part_prune_result);
+		*lockedRelids_per_stmt = lappend(*lockedRelids_per_stmt, lockedRelids);
+	}
+}
+
+/*
+ * ReleaseExecutorLocks
+ * 		Release locks that would've been acquired by an earlier call to
+ * 		AcquireExecutorLocks()
+ */
+static void
+ReleaseExecutorLocks(List *stmt_list, List *lockedRelids_per_stmt)
+{
+	ListCell   *lc1,
+			   *lc2;
+
+	forboth(lc1, stmt_list, lc2, lockedRelids_per_stmt)
+	{
+		PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
+		Bitmapset  *lockedRelids = lfirst(lc2);
+		int			rti;
+
+		if (plannedstmt->commandType == CMD_UTILITY)
+		{
+			/*
+			 * Ignore utility statements, except those (such as EXPLAIN) that
+			 * contain a parsed-but-not-planned query.  Note: it's okay to use
+			 * ScanQueryForLocks, even though the query hasn't been through
+			 * rule rewriting, because rewriting doesn't change the query
+			 * representation.
+			 */
+			Query	   *query = UtilityContainsQuery(plannedstmt->utilityStmt);
+
+			Assert(lockedRelids == NULL);
+			if (query)
+				ScanQueryForLocks(query, false);
+			continue;
+		}
+
+		rti = -1;
+		while ((rti = bms_next_member(lockedRelids, rti)) >= 0)
+		{
+			RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
+
+			Assert(rte->rtekind == RTE_RELATION);
+
+			/* See the comment in AcquireExecutorLocks(). */
+			UnlockRelationOid(rte->relid, rte->rellockmode);
 		}
 	}
 }
diff --git a/src/backend/utils/mmgr/portalmem.c b/src/backend/utils/mmgr/portalmem.c
index 3a161bdb88..27407a7f0f 100644
--- a/src/backend/utils/mmgr/portalmem.c
+++ b/src/backend/utils/mmgr/portalmem.c
@@ -303,6 +303,25 @@ PortalDefineQuery(Portal portal,
 	portal->status = PORTAL_DEFINED;
 }
 
+/*
+ * PortalStorePartitionPruneResults
+ *		Copy the given list of PartitionPruneResults into the portal's
+ *		context
+ *
+ * This allows the caller to ensure that the list exists as long as the portal
+ * does.
+ */
+void
+PortalStorePartitionPruneResults(Portal portal, List *part_prune_results)
+{
+	MemoryContext	oldcxt;
+
+	AssertArg(PortalIsValid(portal));
+	oldcxt = MemoryContextSwitchTo(portal->portalContext);
+	portal->part_prune_results = copyObject(part_prune_results);
+	MemoryContextSwitchTo(oldcxt);
+}
+
 /*
  * PortalReleaseCachedPlan
  *		Release a portal's reference to its cached plan, if any.
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 9ebde089ae..e57e133f0e 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -87,7 +87,9 @@ extern void ExplainOneUtility(Node *utilityStmt, IntoClause *into,
 							  ExplainState *es, const char *queryString,
 							  ParamListInfo params, QueryEnvironment *queryEnv);
 
-extern void ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into,
+extern void ExplainOnePlan(PlannedStmt *plannedstmt,
+						   PartitionPruneResult *part_prune_result,
+						   IntoClause *into,
 						   ExplainState *es, const char *queryString,
 						   ParamListInfo params, QueryEnvironment *queryEnv,
 						   const instr_time *planduration,
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index bf962af7af..bd8776402e 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -45,6 +45,7 @@ extern void ExecCleanupTupleRouting(ModifyTableState *mtstate,
  * nparts						Length of subplan_map[] and subpart_map[].
  * subplan_map					Subplan index by partition index, or -1.
  * subpart_map					Subpart index by partition index, or -1.
+ * rti_map						Range table index by partition index, or 0.
  * present_parts				A Bitmapset of the partition indexes that we
  *								have subplans or subparts for.
  * initial_pruning_steps		List of PartitionPruneSteps used to
@@ -61,6 +62,7 @@ typedef struct PartitionedRelPruningData
 	int			nparts;
 	int		   *subplan_map;
 	int		   *subpart_map;
+	Index	   *rti_map;
 	Bitmapset  *present_parts;
 	List	   *initial_pruning_steps;
 	List	   *exec_pruning_steps;
@@ -126,5 +128,10 @@ extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
 													 int part_prune_index,
 													 Bitmapset **initially_valid_subplans);
 extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
-										   bool initial_prune);
+										   bool initial_prune,
+										   Bitmapset **scan_leafpart_rtis);
+extern Bitmapset *ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt,
+								ParamListInfo params,
+								PartitionPruneInfo *pruneinfo,
+								Bitmapset **scan_leafpart_rtis);
 #endif							/* EXECPARTITION_H */
diff --git a/src/include/executor/execdesc.h b/src/include/executor/execdesc.h
index e79e2c001f..60d5644908 100644
--- a/src/include/executor/execdesc.h
+++ b/src/include/executor/execdesc.h
@@ -35,6 +35,8 @@ typedef struct QueryDesc
 	/* These fields are provided by CreateQueryDesc */
 	CmdType		operation;		/* CMD_SELECT, CMD_UPDATE, etc. */
 	PlannedStmt *plannedstmt;	/* planner's output (could be utility, too) */
+	PartitionPruneResult *part_prune_result; /* ExecutorDoInitialPruning()'s
+											  * output for plannedstmt */
 	const char *sourceText;		/* source text of the query */
 	Snapshot	snapshot;		/* snapshot to use for query */
 	Snapshot	crosscheck_snapshot;	/* crosscheck for RI update/delete */
@@ -57,6 +59,7 @@ typedef struct QueryDesc
 
 /* in pquery.c */
 extern QueryDesc *CreateQueryDesc(PlannedStmt *plannedstmt,
+								  PartitionPruneResult *part_prune_result,
 								  const char *sourceText,
 								  Snapshot snapshot,
 								  Snapshot crosscheck_snapshot,
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index ed95ed1176..6ae897d5d1 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -185,6 +185,8 @@ ExecGetJunkAttribute(TupleTableSlot *slot, AttrNumber attno, bool *isNull)
 /*
  * prototypes from functions in execMain.c
  */
+extern PartitionPruneResult *ExecutorDoInitialPruning(PlannedStmt *plannedstmt,
+													  ParamListInfo params);
 extern void ExecutorStart(QueryDesc *queryDesc, int eflags);
 extern void standard_ExecutorStart(QueryDesc *queryDesc, int eflags);
 extern void ExecutorRun(QueryDesc *queryDesc,
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 4a741b053f..63a89474db 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -612,6 +612,7 @@ typedef struct EState
 										 * ExecRowMarks, or NULL if none */
 	PlannedStmt *es_plannedstmt;	/* link to top of plan tree */
 	List		*es_part_prune_infos;	/* PlannedStmt.partPruneInfos */
+	struct PartitionPruneResult *es_part_prune_result; /* QueryDesc.part_prune_result */
 	const char *es_sourceText;	/* Source text from QueryDesc */
 
 	JunkFilter *es_junkFilter;	/* top-level junk filter, if any */
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index a80f43e540..937cc4629d 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -212,6 +212,7 @@ extern struct Bitmapset *readBitmapset(void);
 extern uintptr_t readDatum(bool typbyval);
 extern bool *readBoolCols(int numCols);
 extern int *readIntCols(int numCols);
+extern Index *readIndexCols(int numCols);
 extern Oid *readOidCols(int numCols);
 extern int16 *readAttrNumberCols(int numCols);
 
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index e392fb6fc0..494ae461be 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -125,6 +125,18 @@ typedef struct PlannerGlobal
 	/* List of PartitionPruneInfo contained in the plan */
 	List	   *partPruneInfos;
 
+	/*
+	 * Do any of those PartitionPruneInfos have initial pruning steps in them?
+	 */
+	bool		containsInitialPruning;
+
+	/*
+	 * Indexes of all range table entries minus indexes of range table entries
+	 * of the leaf partitions scanned by prunable subplans; see
+	 * AcquireExecutorLocks()
+	 */
+	Bitmapset  *minLockRelids;
+
 	/* OIDs of relations the plan depends on */
 	List	   *relationOids;
 
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 3eb3e6e527..a1e06719e6 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -73,8 +73,17 @@ typedef struct PlannedStmt
 	List	   *partPruneInfos;	/* List of PartitionPruneInfo contained in
 								 * the plan */
 
+	bool		containsInitialPruning;	/* Do any of those PartitionPruneInfos
+										 * have initial pruning steps in them?
+										 */
+
 	List	   *rtable;			/* list of RangeTblEntry nodes */
 
+	Bitmapset  *minLockRelids;	/* Indexes of all range table entries minus
+								 * indexes of range table entries of the leaf
+								 * partitions scanned by prunable subplans;
+								 * see AcquireExecutorLocks() */
+
 	/* rtable indexes of target relations for INSERT/UPDATE/DELETE */
 	List	   *resultRelations;	/* integer list of RT indexes, or NIL */
 
@@ -1410,6 +1419,13 @@ typedef struct PlanRowMark
  * prune_infos			List of Lists containing PartitionedRelPruneInfo nodes,
  *						one sublist per run-time-prunable partition hierarchy
  *						appearing in the parent plan node's subplans.
+ *
+ * needs_init_pruning	Does any of the PartitionedRelPruneInfos in
+ *						prune_infos have its initial_pruning_steps set?
+ *
+ * needs_exec_pruning	Does any of the PartitionedRelPruneInfos in
+ *						prune_infos have its exec_pruning_steps set?
+ *
  * other_subplans		Indexes of any subplans that are not accounted for
  *						by any of the PartitionedRelPruneInfo nodes in
  *						"prune_infos".  These subplans must not be pruned.
@@ -1420,6 +1436,8 @@ typedef struct PartitionPruneInfo
 
 	NodeTag		type;
 	List	   *prune_infos;
+	bool		needs_init_pruning;
+	bool		needs_exec_pruning;
 	Bitmapset  *other_subplans;
 } PartitionPruneInfo;
 
@@ -1464,6 +1482,9 @@ typedef struct PartitionedRelPruneInfo
 	/* relation OID by partition index, or 0 */
 	Oid		   *relid_map pg_node_attr(array_size(nparts));
 
+	/* Range table index by partition index, or 0. */
+	Index	   *rti_map pg_node_attr(array_size(nparts));
+
 	/*
 	 * initial_pruning_steps shows how to prune during executor startup (i.e.,
 	 * without use of any PARAM_EXEC Params); it is NIL if no startup pruning
@@ -1548,6 +1569,32 @@ typedef struct PartitionPruneStepCombine
 	List	   *source_stepids;
 } PartitionPruneStepCombine;
 
+/*----------------
+ * PartitionPruneResult
+ *
+ * The result of performing ExecutorDoInitialPruning() invocation on a given
+ * PlannedStmt.
+ *
+ * Contains a list of Bitmapset of the indexes of the subplans remaining after
+ * performing initial pruning by calling ExecFindMatchingSubPlans() for every
+ * PartitionPruneInfo found in PlannedStmt.partPruneInfos.  RT indexes of the
+ * leaf partitions scanned by those subplans across all PartitionPruneInfos
+ * are added into scan_leafpart_rtis.
+ *
+ * This is used by GetCachedPlan() to inform its callers of the pruning
+ * decisions made when performing AcquireExecutorLocks() on a given cached
+ * PlannedStmt, which the callers then pass on to the executor.  The executor
+ * refers to this node when initializing the plan nodes which contain subplans
+ * that may have been pruned by ExecutorDoInitialPruning(), rather than
+ * redoing initial pruning.
+ */
+typedef struct PartitionPruneResult
+{
+	NodeTag		type;
+
+	List		   *valid_subplan_offs_list;
+	Bitmapset	   *scan_leafpart_rtis;
+} PartitionPruneResult;
 
 /*
  * Plan invalidation info
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index 0499635f59..1c5bb5ece1 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -220,7 +220,8 @@ extern List *CachedPlanGetTargetList(CachedPlanSource *plansource,
 extern CachedPlan *GetCachedPlan(CachedPlanSource *plansource,
 								 ParamListInfo boundParams,
 								 ResourceOwner owner,
-								 QueryEnvironment *queryEnv);
+								 QueryEnvironment *queryEnv,
+								 List **part_prune_result_list);
 extern void ReleaseCachedPlan(CachedPlan *plan, ResourceOwner owner);
 
 extern bool CachedPlanAllowsSimpleValidityCheck(CachedPlanSource *plansource,
diff --git a/src/include/utils/portal.h b/src/include/utils/portal.h
index aeddbdafe5..9f7727a837 100644
--- a/src/include/utils/portal.h
+++ b/src/include/utils/portal.h
@@ -138,6 +138,7 @@ typedef struct PortalData
 	QueryCompletion qc;			/* command completion data for executed query */
 	List	   *stmts;			/* list of PlannedStmts */
 	CachedPlan *cplan;			/* CachedPlan, if stmts are from one */
+	List	   *part_prune_results;	/* list of PartitionPruneResults */
 
 	ParamListInfo portalParams; /* params to pass to query */
 	QueryEnvironment *queryEnv; /* environment for query */
@@ -242,6 +243,8 @@ extern void PortalDefineQuery(Portal portal,
 							  CommandTag commandTag,
 							  List *stmts,
 							  CachedPlan *cplan);
+extern void PortalStorePartitionPruneResults(Portal portal,
+											 List *part_prune_result_list);
 extern PlannedStmt *PortalGetPrimaryStmt(Portal portal);
 extern void PortalCreateHoldStore(Portal portal);
 extern void PortalHashTableDeleteAll(void);
-- 
2.35.3



  [application/octet-stream] v21-0002-Allow-adding-Bitmapsets-as-Nodes-into-plan-trees.patch (5.5K, 4-v21-0002-Allow-adding-Bitmapsets-as-Nodes-into-plan-trees.patch)
  download | inline diff:
From 41465f94e426a0b22b070ab8034de19cfdb6daa4 Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Thu, 6 Oct 2022 17:31:37 +0900
Subject: [PATCH v21 2/3] Allow adding Bitmapsets as Nodes into plan trees

Note that this only adds some infrastructure bits and none of the
existing bitmapsets that are added to plan trees have been changed
to instead add the Node version.  So, the plan trees, or really the
bitmapsets contained in them, look the same as before as far as
Node write/read functionality is concerned.

This is needed, because it is not currently possible to write and
then read back Bitmapsets that are not direct members of write/read
capable Nodes; for example, if one needs to add a List of Bitmapsets
to a plan tree.  The most straightforward way to do that is to make
Bitmapsets be written with outNode() and read with nodeRead().
---
 src/backend/nodes/Makefile             |  3 ++-
 src/backend/nodes/copyfuncs.c          | 11 +++++++++++
 src/backend/nodes/equalfuncs.c         |  6 ++++++
 src/backend/nodes/gen_node_support.pl  |  1 +
 src/backend/nodes/outfuncs.c           | 11 +++++++++++
 src/backend/nodes/readfuncs.c          |  4 ++++
 src/backend/optimizer/prep/preptlist.c |  1 -
 src/include/nodes/bitmapset.h          |  5 +++++
 src/include/nodes/meson.build          |  1 +
 9 files changed, 41 insertions(+), 2 deletions(-)

diff --git a/src/backend/nodes/Makefile b/src/backend/nodes/Makefile
index 7450e191ee..da5307771b 100644
--- a/src/backend/nodes/Makefile
+++ b/src/backend/nodes/Makefile
@@ -57,7 +57,8 @@ node_headers = \
 	nodes/replnodes.h \
 	nodes/supportnodes.h \
 	nodes/value.h \
-	utils/rel.h
+	utils/rel.h \
+	nodes/bitmapset.h
 
 # see also catalog/Makefile for an explanation of these make rules
 
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index e76fda8eba..1482019327 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -160,6 +160,17 @@ _copyExtensibleNode(const ExtensibleNode *from)
 	return newnode;
 }
 
+/* Custom copy routine for Node bitmapsets */
+static Bitmapset *
+_copyBitmapset(const Bitmapset *from)
+{
+	Bitmapset *newnode = bms_copy(from);
+
+	newnode->type = T_Bitmapset;
+
+	return newnode;
+}
+
 
 /*
  * copyObjectImpl -- implementation of copyObject(); see nodes/nodes.h
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 0373aa30fe..e8706c461a 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -210,6 +210,12 @@ _equalList(const List *a, const List *b)
 	return true;
 }
 
+/* Custom equal routine for Node bitmapsets */
+static bool
+_equalBitmapset(const Bitmapset *a, const Bitmapset *b)
+{
+	return bms_equal(a, b);
+}
 
 /*
  * equal
diff --git a/src/backend/nodes/gen_node_support.pl b/src/backend/nodes/gen_node_support.pl
index 81b8c184a9..ccb5aff874 100644
--- a/src/backend/nodes/gen_node_support.pl
+++ b/src/backend/nodes/gen_node_support.pl
@@ -71,6 +71,7 @@ my @all_input_files = qw(
   nodes/supportnodes.h
   nodes/value.h
   utils/rel.h
+  nodes/bitmapset.h
 );
 
 # Nodes from these input files are automatically treated as nodetag_only.
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 64c65f060b..b3ffd8cec2 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -328,6 +328,17 @@ outBitmapset(StringInfo str, const Bitmapset *bms)
 	appendStringInfoChar(str, ')');
 }
 
+/* Custom write routine for Node bitmapsets */
+static void
+_outBitmapset(StringInfo str, const Bitmapset *bms)
+{
+	Assert(IsA(bms, Bitmapset));
+	WRITE_NODE_TYPE("BITMAPSET");
+
+	outBitmapset(str, bms);
+}
+
+
 /*
  * Print the value of a Datum given its type.
  */
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index b4ff855f7c..4d6902d3ac 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -230,6 +230,10 @@ _readBitmapset(void)
 		result = bms_add_member(result, val);
 	}
 
+	/* XXX maybe do `result = makeNode(Bitmapset);` at the top? */
+	if (result)
+		result->type = T_Bitmapset;
+
 	return result;
 }
 
diff --git a/src/backend/optimizer/prep/preptlist.c b/src/backend/optimizer/prep/preptlist.c
index 137b28323d..e5c1103316 100644
--- a/src/backend/optimizer/prep/preptlist.c
+++ b/src/backend/optimizer/prep/preptlist.c
@@ -337,7 +337,6 @@ extract_update_targetlist_colnos(List *tlist)
 	return update_colnos;
 }
 
-
 /*****************************************************************************
  *
  *		TARGETLIST EXPANSION
diff --git a/src/include/nodes/bitmapset.h b/src/include/nodes/bitmapset.h
index 75b5ce1a8e..9046ca177f 100644
--- a/src/include/nodes/bitmapset.h
+++ b/src/include/nodes/bitmapset.h
@@ -20,6 +20,8 @@
 #ifndef BITMAPSET_H
 #define BITMAPSET_H
 
+#include "nodes/nodes.h"
+
 /*
  * Forward decl to save including pg_list.h
  */
@@ -48,6 +50,9 @@ typedef int32 signedbitmapword; /* must be the matching signed type */
 
 typedef struct Bitmapset
 {
+	pg_node_attr(custom_copy_equal, custom_read_write)
+
+	NodeTag		type;
 	int			nwords;			/* number of words in array */
 	bitmapword	words[FLEXIBLE_ARRAY_MEMBER];	/* really [nwords] */
 } Bitmapset;
diff --git a/src/include/nodes/meson.build b/src/include/nodes/meson.build
index b7df232081..94701af8e1 100644
--- a/src/include/nodes/meson.build
+++ b/src/include/nodes/meson.build
@@ -19,6 +19,7 @@ node_support_input_i = [
   'nodes/supportnodes.h',
   'nodes/value.h',
   'utils/rel.h',
+  'nodes/bitmapset.h',
 ]
 
 node_support_input = []
-- 
2.35.3



^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2022-10-17 09:29  Amit Langote <[email protected]>
  parent: Amit Langote <[email protected]>
  0 siblings, 1 reply; 108+ messages in thread

From: Amit Langote @ 2022-10-17 09:29 UTC (permalink / raw)
  To: Robert Haas <[email protected]>; +Cc: Jacob Champion <[email protected]>; Zhihong Yu <[email protected]>; David Rowley <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers

On Wed, Oct 12, 2022 at 4:36 PM Amit Langote <[email protected]> wrote:
> On Fri, Jul 29, 2022 at 1:20 PM Amit Langote <[email protected]> wrote:
> > On Thu, Jul 28, 2022 at 1:27 AM Robert Haas <[email protected]> wrote:
> > > 0001 adds es_part_prune_result but does not use it, so maybe the
> > > introduction of that field should be deferred until it's needed for
> > > something.
> >
> > Oops, looks like a mistake when breaking the patch.  Will move that bit to 0002.
>
> Fixed that and also noticed that I had defined PartitionPruneResult in
> the wrong header (execnodes.h).  That led to PartitionPruneResult
> nodes not being able to be written and read, because
> src/backend/nodes/gen_node_support.pl doesn't create _out* and _read*
> routines for the nodes defined in execnodes.h.  I moved its definition
> to plannodes.h, even though it is not actually the planner that
> instantiates those; no other include/nodes header sounds better.
>
> One more thing I realized is that Bitmapsets added to the List
> PartitionPruneResult.valid_subplan_offs_list are not actually
> read/write-able.  That's a problem that I also faced in [1], so I
> proposed a patch there to make Bitmapset a read/write-able Node and
> mark (only) the Bitmapsets that are added into read/write-able node
> trees with the corresponding NodeTag.  I'm including that patch here
> as well (0002) for the main patch to work (pass
> -DWRITE_READ_PARSE_PLAN_TREES build tests), though it might make sense
> to discuss it in its own thread?

Had second thoughts on the use of List of Bitmapsets for this, such
that the make-Bitmapset-Nodes patch is no longer needed.

I had defined PartitionPruneResult such that it stood for the results
of pruning for all PartitionPruneInfos contained in
PlannedStmt.partPruneInfos (covering all Append/MergeAppend nodes that
can use partition pruning in a given plan).  So, it had a List of
Bitmapset.  I think it's perhaps better for PartitionPruneResult to
cover only one PartitionPruneInfo and thus need only a Bitmapset and
not a List thereof, which I have implemented in the attached updated
patch 0002.  So, instead of needing to pass around a
PartitionPruneResult with each PlannedStmt, this now passes a List of
PartitionPruneResult with an entry for each in
PlannedStmt.partPruneInfos.

-- 
Thanks, Amit Langote
EDB: http://www.enterprisedb.com


Attachments:

  [application/octet-stream] v22-0001-Move-PartitioPruneInfo-out-of-plan-nodes-into-Pl.patch (17.2K, 2-v22-0001-Move-PartitioPruneInfo-out-of-plan-nodes-into-Pl.patch)
  download | inline diff:
From 27db8ab066dace77953d71a6446788190b66ce60 Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Fri, 27 May 2022 16:00:28 +0900
Subject: [PATCH v22 1/2] Move PartitioPruneInfo out of plan nodes into
 PlannedStmt

The planner will now add a given PartitioPruneInfo to
PlannedStmt.partPruneInfos instead of directly to the
Append/MergeAppend plan node.  What gets set instead in the
latter is an index field which points to the list element
of PlannedStmt.partPruneInfos containing the PartitioPruneInfo
belonging to the plan node.

A later commit will make AcquireExecutorLocks() do the initial
partition pruning to determine a minimal set of partitions to be
locked when validating a plan tree and it will need to consult the
PartitioPruneInfos referenced therein to do so.  It would be better
for the PartitioPruneInfos to be accessible directly than requiring
a walk of the plan tree to find them, which is easier when it can be
done by simply iterating over PlannedStmt.partPruneInfos.
---
 src/backend/executor/execMain.c         |  1 +
 src/backend/executor/execParallel.c     |  1 +
 src/backend/executor/execPartition.c    |  4 +-
 src/backend/executor/execUtils.c        |  1 +
 src/backend/executor/nodeAppend.c       |  4 +-
 src/backend/executor/nodeMergeAppend.c  |  4 +-
 src/backend/optimizer/plan/createplan.c | 24 ++++-----
 src/backend/optimizer/plan/planner.c    |  1 +
 src/backend/optimizer/plan/setrefs.c    | 65 +++++++++++++------------
 src/backend/partitioning/partprune.c    | 18 ++++---
 src/include/executor/execPartition.h    |  3 +-
 src/include/nodes/execnodes.h           |  1 +
 src/include/nodes/pathnodes.h           |  6 +++
 src/include/nodes/plannodes.h           | 11 +++--
 src/include/partitioning/partprune.h    |  8 +--
 15 files changed, 90 insertions(+), 62 deletions(-)

diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index d78862e660..32475e33ff 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -825,6 +825,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
 	ExecInitRangeTable(estate, rangeTable);
 
 	estate->es_plannedstmt = plannedstmt;
+	estate->es_part_prune_infos = plannedstmt->partPruneInfos;
 
 	/*
 	 * Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index 99512826c5..aca0c6f323 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -183,6 +183,7 @@ ExecSerializePlan(Plan *plan, EState *estate)
 	pstmt->dependsOnRole = false;
 	pstmt->parallelModeNeeded = false;
 	pstmt->planTree = plan;
+	pstmt->partPruneInfos = estate->es_part_prune_infos;
 	pstmt->rtable = estate->es_range_table;
 	pstmt->resultRelations = NIL;
 	pstmt->appendRelations = NIL;
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 40e3c07693..80197d5141 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -1791,11 +1791,13 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
 PartitionPruneState *
 ExecInitPartitionPruning(PlanState *planstate,
 						 int n_total_subplans,
-						 PartitionPruneInfo *pruneinfo,
+						 int part_prune_index,
 						 Bitmapset **initially_valid_subplans)
 {
 	PartitionPruneState *prunestate;
 	EState	   *estate = planstate->state;
+	PartitionPruneInfo *pruneinfo = list_nth(estate->es_part_prune_infos,
+											 part_prune_index);
 
 	/* We may need an expression context to evaluate partition exprs */
 	ExecAssignExprContext(estate, planstate);
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 9df1f81ea8..21f4c10937 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -119,6 +119,7 @@ CreateExecutorState(void)
 	estate->es_relations = NULL;
 	estate->es_rowmarks = NULL;
 	estate->es_plannedstmt = NULL;
+	estate->es_part_prune_infos = NIL;
 
 	estate->es_junkFilter = NULL;
 
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index 357e10a1d7..c6f86a6510 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -134,7 +134,7 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
 	appendstate->as_begun = false;
 
 	/* If run-time partition pruning is enabled, then set that up now */
-	if (node->part_prune_info != NULL)
+	if (node->part_prune_index >= 0)
 	{
 		PartitionPruneState *prunestate;
 
@@ -145,7 +145,7 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
 		 */
 		prunestate = ExecInitPartitionPruning(&appendstate->ps,
 											  list_length(node->appendplans),
-											  node->part_prune_info,
+											  node->part_prune_index,
 											  &validsubplans);
 		appendstate->as_prune_state = prunestate;
 		nplans = bms_num_members(validsubplans);
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index c5c62fa5c7..8d35860c30 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -82,7 +82,7 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
 	mergestate->ps.ExecProcNode = ExecMergeAppend;
 
 	/* If run-time partition pruning is enabled, then set that up now */
-	if (node->part_prune_info != NULL)
+	if (node->part_prune_index >= 0)
 	{
 		PartitionPruneState *prunestate;
 
@@ -93,7 +93,7 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
 		 */
 		prunestate = ExecInitPartitionPruning(&mergestate->ps,
 											  list_length(node->mergeplans),
-											  node->part_prune_info,
+											  node->part_prune_index,
 											  &validsubplans);
 		mergestate->ms_prune_state = prunestate;
 		nplans = bms_num_members(validsubplans);
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index ac86ce9003..50a5719ac6 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -1203,7 +1203,6 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
 	ListCell   *subpaths;
 	int			nasyncplans = 0;
 	RelOptInfo *rel = best_path->path.parent;
-	PartitionPruneInfo *partpruneinfo = NULL;
 	int			nodenumsortkeys = 0;
 	AttrNumber *nodeSortColIdx = NULL;
 	Oid		   *nodeSortOperators = NULL;
@@ -1354,6 +1353,9 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
 		subplans = lappend(subplans, subplan);
 	}
 
+	/* Set below if we find quals that we can use to run-time prune */
+	plan->part_prune_index = -1;
+
 	/*
 	 * If any quals exist, they may be useful to perform further partition
 	 * pruning during execution.  Gather information needed by the executor to
@@ -1377,16 +1379,14 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
 		}
 
 		if (prunequal != NIL)
-			partpruneinfo =
-				make_partition_pruneinfo(root, rel,
-										 best_path->subpaths,
-										 prunequal);
+			plan->part_prune_index = make_partition_pruneinfo(root, rel,
+															  best_path->subpaths,
+															  prunequal);
 	}
 
 	plan->appendplans = subplans;
 	plan->nasyncplans = nasyncplans;
 	plan->first_partial_plan = best_path->first_partial_path;
-	plan->part_prune_info = partpruneinfo;
 
 	copy_generic_path_info(&plan->plan, (Path *) best_path);
 
@@ -1425,7 +1425,6 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
 	List	   *subplans = NIL;
 	ListCell   *subpaths;
 	RelOptInfo *rel = best_path->path.parent;
-	PartitionPruneInfo *partpruneinfo = NULL;
 
 	/*
 	 * We don't have the actual creation of the MergeAppend node split out
@@ -1518,6 +1517,9 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
 		subplans = lappend(subplans, subplan);
 	}
 
+	/* Set below if we find quals that we can use to run-time prune */
+	node->part_prune_index = -1;
+
 	/*
 	 * If any quals exist, they may be useful to perform further partition
 	 * pruning during execution.  Gather information needed by the executor to
@@ -1541,13 +1543,13 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
 		}
 
 		if (prunequal != NIL)
-			partpruneinfo = make_partition_pruneinfo(root, rel,
-													 best_path->subpaths,
-													 prunequal);
+			node->part_prune_index = make_partition_pruneinfo(root, rel,
+															  best_path->subpaths,
+															  prunequal);
 	}
 
 	node->mergeplans = subplans;
-	node->part_prune_info = partpruneinfo;
+
 
 	/*
 	 * If prepare_sort_from_pathkeys added sort columns, but we were told to
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 5d0fd6e072..31fff597a7 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -519,6 +519,7 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
 	result->dependsOnRole = glob->dependsOnRole;
 	result->parallelModeNeeded = glob->parallelModeNeeded;
 	result->planTree = top_plan;
+	result->partPruneInfos = glob->partPruneInfos;
 	result->rtable = glob->finalrtable;
 	result->resultRelations = glob->resultRelations;
 	result->appendRelations = glob->appendRelations;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 1cb0abdbc1..720f20f563 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -348,6 +348,29 @@ set_plan_references(PlannerInfo *root, Plan *plan)
 		}
 	}
 
+	/* Also fix up the information in PartitionPruneInfos. */
+	foreach (lc, root->partPruneInfos)
+	{
+		PartitionPruneInfo *pruneinfo = lfirst(lc);
+		ListCell  *l;
+
+		foreach(l, pruneinfo->prune_infos)
+		{
+			List	   *prune_infos = lfirst(l);
+			ListCell   *l2;
+
+			foreach(l2, prune_infos)
+			{
+				PartitionedRelPruneInfo *pinfo = lfirst(l2);
+
+				/* RT index of the table to which the pinfo belongs. */
+				pinfo->rtindex += rtoffset;
+			}
+		}
+
+		glob->partPruneInfos = lappend(glob->partPruneInfos, pruneinfo);
+	}
+
 	return result;
 }
 
@@ -1658,21 +1681,12 @@ set_append_references(PlannerInfo *root,
 
 	aplan->apprelids = offset_relid_set(aplan->apprelids, rtoffset);
 
-	if (aplan->part_prune_info)
-	{
-		foreach(l, aplan->part_prune_info->prune_infos)
-		{
-			List	   *prune_infos = lfirst(l);
-			ListCell   *l2;
-
-			foreach(l2, prune_infos)
-			{
-				PartitionedRelPruneInfo *pinfo = lfirst(l2);
-
-				pinfo->rtindex += rtoffset;
-			}
-		}
-	}
+	/*
+	 * PartitionPruneInfos will be added to a list in PlannerGlobal, so update
+	 * the index.
+	 */
+	if (aplan->part_prune_index >= 0)
+		aplan->part_prune_index += list_length(root->glob->partPruneInfos);
 
 	/* We don't need to recurse to lefttree or righttree ... */
 	Assert(aplan->plan.lefttree == NULL);
@@ -1734,21 +1748,12 @@ set_mergeappend_references(PlannerInfo *root,
 
 	mplan->apprelids = offset_relid_set(mplan->apprelids, rtoffset);
 
-	if (mplan->part_prune_info)
-	{
-		foreach(l, mplan->part_prune_info->prune_infos)
-		{
-			List	   *prune_infos = lfirst(l);
-			ListCell   *l2;
-
-			foreach(l2, prune_infos)
-			{
-				PartitionedRelPruneInfo *pinfo = lfirst(l2);
-
-				pinfo->rtindex += rtoffset;
-			}
-		}
-	}
+	/*
+	 * PartitionPruneInfos will be added to a list in PlannerGlobal, so update
+	 * the index.
+	 */
+	if (mplan->part_prune_index >= 0)
+		mplan->part_prune_index += list_length(root->glob->partPruneInfos);
 
 	/* We don't need to recurse to lefttree or righttree ... */
 	Assert(mplan->plan.lefttree == NULL);
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index 6188bf69cb..6565b6ed01 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -209,16 +209,20 @@ static void partkey_datum_from_expr(PartitionPruneContext *context,
 
 /*
  * make_partition_pruneinfo
- *		Builds a PartitionPruneInfo which can be used in the executor to allow
- *		additional partition pruning to take place.  Returns NULL when
- *		partition pruning would be useless.
+ *		Checks if the given set of quals can be used to build pruning steps
+ *		that the executor can use to prune away unneeded partitions.  If
+ *		suitable quals are found then a PartitionPruneInfo is built and tagged
+ *		onto the PlannerInfo's partPruneInfos list.
+ *
+ * The return value is the 0-based index of the item added to the
+ * partPruneInfos list or -1 if nothing was added.
  *
  * 'parentrel' is the RelOptInfo for an appendrel, and 'subpaths' is the list
  * of scan paths for its child rels.
  * 'prunequal' is a list of potential pruning quals (i.e., restriction
  * clauses that are applicable to the appendrel).
  */
-PartitionPruneInfo *
+int
 make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 						 List *subpaths,
 						 List *prunequal)
@@ -332,7 +336,7 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	 * quals, then we can just not bother with run-time pruning.
 	 */
 	if (prunerelinfos == NIL)
-		return NULL;
+		return -1;
 
 	/* Else build the result data structure */
 	pruneinfo = makeNode(PartitionPruneInfo);
@@ -358,7 +362,9 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	else
 		pruneinfo->other_subplans = NULL;
 
-	return pruneinfo;
+	root->partPruneInfos = lappend(root->partPruneInfos, pruneinfo);
+
+	return list_length(root->partPruneInfos) - 1;
 }
 
 /*
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index 708435e952..bf962af7af 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -123,9 +123,8 @@ typedef struct PartitionPruneState
 
 extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
 													 int n_total_subplans,
-													 PartitionPruneInfo *pruneinfo,
+													 int part_prune_index,
 													 Bitmapset **initially_valid_subplans);
 extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
 										   bool initial_prune);
-
 #endif							/* EXECPARTITION_H */
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 01b1727fc0..4a741b053f 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -611,6 +611,7 @@ typedef struct EState
 	struct ExecRowMark **es_rowmarks;	/* Array of per-range-table-entry
 										 * ExecRowMarks, or NULL if none */
 	PlannedStmt *es_plannedstmt;	/* link to top of plan tree */
+	List		*es_part_prune_infos;	/* PlannedStmt.partPruneInfos */
 	const char *es_sourceText;	/* Source text from QueryDesc */
 
 	JunkFilter *es_junkFilter;	/* top-level junk filter, if any */
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 6bda383bea..e392fb6fc0 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -122,6 +122,9 @@ typedef struct PlannerGlobal
 	/* "flat" list of AppendRelInfos */
 	List	   *appendRelations;
 
+	/* List of PartitionPruneInfo contained in the plan */
+	List	   *partPruneInfos;
+
 	/* OIDs of relations the plan depends on */
 	List	   *relationOids;
 
@@ -503,6 +506,9 @@ struct PlannerInfo
 
 	/* Does this query modify any partition key columns? */
 	bool		partColsUpdated;
+
+	/* PartitionPruneInfos added in this query's plan. */
+	List	   *partPruneInfos;
 };
 
 
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 21e642a64c..3eb3e6e527 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -70,6 +70,9 @@ typedef struct PlannedStmt
 
 	struct Plan *planTree;		/* tree of Plan nodes */
 
+	List	   *partPruneInfos;	/* List of PartitionPruneInfo contained in
+								 * the plan */
+
 	List	   *rtable;			/* list of RangeTblEntry nodes */
 
 	/* rtable indexes of target relations for INSERT/UPDATE/DELETE */
@@ -270,8 +273,8 @@ typedef struct Append
 	 */
 	int			first_partial_plan;
 
-	/* Info for run-time subplan pruning; NULL if we're not doing that */
-	struct PartitionPruneInfo *part_prune_info;
+	/* Index to PlannerInfo.partPruneInfos or -1 if no run-time pruning */
+	int			part_prune_index;
 } Append;
 
 /* ----------------
@@ -305,8 +308,8 @@ typedef struct MergeAppend
 	/* NULLS FIRST/LAST directions */
 	bool	   *nullsFirst pg_node_attr(array_size(numCols));
 
-	/* Info for run-time subplan pruning; NULL if we're not doing that */
-	struct PartitionPruneInfo *part_prune_info;
+	/* Index to PlannerInfo.partPruneInfos or -1 if no run-time pruning */
+	int			part_prune_index;
 } MergeAppend;
 
 /* ----------------
diff --git a/src/include/partitioning/partprune.h b/src/include/partitioning/partprune.h
index 90684efa25..ebf0dcff8c 100644
--- a/src/include/partitioning/partprune.h
+++ b/src/include/partitioning/partprune.h
@@ -70,10 +70,10 @@ typedef struct PartitionPruneContext
 #define PruneCxtStateIdx(partnatts, step_id, keyno) \
 	((partnatts) * (step_id) + (keyno))
 
-extern PartitionPruneInfo *make_partition_pruneinfo(struct PlannerInfo *root,
-													struct RelOptInfo *parentrel,
-													List *subpaths,
-													List *prunequal);
+extern int make_partition_pruneinfo(struct PlannerInfo *root,
+									struct RelOptInfo *parentrel,
+									List *subpaths,
+									List *prunequal);
 extern Bitmapset *prune_append_rel_partitions(struct RelOptInfo *rel);
 extern Bitmapset *get_matching_partitions(PartitionPruneContext *context,
 										  List *pruning_steps);
-- 
2.35.3



  [application/octet-stream] v22-0002-Optimize-AcquireExecutorLocks-by-locking-only-un.patch (82.3K, 3-v22-0002-Optimize-AcquireExecutorLocks-by-locking-only-un.patch)
  download | inline diff:
From 5f2d5ca36111f8007a7850fd985c7e965d621149 Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Wed, 22 Dec 2021 16:55:17 +0900
Subject: [PATCH v22 2/2] Optimize AcquireExecutorLocks() by locking only
 unpruned partitions

This commit teaches AcquireExecutorLocks() to perform initial
partition pruning to notionally eliminate the subnodes contained in a
generic cached plan that need not be initialized during the actual
execution of the plan and skip locking the partition scanned by those
subnodes.

The result of performing initial partition pruning this way before the
actual execution has started is made available to the actual execution via
PartitionPruneResult, made available along with the PlannedStmt by the
callers of the executor that used plancache.c to get the plan.  It is NULL
in the cases in which the plan is obtained by calling the planner
directly or if the plan obtained by plancache.c is not a generic one.
---
 src/backend/commands/copyto.c          |   2 +-
 src/backend/commands/createas.c        |   2 +-
 src/backend/commands/explain.c         |   7 +-
 src/backend/commands/extension.c       |   2 +-
 src/backend/commands/matview.c         |   2 +-
 src/backend/commands/prepare.c         |  26 ++-
 src/backend/executor/README            |  32 ++++
 src/backend/executor/execMain.c        |  51 ++++++
 src/backend/executor/execParallel.c    |  26 ++-
 src/backend/executor/execPartition.c   | 241 +++++++++++++++++++++----
 src/backend/executor/execUtils.c       |   1 +
 src/backend/executor/functions.c       |   2 +-
 src/backend/executor/nodeAppend.c      |  11 +-
 src/backend/executor/nodeMergeAppend.c |   5 +-
 src/backend/executor/spi.c             |  27 ++-
 src/backend/nodes/readfuncs.c          |   8 +-
 src/backend/optimizer/plan/planner.c   |   2 +
 src/backend/optimizer/plan/setrefs.c   |  46 +++++
 src/backend/partitioning/partprune.c   |  41 ++++-
 src/backend/tcop/postgres.c            |   8 +-
 src/backend/tcop/pquery.c              |  28 ++-
 src/backend/utils/cache/plancache.c    | 208 ++++++++++++++++++---
 src/backend/utils/mmgr/portalmem.c     |  19 ++
 src/include/commands/explain.h         |   4 +-
 src/include/executor/execPartition.h   |   9 +-
 src/include/executor/execdesc.h        |   3 +
 src/include/executor/executor.h        |   3 +
 src/include/nodes/execnodes.h          |   1 +
 src/include/nodes/nodes.h              |   1 +
 src/include/nodes/pathnodes.h          |  12 ++
 src/include/nodes/plannodes.h          |  46 +++++
 src/include/utils/plancache.h          |   3 +-
 src/include/utils/portal.h             |   3 +
 33 files changed, 782 insertions(+), 100 deletions(-)

diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index 2527e66059..fb8779fec0 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -558,7 +558,7 @@ BeginCopyTo(ParseState *pstate,
 		((DR_copy *) dest)->cstate = cstate;
 
 		/* Create a QueryDesc requesting no output */
-		cstate->queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+		cstate->queryDesc = CreateQueryDesc(plan, NIL, pstate->p_sourcetext,
 											GetActiveSnapshot(),
 											InvalidSnapshot,
 											dest, NULL, NULL, 0);
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 152c29b551..942449544c 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -325,7 +325,7 @@ ExecCreateTableAs(ParseState *pstate, CreateTableAsStmt *stmt,
 		UpdateActiveSnapshotCommandId();
 
 		/* Create a QueryDesc, redirecting output to our tuple receiver */
-		queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+		queryDesc = CreateQueryDesc(plan, NIL, pstate->p_sourcetext,
 									GetActiveSnapshot(), InvalidSnapshot,
 									dest, params, queryEnv, 0);
 
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index f86983c660..2f2b558608 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -407,7 +407,7 @@ ExplainOneQuery(Query *query, int cursorOptions,
 		}
 
 		/* run it (if needed) and produce output */
-		ExplainOnePlan(plan, into, es, queryString, params, queryEnv,
+		ExplainOnePlan(plan, NIL, into, es, queryString, params, queryEnv,
 					   &planduration, (es->buffers ? &bufusage : NULL));
 	}
 }
@@ -515,7 +515,8 @@ ExplainOneUtility(Node *utilityStmt, IntoClause *into, ExplainState *es,
  * to call it.
  */
 void
-ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
+ExplainOnePlan(PlannedStmt *plannedstmt, List *part_prune_results,
+			   IntoClause *into, ExplainState *es,
 			   const char *queryString, ParamListInfo params,
 			   QueryEnvironment *queryEnv, const instr_time *planduration,
 			   const BufferUsage *bufusage)
@@ -563,7 +564,7 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
 		dest = None_Receiver;
 
 	/* Create a QueryDesc for the query */
-	queryDesc = CreateQueryDesc(plannedstmt, queryString,
+	queryDesc = CreateQueryDesc(plannedstmt, part_prune_results, queryString,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, params, queryEnv, instrument_option);
 
diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index 6b6720c690..06dfcd4d84 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -776,7 +776,7 @@ execute_sql_string(const char *sql)
 			{
 				QueryDesc  *qdesc;
 
-				qdesc = CreateQueryDesc(stmt,
+				qdesc = CreateQueryDesc(stmt, NIL,
 										sql,
 										GetActiveSnapshot(), NULL,
 										dest, NULL, NULL, 0);
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index 9ac0383459..65c8d0aa59 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -408,7 +408,7 @@ refresh_matview_datafill(DestReceiver *dest, Query *query,
 	UpdateActiveSnapshotCommandId();
 
 	/* Create a QueryDesc, redirecting output to our tuple receiver */
-	queryDesc = CreateQueryDesc(plan, queryString,
+	queryDesc = CreateQueryDesc(plan, NIL, queryString,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, NULL, NULL, 0);
 
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index c4b54d0547..b469e05672 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -155,6 +155,7 @@ ExecuteQuery(ParseState *pstate,
 	PreparedStatement *entry;
 	CachedPlan *cplan;
 	List	   *plan_list;
+	List	   *part_prune_results_list;
 	ParamListInfo paramLI = NULL;
 	EState	   *estate = NULL;
 	Portal		portal;
@@ -193,7 +194,10 @@ ExecuteQuery(ParseState *pstate,
 									   entry->plansource->query_string);
 
 	/* Replan if needed, and increment plan refcount for portal */
-	cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL);
+	cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL,
+						  &part_prune_results_list);
+	Assert(list_length(cplan->stmt_list) ==
+		   list_length(part_prune_results_list));
 	plan_list = cplan->stmt_list;
 
 	/*
@@ -207,6 +211,9 @@ ExecuteQuery(ParseState *pstate,
 					  plan_list,
 					  cplan);
 
+	/* Copy Lists of PartitionPruneResults into the portal's context. */
+	PortalStorePartitionPruneResults(portal, part_prune_results_list);
+
 	/*
 	 * For CREATE TABLE ... AS EXECUTE, we must verify that the prepared
 	 * statement is one that produces tuples.  Currently we insist that it be
@@ -576,7 +583,9 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	const char *query_string;
 	CachedPlan *cplan;
 	List	   *plan_list;
-	ListCell   *p;
+	List	   *part_prune_results_list;
+	ListCell   *p,
+			   *pp;
 	ParamListInfo paramLI = NULL;
 	EState	   *estate = NULL;
 	instr_time	planstart;
@@ -619,7 +628,10 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 
 	/* Replan if needed, and acquire a transient refcount */
 	cplan = GetCachedPlan(entry->plansource, paramLI,
-						  CurrentResourceOwner, queryEnv);
+						  CurrentResourceOwner, queryEnv,
+						  &part_prune_results_list);
+	Assert(list_length(cplan->stmt_list) ==
+		   list_length(part_prune_results_list));
 
 	INSTR_TIME_SET_CURRENT(planduration);
 	INSTR_TIME_SUBTRACT(planduration, planstart);
@@ -634,13 +646,15 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	plan_list = cplan->stmt_list;
 
 	/* Explain each query */
-	foreach(p, plan_list)
+	forboth(p, plan_list, pp, part_prune_results_list)
 	{
 		PlannedStmt *pstmt = lfirst_node(PlannedStmt, p);
+		List *part_prune_results = lfirst_node(List, pp);
 
 		if (pstmt->commandType != CMD_UTILITY)
-			ExplainOnePlan(pstmt, into, es, query_string, paramLI, queryEnv,
-						   &planduration, (es->buffers ? &bufusage : NULL));
+			ExplainOnePlan(pstmt, part_prune_results, into, es, query_string,
+						   paramLI, queryEnv, &planduration,
+						   (es->buffers ? &bufusage : NULL));
 		else
 			ExplainOneUtility(pstmt->utilityStmt, into, es, query_string,
 							  paramLI, queryEnv);
diff --git a/src/backend/executor/README b/src/backend/executor/README
index 0b5183fc4a..f14f9197b5 100644
--- a/src/backend/executor/README
+++ b/src/backend/executor/README
@@ -65,6 +65,34 @@ found there.  This currently only occurs for Append and MergeAppend nodes.  In
 this case the non-required subplans are ignored and the executor state's
 subnode array will become out of sequence to the plan's subplan list.
 
+Actually, the so-called execution time pruning may also occur even before the
+execution has started.  One case where that occurs is when a cached generic
+plan is being validated for execution by plancache.c: GetCachedPlan(), which
+works by locking all the relations that will be scanned by that plan.  If the
+generic plan contains nodes that can perform execution time partition pruning
+(that is, contain a PartitionPruneInfo), a subset of pruning steps contained
+in a given node's PartitionPruneInfo that do not depend on the execution
+actually having started (called "initial" pruning steps) are performed to
+figure out the minimal set of child subplans that satisfy those pruning steps.
+AcquireExecutorLocks() looking at a given generic plan will then lock only the
+relations scanned by the child subplans that survived such pruning, along with
+those present in PlannedStmt.minLockRelids.  Note that the subplans are only
+notionally pruned, that is, they are not removed from the plan tree as such.
+
+To prevent the executor and any third party execution code that can look at
+the plan tree from trying to execute the subplans that were pruned as
+described above, the result of pruning is passed to the executor as a List
+of PartitionPruneResult nodes via the QueryDesc.  Each PartitionPruneResult
+consists of the set of indexes of surviving subplans in the respective parent
+plan node's (the one to which the corresponding PartitionPruneInfo belongs)
+list of child subplans, saved as a bitmapset (valid_subplan_offs).  In other
+words, the executor executing a generic plan should not re-evaluate the set of
+initially valid subplans for a given plan node by redoing the initial pruning
+if it was already done by AcquireExecutorLocks() when validating the plan.
+Such re-evaluation of the pruning steps may very well end up resulting in a
+different set of subplans, containing some whose relations were not locked by
+AcquireExecutorLocks().
+
 Each Plan node may have expression trees associated with it, to represent
 its target list, qualification conditions, etc.  These trees are also
 read-only to the executor, but the executor state for expression evaluation
@@ -286,6 +314,10 @@ Query Processing Control Flow
 
 This is a sketch of control flow for full query processing:
 
+  [ ExecutorDoInitialPruning ] --- an optional step to perform initial
+		partition pruning on the plan tree the result of which is passed
+		to the executor via QueryDesc
+
 	CreateQueryDesc
 
 	ExecutorStart
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 32475e33ff..b59474841f 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -49,6 +49,7 @@
 #include "commands/matview.h"
 #include "commands/trigger.h"
 #include "executor/execdebug.h"
+#include "executor/execPartition.h"
 #include "executor/nodeSubplan.h"
 #include "foreign/fdwapi.h"
 #include "jit/jit.h"
@@ -104,6 +105,54 @@ static void EvalPlanQualStart(EPQState *epqstate, Plan *planTree);
 
 /* end of local decls */
 
+/* ----------------------------------------------------------------
+ *		ExecutorDoInitialPruning
+ *
+ *		For each plan tree node that has been assigned a PartitionPruneInfo,
+ *		this performs initial partition pruning using the information contained
+ *		therein to determine the set of child subplans that satisfy the initial
+ *		pruning steps, to be returned as a bitmapset of their indexes in the
+ *		node's list of child subplans (for example, an Append's appendplans).
+ *
+ * Return value is a PartitionPruneResult node that contains a list of those
+ * bitmapsets, with one element for every PartitionPruneInfo, and a bitmapset
+ * of the RT indexes of all the leaf partitions scanned by those chosen
+ * subplans.  Note that the latter is shared across all PartitionPruneInfos.
+ *
+ * The executor must see the exactly same set of subplans as valid for
+ * execution when doing ExecInitNode() on the plan nodes whose
+ * PartitionPruneInfos are processed here.  So, it must get the set from the
+ * aforementioned PartitionPruneResult, instead of computing it all over
+ * again by redoing the initial pruning.  It's the caller's job to pass the
+ * PartitionPruneResult to the executor.
+ *
+ * Note: Partitioned tables mentioned in PartitionedRelPruneInfo nodes that
+ * drive the pruning will be locked before doing the pruning.
+ * ----------------------------------------------------------------
+ */
+List *
+ExecutorDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params,
+						 Bitmapset **scan_leafpart_rtis)
+{
+	List	 *part_prune_results = NIL;
+	ListCell *lc;
+
+	/* Only get here if there is any pruning to do. */
+	Assert(plannedstmt->containsInitialPruning);
+
+	foreach(lc, plannedstmt->partPruneInfos)
+	{
+		PartitionPruneInfo *pruneinfo = lfirst(lc);
+		PartitionPruneResult *pruneresult = makeNode(PartitionPruneResult);
+
+		pruneresult->valid_subplan_offs =
+			ExecPartitionDoInitialPruning(plannedstmt, params, pruneinfo,
+										  scan_leafpart_rtis);
+		part_prune_results = lappend(part_prune_results, pruneresult);
+	}
+
+	return part_prune_results;
+}
 
 /* ----------------------------------------------------------------
  *		ExecutorStart
@@ -806,6 +855,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
 {
 	CmdType		operation = queryDesc->operation;
 	PlannedStmt *plannedstmt = queryDesc->plannedstmt;
+	List	   *part_prune_results = queryDesc->part_prune_results;
 	Plan	   *plan = plannedstmt->planTree;
 	List	   *rangeTable = plannedstmt->rtable;
 	EState	   *estate = queryDesc->estate;
@@ -826,6 +876,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
 
 	estate->es_plannedstmt = plannedstmt;
 	estate->es_part_prune_infos = plannedstmt->partPruneInfos;
+	estate->es_part_prune_results = part_prune_results;
 
 	/*
 	 * Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index aca0c6f323..917079a034 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -66,6 +66,7 @@
 #define PARALLEL_KEY_QUERY_TEXT		UINT64CONST(0xE000000000000008)
 #define PARALLEL_KEY_JIT_INSTRUMENTATION UINT64CONST(0xE000000000000009)
 #define PARALLEL_KEY_WAL_USAGE			UINT64CONST(0xE00000000000000A)
+#define PARALLEL_KEY_PARTITION_PRUNE_RESULTS	UINT64CONST(0xE00000000000000B)
 
 #define PARALLEL_TUPLE_QUEUE_SIZE		65536
 
@@ -182,6 +183,7 @@ ExecSerializePlan(Plan *plan, EState *estate)
 	pstmt->transientPlan = false;
 	pstmt->dependsOnRole = false;
 	pstmt->parallelModeNeeded = false;
+	pstmt->containsInitialPruning = false;
 	pstmt->planTree = plan;
 	pstmt->partPruneInfos = estate->es_part_prune_infos;
 	pstmt->rtable = estate->es_range_table;
@@ -597,12 +599,15 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	FixedParallelExecutorState *fpes;
 	char	   *pstmt_data;
 	char	   *pstmt_space;
+	char	   *part_prune_results_data;
+	char	   *part_prune_results_space;
 	char	   *paramlistinfo_space;
 	BufferUsage *bufusage_space;
 	WalUsage   *walusage_space;
 	SharedExecutorInstrumentation *instrumentation = NULL;
 	SharedJitInstrumentation *jit_instrumentation = NULL;
 	int			pstmt_len;
+	int			part_prune_results_len;
 	int			paramlistinfo_len;
 	int			instrumentation_len = 0;
 	int			jit_instrumentation_len = 0;
@@ -631,6 +636,7 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 
 	/* Fix up and serialize plan to be sent to workers. */
 	pstmt_data = ExecSerializePlan(planstate->plan, estate);
+	part_prune_results_data = nodeToString(estate->es_part_prune_results);
 
 	/* Create a parallel context. */
 	pcxt = CreateParallelContext("postgres", "ParallelQueryMain", nworkers);
@@ -657,6 +663,11 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	shm_toc_estimate_chunk(&pcxt->estimator, pstmt_len);
 	shm_toc_estimate_keys(&pcxt->estimator, 1);
 
+	/* Estimate space for serialized List of PartitionPruneResult. */
+	part_prune_results_len = strlen(part_prune_results_data) + 1;
+	shm_toc_estimate_chunk(&pcxt->estimator, part_prune_results_len);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
 	/* Estimate space for serialized ParamListInfo. */
 	paramlistinfo_len = EstimateParamListSpace(estate->es_param_list_info);
 	shm_toc_estimate_chunk(&pcxt->estimator, paramlistinfo_len);
@@ -751,6 +762,12 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	memcpy(pstmt_space, pstmt_data, pstmt_len);
 	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PLANNEDSTMT, pstmt_space);
 
+	/* Store serialized List of PartitionPruneResult */
+	part_prune_results_space = shm_toc_allocate(pcxt->toc, part_prune_results_len);
+	memcpy(part_prune_results_space, part_prune_results_data, part_prune_results_len);
+	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARTITION_PRUNE_RESULTS,
+				   part_prune_results_space);
+
 	/* Store serialized ParamListInfo. */
 	paramlistinfo_space = shm_toc_allocate(pcxt->toc, paramlistinfo_len);
 	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARAMLISTINFO, paramlistinfo_space);
@@ -1232,8 +1249,10 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
 						 int instrument_options)
 {
 	char	   *pstmtspace;
+	char	   *part_prune_results_space;
 	char	   *paramspace;
 	PlannedStmt *pstmt;
+	List	   *part_prune_results;
 	ParamListInfo paramLI;
 	char	   *queryString;
 
@@ -1244,12 +1263,17 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
 	pstmtspace = shm_toc_lookup(toc, PARALLEL_KEY_PLANNEDSTMT, false);
 	pstmt = (PlannedStmt *) stringToNode(pstmtspace);
 
+	/* Reconstruct leader-supplied PartitionPruneResult. */
+	part_prune_results_space =
+		shm_toc_lookup(toc, PARALLEL_KEY_PARTITION_PRUNE_RESULTS, false);
+	part_prune_results = (List *) stringToNode(part_prune_results_space);
+
 	/* Reconstruct ParamListInfo. */
 	paramspace = shm_toc_lookup(toc, PARALLEL_KEY_PARAMLISTINFO, false);
 	paramLI = RestoreParamList(&paramspace);
 
 	/* Create a QueryDesc for the query. */
-	return CreateQueryDesc(pstmt,
+	return CreateQueryDesc(pstmt, part_prune_results,
 						   queryString,
 						   GetActiveSnapshot(), InvalidSnapshot,
 						   receiver, paramLI, NULL, instrument_options);
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 80197d5141..8728745c44 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -25,6 +25,7 @@
 #include "mb/pg_wchar.h"
 #include "miscadmin.h"
 #include "nodes/makefuncs.h"
+#include "parser/parsetree.h"
 #include "partitioning/partbounds.h"
 #include "partitioning/partdesc.h"
 #include "partitioning/partprune.h"
@@ -185,7 +186,11 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation rel,
 static List *adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri);
 static List *adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap);
 static PartitionPruneState *CreatePartitionPruneState(PlanState *planstate,
-													  PartitionPruneInfo *pruneinfo);
+													  PartitionPruneInfo *pruneinfo,
+													  bool consider_initial_steps,
+													  bool consider_exec_steps,
+													  List *rtable, ExprContext *econtext,
+													  PartitionDirectory partdir);
 static void InitPartitionPruneContext(PartitionPruneContext *context,
 									  List *pruning_steps,
 									  PartitionDesc partdesc,
@@ -198,7 +203,8 @@ static void PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
 static void find_matching_subplans_recurse(PartitionPruningData *prunedata,
 										   PartitionedRelPruningData *pprune,
 										   bool initial_prune,
-										   Bitmapset **validsubplans);
+										   Bitmapset **validsubplans,
+										   Bitmapset **scan_leafpart_rtis);
 
 
 /*
@@ -1746,8 +1752,10 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  * considered to be a stable expression, it can change value from one plan
  * node scan to the next during query execution.  Stable comparison
  * expressions that don't involve such Params allow partition pruning to be
- * done once during executor startup.  Expressions that do involve such Params
- * require us to prune separately for each scan of the parent plan node.
+ * done once during executor startup or during ExecutorDoInitialPruning() that
+ * runs as part of performing AcquireExecutorLocks() on a given plan tree.
+ * Expressions that do involve such Params require us to prune separately for
+ * each scan of the parent plan node.
  *
  * Note that pruning away unneeded subplans during executor startup has the
  * added benefit of not having to initialize the unneeded subplans at all.
@@ -1764,6 +1772,13 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  *		account for initial pruning possibly having eliminated some of the
  *		subplans.
  *
+ * ExecPartitionDoInitialPruning:
+ *		Do initial pruning with the information contained in a given
+ *		PartitionPruneInfo to determine the minimal set of child subplans
+ *		to be executed of the parent plan node to which the PartitionPruneInfo
+ *		belongs and also the set of the RT indexes of leaf partitions that will
+ *		be scanned with those subplans.
+ *
  * ExecFindMatchingSubPlans:
  *		Returns indexes of matching subplans after evaluating the expressions
  *		that are safe to evaluate at a given point.  This function is first
@@ -1781,8 +1796,9 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  *
  * On return, *initially_valid_subplans is assigned the set of indexes of
  * child subplans that must be initialized along with the parent plan node.
- * Initial pruning is performed here if needed and in that case only the
- * surviving subplans' indexes are added.
+ * Initial pruning is performed here if needed (unless it has already been done
+ * by ExecutorDoInitialPruning()), and in that case only the surviving
+ * subplans' indexes are added.
  *
  * If subplans are indeed pruned, subplan_map arrays contained in the returned
  * PartitionPruneState are re-sequenced to not count those, though only if the
@@ -1794,28 +1810,65 @@ ExecInitPartitionPruning(PlanState *planstate,
 						 int part_prune_index,
 						 Bitmapset **initially_valid_subplans)
 {
-	PartitionPruneState *prunestate;
+	PartitionPruneState *prunestate = NULL;
 	EState	   *estate = planstate->state;
 	PartitionPruneInfo *pruneinfo = list_nth(estate->es_part_prune_infos,
 											 part_prune_index);
+	PartitionPruneResult *pruneresult = NULL;
+	bool	do_pruning = (pruneinfo->needs_init_pruning ||
+						  pruneinfo->needs_exec_pruning);
+
+	/*
+	 * No need to do initial pruning if it was done already by
+	 * ExecutorDoInitialPruning(), which it would be if es_part_prune_results
+	 * is set.
+	 */
+	if (estate->es_part_prune_results)
+	{
+		pruneresult = list_nth(estate->es_part_prune_results, part_prune_index);
+		Assert(IsA(pruneresult, PartitionPruneResult));
+		do_pruning = pruneinfo->needs_exec_pruning;
+	}
 
-	/* We may need an expression context to evaluate partition exprs */
-	ExecAssignExprContext(estate, planstate);
+	if (do_pruning)
+	{
+		/* We may need an expression context to evaluate partition exprs */
+		ExecAssignExprContext(estate, planstate);
 
-	/* Create the working data structure for pruning */
-	prunestate = CreatePartitionPruneState(planstate, pruneinfo);
+		/* For data reading, executor always omits detached partitions */
+		if (estate->es_partition_directory == NULL)
+			estate->es_partition_directory =
+				CreatePartitionDirectory(estate->es_query_cxt, false);
+
+		/*
+		 * Create the working data structure for pruning.  No need to consider
+		 * initial pruning steps if we have a PartitionPruneResult.
+		 */
+		prunestate = CreatePartitionPruneState(planstate, pruneinfo,
+											   pruneresult == NULL, true,
+											   NIL, planstate->ps_ExprContext,
+											   estate->es_partition_directory);
+	}
 
 	/*
 	 * Perform an initial partition prune pass, if required.
 	 */
-	if (prunestate->do_initial_prune)
-		*initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true);
+	if (pruneresult)
+	{
+		*initially_valid_subplans = bms_copy(pruneresult->valid_subplan_offs);
+	}
+	else if (prunestate && prunestate->do_initial_prune)
+	{
+		*initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true,
+															 NULL);
+	}
 	else
 	{
-		/* No pruning, so we'll need to initialize all subplans */
+		/* No initial pruning, so we'll need to initialize all subplans */
 		Assert(n_total_subplans > 0);
 		*initially_valid_subplans = bms_add_range(NULL, 0,
 												  n_total_subplans - 1);
+		return prunestate;
 	}
 
 	/*
@@ -1823,7 +1876,8 @@ ExecInitPartitionPruning(PlanState *planstate,
 	 * that were removed above due to initial pruning.  No need to do this if
 	 * no steps were removed.
 	 */
-	if (bms_num_members(*initially_valid_subplans) < n_total_subplans)
+	if (prunestate &&
+		bms_num_members(*initially_valid_subplans) < n_total_subplans)
 	{
 		/*
 		 * We can safely skip this when !do_exec_prune, even though that
@@ -1839,11 +1893,74 @@ ExecInitPartitionPruning(PlanState *planstate,
 	return prunestate;
 }
 
+/*
+ * ExecPartitionDoInitialPruning
+ *		Perform initial pruning using given PartitionPruneInfo to determine
+ *		the minimal set of child subplans that will be executed and also the
+ *		set of RT indexes of the leaf partitions scanned by those subplans.
+ */
+Bitmapset *
+ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params,
+							  PartitionPruneInfo *pruneinfo,
+							  Bitmapset **scan_leafpart_rtis)
+{
+	List		 *rtable = plannedstmt->rtable;
+	ExprContext	 *econtext;
+	PartitionDirectory pdir;
+	MemoryContext oldcontext,
+				  tmpcontext;
+	PartitionPruneState *prunestate;
+	Bitmapset	 *valid_subplan_offs;
+
+	/*
+	 * A temporary context for memory allocations required while executing
+	 * partition pruning steps.
+	 */
+	tmpcontext = AllocSetContextCreate(CurrentMemoryContext,
+									   "initial pruning working data",
+									   ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(tmpcontext);
+
+	/*
+	 * PartitionDirectory to look up partition descriptors.
+	 * Note that we don't omit detached partitions, just like during
+	 * execution proper.
+	 */
+	pdir = CreatePartitionDirectory(CurrentMemoryContext, false);
+
+	/*
+	 * We don't yet have a PlanState for the parent plan node, so we must
+	 * create a standalone ExprContext to evaluate pruning expressions,
+	 * equipped with the information about the EXTERN parameters that the
+	 * caller passed us.  Note that that's okay because the initial pruning
+	 * steps do not contain anything that requires the execution to have
+	 * started and thus need the information contained in a PlanState.
+	 */
+	econtext = CreateStandaloneExprContext();
+	econtext->ecxt_param_list_info = params;
+	prunestate = CreatePartitionPruneState(NULL, pruneinfo, true, false,
+										   rtable, econtext, pdir);
+	MemoryContextSwitchTo(oldcontext);
+
+	/* Do the initial pruning. */
+	valid_subplan_offs = ExecFindMatchingSubPlans(prunestate, true,
+												  scan_leafpart_rtis);
+
+	FreeExprContext(econtext, true);
+	DestroyPartitionDirectory(pdir);
+	MemoryContextDelete(tmpcontext);
+
+	return valid_subplan_offs;
+}
+
 /*
  * CreatePartitionPruneState
  *		Build the data structure required for calling ExecFindMatchingSubPlans
  *
- * 'planstate' is the parent plan node's execution state.
+ * 'planstate', if not NULL, is the parent plan node's execution state.  It
+ * can be NULL if being called before ExecutorStart(), in which case,
+ * 'rtable' (range table), 'econtext', and 'partdir' must be explicitly
+ * provided.
  *
  * 'pruneinfo' is a PartitionPruneInfo as generated by
  * make_partition_pruneinfo.  Here we build a PartitionPruneState containing a
@@ -1857,19 +1974,21 @@ ExecInitPartitionPruning(PlanState *planstate,
  * PartitionedRelPruneInfo.
  */
 static PartitionPruneState *
-CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
+CreatePartitionPruneState(PlanState *planstate,
+						  PartitionPruneInfo *pruneinfo,
+						  bool consider_initial_steps,
+						  bool consider_exec_steps,
+						  List *rtable, ExprContext *econtext,
+						  PartitionDirectory partdir)
 {
-	EState	   *estate = planstate->state;
+	EState	   *estate = planstate ? planstate->state : NULL;
 	PartitionPruneState *prunestate;
 	int			n_part_hierarchies;
 	ListCell   *lc;
 	int			i;
-	ExprContext *econtext = planstate->ps_ExprContext;
 
-	/* For data reading, executor always omits detached partitions */
-	if (estate->es_partition_directory == NULL)
-		estate->es_partition_directory =
-			CreatePartitionDirectory(estate->es_query_cxt, false);
+	Assert((estate != NULL) ||
+			(partdir != NULL && econtext != NULL && rtable != NIL));
 
 	n_part_hierarchies = list_length(pruneinfo->prune_infos);
 	Assert(n_part_hierarchies > 0);
@@ -1924,15 +2043,42 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 			PartitionKey partkey;
 
 			/*
-			 * We can rely on the copies of the partitioned table's partition
-			 * key and partition descriptor appearing in its relcache entry,
-			 * because that entry will be held open and locked for the
-			 * duration of this executor run.
+			 * Must open the relation by ourselves when called before the
+			 * execution has started, such as, when called during
+			 * ExecutorDoInitialPruning() on a cached plan.  In that case,
+			 * sub-partitions must be locked, because AcquirePlannerLocks()
+			 * would not have seen them. (1st relation in a partrelpruneinfos
+			 * list is always the root partitioned table appearing in the
+			 * query, which AcquirePlannerLocks() would have locked; the
+			 * Assert in relation_open() guards that assumption.)
+			 */
+			if (estate == NULL)
+			{
+				RangeTblEntry *rte = rt_fetch(pinfo->rtindex, rtable);
+				int		lockmode = (j == 0) ? NoLock : rte->rellockmode;
+
+				partrel = table_open(rte->relid, lockmode);
+			}
+			else
+				partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
+
+			/*
+			 * We can rely on the copy of the partitioned table's partition
+			 * key from in its relcache entry, because it can't change (or
+			 * get destroyed) as long as the relation is locked.  Partition
+			 * descriptor is taken from the PartitionDirectory associated with
+			 * the table that is held open long enough for the descriptor to
+			 * remain valid while it's used to perform the pruning steps.
 			 */
-			partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
 			partkey = RelationGetPartitionKey(partrel);
-			partdesc = PartitionDirectoryLookup(estate->es_partition_directory,
-												partrel);
+			partdesc = PartitionDirectoryLookup(partdir, partrel);
+
+			/*
+			 * Must close partrel, keeping the lock taken, if we're not using
+			 * EState's entry.
+			 */
+			if (estate == NULL)
+				table_close(partrel, NoLock);
 
 			/*
 			 * Initialize the subplan_map and subpart_map.
@@ -1946,6 +2092,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 			Assert(partdesc->nparts >= pinfo->nparts);
 			pprune->nparts = partdesc->nparts;
 			pprune->subplan_map = palloc(sizeof(int) * partdesc->nparts);
+			pprune->rti_map = palloc(sizeof(Index) * partdesc->nparts);
 			if (partdesc->nparts == pinfo->nparts)
 			{
 				/*
@@ -1956,6 +2103,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 				pprune->subpart_map = pinfo->subpart_map;
 				memcpy(pprune->subplan_map, pinfo->subplan_map,
 					   sizeof(int) * pinfo->nparts);
+				memcpy(pprune->rti_map, pinfo->rti_map,
+					   sizeof(int) * pinfo->nparts);
 
 				/*
 				 * Double-check that the list of unpruned relations has not
@@ -2006,6 +2155,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 							pinfo->subplan_map[pd_idx];
 						pprune->subpart_map[pp_idx] =
 							pinfo->subpart_map[pd_idx];
+						pprune->rti_map[pp_idx] =
+							pinfo->rti_map[pd_idx];
 						pd_idx++;
 					}
 					else
@@ -2013,6 +2164,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 						/* this partdesc entry is not in the plan */
 						pprune->subplan_map[pp_idx] = -1;
 						pprune->subpart_map[pp_idx] = -1;
+						pprune->rti_map[pp_idx] = 0;
 					}
 				}
 
@@ -2034,7 +2186,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 			 * Initialize pruning contexts as needed.
 			 */
 			pprune->initial_pruning_steps = pinfo->initial_pruning_steps;
-			if (pinfo->initial_pruning_steps)
+			if (consider_initial_steps && pinfo->initial_pruning_steps)
 			{
 				InitPartitionPruneContext(&pprune->initial_context,
 										  pinfo->initial_pruning_steps,
@@ -2044,7 +2196,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 				prunestate->do_initial_prune = true;
 			}
 			pprune->exec_pruning_steps = pinfo->exec_pruning_steps;
-			if (pinfo->exec_pruning_steps)
+			if (consider_exec_steps && pinfo->exec_pruning_steps)
 			{
 				InitPartitionPruneContext(&pprune->exec_context,
 										  pinfo->exec_pruning_steps,
@@ -2272,10 +2424,14 @@ PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
  * Pass initial_prune if PARAM_EXEC Params cannot yet be evaluated.  This
  * differentiates the initial executor-time pruning step from later
  * runtime pruning.
+ *
+ * RT indexes of leaf partitions scanned by the chosen subplans are added to
+ * *scan_leafpart_rtis if the pointer is non-NULL.
  */
 Bitmapset *
 ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
-						 bool initial_prune)
+						 bool initial_prune,
+						 Bitmapset **scan_leafpart_rtis)
 {
 	Bitmapset  *result = NULL;
 	MemoryContext oldcontext;
@@ -2310,7 +2466,7 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
 		 */
 		pprune = &prunedata->partrelprunedata[0];
 		find_matching_subplans_recurse(prunedata, pprune, initial_prune,
-									   &result);
+									   &result, scan_leafpart_rtis);
 
 		/* Expression eval may have used space in ExprContext too */
 		if (pprune->exec_pruning_steps)
@@ -2324,6 +2480,8 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
 
 	/* Copy result out of the temp context before we reset it */
 	result = bms_copy(result);
+	if (scan_leafpart_rtis)
+		*scan_leafpart_rtis = bms_copy(*scan_leafpart_rtis);
 
 	MemoryContextReset(prunestate->prune_context);
 
@@ -2334,13 +2492,15 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
  * find_matching_subplans_recurse
  *		Recursive worker function for ExecFindMatchingSubPlans
  *
- * Adds valid (non-prunable) subplan IDs to *validsubplans
+ * Adds valid (non-prunable) subplan IDs to *validsubplans and RT indexes of
+ * of the corresponding leaf partitions to *scan_leafpart_rtis (if asked for).
  */
 static void
 find_matching_subplans_recurse(PartitionPruningData *prunedata,
 							   PartitionedRelPruningData *pprune,
 							   bool initial_prune,
-							   Bitmapset **validsubplans)
+							   Bitmapset **validsubplans,
+							   Bitmapset **scan_leafpart_rtis)
 {
 	Bitmapset  *partset;
 	int			i;
@@ -2367,8 +2527,14 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
 	while ((i = bms_next_member(partset, i)) >= 0)
 	{
 		if (pprune->subplan_map[i] >= 0)
+		{
 			*validsubplans = bms_add_member(*validsubplans,
 											pprune->subplan_map[i]);
+			Assert(pprune->rti_map[i] > 0);
+			if (scan_leafpart_rtis)
+				*scan_leafpart_rtis = bms_add_member(*scan_leafpart_rtis,
+													 pprune->rti_map[i]);
+		}
 		else
 		{
 			int			partidx = pprune->subpart_map[i];
@@ -2376,7 +2542,8 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
 			if (partidx >= 0)
 				find_matching_subplans_recurse(prunedata,
 											   &prunedata->partrelprunedata[partidx],
-											   initial_prune, validsubplans);
+											   initial_prune, validsubplans,
+											   scan_leafpart_rtis);
 			else
 			{
 				/*
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 21f4c10937..67a58c7163 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -134,6 +134,7 @@ CreateExecutorState(void)
 	estate->es_param_exec_vals = NULL;
 
 	estate->es_queryEnv = NULL;
+	estate->es_part_prune_results = NIL;
 
 	estate->es_query_cxt = qcontext;
 
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index e134a82ff7..18d3b98cdc 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -842,7 +842,7 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
 	else
 		dest = None_Receiver;
 
-	es->qd = CreateQueryDesc(es->stmt,
+	es->qd = CreateQueryDesc(es->stmt, NIL,
 							 fcache->src,
 							 GetActiveSnapshot(),
 							 InvalidSnapshot,
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index c6f86a6510..96880e122a 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -155,7 +155,8 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
 		 * subplan, we can fill as_valid_subplans immediately, preventing
 		 * later calls to ExecFindMatchingSubPlans.
 		 */
-		if (!prunestate->do_exec_prune && nplans > 0)
+		if (appendstate->as_prune_state == NULL ||
+			(!appendstate->as_prune_state->do_exec_prune && nplans > 0))
 			appendstate->as_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
 	}
 	else
@@ -577,7 +578,7 @@ choose_next_subplan_locally(AppendState *node)
 		}
 		else if (node->as_valid_subplans == NULL)
 			node->as_valid_subplans =
-				ExecFindMatchingSubPlans(node->as_prune_state, false);
+				ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
 
 		whichplan = -1;
 	}
@@ -642,7 +643,7 @@ choose_next_subplan_for_leader(AppendState *node)
 		if (node->as_valid_subplans == NULL)
 		{
 			node->as_valid_subplans =
-				ExecFindMatchingSubPlans(node->as_prune_state, false);
+				ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
 
 			/*
 			 * Mark each invalid plan as finished to allow the loop below to
@@ -717,7 +718,7 @@ choose_next_subplan_for_worker(AppendState *node)
 	else if (node->as_valid_subplans == NULL)
 	{
 		node->as_valid_subplans =
-			ExecFindMatchingSubPlans(node->as_prune_state, false);
+			ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
 		mark_invalid_subplans_as_finished(node);
 	}
 
@@ -868,7 +869,7 @@ ExecAppendAsyncBegin(AppendState *node)
 	if (node->as_valid_subplans == NULL)
 	{
 		node->as_valid_subplans =
-			ExecFindMatchingSubPlans(node->as_prune_state, false);
+			ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
 
 		classify_matching_subplans(node);
 	}
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index 8d35860c30..2312e5a633 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -103,7 +103,8 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
 		 * subplan, we can fill ms_valid_subplans immediately, preventing
 		 * later calls to ExecFindMatchingSubPlans.
 		 */
-		if (!prunestate->do_exec_prune && nplans > 0)
+		if (mergestate->ms_prune_state == NULL ||
+			(!mergestate->ms_prune_state->do_exec_prune && nplans > 0))
 			mergestate->ms_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
 	}
 	else
@@ -218,7 +219,7 @@ ExecMergeAppend(PlanState *pstate)
 		 */
 		if (node->ms_valid_subplans == NULL)
 			node->ms_valid_subplans =
-				ExecFindMatchingSubPlans(node->ms_prune_state, false);
+				ExecFindMatchingSubPlans(node->ms_prune_state, false, NULL);
 
 		/*
 		 * First time through: pull the first tuple from each valid subplan,
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index fd5796f1b9..93012a5b3b 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -1578,6 +1578,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 	CachedPlanSource *plansource;
 	CachedPlan *cplan;
 	List	   *stmt_list;
+	List	   *part_prune_results_list;
 	char	   *query_string;
 	Snapshot	snapshot;
 	MemoryContext oldcontext;
@@ -1657,7 +1658,10 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 	 */
 
 	/* Replan if needed, and increment plan refcount for portal */
-	cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv);
+	cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv,
+						  &part_prune_results_list);
+	Assert(list_length(cplan->stmt_list) ==
+		   list_length(part_prune_results_list));
 	stmt_list = cplan->stmt_list;
 
 	if (!plan->saved)
@@ -1685,6 +1689,9 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 					  stmt_list,
 					  cplan);
 
+	/* Copy Lists of PartitionPruneResults into the portal's context. */
+	PortalStorePartitionPruneResults(portal, part_prune_results_list);
+
 	/*
 	 * Set up options for portal.  Default SCROLL type is chosen the same way
 	 * as PerformCursorOpen does it.
@@ -2092,7 +2099,8 @@ SPI_plan_get_cached_plan(SPIPlanPtr plan)
 	/* Get the generic plan for the query */
 	cplan = GetCachedPlan(plansource, NULL,
 						  plan->saved ? CurrentResourceOwner : NULL,
-						  _SPI_current->queryEnv);
+						  _SPI_current->queryEnv,
+						  NULL /* Not interested in PartitionPruneResults */);
 	Assert(cplan == plansource->gplan);
 
 	/* Pop the error context stack */
@@ -2473,7 +2481,9 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 	{
 		CachedPlanSource *plansource = (CachedPlanSource *) lfirst(lc1);
 		List	   *stmt_list;
-		ListCell   *lc2;
+		List	   *part_prune_results_list;
+		ListCell   *lc2,
+				   *lc3;
 
 		spicallbackarg.query = plansource->query_string;
 
@@ -2549,8 +2559,10 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 		 * plan, the refcount must be backed by the plan_owner.
 		 */
 		cplan = GetCachedPlan(plansource, options->params,
-							  plan_owner, _SPI_current->queryEnv);
-
+							  plan_owner, _SPI_current->queryEnv,
+							 &part_prune_results_list);
+		Assert(list_length(cplan->stmt_list) ==
+			   list_length(part_prune_results_list));
 		stmt_list = cplan->stmt_list;
 
 		/*
@@ -2589,9 +2601,10 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 			}
 		}
 
-		foreach(lc2, stmt_list)
+		forboth(lc2, stmt_list, lc3, part_prune_results_list)
 		{
 			PlannedStmt *stmt = lfirst_node(PlannedStmt, lc2);
+			List	   *part_prune_results = lfirst_node(List, lc3);
 			bool		canSetTag = stmt->canSetTag;
 			DestReceiver *dest;
 
@@ -2663,7 +2676,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 				else
 					snap = InvalidSnapshot;
 
-				qdesc = CreateQueryDesc(stmt,
+				qdesc = CreateQueryDesc(stmt, part_prune_results,
 										plansource->query_string,
 										snap, crosscheck_snapshot,
 										dest,
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index b4ff855f7c..77990a2732 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -158,6 +158,11 @@
 	token = pg_strtok(&length);		/* skip :fldname */ \
 	local_node->fldname = readIntCols(len)
 
+/* Read an Index array */
+#define READ_INDEX_ARRAY(fldname, len) \
+	token = pg_strtok(&length);		/* skip :fldname */ \
+	local_node->fldname = readIndexCols(len)
+
 /* Read a bool array */
 #define READ_BOOL_ARRAY(fldname, len) \
 	token = pg_strtok(&length);		/* skip :fldname */ \
@@ -795,7 +800,6 @@ fnname(int numCols) \
  */
 READ_SCALAR_ARRAY(readAttrNumberCols, int16, atoi)
 READ_SCALAR_ARRAY(readOidCols, Oid, atooid)
-/* outfuncs.c has writeIndexCols, but we don't yet need that here */
-/* READ_SCALAR_ARRAY(readIndexCols, Index, atoui) */
+READ_SCALAR_ARRAY(readIndexCols, Index, atoui)
 READ_SCALAR_ARRAY(readIntCols, int, atoi)
 READ_SCALAR_ARRAY(readBoolCols, bool, strtobool)
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 31fff597a7..4097cf7164 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -520,7 +520,9 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
 	result->parallelModeNeeded = glob->parallelModeNeeded;
 	result->planTree = top_plan;
 	result->partPruneInfos = glob->partPruneInfos;
+	result->containsInitialPruning = glob->containsInitialPruning;
 	result->rtable = glob->finalrtable;
+	result->minLockRelids = glob->minLockRelids;
 	result->resultRelations = glob->resultRelations;
 	result->appendRelations = glob->appendRelations;
 	result->subplans = glob->subplans;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 720f20f563..61d6934978 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -270,6 +270,16 @@ set_plan_references(PlannerInfo *root, Plan *plan)
 	 */
 	add_rtes_to_flat_rtable(root, false);
 
+	/*
+	 * Add the query's adjusted range of RT indexes to glob->minLockRelids.
+	 * The adjusted RT indexes of prunable relations will be deleted from the
+	 * set below where PartitionPruneInfos are processed.
+	 */
+	glob->minLockRelids =
+		bms_add_range(glob->minLockRelids,
+					  rtoffset + 1,
+					  rtoffset + list_length(root->parse->rtable));
+
 	/*
 	 * Adjust RT indexes of PlanRowMarks and add to final rowmarks list
 	 */
@@ -352,6 +362,7 @@ set_plan_references(PlannerInfo *root, Plan *plan)
 	foreach (lc, root->partPruneInfos)
 	{
 		PartitionPruneInfo *pruneinfo = lfirst(lc);
+		Bitmapset *leafpart_rtis = NULL;
 		ListCell  *l;
 
 		foreach(l, pruneinfo->prune_infos)
@@ -362,15 +373,50 @@ set_plan_references(PlannerInfo *root, Plan *plan)
 			foreach(l2, prune_infos)
 			{
 				PartitionedRelPruneInfo *pinfo = lfirst(l2);
+				int		i;
 
 				/* RT index of the table to which the pinfo belongs. */
 				pinfo->rtindex += rtoffset;
+
+				/* Also of the leaf partitions that might be scanned. */
+				for (i = 0; i < pinfo->nparts; i++)
+				{
+					if (pinfo->rti_map[i] > 0 && pinfo->subplan_map[i] >= 0)
+					{
+						pinfo->rti_map[i] += rtoffset;
+						leafpart_rtis = bms_add_member(leafpart_rtis,
+													   pinfo->rti_map[i]);
+					}
+				}
 			}
 		}
 
+		if (pruneinfo->needs_init_pruning)
+		{
+			glob->containsInitialPruning = true;
+
+			/*
+			 * Delete the leaf partition RTIs from the global set of relations
+			 * to be locked before executing the plan.  AcquireExecutorLocks()
+			 * will find the ones to add to the set after performing initial
+			 * pruning.
+			 */
+			glob->minLockRelids = bms_del_members(glob->minLockRelids,
+												  leafpart_rtis);
+		}
+
 		glob->partPruneInfos = lappend(glob->partPruneInfos, pruneinfo);
 	}
 
+	/*
+	 * It seems worth doing a bms_copy() on glob->minLockRelids if we deleted
+	 * bits from it above to get rid of any empty tail bits.  It seems better
+	 * for the loop over this set in AcquireExecutorLocks() to not have to go
+	 * through those useless bit words.
+	 */
+	if (glob->containsInitialPruning)
+		glob->minLockRelids = bms_copy(glob->minLockRelids);
+
 	return result;
 }
 
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index 6565b6ed01..37f3e6af61 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -144,7 +144,9 @@ static List *make_partitionedrel_pruneinfo(PlannerInfo *root,
 										   List *prunequal,
 										   Bitmapset *partrelids,
 										   int *relid_subplan_map,
-										   Bitmapset **matchedsubplans);
+										   Bitmapset **matchedsubplans,
+										   bool *needs_init_pruning,
+										   bool *needs_exec_pruning);
 static void gen_partprune_steps(RelOptInfo *rel, List *clauses,
 								PartClauseTarget target,
 								GeneratePruningStepsContext *context);
@@ -234,6 +236,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	int		   *relid_subplan_map;
 	ListCell   *lc;
 	int			i;
+	bool		needs_init_pruning = false;
+	bool		needs_exec_pruning = false;
 
 	/*
 	 * Scan the subpaths to see which ones are scans of partition child
@@ -313,12 +317,16 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		Bitmapset  *partrelids = (Bitmapset *) lfirst(lc);
 		List	   *pinfolist;
 		Bitmapset  *matchedsubplans = NULL;
+		bool		partrel_needs_init_pruning;
+		bool		partrel_needs_exec_pruning;
 
 		pinfolist = make_partitionedrel_pruneinfo(root, parentrel,
 												  prunequal,
 												  partrelids,
 												  relid_subplan_map,
-												  &matchedsubplans);
+												  &matchedsubplans,
+												  &partrel_needs_init_pruning,
+												  &partrel_needs_exec_pruning);
 
 		/* When pruning is possible, record the matched subplans */
 		if (pinfolist != NIL)
@@ -327,6 +335,9 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 			allmatchedsubplans = bms_join(matchedsubplans,
 										  allmatchedsubplans);
 		}
+
+		needs_init_pruning |= partrel_needs_init_pruning;
+		needs_exec_pruning |= partrel_needs_exec_pruning;
 	}
 
 	pfree(relid_subplan_map);
@@ -341,6 +352,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	/* Else build the result data structure */
 	pruneinfo = makeNode(PartitionPruneInfo);
 	pruneinfo->prune_infos = prunerelinfos;
+	pruneinfo->needs_init_pruning = needs_init_pruning;
+	pruneinfo->needs_exec_pruning = needs_exec_pruning;
 
 	/*
 	 * Some subplans may not belong to any of the identified partitioned rels.
@@ -441,13 +454,18 @@ add_part_relids(List *allpartrelids, Bitmapset *partrelids)
  * If we cannot find any useful run-time pruning steps, return NIL.
  * However, on success, each rel identified in partrelids will have
  * an element in the result list, even if some of them are useless.
+ * *needs_init_pruning and *needs_exec_pruning are set to indicate that the
+ * returned PartitionedRelPruneInfos contains pruning steps that can be
+ * performed before and after execution begins, respectively.
  */
 static List *
 make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 							  List *prunequal,
 							  Bitmapset *partrelids,
 							  int *relid_subplan_map,
-							  Bitmapset **matchedsubplans)
+							  Bitmapset **matchedsubplans,
+							  bool *needs_init_pruning,
+							  bool *needs_exec_pruning)
 {
 	RelOptInfo *targetpart = NULL;
 	List	   *pinfolist = NIL;
@@ -458,6 +476,10 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	int			rti;
 	int			i;
 
+	/* Will find out below. */
+	*needs_init_pruning = false;
+	*needs_exec_pruning = false;
+
 	/*
 	 * Examine each partitioned rel, constructing a temporary array to map
 	 * from planner relids to index of the partitioned rel, and building a
@@ -545,6 +567,9 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		 * executor per-scan pruning steps.  This first pass creates startup
 		 * pruning steps and detects whether there's any possibly-useful quals
 		 * that would require per-scan pruning.
+		 *
+		 * In the first pass, we note whether the 2nd pass is necessary by
+		 * noting the presence of EXEC parameters.
 		 */
 		gen_partprune_steps(subpart, partprunequal, PARTTARGET_INITIAL,
 							&context);
@@ -619,6 +644,12 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		pinfo->execparamids = execparamids;
 		/* Remaining fields will be filled in the next loop */
 
+		/* record which types of pruning steps we've seen so far */
+		if (initial_pruning_steps != NIL)
+			*needs_init_pruning = true;
+		if (exec_pruning_steps != NIL)
+			*needs_exec_pruning = true;
+
 		pinfolist = lappend(pinfolist, pinfo);
 	}
 
@@ -646,6 +677,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		int		   *subplan_map;
 		int		   *subpart_map;
 		Oid		   *relid_map;
+		Index	   *rti_map;
 
 		/*
 		 * Construct the subplan and subpart maps for this partitioning level.
@@ -658,6 +690,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		subpart_map = (int *) palloc(nparts * sizeof(int));
 		memset(subpart_map, -1, nparts * sizeof(int));
 		relid_map = (Oid *) palloc0(nparts * sizeof(Oid));
+		rti_map = (Index *) palloc0(nparts * sizeof(Index));
 		present_parts = NULL;
 
 		i = -1;
@@ -672,6 +705,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 			subplan_map[i] = subplanidx = relid_subplan_map[partrel->relid] - 1;
 			subpart_map[i] = subpartidx = relid_subpart_map[partrel->relid] - 1;
 			relid_map[i] = planner_rt_fetch(partrel->relid, root)->relid;
+			rti_map[i] = partrel->relid;
 			if (subplanidx >= 0)
 			{
 				present_parts = bms_add_member(present_parts, i);
@@ -696,6 +730,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		pinfo->subplan_map = subplan_map;
 		pinfo->subpart_map = subpart_map;
 		pinfo->relid_map = relid_map;
+		pinfo->rti_map = rti_map;
 	}
 
 	pfree(relid_subpart_map);
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index a9a1851c94..a1be8179e8 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -1598,6 +1598,7 @@ exec_bind_message(StringInfo input_message)
 	int16	   *rformats = NULL;
 	CachedPlanSource *psrc;
 	CachedPlan *cplan;
+	List	   *part_prune_results_list;
 	Portal		portal;
 	char	   *query_string;
 	char	   *saved_stmt_name;
@@ -1972,7 +1973,9 @@ exec_bind_message(StringInfo input_message)
 	 * will be generated in MessageContext.  The plan refcount will be
 	 * assigned to the Portal, so it will be released at portal destruction.
 	 */
-	cplan = GetCachedPlan(psrc, params, NULL, NULL);
+	cplan = GetCachedPlan(psrc, params, NULL, NULL, &part_prune_results_list);
+	Assert(list_length(cplan->stmt_list) ==
+		   list_length(part_prune_results_list));
 
 	/*
 	 * Now we can define the portal.
@@ -1987,6 +1990,9 @@ exec_bind_message(StringInfo input_message)
 					  cplan->stmt_list,
 					  cplan);
 
+	/* Copy Lists of PartitionPruneResults into the portal's context. */
+	PortalStorePartitionPruneResults(portal, part_prune_results_list);
+
 	/* Done with the snapshot used for parameter I/O and parsing/planning */
 	if (snapshot_set)
 		PopActiveSnapshot();
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index 5aa5a350f3..226ee81b63 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -35,7 +35,7 @@
 Portal		ActivePortal = NULL;
 
 
-static void ProcessQuery(PlannedStmt *plan,
+static void ProcessQuery(PlannedStmt *plan, List *part_prune_results,
 						 const char *sourceText,
 						 ParamListInfo params,
 						 QueryEnvironment *queryEnv,
@@ -65,6 +65,7 @@ static void DoPortalRewind(Portal portal);
  */
 QueryDesc *
 CreateQueryDesc(PlannedStmt *plannedstmt,
+				List *part_prune_results,
 				const char *sourceText,
 				Snapshot snapshot,
 				Snapshot crosscheck_snapshot,
@@ -77,6 +78,8 @@ CreateQueryDesc(PlannedStmt *plannedstmt,
 
 	qd->operation = plannedstmt->commandType;	/* operation */
 	qd->plannedstmt = plannedstmt;	/* plan */
+	qd->part_prune_results = part_prune_results; /* ExecutorDoInitialPruning()
+												  * output for plan */
 	qd->sourceText = sourceText;	/* query text */
 	qd->snapshot = RegisterSnapshot(snapshot);	/* snapshot */
 	/* RI check snapshot */
@@ -122,6 +125,7 @@ FreeQueryDesc(QueryDesc *qdesc)
  *		PORTAL_ONE_RETURNING, or PORTAL_ONE_MOD_WITH portal
  *
  *	plan: the plan tree for the query
+ *	part_prune_results: ExecutorDoInitialPruning() output for the PlannedStmt
  *	sourceText: the source text of the query
  *	params: any parameters needed
  *	dest: where to send results
@@ -134,6 +138,7 @@ FreeQueryDesc(QueryDesc *qdesc)
  */
 static void
 ProcessQuery(PlannedStmt *plan,
+			 List *part_prune_results,
 			 const char *sourceText,
 			 ParamListInfo params,
 			 QueryEnvironment *queryEnv,
@@ -145,7 +150,7 @@ ProcessQuery(PlannedStmt *plan,
 	/*
 	 * Create the QueryDesc object
 	 */
-	queryDesc = CreateQueryDesc(plan, sourceText,
+	queryDesc = CreateQueryDesc(plan, part_prune_results, sourceText,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, params, queryEnv, 0);
 
@@ -491,8 +496,13 @@ PortalStart(Portal portal, ParamListInfo params,
 				/*
 				 * Create QueryDesc in portal's context; for the moment, set
 				 * the destination to DestNone.
+				 *
+				 * There is no PartitionPruneResult unless the PlannedStmt is
+				 * from a CachedPlan.
 				 */
 				queryDesc = CreateQueryDesc(linitial_node(PlannedStmt, portal->stmts),
+											portal->part_prune_results_list == NIL ? NIL :
+											linitial(portal->part_prune_results_list),
 											portal->sourceText,
 											GetActiveSnapshot(),
 											InvalidSnapshot,
@@ -1225,6 +1235,8 @@ PortalRunMulti(Portal portal,
 
 		if (pstmt->utilityStmt == NULL)
 		{
+			List *part_prune_results = NIL;
+
 			/*
 			 * process a plannable query.
 			 */
@@ -1271,10 +1283,18 @@ PortalRunMulti(Portal portal,
 			else
 				UpdateActiveSnapshotCommandId();
 
+			/*
+			 * Determine if there's a corresponding List of PartitionPruneResult
+			 * for this PlannedStmt.
+			 */
+			if (portal->part_prune_results_list != NIL)
+				part_prune_results = list_nth(portal->part_prune_results_list,
+											  foreach_current_index(stmtlist_item));
+
 			if (pstmt->canSetTag)
 			{
 				/* statement can set tag string */
-				ProcessQuery(pstmt,
+				ProcessQuery(pstmt, part_prune_results,
 							 portal->sourceText,
 							 portal->portalParams,
 							 portal->queryEnv,
@@ -1283,7 +1303,7 @@ PortalRunMulti(Portal portal,
 			else
 			{
 				/* stmt added by rewrite cannot set tag */
-				ProcessQuery(pstmt,
+				ProcessQuery(pstmt, part_prune_results,
 							 portal->sourceText,
 							 portal->portalParams,
 							 portal->queryEnv,
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 0d6a295674..957221c47e 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -99,14 +99,19 @@ static dlist_head cached_expression_list = DLIST_STATIC_INIT(cached_expression_l
 static void ReleaseGenericPlan(CachedPlanSource *plansource);
 static List *RevalidateCachedQuery(CachedPlanSource *plansource,
 								   QueryEnvironment *queryEnv);
-static bool CheckCachedPlan(CachedPlanSource *plansource);
+static bool CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
+							List **part_prune_results_list);
 static CachedPlan *BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
-								   ParamListInfo boundParams, QueryEnvironment *queryEnv);
+								   ParamListInfo boundParams, QueryEnvironment *queryEnv,
+								   List **part_prune_results_list);
 static bool choose_custom_plan(CachedPlanSource *plansource,
 							   ParamListInfo boundParams);
 static double cached_plan_cost(CachedPlan *plan, bool include_planner);
 static Query *QueryListGetPrimaryStmt(List *stmts);
-static void AcquireExecutorLocks(List *stmt_list, bool acquire);
+static void AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams,
+								 List **part_prune_results_list,
+								 List **lockedRelids_per_stmt);
+static void ReleaseExecutorLocks(List *stmt_list, List *lockedRelids_per_stmt);
 static void AcquirePlannerLocks(List *stmt_list, bool acquire);
 static void ScanQueryForLocks(Query *parsetree, bool acquire);
 static bool ScanQueryWalker(Node *node, bool *acquire);
@@ -782,6 +787,26 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
 	return tlist;
 }
 
+/* 
+ * FreePartitionPruneResults
+ *		Frees the List of Lists of PartitionPruneResults for CheckCachedPlan()
+ */
+static void
+FreePartitionPruneResults(List *part_prune_results_list)
+{
+	ListCell *lc;
+
+	foreach(lc, part_prune_results_list)
+	{
+		List *part_prune_results = lfirst(lc);
+
+		/* Free both the PartitionPruneResults and the containing List. */
+		list_free_deep(part_prune_results);
+	}
+
+	list_free(part_prune_results_list);
+}
+
 /*
  * CheckCachedPlan: see if the CachedPlanSource's generic plan is valid.
  *
@@ -790,15 +815,20 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
  *
  * On a "true" return, we have acquired the locks needed to run the plan.
  * (We must do this for the "true" result to be race-condition-free.)
+ *
+ * See GetCachedPlan()'s comment for a description of part_prune_results_list.
  */
 static bool
-CheckCachedPlan(CachedPlanSource *plansource)
+CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
+				List **part_prune_results_list)
 {
 	CachedPlan *plan = plansource->gplan;
 
 	/* Assert that caller checked the querytree */
 	Assert(plansource->is_valid);
 
+	*part_prune_results_list = NIL;
+
 	/* If there's no generic plan, just say "false" */
 	if (!plan)
 		return false;
@@ -820,13 +850,21 @@ CheckCachedPlan(CachedPlanSource *plansource)
 	 */
 	if (plan->is_valid)
 	{
+		List   *lockedRelids_per_stmt;
+
 		/*
 		 * Plan must have positive refcount because it is referenced by
 		 * plansource; so no need to fear it disappears under us here.
 		 */
 		Assert(plan->refcount > 0);
 
-		AcquireExecutorLocks(plan->stmt_list, true);
+		/*
+		 * Lock relations scanned by the plan.  This is where the pruning
+		 * happens if needed.
+		 */
+		AcquireExecutorLocks(plan->stmt_list, boundParams,
+							 part_prune_results_list,
+							 &lockedRelids_per_stmt);
 
 		/*
 		 * If plan was transient, check to see if TransactionXmin has
@@ -848,7 +886,11 @@ CheckCachedPlan(CachedPlanSource *plansource)
 		}
 
 		/* Oops, the race case happened.  Release useless locks. */
-		AcquireExecutorLocks(plan->stmt_list, false);
+		ReleaseExecutorLocks(plan->stmt_list, lockedRelids_per_stmt);
+
+		/* Release any PartitionPruneResults that may been created. */
+		FreePartitionPruneResults(*part_prune_results_list);
+		*part_prune_results_list = NIL;
 	}
 
 	/*
@@ -874,10 +916,14 @@ CheckCachedPlan(CachedPlanSource *plansource)
  * Planning work is done in the caller's memory context.  The finished plan
  * is in a child memory context, which typically should get reparented
  * (unless this is a one-shot plan, in which case we don't copy the plan).
+ *
+ * A list of NILs is returned in *part_prune_results_list, meaning that no
+ * no partition pruning has been done yet for the plans in stmt_list.
  */
 static CachedPlan *
 BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
-				ParamListInfo boundParams, QueryEnvironment *queryEnv)
+				ParamListInfo boundParams, QueryEnvironment *queryEnv,
+				List **part_prune_results_list)
 {
 	CachedPlan *plan;
 	List	   *plist;
@@ -1007,6 +1053,17 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
 
 	MemoryContextSwitchTo(oldcxt);
 
+	/*
+	 * No actual PartitionPruneResults yet to add, though must initialize
+	 * the list to have the same number of elements as the list of
+	 * PlannedStmts.
+	 */
+	*part_prune_results_list = NIL;
+	foreach(lc, plist)
+	{
+		*part_prune_results_list = lappend(*part_prune_results_list, NIL);
+	}
+
 	return plan;
 }
 
@@ -1126,6 +1183,19 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
  * plan or a custom plan for the given parameters: the caller does not know
  * which it will get.
  *
+ * For every PlannedStmt found in the returned CachedPlan, an element that
+ * is either a List of PartitionPruneResult or a NIL is added to
+ * *part_prune_results_list.  The former if the PlannedStmt is from
+ * the existing CachedPlan that is otherwise valid and has
+ * containsInitialPruning set to true.  Before returning such a CachedPlan,
+ * those "initial" steps are performed by calling ExecutorDoInitialPruning()
+ * to determine only those leaf partitions that need to be locked by
+ * AcquireExecutorLocks() by pruning away subplans that don't match the
+ * "initial" pruning conditions.  For each PartitionPruneInfo found in
+ * PlannedStmt.partPruneInfos, a PartitionPruneResult containing the bitmapset
+ * of the indexes of surviving subplans is added to the List for the
+ * PlannedStmt.
+ *
  * On return, the plan is valid and we have sufficient locks to begin
  * execution.
  *
@@ -1139,11 +1209,13 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
  */
 CachedPlan *
 GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
-			  ResourceOwner owner, QueryEnvironment *queryEnv)
+			  ResourceOwner owner, QueryEnvironment *queryEnv,
+			  List **part_prune_results_list)
 {
 	CachedPlan *plan = NULL;
 	List	   *qlist;
 	bool		customplan;
+	List	   *my_part_prune_results_list;
 
 	/* Assert caller is doing things in a sane order */
 	Assert(plansource->magic == CACHEDPLANSOURCE_MAGIC);
@@ -1160,7 +1232,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 
 	if (!customplan)
 	{
-		if (CheckCachedPlan(plansource))
+		if (CheckCachedPlan(plansource, boundParams,
+							&my_part_prune_results_list))
 		{
 			/* We want a generic plan, and we already have a valid one */
 			plan = plansource->gplan;
@@ -1169,7 +1242,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 		else
 		{
 			/* Build a new generic plan */
-			plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv);
+			plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv,
+								   &my_part_prune_results_list);
 			/* Just make real sure plansource->gplan is clear */
 			ReleaseGenericPlan(plansource);
 			/* Link the new generic plan into the plansource */
@@ -1214,7 +1288,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 	if (customplan)
 	{
 		/* Build a custom plan */
-		plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv);
+		plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv,
+							   &my_part_prune_results_list);
 		/* Accumulate total costs of custom plans */
 		plansource->total_custom_cost += cached_plan_cost(plan, true);
 
@@ -1246,6 +1321,9 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 		plan->is_saved = true;
 	}
 
+	if (part_prune_results_list)
+		*part_prune_results_list = my_part_prune_results_list;
+
 	return plan;
 }
 
@@ -1737,17 +1815,29 @@ QueryListGetPrimaryStmt(List *stmts)
 
 /*
  * AcquireExecutorLocks: acquire locks needed for execution of a cached plan;
- * or release them if acquire is false.
+ *
+ * See GetCachedPlan()'s comment for a description of part_prune_results_list.
+ *
+ * On return, *lockedRelids_per_stmt will contain a bitmapset for every
+ * PlannedStmt in stmt_list, containing the RT indexes of relation entries
+ * in its range table that were actually locked, or NULL if the PlannedStmt
+ * contains a utility statement.
  */
 static void
-AcquireExecutorLocks(List *stmt_list, bool acquire)
+AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams,
+					 List **part_prune_results_list,
+					 List **lockedRelids_per_stmt)
 {
 	ListCell   *lc1;
 
+	*part_prune_results_list = *lockedRelids_per_stmt = NIL;
 	foreach(lc1, stmt_list)
 	{
 		PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
-		ListCell   *lc2;
+		List	   *part_prune_results = NIL;
+		Bitmapset  *allLockRelids;
+		Bitmapset  *lockedRelids = NULL;
+		int			rti;
 
 		if (plannedstmt->commandType == CMD_UTILITY)
 		{
@@ -1761,13 +1851,40 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
 			Query	   *query = UtilityContainsQuery(plannedstmt->utilityStmt);
 
 			if (query)
-				ScanQueryForLocks(query, acquire);
+				ScanQueryForLocks(query, true);
+			*part_prune_results_list = lappend(*part_prune_results_list, NIL);
 			continue;
 		}
 
-		foreach(lc2, plannedstmt->rtable)
+		/*
+		 * Figure out the set of relations that would need to be locked
+		 * before executing the plan.
+		 */
+		if (plannedstmt->containsInitialPruning)
 		{
-			RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc2);
+			Bitmapset *scan_leafpart_rtis = NULL;
+
+			/*
+			 * Obtain the set of leaf partitions to be locked.
+			 *
+			 * The following does initial partition pruning using the
+			 * PartitionPruneInfos found in plannedstmt->partPruneInfos and
+			 * finds leaf partitions that survive that pruning across all the
+			 * nodes in the plan tree.
+			 */
+			part_prune_results = ExecutorDoInitialPruning(plannedstmt,
+														  boundParams,
+														  &scan_leafpart_rtis);
+			allLockRelids = bms_union(plannedstmt->minLockRelids,
+									  scan_leafpart_rtis);
+		}
+		else
+			allLockRelids = plannedstmt->minLockRelids;
+
+		rti = -1;
+		while ((rti = bms_next_member(allLockRelids, rti)) > 0)
+		{
+			RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
 
 			if (rte->rtekind != RTE_RELATION)
 				continue;
@@ -1778,10 +1895,59 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
 			 * fail if it's been dropped entirely --- we'll just transiently
 			 * acquire a non-conflicting lock.
 			 */
-			if (acquire)
-				LockRelationOid(rte->relid, rte->rellockmode);
-			else
-				UnlockRelationOid(rte->relid, rte->rellockmode);
+			LockRelationOid(rte->relid, rte->rellockmode);
+			lockedRelids = bms_add_member(lockedRelids, rti);
+		}
+
+		*part_prune_results_list = lappend(*part_prune_results_list,
+										   part_prune_results);
+		*lockedRelids_per_stmt = lappend(*lockedRelids_per_stmt, lockedRelids);
+	}
+}
+
+/*
+ * ReleaseExecutorLocks
+ * 		Release locks that would've been acquired by an earlier call to
+ * 		AcquireExecutorLocks()
+ */
+static void
+ReleaseExecutorLocks(List *stmt_list, List *lockedRelids_per_stmt)
+{
+	ListCell   *lc1,
+			   *lc2;
+
+	forboth(lc1, stmt_list, lc2, lockedRelids_per_stmt)
+	{
+		PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
+		Bitmapset  *lockedRelids = lfirst(lc2);
+		int			rti;
+
+		if (plannedstmt->commandType == CMD_UTILITY)
+		{
+			/*
+			 * Ignore utility statements, except those (such as EXPLAIN) that
+			 * contain a parsed-but-not-planned query.  Note: it's okay to use
+			 * ScanQueryForLocks, even though the query hasn't been through
+			 * rule rewriting, because rewriting doesn't change the query
+			 * representation.
+			 */
+			Query	   *query = UtilityContainsQuery(plannedstmt->utilityStmt);
+
+			Assert(lockedRelids == NULL);
+			if (query)
+				ScanQueryForLocks(query, false);
+			continue;
+		}
+
+		rti = -1;
+		while ((rti = bms_next_member(lockedRelids, rti)) >= 0)
+		{
+			RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
+
+			Assert(rte->rtekind == RTE_RELATION);
+
+			/* See the comment in AcquireExecutorLocks(). */
+			UnlockRelationOid(rte->relid, rte->rellockmode);
 		}
 	}
 }
diff --git a/src/backend/utils/mmgr/portalmem.c b/src/backend/utils/mmgr/portalmem.c
index 3a161bdb88..4b156de524 100644
--- a/src/backend/utils/mmgr/portalmem.c
+++ b/src/backend/utils/mmgr/portalmem.c
@@ -303,6 +303,25 @@ PortalDefineQuery(Portal portal,
 	portal->status = PORTAL_DEFINED;
 }
 
+/*
+ * PortalStorePartitionPruneResults
+ *		Copy the given List of Lists of PartitionPruneResults into the
+ *		portal's context
+ *
+ * This allows the caller to ensure that the list exists as long as the portal
+ * does.
+ */
+void
+PortalStorePartitionPruneResults(Portal portal, List *part_prune_results_list)
+{
+	MemoryContext	oldcxt;
+
+	AssertArg(PortalIsValid(portal));
+	oldcxt = MemoryContextSwitchTo(portal->portalContext);
+	portal->part_prune_results_list = copyObject(part_prune_results_list);
+	MemoryContextSwitchTo(oldcxt);
+}
+
 /*
  * PortalReleaseCachedPlan
  *		Release a portal's reference to its cached plan, if any.
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 9ebde089ae..269cc4d562 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -87,7 +87,9 @@ extern void ExplainOneUtility(Node *utilityStmt, IntoClause *into,
 							  ExplainState *es, const char *queryString,
 							  ParamListInfo params, QueryEnvironment *queryEnv);
 
-extern void ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into,
+extern void ExplainOnePlan(PlannedStmt *plannedstmt,
+						   List *part_prune_results,
+						   IntoClause *into,
 						   ExplainState *es, const char *queryString,
 						   ParamListInfo params, QueryEnvironment *queryEnv,
 						   const instr_time *planduration,
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index bf962af7af..bd8776402e 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -45,6 +45,7 @@ extern void ExecCleanupTupleRouting(ModifyTableState *mtstate,
  * nparts						Length of subplan_map[] and subpart_map[].
  * subplan_map					Subplan index by partition index, or -1.
  * subpart_map					Subpart index by partition index, or -1.
+ * rti_map						Range table index by partition index, or 0.
  * present_parts				A Bitmapset of the partition indexes that we
  *								have subplans or subparts for.
  * initial_pruning_steps		List of PartitionPruneSteps used to
@@ -61,6 +62,7 @@ typedef struct PartitionedRelPruningData
 	int			nparts;
 	int		   *subplan_map;
 	int		   *subpart_map;
+	Index	   *rti_map;
 	Bitmapset  *present_parts;
 	List	   *initial_pruning_steps;
 	List	   *exec_pruning_steps;
@@ -126,5 +128,10 @@ extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
 													 int part_prune_index,
 													 Bitmapset **initially_valid_subplans);
 extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
-										   bool initial_prune);
+										   bool initial_prune,
+										   Bitmapset **scan_leafpart_rtis);
+extern Bitmapset *ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt,
+								ParamListInfo params,
+								PartitionPruneInfo *pruneinfo,
+								Bitmapset **scan_leafpart_rtis);
 #endif							/* EXECPARTITION_H */
diff --git a/src/include/executor/execdesc.h b/src/include/executor/execdesc.h
index e79e2c001f..7d4379da7b 100644
--- a/src/include/executor/execdesc.h
+++ b/src/include/executor/execdesc.h
@@ -35,6 +35,8 @@ typedef struct QueryDesc
 	/* These fields are provided by CreateQueryDesc */
 	CmdType		operation;		/* CMD_SELECT, CMD_UPDATE, etc. */
 	PlannedStmt *plannedstmt;	/* planner's output (could be utility, too) */
+	List		*part_prune_results; /* ExecutorDoInitialPruning()'s
+									  * output for plannedstmt */
 	const char *sourceText;		/* source text of the query */
 	Snapshot	snapshot;		/* snapshot to use for query */
 	Snapshot	crosscheck_snapshot;	/* crosscheck for RI update/delete */
@@ -57,6 +59,7 @@ typedef struct QueryDesc
 
 /* in pquery.c */
 extern QueryDesc *CreateQueryDesc(PlannedStmt *plannedstmt,
+								  List *part_prune_results,
 								  const char *sourceText,
 								  Snapshot snapshot,
 								  Snapshot crosscheck_snapshot,
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index ed95ed1176..c9a5e5fb68 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -185,6 +185,9 @@ ExecGetJunkAttribute(TupleTableSlot *slot, AttrNumber attno, bool *isNull)
 /*
  * prototypes from functions in execMain.c
  */
+extern List *ExecutorDoInitialPruning(PlannedStmt *plannedstmt,
+									  ParamListInfo params,
+									  Bitmapset **scan_leafpart_rtis);
 extern void ExecutorStart(QueryDesc *queryDesc, int eflags);
 extern void standard_ExecutorStart(QueryDesc *queryDesc, int eflags);
 extern void ExecutorRun(QueryDesc *queryDesc,
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 4a741b053f..521a60b988 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -612,6 +612,7 @@ typedef struct EState
 										 * ExecRowMarks, or NULL if none */
 	PlannedStmt *es_plannedstmt;	/* link to top of plan tree */
 	List		*es_part_prune_infos;	/* PlannedStmt.partPruneInfos */
+	List		*es_part_prune_results; /* QueryDesc.part_prune_results */
 	const char *es_sourceText;	/* Source text from QueryDesc */
 
 	JunkFilter *es_junkFilter;	/* top-level junk filter, if any */
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index a80f43e540..937cc4629d 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -212,6 +212,7 @@ extern struct Bitmapset *readBitmapset(void);
 extern uintptr_t readDatum(bool typbyval);
 extern bool *readBoolCols(int numCols);
 extern int *readIntCols(int numCols);
+extern Index *readIndexCols(int numCols);
 extern Oid *readOidCols(int numCols);
 extern int16 *readAttrNumberCols(int numCols);
 
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index e392fb6fc0..494ae461be 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -125,6 +125,18 @@ typedef struct PlannerGlobal
 	/* List of PartitionPruneInfo contained in the plan */
 	List	   *partPruneInfos;
 
+	/*
+	 * Do any of those PartitionPruneInfos have initial pruning steps in them?
+	 */
+	bool		containsInitialPruning;
+
+	/*
+	 * Indexes of all range table entries minus indexes of range table entries
+	 * of the leaf partitions scanned by prunable subplans; see
+	 * AcquireExecutorLocks()
+	 */
+	Bitmapset  *minLockRelids;
+
 	/* OIDs of relations the plan depends on */
 	List	   *relationOids;
 
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 3eb3e6e527..0bc4c8130a 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -73,8 +73,17 @@ typedef struct PlannedStmt
 	List	   *partPruneInfos;	/* List of PartitionPruneInfo contained in
 								 * the plan */
 
+	bool		containsInitialPruning;	/* Do any of those PartitionPruneInfos
+										 * have initial pruning steps in them?
+										 */
+
 	List	   *rtable;			/* list of RangeTblEntry nodes */
 
+	Bitmapset  *minLockRelids;	/* Indexes of all range table entries minus
+								 * indexes of range table entries of the leaf
+								 * partitions scanned by prunable subplans;
+								 * see AcquireExecutorLocks() */
+
 	/* rtable indexes of target relations for INSERT/UPDATE/DELETE */
 	List	   *resultRelations;	/* integer list of RT indexes, or NIL */
 
@@ -1410,6 +1419,13 @@ typedef struct PlanRowMark
  * prune_infos			List of Lists containing PartitionedRelPruneInfo nodes,
  *						one sublist per run-time-prunable partition hierarchy
  *						appearing in the parent plan node's subplans.
+ *
+ * needs_init_pruning	Does any of the PartitionedRelPruneInfos in
+ *						prune_infos have its initial_pruning_steps set?
+ *
+ * needs_exec_pruning	Does any of the PartitionedRelPruneInfos in
+ *						prune_infos have its exec_pruning_steps set?
+ *
  * other_subplans		Indexes of any subplans that are not accounted for
  *						by any of the PartitionedRelPruneInfo nodes in
  *						"prune_infos".  These subplans must not be pruned.
@@ -1420,6 +1436,8 @@ typedef struct PartitionPruneInfo
 
 	NodeTag		type;
 	List	   *prune_infos;
+	bool		needs_init_pruning;
+	bool		needs_exec_pruning;
 	Bitmapset  *other_subplans;
 } PartitionPruneInfo;
 
@@ -1464,6 +1482,9 @@ typedef struct PartitionedRelPruneInfo
 	/* relation OID by partition index, or 0 */
 	Oid		   *relid_map pg_node_attr(array_size(nparts));
 
+	/* Range table index by partition index, or 0. */
+	Index	   *rti_map pg_node_attr(array_size(nparts));
+
 	/*
 	 * initial_pruning_steps shows how to prune during executor startup (i.e.,
 	 * without use of any PARAM_EXEC Params); it is NIL if no startup pruning
@@ -1548,6 +1569,31 @@ typedef struct PartitionPruneStepCombine
 	List	   *source_stepids;
 } PartitionPruneStepCombine;
 
+/*----------------
+ * PartitionPruneResult
+ *
+ * The result of performing ExecPartitionDoInitialPruning() on a given
+ * PartitionPruneInfo.
+ *
+ * valid_subplans_offs contains the indexes of subplans remaining after
+ * performing initial pruning by calling ExecFindMatchingSubPlans() on the
+ * PartitionPruneInfo.
+ *
+ * This is used to store the result of initial partition pruning that is
+ * peformed before the execution has started.  A module that needs to do so
+ * should call ExecutorDoInitialPruning() on a given PlannedStmt, which
+ * returns a List of PartitionPruneResult containing an entry for each
+ * PartitionPruneInfo present in PlannedStmt.part_prune_infos.  The module
+ * should then pass that list, along with the PlannedStmt, to the executor,
+ * so that it can reuse the result of initial partition pruning when
+ * initializing the subplans for execution.
+ */
+typedef struct PartitionPruneResult
+{
+	NodeTag		type;
+
+	Bitmapset	   *valid_subplan_offs;
+} PartitionPruneResult;
 
 /*
  * Plan invalidation info
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index 0499635f59..32579d4788 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -220,7 +220,8 @@ extern List *CachedPlanGetTargetList(CachedPlanSource *plansource,
 extern CachedPlan *GetCachedPlan(CachedPlanSource *plansource,
 								 ParamListInfo boundParams,
 								 ResourceOwner owner,
-								 QueryEnvironment *queryEnv);
+								 QueryEnvironment *queryEnv,
+								 List **part_prune_results_list);
 extern void ReleaseCachedPlan(CachedPlan *plan, ResourceOwner owner);
 
 extern bool CachedPlanAllowsSimpleValidityCheck(CachedPlanSource *plansource,
diff --git a/src/include/utils/portal.h b/src/include/utils/portal.h
index aeddbdafe5..1901fc5f28 100644
--- a/src/include/utils/portal.h
+++ b/src/include/utils/portal.h
@@ -138,6 +138,7 @@ typedef struct PortalData
 	QueryCompletion qc;			/* command completion data for executed query */
 	List	   *stmts;			/* list of PlannedStmts */
 	CachedPlan *cplan;			/* CachedPlan, if stmts are from one */
+	List	   *part_prune_results_list;	/* List of Lists of PartitionPruneResults */
 
 	ParamListInfo portalParams; /* params to pass to query */
 	QueryEnvironment *queryEnv; /* environment for query */
@@ -242,6 +243,8 @@ extern void PortalDefineQuery(Portal portal,
 							  CommandTag commandTag,
 							  List *stmts,
 							  CachedPlan *cplan);
+extern void PortalStorePartitionPruneResults(Portal portal,
+											 List *part_prune_results_list);
 extern PlannedStmt *PortalGetPrimaryStmt(Portal portal);
 extern void PortalCreateHoldStore(Portal portal);
 extern void PortalHashTableDeleteAll(void);
-- 
2.35.3



^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2022-10-27 02:41  Amit Langote <[email protected]>
  parent: Amit Langote <[email protected]>
  0 siblings, 1 reply; 108+ messages in thread

From: Amit Langote @ 2022-10-27 02:41 UTC (permalink / raw)
  To: Robert Haas <[email protected]>; +Cc: Jacob Champion <[email protected]>; Zhihong Yu <[email protected]>; David Rowley <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers

On Mon, Oct 17, 2022 at 6:29 PM Amit Langote <[email protected]> wrote:
> On Wed, Oct 12, 2022 at 4:36 PM Amit Langote <[email protected]> wrote:
> > On Fri, Jul 29, 2022 at 1:20 PM Amit Langote <[email protected]> wrote:
> > > On Thu, Jul 28, 2022 at 1:27 AM Robert Haas <[email protected]> wrote:
> > > > 0001 adds es_part_prune_result but does not use it, so maybe the
> > > > introduction of that field should be deferred until it's needed for
> > > > something.
> > >
> > > Oops, looks like a mistake when breaking the patch.  Will move that bit to 0002.
> >
> > Fixed that and also noticed that I had defined PartitionPruneResult in
> > the wrong header (execnodes.h).  That led to PartitionPruneResult
> > nodes not being able to be written and read, because
> > src/backend/nodes/gen_node_support.pl doesn't create _out* and _read*
> > routines for the nodes defined in execnodes.h.  I moved its definition
> > to plannodes.h, even though it is not actually the planner that
> > instantiates those; no other include/nodes header sounds better.
> >
> > One more thing I realized is that Bitmapsets added to the List
> > PartitionPruneResult.valid_subplan_offs_list are not actually
> > read/write-able.  That's a problem that I also faced in [1], so I
> > proposed a patch there to make Bitmapset a read/write-able Node and
> > mark (only) the Bitmapsets that are added into read/write-able node
> > trees with the corresponding NodeTag.  I'm including that patch here
> > as well (0002) for the main patch to work (pass
> > -DWRITE_READ_PARSE_PLAN_TREES build tests), though it might make sense
> > to discuss it in its own thread?
>
> Had second thoughts on the use of List of Bitmapsets for this, such
> that the make-Bitmapset-Nodes patch is no longer needed.
>
> I had defined PartitionPruneResult such that it stood for the results
> of pruning for all PartitionPruneInfos contained in
> PlannedStmt.partPruneInfos (covering all Append/MergeAppend nodes that
> can use partition pruning in a given plan).  So, it had a List of
> Bitmapset.  I think it's perhaps better for PartitionPruneResult to
> cover only one PartitionPruneInfo and thus need only a Bitmapset and
> not a List thereof, which I have implemented in the attached updated
> patch 0002.  So, instead of needing to pass around a
> PartitionPruneResult with each PlannedStmt, this now passes a List of
> PartitionPruneResult with an entry for each in
> PlannedStmt.partPruneInfos.

Rebased over 3b2db22fe.

-- 
Thanks, Amit Langote
EDB: http://www.enterprisedb.com


Attachments:

  [application/octet-stream] v23-0001-Move-PartitioPruneInfo-out-of-plan-nodes-into-Pl.patch (17.2K, 2-v23-0001-Move-PartitioPruneInfo-out-of-plan-nodes-into-Pl.patch)
  download | inline diff:
From c805965cadc12217406309221e2c89e3c17be433 Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Fri, 27 May 2022 16:00:28 +0900
Subject: [PATCH v23 1/2] Move PartitioPruneInfo out of plan nodes into
 PlannedStmt

The planner will now add a given PartitioPruneInfo to
PlannedStmt.partPruneInfos instead of directly to the
Append/MergeAppend plan node.  What gets set instead in the
latter is an index field which points to the list element
of PlannedStmt.partPruneInfos containing the PartitioPruneInfo
belonging to the plan node.

A later commit will make AcquireExecutorLocks() do the initial
partition pruning to determine a minimal set of partitions to be
locked when validating a plan tree and it will need to consult the
PartitioPruneInfos referenced therein to do so.  It would be better
for the PartitioPruneInfos to be accessible directly than requiring
a walk of the plan tree to find them, which is easier when it can be
done by simply iterating over PlannedStmt.partPruneInfos.
---
 src/backend/executor/execMain.c         |  1 +
 src/backend/executor/execParallel.c     |  1 +
 src/backend/executor/execPartition.c    |  4 +-
 src/backend/executor/execUtils.c        |  1 +
 src/backend/executor/nodeAppend.c       |  4 +-
 src/backend/executor/nodeMergeAppend.c  |  4 +-
 src/backend/optimizer/plan/createplan.c | 24 ++++-----
 src/backend/optimizer/plan/planner.c    |  1 +
 src/backend/optimizer/plan/setrefs.c    | 65 +++++++++++++------------
 src/backend/partitioning/partprune.c    | 18 ++++---
 src/include/executor/execPartition.h    |  3 +-
 src/include/nodes/execnodes.h           |  1 +
 src/include/nodes/pathnodes.h           |  6 +++
 src/include/nodes/plannodes.h           | 11 +++--
 src/include/partitioning/partprune.h    |  8 +--
 15 files changed, 90 insertions(+), 62 deletions(-)

diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index d78862e660..32475e33ff 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -825,6 +825,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
 	ExecInitRangeTable(estate, rangeTable);
 
 	estate->es_plannedstmt = plannedstmt;
+	estate->es_part_prune_infos = plannedstmt->partPruneInfos;
 
 	/*
 	 * Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index 99512826c5..aca0c6f323 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -183,6 +183,7 @@ ExecSerializePlan(Plan *plan, EState *estate)
 	pstmt->dependsOnRole = false;
 	pstmt->parallelModeNeeded = false;
 	pstmt->planTree = plan;
+	pstmt->partPruneInfos = estate->es_part_prune_infos;
 	pstmt->rtable = estate->es_range_table;
 	pstmt->resultRelations = NIL;
 	pstmt->appendRelations = NIL;
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 40e3c07693..80197d5141 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -1791,11 +1791,13 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
 PartitionPruneState *
 ExecInitPartitionPruning(PlanState *planstate,
 						 int n_total_subplans,
-						 PartitionPruneInfo *pruneinfo,
+						 int part_prune_index,
 						 Bitmapset **initially_valid_subplans)
 {
 	PartitionPruneState *prunestate;
 	EState	   *estate = planstate->state;
+	PartitionPruneInfo *pruneinfo = list_nth(estate->es_part_prune_infos,
+											 part_prune_index);
 
 	/* We may need an expression context to evaluate partition exprs */
 	ExecAssignExprContext(estate, planstate);
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 9df1f81ea8..21f4c10937 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -119,6 +119,7 @@ CreateExecutorState(void)
 	estate->es_relations = NULL;
 	estate->es_rowmarks = NULL;
 	estate->es_plannedstmt = NULL;
+	estate->es_part_prune_infos = NIL;
 
 	estate->es_junkFilter = NULL;
 
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index 357e10a1d7..c6f86a6510 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -134,7 +134,7 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
 	appendstate->as_begun = false;
 
 	/* If run-time partition pruning is enabled, then set that up now */
-	if (node->part_prune_info != NULL)
+	if (node->part_prune_index >= 0)
 	{
 		PartitionPruneState *prunestate;
 
@@ -145,7 +145,7 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
 		 */
 		prunestate = ExecInitPartitionPruning(&appendstate->ps,
 											  list_length(node->appendplans),
-											  node->part_prune_info,
+											  node->part_prune_index,
 											  &validsubplans);
 		appendstate->as_prune_state = prunestate;
 		nplans = bms_num_members(validsubplans);
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index c5c62fa5c7..8d35860c30 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -82,7 +82,7 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
 	mergestate->ps.ExecProcNode = ExecMergeAppend;
 
 	/* If run-time partition pruning is enabled, then set that up now */
-	if (node->part_prune_info != NULL)
+	if (node->part_prune_index >= 0)
 	{
 		PartitionPruneState *prunestate;
 
@@ -93,7 +93,7 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
 		 */
 		prunestate = ExecInitPartitionPruning(&mergestate->ps,
 											  list_length(node->mergeplans),
-											  node->part_prune_info,
+											  node->part_prune_index,
 											  &validsubplans);
 		mergestate->ms_prune_state = prunestate;
 		nplans = bms_num_members(validsubplans);
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index ac86ce9003..50a5719ac6 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -1203,7 +1203,6 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
 	ListCell   *subpaths;
 	int			nasyncplans = 0;
 	RelOptInfo *rel = best_path->path.parent;
-	PartitionPruneInfo *partpruneinfo = NULL;
 	int			nodenumsortkeys = 0;
 	AttrNumber *nodeSortColIdx = NULL;
 	Oid		   *nodeSortOperators = NULL;
@@ -1354,6 +1353,9 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
 		subplans = lappend(subplans, subplan);
 	}
 
+	/* Set below if we find quals that we can use to run-time prune */
+	plan->part_prune_index = -1;
+
 	/*
 	 * If any quals exist, they may be useful to perform further partition
 	 * pruning during execution.  Gather information needed by the executor to
@@ -1377,16 +1379,14 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
 		}
 
 		if (prunequal != NIL)
-			partpruneinfo =
-				make_partition_pruneinfo(root, rel,
-										 best_path->subpaths,
-										 prunequal);
+			plan->part_prune_index = make_partition_pruneinfo(root, rel,
+															  best_path->subpaths,
+															  prunequal);
 	}
 
 	plan->appendplans = subplans;
 	plan->nasyncplans = nasyncplans;
 	plan->first_partial_plan = best_path->first_partial_path;
-	plan->part_prune_info = partpruneinfo;
 
 	copy_generic_path_info(&plan->plan, (Path *) best_path);
 
@@ -1425,7 +1425,6 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
 	List	   *subplans = NIL;
 	ListCell   *subpaths;
 	RelOptInfo *rel = best_path->path.parent;
-	PartitionPruneInfo *partpruneinfo = NULL;
 
 	/*
 	 * We don't have the actual creation of the MergeAppend node split out
@@ -1518,6 +1517,9 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
 		subplans = lappend(subplans, subplan);
 	}
 
+	/* Set below if we find quals that we can use to run-time prune */
+	node->part_prune_index = -1;
+
 	/*
 	 * If any quals exist, they may be useful to perform further partition
 	 * pruning during execution.  Gather information needed by the executor to
@@ -1541,13 +1543,13 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
 		}
 
 		if (prunequal != NIL)
-			partpruneinfo = make_partition_pruneinfo(root, rel,
-													 best_path->subpaths,
-													 prunequal);
+			node->part_prune_index = make_partition_pruneinfo(root, rel,
+															  best_path->subpaths,
+															  prunequal);
 	}
 
 	node->mergeplans = subplans;
-	node->part_prune_info = partpruneinfo;
+
 
 	/*
 	 * If prepare_sort_from_pathkeys added sort columns, but we were told to
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 78a8174534..240d50f1c0 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -519,6 +519,7 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
 	result->dependsOnRole = glob->dependsOnRole;
 	result->parallelModeNeeded = glob->parallelModeNeeded;
 	result->planTree = top_plan;
+	result->partPruneInfos = glob->partPruneInfos;
 	result->rtable = glob->finalrtable;
 	result->resultRelations = glob->resultRelations;
 	result->appendRelations = glob->appendRelations;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 1cb0abdbc1..720f20f563 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -348,6 +348,29 @@ set_plan_references(PlannerInfo *root, Plan *plan)
 		}
 	}
 
+	/* Also fix up the information in PartitionPruneInfos. */
+	foreach (lc, root->partPruneInfos)
+	{
+		PartitionPruneInfo *pruneinfo = lfirst(lc);
+		ListCell  *l;
+
+		foreach(l, pruneinfo->prune_infos)
+		{
+			List	   *prune_infos = lfirst(l);
+			ListCell   *l2;
+
+			foreach(l2, prune_infos)
+			{
+				PartitionedRelPruneInfo *pinfo = lfirst(l2);
+
+				/* RT index of the table to which the pinfo belongs. */
+				pinfo->rtindex += rtoffset;
+			}
+		}
+
+		glob->partPruneInfos = lappend(glob->partPruneInfos, pruneinfo);
+	}
+
 	return result;
 }
 
@@ -1658,21 +1681,12 @@ set_append_references(PlannerInfo *root,
 
 	aplan->apprelids = offset_relid_set(aplan->apprelids, rtoffset);
 
-	if (aplan->part_prune_info)
-	{
-		foreach(l, aplan->part_prune_info->prune_infos)
-		{
-			List	   *prune_infos = lfirst(l);
-			ListCell   *l2;
-
-			foreach(l2, prune_infos)
-			{
-				PartitionedRelPruneInfo *pinfo = lfirst(l2);
-
-				pinfo->rtindex += rtoffset;
-			}
-		}
-	}
+	/*
+	 * PartitionPruneInfos will be added to a list in PlannerGlobal, so update
+	 * the index.
+	 */
+	if (aplan->part_prune_index >= 0)
+		aplan->part_prune_index += list_length(root->glob->partPruneInfos);
 
 	/* We don't need to recurse to lefttree or righttree ... */
 	Assert(aplan->plan.lefttree == NULL);
@@ -1734,21 +1748,12 @@ set_mergeappend_references(PlannerInfo *root,
 
 	mplan->apprelids = offset_relid_set(mplan->apprelids, rtoffset);
 
-	if (mplan->part_prune_info)
-	{
-		foreach(l, mplan->part_prune_info->prune_infos)
-		{
-			List	   *prune_infos = lfirst(l);
-			ListCell   *l2;
-
-			foreach(l2, prune_infos)
-			{
-				PartitionedRelPruneInfo *pinfo = lfirst(l2);
-
-				pinfo->rtindex += rtoffset;
-			}
-		}
-	}
+	/*
+	 * PartitionPruneInfos will be added to a list in PlannerGlobal, so update
+	 * the index.
+	 */
+	if (mplan->part_prune_index >= 0)
+		mplan->part_prune_index += list_length(root->glob->partPruneInfos);
 
 	/* We don't need to recurse to lefttree or righttree ... */
 	Assert(mplan->plan.lefttree == NULL);
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index 6188bf69cb..6565b6ed01 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -209,16 +209,20 @@ static void partkey_datum_from_expr(PartitionPruneContext *context,
 
 /*
  * make_partition_pruneinfo
- *		Builds a PartitionPruneInfo which can be used in the executor to allow
- *		additional partition pruning to take place.  Returns NULL when
- *		partition pruning would be useless.
+ *		Checks if the given set of quals can be used to build pruning steps
+ *		that the executor can use to prune away unneeded partitions.  If
+ *		suitable quals are found then a PartitionPruneInfo is built and tagged
+ *		onto the PlannerInfo's partPruneInfos list.
+ *
+ * The return value is the 0-based index of the item added to the
+ * partPruneInfos list or -1 if nothing was added.
  *
  * 'parentrel' is the RelOptInfo for an appendrel, and 'subpaths' is the list
  * of scan paths for its child rels.
  * 'prunequal' is a list of potential pruning quals (i.e., restriction
  * clauses that are applicable to the appendrel).
  */
-PartitionPruneInfo *
+int
 make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 						 List *subpaths,
 						 List *prunequal)
@@ -332,7 +336,7 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	 * quals, then we can just not bother with run-time pruning.
 	 */
 	if (prunerelinfos == NIL)
-		return NULL;
+		return -1;
 
 	/* Else build the result data structure */
 	pruneinfo = makeNode(PartitionPruneInfo);
@@ -358,7 +362,9 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	else
 		pruneinfo->other_subplans = NULL;
 
-	return pruneinfo;
+	root->partPruneInfos = lappend(root->partPruneInfos, pruneinfo);
+
+	return list_length(root->partPruneInfos) - 1;
 }
 
 /*
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index 708435e952..bf962af7af 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -123,9 +123,8 @@ typedef struct PartitionPruneState
 
 extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
 													 int n_total_subplans,
-													 PartitionPruneInfo *pruneinfo,
+													 int part_prune_index,
 													 Bitmapset **initially_valid_subplans);
 extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
 										   bool initial_prune);
-
 #endif							/* EXECPARTITION_H */
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 01b1727fc0..4a741b053f 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -611,6 +611,7 @@ typedef struct EState
 	struct ExecRowMark **es_rowmarks;	/* Array of per-range-table-entry
 										 * ExecRowMarks, or NULL if none */
 	PlannedStmt *es_plannedstmt;	/* link to top of plan tree */
+	List		*es_part_prune_infos;	/* PlannedStmt.partPruneInfos */
 	const char *es_sourceText;	/* Source text from QueryDesc */
 
 	JunkFilter *es_junkFilter;	/* top-level junk filter, if any */
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 09342d128d..fbe75dca0f 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -122,6 +122,9 @@ typedef struct PlannerGlobal
 	/* "flat" list of AppendRelInfos */
 	List	   *appendRelations;
 
+	/* List of PartitionPruneInfo contained in the plan */
+	List	   *partPruneInfos;
+
 	/* OIDs of relations the plan depends on */
 	List	   *relationOids;
 
@@ -503,6 +506,9 @@ struct PlannerInfo
 
 	/* Does this query modify any partition key columns? */
 	bool		partColsUpdated;
+
+	/* PartitionPruneInfos added in this query's plan. */
+	List	   *partPruneInfos;
 };
 
 
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 5c2ab1b379..2e132afc5a 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -70,6 +70,9 @@ typedef struct PlannedStmt
 
 	struct Plan *planTree;		/* tree of Plan nodes */
 
+	List	   *partPruneInfos;	/* List of PartitionPruneInfo contained in
+								 * the plan */
+
 	List	   *rtable;			/* list of RangeTblEntry nodes */
 
 	/* rtable indexes of target relations for INSERT/UPDATE/DELETE/MERGE */
@@ -270,8 +273,8 @@ typedef struct Append
 	 */
 	int			first_partial_plan;
 
-	/* Info for run-time subplan pruning; NULL if we're not doing that */
-	struct PartitionPruneInfo *part_prune_info;
+	/* Index to PlannerInfo.partPruneInfos or -1 if no run-time pruning */
+	int			part_prune_index;
 } Append;
 
 /* ----------------
@@ -305,8 +308,8 @@ typedef struct MergeAppend
 	/* NULLS FIRST/LAST directions */
 	bool	   *nullsFirst pg_node_attr(array_size(numCols));
 
-	/* Info for run-time subplan pruning; NULL if we're not doing that */
-	struct PartitionPruneInfo *part_prune_info;
+	/* Index to PlannerInfo.partPruneInfos or -1 if no run-time pruning */
+	int			part_prune_index;
 } MergeAppend;
 
 /* ----------------
diff --git a/src/include/partitioning/partprune.h b/src/include/partitioning/partprune.h
index 90684efa25..ebf0dcff8c 100644
--- a/src/include/partitioning/partprune.h
+++ b/src/include/partitioning/partprune.h
@@ -70,10 +70,10 @@ typedef struct PartitionPruneContext
 #define PruneCxtStateIdx(partnatts, step_id, keyno) \
 	((partnatts) * (step_id) + (keyno))
 
-extern PartitionPruneInfo *make_partition_pruneinfo(struct PlannerInfo *root,
-													struct RelOptInfo *parentrel,
-													List *subpaths,
-													List *prunequal);
+extern int make_partition_pruneinfo(struct PlannerInfo *root,
+									struct RelOptInfo *parentrel,
+									List *subpaths,
+									List *prunequal);
 extern Bitmapset *prune_append_rel_partitions(struct RelOptInfo *rel);
 extern Bitmapset *get_matching_partitions(PartitionPruneContext *context,
 										  List *pruning_steps);
-- 
2.35.3



  [application/octet-stream] v23-0002-Optimize-AcquireExecutorLocks-by-locking-only-un.patch (82.3K, 3-v23-0002-Optimize-AcquireExecutorLocks-by-locking-only-un.patch)
  download | inline diff:
From ae9a6b7186c77888fd85dd7e4056dd3cd607617c Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Wed, 22 Dec 2021 16:55:17 +0900
Subject: [PATCH v23 2/2] Optimize AcquireExecutorLocks() by locking only
 unpruned partitions

This commit teaches AcquireExecutorLocks() to perform initial
partition pruning to notionally eliminate the subnodes contained in a
generic cached plan that need not be initialized during the actual
execution of the plan and skip locking the partition scanned by those
subnodes.

The result of performing initial partition pruning this way before the
actual execution has started is made available to the actual execution via
PartitionPruneResult, made available along with the PlannedStmt by the
callers of the executor that used plancache.c to get the plan.  It is NULL
in the cases in which the plan is obtained by calling the planner
directly or if the plan obtained by plancache.c is not a generic one.
---
 src/backend/commands/copyto.c          |   2 +-
 src/backend/commands/createas.c        |   2 +-
 src/backend/commands/explain.c         |   7 +-
 src/backend/commands/extension.c       |   2 +-
 src/backend/commands/matview.c         |   2 +-
 src/backend/commands/prepare.c         |  26 ++-
 src/backend/executor/README            |  32 ++++
 src/backend/executor/execMain.c        |  51 ++++++
 src/backend/executor/execParallel.c    |  26 ++-
 src/backend/executor/execPartition.c   | 241 +++++++++++++++++++++----
 src/backend/executor/execUtils.c       |   1 +
 src/backend/executor/functions.c       |   2 +-
 src/backend/executor/nodeAppend.c      |  11 +-
 src/backend/executor/nodeMergeAppend.c |   5 +-
 src/backend/executor/spi.c             |  27 ++-
 src/backend/nodes/readfuncs.c          |   8 +-
 src/backend/optimizer/plan/planner.c   |   2 +
 src/backend/optimizer/plan/setrefs.c   |  46 +++++
 src/backend/partitioning/partprune.c   |  41 ++++-
 src/backend/tcop/postgres.c            |   8 +-
 src/backend/tcop/pquery.c              |  28 ++-
 src/backend/utils/cache/plancache.c    | 208 ++++++++++++++++++---
 src/backend/utils/mmgr/portalmem.c     |  19 ++
 src/include/commands/explain.h         |   4 +-
 src/include/executor/execPartition.h   |   9 +-
 src/include/executor/execdesc.h        |   3 +
 src/include/executor/executor.h        |   3 +
 src/include/nodes/execnodes.h          |   1 +
 src/include/nodes/nodes.h              |   1 +
 src/include/nodes/pathnodes.h          |  12 ++
 src/include/nodes/plannodes.h          |  46 +++++
 src/include/utils/plancache.h          |   3 +-
 src/include/utils/portal.h             |   3 +
 33 files changed, 782 insertions(+), 100 deletions(-)

diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index 2527e66059..fb8779fec0 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -558,7 +558,7 @@ BeginCopyTo(ParseState *pstate,
 		((DR_copy *) dest)->cstate = cstate;
 
 		/* Create a QueryDesc requesting no output */
-		cstate->queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+		cstate->queryDesc = CreateQueryDesc(plan, NIL, pstate->p_sourcetext,
 											GetActiveSnapshot(),
 											InvalidSnapshot,
 											dest, NULL, NULL, 0);
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 152c29b551..942449544c 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -325,7 +325,7 @@ ExecCreateTableAs(ParseState *pstate, CreateTableAsStmt *stmt,
 		UpdateActiveSnapshotCommandId();
 
 		/* Create a QueryDesc, redirecting output to our tuple receiver */
-		queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+		queryDesc = CreateQueryDesc(plan, NIL, pstate->p_sourcetext,
 									GetActiveSnapshot(), InvalidSnapshot,
 									dest, params, queryEnv, 0);
 
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index f86983c660..2f2b558608 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -407,7 +407,7 @@ ExplainOneQuery(Query *query, int cursorOptions,
 		}
 
 		/* run it (if needed) and produce output */
-		ExplainOnePlan(plan, into, es, queryString, params, queryEnv,
+		ExplainOnePlan(plan, NIL, into, es, queryString, params, queryEnv,
 					   &planduration, (es->buffers ? &bufusage : NULL));
 	}
 }
@@ -515,7 +515,8 @@ ExplainOneUtility(Node *utilityStmt, IntoClause *into, ExplainState *es,
  * to call it.
  */
 void
-ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
+ExplainOnePlan(PlannedStmt *plannedstmt, List *part_prune_results,
+			   IntoClause *into, ExplainState *es,
 			   const char *queryString, ParamListInfo params,
 			   QueryEnvironment *queryEnv, const instr_time *planduration,
 			   const BufferUsage *bufusage)
@@ -563,7 +564,7 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
 		dest = None_Receiver;
 
 	/* Create a QueryDesc for the query */
-	queryDesc = CreateQueryDesc(plannedstmt, queryString,
+	queryDesc = CreateQueryDesc(plannedstmt, part_prune_results, queryString,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, params, queryEnv, instrument_option);
 
diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index 1a62e5dac5..cc36b6fd15 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -776,7 +776,7 @@ execute_sql_string(const char *sql)
 			{
 				QueryDesc  *qdesc;
 
-				qdesc = CreateQueryDesc(stmt,
+				qdesc = CreateQueryDesc(stmt, NIL,
 										sql,
 										GetActiveSnapshot(), NULL,
 										dest, NULL, NULL, 0);
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index 9ac0383459..65c8d0aa59 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -408,7 +408,7 @@ refresh_matview_datafill(DestReceiver *dest, Query *query,
 	UpdateActiveSnapshotCommandId();
 
 	/* Create a QueryDesc, redirecting output to our tuple receiver */
-	queryDesc = CreateQueryDesc(plan, queryString,
+	queryDesc = CreateQueryDesc(plan, NIL, queryString,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, NULL, NULL, 0);
 
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 9e29584d93..29b45539d3 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -155,6 +155,7 @@ ExecuteQuery(ParseState *pstate,
 	PreparedStatement *entry;
 	CachedPlan *cplan;
 	List	   *plan_list;
+	List	   *part_prune_results_list;
 	ParamListInfo paramLI = NULL;
 	EState	   *estate = NULL;
 	Portal		portal;
@@ -193,7 +194,10 @@ ExecuteQuery(ParseState *pstate,
 									   entry->plansource->query_string);
 
 	/* Replan if needed, and increment plan refcount for portal */
-	cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL);
+	cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL,
+						  &part_prune_results_list);
+	Assert(list_length(cplan->stmt_list) ==
+		   list_length(part_prune_results_list));
 	plan_list = cplan->stmt_list;
 
 	/*
@@ -207,6 +211,9 @@ ExecuteQuery(ParseState *pstate,
 					  plan_list,
 					  cplan);
 
+	/* Copy Lists of PartitionPruneResults into the portal's context. */
+	PortalStorePartitionPruneResults(portal, part_prune_results_list);
+
 	/*
 	 * For CREATE TABLE ... AS EXECUTE, we must verify that the prepared
 	 * statement is one that produces tuples.  Currently we insist that it be
@@ -576,7 +583,9 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	const char *query_string;
 	CachedPlan *cplan;
 	List	   *plan_list;
-	ListCell   *p;
+	List	   *part_prune_results_list;
+	ListCell   *p,
+			   *pp;
 	ParamListInfo paramLI = NULL;
 	EState	   *estate = NULL;
 	instr_time	planstart;
@@ -619,7 +628,10 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 
 	/* Replan if needed, and acquire a transient refcount */
 	cplan = GetCachedPlan(entry->plansource, paramLI,
-						  CurrentResourceOwner, queryEnv);
+						  CurrentResourceOwner, queryEnv,
+						  &part_prune_results_list);
+	Assert(list_length(cplan->stmt_list) ==
+		   list_length(part_prune_results_list));
 
 	INSTR_TIME_SET_CURRENT(planduration);
 	INSTR_TIME_SUBTRACT(planduration, planstart);
@@ -634,13 +646,15 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	plan_list = cplan->stmt_list;
 
 	/* Explain each query */
-	foreach(p, plan_list)
+	forboth(p, plan_list, pp, part_prune_results_list)
 	{
 		PlannedStmt *pstmt = lfirst_node(PlannedStmt, p);
+		List *part_prune_results = lfirst_node(List, pp);
 
 		if (pstmt->commandType != CMD_UTILITY)
-			ExplainOnePlan(pstmt, into, es, query_string, paramLI, queryEnv,
-						   &planduration, (es->buffers ? &bufusage : NULL));
+			ExplainOnePlan(pstmt, part_prune_results, into, es, query_string,
+						   paramLI, queryEnv, &planduration,
+						   (es->buffers ? &bufusage : NULL));
 		else
 			ExplainOneUtility(pstmt->utilityStmt, into, es, query_string,
 							  paramLI, queryEnv);
diff --git a/src/backend/executor/README b/src/backend/executor/README
index 0b5183fc4a..f14f9197b5 100644
--- a/src/backend/executor/README
+++ b/src/backend/executor/README
@@ -65,6 +65,34 @@ found there.  This currently only occurs for Append and MergeAppend nodes.  In
 this case the non-required subplans are ignored and the executor state's
 subnode array will become out of sequence to the plan's subplan list.
 
+Actually, the so-called execution time pruning may also occur even before the
+execution has started.  One case where that occurs is when a cached generic
+plan is being validated for execution by plancache.c: GetCachedPlan(), which
+works by locking all the relations that will be scanned by that plan.  If the
+generic plan contains nodes that can perform execution time partition pruning
+(that is, contain a PartitionPruneInfo), a subset of pruning steps contained
+in a given node's PartitionPruneInfo that do not depend on the execution
+actually having started (called "initial" pruning steps) are performed to
+figure out the minimal set of child subplans that satisfy those pruning steps.
+AcquireExecutorLocks() looking at a given generic plan will then lock only the
+relations scanned by the child subplans that survived such pruning, along with
+those present in PlannedStmt.minLockRelids.  Note that the subplans are only
+notionally pruned, that is, they are not removed from the plan tree as such.
+
+To prevent the executor and any third party execution code that can look at
+the plan tree from trying to execute the subplans that were pruned as
+described above, the result of pruning is passed to the executor as a List
+of PartitionPruneResult nodes via the QueryDesc.  Each PartitionPruneResult
+consists of the set of indexes of surviving subplans in the respective parent
+plan node's (the one to which the corresponding PartitionPruneInfo belongs)
+list of child subplans, saved as a bitmapset (valid_subplan_offs).  In other
+words, the executor executing a generic plan should not re-evaluate the set of
+initially valid subplans for a given plan node by redoing the initial pruning
+if it was already done by AcquireExecutorLocks() when validating the plan.
+Such re-evaluation of the pruning steps may very well end up resulting in a
+different set of subplans, containing some whose relations were not locked by
+AcquireExecutorLocks().
+
 Each Plan node may have expression trees associated with it, to represent
 its target list, qualification conditions, etc.  These trees are also
 read-only to the executor, but the executor state for expression evaluation
@@ -286,6 +314,10 @@ Query Processing Control Flow
 
 This is a sketch of control flow for full query processing:
 
+  [ ExecutorDoInitialPruning ] --- an optional step to perform initial
+		partition pruning on the plan tree the result of which is passed
+		to the executor via QueryDesc
+
 	CreateQueryDesc
 
 	ExecutorStart
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 32475e33ff..b59474841f 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -49,6 +49,7 @@
 #include "commands/matview.h"
 #include "commands/trigger.h"
 #include "executor/execdebug.h"
+#include "executor/execPartition.h"
 #include "executor/nodeSubplan.h"
 #include "foreign/fdwapi.h"
 #include "jit/jit.h"
@@ -104,6 +105,54 @@ static void EvalPlanQualStart(EPQState *epqstate, Plan *planTree);
 
 /* end of local decls */
 
+/* ----------------------------------------------------------------
+ *		ExecutorDoInitialPruning
+ *
+ *		For each plan tree node that has been assigned a PartitionPruneInfo,
+ *		this performs initial partition pruning using the information contained
+ *		therein to determine the set of child subplans that satisfy the initial
+ *		pruning steps, to be returned as a bitmapset of their indexes in the
+ *		node's list of child subplans (for example, an Append's appendplans).
+ *
+ * Return value is a PartitionPruneResult node that contains a list of those
+ * bitmapsets, with one element for every PartitionPruneInfo, and a bitmapset
+ * of the RT indexes of all the leaf partitions scanned by those chosen
+ * subplans.  Note that the latter is shared across all PartitionPruneInfos.
+ *
+ * The executor must see the exactly same set of subplans as valid for
+ * execution when doing ExecInitNode() on the plan nodes whose
+ * PartitionPruneInfos are processed here.  So, it must get the set from the
+ * aforementioned PartitionPruneResult, instead of computing it all over
+ * again by redoing the initial pruning.  It's the caller's job to pass the
+ * PartitionPruneResult to the executor.
+ *
+ * Note: Partitioned tables mentioned in PartitionedRelPruneInfo nodes that
+ * drive the pruning will be locked before doing the pruning.
+ * ----------------------------------------------------------------
+ */
+List *
+ExecutorDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params,
+						 Bitmapset **scan_leafpart_rtis)
+{
+	List	 *part_prune_results = NIL;
+	ListCell *lc;
+
+	/* Only get here if there is any pruning to do. */
+	Assert(plannedstmt->containsInitialPruning);
+
+	foreach(lc, plannedstmt->partPruneInfos)
+	{
+		PartitionPruneInfo *pruneinfo = lfirst(lc);
+		PartitionPruneResult *pruneresult = makeNode(PartitionPruneResult);
+
+		pruneresult->valid_subplan_offs =
+			ExecPartitionDoInitialPruning(plannedstmt, params, pruneinfo,
+										  scan_leafpart_rtis);
+		part_prune_results = lappend(part_prune_results, pruneresult);
+	}
+
+	return part_prune_results;
+}
 
 /* ----------------------------------------------------------------
  *		ExecutorStart
@@ -806,6 +855,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
 {
 	CmdType		operation = queryDesc->operation;
 	PlannedStmt *plannedstmt = queryDesc->plannedstmt;
+	List	   *part_prune_results = queryDesc->part_prune_results;
 	Plan	   *plan = plannedstmt->planTree;
 	List	   *rangeTable = plannedstmt->rtable;
 	EState	   *estate = queryDesc->estate;
@@ -826,6 +876,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
 
 	estate->es_plannedstmt = plannedstmt;
 	estate->es_part_prune_infos = plannedstmt->partPruneInfos;
+	estate->es_part_prune_results = part_prune_results;
 
 	/*
 	 * Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index aca0c6f323..917079a034 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -66,6 +66,7 @@
 #define PARALLEL_KEY_QUERY_TEXT		UINT64CONST(0xE000000000000008)
 #define PARALLEL_KEY_JIT_INSTRUMENTATION UINT64CONST(0xE000000000000009)
 #define PARALLEL_KEY_WAL_USAGE			UINT64CONST(0xE00000000000000A)
+#define PARALLEL_KEY_PARTITION_PRUNE_RESULTS	UINT64CONST(0xE00000000000000B)
 
 #define PARALLEL_TUPLE_QUEUE_SIZE		65536
 
@@ -182,6 +183,7 @@ ExecSerializePlan(Plan *plan, EState *estate)
 	pstmt->transientPlan = false;
 	pstmt->dependsOnRole = false;
 	pstmt->parallelModeNeeded = false;
+	pstmt->containsInitialPruning = false;
 	pstmt->planTree = plan;
 	pstmt->partPruneInfos = estate->es_part_prune_infos;
 	pstmt->rtable = estate->es_range_table;
@@ -597,12 +599,15 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	FixedParallelExecutorState *fpes;
 	char	   *pstmt_data;
 	char	   *pstmt_space;
+	char	   *part_prune_results_data;
+	char	   *part_prune_results_space;
 	char	   *paramlistinfo_space;
 	BufferUsage *bufusage_space;
 	WalUsage   *walusage_space;
 	SharedExecutorInstrumentation *instrumentation = NULL;
 	SharedJitInstrumentation *jit_instrumentation = NULL;
 	int			pstmt_len;
+	int			part_prune_results_len;
 	int			paramlistinfo_len;
 	int			instrumentation_len = 0;
 	int			jit_instrumentation_len = 0;
@@ -631,6 +636,7 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 
 	/* Fix up and serialize plan to be sent to workers. */
 	pstmt_data = ExecSerializePlan(planstate->plan, estate);
+	part_prune_results_data = nodeToString(estate->es_part_prune_results);
 
 	/* Create a parallel context. */
 	pcxt = CreateParallelContext("postgres", "ParallelQueryMain", nworkers);
@@ -657,6 +663,11 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	shm_toc_estimate_chunk(&pcxt->estimator, pstmt_len);
 	shm_toc_estimate_keys(&pcxt->estimator, 1);
 
+	/* Estimate space for serialized List of PartitionPruneResult. */
+	part_prune_results_len = strlen(part_prune_results_data) + 1;
+	shm_toc_estimate_chunk(&pcxt->estimator, part_prune_results_len);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
 	/* Estimate space for serialized ParamListInfo. */
 	paramlistinfo_len = EstimateParamListSpace(estate->es_param_list_info);
 	shm_toc_estimate_chunk(&pcxt->estimator, paramlistinfo_len);
@@ -751,6 +762,12 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	memcpy(pstmt_space, pstmt_data, pstmt_len);
 	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PLANNEDSTMT, pstmt_space);
 
+	/* Store serialized List of PartitionPruneResult */
+	part_prune_results_space = shm_toc_allocate(pcxt->toc, part_prune_results_len);
+	memcpy(part_prune_results_space, part_prune_results_data, part_prune_results_len);
+	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARTITION_PRUNE_RESULTS,
+				   part_prune_results_space);
+
 	/* Store serialized ParamListInfo. */
 	paramlistinfo_space = shm_toc_allocate(pcxt->toc, paramlistinfo_len);
 	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARAMLISTINFO, paramlistinfo_space);
@@ -1232,8 +1249,10 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
 						 int instrument_options)
 {
 	char	   *pstmtspace;
+	char	   *part_prune_results_space;
 	char	   *paramspace;
 	PlannedStmt *pstmt;
+	List	   *part_prune_results;
 	ParamListInfo paramLI;
 	char	   *queryString;
 
@@ -1244,12 +1263,17 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
 	pstmtspace = shm_toc_lookup(toc, PARALLEL_KEY_PLANNEDSTMT, false);
 	pstmt = (PlannedStmt *) stringToNode(pstmtspace);
 
+	/* Reconstruct leader-supplied PartitionPruneResult. */
+	part_prune_results_space =
+		shm_toc_lookup(toc, PARALLEL_KEY_PARTITION_PRUNE_RESULTS, false);
+	part_prune_results = (List *) stringToNode(part_prune_results_space);
+
 	/* Reconstruct ParamListInfo. */
 	paramspace = shm_toc_lookup(toc, PARALLEL_KEY_PARAMLISTINFO, false);
 	paramLI = RestoreParamList(&paramspace);
 
 	/* Create a QueryDesc for the query. */
-	return CreateQueryDesc(pstmt,
+	return CreateQueryDesc(pstmt, part_prune_results,
 						   queryString,
 						   GetActiveSnapshot(), InvalidSnapshot,
 						   receiver, paramLI, NULL, instrument_options);
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 80197d5141..8728745c44 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -25,6 +25,7 @@
 #include "mb/pg_wchar.h"
 #include "miscadmin.h"
 #include "nodes/makefuncs.h"
+#include "parser/parsetree.h"
 #include "partitioning/partbounds.h"
 #include "partitioning/partdesc.h"
 #include "partitioning/partprune.h"
@@ -185,7 +186,11 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation rel,
 static List *adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri);
 static List *adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap);
 static PartitionPruneState *CreatePartitionPruneState(PlanState *planstate,
-													  PartitionPruneInfo *pruneinfo);
+													  PartitionPruneInfo *pruneinfo,
+													  bool consider_initial_steps,
+													  bool consider_exec_steps,
+													  List *rtable, ExprContext *econtext,
+													  PartitionDirectory partdir);
 static void InitPartitionPruneContext(PartitionPruneContext *context,
 									  List *pruning_steps,
 									  PartitionDesc partdesc,
@@ -198,7 +203,8 @@ static void PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
 static void find_matching_subplans_recurse(PartitionPruningData *prunedata,
 										   PartitionedRelPruningData *pprune,
 										   bool initial_prune,
-										   Bitmapset **validsubplans);
+										   Bitmapset **validsubplans,
+										   Bitmapset **scan_leafpart_rtis);
 
 
 /*
@@ -1746,8 +1752,10 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  * considered to be a stable expression, it can change value from one plan
  * node scan to the next during query execution.  Stable comparison
  * expressions that don't involve such Params allow partition pruning to be
- * done once during executor startup.  Expressions that do involve such Params
- * require us to prune separately for each scan of the parent plan node.
+ * done once during executor startup or during ExecutorDoInitialPruning() that
+ * runs as part of performing AcquireExecutorLocks() on a given plan tree.
+ * Expressions that do involve such Params require us to prune separately for
+ * each scan of the parent plan node.
  *
  * Note that pruning away unneeded subplans during executor startup has the
  * added benefit of not having to initialize the unneeded subplans at all.
@@ -1764,6 +1772,13 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  *		account for initial pruning possibly having eliminated some of the
  *		subplans.
  *
+ * ExecPartitionDoInitialPruning:
+ *		Do initial pruning with the information contained in a given
+ *		PartitionPruneInfo to determine the minimal set of child subplans
+ *		to be executed of the parent plan node to which the PartitionPruneInfo
+ *		belongs and also the set of the RT indexes of leaf partitions that will
+ *		be scanned with those subplans.
+ *
  * ExecFindMatchingSubPlans:
  *		Returns indexes of matching subplans after evaluating the expressions
  *		that are safe to evaluate at a given point.  This function is first
@@ -1781,8 +1796,9 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  *
  * On return, *initially_valid_subplans is assigned the set of indexes of
  * child subplans that must be initialized along with the parent plan node.
- * Initial pruning is performed here if needed and in that case only the
- * surviving subplans' indexes are added.
+ * Initial pruning is performed here if needed (unless it has already been done
+ * by ExecutorDoInitialPruning()), and in that case only the surviving
+ * subplans' indexes are added.
  *
  * If subplans are indeed pruned, subplan_map arrays contained in the returned
  * PartitionPruneState are re-sequenced to not count those, though only if the
@@ -1794,28 +1810,65 @@ ExecInitPartitionPruning(PlanState *planstate,
 						 int part_prune_index,
 						 Bitmapset **initially_valid_subplans)
 {
-	PartitionPruneState *prunestate;
+	PartitionPruneState *prunestate = NULL;
 	EState	   *estate = planstate->state;
 	PartitionPruneInfo *pruneinfo = list_nth(estate->es_part_prune_infos,
 											 part_prune_index);
+	PartitionPruneResult *pruneresult = NULL;
+	bool	do_pruning = (pruneinfo->needs_init_pruning ||
+						  pruneinfo->needs_exec_pruning);
+
+	/*
+	 * No need to do initial pruning if it was done already by
+	 * ExecutorDoInitialPruning(), which it would be if es_part_prune_results
+	 * is set.
+	 */
+	if (estate->es_part_prune_results)
+	{
+		pruneresult = list_nth(estate->es_part_prune_results, part_prune_index);
+		Assert(IsA(pruneresult, PartitionPruneResult));
+		do_pruning = pruneinfo->needs_exec_pruning;
+	}
 
-	/* We may need an expression context to evaluate partition exprs */
-	ExecAssignExprContext(estate, planstate);
+	if (do_pruning)
+	{
+		/* We may need an expression context to evaluate partition exprs */
+		ExecAssignExprContext(estate, planstate);
 
-	/* Create the working data structure for pruning */
-	prunestate = CreatePartitionPruneState(planstate, pruneinfo);
+		/* For data reading, executor always omits detached partitions */
+		if (estate->es_partition_directory == NULL)
+			estate->es_partition_directory =
+				CreatePartitionDirectory(estate->es_query_cxt, false);
+
+		/*
+		 * Create the working data structure for pruning.  No need to consider
+		 * initial pruning steps if we have a PartitionPruneResult.
+		 */
+		prunestate = CreatePartitionPruneState(planstate, pruneinfo,
+											   pruneresult == NULL, true,
+											   NIL, planstate->ps_ExprContext,
+											   estate->es_partition_directory);
+	}
 
 	/*
 	 * Perform an initial partition prune pass, if required.
 	 */
-	if (prunestate->do_initial_prune)
-		*initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true);
+	if (pruneresult)
+	{
+		*initially_valid_subplans = bms_copy(pruneresult->valid_subplan_offs);
+	}
+	else if (prunestate && prunestate->do_initial_prune)
+	{
+		*initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true,
+															 NULL);
+	}
 	else
 	{
-		/* No pruning, so we'll need to initialize all subplans */
+		/* No initial pruning, so we'll need to initialize all subplans */
 		Assert(n_total_subplans > 0);
 		*initially_valid_subplans = bms_add_range(NULL, 0,
 												  n_total_subplans - 1);
+		return prunestate;
 	}
 
 	/*
@@ -1823,7 +1876,8 @@ ExecInitPartitionPruning(PlanState *planstate,
 	 * that were removed above due to initial pruning.  No need to do this if
 	 * no steps were removed.
 	 */
-	if (bms_num_members(*initially_valid_subplans) < n_total_subplans)
+	if (prunestate &&
+		bms_num_members(*initially_valid_subplans) < n_total_subplans)
 	{
 		/*
 		 * We can safely skip this when !do_exec_prune, even though that
@@ -1839,11 +1893,74 @@ ExecInitPartitionPruning(PlanState *planstate,
 	return prunestate;
 }
 
+/*
+ * ExecPartitionDoInitialPruning
+ *		Perform initial pruning using given PartitionPruneInfo to determine
+ *		the minimal set of child subplans that will be executed and also the
+ *		set of RT indexes of the leaf partitions scanned by those subplans.
+ */
+Bitmapset *
+ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params,
+							  PartitionPruneInfo *pruneinfo,
+							  Bitmapset **scan_leafpart_rtis)
+{
+	List		 *rtable = plannedstmt->rtable;
+	ExprContext	 *econtext;
+	PartitionDirectory pdir;
+	MemoryContext oldcontext,
+				  tmpcontext;
+	PartitionPruneState *prunestate;
+	Bitmapset	 *valid_subplan_offs;
+
+	/*
+	 * A temporary context for memory allocations required while executing
+	 * partition pruning steps.
+	 */
+	tmpcontext = AllocSetContextCreate(CurrentMemoryContext,
+									   "initial pruning working data",
+									   ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(tmpcontext);
+
+	/*
+	 * PartitionDirectory to look up partition descriptors.
+	 * Note that we don't omit detached partitions, just like during
+	 * execution proper.
+	 */
+	pdir = CreatePartitionDirectory(CurrentMemoryContext, false);
+
+	/*
+	 * We don't yet have a PlanState for the parent plan node, so we must
+	 * create a standalone ExprContext to evaluate pruning expressions,
+	 * equipped with the information about the EXTERN parameters that the
+	 * caller passed us.  Note that that's okay because the initial pruning
+	 * steps do not contain anything that requires the execution to have
+	 * started and thus need the information contained in a PlanState.
+	 */
+	econtext = CreateStandaloneExprContext();
+	econtext->ecxt_param_list_info = params;
+	prunestate = CreatePartitionPruneState(NULL, pruneinfo, true, false,
+										   rtable, econtext, pdir);
+	MemoryContextSwitchTo(oldcontext);
+
+	/* Do the initial pruning. */
+	valid_subplan_offs = ExecFindMatchingSubPlans(prunestate, true,
+												  scan_leafpart_rtis);
+
+	FreeExprContext(econtext, true);
+	DestroyPartitionDirectory(pdir);
+	MemoryContextDelete(tmpcontext);
+
+	return valid_subplan_offs;
+}
+
 /*
  * CreatePartitionPruneState
  *		Build the data structure required for calling ExecFindMatchingSubPlans
  *
- * 'planstate' is the parent plan node's execution state.
+ * 'planstate', if not NULL, is the parent plan node's execution state.  It
+ * can be NULL if being called before ExecutorStart(), in which case,
+ * 'rtable' (range table), 'econtext', and 'partdir' must be explicitly
+ * provided.
  *
  * 'pruneinfo' is a PartitionPruneInfo as generated by
  * make_partition_pruneinfo.  Here we build a PartitionPruneState containing a
@@ -1857,19 +1974,21 @@ ExecInitPartitionPruning(PlanState *planstate,
  * PartitionedRelPruneInfo.
  */
 static PartitionPruneState *
-CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
+CreatePartitionPruneState(PlanState *planstate,
+						  PartitionPruneInfo *pruneinfo,
+						  bool consider_initial_steps,
+						  bool consider_exec_steps,
+						  List *rtable, ExprContext *econtext,
+						  PartitionDirectory partdir)
 {
-	EState	   *estate = planstate->state;
+	EState	   *estate = planstate ? planstate->state : NULL;
 	PartitionPruneState *prunestate;
 	int			n_part_hierarchies;
 	ListCell   *lc;
 	int			i;
-	ExprContext *econtext = planstate->ps_ExprContext;
 
-	/* For data reading, executor always omits detached partitions */
-	if (estate->es_partition_directory == NULL)
-		estate->es_partition_directory =
-			CreatePartitionDirectory(estate->es_query_cxt, false);
+	Assert((estate != NULL) ||
+			(partdir != NULL && econtext != NULL && rtable != NIL));
 
 	n_part_hierarchies = list_length(pruneinfo->prune_infos);
 	Assert(n_part_hierarchies > 0);
@@ -1924,15 +2043,42 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 			PartitionKey partkey;
 
 			/*
-			 * We can rely on the copies of the partitioned table's partition
-			 * key and partition descriptor appearing in its relcache entry,
-			 * because that entry will be held open and locked for the
-			 * duration of this executor run.
+			 * Must open the relation by ourselves when called before the
+			 * execution has started, such as, when called during
+			 * ExecutorDoInitialPruning() on a cached plan.  In that case,
+			 * sub-partitions must be locked, because AcquirePlannerLocks()
+			 * would not have seen them. (1st relation in a partrelpruneinfos
+			 * list is always the root partitioned table appearing in the
+			 * query, which AcquirePlannerLocks() would have locked; the
+			 * Assert in relation_open() guards that assumption.)
+			 */
+			if (estate == NULL)
+			{
+				RangeTblEntry *rte = rt_fetch(pinfo->rtindex, rtable);
+				int		lockmode = (j == 0) ? NoLock : rte->rellockmode;
+
+				partrel = table_open(rte->relid, lockmode);
+			}
+			else
+				partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
+
+			/*
+			 * We can rely on the copy of the partitioned table's partition
+			 * key from in its relcache entry, because it can't change (or
+			 * get destroyed) as long as the relation is locked.  Partition
+			 * descriptor is taken from the PartitionDirectory associated with
+			 * the table that is held open long enough for the descriptor to
+			 * remain valid while it's used to perform the pruning steps.
 			 */
-			partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
 			partkey = RelationGetPartitionKey(partrel);
-			partdesc = PartitionDirectoryLookup(estate->es_partition_directory,
-												partrel);
+			partdesc = PartitionDirectoryLookup(partdir, partrel);
+
+			/*
+			 * Must close partrel, keeping the lock taken, if we're not using
+			 * EState's entry.
+			 */
+			if (estate == NULL)
+				table_close(partrel, NoLock);
 
 			/*
 			 * Initialize the subplan_map and subpart_map.
@@ -1946,6 +2092,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 			Assert(partdesc->nparts >= pinfo->nparts);
 			pprune->nparts = partdesc->nparts;
 			pprune->subplan_map = palloc(sizeof(int) * partdesc->nparts);
+			pprune->rti_map = palloc(sizeof(Index) * partdesc->nparts);
 			if (partdesc->nparts == pinfo->nparts)
 			{
 				/*
@@ -1956,6 +2103,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 				pprune->subpart_map = pinfo->subpart_map;
 				memcpy(pprune->subplan_map, pinfo->subplan_map,
 					   sizeof(int) * pinfo->nparts);
+				memcpy(pprune->rti_map, pinfo->rti_map,
+					   sizeof(int) * pinfo->nparts);
 
 				/*
 				 * Double-check that the list of unpruned relations has not
@@ -2006,6 +2155,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 							pinfo->subplan_map[pd_idx];
 						pprune->subpart_map[pp_idx] =
 							pinfo->subpart_map[pd_idx];
+						pprune->rti_map[pp_idx] =
+							pinfo->rti_map[pd_idx];
 						pd_idx++;
 					}
 					else
@@ -2013,6 +2164,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 						/* this partdesc entry is not in the plan */
 						pprune->subplan_map[pp_idx] = -1;
 						pprune->subpart_map[pp_idx] = -1;
+						pprune->rti_map[pp_idx] = 0;
 					}
 				}
 
@@ -2034,7 +2186,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 			 * Initialize pruning contexts as needed.
 			 */
 			pprune->initial_pruning_steps = pinfo->initial_pruning_steps;
-			if (pinfo->initial_pruning_steps)
+			if (consider_initial_steps && pinfo->initial_pruning_steps)
 			{
 				InitPartitionPruneContext(&pprune->initial_context,
 										  pinfo->initial_pruning_steps,
@@ -2044,7 +2196,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 				prunestate->do_initial_prune = true;
 			}
 			pprune->exec_pruning_steps = pinfo->exec_pruning_steps;
-			if (pinfo->exec_pruning_steps)
+			if (consider_exec_steps && pinfo->exec_pruning_steps)
 			{
 				InitPartitionPruneContext(&pprune->exec_context,
 										  pinfo->exec_pruning_steps,
@@ -2272,10 +2424,14 @@ PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
  * Pass initial_prune if PARAM_EXEC Params cannot yet be evaluated.  This
  * differentiates the initial executor-time pruning step from later
  * runtime pruning.
+ *
+ * RT indexes of leaf partitions scanned by the chosen subplans are added to
+ * *scan_leafpart_rtis if the pointer is non-NULL.
  */
 Bitmapset *
 ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
-						 bool initial_prune)
+						 bool initial_prune,
+						 Bitmapset **scan_leafpart_rtis)
 {
 	Bitmapset  *result = NULL;
 	MemoryContext oldcontext;
@@ -2310,7 +2466,7 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
 		 */
 		pprune = &prunedata->partrelprunedata[0];
 		find_matching_subplans_recurse(prunedata, pprune, initial_prune,
-									   &result);
+									   &result, scan_leafpart_rtis);
 
 		/* Expression eval may have used space in ExprContext too */
 		if (pprune->exec_pruning_steps)
@@ -2324,6 +2480,8 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
 
 	/* Copy result out of the temp context before we reset it */
 	result = bms_copy(result);
+	if (scan_leafpart_rtis)
+		*scan_leafpart_rtis = bms_copy(*scan_leafpart_rtis);
 
 	MemoryContextReset(prunestate->prune_context);
 
@@ -2334,13 +2492,15 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
  * find_matching_subplans_recurse
  *		Recursive worker function for ExecFindMatchingSubPlans
  *
- * Adds valid (non-prunable) subplan IDs to *validsubplans
+ * Adds valid (non-prunable) subplan IDs to *validsubplans and RT indexes of
+ * of the corresponding leaf partitions to *scan_leafpart_rtis (if asked for).
  */
 static void
 find_matching_subplans_recurse(PartitionPruningData *prunedata,
 							   PartitionedRelPruningData *pprune,
 							   bool initial_prune,
-							   Bitmapset **validsubplans)
+							   Bitmapset **validsubplans,
+							   Bitmapset **scan_leafpart_rtis)
 {
 	Bitmapset  *partset;
 	int			i;
@@ -2367,8 +2527,14 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
 	while ((i = bms_next_member(partset, i)) >= 0)
 	{
 		if (pprune->subplan_map[i] >= 0)
+		{
 			*validsubplans = bms_add_member(*validsubplans,
 											pprune->subplan_map[i]);
+			Assert(pprune->rti_map[i] > 0);
+			if (scan_leafpart_rtis)
+				*scan_leafpart_rtis = bms_add_member(*scan_leafpart_rtis,
+													 pprune->rti_map[i]);
+		}
 		else
 		{
 			int			partidx = pprune->subpart_map[i];
@@ -2376,7 +2542,8 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
 			if (partidx >= 0)
 				find_matching_subplans_recurse(prunedata,
 											   &prunedata->partrelprunedata[partidx],
-											   initial_prune, validsubplans);
+											   initial_prune, validsubplans,
+											   scan_leafpart_rtis);
 			else
 			{
 				/*
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 21f4c10937..67a58c7163 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -134,6 +134,7 @@ CreateExecutorState(void)
 	estate->es_param_exec_vals = NULL;
 
 	estate->es_queryEnv = NULL;
+	estate->es_part_prune_results = NIL;
 
 	estate->es_query_cxt = qcontext;
 
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index e134a82ff7..18d3b98cdc 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -842,7 +842,7 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
 	else
 		dest = None_Receiver;
 
-	es->qd = CreateQueryDesc(es->stmt,
+	es->qd = CreateQueryDesc(es->stmt, NIL,
 							 fcache->src,
 							 GetActiveSnapshot(),
 							 InvalidSnapshot,
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index c6f86a6510..96880e122a 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -155,7 +155,8 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
 		 * subplan, we can fill as_valid_subplans immediately, preventing
 		 * later calls to ExecFindMatchingSubPlans.
 		 */
-		if (!prunestate->do_exec_prune && nplans > 0)
+		if (appendstate->as_prune_state == NULL ||
+			(!appendstate->as_prune_state->do_exec_prune && nplans > 0))
 			appendstate->as_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
 	}
 	else
@@ -577,7 +578,7 @@ choose_next_subplan_locally(AppendState *node)
 		}
 		else if (node->as_valid_subplans == NULL)
 			node->as_valid_subplans =
-				ExecFindMatchingSubPlans(node->as_prune_state, false);
+				ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
 
 		whichplan = -1;
 	}
@@ -642,7 +643,7 @@ choose_next_subplan_for_leader(AppendState *node)
 		if (node->as_valid_subplans == NULL)
 		{
 			node->as_valid_subplans =
-				ExecFindMatchingSubPlans(node->as_prune_state, false);
+				ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
 
 			/*
 			 * Mark each invalid plan as finished to allow the loop below to
@@ -717,7 +718,7 @@ choose_next_subplan_for_worker(AppendState *node)
 	else if (node->as_valid_subplans == NULL)
 	{
 		node->as_valid_subplans =
-			ExecFindMatchingSubPlans(node->as_prune_state, false);
+			ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
 		mark_invalid_subplans_as_finished(node);
 	}
 
@@ -868,7 +869,7 @@ ExecAppendAsyncBegin(AppendState *node)
 	if (node->as_valid_subplans == NULL)
 	{
 		node->as_valid_subplans =
-			ExecFindMatchingSubPlans(node->as_prune_state, false);
+			ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
 
 		classify_matching_subplans(node);
 	}
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index 8d35860c30..2312e5a633 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -103,7 +103,8 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
 		 * subplan, we can fill ms_valid_subplans immediately, preventing
 		 * later calls to ExecFindMatchingSubPlans.
 		 */
-		if (!prunestate->do_exec_prune && nplans > 0)
+		if (mergestate->ms_prune_state == NULL ||
+			(!mergestate->ms_prune_state->do_exec_prune && nplans > 0))
 			mergestate->ms_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
 	}
 	else
@@ -218,7 +219,7 @@ ExecMergeAppend(PlanState *pstate)
 		 */
 		if (node->ms_valid_subplans == NULL)
 			node->ms_valid_subplans =
-				ExecFindMatchingSubPlans(node->ms_prune_state, false);
+				ExecFindMatchingSubPlans(node->ms_prune_state, false, NULL);
 
 		/*
 		 * First time through: pull the first tuple from each valid subplan,
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index fd5796f1b9..93012a5b3b 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -1578,6 +1578,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 	CachedPlanSource *plansource;
 	CachedPlan *cplan;
 	List	   *stmt_list;
+	List	   *part_prune_results_list;
 	char	   *query_string;
 	Snapshot	snapshot;
 	MemoryContext oldcontext;
@@ -1657,7 +1658,10 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 	 */
 
 	/* Replan if needed, and increment plan refcount for portal */
-	cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv);
+	cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv,
+						  &part_prune_results_list);
+	Assert(list_length(cplan->stmt_list) ==
+		   list_length(part_prune_results_list));
 	stmt_list = cplan->stmt_list;
 
 	if (!plan->saved)
@@ -1685,6 +1689,9 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 					  stmt_list,
 					  cplan);
 
+	/* Copy Lists of PartitionPruneResults into the portal's context. */
+	PortalStorePartitionPruneResults(portal, part_prune_results_list);
+
 	/*
 	 * Set up options for portal.  Default SCROLL type is chosen the same way
 	 * as PerformCursorOpen does it.
@@ -2092,7 +2099,8 @@ SPI_plan_get_cached_plan(SPIPlanPtr plan)
 	/* Get the generic plan for the query */
 	cplan = GetCachedPlan(plansource, NULL,
 						  plan->saved ? CurrentResourceOwner : NULL,
-						  _SPI_current->queryEnv);
+						  _SPI_current->queryEnv,
+						  NULL /* Not interested in PartitionPruneResults */);
 	Assert(cplan == plansource->gplan);
 
 	/* Pop the error context stack */
@@ -2473,7 +2481,9 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 	{
 		CachedPlanSource *plansource = (CachedPlanSource *) lfirst(lc1);
 		List	   *stmt_list;
-		ListCell   *lc2;
+		List	   *part_prune_results_list;
+		ListCell   *lc2,
+				   *lc3;
 
 		spicallbackarg.query = plansource->query_string;
 
@@ -2549,8 +2559,10 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 		 * plan, the refcount must be backed by the plan_owner.
 		 */
 		cplan = GetCachedPlan(plansource, options->params,
-							  plan_owner, _SPI_current->queryEnv);
-
+							  plan_owner, _SPI_current->queryEnv,
+							 &part_prune_results_list);
+		Assert(list_length(cplan->stmt_list) ==
+			   list_length(part_prune_results_list));
 		stmt_list = cplan->stmt_list;
 
 		/*
@@ -2589,9 +2601,10 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 			}
 		}
 
-		foreach(lc2, stmt_list)
+		forboth(lc2, stmt_list, lc3, part_prune_results_list)
 		{
 			PlannedStmt *stmt = lfirst_node(PlannedStmt, lc2);
+			List	   *part_prune_results = lfirst_node(List, lc3);
 			bool		canSetTag = stmt->canSetTag;
 			DestReceiver *dest;
 
@@ -2663,7 +2676,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 				else
 					snap = InvalidSnapshot;
 
-				qdesc = CreateQueryDesc(stmt,
+				qdesc = CreateQueryDesc(stmt, part_prune_results,
 										plansource->query_string,
 										snap, crosscheck_snapshot,
 										dest,
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index b4ff855f7c..77990a2732 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -158,6 +158,11 @@
 	token = pg_strtok(&length);		/* skip :fldname */ \
 	local_node->fldname = readIntCols(len)
 
+/* Read an Index array */
+#define READ_INDEX_ARRAY(fldname, len) \
+	token = pg_strtok(&length);		/* skip :fldname */ \
+	local_node->fldname = readIndexCols(len)
+
 /* Read a bool array */
 #define READ_BOOL_ARRAY(fldname, len) \
 	token = pg_strtok(&length);		/* skip :fldname */ \
@@ -795,7 +800,6 @@ fnname(int numCols) \
  */
 READ_SCALAR_ARRAY(readAttrNumberCols, int16, atoi)
 READ_SCALAR_ARRAY(readOidCols, Oid, atooid)
-/* outfuncs.c has writeIndexCols, but we don't yet need that here */
-/* READ_SCALAR_ARRAY(readIndexCols, Index, atoui) */
+READ_SCALAR_ARRAY(readIndexCols, Index, atoui)
 READ_SCALAR_ARRAY(readIntCols, int, atoi)
 READ_SCALAR_ARRAY(readBoolCols, bool, strtobool)
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 240d50f1c0..b7801ea04c 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -520,7 +520,9 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
 	result->parallelModeNeeded = glob->parallelModeNeeded;
 	result->planTree = top_plan;
 	result->partPruneInfos = glob->partPruneInfos;
+	result->containsInitialPruning = glob->containsInitialPruning;
 	result->rtable = glob->finalrtable;
+	result->minLockRelids = glob->minLockRelids;
 	result->resultRelations = glob->resultRelations;
 	result->appendRelations = glob->appendRelations;
 	result->subplans = glob->subplans;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 720f20f563..61d6934978 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -270,6 +270,16 @@ set_plan_references(PlannerInfo *root, Plan *plan)
 	 */
 	add_rtes_to_flat_rtable(root, false);
 
+	/*
+	 * Add the query's adjusted range of RT indexes to glob->minLockRelids.
+	 * The adjusted RT indexes of prunable relations will be deleted from the
+	 * set below where PartitionPruneInfos are processed.
+	 */
+	glob->minLockRelids =
+		bms_add_range(glob->minLockRelids,
+					  rtoffset + 1,
+					  rtoffset + list_length(root->parse->rtable));
+
 	/*
 	 * Adjust RT indexes of PlanRowMarks and add to final rowmarks list
 	 */
@@ -352,6 +362,7 @@ set_plan_references(PlannerInfo *root, Plan *plan)
 	foreach (lc, root->partPruneInfos)
 	{
 		PartitionPruneInfo *pruneinfo = lfirst(lc);
+		Bitmapset *leafpart_rtis = NULL;
 		ListCell  *l;
 
 		foreach(l, pruneinfo->prune_infos)
@@ -362,15 +373,50 @@ set_plan_references(PlannerInfo *root, Plan *plan)
 			foreach(l2, prune_infos)
 			{
 				PartitionedRelPruneInfo *pinfo = lfirst(l2);
+				int		i;
 
 				/* RT index of the table to which the pinfo belongs. */
 				pinfo->rtindex += rtoffset;
+
+				/* Also of the leaf partitions that might be scanned. */
+				for (i = 0; i < pinfo->nparts; i++)
+				{
+					if (pinfo->rti_map[i] > 0 && pinfo->subplan_map[i] >= 0)
+					{
+						pinfo->rti_map[i] += rtoffset;
+						leafpart_rtis = bms_add_member(leafpart_rtis,
+													   pinfo->rti_map[i]);
+					}
+				}
 			}
 		}
 
+		if (pruneinfo->needs_init_pruning)
+		{
+			glob->containsInitialPruning = true;
+
+			/*
+			 * Delete the leaf partition RTIs from the global set of relations
+			 * to be locked before executing the plan.  AcquireExecutorLocks()
+			 * will find the ones to add to the set after performing initial
+			 * pruning.
+			 */
+			glob->minLockRelids = bms_del_members(glob->minLockRelids,
+												  leafpart_rtis);
+		}
+
 		glob->partPruneInfos = lappend(glob->partPruneInfos, pruneinfo);
 	}
 
+	/*
+	 * It seems worth doing a bms_copy() on glob->minLockRelids if we deleted
+	 * bits from it above to get rid of any empty tail bits.  It seems better
+	 * for the loop over this set in AcquireExecutorLocks() to not have to go
+	 * through those useless bit words.
+	 */
+	if (glob->containsInitialPruning)
+		glob->minLockRelids = bms_copy(glob->minLockRelids);
+
 	return result;
 }
 
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index 6565b6ed01..37f3e6af61 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -144,7 +144,9 @@ static List *make_partitionedrel_pruneinfo(PlannerInfo *root,
 										   List *prunequal,
 										   Bitmapset *partrelids,
 										   int *relid_subplan_map,
-										   Bitmapset **matchedsubplans);
+										   Bitmapset **matchedsubplans,
+										   bool *needs_init_pruning,
+										   bool *needs_exec_pruning);
 static void gen_partprune_steps(RelOptInfo *rel, List *clauses,
 								PartClauseTarget target,
 								GeneratePruningStepsContext *context);
@@ -234,6 +236,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	int		   *relid_subplan_map;
 	ListCell   *lc;
 	int			i;
+	bool		needs_init_pruning = false;
+	bool		needs_exec_pruning = false;
 
 	/*
 	 * Scan the subpaths to see which ones are scans of partition child
@@ -313,12 +317,16 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		Bitmapset  *partrelids = (Bitmapset *) lfirst(lc);
 		List	   *pinfolist;
 		Bitmapset  *matchedsubplans = NULL;
+		bool		partrel_needs_init_pruning;
+		bool		partrel_needs_exec_pruning;
 
 		pinfolist = make_partitionedrel_pruneinfo(root, parentrel,
 												  prunequal,
 												  partrelids,
 												  relid_subplan_map,
-												  &matchedsubplans);
+												  &matchedsubplans,
+												  &partrel_needs_init_pruning,
+												  &partrel_needs_exec_pruning);
 
 		/* When pruning is possible, record the matched subplans */
 		if (pinfolist != NIL)
@@ -327,6 +335,9 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 			allmatchedsubplans = bms_join(matchedsubplans,
 										  allmatchedsubplans);
 		}
+
+		needs_init_pruning |= partrel_needs_init_pruning;
+		needs_exec_pruning |= partrel_needs_exec_pruning;
 	}
 
 	pfree(relid_subplan_map);
@@ -341,6 +352,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	/* Else build the result data structure */
 	pruneinfo = makeNode(PartitionPruneInfo);
 	pruneinfo->prune_infos = prunerelinfos;
+	pruneinfo->needs_init_pruning = needs_init_pruning;
+	pruneinfo->needs_exec_pruning = needs_exec_pruning;
 
 	/*
 	 * Some subplans may not belong to any of the identified partitioned rels.
@@ -441,13 +454,18 @@ add_part_relids(List *allpartrelids, Bitmapset *partrelids)
  * If we cannot find any useful run-time pruning steps, return NIL.
  * However, on success, each rel identified in partrelids will have
  * an element in the result list, even if some of them are useless.
+ * *needs_init_pruning and *needs_exec_pruning are set to indicate that the
+ * returned PartitionedRelPruneInfos contains pruning steps that can be
+ * performed before and after execution begins, respectively.
  */
 static List *
 make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 							  List *prunequal,
 							  Bitmapset *partrelids,
 							  int *relid_subplan_map,
-							  Bitmapset **matchedsubplans)
+							  Bitmapset **matchedsubplans,
+							  bool *needs_init_pruning,
+							  bool *needs_exec_pruning)
 {
 	RelOptInfo *targetpart = NULL;
 	List	   *pinfolist = NIL;
@@ -458,6 +476,10 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	int			rti;
 	int			i;
 
+	/* Will find out below. */
+	*needs_init_pruning = false;
+	*needs_exec_pruning = false;
+
 	/*
 	 * Examine each partitioned rel, constructing a temporary array to map
 	 * from planner relids to index of the partitioned rel, and building a
@@ -545,6 +567,9 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		 * executor per-scan pruning steps.  This first pass creates startup
 		 * pruning steps and detects whether there's any possibly-useful quals
 		 * that would require per-scan pruning.
+		 *
+		 * In the first pass, we note whether the 2nd pass is necessary by
+		 * noting the presence of EXEC parameters.
 		 */
 		gen_partprune_steps(subpart, partprunequal, PARTTARGET_INITIAL,
 							&context);
@@ -619,6 +644,12 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		pinfo->execparamids = execparamids;
 		/* Remaining fields will be filled in the next loop */
 
+		/* record which types of pruning steps we've seen so far */
+		if (initial_pruning_steps != NIL)
+			*needs_init_pruning = true;
+		if (exec_pruning_steps != NIL)
+			*needs_exec_pruning = true;
+
 		pinfolist = lappend(pinfolist, pinfo);
 	}
 
@@ -646,6 +677,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		int		   *subplan_map;
 		int		   *subpart_map;
 		Oid		   *relid_map;
+		Index	   *rti_map;
 
 		/*
 		 * Construct the subplan and subpart maps for this partitioning level.
@@ -658,6 +690,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		subpart_map = (int *) palloc(nparts * sizeof(int));
 		memset(subpart_map, -1, nparts * sizeof(int));
 		relid_map = (Oid *) palloc0(nparts * sizeof(Oid));
+		rti_map = (Index *) palloc0(nparts * sizeof(Index));
 		present_parts = NULL;
 
 		i = -1;
@@ -672,6 +705,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 			subplan_map[i] = subplanidx = relid_subplan_map[partrel->relid] - 1;
 			subpart_map[i] = subpartidx = relid_subpart_map[partrel->relid] - 1;
 			relid_map[i] = planner_rt_fetch(partrel->relid, root)->relid;
+			rti_map[i] = partrel->relid;
 			if (subplanidx >= 0)
 			{
 				present_parts = bms_add_member(present_parts, i);
@@ -696,6 +730,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		pinfo->subplan_map = subplan_map;
 		pinfo->subpart_map = subpart_map;
 		pinfo->relid_map = relid_map;
+		pinfo->rti_map = rti_map;
 	}
 
 	pfree(relid_subpart_map);
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index a9a1851c94..a1be8179e8 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -1598,6 +1598,7 @@ exec_bind_message(StringInfo input_message)
 	int16	   *rformats = NULL;
 	CachedPlanSource *psrc;
 	CachedPlan *cplan;
+	List	   *part_prune_results_list;
 	Portal		portal;
 	char	   *query_string;
 	char	   *saved_stmt_name;
@@ -1972,7 +1973,9 @@ exec_bind_message(StringInfo input_message)
 	 * will be generated in MessageContext.  The plan refcount will be
 	 * assigned to the Portal, so it will be released at portal destruction.
 	 */
-	cplan = GetCachedPlan(psrc, params, NULL, NULL);
+	cplan = GetCachedPlan(psrc, params, NULL, NULL, &part_prune_results_list);
+	Assert(list_length(cplan->stmt_list) ==
+		   list_length(part_prune_results_list));
 
 	/*
 	 * Now we can define the portal.
@@ -1987,6 +1990,9 @@ exec_bind_message(StringInfo input_message)
 					  cplan->stmt_list,
 					  cplan);
 
+	/* Copy Lists of PartitionPruneResults into the portal's context. */
+	PortalStorePartitionPruneResults(portal, part_prune_results_list);
+
 	/* Done with the snapshot used for parameter I/O and parsing/planning */
 	if (snapshot_set)
 		PopActiveSnapshot();
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index 5aa5a350f3..226ee81b63 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -35,7 +35,7 @@
 Portal		ActivePortal = NULL;
 
 
-static void ProcessQuery(PlannedStmt *plan,
+static void ProcessQuery(PlannedStmt *plan, List *part_prune_results,
 						 const char *sourceText,
 						 ParamListInfo params,
 						 QueryEnvironment *queryEnv,
@@ -65,6 +65,7 @@ static void DoPortalRewind(Portal portal);
  */
 QueryDesc *
 CreateQueryDesc(PlannedStmt *plannedstmt,
+				List *part_prune_results,
 				const char *sourceText,
 				Snapshot snapshot,
 				Snapshot crosscheck_snapshot,
@@ -77,6 +78,8 @@ CreateQueryDesc(PlannedStmt *plannedstmt,
 
 	qd->operation = plannedstmt->commandType;	/* operation */
 	qd->plannedstmt = plannedstmt;	/* plan */
+	qd->part_prune_results = part_prune_results; /* ExecutorDoInitialPruning()
+												  * output for plan */
 	qd->sourceText = sourceText;	/* query text */
 	qd->snapshot = RegisterSnapshot(snapshot);	/* snapshot */
 	/* RI check snapshot */
@@ -122,6 +125,7 @@ FreeQueryDesc(QueryDesc *qdesc)
  *		PORTAL_ONE_RETURNING, or PORTAL_ONE_MOD_WITH portal
  *
  *	plan: the plan tree for the query
+ *	part_prune_results: ExecutorDoInitialPruning() output for the PlannedStmt
  *	sourceText: the source text of the query
  *	params: any parameters needed
  *	dest: where to send results
@@ -134,6 +138,7 @@ FreeQueryDesc(QueryDesc *qdesc)
  */
 static void
 ProcessQuery(PlannedStmt *plan,
+			 List *part_prune_results,
 			 const char *sourceText,
 			 ParamListInfo params,
 			 QueryEnvironment *queryEnv,
@@ -145,7 +150,7 @@ ProcessQuery(PlannedStmt *plan,
 	/*
 	 * Create the QueryDesc object
 	 */
-	queryDesc = CreateQueryDesc(plan, sourceText,
+	queryDesc = CreateQueryDesc(plan, part_prune_results, sourceText,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, params, queryEnv, 0);
 
@@ -491,8 +496,13 @@ PortalStart(Portal portal, ParamListInfo params,
 				/*
 				 * Create QueryDesc in portal's context; for the moment, set
 				 * the destination to DestNone.
+				 *
+				 * There is no PartitionPruneResult unless the PlannedStmt is
+				 * from a CachedPlan.
 				 */
 				queryDesc = CreateQueryDesc(linitial_node(PlannedStmt, portal->stmts),
+											portal->part_prune_results_list == NIL ? NIL :
+											linitial(portal->part_prune_results_list),
 											portal->sourceText,
 											GetActiveSnapshot(),
 											InvalidSnapshot,
@@ -1225,6 +1235,8 @@ PortalRunMulti(Portal portal,
 
 		if (pstmt->utilityStmt == NULL)
 		{
+			List *part_prune_results = NIL;
+
 			/*
 			 * process a plannable query.
 			 */
@@ -1271,10 +1283,18 @@ PortalRunMulti(Portal portal,
 			else
 				UpdateActiveSnapshotCommandId();
 
+			/*
+			 * Determine if there's a corresponding List of PartitionPruneResult
+			 * for this PlannedStmt.
+			 */
+			if (portal->part_prune_results_list != NIL)
+				part_prune_results = list_nth(portal->part_prune_results_list,
+											  foreach_current_index(stmtlist_item));
+
 			if (pstmt->canSetTag)
 			{
 				/* statement can set tag string */
-				ProcessQuery(pstmt,
+				ProcessQuery(pstmt, part_prune_results,
 							 portal->sourceText,
 							 portal->portalParams,
 							 portal->queryEnv,
@@ -1283,7 +1303,7 @@ PortalRunMulti(Portal portal,
 			else
 			{
 				/* stmt added by rewrite cannot set tag */
-				ProcessQuery(pstmt,
+				ProcessQuery(pstmt, part_prune_results,
 							 portal->sourceText,
 							 portal->portalParams,
 							 portal->queryEnv,
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 0d6a295674..957221c47e 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -99,14 +99,19 @@ static dlist_head cached_expression_list = DLIST_STATIC_INIT(cached_expression_l
 static void ReleaseGenericPlan(CachedPlanSource *plansource);
 static List *RevalidateCachedQuery(CachedPlanSource *plansource,
 								   QueryEnvironment *queryEnv);
-static bool CheckCachedPlan(CachedPlanSource *plansource);
+static bool CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
+							List **part_prune_results_list);
 static CachedPlan *BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
-								   ParamListInfo boundParams, QueryEnvironment *queryEnv);
+								   ParamListInfo boundParams, QueryEnvironment *queryEnv,
+								   List **part_prune_results_list);
 static bool choose_custom_plan(CachedPlanSource *plansource,
 							   ParamListInfo boundParams);
 static double cached_plan_cost(CachedPlan *plan, bool include_planner);
 static Query *QueryListGetPrimaryStmt(List *stmts);
-static void AcquireExecutorLocks(List *stmt_list, bool acquire);
+static void AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams,
+								 List **part_prune_results_list,
+								 List **lockedRelids_per_stmt);
+static void ReleaseExecutorLocks(List *stmt_list, List *lockedRelids_per_stmt);
 static void AcquirePlannerLocks(List *stmt_list, bool acquire);
 static void ScanQueryForLocks(Query *parsetree, bool acquire);
 static bool ScanQueryWalker(Node *node, bool *acquire);
@@ -782,6 +787,26 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
 	return tlist;
 }
 
+/* 
+ * FreePartitionPruneResults
+ *		Frees the List of Lists of PartitionPruneResults for CheckCachedPlan()
+ */
+static void
+FreePartitionPruneResults(List *part_prune_results_list)
+{
+	ListCell *lc;
+
+	foreach(lc, part_prune_results_list)
+	{
+		List *part_prune_results = lfirst(lc);
+
+		/* Free both the PartitionPruneResults and the containing List. */
+		list_free_deep(part_prune_results);
+	}
+
+	list_free(part_prune_results_list);
+}
+
 /*
  * CheckCachedPlan: see if the CachedPlanSource's generic plan is valid.
  *
@@ -790,15 +815,20 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
  *
  * On a "true" return, we have acquired the locks needed to run the plan.
  * (We must do this for the "true" result to be race-condition-free.)
+ *
+ * See GetCachedPlan()'s comment for a description of part_prune_results_list.
  */
 static bool
-CheckCachedPlan(CachedPlanSource *plansource)
+CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
+				List **part_prune_results_list)
 {
 	CachedPlan *plan = plansource->gplan;
 
 	/* Assert that caller checked the querytree */
 	Assert(plansource->is_valid);
 
+	*part_prune_results_list = NIL;
+
 	/* If there's no generic plan, just say "false" */
 	if (!plan)
 		return false;
@@ -820,13 +850,21 @@ CheckCachedPlan(CachedPlanSource *plansource)
 	 */
 	if (plan->is_valid)
 	{
+		List   *lockedRelids_per_stmt;
+
 		/*
 		 * Plan must have positive refcount because it is referenced by
 		 * plansource; so no need to fear it disappears under us here.
 		 */
 		Assert(plan->refcount > 0);
 
-		AcquireExecutorLocks(plan->stmt_list, true);
+		/*
+		 * Lock relations scanned by the plan.  This is where the pruning
+		 * happens if needed.
+		 */
+		AcquireExecutorLocks(plan->stmt_list, boundParams,
+							 part_prune_results_list,
+							 &lockedRelids_per_stmt);
 
 		/*
 		 * If plan was transient, check to see if TransactionXmin has
@@ -848,7 +886,11 @@ CheckCachedPlan(CachedPlanSource *plansource)
 		}
 
 		/* Oops, the race case happened.  Release useless locks. */
-		AcquireExecutorLocks(plan->stmt_list, false);
+		ReleaseExecutorLocks(plan->stmt_list, lockedRelids_per_stmt);
+
+		/* Release any PartitionPruneResults that may been created. */
+		FreePartitionPruneResults(*part_prune_results_list);
+		*part_prune_results_list = NIL;
 	}
 
 	/*
@@ -874,10 +916,14 @@ CheckCachedPlan(CachedPlanSource *plansource)
  * Planning work is done in the caller's memory context.  The finished plan
  * is in a child memory context, which typically should get reparented
  * (unless this is a one-shot plan, in which case we don't copy the plan).
+ *
+ * A list of NILs is returned in *part_prune_results_list, meaning that no
+ * no partition pruning has been done yet for the plans in stmt_list.
  */
 static CachedPlan *
 BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
-				ParamListInfo boundParams, QueryEnvironment *queryEnv)
+				ParamListInfo boundParams, QueryEnvironment *queryEnv,
+				List **part_prune_results_list)
 {
 	CachedPlan *plan;
 	List	   *plist;
@@ -1007,6 +1053,17 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
 
 	MemoryContextSwitchTo(oldcxt);
 
+	/*
+	 * No actual PartitionPruneResults yet to add, though must initialize
+	 * the list to have the same number of elements as the list of
+	 * PlannedStmts.
+	 */
+	*part_prune_results_list = NIL;
+	foreach(lc, plist)
+	{
+		*part_prune_results_list = lappend(*part_prune_results_list, NIL);
+	}
+
 	return plan;
 }
 
@@ -1126,6 +1183,19 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
  * plan or a custom plan for the given parameters: the caller does not know
  * which it will get.
  *
+ * For every PlannedStmt found in the returned CachedPlan, an element that
+ * is either a List of PartitionPruneResult or a NIL is added to
+ * *part_prune_results_list.  The former if the PlannedStmt is from
+ * the existing CachedPlan that is otherwise valid and has
+ * containsInitialPruning set to true.  Before returning such a CachedPlan,
+ * those "initial" steps are performed by calling ExecutorDoInitialPruning()
+ * to determine only those leaf partitions that need to be locked by
+ * AcquireExecutorLocks() by pruning away subplans that don't match the
+ * "initial" pruning conditions.  For each PartitionPruneInfo found in
+ * PlannedStmt.partPruneInfos, a PartitionPruneResult containing the bitmapset
+ * of the indexes of surviving subplans is added to the List for the
+ * PlannedStmt.
+ *
  * On return, the plan is valid and we have sufficient locks to begin
  * execution.
  *
@@ -1139,11 +1209,13 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
  */
 CachedPlan *
 GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
-			  ResourceOwner owner, QueryEnvironment *queryEnv)
+			  ResourceOwner owner, QueryEnvironment *queryEnv,
+			  List **part_prune_results_list)
 {
 	CachedPlan *plan = NULL;
 	List	   *qlist;
 	bool		customplan;
+	List	   *my_part_prune_results_list;
 
 	/* Assert caller is doing things in a sane order */
 	Assert(plansource->magic == CACHEDPLANSOURCE_MAGIC);
@@ -1160,7 +1232,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 
 	if (!customplan)
 	{
-		if (CheckCachedPlan(plansource))
+		if (CheckCachedPlan(plansource, boundParams,
+							&my_part_prune_results_list))
 		{
 			/* We want a generic plan, and we already have a valid one */
 			plan = plansource->gplan;
@@ -1169,7 +1242,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 		else
 		{
 			/* Build a new generic plan */
-			plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv);
+			plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv,
+								   &my_part_prune_results_list);
 			/* Just make real sure plansource->gplan is clear */
 			ReleaseGenericPlan(plansource);
 			/* Link the new generic plan into the plansource */
@@ -1214,7 +1288,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 	if (customplan)
 	{
 		/* Build a custom plan */
-		plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv);
+		plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv,
+							   &my_part_prune_results_list);
 		/* Accumulate total costs of custom plans */
 		plansource->total_custom_cost += cached_plan_cost(plan, true);
 
@@ -1246,6 +1321,9 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 		plan->is_saved = true;
 	}
 
+	if (part_prune_results_list)
+		*part_prune_results_list = my_part_prune_results_list;
+
 	return plan;
 }
 
@@ -1737,17 +1815,29 @@ QueryListGetPrimaryStmt(List *stmts)
 
 /*
  * AcquireExecutorLocks: acquire locks needed for execution of a cached plan;
- * or release them if acquire is false.
+ *
+ * See GetCachedPlan()'s comment for a description of part_prune_results_list.
+ *
+ * On return, *lockedRelids_per_stmt will contain a bitmapset for every
+ * PlannedStmt in stmt_list, containing the RT indexes of relation entries
+ * in its range table that were actually locked, or NULL if the PlannedStmt
+ * contains a utility statement.
  */
 static void
-AcquireExecutorLocks(List *stmt_list, bool acquire)
+AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams,
+					 List **part_prune_results_list,
+					 List **lockedRelids_per_stmt)
 {
 	ListCell   *lc1;
 
+	*part_prune_results_list = *lockedRelids_per_stmt = NIL;
 	foreach(lc1, stmt_list)
 	{
 		PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
-		ListCell   *lc2;
+		List	   *part_prune_results = NIL;
+		Bitmapset  *allLockRelids;
+		Bitmapset  *lockedRelids = NULL;
+		int			rti;
 
 		if (plannedstmt->commandType == CMD_UTILITY)
 		{
@@ -1761,13 +1851,40 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
 			Query	   *query = UtilityContainsQuery(plannedstmt->utilityStmt);
 
 			if (query)
-				ScanQueryForLocks(query, acquire);
+				ScanQueryForLocks(query, true);
+			*part_prune_results_list = lappend(*part_prune_results_list, NIL);
 			continue;
 		}
 
-		foreach(lc2, plannedstmt->rtable)
+		/*
+		 * Figure out the set of relations that would need to be locked
+		 * before executing the plan.
+		 */
+		if (plannedstmt->containsInitialPruning)
 		{
-			RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc2);
+			Bitmapset *scan_leafpart_rtis = NULL;
+
+			/*
+			 * Obtain the set of leaf partitions to be locked.
+			 *
+			 * The following does initial partition pruning using the
+			 * PartitionPruneInfos found in plannedstmt->partPruneInfos and
+			 * finds leaf partitions that survive that pruning across all the
+			 * nodes in the plan tree.
+			 */
+			part_prune_results = ExecutorDoInitialPruning(plannedstmt,
+														  boundParams,
+														  &scan_leafpart_rtis);
+			allLockRelids = bms_union(plannedstmt->minLockRelids,
+									  scan_leafpart_rtis);
+		}
+		else
+			allLockRelids = plannedstmt->minLockRelids;
+
+		rti = -1;
+		while ((rti = bms_next_member(allLockRelids, rti)) > 0)
+		{
+			RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
 
 			if (rte->rtekind != RTE_RELATION)
 				continue;
@@ -1778,10 +1895,59 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
 			 * fail if it's been dropped entirely --- we'll just transiently
 			 * acquire a non-conflicting lock.
 			 */
-			if (acquire)
-				LockRelationOid(rte->relid, rte->rellockmode);
-			else
-				UnlockRelationOid(rte->relid, rte->rellockmode);
+			LockRelationOid(rte->relid, rte->rellockmode);
+			lockedRelids = bms_add_member(lockedRelids, rti);
+		}
+
+		*part_prune_results_list = lappend(*part_prune_results_list,
+										   part_prune_results);
+		*lockedRelids_per_stmt = lappend(*lockedRelids_per_stmt, lockedRelids);
+	}
+}
+
+/*
+ * ReleaseExecutorLocks
+ * 		Release locks that would've been acquired by an earlier call to
+ * 		AcquireExecutorLocks()
+ */
+static void
+ReleaseExecutorLocks(List *stmt_list, List *lockedRelids_per_stmt)
+{
+	ListCell   *lc1,
+			   *lc2;
+
+	forboth(lc1, stmt_list, lc2, lockedRelids_per_stmt)
+	{
+		PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
+		Bitmapset  *lockedRelids = lfirst(lc2);
+		int			rti;
+
+		if (plannedstmt->commandType == CMD_UTILITY)
+		{
+			/*
+			 * Ignore utility statements, except those (such as EXPLAIN) that
+			 * contain a parsed-but-not-planned query.  Note: it's okay to use
+			 * ScanQueryForLocks, even though the query hasn't been through
+			 * rule rewriting, because rewriting doesn't change the query
+			 * representation.
+			 */
+			Query	   *query = UtilityContainsQuery(plannedstmt->utilityStmt);
+
+			Assert(lockedRelids == NULL);
+			if (query)
+				ScanQueryForLocks(query, false);
+			continue;
+		}
+
+		rti = -1;
+		while ((rti = bms_next_member(lockedRelids, rti)) >= 0)
+		{
+			RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
+
+			Assert(rte->rtekind == RTE_RELATION);
+
+			/* See the comment in AcquireExecutorLocks(). */
+			UnlockRelationOid(rte->relid, rte->rellockmode);
 		}
 	}
 }
diff --git a/src/backend/utils/mmgr/portalmem.c b/src/backend/utils/mmgr/portalmem.c
index c3e95346b6..74950bd163 100644
--- a/src/backend/utils/mmgr/portalmem.c
+++ b/src/backend/utils/mmgr/portalmem.c
@@ -303,6 +303,25 @@ PortalDefineQuery(Portal portal,
 	portal->status = PORTAL_DEFINED;
 }
 
+/*
+ * PortalStorePartitionPruneResults
+ *		Copy the given List of Lists of PartitionPruneResults into the
+ *		portal's context
+ *
+ * This allows the caller to ensure that the list exists as long as the portal
+ * does.
+ */
+void
+PortalStorePartitionPruneResults(Portal portal, List *part_prune_results_list)
+{
+	MemoryContext	oldcxt;
+
+	AssertArg(PortalIsValid(portal));
+	oldcxt = MemoryContextSwitchTo(portal->portalContext);
+	portal->part_prune_results_list = copyObject(part_prune_results_list);
+	MemoryContextSwitchTo(oldcxt);
+}
+
 /*
  * PortalReleaseCachedPlan
  *		Release a portal's reference to its cached plan, if any.
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 9ebde089ae..269cc4d562 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -87,7 +87,9 @@ extern void ExplainOneUtility(Node *utilityStmt, IntoClause *into,
 							  ExplainState *es, const char *queryString,
 							  ParamListInfo params, QueryEnvironment *queryEnv);
 
-extern void ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into,
+extern void ExplainOnePlan(PlannedStmt *plannedstmt,
+						   List *part_prune_results,
+						   IntoClause *into,
 						   ExplainState *es, const char *queryString,
 						   ParamListInfo params, QueryEnvironment *queryEnv,
 						   const instr_time *planduration,
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index bf962af7af..bd8776402e 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -45,6 +45,7 @@ extern void ExecCleanupTupleRouting(ModifyTableState *mtstate,
  * nparts						Length of subplan_map[] and subpart_map[].
  * subplan_map					Subplan index by partition index, or -1.
  * subpart_map					Subpart index by partition index, or -1.
+ * rti_map						Range table index by partition index, or 0.
  * present_parts				A Bitmapset of the partition indexes that we
  *								have subplans or subparts for.
  * initial_pruning_steps		List of PartitionPruneSteps used to
@@ -61,6 +62,7 @@ typedef struct PartitionedRelPruningData
 	int			nparts;
 	int		   *subplan_map;
 	int		   *subpart_map;
+	Index	   *rti_map;
 	Bitmapset  *present_parts;
 	List	   *initial_pruning_steps;
 	List	   *exec_pruning_steps;
@@ -126,5 +128,10 @@ extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
 													 int part_prune_index,
 													 Bitmapset **initially_valid_subplans);
 extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
-										   bool initial_prune);
+										   bool initial_prune,
+										   Bitmapset **scan_leafpart_rtis);
+extern Bitmapset *ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt,
+								ParamListInfo params,
+								PartitionPruneInfo *pruneinfo,
+								Bitmapset **scan_leafpart_rtis);
 #endif							/* EXECPARTITION_H */
diff --git a/src/include/executor/execdesc.h b/src/include/executor/execdesc.h
index e79e2c001f..7d4379da7b 100644
--- a/src/include/executor/execdesc.h
+++ b/src/include/executor/execdesc.h
@@ -35,6 +35,8 @@ typedef struct QueryDesc
 	/* These fields are provided by CreateQueryDesc */
 	CmdType		operation;		/* CMD_SELECT, CMD_UPDATE, etc. */
 	PlannedStmt *plannedstmt;	/* planner's output (could be utility, too) */
+	List		*part_prune_results; /* ExecutorDoInitialPruning()'s
+									  * output for plannedstmt */
 	const char *sourceText;		/* source text of the query */
 	Snapshot	snapshot;		/* snapshot to use for query */
 	Snapshot	crosscheck_snapshot;	/* crosscheck for RI update/delete */
@@ -57,6 +59,7 @@ typedef struct QueryDesc
 
 /* in pquery.c */
 extern QueryDesc *CreateQueryDesc(PlannedStmt *plannedstmt,
+								  List *part_prune_results,
 								  const char *sourceText,
 								  Snapshot snapshot,
 								  Snapshot crosscheck_snapshot,
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index ed95ed1176..c9a5e5fb68 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -185,6 +185,9 @@ ExecGetJunkAttribute(TupleTableSlot *slot, AttrNumber attno, bool *isNull)
 /*
  * prototypes from functions in execMain.c
  */
+extern List *ExecutorDoInitialPruning(PlannedStmt *plannedstmt,
+									  ParamListInfo params,
+									  Bitmapset **scan_leafpart_rtis);
 extern void ExecutorStart(QueryDesc *queryDesc, int eflags);
 extern void standard_ExecutorStart(QueryDesc *queryDesc, int eflags);
 extern void ExecutorRun(QueryDesc *queryDesc,
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 4a741b053f..521a60b988 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -612,6 +612,7 @@ typedef struct EState
 										 * ExecRowMarks, or NULL if none */
 	PlannedStmt *es_plannedstmt;	/* link to top of plan tree */
 	List		*es_part_prune_infos;	/* PlannedStmt.partPruneInfos */
+	List		*es_part_prune_results; /* QueryDesc.part_prune_results */
 	const char *es_sourceText;	/* Source text from QueryDesc */
 
 	JunkFilter *es_junkFilter;	/* top-level junk filter, if any */
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index a80f43e540..937cc4629d 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -212,6 +212,7 @@ extern struct Bitmapset *readBitmapset(void);
 extern uintptr_t readDatum(bool typbyval);
 extern bool *readBoolCols(int numCols);
 extern int *readIntCols(int numCols);
+extern Index *readIndexCols(int numCols);
 extern Oid *readOidCols(int numCols);
 extern int16 *readAttrNumberCols(int numCols);
 
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index fbe75dca0f..354c2e96c3 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -125,6 +125,18 @@ typedef struct PlannerGlobal
 	/* List of PartitionPruneInfo contained in the plan */
 	List	   *partPruneInfos;
 
+	/*
+	 * Do any of those PartitionPruneInfos have initial pruning steps in them?
+	 */
+	bool		containsInitialPruning;
+
+	/*
+	 * Indexes of all range table entries minus indexes of range table entries
+	 * of the leaf partitions scanned by prunable subplans; see
+	 * AcquireExecutorLocks()
+	 */
+	Bitmapset  *minLockRelids;
+
 	/* OIDs of relations the plan depends on */
 	List	   *relationOids;
 
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 2e132afc5a..c0717bf45e 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -73,8 +73,17 @@ typedef struct PlannedStmt
 	List	   *partPruneInfos;	/* List of PartitionPruneInfo contained in
 								 * the plan */
 
+	bool		containsInitialPruning;	/* Do any of those PartitionPruneInfos
+										 * have initial pruning steps in them?
+										 */
+
 	List	   *rtable;			/* list of RangeTblEntry nodes */
 
+	Bitmapset  *minLockRelids;	/* Indexes of all range table entries minus
+								 * indexes of range table entries of the leaf
+								 * partitions scanned by prunable subplans;
+								 * see AcquireExecutorLocks() */
+
 	/* rtable indexes of target relations for INSERT/UPDATE/DELETE/MERGE */
 	List	   *resultRelations;	/* integer list of RT indexes, or NIL */
 
@@ -1410,6 +1419,13 @@ typedef struct PlanRowMark
  * prune_infos			List of Lists containing PartitionedRelPruneInfo nodes,
  *						one sublist per run-time-prunable partition hierarchy
  *						appearing in the parent plan node's subplans.
+ *
+ * needs_init_pruning	Does any of the PartitionedRelPruneInfos in
+ *						prune_infos have its initial_pruning_steps set?
+ *
+ * needs_exec_pruning	Does any of the PartitionedRelPruneInfos in
+ *						prune_infos have its exec_pruning_steps set?
+ *
  * other_subplans		Indexes of any subplans that are not accounted for
  *						by any of the PartitionedRelPruneInfo nodes in
  *						"prune_infos".  These subplans must not be pruned.
@@ -1420,6 +1436,8 @@ typedef struct PartitionPruneInfo
 
 	NodeTag		type;
 	List	   *prune_infos;
+	bool		needs_init_pruning;
+	bool		needs_exec_pruning;
 	Bitmapset  *other_subplans;
 } PartitionPruneInfo;
 
@@ -1464,6 +1482,9 @@ typedef struct PartitionedRelPruneInfo
 	/* relation OID by partition index, or 0 */
 	Oid		   *relid_map pg_node_attr(array_size(nparts));
 
+	/* Range table index by partition index, or 0. */
+	Index	   *rti_map pg_node_attr(array_size(nparts));
+
 	/*
 	 * initial_pruning_steps shows how to prune during executor startup (i.e.,
 	 * without use of any PARAM_EXEC Params); it is NIL if no startup pruning
@@ -1548,6 +1569,31 @@ typedef struct PartitionPruneStepCombine
 	List	   *source_stepids;
 } PartitionPruneStepCombine;
 
+/*----------------
+ * PartitionPruneResult
+ *
+ * The result of performing ExecPartitionDoInitialPruning() on a given
+ * PartitionPruneInfo.
+ *
+ * valid_subplans_offs contains the indexes of subplans remaining after
+ * performing initial pruning by calling ExecFindMatchingSubPlans() on the
+ * PartitionPruneInfo.
+ *
+ * This is used to store the result of initial partition pruning that is
+ * peformed before the execution has started.  A module that needs to do so
+ * should call ExecutorDoInitialPruning() on a given PlannedStmt, which
+ * returns a List of PartitionPruneResult containing an entry for each
+ * PartitionPruneInfo present in PlannedStmt.part_prune_infos.  The module
+ * should then pass that list, along with the PlannedStmt, to the executor,
+ * so that it can reuse the result of initial partition pruning when
+ * initializing the subplans for execution.
+ */
+typedef struct PartitionPruneResult
+{
+	NodeTag		type;
+
+	Bitmapset	   *valid_subplan_offs;
+} PartitionPruneResult;
 
 /*
  * Plan invalidation info
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index 0499635f59..32579d4788 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -220,7 +220,8 @@ extern List *CachedPlanGetTargetList(CachedPlanSource *plansource,
 extern CachedPlan *GetCachedPlan(CachedPlanSource *plansource,
 								 ParamListInfo boundParams,
 								 ResourceOwner owner,
-								 QueryEnvironment *queryEnv);
+								 QueryEnvironment *queryEnv,
+								 List **part_prune_results_list);
 extern void ReleaseCachedPlan(CachedPlan *plan, ResourceOwner owner);
 
 extern bool CachedPlanAllowsSimpleValidityCheck(CachedPlanSource *plansource,
diff --git a/src/include/utils/portal.h b/src/include/utils/portal.h
index aeddbdafe5..1901fc5f28 100644
--- a/src/include/utils/portal.h
+++ b/src/include/utils/portal.h
@@ -138,6 +138,7 @@ typedef struct PortalData
 	QueryCompletion qc;			/* command completion data for executed query */
 	List	   *stmts;			/* list of PlannedStmts */
 	CachedPlan *cplan;			/* CachedPlan, if stmts are from one */
+	List	   *part_prune_results_list;	/* List of Lists of PartitionPruneResults */
 
 	ParamListInfo portalParams; /* params to pass to query */
 	QueryEnvironment *queryEnv; /* environment for query */
@@ -242,6 +243,8 @@ extern void PortalDefineQuery(Portal portal,
 							  CommandTag commandTag,
 							  List *stmts,
 							  CachedPlan *cplan);
+extern void PortalStorePartitionPruneResults(Portal portal,
+											 List *part_prune_results_list);
 extern PlannedStmt *PortalGetPrimaryStmt(Portal portal);
 extern void PortalCreateHoldStore(Portal portal);
 extern void PortalHashTableDeleteAll(void);
-- 
2.35.3



^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2022-11-08 06:22  Amit Langote <[email protected]>
  parent: Amit Langote <[email protected]>
  0 siblings, 1 reply; 108+ messages in thread

From: Amit Langote @ 2022-11-08 06:22 UTC (permalink / raw)
  To: Robert Haas <[email protected]>; +Cc: Jacob Champion <[email protected]>; Zhihong Yu <[email protected]>; David Rowley <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers

On Thu, Oct 27, 2022 at 11:41 AM Amit Langote <[email protected]> wrote:
> On Mon, Oct 17, 2022 at 6:29 PM Amit Langote <[email protected]> wrote:
> > On Wed, Oct 12, 2022 at 4:36 PM Amit Langote <[email protected]> wrote:
> > > On Fri, Jul 29, 2022 at 1:20 PM Amit Langote <[email protected]> wrote:
> > > > On Thu, Jul 28, 2022 at 1:27 AM Robert Haas <[email protected]> wrote:
> > > > > 0001 adds es_part_prune_result but does not use it, so maybe the
> > > > > introduction of that field should be deferred until it's needed for
> > > > > something.
> > > >
> > > > Oops, looks like a mistake when breaking the patch.  Will move that bit to 0002.
> > >
> > > Fixed that and also noticed that I had defined PartitionPruneResult in
> > > the wrong header (execnodes.h).  That led to PartitionPruneResult
> > > nodes not being able to be written and read, because
> > > src/backend/nodes/gen_node_support.pl doesn't create _out* and _read*
> > > routines for the nodes defined in execnodes.h.  I moved its definition
> > > to plannodes.h, even though it is not actually the planner that
> > > instantiates those; no other include/nodes header sounds better.
> > >
> > > One more thing I realized is that Bitmapsets added to the List
> > > PartitionPruneResult.valid_subplan_offs_list are not actually
> > > read/write-able.  That's a problem that I also faced in [1], so I
> > > proposed a patch there to make Bitmapset a read/write-able Node and
> > > mark (only) the Bitmapsets that are added into read/write-able node
> > > trees with the corresponding NodeTag.  I'm including that patch here
> > > as well (0002) for the main patch to work (pass
> > > -DWRITE_READ_PARSE_PLAN_TREES build tests), though it might make sense
> > > to discuss it in its own thread?
> >
> > Had second thoughts on the use of List of Bitmapsets for this, such
> > that the make-Bitmapset-Nodes patch is no longer needed.
> >
> > I had defined PartitionPruneResult such that it stood for the results
> > of pruning for all PartitionPruneInfos contained in
> > PlannedStmt.partPruneInfos (covering all Append/MergeAppend nodes that
> > can use partition pruning in a given plan).  So, it had a List of
> > Bitmapset.  I think it's perhaps better for PartitionPruneResult to
> > cover only one PartitionPruneInfo and thus need only a Bitmapset and
> > not a List thereof, which I have implemented in the attached updated
> > patch 0002.  So, instead of needing to pass around a
> > PartitionPruneResult with each PlannedStmt, this now passes a List of
> > PartitionPruneResult with an entry for each in
> > PlannedStmt.partPruneInfos.
>
> Rebased over 3b2db22fe.

Updated 0002 to cope with AssertArg() being removed from the tree.

-- 
Thanks, Amit Langote
EDB: http://www.enterprisedb.com


Attachments:

  [application/octet-stream] v24-0002-Optimize-AcquireExecutorLocks-by-locking-only-un.patch (82.3K, 2-v24-0002-Optimize-AcquireExecutorLocks-by-locking-only-un.patch)
  download | inline diff:
From 8f6456d27efb8719a7dd8a52bf0ad3c5033b31a3 Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Wed, 22 Dec 2021 16:55:17 +0900
Subject: [PATCH v24 2/2] Optimize AcquireExecutorLocks() by locking only
 unpruned partitions

This commit teaches AcquireExecutorLocks() to perform initial
partition pruning to notionally eliminate the subnodes contained in a
generic cached plan that need not be initialized during the actual
execution of the plan and skip locking the partition scanned by those
subnodes.

The result of performing initial partition pruning this way before the
actual execution has started is made available to the actual execution via
PartitionPruneResult, made available along with the PlannedStmt by the
callers of the executor that used plancache.c to get the plan.  It is NULL
in the cases in which the plan is obtained by calling the planner
directly or if the plan obtained by plancache.c is not a generic one.
---
 src/backend/commands/copyto.c          |   2 +-
 src/backend/commands/createas.c        |   2 +-
 src/backend/commands/explain.c         |   7 +-
 src/backend/commands/extension.c       |   2 +-
 src/backend/commands/matview.c         |   2 +-
 src/backend/commands/prepare.c         |  26 ++-
 src/backend/executor/README            |  32 ++++
 src/backend/executor/execMain.c        |  51 ++++++
 src/backend/executor/execParallel.c    |  26 ++-
 src/backend/executor/execPartition.c   | 241 +++++++++++++++++++++----
 src/backend/executor/execUtils.c       |   1 +
 src/backend/executor/functions.c       |   2 +-
 src/backend/executor/nodeAppend.c      |  11 +-
 src/backend/executor/nodeMergeAppend.c |   5 +-
 src/backend/executor/spi.c             |  27 ++-
 src/backend/nodes/readfuncs.c          |   8 +-
 src/backend/optimizer/plan/planner.c   |   2 +
 src/backend/optimizer/plan/setrefs.c   |  46 +++++
 src/backend/partitioning/partprune.c   |  41 ++++-
 src/backend/tcop/postgres.c            |   8 +-
 src/backend/tcop/pquery.c              |  28 ++-
 src/backend/utils/cache/plancache.c    | 208 ++++++++++++++++++---
 src/backend/utils/mmgr/portalmem.c     |  19 ++
 src/include/commands/explain.h         |   4 +-
 src/include/executor/execPartition.h   |   9 +-
 src/include/executor/execdesc.h        |   3 +
 src/include/executor/executor.h        |   3 +
 src/include/nodes/execnodes.h          |   1 +
 src/include/nodes/nodes.h              |   1 +
 src/include/nodes/pathnodes.h          |  12 ++
 src/include/nodes/plannodes.h          |  46 +++++
 src/include/utils/plancache.h          |   3 +-
 src/include/utils/portal.h             |   3 +
 33 files changed, 782 insertions(+), 100 deletions(-)

diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index f26cc0d162..401a2280a3 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -558,7 +558,7 @@ BeginCopyTo(ParseState *pstate,
 		((DR_copy *) dest)->cstate = cstate;
 
 		/* Create a QueryDesc requesting no output */
-		cstate->queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+		cstate->queryDesc = CreateQueryDesc(plan, NIL, pstate->p_sourcetext,
 											GetActiveSnapshot(),
 											InvalidSnapshot,
 											dest, NULL, NULL, 0);
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 152c29b551..942449544c 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -325,7 +325,7 @@ ExecCreateTableAs(ParseState *pstate, CreateTableAsStmt *stmt,
 		UpdateActiveSnapshotCommandId();
 
 		/* Create a QueryDesc, redirecting output to our tuple receiver */
-		queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+		queryDesc = CreateQueryDesc(plan, NIL, pstate->p_sourcetext,
 									GetActiveSnapshot(), InvalidSnapshot,
 									dest, params, queryEnv, 0);
 
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index f86983c660..2f2b558608 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -407,7 +407,7 @@ ExplainOneQuery(Query *query, int cursorOptions,
 		}
 
 		/* run it (if needed) and produce output */
-		ExplainOnePlan(plan, into, es, queryString, params, queryEnv,
+		ExplainOnePlan(plan, NIL, into, es, queryString, params, queryEnv,
 					   &planduration, (es->buffers ? &bufusage : NULL));
 	}
 }
@@ -515,7 +515,8 @@ ExplainOneUtility(Node *utilityStmt, IntoClause *into, ExplainState *es,
  * to call it.
  */
 void
-ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
+ExplainOnePlan(PlannedStmt *plannedstmt, List *part_prune_results,
+			   IntoClause *into, ExplainState *es,
 			   const char *queryString, ParamListInfo params,
 			   QueryEnvironment *queryEnv, const instr_time *planduration,
 			   const BufferUsage *bufusage)
@@ -563,7 +564,7 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
 		dest = None_Receiver;
 
 	/* Create a QueryDesc for the query */
-	queryDesc = CreateQueryDesc(plannedstmt, queryString,
+	queryDesc = CreateQueryDesc(plannedstmt, part_prune_results, queryString,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, params, queryEnv, instrument_option);
 
diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index 1a62e5dac5..cc36b6fd15 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -776,7 +776,7 @@ execute_sql_string(const char *sql)
 			{
 				QueryDesc  *qdesc;
 
-				qdesc = CreateQueryDesc(stmt,
+				qdesc = CreateQueryDesc(stmt, NIL,
 										sql,
 										GetActiveSnapshot(), NULL,
 										dest, NULL, NULL, 0);
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index 9ac0383459..65c8d0aa59 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -408,7 +408,7 @@ refresh_matview_datafill(DestReceiver *dest, Query *query,
 	UpdateActiveSnapshotCommandId();
 
 	/* Create a QueryDesc, redirecting output to our tuple receiver */
-	queryDesc = CreateQueryDesc(plan, queryString,
+	queryDesc = CreateQueryDesc(plan, NIL, queryString,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, NULL, NULL, 0);
 
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 9e29584d93..29b45539d3 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -155,6 +155,7 @@ ExecuteQuery(ParseState *pstate,
 	PreparedStatement *entry;
 	CachedPlan *cplan;
 	List	   *plan_list;
+	List	   *part_prune_results_list;
 	ParamListInfo paramLI = NULL;
 	EState	   *estate = NULL;
 	Portal		portal;
@@ -193,7 +194,10 @@ ExecuteQuery(ParseState *pstate,
 									   entry->plansource->query_string);
 
 	/* Replan if needed, and increment plan refcount for portal */
-	cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL);
+	cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL,
+						  &part_prune_results_list);
+	Assert(list_length(cplan->stmt_list) ==
+		   list_length(part_prune_results_list));
 	plan_list = cplan->stmt_list;
 
 	/*
@@ -207,6 +211,9 @@ ExecuteQuery(ParseState *pstate,
 					  plan_list,
 					  cplan);
 
+	/* Copy Lists of PartitionPruneResults into the portal's context. */
+	PortalStorePartitionPruneResults(portal, part_prune_results_list);
+
 	/*
 	 * For CREATE TABLE ... AS EXECUTE, we must verify that the prepared
 	 * statement is one that produces tuples.  Currently we insist that it be
@@ -576,7 +583,9 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	const char *query_string;
 	CachedPlan *cplan;
 	List	   *plan_list;
-	ListCell   *p;
+	List	   *part_prune_results_list;
+	ListCell   *p,
+			   *pp;
 	ParamListInfo paramLI = NULL;
 	EState	   *estate = NULL;
 	instr_time	planstart;
@@ -619,7 +628,10 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 
 	/* Replan if needed, and acquire a transient refcount */
 	cplan = GetCachedPlan(entry->plansource, paramLI,
-						  CurrentResourceOwner, queryEnv);
+						  CurrentResourceOwner, queryEnv,
+						  &part_prune_results_list);
+	Assert(list_length(cplan->stmt_list) ==
+		   list_length(part_prune_results_list));
 
 	INSTR_TIME_SET_CURRENT(planduration);
 	INSTR_TIME_SUBTRACT(planduration, planstart);
@@ -634,13 +646,15 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	plan_list = cplan->stmt_list;
 
 	/* Explain each query */
-	foreach(p, plan_list)
+	forboth(p, plan_list, pp, part_prune_results_list)
 	{
 		PlannedStmt *pstmt = lfirst_node(PlannedStmt, p);
+		List *part_prune_results = lfirst_node(List, pp);
 
 		if (pstmt->commandType != CMD_UTILITY)
-			ExplainOnePlan(pstmt, into, es, query_string, paramLI, queryEnv,
-						   &planduration, (es->buffers ? &bufusage : NULL));
+			ExplainOnePlan(pstmt, part_prune_results, into, es, query_string,
+						   paramLI, queryEnv, &planduration,
+						   (es->buffers ? &bufusage : NULL));
 		else
 			ExplainOneUtility(pstmt->utilityStmt, into, es, query_string,
 							  paramLI, queryEnv);
diff --git a/src/backend/executor/README b/src/backend/executor/README
index 0b5183fc4a..f14f9197b5 100644
--- a/src/backend/executor/README
+++ b/src/backend/executor/README
@@ -65,6 +65,34 @@ found there.  This currently only occurs for Append and MergeAppend nodes.  In
 this case the non-required subplans are ignored and the executor state's
 subnode array will become out of sequence to the plan's subplan list.
 
+Actually, the so-called execution time pruning may also occur even before the
+execution has started.  One case where that occurs is when a cached generic
+plan is being validated for execution by plancache.c: GetCachedPlan(), which
+works by locking all the relations that will be scanned by that plan.  If the
+generic plan contains nodes that can perform execution time partition pruning
+(that is, contain a PartitionPruneInfo), a subset of pruning steps contained
+in a given node's PartitionPruneInfo that do not depend on the execution
+actually having started (called "initial" pruning steps) are performed to
+figure out the minimal set of child subplans that satisfy those pruning steps.
+AcquireExecutorLocks() looking at a given generic plan will then lock only the
+relations scanned by the child subplans that survived such pruning, along with
+those present in PlannedStmt.minLockRelids.  Note that the subplans are only
+notionally pruned, that is, they are not removed from the plan tree as such.
+
+To prevent the executor and any third party execution code that can look at
+the plan tree from trying to execute the subplans that were pruned as
+described above, the result of pruning is passed to the executor as a List
+of PartitionPruneResult nodes via the QueryDesc.  Each PartitionPruneResult
+consists of the set of indexes of surviving subplans in the respective parent
+plan node's (the one to which the corresponding PartitionPruneInfo belongs)
+list of child subplans, saved as a bitmapset (valid_subplan_offs).  In other
+words, the executor executing a generic plan should not re-evaluate the set of
+initially valid subplans for a given plan node by redoing the initial pruning
+if it was already done by AcquireExecutorLocks() when validating the plan.
+Such re-evaluation of the pruning steps may very well end up resulting in a
+different set of subplans, containing some whose relations were not locked by
+AcquireExecutorLocks().
+
 Each Plan node may have expression trees associated with it, to represent
 its target list, qualification conditions, etc.  These trees are also
 read-only to the executor, but the executor state for expression evaluation
@@ -286,6 +314,10 @@ Query Processing Control Flow
 
 This is a sketch of control flow for full query processing:
 
+  [ ExecutorDoInitialPruning ] --- an optional step to perform initial
+		partition pruning on the plan tree the result of which is passed
+		to the executor via QueryDesc
+
 	CreateQueryDesc
 
 	ExecutorStart
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 32475e33ff..b59474841f 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -49,6 +49,7 @@
 #include "commands/matview.h"
 #include "commands/trigger.h"
 #include "executor/execdebug.h"
+#include "executor/execPartition.h"
 #include "executor/nodeSubplan.h"
 #include "foreign/fdwapi.h"
 #include "jit/jit.h"
@@ -104,6 +105,54 @@ static void EvalPlanQualStart(EPQState *epqstate, Plan *planTree);
 
 /* end of local decls */
 
+/* ----------------------------------------------------------------
+ *		ExecutorDoInitialPruning
+ *
+ *		For each plan tree node that has been assigned a PartitionPruneInfo,
+ *		this performs initial partition pruning using the information contained
+ *		therein to determine the set of child subplans that satisfy the initial
+ *		pruning steps, to be returned as a bitmapset of their indexes in the
+ *		node's list of child subplans (for example, an Append's appendplans).
+ *
+ * Return value is a PartitionPruneResult node that contains a list of those
+ * bitmapsets, with one element for every PartitionPruneInfo, and a bitmapset
+ * of the RT indexes of all the leaf partitions scanned by those chosen
+ * subplans.  Note that the latter is shared across all PartitionPruneInfos.
+ *
+ * The executor must see the exactly same set of subplans as valid for
+ * execution when doing ExecInitNode() on the plan nodes whose
+ * PartitionPruneInfos are processed here.  So, it must get the set from the
+ * aforementioned PartitionPruneResult, instead of computing it all over
+ * again by redoing the initial pruning.  It's the caller's job to pass the
+ * PartitionPruneResult to the executor.
+ *
+ * Note: Partitioned tables mentioned in PartitionedRelPruneInfo nodes that
+ * drive the pruning will be locked before doing the pruning.
+ * ----------------------------------------------------------------
+ */
+List *
+ExecutorDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params,
+						 Bitmapset **scan_leafpart_rtis)
+{
+	List	 *part_prune_results = NIL;
+	ListCell *lc;
+
+	/* Only get here if there is any pruning to do. */
+	Assert(plannedstmt->containsInitialPruning);
+
+	foreach(lc, plannedstmt->partPruneInfos)
+	{
+		PartitionPruneInfo *pruneinfo = lfirst(lc);
+		PartitionPruneResult *pruneresult = makeNode(PartitionPruneResult);
+
+		pruneresult->valid_subplan_offs =
+			ExecPartitionDoInitialPruning(plannedstmt, params, pruneinfo,
+										  scan_leafpart_rtis);
+		part_prune_results = lappend(part_prune_results, pruneresult);
+	}
+
+	return part_prune_results;
+}
 
 /* ----------------------------------------------------------------
  *		ExecutorStart
@@ -806,6 +855,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
 {
 	CmdType		operation = queryDesc->operation;
 	PlannedStmt *plannedstmt = queryDesc->plannedstmt;
+	List	   *part_prune_results = queryDesc->part_prune_results;
 	Plan	   *plan = plannedstmt->planTree;
 	List	   *rangeTable = plannedstmt->rtable;
 	EState	   *estate = queryDesc->estate;
@@ -826,6 +876,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
 
 	estate->es_plannedstmt = plannedstmt;
 	estate->es_part_prune_infos = plannedstmt->partPruneInfos;
+	estate->es_part_prune_results = part_prune_results;
 
 	/*
 	 * Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index aca0c6f323..917079a034 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -66,6 +66,7 @@
 #define PARALLEL_KEY_QUERY_TEXT		UINT64CONST(0xE000000000000008)
 #define PARALLEL_KEY_JIT_INSTRUMENTATION UINT64CONST(0xE000000000000009)
 #define PARALLEL_KEY_WAL_USAGE			UINT64CONST(0xE00000000000000A)
+#define PARALLEL_KEY_PARTITION_PRUNE_RESULTS	UINT64CONST(0xE00000000000000B)
 
 #define PARALLEL_TUPLE_QUEUE_SIZE		65536
 
@@ -182,6 +183,7 @@ ExecSerializePlan(Plan *plan, EState *estate)
 	pstmt->transientPlan = false;
 	pstmt->dependsOnRole = false;
 	pstmt->parallelModeNeeded = false;
+	pstmt->containsInitialPruning = false;
 	pstmt->planTree = plan;
 	pstmt->partPruneInfos = estate->es_part_prune_infos;
 	pstmt->rtable = estate->es_range_table;
@@ -597,12 +599,15 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	FixedParallelExecutorState *fpes;
 	char	   *pstmt_data;
 	char	   *pstmt_space;
+	char	   *part_prune_results_data;
+	char	   *part_prune_results_space;
 	char	   *paramlistinfo_space;
 	BufferUsage *bufusage_space;
 	WalUsage   *walusage_space;
 	SharedExecutorInstrumentation *instrumentation = NULL;
 	SharedJitInstrumentation *jit_instrumentation = NULL;
 	int			pstmt_len;
+	int			part_prune_results_len;
 	int			paramlistinfo_len;
 	int			instrumentation_len = 0;
 	int			jit_instrumentation_len = 0;
@@ -631,6 +636,7 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 
 	/* Fix up and serialize plan to be sent to workers. */
 	pstmt_data = ExecSerializePlan(planstate->plan, estate);
+	part_prune_results_data = nodeToString(estate->es_part_prune_results);
 
 	/* Create a parallel context. */
 	pcxt = CreateParallelContext("postgres", "ParallelQueryMain", nworkers);
@@ -657,6 +663,11 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	shm_toc_estimate_chunk(&pcxt->estimator, pstmt_len);
 	shm_toc_estimate_keys(&pcxt->estimator, 1);
 
+	/* Estimate space for serialized List of PartitionPruneResult. */
+	part_prune_results_len = strlen(part_prune_results_data) + 1;
+	shm_toc_estimate_chunk(&pcxt->estimator, part_prune_results_len);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
 	/* Estimate space for serialized ParamListInfo. */
 	paramlistinfo_len = EstimateParamListSpace(estate->es_param_list_info);
 	shm_toc_estimate_chunk(&pcxt->estimator, paramlistinfo_len);
@@ -751,6 +762,12 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	memcpy(pstmt_space, pstmt_data, pstmt_len);
 	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PLANNEDSTMT, pstmt_space);
 
+	/* Store serialized List of PartitionPruneResult */
+	part_prune_results_space = shm_toc_allocate(pcxt->toc, part_prune_results_len);
+	memcpy(part_prune_results_space, part_prune_results_data, part_prune_results_len);
+	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARTITION_PRUNE_RESULTS,
+				   part_prune_results_space);
+
 	/* Store serialized ParamListInfo. */
 	paramlistinfo_space = shm_toc_allocate(pcxt->toc, paramlistinfo_len);
 	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARAMLISTINFO, paramlistinfo_space);
@@ -1232,8 +1249,10 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
 						 int instrument_options)
 {
 	char	   *pstmtspace;
+	char	   *part_prune_results_space;
 	char	   *paramspace;
 	PlannedStmt *pstmt;
+	List	   *part_prune_results;
 	ParamListInfo paramLI;
 	char	   *queryString;
 
@@ -1244,12 +1263,17 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
 	pstmtspace = shm_toc_lookup(toc, PARALLEL_KEY_PLANNEDSTMT, false);
 	pstmt = (PlannedStmt *) stringToNode(pstmtspace);
 
+	/* Reconstruct leader-supplied PartitionPruneResult. */
+	part_prune_results_space =
+		shm_toc_lookup(toc, PARALLEL_KEY_PARTITION_PRUNE_RESULTS, false);
+	part_prune_results = (List *) stringToNode(part_prune_results_space);
+
 	/* Reconstruct ParamListInfo. */
 	paramspace = shm_toc_lookup(toc, PARALLEL_KEY_PARAMLISTINFO, false);
 	paramLI = RestoreParamList(&paramspace);
 
 	/* Create a QueryDesc for the query. */
-	return CreateQueryDesc(pstmt,
+	return CreateQueryDesc(pstmt, part_prune_results,
 						   queryString,
 						   GetActiveSnapshot(), InvalidSnapshot,
 						   receiver, paramLI, NULL, instrument_options);
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 80197d5141..8728745c44 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -25,6 +25,7 @@
 #include "mb/pg_wchar.h"
 #include "miscadmin.h"
 #include "nodes/makefuncs.h"
+#include "parser/parsetree.h"
 #include "partitioning/partbounds.h"
 #include "partitioning/partdesc.h"
 #include "partitioning/partprune.h"
@@ -185,7 +186,11 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation rel,
 static List *adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri);
 static List *adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap);
 static PartitionPruneState *CreatePartitionPruneState(PlanState *planstate,
-													  PartitionPruneInfo *pruneinfo);
+													  PartitionPruneInfo *pruneinfo,
+													  bool consider_initial_steps,
+													  bool consider_exec_steps,
+													  List *rtable, ExprContext *econtext,
+													  PartitionDirectory partdir);
 static void InitPartitionPruneContext(PartitionPruneContext *context,
 									  List *pruning_steps,
 									  PartitionDesc partdesc,
@@ -198,7 +203,8 @@ static void PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
 static void find_matching_subplans_recurse(PartitionPruningData *prunedata,
 										   PartitionedRelPruningData *pprune,
 										   bool initial_prune,
-										   Bitmapset **validsubplans);
+										   Bitmapset **validsubplans,
+										   Bitmapset **scan_leafpart_rtis);
 
 
 /*
@@ -1746,8 +1752,10 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  * considered to be a stable expression, it can change value from one plan
  * node scan to the next during query execution.  Stable comparison
  * expressions that don't involve such Params allow partition pruning to be
- * done once during executor startup.  Expressions that do involve such Params
- * require us to prune separately for each scan of the parent plan node.
+ * done once during executor startup or during ExecutorDoInitialPruning() that
+ * runs as part of performing AcquireExecutorLocks() on a given plan tree.
+ * Expressions that do involve such Params require us to prune separately for
+ * each scan of the parent plan node.
  *
  * Note that pruning away unneeded subplans during executor startup has the
  * added benefit of not having to initialize the unneeded subplans at all.
@@ -1764,6 +1772,13 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  *		account for initial pruning possibly having eliminated some of the
  *		subplans.
  *
+ * ExecPartitionDoInitialPruning:
+ *		Do initial pruning with the information contained in a given
+ *		PartitionPruneInfo to determine the minimal set of child subplans
+ *		to be executed of the parent plan node to which the PartitionPruneInfo
+ *		belongs and also the set of the RT indexes of leaf partitions that will
+ *		be scanned with those subplans.
+ *
  * ExecFindMatchingSubPlans:
  *		Returns indexes of matching subplans after evaluating the expressions
  *		that are safe to evaluate at a given point.  This function is first
@@ -1781,8 +1796,9 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  *
  * On return, *initially_valid_subplans is assigned the set of indexes of
  * child subplans that must be initialized along with the parent plan node.
- * Initial pruning is performed here if needed and in that case only the
- * surviving subplans' indexes are added.
+ * Initial pruning is performed here if needed (unless it has already been done
+ * by ExecutorDoInitialPruning()), and in that case only the surviving
+ * subplans' indexes are added.
  *
  * If subplans are indeed pruned, subplan_map arrays contained in the returned
  * PartitionPruneState are re-sequenced to not count those, though only if the
@@ -1794,28 +1810,65 @@ ExecInitPartitionPruning(PlanState *planstate,
 						 int part_prune_index,
 						 Bitmapset **initially_valid_subplans)
 {
-	PartitionPruneState *prunestate;
+	PartitionPruneState *prunestate = NULL;
 	EState	   *estate = planstate->state;
 	PartitionPruneInfo *pruneinfo = list_nth(estate->es_part_prune_infos,
 											 part_prune_index);
+	PartitionPruneResult *pruneresult = NULL;
+	bool	do_pruning = (pruneinfo->needs_init_pruning ||
+						  pruneinfo->needs_exec_pruning);
+
+	/*
+	 * No need to do initial pruning if it was done already by
+	 * ExecutorDoInitialPruning(), which it would be if es_part_prune_results
+	 * is set.
+	 */
+	if (estate->es_part_prune_results)
+	{
+		pruneresult = list_nth(estate->es_part_prune_results, part_prune_index);
+		Assert(IsA(pruneresult, PartitionPruneResult));
+		do_pruning = pruneinfo->needs_exec_pruning;
+	}
 
-	/* We may need an expression context to evaluate partition exprs */
-	ExecAssignExprContext(estate, planstate);
+	if (do_pruning)
+	{
+		/* We may need an expression context to evaluate partition exprs */
+		ExecAssignExprContext(estate, planstate);
 
-	/* Create the working data structure for pruning */
-	prunestate = CreatePartitionPruneState(planstate, pruneinfo);
+		/* For data reading, executor always omits detached partitions */
+		if (estate->es_partition_directory == NULL)
+			estate->es_partition_directory =
+				CreatePartitionDirectory(estate->es_query_cxt, false);
+
+		/*
+		 * Create the working data structure for pruning.  No need to consider
+		 * initial pruning steps if we have a PartitionPruneResult.
+		 */
+		prunestate = CreatePartitionPruneState(planstate, pruneinfo,
+											   pruneresult == NULL, true,
+											   NIL, planstate->ps_ExprContext,
+											   estate->es_partition_directory);
+	}
 
 	/*
 	 * Perform an initial partition prune pass, if required.
 	 */
-	if (prunestate->do_initial_prune)
-		*initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true);
+	if (pruneresult)
+	{
+		*initially_valid_subplans = bms_copy(pruneresult->valid_subplan_offs);
+	}
+	else if (prunestate && prunestate->do_initial_prune)
+	{
+		*initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true,
+															 NULL);
+	}
 	else
 	{
-		/* No pruning, so we'll need to initialize all subplans */
+		/* No initial pruning, so we'll need to initialize all subplans */
 		Assert(n_total_subplans > 0);
 		*initially_valid_subplans = bms_add_range(NULL, 0,
 												  n_total_subplans - 1);
+		return prunestate;
 	}
 
 	/*
@@ -1823,7 +1876,8 @@ ExecInitPartitionPruning(PlanState *planstate,
 	 * that were removed above due to initial pruning.  No need to do this if
 	 * no steps were removed.
 	 */
-	if (bms_num_members(*initially_valid_subplans) < n_total_subplans)
+	if (prunestate &&
+		bms_num_members(*initially_valid_subplans) < n_total_subplans)
 	{
 		/*
 		 * We can safely skip this when !do_exec_prune, even though that
@@ -1839,11 +1893,74 @@ ExecInitPartitionPruning(PlanState *planstate,
 	return prunestate;
 }
 
+/*
+ * ExecPartitionDoInitialPruning
+ *		Perform initial pruning using given PartitionPruneInfo to determine
+ *		the minimal set of child subplans that will be executed and also the
+ *		set of RT indexes of the leaf partitions scanned by those subplans.
+ */
+Bitmapset *
+ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params,
+							  PartitionPruneInfo *pruneinfo,
+							  Bitmapset **scan_leafpart_rtis)
+{
+	List		 *rtable = plannedstmt->rtable;
+	ExprContext	 *econtext;
+	PartitionDirectory pdir;
+	MemoryContext oldcontext,
+				  tmpcontext;
+	PartitionPruneState *prunestate;
+	Bitmapset	 *valid_subplan_offs;
+
+	/*
+	 * A temporary context for memory allocations required while executing
+	 * partition pruning steps.
+	 */
+	tmpcontext = AllocSetContextCreate(CurrentMemoryContext,
+									   "initial pruning working data",
+									   ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(tmpcontext);
+
+	/*
+	 * PartitionDirectory to look up partition descriptors.
+	 * Note that we don't omit detached partitions, just like during
+	 * execution proper.
+	 */
+	pdir = CreatePartitionDirectory(CurrentMemoryContext, false);
+
+	/*
+	 * We don't yet have a PlanState for the parent plan node, so we must
+	 * create a standalone ExprContext to evaluate pruning expressions,
+	 * equipped with the information about the EXTERN parameters that the
+	 * caller passed us.  Note that that's okay because the initial pruning
+	 * steps do not contain anything that requires the execution to have
+	 * started and thus need the information contained in a PlanState.
+	 */
+	econtext = CreateStandaloneExprContext();
+	econtext->ecxt_param_list_info = params;
+	prunestate = CreatePartitionPruneState(NULL, pruneinfo, true, false,
+										   rtable, econtext, pdir);
+	MemoryContextSwitchTo(oldcontext);
+
+	/* Do the initial pruning. */
+	valid_subplan_offs = ExecFindMatchingSubPlans(prunestate, true,
+												  scan_leafpart_rtis);
+
+	FreeExprContext(econtext, true);
+	DestroyPartitionDirectory(pdir);
+	MemoryContextDelete(tmpcontext);
+
+	return valid_subplan_offs;
+}
+
 /*
  * CreatePartitionPruneState
  *		Build the data structure required for calling ExecFindMatchingSubPlans
  *
- * 'planstate' is the parent plan node's execution state.
+ * 'planstate', if not NULL, is the parent plan node's execution state.  It
+ * can be NULL if being called before ExecutorStart(), in which case,
+ * 'rtable' (range table), 'econtext', and 'partdir' must be explicitly
+ * provided.
  *
  * 'pruneinfo' is a PartitionPruneInfo as generated by
  * make_partition_pruneinfo.  Here we build a PartitionPruneState containing a
@@ -1857,19 +1974,21 @@ ExecInitPartitionPruning(PlanState *planstate,
  * PartitionedRelPruneInfo.
  */
 static PartitionPruneState *
-CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
+CreatePartitionPruneState(PlanState *planstate,
+						  PartitionPruneInfo *pruneinfo,
+						  bool consider_initial_steps,
+						  bool consider_exec_steps,
+						  List *rtable, ExprContext *econtext,
+						  PartitionDirectory partdir)
 {
-	EState	   *estate = planstate->state;
+	EState	   *estate = planstate ? planstate->state : NULL;
 	PartitionPruneState *prunestate;
 	int			n_part_hierarchies;
 	ListCell   *lc;
 	int			i;
-	ExprContext *econtext = planstate->ps_ExprContext;
 
-	/* For data reading, executor always omits detached partitions */
-	if (estate->es_partition_directory == NULL)
-		estate->es_partition_directory =
-			CreatePartitionDirectory(estate->es_query_cxt, false);
+	Assert((estate != NULL) ||
+			(partdir != NULL && econtext != NULL && rtable != NIL));
 
 	n_part_hierarchies = list_length(pruneinfo->prune_infos);
 	Assert(n_part_hierarchies > 0);
@@ -1924,15 +2043,42 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 			PartitionKey partkey;
 
 			/*
-			 * We can rely on the copies of the partitioned table's partition
-			 * key and partition descriptor appearing in its relcache entry,
-			 * because that entry will be held open and locked for the
-			 * duration of this executor run.
+			 * Must open the relation by ourselves when called before the
+			 * execution has started, such as, when called during
+			 * ExecutorDoInitialPruning() on a cached plan.  In that case,
+			 * sub-partitions must be locked, because AcquirePlannerLocks()
+			 * would not have seen them. (1st relation in a partrelpruneinfos
+			 * list is always the root partitioned table appearing in the
+			 * query, which AcquirePlannerLocks() would have locked; the
+			 * Assert in relation_open() guards that assumption.)
+			 */
+			if (estate == NULL)
+			{
+				RangeTblEntry *rte = rt_fetch(pinfo->rtindex, rtable);
+				int		lockmode = (j == 0) ? NoLock : rte->rellockmode;
+
+				partrel = table_open(rte->relid, lockmode);
+			}
+			else
+				partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
+
+			/*
+			 * We can rely on the copy of the partitioned table's partition
+			 * key from in its relcache entry, because it can't change (or
+			 * get destroyed) as long as the relation is locked.  Partition
+			 * descriptor is taken from the PartitionDirectory associated with
+			 * the table that is held open long enough for the descriptor to
+			 * remain valid while it's used to perform the pruning steps.
 			 */
-			partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
 			partkey = RelationGetPartitionKey(partrel);
-			partdesc = PartitionDirectoryLookup(estate->es_partition_directory,
-												partrel);
+			partdesc = PartitionDirectoryLookup(partdir, partrel);
+
+			/*
+			 * Must close partrel, keeping the lock taken, if we're not using
+			 * EState's entry.
+			 */
+			if (estate == NULL)
+				table_close(partrel, NoLock);
 
 			/*
 			 * Initialize the subplan_map and subpart_map.
@@ -1946,6 +2092,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 			Assert(partdesc->nparts >= pinfo->nparts);
 			pprune->nparts = partdesc->nparts;
 			pprune->subplan_map = palloc(sizeof(int) * partdesc->nparts);
+			pprune->rti_map = palloc(sizeof(Index) * partdesc->nparts);
 			if (partdesc->nparts == pinfo->nparts)
 			{
 				/*
@@ -1956,6 +2103,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 				pprune->subpart_map = pinfo->subpart_map;
 				memcpy(pprune->subplan_map, pinfo->subplan_map,
 					   sizeof(int) * pinfo->nparts);
+				memcpy(pprune->rti_map, pinfo->rti_map,
+					   sizeof(int) * pinfo->nparts);
 
 				/*
 				 * Double-check that the list of unpruned relations has not
@@ -2006,6 +2155,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 							pinfo->subplan_map[pd_idx];
 						pprune->subpart_map[pp_idx] =
 							pinfo->subpart_map[pd_idx];
+						pprune->rti_map[pp_idx] =
+							pinfo->rti_map[pd_idx];
 						pd_idx++;
 					}
 					else
@@ -2013,6 +2164,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 						/* this partdesc entry is not in the plan */
 						pprune->subplan_map[pp_idx] = -1;
 						pprune->subpart_map[pp_idx] = -1;
+						pprune->rti_map[pp_idx] = 0;
 					}
 				}
 
@@ -2034,7 +2186,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 			 * Initialize pruning contexts as needed.
 			 */
 			pprune->initial_pruning_steps = pinfo->initial_pruning_steps;
-			if (pinfo->initial_pruning_steps)
+			if (consider_initial_steps && pinfo->initial_pruning_steps)
 			{
 				InitPartitionPruneContext(&pprune->initial_context,
 										  pinfo->initial_pruning_steps,
@@ -2044,7 +2196,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 				prunestate->do_initial_prune = true;
 			}
 			pprune->exec_pruning_steps = pinfo->exec_pruning_steps;
-			if (pinfo->exec_pruning_steps)
+			if (consider_exec_steps && pinfo->exec_pruning_steps)
 			{
 				InitPartitionPruneContext(&pprune->exec_context,
 										  pinfo->exec_pruning_steps,
@@ -2272,10 +2424,14 @@ PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
  * Pass initial_prune if PARAM_EXEC Params cannot yet be evaluated.  This
  * differentiates the initial executor-time pruning step from later
  * runtime pruning.
+ *
+ * RT indexes of leaf partitions scanned by the chosen subplans are added to
+ * *scan_leafpart_rtis if the pointer is non-NULL.
  */
 Bitmapset *
 ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
-						 bool initial_prune)
+						 bool initial_prune,
+						 Bitmapset **scan_leafpart_rtis)
 {
 	Bitmapset  *result = NULL;
 	MemoryContext oldcontext;
@@ -2310,7 +2466,7 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
 		 */
 		pprune = &prunedata->partrelprunedata[0];
 		find_matching_subplans_recurse(prunedata, pprune, initial_prune,
-									   &result);
+									   &result, scan_leafpart_rtis);
 
 		/* Expression eval may have used space in ExprContext too */
 		if (pprune->exec_pruning_steps)
@@ -2324,6 +2480,8 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
 
 	/* Copy result out of the temp context before we reset it */
 	result = bms_copy(result);
+	if (scan_leafpart_rtis)
+		*scan_leafpart_rtis = bms_copy(*scan_leafpart_rtis);
 
 	MemoryContextReset(prunestate->prune_context);
 
@@ -2334,13 +2492,15 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
  * find_matching_subplans_recurse
  *		Recursive worker function for ExecFindMatchingSubPlans
  *
- * Adds valid (non-prunable) subplan IDs to *validsubplans
+ * Adds valid (non-prunable) subplan IDs to *validsubplans and RT indexes of
+ * of the corresponding leaf partitions to *scan_leafpart_rtis (if asked for).
  */
 static void
 find_matching_subplans_recurse(PartitionPruningData *prunedata,
 							   PartitionedRelPruningData *pprune,
 							   bool initial_prune,
-							   Bitmapset **validsubplans)
+							   Bitmapset **validsubplans,
+							   Bitmapset **scan_leafpart_rtis)
 {
 	Bitmapset  *partset;
 	int			i;
@@ -2367,8 +2527,14 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
 	while ((i = bms_next_member(partset, i)) >= 0)
 	{
 		if (pprune->subplan_map[i] >= 0)
+		{
 			*validsubplans = bms_add_member(*validsubplans,
 											pprune->subplan_map[i]);
+			Assert(pprune->rti_map[i] > 0);
+			if (scan_leafpart_rtis)
+				*scan_leafpart_rtis = bms_add_member(*scan_leafpart_rtis,
+													 pprune->rti_map[i]);
+		}
 		else
 		{
 			int			partidx = pprune->subpart_map[i];
@@ -2376,7 +2542,8 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
 			if (partidx >= 0)
 				find_matching_subplans_recurse(prunedata,
 											   &prunedata->partrelprunedata[partidx],
-											   initial_prune, validsubplans);
+											   initial_prune, validsubplans,
+											   scan_leafpart_rtis);
 			else
 			{
 				/*
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 21f4c10937..67a58c7163 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -134,6 +134,7 @@ CreateExecutorState(void)
 	estate->es_param_exec_vals = NULL;
 
 	estate->es_queryEnv = NULL;
+	estate->es_part_prune_results = NIL;
 
 	estate->es_query_cxt = qcontext;
 
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index e134a82ff7..18d3b98cdc 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -842,7 +842,7 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
 	else
 		dest = None_Receiver;
 
-	es->qd = CreateQueryDesc(es->stmt,
+	es->qd = CreateQueryDesc(es->stmt, NIL,
 							 fcache->src,
 							 GetActiveSnapshot(),
 							 InvalidSnapshot,
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index c6f86a6510..96880e122a 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -155,7 +155,8 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
 		 * subplan, we can fill as_valid_subplans immediately, preventing
 		 * later calls to ExecFindMatchingSubPlans.
 		 */
-		if (!prunestate->do_exec_prune && nplans > 0)
+		if (appendstate->as_prune_state == NULL ||
+			(!appendstate->as_prune_state->do_exec_prune && nplans > 0))
 			appendstate->as_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
 	}
 	else
@@ -577,7 +578,7 @@ choose_next_subplan_locally(AppendState *node)
 		}
 		else if (node->as_valid_subplans == NULL)
 			node->as_valid_subplans =
-				ExecFindMatchingSubPlans(node->as_prune_state, false);
+				ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
 
 		whichplan = -1;
 	}
@@ -642,7 +643,7 @@ choose_next_subplan_for_leader(AppendState *node)
 		if (node->as_valid_subplans == NULL)
 		{
 			node->as_valid_subplans =
-				ExecFindMatchingSubPlans(node->as_prune_state, false);
+				ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
 
 			/*
 			 * Mark each invalid plan as finished to allow the loop below to
@@ -717,7 +718,7 @@ choose_next_subplan_for_worker(AppendState *node)
 	else if (node->as_valid_subplans == NULL)
 	{
 		node->as_valid_subplans =
-			ExecFindMatchingSubPlans(node->as_prune_state, false);
+			ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
 		mark_invalid_subplans_as_finished(node);
 	}
 
@@ -868,7 +869,7 @@ ExecAppendAsyncBegin(AppendState *node)
 	if (node->as_valid_subplans == NULL)
 	{
 		node->as_valid_subplans =
-			ExecFindMatchingSubPlans(node->as_prune_state, false);
+			ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
 
 		classify_matching_subplans(node);
 	}
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index 8d35860c30..2312e5a633 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -103,7 +103,8 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
 		 * subplan, we can fill ms_valid_subplans immediately, preventing
 		 * later calls to ExecFindMatchingSubPlans.
 		 */
-		if (!prunestate->do_exec_prune && nplans > 0)
+		if (mergestate->ms_prune_state == NULL ||
+			(!mergestate->ms_prune_state->do_exec_prune && nplans > 0))
 			mergestate->ms_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
 	}
 	else
@@ -218,7 +219,7 @@ ExecMergeAppend(PlanState *pstate)
 		 */
 		if (node->ms_valid_subplans == NULL)
 			node->ms_valid_subplans =
-				ExecFindMatchingSubPlans(node->ms_prune_state, false);
+				ExecFindMatchingSubPlans(node->ms_prune_state, false, NULL);
 
 		/*
 		 * First time through: pull the first tuple from each valid subplan,
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index fd5796f1b9..93012a5b3b 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -1578,6 +1578,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 	CachedPlanSource *plansource;
 	CachedPlan *cplan;
 	List	   *stmt_list;
+	List	   *part_prune_results_list;
 	char	   *query_string;
 	Snapshot	snapshot;
 	MemoryContext oldcontext;
@@ -1657,7 +1658,10 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 	 */
 
 	/* Replan if needed, and increment plan refcount for portal */
-	cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv);
+	cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv,
+						  &part_prune_results_list);
+	Assert(list_length(cplan->stmt_list) ==
+		   list_length(part_prune_results_list));
 	stmt_list = cplan->stmt_list;
 
 	if (!plan->saved)
@@ -1685,6 +1689,9 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 					  stmt_list,
 					  cplan);
 
+	/* Copy Lists of PartitionPruneResults into the portal's context. */
+	PortalStorePartitionPruneResults(portal, part_prune_results_list);
+
 	/*
 	 * Set up options for portal.  Default SCROLL type is chosen the same way
 	 * as PerformCursorOpen does it.
@@ -2092,7 +2099,8 @@ SPI_plan_get_cached_plan(SPIPlanPtr plan)
 	/* Get the generic plan for the query */
 	cplan = GetCachedPlan(plansource, NULL,
 						  plan->saved ? CurrentResourceOwner : NULL,
-						  _SPI_current->queryEnv);
+						  _SPI_current->queryEnv,
+						  NULL /* Not interested in PartitionPruneResults */);
 	Assert(cplan == plansource->gplan);
 
 	/* Pop the error context stack */
@@ -2473,7 +2481,9 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 	{
 		CachedPlanSource *plansource = (CachedPlanSource *) lfirst(lc1);
 		List	   *stmt_list;
-		ListCell   *lc2;
+		List	   *part_prune_results_list;
+		ListCell   *lc2,
+				   *lc3;
 
 		spicallbackarg.query = plansource->query_string;
 
@@ -2549,8 +2559,10 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 		 * plan, the refcount must be backed by the plan_owner.
 		 */
 		cplan = GetCachedPlan(plansource, options->params,
-							  plan_owner, _SPI_current->queryEnv);
-
+							  plan_owner, _SPI_current->queryEnv,
+							 &part_prune_results_list);
+		Assert(list_length(cplan->stmt_list) ==
+			   list_length(part_prune_results_list));
 		stmt_list = cplan->stmt_list;
 
 		/*
@@ -2589,9 +2601,10 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 			}
 		}
 
-		foreach(lc2, stmt_list)
+		forboth(lc2, stmt_list, lc3, part_prune_results_list)
 		{
 			PlannedStmt *stmt = lfirst_node(PlannedStmt, lc2);
+			List	   *part_prune_results = lfirst_node(List, lc3);
 			bool		canSetTag = stmt->canSetTag;
 			DestReceiver *dest;
 
@@ -2663,7 +2676,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 				else
 					snap = InvalidSnapshot;
 
-				qdesc = CreateQueryDesc(stmt,
+				qdesc = CreateQueryDesc(stmt, part_prune_results,
 										plansource->query_string,
 										snap, crosscheck_snapshot,
 										dest,
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index b4ff855f7c..77990a2732 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -158,6 +158,11 @@
 	token = pg_strtok(&length);		/* skip :fldname */ \
 	local_node->fldname = readIntCols(len)
 
+/* Read an Index array */
+#define READ_INDEX_ARRAY(fldname, len) \
+	token = pg_strtok(&length);		/* skip :fldname */ \
+	local_node->fldname = readIndexCols(len)
+
 /* Read a bool array */
 #define READ_BOOL_ARRAY(fldname, len) \
 	token = pg_strtok(&length);		/* skip :fldname */ \
@@ -795,7 +800,6 @@ fnname(int numCols) \
  */
 READ_SCALAR_ARRAY(readAttrNumberCols, int16, atoi)
 READ_SCALAR_ARRAY(readOidCols, Oid, atooid)
-/* outfuncs.c has writeIndexCols, but we don't yet need that here */
-/* READ_SCALAR_ARRAY(readIndexCols, Index, atoui) */
+READ_SCALAR_ARRAY(readIndexCols, Index, atoui)
 READ_SCALAR_ARRAY(readIntCols, int, atoi)
 READ_SCALAR_ARRAY(readBoolCols, bool, strtobool)
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 799602f5ea..a96d316dca 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -520,7 +520,9 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
 	result->parallelModeNeeded = glob->parallelModeNeeded;
 	result->planTree = top_plan;
 	result->partPruneInfos = glob->partPruneInfos;
+	result->containsInitialPruning = glob->containsInitialPruning;
 	result->rtable = glob->finalrtable;
+	result->minLockRelids = glob->minLockRelids;
 	result->resultRelations = glob->resultRelations;
 	result->appendRelations = glob->appendRelations;
 	result->subplans = glob->subplans;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 720f20f563..61d6934978 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -270,6 +270,16 @@ set_plan_references(PlannerInfo *root, Plan *plan)
 	 */
 	add_rtes_to_flat_rtable(root, false);
 
+	/*
+	 * Add the query's adjusted range of RT indexes to glob->minLockRelids.
+	 * The adjusted RT indexes of prunable relations will be deleted from the
+	 * set below where PartitionPruneInfos are processed.
+	 */
+	glob->minLockRelids =
+		bms_add_range(glob->minLockRelids,
+					  rtoffset + 1,
+					  rtoffset + list_length(root->parse->rtable));
+
 	/*
 	 * Adjust RT indexes of PlanRowMarks and add to final rowmarks list
 	 */
@@ -352,6 +362,7 @@ set_plan_references(PlannerInfo *root, Plan *plan)
 	foreach (lc, root->partPruneInfos)
 	{
 		PartitionPruneInfo *pruneinfo = lfirst(lc);
+		Bitmapset *leafpart_rtis = NULL;
 		ListCell  *l;
 
 		foreach(l, pruneinfo->prune_infos)
@@ -362,15 +373,50 @@ set_plan_references(PlannerInfo *root, Plan *plan)
 			foreach(l2, prune_infos)
 			{
 				PartitionedRelPruneInfo *pinfo = lfirst(l2);
+				int		i;
 
 				/* RT index of the table to which the pinfo belongs. */
 				pinfo->rtindex += rtoffset;
+
+				/* Also of the leaf partitions that might be scanned. */
+				for (i = 0; i < pinfo->nparts; i++)
+				{
+					if (pinfo->rti_map[i] > 0 && pinfo->subplan_map[i] >= 0)
+					{
+						pinfo->rti_map[i] += rtoffset;
+						leafpart_rtis = bms_add_member(leafpart_rtis,
+													   pinfo->rti_map[i]);
+					}
+				}
 			}
 		}
 
+		if (pruneinfo->needs_init_pruning)
+		{
+			glob->containsInitialPruning = true;
+
+			/*
+			 * Delete the leaf partition RTIs from the global set of relations
+			 * to be locked before executing the plan.  AcquireExecutorLocks()
+			 * will find the ones to add to the set after performing initial
+			 * pruning.
+			 */
+			glob->minLockRelids = bms_del_members(glob->minLockRelids,
+												  leafpart_rtis);
+		}
+
 		glob->partPruneInfos = lappend(glob->partPruneInfos, pruneinfo);
 	}
 
+	/*
+	 * It seems worth doing a bms_copy() on glob->minLockRelids if we deleted
+	 * bits from it above to get rid of any empty tail bits.  It seems better
+	 * for the loop over this set in AcquireExecutorLocks() to not have to go
+	 * through those useless bit words.
+	 */
+	if (glob->containsInitialPruning)
+		glob->minLockRelids = bms_copy(glob->minLockRelids);
+
 	return result;
 }
 
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index 6565b6ed01..37f3e6af61 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -144,7 +144,9 @@ static List *make_partitionedrel_pruneinfo(PlannerInfo *root,
 										   List *prunequal,
 										   Bitmapset *partrelids,
 										   int *relid_subplan_map,
-										   Bitmapset **matchedsubplans);
+										   Bitmapset **matchedsubplans,
+										   bool *needs_init_pruning,
+										   bool *needs_exec_pruning);
 static void gen_partprune_steps(RelOptInfo *rel, List *clauses,
 								PartClauseTarget target,
 								GeneratePruningStepsContext *context);
@@ -234,6 +236,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	int		   *relid_subplan_map;
 	ListCell   *lc;
 	int			i;
+	bool		needs_init_pruning = false;
+	bool		needs_exec_pruning = false;
 
 	/*
 	 * Scan the subpaths to see which ones are scans of partition child
@@ -313,12 +317,16 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		Bitmapset  *partrelids = (Bitmapset *) lfirst(lc);
 		List	   *pinfolist;
 		Bitmapset  *matchedsubplans = NULL;
+		bool		partrel_needs_init_pruning;
+		bool		partrel_needs_exec_pruning;
 
 		pinfolist = make_partitionedrel_pruneinfo(root, parentrel,
 												  prunequal,
 												  partrelids,
 												  relid_subplan_map,
-												  &matchedsubplans);
+												  &matchedsubplans,
+												  &partrel_needs_init_pruning,
+												  &partrel_needs_exec_pruning);
 
 		/* When pruning is possible, record the matched subplans */
 		if (pinfolist != NIL)
@@ -327,6 +335,9 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 			allmatchedsubplans = bms_join(matchedsubplans,
 										  allmatchedsubplans);
 		}
+
+		needs_init_pruning |= partrel_needs_init_pruning;
+		needs_exec_pruning |= partrel_needs_exec_pruning;
 	}
 
 	pfree(relid_subplan_map);
@@ -341,6 +352,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	/* Else build the result data structure */
 	pruneinfo = makeNode(PartitionPruneInfo);
 	pruneinfo->prune_infos = prunerelinfos;
+	pruneinfo->needs_init_pruning = needs_init_pruning;
+	pruneinfo->needs_exec_pruning = needs_exec_pruning;
 
 	/*
 	 * Some subplans may not belong to any of the identified partitioned rels.
@@ -441,13 +454,18 @@ add_part_relids(List *allpartrelids, Bitmapset *partrelids)
  * If we cannot find any useful run-time pruning steps, return NIL.
  * However, on success, each rel identified in partrelids will have
  * an element in the result list, even if some of them are useless.
+ * *needs_init_pruning and *needs_exec_pruning are set to indicate that the
+ * returned PartitionedRelPruneInfos contains pruning steps that can be
+ * performed before and after execution begins, respectively.
  */
 static List *
 make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 							  List *prunequal,
 							  Bitmapset *partrelids,
 							  int *relid_subplan_map,
-							  Bitmapset **matchedsubplans)
+							  Bitmapset **matchedsubplans,
+							  bool *needs_init_pruning,
+							  bool *needs_exec_pruning)
 {
 	RelOptInfo *targetpart = NULL;
 	List	   *pinfolist = NIL;
@@ -458,6 +476,10 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	int			rti;
 	int			i;
 
+	/* Will find out below. */
+	*needs_init_pruning = false;
+	*needs_exec_pruning = false;
+
 	/*
 	 * Examine each partitioned rel, constructing a temporary array to map
 	 * from planner relids to index of the partitioned rel, and building a
@@ -545,6 +567,9 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		 * executor per-scan pruning steps.  This first pass creates startup
 		 * pruning steps and detects whether there's any possibly-useful quals
 		 * that would require per-scan pruning.
+		 *
+		 * In the first pass, we note whether the 2nd pass is necessary by
+		 * noting the presence of EXEC parameters.
 		 */
 		gen_partprune_steps(subpart, partprunequal, PARTTARGET_INITIAL,
 							&context);
@@ -619,6 +644,12 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		pinfo->execparamids = execparamids;
 		/* Remaining fields will be filled in the next loop */
 
+		/* record which types of pruning steps we've seen so far */
+		if (initial_pruning_steps != NIL)
+			*needs_init_pruning = true;
+		if (exec_pruning_steps != NIL)
+			*needs_exec_pruning = true;
+
 		pinfolist = lappend(pinfolist, pinfo);
 	}
 
@@ -646,6 +677,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		int		   *subplan_map;
 		int		   *subpart_map;
 		Oid		   *relid_map;
+		Index	   *rti_map;
 
 		/*
 		 * Construct the subplan and subpart maps for this partitioning level.
@@ -658,6 +690,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		subpart_map = (int *) palloc(nparts * sizeof(int));
 		memset(subpart_map, -1, nparts * sizeof(int));
 		relid_map = (Oid *) palloc0(nparts * sizeof(Oid));
+		rti_map = (Index *) palloc0(nparts * sizeof(Index));
 		present_parts = NULL;
 
 		i = -1;
@@ -672,6 +705,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 			subplan_map[i] = subplanidx = relid_subplan_map[partrel->relid] - 1;
 			subpart_map[i] = subpartidx = relid_subpart_map[partrel->relid] - 1;
 			relid_map[i] = planner_rt_fetch(partrel->relid, root)->relid;
+			rti_map[i] = partrel->relid;
 			if (subplanidx >= 0)
 			{
 				present_parts = bms_add_member(present_parts, i);
@@ -696,6 +730,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		pinfo->subplan_map = subplan_map;
 		pinfo->subpart_map = subpart_map;
 		pinfo->relid_map = relid_map;
+		pinfo->rti_map = rti_map;
 	}
 
 	pfree(relid_subpart_map);
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 3082093d1e..95ab1d0eef 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -1598,6 +1598,7 @@ exec_bind_message(StringInfo input_message)
 	int16	   *rformats = NULL;
 	CachedPlanSource *psrc;
 	CachedPlan *cplan;
+	List	   *part_prune_results_list;
 	Portal		portal;
 	char	   *query_string;
 	char	   *saved_stmt_name;
@@ -1972,7 +1973,9 @@ exec_bind_message(StringInfo input_message)
 	 * will be generated in MessageContext.  The plan refcount will be
 	 * assigned to the Portal, so it will be released at portal destruction.
 	 */
-	cplan = GetCachedPlan(psrc, params, NULL, NULL);
+	cplan = GetCachedPlan(psrc, params, NULL, NULL, &part_prune_results_list);
+	Assert(list_length(cplan->stmt_list) ==
+		   list_length(part_prune_results_list));
 
 	/*
 	 * Now we can define the portal.
@@ -1987,6 +1990,9 @@ exec_bind_message(StringInfo input_message)
 					  cplan->stmt_list,
 					  cplan);
 
+	/* Copy Lists of PartitionPruneResults into the portal's context. */
+	PortalStorePartitionPruneResults(portal, part_prune_results_list);
+
 	/* Done with the snapshot used for parameter I/O and parsing/planning */
 	if (snapshot_set)
 		PopActiveSnapshot();
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index 52e2db6452..280ed7d239 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -35,7 +35,7 @@
 Portal		ActivePortal = NULL;
 
 
-static void ProcessQuery(PlannedStmt *plan,
+static void ProcessQuery(PlannedStmt *plan, List *part_prune_results,
 						 const char *sourceText,
 						 ParamListInfo params,
 						 QueryEnvironment *queryEnv,
@@ -65,6 +65,7 @@ static void DoPortalRewind(Portal portal);
  */
 QueryDesc *
 CreateQueryDesc(PlannedStmt *plannedstmt,
+				List *part_prune_results,
 				const char *sourceText,
 				Snapshot snapshot,
 				Snapshot crosscheck_snapshot,
@@ -77,6 +78,8 @@ CreateQueryDesc(PlannedStmt *plannedstmt,
 
 	qd->operation = plannedstmt->commandType;	/* operation */
 	qd->plannedstmt = plannedstmt;	/* plan */
+	qd->part_prune_results = part_prune_results; /* ExecutorDoInitialPruning()
+												  * output for plan */
 	qd->sourceText = sourceText;	/* query text */
 	qd->snapshot = RegisterSnapshot(snapshot);	/* snapshot */
 	/* RI check snapshot */
@@ -122,6 +125,7 @@ FreeQueryDesc(QueryDesc *qdesc)
  *		PORTAL_ONE_RETURNING, or PORTAL_ONE_MOD_WITH portal
  *
  *	plan: the plan tree for the query
+ *	part_prune_results: ExecutorDoInitialPruning() output for the PlannedStmt
  *	sourceText: the source text of the query
  *	params: any parameters needed
  *	dest: where to send results
@@ -134,6 +138,7 @@ FreeQueryDesc(QueryDesc *qdesc)
  */
 static void
 ProcessQuery(PlannedStmt *plan,
+			 List *part_prune_results,
 			 const char *sourceText,
 			 ParamListInfo params,
 			 QueryEnvironment *queryEnv,
@@ -145,7 +150,7 @@ ProcessQuery(PlannedStmt *plan,
 	/*
 	 * Create the QueryDesc object
 	 */
-	queryDesc = CreateQueryDesc(plan, sourceText,
+	queryDesc = CreateQueryDesc(plan, part_prune_results, sourceText,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, params, queryEnv, 0);
 
@@ -491,8 +496,13 @@ PortalStart(Portal portal, ParamListInfo params,
 				/*
 				 * Create QueryDesc in portal's context; for the moment, set
 				 * the destination to DestNone.
+				 *
+				 * There is no PartitionPruneResult unless the PlannedStmt is
+				 * from a CachedPlan.
 				 */
 				queryDesc = CreateQueryDesc(linitial_node(PlannedStmt, portal->stmts),
+											portal->part_prune_results_list == NIL ? NIL :
+											linitial(portal->part_prune_results_list),
 											portal->sourceText,
 											GetActiveSnapshot(),
 											InvalidSnapshot,
@@ -1225,6 +1235,8 @@ PortalRunMulti(Portal portal,
 
 		if (pstmt->utilityStmt == NULL)
 		{
+			List *part_prune_results = NIL;
+
 			/*
 			 * process a plannable query.
 			 */
@@ -1271,10 +1283,18 @@ PortalRunMulti(Portal portal,
 			else
 				UpdateActiveSnapshotCommandId();
 
+			/*
+			 * Determine if there's a corresponding List of PartitionPruneResult
+			 * for this PlannedStmt.
+			 */
+			if (portal->part_prune_results_list != NIL)
+				part_prune_results = list_nth(portal->part_prune_results_list,
+											  foreach_current_index(stmtlist_item));
+
 			if (pstmt->canSetTag)
 			{
 				/* statement can set tag string */
-				ProcessQuery(pstmt,
+				ProcessQuery(pstmt, part_prune_results,
 							 portal->sourceText,
 							 portal->portalParams,
 							 portal->queryEnv,
@@ -1283,7 +1303,7 @@ PortalRunMulti(Portal portal,
 			else
 			{
 				/* stmt added by rewrite cannot set tag */
-				ProcessQuery(pstmt,
+				ProcessQuery(pstmt, part_prune_results,
 							 portal->sourceText,
 							 portal->portalParams,
 							 portal->queryEnv,
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index cc943205d3..af6fae6e3b 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -99,14 +99,19 @@ static dlist_head cached_expression_list = DLIST_STATIC_INIT(cached_expression_l
 static void ReleaseGenericPlan(CachedPlanSource *plansource);
 static List *RevalidateCachedQuery(CachedPlanSource *plansource,
 								   QueryEnvironment *queryEnv);
-static bool CheckCachedPlan(CachedPlanSource *plansource);
+static bool CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
+							List **part_prune_results_list);
 static CachedPlan *BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
-								   ParamListInfo boundParams, QueryEnvironment *queryEnv);
+								   ParamListInfo boundParams, QueryEnvironment *queryEnv,
+								   List **part_prune_results_list);
 static bool choose_custom_plan(CachedPlanSource *plansource,
 							   ParamListInfo boundParams);
 static double cached_plan_cost(CachedPlan *plan, bool include_planner);
 static Query *QueryListGetPrimaryStmt(List *stmts);
-static void AcquireExecutorLocks(List *stmt_list, bool acquire);
+static void AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams,
+								 List **part_prune_results_list,
+								 List **lockedRelids_per_stmt);
+static void ReleaseExecutorLocks(List *stmt_list, List *lockedRelids_per_stmt);
 static void AcquirePlannerLocks(List *stmt_list, bool acquire);
 static void ScanQueryForLocks(Query *parsetree, bool acquire);
 static bool ScanQueryWalker(Node *node, bool *acquire);
@@ -782,6 +787,26 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
 	return tlist;
 }
 
+/* 
+ * FreePartitionPruneResults
+ *		Frees the List of Lists of PartitionPruneResults for CheckCachedPlan()
+ */
+static void
+FreePartitionPruneResults(List *part_prune_results_list)
+{
+	ListCell *lc;
+
+	foreach(lc, part_prune_results_list)
+	{
+		List *part_prune_results = lfirst(lc);
+
+		/* Free both the PartitionPruneResults and the containing List. */
+		list_free_deep(part_prune_results);
+	}
+
+	list_free(part_prune_results_list);
+}
+
 /*
  * CheckCachedPlan: see if the CachedPlanSource's generic plan is valid.
  *
@@ -790,15 +815,20 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
  *
  * On a "true" return, we have acquired the locks needed to run the plan.
  * (We must do this for the "true" result to be race-condition-free.)
+ *
+ * See GetCachedPlan()'s comment for a description of part_prune_results_list.
  */
 static bool
-CheckCachedPlan(CachedPlanSource *plansource)
+CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
+				List **part_prune_results_list)
 {
 	CachedPlan *plan = plansource->gplan;
 
 	/* Assert that caller checked the querytree */
 	Assert(plansource->is_valid);
 
+	*part_prune_results_list = NIL;
+
 	/* If there's no generic plan, just say "false" */
 	if (!plan)
 		return false;
@@ -820,13 +850,21 @@ CheckCachedPlan(CachedPlanSource *plansource)
 	 */
 	if (plan->is_valid)
 	{
+		List   *lockedRelids_per_stmt;
+
 		/*
 		 * Plan must have positive refcount because it is referenced by
 		 * plansource; so no need to fear it disappears under us here.
 		 */
 		Assert(plan->refcount > 0);
 
-		AcquireExecutorLocks(plan->stmt_list, true);
+		/*
+		 * Lock relations scanned by the plan.  This is where the pruning
+		 * happens if needed.
+		 */
+		AcquireExecutorLocks(plan->stmt_list, boundParams,
+							 part_prune_results_list,
+							 &lockedRelids_per_stmt);
 
 		/*
 		 * If plan was transient, check to see if TransactionXmin has
@@ -848,7 +886,11 @@ CheckCachedPlan(CachedPlanSource *plansource)
 		}
 
 		/* Oops, the race case happened.  Release useless locks. */
-		AcquireExecutorLocks(plan->stmt_list, false);
+		ReleaseExecutorLocks(plan->stmt_list, lockedRelids_per_stmt);
+
+		/* Release any PartitionPruneResults that may been created. */
+		FreePartitionPruneResults(*part_prune_results_list);
+		*part_prune_results_list = NIL;
 	}
 
 	/*
@@ -874,10 +916,14 @@ CheckCachedPlan(CachedPlanSource *plansource)
  * Planning work is done in the caller's memory context.  The finished plan
  * is in a child memory context, which typically should get reparented
  * (unless this is a one-shot plan, in which case we don't copy the plan).
+ *
+ * A list of NILs is returned in *part_prune_results_list, meaning that no
+ * no partition pruning has been done yet for the plans in stmt_list.
  */
 static CachedPlan *
 BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
-				ParamListInfo boundParams, QueryEnvironment *queryEnv)
+				ParamListInfo boundParams, QueryEnvironment *queryEnv,
+				List **part_prune_results_list)
 {
 	CachedPlan *plan;
 	List	   *plist;
@@ -1007,6 +1053,17 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
 
 	MemoryContextSwitchTo(oldcxt);
 
+	/*
+	 * No actual PartitionPruneResults yet to add, though must initialize
+	 * the list to have the same number of elements as the list of
+	 * PlannedStmts.
+	 */
+	*part_prune_results_list = NIL;
+	foreach(lc, plist)
+	{
+		*part_prune_results_list = lappend(*part_prune_results_list, NIL);
+	}
+
 	return plan;
 }
 
@@ -1126,6 +1183,19 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
  * plan or a custom plan for the given parameters: the caller does not know
  * which it will get.
  *
+ * For every PlannedStmt found in the returned CachedPlan, an element that
+ * is either a List of PartitionPruneResult or a NIL is added to
+ * *part_prune_results_list.  The former if the PlannedStmt is from
+ * the existing CachedPlan that is otherwise valid and has
+ * containsInitialPruning set to true.  Before returning such a CachedPlan,
+ * those "initial" steps are performed by calling ExecutorDoInitialPruning()
+ * to determine only those leaf partitions that need to be locked by
+ * AcquireExecutorLocks() by pruning away subplans that don't match the
+ * "initial" pruning conditions.  For each PartitionPruneInfo found in
+ * PlannedStmt.partPruneInfos, a PartitionPruneResult containing the bitmapset
+ * of the indexes of surviving subplans is added to the List for the
+ * PlannedStmt.
+ *
  * On return, the plan is valid and we have sufficient locks to begin
  * execution.
  *
@@ -1139,11 +1209,13 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
  */
 CachedPlan *
 GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
-			  ResourceOwner owner, QueryEnvironment *queryEnv)
+			  ResourceOwner owner, QueryEnvironment *queryEnv,
+			  List **part_prune_results_list)
 {
 	CachedPlan *plan = NULL;
 	List	   *qlist;
 	bool		customplan;
+	List	   *my_part_prune_results_list;
 
 	/* Assert caller is doing things in a sane order */
 	Assert(plansource->magic == CACHEDPLANSOURCE_MAGIC);
@@ -1160,7 +1232,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 
 	if (!customplan)
 	{
-		if (CheckCachedPlan(plansource))
+		if (CheckCachedPlan(plansource, boundParams,
+							&my_part_prune_results_list))
 		{
 			/* We want a generic plan, and we already have a valid one */
 			plan = plansource->gplan;
@@ -1169,7 +1242,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 		else
 		{
 			/* Build a new generic plan */
-			plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv);
+			plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv,
+								   &my_part_prune_results_list);
 			/* Just make real sure plansource->gplan is clear */
 			ReleaseGenericPlan(plansource);
 			/* Link the new generic plan into the plansource */
@@ -1214,7 +1288,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 	if (customplan)
 	{
 		/* Build a custom plan */
-		plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv);
+		plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv,
+							   &my_part_prune_results_list);
 		/* Accumulate total costs of custom plans */
 		plansource->total_custom_cost += cached_plan_cost(plan, true);
 
@@ -1246,6 +1321,9 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 		plan->is_saved = true;
 	}
 
+	if (part_prune_results_list)
+		*part_prune_results_list = my_part_prune_results_list;
+
 	return plan;
 }
 
@@ -1737,17 +1815,29 @@ QueryListGetPrimaryStmt(List *stmts)
 
 /*
  * AcquireExecutorLocks: acquire locks needed for execution of a cached plan;
- * or release them if acquire is false.
+ *
+ * See GetCachedPlan()'s comment for a description of part_prune_results_list.
+ *
+ * On return, *lockedRelids_per_stmt will contain a bitmapset for every
+ * PlannedStmt in stmt_list, containing the RT indexes of relation entries
+ * in its range table that were actually locked, or NULL if the PlannedStmt
+ * contains a utility statement.
  */
 static void
-AcquireExecutorLocks(List *stmt_list, bool acquire)
+AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams,
+					 List **part_prune_results_list,
+					 List **lockedRelids_per_stmt)
 {
 	ListCell   *lc1;
 
+	*part_prune_results_list = *lockedRelids_per_stmt = NIL;
 	foreach(lc1, stmt_list)
 	{
 		PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
-		ListCell   *lc2;
+		List	   *part_prune_results = NIL;
+		Bitmapset  *allLockRelids;
+		Bitmapset  *lockedRelids = NULL;
+		int			rti;
 
 		if (plannedstmt->commandType == CMD_UTILITY)
 		{
@@ -1761,13 +1851,40 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
 			Query	   *query = UtilityContainsQuery(plannedstmt->utilityStmt);
 
 			if (query)
-				ScanQueryForLocks(query, acquire);
+				ScanQueryForLocks(query, true);
+			*part_prune_results_list = lappend(*part_prune_results_list, NIL);
 			continue;
 		}
 
-		foreach(lc2, plannedstmt->rtable)
+		/*
+		 * Figure out the set of relations that would need to be locked
+		 * before executing the plan.
+		 */
+		if (plannedstmt->containsInitialPruning)
 		{
-			RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc2);
+			Bitmapset *scan_leafpart_rtis = NULL;
+
+			/*
+			 * Obtain the set of leaf partitions to be locked.
+			 *
+			 * The following does initial partition pruning using the
+			 * PartitionPruneInfos found in plannedstmt->partPruneInfos and
+			 * finds leaf partitions that survive that pruning across all the
+			 * nodes in the plan tree.
+			 */
+			part_prune_results = ExecutorDoInitialPruning(plannedstmt,
+														  boundParams,
+														  &scan_leafpart_rtis);
+			allLockRelids = bms_union(plannedstmt->minLockRelids,
+									  scan_leafpart_rtis);
+		}
+		else
+			allLockRelids = plannedstmt->minLockRelids;
+
+		rti = -1;
+		while ((rti = bms_next_member(allLockRelids, rti)) > 0)
+		{
+			RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
 
 			if (rte->rtekind != RTE_RELATION)
 				continue;
@@ -1778,10 +1895,59 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
 			 * fail if it's been dropped entirely --- we'll just transiently
 			 * acquire a non-conflicting lock.
 			 */
-			if (acquire)
-				LockRelationOid(rte->relid, rte->rellockmode);
-			else
-				UnlockRelationOid(rte->relid, rte->rellockmode);
+			LockRelationOid(rte->relid, rte->rellockmode);
+			lockedRelids = bms_add_member(lockedRelids, rti);
+		}
+
+		*part_prune_results_list = lappend(*part_prune_results_list,
+										   part_prune_results);
+		*lockedRelids_per_stmt = lappend(*lockedRelids_per_stmt, lockedRelids);
+	}
+}
+
+/*
+ * ReleaseExecutorLocks
+ * 		Release locks that would've been acquired by an earlier call to
+ * 		AcquireExecutorLocks()
+ */
+static void
+ReleaseExecutorLocks(List *stmt_list, List *lockedRelids_per_stmt)
+{
+	ListCell   *lc1,
+			   *lc2;
+
+	forboth(lc1, stmt_list, lc2, lockedRelids_per_stmt)
+	{
+		PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
+		Bitmapset  *lockedRelids = lfirst(lc2);
+		int			rti;
+
+		if (plannedstmt->commandType == CMD_UTILITY)
+		{
+			/*
+			 * Ignore utility statements, except those (such as EXPLAIN) that
+			 * contain a parsed-but-not-planned query.  Note: it's okay to use
+			 * ScanQueryForLocks, even though the query hasn't been through
+			 * rule rewriting, because rewriting doesn't change the query
+			 * representation.
+			 */
+			Query	   *query = UtilityContainsQuery(plannedstmt->utilityStmt);
+
+			Assert(lockedRelids == NULL);
+			if (query)
+				ScanQueryForLocks(query, false);
+			continue;
+		}
+
+		rti = -1;
+		while ((rti = bms_next_member(lockedRelids, rti)) >= 0)
+		{
+			RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
+
+			Assert(rte->rtekind == RTE_RELATION);
+
+			/* See the comment in AcquireExecutorLocks(). */
+			UnlockRelationOid(rte->relid, rte->rellockmode);
 		}
 	}
 }
diff --git a/src/backend/utils/mmgr/portalmem.c b/src/backend/utils/mmgr/portalmem.c
index 7b1ae6fdcf..5b9098971b 100644
--- a/src/backend/utils/mmgr/portalmem.c
+++ b/src/backend/utils/mmgr/portalmem.c
@@ -303,6 +303,25 @@ PortalDefineQuery(Portal portal,
 	portal->status = PORTAL_DEFINED;
 }
 
+/*
+ * PortalStorePartitionPruneResults
+ *		Copy the given List of Lists of PartitionPruneResults into the
+ *		portal's context
+ *
+ * This allows the caller to ensure that the list exists as long as the portal
+ * does.
+ */
+void
+PortalStorePartitionPruneResults(Portal portal, List *part_prune_results_list)
+{
+	MemoryContext	oldcxt;
+
+	Assert(PortalIsValid(portal));
+	oldcxt = MemoryContextSwitchTo(portal->portalContext);
+	portal->part_prune_results_list = copyObject(part_prune_results_list);
+	MemoryContextSwitchTo(oldcxt);
+}
+
 /*
  * PortalReleaseCachedPlan
  *		Release a portal's reference to its cached plan, if any.
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 9ebde089ae..269cc4d562 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -87,7 +87,9 @@ extern void ExplainOneUtility(Node *utilityStmt, IntoClause *into,
 							  ExplainState *es, const char *queryString,
 							  ParamListInfo params, QueryEnvironment *queryEnv);
 
-extern void ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into,
+extern void ExplainOnePlan(PlannedStmt *plannedstmt,
+						   List *part_prune_results,
+						   IntoClause *into,
 						   ExplainState *es, const char *queryString,
 						   ParamListInfo params, QueryEnvironment *queryEnv,
 						   const instr_time *planduration,
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index bf962af7af..bd8776402e 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -45,6 +45,7 @@ extern void ExecCleanupTupleRouting(ModifyTableState *mtstate,
  * nparts						Length of subplan_map[] and subpart_map[].
  * subplan_map					Subplan index by partition index, or -1.
  * subpart_map					Subpart index by partition index, or -1.
+ * rti_map						Range table index by partition index, or 0.
  * present_parts				A Bitmapset of the partition indexes that we
  *								have subplans or subparts for.
  * initial_pruning_steps		List of PartitionPruneSteps used to
@@ -61,6 +62,7 @@ typedef struct PartitionedRelPruningData
 	int			nparts;
 	int		   *subplan_map;
 	int		   *subpart_map;
+	Index	   *rti_map;
 	Bitmapset  *present_parts;
 	List	   *initial_pruning_steps;
 	List	   *exec_pruning_steps;
@@ -126,5 +128,10 @@ extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
 													 int part_prune_index,
 													 Bitmapset **initially_valid_subplans);
 extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
-										   bool initial_prune);
+										   bool initial_prune,
+										   Bitmapset **scan_leafpart_rtis);
+extern Bitmapset *ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt,
+								ParamListInfo params,
+								PartitionPruneInfo *pruneinfo,
+								Bitmapset **scan_leafpart_rtis);
 #endif							/* EXECPARTITION_H */
diff --git a/src/include/executor/execdesc.h b/src/include/executor/execdesc.h
index e79e2c001f..7d4379da7b 100644
--- a/src/include/executor/execdesc.h
+++ b/src/include/executor/execdesc.h
@@ -35,6 +35,8 @@ typedef struct QueryDesc
 	/* These fields are provided by CreateQueryDesc */
 	CmdType		operation;		/* CMD_SELECT, CMD_UPDATE, etc. */
 	PlannedStmt *plannedstmt;	/* planner's output (could be utility, too) */
+	List		*part_prune_results; /* ExecutorDoInitialPruning()'s
+									  * output for plannedstmt */
 	const char *sourceText;		/* source text of the query */
 	Snapshot	snapshot;		/* snapshot to use for query */
 	Snapshot	crosscheck_snapshot;	/* crosscheck for RI update/delete */
@@ -57,6 +59,7 @@ typedef struct QueryDesc
 
 /* in pquery.c */
 extern QueryDesc *CreateQueryDesc(PlannedStmt *plannedstmt,
+								  List *part_prune_results,
 								  const char *sourceText,
 								  Snapshot snapshot,
 								  Snapshot crosscheck_snapshot,
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index ed95ed1176..c9a5e5fb68 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -185,6 +185,9 @@ ExecGetJunkAttribute(TupleTableSlot *slot, AttrNumber attno, bool *isNull)
 /*
  * prototypes from functions in execMain.c
  */
+extern List *ExecutorDoInitialPruning(PlannedStmt *plannedstmt,
+									  ParamListInfo params,
+									  Bitmapset **scan_leafpart_rtis);
 extern void ExecutorStart(QueryDesc *queryDesc, int eflags);
 extern void standard_ExecutorStart(QueryDesc *queryDesc, int eflags);
 extern void ExecutorRun(QueryDesc *queryDesc,
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 4a741b053f..521a60b988 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -612,6 +612,7 @@ typedef struct EState
 										 * ExecRowMarks, or NULL if none */
 	PlannedStmt *es_plannedstmt;	/* link to top of plan tree */
 	List		*es_part_prune_infos;	/* PlannedStmt.partPruneInfos */
+	List		*es_part_prune_results; /* QueryDesc.part_prune_results */
 	const char *es_sourceText;	/* Source text from QueryDesc */
 
 	JunkFilter *es_junkFilter;	/* top-level junk filter, if any */
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index a80f43e540..937cc4629d 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -212,6 +212,7 @@ extern struct Bitmapset *readBitmapset(void);
 extern uintptr_t readDatum(bool typbyval);
 extern bool *readBoolCols(int numCols);
 extern int *readIntCols(int numCols);
+extern Index *readIndexCols(int numCols);
 extern Oid *readOidCols(int numCols);
 extern int16 *readAttrNumberCols(int numCols);
 
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index fbe75dca0f..354c2e96c3 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -125,6 +125,18 @@ typedef struct PlannerGlobal
 	/* List of PartitionPruneInfo contained in the plan */
 	List	   *partPruneInfos;
 
+	/*
+	 * Do any of those PartitionPruneInfos have initial pruning steps in them?
+	 */
+	bool		containsInitialPruning;
+
+	/*
+	 * Indexes of all range table entries minus indexes of range table entries
+	 * of the leaf partitions scanned by prunable subplans; see
+	 * AcquireExecutorLocks()
+	 */
+	Bitmapset  *minLockRelids;
+
 	/* OIDs of relations the plan depends on */
 	List	   *relationOids;
 
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 2e132afc5a..c0717bf45e 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -73,8 +73,17 @@ typedef struct PlannedStmt
 	List	   *partPruneInfos;	/* List of PartitionPruneInfo contained in
 								 * the plan */
 
+	bool		containsInitialPruning;	/* Do any of those PartitionPruneInfos
+										 * have initial pruning steps in them?
+										 */
+
 	List	   *rtable;			/* list of RangeTblEntry nodes */
 
+	Bitmapset  *minLockRelids;	/* Indexes of all range table entries minus
+								 * indexes of range table entries of the leaf
+								 * partitions scanned by prunable subplans;
+								 * see AcquireExecutorLocks() */
+
 	/* rtable indexes of target relations for INSERT/UPDATE/DELETE/MERGE */
 	List	   *resultRelations;	/* integer list of RT indexes, or NIL */
 
@@ -1410,6 +1419,13 @@ typedef struct PlanRowMark
  * prune_infos			List of Lists containing PartitionedRelPruneInfo nodes,
  *						one sublist per run-time-prunable partition hierarchy
  *						appearing in the parent plan node's subplans.
+ *
+ * needs_init_pruning	Does any of the PartitionedRelPruneInfos in
+ *						prune_infos have its initial_pruning_steps set?
+ *
+ * needs_exec_pruning	Does any of the PartitionedRelPruneInfos in
+ *						prune_infos have its exec_pruning_steps set?
+ *
  * other_subplans		Indexes of any subplans that are not accounted for
  *						by any of the PartitionedRelPruneInfo nodes in
  *						"prune_infos".  These subplans must not be pruned.
@@ -1420,6 +1436,8 @@ typedef struct PartitionPruneInfo
 
 	NodeTag		type;
 	List	   *prune_infos;
+	bool		needs_init_pruning;
+	bool		needs_exec_pruning;
 	Bitmapset  *other_subplans;
 } PartitionPruneInfo;
 
@@ -1464,6 +1482,9 @@ typedef struct PartitionedRelPruneInfo
 	/* relation OID by partition index, or 0 */
 	Oid		   *relid_map pg_node_attr(array_size(nparts));
 
+	/* Range table index by partition index, or 0. */
+	Index	   *rti_map pg_node_attr(array_size(nparts));
+
 	/*
 	 * initial_pruning_steps shows how to prune during executor startup (i.e.,
 	 * without use of any PARAM_EXEC Params); it is NIL if no startup pruning
@@ -1548,6 +1569,31 @@ typedef struct PartitionPruneStepCombine
 	List	   *source_stepids;
 } PartitionPruneStepCombine;
 
+/*----------------
+ * PartitionPruneResult
+ *
+ * The result of performing ExecPartitionDoInitialPruning() on a given
+ * PartitionPruneInfo.
+ *
+ * valid_subplans_offs contains the indexes of subplans remaining after
+ * performing initial pruning by calling ExecFindMatchingSubPlans() on the
+ * PartitionPruneInfo.
+ *
+ * This is used to store the result of initial partition pruning that is
+ * peformed before the execution has started.  A module that needs to do so
+ * should call ExecutorDoInitialPruning() on a given PlannedStmt, which
+ * returns a List of PartitionPruneResult containing an entry for each
+ * PartitionPruneInfo present in PlannedStmt.part_prune_infos.  The module
+ * should then pass that list, along with the PlannedStmt, to the executor,
+ * so that it can reuse the result of initial partition pruning when
+ * initializing the subplans for execution.
+ */
+typedef struct PartitionPruneResult
+{
+	NodeTag		type;
+
+	Bitmapset	   *valid_subplan_offs;
+} PartitionPruneResult;
 
 /*
  * Plan invalidation info
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index 0499635f59..32579d4788 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -220,7 +220,8 @@ extern List *CachedPlanGetTargetList(CachedPlanSource *plansource,
 extern CachedPlan *GetCachedPlan(CachedPlanSource *plansource,
 								 ParamListInfo boundParams,
 								 ResourceOwner owner,
-								 QueryEnvironment *queryEnv);
+								 QueryEnvironment *queryEnv,
+								 List **part_prune_results_list);
 extern void ReleaseCachedPlan(CachedPlan *plan, ResourceOwner owner);
 
 extern bool CachedPlanAllowsSimpleValidityCheck(CachedPlanSource *plansource,
diff --git a/src/include/utils/portal.h b/src/include/utils/portal.h
index aeddbdafe5..1901fc5f28 100644
--- a/src/include/utils/portal.h
+++ b/src/include/utils/portal.h
@@ -138,6 +138,7 @@ typedef struct PortalData
 	QueryCompletion qc;			/* command completion data for executed query */
 	List	   *stmts;			/* list of PlannedStmts */
 	CachedPlan *cplan;			/* CachedPlan, if stmts are from one */
+	List	   *part_prune_results_list;	/* List of Lists of PartitionPruneResults */
 
 	ParamListInfo portalParams; /* params to pass to query */
 	QueryEnvironment *queryEnv; /* environment for query */
@@ -242,6 +243,8 @@ extern void PortalDefineQuery(Portal portal,
 							  CommandTag commandTag,
 							  List *stmts,
 							  CachedPlan *cplan);
+extern void PortalStorePartitionPruneResults(Portal portal,
+											 List *part_prune_results_list);
 extern PlannedStmt *PortalGetPrimaryStmt(Portal portal);
 extern void PortalCreateHoldStore(Portal portal);
 extern void PortalHashTableDeleteAll(void);
-- 
2.35.3



  [application/octet-stream] v24-0001-Move-PartitioPruneInfo-out-of-plan-nodes-into-Pl.patch (17.2K, 3-v24-0001-Move-PartitioPruneInfo-out-of-plan-nodes-into-Pl.patch)
  download | inline diff:
From 9819109681e87342bf22549f5ea316501f77235d Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Fri, 27 May 2022 16:00:28 +0900
Subject: [PATCH v24 1/2] Move PartitioPruneInfo out of plan nodes into
 PlannedStmt

The planner will now add a given PartitioPruneInfo to
PlannedStmt.partPruneInfos instead of directly to the
Append/MergeAppend plan node.  What gets set instead in the
latter is an index field which points to the list element
of PlannedStmt.partPruneInfos containing the PartitioPruneInfo
belonging to the plan node.

A later commit will make AcquireExecutorLocks() do the initial
partition pruning to determine a minimal set of partitions to be
locked when validating a plan tree and it will need to consult the
PartitioPruneInfos referenced therein to do so.  It would be better
for the PartitioPruneInfos to be accessible directly than requiring
a walk of the plan tree to find them, which is easier when it can be
done by simply iterating over PlannedStmt.partPruneInfos.
---
 src/backend/executor/execMain.c         |  1 +
 src/backend/executor/execParallel.c     |  1 +
 src/backend/executor/execPartition.c    |  4 +-
 src/backend/executor/execUtils.c        |  1 +
 src/backend/executor/nodeAppend.c       |  4 +-
 src/backend/executor/nodeMergeAppend.c  |  4 +-
 src/backend/optimizer/plan/createplan.c | 24 ++++-----
 src/backend/optimizer/plan/planner.c    |  1 +
 src/backend/optimizer/plan/setrefs.c    | 65 +++++++++++++------------
 src/backend/partitioning/partprune.c    | 18 ++++---
 src/include/executor/execPartition.h    |  3 +-
 src/include/nodes/execnodes.h           |  1 +
 src/include/nodes/pathnodes.h           |  6 +++
 src/include/nodes/plannodes.h           | 11 +++--
 src/include/partitioning/partprune.h    |  8 +--
 15 files changed, 90 insertions(+), 62 deletions(-)

diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index d78862e660..32475e33ff 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -825,6 +825,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
 	ExecInitRangeTable(estate, rangeTable);
 
 	estate->es_plannedstmt = plannedstmt;
+	estate->es_part_prune_infos = plannedstmt->partPruneInfos;
 
 	/*
 	 * Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index 99512826c5..aca0c6f323 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -183,6 +183,7 @@ ExecSerializePlan(Plan *plan, EState *estate)
 	pstmt->dependsOnRole = false;
 	pstmt->parallelModeNeeded = false;
 	pstmt->planTree = plan;
+	pstmt->partPruneInfos = estate->es_part_prune_infos;
 	pstmt->rtable = estate->es_range_table;
 	pstmt->resultRelations = NIL;
 	pstmt->appendRelations = NIL;
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 40e3c07693..80197d5141 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -1791,11 +1791,13 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
 PartitionPruneState *
 ExecInitPartitionPruning(PlanState *planstate,
 						 int n_total_subplans,
-						 PartitionPruneInfo *pruneinfo,
+						 int part_prune_index,
 						 Bitmapset **initially_valid_subplans)
 {
 	PartitionPruneState *prunestate;
 	EState	   *estate = planstate->state;
+	PartitionPruneInfo *pruneinfo = list_nth(estate->es_part_prune_infos,
+											 part_prune_index);
 
 	/* We may need an expression context to evaluate partition exprs */
 	ExecAssignExprContext(estate, planstate);
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 9df1f81ea8..21f4c10937 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -119,6 +119,7 @@ CreateExecutorState(void)
 	estate->es_relations = NULL;
 	estate->es_rowmarks = NULL;
 	estate->es_plannedstmt = NULL;
+	estate->es_part_prune_infos = NIL;
 
 	estate->es_junkFilter = NULL;
 
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index 357e10a1d7..c6f86a6510 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -134,7 +134,7 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
 	appendstate->as_begun = false;
 
 	/* If run-time partition pruning is enabled, then set that up now */
-	if (node->part_prune_info != NULL)
+	if (node->part_prune_index >= 0)
 	{
 		PartitionPruneState *prunestate;
 
@@ -145,7 +145,7 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
 		 */
 		prunestate = ExecInitPartitionPruning(&appendstate->ps,
 											  list_length(node->appendplans),
-											  node->part_prune_info,
+											  node->part_prune_index,
 											  &validsubplans);
 		appendstate->as_prune_state = prunestate;
 		nplans = bms_num_members(validsubplans);
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index c5c62fa5c7..8d35860c30 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -82,7 +82,7 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
 	mergestate->ps.ExecProcNode = ExecMergeAppend;
 
 	/* If run-time partition pruning is enabled, then set that up now */
-	if (node->part_prune_info != NULL)
+	if (node->part_prune_index >= 0)
 	{
 		PartitionPruneState *prunestate;
 
@@ -93,7 +93,7 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
 		 */
 		prunestate = ExecInitPartitionPruning(&mergestate->ps,
 											  list_length(node->mergeplans),
-											  node->part_prune_info,
+											  node->part_prune_index,
 											  &validsubplans);
 		mergestate->ms_prune_state = prunestate;
 		nplans = bms_num_members(validsubplans);
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index ac86ce9003..50a5719ac6 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -1203,7 +1203,6 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
 	ListCell   *subpaths;
 	int			nasyncplans = 0;
 	RelOptInfo *rel = best_path->path.parent;
-	PartitionPruneInfo *partpruneinfo = NULL;
 	int			nodenumsortkeys = 0;
 	AttrNumber *nodeSortColIdx = NULL;
 	Oid		   *nodeSortOperators = NULL;
@@ -1354,6 +1353,9 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
 		subplans = lappend(subplans, subplan);
 	}
 
+	/* Set below if we find quals that we can use to run-time prune */
+	plan->part_prune_index = -1;
+
 	/*
 	 * If any quals exist, they may be useful to perform further partition
 	 * pruning during execution.  Gather information needed by the executor to
@@ -1377,16 +1379,14 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
 		}
 
 		if (prunequal != NIL)
-			partpruneinfo =
-				make_partition_pruneinfo(root, rel,
-										 best_path->subpaths,
-										 prunequal);
+			plan->part_prune_index = make_partition_pruneinfo(root, rel,
+															  best_path->subpaths,
+															  prunequal);
 	}
 
 	plan->appendplans = subplans;
 	plan->nasyncplans = nasyncplans;
 	plan->first_partial_plan = best_path->first_partial_path;
-	plan->part_prune_info = partpruneinfo;
 
 	copy_generic_path_info(&plan->plan, (Path *) best_path);
 
@@ -1425,7 +1425,6 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
 	List	   *subplans = NIL;
 	ListCell   *subpaths;
 	RelOptInfo *rel = best_path->path.parent;
-	PartitionPruneInfo *partpruneinfo = NULL;
 
 	/*
 	 * We don't have the actual creation of the MergeAppend node split out
@@ -1518,6 +1517,9 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
 		subplans = lappend(subplans, subplan);
 	}
 
+	/* Set below if we find quals that we can use to run-time prune */
+	node->part_prune_index = -1;
+
 	/*
 	 * If any quals exist, they may be useful to perform further partition
 	 * pruning during execution.  Gather information needed by the executor to
@@ -1541,13 +1543,13 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
 		}
 
 		if (prunequal != NIL)
-			partpruneinfo = make_partition_pruneinfo(root, rel,
-													 best_path->subpaths,
-													 prunequal);
+			node->part_prune_index = make_partition_pruneinfo(root, rel,
+															  best_path->subpaths,
+															  prunequal);
 	}
 
 	node->mergeplans = subplans;
-	node->part_prune_info = partpruneinfo;
+
 
 	/*
 	 * If prepare_sort_from_pathkeys added sort columns, but we were told to
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 493a3af0fa..799602f5ea 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -519,6 +519,7 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
 	result->dependsOnRole = glob->dependsOnRole;
 	result->parallelModeNeeded = glob->parallelModeNeeded;
 	result->planTree = top_plan;
+	result->partPruneInfos = glob->partPruneInfos;
 	result->rtable = glob->finalrtable;
 	result->resultRelations = glob->resultRelations;
 	result->appendRelations = glob->appendRelations;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 1cb0abdbc1..720f20f563 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -348,6 +348,29 @@ set_plan_references(PlannerInfo *root, Plan *plan)
 		}
 	}
 
+	/* Also fix up the information in PartitionPruneInfos. */
+	foreach (lc, root->partPruneInfos)
+	{
+		PartitionPruneInfo *pruneinfo = lfirst(lc);
+		ListCell  *l;
+
+		foreach(l, pruneinfo->prune_infos)
+		{
+			List	   *prune_infos = lfirst(l);
+			ListCell   *l2;
+
+			foreach(l2, prune_infos)
+			{
+				PartitionedRelPruneInfo *pinfo = lfirst(l2);
+
+				/* RT index of the table to which the pinfo belongs. */
+				pinfo->rtindex += rtoffset;
+			}
+		}
+
+		glob->partPruneInfos = lappend(glob->partPruneInfos, pruneinfo);
+	}
+
 	return result;
 }
 
@@ -1658,21 +1681,12 @@ set_append_references(PlannerInfo *root,
 
 	aplan->apprelids = offset_relid_set(aplan->apprelids, rtoffset);
 
-	if (aplan->part_prune_info)
-	{
-		foreach(l, aplan->part_prune_info->prune_infos)
-		{
-			List	   *prune_infos = lfirst(l);
-			ListCell   *l2;
-
-			foreach(l2, prune_infos)
-			{
-				PartitionedRelPruneInfo *pinfo = lfirst(l2);
-
-				pinfo->rtindex += rtoffset;
-			}
-		}
-	}
+	/*
+	 * PartitionPruneInfos will be added to a list in PlannerGlobal, so update
+	 * the index.
+	 */
+	if (aplan->part_prune_index >= 0)
+		aplan->part_prune_index += list_length(root->glob->partPruneInfos);
 
 	/* We don't need to recurse to lefttree or righttree ... */
 	Assert(aplan->plan.lefttree == NULL);
@@ -1734,21 +1748,12 @@ set_mergeappend_references(PlannerInfo *root,
 
 	mplan->apprelids = offset_relid_set(mplan->apprelids, rtoffset);
 
-	if (mplan->part_prune_info)
-	{
-		foreach(l, mplan->part_prune_info->prune_infos)
-		{
-			List	   *prune_infos = lfirst(l);
-			ListCell   *l2;
-
-			foreach(l2, prune_infos)
-			{
-				PartitionedRelPruneInfo *pinfo = lfirst(l2);
-
-				pinfo->rtindex += rtoffset;
-			}
-		}
-	}
+	/*
+	 * PartitionPruneInfos will be added to a list in PlannerGlobal, so update
+	 * the index.
+	 */
+	if (mplan->part_prune_index >= 0)
+		mplan->part_prune_index += list_length(root->glob->partPruneInfos);
 
 	/* We don't need to recurse to lefttree or righttree ... */
 	Assert(mplan->plan.lefttree == NULL);
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index 6188bf69cb..6565b6ed01 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -209,16 +209,20 @@ static void partkey_datum_from_expr(PartitionPruneContext *context,
 
 /*
  * make_partition_pruneinfo
- *		Builds a PartitionPruneInfo which can be used in the executor to allow
- *		additional partition pruning to take place.  Returns NULL when
- *		partition pruning would be useless.
+ *		Checks if the given set of quals can be used to build pruning steps
+ *		that the executor can use to prune away unneeded partitions.  If
+ *		suitable quals are found then a PartitionPruneInfo is built and tagged
+ *		onto the PlannerInfo's partPruneInfos list.
+ *
+ * The return value is the 0-based index of the item added to the
+ * partPruneInfos list or -1 if nothing was added.
  *
  * 'parentrel' is the RelOptInfo for an appendrel, and 'subpaths' is the list
  * of scan paths for its child rels.
  * 'prunequal' is a list of potential pruning quals (i.e., restriction
  * clauses that are applicable to the appendrel).
  */
-PartitionPruneInfo *
+int
 make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 						 List *subpaths,
 						 List *prunequal)
@@ -332,7 +336,7 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	 * quals, then we can just not bother with run-time pruning.
 	 */
 	if (prunerelinfos == NIL)
-		return NULL;
+		return -1;
 
 	/* Else build the result data structure */
 	pruneinfo = makeNode(PartitionPruneInfo);
@@ -358,7 +362,9 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	else
 		pruneinfo->other_subplans = NULL;
 
-	return pruneinfo;
+	root->partPruneInfos = lappend(root->partPruneInfos, pruneinfo);
+
+	return list_length(root->partPruneInfos) - 1;
 }
 
 /*
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index 708435e952..bf962af7af 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -123,9 +123,8 @@ typedef struct PartitionPruneState
 
 extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
 													 int n_total_subplans,
-													 PartitionPruneInfo *pruneinfo,
+													 int part_prune_index,
 													 Bitmapset **initially_valid_subplans);
 extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
 										   bool initial_prune);
-
 #endif							/* EXECPARTITION_H */
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 01b1727fc0..4a741b053f 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -611,6 +611,7 @@ typedef struct EState
 	struct ExecRowMark **es_rowmarks;	/* Array of per-range-table-entry
 										 * ExecRowMarks, or NULL if none */
 	PlannedStmt *es_plannedstmt;	/* link to top of plan tree */
+	List		*es_part_prune_infos;	/* PlannedStmt.partPruneInfos */
 	const char *es_sourceText;	/* Source text from QueryDesc */
 
 	JunkFilter *es_junkFilter;	/* top-level junk filter, if any */
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 09342d128d..fbe75dca0f 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -122,6 +122,9 @@ typedef struct PlannerGlobal
 	/* "flat" list of AppendRelInfos */
 	List	   *appendRelations;
 
+	/* List of PartitionPruneInfo contained in the plan */
+	List	   *partPruneInfos;
+
 	/* OIDs of relations the plan depends on */
 	List	   *relationOids;
 
@@ -503,6 +506,9 @@ struct PlannerInfo
 
 	/* Does this query modify any partition key columns? */
 	bool		partColsUpdated;
+
+	/* PartitionPruneInfos added in this query's plan. */
+	List	   *partPruneInfos;
 };
 
 
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 5c2ab1b379..2e132afc5a 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -70,6 +70,9 @@ typedef struct PlannedStmt
 
 	struct Plan *planTree;		/* tree of Plan nodes */
 
+	List	   *partPruneInfos;	/* List of PartitionPruneInfo contained in
+								 * the plan */
+
 	List	   *rtable;			/* list of RangeTblEntry nodes */
 
 	/* rtable indexes of target relations for INSERT/UPDATE/DELETE/MERGE */
@@ -270,8 +273,8 @@ typedef struct Append
 	 */
 	int			first_partial_plan;
 
-	/* Info for run-time subplan pruning; NULL if we're not doing that */
-	struct PartitionPruneInfo *part_prune_info;
+	/* Index to PlannerInfo.partPruneInfos or -1 if no run-time pruning */
+	int			part_prune_index;
 } Append;
 
 /* ----------------
@@ -305,8 +308,8 @@ typedef struct MergeAppend
 	/* NULLS FIRST/LAST directions */
 	bool	   *nullsFirst pg_node_attr(array_size(numCols));
 
-	/* Info for run-time subplan pruning; NULL if we're not doing that */
-	struct PartitionPruneInfo *part_prune_info;
+	/* Index to PlannerInfo.partPruneInfos or -1 if no run-time pruning */
+	int			part_prune_index;
 } MergeAppend;
 
 /* ----------------
diff --git a/src/include/partitioning/partprune.h b/src/include/partitioning/partprune.h
index 90684efa25..ebf0dcff8c 100644
--- a/src/include/partitioning/partprune.h
+++ b/src/include/partitioning/partprune.h
@@ -70,10 +70,10 @@ typedef struct PartitionPruneContext
 #define PruneCxtStateIdx(partnatts, step_id, keyno) \
 	((partnatts) * (step_id) + (keyno))
 
-extern PartitionPruneInfo *make_partition_pruneinfo(struct PlannerInfo *root,
-													struct RelOptInfo *parentrel,
-													List *subpaths,
-													List *prunequal);
+extern int make_partition_pruneinfo(struct PlannerInfo *root,
+									struct RelOptInfo *parentrel,
+									List *subpaths,
+									List *prunequal);
 extern Bitmapset *prune_append_rel_partitions(struct RelOptInfo *rel);
 extern Bitmapset *get_matching_partitions(PartitionPruneContext *context,
 										  List *pruning_steps);
-- 
2.35.3



^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2022-11-30 18:12  Alvaro Herrera <[email protected]>
  parent: Amit Langote <[email protected]>
  0 siblings, 1 reply; 108+ messages in thread

From: Alvaro Herrera @ 2022-11-30 18:12 UTC (permalink / raw)
  To: Amit Langote <[email protected]>; +Cc: Robert Haas <[email protected]>; Jacob Champion <[email protected]>; Zhihong Yu <[email protected]>; David Rowley <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers

Looking at 0001, I wonder if we should have a crosscheck that a
PartitionPruneInfo you got from following an index is indeed constructed
for the relation that you think it is: previously, you were always sure
that the prune struct is for this node, because you followed a pointer
that was set up in the node itself.  Now you only have an index, and you
have to trust that the index is correct.

I'm not sure how to implement this, or even if it's doable at all.
Keeping the OID of the partitioned table in the PartitionPruneInfo
struct is easy, but I don't know how to check it in ExecInitMergeAppend
and ExecInitAppend.

-- 
Álvaro Herrera        Breisgau, Deutschland  —  https://www.EnterpriseDB.com/
"Find a bug in a program, and fix it, and the program will work today.
Show the program how to find and fix a bug, and the program
will work forever" (Oliver Silfridge)





^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2022-12-01 07:59  Amit Langote <[email protected]>
  parent: Alvaro Herrera <[email protected]>
  0 siblings, 1 reply; 108+ messages in thread

From: Amit Langote @ 2022-12-01 07:59 UTC (permalink / raw)
  To: Alvaro Herrera <[email protected]>; +Cc: Robert Haas <[email protected]>; Jacob Champion <[email protected]>; Zhihong Yu <[email protected]>; David Rowley <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers

Hi Alvaro,

Thanks for looking at this one.

On Thu, Dec 1, 2022 at 3:12 AM Alvaro Herrera <[email protected]> wrote:
> Looking at 0001, I wonder if we should have a crosscheck that a
> PartitionPruneInfo you got from following an index is indeed constructed
> for the relation that you think it is: previously, you were always sure
> that the prune struct is for this node, because you followed a pointer
> that was set up in the node itself.  Now you only have an index, and you
> have to trust that the index is correct.

Yeah, a crosscheck sounds like a good idea.

> I'm not sure how to implement this, or even if it's doable at all.
> Keeping the OID of the partitioned table in the PartitionPruneInfo
> struct is easy, but I don't know how to check it in ExecInitMergeAppend
> and ExecInitAppend.

Hmm, how about keeping the [Merge]Append's parent relation's RT index
in the PartitionPruneInfo and passing it down to
ExecInitPartitionPruning() from ExecInit[Merge]Append() for
cross-checking?  Both Append and MergeAppend already have a
'apprelids' field that we can save a copy of in the
PartitionPruneInfo.  Tried that in the attached delta patch.

-- 
Thanks, Amit Langote
EDB: http://www.enterprisedb.com


Attachments:

  [application/octet-stream] PartitionPruneInfo-relids.patch (5.3K, 2-PartitionPruneInfo-relids.patch)
  download | inline diff:
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 2bd069d889..9a631a9192 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -1791,6 +1791,9 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  *		Initialize data structure needed for run-time partition pruning and
  *		do initial pruning if needed
  *
+ * 'root_parent_relids' identifies the relation to which both the parent plan
+ * and the PartitionPruneInfo given by 'part_prune_index' belong.
+ *
  * On return, *initially_valid_subplans is assigned the set of indexes of
  * child subplans that must be initialized along with the parent plan node.
  * Initial pruning is performed here if needed and in that case only the
@@ -1804,6 +1807,7 @@ PartitionPruneState *
 ExecInitPartitionPruning(PlanState *planstate,
 						 int n_total_subplans,
 						 int part_prune_index,
+						 Bitmapset *root_parent_relids,
 						 Bitmapset **initially_valid_subplans)
 {
 	PartitionPruneState *prunestate;
@@ -1811,6 +1815,14 @@ ExecInitPartitionPruning(PlanState *planstate,
 	PartitionPruneInfo *pruneinfo = list_nth(estate->es_part_prune_infos,
 											 part_prune_index);
 
+	/* Sanity: part_prune_index gives the correct PartitionPruneInfo. */
+	if (!bms_equal(root_parent_relids, pruneinfo->root_parent_relids))
+		elog(ERROR, "wrong relids (%s) found in PartitionPruneInfo at part_prune_index=%u which has root_parent_relids=%s",
+			 bmsToString(root_parent_relids),
+			 part_prune_index,
+			 bmsToString(pruneinfo->root_parent_relids));
+
+
 	/* We may need an expression context to evaluate partition exprs */
 	ExecAssignExprContext(estate, planstate);
 
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index c6f86a6510..99830198bd 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -146,6 +146,7 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
 		prunestate = ExecInitPartitionPruning(&appendstate->ps,
 											  list_length(node->appendplans),
 											  node->part_prune_index,
+											  node->apprelids,
 											  &validsubplans);
 		appendstate->as_prune_state = prunestate;
 		nplans = bms_num_members(validsubplans);
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index 8d35860c30..f370f9f287 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -94,6 +94,7 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
 		prunestate = ExecInitPartitionPruning(&mergestate->ps,
 											  list_length(node->mergeplans),
 											  node->part_prune_index,
+											  node->apprelids,
 											  &validsubplans);
 		mergestate->ms_prune_state = prunestate;
 		nplans = bms_num_members(validsubplans);
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 720f20f563..e67f0e3509 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -354,6 +354,8 @@ set_plan_references(PlannerInfo *root, Plan *plan)
 		PartitionPruneInfo *pruneinfo = lfirst(lc);
 		ListCell  *l;
 
+		pruneinfo->root_parent_relids =
+			offset_relid_set(pruneinfo->root_parent_relids, rtoffset);
 		foreach(l, pruneinfo->prune_infos)
 		{
 			List	   *prune_infos = lfirst(l);
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index 6565b6ed01..d48f6784c1 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -340,6 +340,7 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 
 	/* Else build the result data structure */
 	pruneinfo = makeNode(PartitionPruneInfo);
+	pruneinfo->root_parent_relids = parentrel->relids;
 	pruneinfo->prune_infos = prunerelinfos;
 
 	/*
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index bf962af7af..17fabc18c9 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -124,6 +124,7 @@ typedef struct PartitionPruneState
 extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
 													 int n_total_subplans,
 													 int part_prune_index,
+													 Bitmapset *root_parent_relids,
 													 Bitmapset **initially_valid_subplans);
 extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
 										   bool initial_prune);
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 2e132afc5a..b2d6f8fb6e 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -1407,6 +1407,8 @@ typedef struct PlanRowMark
  * Then, since an Append-type node could have multiple partitioning
  * hierarchies among its children, we have an unordered List of those Lists.
  *
+ * root_parent_relids	RelOptInfo.relids of the relation to which the parent
+ *						plan node and this PartitionPruneInfo node belong
  * prune_infos			List of Lists containing PartitionedRelPruneInfo nodes,
  *						one sublist per run-time-prunable partition hierarchy
  *						appearing in the parent plan node's subplans.
@@ -1419,6 +1421,7 @@ typedef struct PartitionPruneInfo
 	pg_node_attr(no_equal)
 
 	NodeTag		type;
+	Bitmapset  *root_parent_relids;
 	List	   *prune_infos;
 	Bitmapset  *other_subplans;
 } PartitionPruneInfo;


^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2022-12-01 11:21  Alvaro Herrera <[email protected]>
  parent: Amit Langote <[email protected]>
  0 siblings, 1 reply; 108+ messages in thread

From: Alvaro Herrera @ 2022-12-01 11:21 UTC (permalink / raw)
  To: Amit Langote <[email protected]>; +Cc: Robert Haas <[email protected]>; Jacob Champion <[email protected]>; Zhihong Yu <[email protected]>; David Rowley <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers

On 2022-Dec-01, Amit Langote wrote:

> Hmm, how about keeping the [Merge]Append's parent relation's RT index
> in the PartitionPruneInfo and passing it down to
> ExecInitPartitionPruning() from ExecInit[Merge]Append() for
> cross-checking?  Both Append and MergeAppend already have a
> 'apprelids' field that we can save a copy of in the
> PartitionPruneInfo.  Tried that in the attached delta patch.

Ah yeah, that sounds about what I was thinking.  I've merged that in and
pushed to github, which had a strange pg_upgrade failure on Windows
mentioning log files that were not captured by the CI tooling.  So I
pushed another one trying to grab those files, in case it wasn't an
one-off failure.  It's running now:
  https://cirrus-ci.com/task/5857239638999040

If all goes well with this run, I'll get this 0001 pushed.

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"Investigación es lo que hago cuando no sé lo que estoy haciendo"
(Wernher von Braun)





^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2022-12-01 12:43  Amit Langote <[email protected]>
  parent: Alvaro Herrera <[email protected]>
  0 siblings, 1 reply; 108+ messages in thread

From: Amit Langote @ 2022-12-01 12:43 UTC (permalink / raw)
  To: Alvaro Herrera <[email protected]>; +Cc: Robert Haas <[email protected]>; Jacob Champion <[email protected]>; Zhihong Yu <[email protected]>; David Rowley <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers

On Thu, Dec 1, 2022 at 8:21 PM Alvaro Herrera <[email protected]> wrote:
> On 2022-Dec-01, Amit Langote wrote:
> > Hmm, how about keeping the [Merge]Append's parent relation's RT index
> > in the PartitionPruneInfo and passing it down to
> > ExecInitPartitionPruning() from ExecInit[Merge]Append() for
> > cross-checking?  Both Append and MergeAppend already have a
> > 'apprelids' field that we can save a copy of in the
> > PartitionPruneInfo.  Tried that in the attached delta patch.
>
> Ah yeah, that sounds about what I was thinking.  I've merged that in and
> pushed to github, which had a strange pg_upgrade failure on Windows
> mentioning log files that were not captured by the CI tooling.  So I
> pushed another one trying to grab those files, in case it wasn't an
> one-off failure.  It's running now:
>   https://cirrus-ci.com/task/5857239638999040
>
> If all goes well with this run, I'll get this 0001 pushed.

Thanks for pushing 0001.

Rebased 0002 attached.

-- 
Thanks, Amit Langote
EDB: http://www.enterprisedb.com


Attachments:

  [application/octet-stream] v25-0001-Optimize-AcquireExecutorLocks-by-locking-only-un.patch (82.4K, 2-v25-0001-Optimize-AcquireExecutorLocks-by-locking-only-un.patch)
  download | inline diff:
From cff400af6c264d7a2651faec4d963e987797f588 Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Wed, 22 Dec 2021 16:55:17 +0900
Subject: [PATCH v25] Optimize AcquireExecutorLocks() by locking only unpruned
 partitions

This commit teaches AcquireExecutorLocks() to perform initial
partition pruning to notionally eliminate the subnodes contained in a
generic cached plan that need not be initialized during the actual
execution of the plan and skip locking the partition scanned by those
subnodes.

The result of performing initial partition pruning this way before the
actual execution has started is made available to the actual execution via
PartitionPruneResult, made available along with the PlannedStmt by the
callers of the executor that used plancache.c to get the plan.  It is NULL
in the cases in which the plan is obtained by calling the planner
directly or if the plan obtained by plancache.c is not a generic one.
---
 src/backend/commands/copyto.c          |   2 +-
 src/backend/commands/createas.c        |   2 +-
 src/backend/commands/explain.c         |   7 +-
 src/backend/commands/extension.c       |   2 +-
 src/backend/commands/matview.c         |   2 +-
 src/backend/commands/prepare.c         |  26 ++-
 src/backend/executor/README            |  32 ++++
 src/backend/executor/execMain.c        |  51 ++++++
 src/backend/executor/execParallel.c    |  26 ++-
 src/backend/executor/execPartition.c   | 238 +++++++++++++++++++++----
 src/backend/executor/execUtils.c       |   1 +
 src/backend/executor/functions.c       |   2 +-
 src/backend/executor/nodeAppend.c      |  11 +-
 src/backend/executor/nodeMergeAppend.c |   5 +-
 src/backend/executor/spi.c             |  27 ++-
 src/backend/nodes/readfuncs.c          |   8 +-
 src/backend/optimizer/plan/planner.c   |   2 +
 src/backend/optimizer/plan/setrefs.c   |  46 +++++
 src/backend/partitioning/partprune.c   |  41 ++++-
 src/backend/tcop/postgres.c            |   8 +-
 src/backend/tcop/pquery.c              |  28 ++-
 src/backend/utils/cache/plancache.c    | 208 ++++++++++++++++++---
 src/backend/utils/mmgr/portalmem.c     |  19 ++
 src/include/commands/explain.h         |   4 +-
 src/include/executor/execPartition.h   |   9 +-
 src/include/executor/execdesc.h        |   3 +
 src/include/executor/executor.h        |   3 +
 src/include/nodes/execnodes.h          |   1 +
 src/include/nodes/nodes.h              |   1 +
 src/include/nodes/pathnodes.h          |  12 ++
 src/include/nodes/plannodes.h          |  46 +++++
 src/include/utils/plancache.h          |   3 +-
 src/include/utils/portal.h             |   3 +
 33 files changed, 781 insertions(+), 98 deletions(-)

diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index f26cc0d162..401a2280a3 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -558,7 +558,7 @@ BeginCopyTo(ParseState *pstate,
 		((DR_copy *) dest)->cstate = cstate;
 
 		/* Create a QueryDesc requesting no output */
-		cstate->queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+		cstate->queryDesc = CreateQueryDesc(plan, NIL, pstate->p_sourcetext,
 											GetActiveSnapshot(),
 											InvalidSnapshot,
 											dest, NULL, NULL, 0);
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 152c29b551..942449544c 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -325,7 +325,7 @@ ExecCreateTableAs(ParseState *pstate, CreateTableAsStmt *stmt,
 		UpdateActiveSnapshotCommandId();
 
 		/* Create a QueryDesc, redirecting output to our tuple receiver */
-		queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+		queryDesc = CreateQueryDesc(plan, NIL, pstate->p_sourcetext,
 									GetActiveSnapshot(), InvalidSnapshot,
 									dest, params, queryEnv, 0);
 
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index f86983c660..2f2b558608 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -407,7 +407,7 @@ ExplainOneQuery(Query *query, int cursorOptions,
 		}
 
 		/* run it (if needed) and produce output */
-		ExplainOnePlan(plan, into, es, queryString, params, queryEnv,
+		ExplainOnePlan(plan, NIL, into, es, queryString, params, queryEnv,
 					   &planduration, (es->buffers ? &bufusage : NULL));
 	}
 }
@@ -515,7 +515,8 @@ ExplainOneUtility(Node *utilityStmt, IntoClause *into, ExplainState *es,
  * to call it.
  */
 void
-ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
+ExplainOnePlan(PlannedStmt *plannedstmt, List *part_prune_results,
+			   IntoClause *into, ExplainState *es,
 			   const char *queryString, ParamListInfo params,
 			   QueryEnvironment *queryEnv, const instr_time *planduration,
 			   const BufferUsage *bufusage)
@@ -563,7 +564,7 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
 		dest = None_Receiver;
 
 	/* Create a QueryDesc for the query */
-	queryDesc = CreateQueryDesc(plannedstmt, queryString,
+	queryDesc = CreateQueryDesc(plannedstmt, part_prune_results, queryString,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, params, queryEnv, instrument_option);
 
diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index cf1b1ca571..904cbcba4a 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -779,7 +779,7 @@ execute_sql_string(const char *sql)
 			{
 				QueryDesc  *qdesc;
 
-				qdesc = CreateQueryDesc(stmt,
+				qdesc = CreateQueryDesc(stmt, NIL,
 										sql,
 										GetActiveSnapshot(), NULL,
 										dest, NULL, NULL, 0);
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index 9ac0383459..65c8d0aa59 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -408,7 +408,7 @@ refresh_matview_datafill(DestReceiver *dest, Query *query,
 	UpdateActiveSnapshotCommandId();
 
 	/* Create a QueryDesc, redirecting output to our tuple receiver */
-	queryDesc = CreateQueryDesc(plan, queryString,
+	queryDesc = CreateQueryDesc(plan, NIL, queryString,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, NULL, NULL, 0);
 
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 9e29584d93..29b45539d3 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -155,6 +155,7 @@ ExecuteQuery(ParseState *pstate,
 	PreparedStatement *entry;
 	CachedPlan *cplan;
 	List	   *plan_list;
+	List	   *part_prune_results_list;
 	ParamListInfo paramLI = NULL;
 	EState	   *estate = NULL;
 	Portal		portal;
@@ -193,7 +194,10 @@ ExecuteQuery(ParseState *pstate,
 									   entry->plansource->query_string);
 
 	/* Replan if needed, and increment plan refcount for portal */
-	cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL);
+	cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL,
+						  &part_prune_results_list);
+	Assert(list_length(cplan->stmt_list) ==
+		   list_length(part_prune_results_list));
 	plan_list = cplan->stmt_list;
 
 	/*
@@ -207,6 +211,9 @@ ExecuteQuery(ParseState *pstate,
 					  plan_list,
 					  cplan);
 
+	/* Copy Lists of PartitionPruneResults into the portal's context. */
+	PortalStorePartitionPruneResults(portal, part_prune_results_list);
+
 	/*
 	 * For CREATE TABLE ... AS EXECUTE, we must verify that the prepared
 	 * statement is one that produces tuples.  Currently we insist that it be
@@ -576,7 +583,9 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	const char *query_string;
 	CachedPlan *cplan;
 	List	   *plan_list;
-	ListCell   *p;
+	List	   *part_prune_results_list;
+	ListCell   *p,
+			   *pp;
 	ParamListInfo paramLI = NULL;
 	EState	   *estate = NULL;
 	instr_time	planstart;
@@ -619,7 +628,10 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 
 	/* Replan if needed, and acquire a transient refcount */
 	cplan = GetCachedPlan(entry->plansource, paramLI,
-						  CurrentResourceOwner, queryEnv);
+						  CurrentResourceOwner, queryEnv,
+						  &part_prune_results_list);
+	Assert(list_length(cplan->stmt_list) ==
+		   list_length(part_prune_results_list));
 
 	INSTR_TIME_SET_CURRENT(planduration);
 	INSTR_TIME_SUBTRACT(planduration, planstart);
@@ -634,13 +646,15 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	plan_list = cplan->stmt_list;
 
 	/* Explain each query */
-	foreach(p, plan_list)
+	forboth(p, plan_list, pp, part_prune_results_list)
 	{
 		PlannedStmt *pstmt = lfirst_node(PlannedStmt, p);
+		List *part_prune_results = lfirst_node(List, pp);
 
 		if (pstmt->commandType != CMD_UTILITY)
-			ExplainOnePlan(pstmt, into, es, query_string, paramLI, queryEnv,
-						   &planduration, (es->buffers ? &bufusage : NULL));
+			ExplainOnePlan(pstmt, part_prune_results, into, es, query_string,
+						   paramLI, queryEnv, &planduration,
+						   (es->buffers ? &bufusage : NULL));
 		else
 			ExplainOneUtility(pstmt->utilityStmt, into, es, query_string,
 							  paramLI, queryEnv);
diff --git a/src/backend/executor/README b/src/backend/executor/README
index 17775a49e2..5c59ac5da7 100644
--- a/src/backend/executor/README
+++ b/src/backend/executor/README
@@ -65,6 +65,34 @@ found there.  This currently only occurs for Append and MergeAppend nodes.  In
 this case the non-required subplans are ignored and the executor state's
 subnode array will become out of sequence to the plan's subplan list.
 
+Actually, the so-called execution time pruning may also occur even before the
+execution has started.  One case where that occurs is when a cached generic
+plan is being validated for execution by plancache.c: GetCachedPlan(), which
+works by locking all the relations that will be scanned by that plan.  If the
+generic plan contains nodes that can perform execution time partition pruning
+(that is, contain a PartitionPruneInfo), a subset of pruning steps contained
+in a given node's PartitionPruneInfo that do not depend on the execution
+actually having started (called "initial" pruning steps) are performed to
+figure out the minimal set of child subplans that satisfy those pruning steps.
+AcquireExecutorLocks() looking at a given generic plan will then lock only the
+relations scanned by the child subplans that survived such pruning, along with
+those present in PlannedStmt.minLockRelids.  Note that the subplans are only
+notionally pruned, that is, they are not removed from the plan tree as such.
+
+To prevent the executor and any third party execution code that can look at
+the plan tree from trying to execute the subplans that were pruned as
+described above, the result of pruning is passed to the executor as a List
+of PartitionPruneResult nodes via the QueryDesc.  Each PartitionPruneResult
+consists of the set of indexes of surviving subplans in the respective parent
+plan node's (the one to which the corresponding PartitionPruneInfo belongs)
+list of child subplans, saved as a bitmapset (valid_subplan_offs).  In other
+words, the executor executing a generic plan should not re-evaluate the set of
+initially valid subplans for a given plan node by redoing the initial pruning
+if it was already done by AcquireExecutorLocks() when validating the plan.
+Such re-evaluation of the pruning steps may very well end up resulting in a
+different set of subplans, containing some whose relations were not locked by
+AcquireExecutorLocks().
+
 Each Plan node may have expression trees associated with it, to represent
 its target list, qualification conditions, etc.  These trees are also
 read-only to the executor, but the executor state for expression evaluation
@@ -286,6 +314,10 @@ Query Processing Control Flow
 
 This is a sketch of control flow for full query processing:
 
+  [ ExecutorDoInitialPruning ] --- an optional step to perform initial
+		partition pruning on the plan tree the result of which is passed
+		to the executor via QueryDesc
+
 	CreateQueryDesc
 
 	ExecutorStart
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index b6751da574..7a4db80104 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -49,6 +49,7 @@
 #include "commands/matview.h"
 #include "commands/trigger.h"
 #include "executor/execdebug.h"
+#include "executor/execPartition.h"
 #include "executor/nodeSubplan.h"
 #include "foreign/fdwapi.h"
 #include "jit/jit.h"
@@ -104,6 +105,54 @@ static void EvalPlanQualStart(EPQState *epqstate, Plan *planTree);
 
 /* end of local decls */
 
+/* ----------------------------------------------------------------
+ *		ExecutorDoInitialPruning
+ *
+ *		For each plan tree node that has been assigned a PartitionPruneInfo,
+ *		this performs initial partition pruning using the information contained
+ *		therein to determine the set of child subplans that satisfy the initial
+ *		pruning steps, to be returned as a bitmapset of their indexes in the
+ *		node's list of child subplans (for example, an Append's appendplans).
+ *
+ * Return value is a PartitionPruneResult node that contains a list of those
+ * bitmapsets, with one element for every PartitionPruneInfo, and a bitmapset
+ * of the RT indexes of all the leaf partitions scanned by those chosen
+ * subplans.  Note that the latter is shared across all PartitionPruneInfos.
+ *
+ * The executor must see the exactly same set of subplans as valid for
+ * execution when doing ExecInitNode() on the plan nodes whose
+ * PartitionPruneInfos are processed here.  So, it must get the set from the
+ * aforementioned PartitionPruneResult, instead of computing it all over
+ * again by redoing the initial pruning.  It's the caller's job to pass the
+ * PartitionPruneResult to the executor.
+ *
+ * Note: Partitioned tables mentioned in PartitionedRelPruneInfo nodes that
+ * drive the pruning will be locked before doing the pruning.
+ * ----------------------------------------------------------------
+ */
+List *
+ExecutorDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params,
+						 Bitmapset **scan_leafpart_rtis)
+{
+	List	 *part_prune_results = NIL;
+	ListCell *lc;
+
+	/* Only get here if there is any pruning to do. */
+	Assert(plannedstmt->containsInitialPruning);
+
+	foreach(lc, plannedstmt->partPruneInfos)
+	{
+		PartitionPruneInfo *pruneinfo = lfirst(lc);
+		PartitionPruneResult *pruneresult = makeNode(PartitionPruneResult);
+
+		pruneresult->valid_subplan_offs =
+			ExecPartitionDoInitialPruning(plannedstmt, params, pruneinfo,
+										  scan_leafpart_rtis);
+		part_prune_results = lappend(part_prune_results, pruneresult);
+	}
+
+	return part_prune_results;
+}
 
 /* ----------------------------------------------------------------
  *		ExecutorStart
@@ -806,6 +855,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
 {
 	CmdType		operation = queryDesc->operation;
 	PlannedStmt *plannedstmt = queryDesc->plannedstmt;
+	List	   *part_prune_results = queryDesc->part_prune_results;
 	Plan	   *plan = plannedstmt->planTree;
 	List	   *rangeTable = plannedstmt->rtable;
 	EState	   *estate = queryDesc->estate;
@@ -826,6 +876,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
 
 	estate->es_plannedstmt = plannedstmt;
 	estate->es_part_prune_infos = plannedstmt->partPruneInfos;
+	estate->es_part_prune_results = part_prune_results;
 
 	/*
 	 * Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index aca0c6f323..917079a034 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -66,6 +66,7 @@
 #define PARALLEL_KEY_QUERY_TEXT		UINT64CONST(0xE000000000000008)
 #define PARALLEL_KEY_JIT_INSTRUMENTATION UINT64CONST(0xE000000000000009)
 #define PARALLEL_KEY_WAL_USAGE			UINT64CONST(0xE00000000000000A)
+#define PARALLEL_KEY_PARTITION_PRUNE_RESULTS	UINT64CONST(0xE00000000000000B)
 
 #define PARALLEL_TUPLE_QUEUE_SIZE		65536
 
@@ -182,6 +183,7 @@ ExecSerializePlan(Plan *plan, EState *estate)
 	pstmt->transientPlan = false;
 	pstmt->dependsOnRole = false;
 	pstmt->parallelModeNeeded = false;
+	pstmt->containsInitialPruning = false;
 	pstmt->planTree = plan;
 	pstmt->partPruneInfos = estate->es_part_prune_infos;
 	pstmt->rtable = estate->es_range_table;
@@ -597,12 +599,15 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	FixedParallelExecutorState *fpes;
 	char	   *pstmt_data;
 	char	   *pstmt_space;
+	char	   *part_prune_results_data;
+	char	   *part_prune_results_space;
 	char	   *paramlistinfo_space;
 	BufferUsage *bufusage_space;
 	WalUsage   *walusage_space;
 	SharedExecutorInstrumentation *instrumentation = NULL;
 	SharedJitInstrumentation *jit_instrumentation = NULL;
 	int			pstmt_len;
+	int			part_prune_results_len;
 	int			paramlistinfo_len;
 	int			instrumentation_len = 0;
 	int			jit_instrumentation_len = 0;
@@ -631,6 +636,7 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 
 	/* Fix up and serialize plan to be sent to workers. */
 	pstmt_data = ExecSerializePlan(planstate->plan, estate);
+	part_prune_results_data = nodeToString(estate->es_part_prune_results);
 
 	/* Create a parallel context. */
 	pcxt = CreateParallelContext("postgres", "ParallelQueryMain", nworkers);
@@ -657,6 +663,11 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	shm_toc_estimate_chunk(&pcxt->estimator, pstmt_len);
 	shm_toc_estimate_keys(&pcxt->estimator, 1);
 
+	/* Estimate space for serialized List of PartitionPruneResult. */
+	part_prune_results_len = strlen(part_prune_results_data) + 1;
+	shm_toc_estimate_chunk(&pcxt->estimator, part_prune_results_len);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
 	/* Estimate space for serialized ParamListInfo. */
 	paramlistinfo_len = EstimateParamListSpace(estate->es_param_list_info);
 	shm_toc_estimate_chunk(&pcxt->estimator, paramlistinfo_len);
@@ -751,6 +762,12 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	memcpy(pstmt_space, pstmt_data, pstmt_len);
 	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PLANNEDSTMT, pstmt_space);
 
+	/* Store serialized List of PartitionPruneResult */
+	part_prune_results_space = shm_toc_allocate(pcxt->toc, part_prune_results_len);
+	memcpy(part_prune_results_space, part_prune_results_data, part_prune_results_len);
+	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARTITION_PRUNE_RESULTS,
+				   part_prune_results_space);
+
 	/* Store serialized ParamListInfo. */
 	paramlistinfo_space = shm_toc_allocate(pcxt->toc, paramlistinfo_len);
 	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARAMLISTINFO, paramlistinfo_space);
@@ -1232,8 +1249,10 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
 						 int instrument_options)
 {
 	char	   *pstmtspace;
+	char	   *part_prune_results_space;
 	char	   *paramspace;
 	PlannedStmt *pstmt;
+	List	   *part_prune_results;
 	ParamListInfo paramLI;
 	char	   *queryString;
 
@@ -1244,12 +1263,17 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
 	pstmtspace = shm_toc_lookup(toc, PARALLEL_KEY_PLANNEDSTMT, false);
 	pstmt = (PlannedStmt *) stringToNode(pstmtspace);
 
+	/* Reconstruct leader-supplied PartitionPruneResult. */
+	part_prune_results_space =
+		shm_toc_lookup(toc, PARALLEL_KEY_PARTITION_PRUNE_RESULTS, false);
+	part_prune_results = (List *) stringToNode(part_prune_results_space);
+
 	/* Reconstruct ParamListInfo. */
 	paramspace = shm_toc_lookup(toc, PARALLEL_KEY_PARAMLISTINFO, false);
 	paramLI = RestoreParamList(&paramspace);
 
 	/* Create a QueryDesc for the query. */
-	return CreateQueryDesc(pstmt,
+	return CreateQueryDesc(pstmt, part_prune_results,
 						   queryString,
 						   GetActiveSnapshot(), InvalidSnapshot,
 						   receiver, paramLI, NULL, instrument_options);
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 8e6453aec2..13e450c0fa 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -25,6 +25,7 @@
 #include "mb/pg_wchar.h"
 #include "miscadmin.h"
 #include "nodes/makefuncs.h"
+#include "parser/parsetree.h"
 #include "partitioning/partbounds.h"
 #include "partitioning/partdesc.h"
 #include "partitioning/partprune.h"
@@ -185,7 +186,11 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation rel,
 static List *adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri);
 static List *adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap);
 static PartitionPruneState *CreatePartitionPruneState(PlanState *planstate,
-													  PartitionPruneInfo *pruneinfo);
+													  PartitionPruneInfo *pruneinfo,
+													  bool consider_initial_steps,
+													  bool consider_exec_steps,
+													  List *rtable, ExprContext *econtext,
+													  PartitionDirectory partdir);
 static void InitPartitionPruneContext(PartitionPruneContext *context,
 									  List *pruning_steps,
 									  PartitionDesc partdesc,
@@ -198,7 +203,8 @@ static void PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
 static void find_matching_subplans_recurse(PartitionPruningData *prunedata,
 										   PartitionedRelPruningData *pprune,
 										   bool initial_prune,
-										   Bitmapset **validsubplans);
+										   Bitmapset **validsubplans,
+										   Bitmapset **scan_leafpart_rtis);
 
 
 /*
@@ -1758,8 +1764,10 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  * considered to be a stable expression, it can change value from one plan
  * node scan to the next during query execution.  Stable comparison
  * expressions that don't involve such Params allow partition pruning to be
- * done once during executor startup.  Expressions that do involve such Params
- * require us to prune separately for each scan of the parent plan node.
+ * done once during executor startup or during ExecutorDoInitialPruning() that
+ * runs as part of performing AcquireExecutorLocks() on a given plan tree.
+ * Expressions that do involve such Params require us to prune separately for
+ * each scan of the parent plan node.
  *
  * Note that pruning away unneeded subplans during executor startup has the
  * added benefit of not having to initialize the unneeded subplans at all.
@@ -1776,6 +1784,13 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  *		account for initial pruning possibly having eliminated some of the
  *		subplans.
  *
+ * ExecPartitionDoInitialPruning:
+ *		Do initial pruning with the information contained in a given
+ *		PartitionPruneInfo to determine the minimal set of child subplans
+ *		to be executed of the parent plan node to which the PartitionPruneInfo
+ *		belongs and also the set of the RT indexes of leaf partitions that will
+ *		be scanned with those subplans.
+ *
  * ExecFindMatchingSubPlans:
  *		Returns indexes of matching subplans after evaluating the expressions
  *		that are safe to evaluate at a given point.  This function is first
@@ -1796,8 +1811,9 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  *
  * On return, *initially_valid_subplans is assigned the set of indexes of
  * child subplans that must be initialized along with the parent plan node.
- * Initial pruning is performed here if needed and in that case only the
- * surviving subplans' indexes are added.
+ * Initial pruning is performed here if needed (unless it has already been done
+ * by ExecutorDoInitialPruning()), and in that case only the surviving
+ * subplans' indexes are added.
  *
  * If subplans are indeed pruned, subplan_map arrays contained in the returned
  * PartitionPruneState are re-sequenced to not count those, though only if the
@@ -1810,9 +1826,10 @@ ExecInitPartitionPruning(PlanState *planstate,
 						 Bitmapset *root_parent_relids,
 						 Bitmapset **initially_valid_subplans)
 {
-	PartitionPruneState *prunestate;
+	PartitionPruneState *prunestate = NULL;
 	EState	   *estate = planstate->state;
 	PartitionPruneInfo *pruneinfo;
+	PartitionPruneResult *pruneresult = NULL;
 
 	/* Obtain the pruneinfo we need, and make sure it's the right one */
 	pruneinfo = list_nth(estate->es_part_prune_infos, part_prune_index);
@@ -1828,20 +1845,57 @@ ExecInitPartitionPruning(PlanState *planstate,
 	/* We may need an expression context to evaluate partition exprs */
 	ExecAssignExprContext(estate, planstate);
 
-	/* Create the working data structure for pruning */
-	prunestate = CreatePartitionPruneState(planstate, pruneinfo);
+	/*
+	 * No need to do initial pruning if it was done already by
+	 * ExecutorDoInitialPruning(), which it would be if es_part_prune_results
+	 * is set.
+	 */
+	if (estate->es_part_prune_results)
+	{
+		pruneresult = list_nth(estate->es_part_prune_results, part_prune_index);
+		Assert(IsA(pruneresult, PartitionPruneResult));
+	}
+
+	if (pruneresult == NULL || pruneinfo->needs_exec_pruning)
+	{
+		/* We may need an expression context to evaluate partition exprs */
+		ExecAssignExprContext(estate, planstate);
+
+		/* For data reading, executor always omits detached partitions */
+		if (estate->es_partition_directory == NULL)
+			estate->es_partition_directory =
+				CreatePartitionDirectory(estate->es_query_cxt, false);
+
+		/*
+		 * Create the working data structure for pruning.  No need to consider
+		 * initial pruning steps if we have a PartitionPruneResult.
+		 */
+		prunestate = CreatePartitionPruneState(planstate, pruneinfo,
+											   pruneresult == NULL,
+											   pruneinfo->needs_exec_pruning,
+											   NIL, planstate->ps_ExprContext,
+											   estate->es_partition_directory);
+	}
 
 	/*
 	 * Perform an initial partition prune pass, if required.
 	 */
-	if (prunestate->do_initial_prune)
-		*initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true);
+	if (pruneresult)
+	{
+		*initially_valid_subplans = bms_copy(pruneresult->valid_subplan_offs);
+	}
+	else if (prunestate && prunestate->do_initial_prune)
+	{
+		*initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true,
+															 NULL);
+	}
 	else
 	{
-		/* No pruning, so we'll need to initialize all subplans */
+		/* No initial pruning, so we'll need to initialize all subplans */
 		Assert(n_total_subplans > 0);
 		*initially_valid_subplans = bms_add_range(NULL, 0,
 												  n_total_subplans - 1);
+		return prunestate;
 	}
 
 	/*
@@ -1849,7 +1903,8 @@ ExecInitPartitionPruning(PlanState *planstate,
 	 * that were removed above due to initial pruning.  No need to do this if
 	 * no steps were removed.
 	 */
-	if (bms_num_members(*initially_valid_subplans) < n_total_subplans)
+	if (prunestate &&
+		bms_num_members(*initially_valid_subplans) < n_total_subplans)
 	{
 		/*
 		 * We can safely skip this when !do_exec_prune, even though that
@@ -1865,11 +1920,74 @@ ExecInitPartitionPruning(PlanState *planstate,
 	return prunestate;
 }
 
+/*
+ * ExecPartitionDoInitialPruning
+ *		Perform initial pruning using given PartitionPruneInfo to determine
+ *		the minimal set of child subplans that will be executed and also the
+ *		set of RT indexes of the leaf partitions scanned by those subplans.
+ */
+Bitmapset *
+ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params,
+							  PartitionPruneInfo *pruneinfo,
+							  Bitmapset **scan_leafpart_rtis)
+{
+	List		 *rtable = plannedstmt->rtable;
+	ExprContext	 *econtext;
+	PartitionDirectory pdir;
+	MemoryContext oldcontext,
+				  tmpcontext;
+	PartitionPruneState *prunestate;
+	Bitmapset	 *valid_subplan_offs;
+
+	/*
+	 * A temporary context for memory allocations required while executing
+	 * partition pruning steps.
+	 */
+	tmpcontext = AllocSetContextCreate(CurrentMemoryContext,
+									   "initial pruning working data",
+									   ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(tmpcontext);
+
+	/*
+	 * PartitionDirectory to look up partition descriptors.
+	 * Note that we don't omit detached partitions, just like during
+	 * execution proper.
+	 */
+	pdir = CreatePartitionDirectory(CurrentMemoryContext, false);
+
+	/*
+	 * We don't yet have a PlanState for the parent plan node, so we must
+	 * create a standalone ExprContext to evaluate pruning expressions,
+	 * equipped with the information about the EXTERN parameters that the
+	 * caller passed us.  Note that that's okay because the initial pruning
+	 * steps do not contain anything that requires the execution to have
+	 * started and thus need the information contained in a PlanState.
+	 */
+	econtext = CreateStandaloneExprContext();
+	econtext->ecxt_param_list_info = params;
+	prunestate = CreatePartitionPruneState(NULL, pruneinfo, true, false,
+										   rtable, econtext, pdir);
+	MemoryContextSwitchTo(oldcontext);
+
+	/* Do the initial pruning. */
+	valid_subplan_offs = ExecFindMatchingSubPlans(prunestate, true,
+												  scan_leafpart_rtis);
+
+	FreeExprContext(econtext, true);
+	DestroyPartitionDirectory(pdir);
+	MemoryContextDelete(tmpcontext);
+
+	return valid_subplan_offs;
+}
+
 /*
  * CreatePartitionPruneState
  *		Build the data structure required for calling ExecFindMatchingSubPlans
  *
- * 'planstate' is the parent plan node's execution state.
+ * 'planstate', if not NULL, is the parent plan node's execution state.  It
+ * can be NULL if being called before ExecutorStart(), in which case,
+ * 'rtable' (range table), 'econtext', and 'partdir' must be explicitly
+ * provided.
  *
  * 'pruneinfo' is a PartitionPruneInfo as generated by
  * make_partition_pruneinfo.  Here we build a PartitionPruneState containing a
@@ -1883,19 +2001,21 @@ ExecInitPartitionPruning(PlanState *planstate,
  * PartitionedRelPruneInfo.
  */
 static PartitionPruneState *
-CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
+CreatePartitionPruneState(PlanState *planstate,
+						  PartitionPruneInfo *pruneinfo,
+						  bool consider_initial_steps,
+						  bool consider_exec_steps,
+						  List *rtable, ExprContext *econtext,
+						  PartitionDirectory partdir)
 {
-	EState	   *estate = planstate->state;
+	EState	   *estate = planstate ? planstate->state : NULL;
 	PartitionPruneState *prunestate;
 	int			n_part_hierarchies;
 	ListCell   *lc;
 	int			i;
-	ExprContext *econtext = planstate->ps_ExprContext;
 
-	/* For data reading, executor always omits detached partitions */
-	if (estate->es_partition_directory == NULL)
-		estate->es_partition_directory =
-			CreatePartitionDirectory(estate->es_query_cxt, false);
+	Assert((estate != NULL) ||
+			(partdir != NULL && econtext != NULL && rtable != NIL));
 
 	n_part_hierarchies = list_length(pruneinfo->prune_infos);
 	Assert(n_part_hierarchies > 0);
@@ -1950,15 +2070,42 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 			PartitionKey partkey;
 
 			/*
-			 * We can rely on the copies of the partitioned table's partition
-			 * key and partition descriptor appearing in its relcache entry,
-			 * because that entry will be held open and locked for the
-			 * duration of this executor run.
+			 * Must open the relation by ourselves when called before the
+			 * execution has started, such as, when called during
+			 * ExecutorDoInitialPruning() on a cached plan.  In that case,
+			 * sub-partitions must be locked, because AcquirePlannerLocks()
+			 * would not have seen them. (1st relation in a partrelpruneinfos
+			 * list is always the root partitioned table appearing in the
+			 * query, which AcquirePlannerLocks() would have locked; the
+			 * Assert in relation_open() guards that assumption.)
+			 */
+			if (estate == NULL)
+			{
+				RangeTblEntry *rte = rt_fetch(pinfo->rtindex, rtable);
+				int		lockmode = (j == 0) ? NoLock : rte->rellockmode;
+
+				partrel = table_open(rte->relid, lockmode);
+			}
+			else
+				partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
+
+			/*
+			 * We can rely on the copy of the partitioned table's partition
+			 * key from in its relcache entry, because it can't change (or
+			 * get destroyed) as long as the relation is locked.  Partition
+			 * descriptor is taken from the PartitionDirectory associated with
+			 * the table that is held open long enough for the descriptor to
+			 * remain valid while it's used to perform the pruning steps.
 			 */
-			partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
 			partkey = RelationGetPartitionKey(partrel);
-			partdesc = PartitionDirectoryLookup(estate->es_partition_directory,
-												partrel);
+			partdesc = PartitionDirectoryLookup(partdir, partrel);
+
+			/*
+			 * Must close partrel, keeping the lock taken, if we're not using
+			 * EState's entry.
+			 */
+			if (estate == NULL)
+				table_close(partrel, NoLock);
 
 			/*
 			 * Initialize the subplan_map and subpart_map.
@@ -1972,6 +2119,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 			Assert(partdesc->nparts >= pinfo->nparts);
 			pprune->nparts = partdesc->nparts;
 			pprune->subplan_map = palloc(sizeof(int) * partdesc->nparts);
+			pprune->rti_map = palloc(sizeof(Index) * partdesc->nparts);
 			if (partdesc->nparts == pinfo->nparts)
 			{
 				/*
@@ -1982,6 +2130,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 				pprune->subpart_map = pinfo->subpart_map;
 				memcpy(pprune->subplan_map, pinfo->subplan_map,
 					   sizeof(int) * pinfo->nparts);
+				memcpy(pprune->rti_map, pinfo->rti_map,
+					   sizeof(int) * pinfo->nparts);
 
 				/*
 				 * Double-check that the list of unpruned relations has not
@@ -2032,6 +2182,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 							pinfo->subplan_map[pd_idx];
 						pprune->subpart_map[pp_idx] =
 							pinfo->subpart_map[pd_idx];
+						pprune->rti_map[pp_idx] =
+							pinfo->rti_map[pd_idx];
 						pd_idx++;
 					}
 					else
@@ -2039,6 +2191,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 						/* this partdesc entry is not in the plan */
 						pprune->subplan_map[pp_idx] = -1;
 						pprune->subpart_map[pp_idx] = -1;
+						pprune->rti_map[pp_idx] = 0;
 					}
 				}
 
@@ -2060,7 +2213,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 			 * Initialize pruning contexts as needed.
 			 */
 			pprune->initial_pruning_steps = pinfo->initial_pruning_steps;
-			if (pinfo->initial_pruning_steps)
+			if (consider_initial_steps && pinfo->initial_pruning_steps)
 			{
 				InitPartitionPruneContext(&pprune->initial_context,
 										  pinfo->initial_pruning_steps,
@@ -2070,7 +2223,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 				prunestate->do_initial_prune = true;
 			}
 			pprune->exec_pruning_steps = pinfo->exec_pruning_steps;
-			if (pinfo->exec_pruning_steps)
+			if (consider_exec_steps && pinfo->exec_pruning_steps)
 			{
 				InitPartitionPruneContext(&pprune->exec_context,
 										  pinfo->exec_pruning_steps,
@@ -2298,10 +2451,14 @@ PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
  * Pass initial_prune if PARAM_EXEC Params cannot yet be evaluated.  This
  * differentiates the initial executor-time pruning step from later
  * runtime pruning.
+ *
+ * RT indexes of leaf partitions scanned by the chosen subplans are added to
+ * *scan_leafpart_rtis if the pointer is non-NULL.
  */
 Bitmapset *
 ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
-						 bool initial_prune)
+						 bool initial_prune,
+						 Bitmapset **scan_leafpart_rtis)
 {
 	Bitmapset  *result = NULL;
 	MemoryContext oldcontext;
@@ -2336,7 +2493,7 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
 		 */
 		pprune = &prunedata->partrelprunedata[0];
 		find_matching_subplans_recurse(prunedata, pprune, initial_prune,
-									   &result);
+									   &result, scan_leafpart_rtis);
 
 		/* Expression eval may have used space in ExprContext too */
 		if (pprune->exec_pruning_steps)
@@ -2350,6 +2507,8 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
 
 	/* Copy result out of the temp context before we reset it */
 	result = bms_copy(result);
+	if (scan_leafpart_rtis)
+		*scan_leafpart_rtis = bms_copy(*scan_leafpart_rtis);
 
 	MemoryContextReset(prunestate->prune_context);
 
@@ -2360,13 +2519,15 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
  * find_matching_subplans_recurse
  *		Recursive worker function for ExecFindMatchingSubPlans
  *
- * Adds valid (non-prunable) subplan IDs to *validsubplans
+ * Adds valid (non-prunable) subplan IDs to *validsubplans and RT indexes of
+ * of the corresponding leaf partitions to *scan_leafpart_rtis (if asked for).
  */
 static void
 find_matching_subplans_recurse(PartitionPruningData *prunedata,
 							   PartitionedRelPruningData *pprune,
 							   bool initial_prune,
-							   Bitmapset **validsubplans)
+							   Bitmapset **validsubplans,
+							   Bitmapset **scan_leafpart_rtis)
 {
 	Bitmapset  *partset;
 	int			i;
@@ -2393,8 +2554,14 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
 	while ((i = bms_next_member(partset, i)) >= 0)
 	{
 		if (pprune->subplan_map[i] >= 0)
+		{
 			*validsubplans = bms_add_member(*validsubplans,
 											pprune->subplan_map[i]);
+			Assert(pprune->rti_map[i] > 0);
+			if (scan_leafpart_rtis)
+				*scan_leafpart_rtis = bms_add_member(*scan_leafpart_rtis,
+													 pprune->rti_map[i]);
+		}
 		else
 		{
 			int			partidx = pprune->subpart_map[i];
@@ -2402,7 +2569,8 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
 			if (partidx >= 0)
 				find_matching_subplans_recurse(prunedata,
 											   &prunedata->partrelprunedata[partidx],
-											   initial_prune, validsubplans);
+											   initial_prune, validsubplans,
+											   scan_leafpart_rtis);
 			else
 			{
 				/*
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 9695de85b9..dce93a8c9f 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -135,6 +135,7 @@ CreateExecutorState(void)
 	estate->es_param_exec_vals = NULL;
 
 	estate->es_queryEnv = NULL;
+	estate->es_part_prune_results = NIL;
 
 	estate->es_query_cxt = qcontext;
 
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index dc13625171..bffb42ce71 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -842,7 +842,7 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
 	else
 		dest = None_Receiver;
 
-	es->qd = CreateQueryDesc(es->stmt,
+	es->qd = CreateQueryDesc(es->stmt, NIL,
 							 fcache->src,
 							 GetActiveSnapshot(),
 							 InvalidSnapshot,
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index 99830198bd..3b917584de 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -156,7 +156,8 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
 		 * subplan, we can fill as_valid_subplans immediately, preventing
 		 * later calls to ExecFindMatchingSubPlans.
 		 */
-		if (!prunestate->do_exec_prune && nplans > 0)
+		if (appendstate->as_prune_state == NULL ||
+			(!appendstate->as_prune_state->do_exec_prune && nplans > 0))
 			appendstate->as_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
 	}
 	else
@@ -578,7 +579,7 @@ choose_next_subplan_locally(AppendState *node)
 		}
 		else if (node->as_valid_subplans == NULL)
 			node->as_valid_subplans =
-				ExecFindMatchingSubPlans(node->as_prune_state, false);
+				ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
 
 		whichplan = -1;
 	}
@@ -643,7 +644,7 @@ choose_next_subplan_for_leader(AppendState *node)
 		if (node->as_valid_subplans == NULL)
 		{
 			node->as_valid_subplans =
-				ExecFindMatchingSubPlans(node->as_prune_state, false);
+				ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
 
 			/*
 			 * Mark each invalid plan as finished to allow the loop below to
@@ -718,7 +719,7 @@ choose_next_subplan_for_worker(AppendState *node)
 	else if (node->as_valid_subplans == NULL)
 	{
 		node->as_valid_subplans =
-			ExecFindMatchingSubPlans(node->as_prune_state, false);
+			ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
 		mark_invalid_subplans_as_finished(node);
 	}
 
@@ -869,7 +870,7 @@ ExecAppendAsyncBegin(AppendState *node)
 	if (node->as_valid_subplans == NULL)
 	{
 		node->as_valid_subplans =
-			ExecFindMatchingSubPlans(node->as_prune_state, false);
+			ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
 
 		classify_matching_subplans(node);
 	}
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index f370f9f287..ccfa083945 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -104,7 +104,8 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
 		 * subplan, we can fill ms_valid_subplans immediately, preventing
 		 * later calls to ExecFindMatchingSubPlans.
 		 */
-		if (!prunestate->do_exec_prune && nplans > 0)
+		if (mergestate->ms_prune_state == NULL ||
+			(!mergestate->ms_prune_state->do_exec_prune && nplans > 0))
 			mergestate->ms_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
 	}
 	else
@@ -219,7 +220,7 @@ ExecMergeAppend(PlanState *pstate)
 		 */
 		if (node->ms_valid_subplans == NULL)
 			node->ms_valid_subplans =
-				ExecFindMatchingSubPlans(node->ms_prune_state, false);
+				ExecFindMatchingSubPlans(node->ms_prune_state, false, NULL);
 
 		/*
 		 * First time through: pull the first tuple from each valid subplan,
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index fd5796f1b9..93012a5b3b 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -1578,6 +1578,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 	CachedPlanSource *plansource;
 	CachedPlan *cplan;
 	List	   *stmt_list;
+	List	   *part_prune_results_list;
 	char	   *query_string;
 	Snapshot	snapshot;
 	MemoryContext oldcontext;
@@ -1657,7 +1658,10 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 	 */
 
 	/* Replan if needed, and increment plan refcount for portal */
-	cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv);
+	cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv,
+						  &part_prune_results_list);
+	Assert(list_length(cplan->stmt_list) ==
+		   list_length(part_prune_results_list));
 	stmt_list = cplan->stmt_list;
 
 	if (!plan->saved)
@@ -1685,6 +1689,9 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 					  stmt_list,
 					  cplan);
 
+	/* Copy Lists of PartitionPruneResults into the portal's context. */
+	PortalStorePartitionPruneResults(portal, part_prune_results_list);
+
 	/*
 	 * Set up options for portal.  Default SCROLL type is chosen the same way
 	 * as PerformCursorOpen does it.
@@ -2092,7 +2099,8 @@ SPI_plan_get_cached_plan(SPIPlanPtr plan)
 	/* Get the generic plan for the query */
 	cplan = GetCachedPlan(plansource, NULL,
 						  plan->saved ? CurrentResourceOwner : NULL,
-						  _SPI_current->queryEnv);
+						  _SPI_current->queryEnv,
+						  NULL /* Not interested in PartitionPruneResults */);
 	Assert(cplan == plansource->gplan);
 
 	/* Pop the error context stack */
@@ -2473,7 +2481,9 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 	{
 		CachedPlanSource *plansource = (CachedPlanSource *) lfirst(lc1);
 		List	   *stmt_list;
-		ListCell   *lc2;
+		List	   *part_prune_results_list;
+		ListCell   *lc2,
+				   *lc3;
 
 		spicallbackarg.query = plansource->query_string;
 
@@ -2549,8 +2559,10 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 		 * plan, the refcount must be backed by the plan_owner.
 		 */
 		cplan = GetCachedPlan(plansource, options->params,
-							  plan_owner, _SPI_current->queryEnv);
-
+							  plan_owner, _SPI_current->queryEnv,
+							 &part_prune_results_list);
+		Assert(list_length(cplan->stmt_list) ==
+			   list_length(part_prune_results_list));
 		stmt_list = cplan->stmt_list;
 
 		/*
@@ -2589,9 +2601,10 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 			}
 		}
 
-		foreach(lc2, stmt_list)
+		forboth(lc2, stmt_list, lc3, part_prune_results_list)
 		{
 			PlannedStmt *stmt = lfirst_node(PlannedStmt, lc2);
+			List	   *part_prune_results = lfirst_node(List, lc3);
 			bool		canSetTag = stmt->canSetTag;
 			DestReceiver *dest;
 
@@ -2663,7 +2676,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 				else
 					snap = InvalidSnapshot;
 
-				qdesc = CreateQueryDesc(stmt,
+				qdesc = CreateQueryDesc(stmt, part_prune_results,
 										plansource->query_string,
 										snap, crosscheck_snapshot,
 										dest,
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 23776367c5..b01f55fb4f 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -158,6 +158,11 @@
 	token = pg_strtok(&length);		/* skip :fldname */ \
 	local_node->fldname = readIntCols(len)
 
+/* Read an Index array */
+#define READ_INDEX_ARRAY(fldname, len) \
+	token = pg_strtok(&length);		/* skip :fldname */ \
+	local_node->fldname = readIndexCols(len)
+
 /* Read a bool array */
 #define READ_BOOL_ARRAY(fldname, len) \
 	token = pg_strtok(&length);		/* skip :fldname */ \
@@ -800,7 +805,6 @@ fnname(int numCols) \
  */
 READ_SCALAR_ARRAY(readAttrNumberCols, int16, atoi)
 READ_SCALAR_ARRAY(readOidCols, Oid, atooid)
-/* outfuncs.c has writeIndexCols, but we don't yet need that here */
-/* READ_SCALAR_ARRAY(readIndexCols, Index, atoui) */
+READ_SCALAR_ARRAY(readIndexCols, Index, atoui)
 READ_SCALAR_ARRAY(readIntCols, int, atoi)
 READ_SCALAR_ARRAY(readBoolCols, bool, strtobool)
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 799602f5ea..a96d316dca 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -520,7 +520,9 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
 	result->parallelModeNeeded = glob->parallelModeNeeded;
 	result->planTree = top_plan;
 	result->partPruneInfos = glob->partPruneInfos;
+	result->containsInitialPruning = glob->containsInitialPruning;
 	result->rtable = glob->finalrtable;
+	result->minLockRelids = glob->minLockRelids;
 	result->resultRelations = glob->resultRelations;
 	result->appendRelations = glob->appendRelations;
 	result->subplans = glob->subplans;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index e67f0e3509..5820f26fdb 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -270,6 +270,16 @@ set_plan_references(PlannerInfo *root, Plan *plan)
 	 */
 	add_rtes_to_flat_rtable(root, false);
 
+	/*
+	 * Add the query's adjusted range of RT indexes to glob->minLockRelids.
+	 * The adjusted RT indexes of prunable relations will be deleted from the
+	 * set below where PartitionPruneInfos are processed.
+	 */
+	glob->minLockRelids =
+		bms_add_range(glob->minLockRelids,
+					  rtoffset + 1,
+					  rtoffset + list_length(root->parse->rtable));
+
 	/*
 	 * Adjust RT indexes of PlanRowMarks and add to final rowmarks list
 	 */
@@ -352,6 +362,7 @@ set_plan_references(PlannerInfo *root, Plan *plan)
 	foreach (lc, root->partPruneInfos)
 	{
 		PartitionPruneInfo *pruneinfo = lfirst(lc);
+		Bitmapset *leafpart_rtis = NULL;
 		ListCell  *l;
 
 		pruneinfo->root_parent_relids =
@@ -364,15 +375,50 @@ set_plan_references(PlannerInfo *root, Plan *plan)
 			foreach(l2, prune_infos)
 			{
 				PartitionedRelPruneInfo *pinfo = lfirst(l2);
+				int		i;
 
 				/* RT index of the table to which the pinfo belongs. */
 				pinfo->rtindex += rtoffset;
+
+				/* Also of the leaf partitions that might be scanned. */
+				for (i = 0; i < pinfo->nparts; i++)
+				{
+					if (pinfo->rti_map[i] > 0 && pinfo->subplan_map[i] >= 0)
+					{
+						pinfo->rti_map[i] += rtoffset;
+						leafpart_rtis = bms_add_member(leafpart_rtis,
+													   pinfo->rti_map[i]);
+					}
+				}
 			}
 		}
 
+		if (pruneinfo->needs_init_pruning)
+		{
+			glob->containsInitialPruning = true;
+
+			/*
+			 * Delete the leaf partition RTIs from the global set of relations
+			 * to be locked before executing the plan.  AcquireExecutorLocks()
+			 * will find the ones to add to the set after performing initial
+			 * pruning.
+			 */
+			glob->minLockRelids = bms_del_members(glob->minLockRelids,
+												  leafpart_rtis);
+		}
+
 		glob->partPruneInfos = lappend(glob->partPruneInfos, pruneinfo);
 	}
 
+	/*
+	 * It seems worth doing a bms_copy() on glob->minLockRelids if we deleted
+	 * bits from it above to get rid of any empty tail bits.  It seems better
+	 * for the loop over this set in AcquireExecutorLocks() to not have to go
+	 * through those useless bit words.
+	 */
+	if (glob->containsInitialPruning)
+		glob->minLockRelids = bms_copy(glob->minLockRelids);
+
 	return result;
 }
 
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index d48f6784c1..d5556354f7 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -144,7 +144,9 @@ static List *make_partitionedrel_pruneinfo(PlannerInfo *root,
 										   List *prunequal,
 										   Bitmapset *partrelids,
 										   int *relid_subplan_map,
-										   Bitmapset **matchedsubplans);
+										   Bitmapset **matchedsubplans,
+										   bool *needs_init_pruning,
+										   bool *needs_exec_pruning);
 static void gen_partprune_steps(RelOptInfo *rel, List *clauses,
 								PartClauseTarget target,
 								GeneratePruningStepsContext *context);
@@ -234,6 +236,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	int		   *relid_subplan_map;
 	ListCell   *lc;
 	int			i;
+	bool		needs_init_pruning = false;
+	bool		needs_exec_pruning = false;
 
 	/*
 	 * Scan the subpaths to see which ones are scans of partition child
@@ -313,12 +317,16 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		Bitmapset  *partrelids = (Bitmapset *) lfirst(lc);
 		List	   *pinfolist;
 		Bitmapset  *matchedsubplans = NULL;
+		bool		partrel_needs_init_pruning;
+		bool		partrel_needs_exec_pruning;
 
 		pinfolist = make_partitionedrel_pruneinfo(root, parentrel,
 												  prunequal,
 												  partrelids,
 												  relid_subplan_map,
-												  &matchedsubplans);
+												  &matchedsubplans,
+												  &partrel_needs_init_pruning,
+												  &partrel_needs_exec_pruning);
 
 		/* When pruning is possible, record the matched subplans */
 		if (pinfolist != NIL)
@@ -327,6 +335,9 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 			allmatchedsubplans = bms_join(matchedsubplans,
 										  allmatchedsubplans);
 		}
+
+		needs_init_pruning |= partrel_needs_init_pruning;
+		needs_exec_pruning |= partrel_needs_exec_pruning;
 	}
 
 	pfree(relid_subplan_map);
@@ -342,6 +353,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	pruneinfo = makeNode(PartitionPruneInfo);
 	pruneinfo->root_parent_relids = parentrel->relids;
 	pruneinfo->prune_infos = prunerelinfos;
+	pruneinfo->needs_init_pruning = needs_init_pruning;
+	pruneinfo->needs_exec_pruning = needs_exec_pruning;
 
 	/*
 	 * Some subplans may not belong to any of the identified partitioned rels.
@@ -442,13 +455,18 @@ add_part_relids(List *allpartrelids, Bitmapset *partrelids)
  * If we cannot find any useful run-time pruning steps, return NIL.
  * However, on success, each rel identified in partrelids will have
  * an element in the result list, even if some of them are useless.
+ * *needs_init_pruning and *needs_exec_pruning are set to indicate that the
+ * returned PartitionedRelPruneInfos contains pruning steps that can be
+ * performed before and after execution begins, respectively.
  */
 static List *
 make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 							  List *prunequal,
 							  Bitmapset *partrelids,
 							  int *relid_subplan_map,
-							  Bitmapset **matchedsubplans)
+							  Bitmapset **matchedsubplans,
+							  bool *needs_init_pruning,
+							  bool *needs_exec_pruning)
 {
 	RelOptInfo *targetpart = NULL;
 	List	   *pinfolist = NIL;
@@ -459,6 +477,10 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	int			rti;
 	int			i;
 
+	/* Will find out below. */
+	*needs_init_pruning = false;
+	*needs_exec_pruning = false;
+
 	/*
 	 * Examine each partitioned rel, constructing a temporary array to map
 	 * from planner relids to index of the partitioned rel, and building a
@@ -546,6 +568,9 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		 * executor per-scan pruning steps.  This first pass creates startup
 		 * pruning steps and detects whether there's any possibly-useful quals
 		 * that would require per-scan pruning.
+		 *
+		 * In the first pass, we note whether the 2nd pass is necessary by
+		 * noting the presence of EXEC parameters.
 		 */
 		gen_partprune_steps(subpart, partprunequal, PARTTARGET_INITIAL,
 							&context);
@@ -620,6 +645,12 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		pinfo->execparamids = execparamids;
 		/* Remaining fields will be filled in the next loop */
 
+		/* record which types of pruning steps we've seen so far */
+		if (initial_pruning_steps != NIL)
+			*needs_init_pruning = true;
+		if (exec_pruning_steps != NIL)
+			*needs_exec_pruning = true;
+
 		pinfolist = lappend(pinfolist, pinfo);
 	}
 
@@ -647,6 +678,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		int		   *subplan_map;
 		int		   *subpart_map;
 		Oid		   *relid_map;
+		Index	   *rti_map;
 
 		/*
 		 * Construct the subplan and subpart maps for this partitioning level.
@@ -659,6 +691,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		subpart_map = (int *) palloc(nparts * sizeof(int));
 		memset(subpart_map, -1, nparts * sizeof(int));
 		relid_map = (Oid *) palloc0(nparts * sizeof(Oid));
+		rti_map = (Index *) palloc0(nparts * sizeof(Index));
 		present_parts = NULL;
 
 		i = -1;
@@ -673,6 +706,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 			subplan_map[i] = subplanidx = relid_subplan_map[partrel->relid] - 1;
 			subpart_map[i] = subpartidx = relid_subpart_map[partrel->relid] - 1;
 			relid_map[i] = planner_rt_fetch(partrel->relid, root)->relid;
+			rti_map[i] = partrel->relid;
 			if (subplanidx >= 0)
 			{
 				present_parts = bms_add_member(present_parts, i);
@@ -697,6 +731,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		pinfo->subplan_map = subplan_map;
 		pinfo->subpart_map = subpart_map;
 		pinfo->relid_map = relid_map;
+		pinfo->rti_map = rti_map;
 	}
 
 	pfree(relid_subpart_map);
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 3082093d1e..95ab1d0eef 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -1598,6 +1598,7 @@ exec_bind_message(StringInfo input_message)
 	int16	   *rformats = NULL;
 	CachedPlanSource *psrc;
 	CachedPlan *cplan;
+	List	   *part_prune_results_list;
 	Portal		portal;
 	char	   *query_string;
 	char	   *saved_stmt_name;
@@ -1972,7 +1973,9 @@ exec_bind_message(StringInfo input_message)
 	 * will be generated in MessageContext.  The plan refcount will be
 	 * assigned to the Portal, so it will be released at portal destruction.
 	 */
-	cplan = GetCachedPlan(psrc, params, NULL, NULL);
+	cplan = GetCachedPlan(psrc, params, NULL, NULL, &part_prune_results_list);
+	Assert(list_length(cplan->stmt_list) ==
+		   list_length(part_prune_results_list));
 
 	/*
 	 * Now we can define the portal.
@@ -1987,6 +1990,9 @@ exec_bind_message(StringInfo input_message)
 					  cplan->stmt_list,
 					  cplan);
 
+	/* Copy Lists of PartitionPruneResults into the portal's context. */
+	PortalStorePartitionPruneResults(portal, part_prune_results_list);
+
 	/* Done with the snapshot used for parameter I/O and parsing/planning */
 	if (snapshot_set)
 		PopActiveSnapshot();
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index 52e2db6452..280ed7d239 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -35,7 +35,7 @@
 Portal		ActivePortal = NULL;
 
 
-static void ProcessQuery(PlannedStmt *plan,
+static void ProcessQuery(PlannedStmt *plan, List *part_prune_results,
 						 const char *sourceText,
 						 ParamListInfo params,
 						 QueryEnvironment *queryEnv,
@@ -65,6 +65,7 @@ static void DoPortalRewind(Portal portal);
  */
 QueryDesc *
 CreateQueryDesc(PlannedStmt *plannedstmt,
+				List *part_prune_results,
 				const char *sourceText,
 				Snapshot snapshot,
 				Snapshot crosscheck_snapshot,
@@ -77,6 +78,8 @@ CreateQueryDesc(PlannedStmt *plannedstmt,
 
 	qd->operation = plannedstmt->commandType;	/* operation */
 	qd->plannedstmt = plannedstmt;	/* plan */
+	qd->part_prune_results = part_prune_results; /* ExecutorDoInitialPruning()
+												  * output for plan */
 	qd->sourceText = sourceText;	/* query text */
 	qd->snapshot = RegisterSnapshot(snapshot);	/* snapshot */
 	/* RI check snapshot */
@@ -122,6 +125,7 @@ FreeQueryDesc(QueryDesc *qdesc)
  *		PORTAL_ONE_RETURNING, or PORTAL_ONE_MOD_WITH portal
  *
  *	plan: the plan tree for the query
+ *	part_prune_results: ExecutorDoInitialPruning() output for the PlannedStmt
  *	sourceText: the source text of the query
  *	params: any parameters needed
  *	dest: where to send results
@@ -134,6 +138,7 @@ FreeQueryDesc(QueryDesc *qdesc)
  */
 static void
 ProcessQuery(PlannedStmt *plan,
+			 List *part_prune_results,
 			 const char *sourceText,
 			 ParamListInfo params,
 			 QueryEnvironment *queryEnv,
@@ -145,7 +150,7 @@ ProcessQuery(PlannedStmt *plan,
 	/*
 	 * Create the QueryDesc object
 	 */
-	queryDesc = CreateQueryDesc(plan, sourceText,
+	queryDesc = CreateQueryDesc(plan, part_prune_results, sourceText,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, params, queryEnv, 0);
 
@@ -491,8 +496,13 @@ PortalStart(Portal portal, ParamListInfo params,
 				/*
 				 * Create QueryDesc in portal's context; for the moment, set
 				 * the destination to DestNone.
+				 *
+				 * There is no PartitionPruneResult unless the PlannedStmt is
+				 * from a CachedPlan.
 				 */
 				queryDesc = CreateQueryDesc(linitial_node(PlannedStmt, portal->stmts),
+											portal->part_prune_results_list == NIL ? NIL :
+											linitial(portal->part_prune_results_list),
 											portal->sourceText,
 											GetActiveSnapshot(),
 											InvalidSnapshot,
@@ -1225,6 +1235,8 @@ PortalRunMulti(Portal portal,
 
 		if (pstmt->utilityStmt == NULL)
 		{
+			List *part_prune_results = NIL;
+
 			/*
 			 * process a plannable query.
 			 */
@@ -1271,10 +1283,18 @@ PortalRunMulti(Portal portal,
 			else
 				UpdateActiveSnapshotCommandId();
 
+			/*
+			 * Determine if there's a corresponding List of PartitionPruneResult
+			 * for this PlannedStmt.
+			 */
+			if (portal->part_prune_results_list != NIL)
+				part_prune_results = list_nth(portal->part_prune_results_list,
+											  foreach_current_index(stmtlist_item));
+
 			if (pstmt->canSetTag)
 			{
 				/* statement can set tag string */
-				ProcessQuery(pstmt,
+				ProcessQuery(pstmt, part_prune_results,
 							 portal->sourceText,
 							 portal->portalParams,
 							 portal->queryEnv,
@@ -1283,7 +1303,7 @@ PortalRunMulti(Portal portal,
 			else
 			{
 				/* stmt added by rewrite cannot set tag */
-				ProcessQuery(pstmt,
+				ProcessQuery(pstmt, part_prune_results,
 							 portal->sourceText,
 							 portal->portalParams,
 							 portal->queryEnv,
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index cc943205d3..af6fae6e3b 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -99,14 +99,19 @@ static dlist_head cached_expression_list = DLIST_STATIC_INIT(cached_expression_l
 static void ReleaseGenericPlan(CachedPlanSource *plansource);
 static List *RevalidateCachedQuery(CachedPlanSource *plansource,
 								   QueryEnvironment *queryEnv);
-static bool CheckCachedPlan(CachedPlanSource *plansource);
+static bool CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
+							List **part_prune_results_list);
 static CachedPlan *BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
-								   ParamListInfo boundParams, QueryEnvironment *queryEnv);
+								   ParamListInfo boundParams, QueryEnvironment *queryEnv,
+								   List **part_prune_results_list);
 static bool choose_custom_plan(CachedPlanSource *plansource,
 							   ParamListInfo boundParams);
 static double cached_plan_cost(CachedPlan *plan, bool include_planner);
 static Query *QueryListGetPrimaryStmt(List *stmts);
-static void AcquireExecutorLocks(List *stmt_list, bool acquire);
+static void AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams,
+								 List **part_prune_results_list,
+								 List **lockedRelids_per_stmt);
+static void ReleaseExecutorLocks(List *stmt_list, List *lockedRelids_per_stmt);
 static void AcquirePlannerLocks(List *stmt_list, bool acquire);
 static void ScanQueryForLocks(Query *parsetree, bool acquire);
 static bool ScanQueryWalker(Node *node, bool *acquire);
@@ -782,6 +787,26 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
 	return tlist;
 }
 
+/* 
+ * FreePartitionPruneResults
+ *		Frees the List of Lists of PartitionPruneResults for CheckCachedPlan()
+ */
+static void
+FreePartitionPruneResults(List *part_prune_results_list)
+{
+	ListCell *lc;
+
+	foreach(lc, part_prune_results_list)
+	{
+		List *part_prune_results = lfirst(lc);
+
+		/* Free both the PartitionPruneResults and the containing List. */
+		list_free_deep(part_prune_results);
+	}
+
+	list_free(part_prune_results_list);
+}
+
 /*
  * CheckCachedPlan: see if the CachedPlanSource's generic plan is valid.
  *
@@ -790,15 +815,20 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
  *
  * On a "true" return, we have acquired the locks needed to run the plan.
  * (We must do this for the "true" result to be race-condition-free.)
+ *
+ * See GetCachedPlan()'s comment for a description of part_prune_results_list.
  */
 static bool
-CheckCachedPlan(CachedPlanSource *plansource)
+CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
+				List **part_prune_results_list)
 {
 	CachedPlan *plan = plansource->gplan;
 
 	/* Assert that caller checked the querytree */
 	Assert(plansource->is_valid);
 
+	*part_prune_results_list = NIL;
+
 	/* If there's no generic plan, just say "false" */
 	if (!plan)
 		return false;
@@ -820,13 +850,21 @@ CheckCachedPlan(CachedPlanSource *plansource)
 	 */
 	if (plan->is_valid)
 	{
+		List   *lockedRelids_per_stmt;
+
 		/*
 		 * Plan must have positive refcount because it is referenced by
 		 * plansource; so no need to fear it disappears under us here.
 		 */
 		Assert(plan->refcount > 0);
 
-		AcquireExecutorLocks(plan->stmt_list, true);
+		/*
+		 * Lock relations scanned by the plan.  This is where the pruning
+		 * happens if needed.
+		 */
+		AcquireExecutorLocks(plan->stmt_list, boundParams,
+							 part_prune_results_list,
+							 &lockedRelids_per_stmt);
 
 		/*
 		 * If plan was transient, check to see if TransactionXmin has
@@ -848,7 +886,11 @@ CheckCachedPlan(CachedPlanSource *plansource)
 		}
 
 		/* Oops, the race case happened.  Release useless locks. */
-		AcquireExecutorLocks(plan->stmt_list, false);
+		ReleaseExecutorLocks(plan->stmt_list, lockedRelids_per_stmt);
+
+		/* Release any PartitionPruneResults that may been created. */
+		FreePartitionPruneResults(*part_prune_results_list);
+		*part_prune_results_list = NIL;
 	}
 
 	/*
@@ -874,10 +916,14 @@ CheckCachedPlan(CachedPlanSource *plansource)
  * Planning work is done in the caller's memory context.  The finished plan
  * is in a child memory context, which typically should get reparented
  * (unless this is a one-shot plan, in which case we don't copy the plan).
+ *
+ * A list of NILs is returned in *part_prune_results_list, meaning that no
+ * no partition pruning has been done yet for the plans in stmt_list.
  */
 static CachedPlan *
 BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
-				ParamListInfo boundParams, QueryEnvironment *queryEnv)
+				ParamListInfo boundParams, QueryEnvironment *queryEnv,
+				List **part_prune_results_list)
 {
 	CachedPlan *plan;
 	List	   *plist;
@@ -1007,6 +1053,17 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
 
 	MemoryContextSwitchTo(oldcxt);
 
+	/*
+	 * No actual PartitionPruneResults yet to add, though must initialize
+	 * the list to have the same number of elements as the list of
+	 * PlannedStmts.
+	 */
+	*part_prune_results_list = NIL;
+	foreach(lc, plist)
+	{
+		*part_prune_results_list = lappend(*part_prune_results_list, NIL);
+	}
+
 	return plan;
 }
 
@@ -1126,6 +1183,19 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
  * plan or a custom plan for the given parameters: the caller does not know
  * which it will get.
  *
+ * For every PlannedStmt found in the returned CachedPlan, an element that
+ * is either a List of PartitionPruneResult or a NIL is added to
+ * *part_prune_results_list.  The former if the PlannedStmt is from
+ * the existing CachedPlan that is otherwise valid and has
+ * containsInitialPruning set to true.  Before returning such a CachedPlan,
+ * those "initial" steps are performed by calling ExecutorDoInitialPruning()
+ * to determine only those leaf partitions that need to be locked by
+ * AcquireExecutorLocks() by pruning away subplans that don't match the
+ * "initial" pruning conditions.  For each PartitionPruneInfo found in
+ * PlannedStmt.partPruneInfos, a PartitionPruneResult containing the bitmapset
+ * of the indexes of surviving subplans is added to the List for the
+ * PlannedStmt.
+ *
  * On return, the plan is valid and we have sufficient locks to begin
  * execution.
  *
@@ -1139,11 +1209,13 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
  */
 CachedPlan *
 GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
-			  ResourceOwner owner, QueryEnvironment *queryEnv)
+			  ResourceOwner owner, QueryEnvironment *queryEnv,
+			  List **part_prune_results_list)
 {
 	CachedPlan *plan = NULL;
 	List	   *qlist;
 	bool		customplan;
+	List	   *my_part_prune_results_list;
 
 	/* Assert caller is doing things in a sane order */
 	Assert(plansource->magic == CACHEDPLANSOURCE_MAGIC);
@@ -1160,7 +1232,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 
 	if (!customplan)
 	{
-		if (CheckCachedPlan(plansource))
+		if (CheckCachedPlan(plansource, boundParams,
+							&my_part_prune_results_list))
 		{
 			/* We want a generic plan, and we already have a valid one */
 			plan = plansource->gplan;
@@ -1169,7 +1242,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 		else
 		{
 			/* Build a new generic plan */
-			plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv);
+			plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv,
+								   &my_part_prune_results_list);
 			/* Just make real sure plansource->gplan is clear */
 			ReleaseGenericPlan(plansource);
 			/* Link the new generic plan into the plansource */
@@ -1214,7 +1288,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 	if (customplan)
 	{
 		/* Build a custom plan */
-		plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv);
+		plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv,
+							   &my_part_prune_results_list);
 		/* Accumulate total costs of custom plans */
 		plansource->total_custom_cost += cached_plan_cost(plan, true);
 
@@ -1246,6 +1321,9 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 		plan->is_saved = true;
 	}
 
+	if (part_prune_results_list)
+		*part_prune_results_list = my_part_prune_results_list;
+
 	return plan;
 }
 
@@ -1737,17 +1815,29 @@ QueryListGetPrimaryStmt(List *stmts)
 
 /*
  * AcquireExecutorLocks: acquire locks needed for execution of a cached plan;
- * or release them if acquire is false.
+ *
+ * See GetCachedPlan()'s comment for a description of part_prune_results_list.
+ *
+ * On return, *lockedRelids_per_stmt will contain a bitmapset for every
+ * PlannedStmt in stmt_list, containing the RT indexes of relation entries
+ * in its range table that were actually locked, or NULL if the PlannedStmt
+ * contains a utility statement.
  */
 static void
-AcquireExecutorLocks(List *stmt_list, bool acquire)
+AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams,
+					 List **part_prune_results_list,
+					 List **lockedRelids_per_stmt)
 {
 	ListCell   *lc1;
 
+	*part_prune_results_list = *lockedRelids_per_stmt = NIL;
 	foreach(lc1, stmt_list)
 	{
 		PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
-		ListCell   *lc2;
+		List	   *part_prune_results = NIL;
+		Bitmapset  *allLockRelids;
+		Bitmapset  *lockedRelids = NULL;
+		int			rti;
 
 		if (plannedstmt->commandType == CMD_UTILITY)
 		{
@@ -1761,13 +1851,40 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
 			Query	   *query = UtilityContainsQuery(plannedstmt->utilityStmt);
 
 			if (query)
-				ScanQueryForLocks(query, acquire);
+				ScanQueryForLocks(query, true);
+			*part_prune_results_list = lappend(*part_prune_results_list, NIL);
 			continue;
 		}
 
-		foreach(lc2, plannedstmt->rtable)
+		/*
+		 * Figure out the set of relations that would need to be locked
+		 * before executing the plan.
+		 */
+		if (plannedstmt->containsInitialPruning)
 		{
-			RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc2);
+			Bitmapset *scan_leafpart_rtis = NULL;
+
+			/*
+			 * Obtain the set of leaf partitions to be locked.
+			 *
+			 * The following does initial partition pruning using the
+			 * PartitionPruneInfos found in plannedstmt->partPruneInfos and
+			 * finds leaf partitions that survive that pruning across all the
+			 * nodes in the plan tree.
+			 */
+			part_prune_results = ExecutorDoInitialPruning(plannedstmt,
+														  boundParams,
+														  &scan_leafpart_rtis);
+			allLockRelids = bms_union(plannedstmt->minLockRelids,
+									  scan_leafpart_rtis);
+		}
+		else
+			allLockRelids = plannedstmt->minLockRelids;
+
+		rti = -1;
+		while ((rti = bms_next_member(allLockRelids, rti)) > 0)
+		{
+			RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
 
 			if (rte->rtekind != RTE_RELATION)
 				continue;
@@ -1778,10 +1895,59 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
 			 * fail if it's been dropped entirely --- we'll just transiently
 			 * acquire a non-conflicting lock.
 			 */
-			if (acquire)
-				LockRelationOid(rte->relid, rte->rellockmode);
-			else
-				UnlockRelationOid(rte->relid, rte->rellockmode);
+			LockRelationOid(rte->relid, rte->rellockmode);
+			lockedRelids = bms_add_member(lockedRelids, rti);
+		}
+
+		*part_prune_results_list = lappend(*part_prune_results_list,
+										   part_prune_results);
+		*lockedRelids_per_stmt = lappend(*lockedRelids_per_stmt, lockedRelids);
+	}
+}
+
+/*
+ * ReleaseExecutorLocks
+ * 		Release locks that would've been acquired by an earlier call to
+ * 		AcquireExecutorLocks()
+ */
+static void
+ReleaseExecutorLocks(List *stmt_list, List *lockedRelids_per_stmt)
+{
+	ListCell   *lc1,
+			   *lc2;
+
+	forboth(lc1, stmt_list, lc2, lockedRelids_per_stmt)
+	{
+		PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
+		Bitmapset  *lockedRelids = lfirst(lc2);
+		int			rti;
+
+		if (plannedstmt->commandType == CMD_UTILITY)
+		{
+			/*
+			 * Ignore utility statements, except those (such as EXPLAIN) that
+			 * contain a parsed-but-not-planned query.  Note: it's okay to use
+			 * ScanQueryForLocks, even though the query hasn't been through
+			 * rule rewriting, because rewriting doesn't change the query
+			 * representation.
+			 */
+			Query	   *query = UtilityContainsQuery(plannedstmt->utilityStmt);
+
+			Assert(lockedRelids == NULL);
+			if (query)
+				ScanQueryForLocks(query, false);
+			continue;
+		}
+
+		rti = -1;
+		while ((rti = bms_next_member(lockedRelids, rti)) >= 0)
+		{
+			RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
+
+			Assert(rte->rtekind == RTE_RELATION);
+
+			/* See the comment in AcquireExecutorLocks(). */
+			UnlockRelationOid(rte->relid, rte->rellockmode);
 		}
 	}
 }
diff --git a/src/backend/utils/mmgr/portalmem.c b/src/backend/utils/mmgr/portalmem.c
index 7b1ae6fdcf..5b9098971b 100644
--- a/src/backend/utils/mmgr/portalmem.c
+++ b/src/backend/utils/mmgr/portalmem.c
@@ -303,6 +303,25 @@ PortalDefineQuery(Portal portal,
 	portal->status = PORTAL_DEFINED;
 }
 
+/*
+ * PortalStorePartitionPruneResults
+ *		Copy the given List of Lists of PartitionPruneResults into the
+ *		portal's context
+ *
+ * This allows the caller to ensure that the list exists as long as the portal
+ * does.
+ */
+void
+PortalStorePartitionPruneResults(Portal portal, List *part_prune_results_list)
+{
+	MemoryContext	oldcxt;
+
+	Assert(PortalIsValid(portal));
+	oldcxt = MemoryContextSwitchTo(portal->portalContext);
+	portal->part_prune_results_list = copyObject(part_prune_results_list);
+	MemoryContextSwitchTo(oldcxt);
+}
+
 /*
  * PortalReleaseCachedPlan
  *		Release a portal's reference to its cached plan, if any.
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 9ebde089ae..269cc4d562 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -87,7 +87,9 @@ extern void ExplainOneUtility(Node *utilityStmt, IntoClause *into,
 							  ExplainState *es, const char *queryString,
 							  ParamListInfo params, QueryEnvironment *queryEnv);
 
-extern void ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into,
+extern void ExplainOnePlan(PlannedStmt *plannedstmt,
+						   List *part_prune_results,
+						   IntoClause *into,
 						   ExplainState *es, const char *queryString,
 						   ParamListInfo params, QueryEnvironment *queryEnv,
 						   const instr_time *planduration,
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index 17fabc18c9..4b98d0d2ef 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -45,6 +45,7 @@ extern void ExecCleanupTupleRouting(ModifyTableState *mtstate,
  * nparts						Length of subplan_map[] and subpart_map[].
  * subplan_map					Subplan index by partition index, or -1.
  * subpart_map					Subpart index by partition index, or -1.
+ * rti_map						Range table index by partition index, or 0.
  * present_parts				A Bitmapset of the partition indexes that we
  *								have subplans or subparts for.
  * initial_pruning_steps		List of PartitionPruneSteps used to
@@ -61,6 +62,7 @@ typedef struct PartitionedRelPruningData
 	int			nparts;
 	int		   *subplan_map;
 	int		   *subpart_map;
+	Index	   *rti_map;
 	Bitmapset  *present_parts;
 	List	   *initial_pruning_steps;
 	List	   *exec_pruning_steps;
@@ -127,5 +129,10 @@ extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
 													 Bitmapset *root_parent_relids,
 													 Bitmapset **initially_valid_subplans);
 extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
-										   bool initial_prune);
+										   bool initial_prune,
+										   Bitmapset **scan_leafpart_rtis);
+extern Bitmapset *ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt,
+								ParamListInfo params,
+								PartitionPruneInfo *pruneinfo,
+								Bitmapset **scan_leafpart_rtis);
 #endif							/* EXECPARTITION_H */
diff --git a/src/include/executor/execdesc.h b/src/include/executor/execdesc.h
index e79e2c001f..7d4379da7b 100644
--- a/src/include/executor/execdesc.h
+++ b/src/include/executor/execdesc.h
@@ -35,6 +35,8 @@ typedef struct QueryDesc
 	/* These fields are provided by CreateQueryDesc */
 	CmdType		operation;		/* CMD_SELECT, CMD_UPDATE, etc. */
 	PlannedStmt *plannedstmt;	/* planner's output (could be utility, too) */
+	List		*part_prune_results; /* ExecutorDoInitialPruning()'s
+									  * output for plannedstmt */
 	const char *sourceText;		/* source text of the query */
 	Snapshot	snapshot;		/* snapshot to use for query */
 	Snapshot	crosscheck_snapshot;	/* crosscheck for RI update/delete */
@@ -57,6 +59,7 @@ typedef struct QueryDesc
 
 /* in pquery.c */
 extern QueryDesc *CreateQueryDesc(PlannedStmt *plannedstmt,
+								  List *part_prune_results,
 								  const char *sourceText,
 								  Snapshot snapshot,
 								  Snapshot crosscheck_snapshot,
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index ed95ed1176..c9a5e5fb68 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -185,6 +185,9 @@ ExecGetJunkAttribute(TupleTableSlot *slot, AttrNumber attno, bool *isNull)
 /*
  * prototypes from functions in execMain.c
  */
+extern List *ExecutorDoInitialPruning(PlannedStmt *plannedstmt,
+									  ParamListInfo params,
+									  Bitmapset **scan_leafpart_rtis);
 extern void ExecutorStart(QueryDesc *queryDesc, int eflags);
 extern void standard_ExecutorStart(QueryDesc *queryDesc, int eflags);
 extern void ExecutorRun(QueryDesc *queryDesc,
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index a2008846c6..369de42caf 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -615,6 +615,7 @@ typedef struct EState
 										 * ExecRowMarks, or NULL if none */
 	PlannedStmt *es_plannedstmt;	/* link to top of plan tree */
 	List		*es_part_prune_infos;	/* PlannedStmt.partPruneInfos */
+	List		*es_part_prune_results; /* QueryDesc.part_prune_results */
 	const char *es_sourceText;	/* Source text from QueryDesc */
 
 	JunkFilter *es_junkFilter;	/* top-level junk filter, if any */
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index a80f43e540..937cc4629d 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -212,6 +212,7 @@ extern struct Bitmapset *readBitmapset(void);
 extern uintptr_t readDatum(bool typbyval);
 extern bool *readBoolCols(int numCols);
 extern int *readIntCols(int numCols);
+extern Index *readIndexCols(int numCols);
 extern Oid *readOidCols(int numCols);
 extern int16 *readAttrNumberCols(int numCols);
 
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index dd4eb8679d..36abe4cf9e 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -125,6 +125,18 @@ typedef struct PlannerGlobal
 	/* List of PartitionPruneInfo contained in the plan */
 	List	   *partPruneInfos;
 
+	/*
+	 * Do any of those PartitionPruneInfos have initial pruning steps in them?
+	 */
+	bool		containsInitialPruning;
+
+	/*
+	 * Indexes of all range table entries minus indexes of range table entries
+	 * of the leaf partitions scanned by prunable subplans; see
+	 * AcquireExecutorLocks()
+	 */
+	Bitmapset  *minLockRelids;
+
 	/* OIDs of relations the plan depends on */
 	List	   *relationOids;
 
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 2e202892a7..0cab6958d7 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -73,8 +73,17 @@ typedef struct PlannedStmt
 	List	   *partPruneInfos;	/* List of PartitionPruneInfo contained in
 								 * the plan */
 
+	bool		containsInitialPruning;	/* Do any of those PartitionPruneInfos
+										 * have initial pruning steps in them?
+										 */
+
 	List	   *rtable;			/* list of RangeTblEntry nodes */
 
+	Bitmapset  *minLockRelids;	/* Indexes of all range table entries minus
+								 * indexes of range table entries of the leaf
+								 * partitions scanned by prunable subplans;
+								 * see AcquireExecutorLocks() */
+
 	/* rtable indexes of target relations for INSERT/UPDATE/DELETE/MERGE */
 	List	   *resultRelations;	/* integer list of RT indexes, or NIL */
 
@@ -1414,6 +1423,13 @@ typedef struct PlanRowMark
  * prune_infos			List of Lists containing PartitionedRelPruneInfo nodes,
  *						one sublist per run-time-prunable partition hierarchy
  *						appearing in the parent plan node's subplans.
+ *
+ * needs_init_pruning	Does any of the PartitionedRelPruneInfos in
+ *						prune_infos have its initial_pruning_steps set?
+ *
+ * needs_exec_pruning	Does any of the PartitionedRelPruneInfos in
+ *						prune_infos have its exec_pruning_steps set?
+ *
  * other_subplans		Indexes of any subplans that are not accounted for
  *						by any of the PartitionedRelPruneInfo nodes in
  *						"prune_infos".  These subplans must not be pruned.
@@ -1425,6 +1441,8 @@ typedef struct PartitionPruneInfo
 	NodeTag		type;
 	Bitmapset  *root_parent_relids;
 	List	   *prune_infos;
+	bool		needs_init_pruning;
+	bool		needs_exec_pruning;
 	Bitmapset  *other_subplans;
 } PartitionPruneInfo;
 
@@ -1469,6 +1487,9 @@ typedef struct PartitionedRelPruneInfo
 	/* relation OID by partition index, or 0 */
 	Oid		   *relid_map pg_node_attr(array_size(nparts));
 
+	/* Range table index by partition index, or 0. */
+	Index	   *rti_map pg_node_attr(array_size(nparts));
+
 	/*
 	 * initial_pruning_steps shows how to prune during executor startup (i.e.,
 	 * without use of any PARAM_EXEC Params); it is NIL if no startup pruning
@@ -1553,6 +1574,31 @@ typedef struct PartitionPruneStepCombine
 	List	   *source_stepids;
 } PartitionPruneStepCombine;
 
+/*----------------
+ * PartitionPruneResult
+ *
+ * The result of performing ExecPartitionDoInitialPruning() on a given
+ * PartitionPruneInfo.
+ *
+ * valid_subplans_offs contains the indexes of subplans remaining after
+ * performing initial pruning by calling ExecFindMatchingSubPlans() on the
+ * PartitionPruneInfo.
+ *
+ * This is used to store the result of initial partition pruning that is
+ * peformed before the execution has started.  A module that needs to do so
+ * should call ExecutorDoInitialPruning() on a given PlannedStmt, which
+ * returns a List of PartitionPruneResult containing an entry for each
+ * PartitionPruneInfo present in PlannedStmt.part_prune_infos.  The module
+ * should then pass that list, along with the PlannedStmt, to the executor,
+ * so that it can reuse the result of initial partition pruning when
+ * initializing the subplans for execution.
+ */
+typedef struct PartitionPruneResult
+{
+	NodeTag		type;
+
+	Bitmapset	   *valid_subplan_offs;
+} PartitionPruneResult;
 
 /*
  * Plan invalidation info
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index 0499635f59..32579d4788 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -220,7 +220,8 @@ extern List *CachedPlanGetTargetList(CachedPlanSource *plansource,
 extern CachedPlan *GetCachedPlan(CachedPlanSource *plansource,
 								 ParamListInfo boundParams,
 								 ResourceOwner owner,
-								 QueryEnvironment *queryEnv);
+								 QueryEnvironment *queryEnv,
+								 List **part_prune_results_list);
 extern void ReleaseCachedPlan(CachedPlan *plan, ResourceOwner owner);
 
 extern bool CachedPlanAllowsSimpleValidityCheck(CachedPlanSource *plansource,
diff --git a/src/include/utils/portal.h b/src/include/utils/portal.h
index aeddbdafe5..1901fc5f28 100644
--- a/src/include/utils/portal.h
+++ b/src/include/utils/portal.h
@@ -138,6 +138,7 @@ typedef struct PortalData
 	QueryCompletion qc;			/* command completion data for executed query */
 	List	   *stmts;			/* list of PlannedStmts */
 	CachedPlan *cplan;			/* CachedPlan, if stmts are from one */
+	List	   *part_prune_results_list;	/* List of Lists of PartitionPruneResults */
 
 	ParamListInfo portalParams; /* params to pass to query */
 	QueryEnvironment *queryEnv; /* environment for query */
@@ -242,6 +243,8 @@ extern void PortalDefineQuery(Portal portal,
 							  CommandTag commandTag,
 							  List *stmts,
 							  CachedPlan *cplan);
+extern void PortalStorePartitionPruneResults(Portal portal,
+											 List *part_prune_results_list);
 extern PlannedStmt *PortalGetPrimaryStmt(Portal portal);
 extern void PortalCreateHoldStore(Portal portal);
 extern void PortalHashTableDeleteAll(void);
-- 
2.35.3



^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2022-12-02 10:40  Amit Langote <[email protected]>
  parent: Amit Langote <[email protected]>
  0 siblings, 1 reply; 108+ messages in thread

From: Amit Langote @ 2022-12-02 10:40 UTC (permalink / raw)
  To: Alvaro Herrera <[email protected]>; +Cc: Robert Haas <[email protected]>; Jacob Champion <[email protected]>; Zhihong Yu <[email protected]>; David Rowley <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers

On Thu, Dec 1, 2022 at 9:43 PM Amit Langote <[email protected]> wrote:
> On Thu, Dec 1, 2022 at 8:21 PM Alvaro Herrera <[email protected]> wrote:
> > On 2022-Dec-01, Amit Langote wrote:
> > > Hmm, how about keeping the [Merge]Append's parent relation's RT index
> > > in the PartitionPruneInfo and passing it down to
> > > ExecInitPartitionPruning() from ExecInit[Merge]Append() for
> > > cross-checking?  Both Append and MergeAppend already have a
> > > 'apprelids' field that we can save a copy of in the
> > > PartitionPruneInfo.  Tried that in the attached delta patch.
> >
> > Ah yeah, that sounds about what I was thinking.  I've merged that in and
> > pushed to github, which had a strange pg_upgrade failure on Windows
> > mentioning log files that were not captured by the CI tooling.  So I
> > pushed another one trying to grab those files, in case it wasn't an
> > one-off failure.  It's running now:
> >   https://cirrus-ci.com/task/5857239638999040
> >
> > If all goes well with this run, I'll get this 0001 pushed.
>
> Thanks for pushing 0001.
>
> Rebased 0002 attached.

Thought it might be good for PartitionPruneResult to also have
root_parent_relids that matches with the corresponding
PartitionPruneInfo.  ExecInitPartitionPruning() does a sanity check
that the root_parent_relids of a given pair of PartitionPrune{Info |
Result} match.

Posting the patch separately as the attached 0002, just in case you
might think that the extra cross-checking would be an overkill.

-- 
Thanks, Amit Langote
EDB: http://www.enterprisedb.com


Attachments:

  [application/octet-stream] v26-0002-Add-root_parent_relids-to-PartitionPruneResult.patch (3.4K, 2-v26-0002-Add-root_parent_relids-to-PartitionPruneResult.patch)
  download | inline diff:
From f1af32816635254773386630b634835bd26d1227 Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Fri, 2 Dec 2022 19:32:14 +0900
Subject: [PATCH v26 2/2] Add root_parent_relids to PartitionPruneResult

It's same as the corresponding PartitionPruneInfo's root_parent_relids.
Like PartitionPruneInfo.root_parent_relids, it's there for
cross-checking a PartitionPruneResult found at a given plan node's
part_prune_index actually matches the plan node.
---
 src/backend/executor/execMain.c      |  2 ++
 src/backend/executor/execPartition.c | 13 +++++++++++--
 src/include/nodes/plannodes.h        |  7 +++++++
 3 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 7a4db80104..1e84e47d46 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -145,6 +145,8 @@ ExecutorDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params,
 		PartitionPruneInfo *pruneinfo = lfirst(lc);
 		PartitionPruneResult *pruneresult = makeNode(PartitionPruneResult);
 
+		pruneresult->root_parent_relids =
+			bms_copy(pruneinfo->root_parent_relids);
 		pruneresult->valid_subplan_offs =
 			ExecPartitionDoInitialPruning(plannedstmt, params, pruneinfo,
 										  scan_leafpart_rtis);
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 13e450c0fa..eda14d6241 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -1852,8 +1852,17 @@ ExecInitPartitionPruning(PlanState *planstate,
 	 */
 	if (estate->es_part_prune_results)
 	{
-		pruneresult = list_nth(estate->es_part_prune_results, part_prune_index);
-		Assert(IsA(pruneresult, PartitionPruneResult));
+		pruneresult = list_nth_node(PartitionPruneResult,
+									estate->es_part_prune_results,
+									part_prune_index);
+		if (!bms_equal(root_parent_relids, pruneinfo->root_parent_relids))
+			ereport(ERROR,
+					errcode(ERRCODE_INTERNAL_ERROR),
+					errmsg_internal("mismatching PartitionPruneInfo and PartitionPruneResult at part_prune_index %d",
+									part_prune_index),
+					errdetail_internal("prunresult relids %s, pruneinfo relids %s",
+									   bmsToString(pruneresult->root_parent_relids),
+									   bmsToString(pruneinfo->root_parent_relids)));
 	}
 
 	if (pruneresult == NULL || pruneinfo->needs_exec_pruning)
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 0cab6958d7..30f51414e9 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -1580,6 +1580,12 @@ typedef struct PartitionPruneStepCombine
  * The result of performing ExecPartitionDoInitialPruning() on a given
  * PartitionPruneInfo.
  *
+ * root_parent_relids is same as PartitionPruneInfo.root_parent_relids.  It's
+ * there for cross-checking in ExecInitPartitionPruning() that the
+ * PartitionPruneResult and the PartitionPruneInfo at a given index in
+ * EState.es_part_prune_results and EState.es_part_prune_infos, respectively,
+ * belong to the same parent plan node.
+ *
  * valid_subplans_offs contains the indexes of subplans remaining after
  * performing initial pruning by calling ExecFindMatchingSubPlans() on the
  * PartitionPruneInfo.
@@ -1597,6 +1603,7 @@ typedef struct PartitionPruneResult
 {
 	NodeTag		type;
 
+	Bitmapset	   *root_parent_relids;
 	Bitmapset	   *valid_subplan_offs;
 } PartitionPruneResult;
 
-- 
2.35.3



  [application/octet-stream] v26-0001-Optimize-AcquireExecutorLocks-by-locking-only-un.patch (82.5K, 3-v26-0001-Optimize-AcquireExecutorLocks-by-locking-only-un.patch)
  download | inline diff:
From d8b8185b6ceb2a2a33a6af142f23a59fd93d5cdc Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Wed, 22 Dec 2021 16:55:17 +0900
Subject: [PATCH v26 1/2] Optimize AcquireExecutorLocks() by locking only
 unpruned partitions

This commit teaches AcquireExecutorLocks() to perform initial
partition pruning to notionally eliminate the subnodes contained in a
generic cached plan that need not be initialized during the actual
execution of the plan and skip locking the partition scanned by those
subnodes.

The result of performing initial partition pruning this way before the
actual execution has started is made available to the actual execution via
PartitionPruneResult, made available along with the PlannedStmt by the
callers of the executor that used plancache.c to get the plan.  It is NULL
in the cases in which the plan is obtained by calling the planner
directly or if the plan obtained by plancache.c is not a generic one.
---
 src/backend/commands/copyto.c          |   2 +-
 src/backend/commands/createas.c        |   2 +-
 src/backend/commands/explain.c         |   7 +-
 src/backend/commands/extension.c       |   2 +-
 src/backend/commands/matview.c         |   2 +-
 src/backend/commands/prepare.c         |  26 ++-
 src/backend/executor/README            |  32 ++++
 src/backend/executor/execMain.c        |  51 ++++++
 src/backend/executor/execParallel.c    |  26 ++-
 src/backend/executor/execPartition.c   | 238 +++++++++++++++++++++----
 src/backend/executor/execUtils.c       |   1 +
 src/backend/executor/functions.c       |   2 +-
 src/backend/executor/nodeAppend.c      |  11 +-
 src/backend/executor/nodeMergeAppend.c |   5 +-
 src/backend/executor/spi.c             |  27 ++-
 src/backend/nodes/readfuncs.c          |   8 +-
 src/backend/optimizer/plan/planner.c   |   2 +
 src/backend/optimizer/plan/setrefs.c   |  46 +++++
 src/backend/partitioning/partprune.c   |  41 ++++-
 src/backend/tcop/postgres.c            |   8 +-
 src/backend/tcop/pquery.c              |  28 ++-
 src/backend/utils/cache/plancache.c    | 208 ++++++++++++++++++---
 src/backend/utils/mmgr/portalmem.c     |  19 ++
 src/include/commands/explain.h         |   4 +-
 src/include/executor/execPartition.h   |   9 +-
 src/include/executor/execdesc.h        |   3 +
 src/include/executor/executor.h        |   3 +
 src/include/nodes/execnodes.h          |   1 +
 src/include/nodes/nodes.h              |   1 +
 src/include/nodes/pathnodes.h          |  12 ++
 src/include/nodes/plannodes.h          |  46 +++++
 src/include/utils/plancache.h          |   3 +-
 src/include/utils/portal.h             |   3 +
 33 files changed, 781 insertions(+), 98 deletions(-)

diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index f26cc0d162..401a2280a3 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -558,7 +558,7 @@ BeginCopyTo(ParseState *pstate,
 		((DR_copy *) dest)->cstate = cstate;
 
 		/* Create a QueryDesc requesting no output */
-		cstate->queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+		cstate->queryDesc = CreateQueryDesc(plan, NIL, pstate->p_sourcetext,
 											GetActiveSnapshot(),
 											InvalidSnapshot,
 											dest, NULL, NULL, 0);
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 152c29b551..942449544c 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -325,7 +325,7 @@ ExecCreateTableAs(ParseState *pstate, CreateTableAsStmt *stmt,
 		UpdateActiveSnapshotCommandId();
 
 		/* Create a QueryDesc, redirecting output to our tuple receiver */
-		queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+		queryDesc = CreateQueryDesc(plan, NIL, pstate->p_sourcetext,
 									GetActiveSnapshot(), InvalidSnapshot,
 									dest, params, queryEnv, 0);
 
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index f86983c660..2f2b558608 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -407,7 +407,7 @@ ExplainOneQuery(Query *query, int cursorOptions,
 		}
 
 		/* run it (if needed) and produce output */
-		ExplainOnePlan(plan, into, es, queryString, params, queryEnv,
+		ExplainOnePlan(plan, NIL, into, es, queryString, params, queryEnv,
 					   &planduration, (es->buffers ? &bufusage : NULL));
 	}
 }
@@ -515,7 +515,8 @@ ExplainOneUtility(Node *utilityStmt, IntoClause *into, ExplainState *es,
  * to call it.
  */
 void
-ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
+ExplainOnePlan(PlannedStmt *plannedstmt, List *part_prune_results,
+			   IntoClause *into, ExplainState *es,
 			   const char *queryString, ParamListInfo params,
 			   QueryEnvironment *queryEnv, const instr_time *planduration,
 			   const BufferUsage *bufusage)
@@ -563,7 +564,7 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
 		dest = None_Receiver;
 
 	/* Create a QueryDesc for the query */
-	queryDesc = CreateQueryDesc(plannedstmt, queryString,
+	queryDesc = CreateQueryDesc(plannedstmt, part_prune_results, queryString,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, params, queryEnv, instrument_option);
 
diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index cf1b1ca571..904cbcba4a 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -779,7 +779,7 @@ execute_sql_string(const char *sql)
 			{
 				QueryDesc  *qdesc;
 
-				qdesc = CreateQueryDesc(stmt,
+				qdesc = CreateQueryDesc(stmt, NIL,
 										sql,
 										GetActiveSnapshot(), NULL,
 										dest, NULL, NULL, 0);
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index 9ac0383459..65c8d0aa59 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -408,7 +408,7 @@ refresh_matview_datafill(DestReceiver *dest, Query *query,
 	UpdateActiveSnapshotCommandId();
 
 	/* Create a QueryDesc, redirecting output to our tuple receiver */
-	queryDesc = CreateQueryDesc(plan, queryString,
+	queryDesc = CreateQueryDesc(plan, NIL, queryString,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, NULL, NULL, 0);
 
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 9e29584d93..29b45539d3 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -155,6 +155,7 @@ ExecuteQuery(ParseState *pstate,
 	PreparedStatement *entry;
 	CachedPlan *cplan;
 	List	   *plan_list;
+	List	   *part_prune_results_list;
 	ParamListInfo paramLI = NULL;
 	EState	   *estate = NULL;
 	Portal		portal;
@@ -193,7 +194,10 @@ ExecuteQuery(ParseState *pstate,
 									   entry->plansource->query_string);
 
 	/* Replan if needed, and increment plan refcount for portal */
-	cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL);
+	cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL,
+						  &part_prune_results_list);
+	Assert(list_length(cplan->stmt_list) ==
+		   list_length(part_prune_results_list));
 	plan_list = cplan->stmt_list;
 
 	/*
@@ -207,6 +211,9 @@ ExecuteQuery(ParseState *pstate,
 					  plan_list,
 					  cplan);
 
+	/* Copy Lists of PartitionPruneResults into the portal's context. */
+	PortalStorePartitionPruneResults(portal, part_prune_results_list);
+
 	/*
 	 * For CREATE TABLE ... AS EXECUTE, we must verify that the prepared
 	 * statement is one that produces tuples.  Currently we insist that it be
@@ -576,7 +583,9 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	const char *query_string;
 	CachedPlan *cplan;
 	List	   *plan_list;
-	ListCell   *p;
+	List	   *part_prune_results_list;
+	ListCell   *p,
+			   *pp;
 	ParamListInfo paramLI = NULL;
 	EState	   *estate = NULL;
 	instr_time	planstart;
@@ -619,7 +628,10 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 
 	/* Replan if needed, and acquire a transient refcount */
 	cplan = GetCachedPlan(entry->plansource, paramLI,
-						  CurrentResourceOwner, queryEnv);
+						  CurrentResourceOwner, queryEnv,
+						  &part_prune_results_list);
+	Assert(list_length(cplan->stmt_list) ==
+		   list_length(part_prune_results_list));
 
 	INSTR_TIME_SET_CURRENT(planduration);
 	INSTR_TIME_SUBTRACT(planduration, planstart);
@@ -634,13 +646,15 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	plan_list = cplan->stmt_list;
 
 	/* Explain each query */
-	foreach(p, plan_list)
+	forboth(p, plan_list, pp, part_prune_results_list)
 	{
 		PlannedStmt *pstmt = lfirst_node(PlannedStmt, p);
+		List *part_prune_results = lfirst_node(List, pp);
 
 		if (pstmt->commandType != CMD_UTILITY)
-			ExplainOnePlan(pstmt, into, es, query_string, paramLI, queryEnv,
-						   &planduration, (es->buffers ? &bufusage : NULL));
+			ExplainOnePlan(pstmt, part_prune_results, into, es, query_string,
+						   paramLI, queryEnv, &planduration,
+						   (es->buffers ? &bufusage : NULL));
 		else
 			ExplainOneUtility(pstmt->utilityStmt, into, es, query_string,
 							  paramLI, queryEnv);
diff --git a/src/backend/executor/README b/src/backend/executor/README
index 17775a49e2..5c59ac5da7 100644
--- a/src/backend/executor/README
+++ b/src/backend/executor/README
@@ -65,6 +65,34 @@ found there.  This currently only occurs for Append and MergeAppend nodes.  In
 this case the non-required subplans are ignored and the executor state's
 subnode array will become out of sequence to the plan's subplan list.
 
+Actually, the so-called execution time pruning may also occur even before the
+execution has started.  One case where that occurs is when a cached generic
+plan is being validated for execution by plancache.c: GetCachedPlan(), which
+works by locking all the relations that will be scanned by that plan.  If the
+generic plan contains nodes that can perform execution time partition pruning
+(that is, contain a PartitionPruneInfo), a subset of pruning steps contained
+in a given node's PartitionPruneInfo that do not depend on the execution
+actually having started (called "initial" pruning steps) are performed to
+figure out the minimal set of child subplans that satisfy those pruning steps.
+AcquireExecutorLocks() looking at a given generic plan will then lock only the
+relations scanned by the child subplans that survived such pruning, along with
+those present in PlannedStmt.minLockRelids.  Note that the subplans are only
+notionally pruned, that is, they are not removed from the plan tree as such.
+
+To prevent the executor and any third party execution code that can look at
+the plan tree from trying to execute the subplans that were pruned as
+described above, the result of pruning is passed to the executor as a List
+of PartitionPruneResult nodes via the QueryDesc.  Each PartitionPruneResult
+consists of the set of indexes of surviving subplans in the respective parent
+plan node's (the one to which the corresponding PartitionPruneInfo belongs)
+list of child subplans, saved as a bitmapset (valid_subplan_offs).  In other
+words, the executor executing a generic plan should not re-evaluate the set of
+initially valid subplans for a given plan node by redoing the initial pruning
+if it was already done by AcquireExecutorLocks() when validating the plan.
+Such re-evaluation of the pruning steps may very well end up resulting in a
+different set of subplans, containing some whose relations were not locked by
+AcquireExecutorLocks().
+
 Each Plan node may have expression trees associated with it, to represent
 its target list, qualification conditions, etc.  These trees are also
 read-only to the executor, but the executor state for expression evaluation
@@ -286,6 +314,10 @@ Query Processing Control Flow
 
 This is a sketch of control flow for full query processing:
 
+  [ ExecutorDoInitialPruning ] --- an optional step to perform initial
+		partition pruning on the plan tree the result of which is passed
+		to the executor via QueryDesc
+
 	CreateQueryDesc
 
 	ExecutorStart
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index b6751da574..7a4db80104 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -49,6 +49,7 @@
 #include "commands/matview.h"
 #include "commands/trigger.h"
 #include "executor/execdebug.h"
+#include "executor/execPartition.h"
 #include "executor/nodeSubplan.h"
 #include "foreign/fdwapi.h"
 #include "jit/jit.h"
@@ -104,6 +105,54 @@ static void EvalPlanQualStart(EPQState *epqstate, Plan *planTree);
 
 /* end of local decls */
 
+/* ----------------------------------------------------------------
+ *		ExecutorDoInitialPruning
+ *
+ *		For each plan tree node that has been assigned a PartitionPruneInfo,
+ *		this performs initial partition pruning using the information contained
+ *		therein to determine the set of child subplans that satisfy the initial
+ *		pruning steps, to be returned as a bitmapset of their indexes in the
+ *		node's list of child subplans (for example, an Append's appendplans).
+ *
+ * Return value is a PartitionPruneResult node that contains a list of those
+ * bitmapsets, with one element for every PartitionPruneInfo, and a bitmapset
+ * of the RT indexes of all the leaf partitions scanned by those chosen
+ * subplans.  Note that the latter is shared across all PartitionPruneInfos.
+ *
+ * The executor must see the exactly same set of subplans as valid for
+ * execution when doing ExecInitNode() on the plan nodes whose
+ * PartitionPruneInfos are processed here.  So, it must get the set from the
+ * aforementioned PartitionPruneResult, instead of computing it all over
+ * again by redoing the initial pruning.  It's the caller's job to pass the
+ * PartitionPruneResult to the executor.
+ *
+ * Note: Partitioned tables mentioned in PartitionedRelPruneInfo nodes that
+ * drive the pruning will be locked before doing the pruning.
+ * ----------------------------------------------------------------
+ */
+List *
+ExecutorDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params,
+						 Bitmapset **scan_leafpart_rtis)
+{
+	List	 *part_prune_results = NIL;
+	ListCell *lc;
+
+	/* Only get here if there is any pruning to do. */
+	Assert(plannedstmt->containsInitialPruning);
+
+	foreach(lc, plannedstmt->partPruneInfos)
+	{
+		PartitionPruneInfo *pruneinfo = lfirst(lc);
+		PartitionPruneResult *pruneresult = makeNode(PartitionPruneResult);
+
+		pruneresult->valid_subplan_offs =
+			ExecPartitionDoInitialPruning(plannedstmt, params, pruneinfo,
+										  scan_leafpart_rtis);
+		part_prune_results = lappend(part_prune_results, pruneresult);
+	}
+
+	return part_prune_results;
+}
 
 /* ----------------------------------------------------------------
  *		ExecutorStart
@@ -806,6 +855,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
 {
 	CmdType		operation = queryDesc->operation;
 	PlannedStmt *plannedstmt = queryDesc->plannedstmt;
+	List	   *part_prune_results = queryDesc->part_prune_results;
 	Plan	   *plan = plannedstmt->planTree;
 	List	   *rangeTable = plannedstmt->rtable;
 	EState	   *estate = queryDesc->estate;
@@ -826,6 +876,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
 
 	estate->es_plannedstmt = plannedstmt;
 	estate->es_part_prune_infos = plannedstmt->partPruneInfos;
+	estate->es_part_prune_results = part_prune_results;
 
 	/*
 	 * Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index aca0c6f323..917079a034 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -66,6 +66,7 @@
 #define PARALLEL_KEY_QUERY_TEXT		UINT64CONST(0xE000000000000008)
 #define PARALLEL_KEY_JIT_INSTRUMENTATION UINT64CONST(0xE000000000000009)
 #define PARALLEL_KEY_WAL_USAGE			UINT64CONST(0xE00000000000000A)
+#define PARALLEL_KEY_PARTITION_PRUNE_RESULTS	UINT64CONST(0xE00000000000000B)
 
 #define PARALLEL_TUPLE_QUEUE_SIZE		65536
 
@@ -182,6 +183,7 @@ ExecSerializePlan(Plan *plan, EState *estate)
 	pstmt->transientPlan = false;
 	pstmt->dependsOnRole = false;
 	pstmt->parallelModeNeeded = false;
+	pstmt->containsInitialPruning = false;
 	pstmt->planTree = plan;
 	pstmt->partPruneInfos = estate->es_part_prune_infos;
 	pstmt->rtable = estate->es_range_table;
@@ -597,12 +599,15 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	FixedParallelExecutorState *fpes;
 	char	   *pstmt_data;
 	char	   *pstmt_space;
+	char	   *part_prune_results_data;
+	char	   *part_prune_results_space;
 	char	   *paramlistinfo_space;
 	BufferUsage *bufusage_space;
 	WalUsage   *walusage_space;
 	SharedExecutorInstrumentation *instrumentation = NULL;
 	SharedJitInstrumentation *jit_instrumentation = NULL;
 	int			pstmt_len;
+	int			part_prune_results_len;
 	int			paramlistinfo_len;
 	int			instrumentation_len = 0;
 	int			jit_instrumentation_len = 0;
@@ -631,6 +636,7 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 
 	/* Fix up and serialize plan to be sent to workers. */
 	pstmt_data = ExecSerializePlan(planstate->plan, estate);
+	part_prune_results_data = nodeToString(estate->es_part_prune_results);
 
 	/* Create a parallel context. */
 	pcxt = CreateParallelContext("postgres", "ParallelQueryMain", nworkers);
@@ -657,6 +663,11 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	shm_toc_estimate_chunk(&pcxt->estimator, pstmt_len);
 	shm_toc_estimate_keys(&pcxt->estimator, 1);
 
+	/* Estimate space for serialized List of PartitionPruneResult. */
+	part_prune_results_len = strlen(part_prune_results_data) + 1;
+	shm_toc_estimate_chunk(&pcxt->estimator, part_prune_results_len);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
 	/* Estimate space for serialized ParamListInfo. */
 	paramlistinfo_len = EstimateParamListSpace(estate->es_param_list_info);
 	shm_toc_estimate_chunk(&pcxt->estimator, paramlistinfo_len);
@@ -751,6 +762,12 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	memcpy(pstmt_space, pstmt_data, pstmt_len);
 	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PLANNEDSTMT, pstmt_space);
 
+	/* Store serialized List of PartitionPruneResult */
+	part_prune_results_space = shm_toc_allocate(pcxt->toc, part_prune_results_len);
+	memcpy(part_prune_results_space, part_prune_results_data, part_prune_results_len);
+	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARTITION_PRUNE_RESULTS,
+				   part_prune_results_space);
+
 	/* Store serialized ParamListInfo. */
 	paramlistinfo_space = shm_toc_allocate(pcxt->toc, paramlistinfo_len);
 	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARAMLISTINFO, paramlistinfo_space);
@@ -1232,8 +1249,10 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
 						 int instrument_options)
 {
 	char	   *pstmtspace;
+	char	   *part_prune_results_space;
 	char	   *paramspace;
 	PlannedStmt *pstmt;
+	List	   *part_prune_results;
 	ParamListInfo paramLI;
 	char	   *queryString;
 
@@ -1244,12 +1263,17 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
 	pstmtspace = shm_toc_lookup(toc, PARALLEL_KEY_PLANNEDSTMT, false);
 	pstmt = (PlannedStmt *) stringToNode(pstmtspace);
 
+	/* Reconstruct leader-supplied PartitionPruneResult. */
+	part_prune_results_space =
+		shm_toc_lookup(toc, PARALLEL_KEY_PARTITION_PRUNE_RESULTS, false);
+	part_prune_results = (List *) stringToNode(part_prune_results_space);
+
 	/* Reconstruct ParamListInfo. */
 	paramspace = shm_toc_lookup(toc, PARALLEL_KEY_PARAMLISTINFO, false);
 	paramLI = RestoreParamList(&paramspace);
 
 	/* Create a QueryDesc for the query. */
-	return CreateQueryDesc(pstmt,
+	return CreateQueryDesc(pstmt, part_prune_results,
 						   queryString,
 						   GetActiveSnapshot(), InvalidSnapshot,
 						   receiver, paramLI, NULL, instrument_options);
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 8e6453aec2..13e450c0fa 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -25,6 +25,7 @@
 #include "mb/pg_wchar.h"
 #include "miscadmin.h"
 #include "nodes/makefuncs.h"
+#include "parser/parsetree.h"
 #include "partitioning/partbounds.h"
 #include "partitioning/partdesc.h"
 #include "partitioning/partprune.h"
@@ -185,7 +186,11 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation rel,
 static List *adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri);
 static List *adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap);
 static PartitionPruneState *CreatePartitionPruneState(PlanState *planstate,
-													  PartitionPruneInfo *pruneinfo);
+													  PartitionPruneInfo *pruneinfo,
+													  bool consider_initial_steps,
+													  bool consider_exec_steps,
+													  List *rtable, ExprContext *econtext,
+													  PartitionDirectory partdir);
 static void InitPartitionPruneContext(PartitionPruneContext *context,
 									  List *pruning_steps,
 									  PartitionDesc partdesc,
@@ -198,7 +203,8 @@ static void PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
 static void find_matching_subplans_recurse(PartitionPruningData *prunedata,
 										   PartitionedRelPruningData *pprune,
 										   bool initial_prune,
-										   Bitmapset **validsubplans);
+										   Bitmapset **validsubplans,
+										   Bitmapset **scan_leafpart_rtis);
 
 
 /*
@@ -1758,8 +1764,10 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  * considered to be a stable expression, it can change value from one plan
  * node scan to the next during query execution.  Stable comparison
  * expressions that don't involve such Params allow partition pruning to be
- * done once during executor startup.  Expressions that do involve such Params
- * require us to prune separately for each scan of the parent plan node.
+ * done once during executor startup or during ExecutorDoInitialPruning() that
+ * runs as part of performing AcquireExecutorLocks() on a given plan tree.
+ * Expressions that do involve such Params require us to prune separately for
+ * each scan of the parent plan node.
  *
  * Note that pruning away unneeded subplans during executor startup has the
  * added benefit of not having to initialize the unneeded subplans at all.
@@ -1776,6 +1784,13 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  *		account for initial pruning possibly having eliminated some of the
  *		subplans.
  *
+ * ExecPartitionDoInitialPruning:
+ *		Do initial pruning with the information contained in a given
+ *		PartitionPruneInfo to determine the minimal set of child subplans
+ *		to be executed of the parent plan node to which the PartitionPruneInfo
+ *		belongs and also the set of the RT indexes of leaf partitions that will
+ *		be scanned with those subplans.
+ *
  * ExecFindMatchingSubPlans:
  *		Returns indexes of matching subplans after evaluating the expressions
  *		that are safe to evaluate at a given point.  This function is first
@@ -1796,8 +1811,9 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  *
  * On return, *initially_valid_subplans is assigned the set of indexes of
  * child subplans that must be initialized along with the parent plan node.
- * Initial pruning is performed here if needed and in that case only the
- * surviving subplans' indexes are added.
+ * Initial pruning is performed here if needed (unless it has already been done
+ * by ExecutorDoInitialPruning()), and in that case only the surviving
+ * subplans' indexes are added.
  *
  * If subplans are indeed pruned, subplan_map arrays contained in the returned
  * PartitionPruneState are re-sequenced to not count those, though only if the
@@ -1810,9 +1826,10 @@ ExecInitPartitionPruning(PlanState *planstate,
 						 Bitmapset *root_parent_relids,
 						 Bitmapset **initially_valid_subplans)
 {
-	PartitionPruneState *prunestate;
+	PartitionPruneState *prunestate = NULL;
 	EState	   *estate = planstate->state;
 	PartitionPruneInfo *pruneinfo;
+	PartitionPruneResult *pruneresult = NULL;
 
 	/* Obtain the pruneinfo we need, and make sure it's the right one */
 	pruneinfo = list_nth(estate->es_part_prune_infos, part_prune_index);
@@ -1828,20 +1845,57 @@ ExecInitPartitionPruning(PlanState *planstate,
 	/* We may need an expression context to evaluate partition exprs */
 	ExecAssignExprContext(estate, planstate);
 
-	/* Create the working data structure for pruning */
-	prunestate = CreatePartitionPruneState(planstate, pruneinfo);
+	/*
+	 * No need to do initial pruning if it was done already by
+	 * ExecutorDoInitialPruning(), which it would be if es_part_prune_results
+	 * is set.
+	 */
+	if (estate->es_part_prune_results)
+	{
+		pruneresult = list_nth(estate->es_part_prune_results, part_prune_index);
+		Assert(IsA(pruneresult, PartitionPruneResult));
+	}
+
+	if (pruneresult == NULL || pruneinfo->needs_exec_pruning)
+	{
+		/* We may need an expression context to evaluate partition exprs */
+		ExecAssignExprContext(estate, planstate);
+
+		/* For data reading, executor always omits detached partitions */
+		if (estate->es_partition_directory == NULL)
+			estate->es_partition_directory =
+				CreatePartitionDirectory(estate->es_query_cxt, false);
+
+		/*
+		 * Create the working data structure for pruning.  No need to consider
+		 * initial pruning steps if we have a PartitionPruneResult.
+		 */
+		prunestate = CreatePartitionPruneState(planstate, pruneinfo,
+											   pruneresult == NULL,
+											   pruneinfo->needs_exec_pruning,
+											   NIL, planstate->ps_ExprContext,
+											   estate->es_partition_directory);
+	}
 
 	/*
 	 * Perform an initial partition prune pass, if required.
 	 */
-	if (prunestate->do_initial_prune)
-		*initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true);
+	if (pruneresult)
+	{
+		*initially_valid_subplans = bms_copy(pruneresult->valid_subplan_offs);
+	}
+	else if (prunestate && prunestate->do_initial_prune)
+	{
+		*initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true,
+															 NULL);
+	}
 	else
 	{
-		/* No pruning, so we'll need to initialize all subplans */
+		/* No initial pruning, so we'll need to initialize all subplans */
 		Assert(n_total_subplans > 0);
 		*initially_valid_subplans = bms_add_range(NULL, 0,
 												  n_total_subplans - 1);
+		return prunestate;
 	}
 
 	/*
@@ -1849,7 +1903,8 @@ ExecInitPartitionPruning(PlanState *planstate,
 	 * that were removed above due to initial pruning.  No need to do this if
 	 * no steps were removed.
 	 */
-	if (bms_num_members(*initially_valid_subplans) < n_total_subplans)
+	if (prunestate &&
+		bms_num_members(*initially_valid_subplans) < n_total_subplans)
 	{
 		/*
 		 * We can safely skip this when !do_exec_prune, even though that
@@ -1865,11 +1920,74 @@ ExecInitPartitionPruning(PlanState *planstate,
 	return prunestate;
 }
 
+/*
+ * ExecPartitionDoInitialPruning
+ *		Perform initial pruning using given PartitionPruneInfo to determine
+ *		the minimal set of child subplans that will be executed and also the
+ *		set of RT indexes of the leaf partitions scanned by those subplans.
+ */
+Bitmapset *
+ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params,
+							  PartitionPruneInfo *pruneinfo,
+							  Bitmapset **scan_leafpart_rtis)
+{
+	List		 *rtable = plannedstmt->rtable;
+	ExprContext	 *econtext;
+	PartitionDirectory pdir;
+	MemoryContext oldcontext,
+				  tmpcontext;
+	PartitionPruneState *prunestate;
+	Bitmapset	 *valid_subplan_offs;
+
+	/*
+	 * A temporary context for memory allocations required while executing
+	 * partition pruning steps.
+	 */
+	tmpcontext = AllocSetContextCreate(CurrentMemoryContext,
+									   "initial pruning working data",
+									   ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(tmpcontext);
+
+	/*
+	 * PartitionDirectory to look up partition descriptors.
+	 * Note that we don't omit detached partitions, just like during
+	 * execution proper.
+	 */
+	pdir = CreatePartitionDirectory(CurrentMemoryContext, false);
+
+	/*
+	 * We don't yet have a PlanState for the parent plan node, so we must
+	 * create a standalone ExprContext to evaluate pruning expressions,
+	 * equipped with the information about the EXTERN parameters that the
+	 * caller passed us.  Note that that's okay because the initial pruning
+	 * steps do not contain anything that requires the execution to have
+	 * started and thus need the information contained in a PlanState.
+	 */
+	econtext = CreateStandaloneExprContext();
+	econtext->ecxt_param_list_info = params;
+	prunestate = CreatePartitionPruneState(NULL, pruneinfo, true, false,
+										   rtable, econtext, pdir);
+	MemoryContextSwitchTo(oldcontext);
+
+	/* Do the initial pruning. */
+	valid_subplan_offs = ExecFindMatchingSubPlans(prunestate, true,
+												  scan_leafpart_rtis);
+
+	FreeExprContext(econtext, true);
+	DestroyPartitionDirectory(pdir);
+	MemoryContextDelete(tmpcontext);
+
+	return valid_subplan_offs;
+}
+
 /*
  * CreatePartitionPruneState
  *		Build the data structure required for calling ExecFindMatchingSubPlans
  *
- * 'planstate' is the parent plan node's execution state.
+ * 'planstate', if not NULL, is the parent plan node's execution state.  It
+ * can be NULL if being called before ExecutorStart(), in which case,
+ * 'rtable' (range table), 'econtext', and 'partdir' must be explicitly
+ * provided.
  *
  * 'pruneinfo' is a PartitionPruneInfo as generated by
  * make_partition_pruneinfo.  Here we build a PartitionPruneState containing a
@@ -1883,19 +2001,21 @@ ExecInitPartitionPruning(PlanState *planstate,
  * PartitionedRelPruneInfo.
  */
 static PartitionPruneState *
-CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
+CreatePartitionPruneState(PlanState *planstate,
+						  PartitionPruneInfo *pruneinfo,
+						  bool consider_initial_steps,
+						  bool consider_exec_steps,
+						  List *rtable, ExprContext *econtext,
+						  PartitionDirectory partdir)
 {
-	EState	   *estate = planstate->state;
+	EState	   *estate = planstate ? planstate->state : NULL;
 	PartitionPruneState *prunestate;
 	int			n_part_hierarchies;
 	ListCell   *lc;
 	int			i;
-	ExprContext *econtext = planstate->ps_ExprContext;
 
-	/* For data reading, executor always omits detached partitions */
-	if (estate->es_partition_directory == NULL)
-		estate->es_partition_directory =
-			CreatePartitionDirectory(estate->es_query_cxt, false);
+	Assert((estate != NULL) ||
+			(partdir != NULL && econtext != NULL && rtable != NIL));
 
 	n_part_hierarchies = list_length(pruneinfo->prune_infos);
 	Assert(n_part_hierarchies > 0);
@@ -1950,15 +2070,42 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 			PartitionKey partkey;
 
 			/*
-			 * We can rely on the copies of the partitioned table's partition
-			 * key and partition descriptor appearing in its relcache entry,
-			 * because that entry will be held open and locked for the
-			 * duration of this executor run.
+			 * Must open the relation by ourselves when called before the
+			 * execution has started, such as, when called during
+			 * ExecutorDoInitialPruning() on a cached plan.  In that case,
+			 * sub-partitions must be locked, because AcquirePlannerLocks()
+			 * would not have seen them. (1st relation in a partrelpruneinfos
+			 * list is always the root partitioned table appearing in the
+			 * query, which AcquirePlannerLocks() would have locked; the
+			 * Assert in relation_open() guards that assumption.)
+			 */
+			if (estate == NULL)
+			{
+				RangeTblEntry *rte = rt_fetch(pinfo->rtindex, rtable);
+				int		lockmode = (j == 0) ? NoLock : rte->rellockmode;
+
+				partrel = table_open(rte->relid, lockmode);
+			}
+			else
+				partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
+
+			/*
+			 * We can rely on the copy of the partitioned table's partition
+			 * key from in its relcache entry, because it can't change (or
+			 * get destroyed) as long as the relation is locked.  Partition
+			 * descriptor is taken from the PartitionDirectory associated with
+			 * the table that is held open long enough for the descriptor to
+			 * remain valid while it's used to perform the pruning steps.
 			 */
-			partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
 			partkey = RelationGetPartitionKey(partrel);
-			partdesc = PartitionDirectoryLookup(estate->es_partition_directory,
-												partrel);
+			partdesc = PartitionDirectoryLookup(partdir, partrel);
+
+			/*
+			 * Must close partrel, keeping the lock taken, if we're not using
+			 * EState's entry.
+			 */
+			if (estate == NULL)
+				table_close(partrel, NoLock);
 
 			/*
 			 * Initialize the subplan_map and subpart_map.
@@ -1972,6 +2119,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 			Assert(partdesc->nparts >= pinfo->nparts);
 			pprune->nparts = partdesc->nparts;
 			pprune->subplan_map = palloc(sizeof(int) * partdesc->nparts);
+			pprune->rti_map = palloc(sizeof(Index) * partdesc->nparts);
 			if (partdesc->nparts == pinfo->nparts)
 			{
 				/*
@@ -1982,6 +2130,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 				pprune->subpart_map = pinfo->subpart_map;
 				memcpy(pprune->subplan_map, pinfo->subplan_map,
 					   sizeof(int) * pinfo->nparts);
+				memcpy(pprune->rti_map, pinfo->rti_map,
+					   sizeof(int) * pinfo->nparts);
 
 				/*
 				 * Double-check that the list of unpruned relations has not
@@ -2032,6 +2182,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 							pinfo->subplan_map[pd_idx];
 						pprune->subpart_map[pp_idx] =
 							pinfo->subpart_map[pd_idx];
+						pprune->rti_map[pp_idx] =
+							pinfo->rti_map[pd_idx];
 						pd_idx++;
 					}
 					else
@@ -2039,6 +2191,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 						/* this partdesc entry is not in the plan */
 						pprune->subplan_map[pp_idx] = -1;
 						pprune->subpart_map[pp_idx] = -1;
+						pprune->rti_map[pp_idx] = 0;
 					}
 				}
 
@@ -2060,7 +2213,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 			 * Initialize pruning contexts as needed.
 			 */
 			pprune->initial_pruning_steps = pinfo->initial_pruning_steps;
-			if (pinfo->initial_pruning_steps)
+			if (consider_initial_steps && pinfo->initial_pruning_steps)
 			{
 				InitPartitionPruneContext(&pprune->initial_context,
 										  pinfo->initial_pruning_steps,
@@ -2070,7 +2223,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 				prunestate->do_initial_prune = true;
 			}
 			pprune->exec_pruning_steps = pinfo->exec_pruning_steps;
-			if (pinfo->exec_pruning_steps)
+			if (consider_exec_steps && pinfo->exec_pruning_steps)
 			{
 				InitPartitionPruneContext(&pprune->exec_context,
 										  pinfo->exec_pruning_steps,
@@ -2298,10 +2451,14 @@ PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
  * Pass initial_prune if PARAM_EXEC Params cannot yet be evaluated.  This
  * differentiates the initial executor-time pruning step from later
  * runtime pruning.
+ *
+ * RT indexes of leaf partitions scanned by the chosen subplans are added to
+ * *scan_leafpart_rtis if the pointer is non-NULL.
  */
 Bitmapset *
 ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
-						 bool initial_prune)
+						 bool initial_prune,
+						 Bitmapset **scan_leafpart_rtis)
 {
 	Bitmapset  *result = NULL;
 	MemoryContext oldcontext;
@@ -2336,7 +2493,7 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
 		 */
 		pprune = &prunedata->partrelprunedata[0];
 		find_matching_subplans_recurse(prunedata, pprune, initial_prune,
-									   &result);
+									   &result, scan_leafpart_rtis);
 
 		/* Expression eval may have used space in ExprContext too */
 		if (pprune->exec_pruning_steps)
@@ -2350,6 +2507,8 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
 
 	/* Copy result out of the temp context before we reset it */
 	result = bms_copy(result);
+	if (scan_leafpart_rtis)
+		*scan_leafpart_rtis = bms_copy(*scan_leafpart_rtis);
 
 	MemoryContextReset(prunestate->prune_context);
 
@@ -2360,13 +2519,15 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
  * find_matching_subplans_recurse
  *		Recursive worker function for ExecFindMatchingSubPlans
  *
- * Adds valid (non-prunable) subplan IDs to *validsubplans
+ * Adds valid (non-prunable) subplan IDs to *validsubplans and RT indexes of
+ * of the corresponding leaf partitions to *scan_leafpart_rtis (if asked for).
  */
 static void
 find_matching_subplans_recurse(PartitionPruningData *prunedata,
 							   PartitionedRelPruningData *pprune,
 							   bool initial_prune,
-							   Bitmapset **validsubplans)
+							   Bitmapset **validsubplans,
+							   Bitmapset **scan_leafpart_rtis)
 {
 	Bitmapset  *partset;
 	int			i;
@@ -2393,8 +2554,14 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
 	while ((i = bms_next_member(partset, i)) >= 0)
 	{
 		if (pprune->subplan_map[i] >= 0)
+		{
 			*validsubplans = bms_add_member(*validsubplans,
 											pprune->subplan_map[i]);
+			Assert(pprune->rti_map[i] > 0);
+			if (scan_leafpart_rtis)
+				*scan_leafpart_rtis = bms_add_member(*scan_leafpart_rtis,
+													 pprune->rti_map[i]);
+		}
 		else
 		{
 			int			partidx = pprune->subpart_map[i];
@@ -2402,7 +2569,8 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
 			if (partidx >= 0)
 				find_matching_subplans_recurse(prunedata,
 											   &prunedata->partrelprunedata[partidx],
-											   initial_prune, validsubplans);
+											   initial_prune, validsubplans,
+											   scan_leafpart_rtis);
 			else
 			{
 				/*
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 9695de85b9..dce93a8c9f 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -135,6 +135,7 @@ CreateExecutorState(void)
 	estate->es_param_exec_vals = NULL;
 
 	estate->es_queryEnv = NULL;
+	estate->es_part_prune_results = NIL;
 
 	estate->es_query_cxt = qcontext;
 
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index dc13625171..bffb42ce71 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -842,7 +842,7 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
 	else
 		dest = None_Receiver;
 
-	es->qd = CreateQueryDesc(es->stmt,
+	es->qd = CreateQueryDesc(es->stmt, NIL,
 							 fcache->src,
 							 GetActiveSnapshot(),
 							 InvalidSnapshot,
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index 99830198bd..3b917584de 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -156,7 +156,8 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
 		 * subplan, we can fill as_valid_subplans immediately, preventing
 		 * later calls to ExecFindMatchingSubPlans.
 		 */
-		if (!prunestate->do_exec_prune && nplans > 0)
+		if (appendstate->as_prune_state == NULL ||
+			(!appendstate->as_prune_state->do_exec_prune && nplans > 0))
 			appendstate->as_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
 	}
 	else
@@ -578,7 +579,7 @@ choose_next_subplan_locally(AppendState *node)
 		}
 		else if (node->as_valid_subplans == NULL)
 			node->as_valid_subplans =
-				ExecFindMatchingSubPlans(node->as_prune_state, false);
+				ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
 
 		whichplan = -1;
 	}
@@ -643,7 +644,7 @@ choose_next_subplan_for_leader(AppendState *node)
 		if (node->as_valid_subplans == NULL)
 		{
 			node->as_valid_subplans =
-				ExecFindMatchingSubPlans(node->as_prune_state, false);
+				ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
 
 			/*
 			 * Mark each invalid plan as finished to allow the loop below to
@@ -718,7 +719,7 @@ choose_next_subplan_for_worker(AppendState *node)
 	else if (node->as_valid_subplans == NULL)
 	{
 		node->as_valid_subplans =
-			ExecFindMatchingSubPlans(node->as_prune_state, false);
+			ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
 		mark_invalid_subplans_as_finished(node);
 	}
 
@@ -869,7 +870,7 @@ ExecAppendAsyncBegin(AppendState *node)
 	if (node->as_valid_subplans == NULL)
 	{
 		node->as_valid_subplans =
-			ExecFindMatchingSubPlans(node->as_prune_state, false);
+			ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
 
 		classify_matching_subplans(node);
 	}
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index f370f9f287..ccfa083945 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -104,7 +104,8 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
 		 * subplan, we can fill ms_valid_subplans immediately, preventing
 		 * later calls to ExecFindMatchingSubPlans.
 		 */
-		if (!prunestate->do_exec_prune && nplans > 0)
+		if (mergestate->ms_prune_state == NULL ||
+			(!mergestate->ms_prune_state->do_exec_prune && nplans > 0))
 			mergestate->ms_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
 	}
 	else
@@ -219,7 +220,7 @@ ExecMergeAppend(PlanState *pstate)
 		 */
 		if (node->ms_valid_subplans == NULL)
 			node->ms_valid_subplans =
-				ExecFindMatchingSubPlans(node->ms_prune_state, false);
+				ExecFindMatchingSubPlans(node->ms_prune_state, false, NULL);
 
 		/*
 		 * First time through: pull the first tuple from each valid subplan,
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index fd5796f1b9..93012a5b3b 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -1578,6 +1578,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 	CachedPlanSource *plansource;
 	CachedPlan *cplan;
 	List	   *stmt_list;
+	List	   *part_prune_results_list;
 	char	   *query_string;
 	Snapshot	snapshot;
 	MemoryContext oldcontext;
@@ -1657,7 +1658,10 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 	 */
 
 	/* Replan if needed, and increment plan refcount for portal */
-	cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv);
+	cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv,
+						  &part_prune_results_list);
+	Assert(list_length(cplan->stmt_list) ==
+		   list_length(part_prune_results_list));
 	stmt_list = cplan->stmt_list;
 
 	if (!plan->saved)
@@ -1685,6 +1689,9 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 					  stmt_list,
 					  cplan);
 
+	/* Copy Lists of PartitionPruneResults into the portal's context. */
+	PortalStorePartitionPruneResults(portal, part_prune_results_list);
+
 	/*
 	 * Set up options for portal.  Default SCROLL type is chosen the same way
 	 * as PerformCursorOpen does it.
@@ -2092,7 +2099,8 @@ SPI_plan_get_cached_plan(SPIPlanPtr plan)
 	/* Get the generic plan for the query */
 	cplan = GetCachedPlan(plansource, NULL,
 						  plan->saved ? CurrentResourceOwner : NULL,
-						  _SPI_current->queryEnv);
+						  _SPI_current->queryEnv,
+						  NULL /* Not interested in PartitionPruneResults */);
 	Assert(cplan == plansource->gplan);
 
 	/* Pop the error context stack */
@@ -2473,7 +2481,9 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 	{
 		CachedPlanSource *plansource = (CachedPlanSource *) lfirst(lc1);
 		List	   *stmt_list;
-		ListCell   *lc2;
+		List	   *part_prune_results_list;
+		ListCell   *lc2,
+				   *lc3;
 
 		spicallbackarg.query = plansource->query_string;
 
@@ -2549,8 +2559,10 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 		 * plan, the refcount must be backed by the plan_owner.
 		 */
 		cplan = GetCachedPlan(plansource, options->params,
-							  plan_owner, _SPI_current->queryEnv);
-
+							  plan_owner, _SPI_current->queryEnv,
+							 &part_prune_results_list);
+		Assert(list_length(cplan->stmt_list) ==
+			   list_length(part_prune_results_list));
 		stmt_list = cplan->stmt_list;
 
 		/*
@@ -2589,9 +2601,10 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 			}
 		}
 
-		foreach(lc2, stmt_list)
+		forboth(lc2, stmt_list, lc3, part_prune_results_list)
 		{
 			PlannedStmt *stmt = lfirst_node(PlannedStmt, lc2);
+			List	   *part_prune_results = lfirst_node(List, lc3);
 			bool		canSetTag = stmt->canSetTag;
 			DestReceiver *dest;
 
@@ -2663,7 +2676,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 				else
 					snap = InvalidSnapshot;
 
-				qdesc = CreateQueryDesc(stmt,
+				qdesc = CreateQueryDesc(stmt, part_prune_results,
 										plansource->query_string,
 										snap, crosscheck_snapshot,
 										dest,
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 23776367c5..b01f55fb4f 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -158,6 +158,11 @@
 	token = pg_strtok(&length);		/* skip :fldname */ \
 	local_node->fldname = readIntCols(len)
 
+/* Read an Index array */
+#define READ_INDEX_ARRAY(fldname, len) \
+	token = pg_strtok(&length);		/* skip :fldname */ \
+	local_node->fldname = readIndexCols(len)
+
 /* Read a bool array */
 #define READ_BOOL_ARRAY(fldname, len) \
 	token = pg_strtok(&length);		/* skip :fldname */ \
@@ -800,7 +805,6 @@ fnname(int numCols) \
  */
 READ_SCALAR_ARRAY(readAttrNumberCols, int16, atoi)
 READ_SCALAR_ARRAY(readOidCols, Oid, atooid)
-/* outfuncs.c has writeIndexCols, but we don't yet need that here */
-/* READ_SCALAR_ARRAY(readIndexCols, Index, atoui) */
+READ_SCALAR_ARRAY(readIndexCols, Index, atoui)
 READ_SCALAR_ARRAY(readIntCols, int, atoi)
 READ_SCALAR_ARRAY(readBoolCols, bool, strtobool)
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 799602f5ea..a96d316dca 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -520,7 +520,9 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
 	result->parallelModeNeeded = glob->parallelModeNeeded;
 	result->planTree = top_plan;
 	result->partPruneInfos = glob->partPruneInfos;
+	result->containsInitialPruning = glob->containsInitialPruning;
 	result->rtable = glob->finalrtable;
+	result->minLockRelids = glob->minLockRelids;
 	result->resultRelations = glob->resultRelations;
 	result->appendRelations = glob->appendRelations;
 	result->subplans = glob->subplans;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index e67f0e3509..5820f26fdb 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -270,6 +270,16 @@ set_plan_references(PlannerInfo *root, Plan *plan)
 	 */
 	add_rtes_to_flat_rtable(root, false);
 
+	/*
+	 * Add the query's adjusted range of RT indexes to glob->minLockRelids.
+	 * The adjusted RT indexes of prunable relations will be deleted from the
+	 * set below where PartitionPruneInfos are processed.
+	 */
+	glob->minLockRelids =
+		bms_add_range(glob->minLockRelids,
+					  rtoffset + 1,
+					  rtoffset + list_length(root->parse->rtable));
+
 	/*
 	 * Adjust RT indexes of PlanRowMarks and add to final rowmarks list
 	 */
@@ -352,6 +362,7 @@ set_plan_references(PlannerInfo *root, Plan *plan)
 	foreach (lc, root->partPruneInfos)
 	{
 		PartitionPruneInfo *pruneinfo = lfirst(lc);
+		Bitmapset *leafpart_rtis = NULL;
 		ListCell  *l;
 
 		pruneinfo->root_parent_relids =
@@ -364,15 +375,50 @@ set_plan_references(PlannerInfo *root, Plan *plan)
 			foreach(l2, prune_infos)
 			{
 				PartitionedRelPruneInfo *pinfo = lfirst(l2);
+				int		i;
 
 				/* RT index of the table to which the pinfo belongs. */
 				pinfo->rtindex += rtoffset;
+
+				/* Also of the leaf partitions that might be scanned. */
+				for (i = 0; i < pinfo->nparts; i++)
+				{
+					if (pinfo->rti_map[i] > 0 && pinfo->subplan_map[i] >= 0)
+					{
+						pinfo->rti_map[i] += rtoffset;
+						leafpart_rtis = bms_add_member(leafpart_rtis,
+													   pinfo->rti_map[i]);
+					}
+				}
 			}
 		}
 
+		if (pruneinfo->needs_init_pruning)
+		{
+			glob->containsInitialPruning = true;
+
+			/*
+			 * Delete the leaf partition RTIs from the global set of relations
+			 * to be locked before executing the plan.  AcquireExecutorLocks()
+			 * will find the ones to add to the set after performing initial
+			 * pruning.
+			 */
+			glob->minLockRelids = bms_del_members(glob->minLockRelids,
+												  leafpart_rtis);
+		}
+
 		glob->partPruneInfos = lappend(glob->partPruneInfos, pruneinfo);
 	}
 
+	/*
+	 * It seems worth doing a bms_copy() on glob->minLockRelids if we deleted
+	 * bits from it above to get rid of any empty tail bits.  It seems better
+	 * for the loop over this set in AcquireExecutorLocks() to not have to go
+	 * through those useless bit words.
+	 */
+	if (glob->containsInitialPruning)
+		glob->minLockRelids = bms_copy(glob->minLockRelids);
+
 	return result;
 }
 
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index d48f6784c1..d5556354f7 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -144,7 +144,9 @@ static List *make_partitionedrel_pruneinfo(PlannerInfo *root,
 										   List *prunequal,
 										   Bitmapset *partrelids,
 										   int *relid_subplan_map,
-										   Bitmapset **matchedsubplans);
+										   Bitmapset **matchedsubplans,
+										   bool *needs_init_pruning,
+										   bool *needs_exec_pruning);
 static void gen_partprune_steps(RelOptInfo *rel, List *clauses,
 								PartClauseTarget target,
 								GeneratePruningStepsContext *context);
@@ -234,6 +236,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	int		   *relid_subplan_map;
 	ListCell   *lc;
 	int			i;
+	bool		needs_init_pruning = false;
+	bool		needs_exec_pruning = false;
 
 	/*
 	 * Scan the subpaths to see which ones are scans of partition child
@@ -313,12 +317,16 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		Bitmapset  *partrelids = (Bitmapset *) lfirst(lc);
 		List	   *pinfolist;
 		Bitmapset  *matchedsubplans = NULL;
+		bool		partrel_needs_init_pruning;
+		bool		partrel_needs_exec_pruning;
 
 		pinfolist = make_partitionedrel_pruneinfo(root, parentrel,
 												  prunequal,
 												  partrelids,
 												  relid_subplan_map,
-												  &matchedsubplans);
+												  &matchedsubplans,
+												  &partrel_needs_init_pruning,
+												  &partrel_needs_exec_pruning);
 
 		/* When pruning is possible, record the matched subplans */
 		if (pinfolist != NIL)
@@ -327,6 +335,9 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 			allmatchedsubplans = bms_join(matchedsubplans,
 										  allmatchedsubplans);
 		}
+
+		needs_init_pruning |= partrel_needs_init_pruning;
+		needs_exec_pruning |= partrel_needs_exec_pruning;
 	}
 
 	pfree(relid_subplan_map);
@@ -342,6 +353,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	pruneinfo = makeNode(PartitionPruneInfo);
 	pruneinfo->root_parent_relids = parentrel->relids;
 	pruneinfo->prune_infos = prunerelinfos;
+	pruneinfo->needs_init_pruning = needs_init_pruning;
+	pruneinfo->needs_exec_pruning = needs_exec_pruning;
 
 	/*
 	 * Some subplans may not belong to any of the identified partitioned rels.
@@ -442,13 +455,18 @@ add_part_relids(List *allpartrelids, Bitmapset *partrelids)
  * If we cannot find any useful run-time pruning steps, return NIL.
  * However, on success, each rel identified in partrelids will have
  * an element in the result list, even if some of them are useless.
+ * *needs_init_pruning and *needs_exec_pruning are set to indicate that the
+ * returned PartitionedRelPruneInfos contains pruning steps that can be
+ * performed before and after execution begins, respectively.
  */
 static List *
 make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 							  List *prunequal,
 							  Bitmapset *partrelids,
 							  int *relid_subplan_map,
-							  Bitmapset **matchedsubplans)
+							  Bitmapset **matchedsubplans,
+							  bool *needs_init_pruning,
+							  bool *needs_exec_pruning)
 {
 	RelOptInfo *targetpart = NULL;
 	List	   *pinfolist = NIL;
@@ -459,6 +477,10 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	int			rti;
 	int			i;
 
+	/* Will find out below. */
+	*needs_init_pruning = false;
+	*needs_exec_pruning = false;
+
 	/*
 	 * Examine each partitioned rel, constructing a temporary array to map
 	 * from planner relids to index of the partitioned rel, and building a
@@ -546,6 +568,9 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		 * executor per-scan pruning steps.  This first pass creates startup
 		 * pruning steps and detects whether there's any possibly-useful quals
 		 * that would require per-scan pruning.
+		 *
+		 * In the first pass, we note whether the 2nd pass is necessary by
+		 * noting the presence of EXEC parameters.
 		 */
 		gen_partprune_steps(subpart, partprunequal, PARTTARGET_INITIAL,
 							&context);
@@ -620,6 +645,12 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		pinfo->execparamids = execparamids;
 		/* Remaining fields will be filled in the next loop */
 
+		/* record which types of pruning steps we've seen so far */
+		if (initial_pruning_steps != NIL)
+			*needs_init_pruning = true;
+		if (exec_pruning_steps != NIL)
+			*needs_exec_pruning = true;
+
 		pinfolist = lappend(pinfolist, pinfo);
 	}
 
@@ -647,6 +678,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		int		   *subplan_map;
 		int		   *subpart_map;
 		Oid		   *relid_map;
+		Index	   *rti_map;
 
 		/*
 		 * Construct the subplan and subpart maps for this partitioning level.
@@ -659,6 +691,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		subpart_map = (int *) palloc(nparts * sizeof(int));
 		memset(subpart_map, -1, nparts * sizeof(int));
 		relid_map = (Oid *) palloc0(nparts * sizeof(Oid));
+		rti_map = (Index *) palloc0(nparts * sizeof(Index));
 		present_parts = NULL;
 
 		i = -1;
@@ -673,6 +706,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 			subplan_map[i] = subplanidx = relid_subplan_map[partrel->relid] - 1;
 			subpart_map[i] = subpartidx = relid_subpart_map[partrel->relid] - 1;
 			relid_map[i] = planner_rt_fetch(partrel->relid, root)->relid;
+			rti_map[i] = partrel->relid;
 			if (subplanidx >= 0)
 			{
 				present_parts = bms_add_member(present_parts, i);
@@ -697,6 +731,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		pinfo->subplan_map = subplan_map;
 		pinfo->subpart_map = subpart_map;
 		pinfo->relid_map = relid_map;
+		pinfo->rti_map = rti_map;
 	}
 
 	pfree(relid_subpart_map);
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 3082093d1e..95ab1d0eef 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -1598,6 +1598,7 @@ exec_bind_message(StringInfo input_message)
 	int16	   *rformats = NULL;
 	CachedPlanSource *psrc;
 	CachedPlan *cplan;
+	List	   *part_prune_results_list;
 	Portal		portal;
 	char	   *query_string;
 	char	   *saved_stmt_name;
@@ -1972,7 +1973,9 @@ exec_bind_message(StringInfo input_message)
 	 * will be generated in MessageContext.  The plan refcount will be
 	 * assigned to the Portal, so it will be released at portal destruction.
 	 */
-	cplan = GetCachedPlan(psrc, params, NULL, NULL);
+	cplan = GetCachedPlan(psrc, params, NULL, NULL, &part_prune_results_list);
+	Assert(list_length(cplan->stmt_list) ==
+		   list_length(part_prune_results_list));
 
 	/*
 	 * Now we can define the portal.
@@ -1987,6 +1990,9 @@ exec_bind_message(StringInfo input_message)
 					  cplan->stmt_list,
 					  cplan);
 
+	/* Copy Lists of PartitionPruneResults into the portal's context. */
+	PortalStorePartitionPruneResults(portal, part_prune_results_list);
+
 	/* Done with the snapshot used for parameter I/O and parsing/planning */
 	if (snapshot_set)
 		PopActiveSnapshot();
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index 52e2db6452..280ed7d239 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -35,7 +35,7 @@
 Portal		ActivePortal = NULL;
 
 
-static void ProcessQuery(PlannedStmt *plan,
+static void ProcessQuery(PlannedStmt *plan, List *part_prune_results,
 						 const char *sourceText,
 						 ParamListInfo params,
 						 QueryEnvironment *queryEnv,
@@ -65,6 +65,7 @@ static void DoPortalRewind(Portal portal);
  */
 QueryDesc *
 CreateQueryDesc(PlannedStmt *plannedstmt,
+				List *part_prune_results,
 				const char *sourceText,
 				Snapshot snapshot,
 				Snapshot crosscheck_snapshot,
@@ -77,6 +78,8 @@ CreateQueryDesc(PlannedStmt *plannedstmt,
 
 	qd->operation = plannedstmt->commandType;	/* operation */
 	qd->plannedstmt = plannedstmt;	/* plan */
+	qd->part_prune_results = part_prune_results; /* ExecutorDoInitialPruning()
+												  * output for plan */
 	qd->sourceText = sourceText;	/* query text */
 	qd->snapshot = RegisterSnapshot(snapshot);	/* snapshot */
 	/* RI check snapshot */
@@ -122,6 +125,7 @@ FreeQueryDesc(QueryDesc *qdesc)
  *		PORTAL_ONE_RETURNING, or PORTAL_ONE_MOD_WITH portal
  *
  *	plan: the plan tree for the query
+ *	part_prune_results: ExecutorDoInitialPruning() output for the PlannedStmt
  *	sourceText: the source text of the query
  *	params: any parameters needed
  *	dest: where to send results
@@ -134,6 +138,7 @@ FreeQueryDesc(QueryDesc *qdesc)
  */
 static void
 ProcessQuery(PlannedStmt *plan,
+			 List *part_prune_results,
 			 const char *sourceText,
 			 ParamListInfo params,
 			 QueryEnvironment *queryEnv,
@@ -145,7 +150,7 @@ ProcessQuery(PlannedStmt *plan,
 	/*
 	 * Create the QueryDesc object
 	 */
-	queryDesc = CreateQueryDesc(plan, sourceText,
+	queryDesc = CreateQueryDesc(plan, part_prune_results, sourceText,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, params, queryEnv, 0);
 
@@ -491,8 +496,13 @@ PortalStart(Portal portal, ParamListInfo params,
 				/*
 				 * Create QueryDesc in portal's context; for the moment, set
 				 * the destination to DestNone.
+				 *
+				 * There is no PartitionPruneResult unless the PlannedStmt is
+				 * from a CachedPlan.
 				 */
 				queryDesc = CreateQueryDesc(linitial_node(PlannedStmt, portal->stmts),
+											portal->part_prune_results_list == NIL ? NIL :
+											linitial(portal->part_prune_results_list),
 											portal->sourceText,
 											GetActiveSnapshot(),
 											InvalidSnapshot,
@@ -1225,6 +1235,8 @@ PortalRunMulti(Portal portal,
 
 		if (pstmt->utilityStmt == NULL)
 		{
+			List *part_prune_results = NIL;
+
 			/*
 			 * process a plannable query.
 			 */
@@ -1271,10 +1283,18 @@ PortalRunMulti(Portal portal,
 			else
 				UpdateActiveSnapshotCommandId();
 
+			/*
+			 * Determine if there's a corresponding List of PartitionPruneResult
+			 * for this PlannedStmt.
+			 */
+			if (portal->part_prune_results_list != NIL)
+				part_prune_results = list_nth(portal->part_prune_results_list,
+											  foreach_current_index(stmtlist_item));
+
 			if (pstmt->canSetTag)
 			{
 				/* statement can set tag string */
-				ProcessQuery(pstmt,
+				ProcessQuery(pstmt, part_prune_results,
 							 portal->sourceText,
 							 portal->portalParams,
 							 portal->queryEnv,
@@ -1283,7 +1303,7 @@ PortalRunMulti(Portal portal,
 			else
 			{
 				/* stmt added by rewrite cannot set tag */
-				ProcessQuery(pstmt,
+				ProcessQuery(pstmt, part_prune_results,
 							 portal->sourceText,
 							 portal->portalParams,
 							 portal->queryEnv,
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index cc943205d3..af6fae6e3b 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -99,14 +99,19 @@ static dlist_head cached_expression_list = DLIST_STATIC_INIT(cached_expression_l
 static void ReleaseGenericPlan(CachedPlanSource *plansource);
 static List *RevalidateCachedQuery(CachedPlanSource *plansource,
 								   QueryEnvironment *queryEnv);
-static bool CheckCachedPlan(CachedPlanSource *plansource);
+static bool CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
+							List **part_prune_results_list);
 static CachedPlan *BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
-								   ParamListInfo boundParams, QueryEnvironment *queryEnv);
+								   ParamListInfo boundParams, QueryEnvironment *queryEnv,
+								   List **part_prune_results_list);
 static bool choose_custom_plan(CachedPlanSource *plansource,
 							   ParamListInfo boundParams);
 static double cached_plan_cost(CachedPlan *plan, bool include_planner);
 static Query *QueryListGetPrimaryStmt(List *stmts);
-static void AcquireExecutorLocks(List *stmt_list, bool acquire);
+static void AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams,
+								 List **part_prune_results_list,
+								 List **lockedRelids_per_stmt);
+static void ReleaseExecutorLocks(List *stmt_list, List *lockedRelids_per_stmt);
 static void AcquirePlannerLocks(List *stmt_list, bool acquire);
 static void ScanQueryForLocks(Query *parsetree, bool acquire);
 static bool ScanQueryWalker(Node *node, bool *acquire);
@@ -782,6 +787,26 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
 	return tlist;
 }
 
+/* 
+ * FreePartitionPruneResults
+ *		Frees the List of Lists of PartitionPruneResults for CheckCachedPlan()
+ */
+static void
+FreePartitionPruneResults(List *part_prune_results_list)
+{
+	ListCell *lc;
+
+	foreach(lc, part_prune_results_list)
+	{
+		List *part_prune_results = lfirst(lc);
+
+		/* Free both the PartitionPruneResults and the containing List. */
+		list_free_deep(part_prune_results);
+	}
+
+	list_free(part_prune_results_list);
+}
+
 /*
  * CheckCachedPlan: see if the CachedPlanSource's generic plan is valid.
  *
@@ -790,15 +815,20 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
  *
  * On a "true" return, we have acquired the locks needed to run the plan.
  * (We must do this for the "true" result to be race-condition-free.)
+ *
+ * See GetCachedPlan()'s comment for a description of part_prune_results_list.
  */
 static bool
-CheckCachedPlan(CachedPlanSource *plansource)
+CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
+				List **part_prune_results_list)
 {
 	CachedPlan *plan = plansource->gplan;
 
 	/* Assert that caller checked the querytree */
 	Assert(plansource->is_valid);
 
+	*part_prune_results_list = NIL;
+
 	/* If there's no generic plan, just say "false" */
 	if (!plan)
 		return false;
@@ -820,13 +850,21 @@ CheckCachedPlan(CachedPlanSource *plansource)
 	 */
 	if (plan->is_valid)
 	{
+		List   *lockedRelids_per_stmt;
+
 		/*
 		 * Plan must have positive refcount because it is referenced by
 		 * plansource; so no need to fear it disappears under us here.
 		 */
 		Assert(plan->refcount > 0);
 
-		AcquireExecutorLocks(plan->stmt_list, true);
+		/*
+		 * Lock relations scanned by the plan.  This is where the pruning
+		 * happens if needed.
+		 */
+		AcquireExecutorLocks(plan->stmt_list, boundParams,
+							 part_prune_results_list,
+							 &lockedRelids_per_stmt);
 
 		/*
 		 * If plan was transient, check to see if TransactionXmin has
@@ -848,7 +886,11 @@ CheckCachedPlan(CachedPlanSource *plansource)
 		}
 
 		/* Oops, the race case happened.  Release useless locks. */
-		AcquireExecutorLocks(plan->stmt_list, false);
+		ReleaseExecutorLocks(plan->stmt_list, lockedRelids_per_stmt);
+
+		/* Release any PartitionPruneResults that may been created. */
+		FreePartitionPruneResults(*part_prune_results_list);
+		*part_prune_results_list = NIL;
 	}
 
 	/*
@@ -874,10 +916,14 @@ CheckCachedPlan(CachedPlanSource *plansource)
  * Planning work is done in the caller's memory context.  The finished plan
  * is in a child memory context, which typically should get reparented
  * (unless this is a one-shot plan, in which case we don't copy the plan).
+ *
+ * A list of NILs is returned in *part_prune_results_list, meaning that no
+ * no partition pruning has been done yet for the plans in stmt_list.
  */
 static CachedPlan *
 BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
-				ParamListInfo boundParams, QueryEnvironment *queryEnv)
+				ParamListInfo boundParams, QueryEnvironment *queryEnv,
+				List **part_prune_results_list)
 {
 	CachedPlan *plan;
 	List	   *plist;
@@ -1007,6 +1053,17 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
 
 	MemoryContextSwitchTo(oldcxt);
 
+	/*
+	 * No actual PartitionPruneResults yet to add, though must initialize
+	 * the list to have the same number of elements as the list of
+	 * PlannedStmts.
+	 */
+	*part_prune_results_list = NIL;
+	foreach(lc, plist)
+	{
+		*part_prune_results_list = lappend(*part_prune_results_list, NIL);
+	}
+
 	return plan;
 }
 
@@ -1126,6 +1183,19 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
  * plan or a custom plan for the given parameters: the caller does not know
  * which it will get.
  *
+ * For every PlannedStmt found in the returned CachedPlan, an element that
+ * is either a List of PartitionPruneResult or a NIL is added to
+ * *part_prune_results_list.  The former if the PlannedStmt is from
+ * the existing CachedPlan that is otherwise valid and has
+ * containsInitialPruning set to true.  Before returning such a CachedPlan,
+ * those "initial" steps are performed by calling ExecutorDoInitialPruning()
+ * to determine only those leaf partitions that need to be locked by
+ * AcquireExecutorLocks() by pruning away subplans that don't match the
+ * "initial" pruning conditions.  For each PartitionPruneInfo found in
+ * PlannedStmt.partPruneInfos, a PartitionPruneResult containing the bitmapset
+ * of the indexes of surviving subplans is added to the List for the
+ * PlannedStmt.
+ *
  * On return, the plan is valid and we have sufficient locks to begin
  * execution.
  *
@@ -1139,11 +1209,13 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
  */
 CachedPlan *
 GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
-			  ResourceOwner owner, QueryEnvironment *queryEnv)
+			  ResourceOwner owner, QueryEnvironment *queryEnv,
+			  List **part_prune_results_list)
 {
 	CachedPlan *plan = NULL;
 	List	   *qlist;
 	bool		customplan;
+	List	   *my_part_prune_results_list;
 
 	/* Assert caller is doing things in a sane order */
 	Assert(plansource->magic == CACHEDPLANSOURCE_MAGIC);
@@ -1160,7 +1232,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 
 	if (!customplan)
 	{
-		if (CheckCachedPlan(plansource))
+		if (CheckCachedPlan(plansource, boundParams,
+							&my_part_prune_results_list))
 		{
 			/* We want a generic plan, and we already have a valid one */
 			plan = plansource->gplan;
@@ -1169,7 +1242,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 		else
 		{
 			/* Build a new generic plan */
-			plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv);
+			plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv,
+								   &my_part_prune_results_list);
 			/* Just make real sure plansource->gplan is clear */
 			ReleaseGenericPlan(plansource);
 			/* Link the new generic plan into the plansource */
@@ -1214,7 +1288,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 	if (customplan)
 	{
 		/* Build a custom plan */
-		plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv);
+		plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv,
+							   &my_part_prune_results_list);
 		/* Accumulate total costs of custom plans */
 		plansource->total_custom_cost += cached_plan_cost(plan, true);
 
@@ -1246,6 +1321,9 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 		plan->is_saved = true;
 	}
 
+	if (part_prune_results_list)
+		*part_prune_results_list = my_part_prune_results_list;
+
 	return plan;
 }
 
@@ -1737,17 +1815,29 @@ QueryListGetPrimaryStmt(List *stmts)
 
 /*
  * AcquireExecutorLocks: acquire locks needed for execution of a cached plan;
- * or release them if acquire is false.
+ *
+ * See GetCachedPlan()'s comment for a description of part_prune_results_list.
+ *
+ * On return, *lockedRelids_per_stmt will contain a bitmapset for every
+ * PlannedStmt in stmt_list, containing the RT indexes of relation entries
+ * in its range table that were actually locked, or NULL if the PlannedStmt
+ * contains a utility statement.
  */
 static void
-AcquireExecutorLocks(List *stmt_list, bool acquire)
+AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams,
+					 List **part_prune_results_list,
+					 List **lockedRelids_per_stmt)
 {
 	ListCell   *lc1;
 
+	*part_prune_results_list = *lockedRelids_per_stmt = NIL;
 	foreach(lc1, stmt_list)
 	{
 		PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
-		ListCell   *lc2;
+		List	   *part_prune_results = NIL;
+		Bitmapset  *allLockRelids;
+		Bitmapset  *lockedRelids = NULL;
+		int			rti;
 
 		if (plannedstmt->commandType == CMD_UTILITY)
 		{
@@ -1761,13 +1851,40 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
 			Query	   *query = UtilityContainsQuery(plannedstmt->utilityStmt);
 
 			if (query)
-				ScanQueryForLocks(query, acquire);
+				ScanQueryForLocks(query, true);
+			*part_prune_results_list = lappend(*part_prune_results_list, NIL);
 			continue;
 		}
 
-		foreach(lc2, plannedstmt->rtable)
+		/*
+		 * Figure out the set of relations that would need to be locked
+		 * before executing the plan.
+		 */
+		if (plannedstmt->containsInitialPruning)
 		{
-			RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc2);
+			Bitmapset *scan_leafpart_rtis = NULL;
+
+			/*
+			 * Obtain the set of leaf partitions to be locked.
+			 *
+			 * The following does initial partition pruning using the
+			 * PartitionPruneInfos found in plannedstmt->partPruneInfos and
+			 * finds leaf partitions that survive that pruning across all the
+			 * nodes in the plan tree.
+			 */
+			part_prune_results = ExecutorDoInitialPruning(plannedstmt,
+														  boundParams,
+														  &scan_leafpart_rtis);
+			allLockRelids = bms_union(plannedstmt->minLockRelids,
+									  scan_leafpart_rtis);
+		}
+		else
+			allLockRelids = plannedstmt->minLockRelids;
+
+		rti = -1;
+		while ((rti = bms_next_member(allLockRelids, rti)) > 0)
+		{
+			RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
 
 			if (rte->rtekind != RTE_RELATION)
 				continue;
@@ -1778,10 +1895,59 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
 			 * fail if it's been dropped entirely --- we'll just transiently
 			 * acquire a non-conflicting lock.
 			 */
-			if (acquire)
-				LockRelationOid(rte->relid, rte->rellockmode);
-			else
-				UnlockRelationOid(rte->relid, rte->rellockmode);
+			LockRelationOid(rte->relid, rte->rellockmode);
+			lockedRelids = bms_add_member(lockedRelids, rti);
+		}
+
+		*part_prune_results_list = lappend(*part_prune_results_list,
+										   part_prune_results);
+		*lockedRelids_per_stmt = lappend(*lockedRelids_per_stmt, lockedRelids);
+	}
+}
+
+/*
+ * ReleaseExecutorLocks
+ * 		Release locks that would've been acquired by an earlier call to
+ * 		AcquireExecutorLocks()
+ */
+static void
+ReleaseExecutorLocks(List *stmt_list, List *lockedRelids_per_stmt)
+{
+	ListCell   *lc1,
+			   *lc2;
+
+	forboth(lc1, stmt_list, lc2, lockedRelids_per_stmt)
+	{
+		PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
+		Bitmapset  *lockedRelids = lfirst(lc2);
+		int			rti;
+
+		if (plannedstmt->commandType == CMD_UTILITY)
+		{
+			/*
+			 * Ignore utility statements, except those (such as EXPLAIN) that
+			 * contain a parsed-but-not-planned query.  Note: it's okay to use
+			 * ScanQueryForLocks, even though the query hasn't been through
+			 * rule rewriting, because rewriting doesn't change the query
+			 * representation.
+			 */
+			Query	   *query = UtilityContainsQuery(plannedstmt->utilityStmt);
+
+			Assert(lockedRelids == NULL);
+			if (query)
+				ScanQueryForLocks(query, false);
+			continue;
+		}
+
+		rti = -1;
+		while ((rti = bms_next_member(lockedRelids, rti)) >= 0)
+		{
+			RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
+
+			Assert(rte->rtekind == RTE_RELATION);
+
+			/* See the comment in AcquireExecutorLocks(). */
+			UnlockRelationOid(rte->relid, rte->rellockmode);
 		}
 	}
 }
diff --git a/src/backend/utils/mmgr/portalmem.c b/src/backend/utils/mmgr/portalmem.c
index 7b1ae6fdcf..5b9098971b 100644
--- a/src/backend/utils/mmgr/portalmem.c
+++ b/src/backend/utils/mmgr/portalmem.c
@@ -303,6 +303,25 @@ PortalDefineQuery(Portal portal,
 	portal->status = PORTAL_DEFINED;
 }
 
+/*
+ * PortalStorePartitionPruneResults
+ *		Copy the given List of Lists of PartitionPruneResults into the
+ *		portal's context
+ *
+ * This allows the caller to ensure that the list exists as long as the portal
+ * does.
+ */
+void
+PortalStorePartitionPruneResults(Portal portal, List *part_prune_results_list)
+{
+	MemoryContext	oldcxt;
+
+	Assert(PortalIsValid(portal));
+	oldcxt = MemoryContextSwitchTo(portal->portalContext);
+	portal->part_prune_results_list = copyObject(part_prune_results_list);
+	MemoryContextSwitchTo(oldcxt);
+}
+
 /*
  * PortalReleaseCachedPlan
  *		Release a portal's reference to its cached plan, if any.
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 9ebde089ae..269cc4d562 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -87,7 +87,9 @@ extern void ExplainOneUtility(Node *utilityStmt, IntoClause *into,
 							  ExplainState *es, const char *queryString,
 							  ParamListInfo params, QueryEnvironment *queryEnv);
 
-extern void ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into,
+extern void ExplainOnePlan(PlannedStmt *plannedstmt,
+						   List *part_prune_results,
+						   IntoClause *into,
 						   ExplainState *es, const char *queryString,
 						   ParamListInfo params, QueryEnvironment *queryEnv,
 						   const instr_time *planduration,
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index 17fabc18c9..4b98d0d2ef 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -45,6 +45,7 @@ extern void ExecCleanupTupleRouting(ModifyTableState *mtstate,
  * nparts						Length of subplan_map[] and subpart_map[].
  * subplan_map					Subplan index by partition index, or -1.
  * subpart_map					Subpart index by partition index, or -1.
+ * rti_map						Range table index by partition index, or 0.
  * present_parts				A Bitmapset of the partition indexes that we
  *								have subplans or subparts for.
  * initial_pruning_steps		List of PartitionPruneSteps used to
@@ -61,6 +62,7 @@ typedef struct PartitionedRelPruningData
 	int			nparts;
 	int		   *subplan_map;
 	int		   *subpart_map;
+	Index	   *rti_map;
 	Bitmapset  *present_parts;
 	List	   *initial_pruning_steps;
 	List	   *exec_pruning_steps;
@@ -127,5 +129,10 @@ extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
 													 Bitmapset *root_parent_relids,
 													 Bitmapset **initially_valid_subplans);
 extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
-										   bool initial_prune);
+										   bool initial_prune,
+										   Bitmapset **scan_leafpart_rtis);
+extern Bitmapset *ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt,
+								ParamListInfo params,
+								PartitionPruneInfo *pruneinfo,
+								Bitmapset **scan_leafpart_rtis);
 #endif							/* EXECPARTITION_H */
diff --git a/src/include/executor/execdesc.h b/src/include/executor/execdesc.h
index e79e2c001f..7d4379da7b 100644
--- a/src/include/executor/execdesc.h
+++ b/src/include/executor/execdesc.h
@@ -35,6 +35,8 @@ typedef struct QueryDesc
 	/* These fields are provided by CreateQueryDesc */
 	CmdType		operation;		/* CMD_SELECT, CMD_UPDATE, etc. */
 	PlannedStmt *plannedstmt;	/* planner's output (could be utility, too) */
+	List		*part_prune_results; /* ExecutorDoInitialPruning()'s
+									  * output for plannedstmt */
 	const char *sourceText;		/* source text of the query */
 	Snapshot	snapshot;		/* snapshot to use for query */
 	Snapshot	crosscheck_snapshot;	/* crosscheck for RI update/delete */
@@ -57,6 +59,7 @@ typedef struct QueryDesc
 
 /* in pquery.c */
 extern QueryDesc *CreateQueryDesc(PlannedStmt *plannedstmt,
+								  List *part_prune_results,
 								  const char *sourceText,
 								  Snapshot snapshot,
 								  Snapshot crosscheck_snapshot,
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index ed95ed1176..c9a5e5fb68 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -185,6 +185,9 @@ ExecGetJunkAttribute(TupleTableSlot *slot, AttrNumber attno, bool *isNull)
 /*
  * prototypes from functions in execMain.c
  */
+extern List *ExecutorDoInitialPruning(PlannedStmt *plannedstmt,
+									  ParamListInfo params,
+									  Bitmapset **scan_leafpart_rtis);
 extern void ExecutorStart(QueryDesc *queryDesc, int eflags);
 extern void standard_ExecutorStart(QueryDesc *queryDesc, int eflags);
 extern void ExecutorRun(QueryDesc *queryDesc,
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index a2008846c6..369de42caf 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -615,6 +615,7 @@ typedef struct EState
 										 * ExecRowMarks, or NULL if none */
 	PlannedStmt *es_plannedstmt;	/* link to top of plan tree */
 	List		*es_part_prune_infos;	/* PlannedStmt.partPruneInfos */
+	List		*es_part_prune_results; /* QueryDesc.part_prune_results */
 	const char *es_sourceText;	/* Source text from QueryDesc */
 
 	JunkFilter *es_junkFilter;	/* top-level junk filter, if any */
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index a80f43e540..937cc4629d 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -212,6 +212,7 @@ extern struct Bitmapset *readBitmapset(void);
 extern uintptr_t readDatum(bool typbyval);
 extern bool *readBoolCols(int numCols);
 extern int *readIntCols(int numCols);
+extern Index *readIndexCols(int numCols);
 extern Oid *readOidCols(int numCols);
 extern int16 *readAttrNumberCols(int numCols);
 
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index dd4eb8679d..36abe4cf9e 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -125,6 +125,18 @@ typedef struct PlannerGlobal
 	/* List of PartitionPruneInfo contained in the plan */
 	List	   *partPruneInfos;
 
+	/*
+	 * Do any of those PartitionPruneInfos have initial pruning steps in them?
+	 */
+	bool		containsInitialPruning;
+
+	/*
+	 * Indexes of all range table entries minus indexes of range table entries
+	 * of the leaf partitions scanned by prunable subplans; see
+	 * AcquireExecutorLocks()
+	 */
+	Bitmapset  *minLockRelids;
+
 	/* OIDs of relations the plan depends on */
 	List	   *relationOids;
 
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 2e202892a7..0cab6958d7 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -73,8 +73,17 @@ typedef struct PlannedStmt
 	List	   *partPruneInfos;	/* List of PartitionPruneInfo contained in
 								 * the plan */
 
+	bool		containsInitialPruning;	/* Do any of those PartitionPruneInfos
+										 * have initial pruning steps in them?
+										 */
+
 	List	   *rtable;			/* list of RangeTblEntry nodes */
 
+	Bitmapset  *minLockRelids;	/* Indexes of all range table entries minus
+								 * indexes of range table entries of the leaf
+								 * partitions scanned by prunable subplans;
+								 * see AcquireExecutorLocks() */
+
 	/* rtable indexes of target relations for INSERT/UPDATE/DELETE/MERGE */
 	List	   *resultRelations;	/* integer list of RT indexes, or NIL */
 
@@ -1414,6 +1423,13 @@ typedef struct PlanRowMark
  * prune_infos			List of Lists containing PartitionedRelPruneInfo nodes,
  *						one sublist per run-time-prunable partition hierarchy
  *						appearing in the parent plan node's subplans.
+ *
+ * needs_init_pruning	Does any of the PartitionedRelPruneInfos in
+ *						prune_infos have its initial_pruning_steps set?
+ *
+ * needs_exec_pruning	Does any of the PartitionedRelPruneInfos in
+ *						prune_infos have its exec_pruning_steps set?
+ *
  * other_subplans		Indexes of any subplans that are not accounted for
  *						by any of the PartitionedRelPruneInfo nodes in
  *						"prune_infos".  These subplans must not be pruned.
@@ -1425,6 +1441,8 @@ typedef struct PartitionPruneInfo
 	NodeTag		type;
 	Bitmapset  *root_parent_relids;
 	List	   *prune_infos;
+	bool		needs_init_pruning;
+	bool		needs_exec_pruning;
 	Bitmapset  *other_subplans;
 } PartitionPruneInfo;
 
@@ -1469,6 +1487,9 @@ typedef struct PartitionedRelPruneInfo
 	/* relation OID by partition index, or 0 */
 	Oid		   *relid_map pg_node_attr(array_size(nparts));
 
+	/* Range table index by partition index, or 0. */
+	Index	   *rti_map pg_node_attr(array_size(nparts));
+
 	/*
 	 * initial_pruning_steps shows how to prune during executor startup (i.e.,
 	 * without use of any PARAM_EXEC Params); it is NIL if no startup pruning
@@ -1553,6 +1574,31 @@ typedef struct PartitionPruneStepCombine
 	List	   *source_stepids;
 } PartitionPruneStepCombine;
 
+/*----------------
+ * PartitionPruneResult
+ *
+ * The result of performing ExecPartitionDoInitialPruning() on a given
+ * PartitionPruneInfo.
+ *
+ * valid_subplans_offs contains the indexes of subplans remaining after
+ * performing initial pruning by calling ExecFindMatchingSubPlans() on the
+ * PartitionPruneInfo.
+ *
+ * This is used to store the result of initial partition pruning that is
+ * peformed before the execution has started.  A module that needs to do so
+ * should call ExecutorDoInitialPruning() on a given PlannedStmt, which
+ * returns a List of PartitionPruneResult containing an entry for each
+ * PartitionPruneInfo present in PlannedStmt.part_prune_infos.  The module
+ * should then pass that list, along with the PlannedStmt, to the executor,
+ * so that it can reuse the result of initial partition pruning when
+ * initializing the subplans for execution.
+ */
+typedef struct PartitionPruneResult
+{
+	NodeTag		type;
+
+	Bitmapset	   *valid_subplan_offs;
+} PartitionPruneResult;
 
 /*
  * Plan invalidation info
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index 0499635f59..32579d4788 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -220,7 +220,8 @@ extern List *CachedPlanGetTargetList(CachedPlanSource *plansource,
 extern CachedPlan *GetCachedPlan(CachedPlanSource *plansource,
 								 ParamListInfo boundParams,
 								 ResourceOwner owner,
-								 QueryEnvironment *queryEnv);
+								 QueryEnvironment *queryEnv,
+								 List **part_prune_results_list);
 extern void ReleaseCachedPlan(CachedPlan *plan, ResourceOwner owner);
 
 extern bool CachedPlanAllowsSimpleValidityCheck(CachedPlanSource *plansource,
diff --git a/src/include/utils/portal.h b/src/include/utils/portal.h
index aeddbdafe5..1901fc5f28 100644
--- a/src/include/utils/portal.h
+++ b/src/include/utils/portal.h
@@ -138,6 +138,7 @@ typedef struct PortalData
 	QueryCompletion qc;			/* command completion data for executed query */
 	List	   *stmts;			/* list of PlannedStmts */
 	CachedPlan *cplan;			/* CachedPlan, if stmts are from one */
+	List	   *part_prune_results_list;	/* List of Lists of PartitionPruneResults */
 
 	ParamListInfo portalParams; /* params to pass to query */
 	QueryEnvironment *queryEnv; /* environment for query */
@@ -242,6 +243,8 @@ extern void PortalDefineQuery(Portal portal,
 							  CommandTag commandTag,
 							  List *stmts,
 							  CachedPlan *cplan);
+extern void PortalStorePartitionPruneResults(Portal portal,
+											 List *part_prune_results_list);
 extern PlannedStmt *PortalGetPrimaryStmt(Portal portal);
 extern void PortalCreateHoldStore(Portal portal);
 extern void PortalHashTableDeleteAll(void);
-- 
2.35.3



^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2022-12-05 03:00  Amit Langote <[email protected]>
  parent: Amit Langote <[email protected]>
  0 siblings, 1 reply; 108+ messages in thread

From: Amit Langote @ 2022-12-05 03:00 UTC (permalink / raw)
  To: Alvaro Herrera <[email protected]>; +Cc: Robert Haas <[email protected]>; Jacob Champion <[email protected]>; David Rowley <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers

On Fri, Dec 2, 2022 at 7:40 PM Amit Langote <[email protected]> wrote:
> On Thu, Dec 1, 2022 at 9:43 PM Amit Langote <[email protected]> wrote:
> > On Thu, Dec 1, 2022 at 8:21 PM Alvaro Herrera <[email protected]> wrote:
> > > On 2022-Dec-01, Amit Langote wrote:
> > > > Hmm, how about keeping the [Merge]Append's parent relation's RT index
> > > > in the PartitionPruneInfo and passing it down to
> > > > ExecInitPartitionPruning() from ExecInit[Merge]Append() for
> > > > cross-checking?  Both Append and MergeAppend already have a
> > > > 'apprelids' field that we can save a copy of in the
> > > > PartitionPruneInfo.  Tried that in the attached delta patch.
> > >
> > > Ah yeah, that sounds about what I was thinking.  I've merged that in and
> > > pushed to github, which had a strange pg_upgrade failure on Windows
> > > mentioning log files that were not captured by the CI tooling.  So I
> > > pushed another one trying to grab those files, in case it wasn't an
> > > one-off failure.  It's running now:
> > >   https://cirrus-ci.com/task/5857239638999040
> > >
> > > If all goes well with this run, I'll get this 0001 pushed.
> >
> > Thanks for pushing 0001.
> >
> > Rebased 0002 attached.
>
> Thought it might be good for PartitionPruneResult to also have
> root_parent_relids that matches with the corresponding
> PartitionPruneInfo.  ExecInitPartitionPruning() does a sanity check
> that the root_parent_relids of a given pair of PartitionPrune{Info |
> Result} match.
>
> Posting the patch separately as the attached 0002, just in case you
> might think that the extra cross-checking would be an overkill.

Rebased over 92c4dafe1eed and fixed some factual mistakes in the
comment above ExecutorDoInitialPruning().

-- 
Thanks, Amit Langote
EDB: http://www.enterprisedb.com


Attachments:

  [application/octet-stream] v27-0001-Optimize-AcquireExecutorLocks-by-locking-only-un.patch (82.9K, 2-v27-0001-Optimize-AcquireExecutorLocks-by-locking-only-un.patch)
  download | inline diff:
From 6c4cf0b0a03bfac62e87f76bb3be9c1e62125a0c Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Wed, 22 Dec 2021 16:55:17 +0900
Subject: [PATCH v27 1/2] Optimize AcquireExecutorLocks() by locking only
 unpruned partitions

This commit teaches AcquireExecutorLocks() to perform initial
partition pruning to notionally eliminate the subnodes contained in a
generic cached plan that need not be initialized during the actual
execution of the plan and skip locking the partition scanned by those
subnodes.

The result of performing initial partition pruning this way before the
actual execution has started is made available to the actual execution via
PartitionPruneResult, made available along with the PlannedStmt by the
callers of the executor that used plancache.c to get the plan.  It is NULL
in the cases in which the plan is obtained by calling the planner
directly or if the plan obtained by plancache.c is not a generic one.
---
 src/backend/commands/copyto.c          |   2 +-
 src/backend/commands/createas.c        |   2 +-
 src/backend/commands/explain.c         |   7 +-
 src/backend/commands/extension.c       |   2 +-
 src/backend/commands/matview.c         |   2 +-
 src/backend/commands/prepare.c         |  26 ++-
 src/backend/executor/README            |  36 ++++
 src/backend/executor/execMain.c        |  53 ++++++
 src/backend/executor/execParallel.c    |  26 ++-
 src/backend/executor/execPartition.c   | 238 +++++++++++++++++++++----
 src/backend/executor/execUtils.c       |   1 +
 src/backend/executor/functions.c       |   2 +-
 src/backend/executor/nodeAppend.c      |  11 +-
 src/backend/executor/nodeMergeAppend.c |   5 +-
 src/backend/executor/spi.c             |  27 ++-
 src/backend/nodes/readfuncs.c          |   8 +-
 src/backend/optimizer/plan/planner.c   |   2 +
 src/backend/optimizer/plan/setrefs.c   |  46 +++++
 src/backend/partitioning/partprune.c   |  41 ++++-
 src/backend/tcop/postgres.c            |   8 +-
 src/backend/tcop/pquery.c              |  28 ++-
 src/backend/utils/cache/plancache.c    | 208 ++++++++++++++++++---
 src/backend/utils/mmgr/portalmem.c     |  19 ++
 src/include/commands/explain.h         |   4 +-
 src/include/executor/execPartition.h   |   9 +-
 src/include/executor/execdesc.h        |   3 +
 src/include/executor/executor.h        |   3 +
 src/include/nodes/execnodes.h          |   1 +
 src/include/nodes/nodes.h              |   1 +
 src/include/nodes/pathnodes.h          |  12 ++
 src/include/nodes/plannodes.h          |  46 +++++
 src/include/utils/plancache.h          |   3 +-
 src/include/utils/portal.h             |   3 +
 33 files changed, 787 insertions(+), 98 deletions(-)

diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index f26cc0d162..401a2280a3 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -558,7 +558,7 @@ BeginCopyTo(ParseState *pstate,
 		((DR_copy *) dest)->cstate = cstate;
 
 		/* Create a QueryDesc requesting no output */
-		cstate->queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+		cstate->queryDesc = CreateQueryDesc(plan, NIL, pstate->p_sourcetext,
 											GetActiveSnapshot(),
 											InvalidSnapshot,
 											dest, NULL, NULL, 0);
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 152c29b551..942449544c 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -325,7 +325,7 @@ ExecCreateTableAs(ParseState *pstate, CreateTableAsStmt *stmt,
 		UpdateActiveSnapshotCommandId();
 
 		/* Create a QueryDesc, redirecting output to our tuple receiver */
-		queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+		queryDesc = CreateQueryDesc(plan, NIL, pstate->p_sourcetext,
 									GetActiveSnapshot(), InvalidSnapshot,
 									dest, params, queryEnv, 0);
 
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index f86983c660..2f2b558608 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -407,7 +407,7 @@ ExplainOneQuery(Query *query, int cursorOptions,
 		}
 
 		/* run it (if needed) and produce output */
-		ExplainOnePlan(plan, into, es, queryString, params, queryEnv,
+		ExplainOnePlan(plan, NIL, into, es, queryString, params, queryEnv,
 					   &planduration, (es->buffers ? &bufusage : NULL));
 	}
 }
@@ -515,7 +515,8 @@ ExplainOneUtility(Node *utilityStmt, IntoClause *into, ExplainState *es,
  * to call it.
  */
 void
-ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
+ExplainOnePlan(PlannedStmt *plannedstmt, List *part_prune_results,
+			   IntoClause *into, ExplainState *es,
 			   const char *queryString, ParamListInfo params,
 			   QueryEnvironment *queryEnv, const instr_time *planduration,
 			   const BufferUsage *bufusage)
@@ -563,7 +564,7 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
 		dest = None_Receiver;
 
 	/* Create a QueryDesc for the query */
-	queryDesc = CreateQueryDesc(plannedstmt, queryString,
+	queryDesc = CreateQueryDesc(plannedstmt, part_prune_results, queryString,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, params, queryEnv, instrument_option);
 
diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index cf1b1ca571..904cbcba4a 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -779,7 +779,7 @@ execute_sql_string(const char *sql)
 			{
 				QueryDesc  *qdesc;
 
-				qdesc = CreateQueryDesc(stmt,
+				qdesc = CreateQueryDesc(stmt, NIL,
 										sql,
 										GetActiveSnapshot(), NULL,
 										dest, NULL, NULL, 0);
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index 9ac0383459..65c8d0aa59 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -408,7 +408,7 @@ refresh_matview_datafill(DestReceiver *dest, Query *query,
 	UpdateActiveSnapshotCommandId();
 
 	/* Create a QueryDesc, redirecting output to our tuple receiver */
-	queryDesc = CreateQueryDesc(plan, queryString,
+	queryDesc = CreateQueryDesc(plan, NIL, queryString,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, NULL, NULL, 0);
 
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 9e29584d93..29b45539d3 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -155,6 +155,7 @@ ExecuteQuery(ParseState *pstate,
 	PreparedStatement *entry;
 	CachedPlan *cplan;
 	List	   *plan_list;
+	List	   *part_prune_results_list;
 	ParamListInfo paramLI = NULL;
 	EState	   *estate = NULL;
 	Portal		portal;
@@ -193,7 +194,10 @@ ExecuteQuery(ParseState *pstate,
 									   entry->plansource->query_string);
 
 	/* Replan if needed, and increment plan refcount for portal */
-	cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL);
+	cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL,
+						  &part_prune_results_list);
+	Assert(list_length(cplan->stmt_list) ==
+		   list_length(part_prune_results_list));
 	plan_list = cplan->stmt_list;
 
 	/*
@@ -207,6 +211,9 @@ ExecuteQuery(ParseState *pstate,
 					  plan_list,
 					  cplan);
 
+	/* Copy Lists of PartitionPruneResults into the portal's context. */
+	PortalStorePartitionPruneResults(portal, part_prune_results_list);
+
 	/*
 	 * For CREATE TABLE ... AS EXECUTE, we must verify that the prepared
 	 * statement is one that produces tuples.  Currently we insist that it be
@@ -576,7 +583,9 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	const char *query_string;
 	CachedPlan *cplan;
 	List	   *plan_list;
-	ListCell   *p;
+	List	   *part_prune_results_list;
+	ListCell   *p,
+			   *pp;
 	ParamListInfo paramLI = NULL;
 	EState	   *estate = NULL;
 	instr_time	planstart;
@@ -619,7 +628,10 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 
 	/* Replan if needed, and acquire a transient refcount */
 	cplan = GetCachedPlan(entry->plansource, paramLI,
-						  CurrentResourceOwner, queryEnv);
+						  CurrentResourceOwner, queryEnv,
+						  &part_prune_results_list);
+	Assert(list_length(cplan->stmt_list) ==
+		   list_length(part_prune_results_list));
 
 	INSTR_TIME_SET_CURRENT(planduration);
 	INSTR_TIME_SUBTRACT(planduration, planstart);
@@ -634,13 +646,15 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	plan_list = cplan->stmt_list;
 
 	/* Explain each query */
-	foreach(p, plan_list)
+	forboth(p, plan_list, pp, part_prune_results_list)
 	{
 		PlannedStmt *pstmt = lfirst_node(PlannedStmt, p);
+		List *part_prune_results = lfirst_node(List, pp);
 
 		if (pstmt->commandType != CMD_UTILITY)
-			ExplainOnePlan(pstmt, into, es, query_string, paramLI, queryEnv,
-						   &planduration, (es->buffers ? &bufusage : NULL));
+			ExplainOnePlan(pstmt, part_prune_results, into, es, query_string,
+						   paramLI, queryEnv, &planduration,
+						   (es->buffers ? &bufusage : NULL));
 		else
 			ExplainOneUtility(pstmt->utilityStmt, into, es, query_string,
 							  paramLI, queryEnv);
diff --git a/src/backend/executor/README b/src/backend/executor/README
index 17775a49e2..7f8cf1494f 100644
--- a/src/backend/executor/README
+++ b/src/backend/executor/README
@@ -65,6 +65,38 @@ found there.  This currently only occurs for Append and MergeAppend nodes.  In
 this case the non-required subplans are ignored and the executor state's
 subnode array will become out of sequence to the plan's subplan list.
 
+The so-called execution time pruning may also occur even before the execution
+has actually started.  One case where that occurs is when a cached generic
+plan is being validated for execution by plancache.c:GetCachedPlan(), which
+works by locking all the relations that will be scanned by that plan.  If the
+generic plan contains nodes that can perform execution time partition pruning
+(that is, contain a PartitionPruneInfo), a subset of pruning steps contained
+in a given node's PartitionPruneInfo that do not depend on the execution
+actually having started (called "initial" pruning steps) are performed as part
+of the plan validation step, by calling ExecutorDoInitialPruning().  That
+returns the minimal set of child subplans that satisfy thoe initial pruning
+steps contained in each PartitionPruneInfo.  AcquireExecutorLocks() will then
+lock only the relations scanned by those subplans, in addition to those present
+inPlannedStmt.minLockRelids.  Note that the subplans are not really pruned as
+in being removed from the plan tree, so care is needed by the downstreams
+users of such a plan that has undergone pre-execution initial pruning.
+
+To prevent the executor and any third party execution code that can look at
+the plan tree from trying to execute the subplans that were pruned as
+described above, the result of that pruning is passed to the executor as a
+List of PartitionPruneResult nodes via the QueryDesc, which is subsequently
+assigned to EState.es_part_prune_results.  Each PartitionPruneResult therein
+consists of the set of indexes of surviving subplans in the respective parent
+plan node's (the one to which the corresponding PartitionPruneInfo belongs)
+list of child subplans, saved as a bitmapset valid_subplan_offs.  The executor
+or any third party execution code working on a generic plan should not
+re-evaluate the set of initially valid subplans for a given plan node by
+redoing the initial pruning if a PartitionPruneResult belonging to thant plan
+node is present in es_part_prune_results.  Note that that is not simply a
+performance optimization, because such re-evaluation of the pruning steps may
+very well end up resulting in a different set of initially valid subplans,
+containing some whose relations were not locked by AcquireExecutorLocks().
+
 Each Plan node may have expression trees associated with it, to represent
 its target list, qualification conditions, etc.  These trees are also
 read-only to the executor, but the executor state for expression evaluation
@@ -286,6 +318,10 @@ Query Processing Control Flow
 
 This is a sketch of control flow for full query processing:
 
+  [ ExecutorDoInitialPruning ] --- an optional step to perform initial
+		partition pruning on the plan tree the result of which is passed
+		to the executor via QueryDesc
+
 	CreateQueryDesc
 
 	ExecutorStart
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 12ff4f3de5..4d8c8e2e43 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -49,6 +49,7 @@
 #include "commands/matview.h"
 #include "commands/trigger.h"
 #include "executor/execdebug.h"
+#include "executor/execPartition.h"
 #include "executor/nodeSubplan.h"
 #include "foreign/fdwapi.h"
 #include "jit/jit.h"
@@ -104,6 +105,56 @@ static void EvalPlanQualStart(EPQState *epqstate, Plan *planTree);
 
 /* end of local decls */
 
+/* ----------------------------------------------------------------
+ *		ExecutorDoInitialPruning
+ *
+ *		For each plan tree node that has been assigned a PartitionPruneInfo,
+ *		this performs initial partition pruning using the information contained
+ *		therein to determine the set of child subplans that satisfy the initial
+ *		pruning steps, to be returned as a bitmapset of their indexes in the
+ *		node's list of child subplans (for example, an Append's appendplans).
+ *
+ * Return value is a List of PartitionPruneResult nodes, one for each
+ * PartitionPruneInfo found in plannedstmt->containsInitialPruning, each
+ * containing a bitmapset of the indexes of unpruned child subplans.
+ * A bitmapset of the RT indexes of the leaf partitions scanned by those
+ * subplans is returned in *scan_leafpart_rtis, which is shared across all
+ * of those PartitionPruneResults.
+ *
+ * The executor must see the exactly same set of subplans as valid for
+ * execution when doing ExecInitNode() on the plan nodes whose
+ * PartitionPruneInfos are processed here.  So, it must get the set from the
+ * aforementioned PartitionPruneResult, instead of computing it all over
+ * again by redoing the initial pruning.  It's the caller's job to pass the
+ * PartitionPruneResult to the executor.
+ *
+ * Note: Partitioned tables mentioned in PartitionedRelPruneInfo nodes that
+ * drive the pruning will be locked before doing the pruning.
+ * ----------------------------------------------------------------
+ */
+List *
+ExecutorDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params,
+						 Bitmapset **scan_leafpart_rtis)
+{
+	List	 *part_prune_results = NIL;
+	ListCell *lc;
+
+	/* Only get here if there is any pruning to do. */
+	Assert(plannedstmt->containsInitialPruning);
+
+	foreach(lc, plannedstmt->partPruneInfos)
+	{
+		PartitionPruneInfo *pruneinfo = lfirst(lc);
+		PartitionPruneResult *pruneresult = makeNode(PartitionPruneResult);
+
+		pruneresult->valid_subplan_offs =
+			ExecPartitionDoInitialPruning(plannedstmt, params, pruneinfo,
+										  scan_leafpart_rtis);
+		part_prune_results = lappend(part_prune_results, pruneresult);
+	}
+
+	return part_prune_results;
+}
 
 /* ----------------------------------------------------------------
  *		ExecutorStart
@@ -806,6 +857,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
 {
 	CmdType		operation = queryDesc->operation;
 	PlannedStmt *plannedstmt = queryDesc->plannedstmt;
+	List	   *part_prune_results = queryDesc->part_prune_results;
 	Plan	   *plan = plannedstmt->planTree;
 	List	   *rangeTable = plannedstmt->rtable;
 	EState	   *estate = queryDesc->estate;
@@ -826,6 +878,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
 
 	estate->es_plannedstmt = plannedstmt;
 	estate->es_part_prune_infos = plannedstmt->partPruneInfos;
+	estate->es_part_prune_results = part_prune_results;
 
 	/*
 	 * Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index aca0c6f323..917079a034 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -66,6 +66,7 @@
 #define PARALLEL_KEY_QUERY_TEXT		UINT64CONST(0xE000000000000008)
 #define PARALLEL_KEY_JIT_INSTRUMENTATION UINT64CONST(0xE000000000000009)
 #define PARALLEL_KEY_WAL_USAGE			UINT64CONST(0xE00000000000000A)
+#define PARALLEL_KEY_PARTITION_PRUNE_RESULTS	UINT64CONST(0xE00000000000000B)
 
 #define PARALLEL_TUPLE_QUEUE_SIZE		65536
 
@@ -182,6 +183,7 @@ ExecSerializePlan(Plan *plan, EState *estate)
 	pstmt->transientPlan = false;
 	pstmt->dependsOnRole = false;
 	pstmt->parallelModeNeeded = false;
+	pstmt->containsInitialPruning = false;
 	pstmt->planTree = plan;
 	pstmt->partPruneInfos = estate->es_part_prune_infos;
 	pstmt->rtable = estate->es_range_table;
@@ -597,12 +599,15 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	FixedParallelExecutorState *fpes;
 	char	   *pstmt_data;
 	char	   *pstmt_space;
+	char	   *part_prune_results_data;
+	char	   *part_prune_results_space;
 	char	   *paramlistinfo_space;
 	BufferUsage *bufusage_space;
 	WalUsage   *walusage_space;
 	SharedExecutorInstrumentation *instrumentation = NULL;
 	SharedJitInstrumentation *jit_instrumentation = NULL;
 	int			pstmt_len;
+	int			part_prune_results_len;
 	int			paramlistinfo_len;
 	int			instrumentation_len = 0;
 	int			jit_instrumentation_len = 0;
@@ -631,6 +636,7 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 
 	/* Fix up and serialize plan to be sent to workers. */
 	pstmt_data = ExecSerializePlan(planstate->plan, estate);
+	part_prune_results_data = nodeToString(estate->es_part_prune_results);
 
 	/* Create a parallel context. */
 	pcxt = CreateParallelContext("postgres", "ParallelQueryMain", nworkers);
@@ -657,6 +663,11 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	shm_toc_estimate_chunk(&pcxt->estimator, pstmt_len);
 	shm_toc_estimate_keys(&pcxt->estimator, 1);
 
+	/* Estimate space for serialized List of PartitionPruneResult. */
+	part_prune_results_len = strlen(part_prune_results_data) + 1;
+	shm_toc_estimate_chunk(&pcxt->estimator, part_prune_results_len);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
 	/* Estimate space for serialized ParamListInfo. */
 	paramlistinfo_len = EstimateParamListSpace(estate->es_param_list_info);
 	shm_toc_estimate_chunk(&pcxt->estimator, paramlistinfo_len);
@@ -751,6 +762,12 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	memcpy(pstmt_space, pstmt_data, pstmt_len);
 	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PLANNEDSTMT, pstmt_space);
 
+	/* Store serialized List of PartitionPruneResult */
+	part_prune_results_space = shm_toc_allocate(pcxt->toc, part_prune_results_len);
+	memcpy(part_prune_results_space, part_prune_results_data, part_prune_results_len);
+	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARTITION_PRUNE_RESULTS,
+				   part_prune_results_space);
+
 	/* Store serialized ParamListInfo. */
 	paramlistinfo_space = shm_toc_allocate(pcxt->toc, paramlistinfo_len);
 	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARAMLISTINFO, paramlistinfo_space);
@@ -1232,8 +1249,10 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
 						 int instrument_options)
 {
 	char	   *pstmtspace;
+	char	   *part_prune_results_space;
 	char	   *paramspace;
 	PlannedStmt *pstmt;
+	List	   *part_prune_results;
 	ParamListInfo paramLI;
 	char	   *queryString;
 
@@ -1244,12 +1263,17 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
 	pstmtspace = shm_toc_lookup(toc, PARALLEL_KEY_PLANNEDSTMT, false);
 	pstmt = (PlannedStmt *) stringToNode(pstmtspace);
 
+	/* Reconstruct leader-supplied PartitionPruneResult. */
+	part_prune_results_space =
+		shm_toc_lookup(toc, PARALLEL_KEY_PARTITION_PRUNE_RESULTS, false);
+	part_prune_results = (List *) stringToNode(part_prune_results_space);
+
 	/* Reconstruct ParamListInfo. */
 	paramspace = shm_toc_lookup(toc, PARALLEL_KEY_PARAMLISTINFO, false);
 	paramLI = RestoreParamList(&paramspace);
 
 	/* Create a QueryDesc for the query. */
-	return CreateQueryDesc(pstmt,
+	return CreateQueryDesc(pstmt, part_prune_results,
 						   queryString,
 						   GetActiveSnapshot(), InvalidSnapshot,
 						   receiver, paramLI, NULL, instrument_options);
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 88d0ea3adb..b0eb15b982 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -25,6 +25,7 @@
 #include "mb/pg_wchar.h"
 #include "miscadmin.h"
 #include "nodes/makefuncs.h"
+#include "parser/parsetree.h"
 #include "partitioning/partbounds.h"
 #include "partitioning/partdesc.h"
 #include "partitioning/partprune.h"
@@ -185,7 +186,11 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation rel,
 static List *adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri);
 static List *adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap);
 static PartitionPruneState *CreatePartitionPruneState(PlanState *planstate,
-													  PartitionPruneInfo *pruneinfo);
+													  PartitionPruneInfo *pruneinfo,
+													  bool consider_initial_steps,
+													  bool consider_exec_steps,
+													  List *rtable, ExprContext *econtext,
+													  PartitionDirectory partdir);
 static void InitPartitionPruneContext(PartitionPruneContext *context,
 									  List *pruning_steps,
 									  PartitionDesc partdesc,
@@ -198,7 +203,8 @@ static void PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
 static void find_matching_subplans_recurse(PartitionPruningData *prunedata,
 										   PartitionedRelPruningData *pprune,
 										   bool initial_prune,
-										   Bitmapset **validsubplans);
+										   Bitmapset **validsubplans,
+										   Bitmapset **scan_leafpart_rtis);
 
 
 /*
@@ -1749,8 +1755,10 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  * considered to be a stable expression, it can change value from one plan
  * node scan to the next during query execution.  Stable comparison
  * expressions that don't involve such Params allow partition pruning to be
- * done once during executor startup.  Expressions that do involve such Params
- * require us to prune separately for each scan of the parent plan node.
+ * done once during executor startup or during ExecutorDoInitialPruning() that
+ * runs as part of performing AcquireExecutorLocks() on a given plan tree.
+ * Expressions that do involve such Params require us to prune separately for
+ * each scan of the parent plan node.
  *
  * Note that pruning away unneeded subplans during executor startup has the
  * added benefit of not having to initialize the unneeded subplans at all.
@@ -1767,6 +1775,13 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  *		account for initial pruning possibly having eliminated some of the
  *		subplans.
  *
+ * ExecPartitionDoInitialPruning:
+ *		Do initial pruning with the information contained in a given
+ *		PartitionPruneInfo to determine the minimal set of child subplans
+ *		to be executed of the parent plan node to which the PartitionPruneInfo
+ *		belongs and also the set of the RT indexes of leaf partitions that will
+ *		be scanned with those subplans.
+ *
  * ExecFindMatchingSubPlans:
  *		Returns indexes of matching subplans after evaluating the expressions
  *		that are safe to evaluate at a given point.  This function is first
@@ -1787,8 +1802,9 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  *
  * On return, *initially_valid_subplans is assigned the set of indexes of
  * child subplans that must be initialized along with the parent plan node.
- * Initial pruning is performed here if needed and in that case only the
- * surviving subplans' indexes are added.
+ * Initial pruning is performed here if needed (unless it has already been done
+ * by ExecutorDoInitialPruning()), and in that case only the surviving
+ * subplans' indexes are added.
  *
  * If subplans are indeed pruned, subplan_map arrays contained in the returned
  * PartitionPruneState are re-sequenced to not count those, though only if the
@@ -1801,9 +1817,10 @@ ExecInitPartitionPruning(PlanState *planstate,
 						 Bitmapset *root_parent_relids,
 						 Bitmapset **initially_valid_subplans)
 {
-	PartitionPruneState *prunestate;
+	PartitionPruneState *prunestate = NULL;
 	EState	   *estate = planstate->state;
 	PartitionPruneInfo *pruneinfo;
+	PartitionPruneResult *pruneresult = NULL;
 
 	/* Obtain the pruneinfo we need, and make sure it's the right one */
 	pruneinfo = list_nth(estate->es_part_prune_infos, part_prune_index);
@@ -1819,20 +1836,57 @@ ExecInitPartitionPruning(PlanState *planstate,
 	/* We may need an expression context to evaluate partition exprs */
 	ExecAssignExprContext(estate, planstate);
 
-	/* Create the working data structure for pruning */
-	prunestate = CreatePartitionPruneState(planstate, pruneinfo);
+	/*
+	 * No need to do initial pruning if it was done already by
+	 * ExecutorDoInitialPruning(), which it would be if es_part_prune_results
+	 * is set.
+	 */
+	if (estate->es_part_prune_results)
+	{
+		pruneresult = list_nth(estate->es_part_prune_results, part_prune_index);
+		Assert(IsA(pruneresult, PartitionPruneResult));
+	}
+
+	if (pruneresult == NULL || pruneinfo->needs_exec_pruning)
+	{
+		/* We may need an expression context to evaluate partition exprs */
+		ExecAssignExprContext(estate, planstate);
+
+		/* For data reading, executor always omits detached partitions */
+		if (estate->es_partition_directory == NULL)
+			estate->es_partition_directory =
+				CreatePartitionDirectory(estate->es_query_cxt, false);
+
+		/*
+		 * Create the working data structure for pruning.  No need to consider
+		 * initial pruning steps if we have a PartitionPruneResult.
+		 */
+		prunestate = CreatePartitionPruneState(planstate, pruneinfo,
+											   pruneresult == NULL,
+											   pruneinfo->needs_exec_pruning,
+											   NIL, planstate->ps_ExprContext,
+											   estate->es_partition_directory);
+	}
 
 	/*
 	 * Perform an initial partition prune pass, if required.
 	 */
-	if (prunestate->do_initial_prune)
-		*initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true);
+	if (pruneresult)
+	{
+		*initially_valid_subplans = bms_copy(pruneresult->valid_subplan_offs);
+	}
+	else if (prunestate && prunestate->do_initial_prune)
+	{
+		*initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true,
+															 NULL);
+	}
 	else
 	{
-		/* No pruning, so we'll need to initialize all subplans */
+		/* No initial pruning, so we'll need to initialize all subplans */
 		Assert(n_total_subplans > 0);
 		*initially_valid_subplans = bms_add_range(NULL, 0,
 												  n_total_subplans - 1);
+		return prunestate;
 	}
 
 	/*
@@ -1840,7 +1894,8 @@ ExecInitPartitionPruning(PlanState *planstate,
 	 * that were removed above due to initial pruning.  No need to do this if
 	 * no steps were removed.
 	 */
-	if (bms_num_members(*initially_valid_subplans) < n_total_subplans)
+	if (prunestate &&
+		bms_num_members(*initially_valid_subplans) < n_total_subplans)
 	{
 		/*
 		 * We can safely skip this when !do_exec_prune, even though that
@@ -1856,11 +1911,74 @@ ExecInitPartitionPruning(PlanState *planstate,
 	return prunestate;
 }
 
+/*
+ * ExecPartitionDoInitialPruning
+ *		Perform initial pruning using given PartitionPruneInfo to determine
+ *		the minimal set of child subplans that will be executed and also the
+ *		set of RT indexes of the leaf partitions scanned by those subplans.
+ */
+Bitmapset *
+ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params,
+							  PartitionPruneInfo *pruneinfo,
+							  Bitmapset **scan_leafpart_rtis)
+{
+	List		 *rtable = plannedstmt->rtable;
+	ExprContext	 *econtext;
+	PartitionDirectory pdir;
+	MemoryContext oldcontext,
+				  tmpcontext;
+	PartitionPruneState *prunestate;
+	Bitmapset	 *valid_subplan_offs;
+
+	/*
+	 * A temporary context for memory allocations required while executing
+	 * partition pruning steps.
+	 */
+	tmpcontext = AllocSetContextCreate(CurrentMemoryContext,
+									   "initial pruning working data",
+									   ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(tmpcontext);
+
+	/*
+	 * PartitionDirectory to look up partition descriptors.
+	 * Note that we don't omit detached partitions, just like during
+	 * execution proper.
+	 */
+	pdir = CreatePartitionDirectory(CurrentMemoryContext, false);
+
+	/*
+	 * We don't yet have a PlanState for the parent plan node, so we must
+	 * create a standalone ExprContext to evaluate pruning expressions,
+	 * equipped with the information about the EXTERN parameters that the
+	 * caller passed us.  Note that that's okay because the initial pruning
+	 * steps do not contain anything that requires the execution to have
+	 * started and thus need the information contained in a PlanState.
+	 */
+	econtext = CreateStandaloneExprContext();
+	econtext->ecxt_param_list_info = params;
+	prunestate = CreatePartitionPruneState(NULL, pruneinfo, true, false,
+										   rtable, econtext, pdir);
+	MemoryContextSwitchTo(oldcontext);
+
+	/* Do the initial pruning. */
+	valid_subplan_offs = ExecFindMatchingSubPlans(prunestate, true,
+												  scan_leafpart_rtis);
+
+	FreeExprContext(econtext, true);
+	DestroyPartitionDirectory(pdir);
+	MemoryContextDelete(tmpcontext);
+
+	return valid_subplan_offs;
+}
+
 /*
  * CreatePartitionPruneState
  *		Build the data structure required for calling ExecFindMatchingSubPlans
  *
- * 'planstate' is the parent plan node's execution state.
+ * 'planstate', if not NULL, is the parent plan node's execution state.  It
+ * can be NULL if being called before ExecutorStart(), in which case,
+ * 'rtable' (range table), 'econtext', and 'partdir' must be explicitly
+ * provided.
  *
  * 'pruneinfo' is a PartitionPruneInfo as generated by
  * make_partition_pruneinfo.  Here we build a PartitionPruneState containing a
@@ -1874,19 +1992,21 @@ ExecInitPartitionPruning(PlanState *planstate,
  * PartitionedRelPruneInfo.
  */
 static PartitionPruneState *
-CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
+CreatePartitionPruneState(PlanState *planstate,
+						  PartitionPruneInfo *pruneinfo,
+						  bool consider_initial_steps,
+						  bool consider_exec_steps,
+						  List *rtable, ExprContext *econtext,
+						  PartitionDirectory partdir)
 {
-	EState	   *estate = planstate->state;
+	EState	   *estate = planstate ? planstate->state : NULL;
 	PartitionPruneState *prunestate;
 	int			n_part_hierarchies;
 	ListCell   *lc;
 	int			i;
-	ExprContext *econtext = planstate->ps_ExprContext;
 
-	/* For data reading, executor always omits detached partitions */
-	if (estate->es_partition_directory == NULL)
-		estate->es_partition_directory =
-			CreatePartitionDirectory(estate->es_query_cxt, false);
+	Assert((estate != NULL) ||
+			(partdir != NULL && econtext != NULL && rtable != NIL));
 
 	n_part_hierarchies = list_length(pruneinfo->prune_infos);
 	Assert(n_part_hierarchies > 0);
@@ -1941,15 +2061,42 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 			PartitionKey partkey;
 
 			/*
-			 * We can rely on the copies of the partitioned table's partition
-			 * key and partition descriptor appearing in its relcache entry,
-			 * because that entry will be held open and locked for the
-			 * duration of this executor run.
+			 * Must open the relation by ourselves when called before the
+			 * execution has started, such as, when called during
+			 * ExecutorDoInitialPruning() on a cached plan.  In that case,
+			 * sub-partitions must be locked, because AcquirePlannerLocks()
+			 * would not have seen them. (1st relation in a partrelpruneinfos
+			 * list is always the root partitioned table appearing in the
+			 * query, which AcquirePlannerLocks() would have locked; the
+			 * Assert in relation_open() guards that assumption.)
+			 */
+			if (estate == NULL)
+			{
+				RangeTblEntry *rte = rt_fetch(pinfo->rtindex, rtable);
+				int		lockmode = (j == 0) ? NoLock : rte->rellockmode;
+
+				partrel = table_open(rte->relid, lockmode);
+			}
+			else
+				partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
+
+			/*
+			 * We can rely on the copy of the partitioned table's partition
+			 * key from in its relcache entry, because it can't change (or
+			 * get destroyed) as long as the relation is locked.  Partition
+			 * descriptor is taken from the PartitionDirectory associated with
+			 * the table that is held open long enough for the descriptor to
+			 * remain valid while it's used to perform the pruning steps.
 			 */
-			partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
 			partkey = RelationGetPartitionKey(partrel);
-			partdesc = PartitionDirectoryLookup(estate->es_partition_directory,
-												partrel);
+			partdesc = PartitionDirectoryLookup(partdir, partrel);
+
+			/*
+			 * Must close partrel, keeping the lock taken, if we're not using
+			 * EState's entry.
+			 */
+			if (estate == NULL)
+				table_close(partrel, NoLock);
 
 			/*
 			 * Initialize the subplan_map and subpart_map.
@@ -1963,6 +2110,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 			Assert(partdesc->nparts >= pinfo->nparts);
 			pprune->nparts = partdesc->nparts;
 			pprune->subplan_map = palloc(sizeof(int) * partdesc->nparts);
+			pprune->rti_map = palloc(sizeof(Index) * partdesc->nparts);
 			if (partdesc->nparts == pinfo->nparts)
 			{
 				/*
@@ -1973,6 +2121,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 				pprune->subpart_map = pinfo->subpart_map;
 				memcpy(pprune->subplan_map, pinfo->subplan_map,
 					   sizeof(int) * pinfo->nparts);
+				memcpy(pprune->rti_map, pinfo->rti_map,
+					   sizeof(int) * pinfo->nparts);
 
 				/*
 				 * Double-check that the list of unpruned relations has not
@@ -2023,6 +2173,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 							pinfo->subplan_map[pd_idx];
 						pprune->subpart_map[pp_idx] =
 							pinfo->subpart_map[pd_idx];
+						pprune->rti_map[pp_idx] =
+							pinfo->rti_map[pd_idx];
 						pd_idx++;
 					}
 					else
@@ -2030,6 +2182,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 						/* this partdesc entry is not in the plan */
 						pprune->subplan_map[pp_idx] = -1;
 						pprune->subpart_map[pp_idx] = -1;
+						pprune->rti_map[pp_idx] = 0;
 					}
 				}
 
@@ -2051,7 +2204,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 			 * Initialize pruning contexts as needed.
 			 */
 			pprune->initial_pruning_steps = pinfo->initial_pruning_steps;
-			if (pinfo->initial_pruning_steps)
+			if (consider_initial_steps && pinfo->initial_pruning_steps)
 			{
 				InitPartitionPruneContext(&pprune->initial_context,
 										  pinfo->initial_pruning_steps,
@@ -2061,7 +2214,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 				prunestate->do_initial_prune = true;
 			}
 			pprune->exec_pruning_steps = pinfo->exec_pruning_steps;
-			if (pinfo->exec_pruning_steps)
+			if (consider_exec_steps && pinfo->exec_pruning_steps)
 			{
 				InitPartitionPruneContext(&pprune->exec_context,
 										  pinfo->exec_pruning_steps,
@@ -2289,10 +2442,14 @@ PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
  * Pass initial_prune if PARAM_EXEC Params cannot yet be evaluated.  This
  * differentiates the initial executor-time pruning step from later
  * runtime pruning.
+ *
+ * RT indexes of leaf partitions scanned by the chosen subplans are added to
+ * *scan_leafpart_rtis if the pointer is non-NULL.
  */
 Bitmapset *
 ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
-						 bool initial_prune)
+						 bool initial_prune,
+						 Bitmapset **scan_leafpart_rtis)
 {
 	Bitmapset  *result = NULL;
 	MemoryContext oldcontext;
@@ -2327,7 +2484,7 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
 		 */
 		pprune = &prunedata->partrelprunedata[0];
 		find_matching_subplans_recurse(prunedata, pprune, initial_prune,
-									   &result);
+									   &result, scan_leafpart_rtis);
 
 		/* Expression eval may have used space in ExprContext too */
 		if (pprune->exec_pruning_steps)
@@ -2341,6 +2498,8 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
 
 	/* Copy result out of the temp context before we reset it */
 	result = bms_copy(result);
+	if (scan_leafpart_rtis)
+		*scan_leafpart_rtis = bms_copy(*scan_leafpart_rtis);
 
 	MemoryContextReset(prunestate->prune_context);
 
@@ -2351,13 +2510,15 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
  * find_matching_subplans_recurse
  *		Recursive worker function for ExecFindMatchingSubPlans
  *
- * Adds valid (non-prunable) subplan IDs to *validsubplans
+ * Adds valid (non-prunable) subplan IDs to *validsubplans and RT indexes of
+ * of the corresponding leaf partitions to *scan_leafpart_rtis (if asked for).
  */
 static void
 find_matching_subplans_recurse(PartitionPruningData *prunedata,
 							   PartitionedRelPruningData *pprune,
 							   bool initial_prune,
-							   Bitmapset **validsubplans)
+							   Bitmapset **validsubplans,
+							   Bitmapset **scan_leafpart_rtis)
 {
 	Bitmapset  *partset;
 	int			i;
@@ -2384,8 +2545,14 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
 	while ((i = bms_next_member(partset, i)) >= 0)
 	{
 		if (pprune->subplan_map[i] >= 0)
+		{
 			*validsubplans = bms_add_member(*validsubplans,
 											pprune->subplan_map[i]);
+			Assert(pprune->rti_map[i] > 0);
+			if (scan_leafpart_rtis)
+				*scan_leafpart_rtis = bms_add_member(*scan_leafpart_rtis,
+													 pprune->rti_map[i]);
+		}
 		else
 		{
 			int			partidx = pprune->subpart_map[i];
@@ -2393,7 +2560,8 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
 			if (partidx >= 0)
 				find_matching_subplans_recurse(prunedata,
 											   &prunedata->partrelprunedata[partidx],
-											   initial_prune, validsubplans);
+											   initial_prune, validsubplans,
+											   scan_leafpart_rtis);
 			else
 			{
 				/*
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 572c87e453..044bf3f491 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -135,6 +135,7 @@ CreateExecutorState(void)
 	estate->es_param_exec_vals = NULL;
 
 	estate->es_queryEnv = NULL;
+	estate->es_part_prune_results = NIL;
 
 	estate->es_query_cxt = qcontext;
 
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index dc13625171..bffb42ce71 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -842,7 +842,7 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
 	else
 		dest = None_Receiver;
 
-	es->qd = CreateQueryDesc(es->stmt,
+	es->qd = CreateQueryDesc(es->stmt, NIL,
 							 fcache->src,
 							 GetActiveSnapshot(),
 							 InvalidSnapshot,
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index 99830198bd..3b917584de 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -156,7 +156,8 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
 		 * subplan, we can fill as_valid_subplans immediately, preventing
 		 * later calls to ExecFindMatchingSubPlans.
 		 */
-		if (!prunestate->do_exec_prune && nplans > 0)
+		if (appendstate->as_prune_state == NULL ||
+			(!appendstate->as_prune_state->do_exec_prune && nplans > 0))
 			appendstate->as_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
 	}
 	else
@@ -578,7 +579,7 @@ choose_next_subplan_locally(AppendState *node)
 		}
 		else if (node->as_valid_subplans == NULL)
 			node->as_valid_subplans =
-				ExecFindMatchingSubPlans(node->as_prune_state, false);
+				ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
 
 		whichplan = -1;
 	}
@@ -643,7 +644,7 @@ choose_next_subplan_for_leader(AppendState *node)
 		if (node->as_valid_subplans == NULL)
 		{
 			node->as_valid_subplans =
-				ExecFindMatchingSubPlans(node->as_prune_state, false);
+				ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
 
 			/*
 			 * Mark each invalid plan as finished to allow the loop below to
@@ -718,7 +719,7 @@ choose_next_subplan_for_worker(AppendState *node)
 	else if (node->as_valid_subplans == NULL)
 	{
 		node->as_valid_subplans =
-			ExecFindMatchingSubPlans(node->as_prune_state, false);
+			ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
 		mark_invalid_subplans_as_finished(node);
 	}
 
@@ -869,7 +870,7 @@ ExecAppendAsyncBegin(AppendState *node)
 	if (node->as_valid_subplans == NULL)
 	{
 		node->as_valid_subplans =
-			ExecFindMatchingSubPlans(node->as_prune_state, false);
+			ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
 
 		classify_matching_subplans(node);
 	}
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index f370f9f287..ccfa083945 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -104,7 +104,8 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
 		 * subplan, we can fill ms_valid_subplans immediately, preventing
 		 * later calls to ExecFindMatchingSubPlans.
 		 */
-		if (!prunestate->do_exec_prune && nplans > 0)
+		if (mergestate->ms_prune_state == NULL ||
+			(!mergestate->ms_prune_state->do_exec_prune && nplans > 0))
 			mergestate->ms_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
 	}
 	else
@@ -219,7 +220,7 @@ ExecMergeAppend(PlanState *pstate)
 		 */
 		if (node->ms_valid_subplans == NULL)
 			node->ms_valid_subplans =
-				ExecFindMatchingSubPlans(node->ms_prune_state, false);
+				ExecFindMatchingSubPlans(node->ms_prune_state, false, NULL);
 
 		/*
 		 * First time through: pull the first tuple from each valid subplan,
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index fd5796f1b9..93012a5b3b 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -1578,6 +1578,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 	CachedPlanSource *plansource;
 	CachedPlan *cplan;
 	List	   *stmt_list;
+	List	   *part_prune_results_list;
 	char	   *query_string;
 	Snapshot	snapshot;
 	MemoryContext oldcontext;
@@ -1657,7 +1658,10 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 	 */
 
 	/* Replan if needed, and increment plan refcount for portal */
-	cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv);
+	cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv,
+						  &part_prune_results_list);
+	Assert(list_length(cplan->stmt_list) ==
+		   list_length(part_prune_results_list));
 	stmt_list = cplan->stmt_list;
 
 	if (!plan->saved)
@@ -1685,6 +1689,9 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 					  stmt_list,
 					  cplan);
 
+	/* Copy Lists of PartitionPruneResults into the portal's context. */
+	PortalStorePartitionPruneResults(portal, part_prune_results_list);
+
 	/*
 	 * Set up options for portal.  Default SCROLL type is chosen the same way
 	 * as PerformCursorOpen does it.
@@ -2092,7 +2099,8 @@ SPI_plan_get_cached_plan(SPIPlanPtr plan)
 	/* Get the generic plan for the query */
 	cplan = GetCachedPlan(plansource, NULL,
 						  plan->saved ? CurrentResourceOwner : NULL,
-						  _SPI_current->queryEnv);
+						  _SPI_current->queryEnv,
+						  NULL /* Not interested in PartitionPruneResults */);
 	Assert(cplan == plansource->gplan);
 
 	/* Pop the error context stack */
@@ -2473,7 +2481,9 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 	{
 		CachedPlanSource *plansource = (CachedPlanSource *) lfirst(lc1);
 		List	   *stmt_list;
-		ListCell   *lc2;
+		List	   *part_prune_results_list;
+		ListCell   *lc2,
+				   *lc3;
 
 		spicallbackarg.query = plansource->query_string;
 
@@ -2549,8 +2559,10 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 		 * plan, the refcount must be backed by the plan_owner.
 		 */
 		cplan = GetCachedPlan(plansource, options->params,
-							  plan_owner, _SPI_current->queryEnv);
-
+							  plan_owner, _SPI_current->queryEnv,
+							 &part_prune_results_list);
+		Assert(list_length(cplan->stmt_list) ==
+			   list_length(part_prune_results_list));
 		stmt_list = cplan->stmt_list;
 
 		/*
@@ -2589,9 +2601,10 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 			}
 		}
 
-		foreach(lc2, stmt_list)
+		forboth(lc2, stmt_list, lc3, part_prune_results_list)
 		{
 			PlannedStmt *stmt = lfirst_node(PlannedStmt, lc2);
+			List	   *part_prune_results = lfirst_node(List, lc3);
 			bool		canSetTag = stmt->canSetTag;
 			DestReceiver *dest;
 
@@ -2663,7 +2676,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 				else
 					snap = InvalidSnapshot;
 
-				qdesc = CreateQueryDesc(stmt,
+				qdesc = CreateQueryDesc(stmt, part_prune_results,
 										plansource->query_string,
 										snap, crosscheck_snapshot,
 										dest,
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 23776367c5..b01f55fb4f 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -158,6 +158,11 @@
 	token = pg_strtok(&length);		/* skip :fldname */ \
 	local_node->fldname = readIntCols(len)
 
+/* Read an Index array */
+#define READ_INDEX_ARRAY(fldname, len) \
+	token = pg_strtok(&length);		/* skip :fldname */ \
+	local_node->fldname = readIndexCols(len)
+
 /* Read a bool array */
 #define READ_BOOL_ARRAY(fldname, len) \
 	token = pg_strtok(&length);		/* skip :fldname */ \
@@ -800,7 +805,6 @@ fnname(int numCols) \
  */
 READ_SCALAR_ARRAY(readAttrNumberCols, int16, atoi)
 READ_SCALAR_ARRAY(readOidCols, Oid, atooid)
-/* outfuncs.c has writeIndexCols, but we don't yet need that here */
-/* READ_SCALAR_ARRAY(readIndexCols, Index, atoui) */
+READ_SCALAR_ARRAY(readIndexCols, Index, atoui)
 READ_SCALAR_ARRAY(readIntCols, int, atoi)
 READ_SCALAR_ARRAY(readBoolCols, bool, strtobool)
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 799602f5ea..a96d316dca 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -520,7 +520,9 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
 	result->parallelModeNeeded = glob->parallelModeNeeded;
 	result->planTree = top_plan;
 	result->partPruneInfos = glob->partPruneInfos;
+	result->containsInitialPruning = glob->containsInitialPruning;
 	result->rtable = glob->finalrtable;
+	result->minLockRelids = glob->minLockRelids;
 	result->resultRelations = glob->resultRelations;
 	result->appendRelations = glob->appendRelations;
 	result->subplans = glob->subplans;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 399c1812d4..44ffe71c49 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -270,6 +270,16 @@ set_plan_references(PlannerInfo *root, Plan *plan)
 	 */
 	add_rtes_to_flat_rtable(root, false);
 
+	/*
+	 * Add the query's adjusted range of RT indexes to glob->minLockRelids.
+	 * The adjusted RT indexes of prunable relations will be deleted from the
+	 * set below where PartitionPruneInfos are processed.
+	 */
+	glob->minLockRelids =
+		bms_add_range(glob->minLockRelids,
+					  rtoffset + 1,
+					  rtoffset + list_length(root->parse->rtable));
+
 	/*
 	 * Adjust RT indexes of PlanRowMarks and add to final rowmarks list
 	 */
@@ -353,6 +363,7 @@ set_plan_references(PlannerInfo *root, Plan *plan)
 	{
 		PartitionPruneInfo *pruneinfo = lfirst(lc);
 		ListCell   *l;
+		Bitmapset  *leafpart_rtis = NULL;
 
 		pruneinfo->root_parent_relids =
 			offset_relid_set(pruneinfo->root_parent_relids, rtoffset);
@@ -364,15 +375,50 @@ set_plan_references(PlannerInfo *root, Plan *plan)
 			foreach(l2, prune_infos)
 			{
 				PartitionedRelPruneInfo *pinfo = lfirst(l2);
+				int		i;
 
 				/* RT index of the table to which the pinfo belongs. */
 				pinfo->rtindex += rtoffset;
+
+				/* Also of the leaf partitions that might be scanned. */
+				for (i = 0; i < pinfo->nparts; i++)
+				{
+					if (pinfo->rti_map[i] > 0 && pinfo->subplan_map[i] >= 0)
+					{
+						pinfo->rti_map[i] += rtoffset;
+						leafpart_rtis = bms_add_member(leafpart_rtis,
+													   pinfo->rti_map[i]);
+					}
+				}
 			}
 		}
 
+		if (pruneinfo->needs_init_pruning)
+		{
+			glob->containsInitialPruning = true;
+
+			/*
+			 * Delete the leaf partition RTIs from the global set of relations
+			 * to be locked before executing the plan.  AcquireExecutorLocks()
+			 * will find the ones to add to the set after performing initial
+			 * pruning.
+			 */
+			glob->minLockRelids = bms_del_members(glob->minLockRelids,
+												  leafpart_rtis);
+		}
+
 		glob->partPruneInfos = lappend(glob->partPruneInfos, pruneinfo);
 	}
 
+	/*
+	 * It seems worth doing a bms_copy() on glob->minLockRelids if we deleted
+	 * bits from it above to get rid of any empty tail bits.  It seems better
+	 * for the loop over this set in AcquireExecutorLocks() to not have to go
+	 * through those useless bit words.
+	 */
+	if (glob->containsInitialPruning)
+		glob->minLockRelids = bms_copy(glob->minLockRelids);
+
 	return result;
 }
 
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index d48f6784c1..d5556354f7 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -144,7 +144,9 @@ static List *make_partitionedrel_pruneinfo(PlannerInfo *root,
 										   List *prunequal,
 										   Bitmapset *partrelids,
 										   int *relid_subplan_map,
-										   Bitmapset **matchedsubplans);
+										   Bitmapset **matchedsubplans,
+										   bool *needs_init_pruning,
+										   bool *needs_exec_pruning);
 static void gen_partprune_steps(RelOptInfo *rel, List *clauses,
 								PartClauseTarget target,
 								GeneratePruningStepsContext *context);
@@ -234,6 +236,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	int		   *relid_subplan_map;
 	ListCell   *lc;
 	int			i;
+	bool		needs_init_pruning = false;
+	bool		needs_exec_pruning = false;
 
 	/*
 	 * Scan the subpaths to see which ones are scans of partition child
@@ -313,12 +317,16 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		Bitmapset  *partrelids = (Bitmapset *) lfirst(lc);
 		List	   *pinfolist;
 		Bitmapset  *matchedsubplans = NULL;
+		bool		partrel_needs_init_pruning;
+		bool		partrel_needs_exec_pruning;
 
 		pinfolist = make_partitionedrel_pruneinfo(root, parentrel,
 												  prunequal,
 												  partrelids,
 												  relid_subplan_map,
-												  &matchedsubplans);
+												  &matchedsubplans,
+												  &partrel_needs_init_pruning,
+												  &partrel_needs_exec_pruning);
 
 		/* When pruning is possible, record the matched subplans */
 		if (pinfolist != NIL)
@@ -327,6 +335,9 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 			allmatchedsubplans = bms_join(matchedsubplans,
 										  allmatchedsubplans);
 		}
+
+		needs_init_pruning |= partrel_needs_init_pruning;
+		needs_exec_pruning |= partrel_needs_exec_pruning;
 	}
 
 	pfree(relid_subplan_map);
@@ -342,6 +353,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	pruneinfo = makeNode(PartitionPruneInfo);
 	pruneinfo->root_parent_relids = parentrel->relids;
 	pruneinfo->prune_infos = prunerelinfos;
+	pruneinfo->needs_init_pruning = needs_init_pruning;
+	pruneinfo->needs_exec_pruning = needs_exec_pruning;
 
 	/*
 	 * Some subplans may not belong to any of the identified partitioned rels.
@@ -442,13 +455,18 @@ add_part_relids(List *allpartrelids, Bitmapset *partrelids)
  * If we cannot find any useful run-time pruning steps, return NIL.
  * However, on success, each rel identified in partrelids will have
  * an element in the result list, even if some of them are useless.
+ * *needs_init_pruning and *needs_exec_pruning are set to indicate that the
+ * returned PartitionedRelPruneInfos contains pruning steps that can be
+ * performed before and after execution begins, respectively.
  */
 static List *
 make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 							  List *prunequal,
 							  Bitmapset *partrelids,
 							  int *relid_subplan_map,
-							  Bitmapset **matchedsubplans)
+							  Bitmapset **matchedsubplans,
+							  bool *needs_init_pruning,
+							  bool *needs_exec_pruning)
 {
 	RelOptInfo *targetpart = NULL;
 	List	   *pinfolist = NIL;
@@ -459,6 +477,10 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	int			rti;
 	int			i;
 
+	/* Will find out below. */
+	*needs_init_pruning = false;
+	*needs_exec_pruning = false;
+
 	/*
 	 * Examine each partitioned rel, constructing a temporary array to map
 	 * from planner relids to index of the partitioned rel, and building a
@@ -546,6 +568,9 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		 * executor per-scan pruning steps.  This first pass creates startup
 		 * pruning steps and detects whether there's any possibly-useful quals
 		 * that would require per-scan pruning.
+		 *
+		 * In the first pass, we note whether the 2nd pass is necessary by
+		 * noting the presence of EXEC parameters.
 		 */
 		gen_partprune_steps(subpart, partprunequal, PARTTARGET_INITIAL,
 							&context);
@@ -620,6 +645,12 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		pinfo->execparamids = execparamids;
 		/* Remaining fields will be filled in the next loop */
 
+		/* record which types of pruning steps we've seen so far */
+		if (initial_pruning_steps != NIL)
+			*needs_init_pruning = true;
+		if (exec_pruning_steps != NIL)
+			*needs_exec_pruning = true;
+
 		pinfolist = lappend(pinfolist, pinfo);
 	}
 
@@ -647,6 +678,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		int		   *subplan_map;
 		int		   *subpart_map;
 		Oid		   *relid_map;
+		Index	   *rti_map;
 
 		/*
 		 * Construct the subplan and subpart maps for this partitioning level.
@@ -659,6 +691,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		subpart_map = (int *) palloc(nparts * sizeof(int));
 		memset(subpart_map, -1, nparts * sizeof(int));
 		relid_map = (Oid *) palloc0(nparts * sizeof(Oid));
+		rti_map = (Index *) palloc0(nparts * sizeof(Index));
 		present_parts = NULL;
 
 		i = -1;
@@ -673,6 +706,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 			subplan_map[i] = subplanidx = relid_subplan_map[partrel->relid] - 1;
 			subpart_map[i] = subpartidx = relid_subpart_map[partrel->relid] - 1;
 			relid_map[i] = planner_rt_fetch(partrel->relid, root)->relid;
+			rti_map[i] = partrel->relid;
 			if (subplanidx >= 0)
 			{
 				present_parts = bms_add_member(present_parts, i);
@@ -697,6 +731,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		pinfo->subplan_map = subplan_map;
 		pinfo->subpart_map = subpart_map;
 		pinfo->relid_map = relid_map;
+		pinfo->rti_map = rti_map;
 	}
 
 	pfree(relid_subpart_map);
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 3082093d1e..95ab1d0eef 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -1598,6 +1598,7 @@ exec_bind_message(StringInfo input_message)
 	int16	   *rformats = NULL;
 	CachedPlanSource *psrc;
 	CachedPlan *cplan;
+	List	   *part_prune_results_list;
 	Portal		portal;
 	char	   *query_string;
 	char	   *saved_stmt_name;
@@ -1972,7 +1973,9 @@ exec_bind_message(StringInfo input_message)
 	 * will be generated in MessageContext.  The plan refcount will be
 	 * assigned to the Portal, so it will be released at portal destruction.
 	 */
-	cplan = GetCachedPlan(psrc, params, NULL, NULL);
+	cplan = GetCachedPlan(psrc, params, NULL, NULL, &part_prune_results_list);
+	Assert(list_length(cplan->stmt_list) ==
+		   list_length(part_prune_results_list));
 
 	/*
 	 * Now we can define the portal.
@@ -1987,6 +1990,9 @@ exec_bind_message(StringInfo input_message)
 					  cplan->stmt_list,
 					  cplan);
 
+	/* Copy Lists of PartitionPruneResults into the portal's context. */
+	PortalStorePartitionPruneResults(portal, part_prune_results_list);
+
 	/* Done with the snapshot used for parameter I/O and parsing/planning */
 	if (snapshot_set)
 		PopActiveSnapshot();
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index 52e2db6452..280ed7d239 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -35,7 +35,7 @@
 Portal		ActivePortal = NULL;
 
 
-static void ProcessQuery(PlannedStmt *plan,
+static void ProcessQuery(PlannedStmt *plan, List *part_prune_results,
 						 const char *sourceText,
 						 ParamListInfo params,
 						 QueryEnvironment *queryEnv,
@@ -65,6 +65,7 @@ static void DoPortalRewind(Portal portal);
  */
 QueryDesc *
 CreateQueryDesc(PlannedStmt *plannedstmt,
+				List *part_prune_results,
 				const char *sourceText,
 				Snapshot snapshot,
 				Snapshot crosscheck_snapshot,
@@ -77,6 +78,8 @@ CreateQueryDesc(PlannedStmt *plannedstmt,
 
 	qd->operation = plannedstmt->commandType;	/* operation */
 	qd->plannedstmt = plannedstmt;	/* plan */
+	qd->part_prune_results = part_prune_results; /* ExecutorDoInitialPruning()
+												  * output for plan */
 	qd->sourceText = sourceText;	/* query text */
 	qd->snapshot = RegisterSnapshot(snapshot);	/* snapshot */
 	/* RI check snapshot */
@@ -122,6 +125,7 @@ FreeQueryDesc(QueryDesc *qdesc)
  *		PORTAL_ONE_RETURNING, or PORTAL_ONE_MOD_WITH portal
  *
  *	plan: the plan tree for the query
+ *	part_prune_results: ExecutorDoInitialPruning() output for the PlannedStmt
  *	sourceText: the source text of the query
  *	params: any parameters needed
  *	dest: where to send results
@@ -134,6 +138,7 @@ FreeQueryDesc(QueryDesc *qdesc)
  */
 static void
 ProcessQuery(PlannedStmt *plan,
+			 List *part_prune_results,
 			 const char *sourceText,
 			 ParamListInfo params,
 			 QueryEnvironment *queryEnv,
@@ -145,7 +150,7 @@ ProcessQuery(PlannedStmt *plan,
 	/*
 	 * Create the QueryDesc object
 	 */
-	queryDesc = CreateQueryDesc(plan, sourceText,
+	queryDesc = CreateQueryDesc(plan, part_prune_results, sourceText,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, params, queryEnv, 0);
 
@@ -491,8 +496,13 @@ PortalStart(Portal portal, ParamListInfo params,
 				/*
 				 * Create QueryDesc in portal's context; for the moment, set
 				 * the destination to DestNone.
+				 *
+				 * There is no PartitionPruneResult unless the PlannedStmt is
+				 * from a CachedPlan.
 				 */
 				queryDesc = CreateQueryDesc(linitial_node(PlannedStmt, portal->stmts),
+											portal->part_prune_results_list == NIL ? NIL :
+											linitial(portal->part_prune_results_list),
 											portal->sourceText,
 											GetActiveSnapshot(),
 											InvalidSnapshot,
@@ -1225,6 +1235,8 @@ PortalRunMulti(Portal portal,
 
 		if (pstmt->utilityStmt == NULL)
 		{
+			List *part_prune_results = NIL;
+
 			/*
 			 * process a plannable query.
 			 */
@@ -1271,10 +1283,18 @@ PortalRunMulti(Portal portal,
 			else
 				UpdateActiveSnapshotCommandId();
 
+			/*
+			 * Determine if there's a corresponding List of PartitionPruneResult
+			 * for this PlannedStmt.
+			 */
+			if (portal->part_prune_results_list != NIL)
+				part_prune_results = list_nth(portal->part_prune_results_list,
+											  foreach_current_index(stmtlist_item));
+
 			if (pstmt->canSetTag)
 			{
 				/* statement can set tag string */
-				ProcessQuery(pstmt,
+				ProcessQuery(pstmt, part_prune_results,
 							 portal->sourceText,
 							 portal->portalParams,
 							 portal->queryEnv,
@@ -1283,7 +1303,7 @@ PortalRunMulti(Portal portal,
 			else
 			{
 				/* stmt added by rewrite cannot set tag */
-				ProcessQuery(pstmt,
+				ProcessQuery(pstmt, part_prune_results,
 							 portal->sourceText,
 							 portal->portalParams,
 							 portal->queryEnv,
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index cc943205d3..af6fae6e3b 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -99,14 +99,19 @@ static dlist_head cached_expression_list = DLIST_STATIC_INIT(cached_expression_l
 static void ReleaseGenericPlan(CachedPlanSource *plansource);
 static List *RevalidateCachedQuery(CachedPlanSource *plansource,
 								   QueryEnvironment *queryEnv);
-static bool CheckCachedPlan(CachedPlanSource *plansource);
+static bool CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
+							List **part_prune_results_list);
 static CachedPlan *BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
-								   ParamListInfo boundParams, QueryEnvironment *queryEnv);
+								   ParamListInfo boundParams, QueryEnvironment *queryEnv,
+								   List **part_prune_results_list);
 static bool choose_custom_plan(CachedPlanSource *plansource,
 							   ParamListInfo boundParams);
 static double cached_plan_cost(CachedPlan *plan, bool include_planner);
 static Query *QueryListGetPrimaryStmt(List *stmts);
-static void AcquireExecutorLocks(List *stmt_list, bool acquire);
+static void AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams,
+								 List **part_prune_results_list,
+								 List **lockedRelids_per_stmt);
+static void ReleaseExecutorLocks(List *stmt_list, List *lockedRelids_per_stmt);
 static void AcquirePlannerLocks(List *stmt_list, bool acquire);
 static void ScanQueryForLocks(Query *parsetree, bool acquire);
 static bool ScanQueryWalker(Node *node, bool *acquire);
@@ -782,6 +787,26 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
 	return tlist;
 }
 
+/* 
+ * FreePartitionPruneResults
+ *		Frees the List of Lists of PartitionPruneResults for CheckCachedPlan()
+ */
+static void
+FreePartitionPruneResults(List *part_prune_results_list)
+{
+	ListCell *lc;
+
+	foreach(lc, part_prune_results_list)
+	{
+		List *part_prune_results = lfirst(lc);
+
+		/* Free both the PartitionPruneResults and the containing List. */
+		list_free_deep(part_prune_results);
+	}
+
+	list_free(part_prune_results_list);
+}
+
 /*
  * CheckCachedPlan: see if the CachedPlanSource's generic plan is valid.
  *
@@ -790,15 +815,20 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
  *
  * On a "true" return, we have acquired the locks needed to run the plan.
  * (We must do this for the "true" result to be race-condition-free.)
+ *
+ * See GetCachedPlan()'s comment for a description of part_prune_results_list.
  */
 static bool
-CheckCachedPlan(CachedPlanSource *plansource)
+CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
+				List **part_prune_results_list)
 {
 	CachedPlan *plan = plansource->gplan;
 
 	/* Assert that caller checked the querytree */
 	Assert(plansource->is_valid);
 
+	*part_prune_results_list = NIL;
+
 	/* If there's no generic plan, just say "false" */
 	if (!plan)
 		return false;
@@ -820,13 +850,21 @@ CheckCachedPlan(CachedPlanSource *plansource)
 	 */
 	if (plan->is_valid)
 	{
+		List   *lockedRelids_per_stmt;
+
 		/*
 		 * Plan must have positive refcount because it is referenced by
 		 * plansource; so no need to fear it disappears under us here.
 		 */
 		Assert(plan->refcount > 0);
 
-		AcquireExecutorLocks(plan->stmt_list, true);
+		/*
+		 * Lock relations scanned by the plan.  This is where the pruning
+		 * happens if needed.
+		 */
+		AcquireExecutorLocks(plan->stmt_list, boundParams,
+							 part_prune_results_list,
+							 &lockedRelids_per_stmt);
 
 		/*
 		 * If plan was transient, check to see if TransactionXmin has
@@ -848,7 +886,11 @@ CheckCachedPlan(CachedPlanSource *plansource)
 		}
 
 		/* Oops, the race case happened.  Release useless locks. */
-		AcquireExecutorLocks(plan->stmt_list, false);
+		ReleaseExecutorLocks(plan->stmt_list, lockedRelids_per_stmt);
+
+		/* Release any PartitionPruneResults that may been created. */
+		FreePartitionPruneResults(*part_prune_results_list);
+		*part_prune_results_list = NIL;
 	}
 
 	/*
@@ -874,10 +916,14 @@ CheckCachedPlan(CachedPlanSource *plansource)
  * Planning work is done in the caller's memory context.  The finished plan
  * is in a child memory context, which typically should get reparented
  * (unless this is a one-shot plan, in which case we don't copy the plan).
+ *
+ * A list of NILs is returned in *part_prune_results_list, meaning that no
+ * no partition pruning has been done yet for the plans in stmt_list.
  */
 static CachedPlan *
 BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
-				ParamListInfo boundParams, QueryEnvironment *queryEnv)
+				ParamListInfo boundParams, QueryEnvironment *queryEnv,
+				List **part_prune_results_list)
 {
 	CachedPlan *plan;
 	List	   *plist;
@@ -1007,6 +1053,17 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
 
 	MemoryContextSwitchTo(oldcxt);
 
+	/*
+	 * No actual PartitionPruneResults yet to add, though must initialize
+	 * the list to have the same number of elements as the list of
+	 * PlannedStmts.
+	 */
+	*part_prune_results_list = NIL;
+	foreach(lc, plist)
+	{
+		*part_prune_results_list = lappend(*part_prune_results_list, NIL);
+	}
+
 	return plan;
 }
 
@@ -1126,6 +1183,19 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
  * plan or a custom plan for the given parameters: the caller does not know
  * which it will get.
  *
+ * For every PlannedStmt found in the returned CachedPlan, an element that
+ * is either a List of PartitionPruneResult or a NIL is added to
+ * *part_prune_results_list.  The former if the PlannedStmt is from
+ * the existing CachedPlan that is otherwise valid and has
+ * containsInitialPruning set to true.  Before returning such a CachedPlan,
+ * those "initial" steps are performed by calling ExecutorDoInitialPruning()
+ * to determine only those leaf partitions that need to be locked by
+ * AcquireExecutorLocks() by pruning away subplans that don't match the
+ * "initial" pruning conditions.  For each PartitionPruneInfo found in
+ * PlannedStmt.partPruneInfos, a PartitionPruneResult containing the bitmapset
+ * of the indexes of surviving subplans is added to the List for the
+ * PlannedStmt.
+ *
  * On return, the plan is valid and we have sufficient locks to begin
  * execution.
  *
@@ -1139,11 +1209,13 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
  */
 CachedPlan *
 GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
-			  ResourceOwner owner, QueryEnvironment *queryEnv)
+			  ResourceOwner owner, QueryEnvironment *queryEnv,
+			  List **part_prune_results_list)
 {
 	CachedPlan *plan = NULL;
 	List	   *qlist;
 	bool		customplan;
+	List	   *my_part_prune_results_list;
 
 	/* Assert caller is doing things in a sane order */
 	Assert(plansource->magic == CACHEDPLANSOURCE_MAGIC);
@@ -1160,7 +1232,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 
 	if (!customplan)
 	{
-		if (CheckCachedPlan(plansource))
+		if (CheckCachedPlan(plansource, boundParams,
+							&my_part_prune_results_list))
 		{
 			/* We want a generic plan, and we already have a valid one */
 			plan = plansource->gplan;
@@ -1169,7 +1242,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 		else
 		{
 			/* Build a new generic plan */
-			plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv);
+			plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv,
+								   &my_part_prune_results_list);
 			/* Just make real sure plansource->gplan is clear */
 			ReleaseGenericPlan(plansource);
 			/* Link the new generic plan into the plansource */
@@ -1214,7 +1288,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 	if (customplan)
 	{
 		/* Build a custom plan */
-		plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv);
+		plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv,
+							   &my_part_prune_results_list);
 		/* Accumulate total costs of custom plans */
 		plansource->total_custom_cost += cached_plan_cost(plan, true);
 
@@ -1246,6 +1321,9 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 		plan->is_saved = true;
 	}
 
+	if (part_prune_results_list)
+		*part_prune_results_list = my_part_prune_results_list;
+
 	return plan;
 }
 
@@ -1737,17 +1815,29 @@ QueryListGetPrimaryStmt(List *stmts)
 
 /*
  * AcquireExecutorLocks: acquire locks needed for execution of a cached plan;
- * or release them if acquire is false.
+ *
+ * See GetCachedPlan()'s comment for a description of part_prune_results_list.
+ *
+ * On return, *lockedRelids_per_stmt will contain a bitmapset for every
+ * PlannedStmt in stmt_list, containing the RT indexes of relation entries
+ * in its range table that were actually locked, or NULL if the PlannedStmt
+ * contains a utility statement.
  */
 static void
-AcquireExecutorLocks(List *stmt_list, bool acquire)
+AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams,
+					 List **part_prune_results_list,
+					 List **lockedRelids_per_stmt)
 {
 	ListCell   *lc1;
 
+	*part_prune_results_list = *lockedRelids_per_stmt = NIL;
 	foreach(lc1, stmt_list)
 	{
 		PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
-		ListCell   *lc2;
+		List	   *part_prune_results = NIL;
+		Bitmapset  *allLockRelids;
+		Bitmapset  *lockedRelids = NULL;
+		int			rti;
 
 		if (plannedstmt->commandType == CMD_UTILITY)
 		{
@@ -1761,13 +1851,40 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
 			Query	   *query = UtilityContainsQuery(plannedstmt->utilityStmt);
 
 			if (query)
-				ScanQueryForLocks(query, acquire);
+				ScanQueryForLocks(query, true);
+			*part_prune_results_list = lappend(*part_prune_results_list, NIL);
 			continue;
 		}
 
-		foreach(lc2, plannedstmt->rtable)
+		/*
+		 * Figure out the set of relations that would need to be locked
+		 * before executing the plan.
+		 */
+		if (plannedstmt->containsInitialPruning)
 		{
-			RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc2);
+			Bitmapset *scan_leafpart_rtis = NULL;
+
+			/*
+			 * Obtain the set of leaf partitions to be locked.
+			 *
+			 * The following does initial partition pruning using the
+			 * PartitionPruneInfos found in plannedstmt->partPruneInfos and
+			 * finds leaf partitions that survive that pruning across all the
+			 * nodes in the plan tree.
+			 */
+			part_prune_results = ExecutorDoInitialPruning(plannedstmt,
+														  boundParams,
+														  &scan_leafpart_rtis);
+			allLockRelids = bms_union(plannedstmt->minLockRelids,
+									  scan_leafpart_rtis);
+		}
+		else
+			allLockRelids = plannedstmt->minLockRelids;
+
+		rti = -1;
+		while ((rti = bms_next_member(allLockRelids, rti)) > 0)
+		{
+			RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
 
 			if (rte->rtekind != RTE_RELATION)
 				continue;
@@ -1778,10 +1895,59 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
 			 * fail if it's been dropped entirely --- we'll just transiently
 			 * acquire a non-conflicting lock.
 			 */
-			if (acquire)
-				LockRelationOid(rte->relid, rte->rellockmode);
-			else
-				UnlockRelationOid(rte->relid, rte->rellockmode);
+			LockRelationOid(rte->relid, rte->rellockmode);
+			lockedRelids = bms_add_member(lockedRelids, rti);
+		}
+
+		*part_prune_results_list = lappend(*part_prune_results_list,
+										   part_prune_results);
+		*lockedRelids_per_stmt = lappend(*lockedRelids_per_stmt, lockedRelids);
+	}
+}
+
+/*
+ * ReleaseExecutorLocks
+ * 		Release locks that would've been acquired by an earlier call to
+ * 		AcquireExecutorLocks()
+ */
+static void
+ReleaseExecutorLocks(List *stmt_list, List *lockedRelids_per_stmt)
+{
+	ListCell   *lc1,
+			   *lc2;
+
+	forboth(lc1, stmt_list, lc2, lockedRelids_per_stmt)
+	{
+		PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
+		Bitmapset  *lockedRelids = lfirst(lc2);
+		int			rti;
+
+		if (plannedstmt->commandType == CMD_UTILITY)
+		{
+			/*
+			 * Ignore utility statements, except those (such as EXPLAIN) that
+			 * contain a parsed-but-not-planned query.  Note: it's okay to use
+			 * ScanQueryForLocks, even though the query hasn't been through
+			 * rule rewriting, because rewriting doesn't change the query
+			 * representation.
+			 */
+			Query	   *query = UtilityContainsQuery(plannedstmt->utilityStmt);
+
+			Assert(lockedRelids == NULL);
+			if (query)
+				ScanQueryForLocks(query, false);
+			continue;
+		}
+
+		rti = -1;
+		while ((rti = bms_next_member(lockedRelids, rti)) >= 0)
+		{
+			RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
+
+			Assert(rte->rtekind == RTE_RELATION);
+
+			/* See the comment in AcquireExecutorLocks(). */
+			UnlockRelationOid(rte->relid, rte->rellockmode);
 		}
 	}
 }
diff --git a/src/backend/utils/mmgr/portalmem.c b/src/backend/utils/mmgr/portalmem.c
index 7b1ae6fdcf..5b9098971b 100644
--- a/src/backend/utils/mmgr/portalmem.c
+++ b/src/backend/utils/mmgr/portalmem.c
@@ -303,6 +303,25 @@ PortalDefineQuery(Portal portal,
 	portal->status = PORTAL_DEFINED;
 }
 
+/*
+ * PortalStorePartitionPruneResults
+ *		Copy the given List of Lists of PartitionPruneResults into the
+ *		portal's context
+ *
+ * This allows the caller to ensure that the list exists as long as the portal
+ * does.
+ */
+void
+PortalStorePartitionPruneResults(Portal portal, List *part_prune_results_list)
+{
+	MemoryContext	oldcxt;
+
+	Assert(PortalIsValid(portal));
+	oldcxt = MemoryContextSwitchTo(portal->portalContext);
+	portal->part_prune_results_list = copyObject(part_prune_results_list);
+	MemoryContextSwitchTo(oldcxt);
+}
+
 /*
  * PortalReleaseCachedPlan
  *		Release a portal's reference to its cached plan, if any.
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 9ebde089ae..269cc4d562 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -87,7 +87,9 @@ extern void ExplainOneUtility(Node *utilityStmt, IntoClause *into,
 							  ExplainState *es, const char *queryString,
 							  ParamListInfo params, QueryEnvironment *queryEnv);
 
-extern void ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into,
+extern void ExplainOnePlan(PlannedStmt *plannedstmt,
+						   List *part_prune_results,
+						   IntoClause *into,
 						   ExplainState *es, const char *queryString,
 						   ParamListInfo params, QueryEnvironment *queryEnv,
 						   const instr_time *planduration,
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index 17fabc18c9..4b98d0d2ef 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -45,6 +45,7 @@ extern void ExecCleanupTupleRouting(ModifyTableState *mtstate,
  * nparts						Length of subplan_map[] and subpart_map[].
  * subplan_map					Subplan index by partition index, or -1.
  * subpart_map					Subpart index by partition index, or -1.
+ * rti_map						Range table index by partition index, or 0.
  * present_parts				A Bitmapset of the partition indexes that we
  *								have subplans or subparts for.
  * initial_pruning_steps		List of PartitionPruneSteps used to
@@ -61,6 +62,7 @@ typedef struct PartitionedRelPruningData
 	int			nparts;
 	int		   *subplan_map;
 	int		   *subpart_map;
+	Index	   *rti_map;
 	Bitmapset  *present_parts;
 	List	   *initial_pruning_steps;
 	List	   *exec_pruning_steps;
@@ -127,5 +129,10 @@ extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
 													 Bitmapset *root_parent_relids,
 													 Bitmapset **initially_valid_subplans);
 extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
-										   bool initial_prune);
+										   bool initial_prune,
+										   Bitmapset **scan_leafpart_rtis);
+extern Bitmapset *ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt,
+								ParamListInfo params,
+								PartitionPruneInfo *pruneinfo,
+								Bitmapset **scan_leafpart_rtis);
 #endif							/* EXECPARTITION_H */
diff --git a/src/include/executor/execdesc.h b/src/include/executor/execdesc.h
index e79e2c001f..7d4379da7b 100644
--- a/src/include/executor/execdesc.h
+++ b/src/include/executor/execdesc.h
@@ -35,6 +35,8 @@ typedef struct QueryDesc
 	/* These fields are provided by CreateQueryDesc */
 	CmdType		operation;		/* CMD_SELECT, CMD_UPDATE, etc. */
 	PlannedStmt *plannedstmt;	/* planner's output (could be utility, too) */
+	List		*part_prune_results; /* ExecutorDoInitialPruning()'s
+									  * output for plannedstmt */
 	const char *sourceText;		/* source text of the query */
 	Snapshot	snapshot;		/* snapshot to use for query */
 	Snapshot	crosscheck_snapshot;	/* crosscheck for RI update/delete */
@@ -57,6 +59,7 @@ typedef struct QueryDesc
 
 /* in pquery.c */
 extern QueryDesc *CreateQueryDesc(PlannedStmt *plannedstmt,
+								  List *part_prune_results,
 								  const char *sourceText,
 								  Snapshot snapshot,
 								  Snapshot crosscheck_snapshot,
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index aaf2bc78b9..32bbbc5927 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -185,6 +185,9 @@ ExecGetJunkAttribute(TupleTableSlot *slot, AttrNumber attno, bool *isNull)
 /*
  * prototypes from functions in execMain.c
  */
+extern List *ExecutorDoInitialPruning(PlannedStmt *plannedstmt,
+									  ParamListInfo params,
+									  Bitmapset **scan_leafpart_rtis);
 extern void ExecutorStart(QueryDesc *queryDesc, int eflags);
 extern void standard_ExecutorStart(QueryDesc *queryDesc, int eflags);
 extern void ExecutorRun(QueryDesc *queryDesc,
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 71248a9466..9c6e8f5e13 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -619,6 +619,7 @@ typedef struct EState
 										 * ExecRowMarks, or NULL if none */
 	PlannedStmt *es_plannedstmt;	/* link to top of plan tree */
 	List		*es_part_prune_infos;	/* PlannedStmt.partPruneInfos */
+	List		*es_part_prune_results; /* QueryDesc.part_prune_results */
 	const char *es_sourceText;	/* Source text from QueryDesc */
 
 	JunkFilter *es_junkFilter;	/* top-level junk filter, if any */
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index 1f33902947..c2f2544df5 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -218,6 +218,7 @@ extern struct Bitmapset *readBitmapset(void);
 extern uintptr_t readDatum(bool typbyval);
 extern bool *readBoolCols(int numCols);
 extern int *readIntCols(int numCols);
+extern Index *readIndexCols(int numCols);
 extern Oid *readOidCols(int numCols);
 extern int16 *readAttrNumberCols(int numCols);
 
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index dbaa9bb54d..e0e5c15b09 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -125,6 +125,18 @@ typedef struct PlannerGlobal
 	/* List of PartitionPruneInfo contained in the plan */
 	List	   *partPruneInfos;
 
+	/*
+	 * Do any of those PartitionPruneInfos have initial pruning steps in them?
+	 */
+	bool		containsInitialPruning;
+
+	/*
+	 * Indexes of all range table entries minus indexes of range table entries
+	 * of the leaf partitions scanned by prunable subplans; see
+	 * AcquireExecutorLocks()
+	 */
+	Bitmapset  *minLockRelids;
+
 	/* OIDs of relations the plan depends on */
 	List	   *relationOids;
 
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index c36a15bd09..714e2cf2c7 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -73,8 +73,17 @@ typedef struct PlannedStmt
 	List	   *partPruneInfos; /* List of PartitionPruneInfo contained in the
 								 * plan */
 
+	bool		containsInitialPruning;	/* Do any of those PartitionPruneInfos
+										 * have initial pruning steps in them?
+										 */
+
 	List	   *rtable;			/* list of RangeTblEntry nodes */
 
+	Bitmapset  *minLockRelids;	/* Indexes of all range table entries minus
+								 * indexes of range table entries of the leaf
+								 * partitions scanned by prunable subplans;
+								 * see AcquireExecutorLocks() */
+
 	/* rtable indexes of target relations for INSERT/UPDATE/DELETE/MERGE */
 	List	   *resultRelations;	/* integer list of RT indexes, or NIL */
 
@@ -1414,6 +1423,13 @@ typedef struct PlanRowMark
  * prune_infos			List of Lists containing PartitionedRelPruneInfo nodes,
  *						one sublist per run-time-prunable partition hierarchy
  *						appearing in the parent plan node's subplans.
+ *
+ * needs_init_pruning	Does any of the PartitionedRelPruneInfos in
+ *						prune_infos have its initial_pruning_steps set?
+ *
+ * needs_exec_pruning	Does any of the PartitionedRelPruneInfos in
+ *						prune_infos have its exec_pruning_steps set?
+ *
  * other_subplans		Indexes of any subplans that are not accounted for
  *						by any of the PartitionedRelPruneInfo nodes in
  *						"prune_infos".  These subplans must not be pruned.
@@ -1425,6 +1441,8 @@ typedef struct PartitionPruneInfo
 	NodeTag		type;
 	Bitmapset  *root_parent_relids;
 	List	   *prune_infos;
+	bool		needs_init_pruning;
+	bool		needs_exec_pruning;
 	Bitmapset  *other_subplans;
 } PartitionPruneInfo;
 
@@ -1469,6 +1487,9 @@ typedef struct PartitionedRelPruneInfo
 	/* relation OID by partition index, or 0 */
 	Oid		   *relid_map pg_node_attr(array_size(nparts));
 
+	/* Range table index by partition index, or 0. */
+	Index	   *rti_map pg_node_attr(array_size(nparts));
+
 	/*
 	 * initial_pruning_steps shows how to prune during executor startup (i.e.,
 	 * without use of any PARAM_EXEC Params); it is NIL if no startup pruning
@@ -1553,6 +1574,31 @@ typedef struct PartitionPruneStepCombine
 	List	   *source_stepids;
 } PartitionPruneStepCombine;
 
+/*----------------
+ * PartitionPruneResult
+ *
+ * The result of performing ExecPartitionDoInitialPruning() on a given
+ * PartitionPruneInfo.
+ *
+ * valid_subplans_offs contains the indexes of subplans remaining after
+ * performing initial pruning by calling ExecFindMatchingSubPlans() on the
+ * PartitionPruneInfo.
+ *
+ * This is used to store the result of initial partition pruning that is
+ * peformed before the execution has started.  A module that needs to do so
+ * should call ExecutorDoInitialPruning() on a given PlannedStmt, which
+ * returns a List of PartitionPruneResult containing an entry for each
+ * PartitionPruneInfo present in PlannedStmt.part_prune_infos.  The module
+ * should then pass that list, along with the PlannedStmt, to the executor,
+ * so that it can reuse the result of initial partition pruning when
+ * initializing the subplans for execution.
+ */
+typedef struct PartitionPruneResult
+{
+	NodeTag		type;
+
+	Bitmapset	   *valid_subplan_offs;
+} PartitionPruneResult;
 
 /*
  * Plan invalidation info
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index 0499635f59..32579d4788 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -220,7 +220,8 @@ extern List *CachedPlanGetTargetList(CachedPlanSource *plansource,
 extern CachedPlan *GetCachedPlan(CachedPlanSource *plansource,
 								 ParamListInfo boundParams,
 								 ResourceOwner owner,
-								 QueryEnvironment *queryEnv);
+								 QueryEnvironment *queryEnv,
+								 List **part_prune_results_list);
 extern void ReleaseCachedPlan(CachedPlan *plan, ResourceOwner owner);
 
 extern bool CachedPlanAllowsSimpleValidityCheck(CachedPlanSource *plansource,
diff --git a/src/include/utils/portal.h b/src/include/utils/portal.h
index aeddbdafe5..1901fc5f28 100644
--- a/src/include/utils/portal.h
+++ b/src/include/utils/portal.h
@@ -138,6 +138,7 @@ typedef struct PortalData
 	QueryCompletion qc;			/* command completion data for executed query */
 	List	   *stmts;			/* list of PlannedStmts */
 	CachedPlan *cplan;			/* CachedPlan, if stmts are from one */
+	List	   *part_prune_results_list;	/* List of Lists of PartitionPruneResults */
 
 	ParamListInfo portalParams; /* params to pass to query */
 	QueryEnvironment *queryEnv; /* environment for query */
@@ -242,6 +243,8 @@ extern void PortalDefineQuery(Portal portal,
 							  CommandTag commandTag,
 							  List *stmts,
 							  CachedPlan *cplan);
+extern void PortalStorePartitionPruneResults(Portal portal,
+											 List *part_prune_results_list);
 extern PlannedStmt *PortalGetPrimaryStmt(Portal portal);
 extern void PortalCreateHoldStore(Portal portal);
 extern void PortalHashTableDeleteAll(void);
-- 
2.35.3



  [application/octet-stream] v27-0002-Add-root_parent_relids-to-PartitionPruneResult.patch (3.4K, 3-v27-0002-Add-root_parent_relids-to-PartitionPruneResult.patch)
  download | inline diff:
From 4ef1d918405a7c7c63a3e7376ccef57cf844796d Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Fri, 2 Dec 2022 19:32:14 +0900
Subject: [PATCH v27 2/2] Add root_parent_relids to PartitionPruneResult

It's same as the corresponding PartitionPruneInfo's root_parent_relids.
Like PartitionPruneInfo.root_parent_relids, it's there for
cross-checking a PartitionPruneResult found at a given plan node's
part_prune_index actually matches the plan node.
---
 src/backend/executor/execMain.c      |  2 ++
 src/backend/executor/execPartition.c | 13 +++++++++++--
 src/include/nodes/plannodes.h        |  7 +++++++
 3 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 4d8c8e2e43..3293a65d15 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -147,6 +147,8 @@ ExecutorDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params,
 		PartitionPruneInfo *pruneinfo = lfirst(lc);
 		PartitionPruneResult *pruneresult = makeNode(PartitionPruneResult);
 
+		pruneresult->root_parent_relids =
+			bms_copy(pruneinfo->root_parent_relids);
 		pruneresult->valid_subplan_offs =
 			ExecPartitionDoInitialPruning(plannedstmt, params, pruneinfo,
 										  scan_leafpart_rtis);
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index b0eb15b982..2eadc30ec8 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -1843,8 +1843,17 @@ ExecInitPartitionPruning(PlanState *planstate,
 	 */
 	if (estate->es_part_prune_results)
 	{
-		pruneresult = list_nth(estate->es_part_prune_results, part_prune_index);
-		Assert(IsA(pruneresult, PartitionPruneResult));
+		pruneresult = list_nth_node(PartitionPruneResult,
+									estate->es_part_prune_results,
+									part_prune_index);
+		if (!bms_equal(root_parent_relids, pruneinfo->root_parent_relids))
+			ereport(ERROR,
+					errcode(ERRCODE_INTERNAL_ERROR),
+					errmsg_internal("mismatching PartitionPruneInfo and PartitionPruneResult at part_prune_index %d",
+									part_prune_index),
+					errdetail_internal("prunresult relids %s, pruneinfo relids %s",
+									   bmsToString(pruneresult->root_parent_relids),
+									   bmsToString(pruneinfo->root_parent_relids)));
 	}
 
 	if (pruneresult == NULL || pruneinfo->needs_exec_pruning)
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 714e2cf2c7..ed664c5469 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -1580,6 +1580,12 @@ typedef struct PartitionPruneStepCombine
  * The result of performing ExecPartitionDoInitialPruning() on a given
  * PartitionPruneInfo.
  *
+ * root_parent_relids is same as PartitionPruneInfo.root_parent_relids.  It's
+ * there for cross-checking in ExecInitPartitionPruning() that the
+ * PartitionPruneResult and the PartitionPruneInfo at a given index in
+ * EState.es_part_prune_results and EState.es_part_prune_infos, respectively,
+ * belong to the same parent plan node.
+ *
  * valid_subplans_offs contains the indexes of subplans remaining after
  * performing initial pruning by calling ExecFindMatchingSubPlans() on the
  * PartitionPruneInfo.
@@ -1597,6 +1603,7 @@ typedef struct PartitionPruneResult
 {
 	NodeTag		type;
 
+	Bitmapset	   *root_parent_relids;
 	Bitmapset	   *valid_subplan_offs;
 } PartitionPruneResult;
 
-- 
2.35.3



^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2022-12-05 06:08  Amit Langote <[email protected]>
  parent: Amit Langote <[email protected]>
  0 siblings, 1 reply; 108+ messages in thread

From: Amit Langote @ 2022-12-05 06:08 UTC (permalink / raw)
  To: Alvaro Herrera <[email protected]>; +Cc: Robert Haas <[email protected]>; Jacob Champion <[email protected]>; David Rowley <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers

On Mon, Dec 5, 2022 at 12:00 PM Amit Langote <[email protected]> wrote:
> On Fri, Dec 2, 2022 at 7:40 PM Amit Langote <[email protected]> wrote:
> > Thought it might be good for PartitionPruneResult to also have
> > root_parent_relids that matches with the corresponding
> > PartitionPruneInfo.  ExecInitPartitionPruning() does a sanity check
> > that the root_parent_relids of a given pair of PartitionPrune{Info |
> > Result} match.
> >
> > Posting the patch separately as the attached 0002, just in case you
> > might think that the extra cross-checking would be an overkill.
>
> Rebased over 92c4dafe1eed and fixed some factual mistakes in the
> comment above ExecutorDoInitialPruning().

Sorry, I had forgotten to git-add hunks including some cosmetic
changes in that one.  Here's another version.

-- 
Thanks, Amit Langote
EDB: http://www.enterprisedb.com


Attachments:

  [application/octet-stream] v28-0002-Add-root_parent_relids-to-PartitionPruneResult.patch (3.3K, 2-v28-0002-Add-root_parent_relids-to-PartitionPruneResult.patch)
  download | inline diff:
From 04f156396309f8c34a853ce1ad4e293fe4e2c4a2 Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Fri, 2 Dec 2022 19:32:14 +0900
Subject: [PATCH v28 2/2] Add root_parent_relids to PartitionPruneResult

It's same as the corresponding PartitionPruneInfo's root_parent_relids.
Like PartitionPruneInfo.root_parent_relids, it's there for
cross-checking a PartitionPruneResult found at a given plan node's
part_prune_index actually matches the plan node.
---
 src/backend/executor/execMain.c      |  2 ++
 src/backend/executor/execPartition.c | 10 ++++++++++
 src/include/nodes/plannodes.h        |  7 +++++++
 3 files changed, 19 insertions(+)

diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index f15265716a..554623751b 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -147,6 +147,8 @@ ExecutorDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params,
 		PartitionPruneInfo *pruneinfo = lfirst_node(PartitionPruneInfo, lc);
 		PartitionPruneResult *pruneresult = makeNode(PartitionPruneResult);
 
+		pruneresult->root_parent_relids =
+			bms_copy(pruneinfo->root_parent_relids);
 		pruneresult->valid_subplan_offs =
 			ExecPartitionDoInitialPruning(plannedstmt, params, pruneinfo,
 										  scan_leafpart_rtis);
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index bc8331a222..2eadc30ec8 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -1842,9 +1842,19 @@ ExecInitPartitionPruning(PlanState *planstate,
 	 * is set.
 	 */
 	if (estate->es_part_prune_results)
+	{
 		pruneresult = list_nth_node(PartitionPruneResult,
 									estate->es_part_prune_results,
 									part_prune_index);
+		if (!bms_equal(root_parent_relids, pruneinfo->root_parent_relids))
+			ereport(ERROR,
+					errcode(ERRCODE_INTERNAL_ERROR),
+					errmsg_internal("mismatching PartitionPruneInfo and PartitionPruneResult at part_prune_index %d",
+									part_prune_index),
+					errdetail_internal("prunresult relids %s, pruneinfo relids %s",
+									   bmsToString(pruneresult->root_parent_relids),
+									   bmsToString(pruneinfo->root_parent_relids)));
+	}
 
 	if (pruneresult == NULL || pruneinfo->needs_exec_pruning)
 	{
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 714e2cf2c7..ed664c5469 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -1580,6 +1580,12 @@ typedef struct PartitionPruneStepCombine
  * The result of performing ExecPartitionDoInitialPruning() on a given
  * PartitionPruneInfo.
  *
+ * root_parent_relids is same as PartitionPruneInfo.root_parent_relids.  It's
+ * there for cross-checking in ExecInitPartitionPruning() that the
+ * PartitionPruneResult and the PartitionPruneInfo at a given index in
+ * EState.es_part_prune_results and EState.es_part_prune_infos, respectively,
+ * belong to the same parent plan node.
+ *
  * valid_subplans_offs contains the indexes of subplans remaining after
  * performing initial pruning by calling ExecFindMatchingSubPlans() on the
  * PartitionPruneInfo.
@@ -1597,6 +1603,7 @@ typedef struct PartitionPruneResult
 {
 	NodeTag		type;
 
+	Bitmapset	   *root_parent_relids;
 	Bitmapset	   *valid_subplan_offs;
 } PartitionPruneResult;
 
-- 
2.35.3



  [application/octet-stream] v28-0001-Optimize-AcquireExecutorLocks-by-locking-only-un.patch (83.0K, 3-v28-0001-Optimize-AcquireExecutorLocks-by-locking-only-un.patch)
  download | inline diff:
From 28bdd07ae15228bc3173257ab5968864455dda16 Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Wed, 22 Dec 2021 16:55:17 +0900
Subject: [PATCH v28 1/2] Optimize AcquireExecutorLocks() by locking only
 unpruned partitions

This commit teaches AcquireExecutorLocks() to perform initial
partition pruning to notionally eliminate the subnodes contained in a
generic cached plan that need not be initialized during the actual
execution of the plan and skip locking the partition scanned by those
subnodes.

The result of performing initial partition pruning this way before the
actual execution has started is made available to the actual execution via
PartitionPruneResult, made available along with the PlannedStmt by the
callers of the executor that used plancache.c to get the plan.  It is NULL
in the cases in which the plan is obtained by calling the planner
directly or if the plan obtained by plancache.c is not a generic one.
---
 src/backend/commands/copyto.c          |   2 +-
 src/backend/commands/createas.c        |   2 +-
 src/backend/commands/explain.c         |   7 +-
 src/backend/commands/extension.c       |   2 +-
 src/backend/commands/matview.c         |   2 +-
 src/backend/commands/prepare.c         |  26 ++-
 src/backend/executor/README            |  36 ++++
 src/backend/executor/execMain.c        |  53 ++++++
 src/backend/executor/execParallel.c    |  26 ++-
 src/backend/executor/execPartition.c   | 237 +++++++++++++++++++++----
 src/backend/executor/execUtils.c       |   1 +
 src/backend/executor/functions.c       |   2 +-
 src/backend/executor/nodeAppend.c      |  11 +-
 src/backend/executor/nodeMergeAppend.c |   5 +-
 src/backend/executor/spi.c             |  27 ++-
 src/backend/nodes/readfuncs.c          |   8 +-
 src/backend/optimizer/plan/planner.c   |   2 +
 src/backend/optimizer/plan/setrefs.c   |  46 +++++
 src/backend/partitioning/partprune.c   |  41 ++++-
 src/backend/tcop/postgres.c            |   8 +-
 src/backend/tcop/pquery.c              |  29 ++-
 src/backend/utils/cache/plancache.c    | 208 +++++++++++++++++++---
 src/backend/utils/mmgr/portalmem.c     |  19 ++
 src/include/commands/explain.h         |   4 +-
 src/include/executor/execPartition.h   |   9 +-
 src/include/executor/execdesc.h        |   3 +
 src/include/executor/executor.h        |   3 +
 src/include/nodes/execnodes.h          |   1 +
 src/include/nodes/nodes.h              |   1 +
 src/include/nodes/pathnodes.h          |  12 ++
 src/include/nodes/plannodes.h          |  46 +++++
 src/include/utils/plancache.h          |   3 +-
 src/include/utils/portal.h             |   3 +
 33 files changed, 787 insertions(+), 98 deletions(-)

diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index f26cc0d162..401a2280a3 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -558,7 +558,7 @@ BeginCopyTo(ParseState *pstate,
 		((DR_copy *) dest)->cstate = cstate;
 
 		/* Create a QueryDesc requesting no output */
-		cstate->queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+		cstate->queryDesc = CreateQueryDesc(plan, NIL, pstate->p_sourcetext,
 											GetActiveSnapshot(),
 											InvalidSnapshot,
 											dest, NULL, NULL, 0);
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 152c29b551..942449544c 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -325,7 +325,7 @@ ExecCreateTableAs(ParseState *pstate, CreateTableAsStmt *stmt,
 		UpdateActiveSnapshotCommandId();
 
 		/* Create a QueryDesc, redirecting output to our tuple receiver */
-		queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+		queryDesc = CreateQueryDesc(plan, NIL, pstate->p_sourcetext,
 									GetActiveSnapshot(), InvalidSnapshot,
 									dest, params, queryEnv, 0);
 
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index f86983c660..2f2b558608 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -407,7 +407,7 @@ ExplainOneQuery(Query *query, int cursorOptions,
 		}
 
 		/* run it (if needed) and produce output */
-		ExplainOnePlan(plan, into, es, queryString, params, queryEnv,
+		ExplainOnePlan(plan, NIL, into, es, queryString, params, queryEnv,
 					   &planduration, (es->buffers ? &bufusage : NULL));
 	}
 }
@@ -515,7 +515,8 @@ ExplainOneUtility(Node *utilityStmt, IntoClause *into, ExplainState *es,
  * to call it.
  */
 void
-ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
+ExplainOnePlan(PlannedStmt *plannedstmt, List *part_prune_results,
+			   IntoClause *into, ExplainState *es,
 			   const char *queryString, ParamListInfo params,
 			   QueryEnvironment *queryEnv, const instr_time *planduration,
 			   const BufferUsage *bufusage)
@@ -563,7 +564,7 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
 		dest = None_Receiver;
 
 	/* Create a QueryDesc for the query */
-	queryDesc = CreateQueryDesc(plannedstmt, queryString,
+	queryDesc = CreateQueryDesc(plannedstmt, part_prune_results, queryString,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, params, queryEnv, instrument_option);
 
diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index cf1b1ca571..904cbcba4a 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -779,7 +779,7 @@ execute_sql_string(const char *sql)
 			{
 				QueryDesc  *qdesc;
 
-				qdesc = CreateQueryDesc(stmt,
+				qdesc = CreateQueryDesc(stmt, NIL,
 										sql,
 										GetActiveSnapshot(), NULL,
 										dest, NULL, NULL, 0);
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index 9ac0383459..65c8d0aa59 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -408,7 +408,7 @@ refresh_matview_datafill(DestReceiver *dest, Query *query,
 	UpdateActiveSnapshotCommandId();
 
 	/* Create a QueryDesc, redirecting output to our tuple receiver */
-	queryDesc = CreateQueryDesc(plan, queryString,
+	queryDesc = CreateQueryDesc(plan, NIL, queryString,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, NULL, NULL, 0);
 
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 9e29584d93..29b45539d3 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -155,6 +155,7 @@ ExecuteQuery(ParseState *pstate,
 	PreparedStatement *entry;
 	CachedPlan *cplan;
 	List	   *plan_list;
+	List	   *part_prune_results_list;
 	ParamListInfo paramLI = NULL;
 	EState	   *estate = NULL;
 	Portal		portal;
@@ -193,7 +194,10 @@ ExecuteQuery(ParseState *pstate,
 									   entry->plansource->query_string);
 
 	/* Replan if needed, and increment plan refcount for portal */
-	cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL);
+	cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL,
+						  &part_prune_results_list);
+	Assert(list_length(cplan->stmt_list) ==
+		   list_length(part_prune_results_list));
 	plan_list = cplan->stmt_list;
 
 	/*
@@ -207,6 +211,9 @@ ExecuteQuery(ParseState *pstate,
 					  plan_list,
 					  cplan);
 
+	/* Copy Lists of PartitionPruneResults into the portal's context. */
+	PortalStorePartitionPruneResults(portal, part_prune_results_list);
+
 	/*
 	 * For CREATE TABLE ... AS EXECUTE, we must verify that the prepared
 	 * statement is one that produces tuples.  Currently we insist that it be
@@ -576,7 +583,9 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	const char *query_string;
 	CachedPlan *cplan;
 	List	   *plan_list;
-	ListCell   *p;
+	List	   *part_prune_results_list;
+	ListCell   *p,
+			   *pp;
 	ParamListInfo paramLI = NULL;
 	EState	   *estate = NULL;
 	instr_time	planstart;
@@ -619,7 +628,10 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 
 	/* Replan if needed, and acquire a transient refcount */
 	cplan = GetCachedPlan(entry->plansource, paramLI,
-						  CurrentResourceOwner, queryEnv);
+						  CurrentResourceOwner, queryEnv,
+						  &part_prune_results_list);
+	Assert(list_length(cplan->stmt_list) ==
+		   list_length(part_prune_results_list));
 
 	INSTR_TIME_SET_CURRENT(planduration);
 	INSTR_TIME_SUBTRACT(planduration, planstart);
@@ -634,13 +646,15 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	plan_list = cplan->stmt_list;
 
 	/* Explain each query */
-	foreach(p, plan_list)
+	forboth(p, plan_list, pp, part_prune_results_list)
 	{
 		PlannedStmt *pstmt = lfirst_node(PlannedStmt, p);
+		List *part_prune_results = lfirst_node(List, pp);
 
 		if (pstmt->commandType != CMD_UTILITY)
-			ExplainOnePlan(pstmt, into, es, query_string, paramLI, queryEnv,
-						   &planduration, (es->buffers ? &bufusage : NULL));
+			ExplainOnePlan(pstmt, part_prune_results, into, es, query_string,
+						   paramLI, queryEnv, &planduration,
+						   (es->buffers ? &bufusage : NULL));
 		else
 			ExplainOneUtility(pstmt->utilityStmt, into, es, query_string,
 							  paramLI, queryEnv);
diff --git a/src/backend/executor/README b/src/backend/executor/README
index 17775a49e2..7f8cf1494f 100644
--- a/src/backend/executor/README
+++ b/src/backend/executor/README
@@ -65,6 +65,38 @@ found there.  This currently only occurs for Append and MergeAppend nodes.  In
 this case the non-required subplans are ignored and the executor state's
 subnode array will become out of sequence to the plan's subplan list.
 
+The so-called execution time pruning may also occur even before the execution
+has actually started.  One case where that occurs is when a cached generic
+plan is being validated for execution by plancache.c:GetCachedPlan(), which
+works by locking all the relations that will be scanned by that plan.  If the
+generic plan contains nodes that can perform execution time partition pruning
+(that is, contain a PartitionPruneInfo), a subset of pruning steps contained
+in a given node's PartitionPruneInfo that do not depend on the execution
+actually having started (called "initial" pruning steps) are performed as part
+of the plan validation step, by calling ExecutorDoInitialPruning().  That
+returns the minimal set of child subplans that satisfy thoe initial pruning
+steps contained in each PartitionPruneInfo.  AcquireExecutorLocks() will then
+lock only the relations scanned by those subplans, in addition to those present
+inPlannedStmt.minLockRelids.  Note that the subplans are not really pruned as
+in being removed from the plan tree, so care is needed by the downstreams
+users of such a plan that has undergone pre-execution initial pruning.
+
+To prevent the executor and any third party execution code that can look at
+the plan tree from trying to execute the subplans that were pruned as
+described above, the result of that pruning is passed to the executor as a
+List of PartitionPruneResult nodes via the QueryDesc, which is subsequently
+assigned to EState.es_part_prune_results.  Each PartitionPruneResult therein
+consists of the set of indexes of surviving subplans in the respective parent
+plan node's (the one to which the corresponding PartitionPruneInfo belongs)
+list of child subplans, saved as a bitmapset valid_subplan_offs.  The executor
+or any third party execution code working on a generic plan should not
+re-evaluate the set of initially valid subplans for a given plan node by
+redoing the initial pruning if a PartitionPruneResult belonging to thant plan
+node is present in es_part_prune_results.  Note that that is not simply a
+performance optimization, because such re-evaluation of the pruning steps may
+very well end up resulting in a different set of initially valid subplans,
+containing some whose relations were not locked by AcquireExecutorLocks().
+
 Each Plan node may have expression trees associated with it, to represent
 its target list, qualification conditions, etc.  These trees are also
 read-only to the executor, but the executor state for expression evaluation
@@ -286,6 +318,10 @@ Query Processing Control Flow
 
 This is a sketch of control flow for full query processing:
 
+  [ ExecutorDoInitialPruning ] --- an optional step to perform initial
+		partition pruning on the plan tree the result of which is passed
+		to the executor via QueryDesc
+
 	CreateQueryDesc
 
 	ExecutorStart
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 12ff4f3de5..f15265716a 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -49,6 +49,7 @@
 #include "commands/matview.h"
 #include "commands/trigger.h"
 #include "executor/execdebug.h"
+#include "executor/execPartition.h"
 #include "executor/nodeSubplan.h"
 #include "foreign/fdwapi.h"
 #include "jit/jit.h"
@@ -104,6 +105,56 @@ static void EvalPlanQualStart(EPQState *epqstate, Plan *planTree);
 
 /* end of local decls */
 
+/* ----------------------------------------------------------------
+ *		ExecutorDoInitialPruning
+ *
+ *		For each plan tree node that has been assigned a PartitionPruneInfo,
+ *		this performs initial partition pruning using the information contained
+ *		therein to determine the set of child subplans that satisfy the initial
+ *		pruning steps, to be returned as a bitmapset of their indexes in the
+ *		node's list of child subplans (for example, an Append's appendplans).
+ *
+ * Return value is a List of PartitionPruneResult nodes, one for each
+ * PartitionPruneInfo found in plannedstmt->containsInitialPruning, each
+ * containing a bitmapset of the indexes of unpruned child subplans.
+ * A bitmapset of the RT indexes of the leaf partitions scanned by those
+ * subplans is returned in *scan_leafpart_rtis, which is shared across all
+ * of those PartitionPruneResults.
+ *
+ * The executor must see the exactly same set of subplans as valid for
+ * execution when doing ExecInitNode() on the plan nodes whose
+ * PartitionPruneInfos are processed here.  So, it must get the set from the
+ * aforementioned PartitionPruneResult, instead of computing it all over
+ * again by redoing the initial pruning.  It's the caller's job to pass the
+ * PartitionPruneResult to the executor.
+ *
+ * Note: Partitioned tables mentioned in PartitionedRelPruneInfo nodes that
+ * drive the pruning will be locked before doing the pruning.
+ * ----------------------------------------------------------------
+ */
+List *
+ExecutorDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params,
+						 Bitmapset **scan_leafpart_rtis)
+{
+	List	 *part_prune_results = NIL;
+	ListCell *lc;
+
+	/* Only get here if there is any pruning to do. */
+	Assert(plannedstmt->containsInitialPruning);
+
+	foreach(lc, plannedstmt->partPruneInfos)
+	{
+		PartitionPruneInfo *pruneinfo = lfirst_node(PartitionPruneInfo, lc);
+		PartitionPruneResult *pruneresult = makeNode(PartitionPruneResult);
+
+		pruneresult->valid_subplan_offs =
+			ExecPartitionDoInitialPruning(plannedstmt, params, pruneinfo,
+										  scan_leafpart_rtis);
+		part_prune_results = lappend(part_prune_results, pruneresult);
+	}
+
+	return part_prune_results;
+}
 
 /* ----------------------------------------------------------------
  *		ExecutorStart
@@ -806,6 +857,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
 {
 	CmdType		operation = queryDesc->operation;
 	PlannedStmt *plannedstmt = queryDesc->plannedstmt;
+	List	   *part_prune_results = queryDesc->part_prune_results;
 	Plan	   *plan = plannedstmt->planTree;
 	List	   *rangeTable = plannedstmt->rtable;
 	EState	   *estate = queryDesc->estate;
@@ -826,6 +878,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
 
 	estate->es_plannedstmt = plannedstmt;
 	estate->es_part_prune_infos = plannedstmt->partPruneInfos;
+	estate->es_part_prune_results = part_prune_results;
 
 	/*
 	 * Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index aca0c6f323..917079a034 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -66,6 +66,7 @@
 #define PARALLEL_KEY_QUERY_TEXT		UINT64CONST(0xE000000000000008)
 #define PARALLEL_KEY_JIT_INSTRUMENTATION UINT64CONST(0xE000000000000009)
 #define PARALLEL_KEY_WAL_USAGE			UINT64CONST(0xE00000000000000A)
+#define PARALLEL_KEY_PARTITION_PRUNE_RESULTS	UINT64CONST(0xE00000000000000B)
 
 #define PARALLEL_TUPLE_QUEUE_SIZE		65536
 
@@ -182,6 +183,7 @@ ExecSerializePlan(Plan *plan, EState *estate)
 	pstmt->transientPlan = false;
 	pstmt->dependsOnRole = false;
 	pstmt->parallelModeNeeded = false;
+	pstmt->containsInitialPruning = false;
 	pstmt->planTree = plan;
 	pstmt->partPruneInfos = estate->es_part_prune_infos;
 	pstmt->rtable = estate->es_range_table;
@@ -597,12 +599,15 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	FixedParallelExecutorState *fpes;
 	char	   *pstmt_data;
 	char	   *pstmt_space;
+	char	   *part_prune_results_data;
+	char	   *part_prune_results_space;
 	char	   *paramlistinfo_space;
 	BufferUsage *bufusage_space;
 	WalUsage   *walusage_space;
 	SharedExecutorInstrumentation *instrumentation = NULL;
 	SharedJitInstrumentation *jit_instrumentation = NULL;
 	int			pstmt_len;
+	int			part_prune_results_len;
 	int			paramlistinfo_len;
 	int			instrumentation_len = 0;
 	int			jit_instrumentation_len = 0;
@@ -631,6 +636,7 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 
 	/* Fix up and serialize plan to be sent to workers. */
 	pstmt_data = ExecSerializePlan(planstate->plan, estate);
+	part_prune_results_data = nodeToString(estate->es_part_prune_results);
 
 	/* Create a parallel context. */
 	pcxt = CreateParallelContext("postgres", "ParallelQueryMain", nworkers);
@@ -657,6 +663,11 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	shm_toc_estimate_chunk(&pcxt->estimator, pstmt_len);
 	shm_toc_estimate_keys(&pcxt->estimator, 1);
 
+	/* Estimate space for serialized List of PartitionPruneResult. */
+	part_prune_results_len = strlen(part_prune_results_data) + 1;
+	shm_toc_estimate_chunk(&pcxt->estimator, part_prune_results_len);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
 	/* Estimate space for serialized ParamListInfo. */
 	paramlistinfo_len = EstimateParamListSpace(estate->es_param_list_info);
 	shm_toc_estimate_chunk(&pcxt->estimator, paramlistinfo_len);
@@ -751,6 +762,12 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	memcpy(pstmt_space, pstmt_data, pstmt_len);
 	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PLANNEDSTMT, pstmt_space);
 
+	/* Store serialized List of PartitionPruneResult */
+	part_prune_results_space = shm_toc_allocate(pcxt->toc, part_prune_results_len);
+	memcpy(part_prune_results_space, part_prune_results_data, part_prune_results_len);
+	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARTITION_PRUNE_RESULTS,
+				   part_prune_results_space);
+
 	/* Store serialized ParamListInfo. */
 	paramlistinfo_space = shm_toc_allocate(pcxt->toc, paramlistinfo_len);
 	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARAMLISTINFO, paramlistinfo_space);
@@ -1232,8 +1249,10 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
 						 int instrument_options)
 {
 	char	   *pstmtspace;
+	char	   *part_prune_results_space;
 	char	   *paramspace;
 	PlannedStmt *pstmt;
+	List	   *part_prune_results;
 	ParamListInfo paramLI;
 	char	   *queryString;
 
@@ -1244,12 +1263,17 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
 	pstmtspace = shm_toc_lookup(toc, PARALLEL_KEY_PLANNEDSTMT, false);
 	pstmt = (PlannedStmt *) stringToNode(pstmtspace);
 
+	/* Reconstruct leader-supplied PartitionPruneResult. */
+	part_prune_results_space =
+		shm_toc_lookup(toc, PARALLEL_KEY_PARTITION_PRUNE_RESULTS, false);
+	part_prune_results = (List *) stringToNode(part_prune_results_space);
+
 	/* Reconstruct ParamListInfo. */
 	paramspace = shm_toc_lookup(toc, PARALLEL_KEY_PARAMLISTINFO, false);
 	paramLI = RestoreParamList(&paramspace);
 
 	/* Create a QueryDesc for the query. */
-	return CreateQueryDesc(pstmt,
+	return CreateQueryDesc(pstmt, part_prune_results,
 						   queryString,
 						   GetActiveSnapshot(), InvalidSnapshot,
 						   receiver, paramLI, NULL, instrument_options);
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 88d0ea3adb..bc8331a222 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -25,6 +25,7 @@
 #include "mb/pg_wchar.h"
 #include "miscadmin.h"
 #include "nodes/makefuncs.h"
+#include "parser/parsetree.h"
 #include "partitioning/partbounds.h"
 #include "partitioning/partdesc.h"
 #include "partitioning/partprune.h"
@@ -185,7 +186,11 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation rel,
 static List *adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri);
 static List *adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap);
 static PartitionPruneState *CreatePartitionPruneState(PlanState *planstate,
-													  PartitionPruneInfo *pruneinfo);
+													  PartitionPruneInfo *pruneinfo,
+													  bool consider_initial_steps,
+													  bool consider_exec_steps,
+													  List *rtable, ExprContext *econtext,
+													  PartitionDirectory partdir);
 static void InitPartitionPruneContext(PartitionPruneContext *context,
 									  List *pruning_steps,
 									  PartitionDesc partdesc,
@@ -198,7 +203,8 @@ static void PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
 static void find_matching_subplans_recurse(PartitionPruningData *prunedata,
 										   PartitionedRelPruningData *pprune,
 										   bool initial_prune,
-										   Bitmapset **validsubplans);
+										   Bitmapset **validsubplans,
+										   Bitmapset **scan_leafpart_rtis);
 
 
 /*
@@ -1749,8 +1755,10 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  * considered to be a stable expression, it can change value from one plan
  * node scan to the next during query execution.  Stable comparison
  * expressions that don't involve such Params allow partition pruning to be
- * done once during executor startup.  Expressions that do involve such Params
- * require us to prune separately for each scan of the parent plan node.
+ * done once during executor startup or during ExecutorDoInitialPruning() that
+ * runs as part of performing AcquireExecutorLocks() on a given plan tree.
+ * Expressions that do involve such Params require us to prune separately for
+ * each scan of the parent plan node.
  *
  * Note that pruning away unneeded subplans during executor startup has the
  * added benefit of not having to initialize the unneeded subplans at all.
@@ -1767,6 +1775,13 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  *		account for initial pruning possibly having eliminated some of the
  *		subplans.
  *
+ * ExecPartitionDoInitialPruning:
+ *		Do initial pruning with the information contained in a given
+ *		PartitionPruneInfo to determine the minimal set of child subplans
+ *		to be executed of the parent plan node to which the PartitionPruneInfo
+ *		belongs and also the set of the RT indexes of leaf partitions that will
+ *		be scanned with those subplans.
+ *
  * ExecFindMatchingSubPlans:
  *		Returns indexes of matching subplans after evaluating the expressions
  *		that are safe to evaluate at a given point.  This function is first
@@ -1787,8 +1802,9 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  *
  * On return, *initially_valid_subplans is assigned the set of indexes of
  * child subplans that must be initialized along with the parent plan node.
- * Initial pruning is performed here if needed and in that case only the
- * surviving subplans' indexes are added.
+ * Initial pruning is performed here if needed (unless it has already been done
+ * by ExecutorDoInitialPruning()), and in that case only the surviving
+ * subplans' indexes are added.
  *
  * If subplans are indeed pruned, subplan_map arrays contained in the returned
  * PartitionPruneState are re-sequenced to not count those, though only if the
@@ -1801,9 +1817,10 @@ ExecInitPartitionPruning(PlanState *planstate,
 						 Bitmapset *root_parent_relids,
 						 Bitmapset **initially_valid_subplans)
 {
-	PartitionPruneState *prunestate;
+	PartitionPruneState *prunestate = NULL;
 	EState	   *estate = planstate->state;
 	PartitionPruneInfo *pruneinfo;
+	PartitionPruneResult *pruneresult = NULL;
 
 	/* Obtain the pruneinfo we need, and make sure it's the right one */
 	pruneinfo = list_nth(estate->es_part_prune_infos, part_prune_index);
@@ -1819,20 +1836,56 @@ ExecInitPartitionPruning(PlanState *planstate,
 	/* We may need an expression context to evaluate partition exprs */
 	ExecAssignExprContext(estate, planstate);
 
-	/* Create the working data structure for pruning */
-	prunestate = CreatePartitionPruneState(planstate, pruneinfo);
+	/*
+	 * No need to do initial pruning if it was done already by
+	 * ExecutorDoInitialPruning(), which it would be if es_part_prune_results
+	 * is set.
+	 */
+	if (estate->es_part_prune_results)
+		pruneresult = list_nth_node(PartitionPruneResult,
+									estate->es_part_prune_results,
+									part_prune_index);
+
+	if (pruneresult == NULL || pruneinfo->needs_exec_pruning)
+	{
+		/* We may need an expression context to evaluate partition exprs */
+		ExecAssignExprContext(estate, planstate);
+
+		/* For data reading, executor always omits detached partitions */
+		if (estate->es_partition_directory == NULL)
+			estate->es_partition_directory =
+				CreatePartitionDirectory(estate->es_query_cxt, false);
+
+		/*
+		 * Create the working data structure for pruning.  No need to consider
+		 * initial pruning steps if we have a PartitionPruneResult.
+		 */
+		prunestate = CreatePartitionPruneState(planstate, pruneinfo,
+											   pruneresult == NULL,
+											   pruneinfo->needs_exec_pruning,
+											   NIL, planstate->ps_ExprContext,
+											   estate->es_partition_directory);
+	}
 
 	/*
 	 * Perform an initial partition prune pass, if required.
 	 */
-	if (prunestate->do_initial_prune)
-		*initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true);
+	if (pruneresult)
+	{
+		*initially_valid_subplans = bms_copy(pruneresult->valid_subplan_offs);
+	}
+	else if (prunestate && prunestate->do_initial_prune)
+	{
+		*initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true,
+															 NULL);
+	}
 	else
 	{
-		/* No pruning, so we'll need to initialize all subplans */
+		/* No initial pruning, so we'll need to initialize all subplans */
 		Assert(n_total_subplans > 0);
 		*initially_valid_subplans = bms_add_range(NULL, 0,
 												  n_total_subplans - 1);
+		return prunestate;
 	}
 
 	/*
@@ -1840,7 +1893,8 @@ ExecInitPartitionPruning(PlanState *planstate,
 	 * that were removed above due to initial pruning.  No need to do this if
 	 * no steps were removed.
 	 */
-	if (bms_num_members(*initially_valid_subplans) < n_total_subplans)
+	if (prunestate &&
+		bms_num_members(*initially_valid_subplans) < n_total_subplans)
 	{
 		/*
 		 * We can safely skip this when !do_exec_prune, even though that
@@ -1856,11 +1910,74 @@ ExecInitPartitionPruning(PlanState *planstate,
 	return prunestate;
 }
 
+/*
+ * ExecPartitionDoInitialPruning
+ *		Perform initial pruning using given PartitionPruneInfo to determine
+ *		the minimal set of child subplans that will be executed and also the
+ *		set of RT indexes of the leaf partitions scanned by those subplans.
+ */
+Bitmapset *
+ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params,
+							  PartitionPruneInfo *pruneinfo,
+							  Bitmapset **scan_leafpart_rtis)
+{
+	List		 *rtable = plannedstmt->rtable;
+	ExprContext	 *econtext;
+	PartitionDirectory pdir;
+	MemoryContext oldcontext,
+				  tmpcontext;
+	PartitionPruneState *prunestate;
+	Bitmapset	 *valid_subplan_offs;
+
+	/*
+	 * A temporary context for memory allocations required while executing
+	 * partition pruning steps.
+	 */
+	tmpcontext = AllocSetContextCreate(CurrentMemoryContext,
+									   "initial pruning working data",
+									   ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(tmpcontext);
+
+	/*
+	 * PartitionDirectory to look up partition descriptors.
+	 * Note that we don't omit detached partitions, just like during
+	 * execution proper.
+	 */
+	pdir = CreatePartitionDirectory(CurrentMemoryContext, false);
+
+	/*
+	 * We don't yet have a PlanState for the parent plan node, so we must
+	 * create a standalone ExprContext to evaluate pruning expressions,
+	 * equipped with the information about the EXTERN parameters that the
+	 * caller passed us.  Note that that's okay because the initial pruning
+	 * steps do not contain anything that requires the execution to have
+	 * started and thus need the information contained in a PlanState.
+	 */
+	econtext = CreateStandaloneExprContext();
+	econtext->ecxt_param_list_info = params;
+	prunestate = CreatePartitionPruneState(NULL, pruneinfo, true, false,
+										   rtable, econtext, pdir);
+	MemoryContextSwitchTo(oldcontext);
+
+	/* Do the initial pruning. */
+	valid_subplan_offs = ExecFindMatchingSubPlans(prunestate, true,
+												  scan_leafpart_rtis);
+
+	FreeExprContext(econtext, true);
+	DestroyPartitionDirectory(pdir);
+	MemoryContextDelete(tmpcontext);
+
+	return valid_subplan_offs;
+}
+
 /*
  * CreatePartitionPruneState
  *		Build the data structure required for calling ExecFindMatchingSubPlans
  *
- * 'planstate' is the parent plan node's execution state.
+ * 'planstate', if not NULL, is the parent plan node's execution state.  It
+ * can be NULL if being called before ExecutorStart(), in which case,
+ * 'rtable' (range table), 'econtext', and 'partdir' must be explicitly
+ * provided.
  *
  * 'pruneinfo' is a PartitionPruneInfo as generated by
  * make_partition_pruneinfo.  Here we build a PartitionPruneState containing a
@@ -1874,19 +1991,21 @@ ExecInitPartitionPruning(PlanState *planstate,
  * PartitionedRelPruneInfo.
  */
 static PartitionPruneState *
-CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
+CreatePartitionPruneState(PlanState *planstate,
+						  PartitionPruneInfo *pruneinfo,
+						  bool consider_initial_steps,
+						  bool consider_exec_steps,
+						  List *rtable, ExprContext *econtext,
+						  PartitionDirectory partdir)
 {
-	EState	   *estate = planstate->state;
+	EState	   *estate = planstate ? planstate->state : NULL;
 	PartitionPruneState *prunestate;
 	int			n_part_hierarchies;
 	ListCell   *lc;
 	int			i;
-	ExprContext *econtext = planstate->ps_ExprContext;
 
-	/* For data reading, executor always omits detached partitions */
-	if (estate->es_partition_directory == NULL)
-		estate->es_partition_directory =
-			CreatePartitionDirectory(estate->es_query_cxt, false);
+	Assert((estate != NULL) ||
+			(partdir != NULL && econtext != NULL && rtable != NIL));
 
 	n_part_hierarchies = list_length(pruneinfo->prune_infos);
 	Assert(n_part_hierarchies > 0);
@@ -1941,15 +2060,42 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 			PartitionKey partkey;
 
 			/*
-			 * We can rely on the copies of the partitioned table's partition
-			 * key and partition descriptor appearing in its relcache entry,
-			 * because that entry will be held open and locked for the
-			 * duration of this executor run.
+			 * Must open the relation by ourselves when called before the
+			 * execution has started, such as, when called during
+			 * ExecutorDoInitialPruning() on a cached plan.  In that case,
+			 * sub-partitions must be locked, because AcquirePlannerLocks()
+			 * would not have seen them. (1st relation in a partrelpruneinfos
+			 * list is always the root partitioned table appearing in the
+			 * query, which AcquirePlannerLocks() would have locked; the
+			 * Assert in relation_open() guards that assumption.)
+			 */
+			if (estate == NULL)
+			{
+				RangeTblEntry *rte = rt_fetch(pinfo->rtindex, rtable);
+				int		lockmode = (j == 0) ? NoLock : rte->rellockmode;
+
+				partrel = table_open(rte->relid, lockmode);
+			}
+			else
+				partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
+
+			/*
+			 * We can rely on the copy of the partitioned table's partition
+			 * key from in its relcache entry, because it can't change (or
+			 * get destroyed) as long as the relation is locked.  Partition
+			 * descriptor is taken from the PartitionDirectory associated with
+			 * the table that is held open long enough for the descriptor to
+			 * remain valid while it's used to perform the pruning steps.
 			 */
-			partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
 			partkey = RelationGetPartitionKey(partrel);
-			partdesc = PartitionDirectoryLookup(estate->es_partition_directory,
-												partrel);
+			partdesc = PartitionDirectoryLookup(partdir, partrel);
+
+			/*
+			 * Must close partrel, keeping the lock taken, if we're not using
+			 * EState's entry.
+			 */
+			if (estate == NULL)
+				table_close(partrel, NoLock);
 
 			/*
 			 * Initialize the subplan_map and subpart_map.
@@ -1963,6 +2109,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 			Assert(partdesc->nparts >= pinfo->nparts);
 			pprune->nparts = partdesc->nparts;
 			pprune->subplan_map = palloc(sizeof(int) * partdesc->nparts);
+			pprune->rti_map = palloc(sizeof(Index) * partdesc->nparts);
 			if (partdesc->nparts == pinfo->nparts)
 			{
 				/*
@@ -1973,6 +2120,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 				pprune->subpart_map = pinfo->subpart_map;
 				memcpy(pprune->subplan_map, pinfo->subplan_map,
 					   sizeof(int) * pinfo->nparts);
+				memcpy(pprune->rti_map, pinfo->rti_map,
+					   sizeof(int) * pinfo->nparts);
 
 				/*
 				 * Double-check that the list of unpruned relations has not
@@ -2023,6 +2172,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 							pinfo->subplan_map[pd_idx];
 						pprune->subpart_map[pp_idx] =
 							pinfo->subpart_map[pd_idx];
+						pprune->rti_map[pp_idx] =
+							pinfo->rti_map[pd_idx];
 						pd_idx++;
 					}
 					else
@@ -2030,6 +2181,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 						/* this partdesc entry is not in the plan */
 						pprune->subplan_map[pp_idx] = -1;
 						pprune->subpart_map[pp_idx] = -1;
+						pprune->rti_map[pp_idx] = 0;
 					}
 				}
 
@@ -2051,7 +2203,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 			 * Initialize pruning contexts as needed.
 			 */
 			pprune->initial_pruning_steps = pinfo->initial_pruning_steps;
-			if (pinfo->initial_pruning_steps)
+			if (consider_initial_steps && pinfo->initial_pruning_steps)
 			{
 				InitPartitionPruneContext(&pprune->initial_context,
 										  pinfo->initial_pruning_steps,
@@ -2061,7 +2213,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 				prunestate->do_initial_prune = true;
 			}
 			pprune->exec_pruning_steps = pinfo->exec_pruning_steps;
-			if (pinfo->exec_pruning_steps)
+			if (consider_exec_steps && pinfo->exec_pruning_steps)
 			{
 				InitPartitionPruneContext(&pprune->exec_context,
 										  pinfo->exec_pruning_steps,
@@ -2289,10 +2441,14 @@ PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
  * Pass initial_prune if PARAM_EXEC Params cannot yet be evaluated.  This
  * differentiates the initial executor-time pruning step from later
  * runtime pruning.
+ *
+ * RT indexes of leaf partitions scanned by the chosen subplans are added to
+ * *scan_leafpart_rtis if the pointer is non-NULL.
  */
 Bitmapset *
 ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
-						 bool initial_prune)
+						 bool initial_prune,
+						 Bitmapset **scan_leafpart_rtis)
 {
 	Bitmapset  *result = NULL;
 	MemoryContext oldcontext;
@@ -2327,7 +2483,7 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
 		 */
 		pprune = &prunedata->partrelprunedata[0];
 		find_matching_subplans_recurse(prunedata, pprune, initial_prune,
-									   &result);
+									   &result, scan_leafpart_rtis);
 
 		/* Expression eval may have used space in ExprContext too */
 		if (pprune->exec_pruning_steps)
@@ -2341,6 +2497,8 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
 
 	/* Copy result out of the temp context before we reset it */
 	result = bms_copy(result);
+	if (scan_leafpart_rtis)
+		*scan_leafpart_rtis = bms_copy(*scan_leafpart_rtis);
 
 	MemoryContextReset(prunestate->prune_context);
 
@@ -2351,13 +2509,15 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
  * find_matching_subplans_recurse
  *		Recursive worker function for ExecFindMatchingSubPlans
  *
- * Adds valid (non-prunable) subplan IDs to *validsubplans
+ * Adds valid (non-prunable) subplan IDs to *validsubplans and RT indexes of
+ * of the corresponding leaf partitions to *scan_leafpart_rtis (if asked for).
  */
 static void
 find_matching_subplans_recurse(PartitionPruningData *prunedata,
 							   PartitionedRelPruningData *pprune,
 							   bool initial_prune,
-							   Bitmapset **validsubplans)
+							   Bitmapset **validsubplans,
+							   Bitmapset **scan_leafpart_rtis)
 {
 	Bitmapset  *partset;
 	int			i;
@@ -2384,8 +2544,14 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
 	while ((i = bms_next_member(partset, i)) >= 0)
 	{
 		if (pprune->subplan_map[i] >= 0)
+		{
 			*validsubplans = bms_add_member(*validsubplans,
 											pprune->subplan_map[i]);
+			Assert(pprune->rti_map[i] > 0);
+			if (scan_leafpart_rtis)
+				*scan_leafpart_rtis = bms_add_member(*scan_leafpart_rtis,
+													 pprune->rti_map[i]);
+		}
 		else
 		{
 			int			partidx = pprune->subpart_map[i];
@@ -2393,7 +2559,8 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
 			if (partidx >= 0)
 				find_matching_subplans_recurse(prunedata,
 											   &prunedata->partrelprunedata[partidx],
-											   initial_prune, validsubplans);
+											   initial_prune, validsubplans,
+											   scan_leafpart_rtis);
 			else
 			{
 				/*
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 572c87e453..044bf3f491 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -135,6 +135,7 @@ CreateExecutorState(void)
 	estate->es_param_exec_vals = NULL;
 
 	estate->es_queryEnv = NULL;
+	estate->es_part_prune_results = NIL;
 
 	estate->es_query_cxt = qcontext;
 
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index dc13625171..bffb42ce71 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -842,7 +842,7 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
 	else
 		dest = None_Receiver;
 
-	es->qd = CreateQueryDesc(es->stmt,
+	es->qd = CreateQueryDesc(es->stmt, NIL,
 							 fcache->src,
 							 GetActiveSnapshot(),
 							 InvalidSnapshot,
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index 99830198bd..3b917584de 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -156,7 +156,8 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
 		 * subplan, we can fill as_valid_subplans immediately, preventing
 		 * later calls to ExecFindMatchingSubPlans.
 		 */
-		if (!prunestate->do_exec_prune && nplans > 0)
+		if (appendstate->as_prune_state == NULL ||
+			(!appendstate->as_prune_state->do_exec_prune && nplans > 0))
 			appendstate->as_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
 	}
 	else
@@ -578,7 +579,7 @@ choose_next_subplan_locally(AppendState *node)
 		}
 		else if (node->as_valid_subplans == NULL)
 			node->as_valid_subplans =
-				ExecFindMatchingSubPlans(node->as_prune_state, false);
+				ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
 
 		whichplan = -1;
 	}
@@ -643,7 +644,7 @@ choose_next_subplan_for_leader(AppendState *node)
 		if (node->as_valid_subplans == NULL)
 		{
 			node->as_valid_subplans =
-				ExecFindMatchingSubPlans(node->as_prune_state, false);
+				ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
 
 			/*
 			 * Mark each invalid plan as finished to allow the loop below to
@@ -718,7 +719,7 @@ choose_next_subplan_for_worker(AppendState *node)
 	else if (node->as_valid_subplans == NULL)
 	{
 		node->as_valid_subplans =
-			ExecFindMatchingSubPlans(node->as_prune_state, false);
+			ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
 		mark_invalid_subplans_as_finished(node);
 	}
 
@@ -869,7 +870,7 @@ ExecAppendAsyncBegin(AppendState *node)
 	if (node->as_valid_subplans == NULL)
 	{
 		node->as_valid_subplans =
-			ExecFindMatchingSubPlans(node->as_prune_state, false);
+			ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
 
 		classify_matching_subplans(node);
 	}
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index f370f9f287..ccfa083945 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -104,7 +104,8 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
 		 * subplan, we can fill ms_valid_subplans immediately, preventing
 		 * later calls to ExecFindMatchingSubPlans.
 		 */
-		if (!prunestate->do_exec_prune && nplans > 0)
+		if (mergestate->ms_prune_state == NULL ||
+			(!mergestate->ms_prune_state->do_exec_prune && nplans > 0))
 			mergestate->ms_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
 	}
 	else
@@ -219,7 +220,7 @@ ExecMergeAppend(PlanState *pstate)
 		 */
 		if (node->ms_valid_subplans == NULL)
 			node->ms_valid_subplans =
-				ExecFindMatchingSubPlans(node->ms_prune_state, false);
+				ExecFindMatchingSubPlans(node->ms_prune_state, false, NULL);
 
 		/*
 		 * First time through: pull the first tuple from each valid subplan,
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index fd5796f1b9..93012a5b3b 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -1578,6 +1578,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 	CachedPlanSource *plansource;
 	CachedPlan *cplan;
 	List	   *stmt_list;
+	List	   *part_prune_results_list;
 	char	   *query_string;
 	Snapshot	snapshot;
 	MemoryContext oldcontext;
@@ -1657,7 +1658,10 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 	 */
 
 	/* Replan if needed, and increment plan refcount for portal */
-	cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv);
+	cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv,
+						  &part_prune_results_list);
+	Assert(list_length(cplan->stmt_list) ==
+		   list_length(part_prune_results_list));
 	stmt_list = cplan->stmt_list;
 
 	if (!plan->saved)
@@ -1685,6 +1689,9 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 					  stmt_list,
 					  cplan);
 
+	/* Copy Lists of PartitionPruneResults into the portal's context. */
+	PortalStorePartitionPruneResults(portal, part_prune_results_list);
+
 	/*
 	 * Set up options for portal.  Default SCROLL type is chosen the same way
 	 * as PerformCursorOpen does it.
@@ -2092,7 +2099,8 @@ SPI_plan_get_cached_plan(SPIPlanPtr plan)
 	/* Get the generic plan for the query */
 	cplan = GetCachedPlan(plansource, NULL,
 						  plan->saved ? CurrentResourceOwner : NULL,
-						  _SPI_current->queryEnv);
+						  _SPI_current->queryEnv,
+						  NULL /* Not interested in PartitionPruneResults */);
 	Assert(cplan == plansource->gplan);
 
 	/* Pop the error context stack */
@@ -2473,7 +2481,9 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 	{
 		CachedPlanSource *plansource = (CachedPlanSource *) lfirst(lc1);
 		List	   *stmt_list;
-		ListCell   *lc2;
+		List	   *part_prune_results_list;
+		ListCell   *lc2,
+				   *lc3;
 
 		spicallbackarg.query = plansource->query_string;
 
@@ -2549,8 +2559,10 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 		 * plan, the refcount must be backed by the plan_owner.
 		 */
 		cplan = GetCachedPlan(plansource, options->params,
-							  plan_owner, _SPI_current->queryEnv);
-
+							  plan_owner, _SPI_current->queryEnv,
+							 &part_prune_results_list);
+		Assert(list_length(cplan->stmt_list) ==
+			   list_length(part_prune_results_list));
 		stmt_list = cplan->stmt_list;
 
 		/*
@@ -2589,9 +2601,10 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 			}
 		}
 
-		foreach(lc2, stmt_list)
+		forboth(lc2, stmt_list, lc3, part_prune_results_list)
 		{
 			PlannedStmt *stmt = lfirst_node(PlannedStmt, lc2);
+			List	   *part_prune_results = lfirst_node(List, lc3);
 			bool		canSetTag = stmt->canSetTag;
 			DestReceiver *dest;
 
@@ -2663,7 +2676,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 				else
 					snap = InvalidSnapshot;
 
-				qdesc = CreateQueryDesc(stmt,
+				qdesc = CreateQueryDesc(stmt, part_prune_results,
 										plansource->query_string,
 										snap, crosscheck_snapshot,
 										dest,
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 23776367c5..b01f55fb4f 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -158,6 +158,11 @@
 	token = pg_strtok(&length);		/* skip :fldname */ \
 	local_node->fldname = readIntCols(len)
 
+/* Read an Index array */
+#define READ_INDEX_ARRAY(fldname, len) \
+	token = pg_strtok(&length);		/* skip :fldname */ \
+	local_node->fldname = readIndexCols(len)
+
 /* Read a bool array */
 #define READ_BOOL_ARRAY(fldname, len) \
 	token = pg_strtok(&length);		/* skip :fldname */ \
@@ -800,7 +805,6 @@ fnname(int numCols) \
  */
 READ_SCALAR_ARRAY(readAttrNumberCols, int16, atoi)
 READ_SCALAR_ARRAY(readOidCols, Oid, atooid)
-/* outfuncs.c has writeIndexCols, but we don't yet need that here */
-/* READ_SCALAR_ARRAY(readIndexCols, Index, atoui) */
+READ_SCALAR_ARRAY(readIndexCols, Index, atoui)
 READ_SCALAR_ARRAY(readIntCols, int, atoi)
 READ_SCALAR_ARRAY(readBoolCols, bool, strtobool)
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 799602f5ea..a96d316dca 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -520,7 +520,9 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
 	result->parallelModeNeeded = glob->parallelModeNeeded;
 	result->planTree = top_plan;
 	result->partPruneInfos = glob->partPruneInfos;
+	result->containsInitialPruning = glob->containsInitialPruning;
 	result->rtable = glob->finalrtable;
+	result->minLockRelids = glob->minLockRelids;
 	result->resultRelations = glob->resultRelations;
 	result->appendRelations = glob->appendRelations;
 	result->subplans = glob->subplans;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 399c1812d4..44ffe71c49 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -270,6 +270,16 @@ set_plan_references(PlannerInfo *root, Plan *plan)
 	 */
 	add_rtes_to_flat_rtable(root, false);
 
+	/*
+	 * Add the query's adjusted range of RT indexes to glob->minLockRelids.
+	 * The adjusted RT indexes of prunable relations will be deleted from the
+	 * set below where PartitionPruneInfos are processed.
+	 */
+	glob->minLockRelids =
+		bms_add_range(glob->minLockRelids,
+					  rtoffset + 1,
+					  rtoffset + list_length(root->parse->rtable));
+
 	/*
 	 * Adjust RT indexes of PlanRowMarks and add to final rowmarks list
 	 */
@@ -353,6 +363,7 @@ set_plan_references(PlannerInfo *root, Plan *plan)
 	{
 		PartitionPruneInfo *pruneinfo = lfirst(lc);
 		ListCell   *l;
+		Bitmapset  *leafpart_rtis = NULL;
 
 		pruneinfo->root_parent_relids =
 			offset_relid_set(pruneinfo->root_parent_relids, rtoffset);
@@ -364,15 +375,50 @@ set_plan_references(PlannerInfo *root, Plan *plan)
 			foreach(l2, prune_infos)
 			{
 				PartitionedRelPruneInfo *pinfo = lfirst(l2);
+				int		i;
 
 				/* RT index of the table to which the pinfo belongs. */
 				pinfo->rtindex += rtoffset;
+
+				/* Also of the leaf partitions that might be scanned. */
+				for (i = 0; i < pinfo->nparts; i++)
+				{
+					if (pinfo->rti_map[i] > 0 && pinfo->subplan_map[i] >= 0)
+					{
+						pinfo->rti_map[i] += rtoffset;
+						leafpart_rtis = bms_add_member(leafpart_rtis,
+													   pinfo->rti_map[i]);
+					}
+				}
 			}
 		}
 
+		if (pruneinfo->needs_init_pruning)
+		{
+			glob->containsInitialPruning = true;
+
+			/*
+			 * Delete the leaf partition RTIs from the global set of relations
+			 * to be locked before executing the plan.  AcquireExecutorLocks()
+			 * will find the ones to add to the set after performing initial
+			 * pruning.
+			 */
+			glob->minLockRelids = bms_del_members(glob->minLockRelids,
+												  leafpart_rtis);
+		}
+
 		glob->partPruneInfos = lappend(glob->partPruneInfos, pruneinfo);
 	}
 
+	/*
+	 * It seems worth doing a bms_copy() on glob->minLockRelids if we deleted
+	 * bits from it above to get rid of any empty tail bits.  It seems better
+	 * for the loop over this set in AcquireExecutorLocks() to not have to go
+	 * through those useless bit words.
+	 */
+	if (glob->containsInitialPruning)
+		glob->minLockRelids = bms_copy(glob->minLockRelids);
+
 	return result;
 }
 
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index d48f6784c1..d5556354f7 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -144,7 +144,9 @@ static List *make_partitionedrel_pruneinfo(PlannerInfo *root,
 										   List *prunequal,
 										   Bitmapset *partrelids,
 										   int *relid_subplan_map,
-										   Bitmapset **matchedsubplans);
+										   Bitmapset **matchedsubplans,
+										   bool *needs_init_pruning,
+										   bool *needs_exec_pruning);
 static void gen_partprune_steps(RelOptInfo *rel, List *clauses,
 								PartClauseTarget target,
 								GeneratePruningStepsContext *context);
@@ -234,6 +236,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	int		   *relid_subplan_map;
 	ListCell   *lc;
 	int			i;
+	bool		needs_init_pruning = false;
+	bool		needs_exec_pruning = false;
 
 	/*
 	 * Scan the subpaths to see which ones are scans of partition child
@@ -313,12 +317,16 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		Bitmapset  *partrelids = (Bitmapset *) lfirst(lc);
 		List	   *pinfolist;
 		Bitmapset  *matchedsubplans = NULL;
+		bool		partrel_needs_init_pruning;
+		bool		partrel_needs_exec_pruning;
 
 		pinfolist = make_partitionedrel_pruneinfo(root, parentrel,
 												  prunequal,
 												  partrelids,
 												  relid_subplan_map,
-												  &matchedsubplans);
+												  &matchedsubplans,
+												  &partrel_needs_init_pruning,
+												  &partrel_needs_exec_pruning);
 
 		/* When pruning is possible, record the matched subplans */
 		if (pinfolist != NIL)
@@ -327,6 +335,9 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 			allmatchedsubplans = bms_join(matchedsubplans,
 										  allmatchedsubplans);
 		}
+
+		needs_init_pruning |= partrel_needs_init_pruning;
+		needs_exec_pruning |= partrel_needs_exec_pruning;
 	}
 
 	pfree(relid_subplan_map);
@@ -342,6 +353,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	pruneinfo = makeNode(PartitionPruneInfo);
 	pruneinfo->root_parent_relids = parentrel->relids;
 	pruneinfo->prune_infos = prunerelinfos;
+	pruneinfo->needs_init_pruning = needs_init_pruning;
+	pruneinfo->needs_exec_pruning = needs_exec_pruning;
 
 	/*
 	 * Some subplans may not belong to any of the identified partitioned rels.
@@ -442,13 +455,18 @@ add_part_relids(List *allpartrelids, Bitmapset *partrelids)
  * If we cannot find any useful run-time pruning steps, return NIL.
  * However, on success, each rel identified in partrelids will have
  * an element in the result list, even if some of them are useless.
+ * *needs_init_pruning and *needs_exec_pruning are set to indicate that the
+ * returned PartitionedRelPruneInfos contains pruning steps that can be
+ * performed before and after execution begins, respectively.
  */
 static List *
 make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 							  List *prunequal,
 							  Bitmapset *partrelids,
 							  int *relid_subplan_map,
-							  Bitmapset **matchedsubplans)
+							  Bitmapset **matchedsubplans,
+							  bool *needs_init_pruning,
+							  bool *needs_exec_pruning)
 {
 	RelOptInfo *targetpart = NULL;
 	List	   *pinfolist = NIL;
@@ -459,6 +477,10 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	int			rti;
 	int			i;
 
+	/* Will find out below. */
+	*needs_init_pruning = false;
+	*needs_exec_pruning = false;
+
 	/*
 	 * Examine each partitioned rel, constructing a temporary array to map
 	 * from planner relids to index of the partitioned rel, and building a
@@ -546,6 +568,9 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		 * executor per-scan pruning steps.  This first pass creates startup
 		 * pruning steps and detects whether there's any possibly-useful quals
 		 * that would require per-scan pruning.
+		 *
+		 * In the first pass, we note whether the 2nd pass is necessary by
+		 * noting the presence of EXEC parameters.
 		 */
 		gen_partprune_steps(subpart, partprunequal, PARTTARGET_INITIAL,
 							&context);
@@ -620,6 +645,12 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		pinfo->execparamids = execparamids;
 		/* Remaining fields will be filled in the next loop */
 
+		/* record which types of pruning steps we've seen so far */
+		if (initial_pruning_steps != NIL)
+			*needs_init_pruning = true;
+		if (exec_pruning_steps != NIL)
+			*needs_exec_pruning = true;
+
 		pinfolist = lappend(pinfolist, pinfo);
 	}
 
@@ -647,6 +678,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		int		   *subplan_map;
 		int		   *subpart_map;
 		Oid		   *relid_map;
+		Index	   *rti_map;
 
 		/*
 		 * Construct the subplan and subpart maps for this partitioning level.
@@ -659,6 +691,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		subpart_map = (int *) palloc(nparts * sizeof(int));
 		memset(subpart_map, -1, nparts * sizeof(int));
 		relid_map = (Oid *) palloc0(nparts * sizeof(Oid));
+		rti_map = (Index *) palloc0(nparts * sizeof(Index));
 		present_parts = NULL;
 
 		i = -1;
@@ -673,6 +706,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 			subplan_map[i] = subplanidx = relid_subplan_map[partrel->relid] - 1;
 			subpart_map[i] = subpartidx = relid_subpart_map[partrel->relid] - 1;
 			relid_map[i] = planner_rt_fetch(partrel->relid, root)->relid;
+			rti_map[i] = partrel->relid;
 			if (subplanidx >= 0)
 			{
 				present_parts = bms_add_member(present_parts, i);
@@ -697,6 +731,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		pinfo->subplan_map = subplan_map;
 		pinfo->subpart_map = subpart_map;
 		pinfo->relid_map = relid_map;
+		pinfo->rti_map = rti_map;
 	}
 
 	pfree(relid_subpart_map);
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 3082093d1e..95ab1d0eef 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -1598,6 +1598,7 @@ exec_bind_message(StringInfo input_message)
 	int16	   *rformats = NULL;
 	CachedPlanSource *psrc;
 	CachedPlan *cplan;
+	List	   *part_prune_results_list;
 	Portal		portal;
 	char	   *query_string;
 	char	   *saved_stmt_name;
@@ -1972,7 +1973,9 @@ exec_bind_message(StringInfo input_message)
 	 * will be generated in MessageContext.  The plan refcount will be
 	 * assigned to the Portal, so it will be released at portal destruction.
 	 */
-	cplan = GetCachedPlan(psrc, params, NULL, NULL);
+	cplan = GetCachedPlan(psrc, params, NULL, NULL, &part_prune_results_list);
+	Assert(list_length(cplan->stmt_list) ==
+		   list_length(part_prune_results_list));
 
 	/*
 	 * Now we can define the portal.
@@ -1987,6 +1990,9 @@ exec_bind_message(StringInfo input_message)
 					  cplan->stmt_list,
 					  cplan);
 
+	/* Copy Lists of PartitionPruneResults into the portal's context. */
+	PortalStorePartitionPruneResults(portal, part_prune_results_list);
+
 	/* Done with the snapshot used for parameter I/O and parsing/planning */
 	if (snapshot_set)
 		PopActiveSnapshot();
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index 52e2db6452..f582ff177b 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -35,7 +35,7 @@
 Portal		ActivePortal = NULL;
 
 
-static void ProcessQuery(PlannedStmt *plan,
+static void ProcessQuery(PlannedStmt *plan, List *part_prune_results,
 						 const char *sourceText,
 						 ParamListInfo params,
 						 QueryEnvironment *queryEnv,
@@ -65,6 +65,7 @@ static void DoPortalRewind(Portal portal);
  */
 QueryDesc *
 CreateQueryDesc(PlannedStmt *plannedstmt,
+				List *part_prune_results,
 				const char *sourceText,
 				Snapshot snapshot,
 				Snapshot crosscheck_snapshot,
@@ -77,6 +78,8 @@ CreateQueryDesc(PlannedStmt *plannedstmt,
 
 	qd->operation = plannedstmt->commandType;	/* operation */
 	qd->plannedstmt = plannedstmt;	/* plan */
+	qd->part_prune_results = part_prune_results; /* ExecutorDoInitialPruning()
+												  * output for plan */
 	qd->sourceText = sourceText;	/* query text */
 	qd->snapshot = RegisterSnapshot(snapshot);	/* snapshot */
 	/* RI check snapshot */
@@ -122,6 +125,7 @@ FreeQueryDesc(QueryDesc *qdesc)
  *		PORTAL_ONE_RETURNING, or PORTAL_ONE_MOD_WITH portal
  *
  *	plan: the plan tree for the query
+ *	part_prune_results: ExecutorDoInitialPruning() output for the PlannedStmt
  *	sourceText: the source text of the query
  *	params: any parameters needed
  *	dest: where to send results
@@ -134,6 +138,7 @@ FreeQueryDesc(QueryDesc *qdesc)
  */
 static void
 ProcessQuery(PlannedStmt *plan,
+			 List *part_prune_results,
 			 const char *sourceText,
 			 ParamListInfo params,
 			 QueryEnvironment *queryEnv,
@@ -145,7 +150,7 @@ ProcessQuery(PlannedStmt *plan,
 	/*
 	 * Create the QueryDesc object
 	 */
-	queryDesc = CreateQueryDesc(plan, sourceText,
+	queryDesc = CreateQueryDesc(plan, part_prune_results, sourceText,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, params, queryEnv, 0);
 
@@ -491,8 +496,13 @@ PortalStart(Portal portal, ParamListInfo params,
 				/*
 				 * Create QueryDesc in portal's context; for the moment, set
 				 * the destination to DestNone.
+				 *
+				 * There is no PartitionPruneResult unless the PlannedStmt is
+				 * from a CachedPlan.
 				 */
 				queryDesc = CreateQueryDesc(linitial_node(PlannedStmt, portal->stmts),
+											portal->part_prune_results_list == NIL ? NIL :
+											linitial(portal->part_prune_results_list),
 											portal->sourceText,
 											GetActiveSnapshot(),
 											InvalidSnapshot,
@@ -1225,6 +1235,8 @@ PortalRunMulti(Portal portal,
 
 		if (pstmt->utilityStmt == NULL)
 		{
+			List *part_prune_results = NIL;
+
 			/*
 			 * process a plannable query.
 			 */
@@ -1271,10 +1283,19 @@ PortalRunMulti(Portal portal,
 			else
 				UpdateActiveSnapshotCommandId();
 
+			/*
+			 * Determine if there's a corresponding List of PartitionPruneResult
+			 * for this PlannedStmt.
+			 */
+			if (portal->part_prune_results_list != NIL)
+				part_prune_results = list_nth_node(List,
+												   portal->part_prune_results_list,
+												   foreach_current_index(stmtlist_item));
+
 			if (pstmt->canSetTag)
 			{
 				/* statement can set tag string */
-				ProcessQuery(pstmt,
+				ProcessQuery(pstmt, part_prune_results,
 							 portal->sourceText,
 							 portal->portalParams,
 							 portal->queryEnv,
@@ -1283,7 +1304,7 @@ PortalRunMulti(Portal portal,
 			else
 			{
 				/* stmt added by rewrite cannot set tag */
-				ProcessQuery(pstmt,
+				ProcessQuery(pstmt, part_prune_results,
 							 portal->sourceText,
 							 portal->portalParams,
 							 portal->queryEnv,
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index cc943205d3..8ff42153a1 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -99,14 +99,19 @@ static dlist_head cached_expression_list = DLIST_STATIC_INIT(cached_expression_l
 static void ReleaseGenericPlan(CachedPlanSource *plansource);
 static List *RevalidateCachedQuery(CachedPlanSource *plansource,
 								   QueryEnvironment *queryEnv);
-static bool CheckCachedPlan(CachedPlanSource *plansource);
+static bool CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
+							List **part_prune_results_list);
 static CachedPlan *BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
-								   ParamListInfo boundParams, QueryEnvironment *queryEnv);
+								   ParamListInfo boundParams, QueryEnvironment *queryEnv,
+								   List **part_prune_results_list);
 static bool choose_custom_plan(CachedPlanSource *plansource,
 							   ParamListInfo boundParams);
 static double cached_plan_cost(CachedPlan *plan, bool include_planner);
 static Query *QueryListGetPrimaryStmt(List *stmts);
-static void AcquireExecutorLocks(List *stmt_list, bool acquire);
+static void AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams,
+								 List **part_prune_results_list,
+								 List **lockedRelids_per_stmt);
+static void ReleaseExecutorLocks(List *stmt_list, List *lockedRelids_per_stmt);
 static void AcquirePlannerLocks(List *stmt_list, bool acquire);
 static void ScanQueryForLocks(Query *parsetree, bool acquire);
 static bool ScanQueryWalker(Node *node, bool *acquire);
@@ -782,6 +787,26 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
 	return tlist;
 }
 
+/* 
+ * FreePartitionPruneResults
+ *		Frees the List of Lists of PartitionPruneResults for CheckCachedPlan()
+ */
+static void
+FreePartitionPruneResults(List *part_prune_results_list)
+{
+	ListCell *lc;
+
+	foreach(lc, part_prune_results_list)
+	{
+		List *part_prune_results = lfirst_node(List, lc);
+
+		/* Free both the PartitionPruneResults and the containing List. */
+		list_free_deep(part_prune_results);
+	}
+
+	list_free(part_prune_results_list);
+}
+
 /*
  * CheckCachedPlan: see if the CachedPlanSource's generic plan is valid.
  *
@@ -790,15 +815,20 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
  *
  * On a "true" return, we have acquired the locks needed to run the plan.
  * (We must do this for the "true" result to be race-condition-free.)
+ *
+ * See GetCachedPlan()'s comment for a description of part_prune_results_list.
  */
 static bool
-CheckCachedPlan(CachedPlanSource *plansource)
+CheckCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
+				List **part_prune_results_list)
 {
 	CachedPlan *plan = plansource->gplan;
 
 	/* Assert that caller checked the querytree */
 	Assert(plansource->is_valid);
 
+	*part_prune_results_list = NIL;
+
 	/* If there's no generic plan, just say "false" */
 	if (!plan)
 		return false;
@@ -820,13 +850,21 @@ CheckCachedPlan(CachedPlanSource *plansource)
 	 */
 	if (plan->is_valid)
 	{
+		List   *lockedRelids_per_stmt;
+
 		/*
 		 * Plan must have positive refcount because it is referenced by
 		 * plansource; so no need to fear it disappears under us here.
 		 */
 		Assert(plan->refcount > 0);
 
-		AcquireExecutorLocks(plan->stmt_list, true);
+		/*
+		 * Lock relations scanned by the plan.  This is where the pruning
+		 * happens if needed.
+		 */
+		AcquireExecutorLocks(plan->stmt_list, boundParams,
+							 part_prune_results_list,
+							 &lockedRelids_per_stmt);
 
 		/*
 		 * If plan was transient, check to see if TransactionXmin has
@@ -848,7 +886,11 @@ CheckCachedPlan(CachedPlanSource *plansource)
 		}
 
 		/* Oops, the race case happened.  Release useless locks. */
-		AcquireExecutorLocks(plan->stmt_list, false);
+		ReleaseExecutorLocks(plan->stmt_list, lockedRelids_per_stmt);
+
+		/* Release any PartitionPruneResults that may been created. */
+		FreePartitionPruneResults(*part_prune_results_list);
+		*part_prune_results_list = NIL;
 	}
 
 	/*
@@ -874,10 +916,14 @@ CheckCachedPlan(CachedPlanSource *plansource)
  * Planning work is done in the caller's memory context.  The finished plan
  * is in a child memory context, which typically should get reparented
  * (unless this is a one-shot plan, in which case we don't copy the plan).
+ *
+ * A list of NILs is returned in *part_prune_results_list, meaning that no
+ * no partition pruning has been done yet for the plans in stmt_list.
  */
 static CachedPlan *
 BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
-				ParamListInfo boundParams, QueryEnvironment *queryEnv)
+				ParamListInfo boundParams, QueryEnvironment *queryEnv,
+				List **part_prune_results_list)
 {
 	CachedPlan *plan;
 	List	   *plist;
@@ -1007,6 +1053,17 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
 
 	MemoryContextSwitchTo(oldcxt);
 
+	/*
+	 * No actual PartitionPruneResults yet to add, though must initialize
+	 * the list to have the same number of elements as the list of
+	 * PlannedStmts.
+	 */
+	*part_prune_results_list = NIL;
+	foreach(lc, plist)
+	{
+		*part_prune_results_list = lappend(*part_prune_results_list, NIL);
+	}
+
 	return plan;
 }
 
@@ -1126,6 +1183,19 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
  * plan or a custom plan for the given parameters: the caller does not know
  * which it will get.
  *
+ * For every PlannedStmt found in the returned CachedPlan, an element that
+ * is either a List of PartitionPruneResult or a NIL is added to
+ * *part_prune_results_list.  The former if the PlannedStmt is from
+ * the existing CachedPlan that is otherwise valid and has
+ * containsInitialPruning set to true.  Before returning such a CachedPlan,
+ * those "initial" steps are performed by calling ExecutorDoInitialPruning()
+ * to determine only those leaf partitions that need to be locked by
+ * AcquireExecutorLocks() by pruning away subplans that don't match the
+ * "initial" pruning conditions.  For each PartitionPruneInfo found in
+ * PlannedStmt.partPruneInfos, a PartitionPruneResult containing the bitmapset
+ * of the indexes of surviving subplans is added to the List for the
+ * PlannedStmt.
+ *
  * On return, the plan is valid and we have sufficient locks to begin
  * execution.
  *
@@ -1139,11 +1209,13 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
  */
 CachedPlan *
 GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
-			  ResourceOwner owner, QueryEnvironment *queryEnv)
+			  ResourceOwner owner, QueryEnvironment *queryEnv,
+			  List **part_prune_results_list)
 {
 	CachedPlan *plan = NULL;
 	List	   *qlist;
 	bool		customplan;
+	List	   *my_part_prune_results_list;
 
 	/* Assert caller is doing things in a sane order */
 	Assert(plansource->magic == CACHEDPLANSOURCE_MAGIC);
@@ -1160,7 +1232,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 
 	if (!customplan)
 	{
-		if (CheckCachedPlan(plansource))
+		if (CheckCachedPlan(plansource, boundParams,
+							&my_part_prune_results_list))
 		{
 			/* We want a generic plan, and we already have a valid one */
 			plan = plansource->gplan;
@@ -1169,7 +1242,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 		else
 		{
 			/* Build a new generic plan */
-			plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv);
+			plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv,
+								   &my_part_prune_results_list);
 			/* Just make real sure plansource->gplan is clear */
 			ReleaseGenericPlan(plansource);
 			/* Link the new generic plan into the plansource */
@@ -1214,7 +1288,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 	if (customplan)
 	{
 		/* Build a custom plan */
-		plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv);
+		plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv,
+							   &my_part_prune_results_list);
 		/* Accumulate total costs of custom plans */
 		plansource->total_custom_cost += cached_plan_cost(plan, true);
 
@@ -1246,6 +1321,9 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 		plan->is_saved = true;
 	}
 
+	if (part_prune_results_list)
+		*part_prune_results_list = my_part_prune_results_list;
+
 	return plan;
 }
 
@@ -1737,17 +1815,29 @@ QueryListGetPrimaryStmt(List *stmts)
 
 /*
  * AcquireExecutorLocks: acquire locks needed for execution of a cached plan;
- * or release them if acquire is false.
+ *
+ * See GetCachedPlan()'s comment for a description of part_prune_results_list.
+ *
+ * On return, *lockedRelids_per_stmt will contain a bitmapset for every
+ * PlannedStmt in stmt_list, containing the RT indexes of relation entries
+ * in its range table that were actually locked, or NULL if the PlannedStmt
+ * contains a utility statement.
  */
 static void
-AcquireExecutorLocks(List *stmt_list, bool acquire)
+AcquireExecutorLocks(List *stmt_list, ParamListInfo boundParams,
+					 List **part_prune_results_list,
+					 List **lockedRelids_per_stmt)
 {
 	ListCell   *lc1;
 
+	*part_prune_results_list = *lockedRelids_per_stmt = NIL;
 	foreach(lc1, stmt_list)
 	{
 		PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
-		ListCell   *lc2;
+		List	   *part_prune_results = NIL;
+		Bitmapset  *allLockRelids;
+		Bitmapset  *lockedRelids = NULL;
+		int			rti;
 
 		if (plannedstmt->commandType == CMD_UTILITY)
 		{
@@ -1761,13 +1851,40 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
 			Query	   *query = UtilityContainsQuery(plannedstmt->utilityStmt);
 
 			if (query)
-				ScanQueryForLocks(query, acquire);
+				ScanQueryForLocks(query, true);
+			*part_prune_results_list = lappend(*part_prune_results_list, NIL);
 			continue;
 		}
 
-		foreach(lc2, plannedstmt->rtable)
+		/*
+		 * Figure out the set of relations that would need to be locked
+		 * before executing the plan.
+		 */
+		if (plannedstmt->containsInitialPruning)
 		{
-			RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc2);
+			Bitmapset *scan_leafpart_rtis = NULL;
+
+			/*
+			 * Obtain the set of leaf partitions to be locked.
+			 *
+			 * The following does initial partition pruning using the
+			 * PartitionPruneInfos found in plannedstmt->partPruneInfos and
+			 * finds leaf partitions that survive that pruning across all the
+			 * nodes in the plan tree.
+			 */
+			part_prune_results = ExecutorDoInitialPruning(plannedstmt,
+														  boundParams,
+														  &scan_leafpart_rtis);
+			allLockRelids = bms_union(plannedstmt->minLockRelids,
+									  scan_leafpart_rtis);
+		}
+		else
+			allLockRelids = plannedstmt->minLockRelids;
+
+		rti = -1;
+		while ((rti = bms_next_member(allLockRelids, rti)) > 0)
+		{
+			RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
 
 			if (rte->rtekind != RTE_RELATION)
 				continue;
@@ -1778,10 +1895,59 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
 			 * fail if it's been dropped entirely --- we'll just transiently
 			 * acquire a non-conflicting lock.
 			 */
-			if (acquire)
-				LockRelationOid(rte->relid, rte->rellockmode);
-			else
-				UnlockRelationOid(rte->relid, rte->rellockmode);
+			LockRelationOid(rte->relid, rte->rellockmode);
+			lockedRelids = bms_add_member(lockedRelids, rti);
+		}
+
+		*part_prune_results_list = lappend(*part_prune_results_list,
+										   part_prune_results);
+		*lockedRelids_per_stmt = lappend(*lockedRelids_per_stmt, lockedRelids);
+	}
+}
+
+/*
+ * ReleaseExecutorLocks
+ * 		Release locks that would've been acquired by an earlier call to
+ * 		AcquireExecutorLocks()
+ */
+static void
+ReleaseExecutorLocks(List *stmt_list, List *lockedRelids_per_stmt)
+{
+	ListCell   *lc1,
+			   *lc2;
+
+	forboth(lc1, stmt_list, lc2, lockedRelids_per_stmt)
+	{
+		PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
+		Bitmapset  *lockedRelids = lfirst_node(Bitmapset, lc2);
+		int			rti;
+
+		if (plannedstmt->commandType == CMD_UTILITY)
+		{
+			/*
+			 * Ignore utility statements, except those (such as EXPLAIN) that
+			 * contain a parsed-but-not-planned query.  Note: it's okay to use
+			 * ScanQueryForLocks, even though the query hasn't been through
+			 * rule rewriting, because rewriting doesn't change the query
+			 * representation.
+			 */
+			Query	   *query = UtilityContainsQuery(plannedstmt->utilityStmt);
+
+			Assert(lockedRelids == NULL);
+			if (query)
+				ScanQueryForLocks(query, false);
+			continue;
+		}
+
+		rti = -1;
+		while ((rti = bms_next_member(lockedRelids, rti)) >= 0)
+		{
+			RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
+
+			Assert(rte->rtekind == RTE_RELATION);
+
+			/* See the comment in AcquireExecutorLocks(). */
+			UnlockRelationOid(rte->relid, rte->rellockmode);
 		}
 	}
 }
diff --git a/src/backend/utils/mmgr/portalmem.c b/src/backend/utils/mmgr/portalmem.c
index 7b1ae6fdcf..5b9098971b 100644
--- a/src/backend/utils/mmgr/portalmem.c
+++ b/src/backend/utils/mmgr/portalmem.c
@@ -303,6 +303,25 @@ PortalDefineQuery(Portal portal,
 	portal->status = PORTAL_DEFINED;
 }
 
+/*
+ * PortalStorePartitionPruneResults
+ *		Copy the given List of Lists of PartitionPruneResults into the
+ *		portal's context
+ *
+ * This allows the caller to ensure that the list exists as long as the portal
+ * does.
+ */
+void
+PortalStorePartitionPruneResults(Portal portal, List *part_prune_results_list)
+{
+	MemoryContext	oldcxt;
+
+	Assert(PortalIsValid(portal));
+	oldcxt = MemoryContextSwitchTo(portal->portalContext);
+	portal->part_prune_results_list = copyObject(part_prune_results_list);
+	MemoryContextSwitchTo(oldcxt);
+}
+
 /*
  * PortalReleaseCachedPlan
  *		Release a portal's reference to its cached plan, if any.
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 9ebde089ae..269cc4d562 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -87,7 +87,9 @@ extern void ExplainOneUtility(Node *utilityStmt, IntoClause *into,
 							  ExplainState *es, const char *queryString,
 							  ParamListInfo params, QueryEnvironment *queryEnv);
 
-extern void ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into,
+extern void ExplainOnePlan(PlannedStmt *plannedstmt,
+						   List *part_prune_results,
+						   IntoClause *into,
 						   ExplainState *es, const char *queryString,
 						   ParamListInfo params, QueryEnvironment *queryEnv,
 						   const instr_time *planduration,
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index 17fabc18c9..4b98d0d2ef 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -45,6 +45,7 @@ extern void ExecCleanupTupleRouting(ModifyTableState *mtstate,
  * nparts						Length of subplan_map[] and subpart_map[].
  * subplan_map					Subplan index by partition index, or -1.
  * subpart_map					Subpart index by partition index, or -1.
+ * rti_map						Range table index by partition index, or 0.
  * present_parts				A Bitmapset of the partition indexes that we
  *								have subplans or subparts for.
  * initial_pruning_steps		List of PartitionPruneSteps used to
@@ -61,6 +62,7 @@ typedef struct PartitionedRelPruningData
 	int			nparts;
 	int		   *subplan_map;
 	int		   *subpart_map;
+	Index	   *rti_map;
 	Bitmapset  *present_parts;
 	List	   *initial_pruning_steps;
 	List	   *exec_pruning_steps;
@@ -127,5 +129,10 @@ extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
 													 Bitmapset *root_parent_relids,
 													 Bitmapset **initially_valid_subplans);
 extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
-										   bool initial_prune);
+										   bool initial_prune,
+										   Bitmapset **scan_leafpart_rtis);
+extern Bitmapset *ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt,
+								ParamListInfo params,
+								PartitionPruneInfo *pruneinfo,
+								Bitmapset **scan_leafpart_rtis);
 #endif							/* EXECPARTITION_H */
diff --git a/src/include/executor/execdesc.h b/src/include/executor/execdesc.h
index e79e2c001f..7d4379da7b 100644
--- a/src/include/executor/execdesc.h
+++ b/src/include/executor/execdesc.h
@@ -35,6 +35,8 @@ typedef struct QueryDesc
 	/* These fields are provided by CreateQueryDesc */
 	CmdType		operation;		/* CMD_SELECT, CMD_UPDATE, etc. */
 	PlannedStmt *plannedstmt;	/* planner's output (could be utility, too) */
+	List		*part_prune_results; /* ExecutorDoInitialPruning()'s
+									  * output for plannedstmt */
 	const char *sourceText;		/* source text of the query */
 	Snapshot	snapshot;		/* snapshot to use for query */
 	Snapshot	crosscheck_snapshot;	/* crosscheck for RI update/delete */
@@ -57,6 +59,7 @@ typedef struct QueryDesc
 
 /* in pquery.c */
 extern QueryDesc *CreateQueryDesc(PlannedStmt *plannedstmt,
+								  List *part_prune_results,
 								  const char *sourceText,
 								  Snapshot snapshot,
 								  Snapshot crosscheck_snapshot,
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index aaf2bc78b9..32bbbc5927 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -185,6 +185,9 @@ ExecGetJunkAttribute(TupleTableSlot *slot, AttrNumber attno, bool *isNull)
 /*
  * prototypes from functions in execMain.c
  */
+extern List *ExecutorDoInitialPruning(PlannedStmt *plannedstmt,
+									  ParamListInfo params,
+									  Bitmapset **scan_leafpart_rtis);
 extern void ExecutorStart(QueryDesc *queryDesc, int eflags);
 extern void standard_ExecutorStart(QueryDesc *queryDesc, int eflags);
 extern void ExecutorRun(QueryDesc *queryDesc,
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 71248a9466..9c6e8f5e13 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -619,6 +619,7 @@ typedef struct EState
 										 * ExecRowMarks, or NULL if none */
 	PlannedStmt *es_plannedstmt;	/* link to top of plan tree */
 	List		*es_part_prune_infos;	/* PlannedStmt.partPruneInfos */
+	List		*es_part_prune_results; /* QueryDesc.part_prune_results */
 	const char *es_sourceText;	/* Source text from QueryDesc */
 
 	JunkFilter *es_junkFilter;	/* top-level junk filter, if any */
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index 1f33902947..c2f2544df5 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -218,6 +218,7 @@ extern struct Bitmapset *readBitmapset(void);
 extern uintptr_t readDatum(bool typbyval);
 extern bool *readBoolCols(int numCols);
 extern int *readIntCols(int numCols);
+extern Index *readIndexCols(int numCols);
 extern Oid *readOidCols(int numCols);
 extern int16 *readAttrNumberCols(int numCols);
 
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index dbaa9bb54d..e0e5c15b09 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -125,6 +125,18 @@ typedef struct PlannerGlobal
 	/* List of PartitionPruneInfo contained in the plan */
 	List	   *partPruneInfos;
 
+	/*
+	 * Do any of those PartitionPruneInfos have initial pruning steps in them?
+	 */
+	bool		containsInitialPruning;
+
+	/*
+	 * Indexes of all range table entries minus indexes of range table entries
+	 * of the leaf partitions scanned by prunable subplans; see
+	 * AcquireExecutorLocks()
+	 */
+	Bitmapset  *minLockRelids;
+
 	/* OIDs of relations the plan depends on */
 	List	   *relationOids;
 
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index c36a15bd09..714e2cf2c7 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -73,8 +73,17 @@ typedef struct PlannedStmt
 	List	   *partPruneInfos; /* List of PartitionPruneInfo contained in the
 								 * plan */
 
+	bool		containsInitialPruning;	/* Do any of those PartitionPruneInfos
+										 * have initial pruning steps in them?
+										 */
+
 	List	   *rtable;			/* list of RangeTblEntry nodes */
 
+	Bitmapset  *minLockRelids;	/* Indexes of all range table entries minus
+								 * indexes of range table entries of the leaf
+								 * partitions scanned by prunable subplans;
+								 * see AcquireExecutorLocks() */
+
 	/* rtable indexes of target relations for INSERT/UPDATE/DELETE/MERGE */
 	List	   *resultRelations;	/* integer list of RT indexes, or NIL */
 
@@ -1414,6 +1423,13 @@ typedef struct PlanRowMark
  * prune_infos			List of Lists containing PartitionedRelPruneInfo nodes,
  *						one sublist per run-time-prunable partition hierarchy
  *						appearing in the parent plan node's subplans.
+ *
+ * needs_init_pruning	Does any of the PartitionedRelPruneInfos in
+ *						prune_infos have its initial_pruning_steps set?
+ *
+ * needs_exec_pruning	Does any of the PartitionedRelPruneInfos in
+ *						prune_infos have its exec_pruning_steps set?
+ *
  * other_subplans		Indexes of any subplans that are not accounted for
  *						by any of the PartitionedRelPruneInfo nodes in
  *						"prune_infos".  These subplans must not be pruned.
@@ -1425,6 +1441,8 @@ typedef struct PartitionPruneInfo
 	NodeTag		type;
 	Bitmapset  *root_parent_relids;
 	List	   *prune_infos;
+	bool		needs_init_pruning;
+	bool		needs_exec_pruning;
 	Bitmapset  *other_subplans;
 } PartitionPruneInfo;
 
@@ -1469,6 +1487,9 @@ typedef struct PartitionedRelPruneInfo
 	/* relation OID by partition index, or 0 */
 	Oid		   *relid_map pg_node_attr(array_size(nparts));
 
+	/* Range table index by partition index, or 0. */
+	Index	   *rti_map pg_node_attr(array_size(nparts));
+
 	/*
 	 * initial_pruning_steps shows how to prune during executor startup (i.e.,
 	 * without use of any PARAM_EXEC Params); it is NIL if no startup pruning
@@ -1553,6 +1574,31 @@ typedef struct PartitionPruneStepCombine
 	List	   *source_stepids;
 } PartitionPruneStepCombine;
 
+/*----------------
+ * PartitionPruneResult
+ *
+ * The result of performing ExecPartitionDoInitialPruning() on a given
+ * PartitionPruneInfo.
+ *
+ * valid_subplans_offs contains the indexes of subplans remaining after
+ * performing initial pruning by calling ExecFindMatchingSubPlans() on the
+ * PartitionPruneInfo.
+ *
+ * This is used to store the result of initial partition pruning that is
+ * peformed before the execution has started.  A module that needs to do so
+ * should call ExecutorDoInitialPruning() on a given PlannedStmt, which
+ * returns a List of PartitionPruneResult containing an entry for each
+ * PartitionPruneInfo present in PlannedStmt.part_prune_infos.  The module
+ * should then pass that list, along with the PlannedStmt, to the executor,
+ * so that it can reuse the result of initial partition pruning when
+ * initializing the subplans for execution.
+ */
+typedef struct PartitionPruneResult
+{
+	NodeTag		type;
+
+	Bitmapset	   *valid_subplan_offs;
+} PartitionPruneResult;
 
 /*
  * Plan invalidation info
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index 0499635f59..32579d4788 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -220,7 +220,8 @@ extern List *CachedPlanGetTargetList(CachedPlanSource *plansource,
 extern CachedPlan *GetCachedPlan(CachedPlanSource *plansource,
 								 ParamListInfo boundParams,
 								 ResourceOwner owner,
-								 QueryEnvironment *queryEnv);
+								 QueryEnvironment *queryEnv,
+								 List **part_prune_results_list);
 extern void ReleaseCachedPlan(CachedPlan *plan, ResourceOwner owner);
 
 extern bool CachedPlanAllowsSimpleValidityCheck(CachedPlanSource *plansource,
diff --git a/src/include/utils/portal.h b/src/include/utils/portal.h
index aeddbdafe5..1901fc5f28 100644
--- a/src/include/utils/portal.h
+++ b/src/include/utils/portal.h
@@ -138,6 +138,7 @@ typedef struct PortalData
 	QueryCompletion qc;			/* command completion data for executed query */
 	List	   *stmts;			/* list of PlannedStmts */
 	CachedPlan *cplan;			/* CachedPlan, if stmts are from one */
+	List	   *part_prune_results_list;	/* List of Lists of PartitionPruneResults */
 
 	ParamListInfo portalParams; /* params to pass to query */
 	QueryEnvironment *queryEnv; /* environment for query */
@@ -242,6 +243,8 @@ extern void PortalDefineQuery(Portal portal,
 							  CommandTag commandTag,
 							  List *stmts,
 							  CachedPlan *cplan);
+extern void PortalStorePartitionPruneResults(Portal portal,
+											 List *part_prune_results_list);
 extern PlannedStmt *PortalGetPrimaryStmt(Portal portal);
 extern void PortalCreateHoldStore(Portal portal);
 extern void PortalHashTableDeleteAll(void);
-- 
2.35.3



^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2022-12-06 19:00  Alvaro Herrera <[email protected]>
  parent: Amit Langote <[email protected]>
  0 siblings, 1 reply; 108+ messages in thread

From: Alvaro Herrera @ 2022-12-06 19:00 UTC (permalink / raw)
  To: Amit Langote <[email protected]>; +Cc: Robert Haas <[email protected]>; Jacob Champion <[email protected]>; David Rowley <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers

I find the API of GetCachedPlans a little weird after this patch.  I
think it may be better to have it return a pointer of a new struct --
one that contains both the CachedPlan pointer and the list of pruning
results.  (As I understand, the sole caller that isn't interested in the
pruning results, SPI_plan_get_cached_plan, can be explained by the fact
that it knows there won't be any.  So I don't think we need to worry
about this case?)

And I think you should make that struct also be the last argument of
PortalDefineQuery, so you don't need the separate
PortalStorePartitionPruneResults function -- because as far as I can
tell, the callers that pass a non-NULL pointer there are the exactly
same that later call PortalStorePartitionPruneResults.

-- 
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"La primera ley de las demostraciones en vivo es: no trate de usar el sistema.
Escriba un guión que no toque nada para no causar daños." (Jakob Nielsen)





^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2022-12-09 08:26  Amit Langote <[email protected]>
  parent: Alvaro Herrera <[email protected]>
  0 siblings, 1 reply; 108+ messages in thread

From: Amit Langote @ 2022-12-09 08:26 UTC (permalink / raw)
  To: Alvaro Herrera <[email protected]>; +Cc: Robert Haas <[email protected]>; Jacob Champion <[email protected]>; David Rowley <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers

Thanks for the review.

On Wed, Dec 7, 2022 at 4:00 AM Alvaro Herrera <[email protected]> wrote:
> I find the API of GetCachedPlans a little weird after this patch.  I
> think it may be better to have it return a pointer of a new struct --
> one that contains both the CachedPlan pointer and the list of pruning
> results.  (As I understand, the sole caller that isn't interested in the
> pruning results, SPI_plan_get_cached_plan, can be explained by the fact
> that it knows there won't be any.  So I don't think we need to worry
> about this case?)

David, in his Apr 7 reply on this thread, also sounded to suggest
something similar.

Hmm, I was / am not so sure if GetCachedPlan() should return something
that is not CachedPlan.  An idea I had today was to replace the
part_prune_results_list output List parameter with, say,
QueryInitPruningResult, or something like that and put the current
list into that struct.   Was looking at QueryEnvironment to come up
with *that* name.  Any thoughts?

> And I think you should make that struct also be the last argument of
> PortalDefineQuery, so you don't need the separate
> PortalStorePartitionPruneResults function -- because as far as I can
> tell, the callers that pass a non-NULL pointer there are the exactly
> same that later call PortalStorePartitionPruneResults.

Yes, it would be better to not need PortalStorePartitionPruneResults.


--
Thanks, Amit Langote
EDB: http://www.enterprisedb.com





^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2022-12-09 09:52  Alvaro Herrera <[email protected]>
  parent: Amit Langote <[email protected]>
  0 siblings, 1 reply; 108+ messages in thread

From: Alvaro Herrera @ 2022-12-09 09:52 UTC (permalink / raw)
  To: Amit Langote <[email protected]>; +Cc: Robert Haas <[email protected]>; Jacob Champion <[email protected]>; David Rowley <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers

On 2022-Dec-09, Amit Langote wrote:

> On Wed, Dec 7, 2022 at 4:00 AM Alvaro Herrera <[email protected]> wrote:
> > I find the API of GetCachedPlans a little weird after this patch.

> David, in his Apr 7 reply on this thread, also sounded to suggest
> something similar.
> 
> Hmm, I was / am not so sure if GetCachedPlan() should return something
> that is not CachedPlan.  An idea I had today was to replace the
> part_prune_results_list output List parameter with, say,
> QueryInitPruningResult, or something like that and put the current
> list into that struct.   Was looking at QueryEnvironment to come up
> with *that* name.  Any thoughts?

Remind me again why is part_prune_results_list not part of struct
CachedPlan then?  I tried to understand that based on comments upthread,
but I was unable to find anything.

(My first reaction to your above comment was "well, rename GetCachedPlan
then, maybe to GetRunnablePlan", but then I'm wondering if CachedPlan is
in any way a structure that must be "immutable" in the way parser output
is.  Looking at the comment at the top of plancache.c it appears to me
that it isn't, but maybe I'm missing something.)

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"The Postgresql hackers have what I call a "NASA space shot" mentality.
 Quite refreshing in a world of "weekend drag racer" developers."
(Scott Marlowe)





^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2022-12-09 10:34  Amit Langote <[email protected]>
  parent: Alvaro Herrera <[email protected]>
  0 siblings, 1 reply; 108+ messages in thread

From: Amit Langote @ 2022-12-09 10:34 UTC (permalink / raw)
  To: Alvaro Herrera <[email protected]>; +Cc: Robert Haas <[email protected]>; Jacob Champion <[email protected]>; David Rowley <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers

On Fri, Dec 9, 2022 at 6:52 PM Alvaro Herrera <[email protected]> wrote:
> On 2022-Dec-09, Amit Langote wrote:
> > On Wed, Dec 7, 2022 at 4:00 AM Alvaro Herrera <[email protected]> wrote:
> > > I find the API of GetCachedPlans a little weird after this patch.
>
> > David, in his Apr 7 reply on this thread, also sounded to suggest
> > something similar.
> >
> > Hmm, I was / am not so sure if GetCachedPlan() should return something
> > that is not CachedPlan.  An idea I had today was to replace the
> > part_prune_results_list output List parameter with, say,
> > QueryInitPruningResult, or something like that and put the current
> > list into that struct.   Was looking at QueryEnvironment to come up
> > with *that* name.  Any thoughts?
>
> Remind me again why is part_prune_results_list not part of struct
> CachedPlan then?  I tried to understand that based on comments upthread,
> but I was unable to find anything.

It used to be part of CachedPlan for a brief period of time (in patch
v12 I posted in [1]), but David, in his reply to [1], said he wasn't
so sure that it belonged there.

> (My first reaction to your above comment was "well, rename GetCachedPlan
> then, maybe to GetRunnablePlan", but then I'm wondering if CachedPlan is
> in any way a structure that must be "immutable" in the way parser output
> is.  Looking at the comment at the top of plancache.c it appears to me
> that it isn't, but maybe I'm missing something.)

CachedPlan *is* supposed to be read-only per the comment above
CachedPlanSource definition:

 * ...If we are using a generic
 * cached plan then it is meant to be re-used across multiple executions, so
 * callers must always treat CachedPlans as read-only.

FYI, there was even an idea of putting a PartitionPruneResults for a
given PlannedStmt into the PlannedStmt itself [2], but PlannedStmt is
supposed to be read-only too [3].

Maybe we need some new overarching context when invoking plancache, if
Portal can't already be it, whose struct can be passed to
GetCachedPlan() to put the pruning results in?  Perhaps,
GetRunnablePlan() that you floated could be a wrapper for
GetCachedPlan(), owning that new context.

-- 
Thanks, Amit Langote
EDB: http://www.enterprisedb.com

[1] https://www.postgresql.org/message-id/CA%2BHiwqH4qQ_YVROr7TY0jSCuGn0oHhH79_DswOdXWN5UnMCBtQ%40mail.g...
[2] https://www.postgresql.org/message-id/CAApHDvp_DjVVkgSV24%2BUF7p_yKWeepgoo%2BW2SWLLhNmjwHTVYQ%40mail...
[3] https://www.postgresql.org/message-id/922566.1648784745%40sss.pgh.pa.us





^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2022-12-09 10:49  Alvaro Herrera <[email protected]>
  parent: Amit Langote <[email protected]>
  0 siblings, 1 reply; 108+ messages in thread

From: Alvaro Herrera @ 2022-12-09 10:49 UTC (permalink / raw)
  To: Amit Langote <[email protected]>; +Cc: Robert Haas <[email protected]>; Jacob Champion <[email protected]>; David Rowley <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers

On 2022-Dec-09, Amit Langote wrote:

> On Fri, Dec 9, 2022 at 6:52 PM Alvaro Herrera <[email protected]> wrote:

> > Remind me again why is part_prune_results_list not part of struct
> > CachedPlan then?  I tried to understand that based on comments upthread,
> > but I was unable to find anything.
> 
> It used to be part of CachedPlan for a brief period of time (in patch
> v12 I posted in [1]), but David, in his reply to [1], said he wasn't
> so sure that it belonged there.

I'm not sure I necessarily agree with that.  I'll have a look at v12 to
try and understand what was David so unhappy about.

> > (My first reaction to your above comment was "well, rename GetCachedPlan
> > then, maybe to GetRunnablePlan", but then I'm wondering if CachedPlan is
> > in any way a structure that must be "immutable" in the way parser output
> > is.  Looking at the comment at the top of plancache.c it appears to me
> > that it isn't, but maybe I'm missing something.)
> 
> CachedPlan *is* supposed to be read-only per the comment above
> CachedPlanSource definition:
> 
>  * ...If we are using a generic
>  * cached plan then it is meant to be re-used across multiple executions, so
>  * callers must always treat CachedPlans as read-only.

I read that as implying that the part_prune_results_list must remain
intact as long as no invalidations occur.  Does part_prune_result_list
really change as a result of something other than a sinval event?
Keep in mind that if a sinval message that touches one of the relations
in the plan arrives, then we'll discard it and generate it afresh.  I
don't see that the part_prune_results_list would change otherwise, but
maybe I misunderstand?

> FYI, there was even an idea of putting a PartitionPruneResults for a
> given PlannedStmt into the PlannedStmt itself [2], but PlannedStmt is
> supposed to be read-only too [3].

Hmm, I'm not familiar with PlannedStmt lifetime, but I'm definitely not
betting that Tom is wrong about this.

> Maybe we need some new overarching context when invoking plancache, if
> Portal can't already be it, whose struct can be passed to
> GetCachedPlan() to put the pruning results in?  Perhaps,
> GetRunnablePlan() that you floated could be a wrapper for
> GetCachedPlan(), owning that new context.

Perhaps that is a solution.  I'm not sure.

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"Uno puede defenderse de los ataques; contra los elogios se esta indefenso"





^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2022-12-09 11:02  Amit Langote <[email protected]>
  parent: Alvaro Herrera <[email protected]>
  0 siblings, 1 reply; 108+ messages in thread

From: Amit Langote @ 2022-12-09 11:02 UTC (permalink / raw)
  To: Alvaro Herrera <[email protected]>; +Cc: Robert Haas <[email protected]>; Jacob Champion <[email protected]>; David Rowley <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers

On Fri, Dec 9, 2022 at 7:49 PM Alvaro Herrera <[email protected]> wrote:
> On 2022-Dec-09, Amit Langote wrote:
> > On Fri, Dec 9, 2022 at 6:52 PM Alvaro Herrera <[email protected]> wrote:
> > > Remind me again why is part_prune_results_list not part of struct
> > > CachedPlan then?  I tried to understand that based on comments upthread,
> > > but I was unable to find anything.
> >
> > > (My first reaction to your above comment was "well, rename GetCachedPlan
> > > then, maybe to GetRunnablePlan", but then I'm wondering if CachedPlan is
> > > in any way a structure that must be "immutable" in the way parser output
> > > is.  Looking at the comment at the top of plancache.c it appears to me
> > > that it isn't, but maybe I'm missing something.)
> >
> > CachedPlan *is* supposed to be read-only per the comment above
> > CachedPlanSource definition:
> >
> >  * ...If we are using a generic
> >  * cached plan then it is meant to be re-used across multiple executions, so
> >  * callers must always treat CachedPlans as read-only.
>
> I read that as implying that the part_prune_results_list must remain
> intact as long as no invalidations occur.  Does part_prune_result_list
> really change as a result of something other than a sinval event?
> Keep in mind that if a sinval message that touches one of the relations
> in the plan arrives, then we'll discard it and generate it afresh.  I
> don't see that the part_prune_results_list would change otherwise, but
> maybe I misunderstand?

Pruning will be done afresh on every fetch of a given cached plan when
CheckCachedPlan() is called on it, so the part_prune_results_list part
will be discarded and rebuilt as many times as the plan is executed.
You'll find a description around CachedPlanSavePartitionPruneResults()
that's in v12.

-- 
Thanks, Amit Langote
EDB: http://www.enterprisedb.com





^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2022-12-09 11:37  Alvaro Herrera <[email protected]>
  parent: Amit Langote <[email protected]>
  0 siblings, 1 reply; 108+ messages in thread

From: Alvaro Herrera @ 2022-12-09 11:37 UTC (permalink / raw)
  To: Amit Langote <[email protected]>; +Cc: Robert Haas <[email protected]>; Jacob Champion <[email protected]>; David Rowley <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers

On 2022-Dec-09, Amit Langote wrote:

> Pruning will be done afresh on every fetch of a given cached plan when
> CheckCachedPlan() is called on it, so the part_prune_results_list part
> will be discarded and rebuilt as many times as the plan is executed.
> You'll find a description around CachedPlanSavePartitionPruneResults()
> that's in v12.

I see.

In that case, a separate container struct seems warranted.

-- 
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"Industry suffers from the managerial dogma that for the sake of stability
and continuity, the company should be independent of the competence of
individual employees."                                      (E. Dijkstra)





^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2022-12-12 11:19  Amit Langote <[email protected]>
  parent: Alvaro Herrera <[email protected]>
  0 siblings, 1 reply; 108+ messages in thread

From: Amit Langote @ 2022-12-12 11:19 UTC (permalink / raw)
  To: Alvaro Herrera <[email protected]>; +Cc: Robert Haas <[email protected]>; Jacob Champion <[email protected]>; David Rowley <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers

On Fri, Dec 9, 2022 at 8:37 PM Alvaro Herrera <[email protected]> wrote:
> On 2022-Dec-09, Amit Langote wrote:
>
> > Pruning will be done afresh on every fetch of a given cached plan when
> > CheckCachedPlan() is called on it, so the part_prune_results_list part
> > will be discarded and rebuilt as many times as the plan is executed.
> > You'll find a description around CachedPlanSavePartitionPruneResults()
> > that's in v12.
>
> I see.
>
> In that case, a separate container struct seems warranted.

I thought about this today and played around with some container struct ideas.

Though, I started feeling like putting all the new logic being added
by this patch into plancache.c at the heart of GetCachedPlan() and
tweaking its API in kind of unintuitive ways may not have been such a
good idea to begin with.  So I started thinking again about your
GetRunnablePlan() wrapper idea and thought maybe we could do something
with it.  Let's say we name it GetCachedPlanLockPartitions() and put
the logic that does initial pruning with the new
ExecutorDoInitialPruning() in it, instead of in the normal
GetCachedPlan() path.  Any callers that call GetCachedPlan() instead
call GetCachedPlanLockPartitions() with either the List ** parameter
as now or some container struct if that seems better.  Whether
GetCachedPlanLockPartitions() needs to do anything other than return
the CachedPlan returned by GetCachedPlan() can be decided by the
latter setting, say, CachedPlan.has_unlocked_partitions.  That will be
done by AcquireExecutorLocks() when it sees containsInitialPrunnig in
any of the PlannedStmts it sees, locking only the
PlannedStmt.minLockRelids set (which is all relations where no pruning
is needed!), leaving the partition locking to
GetCachedPlanLockPartitions().  If the CachedPlan is invalidated
during the partition locking phase, it calls GetCachedPlan() again;
maybe some refactoring is needed to avoid too much useless work in
such cases.

Thoughts?

-- 
Thanks, Amit Langote
EDB: http://www.enterprisedb.com





^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2022-12-12 17:24  Alvaro Herrera <[email protected]>
  parent: Amit Langote <[email protected]>
  0 siblings, 1 reply; 108+ messages in thread

From: Alvaro Herrera @ 2022-12-12 17:24 UTC (permalink / raw)
  To: Amit Langote <[email protected]>; +Cc: Robert Haas <[email protected]>; Jacob Champion <[email protected]>; David Rowley <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers

On 2022-Dec-12, Amit Langote wrote:

> I started feeling like putting all the new logic being added
> by this patch into plancache.c at the heart of GetCachedPlan() and
> tweaking its API in kind of unintuitive ways may not have been such a
> good idea to begin with.  So I started thinking again about your
> GetRunnablePlan() wrapper idea and thought maybe we could do something
> with it.  Let's say we name it GetCachedPlanLockPartitions() and put
> the logic that does initial pruning with the new
> ExecutorDoInitialPruning() in it, instead of in the normal
> GetCachedPlan() path.  Any callers that call GetCachedPlan() instead
> call GetCachedPlanLockPartitions() with either the List ** parameter
> as now or some container struct if that seems better.  Whether
> GetCachedPlanLockPartitions() needs to do anything other than return
> the CachedPlan returned by GetCachedPlan() can be decided by the
> latter setting, say, CachedPlan.has_unlocked_partitions.  That will be
> done by AcquireExecutorLocks() when it sees containsInitialPrunnig in
> any of the PlannedStmts it sees, locking only the
> PlannedStmt.minLockRelids set (which is all relations where no pruning
> is needed!), leaving the partition locking to
> GetCachedPlanLockPartitions().

Hmm.  This doesn't sound totally unreasonable, except to the point David
was making that perhaps we may want this container struct to accomodate
other things in the future than just the partition pruning results, so I
think its name (and that of the function that produces it) ought to be a
little more generic than that.

(I think this also answers your question on whether a List ** is better
than a container struct.)

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"Las cosas son buenas o malas segun las hace nuestra opinión" (Lisias)





^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2022-12-14 08:35  Amit Langote <[email protected]>
  parent: Alvaro Herrera <[email protected]>
  0 siblings, 1 reply; 108+ messages in thread

From: Amit Langote @ 2022-12-14 08:35 UTC (permalink / raw)
  To: Alvaro Herrera <[email protected]>; +Cc: Robert Haas <[email protected]>; Jacob Champion <[email protected]>; David Rowley <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers

On Tue, Dec 13, 2022 at 2:24 AM Alvaro Herrera <[email protected]> wrote:
> On 2022-Dec-12, Amit Langote wrote:
> > I started feeling like putting all the new logic being added
> > by this patch into plancache.c at the heart of GetCachedPlan() and
> > tweaking its API in kind of unintuitive ways may not have been such a
> > good idea to begin with.  So I started thinking again about your
> > GetRunnablePlan() wrapper idea and thought maybe we could do something
> > with it.  Let's say we name it GetCachedPlanLockPartitions() and put
> > the logic that does initial pruning with the new
> > ExecutorDoInitialPruning() in it, instead of in the normal
> > GetCachedPlan() path.  Any callers that call GetCachedPlan() instead
> > call GetCachedPlanLockPartitions() with either the List ** parameter
> > as now or some container struct if that seems better.  Whether
> > GetCachedPlanLockPartitions() needs to do anything other than return
> > the CachedPlan returned by GetCachedPlan() can be decided by the
> > latter setting, say, CachedPlan.has_unlocked_partitions.  That will be
> > done by AcquireExecutorLocks() when it sees containsInitialPrunnig in
> > any of the PlannedStmts it sees, locking only the
> > PlannedStmt.minLockRelids set (which is all relations where no pruning
> > is needed!), leaving the partition locking to
> > GetCachedPlanLockPartitions().
>
> Hmm.  This doesn't sound totally unreasonable, except to the point David
> was making that perhaps we may want this container struct to accomodate
> other things in the future than just the partition pruning results, so I
> think its name (and that of the function that produces it) ought to be a
> little more generic than that.
>
> (I think this also answers your question on whether a List ** is better
> than a container struct.)

OK, so here's a WIP attempt at that.

I have moved the original functionality of GetCachedPlan() to
GetCachedPlanInternal(), turning the former into a sort of controller
as described shortly.  The latter's CheckCachedPlan() part now only
locks the "minimal" set of, non-prunable, relations, making a note of
whether the plan contains any prunable subnodes and thus prunable
relations whose locking is deferred to the caller, GetCachedPlan().
GetCachedPlan(), as a sort of controller as mentioned before, does the
pruning if needed on the minimally valid plan returned by
GetCachedPlanInternal(), locks the partitions that survive, and redoes
the whole thing if the locking of partitions invalidates the plan.

The pruning results are returned through the new output parameter of
GetCachedPlan() of type CachedPlanExtra.  I named it so after much
consideration, because all the new logic that produces stuff to put
into it is a part of the plancache module and has to do with
manipulating a CachedPlan.  (I had considered CachedPlanExecInfo to
indicate that it contains information that is to be forwarded to the
executor, though that just didn't seem to fit in plancache.h.)

I have broken out a few things into a preparatory patch 0001.  Mainly,
it invents PlannedStmt.minLockRelids to replace the
AcquireExecutorLocks()'s current loop over the range table to figure
out the relations to lock.  I also threw in a couple of pruning
related non-functional changes in there to make it easier to read the
0002, which is the main patch.



--
Thanks, Amit Langote
EDB: http://www.enterprisedb.com


Attachments:

  [application/octet-stream] v29-0001-Preparatory-refactoring-before-reworking-CachedP.patch (17.2K, 2-v29-0001-Preparatory-refactoring-before-reworking-CachedP.patch)
  download | inline diff:
From 14a1198bdaad007b1dc835f24caa42d3667c7048 Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Tue, 13 Dec 2022 11:58:07 +0900
Subject: [PATCH v29 1/2] Preparatory refactoring before reworking CachedPlan
 locking

Remember the RT indexes of RTEs that AcquireExecutorLocks() must
look at to consider locking in a bitmapset, so that nstead of looping
over the range table to find those RTEs, it can look them up using
the RT indexes set in the bitmapset.

This also adds some extra information related to execution-time
pruning to the relevant plan nodes.
---
 src/backend/executor/execParallel.c  |  1 +
 src/backend/executor/execPartition.c |  6 ++++
 src/backend/nodes/readfuncs.c        |  8 ++++--
 src/backend/optimizer/plan/planner.c |  2 ++
 src/backend/optimizer/plan/setrefs.c | 12 ++++++++
 src/backend/partitioning/partprune.c | 42 ++++++++++++++++++++++++++--
 src/backend/utils/cache/plancache.c  | 10 +++++--
 src/include/executor/execPartition.h |  2 ++
 src/include/nodes/nodes.h            |  1 +
 src/include/nodes/pathnodes.h        | 11 ++++++++
 src/include/nodes/plannodes.h        | 19 +++++++++++++
 11 files changed, 106 insertions(+), 8 deletions(-)

diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index a5b8e43ec5..65c4b63bbd 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -182,6 +182,7 @@ ExecSerializePlan(Plan *plan, EState *estate)
 	pstmt->transientPlan = false;
 	pstmt->dependsOnRole = false;
 	pstmt->parallelModeNeeded = false;
+	pstmt->containsInitialPruning = false;	/* workers need not know! */
 	pstmt->planTree = plan;
 	pstmt->partPruneInfos = estate->es_part_prune_infos;
 	pstmt->rtable = estate->es_range_table;
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 76d79b9741..5b62157712 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -1956,6 +1956,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 			Assert(partdesc->nparts >= pinfo->nparts);
 			pprune->nparts = partdesc->nparts;
 			pprune->subplan_map = palloc(sizeof(int) * partdesc->nparts);
+			pprune->rti_map = palloc(sizeof(Index) * partdesc->nparts);
 			if (partdesc->nparts == pinfo->nparts)
 			{
 				/*
@@ -1966,6 +1967,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 				pprune->subpart_map = pinfo->subpart_map;
 				memcpy(pprune->subplan_map, pinfo->subplan_map,
 					   sizeof(int) * pinfo->nparts);
+				memcpy(pprune->rti_map, pinfo->rti_map,
+					   sizeof(int) * pinfo->nparts);
 
 				/*
 				 * Double-check that the list of unpruned relations has not
@@ -2016,6 +2019,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 							pinfo->subplan_map[pd_idx];
 						pprune->subpart_map[pp_idx] =
 							pinfo->subpart_map[pd_idx];
+						pprune->rti_map[pp_idx] =
+							pinfo->rti_map[pd_idx];
 						pd_idx++;
 					}
 					else
@@ -2023,6 +2028,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 						/* this partdesc entry is not in the plan */
 						pprune->subplan_map[pp_idx] = -1;
 						pprune->subpart_map[pp_idx] = -1;
+						pprune->rti_map[pp_idx] = 0;
 					}
 				}
 
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 966b75f5a6..1161671fa4 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -158,6 +158,11 @@
 	token = pg_strtok(&length);		/* skip :fldname */ \
 	local_node->fldname = readIntCols(len)
 
+/* Read an Index array */
+#define READ_INDEX_ARRAY(fldname, len) \
+	token = pg_strtok(&length);		/* skip :fldname */ \
+	local_node->fldname = readIndexCols(len)
+
 /* Read a bool array */
 #define READ_BOOL_ARRAY(fldname, len) \
 	token = pg_strtok(&length);		/* skip :fldname */ \
@@ -796,7 +801,6 @@ fnname(int numCols) \
  */
 READ_SCALAR_ARRAY(readAttrNumberCols, int16, atoi)
 READ_SCALAR_ARRAY(readOidCols, Oid, atooid)
-/* outfuncs.c has writeIndexCols, but we don't yet need that here */
-/* READ_SCALAR_ARRAY(readIndexCols, Index, atoui) */
+READ_SCALAR_ARRAY(readIndexCols, Index, atoui)
 READ_SCALAR_ARRAY(readIntCols, int, atoi)
 READ_SCALAR_ARRAY(readBoolCols, bool, strtobool)
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 5dd4f92720..620b163ef9 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -523,8 +523,10 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
 	result->parallelModeNeeded = glob->parallelModeNeeded;
 	result->planTree = top_plan;
 	result->partPruneInfos = glob->partPruneInfos;
+	result->containsInitialPruning = glob->containsInitialPruning;
 	result->rtable = glob->finalrtable;
 	result->permInfos = glob->finalrteperminfos;
+	result->minLockRelids = glob->minLockRelids;
 	result->resultRelations = glob->resultRelations;
 	result->appendRelations = glob->appendRelations;
 	result->subplans = glob->subplans;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 596f1fbc8e..ed43d5936d 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -279,6 +279,16 @@ set_plan_references(PlannerInfo *root, Plan *plan)
 	 */
 	add_rtes_to_flat_rtable(root, false);
 
+	/*
+	 * Add the query's adjusted range of RT indexes to glob->minLockRelids.
+	 * The adjusted RT indexes of prunable relations will be deleted from the
+	 * set below where PartitionPruneInfos are processed.
+	 */
+	glob->minLockRelids =
+		bms_add_range(glob->minLockRelids,
+					  rtoffset + 1,
+					  rtoffset + list_length(root->parse->rtable));
+
 	/*
 	 * Adjust RT indexes of PlanRowMarks and add to final rowmarks list
 	 */
@@ -377,9 +387,11 @@ set_plan_references(PlannerInfo *root, Plan *plan)
 				/* RT index of the table to which the pinfo belongs. */
 				pinfo->rtindex += rtoffset;
 			}
+
 		}
 
 		glob->partPruneInfos = lappend(glob->partPruneInfos, pruneinfo);
+		glob->containsInitialPruning |= pruneinfo->needs_init_pruning;
 	}
 
 	return result;
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index d48f6784c1..56270d7670 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -144,7 +144,9 @@ static List *make_partitionedrel_pruneinfo(PlannerInfo *root,
 										   List *prunequal,
 										   Bitmapset *partrelids,
 										   int *relid_subplan_map,
-										   Bitmapset **matchedsubplans);
+										   Bitmapset **matchedsubplans,
+										   bool *needs_init_pruning,
+										   bool *needs_exec_pruning);
 static void gen_partprune_steps(RelOptInfo *rel, List *clauses,
 								PartClauseTarget target,
 								GeneratePruningStepsContext *context);
@@ -234,6 +236,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	int		   *relid_subplan_map;
 	ListCell   *lc;
 	int			i;
+	bool		needs_init_pruning = false;
+	bool		needs_exec_pruning = false;
 
 	/*
 	 * Scan the subpaths to see which ones are scans of partition child
@@ -313,12 +317,16 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		Bitmapset  *partrelids = (Bitmapset *) lfirst(lc);
 		List	   *pinfolist;
 		Bitmapset  *matchedsubplans = NULL;
+		bool		partrel_needs_init_pruning;
+		bool		partrel_needs_exec_pruning;
 
 		pinfolist = make_partitionedrel_pruneinfo(root, parentrel,
 												  prunequal,
 												  partrelids,
 												  relid_subplan_map,
-												  &matchedsubplans);
+												  &matchedsubplans,
+												  &partrel_needs_init_pruning,
+												  &partrel_needs_exec_pruning);
 
 		/* When pruning is possible, record the matched subplans */
 		if (pinfolist != NIL)
@@ -327,6 +335,9 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 			allmatchedsubplans = bms_join(matchedsubplans,
 										  allmatchedsubplans);
 		}
+
+		needs_init_pruning |= partrel_needs_init_pruning;
+		needs_exec_pruning |= partrel_needs_exec_pruning;
 	}
 
 	pfree(relid_subplan_map);
@@ -342,6 +353,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	pruneinfo = makeNode(PartitionPruneInfo);
 	pruneinfo->root_parent_relids = parentrel->relids;
 	pruneinfo->prune_infos = prunerelinfos;
+	pruneinfo->needs_init_pruning = needs_init_pruning;
+	pruneinfo->needs_exec_pruning = needs_exec_pruning;
 
 	/*
 	 * Some subplans may not belong to any of the identified partitioned rels.
@@ -442,13 +455,19 @@ add_part_relids(List *allpartrelids, Bitmapset *partrelids)
  * If we cannot find any useful run-time pruning steps, return NIL.
  * However, on success, each rel identified in partrelids will have
  * an element in the result list, even if some of them are useless.
+ * *needs_init_pruning and *needs_exec_pruning are set to indicate whether
+ * the pruning steps contained in the returned PartitionedRelPruneInfos
+ * can be performed during executor startup and during execution,
+ * respectively.
  */
 static List *
 make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 							  List *prunequal,
 							  Bitmapset *partrelids,
 							  int *relid_subplan_map,
-							  Bitmapset **matchedsubplans)
+							  Bitmapset **matchedsubplans,
+							  bool *needs_init_pruning,
+							  bool *needs_exec_pruning)
 {
 	RelOptInfo *targetpart = NULL;
 	List	   *pinfolist = NIL;
@@ -459,6 +478,10 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	int			rti;
 	int			i;
 
+	/* Will find out below. */
+	*needs_init_pruning = false;
+	*needs_exec_pruning = false;
+
 	/*
 	 * Examine each partitioned rel, constructing a temporary array to map
 	 * from planner relids to index of the partitioned rel, and building a
@@ -546,6 +569,9 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		 * executor per-scan pruning steps.  This first pass creates startup
 		 * pruning steps and detects whether there's any possibly-useful quals
 		 * that would require per-scan pruning.
+		 *
+		 * In the first pass, we note whether the 2nd pass is necessary by
+		 * noting the presence of EXEC parameters.
 		 */
 		gen_partprune_steps(subpart, partprunequal, PARTTARGET_INITIAL,
 							&context);
@@ -620,6 +646,12 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		pinfo->execparamids = execparamids;
 		/* Remaining fields will be filled in the next loop */
 
+		/* record which types of pruning steps we've seen so far */
+		if (initial_pruning_steps != NIL)
+			*needs_init_pruning = true;
+		if (exec_pruning_steps != NIL)
+			*needs_exec_pruning = true;
+
 		pinfolist = lappend(pinfolist, pinfo);
 	}
 
@@ -647,6 +679,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		int		   *subplan_map;
 		int		   *subpart_map;
 		Oid		   *relid_map;
+		Index	   *rti_map;
 
 		/*
 		 * Construct the subplan and subpart maps for this partitioning level.
@@ -659,6 +692,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		subpart_map = (int *) palloc(nparts * sizeof(int));
 		memset(subpart_map, -1, nparts * sizeof(int));
 		relid_map = (Oid *) palloc0(nparts * sizeof(Oid));
+		rti_map = (Index *) palloc0(nparts * sizeof(Index));
 		present_parts = NULL;
 
 		i = -1;
@@ -673,6 +707,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 			subplan_map[i] = subplanidx = relid_subplan_map[partrel->relid] - 1;
 			subpart_map[i] = subpartidx = relid_subpart_map[partrel->relid] - 1;
 			relid_map[i] = planner_rt_fetch(partrel->relid, root)->relid;
+			rti_map[i] = partrel->relid;
 			if (subplanidx >= 0)
 			{
 				present_parts = bms_add_member(present_parts, i);
@@ -697,6 +732,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		pinfo->subplan_map = subplan_map;
 		pinfo->subpart_map = subpart_map;
 		pinfo->relid_map = relid_map;
+		pinfo->rti_map = rti_map;
 	}
 
 	pfree(relid_subpart_map);
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index cc943205d3..339bb603f7 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -1747,7 +1747,8 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
 	foreach(lc1, stmt_list)
 	{
 		PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
-		ListCell   *lc2;
+		Bitmapset  *allLockRelids;
+		int			rti;
 
 		if (plannedstmt->commandType == CMD_UTILITY)
 		{
@@ -1760,14 +1761,17 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
 			 */
 			Query	   *query = UtilityContainsQuery(plannedstmt->utilityStmt);
 
+			Assert(plannedstmt->minLockRelids == NULL);
 			if (query)
 				ScanQueryForLocks(query, acquire);
 			continue;
 		}
 
-		foreach(lc2, plannedstmt->rtable)
+		allLockRelids = plannedstmt->minLockRelids;
+		rti = -1;
+		while ((rti = bms_next_member(allLockRelids, rti)) > 0)
 		{
-			RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc2);
+			RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
 
 			if (rte->rtekind != RTE_RELATION)
 				continue;
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index 17fabc18c9..aeeaeb7884 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -45,6 +45,7 @@ extern void ExecCleanupTupleRouting(ModifyTableState *mtstate,
  * nparts						Length of subplan_map[] and subpart_map[].
  * subplan_map					Subplan index by partition index, or -1.
  * subpart_map					Subpart index by partition index, or -1.
+ * rti_map						Range table index by partition index, or 0.
  * present_parts				A Bitmapset of the partition indexes that we
  *								have subplans or subparts for.
  * initial_pruning_steps		List of PartitionPruneSteps used to
@@ -61,6 +62,7 @@ typedef struct PartitionedRelPruningData
 	int			nparts;
 	int		   *subplan_map;
 	int		   *subpart_map;
+	Index	   *rti_map;
 	Bitmapset  *present_parts;
 	List	   *initial_pruning_steps;
 	List	   *exec_pruning_steps;
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index 1f33902947..c2f2544df5 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -218,6 +218,7 @@ extern struct Bitmapset *readBitmapset(void);
 extern uintptr_t readDatum(bool typbyval);
 extern bool *readBoolCols(int numCols);
 extern int *readIntCols(int numCols);
+extern Index *readIndexCols(int numCols);
 extern Oid *readOidCols(int numCols);
 extern int16 *readAttrNumberCols(int numCols);
 
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 654dba61aa..4337e7aa34 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -128,6 +128,17 @@ typedef struct PlannerGlobal
 	/* List of PartitionPruneInfo contained in the plan */
 	List	   *partPruneInfos;
 
+	/*
+	 * Do any of those PartitionPruneInfos have initial pruning steps in them?
+	 */
+	bool		containsInitialPruning;
+
+	/*
+	 * Indexes of all range table entries; for AcquireExecutorLocks()'s
+	 * perusal.
+	 */
+	Bitmapset  *minLockRelids;
+
 	/* OIDs of relations the plan depends on */
 	List	   *relationOids;
 
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index bddfe86191..eb0a007946 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -73,11 +73,18 @@ typedef struct PlannedStmt
 	List	   *partPruneInfos; /* List of PartitionPruneInfo contained in the
 								 * plan */
 
+	bool		containsInitialPruning;	/* Do any of those PartitionPruneInfos
+										 * have initial pruning steps in them?
+										 */
+
 	List	   *rtable;			/* list of RangeTblEntry nodes */
 
 	List	   *permInfos;		/* list of RTEPermissionInfo nodes for rtable
 								 * entries needing one */
 
+	Bitmapset  *minLockRelids;	/* Indexes of all range table entries; for
+								 * AcquireExecutorLocks()'s perusal */
+
 	/* rtable indexes of target relations for INSERT/UPDATE/DELETE/MERGE */
 	List	   *resultRelations;	/* integer list of RT indexes, or NIL */
 
@@ -1417,6 +1424,13 @@ typedef struct PlanRowMark
  * prune_infos			List of Lists containing PartitionedRelPruneInfo nodes,
  *						one sublist per run-time-prunable partition hierarchy
  *						appearing in the parent plan node's subplans.
+ *
+ * needs_init_pruning	Does any of the PartitionedRelPruneInfos in
+ *						prune_infos have its initial_pruning_steps set?
+ *
+ * needs_exec_pruning	Does any of the PartitionedRelPruneInfos in
+ *						prune_infos have its exec_pruning_steps set?
+ *
  * other_subplans		Indexes of any subplans that are not accounted for
  *						by any of the PartitionedRelPruneInfo nodes in
  *						"prune_infos".  These subplans must not be pruned.
@@ -1428,6 +1442,8 @@ typedef struct PartitionPruneInfo
 	NodeTag		type;
 	Bitmapset  *root_parent_relids;
 	List	   *prune_infos;
+	bool		needs_init_pruning;
+	bool		needs_exec_pruning;
 	Bitmapset  *other_subplans;
 } PartitionPruneInfo;
 
@@ -1472,6 +1488,9 @@ typedef struct PartitionedRelPruneInfo
 	/* relation OID by partition index, or 0 */
 	Oid		   *relid_map pg_node_attr(array_size(nparts));
 
+	/* Range table index by partition index, or 0. */
+	Index	   *rti_map pg_node_attr(array_size(nparts));
+
 	/*
 	 * initial_pruning_steps shows how to prune during executor startup (i.e.,
 	 * without use of any PARAM_EXEC Params); it is NIL if no startup pruning
-- 
2.35.3



  [application/octet-stream] v29-0002-In-GetCachedPlan-only-lock-unpruned-partitions.patch (67.1K, 3-v29-0002-In-GetCachedPlan-only-lock-unpruned-partitions.patch)
  download | inline diff:
From 69855fffacf69575471beb69da761babadc9f75c Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Wed, 22 Dec 2021 16:55:17 +0900
Subject: [PATCH v29 2/2] In GetCachedPlan(), only lock unpruned partitions

This does two things mainly:

* The planner now removes the RT indexes of "initially prunable"
partitions from PlannedStmt.minLockRelids such that the set only
contains the relations not subject to initial partition pruning.  So,
AcquireExecutorLocks only locks a subset of the relations contained
in a plan, deferring the locking of prunable relations to the caller.

* GetCachedPlans(), if there are prunable relations in the plan,
performs the initial partition pruning using available EXTERN params
and locks the partitions remaining after that, so the the CachedPlan
that's returned is valid in a race-free manner including for any
partitions that will be scanned during execution.

To make the pruning possible before entering ExecutorStart(), this
also adds a ExecPartitionDoInitialPruning(), which can be called by
GetCachedPlan() for a given PlannedStmt.

The result of performing initial partition pruning this way is made
available to the actual execution via PartitionPruneResult, of which
there is one for every ParttionPruneInfo contained in the PlannedStmt.
List of PartitionPruneResult for a given PlannedStmt are returned to
to the callers of GetCachedPlan() via its new output parameter of type
CachedPlanExtra, whose members currently only include said List.
---
 src/backend/commands/copyto.c          |   2 +-
 src/backend/commands/createas.c        |   2 +-
 src/backend/commands/explain.c         |   7 +-
 src/backend/commands/extension.c       |   2 +-
 src/backend/commands/matview.c         |   2 +-
 src/backend/commands/prepare.c         |  28 ++-
 src/backend/executor/README            |  31 ++-
 src/backend/executor/execMain.c        |   2 +
 src/backend/executor/execParallel.c    |  25 ++-
 src/backend/executor/execPartition.c   | 215 +++++++++++++++++----
 src/backend/executor/execUtils.c       |   1 +
 src/backend/executor/functions.c       |   2 +-
 src/backend/executor/nodeAppend.c      |  11 +-
 src/backend/executor/nodeMergeAppend.c |   5 +-
 src/backend/executor/spi.c             |  31 ++-
 src/backend/optimizer/plan/setrefs.c   |  36 ++++
 src/backend/tcop/postgres.c            |   9 +-
 src/backend/tcop/pquery.c              |  28 ++-
 src/backend/utils/cache/plancache.c    | 257 +++++++++++++++++++++++--
 src/backend/utils/mmgr/portalmem.c     |  16 ++
 src/include/commands/explain.h         |   4 +-
 src/include/executor/execPartition.h   |   7 +-
 src/include/executor/execdesc.h        |   3 +
 src/include/nodes/execnodes.h          |   1 +
 src/include/nodes/pathnodes.h          |   4 +-
 src/include/nodes/plannodes.h          |  31 ++-
 src/include/utils/plancache.h          |  11 +-
 src/include/utils/portal.h             |   3 +
 28 files changed, 694 insertions(+), 82 deletions(-)

diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index f26cc0d162..401a2280a3 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -558,7 +558,7 @@ BeginCopyTo(ParseState *pstate,
 		((DR_copy *) dest)->cstate = cstate;
 
 		/* Create a QueryDesc requesting no output */
-		cstate->queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+		cstate->queryDesc = CreateQueryDesc(plan, NIL, pstate->p_sourcetext,
 											GetActiveSnapshot(),
 											InvalidSnapshot,
 											dest, NULL, NULL, 0);
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 152c29b551..942449544c 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -325,7 +325,7 @@ ExecCreateTableAs(ParseState *pstate, CreateTableAsStmt *stmt,
 		UpdateActiveSnapshotCommandId();
 
 		/* Create a QueryDesc, redirecting output to our tuple receiver */
-		queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+		queryDesc = CreateQueryDesc(plan, NIL, pstate->p_sourcetext,
 									GetActiveSnapshot(), InvalidSnapshot,
 									dest, params, queryEnv, 0);
 
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index f86983c660..2f2b558608 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -407,7 +407,7 @@ ExplainOneQuery(Query *query, int cursorOptions,
 		}
 
 		/* run it (if needed) and produce output */
-		ExplainOnePlan(plan, into, es, queryString, params, queryEnv,
+		ExplainOnePlan(plan, NIL, into, es, queryString, params, queryEnv,
 					   &planduration, (es->buffers ? &bufusage : NULL));
 	}
 }
@@ -515,7 +515,8 @@ ExplainOneUtility(Node *utilityStmt, IntoClause *into, ExplainState *es,
  * to call it.
  */
 void
-ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
+ExplainOnePlan(PlannedStmt *plannedstmt, List *part_prune_results,
+			   IntoClause *into, ExplainState *es,
 			   const char *queryString, ParamListInfo params,
 			   QueryEnvironment *queryEnv, const instr_time *planduration,
 			   const BufferUsage *bufusage)
@@ -563,7 +564,7 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
 		dest = None_Receiver;
 
 	/* Create a QueryDesc for the query */
-	queryDesc = CreateQueryDesc(plannedstmt, queryString,
+	queryDesc = CreateQueryDesc(plannedstmt, part_prune_results, queryString,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, params, queryEnv, instrument_option);
 
diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index cf1b1ca571..904cbcba4a 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -779,7 +779,7 @@ execute_sql_string(const char *sql)
 			{
 				QueryDesc  *qdesc;
 
-				qdesc = CreateQueryDesc(stmt,
+				qdesc = CreateQueryDesc(stmt, NIL,
 										sql,
 										GetActiveSnapshot(), NULL,
 										dest, NULL, NULL, 0);
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index 9ac0383459..65c8d0aa59 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -408,7 +408,7 @@ refresh_matview_datafill(DestReceiver *dest, Query *query,
 	UpdateActiveSnapshotCommandId();
 
 	/* Create a QueryDesc, redirecting output to our tuple receiver */
-	queryDesc = CreateQueryDesc(plan, queryString,
+	queryDesc = CreateQueryDesc(plan, NIL, queryString,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, NULL, NULL, 0);
 
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 9e29584d93..729384a9a6 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -154,6 +154,7 @@ ExecuteQuery(ParseState *pstate,
 {
 	PreparedStatement *entry;
 	CachedPlan *cplan;
+	CachedPlanExtra *cplan_extra = NULL;
 	List	   *plan_list;
 	ParamListInfo paramLI = NULL;
 	EState	   *estate = NULL;
@@ -193,7 +194,11 @@ ExecuteQuery(ParseState *pstate,
 									   entry->plansource->query_string);
 
 	/* Replan if needed, and increment plan refcount for portal */
-	cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL);
+	cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL,
+						  &cplan_extra);
+	Assert(cplan_extra == NULL ||
+		   (list_length(cplan->stmt_list) ==
+			list_length(cplan_extra->part_prune_results_list)));
 	plan_list = cplan->stmt_list;
 
 	/*
@@ -207,6 +212,9 @@ ExecuteQuery(ParseState *pstate,
 					  plan_list,
 					  cplan);
 
+	if (cplan_extra)
+		PortalSaveCachedPlanExtra(portal, cplan_extra);
+
 	/*
 	 * For CREATE TABLE ... AS EXECUTE, we must verify that the prepared
 	 * statement is one that produces tuples.  Currently we insist that it be
@@ -575,6 +583,7 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	PreparedStatement *entry;
 	const char *query_string;
 	CachedPlan *cplan;
+	CachedPlanExtra *cplan_extra = NULL;
 	List	   *plan_list;
 	ListCell   *p;
 	ParamListInfo paramLI = NULL;
@@ -619,7 +628,11 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 
 	/* Replan if needed, and acquire a transient refcount */
 	cplan = GetCachedPlan(entry->plansource, paramLI,
-						  CurrentResourceOwner, queryEnv);
+						  CurrentResourceOwner, queryEnv,
+						  &cplan_extra);
+	Assert(cplan_extra == NULL ||
+		   (list_length(cplan->stmt_list) ==
+			list_length(cplan_extra->part_prune_results_list)));
 
 	INSTR_TIME_SET_CURRENT(planduration);
 	INSTR_TIME_SUBTRACT(planduration, planstart);
@@ -637,10 +650,17 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	foreach(p, plan_list)
 	{
 		PlannedStmt *pstmt = lfirst_node(PlannedStmt, p);
+		List *part_prune_results = NIL;
+
+		if (cplan_extra)
+			part_prune_results = list_nth_node(List,
+											   cplan_extra->part_prune_results_list,
+											   foreach_current_index(p));
 
 		if (pstmt->commandType != CMD_UTILITY)
-			ExplainOnePlan(pstmt, into, es, query_string, paramLI, queryEnv,
-						   &planduration, (es->buffers ? &bufusage : NULL));
+			ExplainOnePlan(pstmt, part_prune_results, into, es, query_string,
+						   paramLI, queryEnv, &planduration,
+						   (es->buffers ? &bufusage : NULL));
 		else
 			ExplainOneUtility(pstmt->utilityStmt, into, es, query_string,
 							  paramLI, queryEnv);
diff --git a/src/backend/executor/README b/src/backend/executor/README
index 17775a49e2..2222b3ed6f 100644
--- a/src/backend/executor/README
+++ b/src/backend/executor/README
@@ -63,7 +63,36 @@ if the executor determines that an entire subplan is not required due to
 execution time partition pruning determining that no matching records will be
 found there.  This currently only occurs for Append and MergeAppend nodes.  In
 this case the non-required subplans are ignored and the executor state's
-subnode array will become out of sequence to the plan's subplan list.
+subnode array will become out of sequence to the plan's subplan list.  Note
+that this is referred to as "initial" pruning, because it needs to occur only
+once during the execution startup, and uses a set of pruning steps called
+initial pruning steps (see PartitionedRelPruneInfo.initial_pruning_steps).
+
+Actually, "initial" pruning may occur even before the execution startup in
+in some cases.  For example, when a cached generic plan is validated for
+execution, which works by locking all the relations that will be scanned by
+that plan during execution.  If the generic plan contains plan nodes that have
+prunable child subnodes, then this validation locking is performed after
+pruning child subnodes that need not be scanned during execution, that is,
+using initial pruning steps.  When such a generic plan is forwarded for
+execution, it must be accompanied by the set of PartitionPruneResult nodes that
+contain the result of that pruning, which basically consists of a bitmapset of
+child subnode indexes that survived the pruning and thus whose relations would
+have been locked for execution.  This is important, because, unlike the
+plan-time pruning and actual executor-startup pruning, this does not actually
+remove the pruned subnodes from the plan tree, but only marks them as being
+pruned.  So, the executor code (core or third party), especially one that runs
+before ExecutorStart() and thus looks at bare Plan trees (not PlanState trees)
+must beware of plan nodes that may actually have been pruned and thus subject
+to being invalidated by concurrent schema changes.  For plan nodes that can
+have prunable child subnodes and thus contain a PartitionPruneInfo, such code
+must always check if the corresponding PartitionPruneResult exists
+in EState.es_part_prune_results at given part_prune_index and use that to
+decide which subplans are valid for execution instead of redoing the pruning.
+Note that that is not just a performance optimization but also necessary to
+avoid possibly ending up considering a different set of child subnodes as valid
+than the set CachedPlanLockPartitions() would have locked the relations of, if
+the pruning steps produce a different result when executed multiple times.
 
 Each Plan node may have expression trees associated with it, to represent
 its target list, qualification conditions, etc.  These trees are also
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 2c2b3a8874..229f61f72e 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -798,6 +798,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
 {
 	CmdType		operation = queryDesc->operation;
 	PlannedStmt *plannedstmt = queryDesc->plannedstmt;
+	List	   *part_prune_results = queryDesc->part_prune_results;
 	Plan	   *plan = plannedstmt->planTree;
 	List	   *rangeTable = plannedstmt->rtable;
 	EState	   *estate = queryDesc->estate;
@@ -819,6 +820,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
 
 	estate->es_plannedstmt = plannedstmt;
 	estate->es_part_prune_infos = plannedstmt->partPruneInfos;
+	estate->es_part_prune_results = part_prune_results;
 
 	/*
 	 * Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index 65c4b63bbd..9745eba0af 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -66,6 +66,7 @@
 #define PARALLEL_KEY_QUERY_TEXT		UINT64CONST(0xE000000000000008)
 #define PARALLEL_KEY_JIT_INSTRUMENTATION UINT64CONST(0xE000000000000009)
 #define PARALLEL_KEY_WAL_USAGE			UINT64CONST(0xE00000000000000A)
+#define PARALLEL_KEY_PARTITION_PRUNE_RESULTS	UINT64CONST(0xE00000000000000B)
 
 #define PARALLEL_TUPLE_QUEUE_SIZE		65536
 
@@ -599,12 +600,15 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	FixedParallelExecutorState *fpes;
 	char	   *pstmt_data;
 	char	   *pstmt_space;
+	char	   *part_prune_results_data;
+	char	   *part_prune_results_space;
 	char	   *paramlistinfo_space;
 	BufferUsage *bufusage_space;
 	WalUsage   *walusage_space;
 	SharedExecutorInstrumentation *instrumentation = NULL;
 	SharedJitInstrumentation *jit_instrumentation = NULL;
 	int			pstmt_len;
+	int			part_prune_results_len;
 	int			paramlistinfo_len;
 	int			instrumentation_len = 0;
 	int			jit_instrumentation_len = 0;
@@ -633,6 +637,7 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 
 	/* Fix up and serialize plan to be sent to workers. */
 	pstmt_data = ExecSerializePlan(planstate->plan, estate);
+	part_prune_results_data = nodeToString(estate->es_part_prune_results);
 
 	/* Create a parallel context. */
 	pcxt = CreateParallelContext("postgres", "ParallelQueryMain", nworkers);
@@ -659,6 +664,11 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	shm_toc_estimate_chunk(&pcxt->estimator, pstmt_len);
 	shm_toc_estimate_keys(&pcxt->estimator, 1);
 
+	/* Estimate space for serialized List of PartitionPruneResult. */
+	part_prune_results_len = strlen(part_prune_results_data) + 1;
+	shm_toc_estimate_chunk(&pcxt->estimator, part_prune_results_len);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
 	/* Estimate space for serialized ParamListInfo. */
 	paramlistinfo_len = EstimateParamListSpace(estate->es_param_list_info);
 	shm_toc_estimate_chunk(&pcxt->estimator, paramlistinfo_len);
@@ -753,6 +763,12 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	memcpy(pstmt_space, pstmt_data, pstmt_len);
 	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PLANNEDSTMT, pstmt_space);
 
+	/* Store serialized List of PartitionPruneResult */
+	part_prune_results_space = shm_toc_allocate(pcxt->toc, part_prune_results_len);
+	memcpy(part_prune_results_space, part_prune_results_data, part_prune_results_len);
+	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARTITION_PRUNE_RESULTS,
+				   part_prune_results_space);
+
 	/* Store serialized ParamListInfo. */
 	paramlistinfo_space = shm_toc_allocate(pcxt->toc, paramlistinfo_len);
 	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARAMLISTINFO, paramlistinfo_space);
@@ -1234,8 +1250,10 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
 						 int instrument_options)
 {
 	char	   *pstmtspace;
+	char	   *part_prune_results_space;
 	char	   *paramspace;
 	PlannedStmt *pstmt;
+	List	   *part_prune_results;
 	ParamListInfo paramLI;
 	char	   *queryString;
 
@@ -1246,12 +1264,17 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
 	pstmtspace = shm_toc_lookup(toc, PARALLEL_KEY_PLANNEDSTMT, false);
 	pstmt = (PlannedStmt *) stringToNode(pstmtspace);
 
+	/* Reconstruct leader-supplied PartitionPruneResult. */
+	part_prune_results_space =
+		shm_toc_lookup(toc, PARALLEL_KEY_PARTITION_PRUNE_RESULTS, false);
+	part_prune_results = (List *) stringToNode(part_prune_results_space);
+
 	/* Reconstruct ParamListInfo. */
 	paramspace = shm_toc_lookup(toc, PARALLEL_KEY_PARAMLISTINFO, false);
 	paramLI = RestoreParamList(&paramspace);
 
 	/* Create a QueryDesc for the query. */
-	return CreateQueryDesc(pstmt,
+	return CreateQueryDesc(pstmt, part_prune_results,
 						   queryString,
 						   GetActiveSnapshot(), InvalidSnapshot,
 						   receiver, paramLI, NULL, instrument_options);
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 5b62157712..dcd2bb0f90 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -25,6 +25,7 @@
 #include "mb/pg_wchar.h"
 #include "miscadmin.h"
 #include "nodes/makefuncs.h"
+#include "parser/parsetree.h"
 #include "partitioning/partbounds.h"
 #include "partitioning/partdesc.h"
 #include "partitioning/partprune.h"
@@ -185,7 +186,11 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation rel,
 static List *adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri);
 static List *adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap);
 static PartitionPruneState *CreatePartitionPruneState(PlanState *planstate,
-													  PartitionPruneInfo *pruneinfo);
+													  PartitionPruneInfo *pruneinfo,
+													  bool consider_initial_steps,
+													  bool consider_exec_steps,
+													  List *rtable, ExprContext *econtext,
+													  PartitionDirectory partdir);
 static void InitPartitionPruneContext(PartitionPruneContext *context,
 									  List *pruning_steps,
 									  PartitionDesc partdesc,
@@ -198,7 +203,8 @@ static void PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
 static void find_matching_subplans_recurse(PartitionPruningData *prunedata,
 										   PartitionedRelPruningData *pprune,
 										   bool initial_prune,
-										   Bitmapset **validsubplans);
+										   Bitmapset **validsubplans,
+										   Bitmapset **scan_leafpart_rtis);
 
 
 /*
@@ -1742,7 +1748,8 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  * considered to be a stable expression, it can change value from one plan
  * node scan to the next during query execution.  Stable comparison
  * expressions that don't involve such Params allow partition pruning to be
- * done once during executor startup.  Expressions that do involve such Params
+ * done once during executor startup or even before that, such as when called
+ * from CachedPlanLockPartitions().  Expressions that do involve such Params
  * require us to prune separately for each scan of the parent plan node.
  *
  * Note that pruning away unneeded subplans during executor startup has the
@@ -1760,6 +1767,12 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  *		account for initial pruning possibly having eliminated some of the
  *		subplans.
  *
+ * ExecPartitionDoInitialPruning:
+ *		Do initial pruning with the information contained in a given
+ *		PartitionPruneInfo to determine the set of the parent plan node's
+ *		child subnodes that are valid for execution and also the set of the RT
+ *		indexes of leaf partitions scanned by those subnodes.
+ *
  * ExecFindMatchingSubPlans:
  *		Returns indexes of matching subplans after evaluating the expressions
  *		that are safe to evaluate at a given point.  This function is first
@@ -1780,8 +1793,10 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  *
  * On return, *initially_valid_subplans is assigned the set of indexes of
  * child subplans that must be initialized along with the parent plan node.
- * Initial pruning is performed here if needed and in that case only the
- * surviving subplans' indexes are added.
+ * That set is computed by either performing the "initial pruning" here or
+ * reusing the one present in EState.es_part_prune_results[part_prune_index]
+ * if it has been set, which it would be if CachedPlanLockPartitions() would
+ * have done the initial pruning.
  *
  * If subplans are indeed pruned, subplan_map arrays contained in the returned
  * PartitionPruneState are re-sequenced to not count those, though only if the
@@ -1794,9 +1809,10 @@ ExecInitPartitionPruning(PlanState *planstate,
 						 Bitmapset *root_parent_relids,
 						 Bitmapset **initially_valid_subplans)
 {
-	PartitionPruneState *prunestate;
+	PartitionPruneState *prunestate = NULL;
 	EState	   *estate = planstate->state;
 	PartitionPruneInfo *pruneinfo;
+	PartitionPruneResult *pruneresult = NULL;
 
 	/* Obtain the pruneinfo we need, and make sure it's the right one */
 	pruneinfo = list_nth(estate->es_part_prune_infos, part_prune_index);
@@ -1812,20 +1828,62 @@ ExecInitPartitionPruning(PlanState *planstate,
 	/* We may need an expression context to evaluate partition exprs */
 	ExecAssignExprContext(estate, planstate);
 
-	/* Create the working data structure for pruning */
-	prunestate = CreatePartitionPruneState(planstate, pruneinfo);
+	/* Initial pruning already done if es_part_prune_results has been set. */
+	if (estate->es_part_prune_results)
+	{
+		pruneresult = list_nth_node(PartitionPruneResult,
+									estate->es_part_prune_results,
+									part_prune_index);
+		if (!bms_equal(root_parent_relids, pruneinfo->root_parent_relids))
+			ereport(ERROR,
+					errcode(ERRCODE_INTERNAL_ERROR),
+					errmsg_internal("mismatching PartitionPruneInfo and PartitionPruneResult at part_prune_index %d",
+									part_prune_index),
+					errdetail_internal("prunresult relids %s, pruneinfo relids %s",
+									   bmsToString(pruneresult->root_parent_relids),
+									   bmsToString(pruneinfo->root_parent_relids)));
+	}
+
+	if (pruneresult == NULL || pruneinfo->needs_exec_pruning)
+	{
+		/* We may need an expression context to evaluate partition exprs */
+		ExecAssignExprContext(estate, planstate);
+
+		/* For data reading, executor always omits detached partitions */
+		if (estate->es_partition_directory == NULL)
+			estate->es_partition_directory =
+				CreatePartitionDirectory(estate->es_query_cxt, false);
+
+		/*
+		 * Create the working data structure for pruning.  No need to consider
+		 * initial pruning steps if we have a PartitionPruneResult.
+		 */
+		prunestate = CreatePartitionPruneState(planstate, pruneinfo,
+											   pruneresult == NULL,
+											   pruneinfo->needs_exec_pruning,
+											   NIL, planstate->ps_ExprContext,
+											   estate->es_partition_directory);
+	}
 
 	/*
 	 * Perform an initial partition prune pass, if required.
 	 */
-	if (prunestate->do_initial_prune)
-		*initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true);
+	if (pruneresult)
+	{
+		*initially_valid_subplans = bms_copy(pruneresult->valid_subplan_offs);
+	}
+	else if (prunestate && prunestate->do_initial_prune)
+	{
+		*initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true,
+															 NULL);
+	}
 	else
 	{
-		/* No pruning, so we'll need to initialize all subplans */
+		/* No initial pruning, so we'll need to initialize all subplans */
 		Assert(n_total_subplans > 0);
 		*initially_valid_subplans = bms_add_range(NULL, 0,
 												  n_total_subplans - 1);
+		return prunestate;
 	}
 
 	/*
@@ -1833,7 +1891,8 @@ ExecInitPartitionPruning(PlanState *planstate,
 	 * that were removed above due to initial pruning.  No need to do this if
 	 * no steps were removed.
 	 */
-	if (bms_num_members(*initially_valid_subplans) < n_total_subplans)
+	if (prunestate &&
+		bms_num_members(*initially_valid_subplans) < n_total_subplans)
 	{
 		/*
 		 * We can safely skip this when !do_exec_prune, even though that
@@ -1849,11 +1908,58 @@ ExecInitPartitionPruning(PlanState *planstate,
 	return prunestate;
 }
 
+/*
+ * ExecPartitionDoInitialPruning
+ *		Perform initial pruning using given PartitionPruneInfo to determine
+ *		the set of the parent plan node's child subnodes that are valid for
+ *		execution
+ *
+ * On return, *scan_leafpart_rtis will contain the RT indexes of leaf
+ * partitions scanned by those valid subnodes.
+ *
+ * Note that this does not share state with the actual execution, so must do
+ * with the information present in the PlannedStmt.  For example, there isn't
+ * a PlanState for the parent plan node yet, so we must create a standalone
+ * ExprContext to evaluate pruning expressions, equipped with the information
+ * about the EXTERN parameters that we do have.  Note that that's okay because
+ * the initial pruning steps do not contain anything that would require the
+ * execution to have started.  Likewise, we create our own PartitionDirectory
+ * to look up the PartitionDescs to use.
+ */
+Bitmapset *
+ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params,
+							  PartitionPruneInfo *pruneinfo,
+							  Bitmapset **scan_leafpart_rtis)
+{
+	List		 *rtable = plannedstmt->rtable;
+	ExprContext	 *econtext;
+	PartitionDirectory pdir;
+	PartitionPruneState *prunestate;
+	Bitmapset	 *valid_subplan_offs;
+
+	/* Don't omit detached partitions, just like during execution proper. */
+	pdir = CreatePartitionDirectory(CurrentMemoryContext, false);
+	econtext = CreateStandaloneExprContext();
+	econtext->ecxt_param_list_info = params;
+	prunestate = CreatePartitionPruneState(NULL, pruneinfo, true, false,
+										   rtable, econtext, pdir);
+	valid_subplan_offs = ExecFindMatchingSubPlans(prunestate, true,
+												  scan_leafpart_rtis);
+
+	FreeExprContext(econtext, true);
+	DestroyPartitionDirectory(pdir);
+
+	return valid_subplan_offs;
+}
+
 /*
  * CreatePartitionPruneState
  *		Build the data structure required for calling ExecFindMatchingSubPlans
  *
- * 'planstate' is the parent plan node's execution state.
+ * 'planstate', if not NULL, is the parent plan node's execution state.  It
+ * can be NULL if being called before ExecutorStart(), in which case,
+ * 'rtable' (range table), 'econtext', and 'partdir' must be explicitly
+ * provided.
  *
  * 'pruneinfo' is a PartitionPruneInfo as generated by
  * make_partition_pruneinfo.  Here we build a PartitionPruneState containing a
@@ -1867,19 +1973,21 @@ ExecInitPartitionPruning(PlanState *planstate,
  * PartitionedRelPruneInfo.
  */
 static PartitionPruneState *
-CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
+CreatePartitionPruneState(PlanState *planstate,
+						  PartitionPruneInfo *pruneinfo,
+						  bool consider_initial_steps,
+						  bool consider_exec_steps,
+						  List *rtable, ExprContext *econtext,
+						  PartitionDirectory partdir)
 {
-	EState	   *estate = planstate->state;
+	EState	   *estate = planstate ? planstate->state : NULL;
 	PartitionPruneState *prunestate;
 	int			n_part_hierarchies;
 	ListCell   *lc;
 	int			i;
-	ExprContext *econtext = planstate->ps_ExprContext;
 
-	/* For data reading, executor always omits detached partitions */
-	if (estate->es_partition_directory == NULL)
-		estate->es_partition_directory =
-			CreatePartitionDirectory(estate->es_query_cxt, false);
+	Assert((estate != NULL) ||
+			(partdir != NULL && econtext != NULL && rtable != NIL));
 
 	n_part_hierarchies = list_length(pruneinfo->prune_infos);
 	Assert(n_part_hierarchies > 0);
@@ -1934,15 +2042,39 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 			PartitionKey partkey;
 
 			/*
-			 * We can rely on the copies of the partitioned table's partition
-			 * key and partition descriptor appearing in its relcache entry,
-			 * because that entry will be held open and locked for the
-			 * duration of this executor run.
+			 * Must open the relation by ourselves when called before the
+			 * execution has started, such as, when called from
+			 * CachedPlanLockPartitions().  In that case, sub-partitions must
+			 * be locked, because AcquirePlannerLocks() would have locked only
+			 * the root parent.
+			 */
+			if (estate == NULL)
+			{
+				RangeTblEntry *rte = rt_fetch(pinfo->rtindex, rtable);
+				int		lockmode = (j == 0) ? NoLock : rte->rellockmode;
+
+				partrel = table_open(rte->relid, lockmode);
+			}
+			else
+				partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
+
+			/*
+			 * We can rely on the copy of the partitioned table's partition
+			 * key from in its relcache entry, because it can't change (or
+			 * get destroyed) as long as the relation is locked.  Partition
+			 * descriptor is taken from the PartitionDirectory associated with
+			 * the table that is held open long enough for the descriptor to
+			 * remain valid while it's used to perform the pruning steps.
 			 */
-			partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
 			partkey = RelationGetPartitionKey(partrel);
-			partdesc = PartitionDirectoryLookup(estate->es_partition_directory,
-												partrel);
+			partdesc = PartitionDirectoryLookup(partdir, partrel);
+
+			/*
+			 * Must close partrel, keeping the lock taken, if we're not using
+			 * EState's entry.
+			 */
+			if (estate == NULL)
+				table_close(partrel, NoLock);
 
 			/*
 			 * Initialize the subplan_map and subpart_map.
@@ -2050,7 +2182,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 			 * Initialize pruning contexts as needed.
 			 */
 			pprune->initial_pruning_steps = pinfo->initial_pruning_steps;
-			if (pinfo->initial_pruning_steps)
+			if (consider_initial_steps && pinfo->initial_pruning_steps)
 			{
 				InitPartitionPruneContext(&pprune->initial_context,
 										  pinfo->initial_pruning_steps,
@@ -2060,7 +2192,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 				prunestate->do_initial_prune = true;
 			}
 			pprune->exec_pruning_steps = pinfo->exec_pruning_steps;
-			if (pinfo->exec_pruning_steps)
+			if (consider_exec_steps && pinfo->exec_pruning_steps)
 			{
 				InitPartitionPruneContext(&pprune->exec_context,
 										  pinfo->exec_pruning_steps,
@@ -2288,10 +2420,14 @@ PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
  * Pass initial_prune if PARAM_EXEC Params cannot yet be evaluated.  This
  * differentiates the initial executor-time pruning step from later
  * runtime pruning.
+ *
+ * RT indexes of leaf partitions scanned by the chosen subplans are added to
+ * *scan_leafpart_rtis if the pointer is non-NULL.
  */
 Bitmapset *
 ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
-						 bool initial_prune)
+						 bool initial_prune,
+						 Bitmapset **scan_leafpart_rtis)
 {
 	Bitmapset  *result = NULL;
 	MemoryContext oldcontext;
@@ -2326,7 +2462,7 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
 		 */
 		pprune = &prunedata->partrelprunedata[0];
 		find_matching_subplans_recurse(prunedata, pprune, initial_prune,
-									   &result);
+									   &result, scan_leafpart_rtis);
 
 		/* Expression eval may have used space in ExprContext too */
 		if (pprune->exec_pruning_steps)
@@ -2340,6 +2476,8 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
 
 	/* Copy result out of the temp context before we reset it */
 	result = bms_copy(result);
+	if (scan_leafpart_rtis)
+		*scan_leafpart_rtis = bms_copy(*scan_leafpart_rtis);
 
 	MemoryContextReset(prunestate->prune_context);
 
@@ -2350,13 +2488,15 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
  * find_matching_subplans_recurse
  *		Recursive worker function for ExecFindMatchingSubPlans
  *
- * Adds valid (non-prunable) subplan IDs to *validsubplans
+ * Adds valid (non-prunable) subplan IDs to *validsubplans and RT indexes of
+ * of the corresponding leaf partitions to *scan_leafpart_rtis (if asked for).
  */
 static void
 find_matching_subplans_recurse(PartitionPruningData *prunedata,
 							   PartitionedRelPruningData *pprune,
 							   bool initial_prune,
-							   Bitmapset **validsubplans)
+							   Bitmapset **validsubplans,
+							   Bitmapset **scan_leafpart_rtis)
 {
 	Bitmapset  *partset;
 	int			i;
@@ -2383,8 +2523,14 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
 	while ((i = bms_next_member(partset, i)) >= 0)
 	{
 		if (pprune->subplan_map[i] >= 0)
+		{
 			*validsubplans = bms_add_member(*validsubplans,
 											pprune->subplan_map[i]);
+			Assert(pprune->rti_map[i] > 0);
+			if (scan_leafpart_rtis)
+				*scan_leafpart_rtis = bms_add_member(*scan_leafpart_rtis,
+													 pprune->rti_map[i]);
+		}
 		else
 		{
 			int			partidx = pprune->subpart_map[i];
@@ -2392,7 +2538,8 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
 			if (partidx >= 0)
 				find_matching_subplans_recurse(prunedata,
 											   &prunedata->partrelprunedata[partidx],
-											   initial_prune, validsubplans);
+											   initial_prune, validsubplans,
+											   scan_leafpart_rtis);
 			else
 			{
 				/*
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 87f4d53ca7..7d36c972d3 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -139,6 +139,7 @@ CreateExecutorState(void)
 	estate->es_param_exec_vals = NULL;
 
 	estate->es_queryEnv = NULL;
+	estate->es_part_prune_results = NIL;
 
 	estate->es_query_cxt = qcontext;
 
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index dc13625171..bffb42ce71 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -842,7 +842,7 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
 	else
 		dest = None_Receiver;
 
-	es->qd = CreateQueryDesc(es->stmt,
+	es->qd = CreateQueryDesc(es->stmt, NIL,
 							 fcache->src,
 							 GetActiveSnapshot(),
 							 InvalidSnapshot,
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index 99830198bd..3b917584de 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -156,7 +156,8 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
 		 * subplan, we can fill as_valid_subplans immediately, preventing
 		 * later calls to ExecFindMatchingSubPlans.
 		 */
-		if (!prunestate->do_exec_prune && nplans > 0)
+		if (appendstate->as_prune_state == NULL ||
+			(!appendstate->as_prune_state->do_exec_prune && nplans > 0))
 			appendstate->as_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
 	}
 	else
@@ -578,7 +579,7 @@ choose_next_subplan_locally(AppendState *node)
 		}
 		else if (node->as_valid_subplans == NULL)
 			node->as_valid_subplans =
-				ExecFindMatchingSubPlans(node->as_prune_state, false);
+				ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
 
 		whichplan = -1;
 	}
@@ -643,7 +644,7 @@ choose_next_subplan_for_leader(AppendState *node)
 		if (node->as_valid_subplans == NULL)
 		{
 			node->as_valid_subplans =
-				ExecFindMatchingSubPlans(node->as_prune_state, false);
+				ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
 
 			/*
 			 * Mark each invalid plan as finished to allow the loop below to
@@ -718,7 +719,7 @@ choose_next_subplan_for_worker(AppendState *node)
 	else if (node->as_valid_subplans == NULL)
 	{
 		node->as_valid_subplans =
-			ExecFindMatchingSubPlans(node->as_prune_state, false);
+			ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
 		mark_invalid_subplans_as_finished(node);
 	}
 
@@ -869,7 +870,7 @@ ExecAppendAsyncBegin(AppendState *node)
 	if (node->as_valid_subplans == NULL)
 	{
 		node->as_valid_subplans =
-			ExecFindMatchingSubPlans(node->as_prune_state, false);
+			ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
 
 		classify_matching_subplans(node);
 	}
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index f370f9f287..ccfa083945 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -104,7 +104,8 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
 		 * subplan, we can fill ms_valid_subplans immediately, preventing
 		 * later calls to ExecFindMatchingSubPlans.
 		 */
-		if (!prunestate->do_exec_prune && nplans > 0)
+		if (mergestate->ms_prune_state == NULL ||
+			(!mergestate->ms_prune_state->do_exec_prune && nplans > 0))
 			mergestate->ms_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
 	}
 	else
@@ -219,7 +220,7 @@ ExecMergeAppend(PlanState *pstate)
 		 */
 		if (node->ms_valid_subplans == NULL)
 			node->ms_valid_subplans =
-				ExecFindMatchingSubPlans(node->ms_prune_state, false);
+				ExecFindMatchingSubPlans(node->ms_prune_state, false, NULL);
 
 		/*
 		 * First time through: pull the first tuple from each valid subplan,
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index fd5796f1b9..2ecb9193aa 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -1577,6 +1577,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 {
 	CachedPlanSource *plansource;
 	CachedPlan *cplan;
+	CachedPlanExtra *cplan_extra;
 	List	   *stmt_list;
 	char	   *query_string;
 	Snapshot	snapshot;
@@ -1657,7 +1658,11 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 	 */
 
 	/* Replan if needed, and increment plan refcount for portal */
-	cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv);
+	cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv,
+						  &cplan_extra);
+	Assert(cplan_extra == NULL ||
+		   (list_length(cplan->stmt_list) ==
+			list_length(cplan_extra->part_prune_results_list)));
 	stmt_list = cplan->stmt_list;
 
 	if (!plan->saved)
@@ -1685,6 +1690,9 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 					  stmt_list,
 					  cplan);
 
+	if (cplan_extra)
+		PortalSaveCachedPlanExtra(portal, cplan_extra);
+
 	/*
 	 * Set up options for portal.  Default SCROLL type is chosen the same way
 	 * as PerformCursorOpen does it.
@@ -2067,6 +2075,7 @@ SPI_plan_get_cached_plan(SPIPlanPtr plan)
 {
 	CachedPlanSource *plansource;
 	CachedPlan *cplan;
+	CachedPlanExtra *cplan_extra = NULL;
 	SPICallbackArg spicallbackarg;
 	ErrorContextCallback spierrcontext;
 
@@ -2092,8 +2101,12 @@ SPI_plan_get_cached_plan(SPIPlanPtr plan)
 	/* Get the generic plan for the query */
 	cplan = GetCachedPlan(plansource, NULL,
 						  plan->saved ? CurrentResourceOwner : NULL,
-						  _SPI_current->queryEnv);
+						  _SPI_current->queryEnv,
+						  &cplan_extra);
 	Assert(cplan == plansource->gplan);
+	Assert(cplan_extra == NULL ||
+		   (list_length(cplan->stmt_list) ==
+			list_length(cplan_extra->part_prune_results_list)));
 
 	/* Pop the error context stack */
 	error_context_stack = spierrcontext.previous;
@@ -2399,6 +2412,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 	SPICallbackArg spicallbackarg;
 	ErrorContextCallback spierrcontext;
 	CachedPlan *cplan = NULL;
+	CachedPlanExtra *cplan_extra = NULL;
 	ListCell   *lc1;
 
 	/*
@@ -2549,8 +2563,12 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 		 * plan, the refcount must be backed by the plan_owner.
 		 */
 		cplan = GetCachedPlan(plansource, options->params,
-							  plan_owner, _SPI_current->queryEnv);
+							  plan_owner, _SPI_current->queryEnv,
+							  &cplan_extra);
 
+		Assert(cplan_extra == NULL ||
+			   (list_length(cplan->stmt_list) ==
+				list_length(cplan_extra->part_prune_results_list)));
 		stmt_list = cplan->stmt_list;
 
 		/*
@@ -2592,9 +2610,14 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 		foreach(lc2, stmt_list)
 		{
 			PlannedStmt *stmt = lfirst_node(PlannedStmt, lc2);
+			List	   *part_prune_results = NIL;
 			bool		canSetTag = stmt->canSetTag;
 			DestReceiver *dest;
 
+			if (cplan_extra)
+				part_prune_results = list_nth_node(List,
+												   cplan_extra->part_prune_results_list,
+												   foreach_current_index(lc2));
 			/*
 			 * Reset output state.  (Note that if a non-SPI receiver is used,
 			 * _SPI_current->processed will stay zero, and that's what we'll
@@ -2663,7 +2686,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 				else
 					snap = InvalidSnapshot;
 
-				qdesc = CreateQueryDesc(stmt,
+				qdesc = CreateQueryDesc(stmt, part_prune_results,
 										plansource->query_string,
 										snap, crosscheck_snapshot,
 										dest,
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index ed43d5936d..db27cae297 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -372,6 +372,7 @@ set_plan_references(PlannerInfo *root, Plan *plan)
 	{
 		PartitionPruneInfo *pruneinfo = lfirst(lc);
 		ListCell   *l;
+		Bitmapset  *leafpart_rtis = NULL;
 
 		pruneinfo->root_parent_relids =
 			offset_relid_set(pruneinfo->root_parent_relids, rtoffset);
@@ -383,17 +384,52 @@ set_plan_references(PlannerInfo *root, Plan *plan)
 			foreach(l2, prune_infos)
 			{
 				PartitionedRelPruneInfo *pinfo = lfirst(l2);
+				int		i;
 
 				/* RT index of the table to which the pinfo belongs. */
 				pinfo->rtindex += rtoffset;
+
+				/* Also of the leaf partitions that might be scanned. */
+				for (i = 0; i < pinfo->nparts; i++)
+				{
+					if (pinfo->rti_map[i] > 0 && pinfo->subplan_map[i] >= 0)
+					{
+						pinfo->rti_map[i] += rtoffset;
+						leafpart_rtis = bms_add_member(leafpart_rtis,
+													   pinfo->rti_map[i]);
+					}
+				}
 			}
 
 		}
 
+		if (pruneinfo->needs_init_pruning)
+		{
+			glob->containsInitialPruning = true;
+
+			/*
+			 * Delete the leaf partition RTIs from the set of relations to be
+			 * locked by AcquireExecutorLocks().  The actual set of leaf
+			 * partitions to be locked is computed by
+			 * CachedPlanLockPartitions().
+			 */
+			glob->minLockRelids = bms_del_members(glob->minLockRelids,
+												  leafpart_rtis);
+		}
+
 		glob->partPruneInfos = lappend(glob->partPruneInfos, pruneinfo);
 		glob->containsInitialPruning |= pruneinfo->needs_init_pruning;
 	}
 
+	/*
+	 * It seems worth doing a bms_copy() on glob->minLockRelids if we deleted
+	 * bits from it above to get rid of any empty tail bits.  It seems better
+	 * for the loop over this set in AcquireExecutorLocks() to not have to go
+	 * through those useless bit words.
+	 */
+	if (glob->containsInitialPruning)
+		glob->minLockRelids = bms_copy(glob->minLockRelids);
+
 	return result;
 }
 
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index f8808d2191..9c1c7bfa9e 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -1598,6 +1598,7 @@ exec_bind_message(StringInfo input_message)
 	int16	   *rformats = NULL;
 	CachedPlanSource *psrc;
 	CachedPlan *cplan;
+	CachedPlanExtra *cplan_extra = NULL;
 	Portal		portal;
 	char	   *query_string;
 	char	   *saved_stmt_name;
@@ -1972,7 +1973,10 @@ exec_bind_message(StringInfo input_message)
 	 * will be generated in MessageContext.  The plan refcount will be
 	 * assigned to the Portal, so it will be released at portal destruction.
 	 */
-	cplan = GetCachedPlan(psrc, params, NULL, NULL);
+	cplan = GetCachedPlan(psrc, params, NULL, NULL, &cplan_extra);
+	Assert(cplan_extra == NULL ||
+		   (list_length(cplan->stmt_list) ==
+			list_length(cplan_extra->part_prune_results_list)));
 
 	/*
 	 * Now we can define the portal.
@@ -1987,6 +1991,9 @@ exec_bind_message(StringInfo input_message)
 					  cplan->stmt_list,
 					  cplan);
 
+	if (cplan_extra)
+		PortalSaveCachedPlanExtra(portal, cplan_extra);
+
 	/* Done with the snapshot used for parameter I/O and parsing/planning */
 	if (snapshot_set)
 		PopActiveSnapshot();
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index 52e2db6452..32e6b7b767 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -35,7 +35,7 @@
 Portal		ActivePortal = NULL;
 
 
-static void ProcessQuery(PlannedStmt *plan,
+static void ProcessQuery(PlannedStmt *plan, List *part_prune_results,
 						 const char *sourceText,
 						 ParamListInfo params,
 						 QueryEnvironment *queryEnv,
@@ -65,6 +65,7 @@ static void DoPortalRewind(Portal portal);
  */
 QueryDesc *
 CreateQueryDesc(PlannedStmt *plannedstmt,
+				List *part_prune_results,
 				const char *sourceText,
 				Snapshot snapshot,
 				Snapshot crosscheck_snapshot,
@@ -77,6 +78,7 @@ CreateQueryDesc(PlannedStmt *plannedstmt,
 
 	qd->operation = plannedstmt->commandType;	/* operation */
 	qd->plannedstmt = plannedstmt;	/* plan */
+	qd->part_prune_results = part_prune_results;
 	qd->sourceText = sourceText;	/* query text */
 	qd->snapshot = RegisterSnapshot(snapshot);	/* snapshot */
 	/* RI check snapshot */
@@ -122,6 +124,7 @@ FreeQueryDesc(QueryDesc *qdesc)
  *		PORTAL_ONE_RETURNING, or PORTAL_ONE_MOD_WITH portal
  *
  *	plan: the plan tree for the query
+ *	part_prune_results: pruning results returned by CachedPlanLockPartitions()
  *	sourceText: the source text of the query
  *	params: any parameters needed
  *	dest: where to send results
@@ -134,6 +137,7 @@ FreeQueryDesc(QueryDesc *qdesc)
  */
 static void
 ProcessQuery(PlannedStmt *plan,
+			 List *part_prune_results,
 			 const char *sourceText,
 			 ParamListInfo params,
 			 QueryEnvironment *queryEnv,
@@ -145,7 +149,7 @@ ProcessQuery(PlannedStmt *plan,
 	/*
 	 * Create the QueryDesc object
 	 */
-	queryDesc = CreateQueryDesc(plan, sourceText,
+	queryDesc = CreateQueryDesc(plan, part_prune_results, sourceText,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, params, queryEnv, 0);
 
@@ -491,8 +495,13 @@ PortalStart(Portal portal, ParamListInfo params,
 				/*
 				 * Create QueryDesc in portal's context; for the moment, set
 				 * the destination to DestNone.
+				 *
+				 * There is no PartitionPruneResult unless the PlannedStmt is
+				 * from a CachedPlan.
 				 */
 				queryDesc = CreateQueryDesc(linitial_node(PlannedStmt, portal->stmts),
+											portal->cplan_extra == NULL ? NIL :
+											linitial(portal->cplan_extra->part_prune_results_list),
 											portal->sourceText,
 											GetActiveSnapshot(),
 											InvalidSnapshot,
@@ -1225,6 +1234,8 @@ PortalRunMulti(Portal portal,
 
 		if (pstmt->utilityStmt == NULL)
 		{
+			List *part_prune_results = NIL;
+
 			/*
 			 * process a plannable query.
 			 */
@@ -1271,10 +1282,19 @@ PortalRunMulti(Portal portal,
 			else
 				UpdateActiveSnapshotCommandId();
 
+			/*
+			 * Determine if there's a corresponding List of PartitionPruneResult
+			 * for this PlannedStmt.
+			 */
+			if (portal->cplan_extra)
+				part_prune_results = list_nth_node(List,
+												   portal->cplan_extra->part_prune_results_list,
+												   foreach_current_index(stmtlist_item));
+
 			if (pstmt->canSetTag)
 			{
 				/* statement can set tag string */
-				ProcessQuery(pstmt,
+				ProcessQuery(pstmt, part_prune_results,
 							 portal->sourceText,
 							 portal->portalParams,
 							 portal->queryEnv,
@@ -1283,7 +1303,7 @@ PortalRunMulti(Portal portal,
 			else
 			{
 				/* stmt added by rewrite cannot set tag */
-				ProcessQuery(pstmt,
+				ProcessQuery(pstmt, part_prune_results,
 							 portal->sourceText,
 							 portal->portalParams,
 							 portal->queryEnv,
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 339bb603f7..7bd94e7632 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -59,6 +59,7 @@
 #include "access/transam.h"
 #include "catalog/namespace.h"
 #include "executor/executor.h"
+#include "executor/execPartition.h"
 #include "miscadmin.h"
 #include "nodes/nodeFuncs.h"
 #include "optimizer/optimizer.h"
@@ -96,17 +97,20 @@ static dlist_head saved_plan_list = DLIST_STATIC_INIT(saved_plan_list);
  */
 static dlist_head cached_expression_list = DLIST_STATIC_INIT(cached_expression_list);
 
+static CachedPlan *GetCachedPlanInternal(CachedPlanSource *plansource,
+					  ParamListInfo boundParams, ResourceOwner owner,
+					  QueryEnvironment *queryEnv, bool *hasUnlockedParts);
 static void ReleaseGenericPlan(CachedPlanSource *plansource);
 static List *RevalidateCachedQuery(CachedPlanSource *plansource,
 								   QueryEnvironment *queryEnv);
-static bool CheckCachedPlan(CachedPlanSource *plansource);
+static bool CheckCachedPlan(CachedPlanSource *plansource, bool *hasUnlockedParts);
 static CachedPlan *BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
 								   ParamListInfo boundParams, QueryEnvironment *queryEnv);
 static bool choose_custom_plan(CachedPlanSource *plansource,
 							   ParamListInfo boundParams);
 static double cached_plan_cost(CachedPlan *plan, bool include_planner);
 static Query *QueryListGetPrimaryStmt(List *stmts);
-static void AcquireExecutorLocks(List *stmt_list, bool acquire);
+static bool AcquireExecutorLocks(List *stmt_list, bool acquire);
 static void AcquirePlannerLocks(List *stmt_list, bool acquire);
 static void ScanQueryForLocks(Query *parsetree, bool acquire);
 static bool ScanQueryWalker(Node *node, bool *acquire);
@@ -783,16 +787,23 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
 }
 
 /*
- * CheckCachedPlan: see if the CachedPlanSource's generic plan is valid.
+ * CheckCachedPlan: see if the CachedPlanSource's generic plan is valid and
+ * set *hasUnlockedParts if any PlannedStmt contains "initially" prunable
+ * subnodes; partitions are not locked till initial pruning is done.
  *
  * Caller must have already called RevalidateCachedQuery to verify that the
  * querytree is up to date.
  *
- * On a "true" return, we have acquired the locks needed to run the plan.
+ * On a "true" return, we have acquired the minimal set of locks needed to run
+ * the plan, that is, excluding partitions that are subject to being pruned
+ * before execution.  The caller must lock partitions after pruning those and
+ * locking the ones that remain before actually telling the world that the
+ * plan is "valid".
+ *
  * (We must do this for the "true" result to be race-condition-free.)
  */
 static bool
-CheckCachedPlan(CachedPlanSource *plansource)
+CheckCachedPlan(CachedPlanSource *plansource, bool *hasUnlockedParts)
 {
 	CachedPlan *plan = plansource->gplan;
 
@@ -826,7 +837,7 @@ CheckCachedPlan(CachedPlanSource *plansource)
 		 */
 		Assert(plan->refcount > 0);
 
-		AcquireExecutorLocks(plan->stmt_list, true);
+		*hasUnlockedParts = AcquireExecutorLocks(plan->stmt_list, true);
 
 		/*
 		 * If plan was transient, check to see if TransactionXmin has
@@ -848,7 +859,7 @@ CheckCachedPlan(CachedPlanSource *plansource)
 		}
 
 		/* Oops, the race case happened.  Release useless locks. */
-		AcquireExecutorLocks(plan->stmt_list, false);
+		(void) AcquireExecutorLocks(plan->stmt_list, false);
 	}
 
 	/*
@@ -1120,7 +1131,125 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
 }
 
 /*
- * GetCachedPlan: get a cached plan from a CachedPlanSource.
+ * For each PlannedStmt in plan->stmt_list, do initial partition pruning if
+ * needed and lock partitions that survive.
+ *
+ * The returned list of the same length as plan->stmt_list will contains either
+ * a NIL if the PlannedStmt did not contain any PartitionPruneInfos requiring
+ * initial pruning or a List of PartitionPruneResult that in turn contains
+ * an element for each PartitionPruneInfo found in stmt->partPruneInfos.
+ *
+ * Also, on return, *lockedRelids_per_stmt, that will be made of the same
+ * length as plan->stmt_list, will contain either a NULL if no additional
+ * relations needed to be locked for the PlannedStmt, or a bitmapset of RT
+ * indexes of partitions locked.
+ */
+static bool
+CachedPlanLockPartitions(CachedPlan *plan,
+						 ParamListInfo boundParams,
+						 ResourceOwner owner,
+						 List **part_prune_results_list,
+						 List **lockedRelids_per_stmt)
+{
+	List	   *my_part_prune_results_list = NIL;
+	List	   *my_lockedRelids_per_stmt = NIL;
+	ListCell   *lc1;
+	MemoryContext oldcontext,
+			tmpcontext;
+
+	*part_prune_results_list = NIL;
+	*lockedRelids_per_stmt = NIL;
+
+	/*
+	 * Create a temporary context for memory allocations required while
+	 * executing partition pruning steps.
+	 */
+	tmpcontext = AllocSetContextCreate(CurrentMemoryContext,
+									   "CachedPlanLockPartitions() working data",
+									   ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(tmpcontext);
+	foreach(lc1, plan->stmt_list)
+	{
+		PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
+		Bitmapset  *lockPartRelids = NULL;
+		int			rti;
+		List	   *part_prune_results = NIL;
+		Bitmapset  *lockedRelids = NULL;
+
+		if (plannedstmt->commandType == CMD_UTILITY)
+		{
+			/*
+			 * Ignore utility statements, because AcquireExecutorLocks on the
+			 * parent CachedPlan would have dealt with these.  Though, do let
+			 * the caller know that no pruning is applicable to this statement.
+			 */
+			my_part_prune_results_list = lappend(my_part_prune_results_list,
+												 NIL);
+			*lockedRelids_per_stmt = lappend(*lockedRelids_per_stmt, NULL);
+			continue;
+		}
+
+		/* Figure out the partitions that would need to be locked. */
+		if (plannedstmt->containsInitialPruning)
+		{
+			ListCell *lc2;
+
+			foreach(lc2, plannedstmt->partPruneInfos)
+			{
+				PartitionPruneInfo *pruneinfo = lfirst_node(PartitionPruneInfo, lc2);
+				PartitionPruneResult *pruneresult = makeNode(PartitionPruneResult);
+
+				pruneresult->root_parent_relids =
+					bms_copy(pruneinfo->root_parent_relids);
+				pruneresult->valid_subplan_offs =
+					ExecPartitionDoInitialPruning(plannedstmt, boundParams,
+												  pruneinfo,
+												  &lockPartRelids);
+				part_prune_results = lappend(part_prune_results, pruneresult);
+			}
+		}
+
+		rti = -1;
+		while ((rti = bms_next_member(lockPartRelids, rti)) > 0)
+		{
+			RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
+
+			Assert(rte->rtekind == RTE_RELATION);
+
+			/*
+			 * Acquire the appropriate type of lock on each relation OID. Note
+			 * that we don't actually try to open the rel, and hence will not
+			 * fail if it's been dropped entirely --- we'll just transiently
+			 * acquire a non-conflicting lock.
+			 */
+			LockRelationOid(rte->relid, rte->rellockmode);
+			lockedRelids = bms_add_member(lockedRelids, rti);
+		}
+
+		my_part_prune_results_list = lappend(my_part_prune_results_list,
+											 part_prune_results);
+		my_lockedRelids_per_stmt = lappend(my_lockedRelids_per_stmt,
+										   lockedRelids);
+	}
+
+	/*
+	 * If the plan is still valid, copy the prune results and lockRelids
+	 * bitmapsets into the caller's context.
+	 */
+	MemoryContextSwitchTo(oldcontext);
+	if (plan->is_valid)
+	{
+		*part_prune_results_list = copyObject(my_part_prune_results_list);
+		*lockedRelids_per_stmt = copyObject(my_lockedRelids_per_stmt);
+	}
+
+	/* Clear up the temporary context. */
+	MemoryContextDelete(tmpcontext);
+	return plan->is_valid;
+}
+
+/*
+ * GetCachedPlan: get a cached plan from a CachedPlanSource
  *
  * This function hides the logic that decides whether to use a generic
  * plan or a custom plan for the given parameters: the caller does not know
@@ -1139,7 +1268,97 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
  */
 CachedPlan *
 GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
-			  ResourceOwner owner, QueryEnvironment *queryEnv)
+			  ResourceOwner owner, QueryEnvironment *queryEnv,
+			  CachedPlanExtra **extra)
+{
+	CachedPlan *plan;
+
+	Assert(extra != NULL);
+	*extra = NULL;
+	for (;;)
+	{
+		bool	hasUnlockedParts = false;
+
+		/* Actually get the plan. */
+		plan = GetCachedPlanInternal(plansource, boundParams, owner, queryEnv,
+									 &hasUnlockedParts);
+		Assert(plan->is_valid);
+
+		/* Nothing to do if all relations already locked. */
+		if (!hasUnlockedParts)
+			return plan;
+		else
+		{
+			/*
+			 * Do initial pruning to filter out partitions that need not be
+			 * locked for execution.
+			 */
+			ListCell *lc1,
+				   *lc2;
+			List   *part_prune_results_list;
+			List   *lockedRelids_per_stmt;
+
+			/* Only a generic plan can ever have unlocked partitions in it. */
+			Assert(plan == plansource->gplan);
+
+			/*
+			 * This does:
+			 *
+			 * 	1) the pruning, returning in part_prune_results_list the
+			 * 	PartitionPruneResult Lists for all statements
+			 *
+			 * 	2) lock partitions that survive in each statement, returning
+			 * 	in lockedRelids_per_stmt the RT indexes of those locked.
+			 *
+			 * True is returned if the plan is still valid after locking all
+			 * partitions; false otherwise, in which case we must get a new
+			 * plan.
+			 */
+			if (CachedPlanLockPartitions(plan, boundParams, owner,
+										 &part_prune_results_list,
+										 &lockedRelids_per_stmt))
+			{
+				Assert(plan->is_valid);
+				*extra = (CachedPlanExtra *) palloc(sizeof(CachedPlanExtra));
+				(*extra)->part_prune_results_list = part_prune_results_list;
+				return plan;
+			}
+
+			/*
+			 * Release the locks and start over.  This is the same as what
+			 * CheckCachedPlan does when doing AcquireExecutorLocks() causes
+			 * the plan to be invalidated.
+			 */
+			forboth(lc1, plan->stmt_list, lc2, lockedRelids_per_stmt)
+			{
+				PlannedStmt *plannedstmt = lfirst(lc1);
+				Bitmapset *lockedRelids = lfirst(lc2);
+				int		rti;
+
+				if (plannedstmt->commandType == CMD_UTILITY)
+					continue;
+				rti = -1;
+				while ((rti = bms_next_member(lockedRelids, rti)) > 0)
+				{
+					RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
+
+					Assert(rte->rtekind == RTE_RELATION);
+
+					UnlockRelationOid(rte->relid, rte->rellockmode);
+				}
+			}
+		}
+	}
+
+	Assert(false);
+	return NULL;
+}
+
+/* Internal workhorse of GetCachedPlan() */
+static CachedPlan *
+GetCachedPlanInternal(CachedPlanSource *plansource, ParamListInfo boundParams,
+					  ResourceOwner owner, QueryEnvironment *queryEnv,
+					  bool *hasUnlockedParts)
 {
 	CachedPlan *plan = NULL;
 	List	   *qlist;
@@ -1160,7 +1379,7 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 
 	if (!customplan)
 	{
-		if (CheckCachedPlan(plansource))
+		if (CheckCachedPlan(plansource, hasUnlockedParts))
 		{
 			/* We want a generic plan, and we already have a valid one */
 			plan = plansource->gplan;
@@ -1738,11 +1957,16 @@ QueryListGetPrimaryStmt(List *stmts)
 /*
  * AcquireExecutorLocks: acquire locks needed for execution of a cached plan;
  * or release them if acquire is false.
+ *
+ * If some PlannedStmt(s) contain "initially prunable" partitions, they are not
+ * locked here. Instead, the caller is informed of their existence so that it
+ * can lock them after doing the initial pruning.
  */
-static void
+static bool
 AcquireExecutorLocks(List *stmt_list, bool acquire)
 {
 	ListCell   *lc1;
+	bool		hasUnlockedParts = false;
 
 	foreach(lc1, stmt_list)
 	{
@@ -1763,10 +1987,17 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
 
 			Assert(plannedstmt->minLockRelids == NULL);
 			if (query)
-				ScanQueryForLocks(query, acquire);
+				ScanQueryForLocks(query, true);
 			continue;
 		}
 
+		/*
+		 * If partitions can be pruned before execution, defer their locking to
+		 * the caller.
+		 */
+		if (plannedstmt->containsInitialPruning)
+			hasUnlockedParts = true;
+
 		allLockRelids = plannedstmt->minLockRelids;
 		rti = -1;
 		while ((rti = bms_next_member(allLockRelids, rti)) > 0)
@@ -1788,6 +2019,8 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
 				UnlockRelationOid(rte->relid, rte->rellockmode);
 		}
 	}
+
+	return hasUnlockedParts;
 }
 
 /*
diff --git a/src/backend/utils/mmgr/portalmem.c b/src/backend/utils/mmgr/portalmem.c
index 7b1ae6fdcf..94a9db84e3 100644
--- a/src/backend/utils/mmgr/portalmem.c
+++ b/src/backend/utils/mmgr/portalmem.c
@@ -303,6 +303,22 @@ PortalDefineQuery(Portal portal,
 	portal->status = PORTAL_DEFINED;
 }
 
+/*
+ * Copies the given CachedPlanExtra struct into the portal.
+ */
+void
+PortalSaveCachedPlanExtra(Portal portal, CachedPlanExtra *extra)
+{
+	MemoryContext	oldcxt = MemoryContextSwitchTo(portal->portalContext);
+
+	Assert(portal->cplan_extra == NULL && extra != NULL);
+	portal->cplan_extra = (CachedPlanExtra *)
+		palloc(sizeof(CachedPlanExtra));
+	portal->cplan_extra->part_prune_results_list =
+		copyObject(extra->part_prune_results_list);
+	MemoryContextSwitchTo(oldcxt);
+}
+
 /*
  * PortalReleaseCachedPlan
  *		Release a portal's reference to its cached plan, if any.
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 9ebde089ae..269cc4d562 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -87,7 +87,9 @@ extern void ExplainOneUtility(Node *utilityStmt, IntoClause *into,
 							  ExplainState *es, const char *queryString,
 							  ParamListInfo params, QueryEnvironment *queryEnv);
 
-extern void ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into,
+extern void ExplainOnePlan(PlannedStmt *plannedstmt,
+						   List *part_prune_results,
+						   IntoClause *into,
 						   ExplainState *es, const char *queryString,
 						   ParamListInfo params, QueryEnvironment *queryEnv,
 						   const instr_time *planduration,
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index aeeaeb7884..4b98d0d2ef 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -129,5 +129,10 @@ extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
 													 Bitmapset *root_parent_relids,
 													 Bitmapset **initially_valid_subplans);
 extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
-										   bool initial_prune);
+										   bool initial_prune,
+										   Bitmapset **scan_leafpart_rtis);
+extern Bitmapset *ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt,
+								ParamListInfo params,
+								PartitionPruneInfo *pruneinfo,
+								Bitmapset **scan_leafpart_rtis);
 #endif							/* EXECPARTITION_H */
diff --git a/src/include/executor/execdesc.h b/src/include/executor/execdesc.h
index e79e2c001f..5a7d075750 100644
--- a/src/include/executor/execdesc.h
+++ b/src/include/executor/execdesc.h
@@ -35,6 +35,8 @@ typedef struct QueryDesc
 	/* These fields are provided by CreateQueryDesc */
 	CmdType		operation;		/* CMD_SELECT, CMD_UPDATE, etc. */
 	PlannedStmt *plannedstmt;	/* planner's output (could be utility, too) */
+	List		*part_prune_results; /* PartitionPruneResults returned by
+									  * CachedPlanLockPartitions() */
 	const char *sourceText;		/* source text of the query */
 	Snapshot	snapshot;		/* snapshot to use for query */
 	Snapshot	crosscheck_snapshot;	/* crosscheck for RI update/delete */
@@ -57,6 +59,7 @@ typedef struct QueryDesc
 
 /* in pquery.c */
 extern QueryDesc *CreateQueryDesc(PlannedStmt *plannedstmt,
+								  List *part_prune_results,
 								  const char *sourceText,
 								  Snapshot snapshot,
 								  Snapshot crosscheck_snapshot,
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 9a64a830a2..f1374057e5 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -617,6 +617,7 @@ typedef struct EState
 	List	   *es_rteperminfos;	/* List of RTEPermissionInfo */
 	PlannedStmt *es_plannedstmt;	/* link to top of plan tree */
 	List	   *es_part_prune_infos;	/* PlannedStmt.partPruneInfos */
+	List	   *es_part_prune_results; /* QueryDesc.part_prune_results */
 	const char *es_sourceText;	/* Source text from QueryDesc */
 
 	JunkFilter *es_junkFilter;	/* top-level junk filter, if any */
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 4337e7aa34..10f12e780e 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -134,8 +134,8 @@ typedef struct PlannerGlobal
 	bool		containsInitialPruning;
 
 	/*
-	 * Indexes of all range table entries; for AcquireExecutorLocks()'s
-	 * perusal.
+	 * Indexes of all range table entries except those of leaf partitions
+	 * scanned by prunable subplans; for AcquireExecutorLocks() perusal.
 	 */
 	Bitmapset  *minLockRelids;
 
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index eb0a007946..ab8bc74e4a 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -82,7 +82,9 @@ typedef struct PlannedStmt
 	List	   *permInfos;		/* list of RTEPermissionInfo nodes for rtable
 								 * entries needing one */
 
-	Bitmapset  *minLockRelids;	/* Indexes of all range table entries; for
+	Bitmapset  *minLockRelids;	/* Indexes of all range table entries except
+								 * those of leaf partitions scanned by
+								 * prunable subplans; for
 								 * AcquireExecutorLocks()'s perusal */
 
 	/* rtable indexes of target relations for INSERT/UPDATE/DELETE/MERGE */
@@ -1575,6 +1577,33 @@ typedef struct PartitionPruneStepCombine
 	List	   *source_stepids;
 } PartitionPruneStepCombine;
 
+/*----------------
+ * PartitionPruneResult
+ *
+ * The result of performing ExecPartitionDoInitialPruning() on a given
+ * PartitionPruneInfo.
+ *
+ * root_parent_relids is same as PartitionPruneInfo.root_parent_relids.  It's
+ * there for cross-checking in ExecInitPartitionPruning() that the
+ * PartitionPruneResult and the PartitionPruneInfo at a given index in
+ * EState.es_part_prune_results and EState.es_part_prune_infos, respectively,
+ * belong to the same parent plan node.
+ *
+ * valid_subplans_offs contains the indexes of subplans remaining after
+ * performing initial pruning by calling ExecFindMatchingSubPlans() on the
+ * PartitionPruneInfo.
+ *
+ * This is used to store the result of initial partition pruning that is
+ * peformed before the execution has started, such as in
+ * CachedPlanLockPartitions().
+ */
+typedef struct PartitionPruneResult
+{
+	NodeTag		type;
+
+	Bitmapset	   *root_parent_relids;
+	Bitmapset	   *valid_subplan_offs;
+} PartitionPruneResult;
 
 /*
  * Plan invalidation info
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index 0499635f59..4ac66d2761 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -160,6 +160,14 @@ typedef struct CachedPlan
 	MemoryContext context;		/* context containing this CachedPlan */
 } CachedPlan;
 
+/*
+ * Additional information to pass the executor when executing a CachedPlan.
+ */
+typedef struct CachedPlanExtra
+{
+	List	   *part_prune_results_list;
+} CachedPlanExtra;
+
 /*
  * CachedExpression is a low-overhead mechanism for caching the planned form
  * of standalone scalar expressions.  While such expressions are not usually
@@ -220,7 +228,8 @@ extern List *CachedPlanGetTargetList(CachedPlanSource *plansource,
 extern CachedPlan *GetCachedPlan(CachedPlanSource *plansource,
 								 ParamListInfo boundParams,
 								 ResourceOwner owner,
-								 QueryEnvironment *queryEnv);
+								 QueryEnvironment *queryEnv,
+								 CachedPlanExtra **extra);
 extern void ReleaseCachedPlan(CachedPlan *plan, ResourceOwner owner);
 
 extern bool CachedPlanAllowsSimpleValidityCheck(CachedPlanSource *plansource,
diff --git a/src/include/utils/portal.h b/src/include/utils/portal.h
index aeddbdafe5..49bb00cda5 100644
--- a/src/include/utils/portal.h
+++ b/src/include/utils/portal.h
@@ -138,6 +138,8 @@ typedef struct PortalData
 	QueryCompletion qc;			/* command completion data for executed query */
 	List	   *stmts;			/* list of PlannedStmts */
 	CachedPlan *cplan;			/* CachedPlan, if stmts are from one */
+	CachedPlanExtra *cplan_extra;	/* CachedPlanExtra for cplan in Portal's
+									 * memory */
 
 	ParamListInfo portalParams; /* params to pass to query */
 	QueryEnvironment *queryEnv; /* environment for query */
@@ -242,6 +244,7 @@ extern void PortalDefineQuery(Portal portal,
 							  CommandTag commandTag,
 							  List *stmts,
 							  CachedPlan *cplan);
+extern void PortalSaveCachedPlanExtra(Portal portal, CachedPlanExtra *extra);
 extern PlannedStmt *PortalGetPrimaryStmt(Portal portal);
 extern void PortalCreateHoldStore(Portal portal);
 extern void PortalHashTableDeleteAll(void);
-- 
2.35.3



^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2022-12-16 02:33  Amit Langote <[email protected]>
  parent: Amit Langote <[email protected]>
  0 siblings, 1 reply; 108+ messages in thread

From: Amit Langote @ 2022-12-16 02:33 UTC (permalink / raw)
  To: Alvaro Herrera <[email protected]>; +Cc: Robert Haas <[email protected]>; Jacob Champion <[email protected]>; David Rowley <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers

On Wed, Dec 14, 2022 at 5:35 PM Amit Langote <[email protected]> wrote:
> I have moved the original functionality of GetCachedPlan() to
> GetCachedPlanInternal(), turning the former into a sort of controller
> as described shortly.  The latter's CheckCachedPlan() part now only
> locks the "minimal" set of, non-prunable, relations, making a note of
> whether the plan contains any prunable subnodes and thus prunable
> relations whose locking is deferred to the caller, GetCachedPlan().
> GetCachedPlan(), as a sort of controller as mentioned before, does the
> pruning if needed on the minimally valid plan returned by
> GetCachedPlanInternal(), locks the partitions that survive, and redoes
> the whole thing if the locking of partitions invalidates the plan.

After sleeping on it, I realized this doesn't have to be that
complicated.   Rather than turn GetCachedPlan() into a wrapper for
handling deferred partition locking as outlined above, I could have
changed it more simply as follows to get the same thing done:

    if (!customplan)
    {
-       if (CheckCachedPlan(plansource))
+       bool    hasUnlockedParts = false;
+
+       if (CheckCachedPlan(plansource, &hasUnlockedParts) &&
+           hasUnlockedParts &&
+           CachedPlanLockPartitions(plansource, boundParams, owner, extra))
        {
            /* We want a generic plan, and we already have a valid one */
            plan = plansource->gplan;

Attached updated patch does it like that.

-- 
Thanks, Amit Langote
EDB: http://www.enterprisedb.com


Attachments:

  [application/octet-stream] v30-0002-In-GetCachedPlan-only-lock-unpruned-partitions.patch (66.2K, 2-v30-0002-In-GetCachedPlan-only-lock-unpruned-partitions.patch)
  download | inline diff:
From 4176843628ef29c1ff173ad0dfbdd13f7d07c225 Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Wed, 22 Dec 2021 16:55:17 +0900
Subject: [PATCH v30 2/2] In GetCachedPlan(), only lock unpruned partitions

This does two things mainly:

* The planner now removes the RT indexes of "initially prunable"
partitions from PlannedStmt.minLockRelids such that the set only
contains the relations not subject to initial partition pruning.  So,
AcquireExecutorLocks only locks a subset of the relations contained
in a plan, deferring the locking of prunable relations to the caller.

* GetCachedPlans(), if there are prunable relations in the plan,
performs the initial partition pruning using available EXTERN params
and locks the partitions remaining after that, so the the CachedPlan
that's returned is valid in a race-free manner including for any
partitions that will be scanned during execution.

To make the pruning possible before entering ExecutorStart(), this
also adds a ExecPartitionDoInitialPruning(), which can be called by
GetCachedPlan() for a given PlannedStmt.

The result of performing initial partition pruning this way is made
available to the actual execution via PartitionPruneResult, of which
there is one for every ParttionPruneInfo contained in the PlannedStmt.
List of PartitionPruneResult for a given PlannedStmt are returned to
to the callers of GetCachedPlan() via its new output parameter of type
CachedPlanExtra, whose members currently only include said List.
---
 src/backend/commands/copyto.c          |   2 +-
 src/backend/commands/createas.c        |   2 +-
 src/backend/commands/explain.c         |   7 +-
 src/backend/commands/extension.c       |   2 +-
 src/backend/commands/matview.c         |   2 +-
 src/backend/commands/prepare.c         |  28 +++-
 src/backend/executor/README            |  31 +++-
 src/backend/executor/execMain.c        |   2 +
 src/backend/executor/execParallel.c    |  25 ++-
 src/backend/executor/execPartition.c   | 215 +++++++++++++++++++++----
 src/backend/executor/execUtils.c       |   1 +
 src/backend/executor/functions.c       |   2 +-
 src/backend/executor/nodeAppend.c      |  11 +-
 src/backend/executor/nodeMergeAppend.c |   5 +-
 src/backend/executor/spi.c             |  31 +++-
 src/backend/optimizer/plan/setrefs.c   |  36 +++++
 src/backend/tcop/postgres.c            |   9 +-
 src/backend/tcop/pquery.c              |  28 +++-
 src/backend/utils/cache/plancache.c    | 204 +++++++++++++++++++++--
 src/backend/utils/mmgr/portalmem.c     |  16 ++
 src/include/commands/explain.h         |   4 +-
 src/include/executor/execPartition.h   |   7 +-
 src/include/executor/execdesc.h        |   3 +
 src/include/nodes/execnodes.h          |   1 +
 src/include/nodes/pathnodes.h          |   4 +-
 src/include/nodes/plannodes.h          |  31 +++-
 src/include/utils/plancache.h          |  11 +-
 src/include/utils/portal.h             |   3 +
 28 files changed, 640 insertions(+), 83 deletions(-)

diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index f26cc0d162..401a2280a3 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -558,7 +558,7 @@ BeginCopyTo(ParseState *pstate,
 		((DR_copy *) dest)->cstate = cstate;
 
 		/* Create a QueryDesc requesting no output */
-		cstate->queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+		cstate->queryDesc = CreateQueryDesc(plan, NIL, pstate->p_sourcetext,
 											GetActiveSnapshot(),
 											InvalidSnapshot,
 											dest, NULL, NULL, 0);
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 152c29b551..942449544c 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -325,7 +325,7 @@ ExecCreateTableAs(ParseState *pstate, CreateTableAsStmt *stmt,
 		UpdateActiveSnapshotCommandId();
 
 		/* Create a QueryDesc, redirecting output to our tuple receiver */
-		queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+		queryDesc = CreateQueryDesc(plan, NIL, pstate->p_sourcetext,
 									GetActiveSnapshot(), InvalidSnapshot,
 									dest, params, queryEnv, 0);
 
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index f86983c660..2f2b558608 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -407,7 +407,7 @@ ExplainOneQuery(Query *query, int cursorOptions,
 		}
 
 		/* run it (if needed) and produce output */
-		ExplainOnePlan(plan, into, es, queryString, params, queryEnv,
+		ExplainOnePlan(plan, NIL, into, es, queryString, params, queryEnv,
 					   &planduration, (es->buffers ? &bufusage : NULL));
 	}
 }
@@ -515,7 +515,8 @@ ExplainOneUtility(Node *utilityStmt, IntoClause *into, ExplainState *es,
  * to call it.
  */
 void
-ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
+ExplainOnePlan(PlannedStmt *plannedstmt, List *part_prune_results,
+			   IntoClause *into, ExplainState *es,
 			   const char *queryString, ParamListInfo params,
 			   QueryEnvironment *queryEnv, const instr_time *planduration,
 			   const BufferUsage *bufusage)
@@ -563,7 +564,7 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
 		dest = None_Receiver;
 
 	/* Create a QueryDesc for the query */
-	queryDesc = CreateQueryDesc(plannedstmt, queryString,
+	queryDesc = CreateQueryDesc(plannedstmt, part_prune_results, queryString,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, params, queryEnv, instrument_option);
 
diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index cf1b1ca571..904cbcba4a 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -779,7 +779,7 @@ execute_sql_string(const char *sql)
 			{
 				QueryDesc  *qdesc;
 
-				qdesc = CreateQueryDesc(stmt,
+				qdesc = CreateQueryDesc(stmt, NIL,
 										sql,
 										GetActiveSnapshot(), NULL,
 										dest, NULL, NULL, 0);
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index 8ba2436a71..049a90f49d 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -409,7 +409,7 @@ refresh_matview_datafill(DestReceiver *dest, Query *query,
 	UpdateActiveSnapshotCommandId();
 
 	/* Create a QueryDesc, redirecting output to our tuple receiver */
-	queryDesc = CreateQueryDesc(plan, queryString,
+	queryDesc = CreateQueryDesc(plan, NIL, queryString,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, NULL, NULL, 0);
 
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 9e29584d93..729384a9a6 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -154,6 +154,7 @@ ExecuteQuery(ParseState *pstate,
 {
 	PreparedStatement *entry;
 	CachedPlan *cplan;
+	CachedPlanExtra *cplan_extra = NULL;
 	List	   *plan_list;
 	ParamListInfo paramLI = NULL;
 	EState	   *estate = NULL;
@@ -193,7 +194,11 @@ ExecuteQuery(ParseState *pstate,
 									   entry->plansource->query_string);
 
 	/* Replan if needed, and increment plan refcount for portal */
-	cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL);
+	cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL,
+						  &cplan_extra);
+	Assert(cplan_extra == NULL ||
+		   (list_length(cplan->stmt_list) ==
+			list_length(cplan_extra->part_prune_results_list)));
 	plan_list = cplan->stmt_list;
 
 	/*
@@ -207,6 +212,9 @@ ExecuteQuery(ParseState *pstate,
 					  plan_list,
 					  cplan);
 
+	if (cplan_extra)
+		PortalSaveCachedPlanExtra(portal, cplan_extra);
+
 	/*
 	 * For CREATE TABLE ... AS EXECUTE, we must verify that the prepared
 	 * statement is one that produces tuples.  Currently we insist that it be
@@ -575,6 +583,7 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	PreparedStatement *entry;
 	const char *query_string;
 	CachedPlan *cplan;
+	CachedPlanExtra *cplan_extra = NULL;
 	List	   *plan_list;
 	ListCell   *p;
 	ParamListInfo paramLI = NULL;
@@ -619,7 +628,11 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 
 	/* Replan if needed, and acquire a transient refcount */
 	cplan = GetCachedPlan(entry->plansource, paramLI,
-						  CurrentResourceOwner, queryEnv);
+						  CurrentResourceOwner, queryEnv,
+						  &cplan_extra);
+	Assert(cplan_extra == NULL ||
+		   (list_length(cplan->stmt_list) ==
+			list_length(cplan_extra->part_prune_results_list)));
 
 	INSTR_TIME_SET_CURRENT(planduration);
 	INSTR_TIME_SUBTRACT(planduration, planstart);
@@ -637,10 +650,17 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	foreach(p, plan_list)
 	{
 		PlannedStmt *pstmt = lfirst_node(PlannedStmt, p);
+		List *part_prune_results = NIL;
+
+		if (cplan_extra)
+			part_prune_results = list_nth_node(List,
+											   cplan_extra->part_prune_results_list,
+											   foreach_current_index(p));
 
 		if (pstmt->commandType != CMD_UTILITY)
-			ExplainOnePlan(pstmt, into, es, query_string, paramLI, queryEnv,
-						   &planduration, (es->buffers ? &bufusage : NULL));
+			ExplainOnePlan(pstmt, part_prune_results, into, es, query_string,
+						   paramLI, queryEnv, &planduration,
+						   (es->buffers ? &bufusage : NULL));
 		else
 			ExplainOneUtility(pstmt->utilityStmt, into, es, query_string,
 							  paramLI, queryEnv);
diff --git a/src/backend/executor/README b/src/backend/executor/README
index 17775a49e2..2222b3ed6f 100644
--- a/src/backend/executor/README
+++ b/src/backend/executor/README
@@ -63,7 +63,36 @@ if the executor determines that an entire subplan is not required due to
 execution time partition pruning determining that no matching records will be
 found there.  This currently only occurs for Append and MergeAppend nodes.  In
 this case the non-required subplans are ignored and the executor state's
-subnode array will become out of sequence to the plan's subplan list.
+subnode array will become out of sequence to the plan's subplan list.  Note
+that this is referred to as "initial" pruning, because it needs to occur only
+once during the execution startup, and uses a set of pruning steps called
+initial pruning steps (see PartitionedRelPruneInfo.initial_pruning_steps).
+
+Actually, "initial" pruning may occur even before the execution startup in
+in some cases.  For example, when a cached generic plan is validated for
+execution, which works by locking all the relations that will be scanned by
+that plan during execution.  If the generic plan contains plan nodes that have
+prunable child subnodes, then this validation locking is performed after
+pruning child subnodes that need not be scanned during execution, that is,
+using initial pruning steps.  When such a generic plan is forwarded for
+execution, it must be accompanied by the set of PartitionPruneResult nodes that
+contain the result of that pruning, which basically consists of a bitmapset of
+child subnode indexes that survived the pruning and thus whose relations would
+have been locked for execution.  This is important, because, unlike the
+plan-time pruning and actual executor-startup pruning, this does not actually
+remove the pruned subnodes from the plan tree, but only marks them as being
+pruned.  So, the executor code (core or third party), especially one that runs
+before ExecutorStart() and thus looks at bare Plan trees (not PlanState trees)
+must beware of plan nodes that may actually have been pruned and thus subject
+to being invalidated by concurrent schema changes.  For plan nodes that can
+have prunable child subnodes and thus contain a PartitionPruneInfo, such code
+must always check if the corresponding PartitionPruneResult exists
+in EState.es_part_prune_results at given part_prune_index and use that to
+decide which subplans are valid for execution instead of redoing the pruning.
+Note that that is not just a performance optimization but also necessary to
+avoid possibly ending up considering a different set of child subnodes as valid
+than the set CachedPlanLockPartitions() would have locked the relations of, if
+the pruning steps produce a different result when executed multiple times.
 
 Each Plan node may have expression trees associated with it, to represent
 its target list, qualification conditions, etc.  These trees are also
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 2c2b3a8874..229f61f72e 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -798,6 +798,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
 {
 	CmdType		operation = queryDesc->operation;
 	PlannedStmt *plannedstmt = queryDesc->plannedstmt;
+	List	   *part_prune_results = queryDesc->part_prune_results;
 	Plan	   *plan = plannedstmt->planTree;
 	List	   *rangeTable = plannedstmt->rtable;
 	EState	   *estate = queryDesc->estate;
@@ -819,6 +820,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
 
 	estate->es_plannedstmt = plannedstmt;
 	estate->es_part_prune_infos = plannedstmt->partPruneInfos;
+	estate->es_part_prune_results = part_prune_results;
 
 	/*
 	 * Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index 65c4b63bbd..9745eba0af 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -66,6 +66,7 @@
 #define PARALLEL_KEY_QUERY_TEXT		UINT64CONST(0xE000000000000008)
 #define PARALLEL_KEY_JIT_INSTRUMENTATION UINT64CONST(0xE000000000000009)
 #define PARALLEL_KEY_WAL_USAGE			UINT64CONST(0xE00000000000000A)
+#define PARALLEL_KEY_PARTITION_PRUNE_RESULTS	UINT64CONST(0xE00000000000000B)
 
 #define PARALLEL_TUPLE_QUEUE_SIZE		65536
 
@@ -599,12 +600,15 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	FixedParallelExecutorState *fpes;
 	char	   *pstmt_data;
 	char	   *pstmt_space;
+	char	   *part_prune_results_data;
+	char	   *part_prune_results_space;
 	char	   *paramlistinfo_space;
 	BufferUsage *bufusage_space;
 	WalUsage   *walusage_space;
 	SharedExecutorInstrumentation *instrumentation = NULL;
 	SharedJitInstrumentation *jit_instrumentation = NULL;
 	int			pstmt_len;
+	int			part_prune_results_len;
 	int			paramlistinfo_len;
 	int			instrumentation_len = 0;
 	int			jit_instrumentation_len = 0;
@@ -633,6 +637,7 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 
 	/* Fix up and serialize plan to be sent to workers. */
 	pstmt_data = ExecSerializePlan(planstate->plan, estate);
+	part_prune_results_data = nodeToString(estate->es_part_prune_results);
 
 	/* Create a parallel context. */
 	pcxt = CreateParallelContext("postgres", "ParallelQueryMain", nworkers);
@@ -659,6 +664,11 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	shm_toc_estimate_chunk(&pcxt->estimator, pstmt_len);
 	shm_toc_estimate_keys(&pcxt->estimator, 1);
 
+	/* Estimate space for serialized List of PartitionPruneResult. */
+	part_prune_results_len = strlen(part_prune_results_data) + 1;
+	shm_toc_estimate_chunk(&pcxt->estimator, part_prune_results_len);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
 	/* Estimate space for serialized ParamListInfo. */
 	paramlistinfo_len = EstimateParamListSpace(estate->es_param_list_info);
 	shm_toc_estimate_chunk(&pcxt->estimator, paramlistinfo_len);
@@ -753,6 +763,12 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	memcpy(pstmt_space, pstmt_data, pstmt_len);
 	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PLANNEDSTMT, pstmt_space);
 
+	/* Store serialized List of PartitionPruneResult */
+	part_prune_results_space = shm_toc_allocate(pcxt->toc, part_prune_results_len);
+	memcpy(part_prune_results_space, part_prune_results_data, part_prune_results_len);
+	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARTITION_PRUNE_RESULTS,
+				   part_prune_results_space);
+
 	/* Store serialized ParamListInfo. */
 	paramlistinfo_space = shm_toc_allocate(pcxt->toc, paramlistinfo_len);
 	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARAMLISTINFO, paramlistinfo_space);
@@ -1234,8 +1250,10 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
 						 int instrument_options)
 {
 	char	   *pstmtspace;
+	char	   *part_prune_results_space;
 	char	   *paramspace;
 	PlannedStmt *pstmt;
+	List	   *part_prune_results;
 	ParamListInfo paramLI;
 	char	   *queryString;
 
@@ -1246,12 +1264,17 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
 	pstmtspace = shm_toc_lookup(toc, PARALLEL_KEY_PLANNEDSTMT, false);
 	pstmt = (PlannedStmt *) stringToNode(pstmtspace);
 
+	/* Reconstruct leader-supplied PartitionPruneResult. */
+	part_prune_results_space =
+		shm_toc_lookup(toc, PARALLEL_KEY_PARTITION_PRUNE_RESULTS, false);
+	part_prune_results = (List *) stringToNode(part_prune_results_space);
+
 	/* Reconstruct ParamListInfo. */
 	paramspace = shm_toc_lookup(toc, PARALLEL_KEY_PARAMLISTINFO, false);
 	paramLI = RestoreParamList(&paramspace);
 
 	/* Create a QueryDesc for the query. */
-	return CreateQueryDesc(pstmt,
+	return CreateQueryDesc(pstmt, part_prune_results,
 						   queryString,
 						   GetActiveSnapshot(), InvalidSnapshot,
 						   receiver, paramLI, NULL, instrument_options);
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 5b62157712..dcd2bb0f90 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -25,6 +25,7 @@
 #include "mb/pg_wchar.h"
 #include "miscadmin.h"
 #include "nodes/makefuncs.h"
+#include "parser/parsetree.h"
 #include "partitioning/partbounds.h"
 #include "partitioning/partdesc.h"
 #include "partitioning/partprune.h"
@@ -185,7 +186,11 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation rel,
 static List *adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri);
 static List *adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap);
 static PartitionPruneState *CreatePartitionPruneState(PlanState *planstate,
-													  PartitionPruneInfo *pruneinfo);
+													  PartitionPruneInfo *pruneinfo,
+													  bool consider_initial_steps,
+													  bool consider_exec_steps,
+													  List *rtable, ExprContext *econtext,
+													  PartitionDirectory partdir);
 static void InitPartitionPruneContext(PartitionPruneContext *context,
 									  List *pruning_steps,
 									  PartitionDesc partdesc,
@@ -198,7 +203,8 @@ static void PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
 static void find_matching_subplans_recurse(PartitionPruningData *prunedata,
 										   PartitionedRelPruningData *pprune,
 										   bool initial_prune,
-										   Bitmapset **validsubplans);
+										   Bitmapset **validsubplans,
+										   Bitmapset **scan_leafpart_rtis);
 
 
 /*
@@ -1742,7 +1748,8 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  * considered to be a stable expression, it can change value from one plan
  * node scan to the next during query execution.  Stable comparison
  * expressions that don't involve such Params allow partition pruning to be
- * done once during executor startup.  Expressions that do involve such Params
+ * done once during executor startup or even before that, such as when called
+ * from CachedPlanLockPartitions().  Expressions that do involve such Params
  * require us to prune separately for each scan of the parent plan node.
  *
  * Note that pruning away unneeded subplans during executor startup has the
@@ -1760,6 +1767,12 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  *		account for initial pruning possibly having eliminated some of the
  *		subplans.
  *
+ * ExecPartitionDoInitialPruning:
+ *		Do initial pruning with the information contained in a given
+ *		PartitionPruneInfo to determine the set of the parent plan node's
+ *		child subnodes that are valid for execution and also the set of the RT
+ *		indexes of leaf partitions scanned by those subnodes.
+ *
  * ExecFindMatchingSubPlans:
  *		Returns indexes of matching subplans after evaluating the expressions
  *		that are safe to evaluate at a given point.  This function is first
@@ -1780,8 +1793,10 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  *
  * On return, *initially_valid_subplans is assigned the set of indexes of
  * child subplans that must be initialized along with the parent plan node.
- * Initial pruning is performed here if needed and in that case only the
- * surviving subplans' indexes are added.
+ * That set is computed by either performing the "initial pruning" here or
+ * reusing the one present in EState.es_part_prune_results[part_prune_index]
+ * if it has been set, which it would be if CachedPlanLockPartitions() would
+ * have done the initial pruning.
  *
  * If subplans are indeed pruned, subplan_map arrays contained in the returned
  * PartitionPruneState are re-sequenced to not count those, though only if the
@@ -1794,9 +1809,10 @@ ExecInitPartitionPruning(PlanState *planstate,
 						 Bitmapset *root_parent_relids,
 						 Bitmapset **initially_valid_subplans)
 {
-	PartitionPruneState *prunestate;
+	PartitionPruneState *prunestate = NULL;
 	EState	   *estate = planstate->state;
 	PartitionPruneInfo *pruneinfo;
+	PartitionPruneResult *pruneresult = NULL;
 
 	/* Obtain the pruneinfo we need, and make sure it's the right one */
 	pruneinfo = list_nth(estate->es_part_prune_infos, part_prune_index);
@@ -1812,20 +1828,62 @@ ExecInitPartitionPruning(PlanState *planstate,
 	/* We may need an expression context to evaluate partition exprs */
 	ExecAssignExprContext(estate, planstate);
 
-	/* Create the working data structure for pruning */
-	prunestate = CreatePartitionPruneState(planstate, pruneinfo);
+	/* Initial pruning already done if es_part_prune_results has been set. */
+	if (estate->es_part_prune_results)
+	{
+		pruneresult = list_nth_node(PartitionPruneResult,
+									estate->es_part_prune_results,
+									part_prune_index);
+		if (!bms_equal(root_parent_relids, pruneinfo->root_parent_relids))
+			ereport(ERROR,
+					errcode(ERRCODE_INTERNAL_ERROR),
+					errmsg_internal("mismatching PartitionPruneInfo and PartitionPruneResult at part_prune_index %d",
+									part_prune_index),
+					errdetail_internal("prunresult relids %s, pruneinfo relids %s",
+									   bmsToString(pruneresult->root_parent_relids),
+									   bmsToString(pruneinfo->root_parent_relids)));
+	}
+
+	if (pruneresult == NULL || pruneinfo->needs_exec_pruning)
+	{
+		/* We may need an expression context to evaluate partition exprs */
+		ExecAssignExprContext(estate, planstate);
+
+		/* For data reading, executor always omits detached partitions */
+		if (estate->es_partition_directory == NULL)
+			estate->es_partition_directory =
+				CreatePartitionDirectory(estate->es_query_cxt, false);
+
+		/*
+		 * Create the working data structure for pruning.  No need to consider
+		 * initial pruning steps if we have a PartitionPruneResult.
+		 */
+		prunestate = CreatePartitionPruneState(planstate, pruneinfo,
+											   pruneresult == NULL,
+											   pruneinfo->needs_exec_pruning,
+											   NIL, planstate->ps_ExprContext,
+											   estate->es_partition_directory);
+	}
 
 	/*
 	 * Perform an initial partition prune pass, if required.
 	 */
-	if (prunestate->do_initial_prune)
-		*initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true);
+	if (pruneresult)
+	{
+		*initially_valid_subplans = bms_copy(pruneresult->valid_subplan_offs);
+	}
+	else if (prunestate && prunestate->do_initial_prune)
+	{
+		*initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true,
+															 NULL);
+	}
 	else
 	{
-		/* No pruning, so we'll need to initialize all subplans */
+		/* No initial pruning, so we'll need to initialize all subplans */
 		Assert(n_total_subplans > 0);
 		*initially_valid_subplans = bms_add_range(NULL, 0,
 												  n_total_subplans - 1);
+		return prunestate;
 	}
 
 	/*
@@ -1833,7 +1891,8 @@ ExecInitPartitionPruning(PlanState *planstate,
 	 * that were removed above due to initial pruning.  No need to do this if
 	 * no steps were removed.
 	 */
-	if (bms_num_members(*initially_valid_subplans) < n_total_subplans)
+	if (prunestate &&
+		bms_num_members(*initially_valid_subplans) < n_total_subplans)
 	{
 		/*
 		 * We can safely skip this when !do_exec_prune, even though that
@@ -1849,11 +1908,58 @@ ExecInitPartitionPruning(PlanState *planstate,
 	return prunestate;
 }
 
+/*
+ * ExecPartitionDoInitialPruning
+ *		Perform initial pruning using given PartitionPruneInfo to determine
+ *		the set of the parent plan node's child subnodes that are valid for
+ *		execution
+ *
+ * On return, *scan_leafpart_rtis will contain the RT indexes of leaf
+ * partitions scanned by those valid subnodes.
+ *
+ * Note that this does not share state with the actual execution, so must do
+ * with the information present in the PlannedStmt.  For example, there isn't
+ * a PlanState for the parent plan node yet, so we must create a standalone
+ * ExprContext to evaluate pruning expressions, equipped with the information
+ * about the EXTERN parameters that we do have.  Note that that's okay because
+ * the initial pruning steps do not contain anything that would require the
+ * execution to have started.  Likewise, we create our own PartitionDirectory
+ * to look up the PartitionDescs to use.
+ */
+Bitmapset *
+ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt, ParamListInfo params,
+							  PartitionPruneInfo *pruneinfo,
+							  Bitmapset **scan_leafpart_rtis)
+{
+	List		 *rtable = plannedstmt->rtable;
+	ExprContext	 *econtext;
+	PartitionDirectory pdir;
+	PartitionPruneState *prunestate;
+	Bitmapset	 *valid_subplan_offs;
+
+	/* Don't omit detached partitions, just like during execution proper. */
+	pdir = CreatePartitionDirectory(CurrentMemoryContext, false);
+	econtext = CreateStandaloneExprContext();
+	econtext->ecxt_param_list_info = params;
+	prunestate = CreatePartitionPruneState(NULL, pruneinfo, true, false,
+										   rtable, econtext, pdir);
+	valid_subplan_offs = ExecFindMatchingSubPlans(prunestate, true,
+												  scan_leafpart_rtis);
+
+	FreeExprContext(econtext, true);
+	DestroyPartitionDirectory(pdir);
+
+	return valid_subplan_offs;
+}
+
 /*
  * CreatePartitionPruneState
  *		Build the data structure required for calling ExecFindMatchingSubPlans
  *
- * 'planstate' is the parent plan node's execution state.
+ * 'planstate', if not NULL, is the parent plan node's execution state.  It
+ * can be NULL if being called before ExecutorStart(), in which case,
+ * 'rtable' (range table), 'econtext', and 'partdir' must be explicitly
+ * provided.
  *
  * 'pruneinfo' is a PartitionPruneInfo as generated by
  * make_partition_pruneinfo.  Here we build a PartitionPruneState containing a
@@ -1867,19 +1973,21 @@ ExecInitPartitionPruning(PlanState *planstate,
  * PartitionedRelPruneInfo.
  */
 static PartitionPruneState *
-CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
+CreatePartitionPruneState(PlanState *planstate,
+						  PartitionPruneInfo *pruneinfo,
+						  bool consider_initial_steps,
+						  bool consider_exec_steps,
+						  List *rtable, ExprContext *econtext,
+						  PartitionDirectory partdir)
 {
-	EState	   *estate = planstate->state;
+	EState	   *estate = planstate ? planstate->state : NULL;
 	PartitionPruneState *prunestate;
 	int			n_part_hierarchies;
 	ListCell   *lc;
 	int			i;
-	ExprContext *econtext = planstate->ps_ExprContext;
 
-	/* For data reading, executor always omits detached partitions */
-	if (estate->es_partition_directory == NULL)
-		estate->es_partition_directory =
-			CreatePartitionDirectory(estate->es_query_cxt, false);
+	Assert((estate != NULL) ||
+			(partdir != NULL && econtext != NULL && rtable != NIL));
 
 	n_part_hierarchies = list_length(pruneinfo->prune_infos);
 	Assert(n_part_hierarchies > 0);
@@ -1934,15 +2042,39 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 			PartitionKey partkey;
 
 			/*
-			 * We can rely on the copies of the partitioned table's partition
-			 * key and partition descriptor appearing in its relcache entry,
-			 * because that entry will be held open and locked for the
-			 * duration of this executor run.
+			 * Must open the relation by ourselves when called before the
+			 * execution has started, such as, when called from
+			 * CachedPlanLockPartitions().  In that case, sub-partitions must
+			 * be locked, because AcquirePlannerLocks() would have locked only
+			 * the root parent.
+			 */
+			if (estate == NULL)
+			{
+				RangeTblEntry *rte = rt_fetch(pinfo->rtindex, rtable);
+				int		lockmode = (j == 0) ? NoLock : rte->rellockmode;
+
+				partrel = table_open(rte->relid, lockmode);
+			}
+			else
+				partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
+
+			/*
+			 * We can rely on the copy of the partitioned table's partition
+			 * key from in its relcache entry, because it can't change (or
+			 * get destroyed) as long as the relation is locked.  Partition
+			 * descriptor is taken from the PartitionDirectory associated with
+			 * the table that is held open long enough for the descriptor to
+			 * remain valid while it's used to perform the pruning steps.
 			 */
-			partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
 			partkey = RelationGetPartitionKey(partrel);
-			partdesc = PartitionDirectoryLookup(estate->es_partition_directory,
-												partrel);
+			partdesc = PartitionDirectoryLookup(partdir, partrel);
+
+			/*
+			 * Must close partrel, keeping the lock taken, if we're not using
+			 * EState's entry.
+			 */
+			if (estate == NULL)
+				table_close(partrel, NoLock);
 
 			/*
 			 * Initialize the subplan_map and subpart_map.
@@ -2050,7 +2182,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 			 * Initialize pruning contexts as needed.
 			 */
 			pprune->initial_pruning_steps = pinfo->initial_pruning_steps;
-			if (pinfo->initial_pruning_steps)
+			if (consider_initial_steps && pinfo->initial_pruning_steps)
 			{
 				InitPartitionPruneContext(&pprune->initial_context,
 										  pinfo->initial_pruning_steps,
@@ -2060,7 +2192,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 				prunestate->do_initial_prune = true;
 			}
 			pprune->exec_pruning_steps = pinfo->exec_pruning_steps;
-			if (pinfo->exec_pruning_steps)
+			if (consider_exec_steps && pinfo->exec_pruning_steps)
 			{
 				InitPartitionPruneContext(&pprune->exec_context,
 										  pinfo->exec_pruning_steps,
@@ -2288,10 +2420,14 @@ PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
  * Pass initial_prune if PARAM_EXEC Params cannot yet be evaluated.  This
  * differentiates the initial executor-time pruning step from later
  * runtime pruning.
+ *
+ * RT indexes of leaf partitions scanned by the chosen subplans are added to
+ * *scan_leafpart_rtis if the pointer is non-NULL.
  */
 Bitmapset *
 ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
-						 bool initial_prune)
+						 bool initial_prune,
+						 Bitmapset **scan_leafpart_rtis)
 {
 	Bitmapset  *result = NULL;
 	MemoryContext oldcontext;
@@ -2326,7 +2462,7 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
 		 */
 		pprune = &prunedata->partrelprunedata[0];
 		find_matching_subplans_recurse(prunedata, pprune, initial_prune,
-									   &result);
+									   &result, scan_leafpart_rtis);
 
 		/* Expression eval may have used space in ExprContext too */
 		if (pprune->exec_pruning_steps)
@@ -2340,6 +2476,8 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
 
 	/* Copy result out of the temp context before we reset it */
 	result = bms_copy(result);
+	if (scan_leafpart_rtis)
+		*scan_leafpart_rtis = bms_copy(*scan_leafpart_rtis);
 
 	MemoryContextReset(prunestate->prune_context);
 
@@ -2350,13 +2488,15 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
  * find_matching_subplans_recurse
  *		Recursive worker function for ExecFindMatchingSubPlans
  *
- * Adds valid (non-prunable) subplan IDs to *validsubplans
+ * Adds valid (non-prunable) subplan IDs to *validsubplans and RT indexes of
+ * of the corresponding leaf partitions to *scan_leafpart_rtis (if asked for).
  */
 static void
 find_matching_subplans_recurse(PartitionPruningData *prunedata,
 							   PartitionedRelPruningData *pprune,
 							   bool initial_prune,
-							   Bitmapset **validsubplans)
+							   Bitmapset **validsubplans,
+							   Bitmapset **scan_leafpart_rtis)
 {
 	Bitmapset  *partset;
 	int			i;
@@ -2383,8 +2523,14 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
 	while ((i = bms_next_member(partset, i)) >= 0)
 	{
 		if (pprune->subplan_map[i] >= 0)
+		{
 			*validsubplans = bms_add_member(*validsubplans,
 											pprune->subplan_map[i]);
+			Assert(pprune->rti_map[i] > 0);
+			if (scan_leafpart_rtis)
+				*scan_leafpart_rtis = bms_add_member(*scan_leafpart_rtis,
+													 pprune->rti_map[i]);
+		}
 		else
 		{
 			int			partidx = pprune->subpart_map[i];
@@ -2392,7 +2538,8 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
 			if (partidx >= 0)
 				find_matching_subplans_recurse(prunedata,
 											   &prunedata->partrelprunedata[partidx],
-											   initial_prune, validsubplans);
+											   initial_prune, validsubplans,
+											   scan_leafpart_rtis);
 			else
 			{
 				/*
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 87f4d53ca7..7d36c972d3 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -139,6 +139,7 @@ CreateExecutorState(void)
 	estate->es_param_exec_vals = NULL;
 
 	estate->es_queryEnv = NULL;
+	estate->es_part_prune_results = NIL;
 
 	estate->es_query_cxt = qcontext;
 
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index dc13625171..bffb42ce71 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -842,7 +842,7 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
 	else
 		dest = None_Receiver;
 
-	es->qd = CreateQueryDesc(es->stmt,
+	es->qd = CreateQueryDesc(es->stmt, NIL,
 							 fcache->src,
 							 GetActiveSnapshot(),
 							 InvalidSnapshot,
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index 99830198bd..3b917584de 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -156,7 +156,8 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
 		 * subplan, we can fill as_valid_subplans immediately, preventing
 		 * later calls to ExecFindMatchingSubPlans.
 		 */
-		if (!prunestate->do_exec_prune && nplans > 0)
+		if (appendstate->as_prune_state == NULL ||
+			(!appendstate->as_prune_state->do_exec_prune && nplans > 0))
 			appendstate->as_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
 	}
 	else
@@ -578,7 +579,7 @@ choose_next_subplan_locally(AppendState *node)
 		}
 		else if (node->as_valid_subplans == NULL)
 			node->as_valid_subplans =
-				ExecFindMatchingSubPlans(node->as_prune_state, false);
+				ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
 
 		whichplan = -1;
 	}
@@ -643,7 +644,7 @@ choose_next_subplan_for_leader(AppendState *node)
 		if (node->as_valid_subplans == NULL)
 		{
 			node->as_valid_subplans =
-				ExecFindMatchingSubPlans(node->as_prune_state, false);
+				ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
 
 			/*
 			 * Mark each invalid plan as finished to allow the loop below to
@@ -718,7 +719,7 @@ choose_next_subplan_for_worker(AppendState *node)
 	else if (node->as_valid_subplans == NULL)
 	{
 		node->as_valid_subplans =
-			ExecFindMatchingSubPlans(node->as_prune_state, false);
+			ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
 		mark_invalid_subplans_as_finished(node);
 	}
 
@@ -869,7 +870,7 @@ ExecAppendAsyncBegin(AppendState *node)
 	if (node->as_valid_subplans == NULL)
 	{
 		node->as_valid_subplans =
-			ExecFindMatchingSubPlans(node->as_prune_state, false);
+			ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
 
 		classify_matching_subplans(node);
 	}
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index f370f9f287..ccfa083945 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -104,7 +104,8 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
 		 * subplan, we can fill ms_valid_subplans immediately, preventing
 		 * later calls to ExecFindMatchingSubPlans.
 		 */
-		if (!prunestate->do_exec_prune && nplans > 0)
+		if (mergestate->ms_prune_state == NULL ||
+			(!mergestate->ms_prune_state->do_exec_prune && nplans > 0))
 			mergestate->ms_valid_subplans = bms_add_range(NULL, 0, nplans - 1);
 	}
 	else
@@ -219,7 +220,7 @@ ExecMergeAppend(PlanState *pstate)
 		 */
 		if (node->ms_valid_subplans == NULL)
 			node->ms_valid_subplans =
-				ExecFindMatchingSubPlans(node->ms_prune_state, false);
+				ExecFindMatchingSubPlans(node->ms_prune_state, false, NULL);
 
 		/*
 		 * First time through: pull the first tuple from each valid subplan,
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index fd5796f1b9..2ecb9193aa 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -1577,6 +1577,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 {
 	CachedPlanSource *plansource;
 	CachedPlan *cplan;
+	CachedPlanExtra *cplan_extra;
 	List	   *stmt_list;
 	char	   *query_string;
 	Snapshot	snapshot;
@@ -1657,7 +1658,11 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 	 */
 
 	/* Replan if needed, and increment plan refcount for portal */
-	cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv);
+	cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv,
+						  &cplan_extra);
+	Assert(cplan_extra == NULL ||
+		   (list_length(cplan->stmt_list) ==
+			list_length(cplan_extra->part_prune_results_list)));
 	stmt_list = cplan->stmt_list;
 
 	if (!plan->saved)
@@ -1685,6 +1690,9 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 					  stmt_list,
 					  cplan);
 
+	if (cplan_extra)
+		PortalSaveCachedPlanExtra(portal, cplan_extra);
+
 	/*
 	 * Set up options for portal.  Default SCROLL type is chosen the same way
 	 * as PerformCursorOpen does it.
@@ -2067,6 +2075,7 @@ SPI_plan_get_cached_plan(SPIPlanPtr plan)
 {
 	CachedPlanSource *plansource;
 	CachedPlan *cplan;
+	CachedPlanExtra *cplan_extra = NULL;
 	SPICallbackArg spicallbackarg;
 	ErrorContextCallback spierrcontext;
 
@@ -2092,8 +2101,12 @@ SPI_plan_get_cached_plan(SPIPlanPtr plan)
 	/* Get the generic plan for the query */
 	cplan = GetCachedPlan(plansource, NULL,
 						  plan->saved ? CurrentResourceOwner : NULL,
-						  _SPI_current->queryEnv);
+						  _SPI_current->queryEnv,
+						  &cplan_extra);
 	Assert(cplan == plansource->gplan);
+	Assert(cplan_extra == NULL ||
+		   (list_length(cplan->stmt_list) ==
+			list_length(cplan_extra->part_prune_results_list)));
 
 	/* Pop the error context stack */
 	error_context_stack = spierrcontext.previous;
@@ -2399,6 +2412,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 	SPICallbackArg spicallbackarg;
 	ErrorContextCallback spierrcontext;
 	CachedPlan *cplan = NULL;
+	CachedPlanExtra *cplan_extra = NULL;
 	ListCell   *lc1;
 
 	/*
@@ -2549,8 +2563,12 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 		 * plan, the refcount must be backed by the plan_owner.
 		 */
 		cplan = GetCachedPlan(plansource, options->params,
-							  plan_owner, _SPI_current->queryEnv);
+							  plan_owner, _SPI_current->queryEnv,
+							  &cplan_extra);
 
+		Assert(cplan_extra == NULL ||
+			   (list_length(cplan->stmt_list) ==
+				list_length(cplan_extra->part_prune_results_list)));
 		stmt_list = cplan->stmt_list;
 
 		/*
@@ -2592,9 +2610,14 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 		foreach(lc2, stmt_list)
 		{
 			PlannedStmt *stmt = lfirst_node(PlannedStmt, lc2);
+			List	   *part_prune_results = NIL;
 			bool		canSetTag = stmt->canSetTag;
 			DestReceiver *dest;
 
+			if (cplan_extra)
+				part_prune_results = list_nth_node(List,
+												   cplan_extra->part_prune_results_list,
+												   foreach_current_index(lc2));
 			/*
 			 * Reset output state.  (Note that if a non-SPI receiver is used,
 			 * _SPI_current->processed will stay zero, and that's what we'll
@@ -2663,7 +2686,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 				else
 					snap = InvalidSnapshot;
 
-				qdesc = CreateQueryDesc(stmt,
+				qdesc = CreateQueryDesc(stmt, part_prune_results,
 										plansource->query_string,
 										snap, crosscheck_snapshot,
 										dest,
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index ed43d5936d..db27cae297 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -372,6 +372,7 @@ set_plan_references(PlannerInfo *root, Plan *plan)
 	{
 		PartitionPruneInfo *pruneinfo = lfirst(lc);
 		ListCell   *l;
+		Bitmapset  *leafpart_rtis = NULL;
 
 		pruneinfo->root_parent_relids =
 			offset_relid_set(pruneinfo->root_parent_relids, rtoffset);
@@ -383,17 +384,52 @@ set_plan_references(PlannerInfo *root, Plan *plan)
 			foreach(l2, prune_infos)
 			{
 				PartitionedRelPruneInfo *pinfo = lfirst(l2);
+				int		i;
 
 				/* RT index of the table to which the pinfo belongs. */
 				pinfo->rtindex += rtoffset;
+
+				/* Also of the leaf partitions that might be scanned. */
+				for (i = 0; i < pinfo->nparts; i++)
+				{
+					if (pinfo->rti_map[i] > 0 && pinfo->subplan_map[i] >= 0)
+					{
+						pinfo->rti_map[i] += rtoffset;
+						leafpart_rtis = bms_add_member(leafpart_rtis,
+													   pinfo->rti_map[i]);
+					}
+				}
 			}
 
 		}
 
+		if (pruneinfo->needs_init_pruning)
+		{
+			glob->containsInitialPruning = true;
+
+			/*
+			 * Delete the leaf partition RTIs from the set of relations to be
+			 * locked by AcquireExecutorLocks().  The actual set of leaf
+			 * partitions to be locked is computed by
+			 * CachedPlanLockPartitions().
+			 */
+			glob->minLockRelids = bms_del_members(glob->minLockRelids,
+												  leafpart_rtis);
+		}
+
 		glob->partPruneInfos = lappend(glob->partPruneInfos, pruneinfo);
 		glob->containsInitialPruning |= pruneinfo->needs_init_pruning;
 	}
 
+	/*
+	 * It seems worth doing a bms_copy() on glob->minLockRelids if we deleted
+	 * bits from it above to get rid of any empty tail bits.  It seems better
+	 * for the loop over this set in AcquireExecutorLocks() to not have to go
+	 * through those useless bit words.
+	 */
+	if (glob->containsInitialPruning)
+		glob->minLockRelids = bms_copy(glob->minLockRelids);
+
 	return result;
 }
 
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 01d264b5ab..e11e07658d 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -1598,6 +1598,7 @@ exec_bind_message(StringInfo input_message)
 	int16	   *rformats = NULL;
 	CachedPlanSource *psrc;
 	CachedPlan *cplan;
+	CachedPlanExtra *cplan_extra = NULL;
 	Portal		portal;
 	char	   *query_string;
 	char	   *saved_stmt_name;
@@ -1972,7 +1973,10 @@ exec_bind_message(StringInfo input_message)
 	 * will be generated in MessageContext.  The plan refcount will be
 	 * assigned to the Portal, so it will be released at portal destruction.
 	 */
-	cplan = GetCachedPlan(psrc, params, NULL, NULL);
+	cplan = GetCachedPlan(psrc, params, NULL, NULL, &cplan_extra);
+	Assert(cplan_extra == NULL ||
+		   (list_length(cplan->stmt_list) ==
+			list_length(cplan_extra->part_prune_results_list)));
 
 	/*
 	 * Now we can define the portal.
@@ -1987,6 +1991,9 @@ exec_bind_message(StringInfo input_message)
 					  cplan->stmt_list,
 					  cplan);
 
+	if (cplan_extra)
+		PortalSaveCachedPlanExtra(portal, cplan_extra);
+
 	/* Done with the snapshot used for parameter I/O and parsing/planning */
 	if (snapshot_set)
 		PopActiveSnapshot();
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index 52e2db6452..32e6b7b767 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -35,7 +35,7 @@
 Portal		ActivePortal = NULL;
 
 
-static void ProcessQuery(PlannedStmt *plan,
+static void ProcessQuery(PlannedStmt *plan, List *part_prune_results,
 						 const char *sourceText,
 						 ParamListInfo params,
 						 QueryEnvironment *queryEnv,
@@ -65,6 +65,7 @@ static void DoPortalRewind(Portal portal);
  */
 QueryDesc *
 CreateQueryDesc(PlannedStmt *plannedstmt,
+				List *part_prune_results,
 				const char *sourceText,
 				Snapshot snapshot,
 				Snapshot crosscheck_snapshot,
@@ -77,6 +78,7 @@ CreateQueryDesc(PlannedStmt *plannedstmt,
 
 	qd->operation = plannedstmt->commandType;	/* operation */
 	qd->plannedstmt = plannedstmt;	/* plan */
+	qd->part_prune_results = part_prune_results;
 	qd->sourceText = sourceText;	/* query text */
 	qd->snapshot = RegisterSnapshot(snapshot);	/* snapshot */
 	/* RI check snapshot */
@@ -122,6 +124,7 @@ FreeQueryDesc(QueryDesc *qdesc)
  *		PORTAL_ONE_RETURNING, or PORTAL_ONE_MOD_WITH portal
  *
  *	plan: the plan tree for the query
+ *	part_prune_results: pruning results returned by CachedPlanLockPartitions()
  *	sourceText: the source text of the query
  *	params: any parameters needed
  *	dest: where to send results
@@ -134,6 +137,7 @@ FreeQueryDesc(QueryDesc *qdesc)
  */
 static void
 ProcessQuery(PlannedStmt *plan,
+			 List *part_prune_results,
 			 const char *sourceText,
 			 ParamListInfo params,
 			 QueryEnvironment *queryEnv,
@@ -145,7 +149,7 @@ ProcessQuery(PlannedStmt *plan,
 	/*
 	 * Create the QueryDesc object
 	 */
-	queryDesc = CreateQueryDesc(plan, sourceText,
+	queryDesc = CreateQueryDesc(plan, part_prune_results, sourceText,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, params, queryEnv, 0);
 
@@ -491,8 +495,13 @@ PortalStart(Portal portal, ParamListInfo params,
 				/*
 				 * Create QueryDesc in portal's context; for the moment, set
 				 * the destination to DestNone.
+				 *
+				 * There is no PartitionPruneResult unless the PlannedStmt is
+				 * from a CachedPlan.
 				 */
 				queryDesc = CreateQueryDesc(linitial_node(PlannedStmt, portal->stmts),
+											portal->cplan_extra == NULL ? NIL :
+											linitial(portal->cplan_extra->part_prune_results_list),
 											portal->sourceText,
 											GetActiveSnapshot(),
 											InvalidSnapshot,
@@ -1225,6 +1234,8 @@ PortalRunMulti(Portal portal,
 
 		if (pstmt->utilityStmt == NULL)
 		{
+			List *part_prune_results = NIL;
+
 			/*
 			 * process a plannable query.
 			 */
@@ -1271,10 +1282,19 @@ PortalRunMulti(Portal portal,
 			else
 				UpdateActiveSnapshotCommandId();
 
+			/*
+			 * Determine if there's a corresponding List of PartitionPruneResult
+			 * for this PlannedStmt.
+			 */
+			if (portal->cplan_extra)
+				part_prune_results = list_nth_node(List,
+												   portal->cplan_extra->part_prune_results_list,
+												   foreach_current_index(stmtlist_item));
+
 			if (pstmt->canSetTag)
 			{
 				/* statement can set tag string */
-				ProcessQuery(pstmt,
+				ProcessQuery(pstmt, part_prune_results,
 							 portal->sourceText,
 							 portal->portalParams,
 							 portal->queryEnv,
@@ -1283,7 +1303,7 @@ PortalRunMulti(Portal portal,
 			else
 			{
 				/* stmt added by rewrite cannot set tag */
-				ProcessQuery(pstmt,
+				ProcessQuery(pstmt, part_prune_results,
 							 portal->sourceText,
 							 portal->portalParams,
 							 portal->queryEnv,
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 339bb603f7..16b9869fae 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -59,6 +59,7 @@
 #include "access/transam.h"
 #include "catalog/namespace.h"
 #include "executor/executor.h"
+#include "executor/execPartition.h"
 #include "miscadmin.h"
 #include "nodes/nodeFuncs.h"
 #include "optimizer/optimizer.h"
@@ -99,14 +100,18 @@ static dlist_head cached_expression_list = DLIST_STATIC_INIT(cached_expression_l
 static void ReleaseGenericPlan(CachedPlanSource *plansource);
 static List *RevalidateCachedQuery(CachedPlanSource *plansource,
 								   QueryEnvironment *queryEnv);
-static bool CheckCachedPlan(CachedPlanSource *plansource);
+static bool CheckCachedPlan(CachedPlanSource *plansource, bool *hasUnlockedParts);
 static CachedPlan *BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
 								   ParamListInfo boundParams, QueryEnvironment *queryEnv);
 static bool choose_custom_plan(CachedPlanSource *plansource,
 							   ParamListInfo boundParams);
 static double cached_plan_cost(CachedPlan *plan, bool include_planner);
 static Query *QueryListGetPrimaryStmt(List *stmts);
-static void AcquireExecutorLocks(List *stmt_list, bool acquire);
+static bool AcquireExecutorLocks(List *stmt_list, bool acquire);
+static bool CachedPlanLockPartitions(CachedPlanSource *plansource,
+						 ParamListInfo boundParams,
+						 ResourceOwner owner,
+						 CachedPlanExtra **extra);
 static void AcquirePlannerLocks(List *stmt_list, bool acquire);
 static void ScanQueryForLocks(Query *parsetree, bool acquire);
 static bool ScanQueryWalker(Node *node, bool *acquire);
@@ -783,16 +788,23 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
 }
 
 /*
- * CheckCachedPlan: see if the CachedPlanSource's generic plan is valid.
+ * CheckCachedPlan: see if the CachedPlanSource's generic plan is valid and
+ * set *hasUnlockedParts if any PlannedStmt contains "initially" prunable
+ * subnodes; partitions are not locked till initial pruning is done.
  *
  * Caller must have already called RevalidateCachedQuery to verify that the
  * querytree is up to date.
  *
- * On a "true" return, we have acquired the locks needed to run the plan.
+ * On a "true" return, we have acquired the minimal set of locks needed to run
+ * the plan, that is, excluding partitions that are subject to being pruned
+ * before execution.  The caller must lock partitions after pruning those and
+ * locking the ones that remain before actually telling the world that the
+ * plan is "valid".
+ *
  * (We must do this for the "true" result to be race-condition-free.)
  */
 static bool
-CheckCachedPlan(CachedPlanSource *plansource)
+CheckCachedPlan(CachedPlanSource *plansource, bool *hasUnlockedParts)
 {
 	CachedPlan *plan = plansource->gplan;
 
@@ -826,7 +838,7 @@ CheckCachedPlan(CachedPlanSource *plansource)
 		 */
 		Assert(plan->refcount > 0);
 
-		AcquireExecutorLocks(plan->stmt_list, true);
+		*hasUnlockedParts = AcquireExecutorLocks(plan->stmt_list, true);
 
 		/*
 		 * If plan was transient, check to see if TransactionXmin has
@@ -848,7 +860,7 @@ CheckCachedPlan(CachedPlanSource *plansource)
 		}
 
 		/* Oops, the race case happened.  Release useless locks. */
-		AcquireExecutorLocks(plan->stmt_list, false);
+		(void) AcquireExecutorLocks(plan->stmt_list, false);
 	}
 
 	/*
@@ -1120,14 +1132,17 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
 }
 
 /*
- * GetCachedPlan: get a cached plan from a CachedPlanSource.
+ * GetCachedPlan: get a cached plan from a CachedPlanSource
  *
  * This function hides the logic that decides whether to use a generic
  * plan or a custom plan for the given parameters: the caller does not know
  * which it will get.
  *
  * On return, the plan is valid and we have sufficient locks to begin
- * execution.
+ * execution.  If the plan is a generic plan containing prunable partitions,
+ * the locks on partitions are taken after the pruning and the result of that
+ * pruning is saved in *extra->part_prune_results_list for the caller to pass
+ * to the executor, along with plan->stmt_list.
  *
  * On return, the refcount of the plan has been incremented; a later
  * ReleaseCachedPlan() call is expected.  If "owner" is not NULL then
@@ -1139,12 +1154,16 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
  */
 CachedPlan *
 GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
-			  ResourceOwner owner, QueryEnvironment *queryEnv)
+			  ResourceOwner owner, QueryEnvironment *queryEnv,
+			  CachedPlanExtra **extra)
 {
 	CachedPlan *plan = NULL;
 	List	   *qlist;
 	bool		customplan;
 
+	Assert(extra != NULL);
+	*extra = NULL;
+
 	/* Assert caller is doing things in a sane order */
 	Assert(plansource->magic == CACHEDPLANSOURCE_MAGIC);
 	Assert(plansource->is_complete);
@@ -1160,7 +1179,11 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 
 	if (!customplan)
 	{
-		if (CheckCachedPlan(plansource))
+		bool	hasUnlockedParts = false;
+
+		if (CheckCachedPlan(plansource, &hasUnlockedParts) &&
+			hasUnlockedParts &&
+			CachedPlanLockPartitions(plansource, boundParams, owner, extra))
 		{
 			/* We want a generic plan, and we already have a valid one */
 			plan = plansource->gplan;
@@ -1282,6 +1305,147 @@ ReleaseCachedPlan(CachedPlan *plan, ResourceOwner owner)
 	}
 }
 
+/*
+ * For each PlannedStmt in the generic plan, do the "initial" partition pruning
+ * if needed and lock only partitions that survive.
+ *
+ * On return, (*extra)->part_prune_results_list will contain an element for
+ * each PlannedStmt in the generic plan's stmt_list, which is a NIL if the
+ * PlannedStmt does not contain any PartitionPruneInfos requiring initial
+ * pruning or a List of PartitionPruneResult containing elements corresponding
+ * to the PartitionPruneInfos in PlannedStmt.partPruneInfos.
+ */
+static bool
+CachedPlanLockPartitions(CachedPlanSource *plansource,
+						 ParamListInfo boundParams,
+						 ResourceOwner owner,
+						 CachedPlanExtra **extra)
+{
+	CachedPlan *plan = plansource->gplan;
+	List	   *part_prune_results_list = NIL;
+	List	   *lockedRelids_per_stmt = NIL;
+	ListCell   *lc1,
+			   *lc2;
+	MemoryContext oldcontext,
+			tmpcontext;
+
+	/*
+	 * Won't be here without CheckCachedPlan() having validated a generic
+	 * plan.
+	 */
+	Assert(plansource->gplan != NULL);
+
+	/*
+	 * Create a temporary context for memory allocations required while
+	 * executing partition pruning steps.
+	 */
+	tmpcontext = AllocSetContextCreate(CurrentMemoryContext,
+									   "CachedPlanLockPartitions() working data",
+									   ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(tmpcontext);
+	foreach(lc1, plan->stmt_list)
+	{
+		PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
+		Bitmapset  *lockPartRelids = NULL;
+		int			rti;
+		List	   *part_prune_results = NIL;
+		Bitmapset  *lockedRelids = NULL;
+
+		if (plannedstmt->commandType == CMD_UTILITY)
+		{
+			/*
+			 * Ignore utility statements, because AcquireExecutorLocks on the
+			 * parent CachedPlan would have dealt with these.  Though, do let
+			 * the caller know that no pruning is applicable to this statement.
+			 */
+			part_prune_results_list = lappend(part_prune_results_list, NIL);
+			lockedRelids_per_stmt = lappend(lockedRelids_per_stmt, NULL);
+			continue;
+		}
+
+		/* Figure out the partitions that would need to be locked. */
+		if (plannedstmt->containsInitialPruning)
+		{
+			foreach(lc2, plannedstmt->partPruneInfos)
+			{
+				PartitionPruneInfo *pruneinfo = lfirst_node(PartitionPruneInfo, lc2);
+				PartitionPruneResult *pruneresult = makeNode(PartitionPruneResult);
+
+				pruneresult->root_parent_relids =
+					bms_copy(pruneinfo->root_parent_relids);
+				pruneresult->valid_subplan_offs =
+					ExecPartitionDoInitialPruning(plannedstmt, boundParams,
+												  pruneinfo,
+												  &lockPartRelids);
+				part_prune_results = lappend(part_prune_results, pruneresult);
+			}
+		}
+
+		/* Lock 'em. */
+		rti = -1;
+		while ((rti = bms_next_member(lockPartRelids, rti)) > 0)
+		{
+			RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
+
+			Assert(rte->rtekind == RTE_RELATION);
+
+			/*
+			 * Acquire the appropriate type of lock on each relation OID. Note
+			 * that we don't actually try to open the rel, and hence will not
+			 * fail if it's been dropped entirely --- we'll just transiently
+			 * acquire a non-conflicting lock.
+			 */
+			LockRelationOid(rte->relid, rte->rellockmode);
+			lockedRelids = bms_add_member(lockedRelids, rti);
+		}
+
+		part_prune_results_list = lappend(part_prune_results_list,
+										  part_prune_results);
+		lockedRelids_per_stmt = lappend(lockedRelids_per_stmt,
+										lockedRelids);
+	}
+
+	/*
+	 * If the plan is still valid, set *extra, returning in it a copy the
+	 * pruning results obtained above allocated in the caller's context.
+	 */
+	MemoryContextSwitchTo(oldcontext);
+	if (plan->is_valid)
+	{
+		*extra = (CachedPlanExtra *) palloc(sizeof(CachedPlanExtra));
+		(*extra)->part_prune_results_list = copyObject(part_prune_results_list);
+	}
+	else
+	{
+		/*
+		 * Release the now useless locks.  Note that this is the same as what
+		 * CheckCachedPlan() does when the locks taken by
+		 * AcquireExecutorLocks() causes the plan to be invalidated.
+		 */
+		forboth(lc1, plan->stmt_list, lc2, lockedRelids_per_stmt)
+		{
+			PlannedStmt *plannedstmt = lfirst(lc1);
+			Bitmapset *lockedRelids = lfirst(lc2);
+			int		rti;
+
+			if (plannedstmt->commandType == CMD_UTILITY)
+				continue;
+			rti = -1;
+			while ((rti = bms_next_member(lockedRelids, rti)) > 0)
+			{
+				RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
+
+				Assert(rte->rtekind == RTE_RELATION);
+				UnlockRelationOid(rte->relid, rte->rellockmode);
+			}
+		}
+	}
+
+	/* Clear up the temporary context. */
+	MemoryContextDelete(tmpcontext);
+	return plan->is_valid;
+}
+
 /*
  * CachedPlanAllowsSimpleValidityCheck: can we use CachedPlanIsSimplyValid?
  *
@@ -1738,11 +1902,16 @@ QueryListGetPrimaryStmt(List *stmts)
 /*
  * AcquireExecutorLocks: acquire locks needed for execution of a cached plan;
  * or release them if acquire is false.
+ *
+ * If some PlannedStmt(s) contain "initially prunable" partitions, they are not
+ * locked here. Instead, the caller is informed of their existence so that it
+ * can lock them after doing the initial pruning.
  */
-static void
+static bool
 AcquireExecutorLocks(List *stmt_list, bool acquire)
 {
 	ListCell   *lc1;
+	bool		hasUnlockedParts = false;
 
 	foreach(lc1, stmt_list)
 	{
@@ -1763,10 +1932,17 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
 
 			Assert(plannedstmt->minLockRelids == NULL);
 			if (query)
-				ScanQueryForLocks(query, acquire);
+				ScanQueryForLocks(query, true);
 			continue;
 		}
 
+		/*
+		 * If partitions can be pruned before execution, defer their locking to
+		 * the caller.
+		 */
+		if (plannedstmt->containsInitialPruning)
+			hasUnlockedParts = true;
+
 		allLockRelids = plannedstmt->minLockRelids;
 		rti = -1;
 		while ((rti = bms_next_member(allLockRelids, rti)) > 0)
@@ -1788,6 +1964,8 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
 				UnlockRelationOid(rte->relid, rte->rellockmode);
 		}
 	}
+
+	return hasUnlockedParts;
 }
 
 /*
diff --git a/src/backend/utils/mmgr/portalmem.c b/src/backend/utils/mmgr/portalmem.c
index 7b1ae6fdcf..94a9db84e3 100644
--- a/src/backend/utils/mmgr/portalmem.c
+++ b/src/backend/utils/mmgr/portalmem.c
@@ -303,6 +303,22 @@ PortalDefineQuery(Portal portal,
 	portal->status = PORTAL_DEFINED;
 }
 
+/*
+ * Copies the given CachedPlanExtra struct into the portal.
+ */
+void
+PortalSaveCachedPlanExtra(Portal portal, CachedPlanExtra *extra)
+{
+	MemoryContext	oldcxt = MemoryContextSwitchTo(portal->portalContext);
+
+	Assert(portal->cplan_extra == NULL && extra != NULL);
+	portal->cplan_extra = (CachedPlanExtra *)
+		palloc(sizeof(CachedPlanExtra));
+	portal->cplan_extra->part_prune_results_list =
+		copyObject(extra->part_prune_results_list);
+	MemoryContextSwitchTo(oldcxt);
+}
+
 /*
  * PortalReleaseCachedPlan
  *		Release a portal's reference to its cached plan, if any.
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 9ebde089ae..269cc4d562 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -87,7 +87,9 @@ extern void ExplainOneUtility(Node *utilityStmt, IntoClause *into,
 							  ExplainState *es, const char *queryString,
 							  ParamListInfo params, QueryEnvironment *queryEnv);
 
-extern void ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into,
+extern void ExplainOnePlan(PlannedStmt *plannedstmt,
+						   List *part_prune_results,
+						   IntoClause *into,
 						   ExplainState *es, const char *queryString,
 						   ParamListInfo params, QueryEnvironment *queryEnv,
 						   const instr_time *planduration,
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index aeeaeb7884..4b98d0d2ef 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -129,5 +129,10 @@ extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
 													 Bitmapset *root_parent_relids,
 													 Bitmapset **initially_valid_subplans);
 extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
-										   bool initial_prune);
+										   bool initial_prune,
+										   Bitmapset **scan_leafpart_rtis);
+extern Bitmapset *ExecPartitionDoInitialPruning(PlannedStmt *plannedstmt,
+								ParamListInfo params,
+								PartitionPruneInfo *pruneinfo,
+								Bitmapset **scan_leafpart_rtis);
 #endif							/* EXECPARTITION_H */
diff --git a/src/include/executor/execdesc.h b/src/include/executor/execdesc.h
index e79e2c001f..5a7d075750 100644
--- a/src/include/executor/execdesc.h
+++ b/src/include/executor/execdesc.h
@@ -35,6 +35,8 @@ typedef struct QueryDesc
 	/* These fields are provided by CreateQueryDesc */
 	CmdType		operation;		/* CMD_SELECT, CMD_UPDATE, etc. */
 	PlannedStmt *plannedstmt;	/* planner's output (could be utility, too) */
+	List		*part_prune_results; /* PartitionPruneResults returned by
+									  * CachedPlanLockPartitions() */
 	const char *sourceText;		/* source text of the query */
 	Snapshot	snapshot;		/* snapshot to use for query */
 	Snapshot	crosscheck_snapshot;	/* crosscheck for RI update/delete */
@@ -57,6 +59,7 @@ typedef struct QueryDesc
 
 /* in pquery.c */
 extern QueryDesc *CreateQueryDesc(PlannedStmt *plannedstmt,
+								  List *part_prune_results,
 								  const char *sourceText,
 								  Snapshot snapshot,
 								  Snapshot crosscheck_snapshot,
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 9a64a830a2..f1374057e5 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -617,6 +617,7 @@ typedef struct EState
 	List	   *es_rteperminfos;	/* List of RTEPermissionInfo */
 	PlannedStmt *es_plannedstmt;	/* link to top of plan tree */
 	List	   *es_part_prune_infos;	/* PlannedStmt.partPruneInfos */
+	List	   *es_part_prune_results; /* QueryDesc.part_prune_results */
 	const char *es_sourceText;	/* Source text from QueryDesc */
 
 	JunkFilter *es_junkFilter;	/* top-level junk filter, if any */
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 4337e7aa34..10f12e780e 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -134,8 +134,8 @@ typedef struct PlannerGlobal
 	bool		containsInitialPruning;
 
 	/*
-	 * Indexes of all range table entries; for AcquireExecutorLocks()'s
-	 * perusal.
+	 * Indexes of all range table entries except those of leaf partitions
+	 * scanned by prunable subplans; for AcquireExecutorLocks() perusal.
 	 */
 	Bitmapset  *minLockRelids;
 
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index eb0a007946..ab8bc74e4a 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -82,7 +82,9 @@ typedef struct PlannedStmt
 	List	   *permInfos;		/* list of RTEPermissionInfo nodes for rtable
 								 * entries needing one */
 
-	Bitmapset  *minLockRelids;	/* Indexes of all range table entries; for
+	Bitmapset  *minLockRelids;	/* Indexes of all range table entries except
+								 * those of leaf partitions scanned by
+								 * prunable subplans; for
 								 * AcquireExecutorLocks()'s perusal */
 
 	/* rtable indexes of target relations for INSERT/UPDATE/DELETE/MERGE */
@@ -1575,6 +1577,33 @@ typedef struct PartitionPruneStepCombine
 	List	   *source_stepids;
 } PartitionPruneStepCombine;
 
+/*----------------
+ * PartitionPruneResult
+ *
+ * The result of performing ExecPartitionDoInitialPruning() on a given
+ * PartitionPruneInfo.
+ *
+ * root_parent_relids is same as PartitionPruneInfo.root_parent_relids.  It's
+ * there for cross-checking in ExecInitPartitionPruning() that the
+ * PartitionPruneResult and the PartitionPruneInfo at a given index in
+ * EState.es_part_prune_results and EState.es_part_prune_infos, respectively,
+ * belong to the same parent plan node.
+ *
+ * valid_subplans_offs contains the indexes of subplans remaining after
+ * performing initial pruning by calling ExecFindMatchingSubPlans() on the
+ * PartitionPruneInfo.
+ *
+ * This is used to store the result of initial partition pruning that is
+ * peformed before the execution has started, such as in
+ * CachedPlanLockPartitions().
+ */
+typedef struct PartitionPruneResult
+{
+	NodeTag		type;
+
+	Bitmapset	   *root_parent_relids;
+	Bitmapset	   *valid_subplan_offs;
+} PartitionPruneResult;
 
 /*
  * Plan invalidation info
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index 0499635f59..4ac66d2761 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -160,6 +160,14 @@ typedef struct CachedPlan
 	MemoryContext context;		/* context containing this CachedPlan */
 } CachedPlan;
 
+/*
+ * Additional information to pass the executor when executing a CachedPlan.
+ */
+typedef struct CachedPlanExtra
+{
+	List	   *part_prune_results_list;
+} CachedPlanExtra;
+
 /*
  * CachedExpression is a low-overhead mechanism for caching the planned form
  * of standalone scalar expressions.  While such expressions are not usually
@@ -220,7 +228,8 @@ extern List *CachedPlanGetTargetList(CachedPlanSource *plansource,
 extern CachedPlan *GetCachedPlan(CachedPlanSource *plansource,
 								 ParamListInfo boundParams,
 								 ResourceOwner owner,
-								 QueryEnvironment *queryEnv);
+								 QueryEnvironment *queryEnv,
+								 CachedPlanExtra **extra);
 extern void ReleaseCachedPlan(CachedPlan *plan, ResourceOwner owner);
 
 extern bool CachedPlanAllowsSimpleValidityCheck(CachedPlanSource *plansource,
diff --git a/src/include/utils/portal.h b/src/include/utils/portal.h
index aeddbdafe5..49bb00cda5 100644
--- a/src/include/utils/portal.h
+++ b/src/include/utils/portal.h
@@ -138,6 +138,8 @@ typedef struct PortalData
 	QueryCompletion qc;			/* command completion data for executed query */
 	List	   *stmts;			/* list of PlannedStmts */
 	CachedPlan *cplan;			/* CachedPlan, if stmts are from one */
+	CachedPlanExtra *cplan_extra;	/* CachedPlanExtra for cplan in Portal's
+									 * memory */
 
 	ParamListInfo portalParams; /* params to pass to query */
 	QueryEnvironment *queryEnv; /* environment for query */
@@ -242,6 +244,7 @@ extern void PortalDefineQuery(Portal portal,
 							  CommandTag commandTag,
 							  List *stmts,
 							  CachedPlan *cplan);
+extern void PortalSaveCachedPlanExtra(Portal portal, CachedPlanExtra *extra);
 extern PlannedStmt *PortalGetPrimaryStmt(Portal portal);
 extern void PortalCreateHoldStore(Portal portal);
 extern void PortalHashTableDeleteAll(void);
-- 
2.35.3



  [application/octet-stream] v30-0001-Preparatory-refactoring-before-reworking-CachedP.patch (17.2K, 3-v30-0001-Preparatory-refactoring-before-reworking-CachedP.patch)
  download | inline diff:
From 22c64b3d1ade0cb0f413c17d84a9bb0dd4e6d734 Mon Sep 17 00:00:00 2001
From: amitlan <[email protected]>
Date: Tue, 13 Dec 2022 11:58:07 +0900
Subject: [PATCH v30 1/2] Preparatory refactoring before reworking CachedPlan
 locking

Remember the RT indexes of RTEs that AcquireExecutorLocks() must
look at to consider locking in a bitmapset, so that nstead of looping
over the range table to find those RTEs, it can look them up using
the RT indexes set in the bitmapset.

This also adds some extra information related to execution-time
pruning to the relevant plan nodes.
---
 src/backend/executor/execParallel.c  |  1 +
 src/backend/executor/execPartition.c |  6 ++++
 src/backend/nodes/readfuncs.c        |  8 ++++--
 src/backend/optimizer/plan/planner.c |  2 ++
 src/backend/optimizer/plan/setrefs.c | 12 ++++++++
 src/backend/partitioning/partprune.c | 42 ++++++++++++++++++++++++++--
 src/backend/utils/cache/plancache.c  | 10 +++++--
 src/include/executor/execPartition.h |  2 ++
 src/include/nodes/nodes.h            |  1 +
 src/include/nodes/pathnodes.h        | 11 ++++++++
 src/include/nodes/plannodes.h        | 19 +++++++++++++
 11 files changed, 106 insertions(+), 8 deletions(-)

diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index a5b8e43ec5..65c4b63bbd 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -182,6 +182,7 @@ ExecSerializePlan(Plan *plan, EState *estate)
 	pstmt->transientPlan = false;
 	pstmt->dependsOnRole = false;
 	pstmt->parallelModeNeeded = false;
+	pstmt->containsInitialPruning = false;	/* workers need not know! */
 	pstmt->planTree = plan;
 	pstmt->partPruneInfos = estate->es_part_prune_infos;
 	pstmt->rtable = estate->es_range_table;
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 76d79b9741..5b62157712 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -1956,6 +1956,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 			Assert(partdesc->nparts >= pinfo->nparts);
 			pprune->nparts = partdesc->nparts;
 			pprune->subplan_map = palloc(sizeof(int) * partdesc->nparts);
+			pprune->rti_map = palloc(sizeof(Index) * partdesc->nparts);
 			if (partdesc->nparts == pinfo->nparts)
 			{
 				/*
@@ -1966,6 +1967,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 				pprune->subpart_map = pinfo->subpart_map;
 				memcpy(pprune->subplan_map, pinfo->subplan_map,
 					   sizeof(int) * pinfo->nparts);
+				memcpy(pprune->rti_map, pinfo->rti_map,
+					   sizeof(int) * pinfo->nparts);
 
 				/*
 				 * Double-check that the list of unpruned relations has not
@@ -2016,6 +2019,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 							pinfo->subplan_map[pd_idx];
 						pprune->subpart_map[pp_idx] =
 							pinfo->subpart_map[pd_idx];
+						pprune->rti_map[pp_idx] =
+							pinfo->rti_map[pd_idx];
 						pd_idx++;
 					}
 					else
@@ -2023,6 +2028,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 						/* this partdesc entry is not in the plan */
 						pprune->subplan_map[pp_idx] = -1;
 						pprune->subpart_map[pp_idx] = -1;
+						pprune->rti_map[pp_idx] = 0;
 					}
 				}
 
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 966b75f5a6..1161671fa4 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -158,6 +158,11 @@
 	token = pg_strtok(&length);		/* skip :fldname */ \
 	local_node->fldname = readIntCols(len)
 
+/* Read an Index array */
+#define READ_INDEX_ARRAY(fldname, len) \
+	token = pg_strtok(&length);		/* skip :fldname */ \
+	local_node->fldname = readIndexCols(len)
+
 /* Read a bool array */
 #define READ_BOOL_ARRAY(fldname, len) \
 	token = pg_strtok(&length);		/* skip :fldname */ \
@@ -796,7 +801,6 @@ fnname(int numCols) \
  */
 READ_SCALAR_ARRAY(readAttrNumberCols, int16, atoi)
 READ_SCALAR_ARRAY(readOidCols, Oid, atooid)
-/* outfuncs.c has writeIndexCols, but we don't yet need that here */
-/* READ_SCALAR_ARRAY(readIndexCols, Index, atoui) */
+READ_SCALAR_ARRAY(readIndexCols, Index, atoui)
 READ_SCALAR_ARRAY(readIntCols, int, atoi)
 READ_SCALAR_ARRAY(readBoolCols, bool, strtobool)
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 5dd4f92720..620b163ef9 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -523,8 +523,10 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
 	result->parallelModeNeeded = glob->parallelModeNeeded;
 	result->planTree = top_plan;
 	result->partPruneInfos = glob->partPruneInfos;
+	result->containsInitialPruning = glob->containsInitialPruning;
 	result->rtable = glob->finalrtable;
 	result->permInfos = glob->finalrteperminfos;
+	result->minLockRelids = glob->minLockRelids;
 	result->resultRelations = glob->resultRelations;
 	result->appendRelations = glob->appendRelations;
 	result->subplans = glob->subplans;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 596f1fbc8e..ed43d5936d 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -279,6 +279,16 @@ set_plan_references(PlannerInfo *root, Plan *plan)
 	 */
 	add_rtes_to_flat_rtable(root, false);
 
+	/*
+	 * Add the query's adjusted range of RT indexes to glob->minLockRelids.
+	 * The adjusted RT indexes of prunable relations will be deleted from the
+	 * set below where PartitionPruneInfos are processed.
+	 */
+	glob->minLockRelids =
+		bms_add_range(glob->minLockRelids,
+					  rtoffset + 1,
+					  rtoffset + list_length(root->parse->rtable));
+
 	/*
 	 * Adjust RT indexes of PlanRowMarks and add to final rowmarks list
 	 */
@@ -377,9 +387,11 @@ set_plan_references(PlannerInfo *root, Plan *plan)
 				/* RT index of the table to which the pinfo belongs. */
 				pinfo->rtindex += rtoffset;
 			}
+
 		}
 
 		glob->partPruneInfos = lappend(glob->partPruneInfos, pruneinfo);
+		glob->containsInitialPruning |= pruneinfo->needs_init_pruning;
 	}
 
 	return result;
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index d48f6784c1..56270d7670 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -144,7 +144,9 @@ static List *make_partitionedrel_pruneinfo(PlannerInfo *root,
 										   List *prunequal,
 										   Bitmapset *partrelids,
 										   int *relid_subplan_map,
-										   Bitmapset **matchedsubplans);
+										   Bitmapset **matchedsubplans,
+										   bool *needs_init_pruning,
+										   bool *needs_exec_pruning);
 static void gen_partprune_steps(RelOptInfo *rel, List *clauses,
 								PartClauseTarget target,
 								GeneratePruningStepsContext *context);
@@ -234,6 +236,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	int		   *relid_subplan_map;
 	ListCell   *lc;
 	int			i;
+	bool		needs_init_pruning = false;
+	bool		needs_exec_pruning = false;
 
 	/*
 	 * Scan the subpaths to see which ones are scans of partition child
@@ -313,12 +317,16 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		Bitmapset  *partrelids = (Bitmapset *) lfirst(lc);
 		List	   *pinfolist;
 		Bitmapset  *matchedsubplans = NULL;
+		bool		partrel_needs_init_pruning;
+		bool		partrel_needs_exec_pruning;
 
 		pinfolist = make_partitionedrel_pruneinfo(root, parentrel,
 												  prunequal,
 												  partrelids,
 												  relid_subplan_map,
-												  &matchedsubplans);
+												  &matchedsubplans,
+												  &partrel_needs_init_pruning,
+												  &partrel_needs_exec_pruning);
 
 		/* When pruning is possible, record the matched subplans */
 		if (pinfolist != NIL)
@@ -327,6 +335,9 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 			allmatchedsubplans = bms_join(matchedsubplans,
 										  allmatchedsubplans);
 		}
+
+		needs_init_pruning |= partrel_needs_init_pruning;
+		needs_exec_pruning |= partrel_needs_exec_pruning;
 	}
 
 	pfree(relid_subplan_map);
@@ -342,6 +353,8 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	pruneinfo = makeNode(PartitionPruneInfo);
 	pruneinfo->root_parent_relids = parentrel->relids;
 	pruneinfo->prune_infos = prunerelinfos;
+	pruneinfo->needs_init_pruning = needs_init_pruning;
+	pruneinfo->needs_exec_pruning = needs_exec_pruning;
 
 	/*
 	 * Some subplans may not belong to any of the identified partitioned rels.
@@ -442,13 +455,19 @@ add_part_relids(List *allpartrelids, Bitmapset *partrelids)
  * If we cannot find any useful run-time pruning steps, return NIL.
  * However, on success, each rel identified in partrelids will have
  * an element in the result list, even if some of them are useless.
+ * *needs_init_pruning and *needs_exec_pruning are set to indicate whether
+ * the pruning steps contained in the returned PartitionedRelPruneInfos
+ * can be performed during executor startup and during execution,
+ * respectively.
  */
 static List *
 make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 							  List *prunequal,
 							  Bitmapset *partrelids,
 							  int *relid_subplan_map,
-							  Bitmapset **matchedsubplans)
+							  Bitmapset **matchedsubplans,
+							  bool *needs_init_pruning,
+							  bool *needs_exec_pruning)
 {
 	RelOptInfo *targetpart = NULL;
 	List	   *pinfolist = NIL;
@@ -459,6 +478,10 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 	int			rti;
 	int			i;
 
+	/* Will find out below. */
+	*needs_init_pruning = false;
+	*needs_exec_pruning = false;
+
 	/*
 	 * Examine each partitioned rel, constructing a temporary array to map
 	 * from planner relids to index of the partitioned rel, and building a
@@ -546,6 +569,9 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		 * executor per-scan pruning steps.  This first pass creates startup
 		 * pruning steps and detects whether there's any possibly-useful quals
 		 * that would require per-scan pruning.
+		 *
+		 * In the first pass, we note whether the 2nd pass is necessary by
+		 * noting the presence of EXEC parameters.
 		 */
 		gen_partprune_steps(subpart, partprunequal, PARTTARGET_INITIAL,
 							&context);
@@ -620,6 +646,12 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		pinfo->execparamids = execparamids;
 		/* Remaining fields will be filled in the next loop */
 
+		/* record which types of pruning steps we've seen so far */
+		if (initial_pruning_steps != NIL)
+			*needs_init_pruning = true;
+		if (exec_pruning_steps != NIL)
+			*needs_exec_pruning = true;
+
 		pinfolist = lappend(pinfolist, pinfo);
 	}
 
@@ -647,6 +679,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		int		   *subplan_map;
 		int		   *subpart_map;
 		Oid		   *relid_map;
+		Index	   *rti_map;
 
 		/*
 		 * Construct the subplan and subpart maps for this partitioning level.
@@ -659,6 +692,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		subpart_map = (int *) palloc(nparts * sizeof(int));
 		memset(subpart_map, -1, nparts * sizeof(int));
 		relid_map = (Oid *) palloc0(nparts * sizeof(Oid));
+		rti_map = (Index *) palloc0(nparts * sizeof(Index));
 		present_parts = NULL;
 
 		i = -1;
@@ -673,6 +707,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 			subplan_map[i] = subplanidx = relid_subplan_map[partrel->relid] - 1;
 			subpart_map[i] = subpartidx = relid_subpart_map[partrel->relid] - 1;
 			relid_map[i] = planner_rt_fetch(partrel->relid, root)->relid;
+			rti_map[i] = partrel->relid;
 			if (subplanidx >= 0)
 			{
 				present_parts = bms_add_member(present_parts, i);
@@ -697,6 +732,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
 		pinfo->subplan_map = subplan_map;
 		pinfo->subpart_map = subpart_map;
 		pinfo->relid_map = relid_map;
+		pinfo->rti_map = rti_map;
 	}
 
 	pfree(relid_subpart_map);
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index cc943205d3..339bb603f7 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -1747,7 +1747,8 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
 	foreach(lc1, stmt_list)
 	{
 		PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
-		ListCell   *lc2;
+		Bitmapset  *allLockRelids;
+		int			rti;
 
 		if (plannedstmt->commandType == CMD_UTILITY)
 		{
@@ -1760,14 +1761,17 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
 			 */
 			Query	   *query = UtilityContainsQuery(plannedstmt->utilityStmt);
 
+			Assert(plannedstmt->minLockRelids == NULL);
 			if (query)
 				ScanQueryForLocks(query, acquire);
 			continue;
 		}
 
-		foreach(lc2, plannedstmt->rtable)
+		allLockRelids = plannedstmt->minLockRelids;
+		rti = -1;
+		while ((rti = bms_next_member(allLockRelids, rti)) > 0)
 		{
-			RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc2);
+			RangeTblEntry *rte = rt_fetch(rti, plannedstmt->rtable);
 
 			if (rte->rtekind != RTE_RELATION)
 				continue;
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index 17fabc18c9..aeeaeb7884 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -45,6 +45,7 @@ extern void ExecCleanupTupleRouting(ModifyTableState *mtstate,
  * nparts						Length of subplan_map[] and subpart_map[].
  * subplan_map					Subplan index by partition index, or -1.
  * subpart_map					Subpart index by partition index, or -1.
+ * rti_map						Range table index by partition index, or 0.
  * present_parts				A Bitmapset of the partition indexes that we
  *								have subplans or subparts for.
  * initial_pruning_steps		List of PartitionPruneSteps used to
@@ -61,6 +62,7 @@ typedef struct PartitionedRelPruningData
 	int			nparts;
 	int		   *subplan_map;
 	int		   *subpart_map;
+	Index	   *rti_map;
 	Bitmapset  *present_parts;
 	List	   *initial_pruning_steps;
 	List	   *exec_pruning_steps;
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index 1f33902947..c2f2544df5 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -218,6 +218,7 @@ extern struct Bitmapset *readBitmapset(void);
 extern uintptr_t readDatum(bool typbyval);
 extern bool *readBoolCols(int numCols);
 extern int *readIntCols(int numCols);
+extern Index *readIndexCols(int numCols);
 extern Oid *readOidCols(int numCols);
 extern int16 *readAttrNumberCols(int numCols);
 
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 654dba61aa..4337e7aa34 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -128,6 +128,17 @@ typedef struct PlannerGlobal
 	/* List of PartitionPruneInfo contained in the plan */
 	List	   *partPruneInfos;
 
+	/*
+	 * Do any of those PartitionPruneInfos have initial pruning steps in them?
+	 */
+	bool		containsInitialPruning;
+
+	/*
+	 * Indexes of all range table entries; for AcquireExecutorLocks()'s
+	 * perusal.
+	 */
+	Bitmapset  *minLockRelids;
+
 	/* OIDs of relations the plan depends on */
 	List	   *relationOids;
 
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index bddfe86191..eb0a007946 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -73,11 +73,18 @@ typedef struct PlannedStmt
 	List	   *partPruneInfos; /* List of PartitionPruneInfo contained in the
 								 * plan */
 
+	bool		containsInitialPruning;	/* Do any of those PartitionPruneInfos
+										 * have initial pruning steps in them?
+										 */
+
 	List	   *rtable;			/* list of RangeTblEntry nodes */
 
 	List	   *permInfos;		/* list of RTEPermissionInfo nodes for rtable
 								 * entries needing one */
 
+	Bitmapset  *minLockRelids;	/* Indexes of all range table entries; for
+								 * AcquireExecutorLocks()'s perusal */
+
 	/* rtable indexes of target relations for INSERT/UPDATE/DELETE/MERGE */
 	List	   *resultRelations;	/* integer list of RT indexes, or NIL */
 
@@ -1417,6 +1424,13 @@ typedef struct PlanRowMark
  * prune_infos			List of Lists containing PartitionedRelPruneInfo nodes,
  *						one sublist per run-time-prunable partition hierarchy
  *						appearing in the parent plan node's subplans.
+ *
+ * needs_init_pruning	Does any of the PartitionedRelPruneInfos in
+ *						prune_infos have its initial_pruning_steps set?
+ *
+ * needs_exec_pruning	Does any of the PartitionedRelPruneInfos in
+ *						prune_infos have its exec_pruning_steps set?
+ *
  * other_subplans		Indexes of any subplans that are not accounted for
  *						by any of the PartitionedRelPruneInfo nodes in
  *						"prune_infos".  These subplans must not be pruned.
@@ -1428,6 +1442,8 @@ typedef struct PartitionPruneInfo
 	NodeTag		type;
 	Bitmapset  *root_parent_relids;
 	List	   *prune_infos;
+	bool		needs_init_pruning;
+	bool		needs_exec_pruning;
 	Bitmapset  *other_subplans;
 } PartitionPruneInfo;
 
@@ -1472,6 +1488,9 @@ typedef struct PartitionedRelPruneInfo
 	/* relation OID by partition index, or 0 */
 	Oid		   *relid_map pg_node_attr(array_size(nparts));
 
+	/* Range table index by partition index, or 0. */
+	Index	   *rti_map pg_node_attr(array_size(nparts));
+
 	/*
 	 * initial_pruning_steps shows how to prune during executor startup (i.e.,
 	 * without use of any PARAM_EXEC Params); it is NIL if no startup pruning
-- 
2.35.3



^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2022-12-21 10:18  Alvaro Herrera <[email protected]>
  parent: Amit Langote <[email protected]>
  0 siblings, 2 replies; 108+ messages in thread

From: Alvaro Herrera @ 2022-12-21 10:18 UTC (permalink / raw)
  To: Amit Langote <[email protected]>; +Cc: Robert Haas <[email protected]>; Jacob Champion <[email protected]>; David Rowley <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers

This version of the patch looks not entirely unreasonable to me.  I'll
set this as Ready for Committer in case David or Tom or someone else
want to have a look and potentially commit it.

-- 
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/





^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2022-12-21 10:47  Amit Langote <[email protected]>
  parent: Alvaro Herrera <[email protected]>
  1 sibling, 0 replies; 108+ messages in thread

From: Amit Langote @ 2022-12-21 10:47 UTC (permalink / raw)
  To: Alvaro Herrera <[email protected]>; +Cc: Robert Haas <[email protected]>; Jacob Champion <[email protected]>; David Rowley <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers

On Wed, Dec 21, 2022 at 7:18 PM Alvaro Herrera <[email protected]> wrote:
> This version of the patch looks not entirely unreasonable to me.  I'll
> set this as Ready for Committer in case David or Tom or someone else
> want to have a look and potentially commit it.

Thank you, Alvaro.

-- 
Thanks, Amit Langote
EDB: http://www.enterprisedb.com





^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2022-12-21 15:18  Tom Lane <[email protected]>
  parent: Alvaro Herrera <[email protected]>
  1 sibling, 0 replies; 108+ messages in thread

From: Tom Lane @ 2022-12-21 15:18 UTC (permalink / raw)
  To: Alvaro Herrera <[email protected]>; +Cc: Amit Langote <[email protected]>; Robert Haas <[email protected]>; Jacob Champion <[email protected]>; David Rowley <[email protected]>; pgsql-hackers

Alvaro Herrera <[email protected]> writes:
> This version of the patch looks not entirely unreasonable to me.  I'll
> set this as Ready for Committer in case David or Tom or someone else
> want to have a look and potentially commit it.

I will have a look during the January CF.

			regards, tom lane





^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2025-05-20 03:06  Tom Lane <[email protected]>
  0 siblings, 2 replies; 108+ messages in thread

From: Tom Lane @ 2025-05-20 03:06 UTC (permalink / raw)
  To: Amit Langote <[email protected]>; +Cc: Tender Wang <[email protected]>; Alexander Lakhin <[email protected]>; Tomas Vondra <[email protected]>; Robert Haas <[email protected]>; Alvaro Herrera <[email protected]>; Andres Freund <[email protected]>; Daniel Gustafsson <[email protected]>; David Rowley <[email protected]>; pgsql-hackers; Thom Brown <[email protected]>

Amit Langote <[email protected]> writes:
> Pushed after some tweaks to comments and the test case.

My attention was drawn to commit 525392d57 after observing that
Valgrind complained about a memory leak in some code that commit added
to BuildCachedPlan().  I tried to make sense of said code so I could
remove the leak, and eventually arrived at the attached patch, which
is part of a series of leak-fixing things hence the high sequence
number.

Unfortunately, the bad things I speculated about in the added comments
seem to be reality.  The second attached file is a test case that
triggers

TRAP: failed Assert("list_length(plan_list) == list_length(plan->stmt_list)"), File: "plancache.c", Line: 1259, PID: 602087

because it adds a DO ALSO rule that causes the rewriter to generate
more PlannedStmts than it did before.

This is quite awful, because it does more than simply break the klugy
(and undocumented) business about keeping the top-level List in a
different context.  What it means is that any outside code that is
busy iterating that List is very fundamentally broken: it's not clear
what List index it ought to resume at, except that "the one it was at"
is demonstrably incorrect.

I also don't really believe the (also undocumented) assumption that
such outside code is in between executions of PlannedStmts of the
List and hence can tolerate those being ripped out and replaced.
I have not attempted to build an example, because the one I have
seems sufficiently damning.  But I bet that a recursive function
could be constructed in such a way that an outer execution is
still in progress when an inner call triggers UpdateCachedPlan.

Another small problem (much more easily fixable than the above,
probably) is that summarily setting "plan->is_valid = true"
at the end is not okay.  We could already have received an
invalidation that should result in marking the plan stale.
(Holding locks on the tables involved is not sufficient to
prevent that, as there are other sources of inval events.)

It's possible that this code can be fixed, but I fear it's
going to involve some really fundamental redesign, which
probably shouldn't be happening after beta1.  I think there
is no alternative but to revert for v18.

			regards, tom lane


drop table if exists test_table;

CREATE TABLE test_table (a int);

create or replace function doit(r int, a int) returns bool
language plpgsql as $$
begin
  raise notice 'r = %, a = %', r, a;
  if (r = 10) then
    CREATE RULE make_noise AS ON DELETE TO test_table
	DO ALSO INSERT INTO test_table SELECT 2;
    raise notice 'made rule';
  end if;
  if (r = 20 and a = 1) then
    CREATE RULE make_noise_2 AS ON DELETE TO test_table
	DO ALSO INSERT INTO test_table SELECT 3;
    raise notice 'made rule 2';
  end if;
  return true;
end$$;

set plan_cache_mode to force_generic_plan;

DO $$
BEGIN
    FOR r IN 1..30 LOOP
        TRUNCATE test_table;
        INSERT INTO test_table SELECT 1;
        DELETE FROM test_table where doit(r,a);
    END LOOP;
END$$;

table test_table;


Attachments:

  [text/x-diff] v2-0010-Partially-fix-some-extremely-broken-code-from-52.patch (3.7K, 2-v2-0010-Partially-fix-some-extremely-broken-code-from-52.patch)
  download | inline diff:
From a680e6b6885378beb0164e465b50afd81558ebc5 Mon Sep 17 00:00:00 2001
From: Tom Lane <[email protected]>
Date: Mon, 19 May 2025 00:02:20 -0400
Subject: [PATCH v2 10/20] Partially fix some extremely broken code from
 525392d57.

Avoid leaking memory in the stmt_context during BuildCachedPlan.
Sadly, this code has problems a lot worse than that (per the
documentation I added), so I suspect 525392d57 will get reverted
and we won't need this patch.

Author: Tom Lane <[email protected]>
Discussion: https://postgr.es/m/[email protected]
---
 src/backend/utils/cache/plancache.c | 37 ++++++++++++++++++++++-------
 1 file changed, 29 insertions(+), 8 deletions(-)

diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 9bcbc4c3e97..40ba3e9df7c 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -1109,22 +1109,32 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
 	 */
 	if (!plansource->is_oneshot)
 	{
+		List	   *stmt_plist;
+
 		plan_context = AllocSetContextCreate(CurrentMemoryContext,
 											 "CachedPlan",
 											 ALLOCSET_START_SMALL_SIZES);
 		MemoryContextCopyAndSetIdentifier(plan_context, plansource->query_string);
 
-		stmt_context = AllocSetContextCreate(CurrentMemoryContext,
+		stmt_context = AllocSetContextCreate(plan_context,
 											 "CachedPlan PlannedStmts",
 											 ALLOCSET_START_SMALL_SIZES);
 		MemoryContextCopyAndSetIdentifier(stmt_context, plansource->query_string);
-		MemoryContextSetParent(stmt_context, plan_context);
 
+		/*
+		 * Copy plans into the stmt_context.
+		 */
 		MemoryContextSwitchTo(stmt_context);
-		plist = copyObject(plist);
+		stmt_plist = copyObject(plist);
 
+		/*
+		 * We actually need the top-level List object to be in the long-lived
+		 * plan_context, in case UpdateCachedPlan wants to update it; see
+		 * comments therein.  Do a shallow copy to make that happen.
+		 */
 		MemoryContextSwitchTo(plan_context);
-		plist = list_copy(plist);
+		plist = list_copy(stmt_plist);
+		list_free(stmt_plist);	/* be tidy */
 	}
 	else
 		plan_context = CurrentMemoryContext;
@@ -1251,12 +1261,22 @@ UpdateCachedPlan(CachedPlanSource *plansource, int query_index,
 
 	/*
 	 * Planning work is done in the caller's memory context.  The resulting
-	 * PlannedStmt is then copied into plan->stmt_context after throwing away
-	 * the old ones.
+	 * PlannedStmt(s) are then copied into plan->stmt_context after throwing
+	 * away the old ones.  But note that we re-use the long-lived
+	 * plan->stmt_list list to hold the pointers to the PlannedStmts.  This
+	 * kluge avoids breaking code that is iterating over that list, so long as
+	 * it's between statements and not currently using one of the contained
+	 * PlannedStmts.
+	 *
+	 * XXX this is, if not actively broken, at least unbelievably fragile.
+	 * Aside from the likelihood that the just-stated assumption doesn't hold
+	 * universally, there is not a good reason to believe that the length of
+	 * the plan list is constant.
 	 */
 	plan_list = pg_plan_queries(query_list, plansource->query_string,
 								plansource->cursor_options, NULL);
-	Assert(list_length(plan_list) == list_length(plan->stmt_list));
+	if (list_length(plan_list) != list_length(plan->stmt_list))
+		elog(ERROR, "UpdateCachedPlan(): plan list length changed");
 
 	MemoryContextReset(plan->stmt_context);
 	oldcxt = MemoryContextSwitchTo(plan->stmt_context);
@@ -1276,7 +1296,8 @@ UpdateCachedPlan(CachedPlanSource *plansource, int query_index,
 
 	/*
 	 * We've updated all the plans that might have been invalidated, so mark
-	 * the CachedPlan as valid.
+	 * the CachedPlan as valid.  XXX wrong: we could already have hit a new
+	 * invalidation event.
 	 */
 	plan->is_valid = true;
 
-- 
2.43.5



  [text/plain] break_cached_plan.sql (778B, 3-break_cached_plan.sql)
  download | inline:
drop table if exists test_table;

CREATE TABLE test_table (a int);

create or replace function doit(r int, a int) returns bool
language plpgsql as $$
begin
  raise notice 'r = %, a = %', r, a;
  if (r = 10) then
    CREATE RULE make_noise AS ON DELETE TO test_table
	DO ALSO INSERT INTO test_table SELECT 2;
    raise notice 'made rule';
  end if;
  if (r = 20 and a = 1) then
    CREATE RULE make_noise_2 AS ON DELETE TO test_table
	DO ALSO INSERT INTO test_table SELECT 3;
    raise notice 'made rule 2';
  end if;
  return true;
end$$;

set plan_cache_mode to force_generic_plan;

DO $$
BEGIN
    FOR r IN 1..30 LOOP
        TRUNCATE test_table;
        INSERT INTO test_table SELECT 1;
        DELETE FROM test_table where doit(r,a);
    END LOOP;
END$$;

table test_table;

^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2025-05-20 07:59  Tomas Vondra <[email protected]>
  parent: Tom Lane <[email protected]>
  1 sibling, 1 reply; 108+ messages in thread

From: Tomas Vondra @ 2025-05-20 07:59 UTC (permalink / raw)
  To: Tom Lane <[email protected]>; Amit Langote <[email protected]>; +Cc: Tender Wang <[email protected]>; Alexander Lakhin <[email protected]>; Robert Haas <[email protected]>; Alvaro Herrera <[email protected]>; Andres Freund <[email protected]>; Daniel Gustafsson <[email protected]>; David Rowley <[email protected]>; pgsql-hackers; Thom Brown <[email protected]>


On 5/20/25 05:06, Tom Lane wrote:
> Amit Langote <[email protected]> writes:
>> Pushed after some tweaks to comments and the test case.
> 
> My attention was drawn to commit 525392d57 after observing that
> Valgrind complained about a memory leak in some code that commit added
> to BuildCachedPlan().  I tried to make sense of said code so I could
> remove the leak, and eventually arrived at the attached patch, which
> is part of a series of leak-fixing things hence the high sequence
> number.
> 
> Unfortunately, the bad things I speculated about in the added comments
> seem to be reality.  The second attached file is a test case that
> triggers
> 
> ...

FYI I added this as a PG18 open item:

  https://wiki.postgresql.org/wiki/PostgreSQL_18_Open_Items


regards

-- 
Tomas Vondra






^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2025-05-20 13:25  Amit Langote <[email protected]>
  parent: Tom Lane <[email protected]>
  1 sibling, 1 reply; 108+ messages in thread

From: Amit Langote @ 2025-05-20 13:25 UTC (permalink / raw)
  To: Tom Lane <[email protected]>; +Cc: Tender Wang <[email protected]>; Alexander Lakhin <[email protected]>; Tomas Vondra <[email protected]>; Robert Haas <[email protected]>; Alvaro Herrera <[email protected]>; Andres Freund <[email protected]>; Daniel Gustafsson <[email protected]>; David Rowley <[email protected]>; pgsql-hackers; Thom Brown <[email protected]>

Hi Tom,

On Tue, May 20, 2025 at 12:06 PM Tom Lane <[email protected]> wrote:
> My attention was drawn to commit 525392d57 after observing that
> Valgrind complained about a memory leak in some code that commit added
> to BuildCachedPlan().  I tried to make sense of said code so I could
> remove the leak, and eventually arrived at the attached patch, which
> is part of a series of leak-fixing things hence the high sequence
> number.
>
> Unfortunately, the bad things I speculated about in the added comments
> seem to be reality.  The second attached file is a test case that
> triggers
>
> TRAP: failed Assert("list_length(plan_list) == list_length(plan->stmt_list)"), File: "plancache.c", Line: 1259, PID: 602087
>
> because it adds a DO ALSO rule that causes the rewriter to generate
> more PlannedStmts than it did before.
>
> This is quite awful, because it does more than simply break the klugy
> (and undocumented) business about keeping the top-level List in a
> different context.  What it means is that any outside code that is
> busy iterating that List is very fundamentally broken: it's not clear
> what List index it ought to resume at, except that "the one it was at"
> is demonstrably incorrect.
>
> I also don't really believe the (also undocumented) assumption that
> such outside code is in between executions of PlannedStmts of the
> List and hence can tolerate those being ripped out and replaced.
> I have not attempted to build an example, because the one I have
> seems sufficiently damning.  But I bet that a recursive function
> could be constructed in such a way that an outer execution is
> still in progress when an inner call triggers UpdateCachedPlan.
>
> Another small problem (much more easily fixable than the above,
> probably) is that summarily setting "plan->is_valid = true"
> at the end is not okay.  We could already have received an
> invalidation that should result in marking the plan stale.
> (Holding locks on the tables involved is not sufficient to
> prevent that, as there are other sources of inval events.)

Thanks for pointing out the hole in the current handling of
CachedPlan->stmt_list. You're right that the approach of preserving
the list structure while replacing its contents in-place doesn’t hold
up when the rewriter adds or removes statements dynamically. There
might be other cases that neither of us have tried.  I don’t think
that mechanism is salvageable.

To address the issue without needing a full revert, I’m considering
dropping UpdateCachedPlan() and removing the associated MemoryContext
dance to preserve CachedPlan->stmt_list structure. Instead, the
executor would replan the necessary query into a transient list of
PlannedStmts, leaving the original CachedPlan untouched. That avoids
mutating shared plan state during execution and still enables deferred
locking in the vast majority of cases.

There are two variants of this approach. In the simpler form, the
transient PlannedStmt list exists only in executor-local memory and
isn’t registered with the invalidation machinery. That might be
acceptable in practice, since all referenced relations are locked at
that point -- but it would mean any invalidation events delivered
during execution are ignored. The more robust variant is to build a
one-query standalone CachedPlan using something like
GetTransientCachedPlanForQuery(), which I had proposed back in [1].
This gets added to a standalone_plan_list so that invalidation
callbacks can still reach it. I dropped that design earlier [2] due to
the cleanup overhead, but I’d be happy to bring it back in a
simplified form if that seems preferable.

One open question in either case is what to do if the number of
PlannedStmts in the rewritten plan changes as with your example. Would
it be reasonable to just go ahead and execute the additional
statements from the transient plan, even though the original
CachedPlan wouldn’t have known about them until the next use? That
would avoid introducing any new failure behavior while still handling
the invalidation correctly for the current execution.

> It's possible that this code can be fixed, but I fear it's
> going to involve some really fundamental redesign, which
> probably shouldn't be happening after beta1.  I think there
> is no alternative but to revert for v18.

...Beyond that, I think I’ve run out of clean options for making
deferred locking executor-local while keeping invalidation safe. I
know you'd previously objected (with good reason) to making
GetCachedPlan() itself run pruning logic to determine which partitions
to lock -- and to the idea of carrying or sharing the result of that
pruning back to the executor via interface changes in the path from
plancache.c through its callers down to ExecutorStart(). So I’ve
steered away from revisiting that direction. But if we’re not
comfortable with either of the transient replanning options, then we
may end up shelving the deferred locking idea entirely -- which would
be unfortunate, given how much it helps workloads that rely on generic
plans over large partitioned tables.

Let me know what you think -- I’ll hold off on posting a revert or a
replacement until we’ve agreed on the path forward.

-- 
Thanks, Amit Langote

[1] https://www.postgresql.org/message-id/CA%2BHiwqGSOge3eT3kcm_nxCSA3Ut%2Bd0jtchi8g8J9uXi-kyC7Jw%40mail...
[2] https://www.postgresql.org/message-id/CA%2BHiwqHRRFQN6yZ54fBydOTM6ncqZBCmewZ6n519RjRdDsO44g%40mail.g...





^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2025-05-20 15:38  Tom Lane <[email protected]>
  parent: Amit Langote <[email protected]>
  0 siblings, 2 replies; 108+ messages in thread

From: Tom Lane @ 2025-05-20 15:38 UTC (permalink / raw)
  To: Amit Langote <[email protected]>; +Cc: Tender Wang <[email protected]>; Alexander Lakhin <[email protected]>; Tomas Vondra <[email protected]>; Robert Haas <[email protected]>; Alvaro Herrera <[email protected]>; Andres Freund <[email protected]>; Daniel Gustafsson <[email protected]>; David Rowley <[email protected]>; pgsql-hackers; Thom Brown <[email protected]>

Amit Langote <[email protected]> writes:
> Thanks for pointing out the hole in the current handling of
> CachedPlan->stmt_list. You're right that the approach of preserving
> the list structure while replacing its contents in-place doesn’t hold
> up when the rewriter adds or removes statements dynamically. There
> might be other cases that neither of us have tried.  I don’t think
> that mechanism is salvageable.

> To address the issue without needing a full revert, I’m considering
> dropping UpdateCachedPlan() and removing the associated MemoryContext
> dance to preserve CachedPlan->stmt_list structure. Instead, the
> executor would replan the necessary query into a transient list of
> PlannedStmts, leaving the original CachedPlan untouched. That avoids
> mutating shared plan state during execution and still enables deferred
> locking in the vast majority of cases.

Yeah, I think messing with the CachedPlan is just fundamentally wrong.
It breaks the invariant that the executor should not scribble on what
it's handed --- maybe not as obviously as some other cases, but it's
still not a good design.

I kind of feel that we ought to take two steps back and think
about what it even means to have a generic plan in this situation.
Perhaps we should simply refuse to use that code path if there are
prunable partitioned tables involved?

> Let me know what you think -- I’ll hold off on posting a revert or a
> replacement until we’ve agreed on the path forward.

I had not looked at 525392d57 in any detail before (the claim in
the commit message that I reviewed it is a figment of someone's
imagination).  Now that I have, I'm still going to argue for revert.
Aside from the points above, I really hate what's been done to the
fundamental executor APIs.  The fact that ExecutorStart callers have
to know about this is as ugly as can be.  I also don't like the
fact that it's added overhead in cases where there can be no benefit
(notice that my test case doesn't even involve a partitioned table).

I still like the core idea of deferring locking, but I don't like
anything about this implementation of it.  It seems like there has
to be a better and simpler way.

			regards, tom lane





^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2025-05-21 10:22  Amit Langote <[email protected]>
  parent: Tom Lane <[email protected]>
  1 sibling, 1 reply; 108+ messages in thread

From: Amit Langote @ 2025-05-21 10:22 UTC (permalink / raw)
  To: Tom Lane <[email protected]>; +Cc: Tender Wang <[email protected]>; Alexander Lakhin <[email protected]>; Tomas Vondra <[email protected]>; Robert Haas <[email protected]>; Alvaro Herrera <[email protected]>; Andres Freund <[email protected]>; Daniel Gustafsson <[email protected]>; David Rowley <[email protected]>; pgsql-hackers; Thom Brown <[email protected]>

On Wed, May 21, 2025 at 12:38 AM Tom Lane <[email protected]> wrote:
> Amit Langote <[email protected]> writes:
> > Thanks for pointing out the hole in the current handling of
> > CachedPlan->stmt_list. You're right that the approach of preserving
> > the list structure while replacing its contents in-place doesn’t hold
> > up when the rewriter adds or removes statements dynamically. There
> > might be other cases that neither of us have tried.  I don’t think
> > that mechanism is salvageable.
>
> > To address the issue without needing a full revert, I’m considering
> > dropping UpdateCachedPlan() and removing the associated MemoryContext
> > dance to preserve CachedPlan->stmt_list structure. Instead, the
> > executor would replan the necessary query into a transient list of
> > PlannedStmts, leaving the original CachedPlan untouched. That avoids
> > mutating shared plan state during execution and still enables deferred
> > locking in the vast majority of cases.
>
> Yeah, I think messing with the CachedPlan is just fundamentally wrong.
> It breaks the invariant that the executor should not scribble on what
> it's handed --- maybe not as obviously as some other cases, but it's
> still not a good design.

Fair enough. I’ll revert this and some related changes shortly.  WIP
patch attached.

> I kind of feel that we ought to take two steps back and think
> about what it even means to have a generic plan in this situation.
> Perhaps we should simply refuse to use that code path if there are
> prunable partitioned tables involved?

Sorry, I’m not sure I fully understand -- especially what you mean by
“that code path.” If you're referring to the generic plan creation and
reuse path in general, I'd point out that initial runtime pruning was
introduced largely to improve the efficiency of generic plan execution
(albeit without addressing the locking bottleneck at the time -- David
Rowley had explored that earlier). So simply disallowing generic plans
when partitions are involved feels like an odd direction, given that a
major motivation for initial pruning was to make those cases faster.

Custom plans can win when parameters are available, of course, but
there's a major use case involving stable expressions like now() with
time-based partitions, where plan_cache_mode = auto will still choose
a generic plan. So I wouldn’t say that optimizing generic plan
execution -- especially the goal of this project -- is wasted effort
in practice.

> > Let me know what you think -- I’ll hold off on posting a revert or a
> > replacement until we’ve agreed on the path forward.
>
> I had not looked at 525392d57 in any detail before (the claim in
> the commit message that I reviewed it is a figment of someone's
> imagination).

Apologies if I gave the misleading impression that you were on board
with the current design. I meant only to acknowledge your earlier
engagement with the general idea, which I appreciated. I marked it as
“(old versions)” in the commit metadata to reflect that -- clearly I
should’ve been more precise.  I know that the meaning of Reviewed-by
and other tags is evolving and I clearly haven't kept up.

>  Now that I have, I'm still going to argue for revert.
> Aside from the points above, I really hate what's been done to the
> fundamental executor APIs.  The fact that ExecutorStart callers have
> to know about this is as ugly as can be.  I also don't like the
> fact that it's added overhead in cases where there can be no benefit
> (notice that my test case doesn't even involve a partitioned table).

I tried to keep the overhead low by ensuring that the only additional
thing we'd be doing in the regular path is a CachedPlan->is_valid
boolean check in a couple of places, and that further work would only
happen if invalidation actually occurred. That said, I realize the
patch makes invalidation handling apply in more cases than before,
which may itself be seen as added overhead. But I may have
misunderstood your concern -- perhaps it's more about the layering
violation than the raw cycles?

> I still like the core idea of deferring locking, but I don't like
> anything about this implementation of it.  It seems like there has
> to be a better and simpler way.

It's good to hear that you still like the core idea -- I’d really
appreciate it if you're willing to continue bearing with me as I try
to rework this in a way that's cleaner and better aligned with the
overall design. I'd welcome any thoughts you have along the way. I
know this has been a difficult project, and I don't mean to come
across as taking any of it lightly. I'm still hopeful there's a path
forward, but I completely understand the need to reset here.

-- 
Thanks, Amit Langote


Attachments:

  [application/octet-stream] v1-0001-Revert-Don-t-lock-partitions-pruned-by-initial-pr.patch (66.5K, 2-v1-0001-Revert-Don-t-lock-partitions-pruned-by-initial-pr.patch)
  download | inline diff:
From 260d3fbf4801402f1a2ffd947f1f05fd3cad6878 Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Wed, 21 May 2025 18:46:52 +0900
Subject: [PATCH v1] Revert "Don't lock partitions pruned by initial pruning"

As pointed out by Tom Lane, the patch introduced fragile and invasive
design around plan invalidation handling when locking of prunable
partitions was deferred from plancache.c to the executor. In
particular, it violated assumptions about CachedPlan immutability and
altered executor APIs in ways that are difficult to justify given the
added complexity and overhead.

This also removes the firstResultRels field added to PlannedStmt in
commit 28317de72, which was intended to support deferred locking of
certain ModifyTable result relations.

Reported-by: Tom Lane <[email protected]>
Discussion: https://postgr.es/m/[email protected]
---
 contrib/auto_explain/auto_explain.c           |  16 +-
 .../pg_stat_statements/pg_stat_statements.c   |  16 +-
 src/backend/commands/copyto.c                 |   5 +-
 src/backend/commands/createas.c               |   5 +-
 src/backend/commands/explain.c                |  22 +-
 src/backend/commands/extension.c              |   4 +-
 src/backend/commands/matview.c                |   5 +-
 src/backend/commands/portalcmds.c             |   1 -
 src/backend/commands/prepare.c                |   9 +-
 src/backend/commands/trigger.c                |  15 --
 src/backend/executor/README                   |  35 +---
 src/backend/executor/execMain.c               | 127 +----------
 src/backend/executor/execParallel.c           |  12 +-
 src/backend/executor/execPartition.c          |  67 +-----
 src/backend/executor/execUtils.c              |   1 -
 src/backend/executor/functions.c              |   4 +-
 src/backend/executor/spi.c                    |  29 +--
 src/backend/optimizer/plan/planner.c          |   2 -
 src/backend/optimizer/plan/setrefs.c          |   3 -
 src/backend/tcop/postgres.c                   |   4 +-
 src/backend/tcop/pquery.c                     |  51 +----
 src/backend/utils/cache/plancache.c           | 197 +++---------------
 src/backend/utils/mmgr/portalmem.c            |   4 +-
 src/include/commands/explain.h                |   6 +-
 src/include/commands/trigger.h                |   1 -
 src/include/executor/execdesc.h               |   2 -
 src/include/executor/executor.h               |  33 +--
 src/include/nodes/execnodes.h                 |   3 -
 src/include/nodes/pathnodes.h                 |   3 -
 src/include/nodes/plannodes.h                 |   7 -
 src/include/utils/plancache.h                 |  46 +---
 src/include/utils/portal.h                    |   4 +-
 32 files changed, 88 insertions(+), 651 deletions(-)

diff --git a/contrib/auto_explain/auto_explain.c b/contrib/auto_explain/auto_explain.c
index cd6625020a7..1f4badb4928 100644
--- a/contrib/auto_explain/auto_explain.c
+++ b/contrib/auto_explain/auto_explain.c
@@ -81,7 +81,7 @@ static ExecutorRun_hook_type prev_ExecutorRun = NULL;
 static ExecutorFinish_hook_type prev_ExecutorFinish = NULL;
 static ExecutorEnd_hook_type prev_ExecutorEnd = NULL;
 
-static bool explain_ExecutorStart(QueryDesc *queryDesc, int eflags);
+static void explain_ExecutorStart(QueryDesc *queryDesc, int eflags);
 static void explain_ExecutorRun(QueryDesc *queryDesc,
 								ScanDirection direction,
 								uint64 count);
@@ -261,11 +261,9 @@ _PG_init(void)
 /*
  * ExecutorStart hook: start up logging if needed
  */
-static bool
+static void
 explain_ExecutorStart(QueryDesc *queryDesc, int eflags)
 {
-	bool		plan_valid;
-
 	/*
 	 * At the beginning of each top-level statement, decide whether we'll
 	 * sample this statement.  If nested-statement explaining is enabled,
@@ -301,13 +299,9 @@ explain_ExecutorStart(QueryDesc *queryDesc, int eflags)
 	}
 
 	if (prev_ExecutorStart)
-		plan_valid = prev_ExecutorStart(queryDesc, eflags);
+		prev_ExecutorStart(queryDesc, eflags);
 	else
-		plan_valid = standard_ExecutorStart(queryDesc, eflags);
-
-	/* The plan may have become invalid during standard_ExecutorStart() */
-	if (!plan_valid)
-		return false;
+		standard_ExecutorStart(queryDesc, eflags);
 
 	if (auto_explain_enabled())
 	{
@@ -325,8 +319,6 @@ explain_ExecutorStart(QueryDesc *queryDesc, int eflags)
 			MemoryContextSwitchTo(oldcxt);
 		}
 	}
-
-	return true;
 }
 
 /*
diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index 9778407cba3..d8fdf42df79 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -335,7 +335,7 @@ static PlannedStmt *pgss_planner(Query *parse,
 								 const char *query_string,
 								 int cursorOptions,
 								 ParamListInfo boundParams);
-static bool pgss_ExecutorStart(QueryDesc *queryDesc, int eflags);
+static void pgss_ExecutorStart(QueryDesc *queryDesc, int eflags);
 static void pgss_ExecutorRun(QueryDesc *queryDesc,
 							 ScanDirection direction,
 							 uint64 count);
@@ -989,19 +989,13 @@ pgss_planner(Query *parse,
 /*
  * ExecutorStart hook: start up tracking if needed
  */
-static bool
+static void
 pgss_ExecutorStart(QueryDesc *queryDesc, int eflags)
 {
-	bool		plan_valid;
-
 	if (prev_ExecutorStart)
-		plan_valid = prev_ExecutorStart(queryDesc, eflags);
+		prev_ExecutorStart(queryDesc, eflags);
 	else
-		plan_valid = standard_ExecutorStart(queryDesc, eflags);
-
-	/* The plan may have become invalid during standard_ExecutorStart() */
-	if (!plan_valid)
-		return false;
+		standard_ExecutorStart(queryDesc, eflags);
 
 	/*
 	 * If query has queryId zero, don't track it.  This prevents double
@@ -1024,8 +1018,6 @@ pgss_ExecutorStart(QueryDesc *queryDesc, int eflags)
 			MemoryContextSwitchTo(oldcxt);
 		}
 	}
-
-	return true;
 }
 
 /*
diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index f87e405351d..ea6f18f2c80 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -835,7 +835,7 @@ BeginCopyTo(ParseState *pstate,
 		((DR_copy *) dest)->cstate = cstate;
 
 		/* Create a QueryDesc requesting no output */
-		cstate->queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
+		cstate->queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
 											GetActiveSnapshot(),
 											InvalidSnapshot,
 											dest, NULL, NULL, 0);
@@ -845,8 +845,7 @@ BeginCopyTo(ParseState *pstate,
 		 *
 		 * ExecutorStart computes a result tupdesc for us
 		 */
-		if (!ExecutorStart(cstate->queryDesc, 0))
-			elog(ERROR, "ExecutorStart() failed unexpectedly");
+		ExecutorStart(cstate->queryDesc, 0);
 
 		tupDesc = cstate->queryDesc->tupDesc;
 	}
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 0a4155773eb..dfd2ab8e862 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -334,13 +334,12 @@ ExecCreateTableAs(ParseState *pstate, CreateTableAsStmt *stmt,
 		UpdateActiveSnapshotCommandId();
 
 		/* Create a QueryDesc, redirecting output to our tuple receiver */
-		queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
+		queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
 									GetActiveSnapshot(), InvalidSnapshot,
 									dest, params, queryEnv, 0);
 
 		/* call ExecutorStart to prepare the plan for execution */
-		if (!ExecutorStart(queryDesc, GetIntoRelEFlags(into)))
-			elog(ERROR, "ExecutorStart() failed unexpectedly");
+		ExecutorStart(queryDesc, GetIntoRelEFlags(into));
 
 		/* run the plan to completion */
 		ExecutorRun(queryDesc, ForwardScanDirection, 0);
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 786ee865f14..09ea30dfb92 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -369,8 +369,7 @@ standard_ExplainOneQuery(Query *query, int cursorOptions,
 	}
 
 	/* run it (if needed) and produce output */
-	ExplainOnePlan(plan, NULL, NULL, -1, into, es, queryString, params,
-				   queryEnv,
+	ExplainOnePlan(plan, into, es, queryString, params, queryEnv,
 				   &planduration, (es->buffers ? &bufusage : NULL),
 				   es->memory ? &mem_counters : NULL);
 }
@@ -492,9 +491,7 @@ ExplainOneUtility(Node *utilityStmt, IntoClause *into, ExplainState *es,
  * to call it.
  */
 void
-ExplainOnePlan(PlannedStmt *plannedstmt, CachedPlan *cplan,
-			   CachedPlanSource *plansource, int query_index,
-			   IntoClause *into, ExplainState *es,
+ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
 			   const char *queryString, ParamListInfo params,
 			   QueryEnvironment *queryEnv, const instr_time *planduration,
 			   const BufferUsage *bufusage,
@@ -550,7 +547,7 @@ ExplainOnePlan(PlannedStmt *plannedstmt, CachedPlan *cplan,
 		dest = None_Receiver;
 
 	/* Create a QueryDesc for the query */
-	queryDesc = CreateQueryDesc(plannedstmt, cplan, queryString,
+	queryDesc = CreateQueryDesc(plannedstmt, queryString,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, params, queryEnv, instrument_option);
 
@@ -564,17 +561,8 @@ ExplainOnePlan(PlannedStmt *plannedstmt, CachedPlan *cplan,
 	if (into)
 		eflags |= GetIntoRelEFlags(into);
 
-	/* Prepare the plan for execution. */
-	if (queryDesc->cplan)
-	{
-		ExecutorStartCachedPlan(queryDesc, eflags, plansource, query_index);
-		Assert(queryDesc->planstate);
-	}
-	else
-	{
-		if (!ExecutorStart(queryDesc, eflags))
-			elog(ERROR, "ExecutorStart() failed unexpectedly");
-	}
+	/* call ExecutorStart to prepare the plan for execution */
+	ExecutorStart(queryDesc, eflags);
 
 	/* Execute the plan for statistics if asked for */
 	if (es->analyze)
diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index 73c52e970f6..e6f9ab6dfd6 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -993,13 +993,11 @@ execute_sql_string(const char *sql, const char *filename)
 				QueryDesc  *qdesc;
 
 				qdesc = CreateQueryDesc(stmt,
-										NULL,
 										sql,
 										GetActiveSnapshot(), NULL,
 										dest, NULL, NULL, 0);
 
-				if (!ExecutorStart(qdesc, 0))
-					elog(ERROR, "ExecutorStart() failed unexpectedly");
+				ExecutorStart(qdesc, 0);
 				ExecutorRun(qdesc, ForwardScanDirection, 0);
 				ExecutorFinish(qdesc);
 				ExecutorEnd(qdesc);
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index e7854add178..27c2cb26ef5 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -438,13 +438,12 @@ refresh_matview_datafill(DestReceiver *dest, Query *query,
 	UpdateActiveSnapshotCommandId();
 
 	/* Create a QueryDesc, redirecting output to our tuple receiver */
-	queryDesc = CreateQueryDesc(plan, NULL, queryString,
+	queryDesc = CreateQueryDesc(plan, queryString,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, NULL, NULL, 0);
 
 	/* call ExecutorStart to prepare the plan for execution */
-	if (!ExecutorStart(queryDesc, 0))
-		elog(ERROR, "ExecutorStart() failed unexpectedly");
+	ExecutorStart(queryDesc, 0);
 
 	/* run the plan */
 	ExecutorRun(queryDesc, ForwardScanDirection, 0);
diff --git a/src/backend/commands/portalcmds.c b/src/backend/commands/portalcmds.c
index 4c2ac045224..e7c8171c102 100644
--- a/src/backend/commands/portalcmds.c
+++ b/src/backend/commands/portalcmds.c
@@ -117,7 +117,6 @@ PerformCursorOpen(ParseState *pstate, DeclareCursorStmt *cstmt, ParamListInfo pa
 					  queryString,
 					  CMDTAG_SELECT,	/* cursor's query is always a SELECT */
 					  list_make1(plan),
-					  NULL,
 					  NULL);
 
 	/*----------
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index bf7d2b2309f..34b6410d6a2 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -205,8 +205,7 @@ ExecuteQuery(ParseState *pstate,
 					  query_string,
 					  entry->plansource->commandTag,
 					  plan_list,
-					  cplan,
-					  entry->plansource);
+					  cplan);
 
 	/*
 	 * For CREATE TABLE ... AS EXECUTE, we must verify that the prepared
@@ -586,7 +585,6 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	MemoryContextCounters mem_counters;
 	MemoryContext planner_ctx = NULL;
 	MemoryContext saved_ctx = NULL;
-	int			query_index = 0;
 
 	if (es->memory)
 	{
@@ -659,8 +657,7 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 		PlannedStmt *pstmt = lfirst_node(PlannedStmt, p);
 
 		if (pstmt->commandType != CMD_UTILITY)
-			ExplainOnePlan(pstmt, cplan, entry->plansource, query_index,
-						   into, es, query_string, paramLI, pstate->p_queryEnv,
+			ExplainOnePlan(pstmt, into, es, query_string, paramLI, pstate->p_queryEnv,
 						   &planduration, (es->buffers ? &bufusage : NULL),
 						   es->memory ? &mem_counters : NULL);
 		else
@@ -671,8 +668,6 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 		/* Separate plans with an appropriate separator */
 		if (lnext(plan_list, p) != NULL)
 			ExplainSeparatePlans(es);
-
-		query_index++;
 	}
 
 	if (estate)
diff --git a/src/backend/commands/trigger.c b/src/backend/commands/trigger.c
index c9f61130c69..67f8e70f9c1 100644
--- a/src/backend/commands/trigger.c
+++ b/src/backend/commands/trigger.c
@@ -5057,21 +5057,6 @@ AfterTriggerBeginQuery(void)
 }
 
 
-/* ----------
- * AfterTriggerAbortQuery()
- *
- * Called by standard_ExecutorEnd() if the query execution was aborted due to
- * the plan becoming invalid during initialization.
- * ----------
- */
-void
-AfterTriggerAbortQuery(void)
-{
-	/* Revert the actions of AfterTriggerBeginQuery(). */
-	afterTriggers.query_depth--;
-}
-
-
 /* ----------
  * AfterTriggerEndQuery()
  *
diff --git a/src/backend/executor/README b/src/backend/executor/README
index 02745c23ed9..54f4782f31b 100644
--- a/src/backend/executor/README
+++ b/src/backend/executor/README
@@ -285,28 +285,6 @@ are typically reset to empty once per tuple.  Per-tuple contexts are usually
 associated with ExprContexts, and commonly each PlanState node has its own
 ExprContext to evaluate its qual and targetlist expressions in.
 
-Relation Locking
-----------------
-
-When the executor initializes a plan tree for execution, it doesn't lock
-non-index relations if the plan tree is freshly generated and not derived
-from a CachedPlan. This is because such locks have already been established
-during the query's parsing, rewriting, and planning phases. However, with a
-cached plan tree, some relations may remain unlocked. The function
-AcquireExecutorLocks() only locks unprunable relations in the plan, deferring
-the locking of prunable ones to executor initialization. This avoids
-unnecessary locking of relations that will be pruned during "initial" runtime
-pruning in ExecDoInitialPruning().
-
-This approach creates a window where a cached plan tree with child tables
-could become outdated if another backend modifies these tables before
-ExecDoInitialPruning() locks them. As a result, the executor has the added duty
-to verify the plan tree's validity whenever it locks a child table after
-doing initial pruning. This validation is done by checking the CachedPlan.is_valid
-flag. If the plan tree is outdated (is_valid = false), the executor stops
-further initialization, cleans up anything in EState that would have been
-allocated up to that point, and retries execution after recreating the
-invalid plan in the CachedPlan.  See ExecutorStartCachedPlan().
 
 Query Processing Control Flow
 -----------------------------
@@ -315,13 +293,11 @@ This is a sketch of control flow for full query processing:
 
 	CreateQueryDesc
 
-	ExecutorStart or ExecutorStartCachedPlan
+	ExecutorStart
 		CreateExecutorState
 			creates per-query context
-		switch to per-query context to run ExecDoInitialPruning and ExecInitNode
+		switch to per-query context to run ExecInitNode
 		AfterTriggerBeginQuery
-		ExecDoInitialPruning
-			does initial pruning and locks surviving partitions if needed
 		ExecInitNode --- recursively scans plan tree
 			ExecInitNode
 				recurse into subsidiary nodes
@@ -345,12 +321,7 @@ This is a sketch of control flow for full query processing:
 
 	FreeQueryDesc
 
-As mentioned in the "Relation Locking" section, if the plan tree is found to
-be stale after locking partitions in ExecDoInitialPruning(), the control is
-immediately returned to ExecutorStartCachedPlan(), which will create a new plan
-tree and perform the steps starting from CreateExecutorState() again.
-
-Per above comments, it's not really critical for ExecEndPlan to free any
+Per above comments, it's not really critical for ExecEndNode to free any
 memory; it'll all go away in FreeExecutorState anyway.  However, we do need to
 be careful to close relations, drop buffer pins, etc, so we do need to scan
 the plan state tree to find these sorts of resources.
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 7230f968101..0391798dd2c 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -55,13 +55,11 @@
 #include "parser/parse_relation.h"
 #include "pgstat.h"
 #include "rewrite/rewriteHandler.h"
-#include "storage/lmgr.h"
 #include "tcop/utility.h"
 #include "utils/acl.h"
 #include "utils/backend_status.h"
 #include "utils/lsyscache.h"
 #include "utils/partcache.h"
-#include "utils/plancache.h"
 #include "utils/rls.h"
 #include "utils/snapmgr.h"
 
@@ -119,16 +117,11 @@ static void ReportNotNullViolationError(ResultRelInfo *resultRelInfo,
  * get control when ExecutorStart is called.  Such a plugin would
  * normally call standard_ExecutorStart().
  *
- * Return value indicates if the plan has been initialized successfully so
- * that queryDesc->planstate contains a valid PlanState tree.  It may not
- * if the plan got invalidated during InitPlan().
  * ----------------------------------------------------------------
  */
-bool
+void
 ExecutorStart(QueryDesc *queryDesc, int eflags)
 {
-	bool		plan_valid;
-
 	/*
 	 * In some cases (e.g. an EXECUTE statement or an execute message with the
 	 * extended query protocol) the query_id won't be reported, so do it now.
@@ -140,14 +133,12 @@ ExecutorStart(QueryDesc *queryDesc, int eflags)
 	pgstat_report_query_id(queryDesc->plannedstmt->queryId, false);
 
 	if (ExecutorStart_hook)
-		plan_valid = (*ExecutorStart_hook) (queryDesc, eflags);
+		(*ExecutorStart_hook) (queryDesc, eflags);
 	else
-		plan_valid = standard_ExecutorStart(queryDesc, eflags);
-
-	return plan_valid;
+		standard_ExecutorStart(queryDesc, eflags);
 }
 
-bool
+void
 standard_ExecutorStart(QueryDesc *queryDesc, int eflags)
 {
 	EState	   *estate;
@@ -271,64 +262,6 @@ standard_ExecutorStart(QueryDesc *queryDesc, int eflags)
 	InitPlan(queryDesc, eflags);
 
 	MemoryContextSwitchTo(oldcontext);
-
-	return ExecPlanStillValid(queryDesc->estate);
-}
-
-/*
- * ExecutorStartCachedPlan
- *		Start execution for a given query in the CachedPlanSource, replanning
- *		if the plan is invalidated due to deferred locks taken during the
- *		plan's initialization
- *
- * This function handles cases where the CachedPlan given in queryDesc->cplan
- * might become invalid during the initialization of the plan given in
- * queryDesc->plannedstmt, particularly when prunable relations in it are
- * locked after performing initial pruning. If the locks invalidate the plan,
- * the function calls UpdateCachedPlan() to replan all queries in the
- * CachedPlan, and then retries initialization.
- *
- * The function repeats the process until ExecutorStart() successfully
- * initializes the plan, that is without the CachedPlan becoming invalid.
- */
-void
-ExecutorStartCachedPlan(QueryDesc *queryDesc, int eflags,
-						CachedPlanSource *plansource,
-						int query_index)
-{
-	if (unlikely(queryDesc->cplan == NULL))
-		elog(ERROR, "ExecutorStartCachedPlan(): missing CachedPlan");
-	if (unlikely(plansource == NULL))
-		elog(ERROR, "ExecutorStartCachedPlan(): missing CachedPlanSource");
-
-	/*
-	 * Loop and retry with an updated plan until no further invalidation
-	 * occurs.
-	 */
-	while (1)
-	{
-		if (!ExecutorStart(queryDesc, eflags))
-		{
-			/*
-			 * Clean up the current execution state before creating the new
-			 * plan to retry ExecutorStart().  Mark execution as aborted to
-			 * ensure that AFTER trigger state is properly reset.
-			 */
-			queryDesc->estate->es_aborted = true;
-			ExecutorEnd(queryDesc);
-
-			/* Retry ExecutorStart() with an updated plan tree. */
-			queryDesc->plannedstmt = UpdateCachedPlan(plansource, query_index,
-													  queryDesc->queryEnv);
-		}
-		else
-
-			/*
-			 * Exit the loop if the plan is initialized successfully and no
-			 * sinval messages were received that invalidated the CachedPlan.
-			 */
-			break;
-	}
 }
 
 /* ----------------------------------------------------------------
@@ -387,7 +320,6 @@ standard_ExecutorRun(QueryDesc *queryDesc,
 	estate = queryDesc->estate;
 
 	Assert(estate != NULL);
-	Assert(!estate->es_aborted);
 	Assert(!(estate->es_top_eflags & EXEC_FLAG_EXPLAIN_ONLY));
 
 	/* caller must ensure the query's snapshot is active */
@@ -494,11 +426,8 @@ standard_ExecutorFinish(QueryDesc *queryDesc)
 	Assert(estate != NULL);
 	Assert(!(estate->es_top_eflags & EXEC_FLAG_EXPLAIN_ONLY));
 
-	/*
-	 * This should be run once and only once per Executor instance and never
-	 * if the execution was aborted.
-	 */
-	Assert(!estate->es_finished && !estate->es_aborted);
+	/* This should be run once and only once per Executor instance */
+	Assert(!estate->es_finished);
 
 	/* Switch into per-query memory context */
 	oldcontext = MemoryContextSwitchTo(estate->es_query_cxt);
@@ -561,10 +490,11 @@ standard_ExecutorEnd(QueryDesc *queryDesc)
 											 (PgStat_Counter) estate->es_parallel_workers_launched);
 
 	/*
-	 * Check that ExecutorFinish was called, unless in EXPLAIN-only mode or if
-	 * execution was aborted.
+	 * Check that ExecutorFinish was called, unless in EXPLAIN-only mode. This
+	 * Assert is needed because ExecutorFinish is new as of 9.1, and callers
+	 * might forget to call it.
 	 */
-	Assert(estate->es_finished || estate->es_aborted ||
+	Assert(estate->es_finished ||
 		   (estate->es_top_eflags & EXEC_FLAG_EXPLAIN_ONLY));
 
 	/*
@@ -578,14 +508,6 @@ standard_ExecutorEnd(QueryDesc *queryDesc)
 	UnregisterSnapshot(estate->es_snapshot);
 	UnregisterSnapshot(estate->es_crosscheck_snapshot);
 
-	/*
-	 * Reset AFTER trigger module if the query execution was aborted.
-	 */
-	if (estate->es_aborted &&
-		!(estate->es_top_eflags &
-		  (EXEC_FLAG_SKIP_TRIGGERS | EXEC_FLAG_EXPLAIN_ONLY)))
-		AfterTriggerAbortQuery();
-
 	/*
 	 * Must switch out of context before destroying it
 	 */
@@ -684,21 +606,6 @@ ExecCheckPermissions(List *rangeTable, List *rteperminfos,
 				   (rte->rtekind == RTE_SUBQUERY &&
 					rte->relkind == RELKIND_VIEW));
 
-			/*
-			 * Ensure that we have at least an AccessShareLock on relations
-			 * whose permissions need to be checked.
-			 *
-			 * Skip this check in a parallel worker because locks won't be
-			 * taken until ExecInitNode() performs plan initialization.
-			 *
-			 * XXX: ExecCheckPermissions() in a parallel worker may be
-			 * redundant with the checks done in the leader process, so this
-			 * should be reviewed to ensure it’s necessary.
-			 */
-			Assert(IsParallelWorker() ||
-				   CheckRelationOidLockedByMe(rte->relid, AccessShareLock,
-											  true));
-
 			(void) getRTEPermissionInfo(rteperminfos, rte);
 			/* Many-to-one mapping not allowed */
 			Assert(!bms_is_member(rte->perminfoindex, indexset));
@@ -924,12 +831,6 @@ ExecCheckXactReadOnly(PlannedStmt *plannedstmt)
  *
  *		Initializes the query plan: open files, allocate storage
  *		and start up the rule manager
- *
- *		If the plan originates from a CachedPlan (given in queryDesc->cplan),
- *		it can become invalid during runtime "initial" pruning when the
- *		remaining set of locks is taken.  The function returns early in that
- *		case without initializing the plan, and the caller is expected to
- *		retry with a new valid plan.
  * ----------------------------------------------------------------
  */
 static void
@@ -937,7 +838,6 @@ InitPlan(QueryDesc *queryDesc, int eflags)
 {
 	CmdType		operation = queryDesc->operation;
 	PlannedStmt *plannedstmt = queryDesc->plannedstmt;
-	CachedPlan *cachedplan = queryDesc->cplan;
 	Plan	   *plan = plannedstmt->planTree;
 	List	   *rangeTable = plannedstmt->rtable;
 	EState	   *estate = queryDesc->estate;
@@ -958,7 +858,6 @@ InitPlan(QueryDesc *queryDesc, int eflags)
 					   bms_copy(plannedstmt->unprunableRelids));
 
 	estate->es_plannedstmt = plannedstmt;
-	estate->es_cachedplan = cachedplan;
 	estate->es_part_prune_infos = plannedstmt->partPruneInfos;
 
 	/*
@@ -972,9 +871,6 @@ InitPlan(QueryDesc *queryDesc, int eflags)
 	 */
 	ExecDoInitialPruning(estate);
 
-	if (!ExecPlanStillValid(estate))
-		return;
-
 	/*
 	 * Next, build the ExecRowMark array from the PlanRowMark(s), if any.
 	 */
@@ -3092,9 +2988,6 @@ EvalPlanQualStart(EPQState *epqstate, Plan *planTree)
 	 * the snapshot, rangetable, and external Param info.  They need their own
 	 * copies of local state, including a tuple table, es_param_exec_vals,
 	 * result-rel info, etc.
-	 *
-	 * es_cachedplan is not copied because EPQ plan execution does not acquire
-	 * any new locks that could invalidate the CachedPlan.
 	 */
 	rcestate->es_direction = ForwardScanDirection;
 	rcestate->es_snapshot = parentestate->es_snapshot;
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index 39c990ae638..f3e77bda279 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -1278,15 +1278,8 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
 	paramspace = shm_toc_lookup(toc, PARALLEL_KEY_PARAMLISTINFO, false);
 	paramLI = RestoreParamList(&paramspace);
 
-	/*
-	 * Create a QueryDesc for the query.  We pass NULL for cachedplan, because
-	 * we don't have a pointer to the CachedPlan in the leader's process. It's
-	 * fine because the only reason the executor needs to see it is to decide
-	 * if it should take locks on certain relations, but parallel workers
-	 * always take locks anyway.
-	 */
+	/* Create a QueryDesc for the query. */
 	return CreateQueryDesc(pstmt,
-						   NULL,
 						   queryString,
 						   GetActiveSnapshot(), InvalidSnapshot,
 						   receiver, paramLI, NULL, instrument_options);
@@ -1471,8 +1464,7 @@ ParallelQueryMain(dsm_segment *seg, shm_toc *toc)
 
 	/* Start up the executor */
 	queryDesc->plannedstmt->jitFlags = fpes->jit_flags;
-	if (!ExecutorStart(queryDesc, fpes->eflags))
-		elog(ERROR, "ExecutorStart() failed unexpectedly");
+	ExecutorStart(queryDesc, fpes->eflags);
 
 	/* Special executor initialization steps for parallel workers */
 	queryDesc->planstate->state->es_query_dsa = area;
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 3f8a4cb5244..3299db22bd5 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -26,7 +26,6 @@
 #include "partitioning/partdesc.h"
 #include "partitioning/partprune.h"
 #include "rewrite/rewriteManip.h"
-#include "storage/lmgr.h"
 #include "utils/acl.h"
 #include "utils/lsyscache.h"
 #include "utils/partcache.h"
@@ -1771,8 +1770,7 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  * ExecDoInitialPruning:
  *		Perform runtime "initial" pruning, if necessary, to determine the set
  *		of child subnodes that need to be initialized during ExecInitNode() for
- *		all plan nodes that contain a PartitionPruneInfo.  This also locks the
- *		leaf partitions whose subnodes will be initialized if needed.
+ *		all plan nodes that contain a PartitionPruneInfo.
  *
  * ExecInitPartitionExecPruning:
  *		Updates the PartitionPruneState found at given part_prune_index in
@@ -1793,13 +1791,11 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  *-------------------------------------------------------------------------
  */
 
-
 /*
  * ExecDoInitialPruning
  *		Perform runtime "initial" pruning, if necessary, to determine the set
  *		of child subnodes that need to be initialized during ExecInitNode() for
- *		plan nodes that support partition pruning.  This also locks the leaf
- *		partitions whose subnodes will be initialized if needed.
+ *		plan nodes that support partition pruning.
  *
  * This function iterates over each PartitionPruneInfo entry in
  * estate->es_part_prune_infos. For each entry, it creates a PartitionPruneState
@@ -1821,9 +1817,7 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
 void
 ExecDoInitialPruning(EState *estate)
 {
-	PlannedStmt *stmt = estate->es_plannedstmt;
 	ListCell   *lc;
-	List	   *locked_relids = NIL;
 
 	foreach(lc, estate->es_part_prune_infos)
 	{
@@ -1849,68 +1843,11 @@ ExecDoInitialPruning(EState *estate)
 		else
 			validsubplan_rtis = all_leafpart_rtis;
 
-		if (ExecShouldLockRelations(estate))
-		{
-			int			rtindex = -1;
-
-			while ((rtindex = bms_next_member(validsubplan_rtis,
-											  rtindex)) >= 0)
-			{
-				RangeTblEntry *rte = exec_rt_fetch(rtindex, estate);
-
-				Assert(rte->rtekind == RTE_RELATION &&
-					   rte->rellockmode != NoLock);
-				LockRelationOid(rte->relid, rte->rellockmode);
-				locked_relids = lappend_int(locked_relids, rtindex);
-			}
-		}
 		estate->es_unpruned_relids = bms_add_members(estate->es_unpruned_relids,
 													 validsubplan_rtis);
 		estate->es_part_prune_results = lappend(estate->es_part_prune_results,
 												validsubplans);
 	}
-
-	/*
-	 * Lock the first result relation of each ModifyTable node, even if it was
-	 * pruned.  This is required for ExecInitModifyTable(), which keeps its
-	 * first result relation if all other result relations have been pruned,
-	 * because some executor paths (e.g., in nodeModifyTable.c and
-	 * execPartition.c) rely on there being at least one result relation.
-	 *
-	 * There's room for improvement here --- we actually only need to do this
-	 * if all other result relations of the ModifyTable node were pruned, but
-	 * we don't have an easy way to tell that here.
-	 */
-	if (stmt->resultRelations && ExecShouldLockRelations(estate))
-	{
-		foreach(lc, stmt->firstResultRels)
-		{
-			Index		firstResultRel = lfirst_int(lc);
-
-			if (!bms_is_member(firstResultRel, estate->es_unpruned_relids))
-			{
-				RangeTblEntry *rte = exec_rt_fetch(firstResultRel, estate);
-
-				Assert(rte->rtekind == RTE_RELATION && rte->rellockmode != NoLock);
-				LockRelationOid(rte->relid, rte->rellockmode);
-				locked_relids = lappend_int(locked_relids, firstResultRel);
-			}
-		}
-	}
-
-	/*
-	 * Release the useless locks if the plan won't be executed.  This is the
-	 * same as what CheckCachedPlan() in plancache.c does.
-	 */
-	if (!ExecPlanStillValid(estate))
-	{
-		foreach(lc, locked_relids)
-		{
-			RangeTblEntry *rte = exec_rt_fetch(lfirst_int(lc), estate);
-
-			UnlockRelationOid(rte->relid, rte->rellockmode);
-		}
-	}
 }
 
 /*
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 772c86e70e9..fdc65c2b42b 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -147,7 +147,6 @@ CreateExecutorState(void)
 	estate->es_top_eflags = 0;
 	estate->es_instrument = 0;
 	estate->es_finished = false;
-	estate->es_aborted = false;
 
 	estate->es_exprcontexts = NIL;
 
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index 8d4d062d579..b1f9c17f98a 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -1338,7 +1338,6 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
 		dest = None_Receiver;
 
 	es->qd = CreateQueryDesc(es->stmt,
-							 NULL,
 							 fcache->func->src,
 							 GetActiveSnapshot(),
 							 InvalidSnapshot,
@@ -1363,8 +1362,7 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
 			eflags = EXEC_FLAG_SKIP_TRIGGERS;
 		else
 			eflags = 0;			/* default run-to-completion flags */
-		if (!ExecutorStart(es->qd, eflags))
-			elog(ERROR, "ExecutorStart() failed unexpectedly");
+		ExecutorStart(es->qd, eflags);
 	}
 
 	es->status = F_EXEC_RUN;
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 3288396def3..ecb2e4ccaa1 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -70,8 +70,7 @@ static int	_SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 static ParamListInfo _SPI_convert_params(int nargs, Oid *argtypes,
 										 Datum *Values, const char *Nulls);
 
-static int	_SPI_pquery(QueryDesc *queryDesc, bool fire_triggers, uint64 tcount,
-						CachedPlanSource *plansource, int query_index);
+static int	_SPI_pquery(QueryDesc *queryDesc, bool fire_triggers, uint64 tcount);
 
 static void _SPI_error_callback(void *arg);
 
@@ -1686,8 +1685,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 					  query_string,
 					  plansource->commandTag,
 					  stmt_list,
-					  cplan,
-					  plansource);
+					  cplan);
 
 	/*
 	 * Set up options for portal.  Default SCROLL type is chosen the same way
@@ -2502,7 +2500,6 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 		CachedPlanSource *plansource = (CachedPlanSource *) lfirst(lc1);
 		List	   *stmt_list;
 		ListCell   *lc2;
-		int			query_index = 0;
 
 		spicallbackarg.query = plansource->query_string;
 
@@ -2693,16 +2690,14 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 					snap = InvalidSnapshot;
 
 				qdesc = CreateQueryDesc(stmt,
-										cplan,
 										plansource->query_string,
 										snap, crosscheck_snapshot,
 										dest,
 										options->params,
 										_SPI_current->queryEnv,
 										0);
-
-				res = _SPI_pquery(qdesc, fire_triggers, canSetTag ? options->tcount : 0,
-								  plansource, query_index);
+				res = _SPI_pquery(qdesc, fire_triggers,
+								  canSetTag ? options->tcount : 0);
 				FreeQueryDesc(qdesc);
 			}
 			else
@@ -2799,8 +2794,6 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 				my_res = res;
 				goto fail;
 			}
-
-			query_index++;
 		}
 
 		/* Done with this plan, so release refcount */
@@ -2878,8 +2871,7 @@ _SPI_convert_params(int nargs, Oid *argtypes,
 }
 
 static int
-_SPI_pquery(QueryDesc *queryDesc, bool fire_triggers, uint64 tcount,
-			CachedPlanSource *plansource, int query_index)
+_SPI_pquery(QueryDesc *queryDesc, bool fire_triggers, uint64 tcount)
 {
 	int			operation = queryDesc->operation;
 	int			eflags;
@@ -2935,16 +2927,7 @@ _SPI_pquery(QueryDesc *queryDesc, bool fire_triggers, uint64 tcount,
 	else
 		eflags = EXEC_FLAG_SKIP_TRIGGERS;
 
-	if (queryDesc->cplan)
-	{
-		ExecutorStartCachedPlan(queryDesc, eflags, plansource, query_index);
-		Assert(queryDesc->planstate);
-	}
-	else
-	{
-		if (!ExecutorStart(queryDesc, eflags))
-			elog(ERROR, "ExecutorStart() failed unexpectedly");
-	}
+	ExecutorStart(queryDesc, eflags);
 
 	ExecutorRun(queryDesc, ForwardScanDirection, tcount);
 
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 49ad6e83578..ff65867eebe 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -331,7 +331,6 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
 	glob->finalrteperminfos = NIL;
 	glob->finalrowmarks = NIL;
 	glob->resultRelations = NIL;
-	glob->firstResultRels = NIL;
 	glob->appendRelations = NIL;
 	glob->partPruneInfos = NIL;
 	glob->relationOids = NIL;
@@ -571,7 +570,6 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
 											  glob->prunableRelids);
 	result->permInfos = glob->finalrteperminfos;
 	result->resultRelations = glob->resultRelations;
-	result->firstResultRels = glob->firstResultRels;
 	result->appendRelations = glob->appendRelations;
 	result->subplans = glob->subplans;
 	result->rewindPlanIDs = glob->rewindPlanIDs;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 150e9f060ee..999a5a8ab5a 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -1248,9 +1248,6 @@ set_plan_refs(PlannerInfo *root, Plan *plan, int rtoffset)
 						lappend_int(root->glob->resultRelations,
 									splan->rootRelation);
 				}
-				root->glob->firstResultRels =
-					lappend_int(root->glob->firstResultRels,
-								linitial_int(splan->resultRelations));
 			}
 			break;
 		case T_Append:
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 1ae51b1b391..92ddeba78fd 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -1226,7 +1226,6 @@ exec_simple_query(const char *query_string)
 						  query_string,
 						  commandTag,
 						  plantree_list,
-						  NULL,
 						  NULL);
 
 		/*
@@ -2028,8 +2027,7 @@ exec_bind_message(StringInfo input_message)
 					  query_string,
 					  psrc->commandTag,
 					  cplan->stmt_list,
-					  cplan,
-					  psrc);
+					  cplan);
 
 	/* Portal is defined, set the plan ID based on its contents. */
 	foreach(lc, portal->stmts)
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index 8164d0fbb4f..d1593f38b35 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -19,7 +19,6 @@
 
 #include "access/xact.h"
 #include "commands/prepare.h"
-#include "executor/execdesc.h"
 #include "executor/executor.h"
 #include "executor/tstoreReceiver.h"
 #include "miscadmin.h"
@@ -38,9 +37,6 @@ Portal		ActivePortal = NULL;
 
 
 static void ProcessQuery(PlannedStmt *plan,
-						 CachedPlan *cplan,
-						 CachedPlanSource *plansource,
-						 int query_index,
 						 const char *sourceText,
 						 ParamListInfo params,
 						 QueryEnvironment *queryEnv,
@@ -70,7 +66,6 @@ static void DoPortalRewind(Portal portal);
  */
 QueryDesc *
 CreateQueryDesc(PlannedStmt *plannedstmt,
-				CachedPlan *cplan,
 				const char *sourceText,
 				Snapshot snapshot,
 				Snapshot crosscheck_snapshot,
@@ -83,7 +78,6 @@ CreateQueryDesc(PlannedStmt *plannedstmt,
 
 	qd->operation = plannedstmt->commandType;	/* operation */
 	qd->plannedstmt = plannedstmt;	/* plan */
-	qd->cplan = cplan;			/* CachedPlan supplying the plannedstmt */
 	qd->sourceText = sourceText;	/* query text */
 	qd->snapshot = RegisterSnapshot(snapshot);	/* snapshot */
 	/* RI check snapshot */
@@ -129,9 +123,6 @@ FreeQueryDesc(QueryDesc *qdesc)
  *		PORTAL_ONE_RETURNING, or PORTAL_ONE_MOD_WITH portal
  *
  *	plan: the plan tree for the query
- *	cplan: CachedPlan supplying the plan
- *	plansource: CachedPlanSource supplying the cplan
- *	query_index: index of the query in plansource->query_list
  *	sourceText: the source text of the query
  *	params: any parameters needed
  *	dest: where to send results
@@ -144,9 +135,6 @@ FreeQueryDesc(QueryDesc *qdesc)
  */
 static void
 ProcessQuery(PlannedStmt *plan,
-			 CachedPlan *cplan,
-			 CachedPlanSource *plansource,
-			 int query_index,
 			 const char *sourceText,
 			 ParamListInfo params,
 			 QueryEnvironment *queryEnv,
@@ -158,23 +146,14 @@ ProcessQuery(PlannedStmt *plan,
 	/*
 	 * Create the QueryDesc object
 	 */
-	queryDesc = CreateQueryDesc(plan, cplan, sourceText,
+	queryDesc = CreateQueryDesc(plan, sourceText,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, params, queryEnv, 0);
 
 	/*
-	 * Prepare the plan for execution
+	 * Call ExecutorStart to prepare the plan for execution
 	 */
-	if (queryDesc->cplan)
-	{
-		ExecutorStartCachedPlan(queryDesc, 0, plansource, query_index);
-		Assert(queryDesc->planstate);
-	}
-	else
-	{
-		if (!ExecutorStart(queryDesc, 0))
-			elog(ERROR, "ExecutorStart() failed unexpectedly");
-	}
+	ExecutorStart(queryDesc, 0);
 
 	/*
 	 * Run the plan to completion.
@@ -515,7 +494,6 @@ PortalStart(Portal portal, ParamListInfo params,
 				 * the destination to DestNone.
 				 */
 				queryDesc = CreateQueryDesc(linitial_node(PlannedStmt, portal->stmts),
-											portal->cplan,
 											portal->sourceText,
 											GetActiveSnapshot(),
 											InvalidSnapshot,
@@ -535,19 +513,9 @@ PortalStart(Portal portal, ParamListInfo params,
 					myeflags = eflags;
 
 				/*
-				 * Prepare the plan for execution.
+				 * Call ExecutorStart to prepare the plan for execution
 				 */
-				if (portal->cplan)
-				{
-					ExecutorStartCachedPlan(queryDesc, myeflags,
-											portal->plansource, 0);
-					Assert(queryDesc->planstate);
-				}
-				else
-				{
-					if (!ExecutorStart(queryDesc, myeflags))
-						elog(ERROR, "ExecutorStart() failed unexpectedly");
-				}
+				ExecutorStart(queryDesc, myeflags);
 
 				/*
 				 * This tells PortalCleanup to shut down the executor
@@ -1221,7 +1189,6 @@ PortalRunMulti(Portal portal,
 {
 	bool		active_snapshot_set = false;
 	ListCell   *stmtlist_item;
-	int			query_index = 0;
 
 	/*
 	 * If the destination is DestRemoteExecute, change to DestNone.  The
@@ -1303,9 +1270,6 @@ PortalRunMulti(Portal portal,
 			{
 				/* statement can set tag string */
 				ProcessQuery(pstmt,
-							 portal->cplan,
-							 portal->plansource,
-							 query_index,
 							 portal->sourceText,
 							 portal->portalParams,
 							 portal->queryEnv,
@@ -1315,9 +1279,6 @@ PortalRunMulti(Portal portal,
 			{
 				/* stmt added by rewrite cannot set tag */
 				ProcessQuery(pstmt,
-							 portal->cplan,
-							 portal->plansource,
-							 query_index,
 							 portal->sourceText,
 							 portal->portalParams,
 							 portal->queryEnv,
@@ -1382,8 +1343,6 @@ PortalRunMulti(Portal portal,
 		 */
 		if (lnext(portal->stmts, stmtlist_item) != NULL)
 			CommandCounterIncrement();
-
-		query_index++;
 	}
 
 	/* Pop the snapshot if we pushed one. */
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 9bcbc4c3e97..89a1c79e984 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -92,8 +92,7 @@ static void ReleaseGenericPlan(CachedPlanSource *plansource);
 static bool StmtPlanRequiresRevalidation(CachedPlanSource *plansource);
 static bool BuildingPlanRequiresSnapshot(CachedPlanSource *plansource);
 static List *RevalidateCachedQuery(CachedPlanSource *plansource,
-								   QueryEnvironment *queryEnv,
-								   bool release_generic);
+								   QueryEnvironment *queryEnv);
 static bool CheckCachedPlan(CachedPlanSource *plansource);
 static CachedPlan *BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
 								   ParamListInfo boundParams, QueryEnvironment *queryEnv);
@@ -663,17 +662,10 @@ BuildingPlanRequiresSnapshot(CachedPlanSource *plansource)
  * The result value is the transient analyzed-and-rewritten query tree if we
  * had to do re-analysis, and NIL otherwise.  (This is returned just to save
  * a tree copying step in a subsequent BuildCachedPlan call.)
- *
- * This also releases and drops the generic plan (plansource->gplan), if any,
- * as most callers will typically build a new CachedPlan for the plansource
- * right after this. However, when called from UpdateCachedPlan(), the
- * function does not release the generic plan, as UpdateCachedPlan() updates
- * an existing CachedPlan in place.
  */
 static List *
 RevalidateCachedQuery(CachedPlanSource *plansource,
-					  QueryEnvironment *queryEnv,
-					  bool release_generic)
+					  QueryEnvironment *queryEnv)
 {
 	bool		snapshot_set;
 	List	   *tlist;			/* transient query-tree list */
@@ -772,9 +764,8 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
 		MemoryContextDelete(qcxt);
 	}
 
-	/* Drop the generic plan reference, if any, and if requested */
-	if (release_generic)
-		ReleaseGenericPlan(plansource);
+	/* Drop the generic plan reference if any */
+	ReleaseGenericPlan(plansource);
 
 	/*
 	 * Now re-do parse analysis and rewrite.  This not incidentally acquires
@@ -937,10 +928,8 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
  * Caller must have already called RevalidateCachedQuery to verify that the
  * querytree is up to date.
  *
- * On a "true" return, we have acquired locks on the "unprunableRelids" set
- * for all plans in plansource->stmt_list. However, the plans are not fully
- * race-condition-free until the executor acquires locks on the prunable
- * relations that survive initial runtime pruning during InitPlan().
+ * On a "true" return, we have acquired the locks needed to run the plan.
+ * (We must do this for the "true" result to be race-condition-free.)
  */
 static bool
 CheckCachedPlan(CachedPlanSource *plansource)
@@ -1025,8 +1014,6 @@ CheckCachedPlan(CachedPlanSource *plansource)
  * Planning work is done in the caller's memory context.  The finished plan
  * is in a child memory context, which typically should get reparented
  * (unless this is a one-shot plan, in which case we don't copy the plan).
- *
- * Note: When changing this, you should also look at UpdateCachedPlan().
  */
 static CachedPlan *
 BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
@@ -1037,7 +1024,6 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
 	bool		snapshot_set;
 	bool		is_transient;
 	MemoryContext plan_context;
-	MemoryContext stmt_context = NULL;
 	MemoryContext oldcxt = CurrentMemoryContext;
 	ListCell   *lc;
 
@@ -1055,7 +1041,7 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
 	 * let's treat it as real and redo the RevalidateCachedQuery call.
 	 */
 	if (!plansource->is_valid)
-		qlist = RevalidateCachedQuery(plansource, queryEnv, true);
+		qlist = RevalidateCachedQuery(plansource, queryEnv);
 
 	/*
 	 * If we don't already have a copy of the querytree list that can be
@@ -1093,19 +1079,10 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
 		PopActiveSnapshot();
 
 	/*
-	 * Normally, we create a dedicated memory context for the CachedPlan and
-	 * its subsidiary data. Although it's usually not very large, the context
-	 * is designed to allow growth if necessary.
-	 *
-	 * The PlannedStmts are stored in a separate child context (stmt_context)
-	 * of the CachedPlan's memory context. This separation allows
-	 * UpdateCachedPlan() to free and replace the PlannedStmts without
-	 * affecting the CachedPlan structure or its stmt_list List.
-	 *
-	 * For one-shot plans, we instead use the caller's memory context, as the
-	 * CachedPlan will not persist.  stmt_context will be set to NULL in this
-	 * case, because UpdateCachedPlan() should never get called on a one-shot
-	 * plan.
+	 * Normally we make a dedicated memory context for the CachedPlan and its
+	 * subsidiary data.  (It's probably not going to be large, but just in
+	 * case, allow it to grow large.  It's transient for the moment.)  But for
+	 * a one-shot plan, we just leave it in the caller's memory context.
 	 */
 	if (!plansource->is_oneshot)
 	{
@@ -1114,17 +1091,12 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
 											 ALLOCSET_START_SMALL_SIZES);
 		MemoryContextCopyAndSetIdentifier(plan_context, plansource->query_string);
 
-		stmt_context = AllocSetContextCreate(CurrentMemoryContext,
-											 "CachedPlan PlannedStmts",
-											 ALLOCSET_START_SMALL_SIZES);
-		MemoryContextCopyAndSetIdentifier(stmt_context, plansource->query_string);
-		MemoryContextSetParent(stmt_context, plan_context);
+		/*
+		 * Copy plan into the new context.
+		 */
+		MemoryContextSwitchTo(plan_context);
 
-		MemoryContextSwitchTo(stmt_context);
 		plist = copyObject(plist);
-
-		MemoryContextSwitchTo(plan_context);
-		plist = list_copy(plist);
 	}
 	else
 		plan_context = CurrentMemoryContext;
@@ -1165,10 +1137,8 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
 		plan->saved_xmin = InvalidTransactionId;
 	plan->refcount = 0;
 	plan->context = plan_context;
-	plan->stmt_context = stmt_context;
 	plan->is_oneshot = plansource->is_oneshot;
 	plan->is_saved = false;
-	plan->is_reused = false;
 	plan->is_valid = true;
 
 	/* assign generation number to new plan */
@@ -1179,113 +1149,6 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
 	return plan;
 }
 
-/*
- * UpdateCachedPlan
- *		Create fresh plans for all queries in the CachedPlanSource, replacing
- *		those in the generic plan's stmt_list, and return the plan for the
- *		query_index'th query.
- *
- * This function is primarily used by ExecutorStartCachedPlan() to handle
- * cases where the original generic CachedPlan becomes invalid. Such
- * invalidation may occur when prunable relations in the old plan for the
- * query_index'th query are locked in preparation for execution.
- *
- * Note that invalidations received during the execution of the query_index'th
- * query can affect both the queries that have already finished execution
- * (e.g., due to concurrent modifications on prunable relations that were not
- * locked during their execution) and also the queries that have not yet been
- * executed.  As a result, this function updates all plans to ensure
- * CachedPlan.is_valid is safely set to true.
- *
- * The old PlannedStmts in plansource->gplan->stmt_list are freed here, so
- * the caller and any of its callers must not rely on them remaining accessible
- * after this function is called.
- */
-PlannedStmt *
-UpdateCachedPlan(CachedPlanSource *plansource, int query_index,
-				 QueryEnvironment *queryEnv)
-{
-	List	   *query_list = plansource->query_list,
-			   *plan_list;
-	ListCell   *l1,
-			   *l2;
-	CachedPlan *plan = plansource->gplan;
-	MemoryContext oldcxt;
-
-	Assert(ActiveSnapshotSet());
-
-	/* Sanity checks (XXX can be Asserts?) */
-	if (plan == NULL)
-		elog(ERROR, "UpdateCachedPlan() called in the wrong context: plansource->gplan is NULL");
-	else if (plan->is_valid)
-		elog(ERROR, "UpdateCachedPlan() called in the wrong context: plansource->gplan->is_valid is true");
-	else if (plan->is_oneshot)
-		elog(ERROR, "UpdateCachedPlan() called in the wrong context: plansource->gplan->is_oneshot is true");
-
-	/*
-	 * The plansource might have become invalid since GetCachedPlan() returned
-	 * the CachedPlan. See the comment in BuildCachedPlan() for details on why
-	 * this might happen.  Although invalidation is likely a false positive as
-	 * stated there, we make the plan valid to ensure the query list used for
-	 * planning is up to date.
-	 *
-	 * The risk of catching an invalidation is higher here than when
-	 * BuildCachedPlan() is called from GetCachedPlan(), because this function
-	 * is normally called long after GetCachedPlan() returns the CachedPlan,
-	 * so much more processing could have occurred including things that mark
-	 * the CachedPlanSource invalid.
-	 *
-	 * Note: Do not release plansource->gplan, because the upstream callers
-	 * (such as the callers of ExecutorStartCachedPlan()) would still be
-	 * referencing it.
-	 */
-	if (!plansource->is_valid)
-		query_list = RevalidateCachedQuery(plansource, queryEnv, false);
-	Assert(query_list != NIL);
-
-	/*
-	 * Build a new generic plan for all the queries after making a copy to be
-	 * scribbled on by the planner.
-	 */
-	query_list = copyObject(query_list);
-
-	/*
-	 * Planning work is done in the caller's memory context.  The resulting
-	 * PlannedStmt is then copied into plan->stmt_context after throwing away
-	 * the old ones.
-	 */
-	plan_list = pg_plan_queries(query_list, plansource->query_string,
-								plansource->cursor_options, NULL);
-	Assert(list_length(plan_list) == list_length(plan->stmt_list));
-
-	MemoryContextReset(plan->stmt_context);
-	oldcxt = MemoryContextSwitchTo(plan->stmt_context);
-	forboth(l1, plan_list, l2, plan->stmt_list)
-	{
-		PlannedStmt *plannedstmt = lfirst(l1);
-
-		lfirst(l2) = copyObject(plannedstmt);
-	}
-	MemoryContextSwitchTo(oldcxt);
-
-	/*
-	 * XXX Should this also (re)set the properties of the CachedPlan that are
-	 * set in BuildCachedPlan() after creating the fresh plans such as
-	 * planRoleId, dependsOnRole, and saved_xmin?
-	 */
-
-	/*
-	 * We've updated all the plans that might have been invalidated, so mark
-	 * the CachedPlan as valid.
-	 */
-	plan->is_valid = true;
-
-	/* Also update generic_cost because we just created a new generic plan. */
-	plansource->generic_cost = cached_plan_cost(plan, false);
-
-	return list_nth_node(PlannedStmt, plan->stmt_list, query_index);
-}
-
 /*
  * choose_custom_plan: choose whether to use custom or generic plan
  *
@@ -1402,13 +1265,8 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
  * plan or a custom plan for the given parameters: the caller does not know
  * which it will get.
  *
- * On return, the plan is valid, but if it is a reused generic plan, not all
- * locks are acquired. In such cases, CheckCachedPlan() does not take locks
- * on relations subject to initial runtime pruning; instead, these locks are
- * deferred until execution startup, when ExecDoInitialPruning() performs
- * initial pruning.  The plan's "is_reused" flag is set to indicate that
- * CachedPlanRequiresLocking() should return true when called by
- * ExecDoInitialPruning().
+ * On return, the plan is valid and we have sufficient locks to begin
+ * execution.
  *
  * On return, the refcount of the plan has been incremented; a later
  * ReleaseCachedPlan() call is expected.  If "owner" is not NULL then
@@ -1434,7 +1292,7 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 		elog(ERROR, "cannot apply ResourceOwner to non-saved cached plan");
 
 	/* Make sure the querytree list is valid and we have parse-time locks */
-	qlist = RevalidateCachedQuery(plansource, queryEnv, true);
+	qlist = RevalidateCachedQuery(plansource, queryEnv);
 
 	/* Decide whether to use a custom plan */
 	customplan = choose_custom_plan(plansource, boundParams);
@@ -1446,8 +1304,6 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 			/* We want a generic plan, and we already have a valid one */
 			plan = plansource->gplan;
 			Assert(plan->magic == CACHEDPLAN_MAGIC);
-			/* Reusing the existing plan, so not all locks may be acquired. */
-			plan->is_reused = true;
 		}
 		else
 		{
@@ -1913,7 +1769,7 @@ CachedPlanGetTargetList(CachedPlanSource *plansource,
 		return NIL;
 
 	/* Make sure the querytree list is valid and we have parse-time locks */
-	RevalidateCachedQuery(plansource, queryEnv, true);
+	RevalidateCachedQuery(plansource, queryEnv);
 
 	/* Get the primary statement and find out what it returns */
 	pstmt = QueryListGetPrimaryStmt(plansource->query_list);
@@ -2035,7 +1891,7 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
 	foreach(lc1, stmt_list)
 	{
 		PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
-		int			rtindex;
+		ListCell   *lc2;
 
 		if (plannedstmt->commandType == CMD_UTILITY)
 		{
@@ -2053,16 +1909,13 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
 			continue;
 		}
 
-		rtindex = -1;
-		while ((rtindex = bms_next_member(plannedstmt->unprunableRelids,
-										  rtindex)) >= 0)
+		foreach(lc2, plannedstmt->rtable)
 		{
-			RangeTblEntry *rte = list_nth_node(RangeTblEntry,
-											   plannedstmt->rtable,
-											   rtindex - 1);
+			RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc2);
 
-			Assert(rte->rtekind == RTE_RELATION ||
-				   (rte->rtekind == RTE_SUBQUERY && OidIsValid(rte->relid)));
+			if (!(rte->rtekind == RTE_RELATION ||
+				  (rte->rtekind == RTE_SUBQUERY && OidIsValid(rte->relid))))
+				continue;
 
 			/*
 			 * Acquire the appropriate type of lock on each relation OID. Note
diff --git a/src/backend/utils/mmgr/portalmem.c b/src/backend/utils/mmgr/portalmem.c
index e3526e78064..0be1c2b0fff 100644
--- a/src/backend/utils/mmgr/portalmem.c
+++ b/src/backend/utils/mmgr/portalmem.c
@@ -284,8 +284,7 @@ PortalDefineQuery(Portal portal,
 				  const char *sourceText,
 				  CommandTag commandTag,
 				  List *stmts,
-				  CachedPlan *cplan,
-				  CachedPlanSource *plansource)
+				  CachedPlan *cplan)
 {
 	Assert(PortalIsValid(portal));
 	Assert(portal->status == PORTAL_NEW);
@@ -300,7 +299,6 @@ PortalDefineQuery(Portal portal,
 	portal->commandTag = commandTag;
 	portal->stmts = stmts;
 	portal->cplan = cplan;
-	portal->plansource = plansource;
 	portal->status = PORTAL_DEFINED;
 }
 
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 03c5b3d73e5..3b122f79ed8 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -63,10 +63,8 @@ extern void ExplainOneUtility(Node *utilityStmt, IntoClause *into,
 							  struct ExplainState *es, ParseState *pstate,
 							  ParamListInfo params);
 
-extern void ExplainOnePlan(PlannedStmt *plannedstmt, CachedPlan *cplan,
-						   CachedPlanSource *plansource, int query_index,
-						   IntoClause *into, struct ExplainState *es,
-						   const char *queryString,
+extern void ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into,
+						   struct ExplainState *es, const char *queryString,
 						   ParamListInfo params, QueryEnvironment *queryEnv,
 						   const instr_time *planduration,
 						   const BufferUsage *bufusage,
diff --git a/src/include/commands/trigger.h b/src/include/commands/trigger.h
index 4180601dcd4..2ed2c4bb378 100644
--- a/src/include/commands/trigger.h
+++ b/src/include/commands/trigger.h
@@ -258,7 +258,6 @@ extern void ExecASTruncateTriggers(EState *estate,
 extern void AfterTriggerBeginXact(void);
 extern void AfterTriggerBeginQuery(void);
 extern void AfterTriggerEndQuery(EState *estate);
-extern void AfterTriggerAbortQuery(void);
 extern void AfterTriggerFireDeferred(void);
 extern void AfterTriggerEndXact(bool isCommit);
 extern void AfterTriggerBeginSubXact(void);
diff --git a/src/include/executor/execdesc.h b/src/include/executor/execdesc.h
index ba53305ad42..86db3dc8d0d 100644
--- a/src/include/executor/execdesc.h
+++ b/src/include/executor/execdesc.h
@@ -35,7 +35,6 @@ typedef struct QueryDesc
 	/* These fields are provided by CreateQueryDesc */
 	CmdType		operation;		/* CMD_SELECT, CMD_UPDATE, etc. */
 	PlannedStmt *plannedstmt;	/* planner's output (could be utility, too) */
-	CachedPlan *cplan;			/* CachedPlan that supplies the plannedstmt */
 	const char *sourceText;		/* source text of the query */
 	Snapshot	snapshot;		/* snapshot to use for query */
 	Snapshot	crosscheck_snapshot;	/* crosscheck for RI update/delete */
@@ -58,7 +57,6 @@ typedef struct QueryDesc
 
 /* in pquery.c */
 extern QueryDesc *CreateQueryDesc(PlannedStmt *plannedstmt,
-								  CachedPlan *cplan,
 								  const char *sourceText,
 								  Snapshot snapshot,
 								  Snapshot crosscheck_snapshot,
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index ae99407db89..fbe4bf081f7 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -73,7 +73,7 @@
 
 
 /* Hook for plugins to get control in ExecutorStart() */
-typedef bool (*ExecutorStart_hook_type) (QueryDesc *queryDesc, int eflags);
+typedef void (*ExecutorStart_hook_type) (QueryDesc *queryDesc, int eflags);
 extern PGDLLIMPORT ExecutorStart_hook_type ExecutorStart_hook;
 
 /* Hook for plugins to get control in ExecutorRun() */
@@ -229,11 +229,8 @@ ExecGetJunkAttribute(TupleTableSlot *slot, AttrNumber attno, bool *isNull)
 /*
  * prototypes from functions in execMain.c
  */
-extern bool ExecutorStart(QueryDesc *queryDesc, int eflags);
-extern void ExecutorStartCachedPlan(QueryDesc *queryDesc, int eflags,
-									CachedPlanSource *plansource,
-									int query_index);
-extern bool standard_ExecutorStart(QueryDesc *queryDesc, int eflags);
+extern void ExecutorStart(QueryDesc *queryDesc, int eflags);
+extern void standard_ExecutorStart(QueryDesc *queryDesc, int eflags);
 extern void ExecutorRun(QueryDesc *queryDesc,
 						ScanDirection direction, uint64 count);
 extern void standard_ExecutorRun(QueryDesc *queryDesc,
@@ -300,30 +297,6 @@ extern void ExecEndNode(PlanState *node);
 extern void ExecShutdownNode(PlanState *node);
 extern void ExecSetTupleBound(int64 tuples_needed, PlanState *child_node);
 
-/*
- * Is the CachedPlan in es_cachedplan still valid?
- *
- * Called from InitPlan() because invalidation messages that affect the plan
- * might be received after locks have been taken on runtime-prunable relations.
- * The caller should take appropriate action if the plan has become invalid.
- */
-static inline bool
-ExecPlanStillValid(EState *estate)
-{
-	return estate->es_cachedplan == NULL ? true :
-		CachedPlanValid(estate->es_cachedplan);
-}
-
-/*
- * Locks are needed only if running a cached plan that might contain unlocked
- * relations, such as a reused generic plan.
- */
-static inline bool
-ExecShouldLockRelations(EState *estate)
-{
-	return estate->es_cachedplan == NULL ? false :
-		CachedPlanRequiresLocking(estate->es_cachedplan);
-}
 
 /* ----------------------------------------------------------------
  *		ExecProcNode
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 5b6cadb5a6c..2492282213f 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -42,7 +42,6 @@
 #include "storage/condition_variable.h"
 #include "utils/hsearch.h"
 #include "utils/queryenvironment.h"
-#include "utils/plancache.h"
 #include "utils/reltrigger.h"
 #include "utils/sharedtuplestore.h"
 #include "utils/snapshot.h"
@@ -664,7 +663,6 @@ typedef struct EState
 										 * ExecRowMarks, or NULL if none */
 	List	   *es_rteperminfos;	/* List of RTEPermissionInfo */
 	PlannedStmt *es_plannedstmt;	/* link to top of plan tree */
-	CachedPlan *es_cachedplan;	/* CachedPlan providing the plan tree */
 	List	   *es_part_prune_infos;	/* List of PartitionPruneInfo */
 	List	   *es_part_prune_states;	/* List of PartitionPruneState */
 	List	   *es_part_prune_results;	/* List of Bitmapset */
@@ -717,7 +715,6 @@ typedef struct EState
 	int			es_top_eflags;	/* eflags passed to ExecutorStart */
 	int			es_instrument;	/* OR of InstrumentOption flags */
 	bool		es_finished;	/* true when ExecutorFinish is done */
-	bool		es_aborted;		/* true when execution was aborted */
 
 	List	   *es_exprcontexts;	/* List of ExprContexts within EState */
 
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 1dd2d1560cb..6567759595d 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -138,9 +138,6 @@ typedef struct PlannerGlobal
 	/* "flat" list of integer RT indexes */
 	List	   *resultRelations;
 
-	/* "flat" list of integer RT indexes (one per ModifyTable node) */
-	List	   *firstResultRels;
-
 	/* "flat" list of AppendRelInfos */
 	List	   *appendRelations;
 
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 658d76225e4..f0d514e6e15 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -105,13 +105,6 @@ typedef struct PlannedStmt
 	/* integer list of RT indexes, or NIL */
 	List	   *resultRelations;
 
-	/*
-	 * rtable indexes of first target relation in each ModifyTable node in the
-	 * plan for INSERT/UPDATE/DELETE/MERGE
-	 */
-	/* integer list of RT indexes, or NIL */
-	List	   *firstResultRels;
-
 	/* list of AppendRelInfo nodes */
 	List	   *appendRelations;
 
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index 07ec5318db7..1baa6d50bfd 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -18,8 +18,6 @@
 #include "access/tupdesc.h"
 #include "lib/ilist.h"
 #include "nodes/params.h"
-#include "nodes/parsenodes.h"
-#include "nodes/plannodes.h"
 #include "tcop/cmdtag.h"
 #include "utils/queryenvironment.h"
 #include "utils/resowner.h"
@@ -153,11 +151,10 @@ typedef struct CachedPlanSource
  * The reference count includes both the link from the parent CachedPlanSource
  * (if any), and any active plan executions, so the plan can be discarded
  * exactly when refcount goes to zero.  Both the struct itself and the
- * subsidiary data, except the PlannedStmts in stmt_list live in the context
- * denoted by the context field; the PlannedStmts live in the context denoted
- * by stmt_context.  Separate contexts makes it easy to free a no-longer-needed
- * cached plan. (However, if is_oneshot is true, the context does not belong
- * solely to the CachedPlan so no freeing is possible.)
+ * subsidiary data live in the context denoted by the context field.
+ * This makes it easy to free a no-longer-needed cached plan.  (However,
+ * if is_oneshot is true, the context does not belong solely to the CachedPlan
+ * so no freeing is possible.)
  */
 typedef struct CachedPlan
 {
@@ -165,7 +162,6 @@ typedef struct CachedPlan
 	List	   *stmt_list;		/* list of PlannedStmts */
 	bool		is_oneshot;		/* is it a "oneshot" plan? */
 	bool		is_saved;		/* is CachedPlan in a long-lived context? */
-	bool		is_reused;		/* is it a reused generic plan? */
 	bool		is_valid;		/* is the stmt_list currently valid? */
 	Oid			planRoleId;		/* Role ID the plan was created for */
 	bool		dependsOnRole;	/* is plan specific to that role? */
@@ -174,10 +170,6 @@ typedef struct CachedPlan
 	int			generation;		/* parent's generation number for this plan */
 	int			refcount;		/* count of live references to this struct */
 	MemoryContext context;		/* context containing this CachedPlan */
-	MemoryContext stmt_context; /* context containing the PlannedStmts in
-								 * stmt_list, but not the List itself which is
-								 * in the above context; NULL if is_oneshot is
-								 * true. */
 } CachedPlan;
 
 /*
@@ -249,10 +241,6 @@ extern CachedPlan *GetCachedPlan(CachedPlanSource *plansource,
 								 ParamListInfo boundParams,
 								 ResourceOwner owner,
 								 QueryEnvironment *queryEnv);
-extern PlannedStmt *UpdateCachedPlan(CachedPlanSource *plansource,
-									 int query_index,
-									 QueryEnvironment *queryEnv);
-
 extern void ReleaseCachedPlan(CachedPlan *plan, ResourceOwner owner);
 
 extern bool CachedPlanAllowsSimpleValidityCheck(CachedPlanSource *plansource,
@@ -265,30 +253,4 @@ extern bool CachedPlanIsSimplyValid(CachedPlanSource *plansource,
 extern CachedExpression *GetCachedExpression(Node *expr);
 extern void FreeCachedExpression(CachedExpression *cexpr);
 
-/*
- * CachedPlanRequiresLocking: should the executor acquire additional locks?
- *
- * If the plan is a saved generic plan, the executor must acquire locks for
- * relations that are not covered by AcquireExecutorLocks(), such as partitions
- * that are subject to initial runtime pruning.
- */
-static inline bool
-CachedPlanRequiresLocking(CachedPlan *cplan)
-{
-	return !cplan->is_oneshot && cplan->is_reused;
-}
-
-/*
- * CachedPlanValid
- *      Returns whether a cached generic plan is still valid.
- *
- * Invoked by the executor to check if the plan has not been invalidated after
- * taking locks during the initialization of the plan.
- */
-static inline bool
-CachedPlanValid(CachedPlan *cplan)
-{
-	return cplan->is_valid;
-}
-
 #endif							/* PLANCACHE_H */
diff --git a/src/include/utils/portal.h b/src/include/utils/portal.h
index ddee031f551..0b62143af8b 100644
--- a/src/include/utils/portal.h
+++ b/src/include/utils/portal.h
@@ -138,7 +138,6 @@ typedef struct PortalData
 	QueryCompletion qc;			/* command completion data for executed query */
 	List	   *stmts;			/* list of PlannedStmts */
 	CachedPlan *cplan;			/* CachedPlan, if stmts are from one */
-	CachedPlanSource *plansource;	/* CachedPlanSource, for cplan */
 
 	ParamListInfo portalParams; /* params to pass to query */
 	QueryEnvironment *queryEnv; /* environment for query */
@@ -241,8 +240,7 @@ extern void PortalDefineQuery(Portal portal,
 							  const char *sourceText,
 							  CommandTag commandTag,
 							  List *stmts,
-							  CachedPlan *cplan,
-							  CachedPlanSource *plansource);
+							  CachedPlan *cplan);
 extern PlannedStmt *PortalGetPrimaryStmt(Portal portal);
 extern void PortalCreateHoldStore(Portal portal);
 extern void PortalHashTableDeleteAll(void);
-- 
2.43.0



^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2025-05-21 10:22  Amit Langote <[email protected]>
  parent: Tomas Vondra <[email protected]>
  0 siblings, 0 replies; 108+ messages in thread

From: Amit Langote @ 2025-05-21 10:22 UTC (permalink / raw)
  To: Tomas Vondra <[email protected]>; +Cc: Tom Lane <[email protected]>; Tender Wang <[email protected]>; Alexander Lakhin <[email protected]>; Robert Haas <[email protected]>; Alvaro Herrera <[email protected]>; Andres Freund <[email protected]>; Daniel Gustafsson <[email protected]>; David Rowley <[email protected]>; pgsql-hackers; Thom Brown <[email protected]>

On Wed, May 21, 2025 at 3:44 AM Tomas Vondra <[email protected]> wrote:
> On 5/20/25 05:06, Tom Lane wrote:
> > Amit Langote <[email protected]> writes:
> >> Pushed after some tweaks to comments and the test case.
> >
> > My attention was drawn to commit 525392d57 after observing that
> > Valgrind complained about a memory leak in some code that commit added
> > to BuildCachedPlan().  I tried to make sense of said code so I could
> > remove the leak, and eventually arrived at the attached patch, which
> > is part of a series of leak-fixing things hence the high sequence
> > number.
> >
> > Unfortunately, the bad things I speculated about in the added comments
> > seem to be reality.  The second attached file is a test case that
> > triggers
> >
> > ...
>
> FYI I added this as a PG18 open item:
>
>   https://wiki.postgresql.org/wiki/PostgreSQL_18_Open_Items

Thanks Tomas.

-- 
Thanks, Amit Langote





^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2025-05-22 08:12  Amit Langote <[email protected]>
  parent: Amit Langote <[email protected]>
  0 siblings, 2 replies; 108+ messages in thread

From: Amit Langote @ 2025-05-22 08:12 UTC (permalink / raw)
  To: Tom Lane <[email protected]>; +Cc: Tender Wang <[email protected]>; Alexander Lakhin <[email protected]>; Tomas Vondra <[email protected]>; Robert Haas <[email protected]>; Alvaro Herrera <[email protected]>; Andres Freund <[email protected]>; Daniel Gustafsson <[email protected]>; David Rowley <[email protected]>; pgsql-hackers; Thom Brown <[email protected]>

On Wed, May 21, 2025 at 7:22 PM Amit Langote <[email protected]> wrote:
> Fair enough. I’ll revert this and some related changes shortly.  WIP
> patch attached.

I have pushed out the revert now.

Note that I’ve only reverted the changes related to deferring locks on
prunable partitions. I’m planning to leave the preparatory commits
leading up to that one in place unless anyone objects. For reference,
here they are in chronological order (the last 3 are bug fixes):

bb3ec16e14d Move PartitionPruneInfo out of plan nodes into PlannedStmt
d47cbf474ec Perform runtime initial pruning outside ExecInitNode()
cbc127917e0 Track unpruned relids to avoid processing pruned relations
75dfde13639 Fix an oversight in cbc127917 to handle MERGE correctly
cbb9086c9ef Fix bug in cbc127917 to handle nested Append correctly
28317de723b Ensure first ModifyTable rel initialized if all are pruned

I think separating initial pruning from plan node initialization is
still worthwhile on its own, as evidenced by the improvements in
cbc127917e.

-- 
Thanks, Amit Langote





^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2025-05-22 13:04  Tomas Vondra <[email protected]>
  parent: Amit Langote <[email protected]>
  1 sibling, 1 reply; 108+ messages in thread

From: Tomas Vondra @ 2025-05-22 13:04 UTC (permalink / raw)
  To: Amit Langote <[email protected]>; Tom Lane <[email protected]>; +Cc: Tender Wang <[email protected]>; Alexander Lakhin <[email protected]>; Robert Haas <[email protected]>; Alvaro Herrera <[email protected]>; Andres Freund <[email protected]>; Daniel Gustafsson <[email protected]>; David Rowley <[email protected]>; pgsql-hackers; Thom Brown <[email protected]>

On 5/22/25 10:12, Amit Langote wrote:
> On Wed, May 21, 2025 at 7:22 PM Amit Langote <[email protected]> wrote:
>> Fair enough. I’ll revert this and some related changes shortly.  WIP
>> patch attached.
> 
> I have pushed out the revert now.
> 

Thank you.

> Note that I’ve only reverted the changes related to deferring locks on
> prunable partitions. I’m planning to leave the preparatory commits
> leading up to that one in place unless anyone objects. For reference,
> here they are in chronological order (the last 3 are bug fixes):
> 
> bb3ec16e14d Move PartitionPruneInfo out of plan nodes into PlannedStmt
> d47cbf474ec Perform runtime initial pruning outside ExecInitNode()
> cbc127917e0 Track unpruned relids to avoid processing pruned relations
> 75dfde13639 Fix an oversight in cbc127917 to handle MERGE correctly
> cbb9086c9ef Fix bug in cbc127917 to handle nested Append correctly
> 28317de723b Ensure first ModifyTable rel initialized if all are pruned
> 
> I think separating initial pruning from plan node initialization is
> still worthwhile on its own, as evidenced by the improvements in
> cbc127917e.
> 

I'm OK with that in principle, assuming the benefits outweigh the risk
of making backpatching harder. The patches don't seem exceptionally
large / invasive, but I don't know how often we modify these parts.


regards

-- 
Tomas Vondra






^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2025-05-22 13:50  Robert Haas <[email protected]>
  parent: Tom Lane <[email protected]>
  1 sibling, 0 replies; 108+ messages in thread

From: Robert Haas @ 2025-05-22 13:50 UTC (permalink / raw)
  To: Tom Lane <[email protected]>; +Cc: Amit Langote <[email protected]>; Tender Wang <[email protected]>; Alexander Lakhin <[email protected]>; Tomas Vondra <[email protected]>; Alvaro Herrera <[email protected]>; Andres Freund <[email protected]>; Daniel Gustafsson <[email protected]>; David Rowley <[email protected]>; pgsql-hackers; Thom Brown <[email protected]>

On Tue, May 20, 2025 at 11:38 AM Tom Lane <[email protected]> wrote:
> I still like the core idea of deferring locking, but I don't like
> anything about this implementation of it.  It seems like there has
> to be a better and simpler way.

Without particularly defending this implementation, and certainly
without defending its bugs, I just want to say that I'm not convinced
by the idea that there has to be a better and simpler way. We --
principally Amit, but also me and you and others -- have been trying
to find the best way of doing this for probably 5 years now. If you do
something during executor startup, you have to be prepared for
executor startup to force a replan, and if you do something before
executor startup, then you're duplicating executor logic into a new
phase that needs to communicate its results forward to execution
proper. Either approach is awkward and that awkwardness seems to
inevitably bleed into the plan cache specifically. I'd be beyond
delighted if you want to help chart a path through the awkwardness
here, since you know this stuff better than anybody, but I am
skeptical that there is a truly marvelous approach which we've just
managed to overlook for all this time.

-- 
Robert Haas
EDB: http://www.enterprisedb.com





^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2025-05-23 02:17  Amit Langote <[email protected]>
  parent: Tomas Vondra <[email protected]>
  0 siblings, 0 replies; 108+ messages in thread

From: Amit Langote @ 2025-05-23 02:17 UTC (permalink / raw)
  To: Tomas Vondra <[email protected]>; +Cc: Tom Lane <[email protected]>; Tender Wang <[email protected]>; Alexander Lakhin <[email protected]>; Robert Haas <[email protected]>; Alvaro Herrera <[email protected]>; Andres Freund <[email protected]>; Daniel Gustafsson <[email protected]>; David Rowley <[email protected]>; pgsql-hackers; Thom Brown <[email protected]>

On Thu, May 22, 2025 at 10:04 PM Tomas Vondra <[email protected]> wrote:
> On 5/22/25 10:12, Amit Langote wrote:
> > Note that I’ve only reverted the changes related to deferring locks on
> > prunable partitions. I’m planning to leave the preparatory commits
> > leading up to that one in place unless anyone objects. For reference,
> > here they are in chronological order (the last 3 are bug fixes):
> >
> > bb3ec16e14d Move PartitionPruneInfo out of plan nodes into PlannedStmt
> > d47cbf474ec Perform runtime initial pruning outside ExecInitNode()
> > cbc127917e0 Track unpruned relids to avoid processing pruned relations
> > 75dfde13639 Fix an oversight in cbc127917 to handle MERGE correctly
> > cbb9086c9ef Fix bug in cbc127917 to handle nested Append correctly
> > 28317de723b Ensure first ModifyTable rel initialized if all are pruned
> >
> > I think separating initial pruning from plan node initialization is
> > still worthwhile on its own, as evidenced by the improvements in
> > cbc127917e.
> >
>
> I'm OK with that in principle, assuming the benefits outweigh the risk
> of making backpatching harder. The patches don't seem exceptionally
> large / invasive, but I don't know how often we modify these parts.

Thanks. I agree it's something to be mindful of, but I don’t expect
the reimplementation of the locking deferral to require changes to
this part of the code again. So barring any surprises, it shouldn't be
the case that the pruning code ends up looking significantly different
in v19.

Also, the actual pruning logic hasn’t changed much -- just where it’s
called from.

Let me know if any of that still raises concerns.

-- 
Thanks, Amit Langote





^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2025-06-20 12:30  Amit Langote <[email protected]>
  parent: Amit Langote <[email protected]>
  1 sibling, 1 reply; 108+ messages in thread

From: Amit Langote @ 2025-06-20 12:30 UTC (permalink / raw)
  To: Tom Lane <[email protected]>; +Cc: Tender Wang <[email protected]>; Alexander Lakhin <[email protected]>; Tomas Vondra <[email protected]>; Robert Haas <[email protected]>; Alvaro Herrera <[email protected]>; Andres Freund <[email protected]>; Daniel Gustafsson <[email protected]>; David Rowley <[email protected]>; pgsql-hackers; Thom Brown <[email protected]>

On Thu, May 22, 2025 at 5:12 PM Amit Langote <[email protected]> wrote:
> I have pushed out the revert now.
>
> Note that I’ve only reverted the changes related to deferring locks on
> prunable partitions. I’m planning to leave the preparatory commits
> leading up to that one in place unless anyone objects. For reference,
> here they are in chronological order (the last 3 are bug fixes):
>
> bb3ec16e14d Move PartitionPruneInfo out of plan nodes into PlannedStmt
> d47cbf474ec Perform runtime initial pruning outside ExecInitNode()
> cbc127917e0 Track unpruned relids to avoid processing pruned relations
> 75dfde13639 Fix an oversight in cbc127917 to handle MERGE correctly
> cbb9086c9ef Fix bug in cbc127917 to handle nested Append correctly
> 28317de723b Ensure first ModifyTable rel initialized if all are pruned
>
> I think separating initial pruning from plan node initialization is
> still worthwhile on its own, as evidenced by the improvements in
> cbc127917e.

I've been thinking about how to address the concerns Tom raised about
the reverted patch.  Here's a summary of where my thinking currently
stands.

* CachedPlan invalidation handling:

The first issue is the part of the old design where a CachedPlan
invalidated during executor startup -- while locking unpruned
partitions -- was modified in place to replace the stale PlannedStmts
in its stmt_list with new ones obtained by replanning all queries in
the enclosing CachedPlanSource's query_list. I did that mainly to
ensure that replanning happens as soon as the executor discovers the
plan is invalid, instead of returning to the caller and requiring them
to go back to plancache.c to trigger replanning. There were many
issues with making that approach work in practice, because different
callers of the executor have different ways of running plans from a
CachedPlan -- with pquery.c in particular being hard to refactor
cleanly to support that flow.

The first alternative I came up with is to place only the query whose
PlannedStmt is being initialized into a standalone CachedPlanSource
and create a corresponding standalone CachedPlan. "Standalone" here
means that both objects are "saved" independently of the original
CachedPlanSource and CachedPlan, but are still tracked by the
invalidation callbacks.

But thinking about it more recently, what's actually important is not
whether we construct a new CachedPlan at all, but simply that we
replan just the one query that needs to be run, and use the resulting
PlannedStmt directly. The planner will have taken all required locks,
so we don't need to register the plan with the invalidation machinery
-- concurrent invalidations can't affect correctness.

In that case, the replanned PlannedStmt can be treated as transient
executor-local state, with no need to carry any of the plan cache
infrastructure along with it.  To support that, I further assume that,
because replanning and execution happen essentially back-to-back,
there's no opportunity for role-based or xmin-based invalidation (as
is checked for a CachedPlan in CheckCachedPlan()) to affect the plan
in between. If that reasoning holds, then we don't need to register
the replanned statement with the invalidation machinery at all.

Because we wouldn't have touched the original CachedPlan at all, the
stale PlannedStmts in it wouldn't be replaced until the next
GetCachedPlan() call triggers replanning. I'm willing to accept that
as a tradeoff for a less invasive design to handle replanning in the
executor.

Finally, it's worth noting that the executor is always passed the
entire CachedPlan, regardless of which individual statement is being
executed. Without per-statement validity tracking, it's hard for the
executor to tell whether replanning is actually needed for a given
query when the CachedPlan is marked invalid (is_valid=false), making
it impossible to selectively replan just one. To support that, what I
would need is validity tracking at the level of individual
PlannedStmts -- and perhaps even Querys -- in the source's query_list,
with the current is_valid flag effectively serving as the logical AND
of all the individual flags. We didn't need that in the old design,
because we'd replace all statements to mark the CachedPlan valid again
-- though Tom was right to point out flaws in the assumption that
setting is_valid like that was actually safe.

* ExecutorStart() interface damage control:

The other aspect I’ve been thinking about is how to contain the
changes required inside ExecutorStart(), and limit the disruption to
ExecutorStart_hooks in particular, while keeping changes for outside
callers narrowly scoped. In the previous patch, pruning, locking, and
invalidation checking were all done inside InitPlan(), which is called
by standard_ExecutorStart() -- an implementation choice that was
potentially disruptive to extensions using ExecutorStart_hook. Since
such hooks are expected to call standard_ExecutorStart() to perform
core plan initialization, they would have to check afterward whether
the plan had actually been initialized successfully, in case an
invalidation occurred during InitPlan(). That wasn’t optional, and it
made it easy for hook authors to miss the fact that
standard_ExecutorStart() could return without initializing the plan,
breaking expectations that were previously reliable.

Separately, for top-level callers of the executor, the patch
introduced a new entry point, ExecutorStartCachedPlan(), to avoid
requiring each caller to implement its own replanning loop. But that
approach was also awkward, since it required switching to a
nonstandard function just to get correct behavior.

What I’m thinking now is that we should instead move the logic for
pruning, deferred locking, and replanning directly into
ExecutorStart() itself. In the reverted patch, callers were affected
mainly because they had to choose between ExecutorStart() and a new
entry point, ExecutorStartCachedPlan(), which existed solely to handle
invalidation and replanning. That divergence from the standard API
made things awkward at the call site.

In contrast, the design I’m proposing avoids any need for new executor
entry points -- ExecutorStart() retains its original signature and
behavior, with the added benefit that replanning and pruning are now
handled internally before hooks or standard initialization logic are
invoked. The design requires moving some code from
standard_ExecutorStart() -- specifically the code that sets up the
EState and parameters -- and from InitPlan() -- namely, the parts that
initialize the range table, partition pruning state, and perform
ExecDoInitialPruning().

The callers of ExecutorStart() do still need to ensure that they pass
the CachedPlan, the CachedPlanSource, and the query_index in QueryDesc
via CreateQueryDesc(). The executor’s external API remains unchanged.

Importantly, this restructuring would not require any behavioral
changes for existing ExecutorStart_hook implementations. From a hook’s
point of view, this is a code motion change only. Hooks are still
invoked at the same point, but they’re now guaranteed to receive a
plan that is valid and ready for execution. This avoids the control
flow surprises introduced by the reverted patch -- specifically, the
need for hooks to detect whether standard_ExecutorStart() had
completed successfully -- while preserving the executor’s API and
execution contract as they exist in master.

I’ll hold off on writing any code for now -- just wanted to lay out
this direction and hear what others think, especially Tom.

--
Thanks, Amit Langote





^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2025-07-17 12:11  Amit Langote <[email protected]>
  parent: Amit Langote <[email protected]>
  0 siblings, 1 reply; 108+ messages in thread

From: Amit Langote @ 2025-07-17 12:11 UTC (permalink / raw)
  To: Tom Lane <[email protected]>; +Cc: Tender Wang <[email protected]>; Alexander Lakhin <[email protected]>; Tomas Vondra <[email protected]>; Robert Haas <[email protected]>; Alvaro Herrera <[email protected]>; Andres Freund <[email protected]>; Daniel Gustafsson <[email protected]>; David Rowley <[email protected]>; pgsql-hackers; Thom Brown <[email protected]>

On Fri, Jun 20, 2025 at 9:30 PM Amit Langote <[email protected]> wrote:
> On Thu, May 22, 2025 at 5:12 PM Amit Langote <[email protected]> wrote:
> > I have pushed out the revert now.
> >
> > Note that I’ve only reverted the changes related to deferring locks on
> > prunable partitions. I’m planning to leave the preparatory commits
> > leading up to that one in place unless anyone objects. For reference,
> > here they are in chronological order (the last 3 are bug fixes):
> >
> > bb3ec16e14d Move PartitionPruneInfo out of plan nodes into PlannedStmt
> > d47cbf474ec Perform runtime initial pruning outside ExecInitNode()
> > cbc127917e0 Track unpruned relids to avoid processing pruned relations
> > 75dfde13639 Fix an oversight in cbc127917 to handle MERGE correctly
> > cbb9086c9ef Fix bug in cbc127917 to handle nested Append correctly
> > 28317de723b Ensure first ModifyTable rel initialized if all are pruned
> >
> > I think separating initial pruning from plan node initialization is
> > still worthwhile on its own, as evidenced by the improvements in
> > cbc127917e.
>
> I've been thinking about how to address the concerns Tom raised about
> the reverted patch.  Here's a summary of where my thinking currently
> stands.
>
> * CachedPlan invalidation handling:
>
> The first issue is the part of the old design where a CachedPlan
> invalidated during executor startup -- while locking unpruned
> partitions -- was modified in place to replace the stale PlannedStmts
> in its stmt_list with new ones obtained by replanning all queries in
> the enclosing CachedPlanSource's query_list. I did that mainly to
> ensure that replanning happens as soon as the executor discovers the
> plan is invalid, instead of returning to the caller and requiring them
> to go back to plancache.c to trigger replanning. There were many
> issues with making that approach work in practice, because different
> callers of the executor have different ways of running plans from a
> CachedPlan -- with pquery.c in particular being hard to refactor
> cleanly to support that flow.
>
> The first alternative I came up with is to place only the query whose
> PlannedStmt is being initialized into a standalone CachedPlanSource
> and create a corresponding standalone CachedPlan. "Standalone" here
> means that both objects are "saved" independently of the original
> CachedPlanSource and CachedPlan, but are still tracked by the
> invalidation callbacks.
>
> But thinking about it more recently, what's actually important is not
> whether we construct a new CachedPlan at all, but simply that we
> replan just the one query that needs to be run, and use the resulting
> PlannedStmt directly. The planner will have taken all required locks,
> so we don't need to register the plan with the invalidation machinery
> -- concurrent invalidations can't affect correctness.
>
> In that case, the replanned PlannedStmt can be treated as transient
> executor-local state, with no need to carry any of the plan cache
> infrastructure along with it.  To support that, I further assume that,
> because replanning and execution happen essentially back-to-back,
> there's no opportunity for role-based or xmin-based invalidation (as
> is checked for a CachedPlan in CheckCachedPlan()) to affect the plan
> in between. If that reasoning holds, then we don't need to register
> the replanned statement with the invalidation machinery at all.
>
> Because we wouldn't have touched the original CachedPlan at all, the
> stale PlannedStmts in it wouldn't be replaced until the next
> GetCachedPlan() call triggers replanning. I'm willing to accept that
> as a tradeoff for a less invasive design to handle replanning in the
> executor.
>
> Finally, it's worth noting that the executor is always passed the
> entire CachedPlan, regardless of which individual statement is being
> executed. Without per-statement validity tracking, it's hard for the
> executor to tell whether replanning is actually needed for a given
> query when the CachedPlan is marked invalid (is_valid=false), making
> it impossible to selectively replan just one. To support that, what I
> would need is validity tracking at the level of individual
> PlannedStmts -- and perhaps even Querys -- in the source's query_list,
> with the current is_valid flag effectively serving as the logical AND
> of all the individual flags. We didn't need that in the old design,
> because we'd replace all statements to mark the CachedPlan valid again
> -- though Tom was right to point out flaws in the assumption that
> setting is_valid like that was actually safe.
>
> * ExecutorStart() interface damage control:
>
> The other aspect I’ve been thinking about is how to contain the
> changes required inside ExecutorStart(), and limit the disruption to
> ExecutorStart_hooks in particular, while keeping changes for outside
> callers narrowly scoped. In the previous patch, pruning, locking, and
> invalidation checking were all done inside InitPlan(), which is called
> by standard_ExecutorStart() -- an implementation choice that was
> potentially disruptive to extensions using ExecutorStart_hook. Since
> such hooks are expected to call standard_ExecutorStart() to perform
> core plan initialization, they would have to check afterward whether
> the plan had actually been initialized successfully, in case an
> invalidation occurred during InitPlan(). That wasn’t optional, and it
> made it easy for hook authors to miss the fact that
> standard_ExecutorStart() could return without initializing the plan,
> breaking expectations that were previously reliable.
>
> Separately, for top-level callers of the executor, the patch
> introduced a new entry point, ExecutorStartCachedPlan(), to avoid
> requiring each caller to implement its own replanning loop. But that
> approach was also awkward, since it required switching to a
> nonstandard function just to get correct behavior.
>
> What I’m thinking now is that we should instead move the logic for
> pruning, deferred locking, and replanning directly into
> ExecutorStart() itself. In the reverted patch, callers were affected
> mainly because they had to choose between ExecutorStart() and a new
> entry point, ExecutorStartCachedPlan(), which existed solely to handle
> invalidation and replanning. That divergence from the standard API
> made things awkward at the call site.
>
> In contrast, the design I’m proposing avoids any need for new executor
> entry points -- ExecutorStart() retains its original signature and
> behavior, with the added benefit that replanning and pruning are now
> handled internally before hooks or standard initialization logic are
> invoked. The design requires moving some code from
> standard_ExecutorStart() -- specifically the code that sets up the
> EState and parameters -- and from InitPlan() -- namely, the parts that
> initialize the range table, partition pruning state, and perform
> ExecDoInitialPruning().
>
> The callers of ExecutorStart() do still need to ensure that they pass
> the CachedPlan, the CachedPlanSource, and the query_index in QueryDesc
> via CreateQueryDesc(). The executor’s external API remains unchanged.
>
> Importantly, this restructuring would not require any behavioral
> changes for existing ExecutorStart_hook implementations. From a hook’s
> point of view, this is a code motion change only. Hooks are still
> invoked at the same point, but they’re now guaranteed to receive a
> plan that is valid and ready for execution. This avoids the control
> flow surprises introduced by the reverted patch -- specifically, the
> need for hooks to detect whether standard_ExecutorStart() had
> completed successfully -- while preserving the executor’s API and
> execution contract as they exist in master.
>
> I’ll hold off on writing any code for now -- just wanted to lay out
> this direction and hear what others think, especially Tom.

The refinements I described in my email above might help mitigate some
of those executor-related issues. However, I'm starting to wonder if
it's worth reconsidering our decision to handle pruning, locking, and
validation entirely at executor startup, which was the approach taken
in the reverted patch.

The alternative approach, doing initial pruning and locking within
plancache.c itself (which I floated a while ago), might be worth
revisiting. It avoids the complications we've discussed around the
executor API and preserves the clear separation of concerns that
plancache.c provides, though it does introduce some new layering
concerns, which I describe further below.

To support this, we'd need a mechanism to pass pruning results to the
executor alongside each PlannedStmt. For each PartitionPruneInfo in
the plan, that would include the corresponding PartitionPruneState and
the bitmapset of surviving relids determined by initial pruning. Given
that a CachedPlan can contain multiple PlannedStmts, this would
effectively be a list of pruning results, one per statement. One
reasonable way to handle that might be to define a parallel data
structure, separate from PlannedStmt, constructed by plancache.c and
carried via QueryDesc. The memory and lifetime management would mirror
how ParamListInfo is handled today, leaving the executor API unchanged
and avoiding intrusive changes to PlannedStmt.

However, one potentially problematic aspect of this design is managing
the lifecycle of the relations referenced by PartitionPruneState.
Currently, partitioned table relations are opened by the executor
after entering ExecutorStart() and closed automatically by
ExecEndPlan(), allowing cleanup of pruning states implicitly. If we
perform initial pruning earlier, we'd need to keep these relations
open longer, necessitating explicit cleanup calls (e.g., a new
FinishPartitionPruneState()) invoked by the caller of the executor,
such as from ExecutorEnd() or even higher-level callers. This
introduces some questionable layering by shifting responsibility for
relation management tasks, which ideally belong within the executor,
into its callers.

My sense is that the complexity involved in carrying pruning results
via this parallel data structure was one of the concerns Tom raised
previously, alongside the significant pruning code refactoring that
the earlier patch required. The latter, at least, should no longer be
necessary given recent code improvements.

I think that's about as many approaches as I can think of, and would
really appreciate others' thoughts on these alternatives.

-- 
Thanks, Amit Langote





^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2025-07-22 06:43  Amit Langote <[email protected]>
  parent: Amit Langote <[email protected]>
  0 siblings, 1 reply; 108+ messages in thread

From: Amit Langote @ 2025-07-22 06:43 UTC (permalink / raw)
  To: Tom Lane <[email protected]>; +Cc: Tender Wang <[email protected]>; Alexander Lakhin <[email protected]>; Tomas Vondra <[email protected]>; Robert Haas <[email protected]>; Alvaro Herrera <[email protected]>; Andres Freund <[email protected]>; Daniel Gustafsson <[email protected]>; David Rowley <[email protected]>; pgsql-hackers; Thom Brown <[email protected]>

On Thu, Jul 17, 2025 at 9:11 PM Amit Langote <[email protected]> wrote:
> The refinements I described in my email above might help mitigate some
> of those executor-related issues. However, I'm starting to wonder if
> it's worth reconsidering our decision to handle pruning, locking, and
> validation entirely at executor startup, which was the approach taken
> in the reverted patch.
>
> The alternative approach, doing initial pruning and locking within
> plancache.c itself (which I floated a while ago), might be worth
> revisiting. It avoids the complications we've discussed around the
> executor API and preserves the clear separation of concerns that
> plancache.c provides, though it does introduce some new layering
> concerns, which I describe further below.
>
> To support this, we'd need a mechanism to pass pruning results to the
> executor alongside each PlannedStmt. For each PartitionPruneInfo in
> the plan, that would include the corresponding PartitionPruneState and
> the bitmapset of surviving relids determined by initial pruning. Given
> that a CachedPlan can contain multiple PlannedStmts, this would
> effectively be a list of pruning results, one per statement. One
> reasonable way to handle that might be to define a parallel data
> structure, separate from PlannedStmt, constructed by plancache.c and
> carried via QueryDesc. The memory and lifetime management would mirror
> how ParamListInfo is handled today, leaving the executor API unchanged
> and avoiding intrusive changes to PlannedStmt.
>
> However, one potentially problematic aspect of this design is managing
> the lifecycle of the relations referenced by PartitionPruneState.
> Currently, partitioned table relations are opened by the executor
> after entering ExecutorStart() and closed automatically by
> ExecEndPlan(), allowing cleanup of pruning states implicitly. If we
> perform initial pruning earlier, we'd need to keep these relations
> open longer, necessitating explicit cleanup calls (e.g., a new
> FinishPartitionPruneState()) invoked by the caller of the executor,
> such as from ExecutorEnd() or even higher-level callers. This
> introduces some questionable layering by shifting responsibility for
> relation management tasks, which ideally belong within the executor,
> into its callers.
>
> My sense is that the complexity involved in carrying pruning results
> via this parallel data structure was one of the concerns Tom raised
> previously, alongside the significant pruning code refactoring that
> the earlier patch required. The latter, at least, should no longer be
> necessary given recent code improvements.

One point I forgot to mention about this approach is that we'd also
need to ensure permissions on parent relations are checked before
performing initial pruning in plancache.c, since pruning may involve
evaluating user-provided expressions. So in effect, we'd need to
invoke not just ExecDoInitialPruning(), but also
ExecCheckPermissions(), or some variant of it, prior to executor
startup. While manageable, it does add slightly to the complexity.

-- 
Thanks, Amit Langote





^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2025-11-12 14:17  Amit Langote <[email protected]>
  parent: Amit Langote <[email protected]>
  0 siblings, 1 reply; 108+ messages in thread

From: Amit Langote @ 2025-11-12 14:17 UTC (permalink / raw)
  To: Tom Lane <[email protected]>; +Cc: Tender Wang <[email protected]>; Alexander Lakhin <[email protected]>; Tomas Vondra <[email protected]>; Robert Haas <[email protected]>; Alvaro Herrera <[email protected]>; Andres Freund <[email protected]>; Daniel Gustafsson <[email protected]>; David Rowley <[email protected]>; pgsql-hackers; Thom Brown <[email protected]>

Hi,

On Tue, Jul 22, 2025 at 3:43 PM Amit Langote <[email protected]> wrote:
> On Thu, Jul 17, 2025 at 9:11 PM Amit Langote <[email protected]> wrote:
> > The refinements I described in my email above might help mitigate some
> > of those executor-related issues. However, I'm starting to wonder if
> > it's worth reconsidering our decision to handle pruning, locking, and
> > validation entirely at executor startup, which was the approach taken
> > in the reverted patch.
> >
> > The alternative approach, doing initial pruning and locking within
> > plancache.c itself (which I floated a while ago), might be worth
> > revisiting. It avoids the complications we've discussed around the
> > executor API and preserves the clear separation of concerns that
> > plancache.c provides, though it does introduce some new layering
> > concerns, which I describe further below.
> >
> > To support this, we'd need a mechanism to pass pruning results to the
> > executor alongside each PlannedStmt. For each PartitionPruneInfo in
> > the plan, that would include the corresponding PartitionPruneState and
> > the bitmapset of surviving relids determined by initial pruning. Given
> > that a CachedPlan can contain multiple PlannedStmts, this would
> > effectively be a list of pruning results, one per statement. One
> > reasonable way to handle that might be to define a parallel data
> > structure, separate from PlannedStmt, constructed by plancache.c and
> > carried via QueryDesc. The memory and lifetime management would mirror
> > how ParamListInfo is handled today, leaving the executor API unchanged
> > and avoiding intrusive changes to PlannedStmt.
> >
> > However, one potentially problematic aspect of this design is managing
> > the lifecycle of the relations referenced by PartitionPruneState.
> > Currently, partitioned table relations are opened by the executor
> > after entering ExecutorStart() and closed automatically by
> > ExecEndPlan(), allowing cleanup of pruning states implicitly. If we
> > perform initial pruning earlier, we'd need to keep these relations
> > open longer, necessitating explicit cleanup calls (e.g., a new
> > FinishPartitionPruneState()) invoked by the caller of the executor,
> > such as from ExecutorEnd() or even higher-level callers. This
> > introduces some questionable layering by shifting responsibility for
> > relation management tasks, which ideally belong within the executor,
> > into its callers.
> >
> > My sense is that the complexity involved in carrying pruning results
> > via this parallel data structure was one of the concerns Tom raised
> > previously, alongside the significant pruning code refactoring that
> > the earlier patch required. The latter, at least, should no longer be
> > necessary given recent code improvements.
>
> One point I forgot to mention about this approach is that we'd also
> need to ensure permissions on parent relations are checked before
> performing initial pruning in plancache.c, since pruning may involve
> evaluating user-provided expressions. So in effect, we'd need to
> invoke not just ExecDoInitialPruning(), but also
> ExecCheckPermissions(), or some variant of it, prior to executor
> startup. While manageable, it does add slightly to the complexity.

Sorry for the absence. I've now implemented the approach mentioned
above and split it into a series of reasonably isolated patches.

The key idea is to avoid taking unnecessary locks when reusing a
cached plan. To achieve that, we need to perform initial partition
pruning during cached plan reuse in plancache.c so that only surviving
partitions are locked. This requires some plumbing to reuse the result
of this "early" pruning during executor startup, because repeating the
pruning logic would be both inefficient and potentially inconsistent
-- what if you get different results the second time? (I don't have
proof that this can happen, but some earlier emails mention the
theoretical risk, so better to be safe.)

So this patch introduces ExecutorPrep(), which allows executor
metadata such as initial pruning results (valid subplan indexes) and
full unpruned_relids to be computed ahead of execution and reused
later by ExecutorStart() and during QueryDesc setup in parallel
workers using the results shared by the leader. The parallel query bit
was discussed previously at [1], though I didn’t have a solution I
liked then.

This revives an idea that was last implemented in the patch (v30)
posted on Dec 16, 2022. In retrospect, I understand the hesitation Tom
might have had about the patch at the time -- its changes to enable
early pruning and then feed the results into ExecutorStart() were less
than pretty. Thanks to the initial pruning code refactoring that I
committed in Postgres 18, those changes now seem much more principled
and modular IMO.

The patch set is structured as follows:

* Refactor partition pruning initialization (0001): separates the
setup of the pruning state from its execution by introducing
ExecCreatePartitionPruneStates(). This makes the pruning logic easier
to reuse and adds flexibility to do only the setup but skip pruning in
some cases.

* Introduce ExecutorPrep infrastructure (0002): adds ExecutorPrep()
and ExecPrep as a formal way to perform executor setup ahead of
execution. This enables caching or transferring pruning results and
other metadata without triggering execution. ExecutorStart() can now
consume precomputed prep state from the EState created during
ExecutorPrep().  ExecPrepCleanup() handles cleanup when the plan is
invalidated during prep and so not executed; the state is cleaned up
in the regular ExecutorEnd() path otherwise.

* Allow parallel workers to reuse leader pruning results (0003): lets
workers reuse the leader’s initial pruning results (valid subplan
indexes) and unpruned_relids via ExecutorPrep().  This adds a
verification step to check that leader and worker decisions match,
throwing an error if they don’t -- so "reuse" is a bit of a lie.
Should that check be debug-only? (Maybe not.) As mentioned above, this
was previously discussed at [1].

* Enable pruning-aware locking in cached / generic plan reuse (0004):
extends GetCachedPlan() and CheckCachedPlan() to call ExecutorPrep()
on each PlannedStmt in the CachedPlan, locking only surviving
partitions. Adds CachedPlanPrepData to pass this through plan cache
APIs and down to execution via QueryDesc. Also reinstates the
firstResultRel locking rule added in 28317de72 but later lost due to
revert of the earlier pruning patch, to ensure correctness when all
target partitions are pruned.

This approach keeps plan caching and validation logic self-contained
in plancache.c, avoids invasive executor API changes.

Benchmark results:

echo "plan_cache_mode = force_generic_plan" >> $PGDATA/postgresql.conf
for p in 32 64 128 256 512 1024; do pgbench -i --partitions=$p >
/dev/null 2>&1; echo -ne "$p\t"; pgbench -n -S -T10 -Mprepared | grep
tps; done

Master

32 tps = 23841.822407 (without initial connection time)
64 tps = 21578.619816 (without initial connection time)
128 tps = 18090.500707 (without initial connection time)
256 tps = 14152.248201 (without initial connection time)
512 tps = 9432.708423 (without initial connection time)
1024 tps = 5873.696475 (without initial connection time)

Patched

32 tps = 24724.245798 (without initial connection time)
64 tps = 24858.206407 (without initial connection time)
128 tps = 24652.655269 (without initial connection time)
256 tps = 23656.756615 (without initial connection time)
512 tps = 22299.865769 (without initial connection time)
1024 tps = 21911.704317 (without initial connection time)

Comments welcome.

[1] https://www.postgresql.org/message-id/CA%2BHiwqFA%3DswkzgGK8AmXUNFtLeEXFJwFyY3E7cTxvL46aa1OTw%40mail...

--
Thanks, Amit Langote


Attachments:

  [application/octet-stream] v1-0003-Reuse-partition-pruning-results-in-parallel-worke.patch (9.0K, 2-v1-0003-Reuse-partition-pruning-results-in-parallel-worke.patch)
  download | inline diff:
From d23a05d6f412dcbfd38a910331527765999d78e9 Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Tue, 11 Nov 2025 22:17:47 +0900
Subject: [PATCH v1 3/4] Reuse partition pruning results in parallel workers

Pass the leader's initial partition pruning results and unpruned
relids to parallel workers and reuse them via ExecutorPrep(). This
avoids repeating pruning logic in workers, which is not only
redundant but also risks divergence due to nondeterminism in pruning
steps or parameter evaluation timing.

Introduce ExecCheckInitialPruningResults() to verify that the results
match what the worker would compute. This check helps catch
inconsistencies across leader and worker pruning logic.

While valuable on its own, this change also lays the foundation for
future optimizations where the leader may take locks only on
surviving partitions. Ensuring that workers follow identical pruning
decisions makes such selective locking safe.
---
 src/backend/executor/execParallel.c  | 67 +++++++++++++++++++++++++++-
 src/backend/executor/execPartition.c | 35 +++++++++++++++
 src/include/executor/execPartition.h |  1 +
 3 files changed, 102 insertions(+), 1 deletion(-)

diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index aedbd9566d6..f16ef184c68 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -24,6 +24,7 @@
 #include "postgres.h"
 
 #include "executor/execParallel.h"
+#include "executor/execPartition.h"
 #include "executor/executor.h"
 #include "executor/nodeAgg.h"
 #include "executor/nodeAppend.h"
@@ -65,6 +66,8 @@
 #define PARALLEL_KEY_QUERY_TEXT		UINT64CONST(0xE000000000000008)
 #define PARALLEL_KEY_JIT_INSTRUMENTATION UINT64CONST(0xE000000000000009)
 #define PARALLEL_KEY_WAL_USAGE			UINT64CONST(0xE00000000000000A)
+#define PARALLEL_KEY_PARTITION_PRUNE_RESULTS	UINT64CONST(0xE00000000000000B)
+#define PARALLEL_KEY_UNPRUNED_RELIDS	UINT64CONST(0xE00000000000000C)
 
 #define PARALLEL_TUPLE_QUEUE_SIZE		65536
 
@@ -608,12 +611,18 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	FixedParallelExecutorState *fpes;
 	char	   *pstmt_data;
 	char	   *pstmt_space;
+	char	   *part_prune_results_data;
+	char	   *part_prune_results_space;
+	char	   *unpruned_relids_data;
+	char	   *unpruned_relids_space;
 	char	   *paramlistinfo_space;
 	BufferUsage *bufusage_space;
 	WalUsage   *walusage_space;
 	SharedExecutorInstrumentation *instrumentation = NULL;
 	SharedJitInstrumentation *jit_instrumentation = NULL;
 	int			pstmt_len;
+	int			part_prune_results_len;
+	int			unpruned_relids_len;
 	int			paramlistinfo_len;
 	int			instrumentation_len = 0;
 	int			jit_instrumentation_len = 0;
@@ -642,6 +651,8 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 
 	/* Fix up and serialize plan to be sent to workers. */
 	pstmt_data = ExecSerializePlan(planstate->plan, estate);
+	part_prune_results_data = nodeToString(estate->es_part_prune_results);
+	unpruned_relids_data = nodeToString(estate->es_unpruned_relids);
 
 	/* Create a parallel context. */
 	pcxt = CreateParallelContext("postgres", "ParallelQueryMain", nworkers);
@@ -668,6 +679,16 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	shm_toc_estimate_chunk(&pcxt->estimator, pstmt_len);
 	shm_toc_estimate_keys(&pcxt->estimator, 1);
 
+	/* Estimate space for serialized part_prune_results. */
+	part_prune_results_len = strlen(part_prune_results_data) + 1;
+	shm_toc_estimate_chunk(&pcxt->estimator, part_prune_results_len);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	/* Estimate space for serialized unpruned_relids. */
+	unpruned_relids_len = strlen(unpruned_relids_data) + 1;
+	shm_toc_estimate_chunk(&pcxt->estimator, unpruned_relids_len);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
 	/* Estimate space for serialized ParamListInfo. */
 	paramlistinfo_len = EstimateParamListSpace(estate->es_param_list_info);
 	shm_toc_estimate_chunk(&pcxt->estimator, paramlistinfo_len);
@@ -769,6 +790,16 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	memcpy(pstmt_space, pstmt_data, pstmt_len);
 	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PLANNEDSTMT, pstmt_space);
 
+	/* Store serialized part_prune_results */
+	part_prune_results_space = shm_toc_allocate(pcxt->toc, part_prune_results_len);
+	memcpy(part_prune_results_space, part_prune_results_data, part_prune_results_len);
+	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARTITION_PRUNE_RESULTS, part_prune_results_space);
+
+	/* Store serialized unpruned_relids */
+	unpruned_relids_space = shm_toc_allocate(pcxt->toc, unpruned_relids_len);
+	memcpy(unpruned_relids_space, unpruned_relids_data, unpruned_relids_len);
+	shm_toc_insert(pcxt->toc, PARALLEL_KEY_UNPRUNED_RELIDS, unpruned_relids_space);
+
 	/* Store serialized ParamListInfo. */
 	paramlistinfo_space = shm_toc_allocate(pcxt->toc, paramlistinfo_len);
 	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARAMLISTINFO, paramlistinfo_space);
@@ -1263,10 +1294,15 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
 						 int instrument_options)
 {
 	char	   *pstmtspace;
+	char	   *part_prune_results_space;
+	char	   *unpruned_relids_space;
 	char	   *paramspace;
 	PlannedStmt *pstmt;
+	List	   *part_prune_results;
+	Bitmapset  *unpruned_relids;
 	ParamListInfo paramLI;
 	char	   *queryString;
+	ExecPrep   *prep = NULL;
 
 	/* Get the query string from shared memory */
 	queryString = shm_toc_lookup(toc, PARALLEL_KEY_QUERY_TEXT, false);
@@ -1279,9 +1315,38 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
 	paramspace = shm_toc_lookup(toc, PARALLEL_KEY_PARAMLISTINFO, false);
 	paramLI = RestoreParamList(&paramspace);
 
+	/* Reconstruct leader-supplied part_prune_results and unpruned_relids. */
+	part_prune_results_space =
+		shm_toc_lookup(toc, PARALLEL_KEY_PARTITION_PRUNE_RESULTS, false);
+	part_prune_results = (List *) stringToNode(part_prune_results_space);
+	unpruned_relids_space =
+		shm_toc_lookup(toc, PARALLEL_KEY_UNPRUNED_RELIDS, false);
+	unpruned_relids = (Bitmapset *) stringToNode(unpruned_relids_space);
+
+	/*
+	 * If pruning was done in the leader, build a prep estate in the worker
+	 * and inject the leader's pruning results into it for reuse.
+	 */
+	if (pstmt->partPruneInfos)
+	{
+		prep = ExecutorPrep(pstmt, paramLI, CurrentResourceOwner, false);
+		Assert(prep->prep_estate);
+
+		prep->prep_estate->es_part_prune_results = part_prune_results;
+		prep->prep_estate->es_unpruned_relids =
+			bms_add_members(prep->prep_estate->es_unpruned_relids,
+							unpruned_relids);
+
+		/*
+		 * Verify that the pruning results passed from the leader match
+		 * what the worker would independently compute.
+		 */
+		ExecCheckInitialPruningResults(prep->prep_estate);
+	}
+
 	/* Create a QueryDesc for the query. */
 	return CreateQueryDesc(pstmt,
-						   NULL,
+						   prep,
 						   queryString,
 						   GetActiveSnapshot(), InvalidSnapshot,
 						   receiver, paramLI, NULL, instrument_options);
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 187a480e508..3b450e3373f 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -1872,6 +1872,41 @@ ExecDoInitialPruning(EState *estate)
 	}
 }
 
+/*
+ * ExecCheckInitialPruningResults
+ *      Verify partition pruning results passed from the leader process.
+ *
+ * This is intended to be called during parallel worker query setup.
+ * It recomputes initial pruning results locally and compares them with
+ * those received from the leader. Any mismatch may indicate a divergence
+ * between leader and worker logic or environment.
+ */
+void
+ExecCheckInitialPruningResults(EState *estate)
+{
+	ListCell   *lc;
+	int			i;
+
+	Assert(estate->es_part_prune_results != NULL);
+	i = 0;
+	foreach(lc, estate->es_part_prune_states)
+	{
+		PartitionPruneState *prunestate = (PartitionPruneState *) lfirst(lc);
+		Bitmapset *reuse_validsubplans =
+				list_nth_node(Bitmapset, estate->es_part_prune_results, i);
+		Bitmapset  *validsubplans = NULL;
+		Bitmapset  *validsubplan_rtis = NULL;
+
+		if (prunestate->do_initial_prune)
+			validsubplans = ExecFindMatchingSubPlans(prunestate, true,
+													 &validsubplan_rtis);
+		if (bms_nonempty_difference(validsubplans, reuse_validsubplans))
+			elog(ERROR, "different validsubplns in parallel worker");
+		if (bms_nonempty_difference(validsubplan_rtis, estate->es_unpruned_relids))
+			elog(ERROR, "different unprunable_relids in parallel worker");
+	}
+}
+
 /*
  * ExecInitPartitionExecPruning
  *		Initialize the data structures needed for runtime "exec" partition
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index ba8cc594fc9..126efd008e5 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -132,6 +132,7 @@ typedef struct PartitionPruneState
 
 extern void ExecCreatePartitionPruneStates(EState *estate);
 extern void ExecDoInitialPruning(EState *estate);
+extern void ExecCheckInitialPruningResults(EState *estate);
 extern PartitionPruneState *ExecInitPartitionExecPruning(PlanState *planstate,
 														 int n_total_subplans,
 														 int part_prune_index,
-- 
2.47.3



  [application/octet-stream] v1-0001-Refactor-partition-pruning-initialization-for-cla.patch (7.7K, 3-v1-0001-Refactor-partition-pruning-initialization-for-cla.patch)
  download | inline diff:
From 243d407de86b0a73b9bd8c8dbc541f630eb33747 Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Tue, 11 Nov 2025 21:18:24 +0900
Subject: [PATCH v1 1/4] Refactor partition pruning initialization for clarity
 and modularity

Move the creation of PartitionPruneState structures out of
ExecDoInitialPruning() into a new ExecCreatePartitionPruneStates()
function. This separates the setup of pruning state from the execution
of initial pruning logic, making the code clearer and easier to
maintain.

Also simplify handling of unpruned relids by moving responsibility
for recording them in EState into CreatePartitionPruneState(),
avoiding the need to pass all_leafpart_rtis as an out parameter.

This refactoring allows callers to reuse the pruning setup logic
without always triggering pruning, a capability useful for future use
cases that may only need metadata initialization.
---
 src/backend/executor/execPartition.c | 70 +++++++++++++++++-----------
 src/include/executor/execPartition.h |  1 +
 2 files changed, 43 insertions(+), 28 deletions(-)

diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index aa12e9ad2ea..88b150c8d77 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -182,8 +182,7 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation rel,
 static List *adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri);
 static List *adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap);
 static PartitionPruneState *CreatePartitionPruneState(EState *estate,
-													  PartitionPruneInfo *pruneinfo,
-													  Bitmapset **all_leafpart_rtis);
+													  PartitionPruneInfo *pruneinfo);
 static void InitPartitionPruneContext(PartitionPruneContext *context,
 									  List *pruning_steps,
 									  PartitionDesc partdesc,
@@ -1772,6 +1771,9 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  *
  * Functions:
  *
+ * ExecCreatePartitionPruneStates
+ *		Create PartitionPruneState for all PartitionPruneInfos in the EState
+ *
  * ExecDoInitialPruning:
  *		Perform runtime "initial" pruning, if necessary, to determine the set
  *		of child subnodes that need to be initialized during ExecInitNode() for
@@ -1796,6 +1798,29 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  *-------------------------------------------------------------------------
  */
 
+/*
+ * ExecCreatePartitionPruneStates
+ *
+ * Create a PartitionPruneState for each PartitionPruneInfo in the estate,
+ * and save them in estate->es_part_prune_states. This setup is required
+ * before any initial or runtime pruning can occur.
+ */
+void
+ExecCreatePartitionPruneStates(EState *estate)
+{
+	ListCell   *lc;
+
+	foreach(lc, estate->es_part_prune_infos)
+	{
+		PartitionPruneInfo *pruneinfo = lfirst_node(PartitionPruneInfo, lc);
+		PartitionPruneState *prunestate;
+
+		/* Create and save the PartitionPruneState. */
+		prunestate = CreatePartitionPruneState(estate, pruneinfo);
+		estate->es_part_prune_states = lappend(estate->es_part_prune_states,
+											   prunestate);
+	}
+}
 
 /*
  * ExecDoInitialPruning
@@ -1803,11 +1828,11 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  *		of child subnodes that need to be initialized during ExecInitNode() for
  *		plan nodes that support partition pruning.
  *
- * This function iterates over each PartitionPruneInfo entry in
- * estate->es_part_prune_infos. For each entry, it creates a PartitionPruneState
- * and adds it to es_part_prune_states.  ExecInitPartitionExecPruning() accesses
+ * This function iterates over each PartitionPruneState in
+ * estate->es_part_prune_states, which must have been populated earlier by
+ * ExecCreatePartitionPruneStates(). ExecInitPartitionExecPruning() accesses
  * these states through their corresponding indexes in es_part_prune_states and
- * assign each state to the parent node's PlanState, from where it will be used
+ * assigns each state to the parent node's PlanState, from where it will be used
  * for "exec" pruning.
  *
  * If initial pruning steps exist for a PartitionPruneInfo entry, this function
@@ -1825,20 +1850,13 @@ ExecDoInitialPruning(EState *estate)
 {
 	ListCell   *lc;
 
-	foreach(lc, estate->es_part_prune_infos)
+	Assert(estate->es_part_prune_results == NULL);
+	foreach(lc, estate->es_part_prune_states)
 	{
-		PartitionPruneInfo *pruneinfo = lfirst_node(PartitionPruneInfo, lc);
-		PartitionPruneState *prunestate;
+		PartitionPruneState *prunestate = (PartitionPruneState *) lfirst(lc);
 		Bitmapset  *validsubplans = NULL;
-		Bitmapset  *all_leafpart_rtis = NULL;
 		Bitmapset  *validsubplan_rtis = NULL;
 
-		/* Create and save the PartitionPruneState. */
-		prunestate = CreatePartitionPruneState(estate, pruneinfo,
-											   &all_leafpart_rtis);
-		estate->es_part_prune_states = lappend(estate->es_part_prune_states,
-											   prunestate);
-
 		/*
 		 * Perform initial pruning steps, if any, and save the result
 		 * bitmapset or NULL as described in the header comment.
@@ -1846,8 +1864,6 @@ ExecDoInitialPruning(EState *estate)
 		if (prunestate->do_initial_prune)
 			validsubplans = ExecFindMatchingSubPlans(prunestate, true,
 													 &validsubplan_rtis);
-		else
-			validsubplan_rtis = all_leafpart_rtis;
 
 		estate->es_unpruned_relids = bms_add_members(estate->es_unpruned_relids,
 													 validsubplan_rtis);
@@ -1965,14 +1981,12 @@ ExecInitPartitionExecPruning(PlanState *planstate,
  * parent plan node's PlanState.
  *
  * If initial pruning steps are to be skipped (e.g., during EXPLAIN
- * (GENERIC_PLAN)), *all_leafpart_rtis will be populated with the RT indexes of
- * all leaf partitions whose scanning subnode is included in the parent plan
- * node's list of child plans. The caller must add these RT indexes to
- * estate->es_unpruned_relids.
+ * (GENERIC_PLAN)), the RT indexes of all leaf partitions whose scanning
+ * subnode is included in the parent plan node's list of child plans are
+ * added to estate->es_unpruned_relids.
  */
 static PartitionPruneState *
-CreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo,
-						  Bitmapset **all_leafpart_rtis)
+CreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo)
 {
 	PartitionPruneState *prunestate;
 	int			n_part_hierarchies;
@@ -2206,8 +2220,8 @@ CreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo,
 													   pinfo->execparamids);
 
 			/*
-			 * Return all leaf partition indexes if we're skipping pruning in
-			 * the EXPLAIN (GENERIC_PLAN) case.
+			 * Add all leaf partition indexes to es_unpruned_relids if we're
+			 * skipping pruning in the EXPLAIN (GENERIC_PLAN) case.
 			 */
 			if (pinfo->initial_pruning_steps && !prunestate->do_initial_prune)
 			{
@@ -2219,8 +2233,8 @@ CreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo,
 					Index		rtindex = pprune->leafpart_rti_map[part_index];
 
 					if (rtindex)
-						*all_leafpart_rtis = bms_add_member(*all_leafpart_rtis,
-															rtindex);
+						estate->es_unpruned_relids =
+							bms_add_member(estate->es_unpruned_relids, rtindex);
 				}
 			}
 
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index 3b3f46aced0..ba8cc594fc9 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -130,6 +130,7 @@ typedef struct PartitionPruneState
 	PartitionPruningData *partprunedata[FLEXIBLE_ARRAY_MEMBER];
 } PartitionPruneState;
 
+extern void ExecCreatePartitionPruneStates(EState *estate);
 extern void ExecDoInitialPruning(EState *estate);
 extern PartitionPruneState *ExecInitPartitionExecPruning(PlanState *planstate,
 														 int n_total_subplans,
-- 
2.47.3



  [application/octet-stream] v1-0004-Use-pruning-aware-locking-in-cached-plans.patch (25.0K, 4-v1-0004-Use-pruning-aware-locking-in-cached-plans.patch)
  download | inline diff:
From ddffccd68513bb0e68d6cf75810cf64cf9a4d757 Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Tue, 11 Nov 2025 22:30:52 +0900
Subject: [PATCH v1 4/4] Use pruning-aware locking in cached plans

Extend GetCachedPlan() to perform ExecutorPrep() on each planned
statement, capturing unpruned relids and initial pruning results.
Use this data to acquire execution locks only on surviving partitions,
avoiding unnecessary locking of pruned tables even when using cached
plans.

Introduce CachedPlanPrepData to carry ExecutorPrep results
through the plan caching layer. Adjust call sites in SPI,
functions, portals, and EXPLAIN to propagate this data.

This ensures pruning decisions made during initial pruning are
consistently reused without redoing pruning logic in executor paths
like parallel workers. It also lays the groundwork for
pruning-dependent lock behavior during plan reuse.

To maintain correctness when all target partitions are pruned, also
reinstate the firstResultRel locking behavior lost in commit
28317de72. That commit required the first ModifyTable target to
remain initialized for executor assumptions to hold. We now
explicitly track these relids in PlannerGlobal and PlannedStmt so they
are locked even if pruned, preserving that rule across cached plan
reuse.
---
 src/backend/commands/prepare.c         |  15 +-
 src/backend/executor/functions.c       |  14 +-
 src/backend/executor/nodeModifyTable.c |   4 +-
 src/backend/executor/spi.c             |  22 ++-
 src/backend/optimizer/plan/planner.c   |   1 +
 src/backend/optimizer/plan/setrefs.c   |   3 +
 src/backend/tcop/postgres.c            |   7 +-
 src/backend/utils/cache/plancache.c    | 223 ++++++++++++++++++++++++-
 src/include/nodes/pathnodes.h          |   3 +
 src/include/nodes/plannodes.h          |   7 +
 src/include/utils/plancache.h          |  23 ++-
 11 files changed, 299 insertions(+), 23 deletions(-)

diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index afd449c73ba..10fdff403b9 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -154,6 +154,7 @@ ExecuteQuery(ParseState *pstate,
 {
 	PreparedStatement *entry;
 	CachedPlan *cplan;
+	CachedPlanPrepData cprep = {0};
 	List	   *plan_list;
 	ParamListInfo paramLI = NULL;
 	EState	   *estate = NULL;
@@ -193,7 +194,9 @@ ExecuteQuery(ParseState *pstate,
 									   entry->plansource->query_string);
 
 	/* Replan if needed, and increment plan refcount for portal */
-	cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL);
+	cprep.context = portal->portalContext;
+	cprep.owner = portal->resowner;
+	cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL, &cprep);
 	plan_list = cplan->stmt_list;
 
 	/*
@@ -205,7 +208,7 @@ ExecuteQuery(ParseState *pstate,
 					  query_string,
 					  entry->plansource->commandTag,
 					  plan_list,
-					  NIL,
+					  cprep.prep_list,
 					  cplan);
 
 	/*
@@ -575,6 +578,7 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	PreparedStatement *entry;
 	const char *query_string;
 	CachedPlan *cplan;
+	CachedPlanPrepData cprep = {0};
 	List	   *plan_list;
 	List	   *prep_list;
 	ListCell   *p;
@@ -633,8 +637,11 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	}
 
 	/* Replan if needed, and acquire a transient refcount */
+	cprep.context = CurrentMemoryContext;
+	cprep.owner = CurrentResourceOwner;
 	cplan = GetCachedPlan(entry->plansource, paramLI,
-						  CurrentResourceOwner, pstate->p_queryEnv);
+						  CurrentResourceOwner, pstate->p_queryEnv,
+						  &cprep);
 
 	INSTR_TIME_SET_CURRENT(planduration);
 	INSTR_TIME_SUBTRACT(planduration, planstart);
@@ -653,7 +660,7 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	}
 
 	plan_list = cplan->stmt_list;
-	prep_list = NIL;
+	prep_list = cprep.prep_list;
 
 	/* Explain each query */
 	i = 0;
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index 633310c5f5b..8fc22fbd283 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -72,6 +72,7 @@ typedef struct execution_state
 	bool		setsResult;		/* true if this query produces func's result */
 	bool		lazyEval;		/* true if should fetch one row at a time */
 	PlannedStmt *stmt;			/* plan for this query */
+	ExecPrep   *prep;			/* ExecutorPrep() output for this plan */
 	QueryDesc  *qd;				/* null unless status == RUN */
 } execution_state;
 
@@ -657,6 +658,8 @@ init_execution_state(SQLFunctionCachePtr fcache)
 	execution_state *lasttages = NULL;
 	int			nstmts;
 	ListCell   *lc;
+	CachedPlanPrepData cprep = {0};
+	int			i;
 
 	/*
 	 * Clean up after previous query, if there was one.
@@ -695,10 +698,13 @@ init_execution_state(SQLFunctionCachePtr fcache)
 	 * CurrentResourceOwner will be the same when ShutdownSQLFunction runs.)
 	 */
 	fcache->cowner = CurrentResourceOwner;
+	cprep.context = fcache->fcontext;
+	cprep.owner = fcache->cowner;
 	fcache->cplan = GetCachedPlan(plansource,
 								  fcache->paramLI,
 								  fcache->cowner,
-								  NULL);
+								  NULL,
+								  &cprep);
 
 	/*
 	 * If necessary, make esarray[] bigger to hold the needed state.
@@ -719,9 +725,12 @@ init_execution_state(SQLFunctionCachePtr fcache)
 	/*
 	 * Build execution_state list to match the number of contained plans.
 	 */
+	i = 0;
 	foreach(lc, fcache->cplan->stmt_list)
 	{
 		PlannedStmt *stmt = lfirst_node(PlannedStmt, lc);
+		ExecPrep *prep = cprep.prep_list ? list_nth(cprep.prep_list, i) :
+			NULL;
 		execution_state *newes;
 
 		/*
@@ -763,6 +772,7 @@ init_execution_state(SQLFunctionCachePtr fcache)
 		newes->setsResult = false;	/* might change below */
 		newes->lazyEval = false;	/* might change below */
 		newes->stmt = stmt;
+		newes->prep = prep;
 		newes->qd = NULL;
 
 		if (stmt->canSetTag)
@@ -1362,7 +1372,7 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
 		dest = None_Receiver;
 
 	es->qd = CreateQueryDesc(es->stmt,
-							 NULL,
+							 es->prep,
 							 fcache->func->src,
 							 GetActiveSnapshot(),
 							 InvalidSnapshot,
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index 4c5647ac38a..c5812612f8d 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -4648,8 +4648,8 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
 	 * as a reference for building the ResultRelInfo of the target partition.
 	 * In either case, it doesn't matter which result relation is kept, so we
 	 * just keep the first one, if all others have been pruned.  See also,
-	 * ExecDoInitialPruning(), which ensures that this first result relation
-	 * has been locked.
+	 * AcquireExecutorLocksUnpruned(), which ensures that this first result
+	 * relation has been locked.
 	 */
 	i = 0;
 	foreach(l, node->resultRelations)
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 7a3cb944d6f..72d52baff4b 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -1579,6 +1579,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 {
 	CachedPlanSource *plansource;
 	CachedPlan *cplan;
+	CachedPlanPrepData cprep = {0};
 	List	   *stmt_list;
 	char	   *query_string;
 	Snapshot	snapshot;
@@ -1659,7 +1660,10 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 	 */
 
 	/* Replan if needed, and increment plan refcount for portal */
-	cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv);
+	cprep.context = portal->portalContext;
+	cprep.owner = portal->resowner;
+	cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv,
+						  &cprep);
 	stmt_list = cplan->stmt_list;
 
 	if (!plan->saved)
@@ -1685,7 +1689,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 					  query_string,
 					  plansource->commandTag,
 					  stmt_list,
-					  NIL,
+					  cprep.prep_list,	/* XXX - need copy? */
 					  cplan);
 
 	/*
@@ -2078,6 +2082,7 @@ SPI_plan_get_cached_plan(SPIPlanPtr plan)
 {
 	CachedPlanSource *plansource;
 	CachedPlan *cplan;
+	CachedPlanPrepData cprep = {0};
 	SPICallbackArg spicallbackarg;
 	ErrorContextCallback spierrcontext;
 
@@ -2101,9 +2106,12 @@ SPI_plan_get_cached_plan(SPIPlanPtr plan)
 	error_context_stack = &spierrcontext;
 
 	/* Get the generic plan for the query */
+	cprep.context = CurrentMemoryContext;
+	cprep.owner = CurrentResourceOwner;
 	cplan = GetCachedPlan(plansource, NULL,
 						  plan->saved ? CurrentResourceOwner : NULL,
-						  _SPI_current->queryEnv);
+						  _SPI_current->queryEnv,
+						  &cprep);
 	Assert(cplan == plansource->gplan);
 
 	/* Pop the error context stack */
@@ -2501,6 +2509,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 		CachedPlanSource *plansource = (CachedPlanSource *) lfirst(lc1);
 		List	   *stmt_list;
 		ListCell   *lc2;
+		CachedPlanPrepData cprep = {0};
 		List	   *prep_list;
 		int			i;
 
@@ -2577,11 +2586,14 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 		 * Replan if needed, and increment plan refcount.  If it's a saved
 		 * plan, the refcount must be backed by the plan_owner.
 		 */
+		cprep.context = CurrentMemoryContext;
+		cprep.owner = CurrentResourceOwner;
 		cplan = GetCachedPlan(plansource, options->params,
-							  plan_owner, _SPI_current->queryEnv);
+							  plan_owner, _SPI_current->queryEnv,
+							  &cprep);
 
 		stmt_list = cplan->stmt_list;
-		prep_list = NIL;
+		prep_list = cprep.prep_list;
 
 		/*
 		 * If we weren't given a specific snapshot to use, and the statement
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index c4fd646b999..4c76e78c1da 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -608,6 +608,7 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
 											  glob->prunableRelids);
 	result->permInfos = glob->finalrteperminfos;
 	result->resultRelations = glob->resultRelations;
+	result->firstResultRels = glob->firstResultRels;
 	result->appendRelations = glob->appendRelations;
 	result->subplans = glob->subplans;
 	result->rewindPlanIDs = glob->rewindPlanIDs;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index ccdc9bc264a..229b39060ae 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -1274,6 +1274,9 @@ set_plan_refs(PlannerInfo *root, Plan *plan, int rtoffset)
 						lappend_int(root->glob->resultRelations,
 									splan->rootRelation);
 				}
+				root->glob->firstResultRels =
+					lappend_int(root->glob->firstResultRels,
+								linitial_int(splan->resultRelations));
 			}
 			break;
 		case T_Append:
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index d3964a12a14..82972beee70 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -1639,6 +1639,7 @@ exec_bind_message(StringInfo input_message)
 	int16	   *rformats = NULL;
 	CachedPlanSource *psrc;
 	CachedPlan *cplan;
+	CachedPlanPrepData cprep = {0};
 	Portal		portal;
 	char	   *query_string;
 	char	   *saved_stmt_name;
@@ -2021,7 +2022,9 @@ exec_bind_message(StringInfo input_message)
 	 * will be generated in MessageContext.  The plan refcount will be
 	 * assigned to the Portal, so it will be released at portal destruction.
 	 */
-	cplan = GetCachedPlan(psrc, params, NULL, NULL);
+	cprep.context = portal->portalContext;
+	cprep.owner = portal->resowner;
+	cplan = GetCachedPlan(psrc, params, NULL, NULL, &cprep);
 
 	/*
 	 * Now we can define the portal.
@@ -2034,7 +2037,7 @@ exec_bind_message(StringInfo input_message)
 					  query_string,
 					  psrc->commandTag,
 					  cplan->stmt_list,
-					  NIL,
+					  cprep.prep_list,
 					  cplan);
 
 	/* Portal is defined, set the plan ID based on its contents. */
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 6661d2c6b73..ebcf601fce7 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -93,7 +93,7 @@ static bool StmtPlanRequiresRevalidation(CachedPlanSource *plansource);
 static bool BuildingPlanRequiresSnapshot(CachedPlanSource *plansource);
 static List *RevalidateCachedQuery(CachedPlanSource *plansource,
 								   QueryEnvironment *queryEnv);
-static bool CheckCachedPlan(CachedPlanSource *plansource);
+static bool PrepAndCheckCachedPlan(CachedPlanSource *plansource, CachedPlanPrepData *cprep);
 static CachedPlan *BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
 								   ParamListInfo boundParams, QueryEnvironment *queryEnv);
 static bool choose_custom_plan(CachedPlanSource *plansource,
@@ -101,6 +101,8 @@ static bool choose_custom_plan(CachedPlanSource *plansource,
 static double cached_plan_cost(CachedPlan *plan, bool include_planner);
 static Query *QueryListGetPrimaryStmt(List *stmts);
 static void AcquireExecutorLocks(List *stmt_list, bool acquire);
+static void AcquireExecutorLocksUnpruned(List *stmt_list, bool acquire,
+										 CachedPlanPrepData *cprep);
 static void AcquirePlannerLocks(List *stmt_list, bool acquire);
 static void ScanQueryForLocks(Query *parsetree, bool acquire);
 static bool ScanQueryWalker(Node *node, bool *acquire);
@@ -137,6 +139,26 @@ ResourceOwnerForgetPlanCacheRef(ResourceOwner owner, CachedPlan *plan)
 /* GUC parameter */
 int			plan_cache_mode = PLAN_CACHE_MODE_AUTO;
 
+/*
+ * Lock acquisition policy for execution locks.
+ *
+ * LOCK_ALL acquires locks on all relations mentioned in the plan,
+ * reproducing the behavior of AcquireExecutorLocks().
+ *
+ * LOCK_UNPRUNED restricts locking to only the unpruned relations. That
+ * includes those mentioned in PlannedStmt.unprunableRelids and the leaf
+ * partitions remaining after performing initial pruning.
+ */
+typedef enum LockPolicy
+{
+	LOCK_ALL,
+	LOCK_UNPRUNED,
+} LockPolicy;
+
+static void AcquireExecutorLocksWithPolicy(List *stmt_list,
+										   LockPolicy policy, bool acquire,
+										   CachedPlanPrepData *cprep);
+
 /*
  * InitPlanCache: initialize module during InitPostgres.
  *
@@ -938,7 +960,12 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
 }
 
 /*
- * CheckCachedPlan: see if the CachedPlanSource's generic plan is valid.
+ * PrepAndCheckCachedPlan: see if the CachedPlanSource's generic plan is valid.
+ *
+ * If 'cprep' is not NULL, ExecutorPrep() is applied to each PlannedStmt to
+ * compute the set of partitions that survive initial runtime pruning in order
+ * to only lock them.  The resulting ExecPrep structures are saved in cprep for
+ * later reuse by ExecutorStart().
  *
  * Caller must have already called RevalidateCachedQuery to verify that the
  * querytree is up to date.
@@ -947,7 +974,7 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
  * (We must do this for the "true" result to be race-condition-free.)
  */
 static bool
-CheckCachedPlan(CachedPlanSource *plansource)
+PrepAndCheckCachedPlan(CachedPlanSource *plansource, CachedPlanPrepData *cprep)
 {
 	CachedPlan *plan = plansource->gplan;
 
@@ -975,13 +1002,15 @@ CheckCachedPlan(CachedPlanSource *plansource)
 	 */
 	if (plan->is_valid)
 	{
+		LockPolicy policy = !cprep ? LOCK_ALL : LOCK_UNPRUNED;
+
 		/*
 		 * Plan must have positive refcount because it is referenced by
 		 * plansource; so no need to fear it disappears under us here.
 		 */
 		Assert(plan->refcount > 0);
 
-		AcquireExecutorLocks(plan->stmt_list, true);
+		AcquireExecutorLocksWithPolicy(plan->stmt_list, policy, true, cprep);
 
 		/*
 		 * If plan was transient, check to see if TransactionXmin has
@@ -1003,7 +1032,7 @@ CheckCachedPlan(CachedPlanSource *plansource)
 		}
 
 		/* Oops, the race case happened.  Release useless locks. */
-		AcquireExecutorLocks(plan->stmt_list, false);
+		AcquireExecutorLocksWithPolicy(plan->stmt_list, policy, false, cprep);
 	}
 
 	/*
@@ -1283,6 +1312,10 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
  * On return, the plan is valid and we have sufficient locks to begin
  * execution.
  *
+ * If 'cprep' is not NULL and a generic plan is reused, the function prepares
+ * each PlannedStmt via ExecutorPrep() and stores the results in
+ * cprep->prep_list.  These are intended to be passed later to ExecutorStart().
+ *
  * On return, the refcount of the plan has been incremented; a later
  * ReleaseCachedPlan() call is expected.  If "owner" is not NULL then
  * the refcount has been reported to that ResourceOwner (note that this
@@ -1293,7 +1326,8 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
  */
 CachedPlan *
 GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
-			  ResourceOwner owner, QueryEnvironment *queryEnv)
+			  ResourceOwner owner, QueryEnvironment *queryEnv,
+			  CachedPlanPrepData *cprep)
 {
 	CachedPlan *plan = NULL;
 	List	   *qlist;
@@ -1315,7 +1349,9 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 
 	if (!customplan)
 	{
-		if (CheckCachedPlan(plansource))
+		if (cprep)
+			cprep->params = boundParams;
+		if (PrepAndCheckCachedPlan(plansource, cprep))
 		{
 			/* We want a generic plan, and we already have a valid one */
 			plan = plansource->gplan;
@@ -1902,6 +1938,32 @@ QueryListGetPrimaryStmt(List *stmts)
 	return NULL;
 }
 
+/*
+ * AcquireExecutorLocksWithPolicy
+ *      Acquire or release execution locks for a plan according to
+ *      the specified policy.
+ *
+ * The policy determines whether all relations or only unpruned ones are locked.
+ * For LOCK_UNPRUNED, ExecutorPrep is invoked to identify surviving partitions
+ * and its result is populated in cprep.
+ */
+static void
+AcquireExecutorLocksWithPolicy(List *stmt_list, LockPolicy policy, bool acquire,
+							   CachedPlanPrepData *cprep)
+{
+	switch (policy)
+	{
+		case LOCK_ALL:
+			AcquireExecutorLocks(stmt_list, acquire);
+			break;
+		case LOCK_UNPRUNED:
+			AcquireExecutorLocksUnpruned(stmt_list, acquire, cprep);
+			break;
+		default:
+			elog(ERROR, "invalid LockPolicy");
+	}
+}
+
 /*
  * AcquireExecutorLocks: acquire locks needed for execution of a cached plan;
  * or release them if acquire is false.
@@ -1954,6 +2016,153 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
 	}
 }
 
+/*
+ * LockRelids
+ * 		Acquire or release locks on the specified relids, which reference
+ * 		entries in the provided range table.
+ *
+ * Helper for AcquireExecutorLocksUnpruned().
+ */
+static void
+LockRelids(List *rtable, Bitmapset *relids, bool acquire)
+{
+	int	rtindex = -1;
+
+	while ((rtindex = bms_next_member(relids, rtindex)) >= 0)
+	{
+		RangeTblEntry *rte = list_nth_node(RangeTblEntry, rtable, rtindex - 1);
+
+		Assert(rte->rtekind == RTE_RELATION ||
+			   (rte->rtekind == RTE_SUBQUERY && OidIsValid(rte->relid)));
+
+		/*
+		 * Acquire the appropriate type of lock on each relation OID. Note
+		 * that we don't actually try to open the rel, and hence will not
+		 * fail if it's been dropped entirely --- we'll just transiently
+		 * acquire a non-conflicting lock.
+		 */
+		if (acquire)
+			LockRelationOid(rte->relid, rte->rellockmode);
+		else
+			UnlockRelationOid(rte->relid, rte->rellockmode);
+	}
+}
+
+/*
+ * AcquireExecutorLocksUnpruned
+ *		Acquire or release execution locks for only unpruned relations
+ *		referenced by the given PlannedStmts.
+ *
+ * This function uses ExecutorPrep to identify which partitions survive
+ * initial runtime pruning and locks only those, along with any unprunable
+ * base relations. During acquire, the resulting ExecPrep objects are stored
+ * in cprep->prep_list for later reuse. During release, those same ExecPrep
+ * objects are used to identify what to unlock.
+ *
+ * Unlike AcquireExecutorLocks(), which locks all relations listed in the
+ * PlannedStmt's rtable (LOCK_ALL policy), this function selectively locks
+ * only those rels that may be referenced during execution.
+ *
+ * prep_list is extended during acquire and must match stmt_list during
+ * release. Memory allocation happens in cprep->context.
+ */
+static void
+AcquireExecutorLocksUnpruned(List *stmt_list, bool acquire,
+							 CachedPlanPrepData *cprep)
+{
+	MemoryContext oldcontext = MemoryContextSwitchTo(cprep->context);
+	ListCell   *lc1;
+	List	   *prep_list;
+	int			i;
+
+	Assert(cprep);
+
+	/*
+	 * When releasing locks, use the ExecPrep list (if any) created during
+	 * acquisition to determine which relids to unlock. The list must match
+	 * the PlannedStmt list one-to-one.
+	 */
+	prep_list = cprep->prep_list;
+	Assert(acquire || list_length(prep_list) == list_length(stmt_list));
+
+	i = 0;
+	foreach(lc1, stmt_list)
+	{
+		PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
+		ExecPrep *prep;
+
+		if (plannedstmt->commandType == CMD_UTILITY)
+		{
+			/* Same as AcquireExecutorLocks(). */
+			Query	   *query = UtilityContainsQuery(plannedstmt->utilityStmt);
+
+			if (query)
+				ScanQueryForLocks(query, acquire);
+
+			/* Keep the list one-to-one with stmt_list. */
+			if (acquire)
+				cprep->prep_list = lappend(cprep->prep_list, NULL);
+			continue;
+		}
+
+		/*
+		 * Lock tables mentioned in the original query and other unprunable
+		 * relations that were added to the plan via inheritance expansion.
+		 */
+		LockRelids(plannedstmt->rtable, plannedstmt->unprunableRelids, acquire);
+
+		/* Lock partitions surviving runtime initial pruning. */
+		if (acquire)
+		{
+			prep = ExecutorPrep(plannedstmt, cprep->params, cprep->owner, true);
+			Assert(prep || plannedstmt->partPruneInfos == NULL);
+			cprep->prep_list = lappend(cprep->prep_list, prep);
+		}
+		else
+			prep = list_nth(prep_list, i++);
+
+		Assert(prep == NULL || prep->prep_estate);
+		if (prep)
+		{
+			EState *prep_estate = prep->prep_estate;
+
+			/*
+			 * es_unpruned_relids includes plannedstmt->unprunableRelids,
+			 * which we've already locked. Filter them out to avoid double-locking.
+			 */
+			Bitmapset *lock_relids = bms_difference(prep_estate->es_unpruned_relids,
+													plannedstmt->unprunableRelids);
+
+			/*
+			 * firstResultRels may contain pruned partitions that must still be
+			 * locked to satisfy executor assumptions (see comments in
+			 * ExecInitModifyTable(). Ensure they’re included here.
+			 */
+			if (plannedstmt->resultRelations)
+			{
+				ListCell *lc2;
+
+				foreach(lc2, plannedstmt->firstResultRels)
+				{
+					Index       firstResultRel = lfirst_int(lc2);
+
+					if (!bms_is_member(firstResultRel, lock_relids))
+						lock_relids = bms_add_member(lock_relids, firstResultRel);
+				}
+			}
+
+			LockRelids(plannedstmt->rtable, lock_relids, acquire);
+			bms_free(lock_relids);
+		}
+
+		/* Clean up prep if releasing locks. */
+		if (!acquire)
+			ExecPrepCleanup(prep);
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+}
+
 /*
  * AcquirePlannerLocks: acquire locks needed for planning of a querytree list;
  * or release them if acquire is false.
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 30d889b54c5..6fb86dc05f6 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -141,6 +141,9 @@ typedef struct PlannerGlobal
 	/* "flat" list of integer RT indexes */
 	List	   *resultRelations;
 
+	/* "flat" list of integer RT indexes (one per ModifyTable node) */
+	List	   *firstResultRels;
+
 	/* "flat" list of AppendRelInfos */
 	List	   *appendRelations;
 
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index c4393a94321..42b51299ece 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -123,6 +123,13 @@ typedef struct PlannedStmt
 	/* integer list of RT indexes, or NIL */
 	List	   *resultRelations;
 
+	/*
+	 * rtable indexes of first target relation in each ModifyTable node in the
+	 * plan for INSERT/UPDATE/DELETE/MERGE
+	 */
+	/* integer list of RT indexes, or NIL */
+	List	   *firstResultRels;
+
 	/* list of AppendRelInfo nodes */
 	List	   *appendRelations;
 
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index a82b66d4bc2..59f0b0fc4a4 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -197,6 +197,26 @@ typedef struct CachedExpression
 } CachedExpression;
 
 
+/*
+ * CachedPlanPrepData
+ *	Carries ExecutorPrep results for each PlannedStmt in a CachedPlan,
+ *	along with context and owner information needed to allocate them.
+ *
+ * Populated by GetCachedPlan() when ExecutorPrep is run on a generic plan.
+ *
+ * prep_list: results from ExecutorPrep(), one per PlannedStmt
+ * params: parameters that may be used during ExecutorPrep (e.g., pruning)
+ * context: memory context to allocate ExecutorPrep results in
+ * owner: resource owner to associate ExecutorPrep resources with
+ */
+typedef struct CachedPlanPrepData
+{
+	List   *prep_list;		/* List of ExecPrep */
+	ParamListInfo params;
+	MemoryContext context;
+	ResourceOwner owner;
+} CachedPlanPrepData;
+
 extern void InitPlanCache(void);
 extern void ResetPlanCache(void);
 
@@ -240,7 +260,8 @@ extern List *CachedPlanGetTargetList(CachedPlanSource *plansource,
 extern CachedPlan *GetCachedPlan(CachedPlanSource *plansource,
 								 ParamListInfo boundParams,
 								 ResourceOwner owner,
-								 QueryEnvironment *queryEnv);
+								 QueryEnvironment *queryEnv,
+								 CachedPlanPrepData *cprep);
 extern void ReleaseCachedPlan(CachedPlan *plan, ResourceOwner owner);
 
 extern bool CachedPlanAllowsSimpleValidityCheck(CachedPlanSource *plansource,
-- 
2.47.3



  [application/octet-stream] v1-0002-Introduce-ExecutorPrep-infrastructure-for-pre-exe.patch (29.9K, 5-v1-0002-Introduce-ExecutorPrep-infrastructure-for-pre-exe.patch)
  download | inline diff:
From e9689618f2889f224eb62e9ff4fb5251285ecdb3 Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Tue, 11 Nov 2025 21:47:46 +0900
Subject: [PATCH v1 2/4] Introduce ExecutorPrep infrastructure for
 pre-execution setup

Add ExecutorPrep() and ExecPrep to support setting up executor
metadata like range table initialization and partition pruning
ahead of actual execution. This enables execution paths to
perform setup independently of running the plan.

For example, plan validation can compute and consume this
metadata without executing the query. Parallel query workers
can receive pre-initialized state from the leader and pass it
to ExecutorStart, avoiding redundant setup.

ExecutorStart now accepts a prep-estate from QueryDesc to skip
repeating initialization. The ExecPrep wrapper manages cleanup
and signals ownership of the estate. PrepPlan() encapsulates
shared setup logic.

Call sites, including Portal, SPI, and EXPLAIN, are updated to
support passing down the prep data. These changes are mostly
mechanical and clarify the separation between setup and actual
execution.
---
 src/backend/commands/copyto.c        |   2 +-
 src/backend/commands/createas.c      |   2 +-
 src/backend/commands/explain.c       |   7 +-
 src/backend/commands/extension.c     |   1 +
 src/backend/commands/matview.c       |   2 +-
 src/backend/commands/portalcmds.c    |   1 +
 src/backend/commands/prepare.c       |  11 +-
 src/backend/executor/README          |   9 +-
 src/backend/executor/execMain.c      | 192 +++++++++++++++++++++++----
 src/backend/executor/execParallel.c  |   1 +
 src/backend/executor/execPartition.c |   3 +
 src/backend/executor/functions.c     |   1 +
 src/backend/executor/spi.c           |  10 ++
 src/backend/tcop/postgres.c          |   2 +
 src/backend/tcop/pquery.c            |  27 +++-
 src/backend/utils/mmgr/portalmem.c   |   2 +
 src/include/commands/explain.h       |   3 +-
 src/include/executor/execdesc.h      |   3 +-
 src/include/executor/executor.h      |  10 ++
 src/include/nodes/execnodes.h        |  55 ++++++++
 src/include/utils/portal.h           |   2 +
 21 files changed, 308 insertions(+), 38 deletions(-)

diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index cef452584e5..5efbb0949c2 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -870,7 +870,7 @@ BeginCopyTo(ParseState *pstate,
 		((DR_copy *) dest)->cstate = cstate;
 
 		/* Create a QueryDesc requesting no output */
-		cstate->queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+		cstate->queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
 											GetActiveSnapshot(),
 											InvalidSnapshot,
 											dest, NULL, NULL, 0);
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 1ccc2e55c64..9eabe4920cd 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -334,7 +334,7 @@ ExecCreateTableAs(ParseState *pstate, CreateTableAsStmt *stmt,
 		UpdateActiveSnapshotCommandId();
 
 		/* Create a QueryDesc, redirecting output to our tuple receiver */
-		queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+		queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
 									GetActiveSnapshot(), InvalidSnapshot,
 									dest, params, queryEnv, 0);
 
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 7e699f8595e..d6ab3697dd9 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -370,7 +370,7 @@ standard_ExplainOneQuery(Query *query, int cursorOptions,
 	}
 
 	/* run it (if needed) and produce output */
-	ExplainOnePlan(plan, into, es, queryString, params, queryEnv,
+	ExplainOnePlan(plan, NULL, into, es, queryString, params, queryEnv,
 				   &planduration, (es->buffers ? &bufusage : NULL),
 				   es->memory ? &mem_counters : NULL);
 }
@@ -492,7 +492,8 @@ ExplainOneUtility(Node *utilityStmt, IntoClause *into, ExplainState *es,
  * to call it.
  */
 void
-ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
+ExplainOnePlan(PlannedStmt *plannedstmt, ExecPrep *prep,
+			   IntoClause *into, ExplainState *es,
 			   const char *queryString, ParamListInfo params,
 			   QueryEnvironment *queryEnv, const instr_time *planduration,
 			   const BufferUsage *bufusage,
@@ -548,7 +549,7 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
 		dest = None_Receiver;
 
 	/* Create a QueryDesc for the query */
-	queryDesc = CreateQueryDesc(plannedstmt, queryString,
+	queryDesc = CreateQueryDesc(plannedstmt, prep, queryString,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, params, queryEnv, instrument_option);
 
diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index 93ef1ad106f..3cca6d45ec1 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -993,6 +993,7 @@ execute_sql_string(const char *sql, const char *filename)
 				QueryDesc  *qdesc;
 
 				qdesc = CreateQueryDesc(stmt,
+										NULL,
 										sql,
 										GetActiveSnapshot(), NULL,
 										dest, NULL, NULL, 0);
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index ef7c0d624f1..30cbf9f264f 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -437,7 +437,7 @@ refresh_matview_datafill(DestReceiver *dest, Query *query,
 	UpdateActiveSnapshotCommandId();
 
 	/* Create a QueryDesc, redirecting output to our tuple receiver */
-	queryDesc = CreateQueryDesc(plan, queryString,
+	queryDesc = CreateQueryDesc(plan, NULL, queryString,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, NULL, NULL, 0);
 
diff --git a/src/backend/commands/portalcmds.c b/src/backend/commands/portalcmds.c
index ec96c2efcd3..ac1ddd25aba 100644
--- a/src/backend/commands/portalcmds.c
+++ b/src/backend/commands/portalcmds.c
@@ -118,6 +118,7 @@ PerformCursorOpen(ParseState *pstate, DeclareCursorStmt *cstmt, ParamListInfo pa
 					  queryString,
 					  CMDTAG_SELECT,	/* cursor's query is always a SELECT */
 					  list_make1(plan),
+					  list_make1(NULL),
 					  NULL);
 
 	/*----------
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 34b6410d6a2..afd449c73ba 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -205,6 +205,7 @@ ExecuteQuery(ParseState *pstate,
 					  query_string,
 					  entry->plansource->commandTag,
 					  plan_list,
+					  NIL,
 					  cplan);
 
 	/*
@@ -575,6 +576,7 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	const char *query_string;
 	CachedPlan *cplan;
 	List	   *plan_list;
+	List	   *prep_list;
 	ListCell   *p;
 	ParamListInfo paramLI = NULL;
 	EState	   *estate = NULL;
@@ -585,6 +587,7 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	MemoryContextCounters mem_counters;
 	MemoryContext planner_ctx = NULL;
 	MemoryContext saved_ctx = NULL;
+	int			i;
 
 	if (es->memory)
 	{
@@ -650,14 +653,20 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	}
 
 	plan_list = cplan->stmt_list;
+	prep_list = NIL;
 
 	/* Explain each query */
+	i = 0;
 	foreach(p, plan_list)
 	{
 		PlannedStmt *pstmt = lfirst_node(PlannedStmt, p);
+		ExecPrep *prep = prep_list ?
+			(ExecPrep *) list_nth(prep_list, i) : NULL;
 
+		i++;
 		if (pstmt->commandType != CMD_UTILITY)
-			ExplainOnePlan(pstmt, into, es, query_string, paramLI, pstate->p_queryEnv,
+			ExplainOnePlan(pstmt, prep,
+						   into, es, query_string, paramLI, pstate->p_queryEnv,
 						   &planduration, (es->buffers ? &bufusage : NULL),
 						   es->memory ? &mem_counters : NULL);
 		else
diff --git a/src/backend/executor/README b/src/backend/executor/README
index 54f4782f31b..6e481398f18 100644
--- a/src/backend/executor/README
+++ b/src/backend/executor/README
@@ -291,10 +291,17 @@ Query Processing Control Flow
 
 This is a sketch of control flow for full query processing:
 
+	[Optional] ExecutorPrep
+		- May be run before ExecutorStart (e.g., for plan validation).
+		- Performs range table initialization, permission checks, and
+		  initial partition pruning.
+		- Returns an ExecPrep wrapper with EState that ExecutorStart may
+		  reuse.
+
 	CreateQueryDesc
 
 	ExecutorStart
-		CreateExecutorState
+		CreateExecutorState (or reuse one from ExecPrep if present)
 			creates per-query context
 		switch to per-query context to run ExecInitNode
 		AfterTriggerBeginQuery
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 27c9eec697b..1b96b251c34 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -75,6 +75,7 @@ ExecutorCheckPerms_hook_type ExecutorCheckPerms_hook = NULL;
 
 /* decls for local routines only used within this module */
 static void InitPlan(QueryDesc *queryDesc, int eflags);
+static void PrepPlan(EState *estate, bool do_initial_pruning);
 static void CheckValidRowMarkRel(Relation rel, RowMarkType markType);
 static void ExecPostprocessPlan(EState *estate);
 static void ExecEndPlan(PlanState *planstate, EState *estate);
@@ -171,8 +172,24 @@ standard_ExecutorStart(QueryDesc *queryDesc, int eflags)
 
 	/*
 	 * Build EState, switch into per-query memory context for startup.
+	 *
+	 * If ExecutorPrep() ran earlier (e.g., to do initial pruning during plan
+	 * validity checking), reuse its EState to avoid redoing range table setup
+	 * and pruning. Otherwise, create a fresh EState as usual.
 	 */
-	estate = CreateExecutorState();
+	if (queryDesc->prep)
+	{
+		estate = queryDesc->prep->prep_estate;
+
+		/*
+		 * Executor is adopting the prep's EState. Mark it so ExecPrepCleanup()
+		 * doesn't try to free it redundantly.
+		 */
+		queryDesc->prep->owns_estate = false;
+	}
+	else
+		estate = CreateExecutorState();
+
 	queryDesc->estate = estate;
 
 	oldcontext = MemoryContextSwitchTo(estate->es_query_cxt);
@@ -263,6 +280,143 @@ standard_ExecutorStart(QueryDesc *queryDesc, int eflags)
 	MemoryContextSwitchTo(oldcontext);
 }
 
+/*
+ * ExecutorPrep: prepare executor state for a PlannedStmt outside ExecutorStart.
+ *
+ * Performs range table initialization, permission checks, and initial
+ * partition pruning if partPruneInfos are present and do_initial_pruning is
+ * true.
+ *
+ * This is intended for callers that need executor metadata ahead of actual
+ * execution. Typical use cases include:
+ *	- determining which relations must be locked during plan cache validation;
+ *	- initializing unpruned relids and valid subplans in parallel workers
+ *	  using state copied from the leader.
+ *
+ * The executor can reuse the resulting state to avoid redundant setup during
+ * ExecutorStart(); see InitPlan().
+ *
+ * Returns an ExecPrep wrapper that owns the EState and can be reused
+ * or cleaned up later. Returns NULL if no prep is needed (e.g. no pruning).
+ */
+ExecPrep *
+ExecutorPrep(PlannedStmt *pstmt, ParamListInfo params, ResourceOwner owner,
+			 bool do_initial_pruning)
+{
+	ResourceOwner oldowner;
+	EState *estate;
+
+	Assert(pstmt->commandType != CMD_UTILITY);
+
+	/* No pruning needed -- let normal ExecutorStart handle setup later. */
+	if (pstmt->partPruneInfos == NIL)
+		return NULL;
+
+	estate = CreateExecutorState();
+	estate->es_plannedstmt = pstmt;
+	estate->es_part_prune_infos = pstmt->partPruneInfos;
+	estate->es_param_list_info = params;
+
+	/*
+	 * Ensure locks taken during initial pruning are tracked under the given
+	 * ResourceOwner (e.g., one associated with CachedPlan validation).
+	 */
+	oldowner = CurrentResourceOwner;
+	CurrentResourceOwner = owner;
+
+	PrepPlan(estate, do_initial_pruning);
+
+	CurrentResourceOwner = oldowner;
+
+	return CreateExecPrep(estate, CurrentMemoryContext, NULL, NULL);
+}
+
+/*
+ * PrepPlan: initialize executor metadata needed before plan execution.
+ *
+ * Sets up permissions, range table, and partition pruning infrastructure.
+ * If do_initial_pruning is true, performs initial pruning and stores the
+ * resulting subplan indexes in es_part_prune_results. Otherwise, this step
+ * is skipped, typically when results are provided externally (e.g., in
+ * parallel workers).
+ *
+ * Called from both ExecutorPrep() and InitPlan().
+ */
+static void
+PrepPlan(EState *estate, bool do_initial_pruning)
+{
+	PlannedStmt *pstmt = estate->es_plannedstmt;
+
+	/*
+	 * Do permissions checks.
+	 */
+	ExecCheckPermissions(pstmt->rtable, pstmt->permInfos, true);
+
+	/*
+	 * Initialize range table.
+	 */
+	ExecInitRangeTable(estate, pstmt->rtable, pstmt->permInfos,
+					   bms_copy(pstmt->unprunableRelids));
+
+	/*
+	 * Set up PartitionPruneState structures needed for both initial and
+	 * runtime partition pruning. These structures are built from the
+	 * PartitionPruneInfo entries in the plan tree.
+	 *
+	 * If do_initial_pruning is true, also perform initial pruning to compute
+	 * the subset of child subplans that will be executed. The results,
+	 * which are bitmapsets of selected child indexes, are saved in
+	 * es_part_prune_results. This list is parallel to es_part_prune_infos.
+	 *
+	 * In parallel workers, do_initial_pruning should be false — they receive
+	 * es_part_prune_results from the leader process and should only initialize
+	 * the PartitionPruneStates.
+	 */
+	ExecCreatePartitionPruneStates(estate);
+	if (do_initial_pruning)
+		ExecDoInitialPruning(estate);
+}
+
+/*
+ * CreateExecPrep: initialize ExecPrep wrapper with optional cleanup metadata.
+ */
+ExecPrep *
+CreateExecPrep(EState *estate, MemoryContext context,
+			   execprep_cleanup_fn cleanup, void *cleanup_arg)
+{
+	ExecPrep *prep = palloc0(sizeof(ExecPrep));
+
+	prep->prep_estate = estate;
+	prep->context = context;
+	prep->cleanup = cleanup;
+	prep->cleanup_arg = cleanup_arg;
+	prep->owns_estate = true;
+
+	return prep;
+}
+
+/*
+ * ExecPrepCleanup: free ExecPrep resources not adopted by the executor.
+ *
+ * Only frees the EState if it wasn't taken over by ExecutorStart().
+ * Always runs the optional user-defined cleanup callback.
+ */
+void
+ExecPrepCleanup(ExecPrep *prep)
+{
+	if (prep == NULL)
+		return;
+
+	if (prep->prep_estate && prep->owns_estate)
+	{
+		ExecCloseRangeTableRelations(prep->prep_estate);
+		FreeExecutorState(prep->prep_estate);
+	}
+
+	if (prep->cleanup)
+		prep->cleanup(prep->cleanup_arg);
+}
+
 /* ----------------------------------------------------------------
  *		ExecutorRun
  *
@@ -824,7 +978,6 @@ ExecCheckXactReadOnly(PlannedStmt *plannedstmt)
 		PreventCommandIfParallelMode(CreateCommandName((Node *) plannedstmt));
 }
 
-
 /* ----------------------------------------------------------------
  *		InitPlan
  *
@@ -838,7 +991,6 @@ InitPlan(QueryDesc *queryDesc, int eflags)
 	CmdType		operation = queryDesc->operation;
 	PlannedStmt *plannedstmt = queryDesc->plannedstmt;
 	Plan	   *plan = plannedstmt->planTree;
-	List	   *rangeTable = plannedstmt->rtable;
 	EState	   *estate = queryDesc->estate;
 	PlanState  *planstate;
 	TupleDesc	tupType;
@@ -846,29 +998,19 @@ InitPlan(QueryDesc *queryDesc, int eflags)
 	int			i;
 
 	/*
-	 * Do permissions checks
+	 * If ExecutorPrep() was not run earlier (e.g., during plan validation),
+	 * perform InitPlan setup: init range table, check permissions, and run
+	 * initial pruning. Otherwise, the executor will reuse the same information
+	 * in queryDesc->prep->prep_estate.
 	 */
-	ExecCheckPermissions(rangeTable, plannedstmt->permInfos, true);
-
-	/*
-	 * initialize the node's execution state
-	 */
-	ExecInitRangeTable(estate, rangeTable, plannedstmt->permInfos,
-					   bms_copy(plannedstmt->unprunableRelids));
-
-	estate->es_plannedstmt = plannedstmt;
-	estate->es_part_prune_infos = plannedstmt->partPruneInfos;
-
-	/*
-	 * Perform runtime "initial" pruning to identify which child subplans,
-	 * corresponding to the children of plan nodes that contain
-	 * PartitionPruneInfo such as Append, will not be executed. The results,
-	 * which are bitmapsets of indexes of the child subplans that will be
-	 * executed, are saved in es_part_prune_results.  These results correspond
-	 * to each PartitionPruneInfo entry, and the es_part_prune_results list is
-	 * parallel to es_part_prune_infos.
-	 */
-	ExecDoInitialPruning(estate);
+	if (queryDesc->prep == NULL)
+	{
+		estate->es_plannedstmt = plannedstmt;
+		estate->es_part_prune_infos = plannedstmt->partPruneInfos;
+		PrepPlan(estate, true);
+	}
+	else
+		Assert(estate == queryDesc->prep->prep_estate);
 
 	/*
 	 * Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index f098a5557cf..aedbd9566d6 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -1281,6 +1281,7 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
 
 	/* Create a QueryDesc for the query. */
 	return CreateQueryDesc(pstmt,
+						   NULL,
 						   queryString,
 						   GetActiveSnapshot(), InvalidSnapshot,
 						   receiver, paramLI, NULL, instrument_options);
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 88b150c8d77..187a480e508 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -2368,6 +2368,9 @@ InitExecPartitionPruneContexts(PartitionPruneState *prunestate,
 	Assert(parent_plan != NULL);
 	estate = parent_plan->state;
 
+	/* Wouldn't be available at ExecutorPrep() time. */
+	prunestate->econtext->ecxt_param_exec_vals = estate->es_param_exec_vals;
+
 	/*
 	 * No need to fix subplans maps if initial pruning didn't eliminate any
 	 * subplans.
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index 630d708d2a3..633310c5f5b 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -1362,6 +1362,7 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
 		dest = None_Receiver;
 
 	es->qd = CreateQueryDesc(es->stmt,
+							 NULL,
 							 fcache->func->src,
 							 GetActiveSnapshot(),
 							 InvalidSnapshot,
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 653500b38dc..7a3cb944d6f 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -1685,6 +1685,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 					  query_string,
 					  plansource->commandTag,
 					  stmt_list,
+					  NIL,
 					  cplan);
 
 	/*
@@ -2500,6 +2501,8 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 		CachedPlanSource *plansource = (CachedPlanSource *) lfirst(lc1);
 		List	   *stmt_list;
 		ListCell   *lc2;
+		List	   *prep_list;
+		int			i;
 
 		spicallbackarg.query = plansource->query_string;
 
@@ -2578,6 +2581,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 							  plan_owner, _SPI_current->queryEnv);
 
 		stmt_list = cplan->stmt_list;
+		prep_list = NIL;
 
 		/*
 		 * If we weren't given a specific snapshot to use, and the statement
@@ -2615,12 +2619,17 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 			}
 		}
 
+		i = 0;
 		foreach(lc2, stmt_list)
 		{
 			PlannedStmt *stmt = lfirst_node(PlannedStmt, lc2);
+			ExecPrep *prep = prep_list ?
+				list_nth(prep_list, i) : NULL;
 			bool		canSetTag = stmt->canSetTag;
 			DestReceiver *dest;
 
+			i++;
+
 			/*
 			 * Reset output state.  (Note that if a non-SPI receiver is used,
 			 * _SPI_current->processed will stay zero, and that's what we'll
@@ -2690,6 +2699,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 					snap = InvalidSnapshot;
 
 				qdesc = CreateQueryDesc(stmt,
+										prep,
 										plansource->query_string,
 										snap, crosscheck_snapshot,
 										dest,
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 2bd89102686..d3964a12a14 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -1232,6 +1232,7 @@ exec_simple_query(const char *query_string)
 						  query_string,
 						  commandTag,
 						  plantree_list,
+						  NIL,
 						  NULL);
 
 		/*
@@ -2033,6 +2034,7 @@ exec_bind_message(StringInfo input_message)
 					  query_string,
 					  psrc->commandTag,
 					  cplan->stmt_list,
+					  NIL,
 					  cplan);
 
 	/* Portal is defined, set the plan ID based on its contents. */
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index fde78c55160..82c295502b0 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -37,6 +37,7 @@ Portal		ActivePortal = NULL;
 
 
 static void ProcessQuery(PlannedStmt *plan,
+						 ExecPrep *prep,
 						 const char *sourceText,
 						 ParamListInfo params,
 						 QueryEnvironment *queryEnv,
@@ -66,6 +67,7 @@ static void DoPortalRewind(Portal portal);
  */
 QueryDesc *
 CreateQueryDesc(PlannedStmt *plannedstmt,
+				ExecPrep *prep,
 				const char *sourceText,
 				Snapshot snapshot,
 				Snapshot crosscheck_snapshot,
@@ -78,6 +80,7 @@ CreateQueryDesc(PlannedStmt *plannedstmt,
 
 	qd->operation = plannedstmt->commandType;	/* operation */
 	qd->plannedstmt = plannedstmt;	/* plan */
+	qd->prep = prep;		/* executor prep output */
 	qd->sourceText = sourceText;	/* query text */
 	qd->snapshot = RegisterSnapshot(snapshot);	/* snapshot */
 	/* RI check snapshot */
@@ -112,6 +115,13 @@ FreeQueryDesc(QueryDesc *qdesc)
 	UnregisterSnapshot(qdesc->snapshot);
 	UnregisterSnapshot(qdesc->crosscheck_snapshot);
 
+	/* ExecPrep cleanup if necessary */
+	if (qdesc->prep)
+	{
+		ExecPrepCleanup(qdesc->prep);
+		qdesc->prep = NULL;
+	}
+
 	/* Only the QueryDesc itself need be freed */
 	pfree(qdesc);
 }
@@ -123,6 +133,7 @@ FreeQueryDesc(QueryDesc *qdesc)
  *		PORTAL_ONE_RETURNING, or PORTAL_ONE_MOD_WITH portal
  *
  *	plan: the plan tree for the query
+ *	prep: ExecPrep for the plan (output of ExecutorPrep())
  *	sourceText: the source text of the query
  *	params: any parameters needed
  *	dest: where to send results
@@ -135,6 +146,7 @@ FreeQueryDesc(QueryDesc *qdesc)
  */
 static void
 ProcessQuery(PlannedStmt *plan,
+			 ExecPrep *prep,
 			 const char *sourceText,
 			 ParamListInfo params,
 			 QueryEnvironment *queryEnv,
@@ -146,7 +158,7 @@ ProcessQuery(PlannedStmt *plan,
 	/*
 	 * Create the QueryDesc object
 	 */
-	queryDesc = CreateQueryDesc(plan, sourceText,
+	queryDesc = CreateQueryDesc(plan, prep, sourceText,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, params, queryEnv, 0);
 
@@ -489,6 +501,9 @@ PortalStart(Portal portal, ParamListInfo params,
 				 * the destination to DestNone.
 				 */
 				queryDesc = CreateQueryDesc(linitial_node(PlannedStmt, portal->stmts),
+											portal->preps ?
+											(ExecPrep *) linitial(portal->preps) :
+											NULL,
 											portal->sourceText,
 											GetActiveSnapshot(),
 											InvalidSnapshot,
@@ -1185,6 +1200,7 @@ PortalRunMulti(Portal portal,
 {
 	bool		active_snapshot_set = false;
 	ListCell   *stmtlist_item;
+	int			i;
 
 	/*
 	 * If the destination is DestRemoteExecute, change to DestNone.  The
@@ -1205,9 +1221,14 @@ PortalRunMulti(Portal portal,
 	 * Loop to handle the individual queries generated from a single parsetree
 	 * by analysis and rewrite.
 	 */
+	i = 0;
 	foreach(stmtlist_item, portal->stmts)
 	{
 		PlannedStmt *pstmt = lfirst_node(PlannedStmt, stmtlist_item);
+		ExecPrep *prep = portal->preps ?
+			list_nth(portal->preps, i) : NULL;
+
+		i++;
 
 		/*
 		 * If we got a cancel signal in prior command, quit
@@ -1265,7 +1286,7 @@ PortalRunMulti(Portal portal,
 			if (pstmt->canSetTag)
 			{
 				/* statement can set tag string */
-				ProcessQuery(pstmt,
+				ProcessQuery(pstmt, prep,
 							 portal->sourceText,
 							 portal->portalParams,
 							 portal->queryEnv,
@@ -1274,7 +1295,7 @@ PortalRunMulti(Portal portal,
 			else
 			{
 				/* stmt added by rewrite cannot set tag */
-				ProcessQuery(pstmt,
+				ProcessQuery(pstmt, prep,
 							 portal->sourceText,
 							 portal->portalParams,
 							 portal->queryEnv,
diff --git a/src/backend/utils/mmgr/portalmem.c b/src/backend/utils/mmgr/portalmem.c
index 943da087c9f..313f8ef2fdc 100644
--- a/src/backend/utils/mmgr/portalmem.c
+++ b/src/backend/utils/mmgr/portalmem.c
@@ -284,6 +284,7 @@ PortalDefineQuery(Portal portal,
 				  const char *sourceText,
 				  CommandTag commandTag,
 				  List *stmts,
+				  List *preps,
 				  CachedPlan *cplan)
 {
 	Assert(PortalIsValid(portal));
@@ -298,6 +299,7 @@ PortalDefineQuery(Portal portal,
 	portal->qc.nprocessed = 0;
 	portal->commandTag = commandTag;
 	portal->stmts = stmts;
+	portal->preps = preps;
 	portal->cplan = cplan;
 	portal->status = PORTAL_DEFINED;
 }
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 6e51d50efc7..6aa8b275aa2 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -63,7 +63,8 @@ extern void ExplainOneUtility(Node *utilityStmt, IntoClause *into,
 							  ExplainState *es, ParseState *pstate,
 							  ParamListInfo params);
 
-extern void ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into,
+extern void ExplainOnePlan(PlannedStmt *plannedstmt, ExecPrep *prep,
+						   IntoClause *into,
 						   ExplainState *es, const char *queryString,
 						   ParamListInfo params, QueryEnvironment *queryEnv,
 						   const instr_time *planduration,
diff --git a/src/include/executor/execdesc.h b/src/include/executor/execdesc.h
index 86db3dc8d0d..c18530f5d11 100644
--- a/src/include/executor/execdesc.h
+++ b/src/include/executor/execdesc.h
@@ -18,7 +18,6 @@
 #include "nodes/execnodes.h"
 #include "tcop/dest.h"
 
-
 /* ----------------
  *		query descriptor:
  *
@@ -35,6 +34,7 @@ typedef struct QueryDesc
 	/* These fields are provided by CreateQueryDesc */
 	CmdType		operation;		/* CMD_SELECT, CMD_UPDATE, etc. */
 	PlannedStmt *plannedstmt;	/* planner's output (could be utility, too) */
+	ExecPrep *prep;				/* output of ExecutorPrep() or NULL */
 	const char *sourceText;		/* source text of the query */
 	Snapshot	snapshot;		/* snapshot to use for query */
 	Snapshot	crosscheck_snapshot;	/* crosscheck for RI update/delete */
@@ -57,6 +57,7 @@ typedef struct QueryDesc
 
 /* in pquery.c */
 extern QueryDesc *CreateQueryDesc(PlannedStmt *plannedstmt,
+								  ExecPrep *prep,
 								  const char *sourceText,
 								  Snapshot snapshot,
 								  Snapshot crosscheck_snapshot,
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index fa2b657fb2f..bc90d0ea7ee 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -20,6 +20,7 @@
 #include "nodes/lockoptions.h"
 #include "nodes/parsenodes.h"
 #include "utils/memutils.h"
+#include "utils/resowner.h"
 
 
 /*
@@ -234,6 +235,15 @@ ExecGetJunkAttribute(TupleTableSlot *slot, AttrNumber attno, bool *isNull)
  */
 extern void ExecutorStart(QueryDesc *queryDesc, int eflags);
 extern void standard_ExecutorStart(QueryDesc *queryDesc, int eflags);
+
+extern ExecPrep *ExecutorPrep(PlannedStmt *pstmt,
+							  ParamListInfo params,
+							  ResourceOwner owner,
+							  bool do_initial_pruning);
+extern ExecPrep *CreateExecPrep(EState *estate, MemoryContext context,
+								execprep_cleanup_fn cleanup, void *cleanup_arg);
+extern void ExecPrepCleanup(ExecPrep *prep);
+
 extern void ExecutorRun(QueryDesc *queryDesc,
 						ScanDirection direction, uint64 count);
 extern void standard_ExecutorRun(QueryDesc *queryDesc,
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 18ae8f0d4bb..f569be3853f 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -772,6 +772,61 @@ typedef struct EState
 	List	   *es_insert_pending_modifytables;
 } EState;
 
+/*
+ * ExecPrep: encapsulates executor preparation results for a PlannedStmt.
+ *
+ * This is used when we want to perform executor setup steps -- such as
+ * initializing the range table, checking permissions, and executing initial
+ * partition pruning -- ahead of actual plan execution. A typical use case is
+ * in plan validation logic (e.g., when deciding whether to reuse a generic
+ * cached plan), where we need to determine exactly which partitions will be
+ * scanned and locked, without executing the full plan.
+ *
+ * The executor may later adopt the prepared EState (via ExecutorStart),
+ * avoiding redundant setup. In that case, the executor is responsible for
+ * freeing the state and ExecPrepCleanup() will skip it.
+ */
+struct ExecPrep;
+
+/*
+ * Optional callback to clean up user-specific resources associated with
+ * ExecPrep.
+ */
+typedef void (*execprep_cleanup_fn)(struct ExecPrep *prep);
+
+/* ExecutorPrep output */
+typedef struct ExecPrep
+{
+	/*
+	 * Context in which this struct and all subsidiary allocations were made.
+	 * This context must remain alive until ExecPrepCleanup is called.
+	 */
+	MemoryContext context;
+
+	/*
+	 * Partially-initialized executor state used for permission checks and
+	 * pruning. May be adopted directly by ExecutorStart(), in which case
+	 * ExecPrepCleanup will skip freeing it.
+	 */
+	EState	   *prep_estate;
+
+	/*
+	 * True if ExecPrepCleanup() must free the EState.  If the executor adopts
+	 * prep_estate, this is set to false to avoid double-free.
+	 */
+	bool		owns_estate;
+
+	/*
+	 * Optional caller-supplied cleanup hook to run during ExecPrepCleanup.
+	 * Useful for releasing external resources associated with the prep.
+	 */
+	execprep_cleanup_fn cleanup;
+
+	/*
+	 * Opaque pointer to pass to the cleanup hook.
+	 */
+	void	   *cleanup_arg;
+} ExecPrep;
 
 /*
  * ExecRowMark -
diff --git a/src/include/utils/portal.h b/src/include/utils/portal.h
index 5ffa6fd5cc8..013bcc3bd8e 100644
--- a/src/include/utils/portal.h
+++ b/src/include/utils/portal.h
@@ -137,6 +137,7 @@ typedef struct PortalData
 	CommandTag	commandTag;		/* command tag for original query */
 	QueryCompletion qc;			/* command completion data for executed query */
 	List	   *stmts;			/* list of PlannedStmts */
+	List	   *preps;			/* list of ExecPreps where needed */
 	CachedPlan *cplan;			/* CachedPlan, if stmts are from one */
 
 	ParamListInfo portalParams; /* params to pass to query */
@@ -240,6 +241,7 @@ extern void PortalDefineQuery(Portal portal,
 							  const char *sourceText,
 							  CommandTag commandTag,
 							  List *stmts,
+							  List *preps,
 							  CachedPlan *cplan);
 extern PlannedStmt *PortalGetPrimaryStmt(Portal portal);
 extern void PortalCreateHoldStore(Portal portal);
-- 
2.47.3



^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2025-11-17 12:50  Amit Langote <[email protected]>
  parent: Amit Langote <[email protected]>
  0 siblings, 1 reply; 108+ messages in thread

From: Amit Langote @ 2025-11-17 12:50 UTC (permalink / raw)
  To: Tom Lane <[email protected]>; +Cc: Tender Wang <[email protected]>; Alexander Lakhin <[email protected]>; Tomas Vondra <[email protected]>; Robert Haas <[email protected]>; Alvaro Herrera <[email protected]>; Andres Freund <[email protected]>; Daniel Gustafsson <[email protected]>; David Rowley <[email protected]>; pgsql-hackers; Thom Brown <[email protected]>

On Wed, Nov 12, 2025 at 11:17 PM Amit Langote <[email protected]> wrote:
> The key idea is to avoid taking unnecessary locks when reusing a
> cached plan. To achieve that, we need to perform initial partition
> pruning during cached plan reuse in plancache.c so that only surviving
> partitions are locked. This requires some plumbing to reuse the result
> of this "early" pruning during executor startup, because repeating the
> pruning logic would be both inefficient and potentially inconsistent
> -- what if you get different results the second time? (I don't have
> proof that this can happen, but some earlier emails mention the
> theoretical risk, so better to be safe.)
>
> So this patch introduces ExecutorPrep(), which allows executor
> metadata such as initial pruning results (valid subplan indexes) and
> full unpruned_relids to be computed ahead of execution and reused
> later by ExecutorStart() and during QueryDesc setup in parallel
> workers using the results shared by the leader. The parallel query bit
> was discussed previously at [1], though I didn’t have a solution I
> liked then.
>
...
> The patch set is structured as follows:
>
> * Refactor partition pruning initialization (0001): separates the
> setup of the pruning state from its execution by introducing
> ExecCreatePartitionPruneStates(). This makes the pruning logic easier
> to reuse and adds flexibility to do only the setup but skip pruning in
> some cases.
>
> * Introduce ExecutorPrep infrastructure (0002): adds ExecutorPrep()
> and ExecPrep as a formal way to perform executor setup ahead of
> execution. This enables caching or transferring pruning results and
> other metadata without triggering execution. ExecutorStart() can now
> consume precomputed prep state from the EState created during
> ExecutorPrep().  ExecPrepCleanup() handles cleanup when the plan is
> invalidated during prep and so not executed; the state is cleaned up
> in the regular ExecutorEnd() path otherwise.

In v1 patch, I had not made ExecutorStart() call ExecutorPrep() to do
the prep work (creating EState, setting up es_relations, checking
permissions) when QueryDesc did not carry the results of
ExecutorPrep() from some earlier stage. Instead, InitPlan() would
detect that prep was absent and perform the missing setup itself. On
second thought it is cleaner for ExecutorStart() to detect the absence
of prep and call ExecutorPrep() directly, matching how prep would be
created when coming from plancache et al.

v2 changes the patch to do that.

> * Enable pruning-aware locking in cached / generic plan reuse (0004):
> extends GetCachedPlan() and CheckCachedPlan() to call ExecutorPrep()
> on each PlannedStmt in the CachedPlan, locking only surviving
> partitions. Adds CachedPlanPrepData to pass this through plan cache
> APIs and down to execution via QueryDesc. Also reinstates the
> firstResultRel locking rule added in 28317de72 but later lost due to
> revert of the earlier pruning patch, to ensure correctness when all
> target partitions are pruned.

Looking at the changes to executor/function.c, I also noticed that I
had mistakenly allocated the ExecutorPrep state in
SQLFunctionCache.fcontext whereas the correct context for execution
related state is SQLFunctionCache.subcontext.  In the updated patch,
I've made postquel_start() reparent the prep EState's es_query_cxt to
subcontext from fcontext. I also did not have a test case that
exercised cached plan reuse for SQL functions, so I added one. I split
the function.c's GetCachedPlan() + CachedPlanPrepData plumbing into a
new patch 0005 so it can be reviewed separately, since it is the only
non-mechanical call-site change.

> Benchmark results:
>
> echo "plan_cache_mode = force_generic_plan" >> $PGDATA/postgresql.conf
> for p in 32 64 128 256 512 1024; do pgbench -i --partitions=$p >
> /dev/null 2>&1; echo -ne "$p\t"; pgbench -n -S -T10 -Mprepared | grep
> tps; done
>
> Master
>
> 32 tps = 23841.822407 (without initial connection time)
> 64 tps = 21578.619816 (without initial connection time)
> 128 tps = 18090.500707 (without initial connection time)
> 256 tps = 14152.248201 (without initial connection time)
> 512 tps = 9432.708423 (without initial connection time)
> 1024 tps = 5873.696475 (without initial connection time)
>
> Patched
>
> 32 tps = 24724.245798 (without initial connection time)
> 64 tps = 24858.206407 (without initial connection time)
> 128 tps = 24652.655269 (without initial connection time)
> 256 tps = 23656.756615 (without initial connection time)
> 512 tps = 22299.865769 (without initial connection time)
> 1024 tps = 21911.704317 (without initial connection time)

Re-ran to include 0 partition case and more partitions than 1024:

echo "plan_cache_mode = force_generic_plan" >> $PGDATA/postgresql.conf
for p in 0 8 16 32 64 128 256 512 1024 2048 4096; do pgbench -i
--partitions=$p > /dev/null 2>&1; echo -ne "$p\t"; pgbench -n -S -T10
-Mprepared | grep tps; done

Master

0 tps = 23600.068719 (without initial connection time)
8 tps = 22548.439906 (without initial connection time)
16 tps = 22807.337363 (without initial connection time)
32 tps = 22837.789996 (without initial connection time)
64 tps = 22915.846820 (without initial connection time)
128 tps = 22958.472655 (without initial connection time)
256 tps = 22432.432730 (without initial connection time)
512 tps = 20327.618690 (without initial connection time)
1024 tps = 20554.932475 (without initial connection time)
2048 tps = 19947.061061 (without initial connection time)
4096 tps = 17294.369829 (without initial connection time)

Patched

0 tps = 23869.906654 (without initial connection time)
8 tps = 22682.498914 (without initial connection time)
16 tps = 22714.445711 (without initial connection time)
32 tps = 21653.589371 (without initial connection time)
64 tps = 20571.267545 (without initial connection time)
128 tps = 17138.088269 (without initial connection time)
256 tps = 13027.168426 (without initial connection time)
512 tps = 8689.486966 (without initial connection time)
1024 tps = 5450.525617 (without initial connection time)
2048 tps = 3034.383108 (without initial connection time)
4096 tps = 1560.110609 (without initial connection time)

Tabular format (+ve pct_change means patched better)

 partitions    master        patched       pct_change
 ----------------------------------------------------
 0             23869.91      23600.07       -1.1%
 8             22682.50      22548.44       -0.6%
 16            22714.45      22807.34       +0.4%
 32            21653.59      22837.79       +5.5%
 64            20571.27      22915.85      +11.4%
 128           17138.09      22958.47      +34.0%
 256           13027.17      22432.43      +72.2%
 512            8689.49      20327.62     +133.9%
 1024           5450.53      20554.93     +277.1%
 2048           3034.38      19947.06     +557.4%
 4096           1560.11      17294.37    +1008.5%

I also did some runs for custom plans. The custom plan path should
behave about the same on master and patched since the early
ExecutorPrep() business only applies to generic plan reuse cases.

echo "plan_cache_mode = force_custom_plan" >> $PGDATA/postgresql.conf
for p in 0 8 16 32 64 128 256 512 1024 2048 4096; do pgbench -i
--partitions=$p > /dev/null 2>&1; echo -ne "$p\t"; pgbench -n -S -T10
-Mprepared | grep tps; done

Master

pgbench -n -S -T10 -Mprepared | grep tps; done
0 tps = 22346.419557 (without initial connection time)
8 tps = 20959.115560 (without initial connection time)
16 tps = 21390.573290 (without initial connection time)
32 tps = 21358.292393 (without initial connection time)
64 tps = 21288.742635 (without initial connection time)
128 tps = 21167.721447 (without initial connection time)
256 tps = 21256.618661 (without initial connection time)
512 tps = 19401.261197 (without initial connection time)
1024 tps = 19169.135145 (without initial connection time)
2048 tps = 19504.102179 (without initial connection time)
4096 tps = 18880.855783 (without initial connection time)

Patched

0 tps = 22852.634752 (without initial connection time)
8 tps = 21596.432690 (without initial connection time)
16 tps = 21428.779996 (without initial connection time)
32 tps = 20629.225272 (without initial connection time)
64 tps = 21301.644733 (without initial connection time)
128 tps = 21098.543942 (without initial connection time)
256 tps = 21394.364662 (without initial connection time)
512 tps = 19475.152170 (without initial connection time)
1024 tps = 19585.768438 (without initial connection time)
2048 tps = 19810.211969 (without initial connection time)
4096 tps = 19160.981608 (without initial connection time)

In tabular format:

 partitions    master        patched       pct_change
 ----------------------------------------------------
 0             22346.42      22852.63      +2.3%
 8             20959.12      21596.43      +3.0%
 16            21390.57      21428.78      +0.2%
 32            21358.29      20629.23      -3.4%
 64            21288.74      21301.64      +0.1%
 128           21167.72      21098.54      -0.3%
 256           21256.62      21394.36      +0.6%
 512           19401.26      19475.15      +0.4%
 1024          19169.14      19585.77      +2.2%
 2048          19504.10      19810.21      +1.6%
 4096          18880.86      19160.98      +1.5%

Numbers look within noise range as expected.

-- 
Thanks, Amit Langote


Attachments:

  [application/octet-stream] v2-0005-Make-SQL-function-executor-track-ExecutorPrep-sta.patch (6.5K, 2-v2-0005-Make-SQL-function-executor-track-ExecutorPrep-sta.patch)
  download | inline diff:
From eef8d1af46ca8deefbf8eb95428d37fc900a0944 Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Mon, 17 Nov 2025 17:40:26 +0900
Subject: [PATCH v2 5/5] Make SQL function executor track ExecutorPrep state

Extend the SQL function executor to use the ExecutorPrep results
returned by GetCachedPlan().  init_execution_state() now passes a
CachedPlanPrepData to GetCachedPlan() and stores the per statement
ExecPrep pointers in the execution_state nodes.

At execution time, postquel_start() reparents the prep estate's
es_query_cxt under the function's subcontext so that prep state
follows the usual per call context hierarchy.

This allows SQL language functions to participate in the same
ExecutorPrep machinery as other plan cache users, which a later
patch will use to support pruning aware locking.

Add a regression test where rule rewrite expands a single UPDATE
into multiple PlannedStmts, exercising the SQL function plan cache
and the generic plan reuse path that now invokes ExecutorPrep.
---
 src/backend/executor/functions.c        | 33 +++++++++++++++++++++++--
 src/test/regress/expected/plancache.out | 31 +++++++++++++++++++++++
 src/test/regress/sql/plancache.sql      | 29 ++++++++++++++++++++++
 3 files changed, 91 insertions(+), 2 deletions(-)

diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index 633310c5f5b..ed7352fce61 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -72,6 +72,7 @@ typedef struct execution_state
 	bool		setsResult;		/* true if this query produces func's result */
 	bool		lazyEval;		/* true if should fetch one row at a time */
 	PlannedStmt *stmt;			/* plan for this query */
+	ExecPrep   *prep;			/* ExecutorPrep() output for this plan */
 	QueryDesc  *qd;				/* null unless status == RUN */
 } execution_state;
 
@@ -657,6 +658,8 @@ init_execution_state(SQLFunctionCachePtr fcache)
 	execution_state *lasttages = NULL;
 	int			nstmts;
 	ListCell   *lc;
+	CachedPlanPrepData cprep = {0};
+	int			i;
 
 	/*
 	 * Clean up after previous query, if there was one.
@@ -695,10 +698,20 @@ init_execution_state(SQLFunctionCachePtr fcache)
 	 * CurrentResourceOwner will be the same when ShutdownSQLFunction runs.)
 	 */
 	fcache->cowner = CurrentResourceOwner;
+
+	/*
+	 * Have ExecutorPrep() allocate under fcache->fcontext.  The prep
+	 * EStates it creates will initially live there; postquel_start()
+	 * will later reparent their es_query_cxt into fcache->subcontext
+	 * when using them for execution.
+	 */
+	cprep.context = fcache->fcontext;
+	cprep.owner = fcache->cowner;
 	fcache->cplan = GetCachedPlan(plansource,
 								  fcache->paramLI,
 								  fcache->cowner,
-								  NULL);
+								  NULL,
+								  &cprep);
 
 	/*
 	 * If necessary, make esarray[] bigger to hold the needed state.
@@ -719,9 +732,12 @@ init_execution_state(SQLFunctionCachePtr fcache)
 	/*
 	 * Build execution_state list to match the number of contained plans.
 	 */
+	i = 0;
 	foreach(lc, fcache->cplan->stmt_list)
 	{
 		PlannedStmt *stmt = lfirst_node(PlannedStmt, lc);
+		ExecPrep *prep = cprep.prep_list ? list_nth(cprep.prep_list, i++) :
+			NULL;
 		execution_state *newes;
 
 		/*
@@ -763,6 +779,7 @@ init_execution_state(SQLFunctionCachePtr fcache)
 		newes->setsResult = false;	/* might change below */
 		newes->lazyEval = false;	/* might change below */
 		newes->stmt = stmt;
+		newes->prep = prep;
 		newes->qd = NULL;
 
 		if (stmt->canSetTag)
@@ -1361,8 +1378,20 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
 	else
 		dest = None_Receiver;
 
+	if (es->prep)
+	{
+		/*
+		 * Prep EStates were built under fcache->fcontext.  For execution,
+		 * make their es_query_cxt a child of fcache->subcontext so they
+		 * follow the usual per call lifetime.
+		 */
+		EState *prep_estate = es->prep->prep_estate;
+
+		MemoryContextSetParent(prep_estate->es_query_cxt, fcache->subcontext);
+	}
+
 	es->qd = CreateQueryDesc(es->stmt,
-							 NULL,
+							 es->prep,
 							 fcache->func->src,
 							 GetActiveSnapshot(),
 							 InvalidSnapshot,
diff --git a/src/test/regress/expected/plancache.out b/src/test/regress/expected/plancache.out
index 4e59188196c..8c68691df91 100644
--- a/src/test/regress/expected/plancache.out
+++ b/src/test/regress/expected/plancache.out
@@ -398,3 +398,34 @@ select name, generic_plans, custom_plans from pg_prepared_statements
 (1 row)
 
 drop table test_mode;
+-- exercise sql-function plan cache when rewrite expands a single statement
+-- into multiple planned statements. this forces cachedplan->stmt_list to
+-- contain more than one entry and checks that executor state for the first
+-- rewritten statement does not destroy state needed by the second one.
+set plan_cache_mode = force_generic_plan;
+create table sqlf_base(id int, val int);
+create table sqlf_log(id int, note text);
+insert into sqlf_base values (1, 10);
+create rule sqlf_base_upd_log as
+on update to sqlf_base do also
+    insert into sqlf_log(id, note)
+    values (new.id, 'logged by rule');
+create or replace function sqlf_execprep_test(a int, v int)
+returns void
+language sql
+as $$
+    update sqlf_base set val = v where id = a;
+$$;
+select sqlf_execprep_test(1, 20);
+ sqlf_execprep_test 
+--------------------
+ 
+(1 row)
+
+select sqlf_execprep_test(1, 30);
+ sqlf_execprep_test 
+--------------------
+ 
+(1 row)
+
+reset plan_cache_mode;
diff --git a/src/test/regress/sql/plancache.sql b/src/test/regress/sql/plancache.sql
index 4b2f11dcc64..56ebbbdecd2 100644
--- a/src/test/regress/sql/plancache.sql
+++ b/src/test/regress/sql/plancache.sql
@@ -223,3 +223,32 @@ select name, generic_plans, custom_plans from pg_prepared_statements
   where  name = 'test_mode_pp';
 
 drop table test_mode;
+
+-- exercise sql-function plan cache when rewrite expands a single statement
+-- into multiple planned statements. this forces cachedplan->stmt_list to
+-- contain more than one entry and checks that executor state for the first
+-- rewritten statement does not destroy state needed by the second one.
+
+set plan_cache_mode = force_generic_plan;
+
+create table sqlf_base(id int, val int);
+create table sqlf_log(id int, note text);
+
+insert into sqlf_base values (1, 10);
+
+create rule sqlf_base_upd_log as
+on update to sqlf_base do also
+    insert into sqlf_log(id, note)
+    values (new.id, 'logged by rule');
+
+create or replace function sqlf_execprep_test(a int, v int)
+returns void
+language sql
+as $$
+    update sqlf_base set val = v where id = a;
+$$;
+
+select sqlf_execprep_test(1, 20);
+select sqlf_execprep_test(1, 30);
+
+reset plan_cache_mode;
-- 
2.47.3



  [application/octet-stream] v2-0001-Refactor-partition-pruning-initialization-for-cla.patch (7.7K, 3-v2-0001-Refactor-partition-pruning-initialization-for-cla.patch)
  download | inline diff:
From 243d407de86b0a73b9bd8c8dbc541f630eb33747 Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Tue, 11 Nov 2025 21:18:24 +0900
Subject: [PATCH v2 1/5] Refactor partition pruning initialization for clarity
 and modularity

Move the creation of PartitionPruneState structures out of
ExecDoInitialPruning() into a new ExecCreatePartitionPruneStates()
function. This separates the setup of pruning state from the execution
of initial pruning logic, making the code clearer and easier to
maintain.

Also simplify handling of unpruned relids by moving responsibility
for recording them in EState into CreatePartitionPruneState(),
avoiding the need to pass all_leafpart_rtis as an out parameter.

This refactoring allows callers to reuse the pruning setup logic
without always triggering pruning, a capability useful for future use
cases that may only need metadata initialization.
---
 src/backend/executor/execPartition.c | 70 +++++++++++++++++-----------
 src/include/executor/execPartition.h |  1 +
 2 files changed, 43 insertions(+), 28 deletions(-)

diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index aa12e9ad2ea..88b150c8d77 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -182,8 +182,7 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation rel,
 static List *adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri);
 static List *adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap);
 static PartitionPruneState *CreatePartitionPruneState(EState *estate,
-													  PartitionPruneInfo *pruneinfo,
-													  Bitmapset **all_leafpart_rtis);
+													  PartitionPruneInfo *pruneinfo);
 static void InitPartitionPruneContext(PartitionPruneContext *context,
 									  List *pruning_steps,
 									  PartitionDesc partdesc,
@@ -1772,6 +1771,9 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  *
  * Functions:
  *
+ * ExecCreatePartitionPruneStates
+ *		Create PartitionPruneState for all PartitionPruneInfos in the EState
+ *
  * ExecDoInitialPruning:
  *		Perform runtime "initial" pruning, if necessary, to determine the set
  *		of child subnodes that need to be initialized during ExecInitNode() for
@@ -1796,6 +1798,29 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  *-------------------------------------------------------------------------
  */
 
+/*
+ * ExecCreatePartitionPruneStates
+ *
+ * Create a PartitionPruneState for each PartitionPruneInfo in the estate,
+ * and save them in estate->es_part_prune_states. This setup is required
+ * before any initial or runtime pruning can occur.
+ */
+void
+ExecCreatePartitionPruneStates(EState *estate)
+{
+	ListCell   *lc;
+
+	foreach(lc, estate->es_part_prune_infos)
+	{
+		PartitionPruneInfo *pruneinfo = lfirst_node(PartitionPruneInfo, lc);
+		PartitionPruneState *prunestate;
+
+		/* Create and save the PartitionPruneState. */
+		prunestate = CreatePartitionPruneState(estate, pruneinfo);
+		estate->es_part_prune_states = lappend(estate->es_part_prune_states,
+											   prunestate);
+	}
+}
 
 /*
  * ExecDoInitialPruning
@@ -1803,11 +1828,11 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  *		of child subnodes that need to be initialized during ExecInitNode() for
  *		plan nodes that support partition pruning.
  *
- * This function iterates over each PartitionPruneInfo entry in
- * estate->es_part_prune_infos. For each entry, it creates a PartitionPruneState
- * and adds it to es_part_prune_states.  ExecInitPartitionExecPruning() accesses
+ * This function iterates over each PartitionPruneState in
+ * estate->es_part_prune_states, which must have been populated earlier by
+ * ExecCreatePartitionPruneStates(). ExecInitPartitionExecPruning() accesses
  * these states through their corresponding indexes in es_part_prune_states and
- * assign each state to the parent node's PlanState, from where it will be used
+ * assigns each state to the parent node's PlanState, from where it will be used
  * for "exec" pruning.
  *
  * If initial pruning steps exist for a PartitionPruneInfo entry, this function
@@ -1825,20 +1850,13 @@ ExecDoInitialPruning(EState *estate)
 {
 	ListCell   *lc;
 
-	foreach(lc, estate->es_part_prune_infos)
+	Assert(estate->es_part_prune_results == NULL);
+	foreach(lc, estate->es_part_prune_states)
 	{
-		PartitionPruneInfo *pruneinfo = lfirst_node(PartitionPruneInfo, lc);
-		PartitionPruneState *prunestate;
+		PartitionPruneState *prunestate = (PartitionPruneState *) lfirst(lc);
 		Bitmapset  *validsubplans = NULL;
-		Bitmapset  *all_leafpart_rtis = NULL;
 		Bitmapset  *validsubplan_rtis = NULL;
 
-		/* Create and save the PartitionPruneState. */
-		prunestate = CreatePartitionPruneState(estate, pruneinfo,
-											   &all_leafpart_rtis);
-		estate->es_part_prune_states = lappend(estate->es_part_prune_states,
-											   prunestate);
-
 		/*
 		 * Perform initial pruning steps, if any, and save the result
 		 * bitmapset or NULL as described in the header comment.
@@ -1846,8 +1864,6 @@ ExecDoInitialPruning(EState *estate)
 		if (prunestate->do_initial_prune)
 			validsubplans = ExecFindMatchingSubPlans(prunestate, true,
 													 &validsubplan_rtis);
-		else
-			validsubplan_rtis = all_leafpart_rtis;
 
 		estate->es_unpruned_relids = bms_add_members(estate->es_unpruned_relids,
 													 validsubplan_rtis);
@@ -1965,14 +1981,12 @@ ExecInitPartitionExecPruning(PlanState *planstate,
  * parent plan node's PlanState.
  *
  * If initial pruning steps are to be skipped (e.g., during EXPLAIN
- * (GENERIC_PLAN)), *all_leafpart_rtis will be populated with the RT indexes of
- * all leaf partitions whose scanning subnode is included in the parent plan
- * node's list of child plans. The caller must add these RT indexes to
- * estate->es_unpruned_relids.
+ * (GENERIC_PLAN)), the RT indexes of all leaf partitions whose scanning
+ * subnode is included in the parent plan node's list of child plans are
+ * added to estate->es_unpruned_relids.
  */
 static PartitionPruneState *
-CreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo,
-						  Bitmapset **all_leafpart_rtis)
+CreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo)
 {
 	PartitionPruneState *prunestate;
 	int			n_part_hierarchies;
@@ -2206,8 +2220,8 @@ CreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo,
 													   pinfo->execparamids);
 
 			/*
-			 * Return all leaf partition indexes if we're skipping pruning in
-			 * the EXPLAIN (GENERIC_PLAN) case.
+			 * Add all leaf partition indexes to es_unpruned_relids if we're
+			 * skipping pruning in the EXPLAIN (GENERIC_PLAN) case.
 			 */
 			if (pinfo->initial_pruning_steps && !prunestate->do_initial_prune)
 			{
@@ -2219,8 +2233,8 @@ CreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo,
 					Index		rtindex = pprune->leafpart_rti_map[part_index];
 
 					if (rtindex)
-						*all_leafpart_rtis = bms_add_member(*all_leafpart_rtis,
-															rtindex);
+						estate->es_unpruned_relids =
+							bms_add_member(estate->es_unpruned_relids, rtindex);
 				}
 			}
 
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index 3b3f46aced0..ba8cc594fc9 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -130,6 +130,7 @@ typedef struct PartitionPruneState
 	PartitionPruningData *partprunedata[FLEXIBLE_ARRAY_MEMBER];
 } PartitionPruneState;
 
+extern void ExecCreatePartitionPruneStates(EState *estate);
 extern void ExecDoInitialPruning(EState *estate);
 extern PartitionPruneState *ExecInitPartitionExecPruning(PlanState *planstate,
 														 int n_total_subplans,
-- 
2.47.3



  [application/octet-stream] v2-0004-Use-pruning-aware-locking-in-cached-plans.patch (24.0K, 4-v2-0004-Use-pruning-aware-locking-in-cached-plans.patch)
  download | inline diff:
From 74dc075dc8f844e036fc38e005fc512b6dd54bc9 Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Tue, 11 Nov 2025 22:30:52 +0900
Subject: [PATCH v2 4/5] Use pruning-aware locking in cached plans

Extend GetCachedPlan() to perform ExecutorPrep() on each planned
statement, capturing unpruned relids and initial pruning results.
Use this data to acquire execution locks only on surviving partitions,
avoiding unnecessary locking of pruned tables even when using cached
plans.

Introduce CachedPlanPrepData to carry ExecutorPrep results
through the plan caching layer. Adjust call sites in SPI,
functions, portals, and EXPLAIN to propagate this data.

This ensures pruning decisions made during initial pruning are
consistently reused without redoing pruning logic in executor paths
like parallel workers. It also lays the groundwork for
pruning-dependent lock behavior during plan reuse.

To maintain correctness when all target partitions are pruned, also
reinstate the firstResultRel locking behavior lost in commit
28317de72. That commit required the first ModifyTable target to
remain initialized for executor assumptions to hold. We now
explicitly track these relids in PlannerGlobal and PlannedStmt so they
are locked even if pruned, preserving that rule across cached plan
reuse.
---
 src/backend/commands/prepare.c         |  19 +-
 src/backend/executor/nodeModifyTable.c |   4 +-
 src/backend/executor/spi.c             |  26 ++-
 src/backend/optimizer/plan/planner.c   |   1 +
 src/backend/optimizer/plan/setrefs.c   |   3 +
 src/backend/tcop/postgres.c            |   9 +-
 src/backend/utils/cache/plancache.c    | 234 ++++++++++++++++++++++++-
 src/include/nodes/pathnodes.h          |   3 +
 src/include/nodes/plannodes.h          |  10 ++
 src/include/utils/plancache.h          |  24 ++-
 10 files changed, 312 insertions(+), 21 deletions(-)

diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index afd449c73ba..23332d19b37 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -154,6 +154,7 @@ ExecuteQuery(ParseState *pstate,
 {
 	PreparedStatement *entry;
 	CachedPlan *cplan;
+	CachedPlanPrepData cprep = {0};
 	List	   *plan_list;
 	ParamListInfo paramLI = NULL;
 	EState	   *estate = NULL;
@@ -193,7 +194,10 @@ ExecuteQuery(ParseState *pstate,
 									   entry->plansource->query_string);
 
 	/* Replan if needed, and increment plan refcount for portal */
-	cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL);
+	/* Keep ExecutorPrep state with the portal and its resowner. */
+	cprep.context = portal->portalContext;
+	cprep.owner = portal->resowner;
+	cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL, &cprep);
 	plan_list = cplan->stmt_list;
 
 	/*
@@ -205,7 +209,7 @@ ExecuteQuery(ParseState *pstate,
 					  query_string,
 					  entry->plansource->commandTag,
 					  plan_list,
-					  NIL,
+					  cprep.prep_list,
 					  cplan);
 
 	/*
@@ -575,6 +579,7 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	PreparedStatement *entry;
 	const char *query_string;
 	CachedPlan *cplan;
+	CachedPlanPrepData cprep = {0};
 	List	   *plan_list;
 	List	   *prep_list;
 	ListCell   *p;
@@ -633,8 +638,14 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	}
 
 	/* Replan if needed, and acquire a transient refcount */
+	/* ExecutorPrep state is local to this EXPLAIN EXECUTE call. */
+	cprep.context = CurrentMemoryContext;
+	cprep.owner = CurrentResourceOwner;
+	if (es->generic)
+		cprep.eflags = EXEC_FLAG_EXPLAIN_GENERIC;
 	cplan = GetCachedPlan(entry->plansource, paramLI,
-						  CurrentResourceOwner, pstate->p_queryEnv);
+						  CurrentResourceOwner, pstate->p_queryEnv,
+						  &cprep);
 
 	INSTR_TIME_SET_CURRENT(planduration);
 	INSTR_TIME_SUBTRACT(planduration, planstart);
@@ -653,7 +664,7 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	}
 
 	plan_list = cplan->stmt_list;
-	prep_list = NIL;
+	prep_list = cprep.prep_list;
 
 	/* Explain each query */
 	i = 0;
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index 4c5647ac38a..c5812612f8d 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -4648,8 +4648,8 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
 	 * as a reference for building the ResultRelInfo of the target partition.
 	 * In either case, it doesn't matter which result relation is kept, so we
 	 * just keep the first one, if all others have been pruned.  See also,
-	 * ExecDoInitialPruning(), which ensures that this first result relation
-	 * has been locked.
+	 * AcquireExecutorLocksUnpruned(), which ensures that this first result
+	 * relation has been locked.
 	 */
 	i = 0;
 	foreach(l, node->resultRelations)
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 7a3cb944d6f..d580f1e0425 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -1579,6 +1579,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 {
 	CachedPlanSource *plansource;
 	CachedPlan *cplan;
+	CachedPlanPrepData cprep = {0};
 	List	   *stmt_list;
 	char	   *query_string;
 	Snapshot	snapshot;
@@ -1659,7 +1660,11 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 	 */
 
 	/* Replan if needed, and increment plan refcount for portal */
-	cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv);
+	/* ExecutorPrep state lives in this portal's context. */
+	cprep.context = portal->portalContext;
+	cprep.owner = portal->resowner;
+	cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv,
+						  &cprep);
 	stmt_list = cplan->stmt_list;
 
 	if (!plan->saved)
@@ -1685,7 +1690,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 					  query_string,
 					  plansource->commandTag,
 					  stmt_list,
-					  NIL,
+					  cprep.prep_list,	/* lives in portalContext */
 					  cplan);
 
 	/*
@@ -2078,6 +2083,7 @@ SPI_plan_get_cached_plan(SPIPlanPtr plan)
 {
 	CachedPlanSource *plansource;
 	CachedPlan *cplan;
+	CachedPlanPrepData cprep = {0};
 	SPICallbackArg spicallbackarg;
 	ErrorContextCallback spierrcontext;
 
@@ -2101,9 +2107,13 @@ SPI_plan_get_cached_plan(SPIPlanPtr plan)
 	error_context_stack = &spierrcontext;
 
 	/* Get the generic plan for the query */
+	/* ExecutorPrep() state lives in caller's active context. */
+	cprep.context = CurrentMemoryContext;
+	cprep.owner = CurrentResourceOwner;
 	cplan = GetCachedPlan(plansource, NULL,
 						  plan->saved ? CurrentResourceOwner : NULL,
-						  _SPI_current->queryEnv);
+						  _SPI_current->queryEnv,
+						  &cprep);
 	Assert(cplan == plansource->gplan);
 
 	/* Pop the error context stack */
@@ -2501,6 +2511,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 		CachedPlanSource *plansource = (CachedPlanSource *) lfirst(lc1);
 		List	   *stmt_list;
 		ListCell   *lc2;
+		CachedPlanPrepData cprep = {0};
 		List	   *prep_list;
 		int			i;
 
@@ -2577,11 +2588,16 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 		 * Replan if needed, and increment plan refcount.  If it's a saved
 		 * plan, the refcount must be backed by the plan_owner.
 		 */
+
+		/* ExecutorPrep state is per _SPI_execute_plan call. */
+		cprep.context = CurrentMemoryContext;
+		cprep.owner = CurrentResourceOwner;
 		cplan = GetCachedPlan(plansource, options->params,
-							  plan_owner, _SPI_current->queryEnv);
+							  plan_owner, _SPI_current->queryEnv,
+							  &cprep);
 
 		stmt_list = cplan->stmt_list;
-		prep_list = NIL;
+		prep_list = cprep.prep_list;
 
 		/*
 		 * If we weren't given a specific snapshot to use, and the statement
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index c4fd646b999..4c76e78c1da 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -608,6 +608,7 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
 											  glob->prunableRelids);
 	result->permInfos = glob->finalrteperminfos;
 	result->resultRelations = glob->resultRelations;
+	result->firstResultRels = glob->firstResultRels;
 	result->appendRelations = glob->appendRelations;
 	result->subplans = glob->subplans;
 	result->rewindPlanIDs = glob->rewindPlanIDs;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index ccdc9bc264a..229b39060ae 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -1274,6 +1274,9 @@ set_plan_refs(PlannerInfo *root, Plan *plan, int rtoffset)
 						lappend_int(root->glob->resultRelations,
 									splan->rootRelation);
 				}
+				root->glob->firstResultRels =
+					lappend_int(root->glob->firstResultRels,
+								linitial_int(splan->resultRelations));
 			}
 			break;
 		case T_Append:
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index d3964a12a14..249829f59a0 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -1639,6 +1639,7 @@ exec_bind_message(StringInfo input_message)
 	int16	   *rformats = NULL;
 	CachedPlanSource *psrc;
 	CachedPlan *cplan;
+	CachedPlanPrepData cprep = {0};
 	Portal		portal;
 	char	   *query_string;
 	char	   *saved_stmt_name;
@@ -2021,7 +2022,11 @@ exec_bind_message(StringInfo input_message)
 	 * will be generated in MessageContext.  The plan refcount will be
 	 * assigned to the Portal, so it will be released at portal destruction.
 	 */
-	cplan = GetCachedPlan(psrc, params, NULL, NULL);
+
+	/* ExecutorPrep() state lives in portal context. */
+	cprep.context = portal->portalContext;
+	cprep.owner = portal->resowner;
+	cplan = GetCachedPlan(psrc, params, NULL, NULL, &cprep);
 
 	/*
 	 * Now we can define the portal.
@@ -2034,7 +2039,7 @@ exec_bind_message(StringInfo input_message)
 					  query_string,
 					  psrc->commandTag,
 					  cplan->stmt_list,
-					  NIL,
+					  cprep.prep_list,
 					  cplan);
 
 	/* Portal is defined, set the plan ID based on its contents. */
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 6661d2c6b73..c1cfd47422c 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -93,7 +93,7 @@ static bool StmtPlanRequiresRevalidation(CachedPlanSource *plansource);
 static bool BuildingPlanRequiresSnapshot(CachedPlanSource *plansource);
 static List *RevalidateCachedQuery(CachedPlanSource *plansource,
 								   QueryEnvironment *queryEnv);
-static bool CheckCachedPlan(CachedPlanSource *plansource);
+static bool PrepAndCheckCachedPlan(CachedPlanSource *plansource, CachedPlanPrepData *cprep);
 static CachedPlan *BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
 								   ParamListInfo boundParams, QueryEnvironment *queryEnv);
 static bool choose_custom_plan(CachedPlanSource *plansource,
@@ -101,6 +101,8 @@ static bool choose_custom_plan(CachedPlanSource *plansource,
 static double cached_plan_cost(CachedPlan *plan, bool include_planner);
 static Query *QueryListGetPrimaryStmt(List *stmts);
 static void AcquireExecutorLocks(List *stmt_list, bool acquire);
+static void AcquireExecutorLocksUnpruned(List *stmt_list, bool acquire,
+										 CachedPlanPrepData *cprep);
 static void AcquirePlannerLocks(List *stmt_list, bool acquire);
 static void ScanQueryForLocks(Query *parsetree, bool acquire);
 static bool ScanQueryWalker(Node *node, bool *acquire);
@@ -137,6 +139,26 @@ ResourceOwnerForgetPlanCacheRef(ResourceOwner owner, CachedPlan *plan)
 /* GUC parameter */
 int			plan_cache_mode = PLAN_CACHE_MODE_AUTO;
 
+/*
+ * Lock acquisition policy for execution locks.
+ *
+ * LOCK_ALL acquires locks on all relations mentioned in the plan,
+ * reproducing the behavior of AcquireExecutorLocks().
+ *
+ * LOCK_UNPRUNED restricts locking to only the unpruned relations. That
+ * includes those mentioned in PlannedStmt.unprunableRelids and the leaf
+ * partitions remaining after performing initial pruning.
+ */
+typedef enum LockPolicy
+{
+	LOCK_ALL,
+	LOCK_UNPRUNED,
+} LockPolicy;
+
+static void AcquireExecutorLocksWithPolicy(List *stmt_list,
+										   LockPolicy policy, bool acquire,
+										   CachedPlanPrepData *cprep);
+
 /*
  * InitPlanCache: initialize module during InitPostgres.
  *
@@ -938,7 +960,12 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
 }
 
 /*
- * CheckCachedPlan: see if the CachedPlanSource's generic plan is valid.
+ * PrepAndCheckCachedPlan: see if the CachedPlanSource's generic plan is valid.
+ *
+ * If 'cprep' is not NULL, ExecutorPrep() is applied to each PlannedStmt to
+ * compute the set of partitions that survive initial runtime pruning in order
+ * to only lock them.  The resulting ExecPrep structures are saved in cprep for
+ * later reuse by ExecutorStart().
  *
  * Caller must have already called RevalidateCachedQuery to verify that the
  * querytree is up to date.
@@ -947,7 +974,7 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
  * (We must do this for the "true" result to be race-condition-free.)
  */
 static bool
-CheckCachedPlan(CachedPlanSource *plansource)
+PrepAndCheckCachedPlan(CachedPlanSource *plansource, CachedPlanPrepData *cprep)
 {
 	CachedPlan *plan = plansource->gplan;
 
@@ -975,13 +1002,15 @@ CheckCachedPlan(CachedPlanSource *plansource)
 	 */
 	if (plan->is_valid)
 	{
+		LockPolicy policy = !cprep ? LOCK_ALL : LOCK_UNPRUNED;
+
 		/*
 		 * Plan must have positive refcount because it is referenced by
 		 * plansource; so no need to fear it disappears under us here.
 		 */
 		Assert(plan->refcount > 0);
 
-		AcquireExecutorLocks(plan->stmt_list, true);
+		AcquireExecutorLocksWithPolicy(plan->stmt_list, policy, true, cprep);
 
 		/*
 		 * If plan was transient, check to see if TransactionXmin has
@@ -1003,7 +1032,7 @@ CheckCachedPlan(CachedPlanSource *plansource)
 		}
 
 		/* Oops, the race case happened.  Release useless locks. */
-		AcquireExecutorLocks(plan->stmt_list, false);
+		AcquireExecutorLocksWithPolicy(plan->stmt_list, policy, false, cprep);
 	}
 
 	/*
@@ -1283,6 +1312,10 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
  * On return, the plan is valid and we have sufficient locks to begin
  * execution.
  *
+ * If 'cprep' is not NULL and a generic plan is reused, the function prepares
+ * each PlannedStmt via ExecutorPrep() and stores the results in
+ * cprep->prep_list.  These are intended to be passed later to ExecutorStart().
+ *
  * On return, the refcount of the plan has been incremented; a later
  * ReleaseCachedPlan() call is expected.  If "owner" is not NULL then
  * the refcount has been reported to that ResourceOwner (note that this
@@ -1293,7 +1326,8 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
  */
 CachedPlan *
 GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
-			  ResourceOwner owner, QueryEnvironment *queryEnv)
+			  ResourceOwner owner, QueryEnvironment *queryEnv,
+			  CachedPlanPrepData *cprep)
 {
 	CachedPlan *plan = NULL;
 	List	   *qlist;
@@ -1315,7 +1349,9 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 
 	if (!customplan)
 	{
-		if (CheckCachedPlan(plansource))
+		if (cprep)
+			cprep->params = boundParams;
+		if (PrepAndCheckCachedPlan(plansource, cprep))
 		{
 			/* We want a generic plan, and we already have a valid one */
 			plan = plansource->gplan;
@@ -1902,6 +1938,38 @@ QueryListGetPrimaryStmt(List *stmts)
 	return NULL;
 }
 
+/*
+ * AcquireExecutorLocksWithPolicy
+ *		Acquire or release execution locks for a cached plan according to
+ *		the specified policy.
+ *
+ * LOCK_ALL reproduces AcquireExecutorLocks(), locking every relation in
+ * each PlannedStmt's rtable.  LOCK_UNPRUNED restricts locking to the
+ * unprunable rels and partitions that survive initial runtime pruning.
+ *
+ * When LOCK_UNPRUNED is used on acquire, ExecutorPrep() is invoked for
+ * each PlannedStmt and the resulting ExecPrep pointers are appended to
+ * cprep->prep_list in cprep->context.  On release, the same ExecPrep
+ * list is consulted to determine which relations to unlock and is then
+ * cleaned up with ExecPrepCleanup().
+ */
+static void
+AcquireExecutorLocksWithPolicy(List *stmt_list, LockPolicy policy, bool acquire,
+							   CachedPlanPrepData *cprep)
+{
+	switch (policy)
+	{
+		case LOCK_ALL:
+			AcquireExecutorLocks(stmt_list, acquire);
+			break;
+		case LOCK_UNPRUNED:
+			AcquireExecutorLocksUnpruned(stmt_list, acquire, cprep);
+			break;
+		default:
+			elog(ERROR, "invalid LockPolicy");
+	}
+}
+
 /*
  * AcquireExecutorLocks: acquire locks needed for execution of a cached plan;
  * or release them if acquire is false.
@@ -1954,6 +2022,158 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
 	}
 }
 
+/*
+ * LockRelids
+ * 		Acquire or release locks on the specified relids, which reference
+ * 		entries in the provided range table.
+ *
+ * Helper for AcquireExecutorLocksUnpruned().
+ */
+static void
+LockRelids(List *rtable, Bitmapset *relids, bool acquire)
+{
+	int	rtindex = -1;
+
+	while ((rtindex = bms_next_member(relids, rtindex)) >= 0)
+	{
+		RangeTblEntry *rte = list_nth_node(RangeTblEntry, rtable, rtindex - 1);
+
+		Assert(rte->rtekind == RTE_RELATION ||
+			   (rte->rtekind == RTE_SUBQUERY && OidIsValid(rte->relid)));
+
+		/*
+		 * Acquire the appropriate type of lock on each relation OID. Note
+		 * that we don't actually try to open the rel, and hence will not
+		 * fail if it's been dropped entirely --- we'll just transiently
+		 * acquire a non-conflicting lock.
+		 */
+		if (acquire)
+			LockRelationOid(rte->relid, rte->rellockmode);
+		else
+			UnlockRelationOid(rte->relid, rte->rellockmode);
+	}
+}
+
+/*
+ * AcquireExecutorLocksUnpruned
+ *		Acquire or release execution locks for only unpruned relations
+ *		referenced by the given PlannedStmts.
+ *
+ * On acquire, this:
+ *	- locks unprunable rels listed in PlannedStmt.unprunableRelids
+ *	- runs ExecutorPrep() to perform initial runtime pruning
+ *	- locks the surviving partitions reported in the prep estate
+ *	- appends the ExecPrep pointer for each PlannedStmt to cprep->prep_list
+ *
+ * On release, it:
+ *	- looks up the ExecPrep object for each PlannedStmt from cprep->prep_list
+ *	  (which must already be populated)
+ *	- unlocks the same relations identified during acquire
+ *	- calls ExecPrepCleanup() on each ExecPrep
+ *
+ * prep_list is extended during acquire and must match stmt_list one-to-one
+ * when releasing locks.  Memory allocation for ExecPrep happens in
+ * cprep->context.  Locks are acquired using cprep->owner.
+ */
+
+static void
+AcquireExecutorLocksUnpruned(List *stmt_list, bool acquire,
+							 CachedPlanPrepData *cprep)
+{
+	MemoryContext oldcontext = MemoryContextSwitchTo(cprep->context);
+	ListCell   *lc1;
+	List	   *prep_list;
+	int			i;
+
+	Assert(cprep);
+
+	/*
+	 * When releasing locks, use the ExecPrep list (if any) created during
+	 * acquisition to determine which relids to unlock. The list must match
+	 * the PlannedStmt list one-to-one.
+	 */
+	prep_list = cprep->prep_list;
+	Assert(acquire || list_length(prep_list) == list_length(stmt_list));
+
+	i = 0;
+	foreach(lc1, stmt_list)
+	{
+		PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
+		ExecPrep *prep;
+
+		if (plannedstmt->commandType == CMD_UTILITY)
+		{
+			/* Same as AcquireExecutorLocks(). */
+			Query	   *query = UtilityContainsQuery(plannedstmt->utilityStmt);
+
+			if (query)
+				ScanQueryForLocks(query, acquire);
+
+			/* Keep the list one-to-one with stmt_list. */
+			if (acquire)
+				cprep->prep_list = lappend(cprep->prep_list, NULL);
+			continue;
+		}
+
+		/*
+		 * Lock tables mentioned in the original query and other unprunable
+		 * relations that were added to the plan via inheritance expansion.
+		 */
+		LockRelids(plannedstmt->rtable, plannedstmt->unprunableRelids, acquire);
+
+		/* Lock partitions surviving runtime initial pruning. */
+		if (acquire)
+		{
+			prep = ExecutorPrep(plannedstmt, cprep->params, cprep->owner, true,
+								cprep->eflags);
+			Assert(prep || plannedstmt->partPruneInfos == NULL);
+			cprep->prep_list = lappend(cprep->prep_list, prep);
+		}
+		else
+			prep = list_nth(prep_list, i++);
+
+		Assert(prep == NULL || prep->prep_estate);
+		if (prep)
+		{
+			EState *prep_estate = prep->prep_estate;
+
+			/*
+			 * es_unpruned_relids includes plannedstmt->unprunableRelids,
+			 * which we've already locked. Filter them out to avoid double-locking.
+			 */
+			Bitmapset *lock_relids = bms_difference(prep_estate->es_unpruned_relids,
+													plannedstmt->unprunableRelids);
+
+			/*
+			 * firstResultRels may contain pruned partitions that must still be
+			 * locked to satisfy executor assumptions (see comments in
+			 * ExecInitModifyTable(). Ensure they’re included here.
+			 */
+			if (plannedstmt->resultRelations)
+			{
+				ListCell *lc2;
+
+				foreach(lc2, plannedstmt->firstResultRels)
+				{
+					Index       firstResultRel = lfirst_int(lc2);
+
+					if (!bms_is_member(firstResultRel, lock_relids))
+						lock_relids = bms_add_member(lock_relids, firstResultRel);
+				}
+			}
+
+			LockRelids(plannedstmt->rtable, lock_relids, acquire);
+			bms_free(lock_relids);
+		}
+
+		/* Clean up prep if releasing locks. */
+		if (!acquire)
+			ExecPrepCleanup(prep);
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+}
+
 /*
  * AcquirePlannerLocks: acquire locks needed for planning of a querytree list;
  * or release them if acquire is false.
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 30d889b54c5..6fb86dc05f6 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -141,6 +141,9 @@ typedef struct PlannerGlobal
 	/* "flat" list of integer RT indexes */
 	List	   *resultRelations;
 
+	/* "flat" list of integer RT indexes (one per ModifyTable node) */
+	List	   *firstResultRels;
+
 	/* "flat" list of AppendRelInfos */
 	List	   *appendRelations;
 
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index c4393a94321..eb211f1ba56 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -123,6 +123,16 @@ typedef struct PlannedStmt
 	/* integer list of RT indexes, or NIL */
 	List	   *resultRelations;
 
+	/*
+	 * rtable indexes of first target relation in each ModifyTable node in the
+	 * plan for INSERT/UPDATE/DELETE/MERGE.  NIL if resultRelations is NIL.
+	 *
+	 * These are used by AcquireExecutorLocksUnpruned() to ensure that the
+	 * first result rel for each ModifyTable remains locked even if pruned;
+	 * see ExecInitModifyTable() for the executor side assumptions.
+	 */
+	List	   *firstResultRels;
+
 	/* list of AppendRelInfo nodes */
 	List	   *appendRelations;
 
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index a82b66d4bc2..c7b8ec4be39 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -197,6 +197,27 @@ typedef struct CachedExpression
 } CachedExpression;
 
 
+/*
+ * CachedPlanPrepData
+ *      Carries ExecutorPrep results for each PlannedStmt in a CachedPlan,
+ *      along with context and owner information needed to allocate them.
+ *
+ * prep_list is indexed one-to-one with CachedPlan->stmt_list, and is
+ * populated when GetCachedPlan() prepares a reused generic plan.  The
+ * same list is later used to determine which relations to unlock when
+ * releasing execution locks.
+ *
+ * ExecutorPrep state is allocated in 'context' and owned by 'owner'.
+ */
+typedef struct CachedPlanPrepData
+{
+	List   *prep_list;		/* one ExecPrep per PlannedStmt, or NULL */
+	ParamListInfo params;	/* params visible to ExecutorPrep */
+	MemoryContext context;	/* where to allocate ExecPrep objects */
+	ResourceOwner owner;	/* ResourceOwner for ExecutorPrep state */
+	int		eflags;			/* executor flags to pass to ExecutorPrep */
+} CachedPlanPrepData;
+
 extern void InitPlanCache(void);
 extern void ResetPlanCache(void);
 
@@ -240,7 +261,8 @@ extern List *CachedPlanGetTargetList(CachedPlanSource *plansource,
 extern CachedPlan *GetCachedPlan(CachedPlanSource *plansource,
 								 ParamListInfo boundParams,
 								 ResourceOwner owner,
-								 QueryEnvironment *queryEnv);
+								 QueryEnvironment *queryEnv,
+								 CachedPlanPrepData *cprep);
 extern void ReleaseCachedPlan(CachedPlan *plan, ResourceOwner owner);
 
 extern bool CachedPlanAllowsSimpleValidityCheck(CachedPlanSource *plansource,
-- 
2.47.3



  [application/octet-stream] v2-0003-Reuse-partition-pruning-results-in-parallel-worke.patch (9.1K, 5-v2-0003-Reuse-partition-pruning-results-in-parallel-worke.patch)
  download | inline diff:
From d9d95e09961dcb8236e5fe7b2da4a37fda8e5944 Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Tue, 11 Nov 2025 22:17:47 +0900
Subject: [PATCH v2 3/5] Reuse partition pruning results in parallel workers

Pass the leader's initial partition pruning results and unpruned
relids to parallel workers and reuse them via ExecutorPrep(). This
avoids repeating pruning logic in workers, which is not only
redundant but also risks divergence due to nondeterminism in pruning
steps or parameter evaluation timing.

Introduce ExecCheckInitialPruningResults() to verify that the results
match what the worker would compute. This check helps catch
inconsistencies across leader and worker pruning logic.

While valuable on its own, this change also lays the foundation for
future optimizations where the leader may take locks only on
surviving partitions. Ensuring that workers follow identical pruning
decisions makes such selective locking safe.
---
 src/backend/executor/execParallel.c  | 67 +++++++++++++++++++++++++++-
 src/backend/executor/execPartition.c | 35 +++++++++++++++
 src/include/executor/execPartition.h |  1 +
 3 files changed, 102 insertions(+), 1 deletion(-)

diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index aedbd9566d6..751590adcc9 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -24,6 +24,7 @@
 #include "postgres.h"
 
 #include "executor/execParallel.h"
+#include "executor/execPartition.h"
 #include "executor/executor.h"
 #include "executor/nodeAgg.h"
 #include "executor/nodeAppend.h"
@@ -65,6 +66,8 @@
 #define PARALLEL_KEY_QUERY_TEXT		UINT64CONST(0xE000000000000008)
 #define PARALLEL_KEY_JIT_INSTRUMENTATION UINT64CONST(0xE000000000000009)
 #define PARALLEL_KEY_WAL_USAGE			UINT64CONST(0xE00000000000000A)
+#define PARALLEL_KEY_PARTITION_PRUNE_RESULTS	UINT64CONST(0xE00000000000000B)
+#define PARALLEL_KEY_UNPRUNED_RELIDS	UINT64CONST(0xE00000000000000C)
 
 #define PARALLEL_TUPLE_QUEUE_SIZE		65536
 
@@ -608,12 +611,18 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	FixedParallelExecutorState *fpes;
 	char	   *pstmt_data;
 	char	   *pstmt_space;
+	char	   *part_prune_results_data;
+	char	   *part_prune_results_space;
+	char	   *unpruned_relids_data;
+	char	   *unpruned_relids_space;
 	char	   *paramlistinfo_space;
 	BufferUsage *bufusage_space;
 	WalUsage   *walusage_space;
 	SharedExecutorInstrumentation *instrumentation = NULL;
 	SharedJitInstrumentation *jit_instrumentation = NULL;
 	int			pstmt_len;
+	int			part_prune_results_len;
+	int			unpruned_relids_len;
 	int			paramlistinfo_len;
 	int			instrumentation_len = 0;
 	int			jit_instrumentation_len = 0;
@@ -642,6 +651,8 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 
 	/* Fix up and serialize plan to be sent to workers. */
 	pstmt_data = ExecSerializePlan(planstate->plan, estate);
+	part_prune_results_data = nodeToString(estate->es_part_prune_results);
+	unpruned_relids_data = nodeToString(estate->es_unpruned_relids);
 
 	/* Create a parallel context. */
 	pcxt = CreateParallelContext("postgres", "ParallelQueryMain", nworkers);
@@ -668,6 +679,16 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	shm_toc_estimate_chunk(&pcxt->estimator, pstmt_len);
 	shm_toc_estimate_keys(&pcxt->estimator, 1);
 
+	/* Estimate space for serialized part_prune_results. */
+	part_prune_results_len = strlen(part_prune_results_data) + 1;
+	shm_toc_estimate_chunk(&pcxt->estimator, part_prune_results_len);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	/* Estimate space for serialized unpruned_relids. */
+	unpruned_relids_len = strlen(unpruned_relids_data) + 1;
+	shm_toc_estimate_chunk(&pcxt->estimator, unpruned_relids_len);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
 	/* Estimate space for serialized ParamListInfo. */
 	paramlistinfo_len = EstimateParamListSpace(estate->es_param_list_info);
 	shm_toc_estimate_chunk(&pcxt->estimator, paramlistinfo_len);
@@ -769,6 +790,16 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	memcpy(pstmt_space, pstmt_data, pstmt_len);
 	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PLANNEDSTMT, pstmt_space);
 
+	/* Store serialized part_prune_results */
+	part_prune_results_space = shm_toc_allocate(pcxt->toc, part_prune_results_len);
+	memcpy(part_prune_results_space, part_prune_results_data, part_prune_results_len);
+	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARTITION_PRUNE_RESULTS, part_prune_results_space);
+
+	/* Store serialized unpruned_relids */
+	unpruned_relids_space = shm_toc_allocate(pcxt->toc, unpruned_relids_len);
+	memcpy(unpruned_relids_space, unpruned_relids_data, unpruned_relids_len);
+	shm_toc_insert(pcxt->toc, PARALLEL_KEY_UNPRUNED_RELIDS, unpruned_relids_space);
+
 	/* Store serialized ParamListInfo. */
 	paramlistinfo_space = shm_toc_allocate(pcxt->toc, paramlistinfo_len);
 	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARAMLISTINFO, paramlistinfo_space);
@@ -1263,10 +1294,15 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
 						 int instrument_options)
 {
 	char	   *pstmtspace;
+	char	   *part_prune_results_space;
+	char	   *unpruned_relids_space;
 	char	   *paramspace;
 	PlannedStmt *pstmt;
+	List	   *part_prune_results;
+	Bitmapset  *unpruned_relids;
 	ParamListInfo paramLI;
 	char	   *queryString;
+	ExecPrep   *prep = NULL;
 
 	/* Get the query string from shared memory */
 	queryString = shm_toc_lookup(toc, PARALLEL_KEY_QUERY_TEXT, false);
@@ -1279,9 +1315,38 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
 	paramspace = shm_toc_lookup(toc, PARALLEL_KEY_PARAMLISTINFO, false);
 	paramLI = RestoreParamList(&paramspace);
 
+	/* Reconstruct leader-supplied part_prune_results and unpruned_relids. */
+	part_prune_results_space =
+		shm_toc_lookup(toc, PARALLEL_KEY_PARTITION_PRUNE_RESULTS, false);
+	part_prune_results = (List *) stringToNode(part_prune_results_space);
+	unpruned_relids_space =
+		shm_toc_lookup(toc, PARALLEL_KEY_UNPRUNED_RELIDS, false);
+	unpruned_relids = (Bitmapset *) stringToNode(unpruned_relids_space);
+
+	/*
+	 * If pruning was done in the leader, build a prep estate in the worker
+	 * and inject the leader's pruning results into it for reuse.
+	 */
+	if (pstmt->partPruneInfos)
+	{
+		prep = ExecutorPrep(pstmt, paramLI, CurrentResourceOwner, false, 0);
+		Assert(prep->prep_estate);
+
+		prep->prep_estate->es_part_prune_results = part_prune_results;
+		prep->prep_estate->es_unpruned_relids =
+			bms_add_members(prep->prep_estate->es_unpruned_relids,
+							unpruned_relids);
+
+		/*
+		 * Verify that the pruning results passed from the leader match
+		 * what the worker would independently compute.
+		 */
+		ExecCheckInitialPruningResults(prep->prep_estate);
+	}
+
 	/* Create a QueryDesc for the query. */
 	return CreateQueryDesc(pstmt,
-						   NULL,
+						   prep,
 						   queryString,
 						   GetActiveSnapshot(), InvalidSnapshot,
 						   receiver, paramLI, NULL, instrument_options);
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 187a480e508..3b450e3373f 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -1872,6 +1872,41 @@ ExecDoInitialPruning(EState *estate)
 	}
 }
 
+/*
+ * ExecCheckInitialPruningResults
+ *      Verify partition pruning results passed from the leader process.
+ *
+ * This is intended to be called during parallel worker query setup.
+ * It recomputes initial pruning results locally and compares them with
+ * those received from the leader. Any mismatch may indicate a divergence
+ * between leader and worker logic or environment.
+ */
+void
+ExecCheckInitialPruningResults(EState *estate)
+{
+	ListCell   *lc;
+	int			i;
+
+	Assert(estate->es_part_prune_results != NULL);
+	i = 0;
+	foreach(lc, estate->es_part_prune_states)
+	{
+		PartitionPruneState *prunestate = (PartitionPruneState *) lfirst(lc);
+		Bitmapset *reuse_validsubplans =
+				list_nth_node(Bitmapset, estate->es_part_prune_results, i);
+		Bitmapset  *validsubplans = NULL;
+		Bitmapset  *validsubplan_rtis = NULL;
+
+		if (prunestate->do_initial_prune)
+			validsubplans = ExecFindMatchingSubPlans(prunestate, true,
+													 &validsubplan_rtis);
+		if (bms_nonempty_difference(validsubplans, reuse_validsubplans))
+			elog(ERROR, "different validsubplns in parallel worker");
+		if (bms_nonempty_difference(validsubplan_rtis, estate->es_unpruned_relids))
+			elog(ERROR, "different unprunable_relids in parallel worker");
+	}
+}
+
 /*
  * ExecInitPartitionExecPruning
  *		Initialize the data structures needed for runtime "exec" partition
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index ba8cc594fc9..126efd008e5 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -132,6 +132,7 @@ typedef struct PartitionPruneState
 
 extern void ExecCreatePartitionPruneStates(EState *estate);
 extern void ExecDoInitialPruning(EState *estate);
+extern void ExecCheckInitialPruningResults(EState *estate);
 extern PartitionPruneState *ExecInitPartitionExecPruning(PlanState *planstate,
 														 int n_total_subplans,
 														 int part_prune_index,
-- 
2.47.3



  [application/octet-stream] v2-0002-Introduce-ExecutorPrep-and-refactor-executor-star.patch (28.7K, 6-v2-0002-Introduce-ExecutorPrep-and-refactor-executor-star.patch)
  download | inline diff:
From 11e0262e31e35539f50e96531559db6cd7e32160 Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Tue, 11 Nov 2025 21:47:46 +0900
Subject: [PATCH v2 2/5] Introduce ExecutorPrep and refactor executor startup

Factor permission checks, range table initialization, and initial
partition pruning out of InitPlan() into a new ExecutorPrep()
helper.  ExecutorPrep builds an EState containing the executor
metadata needed before plan execution, including partition
pruning state where partPruneInfos are present.

ExecutorStart() now expects QueryDesc->prep to point at such an
ExecPrep object.  If no prep was supplied by the caller, it
invokes ExecutorPrep() itself and adopts the resulting EState
for the duration of the query.  This keeps the executor startup
behaviour unchanged while making the setup work callable
separately when needed.

CreateQueryDesc() grows a prep argument and stores it in the
QueryDesc.  Portals, SPI, SQL functions, and EXPLAIN are wired
to carry an optional ExecPrep pointer alongside the PlannedStmt
list, but most callers still pass NULL and let ExecutorStart()
perform the setup lazily.

Add the ExecPrep struct and ExecPrepCleanup() to encapsulate
ownership of the prepared EState and any caller specific
cleanup hook.  Update executor/README and related comments to
document the new control flow and the separation between
preparation and execution.
---
 src/backend/commands/copyto.c        |   2 +-
 src/backend/commands/createas.c      |   2 +-
 src/backend/commands/explain.c       |   7 +-
 src/backend/commands/extension.c     |   1 +
 src/backend/commands/matview.c       |   2 +-
 src/backend/commands/portalcmds.c    |   1 +
 src/backend/commands/prepare.c       |  11 +-
 src/backend/executor/README          |   8 +-
 src/backend/executor/execMain.c      | 179 +++++++++++++++++++++++----
 src/backend/executor/execParallel.c  |   1 +
 src/backend/executor/execPartition.c |   3 +
 src/backend/executor/functions.c     |   1 +
 src/backend/executor/spi.c           |  10 ++
 src/backend/tcop/postgres.c          |   2 +
 src/backend/tcop/pquery.c            |  27 +++-
 src/backend/utils/mmgr/portalmem.c   |   2 +
 src/include/commands/explain.h       |   3 +-
 src/include/executor/execdesc.h      |   3 +-
 src/include/executor/executor.h      |  11 ++
 src/include/nodes/execnodes.h        |  48 +++++++
 src/include/utils/portal.h           |   2 +
 21 files changed, 286 insertions(+), 40 deletions(-)

diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index cef452584e5..5efbb0949c2 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -870,7 +870,7 @@ BeginCopyTo(ParseState *pstate,
 		((DR_copy *) dest)->cstate = cstate;
 
 		/* Create a QueryDesc requesting no output */
-		cstate->queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+		cstate->queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
 											GetActiveSnapshot(),
 											InvalidSnapshot,
 											dest, NULL, NULL, 0);
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 1ccc2e55c64..9eabe4920cd 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -334,7 +334,7 @@ ExecCreateTableAs(ParseState *pstate, CreateTableAsStmt *stmt,
 		UpdateActiveSnapshotCommandId();
 
 		/* Create a QueryDesc, redirecting output to our tuple receiver */
-		queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+		queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
 									GetActiveSnapshot(), InvalidSnapshot,
 									dest, params, queryEnv, 0);
 
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 7e699f8595e..d6ab3697dd9 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -370,7 +370,7 @@ standard_ExplainOneQuery(Query *query, int cursorOptions,
 	}
 
 	/* run it (if needed) and produce output */
-	ExplainOnePlan(plan, into, es, queryString, params, queryEnv,
+	ExplainOnePlan(plan, NULL, into, es, queryString, params, queryEnv,
 				   &planduration, (es->buffers ? &bufusage : NULL),
 				   es->memory ? &mem_counters : NULL);
 }
@@ -492,7 +492,8 @@ ExplainOneUtility(Node *utilityStmt, IntoClause *into, ExplainState *es,
  * to call it.
  */
 void
-ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
+ExplainOnePlan(PlannedStmt *plannedstmt, ExecPrep *prep,
+			   IntoClause *into, ExplainState *es,
 			   const char *queryString, ParamListInfo params,
 			   QueryEnvironment *queryEnv, const instr_time *planduration,
 			   const BufferUsage *bufusage,
@@ -548,7 +549,7 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
 		dest = None_Receiver;
 
 	/* Create a QueryDesc for the query */
-	queryDesc = CreateQueryDesc(plannedstmt, queryString,
+	queryDesc = CreateQueryDesc(plannedstmt, prep, queryString,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, params, queryEnv, instrument_option);
 
diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index 93ef1ad106f..3cca6d45ec1 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -993,6 +993,7 @@ execute_sql_string(const char *sql, const char *filename)
 				QueryDesc  *qdesc;
 
 				qdesc = CreateQueryDesc(stmt,
+										NULL,
 										sql,
 										GetActiveSnapshot(), NULL,
 										dest, NULL, NULL, 0);
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index ef7c0d624f1..30cbf9f264f 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -437,7 +437,7 @@ refresh_matview_datafill(DestReceiver *dest, Query *query,
 	UpdateActiveSnapshotCommandId();
 
 	/* Create a QueryDesc, redirecting output to our tuple receiver */
-	queryDesc = CreateQueryDesc(plan, queryString,
+	queryDesc = CreateQueryDesc(plan, NULL, queryString,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, NULL, NULL, 0);
 
diff --git a/src/backend/commands/portalcmds.c b/src/backend/commands/portalcmds.c
index ec96c2efcd3..ac1ddd25aba 100644
--- a/src/backend/commands/portalcmds.c
+++ b/src/backend/commands/portalcmds.c
@@ -118,6 +118,7 @@ PerformCursorOpen(ParseState *pstate, DeclareCursorStmt *cstmt, ParamListInfo pa
 					  queryString,
 					  CMDTAG_SELECT,	/* cursor's query is always a SELECT */
 					  list_make1(plan),
+					  list_make1(NULL),
 					  NULL);
 
 	/*----------
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 34b6410d6a2..afd449c73ba 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -205,6 +205,7 @@ ExecuteQuery(ParseState *pstate,
 					  query_string,
 					  entry->plansource->commandTag,
 					  plan_list,
+					  NIL,
 					  cplan);
 
 	/*
@@ -575,6 +576,7 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	const char *query_string;
 	CachedPlan *cplan;
 	List	   *plan_list;
+	List	   *prep_list;
 	ListCell   *p;
 	ParamListInfo paramLI = NULL;
 	EState	   *estate = NULL;
@@ -585,6 +587,7 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	MemoryContextCounters mem_counters;
 	MemoryContext planner_ctx = NULL;
 	MemoryContext saved_ctx = NULL;
+	int			i;
 
 	if (es->memory)
 	{
@@ -650,14 +653,20 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	}
 
 	plan_list = cplan->stmt_list;
+	prep_list = NIL;
 
 	/* Explain each query */
+	i = 0;
 	foreach(p, plan_list)
 	{
 		PlannedStmt *pstmt = lfirst_node(PlannedStmt, p);
+		ExecPrep *prep = prep_list ?
+			(ExecPrep *) list_nth(prep_list, i) : NULL;
 
+		i++;
 		if (pstmt->commandType != CMD_UTILITY)
-			ExplainOnePlan(pstmt, into, es, query_string, paramLI, pstate->p_queryEnv,
+			ExplainOnePlan(pstmt, prep,
+						   into, es, query_string, paramLI, pstate->p_queryEnv,
 						   &planduration, (es->buffers ? &bufusage : NULL),
 						   es->memory ? &mem_counters : NULL);
 		else
diff --git a/src/backend/executor/README b/src/backend/executor/README
index 54f4782f31b..95b5ec58c55 100644
--- a/src/backend/executor/README
+++ b/src/backend/executor/README
@@ -291,10 +291,16 @@ Query Processing Control Flow
 
 This is a sketch of control flow for full query processing:
 
+    ExecutorPrep
+		May be run before ExecutorStart (e.g., for plan validation), or
+		implicitly from ExecutorStart if not done earlier.  Performs range
+		table initialization, permission checks, and initial partition pruning.
+		Returns an ExecPrep wrapper with EState that ExecutorStart may reuse.
+
 	CreateQueryDesc
 
 	ExecutorStart
-		CreateExecutorState
+		CreateExecutorState (or reuse one from ExecPrep if present)
 			creates per-query context
 		switch to per-query context to run ExecInitNode
 		AfterTriggerBeginQuery
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 27c9eec697b..39de0b93a1c 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -171,8 +171,26 @@ standard_ExecutorStart(QueryDesc *queryDesc, int eflags)
 
 	/*
 	 * Build EState, switch into per-query memory context for startup.
+	 *
+	 * If ExecutorPrep() ran earlier (e.g., to do initial pruning during plan
+	 * validity checking), reuse its EState to avoid redoing range table setup
+	 * and pruning. Otherwise, create a fresh EState as usual.
 	 */
-	estate = CreateExecutorState();
+	if (queryDesc->prep == NULL)
+		queryDesc->prep = ExecutorPrep(queryDesc->plannedstmt,
+									   queryDesc->params,
+									   CurrentResourceOwner,
+									   true,
+									   eflags);
+	Assert(queryDesc->prep);
+	estate = queryDesc->prep->prep_estate;
+
+	/*
+	 * Executor is adopting the prep's EState. Mark it so ExecPrepCleanup()
+	 * doesn't try to free it redundantly.
+	 */
+	queryDesc->prep->owns_estate = false;
+
 	queryDesc->estate = estate;
 
 	oldcontext = MemoryContextSwitchTo(estate->es_query_cxt);
@@ -263,6 +281,136 @@ standard_ExecutorStart(QueryDesc *queryDesc, int eflags)
 	MemoryContextSwitchTo(oldcontext);
 }
 
+/*
+ * ExecutorPrep: prepare executor state for a PlannedStmt outside ExecutorStart.
+ *
+ * Performs range table initialization, permission checks, and initial
+ * partition pruning if partPruneInfos are present and do_initial_pruning is
+ * true.
+ *
+ * This is intended for callers that need executor metadata ahead of actual
+ * execution. Typical use cases include:
+ *	- determining which relations must be locked during plan cache validation;
+ *	- initializing unpruned relids and valid subplans in parallel workers
+ *	  using state copied from the leader.
+ *
+ * The executor can reuse the resulting state to avoid redundant setup during
+ * ExecutorStart().
+ *
+ * Returns an ExecPrep wrapper that owns the EState and can be reused
+ * or cleaned up later.
+ */
+ExecPrep *
+ExecutorPrep(PlannedStmt *pstmt, ParamListInfo params, ResourceOwner owner,
+			 bool do_initial_pruning, int eflags)
+{
+	ResourceOwner oldowner;
+	EState *estate;
+	bool	snapshot_set;
+
+	if (pstmt->commandType == CMD_UTILITY)
+		return NULL;
+
+	/* Pruning may use expressions that require an active snapshot. */
+	snapshot_set = false;
+	if (!ActiveSnapshotSet())
+	{
+		PushActiveSnapshot(GetTransactionSnapshot());
+		snapshot_set = true;
+	}
+	Assert(ActiveSnapshotSet());
+
+	estate = CreateExecutorState();
+	estate->es_plannedstmt = pstmt;
+	estate->es_part_prune_infos = pstmt->partPruneInfos;
+	estate->es_param_list_info = params;
+	estate->es_top_eflags = eflags;
+
+	/*
+	 * Do permissions checks.
+	 */
+	ExecCheckPermissions(pstmt->rtable, pstmt->permInfos, true);
+
+	/*
+	 * Initialize range table.
+	 */
+	ExecInitRangeTable(estate, pstmt->rtable, pstmt->permInfos,
+					   bms_copy(pstmt->unprunableRelids));
+
+	/*
+	 * Ensure locks taken during initial pruning are tracked under the given
+	 * ResourceOwner (e.g., one associated with CachedPlan validation).
+	 */
+	oldowner = CurrentResourceOwner;
+	CurrentResourceOwner = owner;
+
+	/*
+	 * Set up PartitionPruneState structures needed for both initial and
+	 * runtime partition pruning. These structures are built from the
+	 * PartitionPruneInfo entries in the plan tree.
+	 *
+	 * If do_initial_pruning is true, also perform initial pruning to compute
+	 * the subset of child subplans that will be executed. The results,
+	 * which are bitmapsets of selected child indexes, are saved in
+	 * es_part_prune_results. This list is parallel to es_part_prune_infos.
+	 *
+	 * In parallel workers, do_initial_pruning should be false -- they receive
+	 * es_part_prune_results from the leader process and should only initialize
+	 * the PartitionPruneStates.
+	 */
+	ExecCreatePartitionPruneStates(estate);
+	if (do_initial_pruning)
+		ExecDoInitialPruning(estate);
+
+	CurrentResourceOwner = oldowner;
+
+	/* Release snapshot if we got one */
+	if (snapshot_set)
+		PopActiveSnapshot();
+
+	return CreateExecPrep(estate, CurrentMemoryContext, NULL, NULL);
+}
+
+/*
+ * CreateExecPrep: initialize ExecPrep wrapper with optional cleanup metadata.
+ */
+ExecPrep *
+CreateExecPrep(EState *estate, MemoryContext context,
+			   execprep_cleanup_fn cleanup, void *cleanup_arg)
+{
+	ExecPrep *prep = palloc0(sizeof(ExecPrep));
+
+	prep->prep_estate = estate;
+	prep->context = context;
+	prep->cleanup = cleanup;
+	prep->cleanup_arg = cleanup_arg;
+	prep->owns_estate = true;
+
+	return prep;
+}
+
+/*
+ * ExecPrepCleanup: free ExecPrep resources not adopted by the executor.
+ *
+ * Only frees the EState if it wasn't taken over by ExecutorStart().
+ * Always runs the optional user-defined cleanup callback.
+ */
+void
+ExecPrepCleanup(ExecPrep *prep)
+{
+	if (prep == NULL)
+		return;
+
+	if (prep->prep_estate && prep->owns_estate)
+	{
+		ExecCloseRangeTableRelations(prep->prep_estate);
+		FreeExecutorState(prep->prep_estate);
+	}
+
+	if (prep->cleanup)
+		prep->cleanup(prep->cleanup_arg);
+}
+
 /* ----------------------------------------------------------------
  *		ExecutorRun
  *
@@ -824,7 +972,6 @@ ExecCheckXactReadOnly(PlannedStmt *plannedstmt)
 		PreventCommandIfParallelMode(CreateCommandName((Node *) plannedstmt));
 }
 
-
 /* ----------------------------------------------------------------
  *		InitPlan
  *
@@ -838,37 +985,15 @@ InitPlan(QueryDesc *queryDesc, int eflags)
 	CmdType		operation = queryDesc->operation;
 	PlannedStmt *plannedstmt = queryDesc->plannedstmt;
 	Plan	   *plan = plannedstmt->planTree;
-	List	   *rangeTable = plannedstmt->rtable;
 	EState	   *estate = queryDesc->estate;
 	PlanState  *planstate;
 	TupleDesc	tupType;
 	ListCell   *l;
 	int			i;
 
-	/*
-	 * Do permissions checks
-	 */
-	ExecCheckPermissions(rangeTable, plannedstmt->permInfos, true);
-
-	/*
-	 * initialize the node's execution state
-	 */
-	ExecInitRangeTable(estate, rangeTable, plannedstmt->permInfos,
-					   bms_copy(plannedstmt->unprunableRelids));
-
-	estate->es_plannedstmt = plannedstmt;
-	estate->es_part_prune_infos = plannedstmt->partPruneInfos;
-
-	/*
-	 * Perform runtime "initial" pruning to identify which child subplans,
-	 * corresponding to the children of plan nodes that contain
-	 * PartitionPruneInfo such as Append, will not be executed. The results,
-	 * which are bitmapsets of indexes of the child subplans that will be
-	 * executed, are saved in es_part_prune_results.  These results correspond
-	 * to each PartitionPruneInfo entry, and the es_part_prune_results list is
-	 * parallel to es_part_prune_infos.
-	 */
-	ExecDoInitialPruning(estate);
+	/* ExecutorPrep() must have been done. */
+	Assert(queryDesc->prep);
+	Assert(estate == queryDesc->prep->prep_estate);
 
 	/*
 	 * Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index f098a5557cf..aedbd9566d6 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -1281,6 +1281,7 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
 
 	/* Create a QueryDesc for the query. */
 	return CreateQueryDesc(pstmt,
+						   NULL,
 						   queryString,
 						   GetActiveSnapshot(), InvalidSnapshot,
 						   receiver, paramLI, NULL, instrument_options);
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 88b150c8d77..187a480e508 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -2368,6 +2368,9 @@ InitExecPartitionPruneContexts(PartitionPruneState *prunestate,
 	Assert(parent_plan != NULL);
 	estate = parent_plan->state;
 
+	/* Wouldn't be available at ExecutorPrep() time. */
+	prunestate->econtext->ecxt_param_exec_vals = estate->es_param_exec_vals;
+
 	/*
 	 * No need to fix subplans maps if initial pruning didn't eliminate any
 	 * subplans.
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index 630d708d2a3..633310c5f5b 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -1362,6 +1362,7 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
 		dest = None_Receiver;
 
 	es->qd = CreateQueryDesc(es->stmt,
+							 NULL,
 							 fcache->func->src,
 							 GetActiveSnapshot(),
 							 InvalidSnapshot,
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 653500b38dc..7a3cb944d6f 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -1685,6 +1685,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 					  query_string,
 					  plansource->commandTag,
 					  stmt_list,
+					  NIL,
 					  cplan);
 
 	/*
@@ -2500,6 +2501,8 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 		CachedPlanSource *plansource = (CachedPlanSource *) lfirst(lc1);
 		List	   *stmt_list;
 		ListCell   *lc2;
+		List	   *prep_list;
+		int			i;
 
 		spicallbackarg.query = plansource->query_string;
 
@@ -2578,6 +2581,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 							  plan_owner, _SPI_current->queryEnv);
 
 		stmt_list = cplan->stmt_list;
+		prep_list = NIL;
 
 		/*
 		 * If we weren't given a specific snapshot to use, and the statement
@@ -2615,12 +2619,17 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 			}
 		}
 
+		i = 0;
 		foreach(lc2, stmt_list)
 		{
 			PlannedStmt *stmt = lfirst_node(PlannedStmt, lc2);
+			ExecPrep *prep = prep_list ?
+				list_nth(prep_list, i) : NULL;
 			bool		canSetTag = stmt->canSetTag;
 			DestReceiver *dest;
 
+			i++;
+
 			/*
 			 * Reset output state.  (Note that if a non-SPI receiver is used,
 			 * _SPI_current->processed will stay zero, and that's what we'll
@@ -2690,6 +2699,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 					snap = InvalidSnapshot;
 
 				qdesc = CreateQueryDesc(stmt,
+										prep,
 										plansource->query_string,
 										snap, crosscheck_snapshot,
 										dest,
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 2bd89102686..d3964a12a14 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -1232,6 +1232,7 @@ exec_simple_query(const char *query_string)
 						  query_string,
 						  commandTag,
 						  plantree_list,
+						  NIL,
 						  NULL);
 
 		/*
@@ -2033,6 +2034,7 @@ exec_bind_message(StringInfo input_message)
 					  query_string,
 					  psrc->commandTag,
 					  cplan->stmt_list,
+					  NIL,
 					  cplan);
 
 	/* Portal is defined, set the plan ID based on its contents. */
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index fde78c55160..82c295502b0 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -37,6 +37,7 @@ Portal		ActivePortal = NULL;
 
 
 static void ProcessQuery(PlannedStmt *plan,
+						 ExecPrep *prep,
 						 const char *sourceText,
 						 ParamListInfo params,
 						 QueryEnvironment *queryEnv,
@@ -66,6 +67,7 @@ static void DoPortalRewind(Portal portal);
  */
 QueryDesc *
 CreateQueryDesc(PlannedStmt *plannedstmt,
+				ExecPrep *prep,
 				const char *sourceText,
 				Snapshot snapshot,
 				Snapshot crosscheck_snapshot,
@@ -78,6 +80,7 @@ CreateQueryDesc(PlannedStmt *plannedstmt,
 
 	qd->operation = plannedstmt->commandType;	/* operation */
 	qd->plannedstmt = plannedstmt;	/* plan */
+	qd->prep = prep;		/* executor prep output */
 	qd->sourceText = sourceText;	/* query text */
 	qd->snapshot = RegisterSnapshot(snapshot);	/* snapshot */
 	/* RI check snapshot */
@@ -112,6 +115,13 @@ FreeQueryDesc(QueryDesc *qdesc)
 	UnregisterSnapshot(qdesc->snapshot);
 	UnregisterSnapshot(qdesc->crosscheck_snapshot);
 
+	/* ExecPrep cleanup if necessary */
+	if (qdesc->prep)
+	{
+		ExecPrepCleanup(qdesc->prep);
+		qdesc->prep = NULL;
+	}
+
 	/* Only the QueryDesc itself need be freed */
 	pfree(qdesc);
 }
@@ -123,6 +133,7 @@ FreeQueryDesc(QueryDesc *qdesc)
  *		PORTAL_ONE_RETURNING, or PORTAL_ONE_MOD_WITH portal
  *
  *	plan: the plan tree for the query
+ *	prep: ExecPrep for the plan (output of ExecutorPrep())
  *	sourceText: the source text of the query
  *	params: any parameters needed
  *	dest: where to send results
@@ -135,6 +146,7 @@ FreeQueryDesc(QueryDesc *qdesc)
  */
 static void
 ProcessQuery(PlannedStmt *plan,
+			 ExecPrep *prep,
 			 const char *sourceText,
 			 ParamListInfo params,
 			 QueryEnvironment *queryEnv,
@@ -146,7 +158,7 @@ ProcessQuery(PlannedStmt *plan,
 	/*
 	 * Create the QueryDesc object
 	 */
-	queryDesc = CreateQueryDesc(plan, sourceText,
+	queryDesc = CreateQueryDesc(plan, prep, sourceText,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, params, queryEnv, 0);
 
@@ -489,6 +501,9 @@ PortalStart(Portal portal, ParamListInfo params,
 				 * the destination to DestNone.
 				 */
 				queryDesc = CreateQueryDesc(linitial_node(PlannedStmt, portal->stmts),
+											portal->preps ?
+											(ExecPrep *) linitial(portal->preps) :
+											NULL,
 											portal->sourceText,
 											GetActiveSnapshot(),
 											InvalidSnapshot,
@@ -1185,6 +1200,7 @@ PortalRunMulti(Portal portal,
 {
 	bool		active_snapshot_set = false;
 	ListCell   *stmtlist_item;
+	int			i;
 
 	/*
 	 * If the destination is DestRemoteExecute, change to DestNone.  The
@@ -1205,9 +1221,14 @@ PortalRunMulti(Portal portal,
 	 * Loop to handle the individual queries generated from a single parsetree
 	 * by analysis and rewrite.
 	 */
+	i = 0;
 	foreach(stmtlist_item, portal->stmts)
 	{
 		PlannedStmt *pstmt = lfirst_node(PlannedStmt, stmtlist_item);
+		ExecPrep *prep = portal->preps ?
+			list_nth(portal->preps, i) : NULL;
+
+		i++;
 
 		/*
 		 * If we got a cancel signal in prior command, quit
@@ -1265,7 +1286,7 @@ PortalRunMulti(Portal portal,
 			if (pstmt->canSetTag)
 			{
 				/* statement can set tag string */
-				ProcessQuery(pstmt,
+				ProcessQuery(pstmt, prep,
 							 portal->sourceText,
 							 portal->portalParams,
 							 portal->queryEnv,
@@ -1274,7 +1295,7 @@ PortalRunMulti(Portal portal,
 			else
 			{
 				/* stmt added by rewrite cannot set tag */
-				ProcessQuery(pstmt,
+				ProcessQuery(pstmt, prep,
 							 portal->sourceText,
 							 portal->portalParams,
 							 portal->queryEnv,
diff --git a/src/backend/utils/mmgr/portalmem.c b/src/backend/utils/mmgr/portalmem.c
index 943da087c9f..313f8ef2fdc 100644
--- a/src/backend/utils/mmgr/portalmem.c
+++ b/src/backend/utils/mmgr/portalmem.c
@@ -284,6 +284,7 @@ PortalDefineQuery(Portal portal,
 				  const char *sourceText,
 				  CommandTag commandTag,
 				  List *stmts,
+				  List *preps,
 				  CachedPlan *cplan)
 {
 	Assert(PortalIsValid(portal));
@@ -298,6 +299,7 @@ PortalDefineQuery(Portal portal,
 	portal->qc.nprocessed = 0;
 	portal->commandTag = commandTag;
 	portal->stmts = stmts;
+	portal->preps = preps;
 	portal->cplan = cplan;
 	portal->status = PORTAL_DEFINED;
 }
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 6e51d50efc7..6aa8b275aa2 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -63,7 +63,8 @@ extern void ExplainOneUtility(Node *utilityStmt, IntoClause *into,
 							  ExplainState *es, ParseState *pstate,
 							  ParamListInfo params);
 
-extern void ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into,
+extern void ExplainOnePlan(PlannedStmt *plannedstmt, ExecPrep *prep,
+						   IntoClause *into,
 						   ExplainState *es, const char *queryString,
 						   ParamListInfo params, QueryEnvironment *queryEnv,
 						   const instr_time *planduration,
diff --git a/src/include/executor/execdesc.h b/src/include/executor/execdesc.h
index 86db3dc8d0d..c18530f5d11 100644
--- a/src/include/executor/execdesc.h
+++ b/src/include/executor/execdesc.h
@@ -18,7 +18,6 @@
 #include "nodes/execnodes.h"
 #include "tcop/dest.h"
 
-
 /* ----------------
  *		query descriptor:
  *
@@ -35,6 +34,7 @@ typedef struct QueryDesc
 	/* These fields are provided by CreateQueryDesc */
 	CmdType		operation;		/* CMD_SELECT, CMD_UPDATE, etc. */
 	PlannedStmt *plannedstmt;	/* planner's output (could be utility, too) */
+	ExecPrep *prep;				/* output of ExecutorPrep() or NULL */
 	const char *sourceText;		/* source text of the query */
 	Snapshot	snapshot;		/* snapshot to use for query */
 	Snapshot	crosscheck_snapshot;	/* crosscheck for RI update/delete */
@@ -57,6 +57,7 @@ typedef struct QueryDesc
 
 /* in pquery.c */
 extern QueryDesc *CreateQueryDesc(PlannedStmt *plannedstmt,
+								  ExecPrep *prep,
 								  const char *sourceText,
 								  Snapshot snapshot,
 								  Snapshot crosscheck_snapshot,
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index fa2b657fb2f..3579926d4e8 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -20,6 +20,7 @@
 #include "nodes/lockoptions.h"
 #include "nodes/parsenodes.h"
 #include "utils/memutils.h"
+#include "utils/resowner.h"
 
 
 /*
@@ -234,6 +235,16 @@ ExecGetJunkAttribute(TupleTableSlot *slot, AttrNumber attno, bool *isNull)
  */
 extern void ExecutorStart(QueryDesc *queryDesc, int eflags);
 extern void standard_ExecutorStart(QueryDesc *queryDesc, int eflags);
+
+extern ExecPrep *ExecutorPrep(PlannedStmt *pstmt,
+							  ParamListInfo params,
+							  ResourceOwner owner,
+							  bool do_initial_pruning,
+							  int eflags);
+extern ExecPrep *CreateExecPrep(EState *estate, MemoryContext context,
+								execprep_cleanup_fn cleanup, void *cleanup_arg);
+extern void ExecPrepCleanup(ExecPrep *prep);
+
 extern void ExecutorRun(QueryDesc *queryDesc,
 						ScanDirection direction, uint64 count);
 extern void standard_ExecutorRun(QueryDesc *queryDesc,
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 18ae8f0d4bb..8bdecd631bf 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -772,6 +772,54 @@ typedef struct EState
 	List	   *es_insert_pending_modifytables;
 } EState;
 
+/*
+ * ExecPrep: encapsulates executor preparation results for a PlannedStmt.
+ *
+ * ExecutorPrep() factors out executor setup steps such as initializing the
+ * range table, checking permissions, and executing initial partition pruning.
+ * ExecutorStart() can reuse the prepared EState instead of repeating that
+ * work, and other callers (such as plan cache validation) can use it without
+ * running the full plan.
+ */
+
+/*
+ * Optional callback to clean up user-specific resources associated with
+ * ExecPrep.
+ */
+typedef void (*execprep_cleanup_fn)(void *prep);
+
+typedef struct ExecPrep
+{
+	/*
+	 * Context in which this struct and all subsidiary allocations were made.
+	 * This context must remain alive until ExecPrepCleanup is called.
+	 */
+	MemoryContext context;
+
+	/*
+	 * Partially-initialized executor state used for permission checks and
+	 * pruning. May be adopted directly by ExecutorStart(), in which case
+	 * ExecPrepCleanup will skip freeing it.
+	 */
+	EState	   *prep_estate;
+
+	/*
+	 * True if ExecPrepCleanup() must free the EState.  If the executor adopts
+	 * prep_estate, this is set to false to avoid double-free.
+	 */
+	bool		owns_estate;
+
+	/*
+	 * Optional caller-supplied cleanup hook to run during ExecPrepCleanup.
+	 * Useful for releasing external resources associated with the prep.
+	 */
+	execprep_cleanup_fn cleanup;
+
+	/*
+	 * Opaque pointer to pass to the cleanup hook.
+	 */
+	void	   *cleanup_arg;
+} ExecPrep;
 
 /*
  * ExecRowMark -
diff --git a/src/include/utils/portal.h b/src/include/utils/portal.h
index 5ffa6fd5cc8..013bcc3bd8e 100644
--- a/src/include/utils/portal.h
+++ b/src/include/utils/portal.h
@@ -137,6 +137,7 @@ typedef struct PortalData
 	CommandTag	commandTag;		/* command tag for original query */
 	QueryCompletion qc;			/* command completion data for executed query */
 	List	   *stmts;			/* list of PlannedStmts */
+	List	   *preps;			/* list of ExecPreps where needed */
 	CachedPlan *cplan;			/* CachedPlan, if stmts are from one */
 
 	ParamListInfo portalParams; /* params to pass to query */
@@ -240,6 +241,7 @@ extern void PortalDefineQuery(Portal portal,
 							  const char *sourceText,
 							  CommandTag commandTag,
 							  List *stmts,
+							  List *preps,
 							  CachedPlan *cplan);
 extern PlannedStmt *PortalGetPrimaryStmt(Portal portal);
 extern void PortalCreateHoldStore(Portal portal);
-- 
2.47.3



^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2025-11-20 07:30  Amit Langote <[email protected]>
  parent: Amit Langote <[email protected]>
  0 siblings, 2 replies; 108+ messages in thread

From: Amit Langote @ 2025-11-20 07:30 UTC (permalink / raw)
  To: Tom Lane <[email protected]>; +Cc: Tender Wang <[email protected]>; Alexander Lakhin <[email protected]>; Tomas Vondra <[email protected]>; Robert Haas <[email protected]>; Alvaro Herrera <[email protected]>; Andres Freund <[email protected]>; Daniel Gustafsson <[email protected]>; David Rowley <[email protected]>; pgsql-hackers; Thom Brown <[email protected]>

On Mon, Nov 17, 2025 at 9:50 PM Amit Langote <[email protected]> wrote:
> On Wed, Nov 12, 2025 at 11:17 PM Amit Langote <[email protected]> wrote:
> > * Enable pruning-aware locking in cached / generic plan reuse (0004):
> > extends GetCachedPlan() and CheckCachedPlan() to call ExecutorPrep()
> > on each PlannedStmt in the CachedPlan, locking only surviving
> > partitions. Adds CachedPlanPrepData to pass this through plan cache
> > APIs and down to execution via QueryDesc. Also reinstates the
> > firstResultRel locking rule added in 28317de72 but later lost due to
> > revert of the earlier pruning patch, to ensure correctness when all
> > target partitions are pruned.
>
> Looking at the changes to executor/function.c, I also noticed that I
> had mistakenly allocated the ExecutorPrep state in
> SQLFunctionCache.fcontext whereas the correct context for execution
> related state is SQLFunctionCache.subcontext.  In the updated patch,
> I've made postquel_start() reparent the prep EState's es_query_cxt to
> subcontext from fcontext. I also did not have a test case that
> exercised cached plan reuse for SQL functions, so I added one. I split
> the function.c's GetCachedPlan() + CachedPlanPrepData plumbing into a
> new patch 0005 so it can be reviewed separately, since it is the only
> non-mechanical call-site change.

I also noticed a bug in the prep cleanup logic that runs when a cached
plan becomes invalid during the prep phase. Patch 0005 fixes that and
adds a regression test that exercises the invalidation path. This will
be folded into 0004 later.

-- 
Thanks, Amit Langote


Attachments:

  [application/octet-stream] v3-0004-Use-pruning-aware-locking-in-cached-plans.patch (24.5K, 2-v3-0004-Use-pruning-aware-locking-in-cached-plans.patch)
  download | inline diff:
From dc0de03510539ddc3bd33327158785279356821f Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Tue, 11 Nov 2025 22:30:52 +0900
Subject: [PATCH v3 4/6] Use pruning-aware locking in cached plans

Extend GetCachedPlan() to perform ExecutorPrep() on each planned
statement, capturing unpruned relids and initial pruning results.
Use this data to acquire execution locks only on surviving partitions,
avoiding unnecessary locking of pruned tables even when using cached
plans.

Introduce CachedPlanPrepData to carry ExecutorPrep results
through the plan caching layer. Adjust call sites in SPI,
functions, portals, and EXPLAIN to propagate this data.

This ensures pruning decisions made during initial pruning are
consistently reused without redoing pruning logic in executor paths
like parallel workers. It also lays the groundwork for
pruning-dependent lock behavior during plan reuse.

To maintain correctness when all target partitions are pruned, also
reinstate the firstResultRel locking behavior lost in commit
28317de72. That commit required the first ModifyTable target to
remain initialized for executor assumptions to hold. We now
explicitly track these relids in PlannerGlobal and PlannedStmt so they
are locked even if pruned, preserving that rule across cached plan
reuse.
---
 src/backend/commands/prepare.c         |  19 +-
 src/backend/executor/functions.c       |   1 +
 src/backend/executor/nodeModifyTable.c |   4 +-
 src/backend/executor/spi.c             |  26 ++-
 src/backend/optimizer/plan/planner.c   |   1 +
 src/backend/optimizer/plan/setrefs.c   |   3 +
 src/backend/tcop/postgres.c            |   9 +-
 src/backend/utils/cache/plancache.c    | 234 ++++++++++++++++++++++++-
 src/include/nodes/pathnodes.h          |   3 +
 src/include/nodes/plannodes.h          |  10 ++
 src/include/utils/plancache.h          |  24 ++-
 11 files changed, 313 insertions(+), 21 deletions(-)

diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index afd449c73ba..23332d19b37 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -154,6 +154,7 @@ ExecuteQuery(ParseState *pstate,
 {
 	PreparedStatement *entry;
 	CachedPlan *cplan;
+	CachedPlanPrepData cprep = {0};
 	List	   *plan_list;
 	ParamListInfo paramLI = NULL;
 	EState	   *estate = NULL;
@@ -193,7 +194,10 @@ ExecuteQuery(ParseState *pstate,
 									   entry->plansource->query_string);
 
 	/* Replan if needed, and increment plan refcount for portal */
-	cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL);
+	/* Keep ExecutorPrep state with the portal and its resowner. */
+	cprep.context = portal->portalContext;
+	cprep.owner = portal->resowner;
+	cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL, &cprep);
 	plan_list = cplan->stmt_list;
 
 	/*
@@ -205,7 +209,7 @@ ExecuteQuery(ParseState *pstate,
 					  query_string,
 					  entry->plansource->commandTag,
 					  plan_list,
-					  NIL,
+					  cprep.prep_list,
 					  cplan);
 
 	/*
@@ -575,6 +579,7 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	PreparedStatement *entry;
 	const char *query_string;
 	CachedPlan *cplan;
+	CachedPlanPrepData cprep = {0};
 	List	   *plan_list;
 	List	   *prep_list;
 	ListCell   *p;
@@ -633,8 +638,14 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	}
 
 	/* Replan if needed, and acquire a transient refcount */
+	/* ExecutorPrep state is local to this EXPLAIN EXECUTE call. */
+	cprep.context = CurrentMemoryContext;
+	cprep.owner = CurrentResourceOwner;
+	if (es->generic)
+		cprep.eflags = EXEC_FLAG_EXPLAIN_GENERIC;
 	cplan = GetCachedPlan(entry->plansource, paramLI,
-						  CurrentResourceOwner, pstate->p_queryEnv);
+						  CurrentResourceOwner, pstate->p_queryEnv,
+						  &cprep);
 
 	INSTR_TIME_SET_CURRENT(planduration);
 	INSTR_TIME_SUBTRACT(planduration, planstart);
@@ -653,7 +664,7 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	}
 
 	plan_list = cplan->stmt_list;
-	prep_list = NIL;
+	prep_list = cprep.prep_list;
 
 	/* Explain each query */
 	i = 0;
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index 633310c5f5b..d81718ea84e 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -698,6 +698,7 @@ init_execution_state(SQLFunctionCachePtr fcache)
 	fcache->cplan = GetCachedPlan(plansource,
 								  fcache->paramLI,
 								  fcache->cowner,
+								  NULL,
 								  NULL);
 
 	/*
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index 4c5647ac38a..c5812612f8d 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -4648,8 +4648,8 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
 	 * as a reference for building the ResultRelInfo of the target partition.
 	 * In either case, it doesn't matter which result relation is kept, so we
 	 * just keep the first one, if all others have been pruned.  See also,
-	 * ExecDoInitialPruning(), which ensures that this first result relation
-	 * has been locked.
+	 * AcquireExecutorLocksUnpruned(), which ensures that this first result
+	 * relation has been locked.
 	 */
 	i = 0;
 	foreach(l, node->resultRelations)
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 7a3cb944d6f..d580f1e0425 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -1579,6 +1579,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 {
 	CachedPlanSource *plansource;
 	CachedPlan *cplan;
+	CachedPlanPrepData cprep = {0};
 	List	   *stmt_list;
 	char	   *query_string;
 	Snapshot	snapshot;
@@ -1659,7 +1660,11 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 	 */
 
 	/* Replan if needed, and increment plan refcount for portal */
-	cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv);
+	/* ExecutorPrep state lives in this portal's context. */
+	cprep.context = portal->portalContext;
+	cprep.owner = portal->resowner;
+	cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv,
+						  &cprep);
 	stmt_list = cplan->stmt_list;
 
 	if (!plan->saved)
@@ -1685,7 +1690,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 					  query_string,
 					  plansource->commandTag,
 					  stmt_list,
-					  NIL,
+					  cprep.prep_list,	/* lives in portalContext */
 					  cplan);
 
 	/*
@@ -2078,6 +2083,7 @@ SPI_plan_get_cached_plan(SPIPlanPtr plan)
 {
 	CachedPlanSource *plansource;
 	CachedPlan *cplan;
+	CachedPlanPrepData cprep = {0};
 	SPICallbackArg spicallbackarg;
 	ErrorContextCallback spierrcontext;
 
@@ -2101,9 +2107,13 @@ SPI_plan_get_cached_plan(SPIPlanPtr plan)
 	error_context_stack = &spierrcontext;
 
 	/* Get the generic plan for the query */
+	/* ExecutorPrep() state lives in caller's active context. */
+	cprep.context = CurrentMemoryContext;
+	cprep.owner = CurrentResourceOwner;
 	cplan = GetCachedPlan(plansource, NULL,
 						  plan->saved ? CurrentResourceOwner : NULL,
-						  _SPI_current->queryEnv);
+						  _SPI_current->queryEnv,
+						  &cprep);
 	Assert(cplan == plansource->gplan);
 
 	/* Pop the error context stack */
@@ -2501,6 +2511,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 		CachedPlanSource *plansource = (CachedPlanSource *) lfirst(lc1);
 		List	   *stmt_list;
 		ListCell   *lc2;
+		CachedPlanPrepData cprep = {0};
 		List	   *prep_list;
 		int			i;
 
@@ -2577,11 +2588,16 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 		 * Replan if needed, and increment plan refcount.  If it's a saved
 		 * plan, the refcount must be backed by the plan_owner.
 		 */
+
+		/* ExecutorPrep state is per _SPI_execute_plan call. */
+		cprep.context = CurrentMemoryContext;
+		cprep.owner = CurrentResourceOwner;
 		cplan = GetCachedPlan(plansource, options->params,
-							  plan_owner, _SPI_current->queryEnv);
+							  plan_owner, _SPI_current->queryEnv,
+							  &cprep);
 
 		stmt_list = cplan->stmt_list;
-		prep_list = NIL;
+		prep_list = cprep.prep_list;
 
 		/*
 		 * If we weren't given a specific snapshot to use, and the statement
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index c4fd646b999..4c76e78c1da 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -608,6 +608,7 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
 											  glob->prunableRelids);
 	result->permInfos = glob->finalrteperminfos;
 	result->resultRelations = glob->resultRelations;
+	result->firstResultRels = glob->firstResultRels;
 	result->appendRelations = glob->appendRelations;
 	result->subplans = glob->subplans;
 	result->rewindPlanIDs = glob->rewindPlanIDs;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index ccdc9bc264a..229b39060ae 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -1274,6 +1274,9 @@ set_plan_refs(PlannerInfo *root, Plan *plan, int rtoffset)
 						lappend_int(root->glob->resultRelations,
 									splan->rootRelation);
 				}
+				root->glob->firstResultRels =
+					lappend_int(root->glob->firstResultRels,
+								linitial_int(splan->resultRelations));
 			}
 			break;
 		case T_Append:
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index d3964a12a14..249829f59a0 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -1639,6 +1639,7 @@ exec_bind_message(StringInfo input_message)
 	int16	   *rformats = NULL;
 	CachedPlanSource *psrc;
 	CachedPlan *cplan;
+	CachedPlanPrepData cprep = {0};
 	Portal		portal;
 	char	   *query_string;
 	char	   *saved_stmt_name;
@@ -2021,7 +2022,11 @@ exec_bind_message(StringInfo input_message)
 	 * will be generated in MessageContext.  The plan refcount will be
 	 * assigned to the Portal, so it will be released at portal destruction.
 	 */
-	cplan = GetCachedPlan(psrc, params, NULL, NULL);
+
+	/* ExecutorPrep() state lives in portal context. */
+	cprep.context = portal->portalContext;
+	cprep.owner = portal->resowner;
+	cplan = GetCachedPlan(psrc, params, NULL, NULL, &cprep);
 
 	/*
 	 * Now we can define the portal.
@@ -2034,7 +2039,7 @@ exec_bind_message(StringInfo input_message)
 					  query_string,
 					  psrc->commandTag,
 					  cplan->stmt_list,
-					  NIL,
+					  cprep.prep_list,
 					  cplan);
 
 	/* Portal is defined, set the plan ID based on its contents. */
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 6661d2c6b73..c1cfd47422c 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -93,7 +93,7 @@ static bool StmtPlanRequiresRevalidation(CachedPlanSource *plansource);
 static bool BuildingPlanRequiresSnapshot(CachedPlanSource *plansource);
 static List *RevalidateCachedQuery(CachedPlanSource *plansource,
 								   QueryEnvironment *queryEnv);
-static bool CheckCachedPlan(CachedPlanSource *plansource);
+static bool PrepAndCheckCachedPlan(CachedPlanSource *plansource, CachedPlanPrepData *cprep);
 static CachedPlan *BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
 								   ParamListInfo boundParams, QueryEnvironment *queryEnv);
 static bool choose_custom_plan(CachedPlanSource *plansource,
@@ -101,6 +101,8 @@ static bool choose_custom_plan(CachedPlanSource *plansource,
 static double cached_plan_cost(CachedPlan *plan, bool include_planner);
 static Query *QueryListGetPrimaryStmt(List *stmts);
 static void AcquireExecutorLocks(List *stmt_list, bool acquire);
+static void AcquireExecutorLocksUnpruned(List *stmt_list, bool acquire,
+										 CachedPlanPrepData *cprep);
 static void AcquirePlannerLocks(List *stmt_list, bool acquire);
 static void ScanQueryForLocks(Query *parsetree, bool acquire);
 static bool ScanQueryWalker(Node *node, bool *acquire);
@@ -137,6 +139,26 @@ ResourceOwnerForgetPlanCacheRef(ResourceOwner owner, CachedPlan *plan)
 /* GUC parameter */
 int			plan_cache_mode = PLAN_CACHE_MODE_AUTO;
 
+/*
+ * Lock acquisition policy for execution locks.
+ *
+ * LOCK_ALL acquires locks on all relations mentioned in the plan,
+ * reproducing the behavior of AcquireExecutorLocks().
+ *
+ * LOCK_UNPRUNED restricts locking to only the unpruned relations. That
+ * includes those mentioned in PlannedStmt.unprunableRelids and the leaf
+ * partitions remaining after performing initial pruning.
+ */
+typedef enum LockPolicy
+{
+	LOCK_ALL,
+	LOCK_UNPRUNED,
+} LockPolicy;
+
+static void AcquireExecutorLocksWithPolicy(List *stmt_list,
+										   LockPolicy policy, bool acquire,
+										   CachedPlanPrepData *cprep);
+
 /*
  * InitPlanCache: initialize module during InitPostgres.
  *
@@ -938,7 +960,12 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
 }
 
 /*
- * CheckCachedPlan: see if the CachedPlanSource's generic plan is valid.
+ * PrepAndCheckCachedPlan: see if the CachedPlanSource's generic plan is valid.
+ *
+ * If 'cprep' is not NULL, ExecutorPrep() is applied to each PlannedStmt to
+ * compute the set of partitions that survive initial runtime pruning in order
+ * to only lock them.  The resulting ExecPrep structures are saved in cprep for
+ * later reuse by ExecutorStart().
  *
  * Caller must have already called RevalidateCachedQuery to verify that the
  * querytree is up to date.
@@ -947,7 +974,7 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
  * (We must do this for the "true" result to be race-condition-free.)
  */
 static bool
-CheckCachedPlan(CachedPlanSource *plansource)
+PrepAndCheckCachedPlan(CachedPlanSource *plansource, CachedPlanPrepData *cprep)
 {
 	CachedPlan *plan = plansource->gplan;
 
@@ -975,13 +1002,15 @@ CheckCachedPlan(CachedPlanSource *plansource)
 	 */
 	if (plan->is_valid)
 	{
+		LockPolicy policy = !cprep ? LOCK_ALL : LOCK_UNPRUNED;
+
 		/*
 		 * Plan must have positive refcount because it is referenced by
 		 * plansource; so no need to fear it disappears under us here.
 		 */
 		Assert(plan->refcount > 0);
 
-		AcquireExecutorLocks(plan->stmt_list, true);
+		AcquireExecutorLocksWithPolicy(plan->stmt_list, policy, true, cprep);
 
 		/*
 		 * If plan was transient, check to see if TransactionXmin has
@@ -1003,7 +1032,7 @@ CheckCachedPlan(CachedPlanSource *plansource)
 		}
 
 		/* Oops, the race case happened.  Release useless locks. */
-		AcquireExecutorLocks(plan->stmt_list, false);
+		AcquireExecutorLocksWithPolicy(plan->stmt_list, policy, false, cprep);
 	}
 
 	/*
@@ -1283,6 +1312,10 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
  * On return, the plan is valid and we have sufficient locks to begin
  * execution.
  *
+ * If 'cprep' is not NULL and a generic plan is reused, the function prepares
+ * each PlannedStmt via ExecutorPrep() and stores the results in
+ * cprep->prep_list.  These are intended to be passed later to ExecutorStart().
+ *
  * On return, the refcount of the plan has been incremented; a later
  * ReleaseCachedPlan() call is expected.  If "owner" is not NULL then
  * the refcount has been reported to that ResourceOwner (note that this
@@ -1293,7 +1326,8 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
  */
 CachedPlan *
 GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
-			  ResourceOwner owner, QueryEnvironment *queryEnv)
+			  ResourceOwner owner, QueryEnvironment *queryEnv,
+			  CachedPlanPrepData *cprep)
 {
 	CachedPlan *plan = NULL;
 	List	   *qlist;
@@ -1315,7 +1349,9 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 
 	if (!customplan)
 	{
-		if (CheckCachedPlan(plansource))
+		if (cprep)
+			cprep->params = boundParams;
+		if (PrepAndCheckCachedPlan(plansource, cprep))
 		{
 			/* We want a generic plan, and we already have a valid one */
 			plan = plansource->gplan;
@@ -1902,6 +1938,38 @@ QueryListGetPrimaryStmt(List *stmts)
 	return NULL;
 }
 
+/*
+ * AcquireExecutorLocksWithPolicy
+ *		Acquire or release execution locks for a cached plan according to
+ *		the specified policy.
+ *
+ * LOCK_ALL reproduces AcquireExecutorLocks(), locking every relation in
+ * each PlannedStmt's rtable.  LOCK_UNPRUNED restricts locking to the
+ * unprunable rels and partitions that survive initial runtime pruning.
+ *
+ * When LOCK_UNPRUNED is used on acquire, ExecutorPrep() is invoked for
+ * each PlannedStmt and the resulting ExecPrep pointers are appended to
+ * cprep->prep_list in cprep->context.  On release, the same ExecPrep
+ * list is consulted to determine which relations to unlock and is then
+ * cleaned up with ExecPrepCleanup().
+ */
+static void
+AcquireExecutorLocksWithPolicy(List *stmt_list, LockPolicy policy, bool acquire,
+							   CachedPlanPrepData *cprep)
+{
+	switch (policy)
+	{
+		case LOCK_ALL:
+			AcquireExecutorLocks(stmt_list, acquire);
+			break;
+		case LOCK_UNPRUNED:
+			AcquireExecutorLocksUnpruned(stmt_list, acquire, cprep);
+			break;
+		default:
+			elog(ERROR, "invalid LockPolicy");
+	}
+}
+
 /*
  * AcquireExecutorLocks: acquire locks needed for execution of a cached plan;
  * or release them if acquire is false.
@@ -1954,6 +2022,158 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
 	}
 }
 
+/*
+ * LockRelids
+ * 		Acquire or release locks on the specified relids, which reference
+ * 		entries in the provided range table.
+ *
+ * Helper for AcquireExecutorLocksUnpruned().
+ */
+static void
+LockRelids(List *rtable, Bitmapset *relids, bool acquire)
+{
+	int	rtindex = -1;
+
+	while ((rtindex = bms_next_member(relids, rtindex)) >= 0)
+	{
+		RangeTblEntry *rte = list_nth_node(RangeTblEntry, rtable, rtindex - 1);
+
+		Assert(rte->rtekind == RTE_RELATION ||
+			   (rte->rtekind == RTE_SUBQUERY && OidIsValid(rte->relid)));
+
+		/*
+		 * Acquire the appropriate type of lock on each relation OID. Note
+		 * that we don't actually try to open the rel, and hence will not
+		 * fail if it's been dropped entirely --- we'll just transiently
+		 * acquire a non-conflicting lock.
+		 */
+		if (acquire)
+			LockRelationOid(rte->relid, rte->rellockmode);
+		else
+			UnlockRelationOid(rte->relid, rte->rellockmode);
+	}
+}
+
+/*
+ * AcquireExecutorLocksUnpruned
+ *		Acquire or release execution locks for only unpruned relations
+ *		referenced by the given PlannedStmts.
+ *
+ * On acquire, this:
+ *	- locks unprunable rels listed in PlannedStmt.unprunableRelids
+ *	- runs ExecutorPrep() to perform initial runtime pruning
+ *	- locks the surviving partitions reported in the prep estate
+ *	- appends the ExecPrep pointer for each PlannedStmt to cprep->prep_list
+ *
+ * On release, it:
+ *	- looks up the ExecPrep object for each PlannedStmt from cprep->prep_list
+ *	  (which must already be populated)
+ *	- unlocks the same relations identified during acquire
+ *	- calls ExecPrepCleanup() on each ExecPrep
+ *
+ * prep_list is extended during acquire and must match stmt_list one-to-one
+ * when releasing locks.  Memory allocation for ExecPrep happens in
+ * cprep->context.  Locks are acquired using cprep->owner.
+ */
+
+static void
+AcquireExecutorLocksUnpruned(List *stmt_list, bool acquire,
+							 CachedPlanPrepData *cprep)
+{
+	MemoryContext oldcontext = MemoryContextSwitchTo(cprep->context);
+	ListCell   *lc1;
+	List	   *prep_list;
+	int			i;
+
+	Assert(cprep);
+
+	/*
+	 * When releasing locks, use the ExecPrep list (if any) created during
+	 * acquisition to determine which relids to unlock. The list must match
+	 * the PlannedStmt list one-to-one.
+	 */
+	prep_list = cprep->prep_list;
+	Assert(acquire || list_length(prep_list) == list_length(stmt_list));
+
+	i = 0;
+	foreach(lc1, stmt_list)
+	{
+		PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
+		ExecPrep *prep;
+
+		if (plannedstmt->commandType == CMD_UTILITY)
+		{
+			/* Same as AcquireExecutorLocks(). */
+			Query	   *query = UtilityContainsQuery(plannedstmt->utilityStmt);
+
+			if (query)
+				ScanQueryForLocks(query, acquire);
+
+			/* Keep the list one-to-one with stmt_list. */
+			if (acquire)
+				cprep->prep_list = lappend(cprep->prep_list, NULL);
+			continue;
+		}
+
+		/*
+		 * Lock tables mentioned in the original query and other unprunable
+		 * relations that were added to the plan via inheritance expansion.
+		 */
+		LockRelids(plannedstmt->rtable, plannedstmt->unprunableRelids, acquire);
+
+		/* Lock partitions surviving runtime initial pruning. */
+		if (acquire)
+		{
+			prep = ExecutorPrep(plannedstmt, cprep->params, cprep->owner, true,
+								cprep->eflags);
+			Assert(prep || plannedstmt->partPruneInfos == NULL);
+			cprep->prep_list = lappend(cprep->prep_list, prep);
+		}
+		else
+			prep = list_nth(prep_list, i++);
+
+		Assert(prep == NULL || prep->prep_estate);
+		if (prep)
+		{
+			EState *prep_estate = prep->prep_estate;
+
+			/*
+			 * es_unpruned_relids includes plannedstmt->unprunableRelids,
+			 * which we've already locked. Filter them out to avoid double-locking.
+			 */
+			Bitmapset *lock_relids = bms_difference(prep_estate->es_unpruned_relids,
+													plannedstmt->unprunableRelids);
+
+			/*
+			 * firstResultRels may contain pruned partitions that must still be
+			 * locked to satisfy executor assumptions (see comments in
+			 * ExecInitModifyTable(). Ensure they’re included here.
+			 */
+			if (plannedstmt->resultRelations)
+			{
+				ListCell *lc2;
+
+				foreach(lc2, plannedstmt->firstResultRels)
+				{
+					Index       firstResultRel = lfirst_int(lc2);
+
+					if (!bms_is_member(firstResultRel, lock_relids))
+						lock_relids = bms_add_member(lock_relids, firstResultRel);
+				}
+			}
+
+			LockRelids(plannedstmt->rtable, lock_relids, acquire);
+			bms_free(lock_relids);
+		}
+
+		/* Clean up prep if releasing locks. */
+		if (!acquire)
+			ExecPrepCleanup(prep);
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+}
+
 /*
  * AcquirePlannerLocks: acquire locks needed for planning of a querytree list;
  * or release them if acquire is false.
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 30d889b54c5..6fb86dc05f6 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -141,6 +141,9 @@ typedef struct PlannerGlobal
 	/* "flat" list of integer RT indexes */
 	List	   *resultRelations;
 
+	/* "flat" list of integer RT indexes (one per ModifyTable node) */
+	List	   *firstResultRels;
+
 	/* "flat" list of AppendRelInfos */
 	List	   *appendRelations;
 
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index c4393a94321..eb211f1ba56 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -123,6 +123,16 @@ typedef struct PlannedStmt
 	/* integer list of RT indexes, or NIL */
 	List	   *resultRelations;
 
+	/*
+	 * rtable indexes of first target relation in each ModifyTable node in the
+	 * plan for INSERT/UPDATE/DELETE/MERGE.  NIL if resultRelations is NIL.
+	 *
+	 * These are used by AcquireExecutorLocksUnpruned() to ensure that the
+	 * first result rel for each ModifyTable remains locked even if pruned;
+	 * see ExecInitModifyTable() for the executor side assumptions.
+	 */
+	List	   *firstResultRels;
+
 	/* list of AppendRelInfo nodes */
 	List	   *appendRelations;
 
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index a82b66d4bc2..c7b8ec4be39 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -197,6 +197,27 @@ typedef struct CachedExpression
 } CachedExpression;
 
 
+/*
+ * CachedPlanPrepData
+ *      Carries ExecutorPrep results for each PlannedStmt in a CachedPlan,
+ *      along with context and owner information needed to allocate them.
+ *
+ * prep_list is indexed one-to-one with CachedPlan->stmt_list, and is
+ * populated when GetCachedPlan() prepares a reused generic plan.  The
+ * same list is later used to determine which relations to unlock when
+ * releasing execution locks.
+ *
+ * ExecutorPrep state is allocated in 'context' and owned by 'owner'.
+ */
+typedef struct CachedPlanPrepData
+{
+	List   *prep_list;		/* one ExecPrep per PlannedStmt, or NULL */
+	ParamListInfo params;	/* params visible to ExecutorPrep */
+	MemoryContext context;	/* where to allocate ExecPrep objects */
+	ResourceOwner owner;	/* ResourceOwner for ExecutorPrep state */
+	int		eflags;			/* executor flags to pass to ExecutorPrep */
+} CachedPlanPrepData;
+
 extern void InitPlanCache(void);
 extern void ResetPlanCache(void);
 
@@ -240,7 +261,8 @@ extern List *CachedPlanGetTargetList(CachedPlanSource *plansource,
 extern CachedPlan *GetCachedPlan(CachedPlanSource *plansource,
 								 ParamListInfo boundParams,
 								 ResourceOwner owner,
-								 QueryEnvironment *queryEnv);
+								 QueryEnvironment *queryEnv,
+								 CachedPlanPrepData *cprep);
 extern void ReleaseCachedPlan(CachedPlan *plan, ResourceOwner owner);
 
 extern bool CachedPlanAllowsSimpleValidityCheck(CachedPlanSource *plansource,
-- 
2.47.3



  [application/octet-stream] v3-0005-Add-test-exercising-prep-cleanup-on-cached-plan-i.patch (9.3K, 3-v3-0005-Add-test-exercising-prep-cleanup-on-cached-plan-i.patch)
  download | inline diff:
From 052ab8fe38493ca106d749f4e2426a86d0267d59 Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Thu, 20 Nov 2025 15:35:47 +0900
Subject: [PATCH v3 5/6] Add test exercising prep cleanup on cached-plan
 invalidation

Add a regression test that causes a generic plan to become invalid
while pruning-aware setup is running. The pruning expression calls a
function that can perform DDL on a partition, making the plan stale
during reuse.

The test's purpose is to drive execution through the invalidation
path that discards any ExecutorPrep state created before the plan was
found invalid, providing coverage for that cleanup logic.
---
 src/backend/utils/cache/plancache.c     | 38 +++++++++++++--
 src/test/regress/expected/plancache.out | 61 +++++++++++++++++++++++++
 src/test/regress/sql/plancache.sql      | 50 ++++++++++++++++++++
 3 files changed, 144 insertions(+), 5 deletions(-)

diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index c1cfd47422c..a9a4e11d1a5 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -103,6 +103,7 @@ static Query *QueryListGetPrimaryStmt(List *stmts);
 static void AcquireExecutorLocks(List *stmt_list, bool acquire);
 static void AcquireExecutorLocksUnpruned(List *stmt_list, bool acquire,
 										 CachedPlanPrepData *cprep);
+static void CachedPlanPrepCleanup(CachedPlanPrepData *cprep);
 static void AcquirePlannerLocks(List *stmt_list, bool acquire);
 static void ScanQueryForLocks(Query *parsetree, bool acquire);
 static bool ScanQueryWalker(Node *node, bool *acquire);
@@ -1033,6 +1034,9 @@ PrepAndCheckCachedPlan(CachedPlanSource *plansource, CachedPlanPrepData *cprep)
 
 		/* Oops, the race case happened.  Release useless locks. */
 		AcquireExecutorLocksWithPolicy(plan->stmt_list, policy, false, cprep);
+
+		/* Also clean up ExecutorPrep() state, if necessary. */
+		CachedPlanPrepCleanup(cprep);
 	}
 
 	/*
@@ -2069,7 +2073,6 @@ LockRelids(List *rtable, Bitmapset *relids, bool acquire)
  *	- looks up the ExecPrep object for each PlannedStmt from cprep->prep_list
  *	  (which must already be populated)
  *	- unlocks the same relations identified during acquire
- *	- calls ExecPrepCleanup() on each ExecPrep
  *
  * prep_list is extended during acquire and must match stmt_list one-to-one
  * when releasing locks.  Memory allocation for ExecPrep happens in
@@ -2165,15 +2168,40 @@ AcquireExecutorLocksUnpruned(List *stmt_list, bool acquire,
 			LockRelids(plannedstmt->rtable, lock_relids, acquire);
 			bms_free(lock_relids);
 		}
-
-		/* Clean up prep if releasing locks. */
-		if (!acquire)
-			ExecPrepCleanup(prep);
 	}
 
 	MemoryContextSwitchTo(oldcontext);
 }
 
+/*
+ * CachedPlanPrepCleanup
+ *		Clean up ExecPrep state built for a generic plan.
+ *
+ * This is used in the corner case where PrepAndCheckCachedPlan() discovers
+ * that a CachedPlan has become invalid after AcquireExecutorLocksUnpruned()
+ * has already run.  In that case we must both release the execution locks
+ * and dispose of the ExecPrep list stored in CachedPlanPrepData, since the
+ * executor will never see or clean it up.
+ */
+static void
+CachedPlanPrepCleanup(CachedPlanPrepData *cprep)
+{
+	ListCell   *lc;
+
+	if (cprep == NULL)
+		return;
+
+	foreach(lc, cprep->prep_list)
+	{
+		ExecPrep *prep = (ExecPrep *) lfirst(lc);
+
+		ExecPrepCleanup(prep);
+	}
+
+	list_free(cprep->prep_list);
+	cprep->prep_list = NIL;
+}
+
 /*
  * AcquirePlannerLocks: acquire locks needed for planning of a querytree list;
  * or release them if acquire is false.
diff --git a/src/test/regress/expected/plancache.out b/src/test/regress/expected/plancache.out
index 4e59188196c..26c4c5e10fd 100644
--- a/src/test/regress/expected/plancache.out
+++ b/src/test/regress/expected/plancache.out
@@ -398,3 +398,64 @@ select name, generic_plans, custom_plans from pg_prepared_statements
 (1 row)
 
 drop table test_mode;
+-- Test invalidation of a generic plan during pruning-aware lock setup.
+-- The pruning expression uses a stable SQL function that calls a volatile
+-- plpgsql function.  That function performs DDL on a partition when a
+-- separate "signal" table says to do so.  The second EXECUTE should
+-- replan cleanly after the DDL.
+set plan_cache_mode to force_generic_plan;
+create table inval_during_pruning_p (a int) partition by list (a);
+create table inval_during_pruning_p1 partition of inval_during_pruning_p for values in (1);
+create table inval_during_pruning_p2 partition of inval_during_pruning_p for values in (2);
+insert into inval_during_pruning_p values (1), (2);
+create table inval_during_pruning_signal (create_idx bool not null);
+insert into inval_during_pruning_signal values (false);
+create or replace function invalidate_plancache_func() returns int
+as $$
+declare
+	create_index bool;
+begin
+	-- Perform DDL on a partition if asked to
+    select create_idx into create_index from inval_during_pruning_signal for update;
+    if create_index = true then
+		raise notice 'creating index on partition inval_during_pruning_p1';
+        create index on inval_during_pruning_p1 (a);
+		update inval_during_pruning_signal set create_idx = false;
+    end if;
+	-- pruning parameter
+    return 1;
+end;
+$$ language plpgsql volatile;
+create or replace function stable_pruning_val() returns int as $$
+	select invalidate_plancache_func();
+$$ language sql stable;
+prepare inval_during_pruning_q as select * from inval_during_pruning_p where a = stable_pruning_val();
+-- Build a generic plan and run pruning once, but don't set the signal
+-- for invalidate_plancache_func() to perform the DDL.
+explain (verbose, costs off) execute inval_during_pruning_q;
+                                QUERY PLAN                                 
+---------------------------------------------------------------------------
+ Append
+   Subplans Removed: 1
+   ->  Seq Scan on public.inval_during_pruning_p1 inval_during_pruning_p_1
+         Output: inval_during_pruning_p_1.a
+         Filter: (inval_during_pruning_p_1.a = stable_pruning_val())
+(5 rows)
+
+-- Reuse the generic plan.  Make invalidate_plancache_func() perform DDL
+-- during this execution, which should force replanning without errors.
+update inval_during_pruning_signal set create_idx = true;
+explain (verbose, costs off) execute inval_during_pruning_q;
+NOTICE:  creating index on partition inval_during_pruning_p1
+                                QUERY PLAN                                 
+---------------------------------------------------------------------------
+ Append
+   Subplans Removed: 1
+   ->  Seq Scan on public.inval_during_pruning_p1 inval_during_pruning_p_1
+         Output: inval_during_pruning_p_1.a
+         Filter: (inval_during_pruning_p_1.a = stable_pruning_val())
+(5 rows)
+
+drop table inval_during_pruning_p, inval_during_pruning_signal;
+drop function invalidate_plancache_func, stable_pruning_val;
+reset plan_cache_mode;
diff --git a/src/test/regress/sql/plancache.sql b/src/test/regress/sql/plancache.sql
index 4b2f11dcc64..cc7eb4da4d3 100644
--- a/src/test/regress/sql/plancache.sql
+++ b/src/test/regress/sql/plancache.sql
@@ -223,3 +223,53 @@ select name, generic_plans, custom_plans from pg_prepared_statements
   where  name = 'test_mode_pp';
 
 drop table test_mode;
+
+-- Test invalidation of a generic plan during pruning-aware lock setup.
+-- The pruning expression uses a stable SQL function that calls a volatile
+-- plpgsql function.  That function performs DDL on a partition when a
+-- separate "signal" table says to do so.  The second EXECUTE should
+-- replan cleanly after the DDL.
+set plan_cache_mode to force_generic_plan;
+create table inval_during_pruning_p (a int) partition by list (a);
+create table inval_during_pruning_p1 partition of inval_during_pruning_p for values in (1);
+create table inval_during_pruning_p2 partition of inval_during_pruning_p for values in (2);
+insert into inval_during_pruning_p values (1), (2);
+
+create table inval_during_pruning_signal (create_idx bool not null);
+insert into inval_during_pruning_signal values (false);
+create or replace function invalidate_plancache_func() returns int
+as $$
+declare
+	create_index bool;
+begin
+	-- Perform DDL on a partition if asked to
+    select create_idx into create_index from inval_during_pruning_signal for update;
+    if create_index = true then
+		raise notice 'creating index on partition inval_during_pruning_p1';
+        create index on inval_during_pruning_p1 (a);
+		update inval_during_pruning_signal set create_idx = false;
+    end if;
+	-- pruning parameter
+    return 1;
+end;
+$$ language plpgsql volatile;
+
+create or replace function stable_pruning_val() returns int as $$
+	select invalidate_plancache_func();
+$$ language sql stable;
+
+prepare inval_during_pruning_q as select * from inval_during_pruning_p where a = stable_pruning_val();
+
+-- Build a generic plan and run pruning once, but don't set the signal
+-- for invalidate_plancache_func() to perform the DDL.
+explain (verbose, costs off) execute inval_during_pruning_q;
+
+-- Reuse the generic plan.  Make invalidate_plancache_func() perform DDL
+-- during this execution, which should force replanning without errors.
+update inval_during_pruning_signal set create_idx = true;
+explain (verbose, costs off) execute inval_during_pruning_q;
+
+drop table inval_during_pruning_p, inval_during_pruning_signal;
+drop function invalidate_plancache_func, stable_pruning_val;
+
+reset plan_cache_mode;
-- 
2.47.3



  [application/octet-stream] v3-0002-Introduce-ExecutorPrep-and-refactor-executor-star.patch (28.7K, 4-v3-0002-Introduce-ExecutorPrep-and-refactor-executor-star.patch)
  download | inline diff:
From 11e0262e31e35539f50e96531559db6cd7e32160 Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Tue, 11 Nov 2025 21:47:46 +0900
Subject: [PATCH v3 2/6] Introduce ExecutorPrep and refactor executor startup

Factor permission checks, range table initialization, and initial
partition pruning out of InitPlan() into a new ExecutorPrep()
helper.  ExecutorPrep builds an EState containing the executor
metadata needed before plan execution, including partition
pruning state where partPruneInfos are present.

ExecutorStart() now expects QueryDesc->prep to point at such an
ExecPrep object.  If no prep was supplied by the caller, it
invokes ExecutorPrep() itself and adopts the resulting EState
for the duration of the query.  This keeps the executor startup
behaviour unchanged while making the setup work callable
separately when needed.

CreateQueryDesc() grows a prep argument and stores it in the
QueryDesc.  Portals, SPI, SQL functions, and EXPLAIN are wired
to carry an optional ExecPrep pointer alongside the PlannedStmt
list, but most callers still pass NULL and let ExecutorStart()
perform the setup lazily.

Add the ExecPrep struct and ExecPrepCleanup() to encapsulate
ownership of the prepared EState and any caller specific
cleanup hook.  Update executor/README and related comments to
document the new control flow and the separation between
preparation and execution.
---
 src/backend/commands/copyto.c        |   2 +-
 src/backend/commands/createas.c      |   2 +-
 src/backend/commands/explain.c       |   7 +-
 src/backend/commands/extension.c     |   1 +
 src/backend/commands/matview.c       |   2 +-
 src/backend/commands/portalcmds.c    |   1 +
 src/backend/commands/prepare.c       |  11 +-
 src/backend/executor/README          |   8 +-
 src/backend/executor/execMain.c      | 179 +++++++++++++++++++++++----
 src/backend/executor/execParallel.c  |   1 +
 src/backend/executor/execPartition.c |   3 +
 src/backend/executor/functions.c     |   1 +
 src/backend/executor/spi.c           |  10 ++
 src/backend/tcop/postgres.c          |   2 +
 src/backend/tcop/pquery.c            |  27 +++-
 src/backend/utils/mmgr/portalmem.c   |   2 +
 src/include/commands/explain.h       |   3 +-
 src/include/executor/execdesc.h      |   3 +-
 src/include/executor/executor.h      |  11 ++
 src/include/nodes/execnodes.h        |  48 +++++++
 src/include/utils/portal.h           |   2 +
 21 files changed, 286 insertions(+), 40 deletions(-)

diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index cef452584e5..5efbb0949c2 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -870,7 +870,7 @@ BeginCopyTo(ParseState *pstate,
 		((DR_copy *) dest)->cstate = cstate;
 
 		/* Create a QueryDesc requesting no output */
-		cstate->queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+		cstate->queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
 											GetActiveSnapshot(),
 											InvalidSnapshot,
 											dest, NULL, NULL, 0);
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 1ccc2e55c64..9eabe4920cd 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -334,7 +334,7 @@ ExecCreateTableAs(ParseState *pstate, CreateTableAsStmt *stmt,
 		UpdateActiveSnapshotCommandId();
 
 		/* Create a QueryDesc, redirecting output to our tuple receiver */
-		queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+		queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
 									GetActiveSnapshot(), InvalidSnapshot,
 									dest, params, queryEnv, 0);
 
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 7e699f8595e..d6ab3697dd9 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -370,7 +370,7 @@ standard_ExplainOneQuery(Query *query, int cursorOptions,
 	}
 
 	/* run it (if needed) and produce output */
-	ExplainOnePlan(plan, into, es, queryString, params, queryEnv,
+	ExplainOnePlan(plan, NULL, into, es, queryString, params, queryEnv,
 				   &planduration, (es->buffers ? &bufusage : NULL),
 				   es->memory ? &mem_counters : NULL);
 }
@@ -492,7 +492,8 @@ ExplainOneUtility(Node *utilityStmt, IntoClause *into, ExplainState *es,
  * to call it.
  */
 void
-ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
+ExplainOnePlan(PlannedStmt *plannedstmt, ExecPrep *prep,
+			   IntoClause *into, ExplainState *es,
 			   const char *queryString, ParamListInfo params,
 			   QueryEnvironment *queryEnv, const instr_time *planduration,
 			   const BufferUsage *bufusage,
@@ -548,7 +549,7 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
 		dest = None_Receiver;
 
 	/* Create a QueryDesc for the query */
-	queryDesc = CreateQueryDesc(plannedstmt, queryString,
+	queryDesc = CreateQueryDesc(plannedstmt, prep, queryString,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, params, queryEnv, instrument_option);
 
diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index 93ef1ad106f..3cca6d45ec1 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -993,6 +993,7 @@ execute_sql_string(const char *sql, const char *filename)
 				QueryDesc  *qdesc;
 
 				qdesc = CreateQueryDesc(stmt,
+										NULL,
 										sql,
 										GetActiveSnapshot(), NULL,
 										dest, NULL, NULL, 0);
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index ef7c0d624f1..30cbf9f264f 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -437,7 +437,7 @@ refresh_matview_datafill(DestReceiver *dest, Query *query,
 	UpdateActiveSnapshotCommandId();
 
 	/* Create a QueryDesc, redirecting output to our tuple receiver */
-	queryDesc = CreateQueryDesc(plan, queryString,
+	queryDesc = CreateQueryDesc(plan, NULL, queryString,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, NULL, NULL, 0);
 
diff --git a/src/backend/commands/portalcmds.c b/src/backend/commands/portalcmds.c
index ec96c2efcd3..ac1ddd25aba 100644
--- a/src/backend/commands/portalcmds.c
+++ b/src/backend/commands/portalcmds.c
@@ -118,6 +118,7 @@ PerformCursorOpen(ParseState *pstate, DeclareCursorStmt *cstmt, ParamListInfo pa
 					  queryString,
 					  CMDTAG_SELECT,	/* cursor's query is always a SELECT */
 					  list_make1(plan),
+					  list_make1(NULL),
 					  NULL);
 
 	/*----------
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 34b6410d6a2..afd449c73ba 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -205,6 +205,7 @@ ExecuteQuery(ParseState *pstate,
 					  query_string,
 					  entry->plansource->commandTag,
 					  plan_list,
+					  NIL,
 					  cplan);
 
 	/*
@@ -575,6 +576,7 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	const char *query_string;
 	CachedPlan *cplan;
 	List	   *plan_list;
+	List	   *prep_list;
 	ListCell   *p;
 	ParamListInfo paramLI = NULL;
 	EState	   *estate = NULL;
@@ -585,6 +587,7 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	MemoryContextCounters mem_counters;
 	MemoryContext planner_ctx = NULL;
 	MemoryContext saved_ctx = NULL;
+	int			i;
 
 	if (es->memory)
 	{
@@ -650,14 +653,20 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	}
 
 	plan_list = cplan->stmt_list;
+	prep_list = NIL;
 
 	/* Explain each query */
+	i = 0;
 	foreach(p, plan_list)
 	{
 		PlannedStmt *pstmt = lfirst_node(PlannedStmt, p);
+		ExecPrep *prep = prep_list ?
+			(ExecPrep *) list_nth(prep_list, i) : NULL;
 
+		i++;
 		if (pstmt->commandType != CMD_UTILITY)
-			ExplainOnePlan(pstmt, into, es, query_string, paramLI, pstate->p_queryEnv,
+			ExplainOnePlan(pstmt, prep,
+						   into, es, query_string, paramLI, pstate->p_queryEnv,
 						   &planduration, (es->buffers ? &bufusage : NULL),
 						   es->memory ? &mem_counters : NULL);
 		else
diff --git a/src/backend/executor/README b/src/backend/executor/README
index 54f4782f31b..95b5ec58c55 100644
--- a/src/backend/executor/README
+++ b/src/backend/executor/README
@@ -291,10 +291,16 @@ Query Processing Control Flow
 
 This is a sketch of control flow for full query processing:
 
+    ExecutorPrep
+		May be run before ExecutorStart (e.g., for plan validation), or
+		implicitly from ExecutorStart if not done earlier.  Performs range
+		table initialization, permission checks, and initial partition pruning.
+		Returns an ExecPrep wrapper with EState that ExecutorStart may reuse.
+
 	CreateQueryDesc
 
 	ExecutorStart
-		CreateExecutorState
+		CreateExecutorState (or reuse one from ExecPrep if present)
 			creates per-query context
 		switch to per-query context to run ExecInitNode
 		AfterTriggerBeginQuery
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 27c9eec697b..39de0b93a1c 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -171,8 +171,26 @@ standard_ExecutorStart(QueryDesc *queryDesc, int eflags)
 
 	/*
 	 * Build EState, switch into per-query memory context for startup.
+	 *
+	 * If ExecutorPrep() ran earlier (e.g., to do initial pruning during plan
+	 * validity checking), reuse its EState to avoid redoing range table setup
+	 * and pruning. Otherwise, create a fresh EState as usual.
 	 */
-	estate = CreateExecutorState();
+	if (queryDesc->prep == NULL)
+		queryDesc->prep = ExecutorPrep(queryDesc->plannedstmt,
+									   queryDesc->params,
+									   CurrentResourceOwner,
+									   true,
+									   eflags);
+	Assert(queryDesc->prep);
+	estate = queryDesc->prep->prep_estate;
+
+	/*
+	 * Executor is adopting the prep's EState. Mark it so ExecPrepCleanup()
+	 * doesn't try to free it redundantly.
+	 */
+	queryDesc->prep->owns_estate = false;
+
 	queryDesc->estate = estate;
 
 	oldcontext = MemoryContextSwitchTo(estate->es_query_cxt);
@@ -263,6 +281,136 @@ standard_ExecutorStart(QueryDesc *queryDesc, int eflags)
 	MemoryContextSwitchTo(oldcontext);
 }
 
+/*
+ * ExecutorPrep: prepare executor state for a PlannedStmt outside ExecutorStart.
+ *
+ * Performs range table initialization, permission checks, and initial
+ * partition pruning if partPruneInfos are present and do_initial_pruning is
+ * true.
+ *
+ * This is intended for callers that need executor metadata ahead of actual
+ * execution. Typical use cases include:
+ *	- determining which relations must be locked during plan cache validation;
+ *	- initializing unpruned relids and valid subplans in parallel workers
+ *	  using state copied from the leader.
+ *
+ * The executor can reuse the resulting state to avoid redundant setup during
+ * ExecutorStart().
+ *
+ * Returns an ExecPrep wrapper that owns the EState and can be reused
+ * or cleaned up later.
+ */
+ExecPrep *
+ExecutorPrep(PlannedStmt *pstmt, ParamListInfo params, ResourceOwner owner,
+			 bool do_initial_pruning, int eflags)
+{
+	ResourceOwner oldowner;
+	EState *estate;
+	bool	snapshot_set;
+
+	if (pstmt->commandType == CMD_UTILITY)
+		return NULL;
+
+	/* Pruning may use expressions that require an active snapshot. */
+	snapshot_set = false;
+	if (!ActiveSnapshotSet())
+	{
+		PushActiveSnapshot(GetTransactionSnapshot());
+		snapshot_set = true;
+	}
+	Assert(ActiveSnapshotSet());
+
+	estate = CreateExecutorState();
+	estate->es_plannedstmt = pstmt;
+	estate->es_part_prune_infos = pstmt->partPruneInfos;
+	estate->es_param_list_info = params;
+	estate->es_top_eflags = eflags;
+
+	/*
+	 * Do permissions checks.
+	 */
+	ExecCheckPermissions(pstmt->rtable, pstmt->permInfos, true);
+
+	/*
+	 * Initialize range table.
+	 */
+	ExecInitRangeTable(estate, pstmt->rtable, pstmt->permInfos,
+					   bms_copy(pstmt->unprunableRelids));
+
+	/*
+	 * Ensure locks taken during initial pruning are tracked under the given
+	 * ResourceOwner (e.g., one associated with CachedPlan validation).
+	 */
+	oldowner = CurrentResourceOwner;
+	CurrentResourceOwner = owner;
+
+	/*
+	 * Set up PartitionPruneState structures needed for both initial and
+	 * runtime partition pruning. These structures are built from the
+	 * PartitionPruneInfo entries in the plan tree.
+	 *
+	 * If do_initial_pruning is true, also perform initial pruning to compute
+	 * the subset of child subplans that will be executed. The results,
+	 * which are bitmapsets of selected child indexes, are saved in
+	 * es_part_prune_results. This list is parallel to es_part_prune_infos.
+	 *
+	 * In parallel workers, do_initial_pruning should be false -- they receive
+	 * es_part_prune_results from the leader process and should only initialize
+	 * the PartitionPruneStates.
+	 */
+	ExecCreatePartitionPruneStates(estate);
+	if (do_initial_pruning)
+		ExecDoInitialPruning(estate);
+
+	CurrentResourceOwner = oldowner;
+
+	/* Release snapshot if we got one */
+	if (snapshot_set)
+		PopActiveSnapshot();
+
+	return CreateExecPrep(estate, CurrentMemoryContext, NULL, NULL);
+}
+
+/*
+ * CreateExecPrep: initialize ExecPrep wrapper with optional cleanup metadata.
+ */
+ExecPrep *
+CreateExecPrep(EState *estate, MemoryContext context,
+			   execprep_cleanup_fn cleanup, void *cleanup_arg)
+{
+	ExecPrep *prep = palloc0(sizeof(ExecPrep));
+
+	prep->prep_estate = estate;
+	prep->context = context;
+	prep->cleanup = cleanup;
+	prep->cleanup_arg = cleanup_arg;
+	prep->owns_estate = true;
+
+	return prep;
+}
+
+/*
+ * ExecPrepCleanup: free ExecPrep resources not adopted by the executor.
+ *
+ * Only frees the EState if it wasn't taken over by ExecutorStart().
+ * Always runs the optional user-defined cleanup callback.
+ */
+void
+ExecPrepCleanup(ExecPrep *prep)
+{
+	if (prep == NULL)
+		return;
+
+	if (prep->prep_estate && prep->owns_estate)
+	{
+		ExecCloseRangeTableRelations(prep->prep_estate);
+		FreeExecutorState(prep->prep_estate);
+	}
+
+	if (prep->cleanup)
+		prep->cleanup(prep->cleanup_arg);
+}
+
 /* ----------------------------------------------------------------
  *		ExecutorRun
  *
@@ -824,7 +972,6 @@ ExecCheckXactReadOnly(PlannedStmt *plannedstmt)
 		PreventCommandIfParallelMode(CreateCommandName((Node *) plannedstmt));
 }
 
-
 /* ----------------------------------------------------------------
  *		InitPlan
  *
@@ -838,37 +985,15 @@ InitPlan(QueryDesc *queryDesc, int eflags)
 	CmdType		operation = queryDesc->operation;
 	PlannedStmt *plannedstmt = queryDesc->plannedstmt;
 	Plan	   *plan = plannedstmt->planTree;
-	List	   *rangeTable = plannedstmt->rtable;
 	EState	   *estate = queryDesc->estate;
 	PlanState  *planstate;
 	TupleDesc	tupType;
 	ListCell   *l;
 	int			i;
 
-	/*
-	 * Do permissions checks
-	 */
-	ExecCheckPermissions(rangeTable, plannedstmt->permInfos, true);
-
-	/*
-	 * initialize the node's execution state
-	 */
-	ExecInitRangeTable(estate, rangeTable, plannedstmt->permInfos,
-					   bms_copy(plannedstmt->unprunableRelids));
-
-	estate->es_plannedstmt = plannedstmt;
-	estate->es_part_prune_infos = plannedstmt->partPruneInfos;
-
-	/*
-	 * Perform runtime "initial" pruning to identify which child subplans,
-	 * corresponding to the children of plan nodes that contain
-	 * PartitionPruneInfo such as Append, will not be executed. The results,
-	 * which are bitmapsets of indexes of the child subplans that will be
-	 * executed, are saved in es_part_prune_results.  These results correspond
-	 * to each PartitionPruneInfo entry, and the es_part_prune_results list is
-	 * parallel to es_part_prune_infos.
-	 */
-	ExecDoInitialPruning(estate);
+	/* ExecutorPrep() must have been done. */
+	Assert(queryDesc->prep);
+	Assert(estate == queryDesc->prep->prep_estate);
 
 	/*
 	 * Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index f098a5557cf..aedbd9566d6 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -1281,6 +1281,7 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
 
 	/* Create a QueryDesc for the query. */
 	return CreateQueryDesc(pstmt,
+						   NULL,
 						   queryString,
 						   GetActiveSnapshot(), InvalidSnapshot,
 						   receiver, paramLI, NULL, instrument_options);
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 88b150c8d77..187a480e508 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -2368,6 +2368,9 @@ InitExecPartitionPruneContexts(PartitionPruneState *prunestate,
 	Assert(parent_plan != NULL);
 	estate = parent_plan->state;
 
+	/* Wouldn't be available at ExecutorPrep() time. */
+	prunestate->econtext->ecxt_param_exec_vals = estate->es_param_exec_vals;
+
 	/*
 	 * No need to fix subplans maps if initial pruning didn't eliminate any
 	 * subplans.
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index 630d708d2a3..633310c5f5b 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -1362,6 +1362,7 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
 		dest = None_Receiver;
 
 	es->qd = CreateQueryDesc(es->stmt,
+							 NULL,
 							 fcache->func->src,
 							 GetActiveSnapshot(),
 							 InvalidSnapshot,
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 653500b38dc..7a3cb944d6f 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -1685,6 +1685,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 					  query_string,
 					  plansource->commandTag,
 					  stmt_list,
+					  NIL,
 					  cplan);
 
 	/*
@@ -2500,6 +2501,8 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 		CachedPlanSource *plansource = (CachedPlanSource *) lfirst(lc1);
 		List	   *stmt_list;
 		ListCell   *lc2;
+		List	   *prep_list;
+		int			i;
 
 		spicallbackarg.query = plansource->query_string;
 
@@ -2578,6 +2581,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 							  plan_owner, _SPI_current->queryEnv);
 
 		stmt_list = cplan->stmt_list;
+		prep_list = NIL;
 
 		/*
 		 * If we weren't given a specific snapshot to use, and the statement
@@ -2615,12 +2619,17 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 			}
 		}
 
+		i = 0;
 		foreach(lc2, stmt_list)
 		{
 			PlannedStmt *stmt = lfirst_node(PlannedStmt, lc2);
+			ExecPrep *prep = prep_list ?
+				list_nth(prep_list, i) : NULL;
 			bool		canSetTag = stmt->canSetTag;
 			DestReceiver *dest;
 
+			i++;
+
 			/*
 			 * Reset output state.  (Note that if a non-SPI receiver is used,
 			 * _SPI_current->processed will stay zero, and that's what we'll
@@ -2690,6 +2699,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 					snap = InvalidSnapshot;
 
 				qdesc = CreateQueryDesc(stmt,
+										prep,
 										plansource->query_string,
 										snap, crosscheck_snapshot,
 										dest,
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 2bd89102686..d3964a12a14 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -1232,6 +1232,7 @@ exec_simple_query(const char *query_string)
 						  query_string,
 						  commandTag,
 						  plantree_list,
+						  NIL,
 						  NULL);
 
 		/*
@@ -2033,6 +2034,7 @@ exec_bind_message(StringInfo input_message)
 					  query_string,
 					  psrc->commandTag,
 					  cplan->stmt_list,
+					  NIL,
 					  cplan);
 
 	/* Portal is defined, set the plan ID based on its contents. */
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index fde78c55160..82c295502b0 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -37,6 +37,7 @@ Portal		ActivePortal = NULL;
 
 
 static void ProcessQuery(PlannedStmt *plan,
+						 ExecPrep *prep,
 						 const char *sourceText,
 						 ParamListInfo params,
 						 QueryEnvironment *queryEnv,
@@ -66,6 +67,7 @@ static void DoPortalRewind(Portal portal);
  */
 QueryDesc *
 CreateQueryDesc(PlannedStmt *plannedstmt,
+				ExecPrep *prep,
 				const char *sourceText,
 				Snapshot snapshot,
 				Snapshot crosscheck_snapshot,
@@ -78,6 +80,7 @@ CreateQueryDesc(PlannedStmt *plannedstmt,
 
 	qd->operation = plannedstmt->commandType;	/* operation */
 	qd->plannedstmt = plannedstmt;	/* plan */
+	qd->prep = prep;		/* executor prep output */
 	qd->sourceText = sourceText;	/* query text */
 	qd->snapshot = RegisterSnapshot(snapshot);	/* snapshot */
 	/* RI check snapshot */
@@ -112,6 +115,13 @@ FreeQueryDesc(QueryDesc *qdesc)
 	UnregisterSnapshot(qdesc->snapshot);
 	UnregisterSnapshot(qdesc->crosscheck_snapshot);
 
+	/* ExecPrep cleanup if necessary */
+	if (qdesc->prep)
+	{
+		ExecPrepCleanup(qdesc->prep);
+		qdesc->prep = NULL;
+	}
+
 	/* Only the QueryDesc itself need be freed */
 	pfree(qdesc);
 }
@@ -123,6 +133,7 @@ FreeQueryDesc(QueryDesc *qdesc)
  *		PORTAL_ONE_RETURNING, or PORTAL_ONE_MOD_WITH portal
  *
  *	plan: the plan tree for the query
+ *	prep: ExecPrep for the plan (output of ExecutorPrep())
  *	sourceText: the source text of the query
  *	params: any parameters needed
  *	dest: where to send results
@@ -135,6 +146,7 @@ FreeQueryDesc(QueryDesc *qdesc)
  */
 static void
 ProcessQuery(PlannedStmt *plan,
+			 ExecPrep *prep,
 			 const char *sourceText,
 			 ParamListInfo params,
 			 QueryEnvironment *queryEnv,
@@ -146,7 +158,7 @@ ProcessQuery(PlannedStmt *plan,
 	/*
 	 * Create the QueryDesc object
 	 */
-	queryDesc = CreateQueryDesc(plan, sourceText,
+	queryDesc = CreateQueryDesc(plan, prep, sourceText,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, params, queryEnv, 0);
 
@@ -489,6 +501,9 @@ PortalStart(Portal portal, ParamListInfo params,
 				 * the destination to DestNone.
 				 */
 				queryDesc = CreateQueryDesc(linitial_node(PlannedStmt, portal->stmts),
+											portal->preps ?
+											(ExecPrep *) linitial(portal->preps) :
+											NULL,
 											portal->sourceText,
 											GetActiveSnapshot(),
 											InvalidSnapshot,
@@ -1185,6 +1200,7 @@ PortalRunMulti(Portal portal,
 {
 	bool		active_snapshot_set = false;
 	ListCell   *stmtlist_item;
+	int			i;
 
 	/*
 	 * If the destination is DestRemoteExecute, change to DestNone.  The
@@ -1205,9 +1221,14 @@ PortalRunMulti(Portal portal,
 	 * Loop to handle the individual queries generated from a single parsetree
 	 * by analysis and rewrite.
 	 */
+	i = 0;
 	foreach(stmtlist_item, portal->stmts)
 	{
 		PlannedStmt *pstmt = lfirst_node(PlannedStmt, stmtlist_item);
+		ExecPrep *prep = portal->preps ?
+			list_nth(portal->preps, i) : NULL;
+
+		i++;
 
 		/*
 		 * If we got a cancel signal in prior command, quit
@@ -1265,7 +1286,7 @@ PortalRunMulti(Portal portal,
 			if (pstmt->canSetTag)
 			{
 				/* statement can set tag string */
-				ProcessQuery(pstmt,
+				ProcessQuery(pstmt, prep,
 							 portal->sourceText,
 							 portal->portalParams,
 							 portal->queryEnv,
@@ -1274,7 +1295,7 @@ PortalRunMulti(Portal portal,
 			else
 			{
 				/* stmt added by rewrite cannot set tag */
-				ProcessQuery(pstmt,
+				ProcessQuery(pstmt, prep,
 							 portal->sourceText,
 							 portal->portalParams,
 							 portal->queryEnv,
diff --git a/src/backend/utils/mmgr/portalmem.c b/src/backend/utils/mmgr/portalmem.c
index 943da087c9f..313f8ef2fdc 100644
--- a/src/backend/utils/mmgr/portalmem.c
+++ b/src/backend/utils/mmgr/portalmem.c
@@ -284,6 +284,7 @@ PortalDefineQuery(Portal portal,
 				  const char *sourceText,
 				  CommandTag commandTag,
 				  List *stmts,
+				  List *preps,
 				  CachedPlan *cplan)
 {
 	Assert(PortalIsValid(portal));
@@ -298,6 +299,7 @@ PortalDefineQuery(Portal portal,
 	portal->qc.nprocessed = 0;
 	portal->commandTag = commandTag;
 	portal->stmts = stmts;
+	portal->preps = preps;
 	portal->cplan = cplan;
 	portal->status = PORTAL_DEFINED;
 }
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 6e51d50efc7..6aa8b275aa2 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -63,7 +63,8 @@ extern void ExplainOneUtility(Node *utilityStmt, IntoClause *into,
 							  ExplainState *es, ParseState *pstate,
 							  ParamListInfo params);
 
-extern void ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into,
+extern void ExplainOnePlan(PlannedStmt *plannedstmt, ExecPrep *prep,
+						   IntoClause *into,
 						   ExplainState *es, const char *queryString,
 						   ParamListInfo params, QueryEnvironment *queryEnv,
 						   const instr_time *planduration,
diff --git a/src/include/executor/execdesc.h b/src/include/executor/execdesc.h
index 86db3dc8d0d..c18530f5d11 100644
--- a/src/include/executor/execdesc.h
+++ b/src/include/executor/execdesc.h
@@ -18,7 +18,6 @@
 #include "nodes/execnodes.h"
 #include "tcop/dest.h"
 
-
 /* ----------------
  *		query descriptor:
  *
@@ -35,6 +34,7 @@ typedef struct QueryDesc
 	/* These fields are provided by CreateQueryDesc */
 	CmdType		operation;		/* CMD_SELECT, CMD_UPDATE, etc. */
 	PlannedStmt *plannedstmt;	/* planner's output (could be utility, too) */
+	ExecPrep *prep;				/* output of ExecutorPrep() or NULL */
 	const char *sourceText;		/* source text of the query */
 	Snapshot	snapshot;		/* snapshot to use for query */
 	Snapshot	crosscheck_snapshot;	/* crosscheck for RI update/delete */
@@ -57,6 +57,7 @@ typedef struct QueryDesc
 
 /* in pquery.c */
 extern QueryDesc *CreateQueryDesc(PlannedStmt *plannedstmt,
+								  ExecPrep *prep,
 								  const char *sourceText,
 								  Snapshot snapshot,
 								  Snapshot crosscheck_snapshot,
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index fa2b657fb2f..3579926d4e8 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -20,6 +20,7 @@
 #include "nodes/lockoptions.h"
 #include "nodes/parsenodes.h"
 #include "utils/memutils.h"
+#include "utils/resowner.h"
 
 
 /*
@@ -234,6 +235,16 @@ ExecGetJunkAttribute(TupleTableSlot *slot, AttrNumber attno, bool *isNull)
  */
 extern void ExecutorStart(QueryDesc *queryDesc, int eflags);
 extern void standard_ExecutorStart(QueryDesc *queryDesc, int eflags);
+
+extern ExecPrep *ExecutorPrep(PlannedStmt *pstmt,
+							  ParamListInfo params,
+							  ResourceOwner owner,
+							  bool do_initial_pruning,
+							  int eflags);
+extern ExecPrep *CreateExecPrep(EState *estate, MemoryContext context,
+								execprep_cleanup_fn cleanup, void *cleanup_arg);
+extern void ExecPrepCleanup(ExecPrep *prep);
+
 extern void ExecutorRun(QueryDesc *queryDesc,
 						ScanDirection direction, uint64 count);
 extern void standard_ExecutorRun(QueryDesc *queryDesc,
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 18ae8f0d4bb..8bdecd631bf 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -772,6 +772,54 @@ typedef struct EState
 	List	   *es_insert_pending_modifytables;
 } EState;
 
+/*
+ * ExecPrep: encapsulates executor preparation results for a PlannedStmt.
+ *
+ * ExecutorPrep() factors out executor setup steps such as initializing the
+ * range table, checking permissions, and executing initial partition pruning.
+ * ExecutorStart() can reuse the prepared EState instead of repeating that
+ * work, and other callers (such as plan cache validation) can use it without
+ * running the full plan.
+ */
+
+/*
+ * Optional callback to clean up user-specific resources associated with
+ * ExecPrep.
+ */
+typedef void (*execprep_cleanup_fn)(void *prep);
+
+typedef struct ExecPrep
+{
+	/*
+	 * Context in which this struct and all subsidiary allocations were made.
+	 * This context must remain alive until ExecPrepCleanup is called.
+	 */
+	MemoryContext context;
+
+	/*
+	 * Partially-initialized executor state used for permission checks and
+	 * pruning. May be adopted directly by ExecutorStart(), in which case
+	 * ExecPrepCleanup will skip freeing it.
+	 */
+	EState	   *prep_estate;
+
+	/*
+	 * True if ExecPrepCleanup() must free the EState.  If the executor adopts
+	 * prep_estate, this is set to false to avoid double-free.
+	 */
+	bool		owns_estate;
+
+	/*
+	 * Optional caller-supplied cleanup hook to run during ExecPrepCleanup.
+	 * Useful for releasing external resources associated with the prep.
+	 */
+	execprep_cleanup_fn cleanup;
+
+	/*
+	 * Opaque pointer to pass to the cleanup hook.
+	 */
+	void	   *cleanup_arg;
+} ExecPrep;
 
 /*
  * ExecRowMark -
diff --git a/src/include/utils/portal.h b/src/include/utils/portal.h
index 5ffa6fd5cc8..013bcc3bd8e 100644
--- a/src/include/utils/portal.h
+++ b/src/include/utils/portal.h
@@ -137,6 +137,7 @@ typedef struct PortalData
 	CommandTag	commandTag;		/* command tag for original query */
 	QueryCompletion qc;			/* command completion data for executed query */
 	List	   *stmts;			/* list of PlannedStmts */
+	List	   *preps;			/* list of ExecPreps where needed */
 	CachedPlan *cplan;			/* CachedPlan, if stmts are from one */
 
 	ParamListInfo portalParams; /* params to pass to query */
@@ -240,6 +241,7 @@ extern void PortalDefineQuery(Portal portal,
 							  const char *sourceText,
 							  CommandTag commandTag,
 							  List *stmts,
+							  List *preps,
 							  CachedPlan *cplan);
 extern PlannedStmt *PortalGetPrimaryStmt(Portal portal);
 extern void PortalCreateHoldStore(Portal portal);
-- 
2.47.3



  [application/octet-stream] v3-0006-Make-SQL-function-executor-track-ExecutorPrep-sta.patch (6.7K, 5-v3-0006-Make-SQL-function-executor-track-ExecutorPrep-sta.patch)
  download | inline diff:
From 733e3c712ec59b75da031694155c98476f290f37 Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Mon, 17 Nov 2025 17:40:26 +0900
Subject: [PATCH v3 6/6] Make SQL function executor track ExecutorPrep state

Extend the SQL function executor to use the ExecutorPrep results
returned by GetCachedPlan().  init_execution_state() now passes a
CachedPlanPrepData to GetCachedPlan() and stores the per statement
ExecPrep pointers in the execution_state nodes.

At execution time, postquel_start() reparents the prep estate's
es_query_cxt under the function's subcontext so that prep state
follows the usual per call context hierarchy.

This allows SQL language functions to participate in the same
ExecutorPrep machinery as other plan cache users, which a later
patch will use to support pruning aware locking.

Add a regression test where rule rewrite expands a single UPDATE
into multiple PlannedStmts, exercising the SQL function plan cache
and the generic plan reuse path that now invokes ExecutorPrep.
---
 src/backend/executor/functions.c        | 32 +++++++++++++++++++++++--
 src/test/regress/expected/plancache.out | 30 +++++++++++++++++++++++
 src/test/regress/sql/plancache.sql      | 27 +++++++++++++++++++++
 3 files changed, 87 insertions(+), 2 deletions(-)

diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index d81718ea84e..ed7352fce61 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -72,6 +72,7 @@ typedef struct execution_state
 	bool		setsResult;		/* true if this query produces func's result */
 	bool		lazyEval;		/* true if should fetch one row at a time */
 	PlannedStmt *stmt;			/* plan for this query */
+	ExecPrep   *prep;			/* ExecutorPrep() output for this plan */
 	QueryDesc  *qd;				/* null unless status == RUN */
 } execution_state;
 
@@ -657,6 +658,8 @@ init_execution_state(SQLFunctionCachePtr fcache)
 	execution_state *lasttages = NULL;
 	int			nstmts;
 	ListCell   *lc;
+	CachedPlanPrepData cprep = {0};
+	int			i;
 
 	/*
 	 * Clean up after previous query, if there was one.
@@ -695,11 +698,20 @@ init_execution_state(SQLFunctionCachePtr fcache)
 	 * CurrentResourceOwner will be the same when ShutdownSQLFunction runs.)
 	 */
 	fcache->cowner = CurrentResourceOwner;
+
+	/*
+	 * Have ExecutorPrep() allocate under fcache->fcontext.  The prep
+	 * EStates it creates will initially live there; postquel_start()
+	 * will later reparent their es_query_cxt into fcache->subcontext
+	 * when using them for execution.
+	 */
+	cprep.context = fcache->fcontext;
+	cprep.owner = fcache->cowner;
 	fcache->cplan = GetCachedPlan(plansource,
 								  fcache->paramLI,
 								  fcache->cowner,
 								  NULL,
-								  NULL);
+								  &cprep);
 
 	/*
 	 * If necessary, make esarray[] bigger to hold the needed state.
@@ -720,9 +732,12 @@ init_execution_state(SQLFunctionCachePtr fcache)
 	/*
 	 * Build execution_state list to match the number of contained plans.
 	 */
+	i = 0;
 	foreach(lc, fcache->cplan->stmt_list)
 	{
 		PlannedStmt *stmt = lfirst_node(PlannedStmt, lc);
+		ExecPrep *prep = cprep.prep_list ? list_nth(cprep.prep_list, i++) :
+			NULL;
 		execution_state *newes;
 
 		/*
@@ -764,6 +779,7 @@ init_execution_state(SQLFunctionCachePtr fcache)
 		newes->setsResult = false;	/* might change below */
 		newes->lazyEval = false;	/* might change below */
 		newes->stmt = stmt;
+		newes->prep = prep;
 		newes->qd = NULL;
 
 		if (stmt->canSetTag)
@@ -1362,8 +1378,20 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
 	else
 		dest = None_Receiver;
 
+	if (es->prep)
+	{
+		/*
+		 * Prep EStates were built under fcache->fcontext.  For execution,
+		 * make their es_query_cxt a child of fcache->subcontext so they
+		 * follow the usual per call lifetime.
+		 */
+		EState *prep_estate = es->prep->prep_estate;
+
+		MemoryContextSetParent(prep_estate->es_query_cxt, fcache->subcontext);
+	}
+
 	es->qd = CreateQueryDesc(es->stmt,
-							 NULL,
+							 es->prep,
 							 fcache->func->src,
 							 GetActiveSnapshot(),
 							 InvalidSnapshot,
diff --git a/src/test/regress/expected/plancache.out b/src/test/regress/expected/plancache.out
index 26c4c5e10fd..bf937364716 100644
--- a/src/test/regress/expected/plancache.out
+++ b/src/test/regress/expected/plancache.out
@@ -458,4 +458,34 @@ NOTICE:  creating index on partition inval_during_pruning_p1
 
 drop table inval_during_pruning_p, inval_during_pruning_signal;
 drop function invalidate_plancache_func, stable_pruning_val;
+-- exercise sql-function plan cache when rewrite expands a single statement
+-- into multiple planned statements. this forces cachedplan->stmt_list to
+-- contain more than one entry and checks that executor state for the first
+-- rewritten statement does not destroy state needed by the second one.
+set plan_cache_mode = force_generic_plan;
+create table sqlf_base(id int, val int);
+create table sqlf_log(id int, note text);
+insert into sqlf_base values (1, 10);
+create rule sqlf_base_upd_log as
+on update to sqlf_base do also
+    insert into sqlf_log(id, note)
+    values (new.id, 'logged by rule');
+create or replace function sqlf_execprep_test(a int, v int)
+returns void
+language sql
+as $$
+    update sqlf_base set val = v where id = a;
+$$;
+select sqlf_execprep_test(1, 20);
+ sqlf_execprep_test 
+--------------------
+ 
+(1 row)
+
+select sqlf_execprep_test(1, 30);
+ sqlf_execprep_test 
+--------------------
+ 
+(1 row)
+
 reset plan_cache_mode;
diff --git a/src/test/regress/sql/plancache.sql b/src/test/regress/sql/plancache.sql
index cc7eb4da4d3..71320799040 100644
--- a/src/test/regress/sql/plancache.sql
+++ b/src/test/regress/sql/plancache.sql
@@ -272,4 +272,31 @@ explain (verbose, costs off) execute inval_during_pruning_q;
 drop table inval_during_pruning_p, inval_during_pruning_signal;
 drop function invalidate_plancache_func, stable_pruning_val;
 
+-- exercise sql-function plan cache when rewrite expands a single statement
+-- into multiple planned statements. this forces cachedplan->stmt_list to
+-- contain more than one entry and checks that executor state for the first
+-- rewritten statement does not destroy state needed by the second one.
+
+set plan_cache_mode = force_generic_plan;
+
+create table sqlf_base(id int, val int);
+create table sqlf_log(id int, note text);
+
+insert into sqlf_base values (1, 10);
+
+create rule sqlf_base_upd_log as
+on update to sqlf_base do also
+    insert into sqlf_log(id, note)
+    values (new.id, 'logged by rule');
+
+create or replace function sqlf_execprep_test(a int, v int)
+returns void
+language sql
+as $$
+    update sqlf_base set val = v where id = a;
+$$;
+
+select sqlf_execprep_test(1, 20);
+select sqlf_execprep_test(1, 30);
+
 reset plan_cache_mode;
-- 
2.47.3



  [application/octet-stream] v3-0003-Reuse-partition-pruning-results-in-parallel-worke.patch (9.1K, 6-v3-0003-Reuse-partition-pruning-results-in-parallel-worke.patch)
  download | inline diff:
From d9d95e09961dcb8236e5fe7b2da4a37fda8e5944 Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Tue, 11 Nov 2025 22:17:47 +0900
Subject: [PATCH v3 3/6] Reuse partition pruning results in parallel workers

Pass the leader's initial partition pruning results and unpruned
relids to parallel workers and reuse them via ExecutorPrep(). This
avoids repeating pruning logic in workers, which is not only
redundant but also risks divergence due to nondeterminism in pruning
steps or parameter evaluation timing.

Introduce ExecCheckInitialPruningResults() to verify that the results
match what the worker would compute. This check helps catch
inconsistencies across leader and worker pruning logic.

While valuable on its own, this change also lays the foundation for
future optimizations where the leader may take locks only on
surviving partitions. Ensuring that workers follow identical pruning
decisions makes such selective locking safe.
---
 src/backend/executor/execParallel.c  | 67 +++++++++++++++++++++++++++-
 src/backend/executor/execPartition.c | 35 +++++++++++++++
 src/include/executor/execPartition.h |  1 +
 3 files changed, 102 insertions(+), 1 deletion(-)

diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index aedbd9566d6..751590adcc9 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -24,6 +24,7 @@
 #include "postgres.h"
 
 #include "executor/execParallel.h"
+#include "executor/execPartition.h"
 #include "executor/executor.h"
 #include "executor/nodeAgg.h"
 #include "executor/nodeAppend.h"
@@ -65,6 +66,8 @@
 #define PARALLEL_KEY_QUERY_TEXT		UINT64CONST(0xE000000000000008)
 #define PARALLEL_KEY_JIT_INSTRUMENTATION UINT64CONST(0xE000000000000009)
 #define PARALLEL_KEY_WAL_USAGE			UINT64CONST(0xE00000000000000A)
+#define PARALLEL_KEY_PARTITION_PRUNE_RESULTS	UINT64CONST(0xE00000000000000B)
+#define PARALLEL_KEY_UNPRUNED_RELIDS	UINT64CONST(0xE00000000000000C)
 
 #define PARALLEL_TUPLE_QUEUE_SIZE		65536
 
@@ -608,12 +611,18 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	FixedParallelExecutorState *fpes;
 	char	   *pstmt_data;
 	char	   *pstmt_space;
+	char	   *part_prune_results_data;
+	char	   *part_prune_results_space;
+	char	   *unpruned_relids_data;
+	char	   *unpruned_relids_space;
 	char	   *paramlistinfo_space;
 	BufferUsage *bufusage_space;
 	WalUsage   *walusage_space;
 	SharedExecutorInstrumentation *instrumentation = NULL;
 	SharedJitInstrumentation *jit_instrumentation = NULL;
 	int			pstmt_len;
+	int			part_prune_results_len;
+	int			unpruned_relids_len;
 	int			paramlistinfo_len;
 	int			instrumentation_len = 0;
 	int			jit_instrumentation_len = 0;
@@ -642,6 +651,8 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 
 	/* Fix up and serialize plan to be sent to workers. */
 	pstmt_data = ExecSerializePlan(planstate->plan, estate);
+	part_prune_results_data = nodeToString(estate->es_part_prune_results);
+	unpruned_relids_data = nodeToString(estate->es_unpruned_relids);
 
 	/* Create a parallel context. */
 	pcxt = CreateParallelContext("postgres", "ParallelQueryMain", nworkers);
@@ -668,6 +679,16 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	shm_toc_estimate_chunk(&pcxt->estimator, pstmt_len);
 	shm_toc_estimate_keys(&pcxt->estimator, 1);
 
+	/* Estimate space for serialized part_prune_results. */
+	part_prune_results_len = strlen(part_prune_results_data) + 1;
+	shm_toc_estimate_chunk(&pcxt->estimator, part_prune_results_len);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	/* Estimate space for serialized unpruned_relids. */
+	unpruned_relids_len = strlen(unpruned_relids_data) + 1;
+	shm_toc_estimate_chunk(&pcxt->estimator, unpruned_relids_len);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
 	/* Estimate space for serialized ParamListInfo. */
 	paramlistinfo_len = EstimateParamListSpace(estate->es_param_list_info);
 	shm_toc_estimate_chunk(&pcxt->estimator, paramlistinfo_len);
@@ -769,6 +790,16 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	memcpy(pstmt_space, pstmt_data, pstmt_len);
 	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PLANNEDSTMT, pstmt_space);
 
+	/* Store serialized part_prune_results */
+	part_prune_results_space = shm_toc_allocate(pcxt->toc, part_prune_results_len);
+	memcpy(part_prune_results_space, part_prune_results_data, part_prune_results_len);
+	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARTITION_PRUNE_RESULTS, part_prune_results_space);
+
+	/* Store serialized unpruned_relids */
+	unpruned_relids_space = shm_toc_allocate(pcxt->toc, unpruned_relids_len);
+	memcpy(unpruned_relids_space, unpruned_relids_data, unpruned_relids_len);
+	shm_toc_insert(pcxt->toc, PARALLEL_KEY_UNPRUNED_RELIDS, unpruned_relids_space);
+
 	/* Store serialized ParamListInfo. */
 	paramlistinfo_space = shm_toc_allocate(pcxt->toc, paramlistinfo_len);
 	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARAMLISTINFO, paramlistinfo_space);
@@ -1263,10 +1294,15 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
 						 int instrument_options)
 {
 	char	   *pstmtspace;
+	char	   *part_prune_results_space;
+	char	   *unpruned_relids_space;
 	char	   *paramspace;
 	PlannedStmt *pstmt;
+	List	   *part_prune_results;
+	Bitmapset  *unpruned_relids;
 	ParamListInfo paramLI;
 	char	   *queryString;
+	ExecPrep   *prep = NULL;
 
 	/* Get the query string from shared memory */
 	queryString = shm_toc_lookup(toc, PARALLEL_KEY_QUERY_TEXT, false);
@@ -1279,9 +1315,38 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
 	paramspace = shm_toc_lookup(toc, PARALLEL_KEY_PARAMLISTINFO, false);
 	paramLI = RestoreParamList(&paramspace);
 
+	/* Reconstruct leader-supplied part_prune_results and unpruned_relids. */
+	part_prune_results_space =
+		shm_toc_lookup(toc, PARALLEL_KEY_PARTITION_PRUNE_RESULTS, false);
+	part_prune_results = (List *) stringToNode(part_prune_results_space);
+	unpruned_relids_space =
+		shm_toc_lookup(toc, PARALLEL_KEY_UNPRUNED_RELIDS, false);
+	unpruned_relids = (Bitmapset *) stringToNode(unpruned_relids_space);
+
+	/*
+	 * If pruning was done in the leader, build a prep estate in the worker
+	 * and inject the leader's pruning results into it for reuse.
+	 */
+	if (pstmt->partPruneInfos)
+	{
+		prep = ExecutorPrep(pstmt, paramLI, CurrentResourceOwner, false, 0);
+		Assert(prep->prep_estate);
+
+		prep->prep_estate->es_part_prune_results = part_prune_results;
+		prep->prep_estate->es_unpruned_relids =
+			bms_add_members(prep->prep_estate->es_unpruned_relids,
+							unpruned_relids);
+
+		/*
+		 * Verify that the pruning results passed from the leader match
+		 * what the worker would independently compute.
+		 */
+		ExecCheckInitialPruningResults(prep->prep_estate);
+	}
+
 	/* Create a QueryDesc for the query. */
 	return CreateQueryDesc(pstmt,
-						   NULL,
+						   prep,
 						   queryString,
 						   GetActiveSnapshot(), InvalidSnapshot,
 						   receiver, paramLI, NULL, instrument_options);
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 187a480e508..3b450e3373f 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -1872,6 +1872,41 @@ ExecDoInitialPruning(EState *estate)
 	}
 }
 
+/*
+ * ExecCheckInitialPruningResults
+ *      Verify partition pruning results passed from the leader process.
+ *
+ * This is intended to be called during parallel worker query setup.
+ * It recomputes initial pruning results locally and compares them with
+ * those received from the leader. Any mismatch may indicate a divergence
+ * between leader and worker logic or environment.
+ */
+void
+ExecCheckInitialPruningResults(EState *estate)
+{
+	ListCell   *lc;
+	int			i;
+
+	Assert(estate->es_part_prune_results != NULL);
+	i = 0;
+	foreach(lc, estate->es_part_prune_states)
+	{
+		PartitionPruneState *prunestate = (PartitionPruneState *) lfirst(lc);
+		Bitmapset *reuse_validsubplans =
+				list_nth_node(Bitmapset, estate->es_part_prune_results, i);
+		Bitmapset  *validsubplans = NULL;
+		Bitmapset  *validsubplan_rtis = NULL;
+
+		if (prunestate->do_initial_prune)
+			validsubplans = ExecFindMatchingSubPlans(prunestate, true,
+													 &validsubplan_rtis);
+		if (bms_nonempty_difference(validsubplans, reuse_validsubplans))
+			elog(ERROR, "different validsubplns in parallel worker");
+		if (bms_nonempty_difference(validsubplan_rtis, estate->es_unpruned_relids))
+			elog(ERROR, "different unprunable_relids in parallel worker");
+	}
+}
+
 /*
  * ExecInitPartitionExecPruning
  *		Initialize the data structures needed for runtime "exec" partition
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index ba8cc594fc9..126efd008e5 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -132,6 +132,7 @@ typedef struct PartitionPruneState
 
 extern void ExecCreatePartitionPruneStates(EState *estate);
 extern void ExecDoInitialPruning(EState *estate);
+extern void ExecCheckInitialPruningResults(EState *estate);
 extern PartitionPruneState *ExecInitPartitionExecPruning(PlanState *planstate,
 														 int n_total_subplans,
 														 int part_prune_index,
-- 
2.47.3



  [application/octet-stream] v3-0001-Refactor-partition-pruning-initialization-for-cla.patch (7.7K, 7-v3-0001-Refactor-partition-pruning-initialization-for-cla.patch)
  download | inline diff:
From 243d407de86b0a73b9bd8c8dbc541f630eb33747 Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Tue, 11 Nov 2025 21:18:24 +0900
Subject: [PATCH v3 1/6] Refactor partition pruning initialization for clarity
 and modularity

Move the creation of PartitionPruneState structures out of
ExecDoInitialPruning() into a new ExecCreatePartitionPruneStates()
function. This separates the setup of pruning state from the execution
of initial pruning logic, making the code clearer and easier to
maintain.

Also simplify handling of unpruned relids by moving responsibility
for recording them in EState into CreatePartitionPruneState(),
avoiding the need to pass all_leafpart_rtis as an out parameter.

This refactoring allows callers to reuse the pruning setup logic
without always triggering pruning, a capability useful for future use
cases that may only need metadata initialization.
---
 src/backend/executor/execPartition.c | 70 +++++++++++++++++-----------
 src/include/executor/execPartition.h |  1 +
 2 files changed, 43 insertions(+), 28 deletions(-)

diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index aa12e9ad2ea..88b150c8d77 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -182,8 +182,7 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation rel,
 static List *adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri);
 static List *adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap);
 static PartitionPruneState *CreatePartitionPruneState(EState *estate,
-													  PartitionPruneInfo *pruneinfo,
-													  Bitmapset **all_leafpart_rtis);
+													  PartitionPruneInfo *pruneinfo);
 static void InitPartitionPruneContext(PartitionPruneContext *context,
 									  List *pruning_steps,
 									  PartitionDesc partdesc,
@@ -1772,6 +1771,9 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  *
  * Functions:
  *
+ * ExecCreatePartitionPruneStates
+ *		Create PartitionPruneState for all PartitionPruneInfos in the EState
+ *
  * ExecDoInitialPruning:
  *		Perform runtime "initial" pruning, if necessary, to determine the set
  *		of child subnodes that need to be initialized during ExecInitNode() for
@@ -1796,6 +1798,29 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  *-------------------------------------------------------------------------
  */
 
+/*
+ * ExecCreatePartitionPruneStates
+ *
+ * Create a PartitionPruneState for each PartitionPruneInfo in the estate,
+ * and save them in estate->es_part_prune_states. This setup is required
+ * before any initial or runtime pruning can occur.
+ */
+void
+ExecCreatePartitionPruneStates(EState *estate)
+{
+	ListCell   *lc;
+
+	foreach(lc, estate->es_part_prune_infos)
+	{
+		PartitionPruneInfo *pruneinfo = lfirst_node(PartitionPruneInfo, lc);
+		PartitionPruneState *prunestate;
+
+		/* Create and save the PartitionPruneState. */
+		prunestate = CreatePartitionPruneState(estate, pruneinfo);
+		estate->es_part_prune_states = lappend(estate->es_part_prune_states,
+											   prunestate);
+	}
+}
 
 /*
  * ExecDoInitialPruning
@@ -1803,11 +1828,11 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  *		of child subnodes that need to be initialized during ExecInitNode() for
  *		plan nodes that support partition pruning.
  *
- * This function iterates over each PartitionPruneInfo entry in
- * estate->es_part_prune_infos. For each entry, it creates a PartitionPruneState
- * and adds it to es_part_prune_states.  ExecInitPartitionExecPruning() accesses
+ * This function iterates over each PartitionPruneState in
+ * estate->es_part_prune_states, which must have been populated earlier by
+ * ExecCreatePartitionPruneStates(). ExecInitPartitionExecPruning() accesses
  * these states through their corresponding indexes in es_part_prune_states and
- * assign each state to the parent node's PlanState, from where it will be used
+ * assigns each state to the parent node's PlanState, from where it will be used
  * for "exec" pruning.
  *
  * If initial pruning steps exist for a PartitionPruneInfo entry, this function
@@ -1825,20 +1850,13 @@ ExecDoInitialPruning(EState *estate)
 {
 	ListCell   *lc;
 
-	foreach(lc, estate->es_part_prune_infos)
+	Assert(estate->es_part_prune_results == NULL);
+	foreach(lc, estate->es_part_prune_states)
 	{
-		PartitionPruneInfo *pruneinfo = lfirst_node(PartitionPruneInfo, lc);
-		PartitionPruneState *prunestate;
+		PartitionPruneState *prunestate = (PartitionPruneState *) lfirst(lc);
 		Bitmapset  *validsubplans = NULL;
-		Bitmapset  *all_leafpart_rtis = NULL;
 		Bitmapset  *validsubplan_rtis = NULL;
 
-		/* Create and save the PartitionPruneState. */
-		prunestate = CreatePartitionPruneState(estate, pruneinfo,
-											   &all_leafpart_rtis);
-		estate->es_part_prune_states = lappend(estate->es_part_prune_states,
-											   prunestate);
-
 		/*
 		 * Perform initial pruning steps, if any, and save the result
 		 * bitmapset or NULL as described in the header comment.
@@ -1846,8 +1864,6 @@ ExecDoInitialPruning(EState *estate)
 		if (prunestate->do_initial_prune)
 			validsubplans = ExecFindMatchingSubPlans(prunestate, true,
 													 &validsubplan_rtis);
-		else
-			validsubplan_rtis = all_leafpart_rtis;
 
 		estate->es_unpruned_relids = bms_add_members(estate->es_unpruned_relids,
 													 validsubplan_rtis);
@@ -1965,14 +1981,12 @@ ExecInitPartitionExecPruning(PlanState *planstate,
  * parent plan node's PlanState.
  *
  * If initial pruning steps are to be skipped (e.g., during EXPLAIN
- * (GENERIC_PLAN)), *all_leafpart_rtis will be populated with the RT indexes of
- * all leaf partitions whose scanning subnode is included in the parent plan
- * node's list of child plans. The caller must add these RT indexes to
- * estate->es_unpruned_relids.
+ * (GENERIC_PLAN)), the RT indexes of all leaf partitions whose scanning
+ * subnode is included in the parent plan node's list of child plans are
+ * added to estate->es_unpruned_relids.
  */
 static PartitionPruneState *
-CreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo,
-						  Bitmapset **all_leafpart_rtis)
+CreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo)
 {
 	PartitionPruneState *prunestate;
 	int			n_part_hierarchies;
@@ -2206,8 +2220,8 @@ CreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo,
 													   pinfo->execparamids);
 
 			/*
-			 * Return all leaf partition indexes if we're skipping pruning in
-			 * the EXPLAIN (GENERIC_PLAN) case.
+			 * Add all leaf partition indexes to es_unpruned_relids if we're
+			 * skipping pruning in the EXPLAIN (GENERIC_PLAN) case.
 			 */
 			if (pinfo->initial_pruning_steps && !prunestate->do_initial_prune)
 			{
@@ -2219,8 +2233,8 @@ CreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo,
 					Index		rtindex = pprune->leafpart_rti_map[part_index];
 
 					if (rtindex)
-						*all_leafpart_rtis = bms_add_member(*all_leafpart_rtis,
-															rtindex);
+						estate->es_unpruned_relids =
+							bms_add_member(estate->es_unpruned_relids, rtindex);
 				}
 			}
 
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index 3b3f46aced0..ba8cc594fc9 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -130,6 +130,7 @@ typedef struct PartitionPruneState
 	PartitionPruningData *partprunedata[FLEXIBLE_ARRAY_MEMBER];
 } PartitionPruneState;
 
+extern void ExecCreatePartitionPruneStates(EState *estate);
 extern void ExecDoInitialPruning(EState *estate);
 extern PartitionPruneState *ExecInitPartitionExecPruning(PlanState *planstate,
 														 int n_total_subplans,
-- 
2.47.3



^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2025-11-23 12:17  Tender Wang <[email protected]>
  parent: Amit Langote <[email protected]>
  1 sibling, 1 reply; 108+ messages in thread

From: Tender Wang @ 2025-11-23 12:17 UTC (permalink / raw)
  To: Amit Langote <[email protected]>; +Cc: Tom Lane <[email protected]>; Alexander Lakhin <[email protected]>; Tomas Vondra <[email protected]>; Robert Haas <[email protected]>; Alvaro Herrera <[email protected]>; Andres Freund <[email protected]>; Daniel Gustafsson <[email protected]>; David Rowley <[email protected]>; pgsql-hackers; Thom Brown <[email protected]>

Amit Langote <[email protected]> 于2025年11月20日周四 15:30写道:

> On Mon, Nov 17, 2025 at 9:50 PM Amit Langote <[email protected]>
> wrote:
> > On Wed, Nov 12, 2025 at 11:17 PM Amit Langote <[email protected]>
> wrote:
> > > * Enable pruning-aware locking in cached / generic plan reuse (0004):
> > > extends GetCachedPlan() and CheckCachedPlan() to call ExecutorPrep()
> > > on each PlannedStmt in the CachedPlan, locking only surviving
> > > partitions. Adds CachedPlanPrepData to pass this through plan cache
> > > APIs and down to execution via QueryDesc. Also reinstates the
> > > firstResultRel locking rule added in 28317de72 but later lost due to
> > > revert of the earlier pruning patch, to ensure correctness when all
> > > target partitions are pruned.
> >
> > Looking at the changes to executor/function.c, I also noticed that I
> > had mistakenly allocated the ExecutorPrep state in
> > SQLFunctionCache.fcontext whereas the correct context for execution
> > related state is SQLFunctionCache.subcontext.  In the updated patch,
> > I've made postquel_start() reparent the prep EState's es_query_cxt to
> > subcontext from fcontext. I also did not have a test case that
> > exercised cached plan reuse for SQL functions, so I added one. I split
> > the function.c's GetCachedPlan() + CachedPlanPrepData plumbing into a
> > new patch 0005 so it can be reviewed separately, since it is the only
> > non-mechanical call-site change.
>
> I also noticed a bug in the prep cleanup logic that runs when a cached
> plan becomes invalid during the prep phase. Patch 0005 fixes that and
> adds a regression test that exercises the invalidation path. This will
> be folded into 0004 later.
>

I spent time looking at these patches.

I search all places that call GetCachedPlan(), and we always pass
&cprep(CachedPlanPrepData) to GetCachedPlan().
In PrepAndCheckCachedPlan(), if the plan_cache_mode is force_generic_plan,
the LockPolicy is always LOCK_UNPRUNED. Because *cprep has never been NULL.
It seems that the LockPolicy has no chance to be LOCK_ALL. Do I miss
something here?
-- 
Thanks,
Tender Wang


^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2025-11-24 03:29  Chao Li <[email protected]>
  parent: Amit Langote <[email protected]>
  1 sibling, 1 reply; 108+ messages in thread

From: Chao Li @ 2025-11-24 03:29 UTC (permalink / raw)
  To: Amit Langote <[email protected]>; +Cc: Tom Lane <[email protected]>; Tender Wang <[email protected]>; Alexander Lakhin <[email protected]>; Tomas Vondra <[email protected]>; Robert Haas <[email protected]>; Alvaro Herrera <[email protected]>; Andres Freund <[email protected]>; Daniel Gustafsson <[email protected]>; David Rowley <[email protected]>; pgsql-hackers; Thom Brown <[email protected]>

Hi, Amit,

Locking only surviving partitions sounds a good optimization. I started to review this patch, but I cannot finish reviewing in one day. I will post my comments as long as I finished some commits.

> On Nov 20, 2025, at 15:30, Amit Langote <[email protected]> wrote:
> 
> <v3-0004-Use-pruning-aware-locking-in-cached-plans.patch><v3-0005-Add-test-exercising-prep-cleanup-on-cached-plan-i.patch><v3-0002-Introduce-ExecutorPrep-and-refactor-executor-star.patch><v3-0006-Make-SQL-function-executor-track-ExecutorPrep-sta.patch><v3-0003-Reuse-partition-pruning-results-in-parallel-worke.patch><v3-0001-Refactor-partition-pruning-initialization-for-cla.patch>


0001 splits creations of es_part_prune_states into a new function ExecCreatePartitionPruneStates(). With that, you are trying to make the code clearer as you stated in the commit comment. However, the new function is not called, meaning 0001 is not self-contained, feels unusual to me according to the patches I have reviewed so far. I would suggest have ExecDoInitialPruning() call ExecCreatePartitionPruneStates() when es_part_prune_states is still NIL., so that current logic is unchanged, and 0001 can be pushed independently.

0002 moves check permission etc logic from InitPlan() to the new function ExecutorPrep(). The commit message says “executor setup logic unchanged”. Because in old code, before permission check, there was no PushActiveSnapshot(), but in the patch, before check permission, PushActiveSnapshot() is done, which may introduce different behavior, I just wonder why PushActiveSnapshot() is added?

Actually, I am still trying to understand 0002-0004, it would take me some time to fully understand the patch. I’d raise the above comments first. I will continue reviewing this patch tomorrow.

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/









^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2025-11-25 01:56  Amit Langote <[email protected]>
  parent: Tender Wang <[email protected]>
  0 siblings, 0 replies; 108+ messages in thread

From: Amit Langote @ 2025-11-25 01:56 UTC (permalink / raw)
  To: Tender Wang <[email protected]>; +Cc: Tom Lane <[email protected]>; Alexander Lakhin <[email protected]>; Tomas Vondra <[email protected]>; Robert Haas <[email protected]>; Alvaro Herrera <[email protected]>; Andres Freund <[email protected]>; Daniel Gustafsson <[email protected]>; David Rowley <[email protected]>; pgsql-hackers; Thom Brown <[email protected]>

On Sun, Nov 23, 2025 at 9:17 PM Tender Wang <[email protected]> wrote:
> Amit Langote <[email protected]> 于2025年11月20日周四 15:30写道:
>>
>> On Mon, Nov 17, 2025 at 9:50 PM Amit Langote <[email protected]> wrote:
>> > On Wed, Nov 12, 2025 at 11:17 PM Amit Langote <[email protected]> wrote:
>> > > * Enable pruning-aware locking in cached / generic plan reuse (0004):
>> > > extends GetCachedPlan() and CheckCachedPlan() to call ExecutorPrep()
>> > > on each PlannedStmt in the CachedPlan, locking only surviving
>> > > partitions. Adds CachedPlanPrepData to pass this through plan cache
>> > > APIs and down to execution via QueryDesc. Also reinstates the
>> > > firstResultRel locking rule added in 28317de72 but later lost due to
>> > > revert of the earlier pruning patch, to ensure correctness when all
>> > > target partitions are pruned.
>> >
>> > Looking at the changes to executor/function.c, I also noticed that I
>> > had mistakenly allocated the ExecutorPrep state in
>> > SQLFunctionCache.fcontext whereas the correct context for execution
>> > related state is SQLFunctionCache.subcontext.  In the updated patch,
>> > I've made postquel_start() reparent the prep EState's es_query_cxt to
>> > subcontext from fcontext. I also did not have a test case that
>> > exercised cached plan reuse for SQL functions, so I added one. I split
>> > the function.c's GetCachedPlan() + CachedPlanPrepData plumbing into a
>> > new patch 0005 so it can be reviewed separately, since it is the only
>> > non-mechanical call-site change.
>>
>> I also noticed a bug in the prep cleanup logic that runs when a cached
>> plan becomes invalid during the prep phase. Patch 0005 fixes that and
>> adds a regression test that exercises the invalidation path. This will
>> be folded into 0004 later.
>
> I spent time looking at these patches.
>
> I search all places that call GetCachedPlan(), and we always pass &cprep(CachedPlanPrepData) to GetCachedPlan().
> In PrepAndCheckCachedPlan(), if the plan_cache_mode is force_generic_plan, the LockPolicy is always LOCK_UNPRUNED. Because *cprep has never been NULL.
> It seems that the LockPolicy has no chance to be LOCK_ALL. Do I miss something here?

Yes, eventually LockPolicy may end up redundant and we might not need
AcquireExecutorLocksPolicy() at all, with a single locking path
covering both cases.

My goal initially was to stage the changes across call sites: keep a
LOCK_ALL path for callers that still use the old lock everything up
front behaviour, and gradually convert other callers to pass a
non-NULL CachedPlanPrepData and handle the prep_list it may return, so
that GetCachedPlan() can perform LOCK_UNPRUNED locking internally.
That is why GetCachedPlan() accepts a possibly NULL cprep and why
LockPolicy exists as a separate knob.

For example, I decided to split out function.c refactoring of plan
cache usage into its own patch. That made me realise that new users of
GetCachedPlan() may appear that first adopt the simpler LOCK_ALL
behaviour and only later switch to UNPRUNED when pruning aware locking
becomes useful for them. Keeping the two paths preserves that
incremental route and avoids forcing every new user to adopt
CachedPlanPrepData and UNPRUNED locking up front. I am undecided yet
if that two path structure is a good idea, but I am inclined to keep
it for now. I would be happy to hear opinions on this.

-- 
Thanks, Amit Langote





^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2025-11-25 08:31  Amit Langote <[email protected]>
  parent: Chao Li <[email protected]>
  0 siblings, 1 reply; 108+ messages in thread

From: Amit Langote @ 2025-11-25 08:31 UTC (permalink / raw)
  To: Chao Li <[email protected]>; +Cc: Tom Lane <[email protected]>; Tender Wang <[email protected]>; Alexander Lakhin <[email protected]>; Tomas Vondra <[email protected]>; Robert Haas <[email protected]>; Alvaro Herrera <[email protected]>; Andres Freund <[email protected]>; Daniel Gustafsson <[email protected]>; David Rowley <[email protected]>; pgsql-hackers; Thom Brown <[email protected]>

Hi Evan,

On Mon, Nov 24, 2025 at 12:30 PM Chao Li <[email protected]> wrote:
>
> Hi, Amit,
>
> Locking only surviving partitions sounds a good optimization. I started to review this patch, but I cannot finish reviewing in one day. I will post my comments as long as I finished some commits.

Thank you very much for taking the time to review.

> > On Nov 20, 2025, at 15:30, Amit Langote <[email protected]> wrote:
> >
> > <v3-0004-Use-pruning-aware-locking-in-cached-plans.patch><v3-0005-Add-test-exercising-prep-cleanup-on-cached-plan-i.patch><v3-0002-Introduce-ExecutorPrep-and-refactor-executor-star.patch><v3-0006-Make-SQL-function-executor-track-ExecutorPrep-sta.patch><v3-0003-Reuse-partition-pruning-results-in-parallel-worke.patch><v3-0001-Refactor-partition-pruning-initialization-for-cla.patch>
>
>
> 0001 splits creations of es_part_prune_states into a new function ExecCreatePartitionPruneStates(). With that, you are trying to make the code clearer as you stated in the commit comment. However, the new function is not called, meaning 0001 is not self-contained, feels unusual to me according to the patches I have reviewed so far.

Oops, that is not intentional.

> I would suggest have ExecDoInitialPruning() call ExecCreatePartitionPruneStates() when es_part_prune_states is still NIL., so that current logic is unchanged, and 0001 can be pushed independently.

0002 adds a call to ExecDoInitialPruning() in ExecutorPrep(), preceded
by a call to ExecCreatePartitionPruneStates(), and that is how I think
it should be. So in the attached updated 0001, I have made InitPlan()
call ExecCreatePartitionPruneStates() before calling
ExecDoInitialPruning().

> 0002 moves check permission etc logic from InitPlan() to the new function ExecutorPrep(). The commit message says “executor setup logic unchanged”. Because in old code, before permission check, there was no PushActiveSnapshot(), but in the patch, before check permission, PushActiveSnapshot() is done, which may introduce different behavior, I just wonder why PushActiveSnapshot() is added?

That is a valid concern.

I found it necessary because the initial pruning code (which runs in
ExecDoInitialPruning()) may require ActiveSnapshot to be valid if
pruning expressions end up calling code that invokes
EnsurePortalSnapshotExists(). That requirement already existed when
ExecDoInitialPruning() was driven from ExecutorStart(), but
ExecutorPrep() can now be called from places that do not otherwise
push a snapshot. The snapshot push is only there to cover those
callers. It does not change permission checking itself, it just
ensures ExecutorPrep() runs with the same preconditions that
ExecutorStart() always had.

> Actually, I am still trying to understand 0002-0004, it would take me some time to fully understand the patch. I’d raise the above comments first. I will continue reviewing this patch tomorrow.

Thanks, I appreciate your review.

-- 
Thanks, Amit Langote


Attachments:

  [application/octet-stream] v4-0002-Introduce-ExecutorPrep-and-refactor-executor-star.patch (28.8K, 2-v4-0002-Introduce-ExecutorPrep-and-refactor-executor-star.patch)
  download | inline diff:
From a004aab1ce9418a2f6273d1a67673b3d4a7c218b Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Tue, 11 Nov 2025 21:47:46 +0900
Subject: [PATCH v4 2/6] Introduce ExecutorPrep and refactor executor startup

Factor permission checks, range table initialization, and initial
partition pruning out of InitPlan() into a new ExecutorPrep()
helper.  ExecutorPrep builds an EState containing the executor
metadata needed before plan execution, including partition
pruning state where partPruneInfos are present.

ExecutorStart() now expects QueryDesc->prep to point at such an
ExecPrep object.  If no prep was supplied by the caller, it
invokes ExecutorPrep() itself and adopts the resulting EState
for the duration of the query.  This keeps the executor startup
behaviour unchanged while making the setup work callable
separately when needed.

CreateQueryDesc() grows a prep argument and stores it in the
QueryDesc.  Portals, SPI, SQL functions, and EXPLAIN are wired
to carry an optional ExecPrep pointer alongside the PlannedStmt
list, but most callers still pass NULL and let ExecutorStart()
perform the setup lazily.

Add the ExecPrep struct and ExecPrepCleanup() to encapsulate
ownership of the prepared EState and any caller specific
cleanup hook.  Update executor/README and related comments to
document the new control flow and the separation between
preparation and execution.
---
 src/backend/commands/copyto.c        |   2 +-
 src/backend/commands/createas.c      |   2 +-
 src/backend/commands/explain.c       |   7 +-
 src/backend/commands/extension.c     |   1 +
 src/backend/commands/matview.c       |   2 +-
 src/backend/commands/portalcmds.c    |   1 +
 src/backend/commands/prepare.c       |  11 +-
 src/backend/executor/README          |   8 +-
 src/backend/executor/execMain.c      | 180 ++++++++++++++++++++++-----
 src/backend/executor/execParallel.c  |   1 +
 src/backend/executor/execPartition.c |   3 +
 src/backend/executor/functions.c     |   1 +
 src/backend/executor/spi.c           |  10 ++
 src/backend/tcop/postgres.c          |   2 +
 src/backend/tcop/pquery.c            |  27 +++-
 src/backend/utils/mmgr/portalmem.c   |   2 +
 src/include/commands/explain.h       |   3 +-
 src/include/executor/execdesc.h      |   3 +-
 src/include/executor/executor.h      |  11 ++
 src/include/nodes/execnodes.h        |  48 +++++++
 src/include/utils/portal.h           |   2 +
 21 files changed, 286 insertions(+), 41 deletions(-)

diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index cef452584e5..5efbb0949c2 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -870,7 +870,7 @@ BeginCopyTo(ParseState *pstate,
 		((DR_copy *) dest)->cstate = cstate;
 
 		/* Create a QueryDesc requesting no output */
-		cstate->queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+		cstate->queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
 											GetActiveSnapshot(),
 											InvalidSnapshot,
 											dest, NULL, NULL, 0);
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 1ccc2e55c64..9eabe4920cd 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -334,7 +334,7 @@ ExecCreateTableAs(ParseState *pstate, CreateTableAsStmt *stmt,
 		UpdateActiveSnapshotCommandId();
 
 		/* Create a QueryDesc, redirecting output to our tuple receiver */
-		queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+		queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
 									GetActiveSnapshot(), InvalidSnapshot,
 									dest, params, queryEnv, 0);
 
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 7e699f8595e..d6ab3697dd9 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -370,7 +370,7 @@ standard_ExplainOneQuery(Query *query, int cursorOptions,
 	}
 
 	/* run it (if needed) and produce output */
-	ExplainOnePlan(plan, into, es, queryString, params, queryEnv,
+	ExplainOnePlan(plan, NULL, into, es, queryString, params, queryEnv,
 				   &planduration, (es->buffers ? &bufusage : NULL),
 				   es->memory ? &mem_counters : NULL);
 }
@@ -492,7 +492,8 @@ ExplainOneUtility(Node *utilityStmt, IntoClause *into, ExplainState *es,
  * to call it.
  */
 void
-ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
+ExplainOnePlan(PlannedStmt *plannedstmt, ExecPrep *prep,
+			   IntoClause *into, ExplainState *es,
 			   const char *queryString, ParamListInfo params,
 			   QueryEnvironment *queryEnv, const instr_time *planduration,
 			   const BufferUsage *bufusage,
@@ -548,7 +549,7 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
 		dest = None_Receiver;
 
 	/* Create a QueryDesc for the query */
-	queryDesc = CreateQueryDesc(plannedstmt, queryString,
+	queryDesc = CreateQueryDesc(plannedstmt, prep, queryString,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, params, queryEnv, instrument_option);
 
diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index ebc204c4462..9429fc2d17d 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -993,6 +993,7 @@ execute_sql_string(const char *sql, const char *filename)
 				QueryDesc  *qdesc;
 
 				qdesc = CreateQueryDesc(stmt,
+										NULL,
 										sql,
 										GetActiveSnapshot(), NULL,
 										dest, NULL, NULL, 0);
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index ef7c0d624f1..30cbf9f264f 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -437,7 +437,7 @@ refresh_matview_datafill(DestReceiver *dest, Query *query,
 	UpdateActiveSnapshotCommandId();
 
 	/* Create a QueryDesc, redirecting output to our tuple receiver */
-	queryDesc = CreateQueryDesc(plan, queryString,
+	queryDesc = CreateQueryDesc(plan, NULL, queryString,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, NULL, NULL, 0);
 
diff --git a/src/backend/commands/portalcmds.c b/src/backend/commands/portalcmds.c
index ec96c2efcd3..ac1ddd25aba 100644
--- a/src/backend/commands/portalcmds.c
+++ b/src/backend/commands/portalcmds.c
@@ -118,6 +118,7 @@ PerformCursorOpen(ParseState *pstate, DeclareCursorStmt *cstmt, ParamListInfo pa
 					  queryString,
 					  CMDTAG_SELECT,	/* cursor's query is always a SELECT */
 					  list_make1(plan),
+					  list_make1(NULL),
 					  NULL);
 
 	/*----------
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 34b6410d6a2..afd449c73ba 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -205,6 +205,7 @@ ExecuteQuery(ParseState *pstate,
 					  query_string,
 					  entry->plansource->commandTag,
 					  plan_list,
+					  NIL,
 					  cplan);
 
 	/*
@@ -575,6 +576,7 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	const char *query_string;
 	CachedPlan *cplan;
 	List	   *plan_list;
+	List	   *prep_list;
 	ListCell   *p;
 	ParamListInfo paramLI = NULL;
 	EState	   *estate = NULL;
@@ -585,6 +587,7 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	MemoryContextCounters mem_counters;
 	MemoryContext planner_ctx = NULL;
 	MemoryContext saved_ctx = NULL;
+	int			i;
 
 	if (es->memory)
 	{
@@ -650,14 +653,20 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	}
 
 	plan_list = cplan->stmt_list;
+	prep_list = NIL;
 
 	/* Explain each query */
+	i = 0;
 	foreach(p, plan_list)
 	{
 		PlannedStmt *pstmt = lfirst_node(PlannedStmt, p);
+		ExecPrep *prep = prep_list ?
+			(ExecPrep *) list_nth(prep_list, i) : NULL;
 
+		i++;
 		if (pstmt->commandType != CMD_UTILITY)
-			ExplainOnePlan(pstmt, into, es, query_string, paramLI, pstate->p_queryEnv,
+			ExplainOnePlan(pstmt, prep,
+						   into, es, query_string, paramLI, pstate->p_queryEnv,
 						   &planduration, (es->buffers ? &bufusage : NULL),
 						   es->memory ? &mem_counters : NULL);
 		else
diff --git a/src/backend/executor/README b/src/backend/executor/README
index 54f4782f31b..95b5ec58c55 100644
--- a/src/backend/executor/README
+++ b/src/backend/executor/README
@@ -291,10 +291,16 @@ Query Processing Control Flow
 
 This is a sketch of control flow for full query processing:
 
+    ExecutorPrep
+		May be run before ExecutorStart (e.g., for plan validation), or
+		implicitly from ExecutorStart if not done earlier.  Performs range
+		table initialization, permission checks, and initial partition pruning.
+		Returns an ExecPrep wrapper with EState that ExecutorStart may reuse.
+
 	CreateQueryDesc
 
 	ExecutorStart
-		CreateExecutorState
+		CreateExecutorState (or reuse one from ExecPrep if present)
 			creates per-query context
 		switch to per-query context to run ExecInitNode
 		AfterTriggerBeginQuery
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index f5f4986383d..39de0b93a1c 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -171,8 +171,26 @@ standard_ExecutorStart(QueryDesc *queryDesc, int eflags)
 
 	/*
 	 * Build EState, switch into per-query memory context for startup.
+	 *
+	 * If ExecutorPrep() ran earlier (e.g., to do initial pruning during plan
+	 * validity checking), reuse its EState to avoid redoing range table setup
+	 * and pruning. Otherwise, create a fresh EState as usual.
 	 */
-	estate = CreateExecutorState();
+	if (queryDesc->prep == NULL)
+		queryDesc->prep = ExecutorPrep(queryDesc->plannedstmt,
+									   queryDesc->params,
+									   CurrentResourceOwner,
+									   true,
+									   eflags);
+	Assert(queryDesc->prep);
+	estate = queryDesc->prep->prep_estate;
+
+	/*
+	 * Executor is adopting the prep's EState. Mark it so ExecPrepCleanup()
+	 * doesn't try to free it redundantly.
+	 */
+	queryDesc->prep->owns_estate = false;
+
 	queryDesc->estate = estate;
 
 	oldcontext = MemoryContextSwitchTo(estate->es_query_cxt);
@@ -263,6 +281,136 @@ standard_ExecutorStart(QueryDesc *queryDesc, int eflags)
 	MemoryContextSwitchTo(oldcontext);
 }
 
+/*
+ * ExecutorPrep: prepare executor state for a PlannedStmt outside ExecutorStart.
+ *
+ * Performs range table initialization, permission checks, and initial
+ * partition pruning if partPruneInfos are present and do_initial_pruning is
+ * true.
+ *
+ * This is intended for callers that need executor metadata ahead of actual
+ * execution. Typical use cases include:
+ *	- determining which relations must be locked during plan cache validation;
+ *	- initializing unpruned relids and valid subplans in parallel workers
+ *	  using state copied from the leader.
+ *
+ * The executor can reuse the resulting state to avoid redundant setup during
+ * ExecutorStart().
+ *
+ * Returns an ExecPrep wrapper that owns the EState and can be reused
+ * or cleaned up later.
+ */
+ExecPrep *
+ExecutorPrep(PlannedStmt *pstmt, ParamListInfo params, ResourceOwner owner,
+			 bool do_initial_pruning, int eflags)
+{
+	ResourceOwner oldowner;
+	EState *estate;
+	bool	snapshot_set;
+
+	if (pstmt->commandType == CMD_UTILITY)
+		return NULL;
+
+	/* Pruning may use expressions that require an active snapshot. */
+	snapshot_set = false;
+	if (!ActiveSnapshotSet())
+	{
+		PushActiveSnapshot(GetTransactionSnapshot());
+		snapshot_set = true;
+	}
+	Assert(ActiveSnapshotSet());
+
+	estate = CreateExecutorState();
+	estate->es_plannedstmt = pstmt;
+	estate->es_part_prune_infos = pstmt->partPruneInfos;
+	estate->es_param_list_info = params;
+	estate->es_top_eflags = eflags;
+
+	/*
+	 * Do permissions checks.
+	 */
+	ExecCheckPermissions(pstmt->rtable, pstmt->permInfos, true);
+
+	/*
+	 * Initialize range table.
+	 */
+	ExecInitRangeTable(estate, pstmt->rtable, pstmt->permInfos,
+					   bms_copy(pstmt->unprunableRelids));
+
+	/*
+	 * Ensure locks taken during initial pruning are tracked under the given
+	 * ResourceOwner (e.g., one associated with CachedPlan validation).
+	 */
+	oldowner = CurrentResourceOwner;
+	CurrentResourceOwner = owner;
+
+	/*
+	 * Set up PartitionPruneState structures needed for both initial and
+	 * runtime partition pruning. These structures are built from the
+	 * PartitionPruneInfo entries in the plan tree.
+	 *
+	 * If do_initial_pruning is true, also perform initial pruning to compute
+	 * the subset of child subplans that will be executed. The results,
+	 * which are bitmapsets of selected child indexes, are saved in
+	 * es_part_prune_results. This list is parallel to es_part_prune_infos.
+	 *
+	 * In parallel workers, do_initial_pruning should be false -- they receive
+	 * es_part_prune_results from the leader process and should only initialize
+	 * the PartitionPruneStates.
+	 */
+	ExecCreatePartitionPruneStates(estate);
+	if (do_initial_pruning)
+		ExecDoInitialPruning(estate);
+
+	CurrentResourceOwner = oldowner;
+
+	/* Release snapshot if we got one */
+	if (snapshot_set)
+		PopActiveSnapshot();
+
+	return CreateExecPrep(estate, CurrentMemoryContext, NULL, NULL);
+}
+
+/*
+ * CreateExecPrep: initialize ExecPrep wrapper with optional cleanup metadata.
+ */
+ExecPrep *
+CreateExecPrep(EState *estate, MemoryContext context,
+			   execprep_cleanup_fn cleanup, void *cleanup_arg)
+{
+	ExecPrep *prep = palloc0(sizeof(ExecPrep));
+
+	prep->prep_estate = estate;
+	prep->context = context;
+	prep->cleanup = cleanup;
+	prep->cleanup_arg = cleanup_arg;
+	prep->owns_estate = true;
+
+	return prep;
+}
+
+/*
+ * ExecPrepCleanup: free ExecPrep resources not adopted by the executor.
+ *
+ * Only frees the EState if it wasn't taken over by ExecutorStart().
+ * Always runs the optional user-defined cleanup callback.
+ */
+void
+ExecPrepCleanup(ExecPrep *prep)
+{
+	if (prep == NULL)
+		return;
+
+	if (prep->prep_estate && prep->owns_estate)
+	{
+		ExecCloseRangeTableRelations(prep->prep_estate);
+		FreeExecutorState(prep->prep_estate);
+	}
+
+	if (prep->cleanup)
+		prep->cleanup(prep->cleanup_arg);
+}
+
 /* ----------------------------------------------------------------
  *		ExecutorRun
  *
@@ -824,7 +972,6 @@ ExecCheckXactReadOnly(PlannedStmt *plannedstmt)
 		PreventCommandIfParallelMode(CreateCommandName((Node *) plannedstmt));
 }
 
-
 /* ----------------------------------------------------------------
  *		InitPlan
  *
@@ -838,38 +985,15 @@ InitPlan(QueryDesc *queryDesc, int eflags)
 	CmdType		operation = queryDesc->operation;
 	PlannedStmt *plannedstmt = queryDesc->plannedstmt;
 	Plan	   *plan = plannedstmt->planTree;
-	List	   *rangeTable = plannedstmt->rtable;
 	EState	   *estate = queryDesc->estate;
 	PlanState  *planstate;
 	TupleDesc	tupType;
 	ListCell   *l;
 	int			i;
 
-	/*
-	 * Do permissions checks
-	 */
-	ExecCheckPermissions(rangeTable, plannedstmt->permInfos, true);
-
-	/*
-	 * initialize the node's execution state
-	 */
-	ExecInitRangeTable(estate, rangeTable, plannedstmt->permInfos,
-					   bms_copy(plannedstmt->unprunableRelids));
-
-	estate->es_plannedstmt = plannedstmt;
-	estate->es_part_prune_infos = plannedstmt->partPruneInfos;
-
-	/*
-	 * Perform runtime "initial" pruning to identify which child subplans,
-	 * corresponding to the children of plan nodes that contain
-	 * PartitionPruneInfo such as Append, will not be executed. The results,
-	 * which are bitmapsets of indexes of the child subplans that will be
-	 * executed, are saved in es_part_prune_results.  These results correspond
-	 * to each PartitionPruneInfo entry, and the es_part_prune_results list is
-	 * parallel to es_part_prune_infos.
-	 */
-	ExecCreatePartitionPruneStates(estate);
-	ExecDoInitialPruning(estate);
+	/* ExecutorPrep() must have been done. */
+	Assert(queryDesc->prep);
+	Assert(estate == queryDesc->prep->prep_estate);
 
 	/*
 	 * Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index f098a5557cf..aedbd9566d6 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -1281,6 +1281,7 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
 
 	/* Create a QueryDesc for the query. */
 	return CreateQueryDesc(pstmt,
+						   NULL,
 						   queryString,
 						   GetActiveSnapshot(), InvalidSnapshot,
 						   receiver, paramLI, NULL, instrument_options);
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 61559642662..ac5e2ebee72 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -2369,6 +2369,9 @@ InitExecPartitionPruneContexts(PartitionPruneState *prunestate,
 	Assert(parent_plan != NULL);
 	estate = parent_plan->state;
 
+	/* Wouldn't be available at ExecutorPrep() time. */
+	prunestate->econtext->ecxt_param_exec_vals = estate->es_param_exec_vals;
+
 	/*
 	 * No need to fix subplans maps if initial pruning didn't eliminate any
 	 * subplans.
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index 630d708d2a3..633310c5f5b 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -1362,6 +1362,7 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
 		dest = None_Receiver;
 
 	es->qd = CreateQueryDesc(es->stmt,
+							 NULL,
 							 fcache->func->src,
 							 GetActiveSnapshot(),
 							 InvalidSnapshot,
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 653500b38dc..7a3cb944d6f 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -1685,6 +1685,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 					  query_string,
 					  plansource->commandTag,
 					  stmt_list,
+					  NIL,
 					  cplan);
 
 	/*
@@ -2500,6 +2501,8 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 		CachedPlanSource *plansource = (CachedPlanSource *) lfirst(lc1);
 		List	   *stmt_list;
 		ListCell   *lc2;
+		List	   *prep_list;
+		int			i;
 
 		spicallbackarg.query = plansource->query_string;
 
@@ -2578,6 +2581,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 							  plan_owner, _SPI_current->queryEnv);
 
 		stmt_list = cplan->stmt_list;
+		prep_list = NIL;
 
 		/*
 		 * If we weren't given a specific snapshot to use, and the statement
@@ -2615,12 +2619,17 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 			}
 		}
 
+		i = 0;
 		foreach(lc2, stmt_list)
 		{
 			PlannedStmt *stmt = lfirst_node(PlannedStmt, lc2);
+			ExecPrep *prep = prep_list ?
+				list_nth(prep_list, i) : NULL;
 			bool		canSetTag = stmt->canSetTag;
 			DestReceiver *dest;
 
+			i++;
+
 			/*
 			 * Reset output state.  (Note that if a non-SPI receiver is used,
 			 * _SPI_current->processed will stay zero, and that's what we'll
@@ -2690,6 +2699,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 					snap = InvalidSnapshot;
 
 				qdesc = CreateQueryDesc(stmt,
+										prep,
 										plansource->query_string,
 										snap, crosscheck_snapshot,
 										dest,
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 7dd75a490aa..5880a574a06 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -1232,6 +1232,7 @@ exec_simple_query(const char *query_string)
 						  query_string,
 						  commandTag,
 						  plantree_list,
+						  NIL,
 						  NULL);
 
 		/*
@@ -2033,6 +2034,7 @@ exec_bind_message(StringInfo input_message)
 					  query_string,
 					  psrc->commandTag,
 					  cplan->stmt_list,
+					  NIL,
 					  cplan);
 
 	/* Portal is defined, set the plan ID based on its contents. */
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index fde78c55160..82c295502b0 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -37,6 +37,7 @@ Portal		ActivePortal = NULL;
 
 
 static void ProcessQuery(PlannedStmt *plan,
+						 ExecPrep *prep,
 						 const char *sourceText,
 						 ParamListInfo params,
 						 QueryEnvironment *queryEnv,
@@ -66,6 +67,7 @@ static void DoPortalRewind(Portal portal);
  */
 QueryDesc *
 CreateQueryDesc(PlannedStmt *plannedstmt,
+				ExecPrep *prep,
 				const char *sourceText,
 				Snapshot snapshot,
 				Snapshot crosscheck_snapshot,
@@ -78,6 +80,7 @@ CreateQueryDesc(PlannedStmt *plannedstmt,
 
 	qd->operation = plannedstmt->commandType;	/* operation */
 	qd->plannedstmt = plannedstmt;	/* plan */
+	qd->prep = prep;		/* executor prep output */
 	qd->sourceText = sourceText;	/* query text */
 	qd->snapshot = RegisterSnapshot(snapshot);	/* snapshot */
 	/* RI check snapshot */
@@ -112,6 +115,13 @@ FreeQueryDesc(QueryDesc *qdesc)
 	UnregisterSnapshot(qdesc->snapshot);
 	UnregisterSnapshot(qdesc->crosscheck_snapshot);
 
+	/* ExecPrep cleanup if necessary */
+	if (qdesc->prep)
+	{
+		ExecPrepCleanup(qdesc->prep);
+		qdesc->prep = NULL;
+	}
+
 	/* Only the QueryDesc itself need be freed */
 	pfree(qdesc);
 }
@@ -123,6 +133,7 @@ FreeQueryDesc(QueryDesc *qdesc)
  *		PORTAL_ONE_RETURNING, or PORTAL_ONE_MOD_WITH portal
  *
  *	plan: the plan tree for the query
+ *	prep: ExecPrep for the plan (output of ExecutorPrep())
  *	sourceText: the source text of the query
  *	params: any parameters needed
  *	dest: where to send results
@@ -135,6 +146,7 @@ FreeQueryDesc(QueryDesc *qdesc)
  */
 static void
 ProcessQuery(PlannedStmt *plan,
+			 ExecPrep *prep,
 			 const char *sourceText,
 			 ParamListInfo params,
 			 QueryEnvironment *queryEnv,
@@ -146,7 +158,7 @@ ProcessQuery(PlannedStmt *plan,
 	/*
 	 * Create the QueryDesc object
 	 */
-	queryDesc = CreateQueryDesc(plan, sourceText,
+	queryDesc = CreateQueryDesc(plan, prep, sourceText,
 								GetActiveSnapshot(), InvalidSnapshot,
 								dest, params, queryEnv, 0);
 
@@ -489,6 +501,9 @@ PortalStart(Portal portal, ParamListInfo params,
 				 * the destination to DestNone.
 				 */
 				queryDesc = CreateQueryDesc(linitial_node(PlannedStmt, portal->stmts),
+											portal->preps ?
+											(ExecPrep *) linitial(portal->preps) :
+											NULL,
 											portal->sourceText,
 											GetActiveSnapshot(),
 											InvalidSnapshot,
@@ -1185,6 +1200,7 @@ PortalRunMulti(Portal portal,
 {
 	bool		active_snapshot_set = false;
 	ListCell   *stmtlist_item;
+	int			i;
 
 	/*
 	 * If the destination is DestRemoteExecute, change to DestNone.  The
@@ -1205,9 +1221,14 @@ PortalRunMulti(Portal portal,
 	 * Loop to handle the individual queries generated from a single parsetree
 	 * by analysis and rewrite.
 	 */
+	i = 0;
 	foreach(stmtlist_item, portal->stmts)
 	{
 		PlannedStmt *pstmt = lfirst_node(PlannedStmt, stmtlist_item);
+		ExecPrep *prep = portal->preps ?
+			list_nth(portal->preps, i) : NULL;
+
+		i++;
 
 		/*
 		 * If we got a cancel signal in prior command, quit
@@ -1265,7 +1286,7 @@ PortalRunMulti(Portal portal,
 			if (pstmt->canSetTag)
 			{
 				/* statement can set tag string */
-				ProcessQuery(pstmt,
+				ProcessQuery(pstmt, prep,
 							 portal->sourceText,
 							 portal->portalParams,
 							 portal->queryEnv,
@@ -1274,7 +1295,7 @@ PortalRunMulti(Portal portal,
 			else
 			{
 				/* stmt added by rewrite cannot set tag */
-				ProcessQuery(pstmt,
+				ProcessQuery(pstmt, prep,
 							 portal->sourceText,
 							 portal->portalParams,
 							 portal->queryEnv,
diff --git a/src/backend/utils/mmgr/portalmem.c b/src/backend/utils/mmgr/portalmem.c
index 943da087c9f..313f8ef2fdc 100644
--- a/src/backend/utils/mmgr/portalmem.c
+++ b/src/backend/utils/mmgr/portalmem.c
@@ -284,6 +284,7 @@ PortalDefineQuery(Portal portal,
 				  const char *sourceText,
 				  CommandTag commandTag,
 				  List *stmts,
+				  List *preps,
 				  CachedPlan *cplan)
 {
 	Assert(PortalIsValid(portal));
@@ -298,6 +299,7 @@ PortalDefineQuery(Portal portal,
 	portal->qc.nprocessed = 0;
 	portal->commandTag = commandTag;
 	portal->stmts = stmts;
+	portal->preps = preps;
 	portal->cplan = cplan;
 	portal->status = PORTAL_DEFINED;
 }
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 6e51d50efc7..6aa8b275aa2 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -63,7 +63,8 @@ extern void ExplainOneUtility(Node *utilityStmt, IntoClause *into,
 							  ExplainState *es, ParseState *pstate,
 							  ParamListInfo params);
 
-extern void ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into,
+extern void ExplainOnePlan(PlannedStmt *plannedstmt, ExecPrep *prep,
+						   IntoClause *into,
 						   ExplainState *es, const char *queryString,
 						   ParamListInfo params, QueryEnvironment *queryEnv,
 						   const instr_time *planduration,
diff --git a/src/include/executor/execdesc.h b/src/include/executor/execdesc.h
index 86db3dc8d0d..c18530f5d11 100644
--- a/src/include/executor/execdesc.h
+++ b/src/include/executor/execdesc.h
@@ -18,7 +18,6 @@
 #include "nodes/execnodes.h"
 #include "tcop/dest.h"
 
-
 /* ----------------
  *		query descriptor:
  *
@@ -35,6 +34,7 @@ typedef struct QueryDesc
 	/* These fields are provided by CreateQueryDesc */
 	CmdType		operation;		/* CMD_SELECT, CMD_UPDATE, etc. */
 	PlannedStmt *plannedstmt;	/* planner's output (could be utility, too) */
+	ExecPrep *prep;				/* output of ExecutorPrep() or NULL */
 	const char *sourceText;		/* source text of the query */
 	Snapshot	snapshot;		/* snapshot to use for query */
 	Snapshot	crosscheck_snapshot;	/* crosscheck for RI update/delete */
@@ -57,6 +57,7 @@ typedef struct QueryDesc
 
 /* in pquery.c */
 extern QueryDesc *CreateQueryDesc(PlannedStmt *plannedstmt,
+								  ExecPrep *prep,
 								  const char *sourceText,
 								  Snapshot snapshot,
 								  Snapshot crosscheck_snapshot,
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index fa2b657fb2f..3579926d4e8 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -20,6 +20,7 @@
 #include "nodes/lockoptions.h"
 #include "nodes/parsenodes.h"
 #include "utils/memutils.h"
+#include "utils/resowner.h"
 
 
 /*
@@ -234,6 +235,16 @@ ExecGetJunkAttribute(TupleTableSlot *slot, AttrNumber attno, bool *isNull)
  */
 extern void ExecutorStart(QueryDesc *queryDesc, int eflags);
 extern void standard_ExecutorStart(QueryDesc *queryDesc, int eflags);
+
+extern ExecPrep *ExecutorPrep(PlannedStmt *pstmt,
+							  ParamListInfo params,
+							  ResourceOwner owner,
+							  bool do_initial_pruning,
+							  int eflags);
+extern ExecPrep *CreateExecPrep(EState *estate, MemoryContext context,
+								execprep_cleanup_fn cleanup, void *cleanup_arg);
+extern void ExecPrepCleanup(ExecPrep *prep);
+
 extern void ExecutorRun(QueryDesc *queryDesc,
 						ScanDirection direction, uint64 count);
 extern void standard_ExecutorRun(QueryDesc *queryDesc,
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 18ae8f0d4bb..8bdecd631bf 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -772,6 +772,54 @@ typedef struct EState
 	List	   *es_insert_pending_modifytables;
 } EState;
 
+/*
+ * ExecPrep: encapsulates executor preparation results for a PlannedStmt.
+ *
+ * ExecutorPrep() factors out executor setup steps such as initializing the
+ * range table, checking permissions, and executing initial partition pruning.
+ * ExecutorStart() can reuse the prepared EState instead of repeating that
+ * work, and other callers (such as plan cache validation) can use it without
+ * running the full plan.
+ */
+
+/*
+ * Optional callback to clean up user-specific resources associated with
+ * ExecPrep.
+ */
+typedef void (*execprep_cleanup_fn)(void *prep);
+
+typedef struct ExecPrep
+{
+	/*
+	 * Context in which this struct and all subsidiary allocations were made.
+	 * This context must remain alive until ExecPrepCleanup is called.
+	 */
+	MemoryContext context;
+
+	/*
+	 * Partially-initialized executor state used for permission checks and
+	 * pruning. May be adopted directly by ExecutorStart(), in which case
+	 * ExecPrepCleanup will skip freeing it.
+	 */
+	EState	   *prep_estate;
+
+	/*
+	 * True if ExecPrepCleanup() must free the EState.  If the executor adopts
+	 * prep_estate, this is set to false to avoid double-free.
+	 */
+	bool		owns_estate;
+
+	/*
+	 * Optional caller-supplied cleanup hook to run during ExecPrepCleanup.
+	 * Useful for releasing external resources associated with the prep.
+	 */
+	execprep_cleanup_fn cleanup;
+
+	/*
+	 * Opaque pointer to pass to the cleanup hook.
+	 */
+	void	   *cleanup_arg;
+} ExecPrep;
 
 /*
  * ExecRowMark -
diff --git a/src/include/utils/portal.h b/src/include/utils/portal.h
index 5ffa6fd5cc8..013bcc3bd8e 100644
--- a/src/include/utils/portal.h
+++ b/src/include/utils/portal.h
@@ -137,6 +137,7 @@ typedef struct PortalData
 	CommandTag	commandTag;		/* command tag for original query */
 	QueryCompletion qc;			/* command completion data for executed query */
 	List	   *stmts;			/* list of PlannedStmts */
+	List	   *preps;			/* list of ExecPreps where needed */
 	CachedPlan *cplan;			/* CachedPlan, if stmts are from one */
 
 	ParamListInfo portalParams; /* params to pass to query */
@@ -240,6 +241,7 @@ extern void PortalDefineQuery(Portal portal,
 							  const char *sourceText,
 							  CommandTag commandTag,
 							  List *stmts,
+							  List *preps,
 							  CachedPlan *cplan);
 extern PlannedStmt *PortalGetPrimaryStmt(Portal portal);
 extern void PortalCreateHoldStore(Portal portal);
-- 
2.47.3



  [application/octet-stream] v4-0003-Reuse-partition-pruning-results-in-parallel-worke.patch (9.1K, 3-v4-0003-Reuse-partition-pruning-results-in-parallel-worke.patch)
  download | inline diff:
From 695b2d630d1e0812de9e3d227a56fadf21a8b61a Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Tue, 11 Nov 2025 22:17:47 +0900
Subject: [PATCH v4 3/6] Reuse partition pruning results in parallel workers

Pass the leader's initial partition pruning results and unpruned
relids to parallel workers and reuse them via ExecutorPrep(). This
avoids repeating pruning logic in workers, which is not only
redundant but also risks divergence due to nondeterminism in pruning
steps or parameter evaluation timing.

Introduce ExecCheckInitialPruningResults() to verify that the results
match what the worker would compute. This check helps catch
inconsistencies across leader and worker pruning logic.

While valuable on its own, this change also lays the foundation for
future optimizations where the leader may take locks only on
surviving partitions. Ensuring that workers follow identical pruning
decisions makes such selective locking safe.
---
 src/backend/executor/execParallel.c  | 67 +++++++++++++++++++++++++++-
 src/backend/executor/execPartition.c | 35 +++++++++++++++
 src/include/executor/execPartition.h |  1 +
 3 files changed, 102 insertions(+), 1 deletion(-)

diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index aedbd9566d6..751590adcc9 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -24,6 +24,7 @@
 #include "postgres.h"
 
 #include "executor/execParallel.h"
+#include "executor/execPartition.h"
 #include "executor/executor.h"
 #include "executor/nodeAgg.h"
 #include "executor/nodeAppend.h"
@@ -65,6 +66,8 @@
 #define PARALLEL_KEY_QUERY_TEXT		UINT64CONST(0xE000000000000008)
 #define PARALLEL_KEY_JIT_INSTRUMENTATION UINT64CONST(0xE000000000000009)
 #define PARALLEL_KEY_WAL_USAGE			UINT64CONST(0xE00000000000000A)
+#define PARALLEL_KEY_PARTITION_PRUNE_RESULTS	UINT64CONST(0xE00000000000000B)
+#define PARALLEL_KEY_UNPRUNED_RELIDS	UINT64CONST(0xE00000000000000C)
 
 #define PARALLEL_TUPLE_QUEUE_SIZE		65536
 
@@ -608,12 +611,18 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	FixedParallelExecutorState *fpes;
 	char	   *pstmt_data;
 	char	   *pstmt_space;
+	char	   *part_prune_results_data;
+	char	   *part_prune_results_space;
+	char	   *unpruned_relids_data;
+	char	   *unpruned_relids_space;
 	char	   *paramlistinfo_space;
 	BufferUsage *bufusage_space;
 	WalUsage   *walusage_space;
 	SharedExecutorInstrumentation *instrumentation = NULL;
 	SharedJitInstrumentation *jit_instrumentation = NULL;
 	int			pstmt_len;
+	int			part_prune_results_len;
+	int			unpruned_relids_len;
 	int			paramlistinfo_len;
 	int			instrumentation_len = 0;
 	int			jit_instrumentation_len = 0;
@@ -642,6 +651,8 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 
 	/* Fix up and serialize plan to be sent to workers. */
 	pstmt_data = ExecSerializePlan(planstate->plan, estate);
+	part_prune_results_data = nodeToString(estate->es_part_prune_results);
+	unpruned_relids_data = nodeToString(estate->es_unpruned_relids);
 
 	/* Create a parallel context. */
 	pcxt = CreateParallelContext("postgres", "ParallelQueryMain", nworkers);
@@ -668,6 +679,16 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	shm_toc_estimate_chunk(&pcxt->estimator, pstmt_len);
 	shm_toc_estimate_keys(&pcxt->estimator, 1);
 
+	/* Estimate space for serialized part_prune_results. */
+	part_prune_results_len = strlen(part_prune_results_data) + 1;
+	shm_toc_estimate_chunk(&pcxt->estimator, part_prune_results_len);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	/* Estimate space for serialized unpruned_relids. */
+	unpruned_relids_len = strlen(unpruned_relids_data) + 1;
+	shm_toc_estimate_chunk(&pcxt->estimator, unpruned_relids_len);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
 	/* Estimate space for serialized ParamListInfo. */
 	paramlistinfo_len = EstimateParamListSpace(estate->es_param_list_info);
 	shm_toc_estimate_chunk(&pcxt->estimator, paramlistinfo_len);
@@ -769,6 +790,16 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	memcpy(pstmt_space, pstmt_data, pstmt_len);
 	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PLANNEDSTMT, pstmt_space);
 
+	/* Store serialized part_prune_results */
+	part_prune_results_space = shm_toc_allocate(pcxt->toc, part_prune_results_len);
+	memcpy(part_prune_results_space, part_prune_results_data, part_prune_results_len);
+	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARTITION_PRUNE_RESULTS, part_prune_results_space);
+
+	/* Store serialized unpruned_relids */
+	unpruned_relids_space = shm_toc_allocate(pcxt->toc, unpruned_relids_len);
+	memcpy(unpruned_relids_space, unpruned_relids_data, unpruned_relids_len);
+	shm_toc_insert(pcxt->toc, PARALLEL_KEY_UNPRUNED_RELIDS, unpruned_relids_space);
+
 	/* Store serialized ParamListInfo. */
 	paramlistinfo_space = shm_toc_allocate(pcxt->toc, paramlistinfo_len);
 	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARAMLISTINFO, paramlistinfo_space);
@@ -1263,10 +1294,15 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
 						 int instrument_options)
 {
 	char	   *pstmtspace;
+	char	   *part_prune_results_space;
+	char	   *unpruned_relids_space;
 	char	   *paramspace;
 	PlannedStmt *pstmt;
+	List	   *part_prune_results;
+	Bitmapset  *unpruned_relids;
 	ParamListInfo paramLI;
 	char	   *queryString;
+	ExecPrep   *prep = NULL;
 
 	/* Get the query string from shared memory */
 	queryString = shm_toc_lookup(toc, PARALLEL_KEY_QUERY_TEXT, false);
@@ -1279,9 +1315,38 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
 	paramspace = shm_toc_lookup(toc, PARALLEL_KEY_PARAMLISTINFO, false);
 	paramLI = RestoreParamList(&paramspace);
 
+	/* Reconstruct leader-supplied part_prune_results and unpruned_relids. */
+	part_prune_results_space =
+		shm_toc_lookup(toc, PARALLEL_KEY_PARTITION_PRUNE_RESULTS, false);
+	part_prune_results = (List *) stringToNode(part_prune_results_space);
+	unpruned_relids_space =
+		shm_toc_lookup(toc, PARALLEL_KEY_UNPRUNED_RELIDS, false);
+	unpruned_relids = (Bitmapset *) stringToNode(unpruned_relids_space);
+
+	/*
+	 * If pruning was done in the leader, build a prep estate in the worker
+	 * and inject the leader's pruning results into it for reuse.
+	 */
+	if (pstmt->partPruneInfos)
+	{
+		prep = ExecutorPrep(pstmt, paramLI, CurrentResourceOwner, false, 0);
+		Assert(prep->prep_estate);
+
+		prep->prep_estate->es_part_prune_results = part_prune_results;
+		prep->prep_estate->es_unpruned_relids =
+			bms_add_members(prep->prep_estate->es_unpruned_relids,
+							unpruned_relids);
+
+		/*
+		 * Verify that the pruning results passed from the leader match
+		 * what the worker would independently compute.
+		 */
+		ExecCheckInitialPruningResults(prep->prep_estate);
+	}
+
 	/* Create a QueryDesc for the query. */
 	return CreateQueryDesc(pstmt,
-						   NULL,
+						   prep,
 						   queryString,
 						   GetActiveSnapshot(), InvalidSnapshot,
 						   receiver, paramLI, NULL, instrument_options);
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index ac5e2ebee72..dc4eac8a0a7 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -1873,6 +1873,41 @@ ExecDoInitialPruning(EState *estate)
 	}
 }
 
+/*
+ * ExecCheckInitialPruningResults
+ *      Verify partition pruning results passed from the leader process.
+ *
+ * This is intended to be called during parallel worker query setup.
+ * It recomputes initial pruning results locally and compares them with
+ * those received from the leader. Any mismatch may indicate a divergence
+ * between leader and worker logic or environment.
+ */
+void
+ExecCheckInitialPruningResults(EState *estate)
+{
+	ListCell   *lc;
+	int			i;
+
+	Assert(estate->es_part_prune_results != NULL);
+	i = 0;
+	foreach(lc, estate->es_part_prune_states)
+	{
+		PartitionPruneState *prunestate = (PartitionPruneState *) lfirst(lc);
+		Bitmapset *reuse_validsubplans =
+				list_nth_node(Bitmapset, estate->es_part_prune_results, i);
+		Bitmapset  *validsubplans = NULL;
+		Bitmapset  *validsubplan_rtis = NULL;
+
+		if (prunestate->do_initial_prune)
+			validsubplans = ExecFindMatchingSubPlans(prunestate, true,
+													 &validsubplan_rtis);
+		if (bms_nonempty_difference(validsubplans, reuse_validsubplans))
+			elog(ERROR, "different validsubplns in parallel worker");
+		if (bms_nonempty_difference(validsubplan_rtis, estate->es_unpruned_relids))
+			elog(ERROR, "different unprunable_relids in parallel worker");
+	}
+}
+
 /*
  * ExecInitPartitionExecPruning
  *		Initialize the data structures needed for runtime "exec" partition
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index ba8cc594fc9..126efd008e5 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -132,6 +132,7 @@ typedef struct PartitionPruneState
 
 extern void ExecCreatePartitionPruneStates(EState *estate);
 extern void ExecDoInitialPruning(EState *estate);
+extern void ExecCheckInitialPruningResults(EState *estate);
 extern PartitionPruneState *ExecInitPartitionExecPruning(PlanState *planstate,
 														 int n_total_subplans,
 														 int part_prune_index,
-- 
2.47.3



  [application/octet-stream] v4-0006-Make-SQL-function-executor-track-ExecutorPrep-sta.patch (6.7K, 4-v4-0006-Make-SQL-function-executor-track-ExecutorPrep-sta.patch)
  download | inline diff:
From 5dc90ce54c7108d5335003da4f247a65803e42e7 Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Mon, 17 Nov 2025 17:40:26 +0900
Subject: [PATCH v4 6/6] Make SQL function executor track ExecutorPrep state

Extend the SQL function executor to use the ExecutorPrep results
returned by GetCachedPlan().  init_execution_state() now passes a
CachedPlanPrepData to GetCachedPlan() and stores the per statement
ExecPrep pointers in the execution_state nodes.

At execution time, postquel_start() reparents the prep estate's
es_query_cxt under the function's subcontext so that prep state
follows the usual per call context hierarchy.

This allows SQL language functions to participate in the same
ExecutorPrep machinery as other plan cache users, which a later
patch will use to support pruning aware locking.

Add a regression test where rule rewrite expands a single UPDATE
into multiple PlannedStmts, exercising the SQL function plan cache
and the generic plan reuse path that now invokes ExecutorPrep.
---
 src/backend/executor/functions.c        | 32 +++++++++++++++++++++++--
 src/test/regress/expected/plancache.out | 30 +++++++++++++++++++++++
 src/test/regress/sql/plancache.sql      | 27 +++++++++++++++++++++
 3 files changed, 87 insertions(+), 2 deletions(-)

diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index d81718ea84e..ed7352fce61 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -72,6 +72,7 @@ typedef struct execution_state
 	bool		setsResult;		/* true if this query produces func's result */
 	bool		lazyEval;		/* true if should fetch one row at a time */
 	PlannedStmt *stmt;			/* plan for this query */
+	ExecPrep   *prep;			/* ExecutorPrep() output for this plan */
 	QueryDesc  *qd;				/* null unless status == RUN */
 } execution_state;
 
@@ -657,6 +658,8 @@ init_execution_state(SQLFunctionCachePtr fcache)
 	execution_state *lasttages = NULL;
 	int			nstmts;
 	ListCell   *lc;
+	CachedPlanPrepData cprep = {0};
+	int			i;
 
 	/*
 	 * Clean up after previous query, if there was one.
@@ -695,11 +698,20 @@ init_execution_state(SQLFunctionCachePtr fcache)
 	 * CurrentResourceOwner will be the same when ShutdownSQLFunction runs.)
 	 */
 	fcache->cowner = CurrentResourceOwner;
+
+	/*
+	 * Have ExecutorPrep() allocate under fcache->fcontext.  The prep
+	 * EStates it creates will initially live there; postquel_start()
+	 * will later reparent their es_query_cxt into fcache->subcontext
+	 * when using them for execution.
+	 */
+	cprep.context = fcache->fcontext;
+	cprep.owner = fcache->cowner;
 	fcache->cplan = GetCachedPlan(plansource,
 								  fcache->paramLI,
 								  fcache->cowner,
 								  NULL,
-								  NULL);
+								  &cprep);
 
 	/*
 	 * If necessary, make esarray[] bigger to hold the needed state.
@@ -720,9 +732,12 @@ init_execution_state(SQLFunctionCachePtr fcache)
 	/*
 	 * Build execution_state list to match the number of contained plans.
 	 */
+	i = 0;
 	foreach(lc, fcache->cplan->stmt_list)
 	{
 		PlannedStmt *stmt = lfirst_node(PlannedStmt, lc);
+		ExecPrep *prep = cprep.prep_list ? list_nth(cprep.prep_list, i++) :
+			NULL;
 		execution_state *newes;
 
 		/*
@@ -764,6 +779,7 @@ init_execution_state(SQLFunctionCachePtr fcache)
 		newes->setsResult = false;	/* might change below */
 		newes->lazyEval = false;	/* might change below */
 		newes->stmt = stmt;
+		newes->prep = prep;
 		newes->qd = NULL;
 
 		if (stmt->canSetTag)
@@ -1362,8 +1378,20 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
 	else
 		dest = None_Receiver;
 
+	if (es->prep)
+	{
+		/*
+		 * Prep EStates were built under fcache->fcontext.  For execution,
+		 * make their es_query_cxt a child of fcache->subcontext so they
+		 * follow the usual per call lifetime.
+		 */
+		EState *prep_estate = es->prep->prep_estate;
+
+		MemoryContextSetParent(prep_estate->es_query_cxt, fcache->subcontext);
+	}
+
 	es->qd = CreateQueryDesc(es->stmt,
-							 NULL,
+							 es->prep,
 							 fcache->func->src,
 							 GetActiveSnapshot(),
 							 InvalidSnapshot,
diff --git a/src/test/regress/expected/plancache.out b/src/test/regress/expected/plancache.out
index 26c4c5e10fd..bf937364716 100644
--- a/src/test/regress/expected/plancache.out
+++ b/src/test/regress/expected/plancache.out
@@ -458,4 +458,34 @@ NOTICE:  creating index on partition inval_during_pruning_p1
 
 drop table inval_during_pruning_p, inval_during_pruning_signal;
 drop function invalidate_plancache_func, stable_pruning_val;
+-- exercise sql-function plan cache when rewrite expands a single statement
+-- into multiple planned statements. this forces cachedplan->stmt_list to
+-- contain more than one entry and checks that executor state for the first
+-- rewritten statement does not destroy state needed by the second one.
+set plan_cache_mode = force_generic_plan;
+create table sqlf_base(id int, val int);
+create table sqlf_log(id int, note text);
+insert into sqlf_base values (1, 10);
+create rule sqlf_base_upd_log as
+on update to sqlf_base do also
+    insert into sqlf_log(id, note)
+    values (new.id, 'logged by rule');
+create or replace function sqlf_execprep_test(a int, v int)
+returns void
+language sql
+as $$
+    update sqlf_base set val = v where id = a;
+$$;
+select sqlf_execprep_test(1, 20);
+ sqlf_execprep_test 
+--------------------
+ 
+(1 row)
+
+select sqlf_execprep_test(1, 30);
+ sqlf_execprep_test 
+--------------------
+ 
+(1 row)
+
 reset plan_cache_mode;
diff --git a/src/test/regress/sql/plancache.sql b/src/test/regress/sql/plancache.sql
index cc7eb4da4d3..71320799040 100644
--- a/src/test/regress/sql/plancache.sql
+++ b/src/test/regress/sql/plancache.sql
@@ -272,4 +272,31 @@ explain (verbose, costs off) execute inval_during_pruning_q;
 drop table inval_during_pruning_p, inval_during_pruning_signal;
 drop function invalidate_plancache_func, stable_pruning_val;
 
+-- exercise sql-function plan cache when rewrite expands a single statement
+-- into multiple planned statements. this forces cachedplan->stmt_list to
+-- contain more than one entry and checks that executor state for the first
+-- rewritten statement does not destroy state needed by the second one.
+
+set plan_cache_mode = force_generic_plan;
+
+create table sqlf_base(id int, val int);
+create table sqlf_log(id int, note text);
+
+insert into sqlf_base values (1, 10);
+
+create rule sqlf_base_upd_log as
+on update to sqlf_base do also
+    insert into sqlf_log(id, note)
+    values (new.id, 'logged by rule');
+
+create or replace function sqlf_execprep_test(a int, v int)
+returns void
+language sql
+as $$
+    update sqlf_base set val = v where id = a;
+$$;
+
+select sqlf_execprep_test(1, 20);
+select sqlf_execprep_test(1, 30);
+
 reset plan_cache_mode;
-- 
2.47.3



  [application/octet-stream] v4-0004-Use-pruning-aware-locking-in-cached-plans.patch (24.5K, 5-v4-0004-Use-pruning-aware-locking-in-cached-plans.patch)
  download | inline diff:
From f3c07bcc5a14a0b751d82771c97c95775cea2758 Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Tue, 11 Nov 2025 22:30:52 +0900
Subject: [PATCH v4 4/6] Use pruning-aware locking in cached plans

Extend GetCachedPlan() to perform ExecutorPrep() on each planned
statement, capturing unpruned relids and initial pruning results.
Use this data to acquire execution locks only on surviving partitions,
avoiding unnecessary locking of pruned tables even when using cached
plans.

Introduce CachedPlanPrepData to carry ExecutorPrep results
through the plan caching layer. Adjust call sites in SPI,
functions, portals, and EXPLAIN to propagate this data.

This ensures pruning decisions made during initial pruning are
consistently reused without redoing pruning logic in executor paths
like parallel workers. It also lays the groundwork for
pruning-dependent lock behavior during plan reuse.

To maintain correctness when all target partitions are pruned, also
reinstate the firstResultRel locking behavior lost in commit
28317de72. That commit required the first ModifyTable target to
remain initialized for executor assumptions to hold. We now
explicitly track these relids in PlannerGlobal and PlannedStmt so they
are locked even if pruned, preserving that rule across cached plan
reuse.
---
 src/backend/commands/prepare.c         |  19 +-
 src/backend/executor/functions.c       |   1 +
 src/backend/executor/nodeModifyTable.c |   4 +-
 src/backend/executor/spi.c             |  26 ++-
 src/backend/optimizer/plan/planner.c   |   1 +
 src/backend/optimizer/plan/setrefs.c   |   3 +
 src/backend/tcop/postgres.c            |   9 +-
 src/backend/utils/cache/plancache.c    | 234 ++++++++++++++++++++++++-
 src/include/nodes/pathnodes.h          |   3 +
 src/include/nodes/plannodes.h          |  10 ++
 src/include/utils/plancache.h          |  24 ++-
 11 files changed, 313 insertions(+), 21 deletions(-)

diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index afd449c73ba..23332d19b37 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -154,6 +154,7 @@ ExecuteQuery(ParseState *pstate,
 {
 	PreparedStatement *entry;
 	CachedPlan *cplan;
+	CachedPlanPrepData cprep = {0};
 	List	   *plan_list;
 	ParamListInfo paramLI = NULL;
 	EState	   *estate = NULL;
@@ -193,7 +194,10 @@ ExecuteQuery(ParseState *pstate,
 									   entry->plansource->query_string);
 
 	/* Replan if needed, and increment plan refcount for portal */
-	cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL);
+	/* Keep ExecutorPrep state with the portal and its resowner. */
+	cprep.context = portal->portalContext;
+	cprep.owner = portal->resowner;
+	cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL, &cprep);
 	plan_list = cplan->stmt_list;
 
 	/*
@@ -205,7 +209,7 @@ ExecuteQuery(ParseState *pstate,
 					  query_string,
 					  entry->plansource->commandTag,
 					  plan_list,
-					  NIL,
+					  cprep.prep_list,
 					  cplan);
 
 	/*
@@ -575,6 +579,7 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	PreparedStatement *entry;
 	const char *query_string;
 	CachedPlan *cplan;
+	CachedPlanPrepData cprep = {0};
 	List	   *plan_list;
 	List	   *prep_list;
 	ListCell   *p;
@@ -633,8 +638,14 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	}
 
 	/* Replan if needed, and acquire a transient refcount */
+	/* ExecutorPrep state is local to this EXPLAIN EXECUTE call. */
+	cprep.context = CurrentMemoryContext;
+	cprep.owner = CurrentResourceOwner;
+	if (es->generic)
+		cprep.eflags = EXEC_FLAG_EXPLAIN_GENERIC;
 	cplan = GetCachedPlan(entry->plansource, paramLI,
-						  CurrentResourceOwner, pstate->p_queryEnv);
+						  CurrentResourceOwner, pstate->p_queryEnv,
+						  &cprep);
 
 	INSTR_TIME_SET_CURRENT(planduration);
 	INSTR_TIME_SUBTRACT(planduration, planstart);
@@ -653,7 +664,7 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	}
 
 	plan_list = cplan->stmt_list;
-	prep_list = NIL;
+	prep_list = cprep.prep_list;
 
 	/* Explain each query */
 	i = 0;
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index 633310c5f5b..d81718ea84e 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -698,6 +698,7 @@ init_execution_state(SQLFunctionCachePtr fcache)
 	fcache->cplan = GetCachedPlan(plansource,
 								  fcache->paramLI,
 								  fcache->cowner,
+								  NULL,
 								  NULL);
 
 	/*
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index e44f1223886..7de2328021b 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -4671,8 +4671,8 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
 	 * as a reference for building the ResultRelInfo of the target partition.
 	 * In either case, it doesn't matter which result relation is kept, so we
 	 * just keep the first one, if all others have been pruned.  See also,
-	 * ExecDoInitialPruning(), which ensures that this first result relation
-	 * has been locked.
+	 * AcquireExecutorLocksUnpruned(), which ensures that this first result
+	 * relation has been locked.
 	 */
 	i = 0;
 	foreach(l, node->resultRelations)
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 7a3cb944d6f..d580f1e0425 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -1579,6 +1579,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 {
 	CachedPlanSource *plansource;
 	CachedPlan *cplan;
+	CachedPlanPrepData cprep = {0};
 	List	   *stmt_list;
 	char	   *query_string;
 	Snapshot	snapshot;
@@ -1659,7 +1660,11 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 	 */
 
 	/* Replan if needed, and increment plan refcount for portal */
-	cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv);
+	/* ExecutorPrep state lives in this portal's context. */
+	cprep.context = portal->portalContext;
+	cprep.owner = portal->resowner;
+	cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv,
+						  &cprep);
 	stmt_list = cplan->stmt_list;
 
 	if (!plan->saved)
@@ -1685,7 +1690,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 					  query_string,
 					  plansource->commandTag,
 					  stmt_list,
-					  NIL,
+					  cprep.prep_list,	/* lives in portalContext */
 					  cplan);
 
 	/*
@@ -2078,6 +2083,7 @@ SPI_plan_get_cached_plan(SPIPlanPtr plan)
 {
 	CachedPlanSource *plansource;
 	CachedPlan *cplan;
+	CachedPlanPrepData cprep = {0};
 	SPICallbackArg spicallbackarg;
 	ErrorContextCallback spierrcontext;
 
@@ -2101,9 +2107,13 @@ SPI_plan_get_cached_plan(SPIPlanPtr plan)
 	error_context_stack = &spierrcontext;
 
 	/* Get the generic plan for the query */
+	/* ExecutorPrep() state lives in caller's active context. */
+	cprep.context = CurrentMemoryContext;
+	cprep.owner = CurrentResourceOwner;
 	cplan = GetCachedPlan(plansource, NULL,
 						  plan->saved ? CurrentResourceOwner : NULL,
-						  _SPI_current->queryEnv);
+						  _SPI_current->queryEnv,
+						  &cprep);
 	Assert(cplan == plansource->gplan);
 
 	/* Pop the error context stack */
@@ -2501,6 +2511,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 		CachedPlanSource *plansource = (CachedPlanSource *) lfirst(lc1);
 		List	   *stmt_list;
 		ListCell   *lc2;
+		CachedPlanPrepData cprep = {0};
 		List	   *prep_list;
 		int			i;
 
@@ -2577,11 +2588,16 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 		 * Replan if needed, and increment plan refcount.  If it's a saved
 		 * plan, the refcount must be backed by the plan_owner.
 		 */
+
+		/* ExecutorPrep state is per _SPI_execute_plan call. */
+		cprep.context = CurrentMemoryContext;
+		cprep.owner = CurrentResourceOwner;
 		cplan = GetCachedPlan(plansource, options->params,
-							  plan_owner, _SPI_current->queryEnv);
+							  plan_owner, _SPI_current->queryEnv,
+							  &cprep);
 
 		stmt_list = cplan->stmt_list;
-		prep_list = NIL;
+		prep_list = cprep.prep_list;
 
 		/*
 		 * If we weren't given a specific snapshot to use, and the statement
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index c4fd646b999..4c76e78c1da 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -608,6 +608,7 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
 											  glob->prunableRelids);
 	result->permInfos = glob->finalrteperminfos;
 	result->resultRelations = glob->resultRelations;
+	result->firstResultRels = glob->firstResultRels;
 	result->appendRelations = glob->appendRelations;
 	result->subplans = glob->subplans;
 	result->rewindPlanIDs = glob->rewindPlanIDs;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index ccdc9bc264a..229b39060ae 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -1274,6 +1274,9 @@ set_plan_refs(PlannerInfo *root, Plan *plan, int rtoffset)
 						lappend_int(root->glob->resultRelations,
 									splan->rootRelation);
 				}
+				root->glob->firstResultRels =
+					lappend_int(root->glob->firstResultRels,
+								linitial_int(splan->resultRelations));
 			}
 			break;
 		case T_Append:
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 5880a574a06..a96419edcbe 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -1639,6 +1639,7 @@ exec_bind_message(StringInfo input_message)
 	int16	   *rformats = NULL;
 	CachedPlanSource *psrc;
 	CachedPlan *cplan;
+	CachedPlanPrepData cprep = {0};
 	Portal		portal;
 	char	   *query_string;
 	char	   *saved_stmt_name;
@@ -2021,7 +2022,11 @@ exec_bind_message(StringInfo input_message)
 	 * will be generated in MessageContext.  The plan refcount will be
 	 * assigned to the Portal, so it will be released at portal destruction.
 	 */
-	cplan = GetCachedPlan(psrc, params, NULL, NULL);
+
+	/* ExecutorPrep() state lives in portal context. */
+	cprep.context = portal->portalContext;
+	cprep.owner = portal->resowner;
+	cplan = GetCachedPlan(psrc, params, NULL, NULL, &cprep);
 
 	/*
 	 * Now we can define the portal.
@@ -2034,7 +2039,7 @@ exec_bind_message(StringInfo input_message)
 					  query_string,
 					  psrc->commandTag,
 					  cplan->stmt_list,
-					  NIL,
+					  cprep.prep_list,
 					  cplan);
 
 	/* Portal is defined, set the plan ID based on its contents. */
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 6661d2c6b73..c1cfd47422c 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -93,7 +93,7 @@ static bool StmtPlanRequiresRevalidation(CachedPlanSource *plansource);
 static bool BuildingPlanRequiresSnapshot(CachedPlanSource *plansource);
 static List *RevalidateCachedQuery(CachedPlanSource *plansource,
 								   QueryEnvironment *queryEnv);
-static bool CheckCachedPlan(CachedPlanSource *plansource);
+static bool PrepAndCheckCachedPlan(CachedPlanSource *plansource, CachedPlanPrepData *cprep);
 static CachedPlan *BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
 								   ParamListInfo boundParams, QueryEnvironment *queryEnv);
 static bool choose_custom_plan(CachedPlanSource *plansource,
@@ -101,6 +101,8 @@ static bool choose_custom_plan(CachedPlanSource *plansource,
 static double cached_plan_cost(CachedPlan *plan, bool include_planner);
 static Query *QueryListGetPrimaryStmt(List *stmts);
 static void AcquireExecutorLocks(List *stmt_list, bool acquire);
+static void AcquireExecutorLocksUnpruned(List *stmt_list, bool acquire,
+										 CachedPlanPrepData *cprep);
 static void AcquirePlannerLocks(List *stmt_list, bool acquire);
 static void ScanQueryForLocks(Query *parsetree, bool acquire);
 static bool ScanQueryWalker(Node *node, bool *acquire);
@@ -137,6 +139,26 @@ ResourceOwnerForgetPlanCacheRef(ResourceOwner owner, CachedPlan *plan)
 /* GUC parameter */
 int			plan_cache_mode = PLAN_CACHE_MODE_AUTO;
 
+/*
+ * Lock acquisition policy for execution locks.
+ *
+ * LOCK_ALL acquires locks on all relations mentioned in the plan,
+ * reproducing the behavior of AcquireExecutorLocks().
+ *
+ * LOCK_UNPRUNED restricts locking to only the unpruned relations. That
+ * includes those mentioned in PlannedStmt.unprunableRelids and the leaf
+ * partitions remaining after performing initial pruning.
+ */
+typedef enum LockPolicy
+{
+	LOCK_ALL,
+	LOCK_UNPRUNED,
+} LockPolicy;
+
+static void AcquireExecutorLocksWithPolicy(List *stmt_list,
+										   LockPolicy policy, bool acquire,
+										   CachedPlanPrepData *cprep);
+
 /*
  * InitPlanCache: initialize module during InitPostgres.
  *
@@ -938,7 +960,12 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
 }
 
 /*
- * CheckCachedPlan: see if the CachedPlanSource's generic plan is valid.
+ * PrepAndCheckCachedPlan: see if the CachedPlanSource's generic plan is valid.
+ *
+ * If 'cprep' is not NULL, ExecutorPrep() is applied to each PlannedStmt to
+ * compute the set of partitions that survive initial runtime pruning in order
+ * to only lock them.  The resulting ExecPrep structures are saved in cprep for
+ * later reuse by ExecutorStart().
  *
  * Caller must have already called RevalidateCachedQuery to verify that the
  * querytree is up to date.
@@ -947,7 +974,7 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
  * (We must do this for the "true" result to be race-condition-free.)
  */
 static bool
-CheckCachedPlan(CachedPlanSource *plansource)
+PrepAndCheckCachedPlan(CachedPlanSource *plansource, CachedPlanPrepData *cprep)
 {
 	CachedPlan *plan = plansource->gplan;
 
@@ -975,13 +1002,15 @@ CheckCachedPlan(CachedPlanSource *plansource)
 	 */
 	if (plan->is_valid)
 	{
+		LockPolicy policy = !cprep ? LOCK_ALL : LOCK_UNPRUNED;
+
 		/*
 		 * Plan must have positive refcount because it is referenced by
 		 * plansource; so no need to fear it disappears under us here.
 		 */
 		Assert(plan->refcount > 0);
 
-		AcquireExecutorLocks(plan->stmt_list, true);
+		AcquireExecutorLocksWithPolicy(plan->stmt_list, policy, true, cprep);
 
 		/*
 		 * If plan was transient, check to see if TransactionXmin has
@@ -1003,7 +1032,7 @@ CheckCachedPlan(CachedPlanSource *plansource)
 		}
 
 		/* Oops, the race case happened.  Release useless locks. */
-		AcquireExecutorLocks(plan->stmt_list, false);
+		AcquireExecutorLocksWithPolicy(plan->stmt_list, policy, false, cprep);
 	}
 
 	/*
@@ -1283,6 +1312,10 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
  * On return, the plan is valid and we have sufficient locks to begin
  * execution.
  *
+ * If 'cprep' is not NULL and a generic plan is reused, the function prepares
+ * each PlannedStmt via ExecutorPrep() and stores the results in
+ * cprep->prep_list.  These are intended to be passed later to ExecutorStart().
+ *
  * On return, the refcount of the plan has been incremented; a later
  * ReleaseCachedPlan() call is expected.  If "owner" is not NULL then
  * the refcount has been reported to that ResourceOwner (note that this
@@ -1293,7 +1326,8 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
  */
 CachedPlan *
 GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
-			  ResourceOwner owner, QueryEnvironment *queryEnv)
+			  ResourceOwner owner, QueryEnvironment *queryEnv,
+			  CachedPlanPrepData *cprep)
 {
 	CachedPlan *plan = NULL;
 	List	   *qlist;
@@ -1315,7 +1349,9 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 
 	if (!customplan)
 	{
-		if (CheckCachedPlan(plansource))
+		if (cprep)
+			cprep->params = boundParams;
+		if (PrepAndCheckCachedPlan(plansource, cprep))
 		{
 			/* We want a generic plan, and we already have a valid one */
 			plan = plansource->gplan;
@@ -1902,6 +1938,38 @@ QueryListGetPrimaryStmt(List *stmts)
 	return NULL;
 }
 
+/*
+ * AcquireExecutorLocksWithPolicy
+ *		Acquire or release execution locks for a cached plan according to
+ *		the specified policy.
+ *
+ * LOCK_ALL reproduces AcquireExecutorLocks(), locking every relation in
+ * each PlannedStmt's rtable.  LOCK_UNPRUNED restricts locking to the
+ * unprunable rels and partitions that survive initial runtime pruning.
+ *
+ * When LOCK_UNPRUNED is used on acquire, ExecutorPrep() is invoked for
+ * each PlannedStmt and the resulting ExecPrep pointers are appended to
+ * cprep->prep_list in cprep->context.  On release, the same ExecPrep
+ * list is consulted to determine which relations to unlock and is then
+ * cleaned up with ExecPrepCleanup().
+ */
+static void
+AcquireExecutorLocksWithPolicy(List *stmt_list, LockPolicy policy, bool acquire,
+							   CachedPlanPrepData *cprep)
+{
+	switch (policy)
+	{
+		case LOCK_ALL:
+			AcquireExecutorLocks(stmt_list, acquire);
+			break;
+		case LOCK_UNPRUNED:
+			AcquireExecutorLocksUnpruned(stmt_list, acquire, cprep);
+			break;
+		default:
+			elog(ERROR, "invalid LockPolicy");
+	}
+}
+
 /*
  * AcquireExecutorLocks: acquire locks needed for execution of a cached plan;
  * or release them if acquire is false.
@@ -1954,6 +2022,158 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
 	}
 }
 
+/*
+ * LockRelids
+ * 		Acquire or release locks on the specified relids, which reference
+ * 		entries in the provided range table.
+ *
+ * Helper for AcquireExecutorLocksUnpruned().
+ */
+static void
+LockRelids(List *rtable, Bitmapset *relids, bool acquire)
+{
+	int	rtindex = -1;
+
+	while ((rtindex = bms_next_member(relids, rtindex)) >= 0)
+	{
+		RangeTblEntry *rte = list_nth_node(RangeTblEntry, rtable, rtindex - 1);
+
+		Assert(rte->rtekind == RTE_RELATION ||
+			   (rte->rtekind == RTE_SUBQUERY && OidIsValid(rte->relid)));
+
+		/*
+		 * Acquire the appropriate type of lock on each relation OID. Note
+		 * that we don't actually try to open the rel, and hence will not
+		 * fail if it's been dropped entirely --- we'll just transiently
+		 * acquire a non-conflicting lock.
+		 */
+		if (acquire)
+			LockRelationOid(rte->relid, rte->rellockmode);
+		else
+			UnlockRelationOid(rte->relid, rte->rellockmode);
+	}
+}
+
+/*
+ * AcquireExecutorLocksUnpruned
+ *		Acquire or release execution locks for only unpruned relations
+ *		referenced by the given PlannedStmts.
+ *
+ * On acquire, this:
+ *	- locks unprunable rels listed in PlannedStmt.unprunableRelids
+ *	- runs ExecutorPrep() to perform initial runtime pruning
+ *	- locks the surviving partitions reported in the prep estate
+ *	- appends the ExecPrep pointer for each PlannedStmt to cprep->prep_list
+ *
+ * On release, it:
+ *	- looks up the ExecPrep object for each PlannedStmt from cprep->prep_list
+ *	  (which must already be populated)
+ *	- unlocks the same relations identified during acquire
+ *	- calls ExecPrepCleanup() on each ExecPrep
+ *
+ * prep_list is extended during acquire and must match stmt_list one-to-one
+ * when releasing locks.  Memory allocation for ExecPrep happens in
+ * cprep->context.  Locks are acquired using cprep->owner.
+ */
+
+static void
+AcquireExecutorLocksUnpruned(List *stmt_list, bool acquire,
+							 CachedPlanPrepData *cprep)
+{
+	MemoryContext oldcontext = MemoryContextSwitchTo(cprep->context);
+	ListCell   *lc1;
+	List	   *prep_list;
+	int			i;
+
+	Assert(cprep);
+
+	/*
+	 * When releasing locks, use the ExecPrep list (if any) created during
+	 * acquisition to determine which relids to unlock. The list must match
+	 * the PlannedStmt list one-to-one.
+	 */
+	prep_list = cprep->prep_list;
+	Assert(acquire || list_length(prep_list) == list_length(stmt_list));
+
+	i = 0;
+	foreach(lc1, stmt_list)
+	{
+		PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
+		ExecPrep *prep;
+
+		if (plannedstmt->commandType == CMD_UTILITY)
+		{
+			/* Same as AcquireExecutorLocks(). */
+			Query	   *query = UtilityContainsQuery(plannedstmt->utilityStmt);
+
+			if (query)
+				ScanQueryForLocks(query, acquire);
+
+			/* Keep the list one-to-one with stmt_list. */
+			if (acquire)
+				cprep->prep_list = lappend(cprep->prep_list, NULL);
+			continue;
+		}
+
+		/*
+		 * Lock tables mentioned in the original query and other unprunable
+		 * relations that were added to the plan via inheritance expansion.
+		 */
+		LockRelids(plannedstmt->rtable, plannedstmt->unprunableRelids, acquire);
+
+		/* Lock partitions surviving runtime initial pruning. */
+		if (acquire)
+		{
+			prep = ExecutorPrep(plannedstmt, cprep->params, cprep->owner, true,
+								cprep->eflags);
+			Assert(prep || plannedstmt->partPruneInfos == NULL);
+			cprep->prep_list = lappend(cprep->prep_list, prep);
+		}
+		else
+			prep = list_nth(prep_list, i++);
+
+		Assert(prep == NULL || prep->prep_estate);
+		if (prep)
+		{
+			EState *prep_estate = prep->prep_estate;
+
+			/*
+			 * es_unpruned_relids includes plannedstmt->unprunableRelids,
+			 * which we've already locked. Filter them out to avoid double-locking.
+			 */
+			Bitmapset *lock_relids = bms_difference(prep_estate->es_unpruned_relids,
+													plannedstmt->unprunableRelids);
+
+			/*
+			 * firstResultRels may contain pruned partitions that must still be
+			 * locked to satisfy executor assumptions (see comments in
+			 * ExecInitModifyTable(). Ensure they’re included here.
+			 */
+			if (plannedstmt->resultRelations)
+			{
+				ListCell *lc2;
+
+				foreach(lc2, plannedstmt->firstResultRels)
+				{
+					Index       firstResultRel = lfirst_int(lc2);
+
+					if (!bms_is_member(firstResultRel, lock_relids))
+						lock_relids = bms_add_member(lock_relids, firstResultRel);
+				}
+			}
+
+			LockRelids(plannedstmt->rtable, lock_relids, acquire);
+			bms_free(lock_relids);
+		}
+
+		/* Clean up prep if releasing locks. */
+		if (!acquire)
+			ExecPrepCleanup(prep);
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+}
+
 /*
  * AcquirePlannerLocks: acquire locks needed for planning of a querytree list;
  * or release them if acquire is false.
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 46a8655621d..5af4c31f53a 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -141,6 +141,9 @@ typedef struct PlannerGlobal
 	/* "flat" list of integer RT indexes */
 	List	   *resultRelations;
 
+	/* "flat" list of integer RT indexes (one per ModifyTable node) */
+	List	   *firstResultRels;
+
 	/* "flat" list of AppendRelInfos */
 	List	   *appendRelations;
 
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index c4393a94321..eb211f1ba56 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -123,6 +123,16 @@ typedef struct PlannedStmt
 	/* integer list of RT indexes, or NIL */
 	List	   *resultRelations;
 
+	/*
+	 * rtable indexes of first target relation in each ModifyTable node in the
+	 * plan for INSERT/UPDATE/DELETE/MERGE.  NIL if resultRelations is NIL.
+	 *
+	 * These are used by AcquireExecutorLocksUnpruned() to ensure that the
+	 * first result rel for each ModifyTable remains locked even if pruned;
+	 * see ExecInitModifyTable() for the executor side assumptions.
+	 */
+	List	   *firstResultRels;
+
 	/* list of AppendRelInfo nodes */
 	List	   *appendRelations;
 
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index a82b66d4bc2..c7b8ec4be39 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -197,6 +197,27 @@ typedef struct CachedExpression
 } CachedExpression;
 
 
+/*
+ * CachedPlanPrepData
+ *      Carries ExecutorPrep results for each PlannedStmt in a CachedPlan,
+ *      along with context and owner information needed to allocate them.
+ *
+ * prep_list is indexed one-to-one with CachedPlan->stmt_list, and is
+ * populated when GetCachedPlan() prepares a reused generic plan.  The
+ * same list is later used to determine which relations to unlock when
+ * releasing execution locks.
+ *
+ * ExecutorPrep state is allocated in 'context' and owned by 'owner'.
+ */
+typedef struct CachedPlanPrepData
+{
+	List   *prep_list;		/* one ExecPrep per PlannedStmt, or NULL */
+	ParamListInfo params;	/* params visible to ExecutorPrep */
+	MemoryContext context;	/* where to allocate ExecPrep objects */
+	ResourceOwner owner;	/* ResourceOwner for ExecutorPrep state */
+	int		eflags;			/* executor flags to pass to ExecutorPrep */
+} CachedPlanPrepData;
+
 extern void InitPlanCache(void);
 extern void ResetPlanCache(void);
 
@@ -240,7 +261,8 @@ extern List *CachedPlanGetTargetList(CachedPlanSource *plansource,
 extern CachedPlan *GetCachedPlan(CachedPlanSource *plansource,
 								 ParamListInfo boundParams,
 								 ResourceOwner owner,
-								 QueryEnvironment *queryEnv);
+								 QueryEnvironment *queryEnv,
+								 CachedPlanPrepData *cprep);
 extern void ReleaseCachedPlan(CachedPlan *plan, ResourceOwner owner);
 
 extern bool CachedPlanAllowsSimpleValidityCheck(CachedPlanSource *plansource,
-- 
2.47.3



  [application/octet-stream] v4-0005-Add-test-exercising-prep-cleanup-on-cached-plan-i.patch (9.3K, 6-v4-0005-Add-test-exercising-prep-cleanup-on-cached-plan-i.patch)
  download | inline diff:
From 774853b8d3c0f8d4ee1afc8329526e7d22987cab Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Thu, 20 Nov 2025 15:35:47 +0900
Subject: [PATCH v4 5/6] Add test exercising prep cleanup on cached-plan
 invalidation

Add a regression test that causes a generic plan to become invalid
while pruning-aware setup is running. The pruning expression calls a
function that can perform DDL on a partition, making the plan stale
during reuse.

The test's purpose is to drive execution through the invalidation
path that discards any ExecutorPrep state created before the plan was
found invalid, providing coverage for that cleanup logic.
---
 src/backend/utils/cache/plancache.c     | 38 +++++++++++++--
 src/test/regress/expected/plancache.out | 61 +++++++++++++++++++++++++
 src/test/regress/sql/plancache.sql      | 50 ++++++++++++++++++++
 3 files changed, 144 insertions(+), 5 deletions(-)

diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index c1cfd47422c..a9a4e11d1a5 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -103,6 +103,7 @@ static Query *QueryListGetPrimaryStmt(List *stmts);
 static void AcquireExecutorLocks(List *stmt_list, bool acquire);
 static void AcquireExecutorLocksUnpruned(List *stmt_list, bool acquire,
 										 CachedPlanPrepData *cprep);
+static void CachedPlanPrepCleanup(CachedPlanPrepData *cprep);
 static void AcquirePlannerLocks(List *stmt_list, bool acquire);
 static void ScanQueryForLocks(Query *parsetree, bool acquire);
 static bool ScanQueryWalker(Node *node, bool *acquire);
@@ -1033,6 +1034,9 @@ PrepAndCheckCachedPlan(CachedPlanSource *plansource, CachedPlanPrepData *cprep)
 
 		/* Oops, the race case happened.  Release useless locks. */
 		AcquireExecutorLocksWithPolicy(plan->stmt_list, policy, false, cprep);
+
+		/* Also clean up ExecutorPrep() state, if necessary. */
+		CachedPlanPrepCleanup(cprep);
 	}
 
 	/*
@@ -2069,7 +2073,6 @@ LockRelids(List *rtable, Bitmapset *relids, bool acquire)
  *	- looks up the ExecPrep object for each PlannedStmt from cprep->prep_list
  *	  (which must already be populated)
  *	- unlocks the same relations identified during acquire
- *	- calls ExecPrepCleanup() on each ExecPrep
  *
  * prep_list is extended during acquire and must match stmt_list one-to-one
  * when releasing locks.  Memory allocation for ExecPrep happens in
@@ -2165,15 +2168,40 @@ AcquireExecutorLocksUnpruned(List *stmt_list, bool acquire,
 			LockRelids(plannedstmt->rtable, lock_relids, acquire);
 			bms_free(lock_relids);
 		}
-
-		/* Clean up prep if releasing locks. */
-		if (!acquire)
-			ExecPrepCleanup(prep);
 	}
 
 	MemoryContextSwitchTo(oldcontext);
 }
 
+/*
+ * CachedPlanPrepCleanup
+ *		Clean up ExecPrep state built for a generic plan.
+ *
+ * This is used in the corner case where PrepAndCheckCachedPlan() discovers
+ * that a CachedPlan has become invalid after AcquireExecutorLocksUnpruned()
+ * has already run.  In that case we must both release the execution locks
+ * and dispose of the ExecPrep list stored in CachedPlanPrepData, since the
+ * executor will never see or clean it up.
+ */
+static void
+CachedPlanPrepCleanup(CachedPlanPrepData *cprep)
+{
+	ListCell   *lc;
+
+	if (cprep == NULL)
+		return;
+
+	foreach(lc, cprep->prep_list)
+	{
+		ExecPrep *prep = (ExecPrep *) lfirst(lc);
+
+		ExecPrepCleanup(prep);
+	}
+
+	list_free(cprep->prep_list);
+	cprep->prep_list = NIL;
+}
+
 /*
  * AcquirePlannerLocks: acquire locks needed for planning of a querytree list;
  * or release them if acquire is false.
diff --git a/src/test/regress/expected/plancache.out b/src/test/regress/expected/plancache.out
index 4e59188196c..26c4c5e10fd 100644
--- a/src/test/regress/expected/plancache.out
+++ b/src/test/regress/expected/plancache.out
@@ -398,3 +398,64 @@ select name, generic_plans, custom_plans from pg_prepared_statements
 (1 row)
 
 drop table test_mode;
+-- Test invalidation of a generic plan during pruning-aware lock setup.
+-- The pruning expression uses a stable SQL function that calls a volatile
+-- plpgsql function.  That function performs DDL on a partition when a
+-- separate "signal" table says to do so.  The second EXECUTE should
+-- replan cleanly after the DDL.
+set plan_cache_mode to force_generic_plan;
+create table inval_during_pruning_p (a int) partition by list (a);
+create table inval_during_pruning_p1 partition of inval_during_pruning_p for values in (1);
+create table inval_during_pruning_p2 partition of inval_during_pruning_p for values in (2);
+insert into inval_during_pruning_p values (1), (2);
+create table inval_during_pruning_signal (create_idx bool not null);
+insert into inval_during_pruning_signal values (false);
+create or replace function invalidate_plancache_func() returns int
+as $$
+declare
+	create_index bool;
+begin
+	-- Perform DDL on a partition if asked to
+    select create_idx into create_index from inval_during_pruning_signal for update;
+    if create_index = true then
+		raise notice 'creating index on partition inval_during_pruning_p1';
+        create index on inval_during_pruning_p1 (a);
+		update inval_during_pruning_signal set create_idx = false;
+    end if;
+	-- pruning parameter
+    return 1;
+end;
+$$ language plpgsql volatile;
+create or replace function stable_pruning_val() returns int as $$
+	select invalidate_plancache_func();
+$$ language sql stable;
+prepare inval_during_pruning_q as select * from inval_during_pruning_p where a = stable_pruning_val();
+-- Build a generic plan and run pruning once, but don't set the signal
+-- for invalidate_plancache_func() to perform the DDL.
+explain (verbose, costs off) execute inval_during_pruning_q;
+                                QUERY PLAN                                 
+---------------------------------------------------------------------------
+ Append
+   Subplans Removed: 1
+   ->  Seq Scan on public.inval_during_pruning_p1 inval_during_pruning_p_1
+         Output: inval_during_pruning_p_1.a
+         Filter: (inval_during_pruning_p_1.a = stable_pruning_val())
+(5 rows)
+
+-- Reuse the generic plan.  Make invalidate_plancache_func() perform DDL
+-- during this execution, which should force replanning without errors.
+update inval_during_pruning_signal set create_idx = true;
+explain (verbose, costs off) execute inval_during_pruning_q;
+NOTICE:  creating index on partition inval_during_pruning_p1
+                                QUERY PLAN                                 
+---------------------------------------------------------------------------
+ Append
+   Subplans Removed: 1
+   ->  Seq Scan on public.inval_during_pruning_p1 inval_during_pruning_p_1
+         Output: inval_during_pruning_p_1.a
+         Filter: (inval_during_pruning_p_1.a = stable_pruning_val())
+(5 rows)
+
+drop table inval_during_pruning_p, inval_during_pruning_signal;
+drop function invalidate_plancache_func, stable_pruning_val;
+reset plan_cache_mode;
diff --git a/src/test/regress/sql/plancache.sql b/src/test/regress/sql/plancache.sql
index 4b2f11dcc64..cc7eb4da4d3 100644
--- a/src/test/regress/sql/plancache.sql
+++ b/src/test/regress/sql/plancache.sql
@@ -223,3 +223,53 @@ select name, generic_plans, custom_plans from pg_prepared_statements
   where  name = 'test_mode_pp';
 
 drop table test_mode;
+
+-- Test invalidation of a generic plan during pruning-aware lock setup.
+-- The pruning expression uses a stable SQL function that calls a volatile
+-- plpgsql function.  That function performs DDL on a partition when a
+-- separate "signal" table says to do so.  The second EXECUTE should
+-- replan cleanly after the DDL.
+set plan_cache_mode to force_generic_plan;
+create table inval_during_pruning_p (a int) partition by list (a);
+create table inval_during_pruning_p1 partition of inval_during_pruning_p for values in (1);
+create table inval_during_pruning_p2 partition of inval_during_pruning_p for values in (2);
+insert into inval_during_pruning_p values (1), (2);
+
+create table inval_during_pruning_signal (create_idx bool not null);
+insert into inval_during_pruning_signal values (false);
+create or replace function invalidate_plancache_func() returns int
+as $$
+declare
+	create_index bool;
+begin
+	-- Perform DDL on a partition if asked to
+    select create_idx into create_index from inval_during_pruning_signal for update;
+    if create_index = true then
+		raise notice 'creating index on partition inval_during_pruning_p1';
+        create index on inval_during_pruning_p1 (a);
+		update inval_during_pruning_signal set create_idx = false;
+    end if;
+	-- pruning parameter
+    return 1;
+end;
+$$ language plpgsql volatile;
+
+create or replace function stable_pruning_val() returns int as $$
+	select invalidate_plancache_func();
+$$ language sql stable;
+
+prepare inval_during_pruning_q as select * from inval_during_pruning_p where a = stable_pruning_val();
+
+-- Build a generic plan and run pruning once, but don't set the signal
+-- for invalidate_plancache_func() to perform the DDL.
+explain (verbose, costs off) execute inval_during_pruning_q;
+
+-- Reuse the generic plan.  Make invalidate_plancache_func() perform DDL
+-- during this execution, which should force replanning without errors.
+update inval_during_pruning_signal set create_idx = true;
+explain (verbose, costs off) execute inval_during_pruning_q;
+
+drop table inval_during_pruning_p, inval_during_pruning_signal;
+drop function invalidate_plancache_func, stable_pruning_val;
+
+reset plan_cache_mode;
-- 
2.47.3



  [application/octet-stream] v4-0001-Refactor-partition-pruning-initialization-for-cla.patch (8.2K, 7-v4-0001-Refactor-partition-pruning-initialization-for-cla.patch)
  download | inline diff:
From 2d7e972bf0e772b55674d6c390682777dc8c99a3 Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Tue, 11 Nov 2025 21:18:24 +0900
Subject: [PATCH v4 1/6] Refactor partition pruning initialization for clarity
 and modularity

Move the creation of PartitionPruneState structures out of
ExecDoInitialPruning() into a new ExecCreatePartitionPruneStates()
function. This separates the setup of pruning state from the execution
of initial pruning logic, making the code clearer and easier to
maintain.

Also simplify handling of unpruned relids by moving responsibility
for recording them in EState into CreatePartitionPruneState(),
avoiding the need to pass all_leafpart_rtis as an out parameter.

This refactoring allows callers to reuse the pruning setup logic
without always triggering pruning, a capability useful for future use
cases that may only need metadata initialization.
---
 src/backend/executor/execMain.c      |  1 +
 src/backend/executor/execPartition.c | 70 +++++++++++++++++-----------
 src/include/executor/execPartition.h |  1 +
 3 files changed, 44 insertions(+), 28 deletions(-)

diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 27c9eec697b..f5f4986383d 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -868,6 +868,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
 	 * to each PartitionPruneInfo entry, and the es_part_prune_results list is
 	 * parallel to es_part_prune_infos.
 	 */
+	ExecCreatePartitionPruneStates(estate);
 	ExecDoInitialPruning(estate);
 
 	/*
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 0dcce181f09..61559642662 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -182,8 +182,7 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation rel,
 static List *adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri);
 static List *adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap);
 static PartitionPruneState *CreatePartitionPruneState(EState *estate,
-													  PartitionPruneInfo *pruneinfo,
-													  Bitmapset **all_leafpart_rtis);
+													  PartitionPruneInfo *pruneinfo);
 static void InitPartitionPruneContext(PartitionPruneContext *context,
 									  List *pruning_steps,
 									  PartitionDesc partdesc,
@@ -1773,6 +1772,9 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  *
  * Functions:
  *
+ * ExecCreatePartitionPruneStates
+ *		Create PartitionPruneState for all PartitionPruneInfos in the EState
+ *
  * ExecDoInitialPruning:
  *		Perform runtime "initial" pruning, if necessary, to determine the set
  *		of child subnodes that need to be initialized during ExecInitNode() for
@@ -1797,6 +1799,29 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  *-------------------------------------------------------------------------
  */
 
+/*
+ * ExecCreatePartitionPruneStates
+ *
+ * Create a PartitionPruneState for each PartitionPruneInfo in the estate,
+ * and save them in estate->es_part_prune_states. This setup is required
+ * before any initial or runtime pruning can occur.
+ */
+void
+ExecCreatePartitionPruneStates(EState *estate)
+{
+	ListCell   *lc;
+
+	foreach(lc, estate->es_part_prune_infos)
+	{
+		PartitionPruneInfo *pruneinfo = lfirst_node(PartitionPruneInfo, lc);
+		PartitionPruneState *prunestate;
+
+		/* Create and save the PartitionPruneState. */
+		prunestate = CreatePartitionPruneState(estate, pruneinfo);
+		estate->es_part_prune_states = lappend(estate->es_part_prune_states,
+											   prunestate);
+	}
+}
 
 /*
  * ExecDoInitialPruning
@@ -1804,11 +1829,11 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  *		of child subnodes that need to be initialized during ExecInitNode() for
  *		plan nodes that support partition pruning.
  *
- * This function iterates over each PartitionPruneInfo entry in
- * estate->es_part_prune_infos. For each entry, it creates a PartitionPruneState
- * and adds it to es_part_prune_states.  ExecInitPartitionExecPruning() accesses
+ * This function iterates over each PartitionPruneState in
+ * estate->es_part_prune_states, which must have been populated earlier by
+ * ExecCreatePartitionPruneStates(). ExecInitPartitionExecPruning() accesses
  * these states through their corresponding indexes in es_part_prune_states and
- * assign each state to the parent node's PlanState, from where it will be used
+ * assigns each state to the parent node's PlanState, from where it will be used
  * for "exec" pruning.
  *
  * If initial pruning steps exist for a PartitionPruneInfo entry, this function
@@ -1826,20 +1851,13 @@ ExecDoInitialPruning(EState *estate)
 {
 	ListCell   *lc;
 
-	foreach(lc, estate->es_part_prune_infos)
+	Assert(estate->es_part_prune_results == NULL);
+	foreach(lc, estate->es_part_prune_states)
 	{
-		PartitionPruneInfo *pruneinfo = lfirst_node(PartitionPruneInfo, lc);
-		PartitionPruneState *prunestate;
+		PartitionPruneState *prunestate = (PartitionPruneState *) lfirst(lc);
 		Bitmapset  *validsubplans = NULL;
-		Bitmapset  *all_leafpart_rtis = NULL;
 		Bitmapset  *validsubplan_rtis = NULL;
 
-		/* Create and save the PartitionPruneState. */
-		prunestate = CreatePartitionPruneState(estate, pruneinfo,
-											   &all_leafpart_rtis);
-		estate->es_part_prune_states = lappend(estate->es_part_prune_states,
-											   prunestate);
-
 		/*
 		 * Perform initial pruning steps, if any, and save the result
 		 * bitmapset or NULL as described in the header comment.
@@ -1847,8 +1865,6 @@ ExecDoInitialPruning(EState *estate)
 		if (prunestate->do_initial_prune)
 			validsubplans = ExecFindMatchingSubPlans(prunestate, true,
 													 &validsubplan_rtis);
-		else
-			validsubplan_rtis = all_leafpart_rtis;
 
 		estate->es_unpruned_relids = bms_add_members(estate->es_unpruned_relids,
 													 validsubplan_rtis);
@@ -1966,14 +1982,12 @@ ExecInitPartitionExecPruning(PlanState *planstate,
  * parent plan node's PlanState.
  *
  * If initial pruning steps are to be skipped (e.g., during EXPLAIN
- * (GENERIC_PLAN)), *all_leafpart_rtis will be populated with the RT indexes of
- * all leaf partitions whose scanning subnode is included in the parent plan
- * node's list of child plans. The caller must add these RT indexes to
- * estate->es_unpruned_relids.
+ * (GENERIC_PLAN)), the RT indexes of all leaf partitions whose scanning
+ * subnode is included in the parent plan node's list of child plans are
+ * added to estate->es_unpruned_relids.
  */
 static PartitionPruneState *
-CreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo,
-						  Bitmapset **all_leafpart_rtis)
+CreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo)
 {
 	PartitionPruneState *prunestate;
 	int			n_part_hierarchies;
@@ -2207,8 +2221,8 @@ CreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo,
 													   pinfo->execparamids);
 
 			/*
-			 * Return all leaf partition indexes if we're skipping pruning in
-			 * the EXPLAIN (GENERIC_PLAN) case.
+			 * Add all leaf partition indexes to es_unpruned_relids if we're
+			 * skipping pruning in the EXPLAIN (GENERIC_PLAN) case.
 			 */
 			if (pinfo->initial_pruning_steps && !prunestate->do_initial_prune)
 			{
@@ -2220,8 +2234,8 @@ CreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo,
 					Index		rtindex = pprune->leafpart_rti_map[part_index];
 
 					if (rtindex)
-						*all_leafpart_rtis = bms_add_member(*all_leafpart_rtis,
-															rtindex);
+						estate->es_unpruned_relids =
+							bms_add_member(estate->es_unpruned_relids, rtindex);
 				}
 			}
 
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index 3b3f46aced0..ba8cc594fc9 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -130,6 +130,7 @@ typedef struct PartitionPruneState
 	PartitionPruningData *partprunedata[FLEXIBLE_ARRAY_MEMBER];
 } PartitionPruneState;
 
+extern void ExecCreatePartitionPruneStates(EState *estate);
 extern void ExecDoInitialPruning(EState *estate);
 extern PartitionPruneState *ExecInitPartitionExecPruning(PlanState *planstate,
 														 int n_total_subplans,
-- 
2.47.3



^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2026-03-07 09:54  Amit Langote <[email protected]>
  parent: Amit Langote <[email protected]>
  0 siblings, 1 reply; 108+ messages in thread

From: Amit Langote @ 2026-03-07 09:54 UTC (permalink / raw)
  To: Chao Li <[email protected]>; +Cc: Tom Lane <[email protected]>; Tender Wang <[email protected]>; Alexander Lakhin <[email protected]>; Tomas Vondra <[email protected]>; Robert Haas <[email protected]>; Alvaro Herrera <[email protected]>; Andres Freund <[email protected]>; Daniel Gustafsson <[email protected]>; David Rowley <[email protected]>; pgsql-hackers; Thom Brown <[email protected]>

Hi,

Attached is v6 of the patch series. I've been working toward
committing this, so I wanted to lay out the ExecutorPrep() design and
the key trade-offs before doing so.

When a cached generic plan references a partitioned table,
GetCachedPlan() locks all partitions upfront via
AcquireExecutorLocks(), even those that initial pruning will
eliminate.  But initial partition pruning only runs later during
ExecutorStart(). Moving pruning earlier requires some executor setup
(range table, permissions, pruning state), and ExecutorPrep() is the
vehicle for that.  Unlike the approach reverted in last May, this
keeps the CachedPlan itself unchanged -- all per-execution state flows
through a separate CachedPlanPrepData that the caller provides.

The approach also keeps GetCachedPlan()'s interface
backward-compatible: the new CachedPlanPrepData argument is optional.
If a caller passes NULL, all partitions are locked as before and
nothing changes. This means existing callers and any new code that
calls GetCachedPlan() without caring about pruning-aware locking just
works.

The risk is on the other side: if a caller does pass a
CachedPlanPrepData, GetCachedPlan() will lock only the surviving
partitions and populate prep_estates with the EStates that
ExecutorPrep() created. The caller then must make those EStates
available to ExecutorStart() -- via QueryDesc->estate,
portal->prep_estates, or the equivalent path for SPI and SQL
functions. If it fails to do so, ExecutorStart() will call
ExecutorPrep() again, which may compute different pruning results than
the original call, potentially expecting locks on relations that were
never acquired. The executor would then operate on relations it
doesn't hold locks on.

So the contract is: if you opt in to pruning-aware locking by passing
CachedPlanPrepData, you must complete the pipeline by delivering the
prep EStates to the executor. In the current patch, all the call sites
that pass a CachedPlanPrepData (portals, SPI, EXECUTE, SQL functions,
EXPLAIN) do thread the EStates through correctly, and I've tried to
make the plumbing straightforward enough that it's hard to get wrong.
But it is a new invariant that didn't exist before, and a caller that
gets it wrong would fail silently rather than with an obvious error.

To catch such violations, I've added a debug-only check in
standard_ExecutorStart() that fires when no prep EState was provided.
It iterates over the plan's rtable and verifies that every lockable
relation is actually locked.  It should always be true if
AcquireExecutorLocks() locked everything, but would fail if
pruning-aware locking happened upstream and the caller dropped the
prep EState. The check is skipped in parallel workers, which acquire
relation locks lazily in ExecGetRangeTableRelation().

+    if (queryDesc->estate == NULL)
+    {
+#ifdef USE_ASSERT_CHECKING
+        if (!IsParallelWorker())
+        {
+            ListCell   *lc;
+
+            foreach(lc, queryDesc->plannedstmt->rtable)
+            {
+                RangeTblEntry *rte = lfirst_node(RangeTblEntry, lc);
+
+                if (rte->rtekind == RTE_RELATION ||
+                    (rte->rtekind == RTE_SUBQUERY && rte->relid != InvalidOid))
+                    Assert(CheckRelationOidLockedByMe(rte->relid,
+                                                      rte->rellockmode,
+                                                      true));
+            }
+        }
+#endif
+        queryDesc->estate = ExecutorPrep(queryDesc->plannedstmt,
+                                         queryDesc->params,
+                                         CurrentResourceOwner,
+                                         true,
+                                         eflags);
+    }
+#ifdef USE_ASSERT_CHECKING
+    else
+    {
+        /*
+         * A prep EState was provided, meaning pruning-aware locking
+         * should have locked at least the unpruned relations.
+         */
+        if (!IsParallelWorker())
+        {
+            int     rtindex = -1;
+
+            while ((rtindex =
bms_next_member(queryDesc->estate->es_unpruned_relids,
+                                              rtindex)) >= 0)
+            {
+                RangeTblEntry *rte = exec_rt_fetch(rtindex, queryDesc->estate);
+
+                Assert(rte->rtekind == RTE_RELATION ||
+                       (rte->rtekind == RTE_SUBQUERY &&
+                        rte->relid != InvalidOid));
+                Assert(CheckRelationOidLockedByMe(rte->relid,
+                                                  rte->rellockmode, true));
+            }
+        }
+    }
+#endif

So the invariant is: if no prep EState was provided, every relation in
the plan is locked; if one was provided, at least the unpruned
relations are locked. Both are checked in assert builds.

I think this covers the main concerns, but I may be missing something.
If anyone sees a problem with this approach, I'd like to hear about
it.

--
Thanks,
Amit Langote


Attachments:

  [application/octet-stream] v6-0004-Use-pruning-aware-locking-in-cached-plans.patch (37.7K, 2-v6-0004-Use-pruning-aware-locking-in-cached-plans.patch)
  download | inline diff:
From 800949bf7a327a7b8bfc5b9fbcdbf0ac39106056 Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Tue, 11 Nov 2025 22:30:52 +0900
Subject: [PATCH v6 4/6] Use pruning-aware locking in cached plans

Extend GetCachedPlan() to perform ExecutorPrep() on each planned
statement, capturing unpruned relids and initial pruning results.
Use this data to acquire execution locks only on surviving partitions,
avoiding unnecessary locking of pruned tables even when using cached
plans.

Introduce CachedPlanPrepData to carry the EStates created by
ExecutorPrep() through the plan caching layer. The prep_estates
list is indexed one-to-one with CachedPlan->stmt_list and is
populated when GetCachedPlan() prepares a reused generic plan.
Adjust call sites in SPI, functions, portals, and EXPLAIN to
propagate this data.

Partition pruning expressions may call PL functions that require
an active snapshot (e.g., via EnsurePortalSnapshotExists()).
AcquireExecutorLocksUnpruned() establishes one before calling
ExecutorPrep() if needed, ensuring these expressions can execute
correctly during plan cache validation.

To maintain correctness when all target partitions are pruned, also
reinstate the firstResultRel locking behavior lost in commit
28317de72. That commit required the first ModifyTable target to
remain initialized for executor assumptions to hold. We now
explicitly track these relids in PlannerGlobal and PlannedStmt so they
are locked even if pruned, preserving that rule across cached plan
reuse.

Add a regression test that causes a generic plan to become invalid
while pruning-aware setup is running. The pruning expression calls a
function that can perform DDL on a partition, making the plan stale
during reuse.

The test's purpose is to drive execution through the invalidation
path that discards any ExecutorPrep state created before the plan was
found invalid, providing coverage for that cleanup logic.
---
 src/backend/commands/prepare.c                |  19 +-
 src/backend/executor/functions.c              |   1 +
 src/backend/executor/nodeModifyTable.c        |   5 +-
 src/backend/executor/spi.c                    |  26 +-
 src/backend/optimizer/plan/planner.c          |   1 +
 src/backend/optimizer/plan/setrefs.c          |  20 ++
 src/backend/tcop/postgres.c                   |   9 +-
 src/backend/utils/cache/plancache.c           | 292 +++++++++++++++++-
 src/include/nodes/pathnodes.h                 |   3 +
 src/include/nodes/plannodes.h                 |  10 +
 src/include/utils/plancache.h                 |  29 +-
 src/test/regress/expected/partition_prune.out |  50 ++-
 src/test/regress/expected/plancache.out       |  62 ++++
 src/test/regress/sql/partition_prune.sql      |  24 +-
 src/test/regress/sql/plancache.sql            |  51 +++
 15 files changed, 576 insertions(+), 26 deletions(-)

diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 005fbb48aa5..e8cd47131ce 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -154,6 +154,7 @@ ExecuteQuery(ParseState *pstate,
 {
 	PreparedStatement *entry;
 	CachedPlan *cplan;
+	CachedPlanPrepData cprep = {0};
 	List	   *plan_list;
 	ParamListInfo paramLI = NULL;
 	EState	   *estate = NULL;
@@ -193,7 +194,10 @@ ExecuteQuery(ParseState *pstate,
 									   entry->plansource->query_string);
 
 	/* Replan if needed, and increment plan refcount for portal */
-	cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL);
+	/* Keep ExecutorPrep state with the portal and its resowner. */
+	cprep.context = portal->portalContext;
+	cprep.owner = portal->resowner;
+	cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL, &cprep);
 	plan_list = cplan->stmt_list;
 
 	/*
@@ -205,7 +209,7 @@ ExecuteQuery(ParseState *pstate,
 					  query_string,
 					  entry->plansource->commandTag,
 					  plan_list,
-					  NIL,
+					  cprep.prep_estates,
 					  cplan);
 
 	/*
@@ -575,6 +579,7 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	PreparedStatement *entry;
 	const char *query_string;
 	CachedPlan *cplan;
+	CachedPlanPrepData cprep = {0};
 	List	   *plan_list;
 	List	   *prep_estates;
 	ListCell   *p;
@@ -633,8 +638,14 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	}
 
 	/* Replan if needed, and acquire a transient refcount */
+	/* ExecutorPrep state is local to this EXPLAIN EXECUTE call. */
+	cprep.context = CurrentMemoryContext;
+	cprep.owner = CurrentResourceOwner;
+	if (es->generic)
+		cprep.eflags = EXEC_FLAG_EXPLAIN_GENERIC;
 	cplan = GetCachedPlan(entry->plansource, paramLI,
-						  CurrentResourceOwner, pstate->p_queryEnv);
+						  CurrentResourceOwner, pstate->p_queryEnv,
+						  &cprep);
 
 	INSTR_TIME_SET_CURRENT(planduration);
 	INSTR_TIME_SUBTRACT(planduration, planstart);
@@ -653,7 +664,7 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	}
 
 	plan_list = cplan->stmt_list;
-	prep_estates = NIL;
+	prep_estates = cprep.prep_estates;
 
 	/* Explain each query */
 	prep_lc = list_head(prep_estates);
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index c93e2664cfd..65dfae58dcf 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -698,6 +698,7 @@ init_execution_state(SQLFunctionCachePtr fcache)
 	fcache->cplan = GetCachedPlan(plansource,
 								  fcache->paramLI,
 								  fcache->cowner,
+								  NULL,
 								  NULL);
 
 	/*
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index 793c76d4f82..a7a4baaf8af 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -4858,8 +4858,8 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
 	 * as a reference for building the ResultRelInfo of the target partition.
 	 * In either case, it doesn't matter which result relation is kept, so we
 	 * just keep the first one, if all others have been pruned.  See also,
-	 * ExecDoInitialPruning(), which ensures that this first result relation
-	 * has been locked.
+	 * AcquireExecutorLocksUnpruned(), which ensures that this first result
+	 * relation has been locked.
 	 */
 	i = 0;
 	foreach(l, node->resultRelations)
@@ -4873,6 +4873,7 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
 			/* all result relations pruned; keep the first one */
 			keep_rel = true;
 			rti = linitial_int(node->resultRelations);
+			Assert(list_member_int(estate->es_plannedstmt->firstResultRels, rti));
 			i = 0;
 		}
 
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 994a69a1c8e..13703969dd8 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -1579,6 +1579,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 {
 	CachedPlanSource *plansource;
 	CachedPlan *cplan;
+	CachedPlanPrepData cprep = {0};
 	List	   *stmt_list;
 	char	   *query_string;
 	Snapshot	snapshot;
@@ -1659,7 +1660,11 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 	 */
 
 	/* Replan if needed, and increment plan refcount for portal */
-	cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv);
+	/* ExecutorPrep state lives in this portal's context. */
+	cprep.context = portal->portalContext;
+	cprep.owner = portal->resowner;
+	cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv,
+						  &cprep);
 	stmt_list = cplan->stmt_list;
 
 	if (!plan->saved)
@@ -1685,7 +1690,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 					  query_string,
 					  plansource->commandTag,
 					  stmt_list,
-					  NIL,
+					  cprep.prep_estates,	/* lives in portalContext */
 					  cplan);
 
 	/*
@@ -2078,6 +2083,7 @@ SPI_plan_get_cached_plan(SPIPlanPtr plan)
 {
 	CachedPlanSource *plansource;
 	CachedPlan *cplan;
+	CachedPlanPrepData cprep = {0};
 	SPICallbackArg spicallbackarg;
 	ErrorContextCallback spierrcontext;
 
@@ -2101,9 +2107,13 @@ SPI_plan_get_cached_plan(SPIPlanPtr plan)
 	error_context_stack = &spierrcontext;
 
 	/* Get the generic plan for the query */
+	/* ExecutorPrep() state lives in caller's active context. */
+	cprep.context = CurrentMemoryContext;
+	cprep.owner = CurrentResourceOwner;
 	cplan = GetCachedPlan(plansource, NULL,
 						  plan->saved ? CurrentResourceOwner : NULL,
-						  _SPI_current->queryEnv);
+						  _SPI_current->queryEnv,
+						  &cprep);
 	Assert(cplan == plansource->gplan);
 
 	/* Pop the error context stack */
@@ -2502,6 +2512,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 		ListCell   *lc2;
 		List	   *prep_estates;
 		ListCell   *prep_lc;
+		CachedPlanPrepData cprep = {0};
 
 		spicallbackarg.query = plansource->query_string;
 
@@ -2576,11 +2587,16 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 		 * Replan if needed, and increment plan refcount.  If it's a saved
 		 * plan, the refcount must be backed by the plan_owner.
 		 */
+
+		/* ExecutorPrep state is per _SPI_execute_plan call. */
+		cprep.context = CurrentMemoryContext;
+		cprep.owner = CurrentResourceOwner;
 		cplan = GetCachedPlan(plansource, options->params,
-							  plan_owner, _SPI_current->queryEnv);
+							  plan_owner, _SPI_current->queryEnv,
+							  &cprep);
 
 		stmt_list = cplan->stmt_list;
-		prep_estates = NIL;
+		prep_estates = cprep.prep_estates;
 
 		/*
 		 * If we weren't given a specific snapshot to use, and the statement
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 42604a0f75c..afa61d357c5 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -657,6 +657,7 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
 	result->permInfos = glob->finalrteperminfos;
 	result->subrtinfos = glob->subrtinfos;
 	result->resultRelations = glob->resultRelations;
+	result->firstResultRels = glob->firstResultRels;
 	result->appendRelations = glob->appendRelations;
 	result->subplans = glob->subplans;
 	result->rewindPlanIDs = glob->rewindPlanIDs;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 1b5b9b5ed9c..ddb7902bc89 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -384,6 +384,26 @@ set_plan_references(PlannerInfo *root, Plan *plan)
 		}
 	}
 
+	/*
+	 * Record the first result relation if it belongs to the set of
+	 * initially prunable relations.  We use bms_next_member() to get
+	 * the lowest-numbered leaf result rel, which matches
+	 * linitial_int(ModifyTable.resultRelations) because partition
+	 * expansion preserves RT index order.  There is one ModifyTable
+	 * per query level, so this captures exactly one entry per level.
+	 * ExecInitModifyTable() asserts that the recorded index matches
+	 * what it actually needs.
+	 */
+	if (root->leaf_result_relids)
+	{
+		Index	firstResultRel = bms_next_member(root->leaf_result_relids, -1);
+
+		firstResultRel += rtoffset;
+		if (bms_is_member(firstResultRel, root->glob->prunableRelids))
+			root->glob->firstResultRels =
+				lappend_int(root->glob->firstResultRels, firstResultRel);
+	}
+
 	return result;
 }
 
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index cd1e429ceed..5c145a31274 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -1636,6 +1636,7 @@ exec_bind_message(StringInfo input_message)
 	int16	   *rformats = NULL;
 	CachedPlanSource *psrc;
 	CachedPlan *cplan;
+	CachedPlanPrepData cprep = {0};
 	Portal		portal;
 	char	   *query_string;
 	char	   *saved_stmt_name;
@@ -2017,7 +2018,11 @@ exec_bind_message(StringInfo input_message)
 	 * will be generated in MessageContext.  The plan refcount will be
 	 * assigned to the Portal, so it will be released at portal destruction.
 	 */
-	cplan = GetCachedPlan(psrc, params, NULL, NULL);
+
+	/* ExecutorPrep() state lives in portal context. */
+	cprep.context = portal->portalContext;
+	cprep.owner = portal->resowner;
+	cplan = GetCachedPlan(psrc, params, NULL, NULL, &cprep);
 
 	/*
 	 * Now we can define the portal.
@@ -2030,7 +2035,7 @@ exec_bind_message(StringInfo input_message)
 					  query_string,
 					  psrc->commandTag,
 					  cplan->stmt_list,
-					  NIL,
+					  cprep.prep_estates,
 					  cplan);
 
 	/* Portal is defined, set the plan ID based on its contents. */
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 812e2265734..be2a961a918 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -93,7 +93,7 @@ static bool StmtPlanRequiresRevalidation(CachedPlanSource *plansource);
 static bool BuildingPlanRequiresSnapshot(CachedPlanSource *plansource);
 static List *RevalidateCachedQuery(CachedPlanSource *plansource,
 								   QueryEnvironment *queryEnv);
-static bool CheckCachedPlan(CachedPlanSource *plansource);
+static bool PrepAndCheckCachedPlan(CachedPlanSource *plansource, CachedPlanPrepData *cprep);
 static CachedPlan *BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
 								   ParamListInfo boundParams, QueryEnvironment *queryEnv);
 static bool choose_custom_plan(CachedPlanSource *plansource,
@@ -101,6 +101,9 @@ static bool choose_custom_plan(CachedPlanSource *plansource,
 static double cached_plan_cost(CachedPlan *plan, bool include_planner);
 static Query *QueryListGetPrimaryStmt(List *stmts);
 static void AcquireExecutorLocks(List *stmt_list, bool acquire);
+static void AcquireExecutorLocksUnpruned(List *stmt_list, bool acquire,
+										 CachedPlanPrepData *cprep);
+static void CachedPlanPrepCleanup(CachedPlanPrepData *cprep);
 static void AcquirePlannerLocks(List *stmt_list, bool acquire);
 static void ScanQueryForLocks(Query *parsetree, bool acquire);
 static bool ScanQueryWalker(Node *node, bool *acquire);
@@ -139,6 +142,26 @@ ResourceOwnerForgetPlanCacheRef(ResourceOwner owner, CachedPlan *plan)
 /* GUC parameter */
 int			plan_cache_mode = PLAN_CACHE_MODE_AUTO;
 
+/*
+ * Lock acquisition policy for execution locks.
+ *
+ * LOCK_ALL acquires locks on all relations mentioned in the plan,
+ * reproducing the behavior of AcquireExecutorLocks().
+ *
+ * LOCK_UNPRUNED restricts locking to only the unpruned relations. That
+ * includes those mentioned in PlannedStmt.unprunableRelids and the leaf
+ * partitions remaining after performing initial pruning.
+ */
+typedef enum LockPolicy
+{
+	LOCK_ALL,
+	LOCK_UNPRUNED,
+} LockPolicy;
+
+static void AcquireExecutorLocksWithPolicy(List *stmt_list,
+										   LockPolicy policy, bool acquire,
+										   CachedPlanPrepData *cprep);
+
 /*
  * InitPlanCache: initialize module during InitPostgres.
  *
@@ -940,7 +963,12 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
 }
 
 /*
- * CheckCachedPlan: see if the CachedPlanSource's generic plan is valid.
+ * PrepAndCheckCachedPlan: see if the CachedPlanSource's generic plan is valid.
+ *
+ * If 'cprep' is not NULL, ExecutorPrep() is applied to each PlannedStmt to
+ * compute the set of partitions that survive initial runtime pruning in order
+ * to only lock them.  The EStates created to do so are saved in cprep for
+ * later reuse by ExecutorStart().
  *
  * Caller must have already called RevalidateCachedQuery to verify that the
  * querytree is up to date.
@@ -949,7 +977,7 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
  * (We must do this for the "true" result to be race-condition-free.)
  */
 static bool
-CheckCachedPlan(CachedPlanSource *plansource)
+PrepAndCheckCachedPlan(CachedPlanSource *plansource, CachedPlanPrepData *cprep)
 {
 	CachedPlan *plan = plansource->gplan;
 
@@ -977,13 +1005,15 @@ CheckCachedPlan(CachedPlanSource *plansource)
 	 */
 	if (plan->is_valid)
 	{
+		LockPolicy policy = !cprep ? LOCK_ALL : LOCK_UNPRUNED;
+
 		/*
 		 * Plan must have positive refcount because it is referenced by
 		 * plansource; so no need to fear it disappears under us here.
 		 */
 		Assert(plan->refcount > 0);
 
-		AcquireExecutorLocks(plan->stmt_list, true);
+		AcquireExecutorLocksWithPolicy(plan->stmt_list, policy, true, cprep);
 
 		/*
 		 * If plan was transient, check to see if TransactionXmin has
@@ -1005,7 +1035,10 @@ CheckCachedPlan(CachedPlanSource *plansource)
 		}
 
 		/* Oops, the race case happened.  Release useless locks. */
-		AcquireExecutorLocks(plan->stmt_list, false);
+		AcquireExecutorLocksWithPolicy(plan->stmt_list, policy, false, cprep);
+
+		/* Also clean up ExecutorPrep() state, if necessary. */
+		CachedPlanPrepCleanup(cprep);
 	}
 
 	/*
@@ -1285,6 +1318,11 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
  * On return, the plan is valid and we have sufficient locks to begin
  * execution.
  *
+ * If 'cprep' is not NULL and a generic plan is reused, the function prepares
+ * each PlannedStmt via ExecutorPrep() and stores the EStates in
+ * cprep->prep_estates.  These are intended to be passed later to
+ * ExecutorStart().
+ *
  * On return, the refcount of the plan has been incremented; a later
  * ReleaseCachedPlan() call is expected.  If "owner" is not NULL then
  * the refcount has been reported to that ResourceOwner (note that this
@@ -1295,7 +1333,8 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
  */
 CachedPlan *
 GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
-			  ResourceOwner owner, QueryEnvironment *queryEnv)
+			  ResourceOwner owner, QueryEnvironment *queryEnv,
+			  CachedPlanPrepData *cprep)
 {
 	CachedPlan *plan = NULL;
 	List	   *qlist;
@@ -1317,7 +1356,9 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 
 	if (!customplan)
 	{
-		if (CheckCachedPlan(plansource))
+		if (cprep)
+			cprep->params = boundParams;
+		if (PrepAndCheckCachedPlan(plansource, cprep))
 		{
 			/* We want a generic plan, and we already have a valid one */
 			plan = plansource->gplan;
@@ -1903,6 +1944,38 @@ QueryListGetPrimaryStmt(List *stmts)
 	return NULL;
 }
 
+/*
+ * AcquireExecutorLocksWithPolicy
+ *		Acquire or release execution locks for a cached plan according to
+ *		the specified policy.
+ *
+ * LOCK_ALL reproduces AcquireExecutorLocks(), locking every relation in
+ * each PlannedStmt's rtable.  LOCK_UNPRUNED restricts locking to the
+ * unprunable rels and partitions that survive initial runtime pruning.
+ *
+ * When LOCK_UNPRUNED is used on acquire, ExecutorPrep() is invoked for
+ * each PlannedStmt and the resulting EStates are appended to
+ * cprep->prep_estates in cprep->context.  On release, the same EState
+ * list is consulted to determine which relations to unlock and each
+ * EState is released.
+ */
+static void
+AcquireExecutorLocksWithPolicy(List *stmt_list, LockPolicy policy, bool acquire,
+							   CachedPlanPrepData *cprep)
+{
+	switch (policy)
+	{
+		case LOCK_ALL:
+			AcquireExecutorLocks(stmt_list, acquire);
+			break;
+		case LOCK_UNPRUNED:
+			AcquireExecutorLocksUnpruned(stmt_list, acquire, cprep);
+			break;
+		default:
+			elog(ERROR, "invalid LockPolicy");
+	}
+}
+
 /*
  * AcquireExecutorLocks: acquire locks needed for execution of a cached plan;
  * or release them if acquire is false.
@@ -1955,6 +2028,211 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
 	}
 }
 
+/*
+ * LockRelids
+ * 		Acquire or release locks on the specified relids, which reference
+ * 		entries in the provided range table.
+ *
+ * Helper for AcquireExecutorLocksUnpruned().
+ */
+static void
+LockRelids(List *rtable, Bitmapset *relids, bool acquire)
+{
+	int	rtindex = -1;
+
+	while ((rtindex = bms_next_member(relids, rtindex)) >= 0)
+	{
+		RangeTblEntry *rte = list_nth_node(RangeTblEntry, rtable, rtindex - 1);
+
+		if (!(rte->rtekind == RTE_RELATION ||
+			  (rte->rtekind == RTE_SUBQUERY && OidIsValid(rte->relid))))
+			elog(ERROR, "LockRelids(): cannot lock relation at RT index %d",
+				 rtindex);
+
+		/*
+		 * Acquire the appropriate type of lock on each relation OID. Note
+		 * that we don't actually try to open the rel, and hence will not
+		 * fail if it's been dropped entirely --- we'll just transiently
+		 * acquire a non-conflicting lock.
+		 */
+		if (acquire)
+			LockRelationOid(rte->relid, rte->rellockmode);
+		else
+			UnlockRelationOid(rte->relid, rte->rellockmode);
+	}
+}
+
+/*
+ * AcquireExecutorLocksUnpruned
+ *		Acquire or release execution locks for only unpruned relations
+ *		referenced by the given PlannedStmts.
+ *
+ * On acquire, this:
+ *	- locks unprunable rels listed in PlannedStmt.unprunableRelids
+ *	- runs ExecutorPrep() to perform initial runtime pruning
+ *	- locks the surviving partitions reported in the prep estate
+ *	- appends the EState pointer for each PlannedStmt to cprep->prep_estates
+ *
+ * On release, it:
+ *	- looks up the EState for each PlannedStmt from cprep->prep_estates
+ *	  (which must already be populated)
+ *	- unlocks the same relations identified during acquire
+ *	- cleans up each EState
+ *
+ * prep_estates is extended during acquire and must match stmt_list one-to-one
+ * when releasing locks.  Memory allocation for EState happens in
+ * cprep->context.  Locks are acquired using cprep->owner.
+ */
+
+static void
+AcquireExecutorLocksUnpruned(List *stmt_list, bool acquire,
+							 CachedPlanPrepData *cprep)
+{
+	MemoryContext oldcontext = MemoryContextSwitchTo(cprep->context);
+	ListCell   *lc1;
+	List	   *prep_estates;
+	ListCell   *prep_lc;
+
+	Assert(cprep);
+
+	/*
+	 * When releasing locks, use the EState list (if any) created during
+	 * acquisition to determine which relids to unlock. The list must match
+	 * the PlannedStmt list one-to-one.
+	 */
+	prep_estates = cprep->prep_estates;
+	Assert(acquire || list_length(prep_estates) == list_length(stmt_list));
+
+	prep_lc = list_head(prep_estates);
+	foreach(lc1, stmt_list)
+	{
+		PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
+		EState *prep_estate;
+
+		if (plannedstmt->commandType == CMD_UTILITY)
+		{
+			/* Same as AcquireExecutorLocks(). */
+			Query	   *query = UtilityContainsQuery(plannedstmt->utilityStmt);
+
+			if (query)
+				ScanQueryForLocks(query, acquire);
+
+			/* Keep the list one-to-one with stmt_list. */
+			if (acquire)
+				cprep->prep_estates = lappend(cprep->prep_estates, NULL);
+			else
+				(void) next_prep_estate(prep_estates, &prep_lc);
+			continue;
+		}
+
+		/*
+		 * Lock tables mentioned in the original query and other unprunable
+		 * relations that were added to the plan via inheritance expansion.
+		 */
+		LockRelids(plannedstmt->rtable, plannedstmt->unprunableRelids, acquire);
+
+		/* Lock partitions surviving runtime initial pruning. */
+		if (acquire)
+		{
+			/*
+			 * Pruning expressions may call PL functions that require an active
+			 * snapshot (e.g., via EnsurePortalSnapshotExists()). Establish one
+			 * if needed.
+			 */
+			bool		snap_pushed = false;
+
+			if (!ActiveSnapshotSet())
+			{
+				PushActiveSnapshot(GetTransactionSnapshot());
+				snap_pushed = true;
+			}
+
+			prep_estate = ExecutorPrep(plannedstmt, cprep->params, cprep->owner, true,
+									   cprep->eflags);
+			Assert(prep_estate);
+			cprep->prep_estates = lappend(cprep->prep_estates, prep_estate);
+
+			if (snap_pushed)
+				PopActiveSnapshot();
+		}
+		else
+			prep_estate = next_prep_estate(prep_estates, &prep_lc);
+
+		if (prep_estate)
+		{
+			/*
+			 * es_unpruned_relids includes plannedstmt->unprunableRelids,
+			 * which we've already locked. Filter them out to avoid double-locking.
+			 */
+			Bitmapset *lock_relids = bms_difference(prep_estate->es_unpruned_relids,
+													plannedstmt->unprunableRelids);
+
+			/*
+			 * We must always include the first result relation of each
+			 * ModifyTable node in the plan, that is, the one mentioned in
+			 * plannedstmt->firstResultRels in the set of relations to be
+			 * locked to satisfy executor assumptions described
+			 * in ExecInitModifyTable().  This can be wasteful, because we
+			 * may not need to use the first result relation at all if other
+			 * result relations are unpruned and thus sufficient for the
+			 * ModifyTable node's needs.  Unfortunately, we don't have per-node
+			 * unpruned_relids set to determine that other result relations
+			 * are included.
+			 */
+			if (plannedstmt->resultRelations)
+			{
+				ListCell *lc2;
+
+				foreach(lc2, plannedstmt->firstResultRels)
+				{
+					Index       firstResultRel = lfirst_int(lc2);
+
+					if (!bms_is_member(firstResultRel, lock_relids))
+						lock_relids = bms_add_member(lock_relids, firstResultRel);
+				}
+			}
+
+			LockRelids(plannedstmt->rtable, lock_relids, acquire);
+			bms_free(lock_relids);
+		}
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+}
+
+/*
+ * CachedPlanPrepCleanup
+ *		Clean up EState built for a generic plan.
+ *
+ * This is used in the corner case where PrepAndCheckCachedPlan() discovers
+ * that a CachedPlan has become invalid after AcquireExecutorLocksUnpruned()
+ * has already run.  In that case we must both release the execution locks
+ * and dispose of the ExecPrep list stored in CachedPlanPrepData, since the
+ * executor will never see or clean it up.
+ */
+static void
+CachedPlanPrepCleanup(CachedPlanPrepData *cprep)
+{
+	ListCell   *lc;
+
+	if (cprep == NULL)
+		return;
+
+	foreach(lc, cprep->prep_estates)
+	{
+		EState *prep_estate = (EState *) lfirst(lc);
+
+		if (prep_estate == NULL)
+			continue;
+
+		ExecCloseRangeTableRelations(prep_estate);
+		FreeExecutorState(prep_estate);
+	}
+
+	list_free(cprep->prep_estates);
+	cprep->prep_estates = NIL;
+}
+
 /*
  * AcquirePlannerLocks: acquire locks needed for planning of a querytree list;
  * or release them if acquire is false.
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index c175ee95b68..989b3c73691 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -217,6 +217,9 @@ typedef struct PlannerGlobal
 	/* "flat" list of integer RT indexes */
 	List	   *resultRelations;
 
+	/* "flat" list of integer RT indexes (one per ModifyTable node) */
+	List	   *firstResultRels;
+
 	/* "flat" list of AppendRelInfos */
 	List	   *appendRelations;
 
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 8c9321aab8c..1431f12a6e8 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -123,6 +123,16 @@ typedef struct PlannedStmt
 	/* integer list of RT indexes, or NIL */
 	List	   *resultRelations;
 
+	/*
+	 * rtable indexes of first target relation in each ModifyTable node in the
+	 * plan for INSERT/UPDATE/DELETE/MERGE.  NIL if resultRelations is NIL.
+	 *
+	 * These are used by AcquireExecutorLocksUnpruned() to ensure that the
+	 * first result rel for each ModifyTable remains locked even if pruned;
+	 * see ExecInitModifyTable() for the executor side assumptions.
+	 */
+	List	   *firstResultRels;
+
 	/* list of AppendRelInfo nodes */
 	List	   *appendRelations;
 
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index 984c51515c6..da3ce9f3177 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -197,6 +197,32 @@ typedef struct CachedExpression
 } CachedExpression;
 
 
+/*
+ * CachedPlanPrepData
+ *      Carries ExecutorPrep results for each PlannedStmt in a CachedPlan,
+ *      along with context and owner information needed to allocate them.
+ *
+ * prep_estates is indexed one-to-one with CachedPlan->stmt_list, and is
+ * populated when GetCachedPlan() prepares a reused generic plan.  If the
+ * plan is found invalid after locking, the same list is used to determine
+ * which relations to unlock before retrying.
+ *
+ * ExecutorPrep state is allocated in 'context' and owned by 'owner'.
+ *
+ * eflags controls ExecutorPrep() behavior during initial pruning.
+ * Normally zero; set EXEC_FLAG_EXPLAIN_GENERIC to suppress pruning
+ * in EXPLAIN (GENERIC_PLAN).  Need not match the eflags later passed
+ * to ExecutorStart().
+ */
+typedef struct CachedPlanPrepData
+{
+	List   *prep_estates;	/* one EState per PlannedStmt, or NULL */
+	ParamListInfo params;	/* params visible to ExecutorPrep */
+	MemoryContext context;	/* where to allocate EState and its fields */
+	ResourceOwner owner;	/* ResourceOwner for ExecutorPrep state */
+	int		eflags;			/* executor flags to control ExecutorPrep */
+} CachedPlanPrepData;
+
 extern void InitPlanCache(void);
 extern void ResetPlanCache(void);
 
@@ -240,7 +266,8 @@ extern List *CachedPlanGetTargetList(CachedPlanSource *plansource,
 extern CachedPlan *GetCachedPlan(CachedPlanSource *plansource,
 								 ParamListInfo boundParams,
 								 ResourceOwner owner,
-								 QueryEnvironment *queryEnv);
+								 QueryEnvironment *queryEnv,
+								 CachedPlanPrepData *cprep);
 extern void ReleaseCachedPlan(CachedPlan *plan, ResourceOwner owner);
 
 extern bool CachedPlanAllowsSimpleValidityCheck(CachedPlanSource *plansource,
diff --git a/src/test/regress/expected/partition_prune.out b/src/test/regress/expected/partition_prune.out
index 39dab8fcc05..39770f3b6d6 100644
--- a/src/test/regress/expected/partition_prune.out
+++ b/src/test/regress/expected/partition_prune.out
@@ -4860,9 +4860,7 @@ select c.relname
    relname    
 --------------
  prunelock_p1
- prunelock_p2
- prunelock_p3
-(3 rows)
+(1 row)
 
 commit;
 deallocate prunelock_q;
@@ -4904,6 +4902,50 @@ select c.relname
 
 commit;
 deallocate prunelock_q;
+reset enable_partition_pruning;
+--
+-- Verify firstResultRels handling with multiple ModifyTable nodes
+-- (writable CTEs) targeting a partitioned table.  When a pruning
+-- parameter matches no partition, all result relations are pruned
+-- and the executor must still find a usable first result relation
+-- for each ModifyTable node.
+--
+prepare prunelock_mt_q (int, int) as
+  with upd1 as (update prunelock_p set a = a),
+       upd2 as (update prunelock_p set a = a where a = $2)
+  update prunelock_p set a = a where a = $1;
+-- Force generic plan creation
+explain (costs off) execute prunelock_mt_q(1, 2);
+                         QUERY PLAN                         
+------------------------------------------------------------
+ Update on prunelock_p
+   Update on prunelock_p1 prunelock_p_1
+   CTE upd1
+     ->  Update on prunelock_p prunelock_p_3
+           Update on prunelock_p1 prunelock_p_4
+           Update on prunelock_p2 prunelock_p_5
+           Update on prunelock_p3 prunelock_p_6
+           ->  Append
+                 ->  Seq Scan on prunelock_p1 prunelock_p_4
+                 ->  Seq Scan on prunelock_p2 prunelock_p_5
+                 ->  Seq Scan on prunelock_p3 prunelock_p_6
+   CTE upd2
+     ->  Update on prunelock_p prunelock_p_7
+           Update on prunelock_p2 prunelock_p_8
+           ->  Append
+                 Subplans Removed: 2
+                 ->  Seq Scan on prunelock_p2 prunelock_p_8
+                       Filter: (a = $2)
+   ->  Append
+         Subplans Removed: 2
+         ->  Seq Scan on prunelock_p1 prunelock_p_1
+               Filter: (a = $1)
+(22 rows)
+
+-- All partitions pruned: value 4 matches no partition, so each
+-- ModifyTable must still initialize correctly with no matching
+-- result relations.
+execute prunelock_mt_q(4, 5);
+deallocate prunelock_mt_q;
 drop table prunelock_p;
 reset plan_cache_mode;
-reset enable_partition_pruning;
diff --git a/src/test/regress/expected/plancache.out b/src/test/regress/expected/plancache.out
index 4e59188196c..1d69ab0a1c2 100644
--- a/src/test/regress/expected/plancache.out
+++ b/src/test/regress/expected/plancache.out
@@ -398,3 +398,65 @@ select name, generic_plans, custom_plans from pg_prepared_statements
 (1 row)
 
 drop table test_mode;
+-- Test invalidation of a generic plan during pruning-aware lock setup.
+-- The pruning expression uses a stable SQL function that calls a volatile
+-- plpgsql function.  That function performs DDL on a partition when a
+-- separate "signal" table says to do so.  The second EXECUTE should
+-- replan cleanly after the DDL.
+set plan_cache_mode to force_generic_plan;
+create table inval_during_pruning_p (a int) partition by list (a);
+create table inval_during_pruning_p1 partition of inval_during_pruning_p for values in (1);
+create table inval_during_pruning_p2 partition of inval_during_pruning_p for values in (2);
+insert into inval_during_pruning_p values (1), (2);
+create table inval_during_pruning_signal (create_idx bool not null);
+insert into inval_during_pruning_signal values (false);
+create or replace function invalidate_plancache_func() returns int
+as $$
+declare
+	create_index bool;
+begin
+	-- Perform DDL on a partition if asked to
+	select create_idx into create_index from inval_during_pruning_signal for update;
+	if create_index = true then
+		raise notice 'creating index on partition inval_during_pruning_p1';
+		create index on inval_during_pruning_p1 (a);
+		update inval_during_pruning_signal set create_idx = false;
+	end if;
+	-- pruning parameter
+	return 1;
+end;
+$$ language plpgsql volatile;
+create or replace function stable_pruning_val() returns int as $$
+	select invalidate_plancache_func();
+$$ language sql stable;
+prepare inval_during_pruning_q as select * from inval_during_pruning_p where a = stable_pruning_val();
+-- Build a generic plan and run pruning once, but don't set the signal
+-- for invalidate_plancache_func() to perform the DDL.
+explain (verbose, costs off) execute inval_during_pruning_q;
+                                QUERY PLAN                                 
+---------------------------------------------------------------------------
+ Append
+   Subplans Removed: 1
+   ->  Seq Scan on public.inval_during_pruning_p1 inval_during_pruning_p_1
+         Output: inval_during_pruning_p_1.a
+         Filter: (inval_during_pruning_p_1.a = stable_pruning_val())
+(5 rows)
+
+-- Reuse the generic plan.  Make invalidate_plancache_func() perform DDL
+-- during this execution, which should force replanning without errors.
+update inval_during_pruning_signal set create_idx = true;
+explain (verbose, costs off) execute inval_during_pruning_q;
+NOTICE:  creating index on partition inval_during_pruning_p1
+                                QUERY PLAN                                 
+---------------------------------------------------------------------------
+ Append
+   Subplans Removed: 1
+   ->  Seq Scan on public.inval_during_pruning_p1 inval_during_pruning_p_1
+         Output: inval_during_pruning_p_1.a
+         Filter: (inval_during_pruning_p_1.a = stable_pruning_val())
+(5 rows)
+
+drop table inval_during_pruning_p, inval_during_pruning_signal;
+drop function invalidate_plancache_func, stable_pruning_val;
+deallocate inval_during_pruning_q;
+reset plan_cache_mode;
diff --git a/src/test/regress/sql/partition_prune.sql b/src/test/regress/sql/partition_prune.sql
index 229c5eb370c..87672ad40f7 100644
--- a/src/test/regress/sql/partition_prune.sql
+++ b/src/test/regress/sql/partition_prune.sql
@@ -1499,6 +1499,28 @@ select c.relname
 commit;
 
 deallocate prunelock_q;
+reset enable_partition_pruning;
+
+--
+-- Verify firstResultRels handling with multiple ModifyTable nodes
+-- (writable CTEs) targeting a partitioned table.  When a pruning
+-- parameter matches no partition, all result relations are pruned
+-- and the executor must still find a usable first result relation
+-- for each ModifyTable node.
+--
+prepare prunelock_mt_q (int, int) as
+  with upd1 as (update prunelock_p set a = a),
+       upd2 as (update prunelock_p set a = a where a = $2)
+  update prunelock_p set a = a where a = $1;
+
+-- Force generic plan creation
+explain (costs off) execute prunelock_mt_q(1, 2);
+
+-- All partitions pruned: value 4 matches no partition, so each
+-- ModifyTable must still initialize correctly with no matching
+-- result relations.
+execute prunelock_mt_q(4, 5);
+
+deallocate prunelock_mt_q;
 drop table prunelock_p;
 reset plan_cache_mode;
-reset enable_partition_pruning;
diff --git a/src/test/regress/sql/plancache.sql b/src/test/regress/sql/plancache.sql
index 4b2f11dcc64..139b4688fd6 100644
--- a/src/test/regress/sql/plancache.sql
+++ b/src/test/regress/sql/plancache.sql
@@ -223,3 +223,54 @@ select name, generic_plans, custom_plans from pg_prepared_statements
   where  name = 'test_mode_pp';
 
 drop table test_mode;
+
+-- Test invalidation of a generic plan during pruning-aware lock setup.
+-- The pruning expression uses a stable SQL function that calls a volatile
+-- plpgsql function.  That function performs DDL on a partition when a
+-- separate "signal" table says to do so.  The second EXECUTE should
+-- replan cleanly after the DDL.
+set plan_cache_mode to force_generic_plan;
+create table inval_during_pruning_p (a int) partition by list (a);
+create table inval_during_pruning_p1 partition of inval_during_pruning_p for values in (1);
+create table inval_during_pruning_p2 partition of inval_during_pruning_p for values in (2);
+insert into inval_during_pruning_p values (1), (2);
+
+create table inval_during_pruning_signal (create_idx bool not null);
+insert into inval_during_pruning_signal values (false);
+create or replace function invalidate_plancache_func() returns int
+as $$
+declare
+	create_index bool;
+begin
+	-- Perform DDL on a partition if asked to
+	select create_idx into create_index from inval_during_pruning_signal for update;
+	if create_index = true then
+		raise notice 'creating index on partition inval_during_pruning_p1';
+		create index on inval_during_pruning_p1 (a);
+		update inval_during_pruning_signal set create_idx = false;
+	end if;
+	-- pruning parameter
+	return 1;
+end;
+$$ language plpgsql volatile;
+
+create or replace function stable_pruning_val() returns int as $$
+	select invalidate_plancache_func();
+$$ language sql stable;
+
+prepare inval_during_pruning_q as select * from inval_during_pruning_p where a = stable_pruning_val();
+
+-- Build a generic plan and run pruning once, but don't set the signal
+-- for invalidate_plancache_func() to perform the DDL.
+explain (verbose, costs off) execute inval_during_pruning_q;
+
+-- Reuse the generic plan.  Make invalidate_plancache_func() perform DDL
+-- during this execution, which should force replanning without errors.
+update inval_during_pruning_signal set create_idx = true;
+explain (verbose, costs off) execute inval_during_pruning_q;
+
+drop table inval_during_pruning_p, inval_during_pruning_signal;
+drop function invalidate_plancache_func, stable_pruning_val;
+deallocate inval_during_pruning_q;
+
+reset plan_cache_mode;
-- 
2.47.3



  [application/octet-stream] v6-0003-Add-test-for-partition-lock-behavior-with-generic.patch (5.3K, 3-v6-0003-Add-test-for-partition-lock-behavior-with-generic.patch)
  download | inline diff:
From 58179bd0d3730dbd1fdbb0bd9c624dc7ae770830 Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Tue, 10 Feb 2026 22:00:32 +0900
Subject: [PATCH v6 3/6] Add test for partition lock behavior with generic
 cached plans

Add a regression test that inspects pg_locks to verify which child
partitions are locked when executing a prepared statement that uses
a generic cached plan.

Two cases are tested: one with enable_partition_pruning on and one
with it off.  Currently both cases lock all child partitions, because
GetCachedPlan() acquires execution locks on every relation in the
plan regardless of pruning.

A subsequent commit that adds pruning-aware locking will update the
expected output for the pruning-enabled case, showing that only the
surviving partition is locked.
---
 src/test/regress/expected/partition_prune.out | 83 +++++++++++++++++++
 src/test/regress/sql/partition_prune.sql      | 55 ++++++++++++
 2 files changed, 138 insertions(+)

diff --git a/src/test/regress/expected/partition_prune.out b/src/test/regress/expected/partition_prune.out
index deacdd75807..39dab8fcc05 100644
--- a/src/test/regress/expected/partition_prune.out
+++ b/src/test/regress/expected/partition_prune.out
@@ -4824,3 +4824,86 @@ select min(a) over (partition by a order by a) from part_abc where a >= stable_o
 
 drop view part_abc_view;
 drop table part_abc;
+--
+-- Verify that pruning-aware locking skips pruned partitions
+-- when reusing a generic cached plan.
+--
+set plan_cache_mode to force_generic_plan;
+create table prunelock_p (a int) partition by list (a);
+create table prunelock_p1 partition of prunelock_p for values in (1);
+create table prunelock_p2 partition of prunelock_p for values in (2);
+create table prunelock_p3 partition of prunelock_p for values in (3);
+prepare prunelock_q (int) as select * from prunelock_p where a = $1;
+-- Force generic plan creation
+explain (costs off) execute prunelock_q(1);
+                  QUERY PLAN                  
+----------------------------------------------
+ Append
+   Subplans Removed: 2
+   ->  Seq Scan on prunelock_p1 prunelock_p_1
+         Filter: (a = $1)
+(4 rows)
+
+-- Execute and check which child partitions are locked
+begin;
+execute prunelock_q(1);
+ a 
+---
+(0 rows)
+
+select c.relname
+  from pg_locks l
+  join pg_class c on c.oid = l.relation
+ where l.pid = pg_backend_pid()
+   and c.relname like 'prunelock_p_'
+ order by c.relname;
+   relname    
+--------------
+ prunelock_p1
+ prunelock_p2
+ prunelock_p3
+(3 rows)
+
+commit;
+deallocate prunelock_q;
+-- Turn pruning off
+set enable_partition_pruning to off;
+prepare prunelock_q (int) as select * from prunelock_p where a = $1;
+-- Force generic plan creation
+explain (costs off) execute prunelock_q(1);
+                  QUERY PLAN                  
+----------------------------------------------
+ Append
+   ->  Seq Scan on prunelock_p1 prunelock_p_1
+         Filter: (a = $1)
+   ->  Seq Scan on prunelock_p2 prunelock_p_2
+         Filter: (a = $1)
+   ->  Seq Scan on prunelock_p3 prunelock_p_3
+         Filter: (a = $1)
+(7 rows)
+
+-- Execute and check which child partitions are locked
+begin;
+execute prunelock_q(1);
+ a 
+---
+(0 rows)
+
+select c.relname
+  from pg_locks l
+  join pg_class c on c.oid = l.relation
+ where l.pid = pg_backend_pid()
+   and c.relname like 'prunelock_p_'
+ order by c.relname;
+   relname    
+--------------
+ prunelock_p1
+ prunelock_p2
+ prunelock_p3
+(3 rows)
+
+commit;
+deallocate prunelock_q;
+drop table prunelock_p;
+reset plan_cache_mode;
+reset enable_partition_pruning;
diff --git a/src/test/regress/sql/partition_prune.sql b/src/test/regress/sql/partition_prune.sql
index d93c0c03bab..229c5eb370c 100644
--- a/src/test/regress/sql/partition_prune.sql
+++ b/src/test/regress/sql/partition_prune.sql
@@ -1447,3 +1447,58 @@ select min(a) over (partition by a order by a) from part_abc where a >= stable_o
 
 drop view part_abc_view;
 drop table part_abc;
+
+--
+-- Verify that pruning-aware locking skips pruned partitions
+-- when reusing a generic cached plan.
+--
+set plan_cache_mode to force_generic_plan;
+
+create table prunelock_p (a int) partition by list (a);
+create table prunelock_p1 partition of prunelock_p for values in (1);
+create table prunelock_p2 partition of prunelock_p for values in (2);
+create table prunelock_p3 partition of prunelock_p for values in (3);
+
+prepare prunelock_q (int) as select * from prunelock_p where a = $1;
+
+-- Force generic plan creation
+explain (costs off) execute prunelock_q(1);
+
+-- Execute and check which child partitions are locked
+begin;
+execute prunelock_q(1);
+
+select c.relname
+  from pg_locks l
+  join pg_class c on c.oid = l.relation
+ where l.pid = pg_backend_pid()
+   and c.relname like 'prunelock_p_'
+ order by c.relname;
+commit;
+
+deallocate prunelock_q;
+
+-- Turn pruning off
+set enable_partition_pruning to off;
+
+prepare prunelock_q (int) as select * from prunelock_p where a = $1;
+
+-- Force generic plan creation
+explain (costs off) execute prunelock_q(1);
+
+-- Execute and check which child partitions are locked
+begin;
+execute prunelock_q(1);
+
+select c.relname
+  from pg_locks l
+  join pg_class c on c.oid = l.relation
+ where l.pid = pg_backend_pid()
+   and c.relname like 'prunelock_p_'
+ order by c.relname;
+commit;
+
+deallocate prunelock_q;
+drop table prunelock_p;
+reset plan_cache_mode;
+reset enable_partition_pruning;
-- 
2.47.3



  [application/octet-stream] v6-0006-Reuse-partition-pruning-results-in-parallel-worke.patch (15.9K, 4-v6-0006-Reuse-partition-pruning-results-in-parallel-worke.patch)
  download | inline diff:
From dc2cfc32410792b3f00422c07623f989901ee34b Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Tue, 11 Nov 2025 22:17:47 +0900
Subject: [PATCH v6 6/6] Reuse partition pruning results in parallel workers

Pass the leader's initial partition pruning results and unpruned
relids to parallel workers and reuse them via ExecutorPrep(). This
avoids repeating pruning logic in workers, which is not only
redundant but also risks divergence due to nondeterminism in pruning
steps or parameter evaluation timing.

Introduce CheckInitialPruningResultsInWorker() (debug-builds only)
to verify that the results match what the worker would compute. This
check helps catch inconsistencies across leader and worker pruning
logic.
---
 src/backend/executor/execParallel.c | 108 +++++++++++++++++++++++++++-
 src/backend/utils/cache/plancache.c |  95 +++++++-----------------
 2 files changed, 133 insertions(+), 70 deletions(-)

diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index 024780d3516..d337bf8c081 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -24,6 +24,7 @@
 #include "postgres.h"
 
 #include "executor/execParallel.h"
+#include "executor/execPartition.h"
 #include "executor/executor.h"
 #include "executor/nodeAgg.h"
 #include "executor/nodeAppend.h"
@@ -67,6 +68,8 @@
 #define PARALLEL_KEY_QUERY_TEXT		UINT64CONST(0xE000000000000008)
 #define PARALLEL_KEY_JIT_INSTRUMENTATION UINT64CONST(0xE000000000000009)
 #define PARALLEL_KEY_WAL_USAGE			UINT64CONST(0xE00000000000000A)
+#define PARALLEL_KEY_PARTITION_PRUNE_RESULTS	UINT64CONST(0xE00000000000000B)
+#define PARALLEL_KEY_UNPRUNED_RELIDS	UINT64CONST(0xE00000000000000C)
 
 #define PARALLEL_TUPLE_QUEUE_SIZE		65536
 
@@ -141,6 +144,8 @@ static bool ExecParallelRetrieveInstrumentation(PlanState *planstate,
 /* Helper function that runs in the parallel worker. */
 static DestReceiver *ExecParallelGetReceiver(dsm_segment *seg, shm_toc *toc);
 
+static void CheckInitialPruningResultsInWorker(EState *estate);
+
 /*
  * Create a serialized representation of the plan to be sent to each worker.
  */
@@ -620,12 +625,18 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	FixedParallelExecutorState *fpes;
 	char	   *pstmt_data;
 	char	   *pstmt_space;
+	char	   *part_prune_results_data;
+	char	   *part_prune_results_space;
+	char	   *unpruned_relids_data;
+	char	   *unpruned_relids_space;
 	char	   *paramlistinfo_space;
 	BufferUsage *bufusage_space;
 	WalUsage   *walusage_space;
 	SharedExecutorInstrumentation *instrumentation = NULL;
 	SharedJitInstrumentation *jit_instrumentation = NULL;
 	int			pstmt_len;
+	int			part_prune_results_len;
+	int			unpruned_relids_len;
 	int			paramlistinfo_len;
 	int			instrumentation_len = 0;
 	int			jit_instrumentation_len = 0;
@@ -654,6 +665,8 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 
 	/* Fix up and serialize plan to be sent to workers. */
 	pstmt_data = ExecSerializePlan(planstate->plan, estate);
+	part_prune_results_data = nodeToString(estate->es_part_prune_results);
+	unpruned_relids_data = nodeToString(estate->es_unpruned_relids);
 
 	/* Create a parallel context. */
 	pcxt = CreateParallelContext("postgres", "ParallelQueryMain", nworkers);
@@ -680,6 +693,16 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	shm_toc_estimate_chunk(&pcxt->estimator, pstmt_len);
 	shm_toc_estimate_keys(&pcxt->estimator, 1);
 
+	/* Estimate space for serialized part_prune_results. */
+	part_prune_results_len = strlen(part_prune_results_data) + 1;
+	shm_toc_estimate_chunk(&pcxt->estimator, part_prune_results_len);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	/* Estimate space for serialized unpruned_relids. */
+	unpruned_relids_len = strlen(unpruned_relids_data) + 1;
+	shm_toc_estimate_chunk(&pcxt->estimator, unpruned_relids_len);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
 	/* Estimate space for serialized ParamListInfo. */
 	paramlistinfo_len = EstimateParamListSpace(estate->es_param_list_info);
 	shm_toc_estimate_chunk(&pcxt->estimator, paramlistinfo_len);
@@ -781,6 +804,16 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	memcpy(pstmt_space, pstmt_data, pstmt_len);
 	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PLANNEDSTMT, pstmt_space);
 
+	/* Store serialized part_prune_results */
+	part_prune_results_space = shm_toc_allocate(pcxt->toc, part_prune_results_len);
+	memcpy(part_prune_results_space, part_prune_results_data, part_prune_results_len);
+	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARTITION_PRUNE_RESULTS, part_prune_results_space);
+
+	/* Store serialized unpruned_relids */
+	unpruned_relids_space = shm_toc_allocate(pcxt->toc, unpruned_relids_len);
+	memcpy(unpruned_relids_space, unpruned_relids_data, unpruned_relids_len);
+	shm_toc_insert(pcxt->toc, PARALLEL_KEY_UNPRUNED_RELIDS, unpruned_relids_space);
+
 	/* Store serialized ParamListInfo. */
 	paramlistinfo_space = shm_toc_allocate(pcxt->toc, paramlistinfo_len);
 	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARAMLISTINFO, paramlistinfo_space);
@@ -1280,10 +1313,15 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
 						 int instrument_options)
 {
 	char	   *pstmtspace;
+	char	   *part_prune_results_space;
+	char	   *unpruned_relids_space;
 	char	   *paramspace;
 	PlannedStmt *pstmt;
+	List	   *part_prune_results;
+	Bitmapset  *unpruned_relids;
 	ParamListInfo paramLI;
 	char	   *queryString;
+	EState	   *prep_estate = NULL;
 
 	/* Get the query string from shared memory */
 	queryString = shm_toc_lookup(toc, PARALLEL_KEY_QUERY_TEXT, false);
@@ -1296,12 +1334,80 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
 	paramspace = shm_toc_lookup(toc, PARALLEL_KEY_PARAMLISTINFO, false);
 	paramLI = RestoreParamList(&paramspace);
 
+	/* Reconstruct leader-supplied part_prune_results and unpruned_relids. */
+	part_prune_results_space =
+		shm_toc_lookup(toc, PARALLEL_KEY_PARTITION_PRUNE_RESULTS, false);
+	part_prune_results = (List *) stringToNode(part_prune_results_space);
+	unpruned_relids_space =
+		shm_toc_lookup(toc, PARALLEL_KEY_UNPRUNED_RELIDS, false);
+	unpruned_relids = (Bitmapset *) stringToNode(unpruned_relids_space);
+
+	/*
+	 * If pruning was done in the leader, build a prep estate in the worker
+	 * and inject the leader's pruning results into it for reuse.
+	 */
+	if (pstmt->partPruneInfos)
+	{
+		prep_estate = ExecutorPrep(pstmt, paramLI, CurrentResourceOwner, false, 0);
+		Assert(prep_estate);
+
+		prep_estate->es_part_prune_results = part_prune_results;
+		prep_estate->es_unpruned_relids =
+			bms_add_members(prep_estate->es_unpruned_relids,
+							unpruned_relids);
+
+		/*
+		 * A debug-build-only check that the pruning results passed from the
+		 * leader match what the worker would independently compute.
+		 */
+		CheckInitialPruningResultsInWorker(prep_estate);
+	}
+
 	/* Create a QueryDesc for the query. */
 	return CreateQueryDesc(pstmt,
 						   queryString,
 						   GetActiveSnapshot(), InvalidSnapshot,
 						   receiver, paramLI, NULL, instrument_options,
-						   NULL);
+						   prep_estate);
+}
+
+/*
+ * CheckInitialPruningResultsInWorker
+ *		Verify partition pruning results passed from the leader process.
+ *
+ * This is intended to be called during parallel worker query setup.
+ * It recomputes initial pruning results locally and compares them with
+ * those received from the leader. Any mismatch may indicate a divergence
+ * between leader and worker logic or environment.
+ *
+ * Only performed in debug builds.
+ */
+static void
+CheckInitialPruningResultsInWorker(EState *estate)
+{
+#ifdef USE_ASSERT_CHECKING
+	ListCell   *lc;
+	int			i;
+
+	Assert(estate->es_part_prune_results != NULL);
+	i = 0;
+	foreach(lc, estate->es_part_prune_states)
+	{
+		PartitionPruneState *prunestate = (PartitionPruneState *) lfirst(lc);
+		Bitmapset *reuse_validsubplans =
+				list_nth_node(Bitmapset, estate->es_part_prune_results, i++);
+		Bitmapset  *validsubplans = NULL;
+		Bitmapset  *validsubplan_rtis = NULL;
+
+		if (prunestate->do_initial_prune)
+			validsubplans = ExecFindMatchingSubPlans(prunestate, true,
+													 &validsubplan_rtis);
+		if (!bms_equal(validsubplans, reuse_validsubplans))
+			elog(ERROR, "different validsubplans in parallel worker");
+		if (bms_nonempty_difference(validsubplan_rtis, estate->es_unpruned_relids))
+			elog(ERROR, "different unprunable_relids in parallel worker");
+	}
+#endif
 }
 
 /*
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index be2a961a918..1d3244307da 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -93,14 +93,14 @@ static bool StmtPlanRequiresRevalidation(CachedPlanSource *plansource);
 static bool BuildingPlanRequiresSnapshot(CachedPlanSource *plansource);
 static List *RevalidateCachedQuery(CachedPlanSource *plansource,
 								   QueryEnvironment *queryEnv);
-static bool PrepAndCheckCachedPlan(CachedPlanSource *plansource, CachedPlanPrepData *cprep);
+static bool CheckCachedPlan(CachedPlanSource *plansource, CachedPlanPrepData *cprep);
 static CachedPlan *BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
 								   ParamListInfo boundParams, QueryEnvironment *queryEnv);
 static bool choose_custom_plan(CachedPlanSource *plansource,
 							   ParamListInfo boundParams);
 static double cached_plan_cost(CachedPlan *plan, bool include_planner);
 static Query *QueryListGetPrimaryStmt(List *stmts);
-static void AcquireExecutorLocks(List *stmt_list, bool acquire);
+static void AcquireExecutorLocksAll(List *stmt_list, bool acquire);
 static void AcquireExecutorLocksUnpruned(List *stmt_list, bool acquire,
 										 CachedPlanPrepData *cprep);
 static void CachedPlanPrepCleanup(CachedPlanPrepData *cprep);
@@ -142,26 +142,6 @@ ResourceOwnerForgetPlanCacheRef(ResourceOwner owner, CachedPlan *plan)
 /* GUC parameter */
 int			plan_cache_mode = PLAN_CACHE_MODE_AUTO;
 
-/*
- * Lock acquisition policy for execution locks.
- *
- * LOCK_ALL acquires locks on all relations mentioned in the plan,
- * reproducing the behavior of AcquireExecutorLocks().
- *
- * LOCK_UNPRUNED restricts locking to only the unpruned relations. That
- * includes those mentioned in PlannedStmt.unprunableRelids and the leaf
- * partitions remaining after performing initial pruning.
- */
-typedef enum LockPolicy
-{
-	LOCK_ALL,
-	LOCK_UNPRUNED,
-} LockPolicy;
-
-static void AcquireExecutorLocksWithPolicy(List *stmt_list,
-										   LockPolicy policy, bool acquire,
-										   CachedPlanPrepData *cprep);
-
 /*
  * InitPlanCache: initialize module during InitPostgres.
  *
@@ -963,7 +943,7 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
 }
 
 /*
- * PrepAndCheckCachedPlan: see if the CachedPlanSource's generic plan is valid.
+ * CheckCachedPlan: see if the CachedPlanSource's generic plan is valid.
  *
  * If 'cprep' is not NULL, ExecutorPrep() is applied to each PlannedStmt to
  * compute the set of partitions that survive initial runtime pruning in order
@@ -977,7 +957,7 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
  * (We must do this for the "true" result to be race-condition-free.)
  */
 static bool
-PrepAndCheckCachedPlan(CachedPlanSource *plansource, CachedPlanPrepData *cprep)
+CheckCachedPlan(CachedPlanSource *plansource, CachedPlanPrepData *cprep)
 {
 	CachedPlan *plan = plansource->gplan;
 
@@ -1005,15 +985,16 @@ PrepAndCheckCachedPlan(CachedPlanSource *plansource, CachedPlanPrepData *cprep)
 	 */
 	if (plan->is_valid)
 	{
-		LockPolicy policy = !cprep ? LOCK_ALL : LOCK_UNPRUNED;
-
 		/*
 		 * Plan must have positive refcount because it is referenced by
 		 * plansource; so no need to fear it disappears under us here.
 		 */
 		Assert(plan->refcount > 0);
 
-		AcquireExecutorLocksWithPolicy(plan->stmt_list, policy, true, cprep);
+		if (cprep)
+			AcquireExecutorLocksUnpruned(plan->stmt_list, true, cprep);
+		else
+			AcquireExecutorLocksAll(plan->stmt_list, true);
 
 		/*
 		 * If plan was transient, check to see if TransactionXmin has
@@ -1035,7 +1016,10 @@ PrepAndCheckCachedPlan(CachedPlanSource *plansource, CachedPlanPrepData *cprep)
 		}
 
 		/* Oops, the race case happened.  Release useless locks. */
-		AcquireExecutorLocksWithPolicy(plan->stmt_list, policy, false, cprep);
+		if (cprep)
+			AcquireExecutorLocksUnpruned(plan->stmt_list, false, cprep);
+		else
+			AcquireExecutorLocksAll(plan->stmt_list, false);
 
 		/* Also clean up ExecutorPrep() state, if necessary. */
 		CachedPlanPrepCleanup(cprep);
@@ -1358,7 +1342,7 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 	{
 		if (cprep)
 			cprep->params = boundParams;
-		if (PrepAndCheckCachedPlan(plansource, cprep))
+		if (CheckCachedPlan(plansource, cprep))
 		{
 			/* We want a generic plan, and we already have a valid one */
 			plan = plansource->gplan;
@@ -1945,43 +1929,13 @@ QueryListGetPrimaryStmt(List *stmts)
 }
 
 /*
- * AcquireExecutorLocksWithPolicy
- *		Acquire or release execution locks for a cached plan according to
- *		the specified policy.
- *
- * LOCK_ALL reproduces AcquireExecutorLocks(), locking every relation in
- * each PlannedStmt's rtable.  LOCK_UNPRUNED restricts locking to the
- * unprunable rels and partitions that survive initial runtime pruning.
- *
- * When LOCK_UNPRUNED is used on acquire, ExecutorPrep() is invoked for
- * each PlannedStmt and the resulting EStates are appended to
- * cprep->prep_estates in cprep->context.  On release, the same EState
- * list is consulted to determine which relations to unlock and each
- * EState is released.
- */
-static void
-AcquireExecutorLocksWithPolicy(List *stmt_list, LockPolicy policy, bool acquire,
-							   CachedPlanPrepData *cprep)
-{
-	switch (policy)
-	{
-		case LOCK_ALL:
-			AcquireExecutorLocks(stmt_list, acquire);
-			break;
-		case LOCK_UNPRUNED:
-			AcquireExecutorLocksUnpruned(stmt_list, acquire, cprep);
-			break;
-		default:
-			elog(ERROR, "invalid LockPolicy");
-	}
-}
-
-/*
- * AcquireExecutorLocks: acquire locks needed for execution of a cached plan;
- * or release them if acquire is false.
+ * AcquireExecutorLocksAll: acquire locks needed for execution of a cached
+ * plan; or release them if acquire is false.
+ *
+ * This locks all relations in a given PlannedStmt's range table.
  */
 static void
-AcquireExecutorLocks(List *stmt_list, bool acquire)
+AcquireExecutorLocksAll(List *stmt_list, bool acquire)
 {
 	ListCell   *lc1;
 
@@ -2044,10 +1998,8 @@ LockRelids(List *rtable, Bitmapset *relids, bool acquire)
 	{
 		RangeTblEntry *rte = list_nth_node(RangeTblEntry, rtable, rtindex - 1);
 
-		if (!(rte->rtekind == RTE_RELATION ||
-			  (rte->rtekind == RTE_SUBQUERY && OidIsValid(rte->relid))))
-			elog(ERROR, "LockRelids(): cannot lock relation at RT index %d",
-				 rtindex);
+		Assert(rte->rtekind == RTE_RELATION ||
+			   (rte->rtekind == RTE_SUBQUERY && OidIsValid(rte->relid)));
 
 		/*
 		 * Acquire the appropriate type of lock on each relation OID. Note
@@ -2204,7 +2156,7 @@ AcquireExecutorLocksUnpruned(List *stmt_list, bool acquire,
  * CachedPlanPrepCleanup
  *		Clean up EState built for a generic plan.
  *
- * This is used in the corner case where PrepAndCheckCachedPlan() discovers
+ * This is used in the corner case where CheckCachedPlan() discovers
  * that a CachedPlan has become invalid after AcquireExecutorLocksUnpruned()
  * has already run.  In that case we must both release the execution locks
  * and dispose of the ExecPrep list stored in CachedPlanPrepData, since the
@@ -2214,10 +2166,14 @@ static void
 CachedPlanPrepCleanup(CachedPlanPrepData *cprep)
 {
 	ListCell   *lc;
+	ResourceOwner oldowner;
 
 	if (cprep == NULL)
 		return;
 
+	/* Switch to owner that ExecutorPrep() would have used. */
+	oldowner = CurrentResourceOwner;
+	CurrentResourceOwner = cprep->owner;
 	foreach(lc, cprep->prep_estates)
 	{
 		EState *prep_estate = (EState *) lfirst(lc);
@@ -2228,6 +2184,7 @@ CachedPlanPrepCleanup(CachedPlanPrepData *cprep)
 		ExecCloseRangeTableRelations(prep_estate);
 		FreeExecutorState(prep_estate);
 	}
+	CurrentResourceOwner = oldowner;
 
 	list_free(cprep->prep_estates);
 	cprep->prep_estates = NIL;
-- 
2.47.3



  [application/octet-stream] v6-0005-Make-SQL-function-executor-track-ExecutorPrep-sta.patch (7.8K, 5-v6-0005-Make-SQL-function-executor-track-ExecutorPrep-sta.patch)
  download | inline diff:
From 836f0b63ced2546b594643043b7d0055ffaa7b66 Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Tue, 10 Feb 2026 22:09:23 +0900
Subject: [PATCH v6 5/6] Make SQL function executor track ExecutorPrep state

Extend the SQL function executor to use the ExecutorPrep results
returned by GetCachedPlan().  init_execution_state() now passes a
CachedPlanPrepData to GetCachedPlan() and stores the per statement
ExecPrep pointers in the execution_state nodes.

At execution time, postquel_start() reparents the prep estate's
es_query_cxt under the function's subcontext so that prep state
follows the usual per call context hierarchy.

This allows SQL language functions to participate in the same
ExecutorPrep machinery as other plan cache users.

Add a regression test where rule rewrite expands a single UPDATE
into multiple PlannedStmts, exercising the SQL function plan cache
and the generic plan reuse path that now invokes ExecutorPrep.
---
 src/backend/executor/functions.c        | 29 +++++++++++++--
 src/test/regress/expected/plancache.out | 48 +++++++++++++++++++++++++
 src/test/regress/sql/plancache.sql      | 34 ++++++++++++++++++
 3 files changed, 109 insertions(+), 2 deletions(-)

diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index 65dfae58dcf..c70e06d8886 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -72,6 +72,7 @@ typedef struct execution_state
 	bool		setsResult;		/* true if this query produces func's result */
 	bool		lazyEval;		/* true if should fetch one row at a time */
 	PlannedStmt *stmt;			/* plan for this query */
+	EState	   *prep_estate;	/* EState created in ExecutorPrep() for this plan */
 	QueryDesc  *qd;				/* null unless status == RUN */
 } execution_state;
 
@@ -657,6 +658,8 @@ init_execution_state(SQLFunctionCachePtr fcache)
 	execution_state *lasttages = NULL;
 	int			nstmts;
 	ListCell   *lc;
+	CachedPlanPrepData cprep = {0};
+	ListCell   *prep_lc;
 
 	/*
 	 * Clean up after previous query, if there was one.
@@ -695,11 +698,20 @@ init_execution_state(SQLFunctionCachePtr fcache)
 	 * CurrentResourceOwner will be the same when ShutdownSQLFunction runs.)
 	 */
 	fcache->cowner = CurrentResourceOwner;
+
+	/*
+	 * Have ExecutorPrep() allocate under fcache->fcontext.  The prep
+	 * EStates it creates will initially live there; postquel_start()
+	 * will later reparent their es_query_cxt into fcache->subcontext
+	 * when using them for execution.
+	 */
+	cprep.context = fcache->fcontext;
+	cprep.owner = fcache->cowner;
 	fcache->cplan = GetCachedPlan(plansource,
 								  fcache->paramLI,
 								  fcache->cowner,
 								  NULL,
-								  NULL);
+								  &cprep);
 
 	/*
 	 * If necessary, make esarray[] bigger to hold the needed state.
@@ -720,9 +732,11 @@ init_execution_state(SQLFunctionCachePtr fcache)
 	/*
 	 * Build execution_state list to match the number of contained plans.
 	 */
+	prep_lc = list_head(cprep.prep_estates);
 	foreach(lc, fcache->cplan->stmt_list)
 	{
 		PlannedStmt *stmt = lfirst_node(PlannedStmt, lc);
+		EState *prep_estate = next_prep_estate(cprep.prep_estates, &prep_lc);
 		execution_state *newes;
 
 		/*
@@ -764,6 +778,7 @@ init_execution_state(SQLFunctionCachePtr fcache)
 		newes->setsResult = false;	/* might change below */
 		newes->lazyEval = false;	/* might change below */
 		newes->stmt = stmt;
+		newes->prep_estate = prep_estate;
 		newes->qd = NULL;
 
 		if (stmt->canSetTag)
@@ -1362,6 +1377,15 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
 	else
 		dest = None_Receiver;
 
+	/*
+	 * Prep EStates were built under fcache->fcontext.  For execution,
+	 * make their es_query_cxt a child of fcache->subcontext so they
+	 * follow the usual per call lifetime.
+	 */
+	if (es->prep_estate)
+		MemoryContextSetParent(es->prep_estate->es_query_cxt,
+							   fcache->subcontext);
+
 	es->qd = CreateQueryDesc(es->stmt,
 							 fcache->func->src,
 							 GetActiveSnapshot(),
@@ -1370,7 +1394,7 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
 							 fcache->paramLI,
 							 es->qd ? es->qd->queryEnv : NULL,
 							 0,
-							 NULL);
+							 es->prep_estate);
 
 	/* Utility commands don't need Executor. */
 	if (es->qd->operation != CMD_UTILITY)
@@ -1461,6 +1485,7 @@ postquel_end(execution_state *es, SQLFunctionCachePtr fcache)
 
 	FreeQueryDesc(es->qd);
 	es->qd = NULL;
+	es->prep_estate = NULL;
 
 	MemoryContextSwitchTo(oldcontext);
 
diff --git a/src/test/regress/expected/plancache.out b/src/test/regress/expected/plancache.out
index 1d69ab0a1c2..371673a6e96 100644
--- a/src/test/regress/expected/plancache.out
+++ b/src/test/regress/expected/plancache.out
@@ -459,4 +459,52 @@ NOTICE:  creating index on partition inval_during_pruning_p1
 drop table inval_during_pruning_p, inval_during_pruning_signal;
 drop function invalidate_plancache_func, stable_pruning_val;
 deallocate inval_during_pruning_q;
+-- exercise sql-function plan cache when rewrite expands a single statement
+-- into multiple planned statements. this forces cachedplan->stmt_list to
+-- contain more than one entry and checks that executor state for the first
+-- rewritten statement does not destroy state needed by the second one.
+set plan_cache_mode = force_generic_plan;
+create table sqlf_base(id int, val int) partition by list (id);
+create table sqlf_base_1 partition of sqlf_base for values in (1);
+create table sqlf_base_2 partition of sqlf_base for values in (2);
+create table sqlf_log(id int, note text);
+insert into sqlf_base values (1, 10);
+create rule sqlf_base_upd_log as
+on update to sqlf_base do also
+	insert into sqlf_log(id, note)
+	values (new.id, 'logged by rule');
+create or replace function sqlf_execprep_test(a int, v int)
+returns void
+language sql
+as $$
+	update sqlf_base set val = v where id = a;
+$$;
+select sqlf_execprep_test(1, 20);
+ sqlf_execprep_test 
+--------------------
+ 
+(1 row)
+
+select sqlf_execprep_test(1, 30);
+ sqlf_execprep_test 
+--------------------
+ 
+(1 row)
+
+select * from sqlf_base order by 1;
+ id | val 
+----+-----
+  1 |  30
+(1 row)
+
+select * from sqlf_log order by 1;
+ id |      note      
+----+----------------
+  1 | logged by rule
+  1 | logged by rule
+(2 rows)
+
+drop rule sqlf_base_upd_log on sqlf_base;
+drop table sqlf_base, sqlf_log;
+drop function sqlf_execprep_test;
 reset plan_cache_mode;
diff --git a/src/test/regress/sql/plancache.sql b/src/test/regress/sql/plancache.sql
index 139b4688fd6..b89c9ad69a4 100644
--- a/src/test/regress/sql/plancache.sql
+++ b/src/test/regress/sql/plancache.sql
@@ -273,4 +273,38 @@ drop table inval_during_pruning_p, inval_during_pruning_signal;
 drop function invalidate_plancache_func, stable_pruning_val;
 deallocate inval_during_pruning_q;
 
+-- exercise sql-function plan cache when rewrite expands a single statement
+-- into multiple planned statements. this forces cachedplan->stmt_list to
+-- contain more than one entry and checks that executor state for the first
+-- rewritten statement does not destroy state needed by the second one.
+
+set plan_cache_mode = force_generic_plan;
+
+create table sqlf_base(id int, val int) partition by list (id);
+create table sqlf_base_1 partition of sqlf_base for values in (1);
+create table sqlf_base_2 partition of sqlf_base for values in (2);
+create table sqlf_log(id int, note text);
+
+insert into sqlf_base values (1, 10);
+
+create rule sqlf_base_upd_log as
+on update to sqlf_base do also
+	insert into sqlf_log(id, note)
+	values (new.id, 'logged by rule');
+
+create or replace function sqlf_execprep_test(a int, v int)
+returns void
+language sql
+as $$
+	update sqlf_base set val = v where id = a;
+$$;
+
+select sqlf_execprep_test(1, 20);
+select sqlf_execprep_test(1, 30);
+select * from sqlf_base order by 1;
+select * from sqlf_log order by 1;
+
+drop rule sqlf_base_upd_log on sqlf_base;
+drop table sqlf_base, sqlf_log;
+drop function sqlf_execprep_test;
 reset plan_cache_mode;
-- 
2.47.3



  [application/octet-stream] v6-0002-Introduce-ExecutorPrep-and-refactor-executor-star.patch (27.6K, 6-v6-0002-Introduce-ExecutorPrep-and-refactor-executor-star.patch)
  download | inline diff:
From aeaaa5059a7be06c301b1372c16829225b2770fb Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Tue, 11 Nov 2025 21:47:46 +0900
Subject: [PATCH v6 2/6] Introduce ExecutorPrep and refactor executor startup

Factor permission checks, range table initialization, and initial
partition pruning out of InitPlan() into a new ExecutorPrep()
helper. ExecutorPrep() builds an EState containing the executor
metadata needed before plan execution, including partition
pruning state where partPruneInfos are present, and returns it
directly to the caller.

ExecutorStart() now checks if QueryDesc->estate is already set
(indicating ExecutorPrep() was called earlier). If so, it reuses
the EState to avoid redoing range table setup and pruning.
Otherwise, it invokes ExecutorPrep() itself and adopts the
resulting EState for the duration of the query. This keeps the
executor startup behavior unchanged while making the setup work
callable separately when needed.

CreateQueryDesc() grows a prep_estate argument to accept an
optionally pre-created EState and stores it in the QueryDesc.
Portals, SPI, SQL functions, and EXPLAIN are wired to carry
optional EState pointers alongside the PlannedStmt list, but most
callers still pass NULL and let ExecutorStart() perform the setup
lazily.

ExecutorPrep() requires the caller to have established an active
snapshot, as partition pruning expressions may call PL functions
that internally require one (e.g., via EnsurePortalSnapshotExists()).

Update executor/README and related comments to document the new
control flow and the separation between preparation and execution.

Note that as of this commit, ExecutorStart() is the only caller of
ExecutorPrep(), so there is no semantic change in behavior. Later
commits will add specialized callers that invoke ExecutorPrep()
earlier to enable pruning-aware locking in cached plans.
---
 src/backend/commands/copyto.c       |   2 +-
 src/backend/commands/createas.c     |   2 +-
 src/backend/commands/explain.c      |   8 +-
 src/backend/commands/extension.c    |   2 +-
 src/backend/commands/matview.c      |   2 +-
 src/backend/commands/portalcmds.c   |   1 +
 src/backend/commands/prepare.c      |   9 +-
 src/backend/executor/README         |  11 +-
 src/backend/executor/execMain.c     | 176 +++++++++++++++++++++++-----
 src/backend/executor/execParallel.c |   3 +-
 src/backend/executor/functions.c    |   3 +-
 src/backend/executor/spi.c          |   9 +-
 src/backend/tcop/postgres.c         |   2 +
 src/backend/tcop/pquery.c           |  24 +++-
 src/backend/utils/mmgr/portalmem.c  |   2 +
 src/include/commands/explain.h      |   3 +-
 src/include/executor/execdesc.h     |   5 +-
 src/include/executor/executor.h     |  26 ++++
 src/include/nodes/execnodes.h       |   1 -
 src/include/utils/portal.h          |   2 +
 20 files changed, 241 insertions(+), 52 deletions(-)

diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index 9ceeff6d99e..ef1ee2568c6 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -875,7 +875,7 @@ BeginCopyTo(ParseState *pstate,
 		cstate->queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
 											GetActiveSnapshot(),
 											InvalidSnapshot,
-											dest, NULL, NULL, 0);
+											dest, NULL, NULL, 0, NULL);
 
 		/*
 		 * Call ExecutorStart to prepare the plan for execution.
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 270e9bf3110..b4a9808955a 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -336,7 +336,7 @@ ExecCreateTableAs(ParseState *pstate, CreateTableAsStmt *stmt,
 		/* Create a QueryDesc, redirecting output to our tuple receiver */
 		queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
 									GetActiveSnapshot(), InvalidSnapshot,
-									dest, params, queryEnv, 0);
+									dest, params, queryEnv, 0, NULL);
 
 		/* call ExecutorStart to prepare the plan for execution */
 		ExecutorStart(queryDesc, GetIntoRelEFlags(into));
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 93918a223b8..40564d4dff9 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -370,7 +370,7 @@ standard_ExplainOneQuery(Query *query, int cursorOptions,
 	}
 
 	/* run it (if needed) and produce output */
-	ExplainOnePlan(plan, into, es, queryString, params, queryEnv,
+	ExplainOnePlan(plan, NULL, into, es, queryString, params, queryEnv,
 				   &planduration, (es->buffers ? &bufusage : NULL),
 				   es->memory ? &mem_counters : NULL);
 }
@@ -492,7 +492,8 @@ ExplainOneUtility(Node *utilityStmt, IntoClause *into, ExplainState *es,
  * to call it.
  */
 void
-ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
+ExplainOnePlan(PlannedStmt *plannedstmt, EState *prep_estate,
+			   IntoClause *into, ExplainState *es,
 			   const char *queryString, ParamListInfo params,
 			   QueryEnvironment *queryEnv, const instr_time *planduration,
 			   const BufferUsage *bufusage,
@@ -550,7 +551,8 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
 	/* Create a QueryDesc for the query */
 	queryDesc = CreateQueryDesc(plannedstmt, queryString,
 								GetActiveSnapshot(), InvalidSnapshot,
-								dest, params, queryEnv, instrument_option);
+								dest, params, queryEnv, instrument_option,
+								prep_estate);
 
 	/* Select execution options */
 	if (es->analyze)
diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index 963618a64c4..ff759ddd07c 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -1173,7 +1173,7 @@ execute_sql_string(const char *sql, const char *filename)
 				qdesc = CreateQueryDesc(stmt,
 										sql,
 										GetActiveSnapshot(), NULL,
-										dest, NULL, NULL, 0);
+										dest, NULL, NULL, 0, NULL);
 
 				ExecutorStart(qdesc, 0);
 				ExecutorRun(qdesc, ForwardScanDirection, 0);
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index 81a55a33ef2..2cdfdcf984b 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -439,7 +439,7 @@ refresh_matview_datafill(DestReceiver *dest, Query *query,
 	/* Create a QueryDesc, redirecting output to our tuple receiver */
 	queryDesc = CreateQueryDesc(plan, queryString,
 								GetActiveSnapshot(), InvalidSnapshot,
-								dest, NULL, NULL, 0);
+								dest, NULL, NULL, 0, NULL);
 
 	/* call ExecutorStart to prepare the plan for execution */
 	ExecutorStart(queryDesc, 0);
diff --git a/src/backend/commands/portalcmds.c b/src/backend/commands/portalcmds.c
index 01efac3319e..1e880a6d7c9 100644
--- a/src/backend/commands/portalcmds.c
+++ b/src/backend/commands/portalcmds.c
@@ -118,6 +118,7 @@ PerformCursorOpen(ParseState *pstate, DeclareCursorStmt *cstmt, ParamListInfo pa
 					  queryString,
 					  CMDTAG_SELECT,	/* cursor's query is always a SELECT */
 					  list_make1(plan),
+					  NIL,
 					  NULL);
 
 	/*----------
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 5b86a727587..005fbb48aa5 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -205,6 +205,7 @@ ExecuteQuery(ParseState *pstate,
 					  query_string,
 					  entry->plansource->commandTag,
 					  plan_list,
+					  NIL,
 					  cplan);
 
 	/*
@@ -575,7 +576,9 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	const char *query_string;
 	CachedPlan *cplan;
 	List	   *plan_list;
+	List	   *prep_estates;
 	ListCell   *p;
+	ListCell   *prep_lc;
 	ParamListInfo paramLI = NULL;
 	EState	   *estate = NULL;
 	instr_time	planstart;
@@ -650,14 +653,18 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	}
 
 	plan_list = cplan->stmt_list;
+	prep_estates = NIL;
 
 	/* Explain each query */
+	prep_lc = list_head(prep_estates);
 	foreach(p, plan_list)
 	{
 		PlannedStmt *pstmt = lfirst_node(PlannedStmt, p);
+		EState *prep_estate = next_prep_estate(prep_estates, &prep_lc);
 
 		if (pstmt->commandType != CMD_UTILITY)
-			ExplainOnePlan(pstmt, into, es, query_string, paramLI, pstate->p_queryEnv,
+			ExplainOnePlan(pstmt, prep_estate,
+						   into, es, query_string, paramLI, pstate->p_queryEnv,
 						   &planduration, (es->buffers ? &bufusage : NULL),
 						   es->memory ? &mem_counters : NULL);
 		else
diff --git a/src/backend/executor/README b/src/backend/executor/README
index 54f4782f31b..d749ceb6687 100644
--- a/src/backend/executor/README
+++ b/src/backend/executor/README
@@ -291,11 +291,18 @@ Query Processing Control Flow
 
 This is a sketch of control flow for full query processing:
 
+    ExecutorPrep
+		May be run before ExecutorStart (e.g., for plan validation), or
+		implicitly from ExecutorStart if not done earlier.  Creates EState,
+		performs range table initialization, permission checks, and initial
+		partition pruning.  Returns the EState that ExecutorStart() should
+		reuse.
+
 	CreateQueryDesc
 
 	ExecutorStart
-		CreateExecutorState
-			creates per-query context
+		ExecutorPrep (if not already done, indicated by NULL QueryDesc.estate)
+			creates EState and per-query context
 		switch to per-query context to run ExecInitNode
 		AfterTriggerBeginQuery
 		ExecInitNode --- recursively scans plan tree
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 654f9246ad0..d7e99690c7f 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -55,6 +55,7 @@
 #include "parser/parse_relation.h"
 #include "pgstat.h"
 #include "rewrite/rewriteHandler.h"
+#include "storage/lmgr.h"
 #include "tcop/utility.h"
 #include "utils/acl.h"
 #include "utils/backend_status.h"
@@ -145,7 +146,6 @@ standard_ExecutorStart(QueryDesc *queryDesc, int eflags)
 
 	/* sanity checks: queryDesc must not be started already */
 	Assert(queryDesc != NULL);
-	Assert(queryDesc->estate == NULL);
 
 	/* caller must ensure the query's snapshot is active */
 	Assert(GetActiveSnapshot() == queryDesc->snapshot);
@@ -171,9 +171,71 @@ standard_ExecutorStart(QueryDesc *queryDesc, int eflags)
 
 	/*
 	 * Build EState, switch into per-query memory context for startup.
-	 */
-	estate = CreateExecutorState();
-	queryDesc->estate = estate;
+	 *
+	 * If ExecutorPrep() ran earlier (e.g., to do initial pruning during plan
+	 * validity checking), reuse its EState to avoid redoing range table setup
+	 * and pruning. Otherwise, create a fresh EState as usual.
+	 *
+	 * In assert builds, verify that the expected locks are held.  When
+	 * no prep EState was provided, AcquireExecutorLocks() should have
+	 * locked every relation in the plan.  When one was provided,
+	 * pruning-aware locking should have locked at least the unpruned
+	 * relations.  Both checks are skipped in parallel workers, which
+	 * acquire relation locks lazily in ExecGetRangeTableRelation().
+	 */
+	if (queryDesc->estate == NULL)
+	{
+#ifdef USE_ASSERT_CHECKING
+		if (!IsParallelWorker())
+		{
+			ListCell   *lc;
+
+			foreach(lc, queryDesc->plannedstmt->rtable)
+			{
+				RangeTblEntry *rte = lfirst_node(RangeTblEntry, lc);
+
+				if (rte->rtekind == RTE_RELATION ||
+					(rte->rtekind == RTE_SUBQUERY && rte->relid != InvalidOid))
+					Assert(CheckRelationOidLockedByMe(rte->relid,
+													  rte->rellockmode,
+													  true));
+			}
+		}
+#endif
+		queryDesc->estate = ExecutorPrep(queryDesc->plannedstmt,
+										 queryDesc->params,
+										 CurrentResourceOwner,
+										 true,
+										 eflags);
+	}
+#ifdef USE_ASSERT_CHECKING
+	else
+	{
+		/*
+		 * A prep EState was provided, meaning pruning-aware locking
+		 * should have locked at least the unpruned relations.
+		 */
+		if (!IsParallelWorker())
+		{
+			int		rtindex = -1;
+
+			while ((rtindex = bms_next_member(queryDesc->estate->es_unpruned_relids,
+											  rtindex)) >= 0)
+			{
+				RangeTblEntry *rte = exec_rt_fetch(rtindex, queryDesc->estate);
+
+				Assert(rte->rtekind == RTE_RELATION ||
+					   (rte->rtekind == RTE_SUBQUERY &&
+						rte->relid != InvalidOid));
+				Assert(CheckRelationOidLockedByMe(rte->relid,
+												  rte->rellockmode, true));
+			}
+		}
+	}
+#endif
+
+	estate = queryDesc->estate;
+	Assert(estate);
 
 	oldcontext = MemoryContextSwitchTo(estate->es_query_cxt);
 
@@ -263,6 +325,84 @@ standard_ExecutorStart(QueryDesc *queryDesc, int eflags)
 	MemoryContextSwitchTo(oldcontext);
 }
 
+/*
+ * ExecutorPrep: prepare executor state for a PlannedStmt outside ExecutorStart.
+ *
+ * Performs range table initialization, permission checks, and initial
+ * partition pruning if partPruneInfos are present and do_initial_pruning is
+ * true.
+ *
+ * This is intended for callers that need executor metadata ahead of actual
+ * execution. Typical use cases include:
+ *	- determining which relations must be locked during plan cache validation;
+ *	- initializing unpruned relids and valid subplans in parallel workers
+ *	  using state copied from the leader.
+ *
+ * The executor can reuse the resulting state to avoid redundant setup during
+ * ExecutorStart().
+ *
+ * Returns an EState that can be reused later.
+ */
+EState *
+ExecutorPrep(PlannedStmt *pstmt, ParamListInfo params, ResourceOwner owner,
+			 bool do_initial_pruning, int eflags)
+{
+	ResourceOwner oldowner;
+	EState *estate;
+
+	if (pstmt->commandType == CMD_UTILITY)
+		return NULL;
+
+	/* Caller must have established an active snapshot. */
+	Assert(ActiveSnapshotSet());
+
+	estate = CreateExecutorState();
+	estate->es_plannedstmt = pstmt;
+	estate->es_part_prune_infos = pstmt->partPruneInfos;
+	estate->es_param_list_info = params;
+	estate->es_top_eflags = eflags;
+
+	/*
+	 * Do permissions checks.
+	 */
+	ExecCheckPermissions(pstmt->rtable, pstmt->permInfos, true);
+
+	/*
+	 * Initialize range table.
+	 */
+	ExecInitRangeTable(estate, pstmt->rtable, pstmt->permInfos,
+					   bms_copy(pstmt->unprunableRelids));
+
+	/*
+	 * Ensure locks taken during initial pruning are tracked under the given
+	 * ResourceOwner (e.g., one associated with CachedPlan validation).
+	 */
+	oldowner = CurrentResourceOwner;
+	CurrentResourceOwner = owner;
+
+	/*
+	 * Set up PartitionPruneState structures needed for both initial and
+	 * runtime partition pruning. These structures are built from the
+	 * PartitionPruneInfo entries in the plan tree.
+	 *
+	 * If do_initial_pruning is true, also perform initial pruning to compute
+	 * the subset of child subplans that will be executed. The results,
+	 * which are bitmapsets of selected child indexes, are saved in
+	 * es_part_prune_results. This list is parallel to es_part_prune_infos.
+	 *
+	 * In parallel workers, do_initial_pruning should be false -- they receive
+	 * es_part_prune_results from the leader process and should only initialize
+	 * the PartitionPruneStates.
+	 */
+	ExecCreatePartitionPruneStates(estate);
+	if (do_initial_pruning)
+		ExecDoInitialPruning(estate);
+
+	CurrentResourceOwner = oldowner;
+
+	return estate;
+}
+
 /* ----------------------------------------------------------------
  *		ExecutorRun
  *
@@ -838,38 +978,14 @@ InitPlan(QueryDesc *queryDesc, int eflags)
 	CmdType		operation = queryDesc->operation;
 	PlannedStmt *plannedstmt = queryDesc->plannedstmt;
 	Plan	   *plan = plannedstmt->planTree;
-	List	   *rangeTable = plannedstmt->rtable;
 	EState	   *estate = queryDesc->estate;
 	PlanState  *planstate;
 	TupleDesc	tupType;
 	ListCell   *l;
 	int			i;
 
-	/*
-	 * Do permissions checks
-	 */
-	ExecCheckPermissions(rangeTable, plannedstmt->permInfos, true);
-
-	/*
-	 * initialize the node's execution state
-	 */
-	ExecInitRangeTable(estate, rangeTable, plannedstmt->permInfos,
-					   bms_copy(plannedstmt->unprunableRelids));
-
-	estate->es_plannedstmt = plannedstmt;
-	estate->es_part_prune_infos = plannedstmt->partPruneInfos;
-
-	/*
-	 * Perform runtime "initial" pruning to identify which child subplans,
-	 * corresponding to the children of plan nodes that contain
-	 * PartitionPruneInfo such as Append, will not be executed. The results,
-	 * which are bitmapsets of indexes of the child subplans that will be
-	 * executed, are saved in es_part_prune_results.  These results correspond
-	 * to each PartitionPruneInfo entry, and the es_part_prune_results list is
-	 * parallel to es_part_prune_infos.
-	 */
-	ExecCreatePartitionPruneStates(estate);
-	ExecDoInitialPruning(estate);
+	/* ExecutorPrep() must have been done. */
+	Assert(queryDesc->estate);
 
 	/*
 	 * Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index ac84af294c9..024780d3516 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -1300,7 +1300,8 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
 	return CreateQueryDesc(pstmt,
 						   queryString,
 						   GetActiveSnapshot(), InvalidSnapshot,
-						   receiver, paramLI, NULL, instrument_options);
+						   receiver, paramLI, NULL, instrument_options,
+						   NULL);
 }
 
 /*
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index 4ca342a43ef..c93e2664cfd 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -1368,7 +1368,8 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
 							 dest,
 							 fcache->paramLI,
 							 es->qd ? es->qd->queryEnv : NULL,
-							 0);
+							 0,
+							 NULL);
 
 	/* Utility commands don't need Executor. */
 	if (es->qd->operation != CMD_UTILITY)
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 3019a3b2b97..994a69a1c8e 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -1685,6 +1685,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 					  query_string,
 					  plansource->commandTag,
 					  stmt_list,
+					  NIL,
 					  cplan);
 
 	/*
@@ -2499,6 +2500,8 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 		CachedPlanSource *plansource = (CachedPlanSource *) lfirst(lc1);
 		List	   *stmt_list;
 		ListCell   *lc2;
+		List	   *prep_estates;
+		ListCell   *prep_lc;
 
 		spicallbackarg.query = plansource->query_string;
 
@@ -2577,6 +2580,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 							  plan_owner, _SPI_current->queryEnv);
 
 		stmt_list = cplan->stmt_list;
+		prep_estates = NIL;
 
 		/*
 		 * If we weren't given a specific snapshot to use, and the statement
@@ -2614,9 +2618,11 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 			}
 		}
 
+		prep_lc = list_head(prep_estates);
 		foreach(lc2, stmt_list)
 		{
 			PlannedStmt *stmt = lfirst_node(PlannedStmt, lc2);
+			EState *prep_estate = next_prep_estate(prep_estates, &prep_lc);
 			bool		canSetTag = stmt->canSetTag;
 			DestReceiver *dest;
 
@@ -2694,7 +2700,8 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 										dest,
 										options->params,
 										_SPI_current->queryEnv,
-										0);
+										0,
+										prep_estate);
 				res = _SPI_pquery(qdesc, fire_triggers,
 								  canSetTag ? options->tcount : 0);
 				FreeQueryDesc(qdesc);
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index d01a09dd0c4..cd1e429ceed 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -1230,6 +1230,7 @@ exec_simple_query(const char *query_string)
 						  query_string,
 						  commandTag,
 						  plantree_list,
+						  NIL,
 						  NULL);
 
 		/*
@@ -2029,6 +2030,7 @@ exec_bind_message(StringInfo input_message)
 					  query_string,
 					  psrc->commandTag,
 					  cplan->stmt_list,
+					  NIL,
 					  cplan);
 
 	/* Portal is defined, set the plan ID based on its contents. */
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index d8fc75d0bb9..b18266487bb 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -37,6 +37,7 @@ Portal		ActivePortal = NULL;
 
 
 static void ProcessQuery(PlannedStmt *plan,
+						 EState *prep_estate,
 						 const char *sourceText,
 						 ParamListInfo params,
 						 QueryEnvironment *queryEnv,
@@ -72,7 +73,8 @@ CreateQueryDesc(PlannedStmt *plannedstmt,
 				DestReceiver *dest,
 				ParamListInfo params,
 				QueryEnvironment *queryEnv,
-				int instrument_options)
+				int instrument_options,
+				EState *prep_estate)
 {
 	QueryDesc  *qd = palloc_object(QueryDesc);
 
@@ -93,6 +95,9 @@ CreateQueryDesc(PlannedStmt *plannedstmt,
 	qd->planstate = NULL;
 	qd->totaltime = NULL;
 
+	/* Use the EState created by ExecutorPrep() if already done. */
+	qd->estate = prep_estate;
+
 	/* not yet executed */
 	qd->already_executed = false;
 
@@ -123,6 +128,7 @@ FreeQueryDesc(QueryDesc *qdesc)
  *		PORTAL_ONE_RETURNING, or PORTAL_ONE_MOD_WITH portal
  *
  *	plan: the plan tree for the query
+ *	prep_estate: EState created in ExecutorPrep() for the query, if any
  *	sourceText: the source text of the query
  *	params: any parameters needed
  *	dest: where to send results
@@ -135,6 +141,7 @@ FreeQueryDesc(QueryDesc *qdesc)
  */
 static void
 ProcessQuery(PlannedStmt *plan,
+			 EState *prep_estate,
 			 const char *sourceText,
 			 ParamListInfo params,
 			 QueryEnvironment *queryEnv,
@@ -148,7 +155,8 @@ ProcessQuery(PlannedStmt *plan,
 	 */
 	queryDesc = CreateQueryDesc(plan, sourceText,
 								GetActiveSnapshot(), InvalidSnapshot,
-								dest, params, queryEnv, 0);
+								dest, params, queryEnv, 0,
+								prep_estate);
 
 	/*
 	 * Call ExecutorStart to prepare the plan for execution
@@ -495,7 +503,10 @@ PortalStart(Portal portal, ParamListInfo params,
 											None_Receiver,
 											params,
 											portal->queryEnv,
-											0);
+											0,
+											portal->prep_estates ?
+											(EState *) linitial(portal->prep_estates) :
+											NULL);
 
 				/*
 				 * If it's a scrollable cursor, executor needs to support
@@ -1185,6 +1196,7 @@ PortalRunMulti(Portal portal,
 {
 	bool		active_snapshot_set = false;
 	ListCell   *stmtlist_item;
+	ListCell   *prep_lc;
 
 	/*
 	 * If the destination is DestRemoteExecute, change to DestNone.  The
@@ -1205,9 +1217,11 @@ PortalRunMulti(Portal portal,
 	 * Loop to handle the individual queries generated from a single parsetree
 	 * by analysis and rewrite.
 	 */
+	prep_lc = list_head(portal->prep_estates);
 	foreach(stmtlist_item, portal->stmts)
 	{
 		PlannedStmt *pstmt = lfirst_node(PlannedStmt, stmtlist_item);
+		EState *prep_estate = next_prep_estate(portal->prep_estates, &prep_lc);
 
 		/*
 		 * If we got a cancel signal in prior command, quit
@@ -1265,7 +1279,7 @@ PortalRunMulti(Portal portal,
 			if (pstmt->canSetTag)
 			{
 				/* statement can set tag string */
-				ProcessQuery(pstmt,
+				ProcessQuery(pstmt, prep_estate,
 							 portal->sourceText,
 							 portal->portalParams,
 							 portal->queryEnv,
@@ -1274,7 +1288,7 @@ PortalRunMulti(Portal portal,
 			else
 			{
 				/* stmt added by rewrite cannot set tag */
-				ProcessQuery(pstmt,
+				ProcessQuery(pstmt, prep_estate,
 							 portal->sourceText,
 							 portal->portalParams,
 							 portal->queryEnv,
diff --git a/src/backend/utils/mmgr/portalmem.c b/src/backend/utils/mmgr/portalmem.c
index c1a53e658cb..941e95010c3 100644
--- a/src/backend/utils/mmgr/portalmem.c
+++ b/src/backend/utils/mmgr/portalmem.c
@@ -284,6 +284,7 @@ PortalDefineQuery(Portal portal,
 				  const char *sourceText,
 				  CommandTag commandTag,
 				  List *stmts,
+				  List *prep_estates,
 				  CachedPlan *cplan)
 {
 	Assert(PortalIsValid(portal));
@@ -297,6 +298,7 @@ PortalDefineQuery(Portal portal,
 	portal->commandTag = commandTag;
 	SetQueryCompletion(&portal->qc, commandTag, 0);
 	portal->stmts = stmts;
+	portal->prep_estates = prep_estates;
 	portal->cplan = cplan;
 	portal->status = PORTAL_DEFINED;
 }
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 86226f8db70..3756a11345f 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -63,7 +63,8 @@ extern void ExplainOneUtility(Node *utilityStmt, IntoClause *into,
 							  ExplainState *es, ParseState *pstate,
 							  ParamListInfo params);
 
-extern void ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into,
+extern void ExplainOnePlan(PlannedStmt *plannedstmt, EState *prep_estate,
+						   IntoClause *into,
 						   ExplainState *es, const char *queryString,
 						   ParamListInfo params, QueryEnvironment *queryEnv,
 						   const instr_time *planduration,
diff --git a/src/include/executor/execdesc.h b/src/include/executor/execdesc.h
index d3a57242844..3a2169c9613 100644
--- a/src/include/executor/execdesc.h
+++ b/src/include/executor/execdesc.h
@@ -43,7 +43,7 @@ typedef struct QueryDesc
 	QueryEnvironment *queryEnv; /* query environment passed in */
 	int			instrument_options; /* OR of InstrumentOption flags */
 
-	/* These fields are set by ExecutorStart */
+	/* These fields are set by ExecutorStart or ExecutorPrep */
 	TupleDesc	tupDesc;		/* descriptor for result tuples */
 	EState	   *estate;			/* executor's query-wide state */
 	PlanState  *planstate;		/* tree of per-plan-node state */
@@ -63,7 +63,8 @@ extern QueryDesc *CreateQueryDesc(PlannedStmt *plannedstmt,
 								  DestReceiver *dest,
 								  ParamListInfo params,
 								  QueryEnvironment *queryEnv,
-								  int instrument_options);
+								  int instrument_options,
+								  EState *prep_estate);
 
 extern void FreeQueryDesc(QueryDesc *qdesc);
 
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index d46ba59895d..e6fa122e6e4 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -20,6 +20,7 @@
 #include "nodes/lockoptions.h"
 #include "nodes/parsenodes.h"
 #include "utils/memutils.h"
+#include "utils/resowner.h"
 
 
 /*
@@ -234,6 +235,31 @@ ExecGetJunkAttribute(TupleTableSlot *slot, AttrNumber attno, bool *isNull)
  */
 extern void ExecutorStart(QueryDesc *queryDesc, int eflags);
 extern void standard_ExecutorStart(QueryDesc *queryDesc, int eflags);
+
+extern EState *ExecutorPrep(PlannedStmt *pstmt,
+							ParamListInfo params,
+							ResourceOwner owner,
+							bool do_initial_pruning,
+							int eflags);
+
+/*
+ * Walk a prep_estates list in step with a parallel stmt_list iteration.
+ * Returns the next EState (or NULL) and advances *lc.  Safe when
+ * prep_estates is NIL; just returns NULL for every call.
+ */
+static inline EState *
+next_prep_estate(List *prep_estates, ListCell **lc)
+{
+	EState *result = NULL;
+
+	if (*lc != NULL)
+	{
+		result = (EState *) lfirst(*lc);
+		*lc = lnext(prep_estates, *lc);
+	}
+	return result;
+}
+
 extern void ExecutorRun(QueryDesc *queryDesc,
 						ScanDirection direction, uint64 count);
 extern void standard_ExecutorRun(QueryDesc *queryDesc,
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 63c067d5aae..84d80e3ab0d 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -775,7 +775,6 @@ typedef struct EState
 	List	   *es_insert_pending_modifytables;
 } EState;
 
-
 /*
  * ExecRowMark -
  *	   runtime representation of FOR [KEY] UPDATE/SHARE clauses
diff --git a/src/include/utils/portal.h b/src/include/utils/portal.h
index a7bedb12c18..f69b4b9b479 100644
--- a/src/include/utils/portal.h
+++ b/src/include/utils/portal.h
@@ -137,6 +137,7 @@ typedef struct PortalData
 	CommandTag	commandTag;		/* command tag for original query */
 	QueryCompletion qc;			/* command completion data for executed query */
 	List	   *stmts;			/* list of PlannedStmts */
+	List	   *prep_estates;	/* list of EStates where needed */
 	CachedPlan *cplan;			/* CachedPlan, if stmts are from one */
 
 	ParamListInfo portalParams; /* params to pass to query */
@@ -240,6 +241,7 @@ extern void PortalDefineQuery(Portal portal,
 							  const char *sourceText,
 							  CommandTag commandTag,
 							  List *stmts,
+							  List *prep_estates,
 							  CachedPlan *cplan);
 extern PlannedStmt *PortalGetPrimaryStmt(Portal portal);
 extern void PortalCreateHoldStore(Portal portal);
-- 
2.47.3



  [application/octet-stream] v6-0001-Refactor-partition-pruning-initialization-for-cla.patch (10.2K, 7-v6-0001-Refactor-partition-pruning-initialization-for-cla.patch)
  download | inline diff:
From 6f2c9cc7a30d38cb2606595f62b62c77e2aba6e9 Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Tue, 10 Feb 2026 15:08:52 +0900
Subject: [PATCH v6 1/6] Refactor partition pruning initialization for clarity
 and modularity

Move the creation of PartitionPruneState structures out of
ExecDoInitialPruning() into a new ExecCreatePartitionPruneStates()
function. This separates the setup of pruning state from the execution
of initial pruning logic, making the code clearer and easier to
maintain.

Simplify handling of unpruned relids by moving responsibility
for recording them in EState into CreatePartitionPruneState(),
avoiding the need to pass all_leafpart_rtis as an out parameter.

Also move the setting of ecxt_param_exec_vals from
ExecCreatePartitionPruneState() to InitExecPartitionPruneContexts(),
to allow the former to be called at a time when the PARAM_EXEC
parameters have not yet been set up.

This refactoring allows callers to reuse the pruning setup logic
without always triggering pruning, a capability useful for future use
cases that may only need metadata initialization.
---
 src/backend/executor/execMain.c      |   1 +
 src/backend/executor/execPartition.c | 103 +++++++++++++++++++--------
 src/include/executor/execPartition.h |   1 +
 3 files changed, 74 insertions(+), 31 deletions(-)

diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index bfd3ebc601e..654f9246ad0 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -868,6 +868,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
 	 * to each PartitionPruneInfo entry, and the es_part_prune_results list is
 	 * parallel to es_part_prune_infos.
 	 */
+	ExecCreatePartitionPruneStates(estate);
 	ExecDoInitialPruning(estate);
 
 	/*
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index bab294f5e91..20c3513fabe 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -184,8 +184,7 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation rel,
 static List *adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri);
 static List *adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap);
 static PartitionPruneState *CreatePartitionPruneState(EState *estate,
-													  PartitionPruneInfo *pruneinfo,
-													  Bitmapset **all_leafpart_rtis);
+													  PartitionPruneInfo *pruneinfo);
 static void InitPartitionPruneContext(PartitionPruneContext *context,
 									  List *pruning_steps,
 									  PartitionDesc partdesc,
@@ -1942,6 +1941,9 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  *
  * Functions:
  *
+ * ExecCreatePartitionPruneStates
+ *		Create PartitionPruneState for all PartitionPruneInfos in the EState
+ *
  * ExecDoInitialPruning:
  *		Perform runtime "initial" pruning, if necessary, to determine the set
  *		of child subnodes that need to be initialized during ExecInitNode() for
@@ -1966,6 +1968,29 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  *-------------------------------------------------------------------------
  */
 
+/*
+ * ExecCreatePartitionPruneStates
+ *
+ * Create a PartitionPruneState for each PartitionPruneInfo in the estate,
+ * and save them in estate->es_part_prune_states. This setup is required
+ * before any initial or runtime pruning can occur.
+ */
+void
+ExecCreatePartitionPruneStates(EState *estate)
+{
+	ListCell   *lc;
+
+	foreach(lc, estate->es_part_prune_infos)
+	{
+		PartitionPruneInfo *pruneinfo = lfirst_node(PartitionPruneInfo, lc);
+		PartitionPruneState *prunestate;
+
+		/* Create and save the PartitionPruneState. */
+		prunestate = CreatePartitionPruneState(estate, pruneinfo);
+		estate->es_part_prune_states = lappend(estate->es_part_prune_states,
+											   prunestate);
+	}
+}
 
 /*
  * ExecDoInitialPruning
@@ -1973,11 +1998,11 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  *		of child subnodes that need to be initialized during ExecInitNode() for
  *		plan nodes that support partition pruning.
  *
- * This function iterates over each PartitionPruneInfo entry in
- * estate->es_part_prune_infos. For each entry, it creates a PartitionPruneState
- * and adds it to es_part_prune_states.  ExecInitPartitionExecPruning() accesses
+ * This function iterates over each PartitionPruneState in
+ * estate->es_part_prune_states, which must have been populated earlier by
+ * ExecCreatePartitionPruneStates(). ExecInitPartitionExecPruning() accesses
  * these states through their corresponding indexes in es_part_prune_states and
- * assign each state to the parent node's PlanState, from where it will be used
+ * assigns each state to the parent node's PlanState, from where it will be used
  * for "exec" pruning.
  *
  * If initial pruning steps exist for a PartitionPruneInfo entry, this function
@@ -1995,20 +2020,13 @@ ExecDoInitialPruning(EState *estate)
 {
 	ListCell   *lc;
 
-	foreach(lc, estate->es_part_prune_infos)
+	Assert(estate->es_part_prune_results == NULL);
+	foreach(lc, estate->es_part_prune_states)
 	{
-		PartitionPruneInfo *pruneinfo = lfirst_node(PartitionPruneInfo, lc);
-		PartitionPruneState *prunestate;
+		PartitionPruneState *prunestate = (PartitionPruneState *) lfirst(lc);
 		Bitmapset  *validsubplans = NULL;
-		Bitmapset  *all_leafpart_rtis = NULL;
 		Bitmapset  *validsubplan_rtis = NULL;
 
-		/* Create and save the PartitionPruneState. */
-		prunestate = CreatePartitionPruneState(estate, pruneinfo,
-											   &all_leafpart_rtis);
-		estate->es_part_prune_states = lappend(estate->es_part_prune_states,
-											   prunestate);
-
 		/*
 		 * Perform initial pruning steps, if any, and save the result
 		 * bitmapset or NULL as described in the header comment.
@@ -2016,8 +2034,6 @@ ExecDoInitialPruning(EState *estate)
 		if (prunestate->do_initial_prune)
 			validsubplans = ExecFindMatchingSubPlans(prunestate, true,
 													 &validsubplan_rtis);
-		else
-			validsubplan_rtis = all_leafpart_rtis;
 
 		estate->es_unpruned_relids = bms_add_members(estate->es_unpruned_relids,
 													 validsubplan_rtis);
@@ -2135,14 +2151,12 @@ ExecInitPartitionExecPruning(PlanState *planstate,
  * parent plan node's PlanState.
  *
  * If initial pruning steps are to be skipped (e.g., during EXPLAIN
- * (GENERIC_PLAN)), *all_leafpart_rtis will be populated with the RT indexes of
- * all leaf partitions whose scanning subnode is included in the parent plan
- * node's list of child plans. The caller must add these RT indexes to
- * estate->es_unpruned_relids.
+ * (GENERIC_PLAN)), the RT indexes of all leaf partitions whose scanning
+ * subnode is included in the parent plan node's list of child plans are
+ * added to estate->es_unpruned_relids.
  */
 static PartitionPruneState *
-CreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo,
-						  Bitmapset **all_leafpart_rtis)
+CreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo)
 {
 	PartitionPruneState *prunestate;
 	int			n_part_hierarchies;
@@ -2376,8 +2390,8 @@ CreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo,
 													   pinfo->execparamids);
 
 			/*
-			 * Return all leaf partition indexes if we're skipping pruning in
-			 * the EXPLAIN (GENERIC_PLAN) case.
+			 * Add all leaf partition indexes to es_unpruned_relids if we're
+			 * skipping pruning in the EXPLAIN (GENERIC_PLAN) case.
 			 */
 			if (pinfo->initial_pruning_steps && !prunestate->do_initial_prune)
 			{
@@ -2389,10 +2403,29 @@ CreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo,
 					Index		rtindex = pprune->leafpart_rti_map[part_index];
 
 					if (rtindex)
-						*all_leafpart_rtis = bms_add_member(*all_leafpart_rtis,
-															rtindex);
+						estate->es_unpruned_relids =
+							bms_add_member(estate->es_unpruned_relids, rtindex);
 				}
 			}
+			else if (pinfo->initial_pruning_steps == NIL)
+			{
+				/*
+				 * All partitions better be present in es_unpruned_relids when
+				 * none are initially prunable.
+				 */
+#ifdef USE_ASSERT_CHECKING
+				int			part_index = -1;
+
+				while ((part_index = bms_next_member(pprune->present_parts,
+													 part_index)) >= 0)
+				{
+					Index		rtindex = pprune->leafpart_rti_map[part_index];
+
+					if (rtindex)
+						Assert(bms_is_member(rtindex, estate->es_unpruned_relids));
+				}
+#endif
+			}
 
 			j++;
 		}
@@ -2489,9 +2522,10 @@ InitPartitionPruneContext(PartitionPruneContext *context,
  *		Initialize exec pruning contexts deferred by CreatePartitionPruneState()
  *
  * This function finalizes exec pruning setup for a PartitionPruneState by
- * initializing contexts for pruning steps that require the parent plan's
- * PlanState. It iterates over PartitionPruningData entries and sets up the
- * necessary execution contexts for pruning during query execution.
+ * initializing contexts for pruning steps that require PARAM_EXEC parameters
+ * and the parent plan's PlanState. It iterates over PartitionPruningData
+ * entries and sets up the necessary execution contexts for pruning during
+ * query execution.
  *
  * Also fix the mapping of partition indexes to subplan indexes contained in
  * prunestate by considering the new list of subplans that survived initial
@@ -2519,9 +2553,16 @@ InitExecPartitionPruneContexts(PartitionPruneState *prunestate,
 	bool		fix_subplan_map = false;
 
 	Assert(prunestate->do_exec_prune);
+	Assert(prunestate->econtext);
 	Assert(parent_plan != NULL);
 	estate = parent_plan->state;
 
+	/*
+	 * These might not be available when ExecCreatePartitionPruneState() is
+	 * called.
+	 */
+	prunestate->econtext->ecxt_param_exec_vals = estate->es_param_exec_vals;
+
 	/*
 	 * No need to fix subplans maps if initial pruning didn't eliminate any
 	 * subplans.
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index 82063ec2a16..4c96808c376 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -130,6 +130,7 @@ typedef struct PartitionPruneState
 	PartitionPruningData *partprunedata[FLEXIBLE_ARRAY_MEMBER];
 } PartitionPruneState;
 
+extern void ExecCreatePartitionPruneStates(EState *estate);
 extern void ExecDoInitialPruning(EState *estate);
 extern PartitionPruneState *ExecInitPartitionExecPruning(PlanState *planstate,
 														 int n_total_subplans,
-- 
2.47.3



^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2026-03-09 04:41  Amit Langote <[email protected]>
  parent: Amit Langote <[email protected]>
  0 siblings, 1 reply; 108+ messages in thread

From: Amit Langote @ 2026-03-09 04:41 UTC (permalink / raw)
  To: Chao Li <[email protected]>; +Cc: Tom Lane <[email protected]>; Tender Wang <[email protected]>; Alexander Lakhin <[email protected]>; Tomas Vondra <[email protected]>; Robert Haas <[email protected]>; Alvaro Herrera <[email protected]>; Andres Freund <[email protected]>; Daniel Gustafsson <[email protected]>; David Rowley <[email protected]>; pgsql-hackers; Thom Brown <[email protected]>

On Sat, Mar 7, 2026 at 6:54 PM Amit Langote <[email protected]> wrote:
> Attached is v6 of the patch series. I've been working toward
> committing this, so I wanted to lay out the ExecutorPrep() design and
> the key trade-offs before doing so.
>
> When a cached generic plan references a partitioned table,
> GetCachedPlan() locks all partitions upfront via
> AcquireExecutorLocks(), even those that initial pruning will
> eliminate.  But initial partition pruning only runs later during
> ExecutorStart(). Moving pruning earlier requires some executor setup
> (range table, permissions, pruning state), and ExecutorPrep() is the
> vehicle for that.  Unlike the approach reverted in last May, this
> keeps the CachedPlan itself unchanged -- all per-execution state flows
> through a separate CachedPlanPrepData that the caller provides.
>
> The approach also keeps GetCachedPlan()'s interface
> backward-compatible: the new CachedPlanPrepData argument is optional.
> If a caller passes NULL, all partitions are locked as before and
> nothing changes. This means existing callers and any new code that
> calls GetCachedPlan() without caring about pruning-aware locking just
> works.
>
> The risk is on the other side: if a caller does pass a
> CachedPlanPrepData, GetCachedPlan() will lock only the surviving
> partitions and populate prep_estates with the EStates that
> ExecutorPrep() created. The caller then must make those EStates
> available to ExecutorStart() -- via QueryDesc->estate,
> portal->prep_estates, or the equivalent path for SPI and SQL
> functions. If it fails to do so, ExecutorStart() will call
> ExecutorPrep() again, which may compute different pruning results than
> the original call, potentially expecting locks on relations that were
> never acquired. The executor would then operate on relations it
> doesn't hold locks on.
>
> So the contract is: if you opt in to pruning-aware locking by passing
> CachedPlanPrepData, you must complete the pipeline by delivering the
> prep EStates to the executor. In the current patch, all the call sites
> that pass a CachedPlanPrepData (portals, SPI, EXECUTE, SQL functions,
> EXPLAIN) do thread the EStates through correctly, and I've tried to
> make the plumbing straightforward enough that it's hard to get wrong.
> But it is a new invariant that didn't exist before, and a caller that
> gets it wrong would fail silently rather than with an obvious error.
>
> To catch such violations, I've added a debug-only check in
> standard_ExecutorStart() that fires when no prep EState was provided.
> It iterates over the plan's rtable and verifies that every lockable
> relation is actually locked.  It should always be true if
> AcquireExecutorLocks() locked everything, but would fail if
> pruning-aware locking happened upstream and the caller dropped the
> prep EState. The check is skipped in parallel workers, which acquire
> relation locks lazily in ExecGetRangeTableRelation().
>
> +    if (queryDesc->estate == NULL)
> +    {
> +#ifdef USE_ASSERT_CHECKING
> +        if (!IsParallelWorker())
> +        {
> +            ListCell   *lc;
> +
> +            foreach(lc, queryDesc->plannedstmt->rtable)
> +            {
> +                RangeTblEntry *rte = lfirst_node(RangeTblEntry, lc);
> +
> +                if (rte->rtekind == RTE_RELATION ||
> +                    (rte->rtekind == RTE_SUBQUERY && rte->relid != InvalidOid))
> +                    Assert(CheckRelationOidLockedByMe(rte->relid,
> +                                                      rte->rellockmode,
> +                                                      true));
> +            }
> +        }
> +#endif
> +        queryDesc->estate = ExecutorPrep(queryDesc->plannedstmt,
> +                                         queryDesc->params,
> +                                         CurrentResourceOwner,
> +                                         true,
> +                                         eflags);
> +    }
> +#ifdef USE_ASSERT_CHECKING
> +    else
> +    {
> +        /*
> +         * A prep EState was provided, meaning pruning-aware locking
> +         * should have locked at least the unpruned relations.
> +         */
> +        if (!IsParallelWorker())
> +        {
> +            int     rtindex = -1;
> +
> +            while ((rtindex =
> bms_next_member(queryDesc->estate->es_unpruned_relids,
> +                                              rtindex)) >= 0)
> +            {
> +                RangeTblEntry *rte = exec_rt_fetch(rtindex, queryDesc->estate);
> +
> +                Assert(rte->rtekind == RTE_RELATION ||
> +                       (rte->rtekind == RTE_SUBQUERY &&
> +                        rte->relid != InvalidOid));
> +                Assert(CheckRelationOidLockedByMe(rte->relid,
> +                                                  rte->rellockmode, true));
> +            }
> +        }
> +    }
> +#endif
>
> So the invariant is: if no prep EState was provided, every relation in
> the plan is locked; if one was provided, at least the unpruned
> relations are locked. Both are checked in assert builds.
>
> I think this covers the main concerns, but I may be missing something.
> If anyone sees a problem with this approach, I'd like to hear about
> it.

Here's v7. Some plancache.c changes that I'd made were in the wrong
patch in v6; this version puts them where they belong.

-- 
Thanks, Amit Langote


Attachments:

  [application/octet-stream] v7-0003-Add-test-for-partition-lock-behavior-with-generic.patch (5.3K, 2-v7-0003-Add-test-for-partition-lock-behavior-with-generic.patch)
  download | inline diff:
From 58179bd0d3730dbd1fdbb0bd9c624dc7ae770830 Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Tue, 10 Feb 2026 22:00:32 +0900
Subject: [PATCH v7 3/6] Add test for partition lock behavior with generic
 cached plans

Add a regression test that inspects pg_locks to verify which child
partitions are locked when executing a prepared statement that uses
a generic cached plan.

Two cases are tested: one with enable_partition_pruning on and one
with it off.  Currently both cases lock all child partitions, because
GetCachedPlan() acquires execution locks on every relation in the
plan regardless of pruning.

A subsequent commit that adds pruning-aware locking will update the
expected output for the pruning-enabled case, showing that only the
surviving partition is locked.
---
 src/test/regress/expected/partition_prune.out | 83 +++++++++++++++++++
 src/test/regress/sql/partition_prune.sql      | 55 ++++++++++++
 2 files changed, 138 insertions(+)

diff --git a/src/test/regress/expected/partition_prune.out b/src/test/regress/expected/partition_prune.out
index deacdd75807..39dab8fcc05 100644
--- a/src/test/regress/expected/partition_prune.out
+++ b/src/test/regress/expected/partition_prune.out
@@ -4824,3 +4824,86 @@ select min(a) over (partition by a order by a) from part_abc where a >= stable_o
 
 drop view part_abc_view;
 drop table part_abc;
+--
+-- Verify that pruning-aware locking skips pruned partitions
+-- when reusing a generic cached plan.
+--
+set plan_cache_mode to force_generic_plan;
+create table prunelock_p (a int) partition by list (a);
+create table prunelock_p1 partition of prunelock_p for values in (1);
+create table prunelock_p2 partition of prunelock_p for values in (2);
+create table prunelock_p3 partition of prunelock_p for values in (3);
+prepare prunelock_q (int) as select * from prunelock_p where a = $1;
+-- Force generic plan creation
+explain (costs off) execute prunelock_q(1);
+                  QUERY PLAN                  
+----------------------------------------------
+ Append
+   Subplans Removed: 2
+   ->  Seq Scan on prunelock_p1 prunelock_p_1
+         Filter: (a = $1)
+(4 rows)
+
+-- Execute and check which child partitions are locked
+begin;
+execute prunelock_q(1);
+ a 
+---
+(0 rows)
+
+select c.relname
+  from pg_locks l
+  join pg_class c on c.oid = l.relation
+ where l.pid = pg_backend_pid()
+   and c.relname like 'prunelock_p_'
+ order by c.relname;
+   relname    
+--------------
+ prunelock_p1
+ prunelock_p2
+ prunelock_p3
+(3 rows)
+
+commit;
+deallocate prunelock_q;
+-- Turn pruning off
+set enable_partition_pruning to off;
+prepare prunelock_q (int) as select * from prunelock_p where a = $1;
+-- Force generic plan creation
+explain (costs off) execute prunelock_q(1);
+                  QUERY PLAN                  
+----------------------------------------------
+ Append
+   ->  Seq Scan on prunelock_p1 prunelock_p_1
+         Filter: (a = $1)
+   ->  Seq Scan on prunelock_p2 prunelock_p_2
+         Filter: (a = $1)
+   ->  Seq Scan on prunelock_p3 prunelock_p_3
+         Filter: (a = $1)
+(7 rows)
+
+-- Execute and check which child partitions are locked
+begin;
+execute prunelock_q(1);
+ a 
+---
+(0 rows)
+
+select c.relname
+  from pg_locks l
+  join pg_class c on c.oid = l.relation
+ where l.pid = pg_backend_pid()
+   and c.relname like 'prunelock_p_'
+ order by c.relname;
+   relname    
+--------------
+ prunelock_p1
+ prunelock_p2
+ prunelock_p3
+(3 rows)
+
+commit;
+deallocate prunelock_q;
+drop table prunelock_p;
+reset plan_cache_mode;
+reset enable_partition_pruning;
diff --git a/src/test/regress/sql/partition_prune.sql b/src/test/regress/sql/partition_prune.sql
index d93c0c03bab..229c5eb370c 100644
--- a/src/test/regress/sql/partition_prune.sql
+++ b/src/test/regress/sql/partition_prune.sql
@@ -1447,3 +1447,58 @@ select min(a) over (partition by a order by a) from part_abc where a >= stable_o
 
 drop view part_abc_view;
 drop table part_abc;
+
+--
+-- Verify that pruning-aware locking skips pruned partitions
+-- when reusing a generic cached plan.
+--
+set plan_cache_mode to force_generic_plan;
+
+create table prunelock_p (a int) partition by list (a);
+create table prunelock_p1 partition of prunelock_p for values in (1);
+create table prunelock_p2 partition of prunelock_p for values in (2);
+create table prunelock_p3 partition of prunelock_p for values in (3);
+
+prepare prunelock_q (int) as select * from prunelock_p where a = $1;
+
+-- Force generic plan creation
+explain (costs off) execute prunelock_q(1);
+
+-- Execute and check which child partitions are locked
+begin;
+execute prunelock_q(1);
+
+select c.relname
+  from pg_locks l
+  join pg_class c on c.oid = l.relation
+ where l.pid = pg_backend_pid()
+   and c.relname like 'prunelock_p_'
+ order by c.relname;
+commit;
+
+deallocate prunelock_q;
+
+-- Turn pruning off
+set enable_partition_pruning to off;
+
+prepare prunelock_q (int) as select * from prunelock_p where a = $1;
+
+-- Force generic plan creation
+explain (costs off) execute prunelock_q(1);
+
+-- Execute and check which child partitions are locked
+begin;
+execute prunelock_q(1);
+
+select c.relname
+  from pg_locks l
+  join pg_class c on c.oid = l.relation
+ where l.pid = pg_backend_pid()
+   and c.relname like 'prunelock_p_'
+ order by c.relname;
+commit;
+
+deallocate prunelock_q;
+drop table prunelock_p;
+reset plan_cache_mode;
+reset enable_partition_pruning;
-- 
2.47.3



  [application/octet-stream] v7-0005-Make-SQL-function-executor-track-ExecutorPrep-sta.patch (7.8K, 3-v7-0005-Make-SQL-function-executor-track-ExecutorPrep-sta.patch)
  download | inline diff:
From c67ec5cc6bbe20d7ad14fb99cd1696939c6ec70f Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Tue, 10 Feb 2026 22:09:23 +0900
Subject: [PATCH v7 5/6] Make SQL function executor track ExecutorPrep state

Extend the SQL function executor to use the ExecutorPrep results
returned by GetCachedPlan().  init_execution_state() now passes a
CachedPlanPrepData to GetCachedPlan() and stores the per statement
ExecPrep pointers in the execution_state nodes.

At execution time, postquel_start() reparents the prep estate's
es_query_cxt under the function's subcontext so that prep state
follows the usual per call context hierarchy.

This allows SQL language functions to participate in the same
ExecutorPrep machinery as other plan cache users.

Add a regression test where rule rewrite expands a single UPDATE
into multiple PlannedStmts, exercising the SQL function plan cache
and the generic plan reuse path that now invokes ExecutorPrep.
---
 src/backend/executor/functions.c        | 29 +++++++++++++--
 src/test/regress/expected/plancache.out | 48 +++++++++++++++++++++++++
 src/test/regress/sql/plancache.sql      | 34 ++++++++++++++++++
 3 files changed, 109 insertions(+), 2 deletions(-)

diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index 65dfae58dcf..c70e06d8886 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -72,6 +72,7 @@ typedef struct execution_state
 	bool		setsResult;		/* true if this query produces func's result */
 	bool		lazyEval;		/* true if should fetch one row at a time */
 	PlannedStmt *stmt;			/* plan for this query */
+	EState	   *prep_estate;	/* EState created in ExecutorPrep() for this plan */
 	QueryDesc  *qd;				/* null unless status == RUN */
 } execution_state;
 
@@ -657,6 +658,8 @@ init_execution_state(SQLFunctionCachePtr fcache)
 	execution_state *lasttages = NULL;
 	int			nstmts;
 	ListCell   *lc;
+	CachedPlanPrepData cprep = {0};
+	ListCell   *prep_lc;
 
 	/*
 	 * Clean up after previous query, if there was one.
@@ -695,11 +698,20 @@ init_execution_state(SQLFunctionCachePtr fcache)
 	 * CurrentResourceOwner will be the same when ShutdownSQLFunction runs.)
 	 */
 	fcache->cowner = CurrentResourceOwner;
+
+	/*
+	 * Have ExecutorPrep() allocate under fcache->fcontext.  The prep
+	 * EStates it creates will initially live there; postquel_start()
+	 * will later reparent their es_query_cxt into fcache->subcontext
+	 * when using them for execution.
+	 */
+	cprep.context = fcache->fcontext;
+	cprep.owner = fcache->cowner;
 	fcache->cplan = GetCachedPlan(plansource,
 								  fcache->paramLI,
 								  fcache->cowner,
 								  NULL,
-								  NULL);
+								  &cprep);
 
 	/*
 	 * If necessary, make esarray[] bigger to hold the needed state.
@@ -720,9 +732,11 @@ init_execution_state(SQLFunctionCachePtr fcache)
 	/*
 	 * Build execution_state list to match the number of contained plans.
 	 */
+	prep_lc = list_head(cprep.prep_estates);
 	foreach(lc, fcache->cplan->stmt_list)
 	{
 		PlannedStmt *stmt = lfirst_node(PlannedStmt, lc);
+		EState *prep_estate = next_prep_estate(cprep.prep_estates, &prep_lc);
 		execution_state *newes;
 
 		/*
@@ -764,6 +778,7 @@ init_execution_state(SQLFunctionCachePtr fcache)
 		newes->setsResult = false;	/* might change below */
 		newes->lazyEval = false;	/* might change below */
 		newes->stmt = stmt;
+		newes->prep_estate = prep_estate;
 		newes->qd = NULL;
 
 		if (stmt->canSetTag)
@@ -1362,6 +1377,15 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
 	else
 		dest = None_Receiver;
 
+	/*
+	 * Prep EStates were built under fcache->fcontext.  For execution,
+	 * make their es_query_cxt a child of fcache->subcontext so they
+	 * follow the usual per call lifetime.
+	 */
+	if (es->prep_estate)
+		MemoryContextSetParent(es->prep_estate->es_query_cxt,
+							   fcache->subcontext);
+
 	es->qd = CreateQueryDesc(es->stmt,
 							 fcache->func->src,
 							 GetActiveSnapshot(),
@@ -1370,7 +1394,7 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
 							 fcache->paramLI,
 							 es->qd ? es->qd->queryEnv : NULL,
 							 0,
-							 NULL);
+							 es->prep_estate);
 
 	/* Utility commands don't need Executor. */
 	if (es->qd->operation != CMD_UTILITY)
@@ -1461,6 +1485,7 @@ postquel_end(execution_state *es, SQLFunctionCachePtr fcache)
 
 	FreeQueryDesc(es->qd);
 	es->qd = NULL;
+	es->prep_estate = NULL;
 
 	MemoryContextSwitchTo(oldcontext);
 
diff --git a/src/test/regress/expected/plancache.out b/src/test/regress/expected/plancache.out
index 1d69ab0a1c2..371673a6e96 100644
--- a/src/test/regress/expected/plancache.out
+++ b/src/test/regress/expected/plancache.out
@@ -459,4 +459,52 @@ NOTICE:  creating index on partition inval_during_pruning_p1
 drop table inval_during_pruning_p, inval_during_pruning_signal;
 drop function invalidate_plancache_func, stable_pruning_val;
 deallocate inval_during_pruning_q;
+-- exercise sql-function plan cache when rewrite expands a single statement
+-- into multiple planned statements. this forces cachedplan->stmt_list to
+-- contain more than one entry and checks that executor state for the first
+-- rewritten statement does not destroy state needed by the second one.
+set plan_cache_mode = force_generic_plan;
+create table sqlf_base(id int, val int) partition by list (id);
+create table sqlf_base_1 partition of sqlf_base for values in (1);
+create table sqlf_base_2 partition of sqlf_base for values in (2);
+create table sqlf_log(id int, note text);
+insert into sqlf_base values (1, 10);
+create rule sqlf_base_upd_log as
+on update to sqlf_base do also
+	insert into sqlf_log(id, note)
+	values (new.id, 'logged by rule');
+create or replace function sqlf_execprep_test(a int, v int)
+returns void
+language sql
+as $$
+	update sqlf_base set val = v where id = a;
+$$;
+select sqlf_execprep_test(1, 20);
+ sqlf_execprep_test 
+--------------------
+ 
+(1 row)
+
+select sqlf_execprep_test(1, 30);
+ sqlf_execprep_test 
+--------------------
+ 
+(1 row)
+
+select * from sqlf_base order by 1;
+ id | val 
+----+-----
+  1 |  30
+(1 row)
+
+select * from sqlf_log order by 1;
+ id |      note      
+----+----------------
+  1 | logged by rule
+  1 | logged by rule
+(2 rows)
+
+drop rule sqlf_base_upd_log on sqlf_base;
+drop table sqlf_base, sqlf_log;
+drop function sqlf_execprep_test;
 reset plan_cache_mode;
diff --git a/src/test/regress/sql/plancache.sql b/src/test/regress/sql/plancache.sql
index 139b4688fd6..b89c9ad69a4 100644
--- a/src/test/regress/sql/plancache.sql
+++ b/src/test/regress/sql/plancache.sql
@@ -273,4 +273,38 @@ drop table inval_during_pruning_p, inval_during_pruning_signal;
 drop function invalidate_plancache_func, stable_pruning_val;
 deallocate inval_during_pruning_q;
 
+-- exercise sql-function plan cache when rewrite expands a single statement
+-- into multiple planned statements. this forces cachedplan->stmt_list to
+-- contain more than one entry and checks that executor state for the first
+-- rewritten statement does not destroy state needed by the second one.
+
+set plan_cache_mode = force_generic_plan;
+
+create table sqlf_base(id int, val int) partition by list (id);
+create table sqlf_base_1 partition of sqlf_base for values in (1);
+create table sqlf_base_2 partition of sqlf_base for values in (2);
+create table sqlf_log(id int, note text);
+
+insert into sqlf_base values (1, 10);
+
+create rule sqlf_base_upd_log as
+on update to sqlf_base do also
+	insert into sqlf_log(id, note)
+	values (new.id, 'logged by rule');
+
+create or replace function sqlf_execprep_test(a int, v int)
+returns void
+language sql
+as $$
+	update sqlf_base set val = v where id = a;
+$$;
+
+select sqlf_execprep_test(1, 20);
+select sqlf_execprep_test(1, 30);
+select * from sqlf_base order by 1;
+select * from sqlf_log order by 1;
+
+drop rule sqlf_base_upd_log on sqlf_base;
+drop table sqlf_base, sqlf_log;
+drop function sqlf_execprep_test;
 reset plan_cache_mode;
-- 
2.47.3



  [application/octet-stream] v7-0002-Introduce-ExecutorPrep-and-refactor-executor-star.patch (27.6K, 4-v7-0002-Introduce-ExecutorPrep-and-refactor-executor-star.patch)
  download | inline diff:
From aeaaa5059a7be06c301b1372c16829225b2770fb Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Tue, 11 Nov 2025 21:47:46 +0900
Subject: [PATCH v7 2/6] Introduce ExecutorPrep and refactor executor startup

Factor permission checks, range table initialization, and initial
partition pruning out of InitPlan() into a new ExecutorPrep()
helper. ExecutorPrep() builds an EState containing the executor
metadata needed before plan execution, including partition
pruning state where partPruneInfos are present, and returns it
directly to the caller.

ExecutorStart() now checks if QueryDesc->estate is already set
(indicating ExecutorPrep() was called earlier). If so, it reuses
the EState to avoid redoing range table setup and pruning.
Otherwise, it invokes ExecutorPrep() itself and adopts the
resulting EState for the duration of the query. This keeps the
executor startup behavior unchanged while making the setup work
callable separately when needed.

CreateQueryDesc() grows a prep_estate argument to accept an
optionally pre-created EState and stores it in the QueryDesc.
Portals, SPI, SQL functions, and EXPLAIN are wired to carry
optional EState pointers alongside the PlannedStmt list, but most
callers still pass NULL and let ExecutorStart() perform the setup
lazily.

ExecutorPrep() requires the caller to have established an active
snapshot, as partition pruning expressions may call PL functions
that internally require one (e.g., via EnsurePortalSnapshotExists()).

Update executor/README and related comments to document the new
control flow and the separation between preparation and execution.

Note that as of this commit, ExecutorStart() is the only caller of
ExecutorPrep(), so there is no semantic change in behavior. Later
commits will add specialized callers that invoke ExecutorPrep()
earlier to enable pruning-aware locking in cached plans.
---
 src/backend/commands/copyto.c       |   2 +-
 src/backend/commands/createas.c     |   2 +-
 src/backend/commands/explain.c      |   8 +-
 src/backend/commands/extension.c    |   2 +-
 src/backend/commands/matview.c      |   2 +-
 src/backend/commands/portalcmds.c   |   1 +
 src/backend/commands/prepare.c      |   9 +-
 src/backend/executor/README         |  11 +-
 src/backend/executor/execMain.c     | 176 +++++++++++++++++++++++-----
 src/backend/executor/execParallel.c |   3 +-
 src/backend/executor/functions.c    |   3 +-
 src/backend/executor/spi.c          |   9 +-
 src/backend/tcop/postgres.c         |   2 +
 src/backend/tcop/pquery.c           |  24 +++-
 src/backend/utils/mmgr/portalmem.c  |   2 +
 src/include/commands/explain.h      |   3 +-
 src/include/executor/execdesc.h     |   5 +-
 src/include/executor/executor.h     |  26 ++++
 src/include/nodes/execnodes.h       |   1 -
 src/include/utils/portal.h          |   2 +
 20 files changed, 241 insertions(+), 52 deletions(-)

diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index 9ceeff6d99e..ef1ee2568c6 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -875,7 +875,7 @@ BeginCopyTo(ParseState *pstate,
 		cstate->queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
 											GetActiveSnapshot(),
 											InvalidSnapshot,
-											dest, NULL, NULL, 0);
+											dest, NULL, NULL, 0, NULL);
 
 		/*
 		 * Call ExecutorStart to prepare the plan for execution.
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 270e9bf3110..b4a9808955a 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -336,7 +336,7 @@ ExecCreateTableAs(ParseState *pstate, CreateTableAsStmt *stmt,
 		/* Create a QueryDesc, redirecting output to our tuple receiver */
 		queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
 									GetActiveSnapshot(), InvalidSnapshot,
-									dest, params, queryEnv, 0);
+									dest, params, queryEnv, 0, NULL);
 
 		/* call ExecutorStart to prepare the plan for execution */
 		ExecutorStart(queryDesc, GetIntoRelEFlags(into));
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 93918a223b8..40564d4dff9 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -370,7 +370,7 @@ standard_ExplainOneQuery(Query *query, int cursorOptions,
 	}
 
 	/* run it (if needed) and produce output */
-	ExplainOnePlan(plan, into, es, queryString, params, queryEnv,
+	ExplainOnePlan(plan, NULL, into, es, queryString, params, queryEnv,
 				   &planduration, (es->buffers ? &bufusage : NULL),
 				   es->memory ? &mem_counters : NULL);
 }
@@ -492,7 +492,8 @@ ExplainOneUtility(Node *utilityStmt, IntoClause *into, ExplainState *es,
  * to call it.
  */
 void
-ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
+ExplainOnePlan(PlannedStmt *plannedstmt, EState *prep_estate,
+			   IntoClause *into, ExplainState *es,
 			   const char *queryString, ParamListInfo params,
 			   QueryEnvironment *queryEnv, const instr_time *planduration,
 			   const BufferUsage *bufusage,
@@ -550,7 +551,8 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
 	/* Create a QueryDesc for the query */
 	queryDesc = CreateQueryDesc(plannedstmt, queryString,
 								GetActiveSnapshot(), InvalidSnapshot,
-								dest, params, queryEnv, instrument_option);
+								dest, params, queryEnv, instrument_option,
+								prep_estate);
 
 	/* Select execution options */
 	if (es->analyze)
diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index 963618a64c4..ff759ddd07c 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -1173,7 +1173,7 @@ execute_sql_string(const char *sql, const char *filename)
 				qdesc = CreateQueryDesc(stmt,
 										sql,
 										GetActiveSnapshot(), NULL,
-										dest, NULL, NULL, 0);
+										dest, NULL, NULL, 0, NULL);
 
 				ExecutorStart(qdesc, 0);
 				ExecutorRun(qdesc, ForwardScanDirection, 0);
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index 81a55a33ef2..2cdfdcf984b 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -439,7 +439,7 @@ refresh_matview_datafill(DestReceiver *dest, Query *query,
 	/* Create a QueryDesc, redirecting output to our tuple receiver */
 	queryDesc = CreateQueryDesc(plan, queryString,
 								GetActiveSnapshot(), InvalidSnapshot,
-								dest, NULL, NULL, 0);
+								dest, NULL, NULL, 0, NULL);
 
 	/* call ExecutorStart to prepare the plan for execution */
 	ExecutorStart(queryDesc, 0);
diff --git a/src/backend/commands/portalcmds.c b/src/backend/commands/portalcmds.c
index 01efac3319e..1e880a6d7c9 100644
--- a/src/backend/commands/portalcmds.c
+++ b/src/backend/commands/portalcmds.c
@@ -118,6 +118,7 @@ PerformCursorOpen(ParseState *pstate, DeclareCursorStmt *cstmt, ParamListInfo pa
 					  queryString,
 					  CMDTAG_SELECT,	/* cursor's query is always a SELECT */
 					  list_make1(plan),
+					  NIL,
 					  NULL);
 
 	/*----------
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 5b86a727587..005fbb48aa5 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -205,6 +205,7 @@ ExecuteQuery(ParseState *pstate,
 					  query_string,
 					  entry->plansource->commandTag,
 					  plan_list,
+					  NIL,
 					  cplan);
 
 	/*
@@ -575,7 +576,9 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	const char *query_string;
 	CachedPlan *cplan;
 	List	   *plan_list;
+	List	   *prep_estates;
 	ListCell   *p;
+	ListCell   *prep_lc;
 	ParamListInfo paramLI = NULL;
 	EState	   *estate = NULL;
 	instr_time	planstart;
@@ -650,14 +653,18 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	}
 
 	plan_list = cplan->stmt_list;
+	prep_estates = NIL;
 
 	/* Explain each query */
+	prep_lc = list_head(prep_estates);
 	foreach(p, plan_list)
 	{
 		PlannedStmt *pstmt = lfirst_node(PlannedStmt, p);
+		EState *prep_estate = next_prep_estate(prep_estates, &prep_lc);
 
 		if (pstmt->commandType != CMD_UTILITY)
-			ExplainOnePlan(pstmt, into, es, query_string, paramLI, pstate->p_queryEnv,
+			ExplainOnePlan(pstmt, prep_estate,
+						   into, es, query_string, paramLI, pstate->p_queryEnv,
 						   &planduration, (es->buffers ? &bufusage : NULL),
 						   es->memory ? &mem_counters : NULL);
 		else
diff --git a/src/backend/executor/README b/src/backend/executor/README
index 54f4782f31b..d749ceb6687 100644
--- a/src/backend/executor/README
+++ b/src/backend/executor/README
@@ -291,11 +291,18 @@ Query Processing Control Flow
 
 This is a sketch of control flow for full query processing:
 
+    ExecutorPrep
+		May be run before ExecutorStart (e.g., for plan validation), or
+		implicitly from ExecutorStart if not done earlier.  Creates EState,
+		performs range table initialization, permission checks, and initial
+		partition pruning.  Returns the EState that ExecutorStart() should
+		reuse.
+
 	CreateQueryDesc
 
 	ExecutorStart
-		CreateExecutorState
-			creates per-query context
+		ExecutorPrep (if not already done, indicated by NULL QueryDesc.estate)
+			creates EState and per-query context
 		switch to per-query context to run ExecInitNode
 		AfterTriggerBeginQuery
 		ExecInitNode --- recursively scans plan tree
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 654f9246ad0..d7e99690c7f 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -55,6 +55,7 @@
 #include "parser/parse_relation.h"
 #include "pgstat.h"
 #include "rewrite/rewriteHandler.h"
+#include "storage/lmgr.h"
 #include "tcop/utility.h"
 #include "utils/acl.h"
 #include "utils/backend_status.h"
@@ -145,7 +146,6 @@ standard_ExecutorStart(QueryDesc *queryDesc, int eflags)
 
 	/* sanity checks: queryDesc must not be started already */
 	Assert(queryDesc != NULL);
-	Assert(queryDesc->estate == NULL);
 
 	/* caller must ensure the query's snapshot is active */
 	Assert(GetActiveSnapshot() == queryDesc->snapshot);
@@ -171,9 +171,71 @@ standard_ExecutorStart(QueryDesc *queryDesc, int eflags)
 
 	/*
 	 * Build EState, switch into per-query memory context for startup.
-	 */
-	estate = CreateExecutorState();
-	queryDesc->estate = estate;
+	 *
+	 * If ExecutorPrep() ran earlier (e.g., to do initial pruning during plan
+	 * validity checking), reuse its EState to avoid redoing range table setup
+	 * and pruning. Otherwise, create a fresh EState as usual.
+	 *
+	 * In assert builds, verify that the expected locks are held.  When
+	 * no prep EState was provided, AcquireExecutorLocks() should have
+	 * locked every relation in the plan.  When one was provided,
+	 * pruning-aware locking should have locked at least the unpruned
+	 * relations.  Both checks are skipped in parallel workers, which
+	 * acquire relation locks lazily in ExecGetRangeTableRelation().
+	 */
+	if (queryDesc->estate == NULL)
+	{
+#ifdef USE_ASSERT_CHECKING
+		if (!IsParallelWorker())
+		{
+			ListCell   *lc;
+
+			foreach(lc, queryDesc->plannedstmt->rtable)
+			{
+				RangeTblEntry *rte = lfirst_node(RangeTblEntry, lc);
+
+				if (rte->rtekind == RTE_RELATION ||
+					(rte->rtekind == RTE_SUBQUERY && rte->relid != InvalidOid))
+					Assert(CheckRelationOidLockedByMe(rte->relid,
+													  rte->rellockmode,
+													  true));
+			}
+		}
+#endif
+		queryDesc->estate = ExecutorPrep(queryDesc->plannedstmt,
+										 queryDesc->params,
+										 CurrentResourceOwner,
+										 true,
+										 eflags);
+	}
+#ifdef USE_ASSERT_CHECKING
+	else
+	{
+		/*
+		 * A prep EState was provided, meaning pruning-aware locking
+		 * should have locked at least the unpruned relations.
+		 */
+		if (!IsParallelWorker())
+		{
+			int		rtindex = -1;
+
+			while ((rtindex = bms_next_member(queryDesc->estate->es_unpruned_relids,
+											  rtindex)) >= 0)
+			{
+				RangeTblEntry *rte = exec_rt_fetch(rtindex, queryDesc->estate);
+
+				Assert(rte->rtekind == RTE_RELATION ||
+					   (rte->rtekind == RTE_SUBQUERY &&
+						rte->relid != InvalidOid));
+				Assert(CheckRelationOidLockedByMe(rte->relid,
+												  rte->rellockmode, true));
+			}
+		}
+	}
+#endif
+
+	estate = queryDesc->estate;
+	Assert(estate);
 
 	oldcontext = MemoryContextSwitchTo(estate->es_query_cxt);
 
@@ -263,6 +325,84 @@ standard_ExecutorStart(QueryDesc *queryDesc, int eflags)
 	MemoryContextSwitchTo(oldcontext);
 }
 
+/*
+ * ExecutorPrep: prepare executor state for a PlannedStmt outside ExecutorStart.
+ *
+ * Performs range table initialization, permission checks, and initial
+ * partition pruning if partPruneInfos are present and do_initial_pruning is
+ * true.
+ *
+ * This is intended for callers that need executor metadata ahead of actual
+ * execution. Typical use cases include:
+ *	- determining which relations must be locked during plan cache validation;
+ *	- initializing unpruned relids and valid subplans in parallel workers
+ *	  using state copied from the leader.
+ *
+ * The executor can reuse the resulting state to avoid redundant setup during
+ * ExecutorStart().
+ *
+ * Returns an EState that can be reused later.
+ */
+EState *
+ExecutorPrep(PlannedStmt *pstmt, ParamListInfo params, ResourceOwner owner,
+			 bool do_initial_pruning, int eflags)
+{
+	ResourceOwner oldowner;
+	EState *estate;
+
+	if (pstmt->commandType == CMD_UTILITY)
+		return NULL;
+
+	/* Caller must have established an active snapshot. */
+	Assert(ActiveSnapshotSet());
+
+	estate = CreateExecutorState();
+	estate->es_plannedstmt = pstmt;
+	estate->es_part_prune_infos = pstmt->partPruneInfos;
+	estate->es_param_list_info = params;
+	estate->es_top_eflags = eflags;
+
+	/*
+	 * Do permissions checks.
+	 */
+	ExecCheckPermissions(pstmt->rtable, pstmt->permInfos, true);
+
+	/*
+	 * Initialize range table.
+	 */
+	ExecInitRangeTable(estate, pstmt->rtable, pstmt->permInfos,
+					   bms_copy(pstmt->unprunableRelids));
+
+	/*
+	 * Ensure locks taken during initial pruning are tracked under the given
+	 * ResourceOwner (e.g., one associated with CachedPlan validation).
+	 */
+	oldowner = CurrentResourceOwner;
+	CurrentResourceOwner = owner;
+
+	/*
+	 * Set up PartitionPruneState structures needed for both initial and
+	 * runtime partition pruning. These structures are built from the
+	 * PartitionPruneInfo entries in the plan tree.
+	 *
+	 * If do_initial_pruning is true, also perform initial pruning to compute
+	 * the subset of child subplans that will be executed. The results,
+	 * which are bitmapsets of selected child indexes, are saved in
+	 * es_part_prune_results. This list is parallel to es_part_prune_infos.
+	 *
+	 * In parallel workers, do_initial_pruning should be false -- they receive
+	 * es_part_prune_results from the leader process and should only initialize
+	 * the PartitionPruneStates.
+	 */
+	ExecCreatePartitionPruneStates(estate);
+	if (do_initial_pruning)
+		ExecDoInitialPruning(estate);
+
+	CurrentResourceOwner = oldowner;
+
+	return estate;
+}
+
 /* ----------------------------------------------------------------
  *		ExecutorRun
  *
@@ -838,38 +978,14 @@ InitPlan(QueryDesc *queryDesc, int eflags)
 	CmdType		operation = queryDesc->operation;
 	PlannedStmt *plannedstmt = queryDesc->plannedstmt;
 	Plan	   *plan = plannedstmt->planTree;
-	List	   *rangeTable = plannedstmt->rtable;
 	EState	   *estate = queryDesc->estate;
 	PlanState  *planstate;
 	TupleDesc	tupType;
 	ListCell   *l;
 	int			i;
 
-	/*
-	 * Do permissions checks
-	 */
-	ExecCheckPermissions(rangeTable, plannedstmt->permInfos, true);
-
-	/*
-	 * initialize the node's execution state
-	 */
-	ExecInitRangeTable(estate, rangeTable, plannedstmt->permInfos,
-					   bms_copy(plannedstmt->unprunableRelids));
-
-	estate->es_plannedstmt = plannedstmt;
-	estate->es_part_prune_infos = plannedstmt->partPruneInfos;
-
-	/*
-	 * Perform runtime "initial" pruning to identify which child subplans,
-	 * corresponding to the children of plan nodes that contain
-	 * PartitionPruneInfo such as Append, will not be executed. The results,
-	 * which are bitmapsets of indexes of the child subplans that will be
-	 * executed, are saved in es_part_prune_results.  These results correspond
-	 * to each PartitionPruneInfo entry, and the es_part_prune_results list is
-	 * parallel to es_part_prune_infos.
-	 */
-	ExecCreatePartitionPruneStates(estate);
-	ExecDoInitialPruning(estate);
+	/* ExecutorPrep() must have been done. */
+	Assert(queryDesc->estate);
 
 	/*
 	 * Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index ac84af294c9..024780d3516 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -1300,7 +1300,8 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
 	return CreateQueryDesc(pstmt,
 						   queryString,
 						   GetActiveSnapshot(), InvalidSnapshot,
-						   receiver, paramLI, NULL, instrument_options);
+						   receiver, paramLI, NULL, instrument_options,
+						   NULL);
 }
 
 /*
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index 4ca342a43ef..c93e2664cfd 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -1368,7 +1368,8 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
 							 dest,
 							 fcache->paramLI,
 							 es->qd ? es->qd->queryEnv : NULL,
-							 0);
+							 0,
+							 NULL);
 
 	/* Utility commands don't need Executor. */
 	if (es->qd->operation != CMD_UTILITY)
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 3019a3b2b97..994a69a1c8e 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -1685,6 +1685,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 					  query_string,
 					  plansource->commandTag,
 					  stmt_list,
+					  NIL,
 					  cplan);
 
 	/*
@@ -2499,6 +2500,8 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 		CachedPlanSource *plansource = (CachedPlanSource *) lfirst(lc1);
 		List	   *stmt_list;
 		ListCell   *lc2;
+		List	   *prep_estates;
+		ListCell   *prep_lc;
 
 		spicallbackarg.query = plansource->query_string;
 
@@ -2577,6 +2580,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 							  plan_owner, _SPI_current->queryEnv);
 
 		stmt_list = cplan->stmt_list;
+		prep_estates = NIL;
 
 		/*
 		 * If we weren't given a specific snapshot to use, and the statement
@@ -2614,9 +2618,11 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 			}
 		}
 
+		prep_lc = list_head(prep_estates);
 		foreach(lc2, stmt_list)
 		{
 			PlannedStmt *stmt = lfirst_node(PlannedStmt, lc2);
+			EState *prep_estate = next_prep_estate(prep_estates, &prep_lc);
 			bool		canSetTag = stmt->canSetTag;
 			DestReceiver *dest;
 
@@ -2694,7 +2700,8 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 										dest,
 										options->params,
 										_SPI_current->queryEnv,
-										0);
+										0,
+										prep_estate);
 				res = _SPI_pquery(qdesc, fire_triggers,
 								  canSetTag ? options->tcount : 0);
 				FreeQueryDesc(qdesc);
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index d01a09dd0c4..cd1e429ceed 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -1230,6 +1230,7 @@ exec_simple_query(const char *query_string)
 						  query_string,
 						  commandTag,
 						  plantree_list,
+						  NIL,
 						  NULL);
 
 		/*
@@ -2029,6 +2030,7 @@ exec_bind_message(StringInfo input_message)
 					  query_string,
 					  psrc->commandTag,
 					  cplan->stmt_list,
+					  NIL,
 					  cplan);
 
 	/* Portal is defined, set the plan ID based on its contents. */
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index d8fc75d0bb9..b18266487bb 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -37,6 +37,7 @@ Portal		ActivePortal = NULL;
 
 
 static void ProcessQuery(PlannedStmt *plan,
+						 EState *prep_estate,
 						 const char *sourceText,
 						 ParamListInfo params,
 						 QueryEnvironment *queryEnv,
@@ -72,7 +73,8 @@ CreateQueryDesc(PlannedStmt *plannedstmt,
 				DestReceiver *dest,
 				ParamListInfo params,
 				QueryEnvironment *queryEnv,
-				int instrument_options)
+				int instrument_options,
+				EState *prep_estate)
 {
 	QueryDesc  *qd = palloc_object(QueryDesc);
 
@@ -93,6 +95,9 @@ CreateQueryDesc(PlannedStmt *plannedstmt,
 	qd->planstate = NULL;
 	qd->totaltime = NULL;
 
+	/* Use the EState created by ExecutorPrep() if already done. */
+	qd->estate = prep_estate;
+
 	/* not yet executed */
 	qd->already_executed = false;
 
@@ -123,6 +128,7 @@ FreeQueryDesc(QueryDesc *qdesc)
  *		PORTAL_ONE_RETURNING, or PORTAL_ONE_MOD_WITH portal
  *
  *	plan: the plan tree for the query
+ *	prep_estate: EState created in ExecutorPrep() for the query, if any
  *	sourceText: the source text of the query
  *	params: any parameters needed
  *	dest: where to send results
@@ -135,6 +141,7 @@ FreeQueryDesc(QueryDesc *qdesc)
  */
 static void
 ProcessQuery(PlannedStmt *plan,
+			 EState *prep_estate,
 			 const char *sourceText,
 			 ParamListInfo params,
 			 QueryEnvironment *queryEnv,
@@ -148,7 +155,8 @@ ProcessQuery(PlannedStmt *plan,
 	 */
 	queryDesc = CreateQueryDesc(plan, sourceText,
 								GetActiveSnapshot(), InvalidSnapshot,
-								dest, params, queryEnv, 0);
+								dest, params, queryEnv, 0,
+								prep_estate);
 
 	/*
 	 * Call ExecutorStart to prepare the plan for execution
@@ -495,7 +503,10 @@ PortalStart(Portal portal, ParamListInfo params,
 											None_Receiver,
 											params,
 											portal->queryEnv,
-											0);
+											0,
+											portal->prep_estates ?
+											(EState *) linitial(portal->prep_estates) :
+											NULL);
 
 				/*
 				 * If it's a scrollable cursor, executor needs to support
@@ -1185,6 +1196,7 @@ PortalRunMulti(Portal portal,
 {
 	bool		active_snapshot_set = false;
 	ListCell   *stmtlist_item;
+	ListCell   *prep_lc;
 
 	/*
 	 * If the destination is DestRemoteExecute, change to DestNone.  The
@@ -1205,9 +1217,11 @@ PortalRunMulti(Portal portal,
 	 * Loop to handle the individual queries generated from a single parsetree
 	 * by analysis and rewrite.
 	 */
+	prep_lc = list_head(portal->prep_estates);
 	foreach(stmtlist_item, portal->stmts)
 	{
 		PlannedStmt *pstmt = lfirst_node(PlannedStmt, stmtlist_item);
+		EState *prep_estate = next_prep_estate(portal->prep_estates, &prep_lc);
 
 		/*
 		 * If we got a cancel signal in prior command, quit
@@ -1265,7 +1279,7 @@ PortalRunMulti(Portal portal,
 			if (pstmt->canSetTag)
 			{
 				/* statement can set tag string */
-				ProcessQuery(pstmt,
+				ProcessQuery(pstmt, prep_estate,
 							 portal->sourceText,
 							 portal->portalParams,
 							 portal->queryEnv,
@@ -1274,7 +1288,7 @@ PortalRunMulti(Portal portal,
 			else
 			{
 				/* stmt added by rewrite cannot set tag */
-				ProcessQuery(pstmt,
+				ProcessQuery(pstmt, prep_estate,
 							 portal->sourceText,
 							 portal->portalParams,
 							 portal->queryEnv,
diff --git a/src/backend/utils/mmgr/portalmem.c b/src/backend/utils/mmgr/portalmem.c
index c1a53e658cb..941e95010c3 100644
--- a/src/backend/utils/mmgr/portalmem.c
+++ b/src/backend/utils/mmgr/portalmem.c
@@ -284,6 +284,7 @@ PortalDefineQuery(Portal portal,
 				  const char *sourceText,
 				  CommandTag commandTag,
 				  List *stmts,
+				  List *prep_estates,
 				  CachedPlan *cplan)
 {
 	Assert(PortalIsValid(portal));
@@ -297,6 +298,7 @@ PortalDefineQuery(Portal portal,
 	portal->commandTag = commandTag;
 	SetQueryCompletion(&portal->qc, commandTag, 0);
 	portal->stmts = stmts;
+	portal->prep_estates = prep_estates;
 	portal->cplan = cplan;
 	portal->status = PORTAL_DEFINED;
 }
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 86226f8db70..3756a11345f 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -63,7 +63,8 @@ extern void ExplainOneUtility(Node *utilityStmt, IntoClause *into,
 							  ExplainState *es, ParseState *pstate,
 							  ParamListInfo params);
 
-extern void ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into,
+extern void ExplainOnePlan(PlannedStmt *plannedstmt, EState *prep_estate,
+						   IntoClause *into,
 						   ExplainState *es, const char *queryString,
 						   ParamListInfo params, QueryEnvironment *queryEnv,
 						   const instr_time *planduration,
diff --git a/src/include/executor/execdesc.h b/src/include/executor/execdesc.h
index d3a57242844..3a2169c9613 100644
--- a/src/include/executor/execdesc.h
+++ b/src/include/executor/execdesc.h
@@ -43,7 +43,7 @@ typedef struct QueryDesc
 	QueryEnvironment *queryEnv; /* query environment passed in */
 	int			instrument_options; /* OR of InstrumentOption flags */
 
-	/* These fields are set by ExecutorStart */
+	/* These fields are set by ExecutorStart or ExecutorPrep */
 	TupleDesc	tupDesc;		/* descriptor for result tuples */
 	EState	   *estate;			/* executor's query-wide state */
 	PlanState  *planstate;		/* tree of per-plan-node state */
@@ -63,7 +63,8 @@ extern QueryDesc *CreateQueryDesc(PlannedStmt *plannedstmt,
 								  DestReceiver *dest,
 								  ParamListInfo params,
 								  QueryEnvironment *queryEnv,
-								  int instrument_options);
+								  int instrument_options,
+								  EState *prep_estate);
 
 extern void FreeQueryDesc(QueryDesc *qdesc);
 
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index d46ba59895d..e6fa122e6e4 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -20,6 +20,7 @@
 #include "nodes/lockoptions.h"
 #include "nodes/parsenodes.h"
 #include "utils/memutils.h"
+#include "utils/resowner.h"
 
 
 /*
@@ -234,6 +235,31 @@ ExecGetJunkAttribute(TupleTableSlot *slot, AttrNumber attno, bool *isNull)
  */
 extern void ExecutorStart(QueryDesc *queryDesc, int eflags);
 extern void standard_ExecutorStart(QueryDesc *queryDesc, int eflags);
+
+extern EState *ExecutorPrep(PlannedStmt *pstmt,
+							ParamListInfo params,
+							ResourceOwner owner,
+							bool do_initial_pruning,
+							int eflags);
+
+/*
+ * Walk a prep_estates list in step with a parallel stmt_list iteration.
+ * Returns the next EState (or NULL) and advances *lc.  Safe when
+ * prep_estates is NIL; just returns NULL for every call.
+ */
+static inline EState *
+next_prep_estate(List *prep_estates, ListCell **lc)
+{
+	EState *result = NULL;
+
+	if (*lc != NULL)
+	{
+		result = (EState *) lfirst(*lc);
+		*lc = lnext(prep_estates, *lc);
+	}
+	return result;
+}
+
 extern void ExecutorRun(QueryDesc *queryDesc,
 						ScanDirection direction, uint64 count);
 extern void standard_ExecutorRun(QueryDesc *queryDesc,
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 63c067d5aae..84d80e3ab0d 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -775,7 +775,6 @@ typedef struct EState
 	List	   *es_insert_pending_modifytables;
 } EState;
 
-
 /*
  * ExecRowMark -
  *	   runtime representation of FOR [KEY] UPDATE/SHARE clauses
diff --git a/src/include/utils/portal.h b/src/include/utils/portal.h
index a7bedb12c18..f69b4b9b479 100644
--- a/src/include/utils/portal.h
+++ b/src/include/utils/portal.h
@@ -137,6 +137,7 @@ typedef struct PortalData
 	CommandTag	commandTag;		/* command tag for original query */
 	QueryCompletion qc;			/* command completion data for executed query */
 	List	   *stmts;			/* list of PlannedStmts */
+	List	   *prep_estates;	/* list of EStates where needed */
 	CachedPlan *cplan;			/* CachedPlan, if stmts are from one */
 
 	ParamListInfo portalParams; /* params to pass to query */
@@ -240,6 +241,7 @@ extern void PortalDefineQuery(Portal portal,
 							  const char *sourceText,
 							  CommandTag commandTag,
 							  List *stmts,
+							  List *prep_estates,
 							  CachedPlan *cplan);
 extern PlannedStmt *PortalGetPrimaryStmt(Portal portal);
 extern void PortalCreateHoldStore(Portal portal);
-- 
2.47.3



  [application/octet-stream] v7-0004-Use-pruning-aware-locking-in-cached-plans.patch (36.1K, 5-v7-0004-Use-pruning-aware-locking-in-cached-plans.patch)
  download | inline diff:
From e0130ef11bfb97dba5afce22370cba5f3741ab0a Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Tue, 11 Nov 2025 22:30:52 +0900
Subject: [PATCH v7 4/6] Use pruning-aware locking in cached plans

Extend GetCachedPlan() to perform ExecutorPrep() on each planned
statement, capturing unpruned relids and initial pruning results.
Use this data to acquire execution locks only on surviving partitions,
avoiding unnecessary locking of pruned tables even when using cached
plans.

Introduce CachedPlanPrepData to carry the EStates created by
ExecutorPrep() through the plan caching layer. The prep_estates
list is indexed one-to-one with CachedPlan->stmt_list and is
populated when GetCachedPlan() prepares a reused generic plan.
Adjust call sites in SPI, functions, portals, and EXPLAIN to
propagate this data.

Partition pruning expressions may call PL functions that require
an active snapshot (e.g., via EnsurePortalSnapshotExists()).
AcquireExecutorLocksUnpruned() establishes one before calling
ExecutorPrep() if needed, ensuring these expressions can execute
correctly during plan cache validation.

To maintain correctness when all target partitions are pruned, also
reinstate the firstResultRel locking behavior lost in commit
28317de72. That commit required the first ModifyTable target to
remain initialized for executor assumptions to hold. We now
explicitly track these relids in PlannerGlobal and PlannedStmt so they
are locked even if pruned, preserving that rule across cached plan
reuse.

Add a regression test that causes a generic plan to become invalid
while pruning-aware setup is running. The pruning expression calls a
function that can perform DDL on a partition, making the plan stale
during reuse.

The test's purpose is to drive execution through the invalidation
path that discards any ExecutorPrep state created before the plan was
found invalid, providing coverage for that cleanup logic.
---
 src/backend/commands/prepare.c                |  19 +-
 src/backend/executor/functions.c              |   1 +
 src/backend/executor/nodeModifyTable.c        |   5 +-
 src/backend/executor/spi.c                    |  26 +-
 src/backend/optimizer/plan/planner.c          |   1 +
 src/backend/optimizer/plan/setrefs.c          |  20 ++
 src/backend/tcop/postgres.c                   |   9 +-
 src/backend/utils/cache/plancache.c           | 255 +++++++++++++++++-
 src/include/nodes/pathnodes.h                 |   3 +
 src/include/nodes/plannodes.h                 |  10 +
 src/include/utils/plancache.h                 |  29 +-
 src/test/regress/expected/partition_prune.out |  50 +++-
 src/test/regress/expected/plancache.out       |  62 +++++
 src/test/regress/sql/partition_prune.sql      |  24 +-
 src/test/regress/sql/plancache.sql            |  51 ++++
 15 files changed, 536 insertions(+), 29 deletions(-)

diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 005fbb48aa5..e8cd47131ce 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -154,6 +154,7 @@ ExecuteQuery(ParseState *pstate,
 {
 	PreparedStatement *entry;
 	CachedPlan *cplan;
+	CachedPlanPrepData cprep = {0};
 	List	   *plan_list;
 	ParamListInfo paramLI = NULL;
 	EState	   *estate = NULL;
@@ -193,7 +194,10 @@ ExecuteQuery(ParseState *pstate,
 									   entry->plansource->query_string);
 
 	/* Replan if needed, and increment plan refcount for portal */
-	cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL);
+	/* Keep ExecutorPrep state with the portal and its resowner. */
+	cprep.context = portal->portalContext;
+	cprep.owner = portal->resowner;
+	cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL, &cprep);
 	plan_list = cplan->stmt_list;
 
 	/*
@@ -205,7 +209,7 @@ ExecuteQuery(ParseState *pstate,
 					  query_string,
 					  entry->plansource->commandTag,
 					  plan_list,
-					  NIL,
+					  cprep.prep_estates,
 					  cplan);
 
 	/*
@@ -575,6 +579,7 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	PreparedStatement *entry;
 	const char *query_string;
 	CachedPlan *cplan;
+	CachedPlanPrepData cprep = {0};
 	List	   *plan_list;
 	List	   *prep_estates;
 	ListCell   *p;
@@ -633,8 +638,14 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	}
 
 	/* Replan if needed, and acquire a transient refcount */
+	/* ExecutorPrep state is local to this EXPLAIN EXECUTE call. */
+	cprep.context = CurrentMemoryContext;
+	cprep.owner = CurrentResourceOwner;
+	if (es->generic)
+		cprep.eflags = EXEC_FLAG_EXPLAIN_GENERIC;
 	cplan = GetCachedPlan(entry->plansource, paramLI,
-						  CurrentResourceOwner, pstate->p_queryEnv);
+						  CurrentResourceOwner, pstate->p_queryEnv,
+						  &cprep);
 
 	INSTR_TIME_SET_CURRENT(planduration);
 	INSTR_TIME_SUBTRACT(planduration, planstart);
@@ -653,7 +664,7 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	}
 
 	plan_list = cplan->stmt_list;
-	prep_estates = NIL;
+	prep_estates = cprep.prep_estates;
 
 	/* Explain each query */
 	prep_lc = list_head(prep_estates);
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index c93e2664cfd..65dfae58dcf 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -698,6 +698,7 @@ init_execution_state(SQLFunctionCachePtr fcache)
 	fcache->cplan = GetCachedPlan(plansource,
 								  fcache->paramLI,
 								  fcache->cowner,
+								  NULL,
 								  NULL);
 
 	/*
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index 793c76d4f82..a7a4baaf8af 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -4858,8 +4858,8 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
 	 * as a reference for building the ResultRelInfo of the target partition.
 	 * In either case, it doesn't matter which result relation is kept, so we
 	 * just keep the first one, if all others have been pruned.  See also,
-	 * ExecDoInitialPruning(), which ensures that this first result relation
-	 * has been locked.
+	 * AcquireExecutorLocksUnpruned(), which ensures that this first result
+	 * relation has been locked.
 	 */
 	i = 0;
 	foreach(l, node->resultRelations)
@@ -4873,6 +4873,7 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
 			/* all result relations pruned; keep the first one */
 			keep_rel = true;
 			rti = linitial_int(node->resultRelations);
+			Assert(list_member_int(estate->es_plannedstmt->firstResultRels, rti));
 			i = 0;
 		}
 
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 994a69a1c8e..13703969dd8 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -1579,6 +1579,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 {
 	CachedPlanSource *plansource;
 	CachedPlan *cplan;
+	CachedPlanPrepData cprep = {0};
 	List	   *stmt_list;
 	char	   *query_string;
 	Snapshot	snapshot;
@@ -1659,7 +1660,11 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 	 */
 
 	/* Replan if needed, and increment plan refcount for portal */
-	cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv);
+	/* ExecutorPrep state lives in this portal's context. */
+	cprep.context = portal->portalContext;
+	cprep.owner = portal->resowner;
+	cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv,
+						  &cprep);
 	stmt_list = cplan->stmt_list;
 
 	if (!plan->saved)
@@ -1685,7 +1690,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 					  query_string,
 					  plansource->commandTag,
 					  stmt_list,
-					  NIL,
+					  cprep.prep_estates,	/* lives in portalContext */
 					  cplan);
 
 	/*
@@ -2078,6 +2083,7 @@ SPI_plan_get_cached_plan(SPIPlanPtr plan)
 {
 	CachedPlanSource *plansource;
 	CachedPlan *cplan;
+	CachedPlanPrepData cprep = {0};
 	SPICallbackArg spicallbackarg;
 	ErrorContextCallback spierrcontext;
 
@@ -2101,9 +2107,13 @@ SPI_plan_get_cached_plan(SPIPlanPtr plan)
 	error_context_stack = &spierrcontext;
 
 	/* Get the generic plan for the query */
+	/* ExecutorPrep() state lives in caller's active context. */
+	cprep.context = CurrentMemoryContext;
+	cprep.owner = CurrentResourceOwner;
 	cplan = GetCachedPlan(plansource, NULL,
 						  plan->saved ? CurrentResourceOwner : NULL,
-						  _SPI_current->queryEnv);
+						  _SPI_current->queryEnv,
+						  &cprep);
 	Assert(cplan == plansource->gplan);
 
 	/* Pop the error context stack */
@@ -2502,6 +2512,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 		ListCell   *lc2;
 		List	   *prep_estates;
 		ListCell   *prep_lc;
+		CachedPlanPrepData cprep = {0};
 
 		spicallbackarg.query = plansource->query_string;
 
@@ -2576,11 +2587,16 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 		 * Replan if needed, and increment plan refcount.  If it's a saved
 		 * plan, the refcount must be backed by the plan_owner.
 		 */
+
+		/* ExecutorPrep state is per _SPI_execute_plan call. */
+		cprep.context = CurrentMemoryContext;
+		cprep.owner = CurrentResourceOwner;
 		cplan = GetCachedPlan(plansource, options->params,
-							  plan_owner, _SPI_current->queryEnv);
+							  plan_owner, _SPI_current->queryEnv,
+							  &cprep);
 
 		stmt_list = cplan->stmt_list;
-		prep_estates = NIL;
+		prep_estates = cprep.prep_estates;
 
 		/*
 		 * If we weren't given a specific snapshot to use, and the statement
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 42604a0f75c..afa61d357c5 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -657,6 +657,7 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
 	result->permInfos = glob->finalrteperminfos;
 	result->subrtinfos = glob->subrtinfos;
 	result->resultRelations = glob->resultRelations;
+	result->firstResultRels = glob->firstResultRels;
 	result->appendRelations = glob->appendRelations;
 	result->subplans = glob->subplans;
 	result->rewindPlanIDs = glob->rewindPlanIDs;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 1b5b9b5ed9c..ddb7902bc89 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -384,6 +384,26 @@ set_plan_references(PlannerInfo *root, Plan *plan)
 		}
 	}
 
+	/*
+	 * Record the first result relation if it belongs to the set of
+	 * initially prunable relations.  We use bms_next_member() to get
+	 * the lowest-numbered leaf result rel, which matches
+	 * linitial_int(ModifyTable.resultRelations) because partition
+	 * expansion preserves RT index order.  There is one ModifyTable
+	 * per query level, so this captures exactly one entry per level.
+	 * ExecInitModifyTable() asserts that the recorded index matches
+	 * what it actually needs.
+	 */
+	if (root->leaf_result_relids)
+	{
+		Index	firstResultRel = bms_next_member(root->leaf_result_relids, -1);
+
+		firstResultRel += rtoffset;
+		if (bms_is_member(firstResultRel, root->glob->prunableRelids))
+			root->glob->firstResultRels =
+				lappend_int(root->glob->firstResultRels, firstResultRel);
+	}
+
 	return result;
 }
 
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index cd1e429ceed..5c145a31274 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -1636,6 +1636,7 @@ exec_bind_message(StringInfo input_message)
 	int16	   *rformats = NULL;
 	CachedPlanSource *psrc;
 	CachedPlan *cplan;
+	CachedPlanPrepData cprep = {0};
 	Portal		portal;
 	char	   *query_string;
 	char	   *saved_stmt_name;
@@ -2017,7 +2018,11 @@ exec_bind_message(StringInfo input_message)
 	 * will be generated in MessageContext.  The plan refcount will be
 	 * assigned to the Portal, so it will be released at portal destruction.
 	 */
-	cplan = GetCachedPlan(psrc, params, NULL, NULL);
+
+	/* ExecutorPrep() state lives in portal context. */
+	cprep.context = portal->portalContext;
+	cprep.owner = portal->resowner;
+	cplan = GetCachedPlan(psrc, params, NULL, NULL, &cprep);
 
 	/*
 	 * Now we can define the portal.
@@ -2030,7 +2035,7 @@ exec_bind_message(StringInfo input_message)
 					  query_string,
 					  psrc->commandTag,
 					  cplan->stmt_list,
-					  NIL,
+					  cprep.prep_estates,
 					  cplan);
 
 	/* Portal is defined, set the plan ID based on its contents. */
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 812e2265734..1d3244307da 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -93,14 +93,17 @@ static bool StmtPlanRequiresRevalidation(CachedPlanSource *plansource);
 static bool BuildingPlanRequiresSnapshot(CachedPlanSource *plansource);
 static List *RevalidateCachedQuery(CachedPlanSource *plansource,
 								   QueryEnvironment *queryEnv);
-static bool CheckCachedPlan(CachedPlanSource *plansource);
+static bool CheckCachedPlan(CachedPlanSource *plansource, CachedPlanPrepData *cprep);
 static CachedPlan *BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
 								   ParamListInfo boundParams, QueryEnvironment *queryEnv);
 static bool choose_custom_plan(CachedPlanSource *plansource,
 							   ParamListInfo boundParams);
 static double cached_plan_cost(CachedPlan *plan, bool include_planner);
 static Query *QueryListGetPrimaryStmt(List *stmts);
-static void AcquireExecutorLocks(List *stmt_list, bool acquire);
+static void AcquireExecutorLocksAll(List *stmt_list, bool acquire);
+static void AcquireExecutorLocksUnpruned(List *stmt_list, bool acquire,
+										 CachedPlanPrepData *cprep);
+static void CachedPlanPrepCleanup(CachedPlanPrepData *cprep);
 static void AcquirePlannerLocks(List *stmt_list, bool acquire);
 static void ScanQueryForLocks(Query *parsetree, bool acquire);
 static bool ScanQueryWalker(Node *node, bool *acquire);
@@ -942,6 +945,11 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
 /*
  * CheckCachedPlan: see if the CachedPlanSource's generic plan is valid.
  *
+ * If 'cprep' is not NULL, ExecutorPrep() is applied to each PlannedStmt to
+ * compute the set of partitions that survive initial runtime pruning in order
+ * to only lock them.  The EStates created to do so are saved in cprep for
+ * later reuse by ExecutorStart().
+ *
  * Caller must have already called RevalidateCachedQuery to verify that the
  * querytree is up to date.
  *
@@ -949,7 +957,7 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
  * (We must do this for the "true" result to be race-condition-free.)
  */
 static bool
-CheckCachedPlan(CachedPlanSource *plansource)
+CheckCachedPlan(CachedPlanSource *plansource, CachedPlanPrepData *cprep)
 {
 	CachedPlan *plan = plansource->gplan;
 
@@ -983,7 +991,10 @@ CheckCachedPlan(CachedPlanSource *plansource)
 		 */
 		Assert(plan->refcount > 0);
 
-		AcquireExecutorLocks(plan->stmt_list, true);
+		if (cprep)
+			AcquireExecutorLocksUnpruned(plan->stmt_list, true, cprep);
+		else
+			AcquireExecutorLocksAll(plan->stmt_list, true);
 
 		/*
 		 * If plan was transient, check to see if TransactionXmin has
@@ -1005,7 +1016,13 @@ CheckCachedPlan(CachedPlanSource *plansource)
 		}
 
 		/* Oops, the race case happened.  Release useless locks. */
-		AcquireExecutorLocks(plan->stmt_list, false);
+		if (cprep)
+			AcquireExecutorLocksUnpruned(plan->stmt_list, false, cprep);
+		else
+			AcquireExecutorLocksAll(plan->stmt_list, false);
+
+		/* Also clean up ExecutorPrep() state, if necessary. */
+		CachedPlanPrepCleanup(cprep);
 	}
 
 	/*
@@ -1285,6 +1302,11 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
  * On return, the plan is valid and we have sufficient locks to begin
  * execution.
  *
+ * If 'cprep' is not NULL and a generic plan is reused, the function prepares
+ * each PlannedStmt via ExecutorPrep() and stores the EStates in
+ * cprep->prep_estates.  These are intended to be passed later to
+ * ExecutorStart().
+ *
  * On return, the refcount of the plan has been incremented; a later
  * ReleaseCachedPlan() call is expected.  If "owner" is not NULL then
  * the refcount has been reported to that ResourceOwner (note that this
@@ -1295,7 +1317,8 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
  */
 CachedPlan *
 GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
-			  ResourceOwner owner, QueryEnvironment *queryEnv)
+			  ResourceOwner owner, QueryEnvironment *queryEnv,
+			  CachedPlanPrepData *cprep)
 {
 	CachedPlan *plan = NULL;
 	List	   *qlist;
@@ -1317,7 +1340,9 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 
 	if (!customplan)
 	{
-		if (CheckCachedPlan(plansource))
+		if (cprep)
+			cprep->params = boundParams;
+		if (CheckCachedPlan(plansource, cprep))
 		{
 			/* We want a generic plan, and we already have a valid one */
 			plan = plansource->gplan;
@@ -1904,11 +1929,13 @@ QueryListGetPrimaryStmt(List *stmts)
 }
 
 /*
- * AcquireExecutorLocks: acquire locks needed for execution of a cached plan;
- * or release them if acquire is false.
+ * AcquireExecutorLocksAll: acquire locks needed for execution of a cached
+ * plan; or release them if acquire is false.
+ *
+ * This locks all relations in a given PlannedStmt's range table.
  */
 static void
-AcquireExecutorLocks(List *stmt_list, bool acquire)
+AcquireExecutorLocksAll(List *stmt_list, bool acquire)
 {
 	ListCell   *lc1;
 
@@ -1955,6 +1982,214 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
 	}
 }
 
+/*
+ * LockRelids
+ * 		Acquire or release locks on the specified relids, which reference
+ * 		entries in the provided range table.
+ *
+ * Helper for AcquireExecutorLocksUnpruned().
+ */
+static void
+LockRelids(List *rtable, Bitmapset *relids, bool acquire)
+{
+	int	rtindex = -1;
+
+	while ((rtindex = bms_next_member(relids, rtindex)) >= 0)
+	{
+		RangeTblEntry *rte = list_nth_node(RangeTblEntry, rtable, rtindex - 1);
+
+		Assert(rte->rtekind == RTE_RELATION ||
+			   (rte->rtekind == RTE_SUBQUERY && OidIsValid(rte->relid)));
+
+		/*
+		 * Acquire the appropriate type of lock on each relation OID. Note
+		 * that we don't actually try to open the rel, and hence will not
+		 * fail if it's been dropped entirely --- we'll just transiently
+		 * acquire a non-conflicting lock.
+		 */
+		if (acquire)
+			LockRelationOid(rte->relid, rte->rellockmode);
+		else
+			UnlockRelationOid(rte->relid, rte->rellockmode);
+	}
+}
+
+/*
+ * AcquireExecutorLocksUnpruned
+ *		Acquire or release execution locks for only unpruned relations
+ *		referenced by the given PlannedStmts.
+ *
+ * On acquire, this:
+ *	- locks unprunable rels listed in PlannedStmt.unprunableRelids
+ *	- runs ExecutorPrep() to perform initial runtime pruning
+ *	- locks the surviving partitions reported in the prep estate
+ *	- appends the EState pointer for each PlannedStmt to cprep->prep_estates
+ *
+ * On release, it:
+ *	- looks up the EState for each PlannedStmt from cprep->prep_estates
+ *	  (which must already be populated)
+ *	- unlocks the same relations identified during acquire
+ *	- cleans up each EState
+ *
+ * prep_estates is extended during acquire and must match stmt_list one-to-one
+ * when releasing locks.  Memory allocation for EState happens in
+ * cprep->context.  Locks are acquired using cprep->owner.
+ */
+
+static void
+AcquireExecutorLocksUnpruned(List *stmt_list, bool acquire,
+							 CachedPlanPrepData *cprep)
+{
+	MemoryContext oldcontext = MemoryContextSwitchTo(cprep->context);
+	ListCell   *lc1;
+	List	   *prep_estates;
+	ListCell   *prep_lc;
+
+	Assert(cprep);
+
+	/*
+	 * When releasing locks, use the EState list (if any) created during
+	 * acquisition to determine which relids to unlock. The list must match
+	 * the PlannedStmt list one-to-one.
+	 */
+	prep_estates = cprep->prep_estates;
+	Assert(acquire || list_length(prep_estates) == list_length(stmt_list));
+
+	prep_lc = list_head(prep_estates);
+	foreach(lc1, stmt_list)
+	{
+		PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
+		EState *prep_estate;
+
+		if (plannedstmt->commandType == CMD_UTILITY)
+		{
+			/* Same as AcquireExecutorLocks(). */
+			Query	   *query = UtilityContainsQuery(plannedstmt->utilityStmt);
+
+			if (query)
+				ScanQueryForLocks(query, acquire);
+
+			/* Keep the list one-to-one with stmt_list. */
+			if (acquire)
+				cprep->prep_estates = lappend(cprep->prep_estates, NULL);
+			else
+				(void) next_prep_estate(prep_estates, &prep_lc);
+			continue;
+		}
+
+		/*
+		 * Lock tables mentioned in the original query and other unprunable
+		 * relations that were added to the plan via inheritance expansion.
+		 */
+		LockRelids(plannedstmt->rtable, plannedstmt->unprunableRelids, acquire);
+
+		/* Lock partitions surviving runtime initial pruning. */
+		if (acquire)
+		{
+			/*
+			 * Pruning expressions may call PL functions that require an active
+			 * snapshot (e.g., via EnsurePortalSnapshotExists()). Establish one
+			 * if needed.
+			 */
+			bool		snap_pushed = false;
+
+			if (!ActiveSnapshotSet())
+			{
+				PushActiveSnapshot(GetTransactionSnapshot());
+				snap_pushed = true;
+			}
+
+			prep_estate = ExecutorPrep(plannedstmt, cprep->params, cprep->owner, true,
+									   cprep->eflags);
+			Assert(prep_estate);
+			cprep->prep_estates = lappend(cprep->prep_estates, prep_estate);
+
+			if (snap_pushed)
+				PopActiveSnapshot();
+		}
+		else
+			prep_estate = next_prep_estate(prep_estates, &prep_lc);
+
+		if (prep_estate)
+		{
+			/*
+			 * es_unpruned_relids includes plannedstmt->unprunableRelids,
+			 * which we've already locked. Filter them out to avoid double-locking.
+			 */
+			Bitmapset *lock_relids = bms_difference(prep_estate->es_unpruned_relids,
+													plannedstmt->unprunableRelids);
+
+			/*
+			 * We must always include the first result relation of each
+			 * ModifyTable node in the plan, that is, the one mentioned in
+			 * plannedstmt->firstResultRels in the set of relations to be
+			 * locked to satisfy executor assumptions described
+			 * in ExecInitModifyTable().  This can be wasteful, because we
+			 * may not need to use the first result relation at all if other
+			 * result relations are unpruned and thus sufficient for the
+			 * ModifyTable node's needs.  Unfortunately, we don't have per-node
+			 * unpruned_relids set to determine that other result relations
+			 * are included.
+			 */
+			if (plannedstmt->resultRelations)
+			{
+				ListCell *lc2;
+
+				foreach(lc2, plannedstmt->firstResultRels)
+				{
+					Index       firstResultRel = lfirst_int(lc2);
+
+					if (!bms_is_member(firstResultRel, lock_relids))
+						lock_relids = bms_add_member(lock_relids, firstResultRel);
+				}
+			}
+
+			LockRelids(plannedstmt->rtable, lock_relids, acquire);
+			bms_free(lock_relids);
+		}
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+}
+
+/*
+ * CachedPlanPrepCleanup
+ *		Clean up EState built for a generic plan.
+ *
+ * This is used in the corner case where CheckCachedPlan() discovers
+ * that a CachedPlan has become invalid after AcquireExecutorLocksUnpruned()
+ * has already run.  In that case we must both release the execution locks
+ * and dispose of the ExecPrep list stored in CachedPlanPrepData, since the
+ * executor will never see or clean it up.
+ */
+static void
+CachedPlanPrepCleanup(CachedPlanPrepData *cprep)
+{
+	ListCell   *lc;
+	ResourceOwner oldowner;
+
+	if (cprep == NULL)
+		return;
+
+	/* Switch to owner that ExecutorPrep() would have used. */
+	oldowner = CurrentResourceOwner;
+	CurrentResourceOwner = cprep->owner;
+	foreach(lc, cprep->prep_estates)
+	{
+		EState *prep_estate = (EState *) lfirst(lc);
+
+		if (prep_estate == NULL)
+			continue;
+
+		ExecCloseRangeTableRelations(prep_estate);
+		FreeExecutorState(prep_estate);
+	}
+	CurrentResourceOwner = oldowner;
+
+	list_free(cprep->prep_estates);
+	cprep->prep_estates = NIL;
+}
+
 /*
  * AcquirePlannerLocks: acquire locks needed for planning of a querytree list;
  * or release them if acquire is false.
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index c175ee95b68..989b3c73691 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -217,6 +217,9 @@ typedef struct PlannerGlobal
 	/* "flat" list of integer RT indexes */
 	List	   *resultRelations;
 
+	/* "flat" list of integer RT indexes (one per ModifyTable node) */
+	List	   *firstResultRels;
+
 	/* "flat" list of AppendRelInfos */
 	List	   *appendRelations;
 
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 8c9321aab8c..1431f12a6e8 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -123,6 +123,16 @@ typedef struct PlannedStmt
 	/* integer list of RT indexes, or NIL */
 	List	   *resultRelations;
 
+	/*
+	 * rtable indexes of first target relation in each ModifyTable node in the
+	 * plan for INSERT/UPDATE/DELETE/MERGE.  NIL if resultRelations is NIL.
+	 *
+	 * These are used by AcquireExecutorLocksUnpruned() to ensure that the
+	 * first result rel for each ModifyTable remains locked even if pruned;
+	 * see ExecInitModifyTable() for the executor side assumptions.
+	 */
+	List	   *firstResultRels;
+
 	/* list of AppendRelInfo nodes */
 	List	   *appendRelations;
 
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index 984c51515c6..da3ce9f3177 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -197,6 +197,32 @@ typedef struct CachedExpression
 } CachedExpression;
 
 
+/*
+ * CachedPlanPrepData
+ *      Carries ExecutorPrep results for each PlannedStmt in a CachedPlan,
+ *      along with context and owner information needed to allocate them.
+ *
+ * prep_estates is indexed one-to-one with CachedPlan->stmt_list, and is
+ * populated when GetCachedPlan() prepares a reused generic plan.  If the
+ * plan is found invalid after locking, the same list is used to determine
+ * which relations to unlock before retrying.
+ *
+ * ExecutorPrep state is allocated in 'context' and owned by 'owner'.
+ *
+ * eflags controls ExecutorPrep() behavior during initial pruning.
+ * Normally zero; set EXEC_FLAG_EXPLAIN_GENERIC to suppress pruning
+ * in EXPLAIN (GENERIC_PLAN).  Need not match the eflags later passed
+ * to ExecutorStart().
+ */
+typedef struct CachedPlanPrepData
+{
+	List   *prep_estates;	/* one EState per PlannedStmt, or NULL */
+	ParamListInfo params;	/* params visible to ExecutorPrep */
+	MemoryContext context;	/* where to allocate EState and its fields */
+	ResourceOwner owner;	/* ResourceOwner for ExecutorPrep state */
+	int		eflags;			/* executor flags to control ExecutorPrep */
+} CachedPlanPrepData;
+
 extern void InitPlanCache(void);
 extern void ResetPlanCache(void);
 
@@ -240,7 +266,8 @@ extern List *CachedPlanGetTargetList(CachedPlanSource *plansource,
 extern CachedPlan *GetCachedPlan(CachedPlanSource *plansource,
 								 ParamListInfo boundParams,
 								 ResourceOwner owner,
-								 QueryEnvironment *queryEnv);
+								 QueryEnvironment *queryEnv,
+								 CachedPlanPrepData *cprep);
 extern void ReleaseCachedPlan(CachedPlan *plan, ResourceOwner owner);
 
 extern bool CachedPlanAllowsSimpleValidityCheck(CachedPlanSource *plansource,
diff --git a/src/test/regress/expected/partition_prune.out b/src/test/regress/expected/partition_prune.out
index 39dab8fcc05..39770f3b6d6 100644
--- a/src/test/regress/expected/partition_prune.out
+++ b/src/test/regress/expected/partition_prune.out
@@ -4860,9 +4860,7 @@ select c.relname
    relname    
 --------------
  prunelock_p1
- prunelock_p2
- prunelock_p3
-(3 rows)
+(1 row)
 
 commit;
 deallocate prunelock_q;
@@ -4904,6 +4902,50 @@ select c.relname
 
 commit;
 deallocate prunelock_q;
+reset enable_partition_pruning;
+--
+-- Verify firstResultRels handling with multiple ModifyTable nodes
+-- (writable CTEs) targeting a partitioned table.  When a pruning
+-- parameter matches no partition, all result relations are pruned
+-- and the executor must still find a usable first result relation
+-- for each ModifyTable node.
+--
+prepare prunelock_mt_q (int, int) as
+  with upd1 as (update prunelock_p set a = a),
+       upd2 as (update prunelock_p set a = a where a = $2)
+  update prunelock_p set a = a where a = $1;
+-- Force generic plan creation
+explain (costs off) execute prunelock_mt_q(1, 2);
+                         QUERY PLAN                         
+------------------------------------------------------------
+ Update on prunelock_p
+   Update on prunelock_p1 prunelock_p_1
+   CTE upd1
+     ->  Update on prunelock_p prunelock_p_3
+           Update on prunelock_p1 prunelock_p_4
+           Update on prunelock_p2 prunelock_p_5
+           Update on prunelock_p3 prunelock_p_6
+           ->  Append
+                 ->  Seq Scan on prunelock_p1 prunelock_p_4
+                 ->  Seq Scan on prunelock_p2 prunelock_p_5
+                 ->  Seq Scan on prunelock_p3 prunelock_p_6
+   CTE upd2
+     ->  Update on prunelock_p prunelock_p_7
+           Update on prunelock_p2 prunelock_p_8
+           ->  Append
+                 Subplans Removed: 2
+                 ->  Seq Scan on prunelock_p2 prunelock_p_8
+                       Filter: (a = $2)
+   ->  Append
+         Subplans Removed: 2
+         ->  Seq Scan on prunelock_p1 prunelock_p_1
+               Filter: (a = $1)
+(22 rows)
+
+-- All partitions pruned: value 4 matches no partition, so each
+-- ModifyTable must still initialize correctly with no matching
+-- result relations.
+execute prunelock_mt_q(4, 5);
+deallocate prunelock_mt_q;
 drop table prunelock_p;
 reset plan_cache_mode;
-reset enable_partition_pruning;
diff --git a/src/test/regress/expected/plancache.out b/src/test/regress/expected/plancache.out
index 4e59188196c..1d69ab0a1c2 100644
--- a/src/test/regress/expected/plancache.out
+++ b/src/test/regress/expected/plancache.out
@@ -398,3 +398,65 @@ select name, generic_plans, custom_plans from pg_prepared_statements
 (1 row)
 
 drop table test_mode;
+-- Test invalidation of a generic plan during pruning-aware lock setup.
+-- The pruning expression uses a stable SQL function that calls a volatile
+-- plpgsql function.  That function performs DDL on a partition when a
+-- separate "signal" table says to do so.  The second EXECUTE should
+-- replan cleanly after the DDL.
+set plan_cache_mode to force_generic_plan;
+create table inval_during_pruning_p (a int) partition by list (a);
+create table inval_during_pruning_p1 partition of inval_during_pruning_p for values in (1);
+create table inval_during_pruning_p2 partition of inval_during_pruning_p for values in (2);
+insert into inval_during_pruning_p values (1), (2);
+create table inval_during_pruning_signal (create_idx bool not null);
+insert into inval_during_pruning_signal values (false);
+create or replace function invalidate_plancache_func() returns int
+as $$
+declare
+	create_index bool;
+begin
+	-- Perform DDL on a partition if asked to
+	select create_idx into create_index from inval_during_pruning_signal for update;
+	if create_index = true then
+		raise notice 'creating index on partition inval_during_pruning_p1';
+		create index on inval_during_pruning_p1 (a);
+		update inval_during_pruning_signal set create_idx = false;
+	end if;
+	-- pruning parameter
+	return 1;
+end;
+$$ language plpgsql volatile;
+create or replace function stable_pruning_val() returns int as $$
+	select invalidate_plancache_func();
+$$ language sql stable;
+prepare inval_during_pruning_q as select * from inval_during_pruning_p where a = stable_pruning_val();
+-- Build a generic plan and run pruning once, but don't set the signal
+-- for invalidate_plancache_func() to perform the DDL.
+explain (verbose, costs off) execute inval_during_pruning_q;
+                                QUERY PLAN                                 
+---------------------------------------------------------------------------
+ Append
+   Subplans Removed: 1
+   ->  Seq Scan on public.inval_during_pruning_p1 inval_during_pruning_p_1
+         Output: inval_during_pruning_p_1.a
+         Filter: (inval_during_pruning_p_1.a = stable_pruning_val())
+(5 rows)
+
+-- Reuse the generic plan.  Make invalidate_plancache_func() perform DDL
+-- during this execution, which should force replanning without errors.
+update inval_during_pruning_signal set create_idx = true;
+explain (verbose, costs off) execute inval_during_pruning_q;
+NOTICE:  creating index on partition inval_during_pruning_p1
+                                QUERY PLAN                                 
+---------------------------------------------------------------------------
+ Append
+   Subplans Removed: 1
+   ->  Seq Scan on public.inval_during_pruning_p1 inval_during_pruning_p_1
+         Output: inval_during_pruning_p_1.a
+         Filter: (inval_during_pruning_p_1.a = stable_pruning_val())
+(5 rows)
+
+drop table inval_during_pruning_p, inval_during_pruning_signal;
+drop function invalidate_plancache_func, stable_pruning_val;
+deallocate inval_during_pruning_q;
+reset plan_cache_mode;
diff --git a/src/test/regress/sql/partition_prune.sql b/src/test/regress/sql/partition_prune.sql
index 229c5eb370c..87672ad40f7 100644
--- a/src/test/regress/sql/partition_prune.sql
+++ b/src/test/regress/sql/partition_prune.sql
@@ -1499,6 +1499,28 @@ select c.relname
 commit;
 
 deallocate prunelock_q;
+reset enable_partition_pruning;
+
+--
+-- Verify firstResultRels handling with multiple ModifyTable nodes
+-- (writable CTEs) targeting a partitioned table.  When a pruning
+-- parameter matches no partition, all result relations are pruned
+-- and the executor must still find a usable first result relation
+-- for each ModifyTable node.
+--
+prepare prunelock_mt_q (int, int) as
+  with upd1 as (update prunelock_p set a = a),
+       upd2 as (update prunelock_p set a = a where a = $2)
+  update prunelock_p set a = a where a = $1;
+
+-- Force generic plan creation
+explain (costs off) execute prunelock_mt_q(1, 2);
+
+-- All partitions pruned: value 4 matches no partition, so each
+-- ModifyTable must still initialize correctly with no matching
+-- result relations.
+execute prunelock_mt_q(4, 5);
+
+deallocate prunelock_mt_q;
 drop table prunelock_p;
 reset plan_cache_mode;
-reset enable_partition_pruning;
diff --git a/src/test/regress/sql/plancache.sql b/src/test/regress/sql/plancache.sql
index 4b2f11dcc64..139b4688fd6 100644
--- a/src/test/regress/sql/plancache.sql
+++ b/src/test/regress/sql/plancache.sql
@@ -223,3 +223,54 @@ select name, generic_plans, custom_plans from pg_prepared_statements
   where  name = 'test_mode_pp';
 
 drop table test_mode;
+
+-- Test invalidation of a generic plan during pruning-aware lock setup.
+-- The pruning expression uses a stable SQL function that calls a volatile
+-- plpgsql function.  That function performs DDL on a partition when a
+-- separate "signal" table says to do so.  The second EXECUTE should
+-- replan cleanly after the DDL.
+set plan_cache_mode to force_generic_plan;
+create table inval_during_pruning_p (a int) partition by list (a);
+create table inval_during_pruning_p1 partition of inval_during_pruning_p for values in (1);
+create table inval_during_pruning_p2 partition of inval_during_pruning_p for values in (2);
+insert into inval_during_pruning_p values (1), (2);
+
+create table inval_during_pruning_signal (create_idx bool not null);
+insert into inval_during_pruning_signal values (false);
+create or replace function invalidate_plancache_func() returns int
+as $$
+declare
+	create_index bool;
+begin
+	-- Perform DDL on a partition if asked to
+	select create_idx into create_index from inval_during_pruning_signal for update;
+	if create_index = true then
+		raise notice 'creating index on partition inval_during_pruning_p1';
+		create index on inval_during_pruning_p1 (a);
+		update inval_during_pruning_signal set create_idx = false;
+	end if;
+	-- pruning parameter
+	return 1;
+end;
+$$ language plpgsql volatile;
+
+create or replace function stable_pruning_val() returns int as $$
+	select invalidate_plancache_func();
+$$ language sql stable;
+
+prepare inval_during_pruning_q as select * from inval_during_pruning_p where a = stable_pruning_val();
+
+-- Build a generic plan and run pruning once, but don't set the signal
+-- for invalidate_plancache_func() to perform the DDL.
+explain (verbose, costs off) execute inval_during_pruning_q;
+
+-- Reuse the generic plan.  Make invalidate_plancache_func() perform DDL
+-- during this execution, which should force replanning without errors.
+update inval_during_pruning_signal set create_idx = true;
+explain (verbose, costs off) execute inval_during_pruning_q;
+
+drop table inval_during_pruning_p, inval_during_pruning_signal;
+drop function invalidate_plancache_func, stable_pruning_val;
+deallocate inval_during_pruning_q;
+
+reset plan_cache_mode;
-- 
2.47.3



  [application/octet-stream] v7-0006-Reuse-partition-pruning-results-in-parallel-worke.patch (8.2K, 6-v7-0006-Reuse-partition-pruning-results-in-parallel-worke.patch)
  download | inline diff:
From 9c94b3751ae0c9decc337e33de2750a954a88d6f Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Tue, 11 Nov 2025 22:17:47 +0900
Subject: [PATCH v7 6/6] Reuse partition pruning results in parallel workers

Pass the leader's initial partition pruning results and unpruned
relids to parallel workers and reuse them via ExecutorPrep(). This
avoids repeating pruning logic in workers, which is not only
redundant but also risks divergence due to nondeterminism in pruning
steps or parameter evaluation timing.

Introduce CheckInitialPruningResultsInWorker() (debug-builds only)
to verify that the results match what the worker would compute. This
check helps catch inconsistencies across leader and worker pruning
logic.
---
 src/backend/executor/execParallel.c | 108 +++++++++++++++++++++++++++-
 1 file changed, 107 insertions(+), 1 deletion(-)

diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index 024780d3516..d337bf8c081 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -24,6 +24,7 @@
 #include "postgres.h"
 
 #include "executor/execParallel.h"
+#include "executor/execPartition.h"
 #include "executor/executor.h"
 #include "executor/nodeAgg.h"
 #include "executor/nodeAppend.h"
@@ -67,6 +68,8 @@
 #define PARALLEL_KEY_QUERY_TEXT		UINT64CONST(0xE000000000000008)
 #define PARALLEL_KEY_JIT_INSTRUMENTATION UINT64CONST(0xE000000000000009)
 #define PARALLEL_KEY_WAL_USAGE			UINT64CONST(0xE00000000000000A)
+#define PARALLEL_KEY_PARTITION_PRUNE_RESULTS	UINT64CONST(0xE00000000000000B)
+#define PARALLEL_KEY_UNPRUNED_RELIDS	UINT64CONST(0xE00000000000000C)
 
 #define PARALLEL_TUPLE_QUEUE_SIZE		65536
 
@@ -141,6 +144,8 @@ static bool ExecParallelRetrieveInstrumentation(PlanState *planstate,
 /* Helper function that runs in the parallel worker. */
 static DestReceiver *ExecParallelGetReceiver(dsm_segment *seg, shm_toc *toc);
 
+static void CheckInitialPruningResultsInWorker(EState *estate);
+
 /*
  * Create a serialized representation of the plan to be sent to each worker.
  */
@@ -620,12 +625,18 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	FixedParallelExecutorState *fpes;
 	char	   *pstmt_data;
 	char	   *pstmt_space;
+	char	   *part_prune_results_data;
+	char	   *part_prune_results_space;
+	char	   *unpruned_relids_data;
+	char	   *unpruned_relids_space;
 	char	   *paramlistinfo_space;
 	BufferUsage *bufusage_space;
 	WalUsage   *walusage_space;
 	SharedExecutorInstrumentation *instrumentation = NULL;
 	SharedJitInstrumentation *jit_instrumentation = NULL;
 	int			pstmt_len;
+	int			part_prune_results_len;
+	int			unpruned_relids_len;
 	int			paramlistinfo_len;
 	int			instrumentation_len = 0;
 	int			jit_instrumentation_len = 0;
@@ -654,6 +665,8 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 
 	/* Fix up and serialize plan to be sent to workers. */
 	pstmt_data = ExecSerializePlan(planstate->plan, estate);
+	part_prune_results_data = nodeToString(estate->es_part_prune_results);
+	unpruned_relids_data = nodeToString(estate->es_unpruned_relids);
 
 	/* Create a parallel context. */
 	pcxt = CreateParallelContext("postgres", "ParallelQueryMain", nworkers);
@@ -680,6 +693,16 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	shm_toc_estimate_chunk(&pcxt->estimator, pstmt_len);
 	shm_toc_estimate_keys(&pcxt->estimator, 1);
 
+	/* Estimate space for serialized part_prune_results. */
+	part_prune_results_len = strlen(part_prune_results_data) + 1;
+	shm_toc_estimate_chunk(&pcxt->estimator, part_prune_results_len);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	/* Estimate space for serialized unpruned_relids. */
+	unpruned_relids_len = strlen(unpruned_relids_data) + 1;
+	shm_toc_estimate_chunk(&pcxt->estimator, unpruned_relids_len);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
 	/* Estimate space for serialized ParamListInfo. */
 	paramlistinfo_len = EstimateParamListSpace(estate->es_param_list_info);
 	shm_toc_estimate_chunk(&pcxt->estimator, paramlistinfo_len);
@@ -781,6 +804,16 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	memcpy(pstmt_space, pstmt_data, pstmt_len);
 	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PLANNEDSTMT, pstmt_space);
 
+	/* Store serialized part_prune_results */
+	part_prune_results_space = shm_toc_allocate(pcxt->toc, part_prune_results_len);
+	memcpy(part_prune_results_space, part_prune_results_data, part_prune_results_len);
+	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARTITION_PRUNE_RESULTS, part_prune_results_space);
+
+	/* Store serialized unpruned_relids */
+	unpruned_relids_space = shm_toc_allocate(pcxt->toc, unpruned_relids_len);
+	memcpy(unpruned_relids_space, unpruned_relids_data, unpruned_relids_len);
+	shm_toc_insert(pcxt->toc, PARALLEL_KEY_UNPRUNED_RELIDS, unpruned_relids_space);
+
 	/* Store serialized ParamListInfo. */
 	paramlistinfo_space = shm_toc_allocate(pcxt->toc, paramlistinfo_len);
 	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARAMLISTINFO, paramlistinfo_space);
@@ -1280,10 +1313,15 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
 						 int instrument_options)
 {
 	char	   *pstmtspace;
+	char	   *part_prune_results_space;
+	char	   *unpruned_relids_space;
 	char	   *paramspace;
 	PlannedStmt *pstmt;
+	List	   *part_prune_results;
+	Bitmapset  *unpruned_relids;
 	ParamListInfo paramLI;
 	char	   *queryString;
+	EState	   *prep_estate = NULL;
 
 	/* Get the query string from shared memory */
 	queryString = shm_toc_lookup(toc, PARALLEL_KEY_QUERY_TEXT, false);
@@ -1296,12 +1334,80 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
 	paramspace = shm_toc_lookup(toc, PARALLEL_KEY_PARAMLISTINFO, false);
 	paramLI = RestoreParamList(&paramspace);
 
+	/* Reconstruct leader-supplied part_prune_results and unpruned_relids. */
+	part_prune_results_space =
+		shm_toc_lookup(toc, PARALLEL_KEY_PARTITION_PRUNE_RESULTS, false);
+	part_prune_results = (List *) stringToNode(part_prune_results_space);
+	unpruned_relids_space =
+		shm_toc_lookup(toc, PARALLEL_KEY_UNPRUNED_RELIDS, false);
+	unpruned_relids = (Bitmapset *) stringToNode(unpruned_relids_space);
+
+	/*
+	 * If pruning was done in the leader, build a prep estate in the worker
+	 * and inject the leader's pruning results into it for reuse.
+	 */
+	if (pstmt->partPruneInfos)
+	{
+		prep_estate = ExecutorPrep(pstmt, paramLI, CurrentResourceOwner, false, 0);
+		Assert(prep_estate);
+
+		prep_estate->es_part_prune_results = part_prune_results;
+		prep_estate->es_unpruned_relids =
+			bms_add_members(prep_estate->es_unpruned_relids,
+							unpruned_relids);
+
+		/*
+		 * A debug-build-only check that the pruning results passed from the
+		 * leader match what the worker would independently compute.
+		 */
+		CheckInitialPruningResultsInWorker(prep_estate);
+	}
+
 	/* Create a QueryDesc for the query. */
 	return CreateQueryDesc(pstmt,
 						   queryString,
 						   GetActiveSnapshot(), InvalidSnapshot,
 						   receiver, paramLI, NULL, instrument_options,
-						   NULL);
+						   prep_estate);
+}
+
+/*
+ * CheckInitialPruningResultsInWorker
+ *		Verify partition pruning results passed from the leader process.
+ *
+ * This is intended to be called during parallel worker query setup.
+ * It recomputes initial pruning results locally and compares them with
+ * those received from the leader. Any mismatch may indicate a divergence
+ * between leader and worker logic or environment.
+ *
+ * Only performed in debug builds.
+ */
+static void
+CheckInitialPruningResultsInWorker(EState *estate)
+{
+#ifdef USE_ASSERT_CHECKING
+	ListCell   *lc;
+	int			i;
+
+	Assert(estate->es_part_prune_results != NULL);
+	i = 0;
+	foreach(lc, estate->es_part_prune_states)
+	{
+		PartitionPruneState *prunestate = (PartitionPruneState *) lfirst(lc);
+		Bitmapset *reuse_validsubplans =
+				list_nth_node(Bitmapset, estate->es_part_prune_results, i++);
+		Bitmapset  *validsubplans = NULL;
+		Bitmapset  *validsubplan_rtis = NULL;
+
+		if (prunestate->do_initial_prune)
+			validsubplans = ExecFindMatchingSubPlans(prunestate, true,
+													 &validsubplan_rtis);
+		if (!bms_equal(validsubplans, reuse_validsubplans))
+			elog(ERROR, "different validsubplans in parallel worker");
+		if (bms_nonempty_difference(validsubplan_rtis, estate->es_unpruned_relids))
+			elog(ERROR, "different unprunable_relids in parallel worker");
+	}
+#endif
 }
 
 /*
-- 
2.47.3



  [application/octet-stream] v7-0001-Refactor-partition-pruning-initialization-for-cla.patch (10.2K, 7-v7-0001-Refactor-partition-pruning-initialization-for-cla.patch)
  download | inline diff:
From 6f2c9cc7a30d38cb2606595f62b62c77e2aba6e9 Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Tue, 10 Feb 2026 15:08:52 +0900
Subject: [PATCH v7 1/6] Refactor partition pruning initialization for clarity
 and modularity

Move the creation of PartitionPruneState structures out of
ExecDoInitialPruning() into a new ExecCreatePartitionPruneStates()
function. This separates the setup of pruning state from the execution
of initial pruning logic, making the code clearer and easier to
maintain.

Simplify handling of unpruned relids by moving responsibility
for recording them in EState into CreatePartitionPruneState(),
avoiding the need to pass all_leafpart_rtis as an out parameter.

Also move the setting of ecxt_param_exec_vals from
ExecCreatePartitionPruneState() to InitExecPartitionPruneContexts(),
to allow the former to be called at a time when the PARAM_EXEC
parameters have not yet been set up.

This refactoring allows callers to reuse the pruning setup logic
without always triggering pruning, a capability useful for future use
cases that may only need metadata initialization.
---
 src/backend/executor/execMain.c      |   1 +
 src/backend/executor/execPartition.c | 103 +++++++++++++++++++--------
 src/include/executor/execPartition.h |   1 +
 3 files changed, 74 insertions(+), 31 deletions(-)

diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index bfd3ebc601e..654f9246ad0 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -868,6 +868,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
 	 * to each PartitionPruneInfo entry, and the es_part_prune_results list is
 	 * parallel to es_part_prune_infos.
 	 */
+	ExecCreatePartitionPruneStates(estate);
 	ExecDoInitialPruning(estate);
 
 	/*
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index bab294f5e91..20c3513fabe 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -184,8 +184,7 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation rel,
 static List *adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri);
 static List *adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap);
 static PartitionPruneState *CreatePartitionPruneState(EState *estate,
-													  PartitionPruneInfo *pruneinfo,
-													  Bitmapset **all_leafpart_rtis);
+													  PartitionPruneInfo *pruneinfo);
 static void InitPartitionPruneContext(PartitionPruneContext *context,
 									  List *pruning_steps,
 									  PartitionDesc partdesc,
@@ -1942,6 +1941,9 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  *
  * Functions:
  *
+ * ExecCreatePartitionPruneStates
+ *		Create PartitionPruneState for all PartitionPruneInfos in the EState
+ *
  * ExecDoInitialPruning:
  *		Perform runtime "initial" pruning, if necessary, to determine the set
  *		of child subnodes that need to be initialized during ExecInitNode() for
@@ -1966,6 +1968,29 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  *-------------------------------------------------------------------------
  */
 
+/*
+ * ExecCreatePartitionPruneStates
+ *
+ * Create a PartitionPruneState for each PartitionPruneInfo in the estate,
+ * and save them in estate->es_part_prune_states. This setup is required
+ * before any initial or runtime pruning can occur.
+ */
+void
+ExecCreatePartitionPruneStates(EState *estate)
+{
+	ListCell   *lc;
+
+	foreach(lc, estate->es_part_prune_infos)
+	{
+		PartitionPruneInfo *pruneinfo = lfirst_node(PartitionPruneInfo, lc);
+		PartitionPruneState *prunestate;
+
+		/* Create and save the PartitionPruneState. */
+		prunestate = CreatePartitionPruneState(estate, pruneinfo);
+		estate->es_part_prune_states = lappend(estate->es_part_prune_states,
+											   prunestate);
+	}
+}
 
 /*
  * ExecDoInitialPruning
@@ -1973,11 +1998,11 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  *		of child subnodes that need to be initialized during ExecInitNode() for
  *		plan nodes that support partition pruning.
  *
- * This function iterates over each PartitionPruneInfo entry in
- * estate->es_part_prune_infos. For each entry, it creates a PartitionPruneState
- * and adds it to es_part_prune_states.  ExecInitPartitionExecPruning() accesses
+ * This function iterates over each PartitionPruneState in
+ * estate->es_part_prune_states, which must have been populated earlier by
+ * ExecCreatePartitionPruneStates(). ExecInitPartitionExecPruning() accesses
  * these states through their corresponding indexes in es_part_prune_states and
- * assign each state to the parent node's PlanState, from where it will be used
+ * assigns each state to the parent node's PlanState, from where it will be used
  * for "exec" pruning.
  *
  * If initial pruning steps exist for a PartitionPruneInfo entry, this function
@@ -1995,20 +2020,13 @@ ExecDoInitialPruning(EState *estate)
 {
 	ListCell   *lc;
 
-	foreach(lc, estate->es_part_prune_infos)
+	Assert(estate->es_part_prune_results == NULL);
+	foreach(lc, estate->es_part_prune_states)
 	{
-		PartitionPruneInfo *pruneinfo = lfirst_node(PartitionPruneInfo, lc);
-		PartitionPruneState *prunestate;
+		PartitionPruneState *prunestate = (PartitionPruneState *) lfirst(lc);
 		Bitmapset  *validsubplans = NULL;
-		Bitmapset  *all_leafpart_rtis = NULL;
 		Bitmapset  *validsubplan_rtis = NULL;
 
-		/* Create and save the PartitionPruneState. */
-		prunestate = CreatePartitionPruneState(estate, pruneinfo,
-											   &all_leafpart_rtis);
-		estate->es_part_prune_states = lappend(estate->es_part_prune_states,
-											   prunestate);
-
 		/*
 		 * Perform initial pruning steps, if any, and save the result
 		 * bitmapset or NULL as described in the header comment.
@@ -2016,8 +2034,6 @@ ExecDoInitialPruning(EState *estate)
 		if (prunestate->do_initial_prune)
 			validsubplans = ExecFindMatchingSubPlans(prunestate, true,
 													 &validsubplan_rtis);
-		else
-			validsubplan_rtis = all_leafpart_rtis;
 
 		estate->es_unpruned_relids = bms_add_members(estate->es_unpruned_relids,
 													 validsubplan_rtis);
@@ -2135,14 +2151,12 @@ ExecInitPartitionExecPruning(PlanState *planstate,
  * parent plan node's PlanState.
  *
  * If initial pruning steps are to be skipped (e.g., during EXPLAIN
- * (GENERIC_PLAN)), *all_leafpart_rtis will be populated with the RT indexes of
- * all leaf partitions whose scanning subnode is included in the parent plan
- * node's list of child plans. The caller must add these RT indexes to
- * estate->es_unpruned_relids.
+ * (GENERIC_PLAN)), the RT indexes of all leaf partitions whose scanning
+ * subnode is included in the parent plan node's list of child plans are
+ * added to estate->es_unpruned_relids.
  */
 static PartitionPruneState *
-CreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo,
-						  Bitmapset **all_leafpart_rtis)
+CreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo)
 {
 	PartitionPruneState *prunestate;
 	int			n_part_hierarchies;
@@ -2376,8 +2390,8 @@ CreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo,
 													   pinfo->execparamids);
 
 			/*
-			 * Return all leaf partition indexes if we're skipping pruning in
-			 * the EXPLAIN (GENERIC_PLAN) case.
+			 * Add all leaf partition indexes to es_unpruned_relids if we're
+			 * skipping pruning in the EXPLAIN (GENERIC_PLAN) case.
 			 */
 			if (pinfo->initial_pruning_steps && !prunestate->do_initial_prune)
 			{
@@ -2389,10 +2403,29 @@ CreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo,
 					Index		rtindex = pprune->leafpart_rti_map[part_index];
 
 					if (rtindex)
-						*all_leafpart_rtis = bms_add_member(*all_leafpart_rtis,
-															rtindex);
+						estate->es_unpruned_relids =
+							bms_add_member(estate->es_unpruned_relids, rtindex);
 				}
 			}
+			else if (pinfo->initial_pruning_steps == NIL)
+			{
+				/*
+				 * All partitions better be present in es_unpruned_relids when
+				 * none are initially prunable.
+				 */
+#ifdef USE_ASSERT_CHECKING
+				int			part_index = -1;
+
+				while ((part_index = bms_next_member(pprune->present_parts,
+													 part_index)) >= 0)
+				{
+					Index		rtindex = pprune->leafpart_rti_map[part_index];
+
+					if (rtindex)
+						Assert(bms_is_member(rtindex, estate->es_unpruned_relids));
+				}
+#endif
+			}
 
 			j++;
 		}
@@ -2489,9 +2522,10 @@ InitPartitionPruneContext(PartitionPruneContext *context,
  *		Initialize exec pruning contexts deferred by CreatePartitionPruneState()
  *
  * This function finalizes exec pruning setup for a PartitionPruneState by
- * initializing contexts for pruning steps that require the parent plan's
- * PlanState. It iterates over PartitionPruningData entries and sets up the
- * necessary execution contexts for pruning during query execution.
+ * initializing contexts for pruning steps that require PARAM_EXEC parameters
+ * and the parent plan's PlanState. It iterates over PartitionPruningData
+ * entries and sets up the necessary execution contexts for pruning during
+ * query execution.
  *
  * Also fix the mapping of partition indexes to subplan indexes contained in
  * prunestate by considering the new list of subplans that survived initial
@@ -2519,9 +2553,16 @@ InitExecPartitionPruneContexts(PartitionPruneState *prunestate,
 	bool		fix_subplan_map = false;
 
 	Assert(prunestate->do_exec_prune);
+	Assert(prunestate->econtext);
 	Assert(parent_plan != NULL);
 	estate = parent_plan->state;
 
+	/*
+	 * These might not be available when ExecCreatePartitionPruneState() is
+	 * called.
+	 */
+	prunestate->econtext->ecxt_param_exec_vals = estate->es_param_exec_vals;
+
 	/*
 	 * No need to fix subplans maps if initial pruning didn't eliminate any
 	 * subplans.
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index 82063ec2a16..4c96808c376 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -130,6 +130,7 @@ typedef struct PartitionPruneState
 	PartitionPruningData *partprunedata[FLEXIBLE_ARRAY_MEMBER];
 } PartitionPruneState;
 
+extern void ExecCreatePartitionPruneStates(EState *estate);
 extern void ExecDoInitialPruning(EState *estate);
 extern PartitionPruneState *ExecInitPartitionExecPruning(PlanState *planstate,
 														 int n_total_subplans,
-- 
2.47.3



^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2026-03-19 17:20  Amit Langote <[email protected]>
  parent: Amit Langote <[email protected]>
  0 siblings, 1 reply; 108+ messages in thread

From: Amit Langote @ 2026-03-19 17:20 UTC (permalink / raw)
  To: Chao Li <[email protected]>; +Cc: Tom Lane <[email protected]>; Tender Wang <[email protected]>; Alexander Lakhin <[email protected]>; Tomas Vondra <[email protected]>; Robert Haas <[email protected]>; Alvaro Herrera <[email protected]>; Andres Freund <[email protected]>; Daniel Gustafsson <[email protected]>; David Rowley <[email protected]>; pgsql-hackers; Thom Brown <[email protected]>

On Mon, Mar 9, 2026 at 1:41 PM Amit Langote <[email protected]> wrote:
> On Sat, Mar 7, 2026 at 6:54 PM Amit Langote <[email protected]> wrote:
> > Attached is v6 of the patch series. I've been working toward
> > committing this, so I wanted to lay out the ExecutorPrep() design and
> > the key trade-offs before doing so.
> >
> > When a cached generic plan references a partitioned table,
> > GetCachedPlan() locks all partitions upfront via
> > AcquireExecutorLocks(), even those that initial pruning will
> > eliminate.  But initial partition pruning only runs later during
> > ExecutorStart(). Moving pruning earlier requires some executor setup
> > (range table, permissions, pruning state), and ExecutorPrep() is the
> > vehicle for that.  Unlike the approach reverted in last May, this
> > keeps the CachedPlan itself unchanged -- all per-execution state flows
> > through a separate CachedPlanPrepData that the caller provides.
> >
> > The approach also keeps GetCachedPlan()'s interface
> > backward-compatible: the new CachedPlanPrepData argument is optional.
> > If a caller passes NULL, all partitions are locked as before and
> > nothing changes. This means existing callers and any new code that
> > calls GetCachedPlan() without caring about pruning-aware locking just
> > works.
> >
> > The risk is on the other side: if a caller does pass a
> > CachedPlanPrepData, GetCachedPlan() will lock only the surviving
> > partitions and populate prep_estates with the EStates that
> > ExecutorPrep() created. The caller then must make those EStates
> > available to ExecutorStart() -- via QueryDesc->estate,
> > portal->prep_estates, or the equivalent path for SPI and SQL
> > functions. If it fails to do so, ExecutorStart() will call
> > ExecutorPrep() again, which may compute different pruning results than
> > the original call, potentially expecting locks on relations that were
> > never acquired. The executor would then operate on relations it
> > doesn't hold locks on.
> >
> > So the contract is: if you opt in to pruning-aware locking by passing
> > CachedPlanPrepData, you must complete the pipeline by delivering the
> > prep EStates to the executor. In the current patch, all the call sites
> > that pass a CachedPlanPrepData (portals, SPI, EXECUTE, SQL functions,
> > EXPLAIN) do thread the EStates through correctly, and I've tried to
> > make the plumbing straightforward enough that it's hard to get wrong.
> > But it is a new invariant that didn't exist before, and a caller that
> > gets it wrong would fail silently rather than with an obvious error.
> >
> > To catch such violations, I've added a debug-only check in
> > standard_ExecutorStart() that fires when no prep EState was provided.
> > It iterates over the plan's rtable and verifies that every lockable
> > relation is actually locked.  It should always be true if
> > AcquireExecutorLocks() locked everything, but would fail if
> > pruning-aware locking happened upstream and the caller dropped the
> > prep EState. The check is skipped in parallel workers, which acquire
> > relation locks lazily in ExecGetRangeTableRelation().
> >
> > +    if (queryDesc->estate == NULL)
> > +    {
> > +#ifdef USE_ASSERT_CHECKING
> > +        if (!IsParallelWorker())
> > +        {
> > +            ListCell   *lc;
> > +
> > +            foreach(lc, queryDesc->plannedstmt->rtable)
> > +            {
> > +                RangeTblEntry *rte = lfirst_node(RangeTblEntry, lc);
> > +
> > +                if (rte->rtekind == RTE_RELATION ||
> > +                    (rte->rtekind == RTE_SUBQUERY && rte->relid != InvalidOid))
> > +                    Assert(CheckRelationOidLockedByMe(rte->relid,
> > +                                                      rte->rellockmode,
> > +                                                      true));
> > +            }
> > +        }
> > +#endif
> > +        queryDesc->estate = ExecutorPrep(queryDesc->plannedstmt,
> > +                                         queryDesc->params,
> > +                                         CurrentResourceOwner,
> > +                                         true,
> > +                                         eflags);
> > +    }
> > +#ifdef USE_ASSERT_CHECKING
> > +    else
> > +    {
> > +        /*
> > +         * A prep EState was provided, meaning pruning-aware locking
> > +         * should have locked at least the unpruned relations.
> > +         */
> > +        if (!IsParallelWorker())
> > +        {
> > +            int     rtindex = -1;
> > +
> > +            while ((rtindex =
> > bms_next_member(queryDesc->estate->es_unpruned_relids,
> > +                                              rtindex)) >= 0)
> > +            {
> > +                RangeTblEntry *rte = exec_rt_fetch(rtindex, queryDesc->estate);
> > +
> > +                Assert(rte->rtekind == RTE_RELATION ||
> > +                       (rte->rtekind == RTE_SUBQUERY &&
> > +                        rte->relid != InvalidOid));
> > +                Assert(CheckRelationOidLockedByMe(rte->relid,
> > +                                                  rte->rellockmode, true));
> > +            }
> > +        }
> > +    }
> > +#endif
> >
> > So the invariant is: if no prep EState was provided, every relation in
> > the plan is locked; if one was provided, at least the unpruned
> > relations are locked. Both are checked in assert builds.
> >
> > I think this covers the main concerns, but I may be missing something.
> > If anyone sees a problem with this approach, I'd like to hear about
> > it.
>
> Here's v7. Some plancache.c changes that I'd made were in the wrong
> patch in v6; this version puts them where they belong.

Attached is an updated set. One more fix: I added an Assert in
SPI_cursor_open_internal()'s !plan->saved path to verify that
prep_estates is NIL. Unsaved plans always take the custom plan path,
so pruning-aware locking never applies, but it's worth guarding
explicitly since the copyObject/ReleaseCachedPlan sequence that
follows would not be safe otherwise. Also changed
SPI_plan_get_cached_plan() to pass NULL for cprep, since it only
returns the CachedPlan pointer and has no way to deliver prep_estates
to anyone.

Stepping back -- the core question is whether running executor logic
(pruning) inside GetCachedPlan() is acceptable at all. The plan cache
and executor have always had a clean boundary: plan cache locks
everything, executor runs. This optimization necessarily crosses that
line, because the information needed to decide which locks to skip
(pruning results) can only come from executor machinery.

The proposed approach has GetCachedPlan() call ExecutorPrep() to do a
limited subset of executor work (range table init, permissions,
pruning), carry the results out through CachedPlanPrepData, and leave
the CachedPlan itself untouched. The executor already has a multi-step
protocol: start/run/end. prep/start/run/end is just a finer
decomposition of what InitPlan() was already doing inside
ExecutorStart().

Of the attached patches, I'm targeting 0001-0003 for commit. 0004 (SQL
function support) and 0005 (parallel worker reuse) are useful
follow-ons but not essential.  The optimization works without them for
most cases, and they can be reviewed and committed separately.

If there's a cleaner way to avoid locking pruned partitions without
the plumbing this patch adds, I haven't found it in the year since the
revert.  I'd welcome a pointer if you see one.  Failing that, I think
this is the right trade-off, but it's a judgment call about where to
hold your nose.

Tom, I'd value your opinion on whether this approach is something
you'd be comfortable seeing in the tree.

--
Thanks, Amit Langote


Attachments:

  [application/octet-stream] v8-0005-Reuse-partition-pruning-results-in-parallel-worke.patch (11.0K, 2-v8-0005-Reuse-partition-pruning-results-in-parallel-worke.patch)
  download | inline diff:
From 4c12c380b75b8684e9c41c80d0c77027cf592e17 Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Thu, 19 Mar 2026 20:03:58 +0900
Subject: [PATCH v8 5/5] Reuse partition pruning results in parallel workers

Pass the leader's initial partition pruning results and unpruned
relids to parallel workers and reuse them via ExecutorPrep(). This
avoids repeating pruning logic in workers, which is not only
redundant but also risks divergence due to nondeterminism in pruning
steps or parameter evaluation timing.

Introduce CheckInitialPruningResultsInWorker() (debug-builds only)
to verify that the results match what the worker would compute. This
check helps catch inconsistencies across leader and worker pruning
logic.
---
 src/backend/executor/execMain.c     |  10 +--
 src/backend/executor/execParallel.c | 108 +++++++++++++++++++++++++++-
 src/backend/utils/cache/plancache.c |   2 +-
 src/include/executor/executor.h     |   3 +-
 4 files changed, 116 insertions(+), 7 deletions(-)

diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 0f95ad88497..9a3700e672f 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -207,7 +207,7 @@ standard_ExecutorStart(QueryDesc *queryDesc, int eflags)
 		queryDesc->estate = ExecutorPrep(queryDesc->plannedstmt,
 										 queryDesc->params,
 										 CurrentResourceOwner,
-										 eflags);
+										 eflags, true);
 	}
 #ifdef USE_ASSERT_CHECKING
 	else
@@ -330,7 +330,8 @@ standard_ExecutorStart(QueryDesc *queryDesc, int eflags)
  * ExecutorPrep: prepare executor state for a PlannedStmt outside ExecutorStart.
  *
  * Performs range table initialization, permission checks, and initial
- * partition pruning if partPruneInfos are present.
+ * partition pruning if partPruneInfos are present and do_initial_pruning is
+ * true; false in a parallel worker.
  *
  * Returns an EState that the caller must either pass to ExecutorStart()
  * for reuse or free via FreeExecutorState() if execution will not proceed.
@@ -340,7 +341,7 @@ standard_ExecutorStart(QueryDesc *queryDesc, int eflags)
  */
 EState *
 ExecutorPrep(PlannedStmt *pstmt, ParamListInfo params, ResourceOwner owner,
-			 int eflags)
+			 int eflags, bool do_initial_pruning)
 {
 	ResourceOwner oldowner;
 	EState *estate;
@@ -386,7 +387,8 @@ ExecutorPrep(PlannedStmt *pstmt, ParamListInfo params, ResourceOwner owner,
 	 * to es_part_prune_infos.
 	 */
 	ExecCreatePartitionPruneStates(estate);
-	ExecDoInitialPruning(estate);
+	if (do_initial_pruning)
+		ExecDoInitialPruning(estate);
 
 	CurrentResourceOwner = oldowner;
 
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index 024780d3516..2de4b35a16e 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -24,6 +24,7 @@
 #include "postgres.h"
 
 #include "executor/execParallel.h"
+#include "executor/execPartition.h"
 #include "executor/executor.h"
 #include "executor/nodeAgg.h"
 #include "executor/nodeAppend.h"
@@ -67,6 +68,8 @@
 #define PARALLEL_KEY_QUERY_TEXT		UINT64CONST(0xE000000000000008)
 #define PARALLEL_KEY_JIT_INSTRUMENTATION UINT64CONST(0xE000000000000009)
 #define PARALLEL_KEY_WAL_USAGE			UINT64CONST(0xE00000000000000A)
+#define PARALLEL_KEY_PARTITION_PRUNE_RESULTS	UINT64CONST(0xE00000000000000B)
+#define PARALLEL_KEY_UNPRUNED_RELIDS	UINT64CONST(0xE00000000000000C)
 
 #define PARALLEL_TUPLE_QUEUE_SIZE		65536
 
@@ -141,6 +144,8 @@ static bool ExecParallelRetrieveInstrumentation(PlanState *planstate,
 /* Helper function that runs in the parallel worker. */
 static DestReceiver *ExecParallelGetReceiver(dsm_segment *seg, shm_toc *toc);
 
+static void CheckInitialPruningResultsInWorker(EState *estate);
+
 /*
  * Create a serialized representation of the plan to be sent to each worker.
  */
@@ -620,12 +625,18 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	FixedParallelExecutorState *fpes;
 	char	   *pstmt_data;
 	char	   *pstmt_space;
+	char	   *part_prune_results_data;
+	char	   *part_prune_results_space;
+	char	   *unpruned_relids_data;
+	char	   *unpruned_relids_space;
 	char	   *paramlistinfo_space;
 	BufferUsage *bufusage_space;
 	WalUsage   *walusage_space;
 	SharedExecutorInstrumentation *instrumentation = NULL;
 	SharedJitInstrumentation *jit_instrumentation = NULL;
 	int			pstmt_len;
+	int			part_prune_results_len;
+	int			unpruned_relids_len;
 	int			paramlistinfo_len;
 	int			instrumentation_len = 0;
 	int			jit_instrumentation_len = 0;
@@ -654,6 +665,8 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 
 	/* Fix up and serialize plan to be sent to workers. */
 	pstmt_data = ExecSerializePlan(planstate->plan, estate);
+	part_prune_results_data = nodeToString(estate->es_part_prune_results);
+	unpruned_relids_data = nodeToString(estate->es_unpruned_relids);
 
 	/* Create a parallel context. */
 	pcxt = CreateParallelContext("postgres", "ParallelQueryMain", nworkers);
@@ -680,6 +693,16 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	shm_toc_estimate_chunk(&pcxt->estimator, pstmt_len);
 	shm_toc_estimate_keys(&pcxt->estimator, 1);
 
+	/* Estimate space for serialized part_prune_results. */
+	part_prune_results_len = strlen(part_prune_results_data) + 1;
+	shm_toc_estimate_chunk(&pcxt->estimator, part_prune_results_len);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	/* Estimate space for serialized unpruned_relids. */
+	unpruned_relids_len = strlen(unpruned_relids_data) + 1;
+	shm_toc_estimate_chunk(&pcxt->estimator, unpruned_relids_len);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
 	/* Estimate space for serialized ParamListInfo. */
 	paramlistinfo_len = EstimateParamListSpace(estate->es_param_list_info);
 	shm_toc_estimate_chunk(&pcxt->estimator, paramlistinfo_len);
@@ -781,6 +804,16 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	memcpy(pstmt_space, pstmt_data, pstmt_len);
 	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PLANNEDSTMT, pstmt_space);
 
+	/* Store serialized part_prune_results */
+	part_prune_results_space = shm_toc_allocate(pcxt->toc, part_prune_results_len);
+	memcpy(part_prune_results_space, part_prune_results_data, part_prune_results_len);
+	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARTITION_PRUNE_RESULTS, part_prune_results_space);
+
+	/* Store serialized unpruned_relids */
+	unpruned_relids_space = shm_toc_allocate(pcxt->toc, unpruned_relids_len);
+	memcpy(unpruned_relids_space, unpruned_relids_data, unpruned_relids_len);
+	shm_toc_insert(pcxt->toc, PARALLEL_KEY_UNPRUNED_RELIDS, unpruned_relids_space);
+
 	/* Store serialized ParamListInfo. */
 	paramlistinfo_space = shm_toc_allocate(pcxt->toc, paramlistinfo_len);
 	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARAMLISTINFO, paramlistinfo_space);
@@ -1280,10 +1313,15 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
 						 int instrument_options)
 {
 	char	   *pstmtspace;
+	char	   *part_prune_results_space;
+	char	   *unpruned_relids_space;
 	char	   *paramspace;
 	PlannedStmt *pstmt;
+	List	   *part_prune_results;
+	Bitmapset  *unpruned_relids;
 	ParamListInfo paramLI;
 	char	   *queryString;
+	EState	   *prep_estate = NULL;
 
 	/* Get the query string from shared memory */
 	queryString = shm_toc_lookup(toc, PARALLEL_KEY_QUERY_TEXT, false);
@@ -1296,12 +1334,80 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
 	paramspace = shm_toc_lookup(toc, PARALLEL_KEY_PARAMLISTINFO, false);
 	paramLI = RestoreParamList(&paramspace);
 
+	/* Reconstruct leader-supplied part_prune_results and unpruned_relids. */
+	part_prune_results_space =
+		shm_toc_lookup(toc, PARALLEL_KEY_PARTITION_PRUNE_RESULTS, false);
+	part_prune_results = (List *) stringToNode(part_prune_results_space);
+	unpruned_relids_space =
+		shm_toc_lookup(toc, PARALLEL_KEY_UNPRUNED_RELIDS, false);
+	unpruned_relids = (Bitmapset *) stringToNode(unpruned_relids_space);
+
+	/*
+	 * If pruning was done in the leader, build a prep estate in the worker
+	 * and inject the leader's pruning results into it for reuse.
+	 */
+	if (pstmt->partPruneInfos)
+	{
+		prep_estate = ExecutorPrep(pstmt, paramLI, CurrentResourceOwner, 0, false);
+		Assert(prep_estate);
+
+		prep_estate->es_part_prune_results = part_prune_results;
+		prep_estate->es_unpruned_relids =
+			bms_add_members(prep_estate->es_unpruned_relids,
+							unpruned_relids);
+
+		/*
+		 * A debug-build-only check that the pruning results passed from the
+		 * leader match what the worker would independently compute.
+		 */
+		CheckInitialPruningResultsInWorker(prep_estate);
+	}
+
 	/* Create a QueryDesc for the query. */
 	return CreateQueryDesc(pstmt,
 						   queryString,
 						   GetActiveSnapshot(), InvalidSnapshot,
 						   receiver, paramLI, NULL, instrument_options,
-						   NULL);
+						   prep_estate);
+}
+
+/*
+ * CheckInitialPruningResultsInWorker
+ *		Verify partition pruning results passed from the leader process.
+ *
+ * This is intended to be called during parallel worker query setup.
+ * It recomputes initial pruning results locally and compares them with
+ * those received from the leader. Any mismatch may indicate a divergence
+ * between leader and worker logic or environment.
+ *
+ * Only performed in debug builds.
+ */
+static void
+CheckInitialPruningResultsInWorker(EState *estate)
+{
+#ifdef USE_ASSERT_CHECKING
+	ListCell   *lc;
+	int			i;
+
+	Assert(estate->es_part_prune_results != NULL);
+	i = 0;
+	foreach(lc, estate->es_part_prune_states)
+	{
+		PartitionPruneState *prunestate = (PartitionPruneState *) lfirst(lc);
+		Bitmapset *reuse_validsubplans =
+				list_nth_node(Bitmapset, estate->es_part_prune_results, i++);
+		Bitmapset  *validsubplans = NULL;
+		Bitmapset  *validsubplan_rtis = NULL;
+
+		if (prunestate->do_initial_prune)
+			validsubplans = ExecFindMatchingSubPlans(prunestate, true,
+													 &validsubplan_rtis);
+		if (!bms_equal(validsubplans, reuse_validsubplans))
+			elog(ERROR, "different validsubplans in parallel worker");
+		if (bms_nonempty_difference(validsubplan_rtis, estate->es_unpruned_relids))
+			elog(ERROR, "different unprunable_relids in parallel worker");
+	}
+#endif
 }
 
 /*
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 2d4c57d3deb..0dd4f40c964 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -2102,7 +2102,7 @@ AcquireExecutorLocksUnpruned(List *stmt_list, bool acquire,
 			}
 
 			prep_estate = ExecutorPrep(plannedstmt, cprep->params,
-									   cprep->owner, cprep->eflags);
+									   cprep->owner, cprep->eflags, true);
 			Assert(prep_estate);
 			cprep->prep_estates = lappend(cprep->prep_estates, prep_estate);
 
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index 24604120c27..38848ba0651 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -240,7 +240,8 @@ extern void standard_ExecutorStart(QueryDesc *queryDesc, int eflags);
 extern EState *ExecutorPrep(PlannedStmt *pstmt,
 							ParamListInfo params,
 							ResourceOwner owner,
-							int eflags);
+							int eflags,
+							bool do_initial_pruning);
 
 /*
  * Walk a prep_estates list in step with a parallel stmt_list iteration.
-- 
2.47.3



  [application/octet-stream] v8-0003-Use-pruning-aware-locking-in-cached-plans.patch (41.1K, 3-v8-0003-Use-pruning-aware-locking-in-cached-plans.patch)
  download | inline diff:
From 2e637cbc71a14775e161bde21e1036eca2644a2b Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Thu, 19 Mar 2026 19:02:04 +0900
Subject: [PATCH v8 3/5] Use pruning-aware locking in cached plans

Extend GetCachedPlan() to perform ExecutorPrep() on each planned
statement, capturing unpruned relids and initial pruning results.
Use this data to acquire execution locks only on surviving partitions,
avoiding unnecessary locking of pruned tables even when using cached
plans.

Introduce CachedPlanPrepData to carry the EStates created by
ExecutorPrep() through the plan caching layer. The prep_estates
list is indexed one-to-one with CachedPlan->stmt_list and is
populated when GetCachedPlan() prepares a reused generic plan.
Adjust call sites in SPI, functions, portals, and EXPLAIN to
propagate this data.

Partition pruning expressions may call PL functions that require
an active snapshot (e.g., via EnsurePortalSnapshotExists()).
AcquireExecutorLocksUnpruned() establishes one before calling
ExecutorPrep() if needed, ensuring these expressions can execute
correctly during plan cache validation.

To maintain correctness when all target partitions are pruned, also
reinstate the firstResultRel locking behavior lost in commit
28317de72. That commit required the first ModifyTable target to
remain initialized for executor assumptions to hold. We now
explicitly track these relids in PlannerGlobal and PlannedStmt so they
are locked even if pruned, preserving that rule across cached plan
reuse.

Regression tests are included to verify:

- Only surviving partitions are locked when pruning is enabled, and
  all partitions are locked when it is disabled (pg_locks inspection).
- Multiple ModifyTable nodes (via writable CTEs) handle the case where
  all target partitions are pruned, exercising firstResultRels.
- Plan invalidation during pruning-aware lock setup (DDL triggered by
  a pruning expression) discards the prep state and replans cleanly.

Note for extension authors: code that accesses partition relations
through EState must check that the RT index is a member of
es_unpruned_relids before opening the relation.  Previously this was
an optimization (avoid processing pruned partitions); it is now a
correctness requirement, because pruned partitions may not be locked.
ExecGetRangeTableRelation() already enforces this with an error when
called on a pruned relation.
---
 src/backend/commands/prepare.c                |  17 +-
 src/backend/executor/functions.c              |   1 +
 src/backend/executor/nodeModifyTable.c        |   5 +-
 src/backend/executor/spi.c                    |  22 +-
 src/backend/optimizer/plan/planner.c          |   1 +
 src/backend/optimizer/plan/setrefs.c          |  20 ++
 src/backend/tcop/postgres.c                   |   7 +-
 src/backend/utils/cache/plancache.c           | 257 +++++++++++++++++-
 src/include/nodes/pathnodes.h                 |   3 +
 src/include/nodes/plannodes.h                 |  10 +
 src/include/utils/plancache.h                 |  35 ++-
 src/test/regress/expected/partition_prune.out | 145 ++++++++++
 src/test/regress/expected/plancache.out       |  62 +++++
 src/test/regress/sql/partition_prune.sql      |  77 ++++++
 src/test/regress/sql/plancache.sql            |  51 ++++
 15 files changed, 689 insertions(+), 24 deletions(-)

diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index c7bab14b633..fec83cc6fd4 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -156,6 +156,7 @@ ExecuteQuery(ParseState *pstate,
 {
 	PreparedStatement *entry;
 	CachedPlan *cplan;
+	CachedPlanPrepData cprep = {0};
 	List	   *plan_list;
 	ParamListInfo paramLI = NULL;
 	EState	   *estate = NULL;
@@ -195,7 +196,9 @@ ExecuteQuery(ParseState *pstate,
 									   entry->plansource->query_string);
 
 	/* Replan if needed, and increment plan refcount for portal */
-	cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL);
+	cprep.context = portal->portalContext;
+	cprep.owner = portal->resowner;
+	cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL, &cprep);
 	plan_list = cplan->stmt_list;
 
 	/*
@@ -207,7 +210,7 @@ ExecuteQuery(ParseState *pstate,
 					  query_string,
 					  entry->plansource->commandTag,
 					  plan_list,
-					  NIL,
+					  cprep.prep_estates,
 					  cplan);
 
 	/*
@@ -577,6 +580,7 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	PreparedStatement *entry;
 	const char *query_string;
 	CachedPlan *cplan;
+	CachedPlanPrepData cprep = {0};
 	List	   *plan_list;
 	List	   *prep_estates;
 	ListCell   *p;
@@ -635,8 +639,13 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	}
 
 	/* Replan if needed, and acquire a transient refcount */
+	cprep.context = CurrentMemoryContext;
+	cprep.owner = CurrentResourceOwner;
+	if (es->generic)
+		cprep.eflags = EXEC_FLAG_EXPLAIN_GENERIC;
 	cplan = GetCachedPlan(entry->plansource, paramLI,
-						  CurrentResourceOwner, pstate->p_queryEnv);
+						  CurrentResourceOwner, pstate->p_queryEnv,
+						  &cprep);
 
 	INSTR_TIME_SET_CURRENT(planduration);
 	INSTR_TIME_SUBTRACT(planduration, planstart);
@@ -655,7 +664,7 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	}
 
 	plan_list = cplan->stmt_list;
-	prep_estates = NIL;
+	prep_estates = cprep.prep_estates;
 
 	/* Explain each query */
 	prep_lc = list_head(prep_estates);
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index 952a784c924..c0ca72b38dd 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -699,6 +699,7 @@ init_execution_state(SQLFunctionCachePtr fcache)
 	fcache->cplan = GetCachedPlan(plansource,
 								  fcache->paramLI,
 								  fcache->cowner,
+								  NULL,
 								  NULL);
 
 	/*
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index 4cd5e262e0f..9230f2b554f 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -4865,8 +4865,8 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
 	 * as a reference for building the ResultRelInfo of the target partition.
 	 * In either case, it doesn't matter which result relation is kept, so we
 	 * just keep the first one, if all others have been pruned.  See also,
-	 * ExecDoInitialPruning(), which ensures that this first result relation
-	 * has been locked.
+	 * AcquireExecutorLocksUnpruned(), which ensures that this first result
+	 * relation has been locked.
 	 */
 	i = 0;
 	foreach(l, node->resultRelations)
@@ -4880,6 +4880,7 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
 			/* all result relations pruned; keep the first one */
 			keep_rel = true;
 			rti = linitial_int(node->resultRelations);
+			Assert(list_member_int(estate->es_plannedstmt->firstResultRels, rti));
 			i = 0;
 		}
 
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 380bbc44e97..f1d84f7a350 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -1580,6 +1580,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 {
 	CachedPlanSource *plansource;
 	CachedPlan *cplan;
+	CachedPlanPrepData cprep = {0};
 	List	   *stmt_list;
 	char	   *query_string;
 	Snapshot	snapshot;
@@ -1660,7 +1661,10 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 	 */
 
 	/* Replan if needed, and increment plan refcount for portal */
-	cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv);
+	cprep.context = portal->portalContext;
+	cprep.owner = portal->resowner;
+	cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv,
+						  &cprep);
 	stmt_list = cplan->stmt_list;
 
 	if (!plan->saved)
@@ -1670,7 +1674,10 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 		 * so must copy the plan into the portal's context.  An error here
 		 * will result in leaking our refcount on the plan, but it doesn't
 		 * matter because the plan is unsaved and hence transient anyway.
+		 *
+		 * Unsaved plans use custom plans, so prep should be a no-op.
 		 */
+		Assert(cprep.prep_estates == NIL);
 		oldcontext = MemoryContextSwitchTo(portal->portalContext);
 		stmt_list = copyObject(stmt_list);
 		MemoryContextSwitchTo(oldcontext);
@@ -1686,7 +1693,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 					  query_string,
 					  plansource->commandTag,
 					  stmt_list,
-					  NIL,
+					  cprep.prep_estates,
 					  cplan);
 
 	/*
@@ -2104,7 +2111,8 @@ SPI_plan_get_cached_plan(SPIPlanPtr plan)
 	/* Get the generic plan for the query */
 	cplan = GetCachedPlan(plansource, NULL,
 						  plan->saved ? CurrentResourceOwner : NULL,
-						  _SPI_current->queryEnv);
+						  _SPI_current->queryEnv,
+						  NULL);
 	Assert(cplan == plansource->gplan);
 
 	/* Pop the error context stack */
@@ -2503,6 +2511,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 		ListCell   *lc2;
 		List	   *prep_estates;
 		ListCell   *prep_lc;
+		CachedPlanPrepData cprep = {0};
 
 		spicallbackarg.query = plansource->query_string;
 
@@ -2577,11 +2586,14 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 		 * Replan if needed, and increment plan refcount.  If it's a saved
 		 * plan, the refcount must be backed by the plan_owner.
 		 */
+		cprep.context = CurrentMemoryContext;
+		cprep.owner = CurrentResourceOwner;
 		cplan = GetCachedPlan(plansource, options->params,
-							  plan_owner, _SPI_current->queryEnv);
+							  plan_owner, _SPI_current->queryEnv,
+							  &cprep);
 
 		stmt_list = cplan->stmt_list;
-		prep_estates = NIL;
+		prep_estates = cprep.prep_estates;
 
 		/*
 		 * If we weren't given a specific snapshot to use, and the statement
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 42604a0f75c..afa61d357c5 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -657,6 +657,7 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
 	result->permInfos = glob->finalrteperminfos;
 	result->subrtinfos = glob->subrtinfos;
 	result->resultRelations = glob->resultRelations;
+	result->firstResultRels = glob->firstResultRels;
 	result->appendRelations = glob->appendRelations;
 	result->subplans = glob->subplans;
 	result->rewindPlanIDs = glob->rewindPlanIDs;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 1b5b9b5ed9c..ddb7902bc89 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -384,6 +384,26 @@ set_plan_references(PlannerInfo *root, Plan *plan)
 		}
 	}
 
+	/*
+	 * Record the first result relation if it belongs to the set of
+	 * initially prunable relations.  We use bms_next_member() to get
+	 * the lowest-numbered leaf result rel, which matches
+	 * linitial_int(ModifyTable.resultRelations) because partition
+	 * expansion preserves RT index order.  There is one ModifyTable
+	 * per query level, so this captures exactly one entry per level.
+	 * ExecInitModifyTable() asserts that the recorded index matches
+	 * what it actually needs.
+	 */
+	if (root->leaf_result_relids)
+	{
+		Index	firstResultRel = bms_next_member(root->leaf_result_relids, -1);
+
+		firstResultRel += rtoffset;
+		if (bms_is_member(firstResultRel, root->glob->prunableRelids))
+			root->glob->firstResultRels =
+				lappend_int(root->glob->firstResultRels, firstResultRel);
+	}
+
 	return result;
 }
 
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 355a490cde9..de362ff1672 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -1637,6 +1637,7 @@ exec_bind_message(StringInfo input_message)
 	int16	   *rformats = NULL;
 	CachedPlanSource *psrc;
 	CachedPlan *cplan;
+	CachedPlanPrepData cprep = {0};
 	Portal		portal;
 	char	   *query_string;
 	char	   *saved_stmt_name;
@@ -2018,7 +2019,9 @@ exec_bind_message(StringInfo input_message)
 	 * will be generated in MessageContext.  The plan refcount will be
 	 * assigned to the Portal, so it will be released at portal destruction.
 	 */
-	cplan = GetCachedPlan(psrc, params, NULL, NULL);
+	cprep.context = portal->portalContext;
+	cprep.owner = portal->resowner;
+	cplan = GetCachedPlan(psrc, params, NULL, NULL, &cprep);
 
 	/*
 	 * Now we can define the portal.
@@ -2031,7 +2034,7 @@ exec_bind_message(StringInfo input_message)
 					  query_string,
 					  psrc->commandTag,
 					  cplan->stmt_list,
-					  NIL,
+					  cprep.prep_estates,
 					  cplan);
 
 	/* Portal is defined, set the plan ID based on its contents. */
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 182c16e9b9a..2d4c57d3deb 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -93,14 +93,17 @@ static bool StmtPlanRequiresRevalidation(CachedPlanSource *plansource);
 static bool BuildingPlanRequiresSnapshot(CachedPlanSource *plansource);
 static List *RevalidateCachedQuery(CachedPlanSource *plansource,
 								   QueryEnvironment *queryEnv);
-static bool CheckCachedPlan(CachedPlanSource *plansource);
+static bool CheckCachedPlan(CachedPlanSource *plansource, CachedPlanPrepData *cprep);
 static CachedPlan *BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
 								   ParamListInfo boundParams, QueryEnvironment *queryEnv);
 static bool choose_custom_plan(CachedPlanSource *plansource,
 							   ParamListInfo boundParams);
 static double cached_plan_cost(CachedPlan *plan, bool include_planner);
 static Query *QueryListGetPrimaryStmt(List *stmts);
-static void AcquireExecutorLocks(List *stmt_list, bool acquire);
+static void AcquireExecutorLocksAll(List *stmt_list, bool acquire);
+static void AcquireExecutorLocksUnpruned(List *stmt_list, bool acquire,
+										 CachedPlanPrepData *cprep);
+static void CachedPlanPrepCleanup(CachedPlanPrepData *cprep);
 static void AcquirePlannerLocks(List *stmt_list, bool acquire);
 static void ScanQueryForLocks(Query *parsetree, bool acquire);
 static bool ScanQueryWalker(Node *node, bool *acquire);
@@ -942,6 +945,11 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
 /*
  * CheckCachedPlan: see if the CachedPlanSource's generic plan is valid.
  *
+ * If 'cprep' is not NULL, ExecutorPrep() is applied to each PlannedStmt to
+ * compute the set of partitions that survive initial runtime pruning in order
+ * to only lock them.  The EStates created to do so are saved in cprep for
+ * later reuse by ExecutorStart().
+ *
  * Caller must have already called RevalidateCachedQuery to verify that the
  * querytree is up to date.
  *
@@ -949,7 +957,7 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
  * (We must do this for the "true" result to be race-condition-free.)
  */
 static bool
-CheckCachedPlan(CachedPlanSource *plansource)
+CheckCachedPlan(CachedPlanSource *plansource, CachedPlanPrepData *cprep)
 {
 	CachedPlan *plan = plansource->gplan;
 
@@ -983,7 +991,10 @@ CheckCachedPlan(CachedPlanSource *plansource)
 		 */
 		Assert(plan->refcount > 0);
 
-		AcquireExecutorLocks(plan->stmt_list, true);
+		if (cprep)
+			AcquireExecutorLocksUnpruned(plan->stmt_list, true, cprep);
+		else
+			AcquireExecutorLocksAll(plan->stmt_list, true);
 
 		/*
 		 * If plan was transient, check to see if TransactionXmin has
@@ -1005,7 +1016,13 @@ CheckCachedPlan(CachedPlanSource *plansource)
 		}
 
 		/* Oops, the race case happened.  Release useless locks. */
-		AcquireExecutorLocks(plan->stmt_list, false);
+		if (cprep)
+			AcquireExecutorLocksUnpruned(plan->stmt_list, false, cprep);
+		else
+			AcquireExecutorLocksAll(plan->stmt_list, false);
+
+		/* Also clean up ExecutorPrep() state, if necessary. */
+		CachedPlanPrepCleanup(cprep);
 	}
 
 	/*
@@ -1285,6 +1302,15 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
  * On return, the plan is valid and we have sufficient locks to begin
  * execution.
  *
+ * If 'cprep' is not NULL and a generic plan is reused, the function
+ * performs initial pruning via ExecutorPrep() and locks only the
+ * surviving partitions.  The resulting EStates are stored in
+ * cprep->prep_estates and must be delivered to ExecutorStart() via
+ * QueryDesc->estate (or the equivalent portal/SPI path).  Failure
+ * to do so means the executor will operate on relations for which
+ * locks were never acquired.  Passing NULL for cprep is always safe;
+ * all partitions are locked as before.
+ *
  * On return, the refcount of the plan has been incremented; a later
  * ReleaseCachedPlan() call is expected.  If "owner" is not NULL then
  * the refcount has been reported to that ResourceOwner (note that this
@@ -1295,7 +1321,8 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
  */
 CachedPlan *
 GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
-			  ResourceOwner owner, QueryEnvironment *queryEnv)
+			  ResourceOwner owner, QueryEnvironment *queryEnv,
+			  CachedPlanPrepData *cprep)
 {
 	CachedPlan *plan = NULL;
 	List	   *qlist;
@@ -1317,7 +1344,9 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 
 	if (!customplan)
 	{
-		if (CheckCachedPlan(plansource))
+		if (cprep)
+			cprep->params = boundParams;
+		if (CheckCachedPlan(plansource, cprep))
 		{
 			/* We want a generic plan, and we already have a valid one */
 			plan = plansource->gplan;
@@ -1904,11 +1933,13 @@ QueryListGetPrimaryStmt(List *stmts)
 }
 
 /*
- * AcquireExecutorLocks: acquire locks needed for execution of a cached plan;
- * or release them if acquire is false.
+ * AcquireExecutorLocksAll: acquire locks needed for execution of a cached
+ * plan; or release them if acquire is false.
+ *
+ * This locks all relations in a given PlannedStmt's range table.
  */
 static void
-AcquireExecutorLocks(List *stmt_list, bool acquire)
+AcquireExecutorLocksAll(List *stmt_list, bool acquire)
 {
 	ListCell   *lc1;
 
@@ -1955,6 +1986,212 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
 	}
 }
 
+/*
+ * LockRelids
+ * 		Acquire or release locks on the specified relids, which reference
+ * 		entries in the provided range table.
+ *
+ * Helper for AcquireExecutorLocksUnpruned().
+ */
+static void
+LockRelids(List *rtable, Bitmapset *relids, bool acquire)
+{
+	int	rtindex = -1;
+
+	while ((rtindex = bms_next_member(relids, rtindex)) >= 0)
+	{
+		RangeTblEntry *rte = list_nth_node(RangeTblEntry, rtable, rtindex - 1);
+
+		Assert(rte->rtekind == RTE_RELATION ||
+			   (rte->rtekind == RTE_SUBQUERY && OidIsValid(rte->relid)));
+
+		/*
+		 * Acquire the appropriate type of lock on each relation OID. Note
+		 * that we don't actually try to open the rel, and hence will not
+		 * fail if it's been dropped entirely --- we'll just transiently
+		 * acquire a non-conflicting lock.
+		 */
+		if (acquire)
+			LockRelationOid(rte->relid, rte->rellockmode);
+		else
+			UnlockRelationOid(rte->relid, rte->rellockmode);
+	}
+}
+
+/*
+ * AcquireExecutorLocksUnpruned
+ *		Acquire or release execution locks for only unpruned relations
+ *		referenced by the given PlannedStmts.
+ *
+ * On acquire, this:
+ *	- locks unprunable rels listed in PlannedStmt.unprunableRelids
+ *	- runs ExecutorPrep() to perform initial runtime pruning
+ *	- locks the surviving partitions reported in the prep estate
+ *	- appends the EState pointer for each PlannedStmt to cprep->prep_estates
+ *
+ * On release, it:
+ *	- looks up the EState for each PlannedStmt from cprep->prep_estates
+ *	  (which must already be populated)
+ *	- unlocks the same relations identified during acquire
+ *
+ * prep_estates is extended during acquire and must match stmt_list one-to-one
+ * when releasing locks.  Memory allocation for EState happens in
+ * cprep->context.  Locks are acquired using cprep->owner.
+ */
+static void
+AcquireExecutorLocksUnpruned(List *stmt_list, bool acquire,
+							 CachedPlanPrepData *cprep)
+{
+	MemoryContext oldcontext = MemoryContextSwitchTo(cprep->context);
+	ListCell   *lc1;
+	List	   *prep_estates;
+	ListCell   *prep_lc;
+
+	Assert(cprep);
+
+	/*
+	 * When releasing locks, use the EState list (if any) created during
+	 * acquisition to determine which relids to unlock. The list must match
+	 * the PlannedStmt list one-to-one.
+	 */
+	prep_estates = cprep->prep_estates;
+	Assert(acquire || list_length(prep_estates) == list_length(stmt_list));
+
+	prep_lc = list_head(prep_estates);
+	foreach(lc1, stmt_list)
+	{
+		PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
+		EState *prep_estate;
+
+		if (plannedstmt->commandType == CMD_UTILITY)
+		{
+			/* Same as AcquireExecutorLocks(). */
+			Query	   *query = UtilityContainsQuery(plannedstmt->utilityStmt);
+
+			if (query)
+				ScanQueryForLocks(query, acquire);
+
+			/* Keep the list one-to-one with stmt_list. */
+			if (acquire)
+				cprep->prep_estates = lappend(cprep->prep_estates, NULL);
+			else
+				(void) next_prep_estate(prep_estates, &prep_lc);
+			continue;
+		}
+
+		/*
+		 * Lock tables mentioned in the original query and other unprunable
+		 * relations that were added to the plan via inheritance expansion.
+		 */
+		LockRelids(plannedstmt->rtable, plannedstmt->unprunableRelids, acquire);
+
+		/* Lock partitions surviving runtime initial pruning. */
+		if (acquire)
+		{
+			/*
+			 * Pruning expressions may call PL functions that require an active
+			 * snapshot (e.g., via EnsurePortalSnapshotExists()). Establish one
+			 * if needed.
+			 */
+			bool		snap_pushed = false;
+
+			if (!ActiveSnapshotSet())
+			{
+				PushActiveSnapshot(GetTransactionSnapshot());
+				snap_pushed = true;
+			}
+
+			prep_estate = ExecutorPrep(plannedstmt, cprep->params,
+									   cprep->owner, cprep->eflags);
+			Assert(prep_estate);
+			cprep->prep_estates = lappend(cprep->prep_estates, prep_estate);
+
+			if (snap_pushed)
+				PopActiveSnapshot();
+		}
+		else
+			prep_estate = next_prep_estate(prep_estates, &prep_lc);
+
+		if (prep_estate)
+		{
+			/*
+			 * es_unpruned_relids includes plannedstmt->unprunableRelids,
+			 * which we've already locked. Filter them out to avoid double-locking.
+			 */
+			Bitmapset *lock_relids = bms_difference(prep_estate->es_unpruned_relids,
+													plannedstmt->unprunableRelids);
+
+			/*
+			 * We must always include the first result relation of each
+			 * ModifyTable node in the plan, that is, the one mentioned in
+			 * plannedstmt->firstResultRels in the set of relations to be
+			 * locked to satisfy executor assumptions described
+			 * in ExecInitModifyTable().  This can be wasteful, because we
+			 * may not need to use the first result relation at all if other
+			 * result relations are unpruned and thus sufficient for the
+			 * ModifyTable node's needs.  Unfortunately, we don't have per-node
+			 * unpruned_relids set to determine that other result relations
+			 * are included.
+			 */
+			if (plannedstmt->resultRelations)
+			{
+				ListCell *lc2;
+
+				foreach(lc2, plannedstmt->firstResultRels)
+				{
+					Index       firstResultRel = lfirst_int(lc2);
+
+					if (!bms_is_member(firstResultRel, lock_relids))
+						lock_relids = bms_add_member(lock_relids, firstResultRel);
+				}
+			}
+
+			LockRelids(plannedstmt->rtable, lock_relids, acquire);
+			bms_free(lock_relids);
+		}
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+}
+
+/*
+ * CachedPlanPrepCleanup
+ *		Clean up EState built for a generic plan.
+ *
+ * This is used in the corner case where CheckCachedPlan() discovers
+ * that a CachedPlan has become invalid after AcquireExecutorLocksUnpruned()
+ * has already run.  In that case we must both release the execution locks
+ * and dispose of the ExecPrep list stored in CachedPlanPrepData, since the
+ * executor will never see or clean it up.
+ */
+static void
+CachedPlanPrepCleanup(CachedPlanPrepData *cprep)
+{
+	ListCell   *lc;
+	ResourceOwner oldowner;
+
+	if (cprep == NULL)
+		return;
+
+	/* Switch to owner that ExecutorPrep() would have used. */
+	oldowner = CurrentResourceOwner;
+	CurrentResourceOwner = cprep->owner;
+	foreach(lc, cprep->prep_estates)
+	{
+		EState *prep_estate = (EState *) lfirst(lc);
+
+		if (prep_estate == NULL)
+			continue;
+
+		ExecCloseRangeTableRelations(prep_estate);
+		FreeExecutorState(prep_estate);
+	}
+	CurrentResourceOwner = oldowner;
+
+	list_free(cprep->prep_estates);
+	cprep->prep_estates = NIL;
+}
+
 /*
  * AcquirePlannerLocks: acquire locks needed for planning of a querytree list;
  * or release them if acquire is false.
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 27758ec16fe..4fd9d9bcc56 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -217,6 +217,9 @@ typedef struct PlannerGlobal
 	/* "flat" list of integer RT indexes */
 	List	   *resultRelations;
 
+	/* "flat" list of integer RT indexes (one per ModifyTable node) */
+	List	   *firstResultRels;
+
 	/* "flat" list of AppendRelInfos */
 	List	   *appendRelations;
 
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index b6185825fcb..55279cbbda8 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -121,6 +121,16 @@ typedef struct PlannedStmt
 	/* integer list of RT indexes, or NIL */
 	List	   *resultRelations;
 
+	/*
+	 * rtable indexes of first target relation in each ModifyTable node in the
+	 * plan for INSERT/UPDATE/DELETE/MERGE.  NIL if resultRelations is NIL.
+	 *
+	 * These are used by AcquireExecutorLocksUnpruned() to ensure that the
+	 * first result rel for each ModifyTable remains locked even if pruned;
+	 * see ExecInitModifyTable() for the executor side assumptions.
+	 */
+	List	   *firstResultRels;
+
 	/* list of AppendRelInfo nodes */
 	List	   *appendRelations;
 
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index 984c51515c6..c22f832d0b1 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -197,6 +197,38 @@ typedef struct CachedExpression
 } CachedExpression;
 
 
+/*
+ * CachedPlanPrepData
+ *      Carries ExecutorPrep results for each PlannedStmt in a CachedPlan,
+ *      along with context and owner information needed to allocate them.
+ *
+ * prep_estates is indexed one-to-one with CachedPlan->stmt_list, and is
+ * populated when GetCachedPlan() prepares a reused generic plan.  If the
+ * plan is found invalid after locking, the same list is used to determine
+ * which relations to unlock before retrying.
+ *
+ * ExecutorPrep state is allocated in 'context' and owned by 'owner'.
+ *
+ * eflags controls ExecutorPrep() behavior during initial pruning.
+ * Normally zero; set EXEC_FLAG_EXPLAIN_GENERIC to suppress pruning
+ * in EXPLAIN (GENERIC_PLAN).  Need not match the eflags later passed
+ * to ExecutorStart().
+ *
+ * prep_estates must reach ExecutorStart() to be adopted for execution.
+ * If the plan is invalidated before that happens, CachedPlanPrepCleanup()
+ * frees them instead.  The EStates are allocated in 'context' and their
+ * resources tracked under 'owner', which the caller sets to match the
+ * execution environment (e.g., portal context and resowner).
+ */
+typedef struct CachedPlanPrepData
+{
+	List   *prep_estates;	/* one EState per PlannedStmt, or NULL */
+	ParamListInfo params;	/* params visible to ExecutorPrep */
+	MemoryContext context;	/* where to allocate EState and its fields */
+	ResourceOwner owner;	/* ResourceOwner for ExecutorPrep state */
+	int		eflags;			/* executor flags to control ExecutorPrep */
+} CachedPlanPrepData;
+
 extern void InitPlanCache(void);
 extern void ResetPlanCache(void);
 
@@ -240,7 +272,8 @@ extern List *CachedPlanGetTargetList(CachedPlanSource *plansource,
 extern CachedPlan *GetCachedPlan(CachedPlanSource *plansource,
 								 ParamListInfo boundParams,
 								 ResourceOwner owner,
-								 QueryEnvironment *queryEnv);
+								 QueryEnvironment *queryEnv,
+								 CachedPlanPrepData *cprep);
 extern void ReleaseCachedPlan(CachedPlan *plan, ResourceOwner owner);
 
 extern bool CachedPlanAllowsSimpleValidityCheck(CachedPlanSource *plansource,
diff --git a/src/test/regress/expected/partition_prune.out b/src/test/regress/expected/partition_prune.out
index deacdd75807..8e0cc98baca 100644
--- a/src/test/regress/expected/partition_prune.out
+++ b/src/test/regress/expected/partition_prune.out
@@ -4824,3 +4824,148 @@ select min(a) over (partition by a order by a) from part_abc where a >= stable_o
 
 drop view part_abc_view;
 drop table part_abc;
+--
+-- Verify that pruning-aware locking skips pruned partitions
+-- when reusing a generic cached plan.
+--
+set plan_cache_mode to force_generic_plan;
+create table prunelock_p (a int) partition by list (a);
+create table prunelock_p1 partition of prunelock_p for values in (1);
+create table prunelock_p2 partition of prunelock_p for values in (2);
+create table prunelock_p3 partition of prunelock_p for values in (3);
+prepare prunelock_q (int) as select * from prunelock_p where a = $1;
+-- Force generic plan creation
+explain (costs off) execute prunelock_q(1);
+                  QUERY PLAN                  
+----------------------------------------------
+ Append
+   Subplans Removed: 2
+   ->  Seq Scan on prunelock_p1 prunelock_p_1
+         Filter: (a = $1)
+(4 rows)
+
+-- Execute and check which child partitions are locked
+begin;
+execute prunelock_q(1);
+ a 
+---
+(0 rows)
+
+select c.relname
+  from pg_locks l
+  join pg_class c on c.oid = l.relation
+ where l.pid = pg_backend_pid()
+   and c.relname like 'prunelock_p_'
+ order by c.relname;
+   relname    
+--------------
+ prunelock_p1
+(1 row)
+
+commit;
+deallocate prunelock_q;
+-- Turn pruning off
+set enable_partition_pruning to off;
+prepare prunelock_q (int) as select * from prunelock_p where a = $1;
+-- Force generic plan creation
+explain (costs off) execute prunelock_q(1);
+                  QUERY PLAN                  
+----------------------------------------------
+ Append
+   ->  Seq Scan on prunelock_p1 prunelock_p_1
+         Filter: (a = $1)
+   ->  Seq Scan on prunelock_p2 prunelock_p_2
+         Filter: (a = $1)
+   ->  Seq Scan on prunelock_p3 prunelock_p_3
+         Filter: (a = $1)
+(7 rows)
+
+-- Execute and check which child partitions are locked
+begin;
+execute prunelock_q(1);
+ a 
+---
+(0 rows)
+
+select c.relname
+  from pg_locks l
+  join pg_class c on c.oid = l.relation
+ where l.pid = pg_backend_pid()
+   and c.relname like 'prunelock_p_'
+ order by c.relname;
+   relname    
+--------------
+ prunelock_p1
+ prunelock_p2
+ prunelock_p3
+(3 rows)
+
+commit;
+deallocate prunelock_q;
+reset enable_partition_pruning;
+--
+-- Verify firstResultRels handling with multiple ModifyTable nodes
+-- (writable CTEs) targeting a partitioned table.  When a pruning
+-- parameter matches no partition, all result relations are pruned
+-- and the executor must still find a usable first result relation
+-- for each ModifyTable node.
+--
+prepare prunelock_mt_q (int, int) as
+  with upd1 as (update prunelock_p set a = a),
+       upd2 as (update prunelock_p set a = a where a = $2)
+  update prunelock_p set a = a where a = $1;
+-- Force generic plan creation
+explain (costs off) execute prunelock_mt_q(1, 2);
+                         QUERY PLAN                         
+------------------------------------------------------------
+ Update on prunelock_p
+   Update on prunelock_p1 prunelock_p_1
+   CTE upd1
+     ->  Update on prunelock_p prunelock_p_3
+           Update on prunelock_p1 prunelock_p_4
+           Update on prunelock_p2 prunelock_p_5
+           Update on prunelock_p3 prunelock_p_6
+           ->  Append
+                 ->  Seq Scan on prunelock_p1 prunelock_p_4
+                 ->  Seq Scan on prunelock_p2 prunelock_p_5
+                 ->  Seq Scan on prunelock_p3 prunelock_p_6
+   CTE upd2
+     ->  Update on prunelock_p prunelock_p_7
+           Update on prunelock_p2 prunelock_p_8
+           ->  Append
+                 Subplans Removed: 2
+                 ->  Seq Scan on prunelock_p2 prunelock_p_8
+                       Filter: (a = $2)
+   ->  Append
+         Subplans Removed: 2
+         ->  Seq Scan on prunelock_p1 prunelock_p_1
+               Filter: (a = $1)
+(22 rows)
+
+-- All partitions pruned: value 4 matches no partition, so each
+-- ModifyTable must still initialize correctly with no matching
+-- result relations.
+explain (costs off) execute prunelock_mt_q(4, 5);
+                         QUERY PLAN                         
+------------------------------------------------------------
+ Update on prunelock_p
+   CTE upd1
+     ->  Update on prunelock_p prunelock_p_2
+           Update on prunelock_p1 prunelock_p_3
+           Update on prunelock_p2 prunelock_p_4
+           Update on prunelock_p3 prunelock_p_5
+           ->  Append
+                 ->  Seq Scan on prunelock_p1 prunelock_p_3
+                 ->  Seq Scan on prunelock_p2 prunelock_p_4
+                 ->  Seq Scan on prunelock_p3 prunelock_p_5
+   CTE upd2
+     ->  Update on prunelock_p prunelock_p_6
+           ->  Append
+                 Subplans Removed: 3
+   ->  Append
+         Subplans Removed: 3
+(16 rows)
+
+deallocate prunelock_mt_q;
+drop table prunelock_p;
+reset plan_cache_mode;
diff --git a/src/test/regress/expected/plancache.out b/src/test/regress/expected/plancache.out
index 4e59188196c..1d69ab0a1c2 100644
--- a/src/test/regress/expected/plancache.out
+++ b/src/test/regress/expected/plancache.out
@@ -398,3 +398,65 @@ select name, generic_plans, custom_plans from pg_prepared_statements
 (1 row)
 
 drop table test_mode;
+-- Test invalidation of a generic plan during pruning-aware lock setup.
+-- The pruning expression uses a stable SQL function that calls a volatile
+-- plpgsql function.  That function performs DDL on a partition when a
+-- separate "signal" table says to do so.  The second EXECUTE should
+-- replan cleanly after the DDL.
+set plan_cache_mode to force_generic_plan;
+create table inval_during_pruning_p (a int) partition by list (a);
+create table inval_during_pruning_p1 partition of inval_during_pruning_p for values in (1);
+create table inval_during_pruning_p2 partition of inval_during_pruning_p for values in (2);
+insert into inval_during_pruning_p values (1), (2);
+create table inval_during_pruning_signal (create_idx bool not null);
+insert into inval_during_pruning_signal values (false);
+create or replace function invalidate_plancache_func() returns int
+as $$
+declare
+	create_index bool;
+begin
+	-- Perform DDL on a partition if asked to
+	select create_idx into create_index from inval_during_pruning_signal for update;
+	if create_index = true then
+		raise notice 'creating index on partition inval_during_pruning_p1';
+		create index on inval_during_pruning_p1 (a);
+		update inval_during_pruning_signal set create_idx = false;
+	end if;
+	-- pruning parameter
+	return 1;
+end;
+$$ language plpgsql volatile;
+create or replace function stable_pruning_val() returns int as $$
+	select invalidate_plancache_func();
+$$ language sql stable;
+prepare inval_during_pruning_q as select * from inval_during_pruning_p where a = stable_pruning_val();
+-- Build a generic plan and run pruning once, but don't set the signal
+-- for invalidate_plancache_func() to perform the DDL.
+explain (verbose, costs off) execute inval_during_pruning_q;
+                                QUERY PLAN                                 
+---------------------------------------------------------------------------
+ Append
+   Subplans Removed: 1
+   ->  Seq Scan on public.inval_during_pruning_p1 inval_during_pruning_p_1
+         Output: inval_during_pruning_p_1.a
+         Filter: (inval_during_pruning_p_1.a = stable_pruning_val())
+(5 rows)
+
+-- Reuse the generic plan.  Make invalidate_plancache_func() perform DDL
+-- during this execution, which should force replanning without errors.
+update inval_during_pruning_signal set create_idx = true;
+explain (verbose, costs off) execute inval_during_pruning_q;
+NOTICE:  creating index on partition inval_during_pruning_p1
+                                QUERY PLAN                                 
+---------------------------------------------------------------------------
+ Append
+   Subplans Removed: 1
+   ->  Seq Scan on public.inval_during_pruning_p1 inval_during_pruning_p_1
+         Output: inval_during_pruning_p_1.a
+         Filter: (inval_during_pruning_p_1.a = stable_pruning_val())
+(5 rows)
+
+drop table inval_during_pruning_p, inval_during_pruning_signal;
+drop function invalidate_plancache_func, stable_pruning_val;
+deallocate inval_during_pruning_q;
+reset plan_cache_mode;
diff --git a/src/test/regress/sql/partition_prune.sql b/src/test/regress/sql/partition_prune.sql
index d93c0c03bab..804dd3c8f4e 100644
--- a/src/test/regress/sql/partition_prune.sql
+++ b/src/test/regress/sql/partition_prune.sql
@@ -1447,3 +1447,80 @@ select min(a) over (partition by a order by a) from part_abc where a >= stable_o
 
 drop view part_abc_view;
 drop table part_abc;
+
+--
+-- Verify that pruning-aware locking skips pruned partitions
+-- when reusing a generic cached plan.
+--
+set plan_cache_mode to force_generic_plan;
+
+create table prunelock_p (a int) partition by list (a);
+create table prunelock_p1 partition of prunelock_p for values in (1);
+create table prunelock_p2 partition of prunelock_p for values in (2);
+create table prunelock_p3 partition of prunelock_p for values in (3);
+
+prepare prunelock_q (int) as select * from prunelock_p where a = $1;
+
+-- Force generic plan creation
+explain (costs off) execute prunelock_q(1);
+
+-- Execute and check which child partitions are locked
+begin;
+execute prunelock_q(1);
+
+select c.relname
+  from pg_locks l
+  join pg_class c on c.oid = l.relation
+ where l.pid = pg_backend_pid()
+   and c.relname like 'prunelock_p_'
+ order by c.relname;
+commit;
+
+deallocate prunelock_q;
+
+-- Turn pruning off
+set enable_partition_pruning to off;
+
+prepare prunelock_q (int) as select * from prunelock_p where a = $1;
+
+-- Force generic plan creation
+explain (costs off) execute prunelock_q(1);
+
+-- Execute and check which child partitions are locked
+begin;
+execute prunelock_q(1);
+
+select c.relname
+  from pg_locks l
+  join pg_class c on c.oid = l.relation
+ where l.pid = pg_backend_pid()
+   and c.relname like 'prunelock_p_'
+ order by c.relname;
+commit;
+
+deallocate prunelock_q;
+reset enable_partition_pruning;
+
+--
+-- Verify firstResultRels handling with multiple ModifyTable nodes
+-- (writable CTEs) targeting a partitioned table.  When a pruning
+-- parameter matches no partition, all result relations are pruned
+-- and the executor must still find a usable first result relation
+-- for each ModifyTable node.
+--
+prepare prunelock_mt_q (int, int) as
+  with upd1 as (update prunelock_p set a = a),
+       upd2 as (update prunelock_p set a = a where a = $2)
+  update prunelock_p set a = a where a = $1;
+
+-- Force generic plan creation
+explain (costs off) execute prunelock_mt_q(1, 2);
+
+-- All partitions pruned: value 4 matches no partition, so each
+-- ModifyTable must still initialize correctly with no matching
+-- result relations.
+explain (costs off) execute prunelock_mt_q(4, 5);
+
+deallocate prunelock_mt_q;
+drop table prunelock_p;
+reset plan_cache_mode;
diff --git a/src/test/regress/sql/plancache.sql b/src/test/regress/sql/plancache.sql
index 4b2f11dcc64..139b4688fd6 100644
--- a/src/test/regress/sql/plancache.sql
+++ b/src/test/regress/sql/plancache.sql
@@ -223,3 +223,54 @@ select name, generic_plans, custom_plans from pg_prepared_statements
   where  name = 'test_mode_pp';
 
 drop table test_mode;
+
+-- Test invalidation of a generic plan during pruning-aware lock setup.
+-- The pruning expression uses a stable SQL function that calls a volatile
+-- plpgsql function.  That function performs DDL on a partition when a
+-- separate "signal" table says to do so.  The second EXECUTE should
+-- replan cleanly after the DDL.
+set plan_cache_mode to force_generic_plan;
+create table inval_during_pruning_p (a int) partition by list (a);
+create table inval_during_pruning_p1 partition of inval_during_pruning_p for values in (1);
+create table inval_during_pruning_p2 partition of inval_during_pruning_p for values in (2);
+insert into inval_during_pruning_p values (1), (2);
+
+create table inval_during_pruning_signal (create_idx bool not null);
+insert into inval_during_pruning_signal values (false);
+create or replace function invalidate_plancache_func() returns int
+as $$
+declare
+	create_index bool;
+begin
+	-- Perform DDL on a partition if asked to
+	select create_idx into create_index from inval_during_pruning_signal for update;
+	if create_index = true then
+		raise notice 'creating index on partition inval_during_pruning_p1';
+		create index on inval_during_pruning_p1 (a);
+		update inval_during_pruning_signal set create_idx = false;
+	end if;
+	-- pruning parameter
+	return 1;
+end;
+$$ language plpgsql volatile;
+
+create or replace function stable_pruning_val() returns int as $$
+	select invalidate_plancache_func();
+$$ language sql stable;
+
+prepare inval_during_pruning_q as select * from inval_during_pruning_p where a = stable_pruning_val();
+
+-- Build a generic plan and run pruning once, but don't set the signal
+-- for invalidate_plancache_func() to perform the DDL.
+explain (verbose, costs off) execute inval_during_pruning_q;
+
+-- Reuse the generic plan.  Make invalidate_plancache_func() perform DDL
+-- during this execution, which should force replanning without errors.
+update inval_during_pruning_signal set create_idx = true;
+explain (verbose, costs off) execute inval_during_pruning_q;
+
+drop table inval_during_pruning_p, inval_during_pruning_signal;
+drop function invalidate_plancache_func, stable_pruning_val;
+deallocate inval_during_pruning_q;
+
+reset plan_cache_mode;
-- 
2.47.3



  [application/octet-stream] v8-0004-Make-SQL-function-executor-track-ExecutorPrep-sta.patch (7.8K, 4-v8-0004-Make-SQL-function-executor-track-ExecutorPrep-sta.patch)
  download | inline diff:
From 2ab5fefb9644118a1f1528a53b9a6af90e063edb Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Tue, 10 Feb 2026 22:09:23 +0900
Subject: [PATCH v8 4/5] Make SQL function executor track ExecutorPrep state

Extend the SQL function executor to use the ExecutorPrep results
returned by GetCachedPlan().  init_execution_state() now passes a
CachedPlanPrepData to GetCachedPlan() and stores the per statement
ExecPrep pointers in the execution_state nodes.

At execution time, postquel_start() reparents the prep estate's
es_query_cxt under the function's subcontext so that prep state
follows the usual per call context hierarchy.

This allows SQL language functions to participate in the same
ExecutorPrep machinery as other plan cache users.

Add a regression test where rule rewrite expands a single UPDATE
into multiple PlannedStmts, exercising the SQL function plan cache
and the generic plan reuse path that now invokes ExecutorPrep.
---
 src/backend/executor/functions.c        | 29 +++++++++++++--
 src/test/regress/expected/plancache.out | 48 +++++++++++++++++++++++++
 src/test/regress/sql/plancache.sql      | 34 ++++++++++++++++++
 3 files changed, 109 insertions(+), 2 deletions(-)

diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index c0ca72b38dd..f246f051c25 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -73,6 +73,7 @@ typedef struct execution_state
 	bool		setsResult;		/* true if this query produces func's result */
 	bool		lazyEval;		/* true if should fetch one row at a time */
 	PlannedStmt *stmt;			/* plan for this query */
+	EState	   *prep_estate;	/* EState created in ExecutorPrep() for this plan */
 	QueryDesc  *qd;				/* null unless status == RUN */
 } execution_state;
 
@@ -658,6 +659,8 @@ init_execution_state(SQLFunctionCachePtr fcache)
 	execution_state *lasttages = NULL;
 	int			nstmts;
 	ListCell   *lc;
+	CachedPlanPrepData cprep = {0};
+	ListCell   *prep_lc;
 
 	/*
 	 * Clean up after previous query, if there was one.
@@ -696,11 +699,20 @@ init_execution_state(SQLFunctionCachePtr fcache)
 	 * CurrentResourceOwner will be the same when ShutdownSQLFunction runs.)
 	 */
 	fcache->cowner = CurrentResourceOwner;
+
+	/*
+	 * Have ExecutorPrep() allocate under fcache->fcontext.  The prep
+	 * EStates it creates will initially live there; postquel_start()
+	 * will later reparent their es_query_cxt into fcache->subcontext
+	 * when using them for execution.
+	 */
+	cprep.context = fcache->fcontext;
+	cprep.owner = fcache->cowner;
 	fcache->cplan = GetCachedPlan(plansource,
 								  fcache->paramLI,
 								  fcache->cowner,
 								  NULL,
-								  NULL);
+								  &cprep);
 
 	/*
 	 * If necessary, make esarray[] bigger to hold the needed state.
@@ -721,9 +733,11 @@ init_execution_state(SQLFunctionCachePtr fcache)
 	/*
 	 * Build execution_state list to match the number of contained plans.
 	 */
+	prep_lc = list_head(cprep.prep_estates);
 	foreach(lc, fcache->cplan->stmt_list)
 	{
 		PlannedStmt *stmt = lfirst_node(PlannedStmt, lc);
+		EState *prep_estate = next_prep_estate(cprep.prep_estates, &prep_lc);
 		execution_state *newes;
 
 		/*
@@ -765,6 +779,7 @@ init_execution_state(SQLFunctionCachePtr fcache)
 		newes->setsResult = false;	/* might change below */
 		newes->lazyEval = false;	/* might change below */
 		newes->stmt = stmt;
+		newes->prep_estate = prep_estate;
 		newes->qd = NULL;
 
 		if (stmt->canSetTag)
@@ -1363,6 +1378,15 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
 	else
 		dest = None_Receiver;
 
+	/*
+	 * Prep EStates were built under fcache->fcontext.  For execution,
+	 * make their es_query_cxt a child of fcache->subcontext so they
+	 * follow the usual per call lifetime.
+	 */
+	if (es->prep_estate)
+		MemoryContextSetParent(es->prep_estate->es_query_cxt,
+							   fcache->subcontext);
+
 	es->qd = CreateQueryDesc(es->stmt,
 							 fcache->func->src,
 							 GetActiveSnapshot(),
@@ -1371,7 +1395,7 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
 							 fcache->paramLI,
 							 es->qd ? es->qd->queryEnv : NULL,
 							 0,
-							 NULL);
+							 es->prep_estate);
 
 	/* Utility commands don't need Executor. */
 	if (es->qd->operation != CMD_UTILITY)
@@ -1462,6 +1486,7 @@ postquel_end(execution_state *es, SQLFunctionCachePtr fcache)
 
 	FreeQueryDesc(es->qd);
 	es->qd = NULL;
+	es->prep_estate = NULL;
 
 	MemoryContextSwitchTo(oldcontext);
 
diff --git a/src/test/regress/expected/plancache.out b/src/test/regress/expected/plancache.out
index 1d69ab0a1c2..371673a6e96 100644
--- a/src/test/regress/expected/plancache.out
+++ b/src/test/regress/expected/plancache.out
@@ -459,4 +459,52 @@ NOTICE:  creating index on partition inval_during_pruning_p1
 drop table inval_during_pruning_p, inval_during_pruning_signal;
 drop function invalidate_plancache_func, stable_pruning_val;
 deallocate inval_during_pruning_q;
+-- exercise sql-function plan cache when rewrite expands a single statement
+-- into multiple planned statements. this forces cachedplan->stmt_list to
+-- contain more than one entry and checks that executor state for the first
+-- rewritten statement does not destroy state needed by the second one.
+set plan_cache_mode = force_generic_plan;
+create table sqlf_base(id int, val int) partition by list (id);
+create table sqlf_base_1 partition of sqlf_base for values in (1);
+create table sqlf_base_2 partition of sqlf_base for values in (2);
+create table sqlf_log(id int, note text);
+insert into sqlf_base values (1, 10);
+create rule sqlf_base_upd_log as
+on update to sqlf_base do also
+	insert into sqlf_log(id, note)
+	values (new.id, 'logged by rule');
+create or replace function sqlf_execprep_test(a int, v int)
+returns void
+language sql
+as $$
+	update sqlf_base set val = v where id = a;
+$$;
+select sqlf_execprep_test(1, 20);
+ sqlf_execprep_test 
+--------------------
+ 
+(1 row)
+
+select sqlf_execprep_test(1, 30);
+ sqlf_execprep_test 
+--------------------
+ 
+(1 row)
+
+select * from sqlf_base order by 1;
+ id | val 
+----+-----
+  1 |  30
+(1 row)
+
+select * from sqlf_log order by 1;
+ id |      note      
+----+----------------
+  1 | logged by rule
+  1 | logged by rule
+(2 rows)
+
+drop rule sqlf_base_upd_log on sqlf_base;
+drop table sqlf_base, sqlf_log;
+drop function sqlf_execprep_test;
 reset plan_cache_mode;
diff --git a/src/test/regress/sql/plancache.sql b/src/test/regress/sql/plancache.sql
index 139b4688fd6..b89c9ad69a4 100644
--- a/src/test/regress/sql/plancache.sql
+++ b/src/test/regress/sql/plancache.sql
@@ -273,4 +273,38 @@ drop table inval_during_pruning_p, inval_during_pruning_signal;
 drop function invalidate_plancache_func, stable_pruning_val;
 deallocate inval_during_pruning_q;
 
+-- exercise sql-function plan cache when rewrite expands a single statement
+-- into multiple planned statements. this forces cachedplan->stmt_list to
+-- contain more than one entry and checks that executor state for the first
+-- rewritten statement does not destroy state needed by the second one.
+
+set plan_cache_mode = force_generic_plan;
+
+create table sqlf_base(id int, val int) partition by list (id);
+create table sqlf_base_1 partition of sqlf_base for values in (1);
+create table sqlf_base_2 partition of sqlf_base for values in (2);
+create table sqlf_log(id int, note text);
+
+insert into sqlf_base values (1, 10);
+
+create rule sqlf_base_upd_log as
+on update to sqlf_base do also
+	insert into sqlf_log(id, note)
+	values (new.id, 'logged by rule');
+
+create or replace function sqlf_execprep_test(a int, v int)
+returns void
+language sql
+as $$
+	update sqlf_base set val = v where id = a;
+$$;
+
+select sqlf_execprep_test(1, 20);
+select sqlf_execprep_test(1, 30);
+select * from sqlf_base order by 1;
+select * from sqlf_log order by 1;
+
+drop rule sqlf_base_upd_log on sqlf_base;
+drop table sqlf_base, sqlf_log;
+drop function sqlf_execprep_test;
 reset plan_cache_mode;
-- 
2.47.3



  [application/octet-stream] v8-0002-Introduce-ExecutorPrep-and-refactor-executor-star.patch (27.1K, 5-v8-0002-Introduce-ExecutorPrep-and-refactor-executor-star.patch)
  download | inline diff:
From a2a0befc44d25df8b549644a7e179923270a0fc6 Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Tue, 11 Nov 2025 21:47:46 +0900
Subject: [PATCH v8 2/5] Introduce ExecutorPrep and refactor executor startup

Factor permission checks, range table initialization, and initial
partition pruning out of InitPlan() into a new ExecutorPrep()
helper. ExecutorPrep() builds an EState containing the executor
metadata needed before plan execution, including partition
pruning state where partPruneInfos are present, and returns it
directly to the caller.

ExecutorStart() now checks if QueryDesc->estate is already set
(indicating ExecutorPrep() was called earlier). If so, it reuses
the EState to avoid redoing range table setup and pruning.
Otherwise, it invokes ExecutorPrep() itself and adopts the
resulting EState for the duration of the query. This keeps the
executor startup behavior unchanged while making the setup work
callable separately when needed.

CreateQueryDesc() grows a prep_estate argument to accept an
optionally pre-created EState and stores it in the QueryDesc.
Portals, SPI, SQL functions, and EXPLAIN are wired to carry
optional EState pointers alongside the PlannedStmt list, but most
callers still pass NULL and let ExecutorStart() perform the setup
lazily.

ExecutorPrep() requires the caller to have established an active
snapshot, as partition pruning expressions may call PL functions
that internally require one (e.g., via EnsurePortalSnapshotExists()).

Update executor/README and related comments to document the new
control flow and the separation between preparation and execution.

Note that as of this commit, ExecutorStart() is the only caller of
ExecutorPrep(), so there is no semantic change in behavior. Later
commits will add specialized callers that invoke ExecutorPrep()
earlier to enable pruning-aware locking in cached plans.
---
 src/backend/commands/copyto.c       |   2 +-
 src/backend/commands/createas.c     |   2 +-
 src/backend/commands/explain.c      |   8 +-
 src/backend/commands/extension.c    |   2 +-
 src/backend/commands/matview.c      |   2 +-
 src/backend/commands/portalcmds.c   |   1 +
 src/backend/commands/prepare.c      |   9 +-
 src/backend/executor/README         |  11 +-
 src/backend/executor/execMain.c     | 164 +++++++++++++++++++++++-----
 src/backend/executor/execParallel.c |   3 +-
 src/backend/executor/functions.c    |   3 +-
 src/backend/executor/spi.c          |   9 +-
 src/backend/tcop/postgres.c         |   2 +
 src/backend/tcop/pquery.c           |  24 +++-
 src/backend/utils/mmgr/portalmem.c  |   2 +
 src/include/commands/explain.h      |   3 +-
 src/include/executor/execdesc.h     |   5 +-
 src/include/executor/executor.h     |  26 +++++
 src/include/nodes/execnodes.h       |   1 -
 src/include/utils/portal.h          |   2 +
 20 files changed, 229 insertions(+), 52 deletions(-)

diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index 499ce9ad3db..e09303491d2 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -877,7 +877,7 @@ BeginCopyTo(ParseState *pstate,
 		cstate->queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
 											GetActiveSnapshot(),
 											InvalidSnapshot,
-											dest, NULL, NULL, 0);
+											dest, NULL, NULL, 0, NULL);
 
 		/*
 		 * Call ExecutorStart to prepare the plan for execution.
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 270e9bf3110..b4a9808955a 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -336,7 +336,7 @@ ExecCreateTableAs(ParseState *pstate, CreateTableAsStmt *stmt,
 		/* Create a QueryDesc, redirecting output to our tuple receiver */
 		queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
 									GetActiveSnapshot(), InvalidSnapshot,
-									dest, params, queryEnv, 0);
+									dest, params, queryEnv, 0, NULL);
 
 		/* call ExecutorStart to prepare the plan for execution */
 		ExecutorStart(queryDesc, GetIntoRelEFlags(into));
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 296ea8a1ed2..02027c429e1 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -372,7 +372,7 @@ standard_ExplainOneQuery(Query *query, int cursorOptions,
 	}
 
 	/* run it (if needed) and produce output */
-	ExplainOnePlan(plan, into, es, queryString, params, queryEnv,
+	ExplainOnePlan(plan, NULL, into, es, queryString, params, queryEnv,
 				   &planduration, (es->buffers ? &bufusage : NULL),
 				   es->memory ? &mem_counters : NULL);
 }
@@ -494,7 +494,8 @@ ExplainOneUtility(Node *utilityStmt, IntoClause *into, ExplainState *es,
  * to call it.
  */
 void
-ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
+ExplainOnePlan(PlannedStmt *plannedstmt, EState *prep_estate,
+			   IntoClause *into, ExplainState *es,
 			   const char *queryString, ParamListInfo params,
 			   QueryEnvironment *queryEnv, const instr_time *planduration,
 			   const BufferUsage *bufusage,
@@ -552,7 +553,8 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
 	/* Create a QueryDesc for the query */
 	queryDesc = CreateQueryDesc(plannedstmt, queryString,
 								GetActiveSnapshot(), InvalidSnapshot,
-								dest, params, queryEnv, instrument_option);
+								dest, params, queryEnv, instrument_option,
+								prep_estate);
 
 	/* Select execution options */
 	if (es->analyze)
diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index b98801d08f2..939e7a632f0 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -1174,7 +1174,7 @@ execute_sql_string(const char *sql, const char *filename)
 				qdesc = CreateQueryDesc(stmt,
 										sql,
 										GetActiveSnapshot(), NULL,
-										dest, NULL, NULL, 0);
+										dest, NULL, NULL, 0, NULL);
 
 				ExecutorStart(qdesc, 0);
 				ExecutorRun(qdesc, ForwardScanDirection, 0);
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index 81a55a33ef2..2cdfdcf984b 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -439,7 +439,7 @@ refresh_matview_datafill(DestReceiver *dest, Query *query,
 	/* Create a QueryDesc, redirecting output to our tuple receiver */
 	queryDesc = CreateQueryDesc(plan, queryString,
 								GetActiveSnapshot(), InvalidSnapshot,
-								dest, NULL, NULL, 0);
+								dest, NULL, NULL, 0, NULL);
 
 	/* call ExecutorStart to prepare the plan for execution */
 	ExecutorStart(queryDesc, 0);
diff --git a/src/backend/commands/portalcmds.c b/src/backend/commands/portalcmds.c
index 01efac3319e..1e880a6d7c9 100644
--- a/src/backend/commands/portalcmds.c
+++ b/src/backend/commands/portalcmds.c
@@ -118,6 +118,7 @@ PerformCursorOpen(ParseState *pstate, DeclareCursorStmt *cstmt, ParamListInfo pa
 					  queryString,
 					  CMDTAG_SELECT,	/* cursor's query is always a SELECT */
 					  list_make1(plan),
+					  NIL,
 					  NULL);
 
 	/*----------
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 876aad2100a..c7bab14b633 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -207,6 +207,7 @@ ExecuteQuery(ParseState *pstate,
 					  query_string,
 					  entry->plansource->commandTag,
 					  plan_list,
+					  NIL,
 					  cplan);
 
 	/*
@@ -577,7 +578,9 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	const char *query_string;
 	CachedPlan *cplan;
 	List	   *plan_list;
+	List	   *prep_estates;
 	ListCell   *p;
+	ListCell   *prep_lc;
 	ParamListInfo paramLI = NULL;
 	EState	   *estate = NULL;
 	instr_time	planstart;
@@ -652,14 +655,18 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	}
 
 	plan_list = cplan->stmt_list;
+	prep_estates = NIL;
 
 	/* Explain each query */
+	prep_lc = list_head(prep_estates);
 	foreach(p, plan_list)
 	{
 		PlannedStmt *pstmt = lfirst_node(PlannedStmt, p);
+		EState *prep_estate = next_prep_estate(prep_estates, &prep_lc);
 
 		if (pstmt->commandType != CMD_UTILITY)
-			ExplainOnePlan(pstmt, into, es, query_string, paramLI, pstate->p_queryEnv,
+			ExplainOnePlan(pstmt, prep_estate,
+						   into, es, query_string, paramLI, pstate->p_queryEnv,
 						   &planduration, (es->buffers ? &bufusage : NULL),
 						   es->memory ? &mem_counters : NULL);
 		else
diff --git a/src/backend/executor/README b/src/backend/executor/README
index 54f4782f31b..d749ceb6687 100644
--- a/src/backend/executor/README
+++ b/src/backend/executor/README
@@ -291,11 +291,18 @@ Query Processing Control Flow
 
 This is a sketch of control flow for full query processing:
 
+    ExecutorPrep
+		May be run before ExecutorStart (e.g., for plan validation), or
+		implicitly from ExecutorStart if not done earlier.  Creates EState,
+		performs range table initialization, permission checks, and initial
+		partition pruning.  Returns the EState that ExecutorStart() should
+		reuse.
+
 	CreateQueryDesc
 
 	ExecutorStart
-		CreateExecutorState
-			creates per-query context
+		ExecutorPrep (if not already done, indicated by NULL QueryDesc.estate)
+			creates EState and per-query context
 		switch to per-query context to run ExecInitNode
 		AfterTriggerBeginQuery
 		ExecInitNode --- recursively scans plan tree
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index c58a2abe9a7..0f95ad88497 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -57,6 +57,7 @@
 #include "parser/parse_relation.h"
 #include "pgstat.h"
 #include "rewrite/rewriteHandler.h"
+#include "storage/lmgr.h"
 #include "tcop/utility.h"
 #include "utils/acl.h"
 #include "utils/backend_status.h"
@@ -147,7 +148,6 @@ standard_ExecutorStart(QueryDesc *queryDesc, int eflags)
 
 	/* sanity checks: queryDesc must not be started already */
 	Assert(queryDesc != NULL);
-	Assert(queryDesc->estate == NULL);
 
 	/* caller must ensure the query's snapshot is active */
 	Assert(GetActiveSnapshot() == queryDesc->snapshot);
@@ -173,9 +173,70 @@ standard_ExecutorStart(QueryDesc *queryDesc, int eflags)
 
 	/*
 	 * Build EState, switch into per-query memory context for startup.
-	 */
-	estate = CreateExecutorState();
-	queryDesc->estate = estate;
+	 *
+	 * If ExecutorPrep() ran earlier (e.g., to do initial pruning during plan
+	 * validity checking), reuse its EState to avoid redoing range table setup
+	 * and pruning. Otherwise, create a fresh EState as usual.
+	 *
+	 * In assert builds, verify that the expected locks are held.  When
+	 * no prep EState was provided, AcquireExecutorLocks() should have
+	 * locked every relation in the plan.  When one was provided,
+	 * pruning-aware locking should have locked at least the unpruned
+	 * relations.  Both checks are skipped in parallel workers, which
+	 * acquire relation locks lazily in ExecGetRangeTableRelation().
+	 */
+	if (queryDesc->estate == NULL)
+	{
+#ifdef USE_ASSERT_CHECKING
+		if (!IsParallelWorker())
+		{
+			ListCell   *lc;
+
+			foreach(lc, queryDesc->plannedstmt->rtable)
+			{
+				RangeTblEntry *rte = lfirst_node(RangeTblEntry, lc);
+
+				if (rte->rtekind == RTE_RELATION ||
+					(rte->rtekind == RTE_SUBQUERY && rte->relid != InvalidOid))
+					Assert(CheckRelationOidLockedByMe(rte->relid,
+													  rte->rellockmode,
+													  true));
+			}
+		}
+#endif
+		queryDesc->estate = ExecutorPrep(queryDesc->plannedstmt,
+										 queryDesc->params,
+										 CurrentResourceOwner,
+										 eflags);
+	}
+#ifdef USE_ASSERT_CHECKING
+	else
+	{
+		/*
+		 * A prep EState was provided, meaning pruning-aware locking
+		 * should have locked at least the unpruned relations.
+		 */
+		if (!IsParallelWorker())
+		{
+			int		rtindex = -1;
+
+			while ((rtindex = bms_next_member(queryDesc->estate->es_unpruned_relids,
+											  rtindex)) >= 0)
+			{
+				RangeTblEntry *rte = exec_rt_fetch(rtindex, queryDesc->estate);
+
+				Assert(rte->rtekind == RTE_RELATION ||
+					   (rte->rtekind == RTE_SUBQUERY &&
+						rte->relid != InvalidOid));
+				Assert(CheckRelationOidLockedByMe(rte->relid,
+												  rte->rellockmode, true));
+			}
+		}
+	}
+#endif
+
+	estate = queryDesc->estate;
+	Assert(estate);
 
 	oldcontext = MemoryContextSwitchTo(estate->es_query_cxt);
 
@@ -265,6 +326,73 @@ standard_ExecutorStart(QueryDesc *queryDesc, int eflags)
 	MemoryContextSwitchTo(oldcontext);
 }
 
+/*
+ * ExecutorPrep: prepare executor state for a PlannedStmt outside ExecutorStart.
+ *
+ * Performs range table initialization, permission checks, and initial
+ * partition pruning if partPruneInfos are present.
+ *
+ * Returns an EState that the caller must either pass to ExecutorStart()
+ * for reuse or free via FreeExecutorState() if execution will not proceed.
+ * GetCachedPlan() uses this to determine which partitions to lock after
+ * pruning; if the resulting EState is not delivered to ExecutorStart(),
+ * the executor would operate on unlocked relations.
+ */
+EState *
+ExecutorPrep(PlannedStmt *pstmt, ParamListInfo params, ResourceOwner owner,
+			 int eflags)
+{
+	ResourceOwner oldowner;
+	EState *estate;
+
+	if (pstmt->commandType == CMD_UTILITY)
+		return NULL;
+
+	/* Caller must have established an active snapshot. */
+	Assert(ActiveSnapshotSet());
+
+	estate = CreateExecutorState();
+	estate->es_plannedstmt = pstmt;
+	estate->es_part_prune_infos = pstmt->partPruneInfos;
+	estate->es_param_list_info = params;
+	estate->es_top_eflags = eflags;
+
+	/*
+	 * Do permissions checks.
+	 */
+	ExecCheckPermissions(pstmt->rtable, pstmt->permInfos, true);
+
+	/*
+	 * Initialize range table.
+	 */
+	ExecInitRangeTable(estate, pstmt->rtable, pstmt->permInfos,
+					   bms_copy(pstmt->unprunableRelids));
+
+	/*
+	 * Ensure locks taken during initial pruning are tracked under the given
+	 * ResourceOwner (e.g., one associated with CachedPlan validation).
+	 */
+	oldowner = CurrentResourceOwner;
+	CurrentResourceOwner = owner;
+
+	/*
+	 * Set up PartitionPruneState structures needed for both initial and
+	 * runtime partition pruning. These structures are built from the
+	 * PartitionPruneInfo entries in the plan tree.
+	 *
+	 * Also perform initial pruning to compute the subset of child subplans
+	 * that will be executed. The results, which are bitmapsets of selected
+	 * child indexes, are saved in es_part_prune_results. This list is parallel
+	 * to es_part_prune_infos.
+	 */
+	ExecCreatePartitionPruneStates(estate);
+	ExecDoInitialPruning(estate);
+
+	CurrentResourceOwner = oldowner;
+
+	return estate;
+}
+
 /* ----------------------------------------------------------------
  *		ExecutorRun
  *
@@ -840,38 +968,14 @@ InitPlan(QueryDesc *queryDesc, int eflags)
 	CmdType		operation = queryDesc->operation;
 	PlannedStmt *plannedstmt = queryDesc->plannedstmt;
 	Plan	   *plan = plannedstmt->planTree;
-	List	   *rangeTable = plannedstmt->rtable;
 	EState	   *estate = queryDesc->estate;
 	PlanState  *planstate;
 	TupleDesc	tupType;
 	ListCell   *l;
 	int			i;
 
-	/*
-	 * Do permissions checks
-	 */
-	ExecCheckPermissions(rangeTable, plannedstmt->permInfos, true);
-
-	/*
-	 * initialize the node's execution state
-	 */
-	ExecInitRangeTable(estate, rangeTable, plannedstmt->permInfos,
-					   bms_copy(plannedstmt->unprunableRelids));
-
-	estate->es_plannedstmt = plannedstmt;
-	estate->es_part_prune_infos = plannedstmt->partPruneInfos;
-
-	/*
-	 * Perform runtime "initial" pruning to identify which child subplans,
-	 * corresponding to the children of plan nodes that contain
-	 * PartitionPruneInfo such as Append, will not be executed. The results,
-	 * which are bitmapsets of indexes of the child subplans that will be
-	 * executed, are saved in es_part_prune_results.  These results correspond
-	 * to each PartitionPruneInfo entry, and the es_part_prune_results list is
-	 * parallel to es_part_prune_infos.
-	 */
-	ExecCreatePartitionPruneStates(estate);
-	ExecDoInitialPruning(estate);
+	/* ExecutorPrep() must have been done. */
+	Assert(queryDesc->estate);
 
 	/*
 	 * Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index ac84af294c9..024780d3516 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -1300,7 +1300,8 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
 	return CreateQueryDesc(pstmt,
 						   queryString,
 						   GetActiveSnapshot(), InvalidSnapshot,
-						   receiver, paramLI, NULL, instrument_options);
+						   receiver, paramLI, NULL, instrument_options,
+						   NULL);
 }
 
 /*
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index 88109348817..952a784c924 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -1369,7 +1369,8 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
 							 dest,
 							 fcache->paramLI,
 							 es->qd ? es->qd->queryEnv : NULL,
-							 0);
+							 0,
+							 NULL);
 
 	/* Utility commands don't need Executor. */
 	if (es->qd->operation != CMD_UTILITY)
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 52f3b11301c..380bbc44e97 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -1686,6 +1686,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 					  query_string,
 					  plansource->commandTag,
 					  stmt_list,
+					  NIL,
 					  cplan);
 
 	/*
@@ -2500,6 +2501,8 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 		CachedPlanSource *plansource = (CachedPlanSource *) lfirst(lc1);
 		List	   *stmt_list;
 		ListCell   *lc2;
+		List	   *prep_estates;
+		ListCell   *prep_lc;
 
 		spicallbackarg.query = plansource->query_string;
 
@@ -2578,6 +2581,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 							  plan_owner, _SPI_current->queryEnv);
 
 		stmt_list = cplan->stmt_list;
+		prep_estates = NIL;
 
 		/*
 		 * If we weren't given a specific snapshot to use, and the statement
@@ -2615,9 +2619,11 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 			}
 		}
 
+		prep_lc = list_head(prep_estates);
 		foreach(lc2, stmt_list)
 		{
 			PlannedStmt *stmt = lfirst_node(PlannedStmt, lc2);
+			EState *prep_estate = next_prep_estate(prep_estates, &prep_lc);
 			bool		canSetTag = stmt->canSetTag;
 			DestReceiver *dest;
 
@@ -2695,7 +2701,8 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 										dest,
 										options->params,
 										_SPI_current->queryEnv,
-										0);
+										0,
+										prep_estate);
 				res = _SPI_pquery(qdesc, fire_triggers,
 								  canSetTag ? options->tcount : 0);
 				FreeQueryDesc(qdesc);
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index b3563113219..355a490cde9 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -1231,6 +1231,7 @@ exec_simple_query(const char *query_string)
 						  query_string,
 						  commandTag,
 						  plantree_list,
+						  NIL,
 						  NULL);
 
 		/*
@@ -2030,6 +2031,7 @@ exec_bind_message(StringInfo input_message)
 					  query_string,
 					  psrc->commandTag,
 					  cplan->stmt_list,
+					  NIL,
 					  cplan);
 
 	/* Portal is defined, set the plan ID based on its contents. */
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index d8fc75d0bb9..b18266487bb 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -37,6 +37,7 @@ Portal		ActivePortal = NULL;
 
 
 static void ProcessQuery(PlannedStmt *plan,
+						 EState *prep_estate,
 						 const char *sourceText,
 						 ParamListInfo params,
 						 QueryEnvironment *queryEnv,
@@ -72,7 +73,8 @@ CreateQueryDesc(PlannedStmt *plannedstmt,
 				DestReceiver *dest,
 				ParamListInfo params,
 				QueryEnvironment *queryEnv,
-				int instrument_options)
+				int instrument_options,
+				EState *prep_estate)
 {
 	QueryDesc  *qd = palloc_object(QueryDesc);
 
@@ -93,6 +95,9 @@ CreateQueryDesc(PlannedStmt *plannedstmt,
 	qd->planstate = NULL;
 	qd->totaltime = NULL;
 
+	/* Use the EState created by ExecutorPrep() if already done. */
+	qd->estate = prep_estate;
+
 	/* not yet executed */
 	qd->already_executed = false;
 
@@ -123,6 +128,7 @@ FreeQueryDesc(QueryDesc *qdesc)
  *		PORTAL_ONE_RETURNING, or PORTAL_ONE_MOD_WITH portal
  *
  *	plan: the plan tree for the query
+ *	prep_estate: EState created in ExecutorPrep() for the query, if any
  *	sourceText: the source text of the query
  *	params: any parameters needed
  *	dest: where to send results
@@ -135,6 +141,7 @@ FreeQueryDesc(QueryDesc *qdesc)
  */
 static void
 ProcessQuery(PlannedStmt *plan,
+			 EState *prep_estate,
 			 const char *sourceText,
 			 ParamListInfo params,
 			 QueryEnvironment *queryEnv,
@@ -148,7 +155,8 @@ ProcessQuery(PlannedStmt *plan,
 	 */
 	queryDesc = CreateQueryDesc(plan, sourceText,
 								GetActiveSnapshot(), InvalidSnapshot,
-								dest, params, queryEnv, 0);
+								dest, params, queryEnv, 0,
+								prep_estate);
 
 	/*
 	 * Call ExecutorStart to prepare the plan for execution
@@ -495,7 +503,10 @@ PortalStart(Portal portal, ParamListInfo params,
 											None_Receiver,
 											params,
 											portal->queryEnv,
-											0);
+											0,
+											portal->prep_estates ?
+											(EState *) linitial(portal->prep_estates) :
+											NULL);
 
 				/*
 				 * If it's a scrollable cursor, executor needs to support
@@ -1185,6 +1196,7 @@ PortalRunMulti(Portal portal,
 {
 	bool		active_snapshot_set = false;
 	ListCell   *stmtlist_item;
+	ListCell   *prep_lc;
 
 	/*
 	 * If the destination is DestRemoteExecute, change to DestNone.  The
@@ -1205,9 +1217,11 @@ PortalRunMulti(Portal portal,
 	 * Loop to handle the individual queries generated from a single parsetree
 	 * by analysis and rewrite.
 	 */
+	prep_lc = list_head(portal->prep_estates);
 	foreach(stmtlist_item, portal->stmts)
 	{
 		PlannedStmt *pstmt = lfirst_node(PlannedStmt, stmtlist_item);
+		EState *prep_estate = next_prep_estate(portal->prep_estates, &prep_lc);
 
 		/*
 		 * If we got a cancel signal in prior command, quit
@@ -1265,7 +1279,7 @@ PortalRunMulti(Portal portal,
 			if (pstmt->canSetTag)
 			{
 				/* statement can set tag string */
-				ProcessQuery(pstmt,
+				ProcessQuery(pstmt, prep_estate,
 							 portal->sourceText,
 							 portal->portalParams,
 							 portal->queryEnv,
@@ -1274,7 +1288,7 @@ PortalRunMulti(Portal portal,
 			else
 			{
 				/* stmt added by rewrite cannot set tag */
-				ProcessQuery(pstmt,
+				ProcessQuery(pstmt, prep_estate,
 							 portal->sourceText,
 							 portal->portalParams,
 							 portal->queryEnv,
diff --git a/src/backend/utils/mmgr/portalmem.c b/src/backend/utils/mmgr/portalmem.c
index 493f9b0ee19..443b583637c 100644
--- a/src/backend/utils/mmgr/portalmem.c
+++ b/src/backend/utils/mmgr/portalmem.c
@@ -286,6 +286,7 @@ PortalDefineQuery(Portal portal,
 				  const char *sourceText,
 				  CommandTag commandTag,
 				  List *stmts,
+				  List *prep_estates,
 				  CachedPlan *cplan)
 {
 	Assert(PortalIsValid(portal));
@@ -299,6 +300,7 @@ PortalDefineQuery(Portal portal,
 	portal->commandTag = commandTag;
 	SetQueryCompletion(&portal->qc, commandTag, 0);
 	portal->stmts = stmts;
+	portal->prep_estates = prep_estates;
 	portal->cplan = cplan;
 	portal->status = PORTAL_DEFINED;
 }
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 472e141bba3..71ebe38bc86 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -64,7 +64,8 @@ extern void ExplainOneUtility(Node *utilityStmt, IntoClause *into,
 							  ExplainState *es, ParseState *pstate,
 							  ParamListInfo params);
 
-extern void ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into,
+extern void ExplainOnePlan(PlannedStmt *plannedstmt, EState *prep_estate,
+						   IntoClause *into,
 						   ExplainState *es, const char *queryString,
 						   ParamListInfo params, QueryEnvironment *queryEnv,
 						   const instr_time *planduration,
diff --git a/src/include/executor/execdesc.h b/src/include/executor/execdesc.h
index d3a57242844..3a2169c9613 100644
--- a/src/include/executor/execdesc.h
+++ b/src/include/executor/execdesc.h
@@ -43,7 +43,7 @@ typedef struct QueryDesc
 	QueryEnvironment *queryEnv; /* query environment passed in */
 	int			instrument_options; /* OR of InstrumentOption flags */
 
-	/* These fields are set by ExecutorStart */
+	/* These fields are set by ExecutorStart or ExecutorPrep */
 	TupleDesc	tupDesc;		/* descriptor for result tuples */
 	EState	   *estate;			/* executor's query-wide state */
 	PlanState  *planstate;		/* tree of per-plan-node state */
@@ -63,7 +63,8 @@ extern QueryDesc *CreateQueryDesc(PlannedStmt *plannedstmt,
 								  DestReceiver *dest,
 								  ParamListInfo params,
 								  QueryEnvironment *queryEnv,
-								  int instrument_options);
+								  int instrument_options,
+								  EState *prep_estate);
 
 extern void FreeQueryDesc(QueryDesc *qdesc);
 
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index 064df01811e..24604120c27 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -21,6 +21,7 @@
 #include "nodes/lockoptions.h"
 #include "nodes/parsenodes.h"
 #include "utils/memutils.h"
+#include "utils/resowner.h"
 
 
 /*
@@ -235,6 +236,31 @@ ExecGetJunkAttribute(TupleTableSlot *slot, AttrNumber attno, bool *isNull)
  */
 extern void ExecutorStart(QueryDesc *queryDesc, int eflags);
 extern void standard_ExecutorStart(QueryDesc *queryDesc, int eflags);
+
+extern EState *ExecutorPrep(PlannedStmt *pstmt,
+							ParamListInfo params,
+							ResourceOwner owner,
+							int eflags);
+
+/*
+ * Walk a prep_estates list in step with a parallel stmt_list iteration.
+ * Returns the next EState (or NULL) and advances *lc.
+ *
+ * Safe when prep_estates is NIL; just returns NULL for every call.
+ */
+static inline EState *
+next_prep_estate(List *prep_estates, ListCell **lc)
+{
+	EState *result = NULL;
+
+	if (*lc != NULL)
+	{
+		result = (EState *) lfirst(*lc);
+		*lc = lnext(prep_estates, *lc);
+	}
+	return result;
+}
+
 extern void ExecutorRun(QueryDesc *queryDesc,
 						ScanDirection direction, uint64 count);
 extern void standard_ExecutorRun(QueryDesc *queryDesc,
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 0716c5a9aed..42d75693d43 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -784,7 +784,6 @@ typedef struct EState
 	List	   *es_insert_pending_modifytables;
 } EState;
 
-
 /*
  * ExecRowMark -
  *	   runtime representation of FOR [KEY] UPDATE/SHARE clauses
diff --git a/src/include/utils/portal.h b/src/include/utils/portal.h
index a7bedb12c18..f69b4b9b479 100644
--- a/src/include/utils/portal.h
+++ b/src/include/utils/portal.h
@@ -137,6 +137,7 @@ typedef struct PortalData
 	CommandTag	commandTag;		/* command tag for original query */
 	QueryCompletion qc;			/* command completion data for executed query */
 	List	   *stmts;			/* list of PlannedStmts */
+	List	   *prep_estates;	/* list of EStates where needed */
 	CachedPlan *cplan;			/* CachedPlan, if stmts are from one */
 
 	ParamListInfo portalParams; /* params to pass to query */
@@ -240,6 +241,7 @@ extern void PortalDefineQuery(Portal portal,
 							  const char *sourceText,
 							  CommandTag commandTag,
 							  List *stmts,
+							  List *prep_estates,
 							  CachedPlan *cplan);
 extern PlannedStmt *PortalGetPrimaryStmt(Portal portal);
 extern void PortalCreateHoldStore(Portal portal);
-- 
2.47.3



  [application/octet-stream] v8-0001-Refactor-partition-pruning-initialization-for-cla.patch (10.2K, 6-v8-0001-Refactor-partition-pruning-initialization-for-cla.patch)
  download | inline diff:
From a79af61882f1ff696d46f612a5b3a8ce50ee75d6 Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Tue, 10 Feb 2026 15:08:52 +0900
Subject: [PATCH v8 1/5] Refactor partition pruning initialization for clarity
 and modularity

Move the creation of PartitionPruneState structures out of
ExecDoInitialPruning() into a new ExecCreatePartitionPruneStates()
function. This separates the setup of pruning state from the execution
of initial pruning logic, making the code clearer and easier to
maintain.

Simplify handling of unpruned relids by moving responsibility
for recording them in EState into CreatePartitionPruneState(),
avoiding the need to pass all_leafpart_rtis as an out parameter.

Also move the setting of ecxt_param_exec_vals from
ExecCreatePartitionPruneState() to InitExecPartitionPruneContexts(),
to allow the former to be called at a time when the PARAM_EXEC
parameters have not yet been set up.

This refactoring allows callers to reuse the pruning setup logic
without always triggering pruning, a capability useful for future use
cases that may only need metadata initialization.
---
 src/backend/executor/execMain.c      |   1 +
 src/backend/executor/execPartition.c | 103 +++++++++++++++++++--------
 src/include/executor/execPartition.h |   1 +
 3 files changed, 74 insertions(+), 31 deletions(-)

diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 58b84955c2b..c58a2abe9a7 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -870,6 +870,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
 	 * to each PartitionPruneInfo entry, and the es_part_prune_results list is
 	 * parallel to es_part_prune_infos.
 	 */
+	ExecCreatePartitionPruneStates(estate);
 	ExecDoInitialPruning(estate);
 
 	/*
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index d96d4f9947b..feea9fdfde0 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -185,8 +185,7 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation rel,
 static List *adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri);
 static List *adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap);
 static PartitionPruneState *CreatePartitionPruneState(EState *estate,
-													  PartitionPruneInfo *pruneinfo,
-													  Bitmapset **all_leafpart_rtis);
+													  PartitionPruneInfo *pruneinfo);
 static void InitPartitionPruneContext(PartitionPruneContext *context,
 									  List *pruning_steps,
 									  PartitionDesc partdesc,
@@ -1943,6 +1942,9 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  *
  * Functions:
  *
+ * ExecCreatePartitionPruneStates
+ *		Create PartitionPruneState for all PartitionPruneInfos in the EState
+ *
  * ExecDoInitialPruning:
  *		Perform runtime "initial" pruning, if necessary, to determine the set
  *		of child subnodes that need to be initialized during ExecInitNode() for
@@ -1967,6 +1969,29 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  *-------------------------------------------------------------------------
  */
 
+/*
+ * ExecCreatePartitionPruneStates
+ *
+ * Create a PartitionPruneState for each PartitionPruneInfo in the estate,
+ * and save them in estate->es_part_prune_states. This setup is required
+ * before any initial or runtime pruning can occur.
+ */
+void
+ExecCreatePartitionPruneStates(EState *estate)
+{
+	ListCell   *lc;
+
+	foreach(lc, estate->es_part_prune_infos)
+	{
+		PartitionPruneInfo *pruneinfo = lfirst_node(PartitionPruneInfo, lc);
+		PartitionPruneState *prunestate;
+
+		/* Create and save the PartitionPruneState. */
+		prunestate = CreatePartitionPruneState(estate, pruneinfo);
+		estate->es_part_prune_states = lappend(estate->es_part_prune_states,
+											   prunestate);
+	}
+}
 
 /*
  * ExecDoInitialPruning
@@ -1974,11 +1999,11 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  *		of child subnodes that need to be initialized during ExecInitNode() for
  *		plan nodes that support partition pruning.
  *
- * This function iterates over each PartitionPruneInfo entry in
- * estate->es_part_prune_infos. For each entry, it creates a PartitionPruneState
- * and adds it to es_part_prune_states.  ExecInitPartitionExecPruning() accesses
+ * This function iterates over each PartitionPruneState in
+ * estate->es_part_prune_states, which must have been populated earlier by
+ * ExecCreatePartitionPruneStates(). ExecInitPartitionExecPruning() accesses
  * these states through their corresponding indexes in es_part_prune_states and
- * assign each state to the parent node's PlanState, from where it will be used
+ * assigns each state to the parent node's PlanState, from where it will be used
  * for "exec" pruning.
  *
  * If initial pruning steps exist for a PartitionPruneInfo entry, this function
@@ -1996,20 +2021,13 @@ ExecDoInitialPruning(EState *estate)
 {
 	ListCell   *lc;
 
-	foreach(lc, estate->es_part_prune_infos)
+	Assert(estate->es_part_prune_results == NULL);
+	foreach(lc, estate->es_part_prune_states)
 	{
-		PartitionPruneInfo *pruneinfo = lfirst_node(PartitionPruneInfo, lc);
-		PartitionPruneState *prunestate;
+		PartitionPruneState *prunestate = (PartitionPruneState *) lfirst(lc);
 		Bitmapset  *validsubplans = NULL;
-		Bitmapset  *all_leafpart_rtis = NULL;
 		Bitmapset  *validsubplan_rtis = NULL;
 
-		/* Create and save the PartitionPruneState. */
-		prunestate = CreatePartitionPruneState(estate, pruneinfo,
-											   &all_leafpart_rtis);
-		estate->es_part_prune_states = lappend(estate->es_part_prune_states,
-											   prunestate);
-
 		/*
 		 * Perform initial pruning steps, if any, and save the result
 		 * bitmapset or NULL as described in the header comment.
@@ -2017,8 +2035,6 @@ ExecDoInitialPruning(EState *estate)
 		if (prunestate->do_initial_prune)
 			validsubplans = ExecFindMatchingSubPlans(prunestate, true,
 													 &validsubplan_rtis);
-		else
-			validsubplan_rtis = all_leafpart_rtis;
 
 		estate->es_unpruned_relids = bms_add_members(estate->es_unpruned_relids,
 													 validsubplan_rtis);
@@ -2136,14 +2152,12 @@ ExecInitPartitionExecPruning(PlanState *planstate,
  * parent plan node's PlanState.
  *
  * If initial pruning steps are to be skipped (e.g., during EXPLAIN
- * (GENERIC_PLAN)), *all_leafpart_rtis will be populated with the RT indexes of
- * all leaf partitions whose scanning subnode is included in the parent plan
- * node's list of child plans. The caller must add these RT indexes to
- * estate->es_unpruned_relids.
+ * (GENERIC_PLAN)), the RT indexes of all leaf partitions whose scanning
+ * subnode is included in the parent plan node's list of child plans are
+ * added to estate->es_unpruned_relids.
  */
 static PartitionPruneState *
-CreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo,
-						  Bitmapset **all_leafpart_rtis)
+CreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo)
 {
 	PartitionPruneState *prunestate;
 	int			n_part_hierarchies;
@@ -2377,8 +2391,8 @@ CreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo,
 													   pinfo->execparamids);
 
 			/*
-			 * Return all leaf partition indexes if we're skipping pruning in
-			 * the EXPLAIN (GENERIC_PLAN) case.
+			 * Add all leaf partition indexes to es_unpruned_relids if we're
+			 * skipping pruning in the EXPLAIN (GENERIC_PLAN) case.
 			 */
 			if (pinfo->initial_pruning_steps && !prunestate->do_initial_prune)
 			{
@@ -2390,10 +2404,29 @@ CreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo,
 					Index		rtindex = pprune->leafpart_rti_map[part_index];
 
 					if (rtindex)
-						*all_leafpart_rtis = bms_add_member(*all_leafpart_rtis,
-															rtindex);
+						estate->es_unpruned_relids =
+							bms_add_member(estate->es_unpruned_relids, rtindex);
 				}
 			}
+			else if (pinfo->initial_pruning_steps == NIL)
+			{
+				/*
+				 * All partitions better be present in es_unpruned_relids when
+				 * none are initially prunable.
+				 */
+#ifdef USE_ASSERT_CHECKING
+				int			part_index = -1;
+
+				while ((part_index = bms_next_member(pprune->present_parts,
+													 part_index)) >= 0)
+				{
+					Index		rtindex = pprune->leafpart_rti_map[part_index];
+
+					if (rtindex)
+						Assert(bms_is_member(rtindex, estate->es_unpruned_relids));
+				}
+#endif
+			}
 
 			j++;
 		}
@@ -2490,9 +2523,10 @@ InitPartitionPruneContext(PartitionPruneContext *context,
  *		Initialize exec pruning contexts deferred by CreatePartitionPruneState()
  *
  * This function finalizes exec pruning setup for a PartitionPruneState by
- * initializing contexts for pruning steps that require the parent plan's
- * PlanState. It iterates over PartitionPruningData entries and sets up the
- * necessary execution contexts for pruning during query execution.
+ * initializing contexts for pruning steps that require PARAM_EXEC parameters
+ * and the parent plan's PlanState. It iterates over PartitionPruningData
+ * entries and sets up the necessary execution contexts for pruning during
+ * query execution.
  *
  * Also fix the mapping of partition indexes to subplan indexes contained in
  * prunestate by considering the new list of subplans that survived initial
@@ -2520,9 +2554,16 @@ InitExecPartitionPruneContexts(PartitionPruneState *prunestate,
 	bool		fix_subplan_map = false;
 
 	Assert(prunestate->do_exec_prune);
+	Assert(prunestate->econtext);
 	Assert(parent_plan != NULL);
 	estate = parent_plan->state;
 
+	/*
+	 * These might not be available when ExecCreatePartitionPruneState() is
+	 * called.
+	 */
+	prunestate->econtext->ecxt_param_exec_vals = estate->es_param_exec_vals;
+
 	/*
 	 * No need to fix subplans maps if initial pruning didn't eliminate any
 	 * subplans.
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index 82063ec2a16..4c96808c376 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -130,6 +130,7 @@ typedef struct PartitionPruneState
 	PartitionPruningData *partprunedata[FLEXIBLE_ARRAY_MEMBER];
 } PartitionPruneState;
 
+extern void ExecCreatePartitionPruneStates(EState *estate);
 extern void ExecDoInitialPruning(EState *estate);
 extern PartitionPruneState *ExecInitPartitionExecPruning(PlanState *planstate,
 														 int n_total_subplans,
-- 
2.47.3



^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2026-03-25 07:39  Amit Langote <[email protected]>
  parent: Amit Langote <[email protected]>
  0 siblings, 1 reply; 108+ messages in thread

From: Amit Langote @ 2026-03-25 07:39 UTC (permalink / raw)
  To: Chao Li <[email protected]>; +Cc: Tom Lane <[email protected]>; Tender Wang <[email protected]>; Alexander Lakhin <[email protected]>; Tomas Vondra <[email protected]>; Robert Haas <[email protected]>; Alvaro Herrera <[email protected]>; Andres Freund <[email protected]>; Daniel Gustafsson <[email protected]>; David Rowley <[email protected]>; pgsql-hackers; Thom Brown <[email protected]>

On Fri, Mar 20, 2026 at 2:20 AM Amit Langote <[email protected]> wrote:
> On Mon, Mar 9, 2026 at 1:41 PM Amit Langote <[email protected]> wrote:
> > On Sat, Mar 7, 2026 at 6:54 PM Amit Langote <[email protected]> wrote:
> > > Attached is v6 of the patch series. I've been working toward
> > > committing this, so I wanted to lay out the ExecutorPrep() design and
> > > the key trade-offs before doing so.
> > >
> > > When a cached generic plan references a partitioned table,
> > > GetCachedPlan() locks all partitions upfront via
> > > AcquireExecutorLocks(), even those that initial pruning will
> > > eliminate.  But initial partition pruning only runs later during
> > > ExecutorStart(). Moving pruning earlier requires some executor setup
> > > (range table, permissions, pruning state), and ExecutorPrep() is the
> > > vehicle for that.  Unlike the approach reverted in last May, this
> > > keeps the CachedPlan itself unchanged -- all per-execution state flows
> > > through a separate CachedPlanPrepData that the caller provides.
> > >
> > > The approach also keeps GetCachedPlan()'s interface
> > > backward-compatible: the new CachedPlanPrepData argument is optional.
> > > If a caller passes NULL, all partitions are locked as before and
> > > nothing changes. This means existing callers and any new code that
> > > calls GetCachedPlan() without caring about pruning-aware locking just
> > > works.
> > >
> > > The risk is on the other side: if a caller does pass a
> > > CachedPlanPrepData, GetCachedPlan() will lock only the surviving
> > > partitions and populate prep_estates with the EStates that
> > > ExecutorPrep() created. The caller then must make those EStates
> > > available to ExecutorStart() -- via QueryDesc->estate,
> > > portal->prep_estates, or the equivalent path for SPI and SQL
> > > functions. If it fails to do so, ExecutorStart() will call
> > > ExecutorPrep() again, which may compute different pruning results than
> > > the original call, potentially expecting locks on relations that were
> > > never acquired. The executor would then operate on relations it
> > > doesn't hold locks on.
> > >
> > > So the contract is: if you opt in to pruning-aware locking by passing
> > > CachedPlanPrepData, you must complete the pipeline by delivering the
> > > prep EStates to the executor. In the current patch, all the call sites
> > > that pass a CachedPlanPrepData (portals, SPI, EXECUTE, SQL functions,
> > > EXPLAIN) do thread the EStates through correctly, and I've tried to
> > > make the plumbing straightforward enough that it's hard to get wrong.
> > > But it is a new invariant that didn't exist before, and a caller that
> > > gets it wrong would fail silently rather than with an obvious error.
> > >
> > > To catch such violations, I've added a debug-only check in
> > > standard_ExecutorStart() that fires when no prep EState was provided.
> > > It iterates over the plan's rtable and verifies that every lockable
> > > relation is actually locked.  It should always be true if
> > > AcquireExecutorLocks() locked everything, but would fail if
> > > pruning-aware locking happened upstream and the caller dropped the
> > > prep EState. The check is skipped in parallel workers, which acquire
> > > relation locks lazily in ExecGetRangeTableRelation().
> > >
> > > +    if (queryDesc->estate == NULL)
> > > +    {
> > > +#ifdef USE_ASSERT_CHECKING
> > > +        if (!IsParallelWorker())
> > > +        {
> > > +            ListCell   *lc;
> > > +
> > > +            foreach(lc, queryDesc->plannedstmt->rtable)
> > > +            {
> > > +                RangeTblEntry *rte = lfirst_node(RangeTblEntry, lc);
> > > +
> > > +                if (rte->rtekind == RTE_RELATION ||
> > > +                    (rte->rtekind == RTE_SUBQUERY && rte->relid != InvalidOid))
> > > +                    Assert(CheckRelationOidLockedByMe(rte->relid,
> > > +                                                      rte->rellockmode,
> > > +                                                      true));
> > > +            }
> > > +        }
> > > +#endif
> > > +        queryDesc->estate = ExecutorPrep(queryDesc->plannedstmt,
> > > +                                         queryDesc->params,
> > > +                                         CurrentResourceOwner,
> > > +                                         true,
> > > +                                         eflags);
> > > +    }
> > > +#ifdef USE_ASSERT_CHECKING
> > > +    else
> > > +    {
> > > +        /*
> > > +         * A prep EState was provided, meaning pruning-aware locking
> > > +         * should have locked at least the unpruned relations.
> > > +         */
> > > +        if (!IsParallelWorker())
> > > +        {
> > > +            int     rtindex = -1;
> > > +
> > > +            while ((rtindex =
> > > bms_next_member(queryDesc->estate->es_unpruned_relids,
> > > +                                              rtindex)) >= 0)
> > > +            {
> > > +                RangeTblEntry *rte = exec_rt_fetch(rtindex, queryDesc->estate);
> > > +
> > > +                Assert(rte->rtekind == RTE_RELATION ||
> > > +                       (rte->rtekind == RTE_SUBQUERY &&
> > > +                        rte->relid != InvalidOid));
> > > +                Assert(CheckRelationOidLockedByMe(rte->relid,
> > > +                                                  rte->rellockmode, true));
> > > +            }
> > > +        }
> > > +    }
> > > +#endif
> > >
> > > So the invariant is: if no prep EState was provided, every relation in
> > > the plan is locked; if one was provided, at least the unpruned
> > > relations are locked. Both are checked in assert builds.
> > >
> > > I think this covers the main concerns, but I may be missing something.
> > > If anyone sees a problem with this approach, I'd like to hear about
> > > it.
> >
> > Here's v7. Some plancache.c changes that I'd made were in the wrong
> > patch in v6; this version puts them where they belong.
>
> Attached is an updated set. One more fix: I added an Assert in
> SPI_cursor_open_internal()'s !plan->saved path to verify that
> prep_estates is NIL. Unsaved plans always take the custom plan path,
> so pruning-aware locking never applies, but it's worth guarding
> explicitly since the copyObject/ReleaseCachedPlan sequence that
> follows would not be safe otherwise. Also changed
> SPI_plan_get_cached_plan() to pass NULL for cprep, since it only
> returns the CachedPlan pointer and has no way to deliver prep_estates
> to anyone.
>
> Stepping back -- the core question is whether running executor logic
> (pruning) inside GetCachedPlan() is acceptable at all. The plan cache
> and executor have always had a clean boundary: plan cache locks
> everything, executor runs. This optimization necessarily crosses that
> line, because the information needed to decide which locks to skip
> (pruning results) can only come from executor machinery.
>
> The proposed approach has GetCachedPlan() call ExecutorPrep() to do a
> limited subset of executor work (range table init, permissions,
> pruning), carry the results out through CachedPlanPrepData, and leave
> the CachedPlan itself untouched. The executor already has a multi-step
> protocol: start/run/end. prep/start/run/end is just a finer
> decomposition of what InitPlan() was already doing inside
> ExecutorStart().
>
> Of the attached patches, I'm targeting 0001-0003 for commit. 0004 (SQL
> function support) and 0005 (parallel worker reuse) are useful
> follow-ons but not essential.  The optimization works without them for
> most cases, and they can be reviewed and committed separately.
>
> If there's a cleaner way to avoid locking pruned partitions without
> the plumbing this patch adds, I haven't found it in the year since the
> revert.  I'd welcome a pointer if you see one.  Failing that, I think
> this is the right trade-off, but it's a judgment call about where to
> hold your nose.
>
> Tom, I'd value your opinion on whether this approach is something
> you'd be comfortable seeing in the tree.

Attached is an updated set with some cleanup after another pass.

- Removed ExecCreatePartitionPruneStates() from 0001. In 0001-0003,
ExecDoInitialPruning() handles both setup and pruning internally; the
split isn't needed yet.

- Tightened commit messages to describe what each commit does now, not
what later commits will use it for. In particular, 0002 is upfront
that the portal/SPI/EXPLAIN plumbing is scaffolding that 0003 lights
up.

- Updated setrefs.c comment for firstResultRels to drop a blanket
claim about one ModifyTable per query level.

As before, 0001-0003 is the focus, maybe 0004 which teaches the new
GetCachedPlan() pruning-aware contract to its relatively new user in
function.c.

-- 
Thanks, Amit Langote


Attachments:

  [application/octet-stream] v9-0004-Make-SQL-function-executor-track-ExecutorPrep-sta.patch (7.8K, 2-v9-0004-Make-SQL-function-executor-track-ExecutorPrep-sta.patch)
  download | inline diff:
From 3aedeffabed40d317f1f7e2bb80bce8063429795 Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Tue, 10 Feb 2026 22:09:23 +0900
Subject: [PATCH v9 4/5] Make SQL function executor track ExecutorPrep state

Extend the SQL function executor to use the ExecutorPrep results
returned by GetCachedPlan().  init_execution_state() now passes a
CachedPlanPrepData to GetCachedPlan() and stores the per statement
ExecPrep pointers in the execution_state nodes.

At execution time, postquel_start() reparents the prep estate's
es_query_cxt under the function's subcontext so that prep state
follows the usual per call context hierarchy.

This allows SQL language functions to participate in the same
ExecutorPrep machinery as other plan cache users.

Add a regression test where rule rewrite expands a single UPDATE
into multiple PlannedStmts, exercising the SQL function plan cache
and the generic plan reuse path that now invokes ExecutorPrep.
---
 src/backend/executor/functions.c        | 29 +++++++++++++--
 src/test/regress/expected/plancache.out | 48 +++++++++++++++++++++++++
 src/test/regress/sql/plancache.sql      | 34 ++++++++++++++++++
 3 files changed, 109 insertions(+), 2 deletions(-)

diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index c0ca72b38dd..f246f051c25 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -73,6 +73,7 @@ typedef struct execution_state
 	bool		setsResult;		/* true if this query produces func's result */
 	bool		lazyEval;		/* true if should fetch one row at a time */
 	PlannedStmt *stmt;			/* plan for this query */
+	EState	   *prep_estate;	/* EState created in ExecutorPrep() for this plan */
 	QueryDesc  *qd;				/* null unless status == RUN */
 } execution_state;
 
@@ -658,6 +659,8 @@ init_execution_state(SQLFunctionCachePtr fcache)
 	execution_state *lasttages = NULL;
 	int			nstmts;
 	ListCell   *lc;
+	CachedPlanPrepData cprep = {0};
+	ListCell   *prep_lc;
 
 	/*
 	 * Clean up after previous query, if there was one.
@@ -696,11 +699,20 @@ init_execution_state(SQLFunctionCachePtr fcache)
 	 * CurrentResourceOwner will be the same when ShutdownSQLFunction runs.)
 	 */
 	fcache->cowner = CurrentResourceOwner;
+
+	/*
+	 * Have ExecutorPrep() allocate under fcache->fcontext.  The prep
+	 * EStates it creates will initially live there; postquel_start()
+	 * will later reparent their es_query_cxt into fcache->subcontext
+	 * when using them for execution.
+	 */
+	cprep.context = fcache->fcontext;
+	cprep.owner = fcache->cowner;
 	fcache->cplan = GetCachedPlan(plansource,
 								  fcache->paramLI,
 								  fcache->cowner,
 								  NULL,
-								  NULL);
+								  &cprep);
 
 	/*
 	 * If necessary, make esarray[] bigger to hold the needed state.
@@ -721,9 +733,11 @@ init_execution_state(SQLFunctionCachePtr fcache)
 	/*
 	 * Build execution_state list to match the number of contained plans.
 	 */
+	prep_lc = list_head(cprep.prep_estates);
 	foreach(lc, fcache->cplan->stmt_list)
 	{
 		PlannedStmt *stmt = lfirst_node(PlannedStmt, lc);
+		EState *prep_estate = next_prep_estate(cprep.prep_estates, &prep_lc);
 		execution_state *newes;
 
 		/*
@@ -765,6 +779,7 @@ init_execution_state(SQLFunctionCachePtr fcache)
 		newes->setsResult = false;	/* might change below */
 		newes->lazyEval = false;	/* might change below */
 		newes->stmt = stmt;
+		newes->prep_estate = prep_estate;
 		newes->qd = NULL;
 
 		if (stmt->canSetTag)
@@ -1363,6 +1378,15 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
 	else
 		dest = None_Receiver;
 
+	/*
+	 * Prep EStates were built under fcache->fcontext.  For execution,
+	 * make their es_query_cxt a child of fcache->subcontext so they
+	 * follow the usual per call lifetime.
+	 */
+	if (es->prep_estate)
+		MemoryContextSetParent(es->prep_estate->es_query_cxt,
+							   fcache->subcontext);
+
 	es->qd = CreateQueryDesc(es->stmt,
 							 fcache->func->src,
 							 GetActiveSnapshot(),
@@ -1371,7 +1395,7 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
 							 fcache->paramLI,
 							 es->qd ? es->qd->queryEnv : NULL,
 							 0,
-							 NULL);
+							 es->prep_estate);
 
 	/* Utility commands don't need Executor. */
 	if (es->qd->operation != CMD_UTILITY)
@@ -1462,6 +1486,7 @@ postquel_end(execution_state *es, SQLFunctionCachePtr fcache)
 
 	FreeQueryDesc(es->qd);
 	es->qd = NULL;
+	es->prep_estate = NULL;
 
 	MemoryContextSwitchTo(oldcontext);
 
diff --git a/src/test/regress/expected/plancache.out b/src/test/regress/expected/plancache.out
index 1d69ab0a1c2..371673a6e96 100644
--- a/src/test/regress/expected/plancache.out
+++ b/src/test/regress/expected/plancache.out
@@ -459,4 +459,52 @@ NOTICE:  creating index on partition inval_during_pruning_p1
 drop table inval_during_pruning_p, inval_during_pruning_signal;
 drop function invalidate_plancache_func, stable_pruning_val;
 deallocate inval_during_pruning_q;
+-- exercise sql-function plan cache when rewrite expands a single statement
+-- into multiple planned statements. this forces cachedplan->stmt_list to
+-- contain more than one entry and checks that executor state for the first
+-- rewritten statement does not destroy state needed by the second one.
+set plan_cache_mode = force_generic_plan;
+create table sqlf_base(id int, val int) partition by list (id);
+create table sqlf_base_1 partition of sqlf_base for values in (1);
+create table sqlf_base_2 partition of sqlf_base for values in (2);
+create table sqlf_log(id int, note text);
+insert into sqlf_base values (1, 10);
+create rule sqlf_base_upd_log as
+on update to sqlf_base do also
+	insert into sqlf_log(id, note)
+	values (new.id, 'logged by rule');
+create or replace function sqlf_execprep_test(a int, v int)
+returns void
+language sql
+as $$
+	update sqlf_base set val = v where id = a;
+$$;
+select sqlf_execprep_test(1, 20);
+ sqlf_execprep_test 
+--------------------
+ 
+(1 row)
+
+select sqlf_execprep_test(1, 30);
+ sqlf_execprep_test 
+--------------------
+ 
+(1 row)
+
+select * from sqlf_base order by 1;
+ id | val 
+----+-----
+  1 |  30
+(1 row)
+
+select * from sqlf_log order by 1;
+ id |      note      
+----+----------------
+  1 | logged by rule
+  1 | logged by rule
+(2 rows)
+
+drop rule sqlf_base_upd_log on sqlf_base;
+drop table sqlf_base, sqlf_log;
+drop function sqlf_execprep_test;
 reset plan_cache_mode;
diff --git a/src/test/regress/sql/plancache.sql b/src/test/regress/sql/plancache.sql
index 139b4688fd6..b89c9ad69a4 100644
--- a/src/test/regress/sql/plancache.sql
+++ b/src/test/regress/sql/plancache.sql
@@ -273,4 +273,38 @@ drop table inval_during_pruning_p, inval_during_pruning_signal;
 drop function invalidate_plancache_func, stable_pruning_val;
 deallocate inval_during_pruning_q;
 
+-- exercise sql-function plan cache when rewrite expands a single statement
+-- into multiple planned statements. this forces cachedplan->stmt_list to
+-- contain more than one entry and checks that executor state for the first
+-- rewritten statement does not destroy state needed by the second one.
+
+set plan_cache_mode = force_generic_plan;
+
+create table sqlf_base(id int, val int) partition by list (id);
+create table sqlf_base_1 partition of sqlf_base for values in (1);
+create table sqlf_base_2 partition of sqlf_base for values in (2);
+create table sqlf_log(id int, note text);
+
+insert into sqlf_base values (1, 10);
+
+create rule sqlf_base_upd_log as
+on update to sqlf_base do also
+	insert into sqlf_log(id, note)
+	values (new.id, 'logged by rule');
+
+create or replace function sqlf_execprep_test(a int, v int)
+returns void
+language sql
+as $$
+	update sqlf_base set val = v where id = a;
+$$;
+
+select sqlf_execprep_test(1, 20);
+select sqlf_execprep_test(1, 30);
+select * from sqlf_base order by 1;
+select * from sqlf_log order by 1;
+
+drop rule sqlf_base_upd_log on sqlf_base;
+drop table sqlf_base, sqlf_log;
+drop function sqlf_execprep_test;
 reset plan_cache_mode;
-- 
2.47.3



  [application/octet-stream] v9-0005-Reuse-partition-pruning-results-in-parallel-worke.patch (15.8K, 3-v9-0005-Reuse-partition-pruning-results-in-parallel-worke.patch)
  download | inline diff:
From ddcbd693f9aa8498c06b4f20fe4df20ff98974c5 Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Wed, 25 Mar 2026 16:06:57 +0900
Subject: [PATCH v9 5/5] Reuse partition pruning results in parallel workers

Pass the leader's initial partition pruning results and unpruned
relids to parallel workers and reuse them via ExecutorPrep().  This
avoids repeating pruning logic in workers, which is not only
redundant but also risks divergence due to nondeterminism in pruning
steps or parameter evaluation timing.

Factor the creation of PartitionPruneState structures out of
ExecDoInitialPruning() into a new ExecCreatePartitionPruneStates()
function.  Parallel workers need to set up pruning state without
performing initial pruning, since they receive the leader's results
instead.

Introduce CheckInitialPruningResultsInWorker() (debug-builds only)
to verify that the results match what the worker would compute.
This check helps catch inconsistencies across leader and worker
pruning logic.
---
 src/backend/executor/execMain.c      |  25 +++++--
 src/backend/executor/execParallel.c  | 108 ++++++++++++++++++++++++++-
 src/backend/executor/execPartition.c |  44 ++++++++---
 src/backend/utils/cache/plancache.c  |   2 +-
 src/include/executor/execPartition.h |   1 +
 src/include/executor/executor.h      |   3 +-
 6 files changed, 161 insertions(+), 22 deletions(-)

diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 336bd4d09b3..5fa312436fb 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -207,7 +207,7 @@ standard_ExecutorStart(QueryDesc *queryDesc, int eflags)
 		queryDesc->estate = ExecutorPrep(queryDesc->plannedstmt,
 										 queryDesc->params,
 										 CurrentResourceOwner,
-										 eflags);
+										 eflags, true);
 	}
 #ifdef USE_ASSERT_CHECKING
 	else
@@ -330,7 +330,8 @@ standard_ExecutorStart(QueryDesc *queryDesc, int eflags)
  * ExecutorPrep: prepare executor state for a PlannedStmt outside ExecutorStart.
  *
  * Performs range table initialization, permission checks, and initial
- * partition pruning if partPruneInfos are present.
+ * partition pruning if partPruneInfos are present and do_initial_pruning is
+ * true; false in a parallel worker.
  *
  * Returns an EState that the caller must either pass to ExecutorStart()
  * for reuse or free via FreeExecutorState() if execution will not proceed.
@@ -341,7 +342,7 @@ standard_ExecutorStart(QueryDesc *queryDesc, int eflags)
  */
 EState *
 ExecutorPrep(PlannedStmt *pstmt, ParamListInfo params, ResourceOwner owner,
-			 int eflags)
+			 int eflags, bool do_initial_pruning)
 {
 	ResourceOwner oldowner;
 	EState *estate;
@@ -377,14 +378,22 @@ ExecutorPrep(PlannedStmt *pstmt, ParamListInfo params, ResourceOwner owner,
 	CurrentResourceOwner = owner;
 
 	/*
-	 * Set up PartitionPruneState structures and perform initial partition
-	 * pruning to compute the subset of child subplans that will be
-	 * executed.  The results, which are bitmapsets of selected child
-	 * indexes, are saved in es_part_prune_results, parallel to
+	 * Set up PartitionPruneState structures needed for initial
+	 * partition pruning.
+	 *
+	 * If do_initial_pruning is true, also perform initial pruning to
+	 * compute the subset of child subplans that will be executed.
+	 * The results, which are bitmapsets of selected child indexes,
+	 * are saved in es_part_prune_results, parallel to
 	 * es_part_prune_infos.  RT indexes of surviving partitions are
 	 * added to es_unpruned_relids.
+	 *
+	 * Parallel workers pass false here and instead receive the
+	 * leader's pruning results via shared memory.
 	 */
-	ExecDoInitialPruning(estate);
+	ExecCreatePartitionPruneStates(estate);
+	if (do_initial_pruning)
+		ExecDoInitialPruning(estate);
 
 	CurrentResourceOwner = oldowner;
 
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index 024780d3516..2de4b35a16e 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -24,6 +24,7 @@
 #include "postgres.h"
 
 #include "executor/execParallel.h"
+#include "executor/execPartition.h"
 #include "executor/executor.h"
 #include "executor/nodeAgg.h"
 #include "executor/nodeAppend.h"
@@ -67,6 +68,8 @@
 #define PARALLEL_KEY_QUERY_TEXT		UINT64CONST(0xE000000000000008)
 #define PARALLEL_KEY_JIT_INSTRUMENTATION UINT64CONST(0xE000000000000009)
 #define PARALLEL_KEY_WAL_USAGE			UINT64CONST(0xE00000000000000A)
+#define PARALLEL_KEY_PARTITION_PRUNE_RESULTS	UINT64CONST(0xE00000000000000B)
+#define PARALLEL_KEY_UNPRUNED_RELIDS	UINT64CONST(0xE00000000000000C)
 
 #define PARALLEL_TUPLE_QUEUE_SIZE		65536
 
@@ -141,6 +144,8 @@ static bool ExecParallelRetrieveInstrumentation(PlanState *planstate,
 /* Helper function that runs in the parallel worker. */
 static DestReceiver *ExecParallelGetReceiver(dsm_segment *seg, shm_toc *toc);
 
+static void CheckInitialPruningResultsInWorker(EState *estate);
+
 /*
  * Create a serialized representation of the plan to be sent to each worker.
  */
@@ -620,12 +625,18 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	FixedParallelExecutorState *fpes;
 	char	   *pstmt_data;
 	char	   *pstmt_space;
+	char	   *part_prune_results_data;
+	char	   *part_prune_results_space;
+	char	   *unpruned_relids_data;
+	char	   *unpruned_relids_space;
 	char	   *paramlistinfo_space;
 	BufferUsage *bufusage_space;
 	WalUsage   *walusage_space;
 	SharedExecutorInstrumentation *instrumentation = NULL;
 	SharedJitInstrumentation *jit_instrumentation = NULL;
 	int			pstmt_len;
+	int			part_prune_results_len;
+	int			unpruned_relids_len;
 	int			paramlistinfo_len;
 	int			instrumentation_len = 0;
 	int			jit_instrumentation_len = 0;
@@ -654,6 +665,8 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 
 	/* Fix up and serialize plan to be sent to workers. */
 	pstmt_data = ExecSerializePlan(planstate->plan, estate);
+	part_prune_results_data = nodeToString(estate->es_part_prune_results);
+	unpruned_relids_data = nodeToString(estate->es_unpruned_relids);
 
 	/* Create a parallel context. */
 	pcxt = CreateParallelContext("postgres", "ParallelQueryMain", nworkers);
@@ -680,6 +693,16 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	shm_toc_estimate_chunk(&pcxt->estimator, pstmt_len);
 	shm_toc_estimate_keys(&pcxt->estimator, 1);
 
+	/* Estimate space for serialized part_prune_results. */
+	part_prune_results_len = strlen(part_prune_results_data) + 1;
+	shm_toc_estimate_chunk(&pcxt->estimator, part_prune_results_len);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	/* Estimate space for serialized unpruned_relids. */
+	unpruned_relids_len = strlen(unpruned_relids_data) + 1;
+	shm_toc_estimate_chunk(&pcxt->estimator, unpruned_relids_len);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
 	/* Estimate space for serialized ParamListInfo. */
 	paramlistinfo_len = EstimateParamListSpace(estate->es_param_list_info);
 	shm_toc_estimate_chunk(&pcxt->estimator, paramlistinfo_len);
@@ -781,6 +804,16 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	memcpy(pstmt_space, pstmt_data, pstmt_len);
 	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PLANNEDSTMT, pstmt_space);
 
+	/* Store serialized part_prune_results */
+	part_prune_results_space = shm_toc_allocate(pcxt->toc, part_prune_results_len);
+	memcpy(part_prune_results_space, part_prune_results_data, part_prune_results_len);
+	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARTITION_PRUNE_RESULTS, part_prune_results_space);
+
+	/* Store serialized unpruned_relids */
+	unpruned_relids_space = shm_toc_allocate(pcxt->toc, unpruned_relids_len);
+	memcpy(unpruned_relids_space, unpruned_relids_data, unpruned_relids_len);
+	shm_toc_insert(pcxt->toc, PARALLEL_KEY_UNPRUNED_RELIDS, unpruned_relids_space);
+
 	/* Store serialized ParamListInfo. */
 	paramlistinfo_space = shm_toc_allocate(pcxt->toc, paramlistinfo_len);
 	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARAMLISTINFO, paramlistinfo_space);
@@ -1280,10 +1313,15 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
 						 int instrument_options)
 {
 	char	   *pstmtspace;
+	char	   *part_prune_results_space;
+	char	   *unpruned_relids_space;
 	char	   *paramspace;
 	PlannedStmt *pstmt;
+	List	   *part_prune_results;
+	Bitmapset  *unpruned_relids;
 	ParamListInfo paramLI;
 	char	   *queryString;
+	EState	   *prep_estate = NULL;
 
 	/* Get the query string from shared memory */
 	queryString = shm_toc_lookup(toc, PARALLEL_KEY_QUERY_TEXT, false);
@@ -1296,12 +1334,80 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
 	paramspace = shm_toc_lookup(toc, PARALLEL_KEY_PARAMLISTINFO, false);
 	paramLI = RestoreParamList(&paramspace);
 
+	/* Reconstruct leader-supplied part_prune_results and unpruned_relids. */
+	part_prune_results_space =
+		shm_toc_lookup(toc, PARALLEL_KEY_PARTITION_PRUNE_RESULTS, false);
+	part_prune_results = (List *) stringToNode(part_prune_results_space);
+	unpruned_relids_space =
+		shm_toc_lookup(toc, PARALLEL_KEY_UNPRUNED_RELIDS, false);
+	unpruned_relids = (Bitmapset *) stringToNode(unpruned_relids_space);
+
+	/*
+	 * If pruning was done in the leader, build a prep estate in the worker
+	 * and inject the leader's pruning results into it for reuse.
+	 */
+	if (pstmt->partPruneInfos)
+	{
+		prep_estate = ExecutorPrep(pstmt, paramLI, CurrentResourceOwner, 0, false);
+		Assert(prep_estate);
+
+		prep_estate->es_part_prune_results = part_prune_results;
+		prep_estate->es_unpruned_relids =
+			bms_add_members(prep_estate->es_unpruned_relids,
+							unpruned_relids);
+
+		/*
+		 * A debug-build-only check that the pruning results passed from the
+		 * leader match what the worker would independently compute.
+		 */
+		CheckInitialPruningResultsInWorker(prep_estate);
+	}
+
 	/* Create a QueryDesc for the query. */
 	return CreateQueryDesc(pstmt,
 						   queryString,
 						   GetActiveSnapshot(), InvalidSnapshot,
 						   receiver, paramLI, NULL, instrument_options,
-						   NULL);
+						   prep_estate);
+}
+
+/*
+ * CheckInitialPruningResultsInWorker
+ *		Verify partition pruning results passed from the leader process.
+ *
+ * This is intended to be called during parallel worker query setup.
+ * It recomputes initial pruning results locally and compares them with
+ * those received from the leader. Any mismatch may indicate a divergence
+ * between leader and worker logic or environment.
+ *
+ * Only performed in debug builds.
+ */
+static void
+CheckInitialPruningResultsInWorker(EState *estate)
+{
+#ifdef USE_ASSERT_CHECKING
+	ListCell   *lc;
+	int			i;
+
+	Assert(estate->es_part_prune_results != NULL);
+	i = 0;
+	foreach(lc, estate->es_part_prune_states)
+	{
+		PartitionPruneState *prunestate = (PartitionPruneState *) lfirst(lc);
+		Bitmapset *reuse_validsubplans =
+				list_nth_node(Bitmapset, estate->es_part_prune_results, i++);
+		Bitmapset  *validsubplans = NULL;
+		Bitmapset  *validsubplan_rtis = NULL;
+
+		if (prunestate->do_initial_prune)
+			validsubplans = ExecFindMatchingSubPlans(prunestate, true,
+													 &validsubplan_rtis);
+		if (!bms_equal(validsubplans, reuse_validsubplans))
+			elog(ERROR, "different validsubplans in parallel worker");
+		if (bms_nonempty_difference(validsubplan_rtis, estate->es_unpruned_relids))
+			elog(ERROR, "different unprunable_relids in parallel worker");
+	}
+#endif
 }
 
 /*
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 2a3af006f77..47322614aad 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -1942,6 +1942,9 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  *
  * Functions:
  *
+ * ExecCreatePartitionPruneStates
+ *     Create PartitionPruneState for all PartitionPruneInfos in the EState
+ *
  * ExecDoInitialPruning:
  *		Perform runtime "initial" pruning, if necessary, to determine the set
  *		of child subnodes that need to be initialized during ExecInitNode() for
@@ -1967,15 +1970,40 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  */
 
 
+/*
+ * ExecCreatePartitionPruneStates
+ *
+ * Create a PartitionPruneState for each PartitionPruneInfo in the estate,
+ * and save them in estate->es_part_prune_states. This setup is required
+ * before any initial or runtime pruning can occur.
+ */
+void
+ExecCreatePartitionPruneStates(EState *estate)
+{
+	ListCell   *lc;
+
+	foreach(lc, estate->es_part_prune_infos)
+	{
+		PartitionPruneInfo *pruneinfo = lfirst_node(PartitionPruneInfo, lc);
+		PartitionPruneState *prunestate;
+
+		/* Create and save the PartitionPruneState. */
+		prunestate = CreatePartitionPruneState(estate, pruneinfo);
+		estate->es_part_prune_states = lappend(estate->es_part_prune_states,
+											   prunestate);
+	}
+}
+
 /*
  * ExecDoInitialPruning
  *		Perform runtime "initial" pruning, if necessary, to determine the set
  *		of child subnodes that need to be initialized during ExecInitNode() for
  *		plan nodes that support partition pruning.
  *
- * This function iterates over each PartitionPruneInfo entry in
- * estate->es_part_prune_infos. For each entry, it creates a PartitionPruneState
- * and adds it to es_part_prune_states.  ExecInitPartitionExecPruning() accesses
+ *
+ * This function iterates over each PartitionPruneState in
+ * estate->es_part_prune_states, which must have been populated earlier by
+ * ExecCreatePartitionPruneStates(). ExecInitPartitionExecPruning() accesses
  * these states through their corresponding indexes in es_part_prune_states and
  * assigns each state to the parent node's PlanState, from where it will be used
  * for "exec" pruning.
@@ -1996,18 +2024,12 @@ ExecDoInitialPruning(EState *estate)
 	ListCell   *lc;
 
 	Assert(estate->es_part_prune_results == NULL);
-	foreach(lc, estate->es_part_prune_infos)
+	foreach(lc, estate->es_part_prune_states)
 	{
-		PartitionPruneInfo *pruneinfo = lfirst_node(PartitionPruneInfo, lc);
-		PartitionPruneState *prunestate;
+		PartitionPruneState *prunestate = (PartitionPruneState *) lfirst(lc);
 		Bitmapset  *validsubplans = NULL;
 		Bitmapset  *validsubplan_rtis = NULL;
 
-		/* Create and save the PartitionPruneState. */
-		prunestate = CreatePartitionPruneState(estate, pruneinfo);
-		estate->es_part_prune_states = lappend(estate->es_part_prune_states,
-											   prunestate);
-
 		/*
 		 * Perform initial pruning steps, if any, and save the result
 		 * bitmapset or NULL as described in the header comment.  RT indexes
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index bb62c648899..879b2d012a1 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -2102,7 +2102,7 @@ AcquireExecutorLocksUnpruned(List *stmt_list, bool acquire,
 			}
 
 			prep_estate = ExecutorPrep(plannedstmt, cprep->params,
-									   cprep->owner, cprep->eflags);
+									   cprep->owner, cprep->eflags, true);
 			Assert(prep_estate);
 			cprep->prep_estates = lappend(cprep->prep_estates, prep_estate);
 
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index 82063ec2a16..4c96808c376 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -130,6 +130,7 @@ typedef struct PartitionPruneState
 	PartitionPruningData *partprunedata[FLEXIBLE_ARRAY_MEMBER];
 } PartitionPruneState;
 
+extern void ExecCreatePartitionPruneStates(EState *estate);
 extern void ExecDoInitialPruning(EState *estate);
 extern PartitionPruneState *ExecInitPartitionExecPruning(PlanState *planstate,
 														 int n_total_subplans,
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index 4505ceaca3c..8e5fde965ed 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -240,7 +240,8 @@ extern void standard_ExecutorStart(QueryDesc *queryDesc, int eflags);
 extern EState *ExecutorPrep(PlannedStmt *pstmt,
 							ParamListInfo params,
 							ResourceOwner owner,
-							int eflags);
+							int eflags,
+							bool do_initial_pruning);
 
 /*
  * Walk a prep_estates list in step with a parallel stmt_list iteration.
-- 
2.47.3



  [application/octet-stream] v9-0003-Use-pruning-aware-locking-in-cached-plans.patch (41.8K, 4-v9-0003-Use-pruning-aware-locking-in-cached-plans.patch)
  download | inline diff:
From a5cbee90d2f57c0b775ecc9d959bdcf9fe864075 Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Thu, 19 Mar 2026 19:02:04 +0900
Subject: [PATCH v9 3/5] Use pruning-aware locking in cached plans

Extend GetCachedPlan() to perform ExecutorPrep() on each planned
statement, capturing unpruned relids and initial pruning results.
Use this data to acquire execution locks only on surviving partitions,
avoiding unnecessary locking of pruned tables even when using cached
plans.

Introduce CachedPlanPrepData to carry the EStates created by
ExecutorPrep() through the plan caching layer. The prep_estates
list is indexed one-to-one with CachedPlan->stmt_list and is
populated when GetCachedPlan() prepares a reused generic plan.
Adjust call sites in SPI, functions, portals, and EXPLAIN to
propagate this data.

Partition pruning expressions may call PL functions that require
an active snapshot (e.g., via EnsurePortalSnapshotExists()).
AcquireExecutorLocksUnpruned() establishes one before calling
ExecutorPrep() if needed, ensuring these expressions can execute
correctly during plan cache validation.

To maintain correctness when all target partitions are pruned, also
reinstate the firstResultRel locking behavior lost in commit
28317de72. That commit required the first ModifyTable target to
remain initialized for executor assumptions to hold. We now
explicitly track these relids in PlannerGlobal and PlannedStmt so they
are locked even if pruned, preserving that rule across cached plan
reuse.

Regression tests are included to verify:

- Only surviving partitions are locked when pruning is enabled, and
  all partitions are locked when it is disabled (pg_locks inspection).
- Multiple ModifyTable nodes (via writable CTEs) handle the case where
  all target partitions are pruned, exercising firstResultRels.
- Plan invalidation during pruning-aware lock setup (DDL triggered by
  a pruning expression) discards the prep state and replans cleanly.

Note for extension authors: code that accesses partition relations
through EState must check that the RT index is a member of
es_unpruned_relids before opening the relation.  Previously this was
an optimization (avoid processing pruned partitions); it is now a
correctness requirement, because pruned partitions may not be locked.
ExecGetRangeTableRelation() already enforces this with an error when
called on a pruned relation.
---
 src/backend/commands/prepare.c                |  17 +-
 src/backend/executor/execMain.c               |   4 +
 src/backend/executor/functions.c              |   1 +
 src/backend/executor/nodeModifyTable.c        |   5 +-
 src/backend/executor/spi.c                    |  22 +-
 src/backend/optimizer/plan/planner.c          |   1 +
 src/backend/optimizer/plan/setrefs.c          |  18 ++
 src/backend/tcop/postgres.c                   |   7 +-
 src/backend/utils/cache/plancache.c           | 257 +++++++++++++++++-
 src/include/nodes/pathnodes.h                 |   3 +
 src/include/nodes/plannodes.h                 |  10 +
 src/include/utils/plancache.h                 |  35 ++-
 src/test/regress/expected/partition_prune.out | 145 ++++++++++
 src/test/regress/expected/plancache.out       |  62 +++++
 src/test/regress/sql/partition_prune.sql      |  77 ++++++
 src/test/regress/sql/plancache.sql            |  51 ++++
 16 files changed, 691 insertions(+), 24 deletions(-)

diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index c7bab14b633..fec83cc6fd4 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -156,6 +156,7 @@ ExecuteQuery(ParseState *pstate,
 {
 	PreparedStatement *entry;
 	CachedPlan *cplan;
+	CachedPlanPrepData cprep = {0};
 	List	   *plan_list;
 	ParamListInfo paramLI = NULL;
 	EState	   *estate = NULL;
@@ -195,7 +196,9 @@ ExecuteQuery(ParseState *pstate,
 									   entry->plansource->query_string);
 
 	/* Replan if needed, and increment plan refcount for portal */
-	cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL);
+	cprep.context = portal->portalContext;
+	cprep.owner = portal->resowner;
+	cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL, &cprep);
 	plan_list = cplan->stmt_list;
 
 	/*
@@ -207,7 +210,7 @@ ExecuteQuery(ParseState *pstate,
 					  query_string,
 					  entry->plansource->commandTag,
 					  plan_list,
-					  NIL,
+					  cprep.prep_estates,
 					  cplan);
 
 	/*
@@ -577,6 +580,7 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	PreparedStatement *entry;
 	const char *query_string;
 	CachedPlan *cplan;
+	CachedPlanPrepData cprep = {0};
 	List	   *plan_list;
 	List	   *prep_estates;
 	ListCell   *p;
@@ -635,8 +639,13 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	}
 
 	/* Replan if needed, and acquire a transient refcount */
+	cprep.context = CurrentMemoryContext;
+	cprep.owner = CurrentResourceOwner;
+	if (es->generic)
+		cprep.eflags = EXEC_FLAG_EXPLAIN_GENERIC;
 	cplan = GetCachedPlan(entry->plansource, paramLI,
-						  CurrentResourceOwner, pstate->p_queryEnv);
+						  CurrentResourceOwner, pstate->p_queryEnv,
+						  &cprep);
 
 	INSTR_TIME_SET_CURRENT(planduration);
 	INSTR_TIME_SUBTRACT(planduration, planstart);
@@ -655,7 +664,7 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	}
 
 	plan_list = cplan->stmt_list;
-	prep_estates = NIL;
+	prep_estates = cprep.prep_estates;
 
 	/* Explain each query */
 	prep_lc = list_head(prep_estates);
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 282c9871de0..336bd4d09b3 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -334,6 +334,10 @@ standard_ExecutorStart(QueryDesc *queryDesc, int eflags)
  *
  * Returns an EState that the caller must either pass to ExecutorStart()
  * for reuse or free via FreeExecutorState() if execution will not proceed.
+ * GetCachedPlan() uses this to determine, based on initial pruning
+ * results, which partitions to lock; if the resulting EState is not
+ * delivered to ExecutorStart(), the executor would operate on unlocked
+ * relations.  See the assert checks in standard_ExecutorStart().
  */
 EState *
 ExecutorPrep(PlannedStmt *pstmt, ParamListInfo params, ResourceOwner owner,
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index 952a784c924..c0ca72b38dd 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -699,6 +699,7 @@ init_execution_state(SQLFunctionCachePtr fcache)
 	fcache->cplan = GetCachedPlan(plansource,
 								  fcache->paramLI,
 								  fcache->cowner,
+								  NULL,
 								  NULL);
 
 	/*
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index 4cd5e262e0f..9230f2b554f 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -4865,8 +4865,8 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
 	 * as a reference for building the ResultRelInfo of the target partition.
 	 * In either case, it doesn't matter which result relation is kept, so we
 	 * just keep the first one, if all others have been pruned.  See also,
-	 * ExecDoInitialPruning(), which ensures that this first result relation
-	 * has been locked.
+	 * AcquireExecutorLocksUnpruned(), which ensures that this first result
+	 * relation has been locked.
 	 */
 	i = 0;
 	foreach(l, node->resultRelations)
@@ -4880,6 +4880,7 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
 			/* all result relations pruned; keep the first one */
 			keep_rel = true;
 			rti = linitial_int(node->resultRelations);
+			Assert(list_member_int(estate->es_plannedstmt->firstResultRels, rti));
 			i = 0;
 		}
 
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 380bbc44e97..f1d84f7a350 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -1580,6 +1580,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 {
 	CachedPlanSource *plansource;
 	CachedPlan *cplan;
+	CachedPlanPrepData cprep = {0};
 	List	   *stmt_list;
 	char	   *query_string;
 	Snapshot	snapshot;
@@ -1660,7 +1661,10 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 	 */
 
 	/* Replan if needed, and increment plan refcount for portal */
-	cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv);
+	cprep.context = portal->portalContext;
+	cprep.owner = portal->resowner;
+	cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv,
+						  &cprep);
 	stmt_list = cplan->stmt_list;
 
 	if (!plan->saved)
@@ -1670,7 +1674,10 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 		 * so must copy the plan into the portal's context.  An error here
 		 * will result in leaking our refcount on the plan, but it doesn't
 		 * matter because the plan is unsaved and hence transient anyway.
+		 *
+		 * Unsaved plans use custom plans, so prep should be a no-op.
 		 */
+		Assert(cprep.prep_estates == NIL);
 		oldcontext = MemoryContextSwitchTo(portal->portalContext);
 		stmt_list = copyObject(stmt_list);
 		MemoryContextSwitchTo(oldcontext);
@@ -1686,7 +1693,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 					  query_string,
 					  plansource->commandTag,
 					  stmt_list,
-					  NIL,
+					  cprep.prep_estates,
 					  cplan);
 
 	/*
@@ -2104,7 +2111,8 @@ SPI_plan_get_cached_plan(SPIPlanPtr plan)
 	/* Get the generic plan for the query */
 	cplan = GetCachedPlan(plansource, NULL,
 						  plan->saved ? CurrentResourceOwner : NULL,
-						  _SPI_current->queryEnv);
+						  _SPI_current->queryEnv,
+						  NULL);
 	Assert(cplan == plansource->gplan);
 
 	/* Pop the error context stack */
@@ -2503,6 +2511,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 		ListCell   *lc2;
 		List	   *prep_estates;
 		ListCell   *prep_lc;
+		CachedPlanPrepData cprep = {0};
 
 		spicallbackarg.query = plansource->query_string;
 
@@ -2577,11 +2586,14 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 		 * Replan if needed, and increment plan refcount.  If it's a saved
 		 * plan, the refcount must be backed by the plan_owner.
 		 */
+		cprep.context = CurrentMemoryContext;
+		cprep.owner = CurrentResourceOwner;
 		cplan = GetCachedPlan(plansource, options->params,
-							  plan_owner, _SPI_current->queryEnv);
+							  plan_owner, _SPI_current->queryEnv,
+							  &cprep);
 
 		stmt_list = cplan->stmt_list;
-		prep_estates = NIL;
+		prep_estates = cprep.prep_estates;
 
 		/*
 		 * If we weren't given a specific snapshot to use, and the statement
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 42604a0f75c..afa61d357c5 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -657,6 +657,7 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
 	result->permInfos = glob->finalrteperminfos;
 	result->subrtinfos = glob->subrtinfos;
 	result->resultRelations = glob->resultRelations;
+	result->firstResultRels = glob->firstResultRels;
 	result->appendRelations = glob->appendRelations;
 	result->subplans = glob->subplans;
 	result->rewindPlanIDs = glob->rewindPlanIDs;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 1b5b9b5ed9c..8c9956e687e 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -384,6 +384,24 @@ set_plan_references(PlannerInfo *root, Plan *plan)
 		}
 	}
 
+	/*
+	 * Record the first result relation if it belongs to the set of
+	 * initially prunable relations.  We use bms_next_member() to get
+	 * the lowest-numbered leaf result rel, which matches
+	 * linitial_int(ModifyTable.resultRelations) because partition
+	 * expansion preserves RT index order.  ExecInitModifyTable() asserts
+	 * that the recorded index matches what it actually needs.
+	 */
+	if (root->leaf_result_relids)
+	{
+		Index	firstResultRel = bms_next_member(root->leaf_result_relids, -1);
+
+		firstResultRel += rtoffset;
+		if (bms_is_member(firstResultRel, root->glob->prunableRelids))
+			root->glob->firstResultRels =
+				lappend_int(root->glob->firstResultRels, firstResultRel);
+	}
+
 	return result;
 }
 
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 355a490cde9..de362ff1672 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -1637,6 +1637,7 @@ exec_bind_message(StringInfo input_message)
 	int16	   *rformats = NULL;
 	CachedPlanSource *psrc;
 	CachedPlan *cplan;
+	CachedPlanPrepData cprep = {0};
 	Portal		portal;
 	char	   *query_string;
 	char	   *saved_stmt_name;
@@ -2018,7 +2019,9 @@ exec_bind_message(StringInfo input_message)
 	 * will be generated in MessageContext.  The plan refcount will be
 	 * assigned to the Portal, so it will be released at portal destruction.
 	 */
-	cplan = GetCachedPlan(psrc, params, NULL, NULL);
+	cprep.context = portal->portalContext;
+	cprep.owner = portal->resowner;
+	cplan = GetCachedPlan(psrc, params, NULL, NULL, &cprep);
 
 	/*
 	 * Now we can define the portal.
@@ -2031,7 +2034,7 @@ exec_bind_message(StringInfo input_message)
 					  query_string,
 					  psrc->commandTag,
 					  cplan->stmt_list,
-					  NIL,
+					  cprep.prep_estates,
 					  cplan);
 
 	/* Portal is defined, set the plan ID based on its contents. */
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 698e7c1aa22..bb62c648899 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -93,14 +93,17 @@ static bool StmtPlanRequiresRevalidation(CachedPlanSource *plansource);
 static bool BuildingPlanRequiresSnapshot(CachedPlanSource *plansource);
 static List *RevalidateCachedQuery(CachedPlanSource *plansource,
 								   QueryEnvironment *queryEnv);
-static bool CheckCachedPlan(CachedPlanSource *plansource);
+static bool CheckCachedPlan(CachedPlanSource *plansource, CachedPlanPrepData *cprep);
 static CachedPlan *BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
 								   ParamListInfo boundParams, QueryEnvironment *queryEnv);
 static bool choose_custom_plan(CachedPlanSource *plansource,
 							   ParamListInfo boundParams);
 static double cached_plan_cost(CachedPlan *plan, bool include_planner);
 static Query *QueryListGetPrimaryStmt(List *stmts);
-static void AcquireExecutorLocks(List *stmt_list, bool acquire);
+static void AcquireExecutorLocksAll(List *stmt_list, bool acquire);
+static void AcquireExecutorLocksUnpruned(List *stmt_list, bool acquire,
+										 CachedPlanPrepData *cprep);
+static void CachedPlanPrepCleanup(CachedPlanPrepData *cprep);
 static void AcquirePlannerLocks(List *stmt_list, bool acquire);
 static void ScanQueryForLocks(Query *parsetree, bool acquire);
 static bool ScanQueryWalker(Node *node, bool *acquire);
@@ -942,6 +945,11 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
 /*
  * CheckCachedPlan: see if the CachedPlanSource's generic plan is valid.
  *
+ * If 'cprep' is not NULL, ExecutorPrep() is applied to each PlannedStmt to
+ * compute the set of partitions that survive initial runtime pruning in order
+ * to only lock them.  The EStates created to do so are saved in cprep for
+ * later reuse by ExecutorStart().
+ *
  * Caller must have already called RevalidateCachedQuery to verify that the
  * querytree is up to date.
  *
@@ -949,7 +957,7 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
  * (We must do this for the "true" result to be race-condition-free.)
  */
 static bool
-CheckCachedPlan(CachedPlanSource *plansource)
+CheckCachedPlan(CachedPlanSource *plansource, CachedPlanPrepData *cprep)
 {
 	CachedPlan *plan = plansource->gplan;
 
@@ -983,7 +991,10 @@ CheckCachedPlan(CachedPlanSource *plansource)
 		 */
 		Assert(plan->refcount > 0);
 
-		AcquireExecutorLocks(plan->stmt_list, true);
+		if (cprep)
+			AcquireExecutorLocksUnpruned(plan->stmt_list, true, cprep);
+		else
+			AcquireExecutorLocksAll(plan->stmt_list, true);
 
 		/*
 		 * If plan was transient, check to see if TransactionXmin has
@@ -1005,7 +1016,13 @@ CheckCachedPlan(CachedPlanSource *plansource)
 		}
 
 		/* Oops, the race case happened.  Release useless locks. */
-		AcquireExecutorLocks(plan->stmt_list, false);
+		if (cprep)
+			AcquireExecutorLocksUnpruned(plan->stmt_list, false, cprep);
+		else
+			AcquireExecutorLocksAll(plan->stmt_list, false);
+
+		/* Also clean up ExecutorPrep() state, if necessary. */
+		CachedPlanPrepCleanup(cprep);
 	}
 
 	/*
@@ -1285,6 +1302,15 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
  * On return, the plan is valid and we have sufficient locks to begin
  * execution.
  *
+ * If 'cprep' is not NULL and a generic plan is reused, the function
+ * performs initial pruning via ExecutorPrep() and locks only the
+ * surviving partitions.  The resulting EStates are stored in
+ * cprep->prep_estates and must be delivered to ExecutorStart() via
+ * QueryDesc->estate (or the equivalent portal/SPI path).  Failure
+ * to do so means the executor will operate on relations for which
+ * locks were never acquired.  Passing NULL for cprep is always safe;
+ * all partitions are locked as before.
+ *
  * On return, the refcount of the plan has been incremented; a later
  * ReleaseCachedPlan() call is expected.  If "owner" is not NULL then
  * the refcount has been reported to that ResourceOwner (note that this
@@ -1295,7 +1321,8 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
  */
 CachedPlan *
 GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
-			  ResourceOwner owner, QueryEnvironment *queryEnv)
+			  ResourceOwner owner, QueryEnvironment *queryEnv,
+			  CachedPlanPrepData *cprep)
 {
 	CachedPlan *plan = NULL;
 	List	   *qlist;
@@ -1317,7 +1344,9 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 
 	if (!customplan)
 	{
-		if (CheckCachedPlan(plansource))
+		if (cprep)
+			cprep->params = boundParams;
+		if (CheckCachedPlan(plansource, cprep))
 		{
 			/* We want a generic plan, and we already have a valid one */
 			plan = plansource->gplan;
@@ -1904,11 +1933,13 @@ QueryListGetPrimaryStmt(List *stmts)
 }
 
 /*
- * AcquireExecutorLocks: acquire locks needed for execution of a cached plan;
- * or release them if acquire is false.
+ * AcquireExecutorLocksAll: acquire locks needed for execution of a cached
+ * plan; or release them if acquire is false.
+ *
+ * This locks all relations in a given PlannedStmt's range table.
  */
 static void
-AcquireExecutorLocks(List *stmt_list, bool acquire)
+AcquireExecutorLocksAll(List *stmt_list, bool acquire)
 {
 	ListCell   *lc1;
 
@@ -1955,6 +1986,212 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
 	}
 }
 
+/*
+ * LockRelids
+ * 		Acquire or release locks on the specified relids, which reference
+ * 		entries in the provided range table.
+ *
+ * Helper for AcquireExecutorLocksUnpruned().
+ */
+static void
+LockRelids(List *rtable, Bitmapset *relids, bool acquire)
+{
+	int	rtindex = -1;
+
+	while ((rtindex = bms_next_member(relids, rtindex)) >= 0)
+	{
+		RangeTblEntry *rte = list_nth_node(RangeTblEntry, rtable, rtindex - 1);
+
+		Assert(rte->rtekind == RTE_RELATION ||
+			   (rte->rtekind == RTE_SUBQUERY && OidIsValid(rte->relid)));
+
+		/*
+		 * Acquire the appropriate type of lock on each relation OID. Note
+		 * that we don't actually try to open the rel, and hence will not
+		 * fail if it's been dropped entirely --- we'll just transiently
+		 * acquire a non-conflicting lock.
+		 */
+		if (acquire)
+			LockRelationOid(rte->relid, rte->rellockmode);
+		else
+			UnlockRelationOid(rte->relid, rte->rellockmode);
+	}
+}
+
+/*
+ * AcquireExecutorLocksUnpruned
+ *		Acquire or release execution locks for only unpruned relations
+ *		referenced by the given PlannedStmts.
+ *
+ * On acquire, this:
+ *	- locks unprunable rels listed in PlannedStmt.unprunableRelids
+ *	- runs ExecutorPrep() to perform initial runtime pruning
+ *	- locks the surviving partitions reported in the prep estate
+ *	- appends the EState pointer for each PlannedStmt to cprep->prep_estates
+ *
+ * On release, it:
+ *	- looks up the EState for each PlannedStmt from cprep->prep_estates
+ *	  (which must already be populated)
+ *	- unlocks the same relations identified during acquire
+ *
+ * prep_estates is extended during acquire and must match stmt_list one-to-one
+ * when releasing locks.  Memory allocation for EState happens in
+ * cprep->context.  Locks are acquired using cprep->owner.
+ */
+static void
+AcquireExecutorLocksUnpruned(List *stmt_list, bool acquire,
+							 CachedPlanPrepData *cprep)
+{
+	MemoryContext oldcontext = MemoryContextSwitchTo(cprep->context);
+	ListCell   *lc1;
+	List	   *prep_estates;
+	ListCell   *prep_lc;
+
+	Assert(cprep);
+
+	/*
+	 * When releasing locks, use the EState list (if any) created during
+	 * acquisition to determine which relids to unlock. The list must match
+	 * the PlannedStmt list one-to-one.
+	 */
+	prep_estates = cprep->prep_estates;
+	Assert(acquire || list_length(prep_estates) == list_length(stmt_list));
+
+	prep_lc = list_head(prep_estates);
+	foreach(lc1, stmt_list)
+	{
+		PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
+		EState *prep_estate;
+
+		if (plannedstmt->commandType == CMD_UTILITY)
+		{
+			/* Same as AcquireExecutorLocks(). */
+			Query	   *query = UtilityContainsQuery(plannedstmt->utilityStmt);
+
+			if (query)
+				ScanQueryForLocks(query, acquire);
+
+			/* Keep the list one-to-one with stmt_list. */
+			if (acquire)
+				cprep->prep_estates = lappend(cprep->prep_estates, NULL);
+			else
+				(void) next_prep_estate(prep_estates, &prep_lc);
+			continue;
+		}
+
+		/*
+		 * Lock tables mentioned in the original query and other unprunable
+		 * relations that were added to the plan via inheritance expansion.
+		 */
+		LockRelids(plannedstmt->rtable, plannedstmt->unprunableRelids, acquire);
+
+		/* Lock partitions surviving runtime initial pruning. */
+		if (acquire)
+		{
+			/*
+			 * Pruning expressions may call PL functions that require an active
+			 * snapshot (e.g., via EnsurePortalSnapshotExists()). Establish one
+			 * if needed.
+			 */
+			bool		snap_pushed = false;
+
+			if (!ActiveSnapshotSet())
+			{
+				PushActiveSnapshot(GetTransactionSnapshot());
+				snap_pushed = true;
+			}
+
+			prep_estate = ExecutorPrep(plannedstmt, cprep->params,
+									   cprep->owner, cprep->eflags);
+			Assert(prep_estate);
+			cprep->prep_estates = lappend(cprep->prep_estates, prep_estate);
+
+			if (snap_pushed)
+				PopActiveSnapshot();
+		}
+		else
+			prep_estate = next_prep_estate(prep_estates, &prep_lc);
+
+		if (prep_estate)
+		{
+			/*
+			 * es_unpruned_relids includes plannedstmt->unprunableRelids,
+			 * which we've already locked. Filter them out to avoid double-locking.
+			 */
+			Bitmapset *lock_relids = bms_difference(prep_estate->es_unpruned_relids,
+													plannedstmt->unprunableRelids);
+
+			/*
+			 * We must always include the first result relation of each
+			 * ModifyTable node in the plan, that is, the one mentioned in
+			 * plannedstmt->firstResultRels in the set of relations to be
+			 * locked to satisfy executor assumptions described
+			 * in ExecInitModifyTable().  This can be wasteful, because we
+			 * may not need to use the first result relation at all if other
+			 * result relations are unpruned and thus sufficient for the
+			 * ModifyTable node's needs.  Unfortunately, we don't have per-node
+			 * unpruned_relids set to determine that other result relations
+			 * are included.
+			 */
+			if (plannedstmt->resultRelations)
+			{
+				ListCell *lc2;
+
+				foreach(lc2, plannedstmt->firstResultRels)
+				{
+					Index       firstResultRel = lfirst_int(lc2);
+
+					if (!bms_is_member(firstResultRel, lock_relids))
+						lock_relids = bms_add_member(lock_relids, firstResultRel);
+				}
+			}
+
+			LockRelids(plannedstmt->rtable, lock_relids, acquire);
+			bms_free(lock_relids);
+		}
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+}
+
+/*
+ * CachedPlanPrepCleanup
+ *		Clean up EState built for a generic plan.
+ *
+ * This is used in the corner case where CheckCachedPlan() discovers
+ * that a CachedPlan has become invalid after AcquireExecutorLocksUnpruned()
+ * has already run.  In that case we must both release the execution locks
+ * and dispose of the ExecPrep list stored in CachedPlanPrepData, since the
+ * executor will never see or clean it up.
+ */
+static void
+CachedPlanPrepCleanup(CachedPlanPrepData *cprep)
+{
+	ListCell   *lc;
+	ResourceOwner oldowner;
+
+	if (cprep == NULL)
+		return;
+
+	/* Switch to owner that ExecutorPrep() would have used. */
+	oldowner = CurrentResourceOwner;
+	CurrentResourceOwner = cprep->owner;
+	foreach(lc, cprep->prep_estates)
+	{
+		EState *prep_estate = (EState *) lfirst(lc);
+
+		if (prep_estate == NULL)
+			continue;
+
+		ExecCloseRangeTableRelations(prep_estate);
+		FreeExecutorState(prep_estate);
+	}
+	CurrentResourceOwner = oldowner;
+
+	list_free(cprep->prep_estates);
+	cprep->prep_estates = NIL;
+}
+
 /*
  * AcquirePlannerLocks: acquire locks needed for planning of a querytree list;
  * or release them if acquire is false.
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 27758ec16fe..4fd9d9bcc56 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -217,6 +217,9 @@ typedef struct PlannerGlobal
 	/* "flat" list of integer RT indexes */
 	List	   *resultRelations;
 
+	/* "flat" list of integer RT indexes (one per ModifyTable node) */
+	List	   *firstResultRels;
+
 	/* "flat" list of AppendRelInfos */
 	List	   *appendRelations;
 
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index b6185825fcb..55279cbbda8 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -121,6 +121,16 @@ typedef struct PlannedStmt
 	/* integer list of RT indexes, or NIL */
 	List	   *resultRelations;
 
+	/*
+	 * rtable indexes of first target relation in each ModifyTable node in the
+	 * plan for INSERT/UPDATE/DELETE/MERGE.  NIL if resultRelations is NIL.
+	 *
+	 * These are used by AcquireExecutorLocksUnpruned() to ensure that the
+	 * first result rel for each ModifyTable remains locked even if pruned;
+	 * see ExecInitModifyTable() for the executor side assumptions.
+	 */
+	List	   *firstResultRels;
+
 	/* list of AppendRelInfo nodes */
 	List	   *appendRelations;
 
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index 7a4a85c8038..177150a5848 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -197,6 +197,38 @@ typedef struct CachedExpression
 } CachedExpression;
 
 
+/*
+ * CachedPlanPrepData
+ *      Carries ExecutorPrep results for each PlannedStmt in a CachedPlan,
+ *      along with context and owner information needed to allocate them.
+ *
+ * prep_estates is indexed one-to-one with CachedPlan->stmt_list, and is
+ * populated when GetCachedPlan() prepares a reused generic plan.  If the
+ * plan is found invalid after locking, the same list is used to determine
+ * which relations to unlock before retrying.
+ *
+ * ExecutorPrep state is allocated in 'context' and owned by 'owner'.
+ *
+ * eflags controls ExecutorPrep() behavior during initial pruning.
+ * Normally zero; set EXEC_FLAG_EXPLAIN_GENERIC to suppress pruning
+ * in EXPLAIN (GENERIC_PLAN).  Need not match the eflags later passed
+ * to ExecutorStart().
+ *
+ * prep_estates must reach ExecutorStart() to be adopted for execution.
+ * If the plan is invalidated before that happens, CachedPlanPrepCleanup()
+ * frees them instead.  The EStates are allocated in 'context' and their
+ * resources tracked under 'owner', which the caller sets to match the
+ * execution environment (e.g., portal context and resowner).
+ */
+typedef struct CachedPlanPrepData
+{
+	List   *prep_estates;	/* one EState per PlannedStmt, or NULL */
+	ParamListInfo params;	/* params visible to ExecutorPrep */
+	MemoryContext context;	/* where to allocate EState and its fields */
+	ResourceOwner owner;	/* ResourceOwner for ExecutorPrep state */
+	int		eflags;			/* executor flags to control ExecutorPrep */
+} CachedPlanPrepData;
+
 extern void InitPlanCache(void);
 extern void ResetPlanCache(void);
 
@@ -240,7 +272,8 @@ extern List *CachedPlanGetTargetList(CachedPlanSource *plansource,
 extern CachedPlan *GetCachedPlan(CachedPlanSource *plansource,
 								 ParamListInfo boundParams,
 								 ResourceOwner owner,
-								 QueryEnvironment *queryEnv);
+								 QueryEnvironment *queryEnv,
+								 CachedPlanPrepData *cprep);
 extern void ReleaseCachedPlan(CachedPlan *plan, ResourceOwner owner);
 
 extern bool CachedPlanAllowsSimpleValidityCheck(CachedPlanSource *plansource,
diff --git a/src/test/regress/expected/partition_prune.out b/src/test/regress/expected/partition_prune.out
index deacdd75807..8e0cc98baca 100644
--- a/src/test/regress/expected/partition_prune.out
+++ b/src/test/regress/expected/partition_prune.out
@@ -4824,3 +4824,148 @@ select min(a) over (partition by a order by a) from part_abc where a >= stable_o
 
 drop view part_abc_view;
 drop table part_abc;
+--
+-- Verify that pruning-aware locking skips pruned partitions
+-- when reusing a generic cached plan.
+--
+set plan_cache_mode to force_generic_plan;
+create table prunelock_p (a int) partition by list (a);
+create table prunelock_p1 partition of prunelock_p for values in (1);
+create table prunelock_p2 partition of prunelock_p for values in (2);
+create table prunelock_p3 partition of prunelock_p for values in (3);
+prepare prunelock_q (int) as select * from prunelock_p where a = $1;
+-- Force generic plan creation
+explain (costs off) execute prunelock_q(1);
+                  QUERY PLAN                  
+----------------------------------------------
+ Append
+   Subplans Removed: 2
+   ->  Seq Scan on prunelock_p1 prunelock_p_1
+         Filter: (a = $1)
+(4 rows)
+
+-- Execute and check which child partitions are locked
+begin;
+execute prunelock_q(1);
+ a 
+---
+(0 rows)
+
+select c.relname
+  from pg_locks l
+  join pg_class c on c.oid = l.relation
+ where l.pid = pg_backend_pid()
+   and c.relname like 'prunelock_p_'
+ order by c.relname;
+   relname    
+--------------
+ prunelock_p1
+(1 row)
+
+commit;
+deallocate prunelock_q;
+-- Turn pruning off
+set enable_partition_pruning to off;
+prepare prunelock_q (int) as select * from prunelock_p where a = $1;
+-- Force generic plan creation
+explain (costs off) execute prunelock_q(1);
+                  QUERY PLAN                  
+----------------------------------------------
+ Append
+   ->  Seq Scan on prunelock_p1 prunelock_p_1
+         Filter: (a = $1)
+   ->  Seq Scan on prunelock_p2 prunelock_p_2
+         Filter: (a = $1)
+   ->  Seq Scan on prunelock_p3 prunelock_p_3
+         Filter: (a = $1)
+(7 rows)
+
+-- Execute and check which child partitions are locked
+begin;
+execute prunelock_q(1);
+ a 
+---
+(0 rows)
+
+select c.relname
+  from pg_locks l
+  join pg_class c on c.oid = l.relation
+ where l.pid = pg_backend_pid()
+   and c.relname like 'prunelock_p_'
+ order by c.relname;
+   relname    
+--------------
+ prunelock_p1
+ prunelock_p2
+ prunelock_p3
+(3 rows)
+
+commit;
+deallocate prunelock_q;
+reset enable_partition_pruning;
+--
+-- Verify firstResultRels handling with multiple ModifyTable nodes
+-- (writable CTEs) targeting a partitioned table.  When a pruning
+-- parameter matches no partition, all result relations are pruned
+-- and the executor must still find a usable first result relation
+-- for each ModifyTable node.
+--
+prepare prunelock_mt_q (int, int) as
+  with upd1 as (update prunelock_p set a = a),
+       upd2 as (update prunelock_p set a = a where a = $2)
+  update prunelock_p set a = a where a = $1;
+-- Force generic plan creation
+explain (costs off) execute prunelock_mt_q(1, 2);
+                         QUERY PLAN                         
+------------------------------------------------------------
+ Update on prunelock_p
+   Update on prunelock_p1 prunelock_p_1
+   CTE upd1
+     ->  Update on prunelock_p prunelock_p_3
+           Update on prunelock_p1 prunelock_p_4
+           Update on prunelock_p2 prunelock_p_5
+           Update on prunelock_p3 prunelock_p_6
+           ->  Append
+                 ->  Seq Scan on prunelock_p1 prunelock_p_4
+                 ->  Seq Scan on prunelock_p2 prunelock_p_5
+                 ->  Seq Scan on prunelock_p3 prunelock_p_6
+   CTE upd2
+     ->  Update on prunelock_p prunelock_p_7
+           Update on prunelock_p2 prunelock_p_8
+           ->  Append
+                 Subplans Removed: 2
+                 ->  Seq Scan on prunelock_p2 prunelock_p_8
+                       Filter: (a = $2)
+   ->  Append
+         Subplans Removed: 2
+         ->  Seq Scan on prunelock_p1 prunelock_p_1
+               Filter: (a = $1)
+(22 rows)
+
+-- All partitions pruned: value 4 matches no partition, so each
+-- ModifyTable must still initialize correctly with no matching
+-- result relations.
+explain (costs off) execute prunelock_mt_q(4, 5);
+                         QUERY PLAN                         
+------------------------------------------------------------
+ Update on prunelock_p
+   CTE upd1
+     ->  Update on prunelock_p prunelock_p_2
+           Update on prunelock_p1 prunelock_p_3
+           Update on prunelock_p2 prunelock_p_4
+           Update on prunelock_p3 prunelock_p_5
+           ->  Append
+                 ->  Seq Scan on prunelock_p1 prunelock_p_3
+                 ->  Seq Scan on prunelock_p2 prunelock_p_4
+                 ->  Seq Scan on prunelock_p3 prunelock_p_5
+   CTE upd2
+     ->  Update on prunelock_p prunelock_p_6
+           ->  Append
+                 Subplans Removed: 3
+   ->  Append
+         Subplans Removed: 3
+(16 rows)
+
+deallocate prunelock_mt_q;
+drop table prunelock_p;
+reset plan_cache_mode;
diff --git a/src/test/regress/expected/plancache.out b/src/test/regress/expected/plancache.out
index 4e59188196c..1d69ab0a1c2 100644
--- a/src/test/regress/expected/plancache.out
+++ b/src/test/regress/expected/plancache.out
@@ -398,3 +398,65 @@ select name, generic_plans, custom_plans from pg_prepared_statements
 (1 row)
 
 drop table test_mode;
+-- Test invalidation of a generic plan during pruning-aware lock setup.
+-- The pruning expression uses a stable SQL function that calls a volatile
+-- plpgsql function.  That function performs DDL on a partition when a
+-- separate "signal" table says to do so.  The second EXECUTE should
+-- replan cleanly after the DDL.
+set plan_cache_mode to force_generic_plan;
+create table inval_during_pruning_p (a int) partition by list (a);
+create table inval_during_pruning_p1 partition of inval_during_pruning_p for values in (1);
+create table inval_during_pruning_p2 partition of inval_during_pruning_p for values in (2);
+insert into inval_during_pruning_p values (1), (2);
+create table inval_during_pruning_signal (create_idx bool not null);
+insert into inval_during_pruning_signal values (false);
+create or replace function invalidate_plancache_func() returns int
+as $$
+declare
+	create_index bool;
+begin
+	-- Perform DDL on a partition if asked to
+	select create_idx into create_index from inval_during_pruning_signal for update;
+	if create_index = true then
+		raise notice 'creating index on partition inval_during_pruning_p1';
+		create index on inval_during_pruning_p1 (a);
+		update inval_during_pruning_signal set create_idx = false;
+	end if;
+	-- pruning parameter
+	return 1;
+end;
+$$ language plpgsql volatile;
+create or replace function stable_pruning_val() returns int as $$
+	select invalidate_plancache_func();
+$$ language sql stable;
+prepare inval_during_pruning_q as select * from inval_during_pruning_p where a = stable_pruning_val();
+-- Build a generic plan and run pruning once, but don't set the signal
+-- for invalidate_plancache_func() to perform the DDL.
+explain (verbose, costs off) execute inval_during_pruning_q;
+                                QUERY PLAN                                 
+---------------------------------------------------------------------------
+ Append
+   Subplans Removed: 1
+   ->  Seq Scan on public.inval_during_pruning_p1 inval_during_pruning_p_1
+         Output: inval_during_pruning_p_1.a
+         Filter: (inval_during_pruning_p_1.a = stable_pruning_val())
+(5 rows)
+
+-- Reuse the generic plan.  Make invalidate_plancache_func() perform DDL
+-- during this execution, which should force replanning without errors.
+update inval_during_pruning_signal set create_idx = true;
+explain (verbose, costs off) execute inval_during_pruning_q;
+NOTICE:  creating index on partition inval_during_pruning_p1
+                                QUERY PLAN                                 
+---------------------------------------------------------------------------
+ Append
+   Subplans Removed: 1
+   ->  Seq Scan on public.inval_during_pruning_p1 inval_during_pruning_p_1
+         Output: inval_during_pruning_p_1.a
+         Filter: (inval_during_pruning_p_1.a = stable_pruning_val())
+(5 rows)
+
+drop table inval_during_pruning_p, inval_during_pruning_signal;
+drop function invalidate_plancache_func, stable_pruning_val;
+deallocate inval_during_pruning_q;
+reset plan_cache_mode;
diff --git a/src/test/regress/sql/partition_prune.sql b/src/test/regress/sql/partition_prune.sql
index d93c0c03bab..804dd3c8f4e 100644
--- a/src/test/regress/sql/partition_prune.sql
+++ b/src/test/regress/sql/partition_prune.sql
@@ -1447,3 +1447,80 @@ select min(a) over (partition by a order by a) from part_abc where a >= stable_o
 
 drop view part_abc_view;
 drop table part_abc;
+
+--
+-- Verify that pruning-aware locking skips pruned partitions
+-- when reusing a generic cached plan.
+--
+set plan_cache_mode to force_generic_plan;
+
+create table prunelock_p (a int) partition by list (a);
+create table prunelock_p1 partition of prunelock_p for values in (1);
+create table prunelock_p2 partition of prunelock_p for values in (2);
+create table prunelock_p3 partition of prunelock_p for values in (3);
+
+prepare prunelock_q (int) as select * from prunelock_p where a = $1;
+
+-- Force generic plan creation
+explain (costs off) execute prunelock_q(1);
+
+-- Execute and check which child partitions are locked
+begin;
+execute prunelock_q(1);
+
+select c.relname
+  from pg_locks l
+  join pg_class c on c.oid = l.relation
+ where l.pid = pg_backend_pid()
+   and c.relname like 'prunelock_p_'
+ order by c.relname;
+commit;
+
+deallocate prunelock_q;
+
+-- Turn pruning off
+set enable_partition_pruning to off;
+
+prepare prunelock_q (int) as select * from prunelock_p where a = $1;
+
+-- Force generic plan creation
+explain (costs off) execute prunelock_q(1);
+
+-- Execute and check which child partitions are locked
+begin;
+execute prunelock_q(1);
+
+select c.relname
+  from pg_locks l
+  join pg_class c on c.oid = l.relation
+ where l.pid = pg_backend_pid()
+   and c.relname like 'prunelock_p_'
+ order by c.relname;
+commit;
+
+deallocate prunelock_q;
+reset enable_partition_pruning;
+
+--
+-- Verify firstResultRels handling with multiple ModifyTable nodes
+-- (writable CTEs) targeting a partitioned table.  When a pruning
+-- parameter matches no partition, all result relations are pruned
+-- and the executor must still find a usable first result relation
+-- for each ModifyTable node.
+--
+prepare prunelock_mt_q (int, int) as
+  with upd1 as (update prunelock_p set a = a),
+       upd2 as (update prunelock_p set a = a where a = $2)
+  update prunelock_p set a = a where a = $1;
+
+-- Force generic plan creation
+explain (costs off) execute prunelock_mt_q(1, 2);
+
+-- All partitions pruned: value 4 matches no partition, so each
+-- ModifyTable must still initialize correctly with no matching
+-- result relations.
+explain (costs off) execute prunelock_mt_q(4, 5);
+
+deallocate prunelock_mt_q;
+drop table prunelock_p;
+reset plan_cache_mode;
diff --git a/src/test/regress/sql/plancache.sql b/src/test/regress/sql/plancache.sql
index 4b2f11dcc64..139b4688fd6 100644
--- a/src/test/regress/sql/plancache.sql
+++ b/src/test/regress/sql/plancache.sql
@@ -223,3 +223,54 @@ select name, generic_plans, custom_plans from pg_prepared_statements
   where  name = 'test_mode_pp';
 
 drop table test_mode;
+
+-- Test invalidation of a generic plan during pruning-aware lock setup.
+-- The pruning expression uses a stable SQL function that calls a volatile
+-- plpgsql function.  That function performs DDL on a partition when a
+-- separate "signal" table says to do so.  The second EXECUTE should
+-- replan cleanly after the DDL.
+set plan_cache_mode to force_generic_plan;
+create table inval_during_pruning_p (a int) partition by list (a);
+create table inval_during_pruning_p1 partition of inval_during_pruning_p for values in (1);
+create table inval_during_pruning_p2 partition of inval_during_pruning_p for values in (2);
+insert into inval_during_pruning_p values (1), (2);
+
+create table inval_during_pruning_signal (create_idx bool not null);
+insert into inval_during_pruning_signal values (false);
+create or replace function invalidate_plancache_func() returns int
+as $$
+declare
+	create_index bool;
+begin
+	-- Perform DDL on a partition if asked to
+	select create_idx into create_index from inval_during_pruning_signal for update;
+	if create_index = true then
+		raise notice 'creating index on partition inval_during_pruning_p1';
+		create index on inval_during_pruning_p1 (a);
+		update inval_during_pruning_signal set create_idx = false;
+	end if;
+	-- pruning parameter
+	return 1;
+end;
+$$ language plpgsql volatile;
+
+create or replace function stable_pruning_val() returns int as $$
+	select invalidate_plancache_func();
+$$ language sql stable;
+
+prepare inval_during_pruning_q as select * from inval_during_pruning_p where a = stable_pruning_val();
+
+-- Build a generic plan and run pruning once, but don't set the signal
+-- for invalidate_plancache_func() to perform the DDL.
+explain (verbose, costs off) execute inval_during_pruning_q;
+
+-- Reuse the generic plan.  Make invalidate_plancache_func() perform DDL
+-- during this execution, which should force replanning without errors.
+update inval_during_pruning_signal set create_idx = true;
+explain (verbose, costs off) execute inval_during_pruning_q;
+
+drop table inval_during_pruning_p, inval_during_pruning_signal;
+drop function invalidate_plancache_func, stable_pruning_val;
+deallocate inval_during_pruning_q;
+
+reset plan_cache_mode;
-- 
2.47.3



  [application/octet-stream] v9-0001-Refactor-executor-s-initial-partition-pruning-set.patch (7.3K, 5-v9-0001-Refactor-executor-s-initial-partition-pruning-set.patch)
  download | inline diff:
From 6b2a9740b49a5238569cfeeb11fa632225ec2cfb Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Wed, 25 Mar 2026 16:06:38 +0900
Subject: [PATCH v9 1/5] Refactor executor's initial partition pruning setup

Simplify handling of unpruned relids by moving responsibility
for recording them in EState into CreatePartitionPruneState(),
avoiding the need to pass all_leafpart_rtis as an out parameter.

Also move the setting of ecxt_param_exec_vals from
ExecCreatePartitionPruneState() to InitExecPartitionPruneContexts(),
to allow the former to be called before PARAM_EXEC parameters are
set up.  A later commit needs this when running pruning state setup
outside of InitPlan().

No behavioral change.
---
 src/backend/executor/execPartition.c | 70 +++++++++++++++++++---------
 1 file changed, 48 insertions(+), 22 deletions(-)

diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index d96d4f9947b..2a3af006f77 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -185,8 +185,7 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation rel,
 static List *adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri);
 static List *adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap);
 static PartitionPruneState *CreatePartitionPruneState(EState *estate,
-													  PartitionPruneInfo *pruneinfo,
-													  Bitmapset **all_leafpart_rtis);
+													  PartitionPruneInfo *pruneinfo);
 static void InitPartitionPruneContext(PartitionPruneContext *context,
 									  List *pruning_steps,
 									  PartitionDesc partdesc,
@@ -1978,7 +1977,7 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  * estate->es_part_prune_infos. For each entry, it creates a PartitionPruneState
  * and adds it to es_part_prune_states.  ExecInitPartitionExecPruning() accesses
  * these states through their corresponding indexes in es_part_prune_states and
- * assign each state to the parent node's PlanState, from where it will be used
+ * assigns each state to the parent node's PlanState, from where it will be used
  * for "exec" pruning.
  *
  * If initial pruning steps exist for a PartitionPruneInfo entry, this function
@@ -1996,29 +1995,31 @@ ExecDoInitialPruning(EState *estate)
 {
 	ListCell   *lc;
 
+	Assert(estate->es_part_prune_results == NULL);
 	foreach(lc, estate->es_part_prune_infos)
 	{
 		PartitionPruneInfo *pruneinfo = lfirst_node(PartitionPruneInfo, lc);
 		PartitionPruneState *prunestate;
 		Bitmapset  *validsubplans = NULL;
-		Bitmapset  *all_leafpart_rtis = NULL;
 		Bitmapset  *validsubplan_rtis = NULL;
 
 		/* Create and save the PartitionPruneState. */
-		prunestate = CreatePartitionPruneState(estate, pruneinfo,
-											   &all_leafpart_rtis);
+		prunestate = CreatePartitionPruneState(estate, pruneinfo);
 		estate->es_part_prune_states = lappend(estate->es_part_prune_states,
 											   prunestate);
 
 		/*
 		 * Perform initial pruning steps, if any, and save the result
-		 * bitmapset or NULL as described in the header comment.
+		 * bitmapset or NULL as described in the header comment.  RT indexes
+		 * of surviving partitions would be added to validsubplan_rtis.
+		 *
+		 * Note that when do_initial_prune is false,
+		 * CreatePartitionPruneState() would have already added the RT indexes
+		 * of all leaf partitions to es_unpruned_relids directly.
 		 */
 		if (prunestate->do_initial_prune)
 			validsubplans = ExecFindMatchingSubPlans(prunestate, true,
 													 &validsubplan_rtis);
-		else
-			validsubplan_rtis = all_leafpart_rtis;
 
 		estate->es_unpruned_relids = bms_add_members(estate->es_unpruned_relids,
 													 validsubplan_rtis);
@@ -2136,14 +2137,12 @@ ExecInitPartitionExecPruning(PlanState *planstate,
  * parent plan node's PlanState.
  *
  * If initial pruning steps are to be skipped (e.g., during EXPLAIN
- * (GENERIC_PLAN)), *all_leafpart_rtis will be populated with the RT indexes of
- * all leaf partitions whose scanning subnode is included in the parent plan
- * node's list of child plans. The caller must add these RT indexes to
- * estate->es_unpruned_relids.
+ * (GENERIC_PLAN)), the RT indexes of all leaf partitions whose scanning
+ * subnode is included in the parent plan node's list of child plans are
+ * added to estate->es_unpruned_relids.
  */
 static PartitionPruneState *
-CreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo,
-						  Bitmapset **all_leafpart_rtis)
+CreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo)
 {
 	PartitionPruneState *prunestate;
 	int			n_part_hierarchies;
@@ -2377,8 +2376,8 @@ CreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo,
 													   pinfo->execparamids);
 
 			/*
-			 * Return all leaf partition indexes if we're skipping pruning in
-			 * the EXPLAIN (GENERIC_PLAN) case.
+			 * Add all leaf partition indexes to es_unpruned_relids if we're
+			 * skipping pruning in the EXPLAIN (GENERIC_PLAN) case.
 			 */
 			if (pinfo->initial_pruning_steps && !prunestate->do_initial_prune)
 			{
@@ -2390,9 +2389,28 @@ CreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo,
 					Index		rtindex = pprune->leafpart_rti_map[part_index];
 
 					if (rtindex)
-						*all_leafpart_rtis = bms_add_member(*all_leafpart_rtis,
-															rtindex);
+						estate->es_unpruned_relids =
+							bms_add_member(estate->es_unpruned_relids, rtindex);
+				}
+			}
+			else if (pinfo->initial_pruning_steps == NIL)
+			{
+				/*
+				 * All partitions better be present in es_unpruned_relids when
+				 * none are initially prunable.
+				 */
+#ifdef USE_ASSERT_CHECKING
+				int			part_index = -1;
+
+				while ((part_index = bms_next_member(pprune->present_parts,
+													 part_index)) >= 0)
+				{
+					Index		rtindex = pprune->leafpart_rti_map[part_index];
+
+					if (rtindex)
+						Assert(bms_is_member(rtindex, estate->es_unpruned_relids));
 				}
+#endif
 			}
 
 			j++;
@@ -2490,9 +2508,10 @@ InitPartitionPruneContext(PartitionPruneContext *context,
  *		Initialize exec pruning contexts deferred by CreatePartitionPruneState()
  *
  * This function finalizes exec pruning setup for a PartitionPruneState by
- * initializing contexts for pruning steps that require the parent plan's
- * PlanState. It iterates over PartitionPruningData entries and sets up the
- * necessary execution contexts for pruning during query execution.
+ * initializing contexts for pruning steps that require PARAM_EXEC parameters
+ * and the parent plan's PlanState. It iterates over PartitionPruningData
+ * entries and sets up the necessary execution contexts for pruning during
+ * query execution.
  *
  * Also fix the mapping of partition indexes to subplan indexes contained in
  * prunestate by considering the new list of subplans that survived initial
@@ -2520,9 +2539,16 @@ InitExecPartitionPruneContexts(PartitionPruneState *prunestate,
 	bool		fix_subplan_map = false;
 
 	Assert(prunestate->do_exec_prune);
+	Assert(prunestate->econtext);
 	Assert(parent_plan != NULL);
 	estate = parent_plan->state;
 
+	/*
+	 * These might not be available when ExecCreatePartitionPruneState() is
+	 * called.
+	 */
+	prunestate->econtext->ecxt_param_exec_vals = estate->es_param_exec_vals;
+
 	/*
 	 * No need to fix subplans maps if initial pruning didn't eliminate any
 	 * subplans.
-- 
2.47.3



  [application/octet-stream] v9-0002-Introduce-ExecutorPrep-and-refactor-executor-star.patch (25.5K, 6-v9-0002-Introduce-ExecutorPrep-and-refactor-executor-star.patch)
  download | inline diff:
From 32267b58bdf9db56a716abde9fcc3e4e8fac6fee Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Wed, 25 Mar 2026 16:07:18 +0900
Subject: [PATCH v9 2/5] Introduce ExecutorPrep and refactor executor startup

Factor permission checks, range table initialization, and initial
partition pruning out of InitPlan() into a new ExecutorPrep()
helper.  ExecutorStart() calls it to build the EState, keeping
behavior unchanged.

If QueryDesc->estate is already set when ExecutorStart() is called,
the existing EState is reused and ExecutorPrep() is skipped.  This
allows a later commit to supply a pre-built EState from outside
the executor.

Add scaffolding for carrying an optional prep EState through
CreateQueryDesc, PortalDefineQuery, and SPI.  All callers currently
pass NULL/NIL; the next commit populates these to enable
pruning-aware locking in cached plans.

In assert builds, verify that the expected relation locks are held
when entering ExecutorStart().
---
 src/backend/commands/copyto.c       |   2 +-
 src/backend/commands/createas.c     |   2 +-
 src/backend/commands/explain.c      |   8 +-
 src/backend/commands/extension.c    |   2 +-
 src/backend/commands/matview.c      |   2 +-
 src/backend/commands/portalcmds.c   |   1 +
 src/backend/commands/prepare.c      |   9 +-
 src/backend/executor/README         |  11 +-
 src/backend/executor/execMain.c     | 157 +++++++++++++++++++++++-----
 src/backend/executor/execParallel.c |   3 +-
 src/backend/executor/functions.c    |   3 +-
 src/backend/executor/spi.c          |   9 +-
 src/backend/tcop/postgres.c         |   2 +
 src/backend/tcop/pquery.c           |  24 ++++-
 src/backend/utils/mmgr/portalmem.c  |   2 +
 src/include/commands/explain.h      |   3 +-
 src/include/executor/execdesc.h     |   5 +-
 src/include/executor/executor.h     |  26 +++++
 src/include/utils/portal.h          |   2 +
 19 files changed, 223 insertions(+), 50 deletions(-)

diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index faf62d959b4..b9bd5ba7078 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -1011,7 +1011,7 @@ BeginCopyTo(ParseState *pstate,
 		cstate->queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
 											GetActiveSnapshot(),
 											InvalidSnapshot,
-											dest, NULL, NULL, 0);
+											dest, NULL, NULL, 0, NULL);
 
 		/*
 		 * Call ExecutorStart to prepare the plan for execution.
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 270e9bf3110..b4a9808955a 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -336,7 +336,7 @@ ExecCreateTableAs(ParseState *pstate, CreateTableAsStmt *stmt,
 		/* Create a QueryDesc, redirecting output to our tuple receiver */
 		queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
 									GetActiveSnapshot(), InvalidSnapshot,
-									dest, params, queryEnv, 0);
+									dest, params, queryEnv, 0, NULL);
 
 		/* call ExecutorStart to prepare the plan for execution */
 		ExecutorStart(queryDesc, GetIntoRelEFlags(into));
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index e4b70166b0e..24c0c235fd3 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -372,7 +372,7 @@ standard_ExplainOneQuery(Query *query, int cursorOptions,
 	}
 
 	/* run it (if needed) and produce output */
-	ExplainOnePlan(plan, into, es, queryString, params, queryEnv,
+	ExplainOnePlan(plan, NULL, into, es, queryString, params, queryEnv,
 				   &planduration, (es->buffers ? &bufusage : NULL),
 				   es->memory ? &mem_counters : NULL);
 }
@@ -494,7 +494,8 @@ ExplainOneUtility(Node *utilityStmt, IntoClause *into, ExplainState *es,
  * to call it.
  */
 void
-ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
+ExplainOnePlan(PlannedStmt *plannedstmt, EState *prep_estate,
+			   IntoClause *into, ExplainState *es,
 			   const char *queryString, ParamListInfo params,
 			   QueryEnvironment *queryEnv, const instr_time *planduration,
 			   const BufferUsage *bufusage,
@@ -552,7 +553,8 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
 	/* Create a QueryDesc for the query */
 	queryDesc = CreateQueryDesc(plannedstmt, queryString,
 								GetActiveSnapshot(), InvalidSnapshot,
-								dest, params, queryEnv, instrument_option);
+								dest, params, queryEnv, instrument_option,
+								prep_estate);
 
 	/* Select execution options */
 	if (es->analyze)
diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index b98801d08f2..939e7a632f0 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -1174,7 +1174,7 @@ execute_sql_string(const char *sql, const char *filename)
 				qdesc = CreateQueryDesc(stmt,
 										sql,
 										GetActiveSnapshot(), NULL,
-										dest, NULL, NULL, 0);
+										dest, NULL, NULL, 0, NULL);
 
 				ExecutorStart(qdesc, 0);
 				ExecutorRun(qdesc, ForwardScanDirection, 0);
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index 81a55a33ef2..2cdfdcf984b 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -439,7 +439,7 @@ refresh_matview_datafill(DestReceiver *dest, Query *query,
 	/* Create a QueryDesc, redirecting output to our tuple receiver */
 	queryDesc = CreateQueryDesc(plan, queryString,
 								GetActiveSnapshot(), InvalidSnapshot,
-								dest, NULL, NULL, 0);
+								dest, NULL, NULL, 0, NULL);
 
 	/* call ExecutorStart to prepare the plan for execution */
 	ExecutorStart(queryDesc, 0);
diff --git a/src/backend/commands/portalcmds.c b/src/backend/commands/portalcmds.c
index 01efac3319e..1e880a6d7c9 100644
--- a/src/backend/commands/portalcmds.c
+++ b/src/backend/commands/portalcmds.c
@@ -118,6 +118,7 @@ PerformCursorOpen(ParseState *pstate, DeclareCursorStmt *cstmt, ParamListInfo pa
 					  queryString,
 					  CMDTAG_SELECT,	/* cursor's query is always a SELECT */
 					  list_make1(plan),
+					  NIL,
 					  NULL);
 
 	/*----------
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 876aad2100a..c7bab14b633 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -207,6 +207,7 @@ ExecuteQuery(ParseState *pstate,
 					  query_string,
 					  entry->plansource->commandTag,
 					  plan_list,
+					  NIL,
 					  cplan);
 
 	/*
@@ -577,7 +578,9 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	const char *query_string;
 	CachedPlan *cplan;
 	List	   *plan_list;
+	List	   *prep_estates;
 	ListCell   *p;
+	ListCell   *prep_lc;
 	ParamListInfo paramLI = NULL;
 	EState	   *estate = NULL;
 	instr_time	planstart;
@@ -652,14 +655,18 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	}
 
 	plan_list = cplan->stmt_list;
+	prep_estates = NIL;
 
 	/* Explain each query */
+	prep_lc = list_head(prep_estates);
 	foreach(p, plan_list)
 	{
 		PlannedStmt *pstmt = lfirst_node(PlannedStmt, p);
+		EState *prep_estate = next_prep_estate(prep_estates, &prep_lc);
 
 		if (pstmt->commandType != CMD_UTILITY)
-			ExplainOnePlan(pstmt, into, es, query_string, paramLI, pstate->p_queryEnv,
+			ExplainOnePlan(pstmt, prep_estate,
+						   into, es, query_string, paramLI, pstate->p_queryEnv,
 						   &planduration, (es->buffers ? &bufusage : NULL),
 						   es->memory ? &mem_counters : NULL);
 		else
diff --git a/src/backend/executor/README b/src/backend/executor/README
index 54f4782f31b..d749ceb6687 100644
--- a/src/backend/executor/README
+++ b/src/backend/executor/README
@@ -291,11 +291,18 @@ Query Processing Control Flow
 
 This is a sketch of control flow for full query processing:
 
+    ExecutorPrep
+		May be run before ExecutorStart (e.g., for plan validation), or
+		implicitly from ExecutorStart if not done earlier.  Creates EState,
+		performs range table initialization, permission checks, and initial
+		partition pruning.  Returns the EState that ExecutorStart() should
+		reuse.
+
 	CreateQueryDesc
 
 	ExecutorStart
-		CreateExecutorState
-			creates per-query context
+		ExecutorPrep (if not already done, indicated by NULL QueryDesc.estate)
+			creates EState and per-query context
 		switch to per-query context to run ExecInitNode
 		AfterTriggerBeginQuery
 		ExecInitNode --- recursively scans plan tree
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 58b84955c2b..282c9871de0 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -57,6 +57,7 @@
 #include "parser/parse_relation.h"
 #include "pgstat.h"
 #include "rewrite/rewriteHandler.h"
+#include "storage/lmgr.h"
 #include "tcop/utility.h"
 #include "utils/acl.h"
 #include "utils/backend_status.h"
@@ -147,7 +148,6 @@ standard_ExecutorStart(QueryDesc *queryDesc, int eflags)
 
 	/* sanity checks: queryDesc must not be started already */
 	Assert(queryDesc != NULL);
-	Assert(queryDesc->estate == NULL);
 
 	/* caller must ensure the query's snapshot is active */
 	Assert(GetActiveSnapshot() == queryDesc->snapshot);
@@ -173,9 +173,70 @@ standard_ExecutorStart(QueryDesc *queryDesc, int eflags)
 
 	/*
 	 * Build EState, switch into per-query memory context for startup.
-	 */
-	estate = CreateExecutorState();
-	queryDesc->estate = estate;
+	 *
+	 * If ExecutorPrep() ran earlier (e.g., to do initial pruning during plan
+	 * validity checking), reuse its EState to avoid redoing range table setup
+	 * and pruning. Otherwise, create a fresh EState as usual.
+	 *
+	 * In assert builds, verify that the expected locks are held.  When
+	 * no prep EState was provided, AcquireExecutorLocks() should have
+	 * locked every relation in the plan.  When one was provided,
+	 * pruning-aware locking should have locked at least the unpruned
+	 * relations.  Both checks are skipped in parallel workers, which
+	 * acquire relation locks lazily in ExecGetRangeTableRelation().
+	 */
+	if (queryDesc->estate == NULL)
+	{
+#ifdef USE_ASSERT_CHECKING
+		if (!IsParallelWorker())
+		{
+			ListCell   *lc;
+
+			foreach(lc, queryDesc->plannedstmt->rtable)
+			{
+				RangeTblEntry *rte = lfirst_node(RangeTblEntry, lc);
+
+				if (rte->rtekind == RTE_RELATION ||
+					(rte->rtekind == RTE_SUBQUERY && rte->relid != InvalidOid))
+					Assert(CheckRelationOidLockedByMe(rte->relid,
+													  rte->rellockmode,
+													  true));
+			}
+		}
+#endif
+		queryDesc->estate = ExecutorPrep(queryDesc->plannedstmt,
+										 queryDesc->params,
+										 CurrentResourceOwner,
+										 eflags);
+	}
+#ifdef USE_ASSERT_CHECKING
+	else
+	{
+		/*
+		 * A prep EState was provided, meaning pruning-aware locking
+		 * should have locked at least the unpruned relations.
+		 */
+		if (!IsParallelWorker())
+		{
+			int		rtindex = -1;
+
+			while ((rtindex = bms_next_member(queryDesc->estate->es_unpruned_relids,
+											  rtindex)) >= 0)
+			{
+				RangeTblEntry *rte = exec_rt_fetch(rtindex, queryDesc->estate);
+
+				Assert(rte->rtekind == RTE_RELATION ||
+					   (rte->rtekind == RTE_SUBQUERY &&
+						rte->relid != InvalidOid));
+				Assert(CheckRelationOidLockedByMe(rte->relid,
+												  rte->rellockmode, true));
+			}
+		}
+	}
+#endif
+
+	estate = queryDesc->estate;
+	Assert(estate);
 
 	oldcontext = MemoryContextSwitchTo(estate->es_query_cxt);
 
@@ -265,6 +326,67 @@ standard_ExecutorStart(QueryDesc *queryDesc, int eflags)
 	MemoryContextSwitchTo(oldcontext);
 }
 
+/*
+ * ExecutorPrep: prepare executor state for a PlannedStmt outside ExecutorStart.
+ *
+ * Performs range table initialization, permission checks, and initial
+ * partition pruning if partPruneInfos are present.
+ *
+ * Returns an EState that the caller must either pass to ExecutorStart()
+ * for reuse or free via FreeExecutorState() if execution will not proceed.
+ */
+EState *
+ExecutorPrep(PlannedStmt *pstmt, ParamListInfo params, ResourceOwner owner,
+			 int eflags)
+{
+	ResourceOwner oldowner;
+	EState *estate;
+
+	if (pstmt->commandType == CMD_UTILITY)
+		return NULL;
+
+	/* Caller must have established an active snapshot. */
+	Assert(ActiveSnapshotSet());
+
+	estate = CreateExecutorState();
+	estate->es_plannedstmt = pstmt;
+	estate->es_part_prune_infos = pstmt->partPruneInfos;
+	estate->es_param_list_info = params;
+	estate->es_top_eflags = eflags;
+
+	/*
+	 * Do permissions checks.
+	 */
+	ExecCheckPermissions(pstmt->rtable, pstmt->permInfos, true);
+
+	/*
+	 * Initialize range table.
+	 */
+	ExecInitRangeTable(estate, pstmt->rtable, pstmt->permInfos,
+					   bms_copy(pstmt->unprunableRelids));
+
+	/*
+	 * Ensure locks taken during initial pruning are tracked under the given
+	 * ResourceOwner (e.g., one associated with CachedPlan validation).
+	 */
+	oldowner = CurrentResourceOwner;
+	CurrentResourceOwner = owner;
+
+	/*
+	 * Set up PartitionPruneState structures and perform initial partition
+	 * pruning to compute the subset of child subplans that will be
+	 * executed.  The results, which are bitmapsets of selected child
+	 * indexes, are saved in es_part_prune_results, parallel to
+	 * es_part_prune_infos.  RT indexes of surviving partitions are
+	 * added to es_unpruned_relids.
+	 */
+	ExecDoInitialPruning(estate);
+
+	CurrentResourceOwner = oldowner;
+
+	return estate;
+}
+
 /* ----------------------------------------------------------------
  *		ExecutorRun
  *
@@ -840,37 +962,14 @@ InitPlan(QueryDesc *queryDesc, int eflags)
 	CmdType		operation = queryDesc->operation;
 	PlannedStmt *plannedstmt = queryDesc->plannedstmt;
 	Plan	   *plan = plannedstmt->planTree;
-	List	   *rangeTable = plannedstmt->rtable;
 	EState	   *estate = queryDesc->estate;
 	PlanState  *planstate;
 	TupleDesc	tupType;
 	ListCell   *l;
 	int			i;
 
-	/*
-	 * Do permissions checks
-	 */
-	ExecCheckPermissions(rangeTable, plannedstmt->permInfos, true);
-
-	/*
-	 * initialize the node's execution state
-	 */
-	ExecInitRangeTable(estate, rangeTable, plannedstmt->permInfos,
-					   bms_copy(plannedstmt->unprunableRelids));
-
-	estate->es_plannedstmt = plannedstmt;
-	estate->es_part_prune_infos = plannedstmt->partPruneInfos;
-
-	/*
-	 * Perform runtime "initial" pruning to identify which child subplans,
-	 * corresponding to the children of plan nodes that contain
-	 * PartitionPruneInfo such as Append, will not be executed. The results,
-	 * which are bitmapsets of indexes of the child subplans that will be
-	 * executed, are saved in es_part_prune_results.  These results correspond
-	 * to each PartitionPruneInfo entry, and the es_part_prune_results list is
-	 * parallel to es_part_prune_infos.
-	 */
-	ExecDoInitialPruning(estate);
+	/* ExecutorPrep() must have been done. */
+	Assert(queryDesc->estate);
 
 	/*
 	 * Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index ac84af294c9..024780d3516 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -1300,7 +1300,8 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
 	return CreateQueryDesc(pstmt,
 						   queryString,
 						   GetActiveSnapshot(), InvalidSnapshot,
-						   receiver, paramLI, NULL, instrument_options);
+						   receiver, paramLI, NULL, instrument_options,
+						   NULL);
 }
 
 /*
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index 88109348817..952a784c924 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -1369,7 +1369,8 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
 							 dest,
 							 fcache->paramLI,
 							 es->qd ? es->qd->queryEnv : NULL,
-							 0);
+							 0,
+							 NULL);
 
 	/* Utility commands don't need Executor. */
 	if (es->qd->operation != CMD_UTILITY)
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 52f3b11301c..380bbc44e97 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -1686,6 +1686,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 					  query_string,
 					  plansource->commandTag,
 					  stmt_list,
+					  NIL,
 					  cplan);
 
 	/*
@@ -2500,6 +2501,8 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 		CachedPlanSource *plansource = (CachedPlanSource *) lfirst(lc1);
 		List	   *stmt_list;
 		ListCell   *lc2;
+		List	   *prep_estates;
+		ListCell   *prep_lc;
 
 		spicallbackarg.query = plansource->query_string;
 
@@ -2578,6 +2581,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 							  plan_owner, _SPI_current->queryEnv);
 
 		stmt_list = cplan->stmt_list;
+		prep_estates = NIL;
 
 		/*
 		 * If we weren't given a specific snapshot to use, and the statement
@@ -2615,9 +2619,11 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 			}
 		}
 
+		prep_lc = list_head(prep_estates);
 		foreach(lc2, stmt_list)
 		{
 			PlannedStmt *stmt = lfirst_node(PlannedStmt, lc2);
+			EState *prep_estate = next_prep_estate(prep_estates, &prep_lc);
 			bool		canSetTag = stmt->canSetTag;
 			DestReceiver *dest;
 
@@ -2695,7 +2701,8 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 										dest,
 										options->params,
 										_SPI_current->queryEnv,
-										0);
+										0,
+										prep_estate);
 				res = _SPI_pquery(qdesc, fire_triggers,
 								  canSetTag ? options->tcount : 0);
 				FreeQueryDesc(qdesc);
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index b3563113219..355a490cde9 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -1231,6 +1231,7 @@ exec_simple_query(const char *query_string)
 						  query_string,
 						  commandTag,
 						  plantree_list,
+						  NIL,
 						  NULL);
 
 		/*
@@ -2030,6 +2031,7 @@ exec_bind_message(StringInfo input_message)
 					  query_string,
 					  psrc->commandTag,
 					  cplan->stmt_list,
+					  NIL,
 					  cplan);
 
 	/* Portal is defined, set the plan ID based on its contents. */
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index d8fc75d0bb9..b18266487bb 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -37,6 +37,7 @@ Portal		ActivePortal = NULL;
 
 
 static void ProcessQuery(PlannedStmt *plan,
+						 EState *prep_estate,
 						 const char *sourceText,
 						 ParamListInfo params,
 						 QueryEnvironment *queryEnv,
@@ -72,7 +73,8 @@ CreateQueryDesc(PlannedStmt *plannedstmt,
 				DestReceiver *dest,
 				ParamListInfo params,
 				QueryEnvironment *queryEnv,
-				int instrument_options)
+				int instrument_options,
+				EState *prep_estate)
 {
 	QueryDesc  *qd = palloc_object(QueryDesc);
 
@@ -93,6 +95,9 @@ CreateQueryDesc(PlannedStmt *plannedstmt,
 	qd->planstate = NULL;
 	qd->totaltime = NULL;
 
+	/* Use the EState created by ExecutorPrep() if already done. */
+	qd->estate = prep_estate;
+
 	/* not yet executed */
 	qd->already_executed = false;
 
@@ -123,6 +128,7 @@ FreeQueryDesc(QueryDesc *qdesc)
  *		PORTAL_ONE_RETURNING, or PORTAL_ONE_MOD_WITH portal
  *
  *	plan: the plan tree for the query
+ *	prep_estate: EState created in ExecutorPrep() for the query, if any
  *	sourceText: the source text of the query
  *	params: any parameters needed
  *	dest: where to send results
@@ -135,6 +141,7 @@ FreeQueryDesc(QueryDesc *qdesc)
  */
 static void
 ProcessQuery(PlannedStmt *plan,
+			 EState *prep_estate,
 			 const char *sourceText,
 			 ParamListInfo params,
 			 QueryEnvironment *queryEnv,
@@ -148,7 +155,8 @@ ProcessQuery(PlannedStmt *plan,
 	 */
 	queryDesc = CreateQueryDesc(plan, sourceText,
 								GetActiveSnapshot(), InvalidSnapshot,
-								dest, params, queryEnv, 0);
+								dest, params, queryEnv, 0,
+								prep_estate);
 
 	/*
 	 * Call ExecutorStart to prepare the plan for execution
@@ -495,7 +503,10 @@ PortalStart(Portal portal, ParamListInfo params,
 											None_Receiver,
 											params,
 											portal->queryEnv,
-											0);
+											0,
+											portal->prep_estates ?
+											(EState *) linitial(portal->prep_estates) :
+											NULL);
 
 				/*
 				 * If it's a scrollable cursor, executor needs to support
@@ -1185,6 +1196,7 @@ PortalRunMulti(Portal portal,
 {
 	bool		active_snapshot_set = false;
 	ListCell   *stmtlist_item;
+	ListCell   *prep_lc;
 
 	/*
 	 * If the destination is DestRemoteExecute, change to DestNone.  The
@@ -1205,9 +1217,11 @@ PortalRunMulti(Portal portal,
 	 * Loop to handle the individual queries generated from a single parsetree
 	 * by analysis and rewrite.
 	 */
+	prep_lc = list_head(portal->prep_estates);
 	foreach(stmtlist_item, portal->stmts)
 	{
 		PlannedStmt *pstmt = lfirst_node(PlannedStmt, stmtlist_item);
+		EState *prep_estate = next_prep_estate(portal->prep_estates, &prep_lc);
 
 		/*
 		 * If we got a cancel signal in prior command, quit
@@ -1265,7 +1279,7 @@ PortalRunMulti(Portal portal,
 			if (pstmt->canSetTag)
 			{
 				/* statement can set tag string */
-				ProcessQuery(pstmt,
+				ProcessQuery(pstmt, prep_estate,
 							 portal->sourceText,
 							 portal->portalParams,
 							 portal->queryEnv,
@@ -1274,7 +1288,7 @@ PortalRunMulti(Portal portal,
 			else
 			{
 				/* stmt added by rewrite cannot set tag */
-				ProcessQuery(pstmt,
+				ProcessQuery(pstmt, prep_estate,
 							 portal->sourceText,
 							 portal->portalParams,
 							 portal->queryEnv,
diff --git a/src/backend/utils/mmgr/portalmem.c b/src/backend/utils/mmgr/portalmem.c
index 493f9b0ee19..443b583637c 100644
--- a/src/backend/utils/mmgr/portalmem.c
+++ b/src/backend/utils/mmgr/portalmem.c
@@ -286,6 +286,7 @@ PortalDefineQuery(Portal portal,
 				  const char *sourceText,
 				  CommandTag commandTag,
 				  List *stmts,
+				  List *prep_estates,
 				  CachedPlan *cplan)
 {
 	Assert(PortalIsValid(portal));
@@ -299,6 +300,7 @@ PortalDefineQuery(Portal portal,
 	portal->commandTag = commandTag;
 	SetQueryCompletion(&portal->qc, commandTag, 0);
 	portal->stmts = stmts;
+	portal->prep_estates = prep_estates;
 	portal->cplan = cplan;
 	portal->status = PORTAL_DEFINED;
 }
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 472e141bba3..71ebe38bc86 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -64,7 +64,8 @@ extern void ExplainOneUtility(Node *utilityStmt, IntoClause *into,
 							  ExplainState *es, ParseState *pstate,
 							  ParamListInfo params);
 
-extern void ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into,
+extern void ExplainOnePlan(PlannedStmt *plannedstmt, EState *prep_estate,
+						   IntoClause *into,
 						   ExplainState *es, const char *queryString,
 						   ParamListInfo params, QueryEnvironment *queryEnv,
 						   const instr_time *planduration,
diff --git a/src/include/executor/execdesc.h b/src/include/executor/execdesc.h
index d3a57242844..3a2169c9613 100644
--- a/src/include/executor/execdesc.h
+++ b/src/include/executor/execdesc.h
@@ -43,7 +43,7 @@ typedef struct QueryDesc
 	QueryEnvironment *queryEnv; /* query environment passed in */
 	int			instrument_options; /* OR of InstrumentOption flags */
 
-	/* These fields are set by ExecutorStart */
+	/* These fields are set by ExecutorStart or ExecutorPrep */
 	TupleDesc	tupDesc;		/* descriptor for result tuples */
 	EState	   *estate;			/* executor's query-wide state */
 	PlanState  *planstate;		/* tree of per-plan-node state */
@@ -63,7 +63,8 @@ extern QueryDesc *CreateQueryDesc(PlannedStmt *plannedstmt,
 								  DestReceiver *dest,
 								  ParamListInfo params,
 								  QueryEnvironment *queryEnv,
-								  int instrument_options);
+								  int instrument_options,
+								  EState *prep_estate);
 
 extern void FreeQueryDesc(QueryDesc *qdesc);
 
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index 07f4b1f7490..4505ceaca3c 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -21,6 +21,7 @@
 #include "nodes/lockoptions.h"
 #include "nodes/parsenodes.h"
 #include "utils/memutils.h"
+#include "utils/resowner.h"
 
 
 /*
@@ -235,6 +236,31 @@ ExecGetJunkAttribute(TupleTableSlot *slot, AttrNumber attno, bool *isNull)
  */
 extern void ExecutorStart(QueryDesc *queryDesc, int eflags);
 extern void standard_ExecutorStart(QueryDesc *queryDesc, int eflags);
+
+extern EState *ExecutorPrep(PlannedStmt *pstmt,
+							ParamListInfo params,
+							ResourceOwner owner,
+							int eflags);
+
+/*
+ * Walk a prep_estates list in step with a parallel stmt_list iteration.
+ * Returns the next EState (or NULL) and advances *lc.
+ *
+ * Safe when prep_estates is NIL; just returns NULL for every call.
+ */
+static inline EState *
+next_prep_estate(List *prep_estates, ListCell **lc)
+{
+	EState *result = NULL;
+
+	if (*lc != NULL)
+	{
+		result = (EState *) lfirst(*lc);
+		*lc = lnext(prep_estates, *lc);
+	}
+	return result;
+}
+
 extern void ExecutorRun(QueryDesc *queryDesc,
 						ScanDirection direction, uint64 count);
 extern void standard_ExecutorRun(QueryDesc *queryDesc,
diff --git a/src/include/utils/portal.h b/src/include/utils/portal.h
index a7bedb12c18..f69b4b9b479 100644
--- a/src/include/utils/portal.h
+++ b/src/include/utils/portal.h
@@ -137,6 +137,7 @@ typedef struct PortalData
 	CommandTag	commandTag;		/* command tag for original query */
 	QueryCompletion qc;			/* command completion data for executed query */
 	List	   *stmts;			/* list of PlannedStmts */
+	List	   *prep_estates;	/* list of EStates where needed */
 	CachedPlan *cplan;			/* CachedPlan, if stmts are from one */
 
 	ParamListInfo portalParams; /* params to pass to query */
@@ -240,6 +241,7 @@ extern void PortalDefineQuery(Portal portal,
 							  const char *sourceText,
 							  CommandTag commandTag,
 							  List *stmts,
+							  List *prep_estates,
 							  CachedPlan *cplan);
 extern PlannedStmt *PortalGetPrimaryStmt(Portal portal);
 extern void PortalCreateHoldStore(Portal portal);
-- 
2.47.3



^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2026-03-26 09:24  Amit Langote <[email protected]>
  parent: Amit Langote <[email protected]>
  0 siblings, 1 reply; 108+ messages in thread

From: Amit Langote @ 2026-03-26 09:24 UTC (permalink / raw)
  To: Chao Li <[email protected]>; +Cc: Tom Lane <[email protected]>; Tender Wang <[email protected]>; Alexander Lakhin <[email protected]>; Tomas Vondra <[email protected]>; Robert Haas <[email protected]>; Alvaro Herrera <[email protected]>; Andres Freund <[email protected]>; Daniel Gustafsson <[email protected]>; David Rowley <[email protected]>; pgsql-hackers; Thom Brown <[email protected]>

On Wed, Mar 25, 2026 at 4:39 PM Amit Langote <[email protected]> wrote:
> On Fri, Mar 20, 2026 at 2:20 AM Amit Langote <[email protected]> wrote:
> > On Mon, Mar 9, 2026 at 1:41 PM Amit Langote <[email protected]> wrote:
> > Stepping back -- the core question is whether running executor logic
> > (pruning) inside GetCachedPlan() is acceptable at all. The plan cache
> > and executor have always had a clean boundary: plan cache locks
> > everything, executor runs. This optimization necessarily crosses that
> > line, because the information needed to decide which locks to skip
> > (pruning results) can only come from executor machinery.
> >
> > The proposed approach has GetCachedPlan() call ExecutorPrep() to do a
> > limited subset of executor work (range table init, permissions,
> > pruning), carry the results out through CachedPlanPrepData, and leave
> > the CachedPlan itself untouched. The executor already has a multi-step
> > protocol: start/run/end. prep/start/run/end is just a finer
> > decomposition of what InitPlan() was already doing inside
> > ExecutorStart().
> >
> > Of the attached patches, I'm targeting 0001-0003 for commit. 0004 (SQL
> > function support) and 0005 (parallel worker reuse) are useful
> > follow-ons but not essential.  The optimization works without them for
> > most cases, and they can be reviewed and committed separately.
> >
> > If there's a cleaner way to avoid locking pruned partitions without
> > the plumbing this patch adds, I haven't found it in the year since the
> > revert.  I'd welcome a pointer if you see one.  Failing that, I think
> > this is the right trade-off, but it's a judgment call about where to
> > hold your nose.
> >
> > Tom, I'd value your opinion on whether this approach is something
> > you'd be comfortable seeing in the tree.
>
> Attached is an updated set with some cleanup after another pass.
>
> - Removed ExecCreatePartitionPruneStates() from 0001. In 0001-0003,
> ExecDoInitialPruning() handles both setup and pruning internally; the
> split isn't needed yet.
>
> - Tightened commit messages to describe what each commit does now, not
> what later commits will use it for. In particular, 0002 is upfront
> that the portal/SPI/EXPLAIN plumbing is scaffolding that 0003 lights
> up.
>
> - Updated setrefs.c comment for firstResultRels to drop a blanket
> claim about one ModifyTable per query level.
>
> As before, 0001-0003 is the focus, maybe 0004 which teaches the new
> GetCachedPlan() pruning-aware contract to its relatively new user in
> function.c.

While reviewing the patch more carefully, I realized there's a
correctness issue when rule rewriting causes a single statement to
expand into multiple PlannedStmts in one CachedPlan.

PortalRunMulti() executes those statements sequentially, with
CommandCounterIncrement() between them, so Q2's ExecutorStart()
normally sees the effects of Q1.

With the patch, though, AcquireExecutorLocksUnpruned() runs
ExecutorPrep() on all PlannedStmts in one pass during GetCachedPlan(),
before any statement executes. If a later statement has
initial-pruning expressions that read data modified by an earlier one,
pruning can see stale results.

There's also a memory lifetime issue: PortalRunMulti() calls
MemoryContextDeleteChildren(portalContext) between statements, which
destroys EStates prepared for later statements.

Here's a concrete case demonstrating the semantic issue:

  create table multistmt_pt (a int, b int) partition by list (a);
  create table multistmt_pt_1 partition of multistmt_pt for values in (1);
  create table multistmt_pt_2 partition of multistmt_pt for values in (2);
  insert into multistmt_pt values (1, 0), (2, 0);

  create table prune_config (val int);
  insert into prune_config values (1);

  create function get_prune_val() returns int as $$
    select val from prune_config;
  $$ language sql stable;

  -- rule action runs first, updating prune_config before the
  -- original statement's pruning would normally be evaluated
  create rule config_upd_rule as on update to multistmt_pt
    do also update prune_config set val = 2;

  set plan_cache_mode to force_generic_plan;
  prepare multi_q as
    update multistmt_pt set b = b + 1 where a = get_prune_val();
  execute multi_q;  -- creates the generic plan

  -- reset for the real test
  update prune_config set val = 1;
  update multistmt_pt set b = 0;

  -- second execute reuses the plan
  execute multi_q;
  select * from multistmt_pt order by a;

Without the patch: the rule action updates prune_config to val=2
first, then after CCI the original statement's initial pruning calls
get_prune_val(), gets 2, prunes to multistmt_pt_2, and updates it
correctly: (1, 0), (2, 1).

With the patch as it stood: both statements' pruning runs during
GetCachedPlan() before either executes. The original statement's
pruning sees val=1, prunes to multistmt_pt_1, and multistmt_pt_2 is
never touched.

The fix is to skip pruning-aware locking for CachedPlans containing
multiple PlannedStmts, falling back to locking all partitions.
Single-statement plans are unchanged.

Since multi-statement plans are now excluded, CachedPlanPrepData no
longer needs a list of EStates -- it carries a single EState pointer.
This simplifies the plumbing throughout: PortalData,
PortalDefineQuery, SPI, and EXPLAIN all pass a single optional EState
instead of walking parallel lists. The next_prep_estate() helper is
gone.

Attached is the updated set.


--
Thanks, Amit Langote


Attachments:

  [application/octet-stream] v10-0005-Reuse-partition-pruning-results-in-parallel-work.patch (15.8K, 2-v10-0005-Reuse-partition-pruning-results-in-parallel-work.patch)
  download | inline diff:
From 33fff6e090d9c713413a68ef2bdf9721f7e7f95b Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Wed, 25 Mar 2026 16:06:57 +0900
Subject: [PATCH v10 5/5] Reuse partition pruning results in parallel workers

Pass the leader's initial partition pruning results and unpruned
relids to parallel workers and reuse them via ExecutorPrep().  This
avoids repeating pruning logic in workers, which is not only
redundant but also risks divergence due to nondeterminism in pruning
steps or parameter evaluation timing.

Factor the creation of PartitionPruneState structures out of
ExecDoInitialPruning() into a new ExecCreatePartitionPruneStates()
function.  Parallel workers need to set up pruning state without
performing initial pruning, since they receive the leader's results
instead.

Introduce CheckInitialPruningResultsInWorker() (debug-builds only)
to verify that the results match what the worker would compute.
This check helps catch inconsistencies across leader and worker
pruning logic.
---
 src/backend/executor/execMain.c      |  25 +++++--
 src/backend/executor/execParallel.c  | 108 ++++++++++++++++++++++++++-
 src/backend/executor/execPartition.c |  44 ++++++++---
 src/backend/utils/cache/plancache.c  |   2 +-
 src/include/executor/execPartition.h |   1 +
 src/include/executor/executor.h      |   3 +-
 6 files changed, 161 insertions(+), 22 deletions(-)

diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 051b5d7bfcf..659557189ce 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -207,7 +207,7 @@ standard_ExecutorStart(QueryDesc *queryDesc, int eflags)
 		queryDesc->estate = ExecutorPrep(queryDesc->plannedstmt,
 										 queryDesc->params,
 										 CurrentResourceOwner,
-										 eflags);
+										 eflags, true);
 	}
 #ifdef USE_ASSERT_CHECKING
 	else
@@ -330,7 +330,8 @@ standard_ExecutorStart(QueryDesc *queryDesc, int eflags)
  * ExecutorPrep: build initial executor state for a PlannedStmt.
  *
  * Performs range table initialization, permission checks, and initial
- * partition pruning if partPruneInfos are present.
+ * partition pruning if partPruneInfos are present and do_initial_pruning is
+ * true; false in a parallel worker.
  *
  * Returns an EState that the caller must either pass to ExecutorStart()
  * for reuse or free via FreeExecutorState() if execution will not proceed.
@@ -341,7 +342,7 @@ standard_ExecutorStart(QueryDesc *queryDesc, int eflags)
  */
 EState *
 ExecutorPrep(PlannedStmt *pstmt, ParamListInfo params, ResourceOwner owner,
-			 int eflags)
+			 int eflags, bool do_initial_pruning)
 {
 	ResourceOwner oldowner;
 	EState *estate;
@@ -378,14 +379,22 @@ ExecutorPrep(PlannedStmt *pstmt, ParamListInfo params, ResourceOwner owner,
 	CurrentResourceOwner = owner;
 
 	/*
-	 * Set up PartitionPruneState structures and perform initial partition
-	 * pruning to compute the subset of child subplans that will be
-	 * executed.  The results, which are bitmapsets of selected child
-	 * indexes, are saved in es_part_prune_results, parallel to
+	 * Set up PartitionPruneState structures needed for initial
+	 * partition pruning.
+	 *
+	 * If do_initial_pruning is true, also perform initial pruning to
+	 * compute the subset of child subplans that will be executed.
+	 * The results, which are bitmapsets of selected child indexes,
+	 * are saved in es_part_prune_results, parallel to
 	 * es_part_prune_infos.  RT indexes of surviving partitions are
 	 * added to es_unpruned_relids.
+	 *
+	 * Parallel workers pass false here and instead receive the
+	 * leader's pruning results via shared memory.
 	 */
-	ExecDoInitialPruning(estate);
+	ExecCreatePartitionPruneStates(estate);
+	if (do_initial_pruning)
+		ExecDoInitialPruning(estate);
 
 	CurrentResourceOwner = oldowner;
 
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index 024780d3516..2de4b35a16e 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -24,6 +24,7 @@
 #include "postgres.h"
 
 #include "executor/execParallel.h"
+#include "executor/execPartition.h"
 #include "executor/executor.h"
 #include "executor/nodeAgg.h"
 #include "executor/nodeAppend.h"
@@ -67,6 +68,8 @@
 #define PARALLEL_KEY_QUERY_TEXT		UINT64CONST(0xE000000000000008)
 #define PARALLEL_KEY_JIT_INSTRUMENTATION UINT64CONST(0xE000000000000009)
 #define PARALLEL_KEY_WAL_USAGE			UINT64CONST(0xE00000000000000A)
+#define PARALLEL_KEY_PARTITION_PRUNE_RESULTS	UINT64CONST(0xE00000000000000B)
+#define PARALLEL_KEY_UNPRUNED_RELIDS	UINT64CONST(0xE00000000000000C)
 
 #define PARALLEL_TUPLE_QUEUE_SIZE		65536
 
@@ -141,6 +144,8 @@ static bool ExecParallelRetrieveInstrumentation(PlanState *planstate,
 /* Helper function that runs in the parallel worker. */
 static DestReceiver *ExecParallelGetReceiver(dsm_segment *seg, shm_toc *toc);
 
+static void CheckInitialPruningResultsInWorker(EState *estate);
+
 /*
  * Create a serialized representation of the plan to be sent to each worker.
  */
@@ -620,12 +625,18 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	FixedParallelExecutorState *fpes;
 	char	   *pstmt_data;
 	char	   *pstmt_space;
+	char	   *part_prune_results_data;
+	char	   *part_prune_results_space;
+	char	   *unpruned_relids_data;
+	char	   *unpruned_relids_space;
 	char	   *paramlistinfo_space;
 	BufferUsage *bufusage_space;
 	WalUsage   *walusage_space;
 	SharedExecutorInstrumentation *instrumentation = NULL;
 	SharedJitInstrumentation *jit_instrumentation = NULL;
 	int			pstmt_len;
+	int			part_prune_results_len;
+	int			unpruned_relids_len;
 	int			paramlistinfo_len;
 	int			instrumentation_len = 0;
 	int			jit_instrumentation_len = 0;
@@ -654,6 +665,8 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 
 	/* Fix up and serialize plan to be sent to workers. */
 	pstmt_data = ExecSerializePlan(planstate->plan, estate);
+	part_prune_results_data = nodeToString(estate->es_part_prune_results);
+	unpruned_relids_data = nodeToString(estate->es_unpruned_relids);
 
 	/* Create a parallel context. */
 	pcxt = CreateParallelContext("postgres", "ParallelQueryMain", nworkers);
@@ -680,6 +693,16 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	shm_toc_estimate_chunk(&pcxt->estimator, pstmt_len);
 	shm_toc_estimate_keys(&pcxt->estimator, 1);
 
+	/* Estimate space for serialized part_prune_results. */
+	part_prune_results_len = strlen(part_prune_results_data) + 1;
+	shm_toc_estimate_chunk(&pcxt->estimator, part_prune_results_len);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	/* Estimate space for serialized unpruned_relids. */
+	unpruned_relids_len = strlen(unpruned_relids_data) + 1;
+	shm_toc_estimate_chunk(&pcxt->estimator, unpruned_relids_len);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
 	/* Estimate space for serialized ParamListInfo. */
 	paramlistinfo_len = EstimateParamListSpace(estate->es_param_list_info);
 	shm_toc_estimate_chunk(&pcxt->estimator, paramlistinfo_len);
@@ -781,6 +804,16 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
 	memcpy(pstmt_space, pstmt_data, pstmt_len);
 	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PLANNEDSTMT, pstmt_space);
 
+	/* Store serialized part_prune_results */
+	part_prune_results_space = shm_toc_allocate(pcxt->toc, part_prune_results_len);
+	memcpy(part_prune_results_space, part_prune_results_data, part_prune_results_len);
+	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARTITION_PRUNE_RESULTS, part_prune_results_space);
+
+	/* Store serialized unpruned_relids */
+	unpruned_relids_space = shm_toc_allocate(pcxt->toc, unpruned_relids_len);
+	memcpy(unpruned_relids_space, unpruned_relids_data, unpruned_relids_len);
+	shm_toc_insert(pcxt->toc, PARALLEL_KEY_UNPRUNED_RELIDS, unpruned_relids_space);
+
 	/* Store serialized ParamListInfo. */
 	paramlistinfo_space = shm_toc_allocate(pcxt->toc, paramlistinfo_len);
 	shm_toc_insert(pcxt->toc, PARALLEL_KEY_PARAMLISTINFO, paramlistinfo_space);
@@ -1280,10 +1313,15 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
 						 int instrument_options)
 {
 	char	   *pstmtspace;
+	char	   *part_prune_results_space;
+	char	   *unpruned_relids_space;
 	char	   *paramspace;
 	PlannedStmt *pstmt;
+	List	   *part_prune_results;
+	Bitmapset  *unpruned_relids;
 	ParamListInfo paramLI;
 	char	   *queryString;
+	EState	   *prep_estate = NULL;
 
 	/* Get the query string from shared memory */
 	queryString = shm_toc_lookup(toc, PARALLEL_KEY_QUERY_TEXT, false);
@@ -1296,12 +1334,80 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
 	paramspace = shm_toc_lookup(toc, PARALLEL_KEY_PARAMLISTINFO, false);
 	paramLI = RestoreParamList(&paramspace);
 
+	/* Reconstruct leader-supplied part_prune_results and unpruned_relids. */
+	part_prune_results_space =
+		shm_toc_lookup(toc, PARALLEL_KEY_PARTITION_PRUNE_RESULTS, false);
+	part_prune_results = (List *) stringToNode(part_prune_results_space);
+	unpruned_relids_space =
+		shm_toc_lookup(toc, PARALLEL_KEY_UNPRUNED_RELIDS, false);
+	unpruned_relids = (Bitmapset *) stringToNode(unpruned_relids_space);
+
+	/*
+	 * If pruning was done in the leader, build a prep estate in the worker
+	 * and inject the leader's pruning results into it for reuse.
+	 */
+	if (pstmt->partPruneInfos)
+	{
+		prep_estate = ExecutorPrep(pstmt, paramLI, CurrentResourceOwner, 0, false);
+		Assert(prep_estate);
+
+		prep_estate->es_part_prune_results = part_prune_results;
+		prep_estate->es_unpruned_relids =
+			bms_add_members(prep_estate->es_unpruned_relids,
+							unpruned_relids);
+
+		/*
+		 * A debug-build-only check that the pruning results passed from the
+		 * leader match what the worker would independently compute.
+		 */
+		CheckInitialPruningResultsInWorker(prep_estate);
+	}
+
 	/* Create a QueryDesc for the query. */
 	return CreateQueryDesc(pstmt,
 						   queryString,
 						   GetActiveSnapshot(), InvalidSnapshot,
 						   receiver, paramLI, NULL, instrument_options,
-						   NULL);
+						   prep_estate);
+}
+
+/*
+ * CheckInitialPruningResultsInWorker
+ *		Verify partition pruning results passed from the leader process.
+ *
+ * This is intended to be called during parallel worker query setup.
+ * It recomputes initial pruning results locally and compares them with
+ * those received from the leader. Any mismatch may indicate a divergence
+ * between leader and worker logic or environment.
+ *
+ * Only performed in debug builds.
+ */
+static void
+CheckInitialPruningResultsInWorker(EState *estate)
+{
+#ifdef USE_ASSERT_CHECKING
+	ListCell   *lc;
+	int			i;
+
+	Assert(estate->es_part_prune_results != NULL);
+	i = 0;
+	foreach(lc, estate->es_part_prune_states)
+	{
+		PartitionPruneState *prunestate = (PartitionPruneState *) lfirst(lc);
+		Bitmapset *reuse_validsubplans =
+				list_nth_node(Bitmapset, estate->es_part_prune_results, i++);
+		Bitmapset  *validsubplans = NULL;
+		Bitmapset  *validsubplan_rtis = NULL;
+
+		if (prunestate->do_initial_prune)
+			validsubplans = ExecFindMatchingSubPlans(prunestate, true,
+													 &validsubplan_rtis);
+		if (!bms_equal(validsubplans, reuse_validsubplans))
+			elog(ERROR, "different validsubplans in parallel worker");
+		if (bms_nonempty_difference(validsubplan_rtis, estate->es_unpruned_relids))
+			elog(ERROR, "different unprunable_relids in parallel worker");
+	}
+#endif
 }
 
 /*
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 2a3af006f77..47322614aad 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -1942,6 +1942,9 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  *
  * Functions:
  *
+ * ExecCreatePartitionPruneStates
+ *     Create PartitionPruneState for all PartitionPruneInfos in the EState
+ *
  * ExecDoInitialPruning:
  *		Perform runtime "initial" pruning, if necessary, to determine the set
  *		of child subnodes that need to be initialized during ExecInitNode() for
@@ -1967,15 +1970,40 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  */
 
 
+/*
+ * ExecCreatePartitionPruneStates
+ *
+ * Create a PartitionPruneState for each PartitionPruneInfo in the estate,
+ * and save them in estate->es_part_prune_states. This setup is required
+ * before any initial or runtime pruning can occur.
+ */
+void
+ExecCreatePartitionPruneStates(EState *estate)
+{
+	ListCell   *lc;
+
+	foreach(lc, estate->es_part_prune_infos)
+	{
+		PartitionPruneInfo *pruneinfo = lfirst_node(PartitionPruneInfo, lc);
+		PartitionPruneState *prunestate;
+
+		/* Create and save the PartitionPruneState. */
+		prunestate = CreatePartitionPruneState(estate, pruneinfo);
+		estate->es_part_prune_states = lappend(estate->es_part_prune_states,
+											   prunestate);
+	}
+}
+
 /*
  * ExecDoInitialPruning
  *		Perform runtime "initial" pruning, if necessary, to determine the set
  *		of child subnodes that need to be initialized during ExecInitNode() for
  *		plan nodes that support partition pruning.
  *
- * This function iterates over each PartitionPruneInfo entry in
- * estate->es_part_prune_infos. For each entry, it creates a PartitionPruneState
- * and adds it to es_part_prune_states.  ExecInitPartitionExecPruning() accesses
+ *
+ * This function iterates over each PartitionPruneState in
+ * estate->es_part_prune_states, which must have been populated earlier by
+ * ExecCreatePartitionPruneStates(). ExecInitPartitionExecPruning() accesses
  * these states through their corresponding indexes in es_part_prune_states and
  * assigns each state to the parent node's PlanState, from where it will be used
  * for "exec" pruning.
@@ -1996,18 +2024,12 @@ ExecDoInitialPruning(EState *estate)
 	ListCell   *lc;
 
 	Assert(estate->es_part_prune_results == NULL);
-	foreach(lc, estate->es_part_prune_infos)
+	foreach(lc, estate->es_part_prune_states)
 	{
-		PartitionPruneInfo *pruneinfo = lfirst_node(PartitionPruneInfo, lc);
-		PartitionPruneState *prunestate;
+		PartitionPruneState *prunestate = (PartitionPruneState *) lfirst(lc);
 		Bitmapset  *validsubplans = NULL;
 		Bitmapset  *validsubplan_rtis = NULL;
 
-		/* Create and save the PartitionPruneState. */
-		prunestate = CreatePartitionPruneState(estate, pruneinfo);
-		estate->es_part_prune_states = lappend(estate->es_part_prune_states,
-											   prunestate);
-
 		/*
 		 * Perform initial pruning steps, if any, and save the result
 		 * bitmapset or NULL as described in the header comment.  RT indexes
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index b0c4d62564d..6c178c461a7 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -2100,7 +2100,7 @@ AcquireExecutorLocksUnpruned(List *stmt_list, bool acquire,
 			}
 
 			prep_estate = ExecutorPrep(plannedstmt, cprep->params,
-									   cprep->owner, cprep->eflags);
+									   cprep->owner, cprep->eflags, true);
 			Assert(prep_estate);
 			cprep->prep_estate = prep_estate;
 
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index 82063ec2a16..4c96808c376 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -130,6 +130,7 @@ typedef struct PartitionPruneState
 	PartitionPruningData *partprunedata[FLEXIBLE_ARRAY_MEMBER];
 } PartitionPruneState;
 
+extern void ExecCreatePartitionPruneStates(EState *estate);
 extern void ExecDoInitialPruning(EState *estate);
 extern PartitionPruneState *ExecInitPartitionExecPruning(PlanState *planstate,
 														 int n_total_subplans,
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index fac5bef1384..37195312bce 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -240,7 +240,8 @@ extern void standard_ExecutorStart(QueryDesc *queryDesc, int eflags);
 extern EState *ExecutorPrep(PlannedStmt *pstmt,
 							ParamListInfo params,
 							ResourceOwner owner,
-							int eflags);
+							int eflags,
+							bool do_initial_pruning);
 
 extern void ExecutorRun(QueryDesc *queryDesc,
 						ScanDirection direction, uint64 count);
-- 
2.47.3



  [application/octet-stream] v10-0001-Refactor-executor-s-initial-partition-pruning-se.patch (7.3K, 3-v10-0001-Refactor-executor-s-initial-partition-pruning-se.patch)
  download | inline diff:
From 6b2a9740b49a5238569cfeeb11fa632225ec2cfb Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Wed, 25 Mar 2026 16:06:38 +0900
Subject: [PATCH v10 1/5] Refactor executor's initial partition pruning setup

Simplify handling of unpruned relids by moving responsibility
for recording them in EState into CreatePartitionPruneState(),
avoiding the need to pass all_leafpart_rtis as an out parameter.

Also move the setting of ecxt_param_exec_vals from
ExecCreatePartitionPruneState() to InitExecPartitionPruneContexts(),
to allow the former to be called before PARAM_EXEC parameters are
set up.  A later commit needs this when running pruning state setup
outside of InitPlan().

No behavioral change.
---
 src/backend/executor/execPartition.c | 70 +++++++++++++++++++---------
 1 file changed, 48 insertions(+), 22 deletions(-)

diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index d96d4f9947b..2a3af006f77 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -185,8 +185,7 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation rel,
 static List *adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri);
 static List *adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap);
 static PartitionPruneState *CreatePartitionPruneState(EState *estate,
-													  PartitionPruneInfo *pruneinfo,
-													  Bitmapset **all_leafpart_rtis);
+													  PartitionPruneInfo *pruneinfo);
 static void InitPartitionPruneContext(PartitionPruneContext *context,
 									  List *pruning_steps,
 									  PartitionDesc partdesc,
@@ -1978,7 +1977,7 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  * estate->es_part_prune_infos. For each entry, it creates a PartitionPruneState
  * and adds it to es_part_prune_states.  ExecInitPartitionExecPruning() accesses
  * these states through their corresponding indexes in es_part_prune_states and
- * assign each state to the parent node's PlanState, from where it will be used
+ * assigns each state to the parent node's PlanState, from where it will be used
  * for "exec" pruning.
  *
  * If initial pruning steps exist for a PartitionPruneInfo entry, this function
@@ -1996,29 +1995,31 @@ ExecDoInitialPruning(EState *estate)
 {
 	ListCell   *lc;
 
+	Assert(estate->es_part_prune_results == NULL);
 	foreach(lc, estate->es_part_prune_infos)
 	{
 		PartitionPruneInfo *pruneinfo = lfirst_node(PartitionPruneInfo, lc);
 		PartitionPruneState *prunestate;
 		Bitmapset  *validsubplans = NULL;
-		Bitmapset  *all_leafpart_rtis = NULL;
 		Bitmapset  *validsubplan_rtis = NULL;
 
 		/* Create and save the PartitionPruneState. */
-		prunestate = CreatePartitionPruneState(estate, pruneinfo,
-											   &all_leafpart_rtis);
+		prunestate = CreatePartitionPruneState(estate, pruneinfo);
 		estate->es_part_prune_states = lappend(estate->es_part_prune_states,
 											   prunestate);
 
 		/*
 		 * Perform initial pruning steps, if any, and save the result
-		 * bitmapset or NULL as described in the header comment.
+		 * bitmapset or NULL as described in the header comment.  RT indexes
+		 * of surviving partitions would be added to validsubplan_rtis.
+		 *
+		 * Note that when do_initial_prune is false,
+		 * CreatePartitionPruneState() would have already added the RT indexes
+		 * of all leaf partitions to es_unpruned_relids directly.
 		 */
 		if (prunestate->do_initial_prune)
 			validsubplans = ExecFindMatchingSubPlans(prunestate, true,
 													 &validsubplan_rtis);
-		else
-			validsubplan_rtis = all_leafpart_rtis;
 
 		estate->es_unpruned_relids = bms_add_members(estate->es_unpruned_relids,
 													 validsubplan_rtis);
@@ -2136,14 +2137,12 @@ ExecInitPartitionExecPruning(PlanState *planstate,
  * parent plan node's PlanState.
  *
  * If initial pruning steps are to be skipped (e.g., during EXPLAIN
- * (GENERIC_PLAN)), *all_leafpart_rtis will be populated with the RT indexes of
- * all leaf partitions whose scanning subnode is included in the parent plan
- * node's list of child plans. The caller must add these RT indexes to
- * estate->es_unpruned_relids.
+ * (GENERIC_PLAN)), the RT indexes of all leaf partitions whose scanning
+ * subnode is included in the parent plan node's list of child plans are
+ * added to estate->es_unpruned_relids.
  */
 static PartitionPruneState *
-CreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo,
-						  Bitmapset **all_leafpart_rtis)
+CreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo)
 {
 	PartitionPruneState *prunestate;
 	int			n_part_hierarchies;
@@ -2377,8 +2376,8 @@ CreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo,
 													   pinfo->execparamids);
 
 			/*
-			 * Return all leaf partition indexes if we're skipping pruning in
-			 * the EXPLAIN (GENERIC_PLAN) case.
+			 * Add all leaf partition indexes to es_unpruned_relids if we're
+			 * skipping pruning in the EXPLAIN (GENERIC_PLAN) case.
 			 */
 			if (pinfo->initial_pruning_steps && !prunestate->do_initial_prune)
 			{
@@ -2390,9 +2389,28 @@ CreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo,
 					Index		rtindex = pprune->leafpart_rti_map[part_index];
 
 					if (rtindex)
-						*all_leafpart_rtis = bms_add_member(*all_leafpart_rtis,
-															rtindex);
+						estate->es_unpruned_relids =
+							bms_add_member(estate->es_unpruned_relids, rtindex);
+				}
+			}
+			else if (pinfo->initial_pruning_steps == NIL)
+			{
+				/*
+				 * All partitions better be present in es_unpruned_relids when
+				 * none are initially prunable.
+				 */
+#ifdef USE_ASSERT_CHECKING
+				int			part_index = -1;
+
+				while ((part_index = bms_next_member(pprune->present_parts,
+													 part_index)) >= 0)
+				{
+					Index		rtindex = pprune->leafpart_rti_map[part_index];
+
+					if (rtindex)
+						Assert(bms_is_member(rtindex, estate->es_unpruned_relids));
 				}
+#endif
 			}
 
 			j++;
@@ -2490,9 +2508,10 @@ InitPartitionPruneContext(PartitionPruneContext *context,
  *		Initialize exec pruning contexts deferred by CreatePartitionPruneState()
  *
  * This function finalizes exec pruning setup for a PartitionPruneState by
- * initializing contexts for pruning steps that require the parent plan's
- * PlanState. It iterates over PartitionPruningData entries and sets up the
- * necessary execution contexts for pruning during query execution.
+ * initializing contexts for pruning steps that require PARAM_EXEC parameters
+ * and the parent plan's PlanState. It iterates over PartitionPruningData
+ * entries and sets up the necessary execution contexts for pruning during
+ * query execution.
  *
  * Also fix the mapping of partition indexes to subplan indexes contained in
  * prunestate by considering the new list of subplans that survived initial
@@ -2520,9 +2539,16 @@ InitExecPartitionPruneContexts(PartitionPruneState *prunestate,
 	bool		fix_subplan_map = false;
 
 	Assert(prunestate->do_exec_prune);
+	Assert(prunestate->econtext);
 	Assert(parent_plan != NULL);
 	estate = parent_plan->state;
 
+	/*
+	 * These might not be available when ExecCreatePartitionPruneState() is
+	 * called.
+	 */
+	prunestate->econtext->ecxt_param_exec_vals = estate->es_param_exec_vals;
+
 	/*
 	 * No need to fix subplans maps if initial pruning didn't eliminate any
 	 * subplans.
-- 
2.47.3



  [application/octet-stream] v10-0002-Introduce-ExecutorPrep-and-refactor-executor-sta.patch (23.5K, 4-v10-0002-Introduce-ExecutorPrep-and-refactor-executor-sta.patch)
  download | inline diff:
From 4e849ce0af12963ee2040f187f4cb0bad1c2851e Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Thu, 26 Mar 2026 16:08:46 +0900
Subject: [PATCH v10 2/5] Introduce ExecutorPrep and refactor executor startup

Factor permission checks, range table initialization, and initial
partition pruning out of InitPlan() into a new ExecutorPrep()
helper.  ExecutorStart() calls it to build the EState, keeping
behavior unchanged.

If QueryDesc->estate is already set when ExecutorStart() is called,
the existing EState is reused and ExecutorPrep() is skipped.  This
allows a later commit to supply a pre-built EState from outside
the executor.

Add scaffolding for carrying an optional prep EState through
CreateQueryDesc, PortalDefineQuery, and SPI.  All callers currently
pass NULL; the next commit populates these to enable pruning-aware
locking in cached plans.

In assert builds, verify that the expected relation locks are held
when entering ExecutorStart().
---
 src/backend/commands/copyto.c       |   2 +-
 src/backend/commands/createas.c     |   2 +-
 src/backend/commands/explain.c      |   8 +-
 src/backend/commands/extension.c    |   2 +-
 src/backend/commands/matview.c      |   2 +-
 src/backend/commands/portalcmds.c   |   1 +
 src/backend/commands/prepare.c      |   4 +-
 src/backend/executor/README         |  11 +-
 src/backend/executor/execMain.c     | 158 +++++++++++++++++++++++-----
 src/backend/executor/execParallel.c |   3 +-
 src/backend/executor/functions.c    |   3 +-
 src/backend/executor/spi.c          |   4 +-
 src/backend/tcop/postgres.c         |   2 +
 src/backend/tcop/pquery.c           |  19 +++-
 src/backend/utils/mmgr/portalmem.c  |   7 ++
 src/include/commands/explain.h      |   3 +-
 src/include/executor/execdesc.h     |   5 +-
 src/include/executor/executor.h     |   7 ++
 src/include/utils/portal.h          |   2 +
 19 files changed, 195 insertions(+), 50 deletions(-)

diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index faf62d959b4..b9bd5ba7078 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -1011,7 +1011,7 @@ BeginCopyTo(ParseState *pstate,
 		cstate->queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
 											GetActiveSnapshot(),
 											InvalidSnapshot,
-											dest, NULL, NULL, 0);
+											dest, NULL, NULL, 0, NULL);
 
 		/*
 		 * Call ExecutorStart to prepare the plan for execution.
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 270e9bf3110..b4a9808955a 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -336,7 +336,7 @@ ExecCreateTableAs(ParseState *pstate, CreateTableAsStmt *stmt,
 		/* Create a QueryDesc, redirecting output to our tuple receiver */
 		queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
 									GetActiveSnapshot(), InvalidSnapshot,
-									dest, params, queryEnv, 0);
+									dest, params, queryEnv, 0, NULL);
 
 		/* call ExecutorStart to prepare the plan for execution */
 		ExecutorStart(queryDesc, GetIntoRelEFlags(into));
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index e4b70166b0e..24c0c235fd3 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -372,7 +372,7 @@ standard_ExplainOneQuery(Query *query, int cursorOptions,
 	}
 
 	/* run it (if needed) and produce output */
-	ExplainOnePlan(plan, into, es, queryString, params, queryEnv,
+	ExplainOnePlan(plan, NULL, into, es, queryString, params, queryEnv,
 				   &planduration, (es->buffers ? &bufusage : NULL),
 				   es->memory ? &mem_counters : NULL);
 }
@@ -494,7 +494,8 @@ ExplainOneUtility(Node *utilityStmt, IntoClause *into, ExplainState *es,
  * to call it.
  */
 void
-ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
+ExplainOnePlan(PlannedStmt *plannedstmt, EState *prep_estate,
+			   IntoClause *into, ExplainState *es,
 			   const char *queryString, ParamListInfo params,
 			   QueryEnvironment *queryEnv, const instr_time *planduration,
 			   const BufferUsage *bufusage,
@@ -552,7 +553,8 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
 	/* Create a QueryDesc for the query */
 	queryDesc = CreateQueryDesc(plannedstmt, queryString,
 								GetActiveSnapshot(), InvalidSnapshot,
-								dest, params, queryEnv, instrument_option);
+								dest, params, queryEnv, instrument_option,
+								prep_estate);
 
 	/* Select execution options */
 	if (es->analyze)
diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index b98801d08f2..939e7a632f0 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -1174,7 +1174,7 @@ execute_sql_string(const char *sql, const char *filename)
 				qdesc = CreateQueryDesc(stmt,
 										sql,
 										GetActiveSnapshot(), NULL,
-										dest, NULL, NULL, 0);
+										dest, NULL, NULL, 0, NULL);
 
 				ExecutorStart(qdesc, 0);
 				ExecutorRun(qdesc, ForwardScanDirection, 0);
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index 81a55a33ef2..2cdfdcf984b 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -439,7 +439,7 @@ refresh_matview_datafill(DestReceiver *dest, Query *query,
 	/* Create a QueryDesc, redirecting output to our tuple receiver */
 	queryDesc = CreateQueryDesc(plan, queryString,
 								GetActiveSnapshot(), InvalidSnapshot,
-								dest, NULL, NULL, 0);
+								dest, NULL, NULL, 0, NULL);
 
 	/* call ExecutorStart to prepare the plan for execution */
 	ExecutorStart(queryDesc, 0);
diff --git a/src/backend/commands/portalcmds.c b/src/backend/commands/portalcmds.c
index 01efac3319e..cf5deec4943 100644
--- a/src/backend/commands/portalcmds.c
+++ b/src/backend/commands/portalcmds.c
@@ -118,6 +118,7 @@ PerformCursorOpen(ParseState *pstate, DeclareCursorStmt *cstmt, ParamListInfo pa
 					  queryString,
 					  CMDTAG_SELECT,	/* cursor's query is always a SELECT */
 					  list_make1(plan),
+					  NULL,
 					  NULL);
 
 	/*----------
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 876aad2100a..c24d97f7e5a 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -207,6 +207,7 @@ ExecuteQuery(ParseState *pstate,
 					  query_string,
 					  entry->plansource->commandTag,
 					  plan_list,
+					  NULL,
 					  cplan);
 
 	/*
@@ -659,7 +660,8 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 		PlannedStmt *pstmt = lfirst_node(PlannedStmt, p);
 
 		if (pstmt->commandType != CMD_UTILITY)
-			ExplainOnePlan(pstmt, into, es, query_string, paramLI, pstate->p_queryEnv,
+			ExplainOnePlan(pstmt, NULL,
+						   into, es, query_string, paramLI, pstate->p_queryEnv,
 						   &planduration, (es->buffers ? &bufusage : NULL),
 						   es->memory ? &mem_counters : NULL);
 		else
diff --git a/src/backend/executor/README b/src/backend/executor/README
index 54f4782f31b..d749ceb6687 100644
--- a/src/backend/executor/README
+++ b/src/backend/executor/README
@@ -291,11 +291,18 @@ Query Processing Control Flow
 
 This is a sketch of control flow for full query processing:
 
+    ExecutorPrep
+		May be run before ExecutorStart (e.g., for plan validation), or
+		implicitly from ExecutorStart if not done earlier.  Creates EState,
+		performs range table initialization, permission checks, and initial
+		partition pruning.  Returns the EState that ExecutorStart() should
+		reuse.
+
 	CreateQueryDesc
 
 	ExecutorStart
-		CreateExecutorState
-			creates per-query context
+		ExecutorPrep (if not already done, indicated by NULL QueryDesc.estate)
+			creates EState and per-query context
 		switch to per-query context to run ExecInitNode
 		AfterTriggerBeginQuery
 		ExecInitNode --- recursively scans plan tree
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 58b84955c2b..cc7794f58db 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -57,6 +57,7 @@
 #include "parser/parse_relation.h"
 #include "pgstat.h"
 #include "rewrite/rewriteHandler.h"
+#include "storage/lmgr.h"
 #include "tcop/utility.h"
 #include "utils/acl.h"
 #include "utils/backend_status.h"
@@ -147,7 +148,6 @@ standard_ExecutorStart(QueryDesc *queryDesc, int eflags)
 
 	/* sanity checks: queryDesc must not be started already */
 	Assert(queryDesc != NULL);
-	Assert(queryDesc->estate == NULL);
 
 	/* caller must ensure the query's snapshot is active */
 	Assert(GetActiveSnapshot() == queryDesc->snapshot);
@@ -173,9 +173,70 @@ standard_ExecutorStart(QueryDesc *queryDesc, int eflags)
 
 	/*
 	 * Build EState, switch into per-query memory context for startup.
-	 */
-	estate = CreateExecutorState();
-	queryDesc->estate = estate;
+	 *
+	 * If ExecutorPrep() ran earlier (e.g., to do initial pruning during plan
+	 * validity checking), reuse its EState to avoid redoing range table setup
+	 * and pruning. Otherwise, create a fresh EState as usual.
+	 *
+	 * In assert builds, verify that the expected locks are held.  When
+	 * no prep EState was provided, AcquireExecutorLocks() should have
+	 * locked every relation in the plan.  When one was provided,
+	 * pruning-aware locking should have locked at least the unpruned
+	 * relations.  Both checks are skipped in parallel workers, which
+	 * acquire relation locks lazily in ExecGetRangeTableRelation().
+	 */
+	if (queryDesc->estate == NULL)
+	{
+#ifdef USE_ASSERT_CHECKING
+		if (!IsParallelWorker())
+		{
+			ListCell   *lc;
+
+			foreach(lc, queryDesc->plannedstmt->rtable)
+			{
+				RangeTblEntry *rte = lfirst_node(RangeTblEntry, lc);
+
+				if (rte->rtekind == RTE_RELATION ||
+					(rte->rtekind == RTE_SUBQUERY && rte->relid != InvalidOid))
+					Assert(CheckRelationOidLockedByMe(rte->relid,
+													  rte->rellockmode,
+													  true));
+			}
+		}
+#endif
+		queryDesc->estate = ExecutorPrep(queryDesc->plannedstmt,
+										 queryDesc->params,
+										 CurrentResourceOwner,
+										 eflags);
+	}
+#ifdef USE_ASSERT_CHECKING
+	else
+	{
+		/*
+		 * A prep EState was provided, meaning pruning-aware locking
+		 * should have locked at least the unpruned relations.
+		 */
+		if (!IsParallelWorker())
+		{
+			int		rtindex = -1;
+
+			while ((rtindex = bms_next_member(queryDesc->estate->es_unpruned_relids,
+											  rtindex)) >= 0)
+			{
+				RangeTblEntry *rte = exec_rt_fetch(rtindex, queryDesc->estate);
+
+				Assert(rte->rtekind == RTE_RELATION ||
+					   (rte->rtekind == RTE_SUBQUERY &&
+						rte->relid != InvalidOid));
+				Assert(CheckRelationOidLockedByMe(rte->relid,
+												  rte->rellockmode, true));
+			}
+		}
+	}
+#endif
+
+	estate = queryDesc->estate;
+	Assert(estate);
 
 	oldcontext = MemoryContextSwitchTo(estate->es_query_cxt);
 
@@ -265,6 +326,68 @@ standard_ExecutorStart(QueryDesc *queryDesc, int eflags)
 	MemoryContextSwitchTo(oldcontext);
 }
 
+/*
+ * ExecutorPrep: build initial executor state for a PlannedStmt.
+ *
+ * Performs range table initialization, permission checks, and initial
+ * partition pruning if partPruneInfos are present.
+ *
+ * Returns an EState that the caller must either pass to ExecutorStart()
+ * for reuse or free via FreeExecutorState() if execution will not proceed.
+ */
+EState *
+ExecutorPrep(PlannedStmt *pstmt, ParamListInfo params, ResourceOwner owner,
+			 int eflags)
+{
+	ResourceOwner oldowner;
+	EState *estate;
+
+	if (pstmt->commandType == CMD_UTILITY)
+		return NULL;
+
+	/* Caller must have established an active snapshot. */
+	Assert(ActiveSnapshotSet());
+
+	estate = CreateExecutorState();
+	estate->es_plannedstmt = pstmt;
+	estate->es_part_prune_infos = pstmt->partPruneInfos;
+	estate->es_param_list_info = params;
+	estate->es_top_eflags = eflags;
+
+	/*
+	 * Do permissions checks.
+	 */
+	ExecCheckPermissions(pstmt->rtable, pstmt->permInfos, true);
+
+	/*
+	 * Initialize range table.
+	 */
+	ExecInitRangeTable(estate, pstmt->rtable, pstmt->permInfos,
+					   bms_copy(pstmt->unprunableRelids));
+
+	/*
+	 * Track resources acquired during pruning under the given
+	 * ResourceOwner, which may differ from CurrentResourceOwner
+	 * when ExecutorPrep() is called outside ExecutorStart().
+	 */
+	oldowner = CurrentResourceOwner;
+	CurrentResourceOwner = owner;
+
+	/*
+	 * Set up PartitionPruneState structures and perform initial partition
+	 * pruning to compute the subset of child subplans that will be
+	 * executed.  The results, which are bitmapsets of selected child
+	 * indexes, are saved in es_part_prune_results, parallel to
+	 * es_part_prune_infos.  RT indexes of surviving partitions are
+	 * added to es_unpruned_relids.
+	 */
+	ExecDoInitialPruning(estate);
+
+	CurrentResourceOwner = oldowner;
+
+	return estate;
+}
+
 /* ----------------------------------------------------------------
  *		ExecutorRun
  *
@@ -840,37 +963,14 @@ InitPlan(QueryDesc *queryDesc, int eflags)
 	CmdType		operation = queryDesc->operation;
 	PlannedStmt *plannedstmt = queryDesc->plannedstmt;
 	Plan	   *plan = plannedstmt->planTree;
-	List	   *rangeTable = plannedstmt->rtable;
 	EState	   *estate = queryDesc->estate;
 	PlanState  *planstate;
 	TupleDesc	tupType;
 	ListCell   *l;
 	int			i;
 
-	/*
-	 * Do permissions checks
-	 */
-	ExecCheckPermissions(rangeTable, plannedstmt->permInfos, true);
-
-	/*
-	 * initialize the node's execution state
-	 */
-	ExecInitRangeTable(estate, rangeTable, plannedstmt->permInfos,
-					   bms_copy(plannedstmt->unprunableRelids));
-
-	estate->es_plannedstmt = plannedstmt;
-	estate->es_part_prune_infos = plannedstmt->partPruneInfos;
-
-	/*
-	 * Perform runtime "initial" pruning to identify which child subplans,
-	 * corresponding to the children of plan nodes that contain
-	 * PartitionPruneInfo such as Append, will not be executed. The results,
-	 * which are bitmapsets of indexes of the child subplans that will be
-	 * executed, are saved in es_part_prune_results.  These results correspond
-	 * to each PartitionPruneInfo entry, and the es_part_prune_results list is
-	 * parallel to es_part_prune_infos.
-	 */
-	ExecDoInitialPruning(estate);
+	/* ExecutorPrep() must have been done. */
+	Assert(queryDesc->estate);
 
 	/*
 	 * Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index ac84af294c9..024780d3516 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -1300,7 +1300,8 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
 	return CreateQueryDesc(pstmt,
 						   queryString,
 						   GetActiveSnapshot(), InvalidSnapshot,
-						   receiver, paramLI, NULL, instrument_options);
+						   receiver, paramLI, NULL, instrument_options,
+						   NULL);
 }
 
 /*
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index 88109348817..952a784c924 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -1369,7 +1369,8 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
 							 dest,
 							 fcache->paramLI,
 							 es->qd ? es->qd->queryEnv : NULL,
-							 0);
+							 0,
+							 NULL);
 
 	/* Utility commands don't need Executor. */
 	if (es->qd->operation != CMD_UTILITY)
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 52f3b11301c..32c9d987c59 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -1686,6 +1686,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 					  query_string,
 					  plansource->commandTag,
 					  stmt_list,
+					  NULL,
 					  cplan);
 
 	/*
@@ -2695,7 +2696,8 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 										dest,
 										options->params,
 										_SPI_current->queryEnv,
-										0);
+										0,
+										NULL);
 				res = _SPI_pquery(qdesc, fire_triggers,
 								  canSetTag ? options->tcount : 0);
 				FreeQueryDesc(qdesc);
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index b3563113219..ccdb6c01071 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -1231,6 +1231,7 @@ exec_simple_query(const char *query_string)
 						  query_string,
 						  commandTag,
 						  plantree_list,
+						  NULL,
 						  NULL);
 
 		/*
@@ -2030,6 +2031,7 @@ exec_bind_message(StringInfo input_message)
 					  query_string,
 					  psrc->commandTag,
 					  cplan->stmt_list,
+					  NULL,
 					  cplan);
 
 	/* Portal is defined, set the plan ID based on its contents. */
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index d8fc75d0bb9..42ef3e82f82 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -37,6 +37,7 @@ Portal		ActivePortal = NULL;
 
 
 static void ProcessQuery(PlannedStmt *plan,
+						 EState *prep_estate,
 						 const char *sourceText,
 						 ParamListInfo params,
 						 QueryEnvironment *queryEnv,
@@ -72,7 +73,8 @@ CreateQueryDesc(PlannedStmt *plannedstmt,
 				DestReceiver *dest,
 				ParamListInfo params,
 				QueryEnvironment *queryEnv,
-				int instrument_options)
+				int instrument_options,
+				EState *prep_estate)
 {
 	QueryDesc  *qd = palloc_object(QueryDesc);
 
@@ -93,6 +95,9 @@ CreateQueryDesc(PlannedStmt *plannedstmt,
 	qd->planstate = NULL;
 	qd->totaltime = NULL;
 
+	/* Use the EState created by ExecutorPrep() if already done. */
+	qd->estate = prep_estate;
+
 	/* not yet executed */
 	qd->already_executed = false;
 
@@ -123,6 +128,7 @@ FreeQueryDesc(QueryDesc *qdesc)
  *		PORTAL_ONE_RETURNING, or PORTAL_ONE_MOD_WITH portal
  *
  *	plan: the plan tree for the query
+ *	prep_estate: EState created in ExecutorPrep() for the query, if any
  *	sourceText: the source text of the query
  *	params: any parameters needed
  *	dest: where to send results
@@ -135,6 +141,7 @@ FreeQueryDesc(QueryDesc *qdesc)
  */
 static void
 ProcessQuery(PlannedStmt *plan,
+			 EState *prep_estate,
 			 const char *sourceText,
 			 ParamListInfo params,
 			 QueryEnvironment *queryEnv,
@@ -148,7 +155,8 @@ ProcessQuery(PlannedStmt *plan,
 	 */
 	queryDesc = CreateQueryDesc(plan, sourceText,
 								GetActiveSnapshot(), InvalidSnapshot,
-								dest, params, queryEnv, 0);
+								dest, params, queryEnv, 0,
+								prep_estate);
 
 	/*
 	 * Call ExecutorStart to prepare the plan for execution
@@ -495,7 +503,8 @@ PortalStart(Portal portal, ParamListInfo params,
 											None_Receiver,
 											params,
 											portal->queryEnv,
-											0);
+											0,
+											portal->prep_estate);
 
 				/*
 				 * If it's a scrollable cursor, executor needs to support
@@ -1265,7 +1274,7 @@ PortalRunMulti(Portal portal,
 			if (pstmt->canSetTag)
 			{
 				/* statement can set tag string */
-				ProcessQuery(pstmt,
+				ProcessQuery(pstmt, portal->prep_estate,
 							 portal->sourceText,
 							 portal->portalParams,
 							 portal->queryEnv,
@@ -1274,7 +1283,7 @@ PortalRunMulti(Portal portal,
 			else
 			{
 				/* stmt added by rewrite cannot set tag */
-				ProcessQuery(pstmt,
+				ProcessQuery(pstmt, portal->prep_estate,
 							 portal->sourceText,
 							 portal->portalParams,
 							 portal->queryEnv,
diff --git a/src/backend/utils/mmgr/portalmem.c b/src/backend/utils/mmgr/portalmem.c
index 493f9b0ee19..0ecda763d21 100644
--- a/src/backend/utils/mmgr/portalmem.c
+++ b/src/backend/utils/mmgr/portalmem.c
@@ -272,6 +272,11 @@ CreateNewPortal(void)
  * the passed plan trees have adequate lifetime.  Typically this is done by
  * copying them into the portal's context.
  *
+ * If prep_estate is not NULL, it is an EState created by ExecutorPrep()
+ * during GetCachedPlan().  It will be passed to ExecutorStart() to avoid
+ * redoing range table setup and pruning.  The portal takes ownership;
+ * the EState must have been allocated in the portal's memory context.
+ *
  * The caller is also responsible for ensuring that the passed prepStmtName
  * (if not NULL) and sourceText have adequate lifetime.
  *
@@ -286,6 +291,7 @@ PortalDefineQuery(Portal portal,
 				  const char *sourceText,
 				  CommandTag commandTag,
 				  List *stmts,
+				  EState *prep_estate,
 				  CachedPlan *cplan)
 {
 	Assert(PortalIsValid(portal));
@@ -299,6 +305,7 @@ PortalDefineQuery(Portal portal,
 	portal->commandTag = commandTag;
 	SetQueryCompletion(&portal->qc, commandTag, 0);
 	portal->stmts = stmts;
+	portal->prep_estate = prep_estate;
 	portal->cplan = cplan;
 	portal->status = PORTAL_DEFINED;
 }
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 472e141bba3..71ebe38bc86 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -64,7 +64,8 @@ extern void ExplainOneUtility(Node *utilityStmt, IntoClause *into,
 							  ExplainState *es, ParseState *pstate,
 							  ParamListInfo params);
 
-extern void ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into,
+extern void ExplainOnePlan(PlannedStmt *plannedstmt, EState *prep_estate,
+						   IntoClause *into,
 						   ExplainState *es, const char *queryString,
 						   ParamListInfo params, QueryEnvironment *queryEnv,
 						   const instr_time *planduration,
diff --git a/src/include/executor/execdesc.h b/src/include/executor/execdesc.h
index d3a57242844..3a2169c9613 100644
--- a/src/include/executor/execdesc.h
+++ b/src/include/executor/execdesc.h
@@ -43,7 +43,7 @@ typedef struct QueryDesc
 	QueryEnvironment *queryEnv; /* query environment passed in */
 	int			instrument_options; /* OR of InstrumentOption flags */
 
-	/* These fields are set by ExecutorStart */
+	/* These fields are set by ExecutorStart or ExecutorPrep */
 	TupleDesc	tupDesc;		/* descriptor for result tuples */
 	EState	   *estate;			/* executor's query-wide state */
 	PlanState  *planstate;		/* tree of per-plan-node state */
@@ -63,7 +63,8 @@ extern QueryDesc *CreateQueryDesc(PlannedStmt *plannedstmt,
 								  DestReceiver *dest,
 								  ParamListInfo params,
 								  QueryEnvironment *queryEnv,
-								  int instrument_options);
+								  int instrument_options,
+								  EState *prep_estate);
 
 extern void FreeQueryDesc(QueryDesc *qdesc);
 
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index 07f4b1f7490..fac5bef1384 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -21,6 +21,7 @@
 #include "nodes/lockoptions.h"
 #include "nodes/parsenodes.h"
 #include "utils/memutils.h"
+#include "utils/resowner.h"
 
 
 /*
@@ -235,6 +236,12 @@ ExecGetJunkAttribute(TupleTableSlot *slot, AttrNumber attno, bool *isNull)
  */
 extern void ExecutorStart(QueryDesc *queryDesc, int eflags);
 extern void standard_ExecutorStart(QueryDesc *queryDesc, int eflags);
+
+extern EState *ExecutorPrep(PlannedStmt *pstmt,
+							ParamListInfo params,
+							ResourceOwner owner,
+							int eflags);
+
 extern void ExecutorRun(QueryDesc *queryDesc,
 						ScanDirection direction, uint64 count);
 extern void standard_ExecutorRun(QueryDesc *queryDesc,
diff --git a/src/include/utils/portal.h b/src/include/utils/portal.h
index a7bedb12c18..a59e96fa11e 100644
--- a/src/include/utils/portal.h
+++ b/src/include/utils/portal.h
@@ -137,6 +137,7 @@ typedef struct PortalData
 	CommandTag	commandTag;		/* command tag for original query */
 	QueryCompletion qc;			/* command completion data for executed query */
 	List	   *stmts;			/* list of PlannedStmts */
+	EState	   *prep_estate;	/* EState from ExecutorPrep() if any */
 	CachedPlan *cplan;			/* CachedPlan, if stmts are from one */
 
 	ParamListInfo portalParams; /* params to pass to query */
@@ -240,6 +241,7 @@ extern void PortalDefineQuery(Portal portal,
 							  const char *sourceText,
 							  CommandTag commandTag,
 							  List *stmts,
+							  EState *prep_estate,
 							  CachedPlan *cplan);
 extern PlannedStmt *PortalGetPrimaryStmt(Portal portal);
 extern void PortalCreateHoldStore(Portal portal);
-- 
2.47.3



  [application/octet-stream] v10-0003-Use-pruning-aware-locking-in-cached-plans.patch (47.3K, 5-v10-0003-Use-pruning-aware-locking-in-cached-plans.patch)
  download | inline diff:
From 648b9f5c89069692bbb46cf579576be50a9147f2 Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Thu, 26 Mar 2026 18:15:39 +0900
Subject: [PATCH v10 3/5] Use pruning-aware locking in cached plans

Extend GetCachedPlan()'s lock acquisition to perform initial
partition pruning via ExecutorPrep(), then lock only the surviving
partitions.  This avoids unnecessary locking of pruned partitions
when reusing a generic cached plan.

Introduce CachedPlanPrepData to carry the EState created by
ExecutorPrep() through the plan caching layer.  The prep_estate
field is populated when GetCachedPlan() prepares a reused
single-statement generic plan.  Adjust call sites in SPI,
portals, and EXPLAIN to propagate this to ExecutorStart().

Disable pruning-aware locking for multi-statement CachedPlans, which
arise from rule rewriting.  PortalRunMulti() executes such statements
sequentially with CommandCounterIncrement() between them, so later
statements' pruning expressions may see different results depending
on when they are evaluated.  Evaluating all statements' pruning
upfront during GetCachedPlan() would produce stale results for later
statements.  Additionally, PortalRunMulti() calls
MemoryContextDeleteChildren(portalContext) between statements, which
would destroy EStates prepared for later statements.  The fallback
to locking all partitions is safe and sufficient here; multi-statement
plans from rule rewriting are uncommon.

Partition pruning expressions may call PL functions that require
an active snapshot (e.g., via EnsurePortalSnapshotExists()).
AcquireExecutorLocksUnpruned() establishes one before calling
ExecutorPrep() if needed, ensuring these expressions can execute
correctly during plan cache validation.

To maintain correctness when all target partitions are pruned, also
reinstate the firstResultRel locking behavior lost in commit
28317de72. That commit required the first ModifyTable target to
remain initialized for executor assumptions to hold. We now
explicitly track these relids in PlannerGlobal and PlannedStmt so they
are locked even if pruned, preserving that rule across cached plan
reuse.

Regression tests are included to verify:

- Only surviving partitions are locked when pruning is enabled, and
  all partitions are locked when it is disabled (pg_locks inspection).
- Multiple ModifyTable nodes (via writable CTEs) handle the case where
  all target partitions are pruned, exercising firstResultRels.
- Plan invalidation during pruning-aware lock setup (DDL triggered by
  a pruning expression) discards the prep state and replans cleanly.
- Multi-statement CachedPlans (from rule rewriting) fall back to
  locking all partitions, avoiding stale pruning and use-after-free.

Note for extension authors: code that accesses partition relations
through EState must check that the RT index is a member of
es_unpruned_relids before opening the relation.  Previously this was
an optimization (avoid processing pruned partitions); it is now a
correctness requirement, because pruned partitions may not be locked.
ExecGetRangeTableRelation() already enforces this with an error when
called on a pruned relation.
---
 src/backend/commands/prepare.c                |  19 +-
 src/backend/executor/execMain.c               |   4 +
 src/backend/executor/functions.c              |   1 +
 src/backend/executor/nodeModifyTable.c        |   5 +-
 src/backend/executor/spi.c                    |  24 +-
 src/backend/optimizer/plan/planner.c          |   1 +
 src/backend/optimizer/plan/setrefs.c          |  18 ++
 src/backend/tcop/postgres.c                   |   8 +-
 src/backend/tcop/pquery.c                     |   1 +
 src/backend/utils/cache/plancache.c           | 246 +++++++++++++++++-
 src/include/nodes/pathnodes.h                 |   3 +
 src/include/nodes/plannodes.h                 |  10 +
 src/include/utils/plancache.h                 |  38 ++-
 src/test/regress/expected/partition_prune.out | 184 +++++++++++++
 src/test/regress/expected/plancache.out       |  63 +++++
 src/test/regress/sql/partition_prune.sql      | 116 +++++++++
 src/test/regress/sql/plancache.sql            |  52 ++++
 17 files changed, 769 insertions(+), 24 deletions(-)

diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index c24d97f7e5a..621fd30fd5e 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -156,6 +156,7 @@ ExecuteQuery(ParseState *pstate,
 {
 	PreparedStatement *entry;
 	CachedPlan *cplan;
+	CachedPlanPrepData cprep = {0};
 	List	   *plan_list;
 	ParamListInfo paramLI = NULL;
 	EState	   *estate = NULL;
@@ -195,8 +196,11 @@ ExecuteQuery(ParseState *pstate,
 									   entry->plansource->query_string);
 
 	/* Replan if needed, and increment plan refcount for portal */
-	cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL);
+	cprep.context = portal->portalContext;
+	cprep.owner = portal->resowner;
+	cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL, &cprep);
 	plan_list = cplan->stmt_list;
+	Assert(cprep.prep_estate == NULL || list_length(plan_list) == 1);
 
 	/*
 	 * DO NOT add any logic that could possibly throw an error between
@@ -207,7 +211,7 @@ ExecuteQuery(ParseState *pstate,
 					  query_string,
 					  entry->plansource->commandTag,
 					  plan_list,
-					  NULL,
+					  cprep.prep_estate,
 					  cplan);
 
 	/*
@@ -577,6 +581,7 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	PreparedStatement *entry;
 	const char *query_string;
 	CachedPlan *cplan;
+	CachedPlanPrepData cprep = {0};
 	List	   *plan_list;
 	ListCell   *p;
 	ParamListInfo paramLI = NULL;
@@ -633,8 +638,13 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	}
 
 	/* Replan if needed, and acquire a transient refcount */
+	cprep.context = CurrentMemoryContext;
+	cprep.owner = CurrentResourceOwner;
+	if (es->generic)
+		cprep.eflags = EXEC_FLAG_EXPLAIN_GENERIC;
 	cplan = GetCachedPlan(entry->plansource, paramLI,
-						  CurrentResourceOwner, pstate->p_queryEnv);
+						  CurrentResourceOwner, pstate->p_queryEnv,
+						  &cprep);
 
 	INSTR_TIME_SET_CURRENT(planduration);
 	INSTR_TIME_SUBTRACT(planduration, planstart);
@@ -655,12 +665,13 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	plan_list = cplan->stmt_list;
 
 	/* Explain each query */
+	Assert(cprep.prep_estate == NULL || list_length(plan_list) == 1);
 	foreach(p, plan_list)
 	{
 		PlannedStmt *pstmt = lfirst_node(PlannedStmt, p);
 
 		if (pstmt->commandType != CMD_UTILITY)
-			ExplainOnePlan(pstmt, NULL,
+			ExplainOnePlan(pstmt, cprep.prep_estate,
 						   into, es, query_string, paramLI, pstate->p_queryEnv,
 						   &planduration, (es->buffers ? &bufusage : NULL),
 						   es->memory ? &mem_counters : NULL);
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index cc7794f58db..051b5d7bfcf 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -334,6 +334,10 @@ standard_ExecutorStart(QueryDesc *queryDesc, int eflags)
  *
  * Returns an EState that the caller must either pass to ExecutorStart()
  * for reuse or free via FreeExecutorState() if execution will not proceed.
+ * GetCachedPlan() uses this to determine, based on initial pruning
+ * results, which partitions to lock; if the resulting EState is not
+ * delivered to ExecutorStart(), the executor would operate on unlocked
+ * relations.  See the assert checks in standard_ExecutorStart().
  */
 EState *
 ExecutorPrep(PlannedStmt *pstmt, ParamListInfo params, ResourceOwner owner,
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index 952a784c924..c0ca72b38dd 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -699,6 +699,7 @@ init_execution_state(SQLFunctionCachePtr fcache)
 	fcache->cplan = GetCachedPlan(plansource,
 								  fcache->paramLI,
 								  fcache->cowner,
+								  NULL,
 								  NULL);
 
 	/*
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index 4cd5e262e0f..9230f2b554f 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -4865,8 +4865,8 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
 	 * as a reference for building the ResultRelInfo of the target partition.
 	 * In either case, it doesn't matter which result relation is kept, so we
 	 * just keep the first one, if all others have been pruned.  See also,
-	 * ExecDoInitialPruning(), which ensures that this first result relation
-	 * has been locked.
+	 * AcquireExecutorLocksUnpruned(), which ensures that this first result
+	 * relation has been locked.
 	 */
 	i = 0;
 	foreach(l, node->resultRelations)
@@ -4880,6 +4880,7 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
 			/* all result relations pruned; keep the first one */
 			keep_rel = true;
 			rti = linitial_int(node->resultRelations);
+			Assert(list_member_int(estate->es_plannedstmt->firstResultRels, rti));
 			i = 0;
 		}
 
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 32c9d987c59..eb9552f85db 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -1580,6 +1580,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 {
 	CachedPlanSource *plansource;
 	CachedPlan *cplan;
+	CachedPlanPrepData cprep = {0};
 	List	   *stmt_list;
 	char	   *query_string;
 	Snapshot	snapshot;
@@ -1660,8 +1661,12 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 	 */
 
 	/* Replan if needed, and increment plan refcount for portal */
-	cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv);
+	cprep.context = portal->portalContext;
+	cprep.owner = portal->resowner;
+	cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv,
+						  &cprep);
 	stmt_list = cplan->stmt_list;
+	Assert(cprep.prep_estate == NULL || list_length(stmt_list) == 1);
 
 	if (!plan->saved)
 	{
@@ -1670,7 +1675,10 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 		 * so must copy the plan into the portal's context.  An error here
 		 * will result in leaking our refcount on the plan, but it doesn't
 		 * matter because the plan is unsaved and hence transient anyway.
+		 *
+		 * Unsaved plans use custom plans, so prep should be a no-op.
 		 */
+		Assert(cprep.prep_estate == NULL);
 		oldcontext = MemoryContextSwitchTo(portal->portalContext);
 		stmt_list = copyObject(stmt_list);
 		MemoryContextSwitchTo(oldcontext);
@@ -1686,7 +1694,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 					  query_string,
 					  plansource->commandTag,
 					  stmt_list,
-					  NULL,
+					  cprep.prep_estate,
 					  cplan);
 
 	/*
@@ -2104,7 +2112,8 @@ SPI_plan_get_cached_plan(SPIPlanPtr plan)
 	/* Get the generic plan for the query */
 	cplan = GetCachedPlan(plansource, NULL,
 						  plan->saved ? CurrentResourceOwner : NULL,
-						  _SPI_current->queryEnv);
+						  _SPI_current->queryEnv,
+						  NULL);
 	Assert(cplan == plansource->gplan);
 
 	/* Pop the error context stack */
@@ -2501,6 +2510,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 		CachedPlanSource *plansource = (CachedPlanSource *) lfirst(lc1);
 		List	   *stmt_list;
 		ListCell   *lc2;
+		CachedPlanPrepData cprep = {0};
 
 		spicallbackarg.query = plansource->query_string;
 
@@ -2575,8 +2585,11 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 		 * Replan if needed, and increment plan refcount.  If it's a saved
 		 * plan, the refcount must be backed by the plan_owner.
 		 */
+		cprep.context = CurrentMemoryContext;
+		cprep.owner = CurrentResourceOwner;
 		cplan = GetCachedPlan(plansource, options->params,
-							  plan_owner, _SPI_current->queryEnv);
+							  plan_owner, _SPI_current->queryEnv,
+							  &cprep);
 
 		stmt_list = cplan->stmt_list;
 
@@ -2616,6 +2629,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 			}
 		}
 
+		Assert(cprep.prep_estate == NULL || list_length(stmt_list) == 1);
 		foreach(lc2, stmt_list)
 		{
 			PlannedStmt *stmt = lfirst_node(PlannedStmt, lc2);
@@ -2697,7 +2711,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 										options->params,
 										_SPI_current->queryEnv,
 										0,
-										NULL);
+										cprep.prep_estate);
 				res = _SPI_pquery(qdesc, fire_triggers,
 								  canSetTag ? options->tcount : 0);
 				FreeQueryDesc(qdesc);
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 42604a0f75c..afa61d357c5 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -657,6 +657,7 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
 	result->permInfos = glob->finalrteperminfos;
 	result->subrtinfos = glob->subrtinfos;
 	result->resultRelations = glob->resultRelations;
+	result->firstResultRels = glob->firstResultRels;
 	result->appendRelations = glob->appendRelations;
 	result->subplans = glob->subplans;
 	result->rewindPlanIDs = glob->rewindPlanIDs;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 1b5b9b5ed9c..8c9956e687e 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -384,6 +384,24 @@ set_plan_references(PlannerInfo *root, Plan *plan)
 		}
 	}
 
+	/*
+	 * Record the first result relation if it belongs to the set of
+	 * initially prunable relations.  We use bms_next_member() to get
+	 * the lowest-numbered leaf result rel, which matches
+	 * linitial_int(ModifyTable.resultRelations) because partition
+	 * expansion preserves RT index order.  ExecInitModifyTable() asserts
+	 * that the recorded index matches what it actually needs.
+	 */
+	if (root->leaf_result_relids)
+	{
+		Index	firstResultRel = bms_next_member(root->leaf_result_relids, -1);
+
+		firstResultRel += rtoffset;
+		if (bms_is_member(firstResultRel, root->glob->prunableRelids))
+			root->glob->firstResultRels =
+				lappend_int(root->glob->firstResultRels, firstResultRel);
+	}
+
 	return result;
 }
 
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index ccdb6c01071..487258641a5 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -1637,6 +1637,7 @@ exec_bind_message(StringInfo input_message)
 	int16	   *rformats = NULL;
 	CachedPlanSource *psrc;
 	CachedPlan *cplan;
+	CachedPlanPrepData cprep = {0};
 	Portal		portal;
 	char	   *query_string;
 	char	   *saved_stmt_name;
@@ -2018,7 +2019,10 @@ exec_bind_message(StringInfo input_message)
 	 * will be generated in MessageContext.  The plan refcount will be
 	 * assigned to the Portal, so it will be released at portal destruction.
 	 */
-	cplan = GetCachedPlan(psrc, params, NULL, NULL);
+	cprep.context = portal->portalContext;
+	cprep.owner = portal->resowner;
+	cplan = GetCachedPlan(psrc, params, NULL, NULL, &cprep);
+	Assert(cprep.prep_estate == NULL || list_length(cplan->stmt_list) == 1);
 
 	/*
 	 * Now we can define the portal.
@@ -2031,7 +2035,7 @@ exec_bind_message(StringInfo input_message)
 					  query_string,
 					  psrc->commandTag,
 					  cplan->stmt_list,
-					  NULL,
+					  cprep.prep_estate,
 					  cplan);
 
 	/* Portal is defined, set the plan ID based on its contents. */
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index 42ef3e82f82..b52c4c619ee 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -1214,6 +1214,7 @@ PortalRunMulti(Portal portal,
 	 * Loop to handle the individual queries generated from a single parsetree
 	 * by analysis and rewrite.
 	 */
+	Assert(portal->prep_estate == NULL || list_length(portal->stmts) == 1);
 	foreach(stmtlist_item, portal->stmts)
 	{
 		PlannedStmt *pstmt = lfirst_node(PlannedStmt, stmtlist_item);
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 698e7c1aa22..b0c4d62564d 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -93,14 +93,17 @@ static bool StmtPlanRequiresRevalidation(CachedPlanSource *plansource);
 static bool BuildingPlanRequiresSnapshot(CachedPlanSource *plansource);
 static List *RevalidateCachedQuery(CachedPlanSource *plansource,
 								   QueryEnvironment *queryEnv);
-static bool CheckCachedPlan(CachedPlanSource *plansource);
+static bool CheckCachedPlan(CachedPlanSource *plansource, CachedPlanPrepData *cprep);
 static CachedPlan *BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
 								   ParamListInfo boundParams, QueryEnvironment *queryEnv);
 static bool choose_custom_plan(CachedPlanSource *plansource,
 							   ParamListInfo boundParams);
 static double cached_plan_cost(CachedPlan *plan, bool include_planner);
 static Query *QueryListGetPrimaryStmt(List *stmts);
-static void AcquireExecutorLocks(List *stmt_list, bool acquire);
+static void AcquireExecutorLocksAll(List *stmt_list, bool acquire);
+static void AcquireExecutorLocksUnpruned(List *stmt_list, bool acquire,
+										 CachedPlanPrepData *cprep);
+static void CachedPlanPrepCleanup(CachedPlanPrepData *cprep);
 static void AcquirePlannerLocks(List *stmt_list, bool acquire);
 static void ScanQueryForLocks(Query *parsetree, bool acquire);
 static bool ScanQueryWalker(Node *node, bool *acquire);
@@ -942,6 +945,12 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
 /*
  * CheckCachedPlan: see if the CachedPlanSource's generic plan is valid.
  *
+ * If 'cprep' is not NULL and the generic plan contains only a single
+ * statement, ExecutorPrep() is applied to that PlannedStmt to compute the set
+ * of partitions that survive initial runtime pruning in order to only lock
+ * them.  The EState is saved in cprep.prep_estate, which must be passed to
+ * ExecutorStart() for reuse.
+ *
  * Caller must have already called RevalidateCachedQuery to verify that the
  * querytree is up to date.
  *
@@ -949,7 +958,7 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
  * (We must do this for the "true" result to be race-condition-free.)
  */
 static bool
-CheckCachedPlan(CachedPlanSource *plansource)
+CheckCachedPlan(CachedPlanSource *plansource, CachedPlanPrepData *cprep)
 {
 	CachedPlan *plan = plansource->gplan;
 
@@ -983,7 +992,19 @@ CheckCachedPlan(CachedPlanSource *plansource)
 		 */
 		Assert(plan->refcount > 0);
 
-		AcquireExecutorLocks(plan->stmt_list, true);
+		/*
+		 * Multi-statement CachedPlans (from rule rewriting) must not
+		 * use pruning-aware locking, because later statements' pruning
+		 * expressions could see stale results if evaluated before
+		 * earlier statements have executed.
+		 */
+		if (cprep && list_length(plan->stmt_list) > 1)
+			cprep = NULL;
+
+		if (cprep)
+			AcquireExecutorLocksUnpruned(plan->stmt_list, true, cprep);
+		else
+			AcquireExecutorLocksAll(plan->stmt_list, true);
 
 		/*
 		 * If plan was transient, check to see if TransactionXmin has
@@ -1005,7 +1026,13 @@ CheckCachedPlan(CachedPlanSource *plansource)
 		}
 
 		/* Oops, the race case happened.  Release useless locks. */
-		AcquireExecutorLocks(plan->stmt_list, false);
+		if (cprep)
+			AcquireExecutorLocksUnpruned(plan->stmt_list, false, cprep);
+		else
+			AcquireExecutorLocksAll(plan->stmt_list, false);
+
+		/* Also clean up ExecutorPrep() state, if necessary. */
+		CachedPlanPrepCleanup(cprep);
 	}
 
 	/*
@@ -1285,6 +1312,16 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
  * On return, the plan is valid and we have sufficient locks to begin
  * execution.
  *
+ * If 'cprep' is not NULL and a single-statement generic plan is reused,
+ * the function performs initial pruning via ExecutorPrep() and locks only
+ * the surviving partitions.  The resulting EState is stored in
+ * cprep->prep_estate and must be delivered to ExecutorStart() via
+ * QueryDesc->estate (or the equivalent portal/SPI path).  Failure
+ * to do so means the executor will operate on relations for which
+ * locks were never acquired.  Passing NULL for cprep is always safe;
+ * all partitions are locked as before.  Multi-statement plans also
+ * fall back to locking all partitions.
+ *
  * On return, the refcount of the plan has been incremented; a later
  * ReleaseCachedPlan() call is expected.  If "owner" is not NULL then
  * the refcount has been reported to that ResourceOwner (note that this
@@ -1295,7 +1332,8 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
  */
 CachedPlan *
 GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
-			  ResourceOwner owner, QueryEnvironment *queryEnv)
+			  ResourceOwner owner, QueryEnvironment *queryEnv,
+			  CachedPlanPrepData *cprep)
 {
 	CachedPlan *plan = NULL;
 	List	   *qlist;
@@ -1317,7 +1355,9 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 
 	if (!customplan)
 	{
-		if (CheckCachedPlan(plansource))
+		if (cprep)
+			cprep->params = boundParams;
+		if (CheckCachedPlan(plansource, cprep))
 		{
 			/* We want a generic plan, and we already have a valid one */
 			plan = plansource->gplan;
@@ -1904,11 +1944,13 @@ QueryListGetPrimaryStmt(List *stmts)
 }
 
 /*
- * AcquireExecutorLocks: acquire locks needed for execution of a cached plan;
- * or release them if acquire is false.
+ * AcquireExecutorLocksAll: acquire locks needed for execution of a cached
+ * plan; or release them if acquire is false.
+ *
+ * This locks all relations in a given PlannedStmt's range table.
  */
 static void
-AcquireExecutorLocks(List *stmt_list, bool acquire)
+AcquireExecutorLocksAll(List *stmt_list, bool acquire)
 {
 	ListCell   *lc1;
 
@@ -1955,6 +1997,190 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
 	}
 }
 
+/*
+ * LockRelids
+ * 		Acquire or release locks on the specified relids, which reference
+ * 		entries in the provided range table.
+ *
+ * Helper for AcquireExecutorLocksUnpruned().
+ */
+static void
+LockRelids(List *rtable, Bitmapset *relids, bool acquire)
+{
+	int	rtindex = -1;
+
+	while ((rtindex = bms_next_member(relids, rtindex)) >= 0)
+	{
+		RangeTblEntry *rte = list_nth_node(RangeTblEntry, rtable, rtindex - 1);
+
+		Assert(rte->rtekind == RTE_RELATION ||
+			   (rte->rtekind == RTE_SUBQUERY && OidIsValid(rte->relid)));
+
+		/*
+		 * Acquire the appropriate type of lock on each relation OID. Note
+		 * that we don't actually try to open the rel, and hence will not
+		 * fail if it's been dropped entirely --- we'll just transiently
+		 * acquire a non-conflicting lock.
+		 */
+		if (acquire)
+			LockRelationOid(rte->relid, rte->rellockmode);
+		else
+			UnlockRelationOid(rte->relid, rte->rellockmode);
+	}
+}
+
+/*
+ * AcquireExecutorLocksUnpruned
+ *		Acquire or release execution locks for only unpruned relations
+ *		referenced by the given single-statement PlannedStmt list.
+ *
+ * On acquire, this:
+ *	- locks unprunable rels listed in PlannedStmt.unprunableRelids
+ *	- runs ExecutorPrep() to perform initial runtime pruning
+ *	- locks the surviving partitions reported in the prep estate
+ *	- stores the EState in cprep->prep_estate
+ *
+ * On release, it:
+ *	- uses the EState in cprep->prep_estate to determine which
+ *	  relids to unlock
+ *
+ * Memory allocation for the EState happens in cprep->context.
+ * Locks are acquired using cprep->owner.
+ */
+static void
+AcquireExecutorLocksUnpruned(List *stmt_list, bool acquire,
+							 CachedPlanPrepData *cprep)
+{
+	MemoryContext oldcontext = MemoryContextSwitchTo(cprep->context);
+	ListCell   *lc1;
+	EState	   *prep_estate;
+
+	Assert(cprep);
+
+	/*
+	 * When releasing locks, use the EState created during acquisition to
+	 * determine which relids to unlock.
+	 */
+	prep_estate = cprep->prep_estate;
+	Assert(!acquire || prep_estate == NULL);
+	foreach(lc1, stmt_list)
+	{
+		PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
+
+		if (plannedstmt->commandType == CMD_UTILITY)
+		{
+			/* Same as AcquireExecutorLocks(). */
+			Query	   *query = UtilityContainsQuery(plannedstmt->utilityStmt);
+
+			if (query)
+				ScanQueryForLocks(query, acquire);
+			continue;
+		}
+
+		/*
+		 * Lock tables mentioned in the original query and other unprunable
+		 * relations that were added to the plan via inheritance expansion.
+		 */
+		LockRelids(plannedstmt->rtable, plannedstmt->unprunableRelids, acquire);
+
+		/* Lock partitions surviving runtime initial pruning. */
+		if (acquire)
+		{
+			/*
+			 * Pruning expressions may call PL functions that require an active
+			 * snapshot (e.g., via EnsurePortalSnapshotExists()). Establish one
+			 * if needed.
+			 */
+			bool		snap_pushed = false;
+
+			if (!ActiveSnapshotSet())
+			{
+				PushActiveSnapshot(GetTransactionSnapshot());
+				snap_pushed = true;
+			}
+
+			prep_estate = ExecutorPrep(plannedstmt, cprep->params,
+									   cprep->owner, cprep->eflags);
+			Assert(prep_estate);
+			cprep->prep_estate = prep_estate;
+
+			if (snap_pushed)
+				PopActiveSnapshot();
+		}
+
+		if (prep_estate)
+		{
+			/*
+			 * es_unpruned_relids includes plannedstmt->unprunableRelids,
+			 * which we've already locked. Filter them out to avoid double-locking.
+			 */
+			Bitmapset *lock_relids = bms_difference(prep_estate->es_unpruned_relids,
+													plannedstmt->unprunableRelids);
+
+			/*
+			 * We must always include the first result relation of each
+			 * ModifyTable node in the plan, that is, the one mentioned in
+			 * plannedstmt->firstResultRels in the set of relations to be
+			 * locked to satisfy executor assumptions described
+			 * in ExecInitModifyTable().  This can be wasteful, because we
+			 * may not need to use the first result relation at all if other
+			 * result relations are unpruned and thus sufficient for the
+			 * ModifyTable node's needs.  Unfortunately, we don't have per-node
+			 * unpruned_relids set to determine that other result relations
+			 * are included.
+			 */
+			if (plannedstmt->resultRelations)
+			{
+				ListCell *lc2;
+
+				foreach(lc2, plannedstmt->firstResultRels)
+				{
+					Index       firstResultRel = lfirst_int(lc2);
+
+					if (!bms_is_member(firstResultRel, lock_relids))
+						lock_relids = bms_add_member(lock_relids, firstResultRel);
+				}
+			}
+
+			LockRelids(plannedstmt->rtable, lock_relids, acquire);
+			bms_free(lock_relids);
+		}
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+}
+
+/*
+ * CachedPlanPrepCleanup
+ *		Dispose of EState built during pruning-aware lock acquisition.
+ *
+ * This is used when CheckCachedPlan() discovers that a CachedPlan has
+ * become invalid after AcquireExecutorLocksUnpruned() has already run.
+ * The execution locks have already been released by that point; this
+ * function frees the EState that the executor will never see.
+ */
+static void
+CachedPlanPrepCleanup(CachedPlanPrepData *cprep)
+{
+	EState   *prep_estate;
+	ResourceOwner oldowner;
+
+	if (cprep == NULL)
+		return;
+
+	/* Switch to owner that ExecutorPrep() would have used. */
+	oldowner = CurrentResourceOwner;
+	CurrentResourceOwner = cprep->owner;
+
+	prep_estate = cprep->prep_estate;
+	Assert(prep_estate);
+	ExecCloseRangeTableRelations(prep_estate);
+	FreeExecutorState(prep_estate);
+	CurrentResourceOwner = oldowner;
+
+	cprep->prep_estate = NULL;
+}
+
 /*
  * AcquirePlannerLocks: acquire locks needed for planning of a querytree list;
  * or release them if acquire is false.
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 27758ec16fe..4fd9d9bcc56 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -217,6 +217,9 @@ typedef struct PlannerGlobal
 	/* "flat" list of integer RT indexes */
 	List	   *resultRelations;
 
+	/* "flat" list of integer RT indexes (one per ModifyTable node) */
+	List	   *firstResultRels;
+
 	/* "flat" list of AppendRelInfos */
 	List	   *appendRelations;
 
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index b6185825fcb..55279cbbda8 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -121,6 +121,16 @@ typedef struct PlannedStmt
 	/* integer list of RT indexes, or NIL */
 	List	   *resultRelations;
 
+	/*
+	 * rtable indexes of first target relation in each ModifyTable node in the
+	 * plan for INSERT/UPDATE/DELETE/MERGE.  NIL if resultRelations is NIL.
+	 *
+	 * These are used by AcquireExecutorLocksUnpruned() to ensure that the
+	 * first result rel for each ModifyTable remains locked even if pruned;
+	 * see ExecInitModifyTable() for the executor side assumptions.
+	 */
+	List	   *firstResultRels;
+
 	/* list of AppendRelInfo nodes */
 	List	   *appendRelations;
 
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index 7a4a85c8038..1a153b816eb 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -27,6 +27,9 @@
 typedef struct Query Query;
 typedef struct RawStmt RawStmt;
 
+/* to avoid including execnodes.h */
+typedef struct EState EState;
+
 /* possible values for plan_cache_mode */
 typedef enum
 {
@@ -196,6 +199,38 @@ typedef struct CachedExpression
 	dlist_node	node;			/* link in global list of CachedExpressions */
 } CachedExpression;
 
+/*
+ * CachedPlanPrepData
+ *		Carries ExecutorPrep results for a CachedPlan's PlannedStmt,
+ *		along with context and owner information needed to allocate them.
+ *
+ * prep_estate is populated when GetCachedPlan() prepares a reused
+ * single-statement generic plan.  Multi-statement plans (from rule
+ * rewriting) fall back to locking all partitions and leave this NULL.
+ * If the plan is found invalid after locking, the EState is freed
+ * by CachedPlanPrepCleanup() before retrying.
+ *
+ * ExecutorPrep state is allocated in 'context' and owned by 'owner'.
+ *
+ * eflags controls ExecutorPrep() behavior during initial pruning.
+ * Normally zero; set EXEC_FLAG_EXPLAIN_GENERIC to suppress pruning
+ * in EXPLAIN (GENERIC_PLAN).  Need not match the eflags later passed
+ * to ExecutorStart().
+ *
+ * prep_estate must reach ExecutorStart() to be adopted for execution.
+ * If the plan is invalidated before that happens, CachedPlanPrepCleanup()
+ * frees it instead.  The EState is allocated in 'context' and its
+ * resources tracked under 'owner', which the caller sets to match the
+ * execution environment (e.g., portal context and resowner).
+ */
+typedef struct CachedPlanPrepData
+{
+	EState  *prep_estate;	/* EState for the PlannedStmt  */
+	ParamListInfo params;	/* params visible to ExecutorPrep */
+	MemoryContext context;	/* where to allocate EState and its fields */
+	ResourceOwner owner;	/* ResourceOwner for ExecutorPrep state */
+	int		eflags;			/* executor flags to control ExecutorPrep */
+} CachedPlanPrepData;
 
 extern void InitPlanCache(void);
 extern void ResetPlanCache(void);
@@ -240,7 +275,8 @@ extern List *CachedPlanGetTargetList(CachedPlanSource *plansource,
 extern CachedPlan *GetCachedPlan(CachedPlanSource *plansource,
 								 ParamListInfo boundParams,
 								 ResourceOwner owner,
-								 QueryEnvironment *queryEnv);
+								 QueryEnvironment *queryEnv,
+								 CachedPlanPrepData *cprep);
 extern void ReleaseCachedPlan(CachedPlan *plan, ResourceOwner owner);
 
 extern bool CachedPlanAllowsSimpleValidityCheck(CachedPlanSource *plansource,
diff --git a/src/test/regress/expected/partition_prune.out b/src/test/regress/expected/partition_prune.out
index deacdd75807..61781389d2f 100644
--- a/src/test/regress/expected/partition_prune.out
+++ b/src/test/regress/expected/partition_prune.out
@@ -4824,3 +4824,187 @@ select min(a) over (partition by a order by a) from part_abc where a >= stable_o
 
 drop view part_abc_view;
 drop table part_abc;
+--
+-- Verify that pruning-aware locking skips pruned partitions
+-- when reusing a generic cached plan.
+--
+set plan_cache_mode to force_generic_plan;
+create table prunelock_p (a int) partition by list (a);
+create table prunelock_p1 partition of prunelock_p for values in (1);
+create table prunelock_p2 partition of prunelock_p for values in (2);
+create table prunelock_p3 partition of prunelock_p for values in (3);
+prepare prunelock_q (int) as select * from prunelock_p where a = $1;
+-- Force generic plan creation
+explain (costs off) execute prunelock_q(1);
+                  QUERY PLAN                  
+----------------------------------------------
+ Append
+   Subplans Removed: 2
+   ->  Seq Scan on prunelock_p1 prunelock_p_1
+         Filter: (a = $1)
+(4 rows)
+
+-- Execute and check which child partitions are locked
+begin;
+execute prunelock_q(1);
+ a 
+---
+(0 rows)
+
+select c.relname
+  from pg_locks l
+  join pg_class c on c.oid = l.relation
+ where l.pid = pg_backend_pid()
+   and c.relname like 'prunelock_p_'
+ order by c.relname;
+   relname    
+--------------
+ prunelock_p1
+(1 row)
+
+commit;
+deallocate prunelock_q;
+-- Turn pruning off
+set enable_partition_pruning to off;
+prepare prunelock_q (int) as select * from prunelock_p where a = $1;
+-- Force generic plan creation
+explain (costs off) execute prunelock_q(1);
+                  QUERY PLAN                  
+----------------------------------------------
+ Append
+   ->  Seq Scan on prunelock_p1 prunelock_p_1
+         Filter: (a = $1)
+   ->  Seq Scan on prunelock_p2 prunelock_p_2
+         Filter: (a = $1)
+   ->  Seq Scan on prunelock_p3 prunelock_p_3
+         Filter: (a = $1)
+(7 rows)
+
+-- Execute and check which child partitions are locked
+begin;
+execute prunelock_q(1);
+ a 
+---
+(0 rows)
+
+select c.relname
+  from pg_locks l
+  join pg_class c on c.oid = l.relation
+ where l.pid = pg_backend_pid()
+   and c.relname like 'prunelock_p_'
+ order by c.relname;
+   relname    
+--------------
+ prunelock_p1
+ prunelock_p2
+ prunelock_p3
+(3 rows)
+
+commit;
+deallocate prunelock_q;
+reset enable_partition_pruning;
+--
+-- Verify firstResultRels handling with multiple ModifyTable nodes
+-- (writable CTEs) targeting a partitioned table.  When a pruning
+-- parameter matches no partition, all result relations are pruned
+-- and the executor must still find a usable first result relation
+-- for each ModifyTable node.
+--
+prepare prunelock_mt_q (int, int) as
+  with upd1 as (update prunelock_p set a = a),
+       upd2 as (update prunelock_p set a = a where a = $2)
+  update prunelock_p set a = a where a = $1;
+-- Force generic plan creation
+explain (costs off) execute prunelock_mt_q(1, 2);
+                         QUERY PLAN                         
+------------------------------------------------------------
+ Update on prunelock_p
+   Update on prunelock_p1 prunelock_p_1
+   CTE upd1
+     ->  Update on prunelock_p prunelock_p_3
+           Update on prunelock_p1 prunelock_p_4
+           Update on prunelock_p2 prunelock_p_5
+           Update on prunelock_p3 prunelock_p_6
+           ->  Append
+                 ->  Seq Scan on prunelock_p1 prunelock_p_4
+                 ->  Seq Scan on prunelock_p2 prunelock_p_5
+                 ->  Seq Scan on prunelock_p3 prunelock_p_6
+   CTE upd2
+     ->  Update on prunelock_p prunelock_p_7
+           Update on prunelock_p2 prunelock_p_8
+           ->  Append
+                 Subplans Removed: 2
+                 ->  Seq Scan on prunelock_p2 prunelock_p_8
+                       Filter: (a = $2)
+   ->  Append
+         Subplans Removed: 2
+         ->  Seq Scan on prunelock_p1 prunelock_p_1
+               Filter: (a = $1)
+(22 rows)
+
+-- All partitions pruned: value 4 matches no partition, so each
+-- ModifyTable must still initialize correctly with no matching
+-- result relations.
+explain (costs off) execute prunelock_mt_q(4, 5);
+                         QUERY PLAN                         
+------------------------------------------------------------
+ Update on prunelock_p
+   CTE upd1
+     ->  Update on prunelock_p prunelock_p_2
+           Update on prunelock_p1 prunelock_p_3
+           Update on prunelock_p2 prunelock_p_4
+           Update on prunelock_p3 prunelock_p_5
+           ->  Append
+                 ->  Seq Scan on prunelock_p1 prunelock_p_3
+                 ->  Seq Scan on prunelock_p2 prunelock_p_4
+                 ->  Seq Scan on prunelock_p3 prunelock_p_5
+   CTE upd2
+     ->  Update on prunelock_p prunelock_p_6
+           ->  Append
+                 Subplans Removed: 3
+   ->  Append
+         Subplans Removed: 3
+(16 rows)
+
+deallocate prunelock_mt_q;
+drop table prunelock_p;
+--
+-- Verify that pruning-aware locking falls back to locking all
+-- partitions for multi-statement CachedPlans.  Rule rewriting can
+-- expand a single statement into multiple PlannedStmts, and later
+-- statements must not have their pruning evaluated before earlier
+-- ones have executed, since CCI between statements can change what
+-- pruning expressions see.
+--
+create table prune_config (val int);
+insert into prune_config values (1);
+create table multistmt_pt (a int, b int) partition by list (a);
+create table multistmt_pt_1 partition of multistmt_pt for values in (1);
+create table multistmt_pt_2 partition of multistmt_pt for values in (2);
+insert into multistmt_pt values (1, 0), (2, 0);
+create function get_prune_val() returns int as $$
+  select val from prune_config;
+$$ language sql stable;
+create rule config_upd_rule as on update to multistmt_pt
+  do also update prune_config set val = 2;
+set plan_cache_mode to force_generic_plan;
+prepare multi_q as update multistmt_pt set b = b + 1 where a = get_prune_val();
+-- first execute creates the generic plan
+execute multi_q;
+-- reset for the real test
+update prune_config set val = 1;
+update multistmt_pt set b = 0;
+-- second execute reuses the plan; pruning-aware locking kicks in
+execute multi_q;
+select * from multistmt_pt order by a;
+ a | b 
+---+---
+ 1 | 0
+ 2 | 1
+(2 rows)
+
+deallocate multi_q;
+drop rule config_upd_rule on multistmt_pt;
+drop function get_prune_val;
+drop table multistmt_pt, prune_config;
+reset plan_cache_mode;
diff --git a/src/test/regress/expected/plancache.out b/src/test/regress/expected/plancache.out
index 4e59188196c..3043dbfac2d 100644
--- a/src/test/regress/expected/plancache.out
+++ b/src/test/regress/expected/plancache.out
@@ -398,3 +398,66 @@ select name, generic_plans, custom_plans from pg_prepared_statements
 (1 row)
 
 drop table test_mode;
+-- This exercises the CachedPlanPrepCleanup() path, which must free
+-- the EState created by ExecutorPrep() when the plan is invalidated
+-- before execution begins.  The pruning expression uses a stable SQL
+-- function that calls a volatile plpgsql function.  That function
+-- performs DDL on a partition when a separate "signal" table says to
+-- do so.  The second EXECUTE should replan cleanly after the DDL.
+set plan_cache_mode to force_generic_plan;
+create table inval_during_pruning_p (a int) partition by list (a);
+create table inval_during_pruning_p1 partition of inval_during_pruning_p for values in (1);
+create table inval_during_pruning_p2 partition of inval_during_pruning_p for values in (2);
+insert into inval_during_pruning_p values (1), (2);
+create table inval_during_pruning_signal (create_idx bool not null);
+insert into inval_during_pruning_signal values (false);
+create or replace function invalidate_plancache_func() returns int
+as $$
+declare
+	create_index bool;
+begin
+	-- Perform DDL on a partition if asked to
+	select create_idx into create_index from inval_during_pruning_signal for update;
+	if create_index = true then
+		raise notice 'creating index on partition inval_during_pruning_p1';
+		create index on inval_during_pruning_p1 (a);
+		update inval_during_pruning_signal set create_idx = false;
+	end if;
+	-- value that pruning will match against partition bounds
+	return 1;
+end;
+$$ language plpgsql volatile;
+create or replace function stable_pruning_val() returns int as $$
+	select invalidate_plancache_func();
+$$ language sql stable;
+prepare inval_during_pruning_q as select * from inval_during_pruning_p where a = stable_pruning_val();
+-- Build a generic plan and run pruning once, but don't set the signal
+-- for invalidate_plancache_func() to perform the DDL.
+explain (verbose, costs off) execute inval_during_pruning_q;
+                                QUERY PLAN                                 
+---------------------------------------------------------------------------
+ Append
+   Subplans Removed: 1
+   ->  Seq Scan on public.inval_during_pruning_p1 inval_during_pruning_p_1
+         Output: inval_during_pruning_p_1.a
+         Filter: (inval_during_pruning_p_1.a = stable_pruning_val())
+(5 rows)
+
+-- Reuse the generic plan.  Make invalidate_plancache_func() perform DDL
+-- during this execution, which should force replanning without errors.
+update inval_during_pruning_signal set create_idx = true;
+explain (verbose, costs off) execute inval_during_pruning_q;
+NOTICE:  creating index on partition inval_during_pruning_p1
+                                QUERY PLAN                                 
+---------------------------------------------------------------------------
+ Append
+   Subplans Removed: 1
+   ->  Seq Scan on public.inval_during_pruning_p1 inval_during_pruning_p_1
+         Output: inval_during_pruning_p_1.a
+         Filter: (inval_during_pruning_p_1.a = stable_pruning_val())
+(5 rows)
+
+deallocate inval_during_pruning_q;
+drop table inval_during_pruning_p, inval_during_pruning_signal;
+drop function invalidate_plancache_func, stable_pruning_val;
+reset plan_cache_mode;
diff --git a/src/test/regress/sql/partition_prune.sql b/src/test/regress/sql/partition_prune.sql
index d93c0c03bab..692415a8d9f 100644
--- a/src/test/regress/sql/partition_prune.sql
+++ b/src/test/regress/sql/partition_prune.sql
@@ -1447,3 +1447,119 @@ select min(a) over (partition by a order by a) from part_abc where a >= stable_o
 
 drop view part_abc_view;
 drop table part_abc;
+
+--
+-- Verify that pruning-aware locking skips pruned partitions
+-- when reusing a generic cached plan.
+--
+set plan_cache_mode to force_generic_plan;
+
+create table prunelock_p (a int) partition by list (a);
+create table prunelock_p1 partition of prunelock_p for values in (1);
+create table prunelock_p2 partition of prunelock_p for values in (2);
+create table prunelock_p3 partition of prunelock_p for values in (3);
+
+prepare prunelock_q (int) as select * from prunelock_p where a = $1;
+
+-- Force generic plan creation
+explain (costs off) execute prunelock_q(1);
+
+-- Execute and check which child partitions are locked
+begin;
+execute prunelock_q(1);
+
+select c.relname
+  from pg_locks l
+  join pg_class c on c.oid = l.relation
+ where l.pid = pg_backend_pid()
+   and c.relname like 'prunelock_p_'
+ order by c.relname;
+commit;
+
+deallocate prunelock_q;
+
+-- Turn pruning off
+set enable_partition_pruning to off;
+
+prepare prunelock_q (int) as select * from prunelock_p where a = $1;
+
+-- Force generic plan creation
+explain (costs off) execute prunelock_q(1);
+
+-- Execute and check which child partitions are locked
+begin;
+execute prunelock_q(1);
+
+select c.relname
+  from pg_locks l
+  join pg_class c on c.oid = l.relation
+ where l.pid = pg_backend_pid()
+   and c.relname like 'prunelock_p_'
+ order by c.relname;
+commit;
+
+deallocate prunelock_q;
+reset enable_partition_pruning;
+
+--
+-- Verify firstResultRels handling with multiple ModifyTable nodes
+-- (writable CTEs) targeting a partitioned table.  When a pruning
+-- parameter matches no partition, all result relations are pruned
+-- and the executor must still find a usable first result relation
+-- for each ModifyTable node.
+--
+prepare prunelock_mt_q (int, int) as
+  with upd1 as (update prunelock_p set a = a),
+       upd2 as (update prunelock_p set a = a where a = $2)
+  update prunelock_p set a = a where a = $1;
+
+-- Force generic plan creation
+explain (costs off) execute prunelock_mt_q(1, 2);
+
+-- All partitions pruned: value 4 matches no partition, so each
+-- ModifyTable must still initialize correctly with no matching
+-- result relations.
+explain (costs off) execute prunelock_mt_q(4, 5);
+
+deallocate prunelock_mt_q;
+drop table prunelock_p;
+
+--
+-- Verify that pruning-aware locking falls back to locking all
+-- partitions for multi-statement CachedPlans.  Rule rewriting can
+-- expand a single statement into multiple PlannedStmts, and later
+-- statements must not have their pruning evaluated before earlier
+-- ones have executed, since CCI between statements can change what
+-- pruning expressions see.
+--
+create table prune_config (val int);
+insert into prune_config values (1);
+
+create table multistmt_pt (a int, b int) partition by list (a);
+create table multistmt_pt_1 partition of multistmt_pt for values in (1);
+create table multistmt_pt_2 partition of multistmt_pt for values in (2);
+insert into multistmt_pt values (1, 0), (2, 0);
+
+create function get_prune_val() returns int as $$
+  select val from prune_config;
+$$ language sql stable;
+
+create rule config_upd_rule as on update to multistmt_pt
+  do also update prune_config set val = 2;
+
+set plan_cache_mode to force_generic_plan;
+prepare multi_q as update multistmt_pt set b = b + 1 where a = get_prune_val();
+-- first execute creates the generic plan
+execute multi_q;
+-- reset for the real test
+update prune_config set val = 1;
+update multistmt_pt set b = 0;
+-- second execute reuses the plan; pruning-aware locking kicks in
+execute multi_q;
+select * from multistmt_pt order by a;
+
+deallocate multi_q;
+drop rule config_upd_rule on multistmt_pt;
+drop function get_prune_val;
+drop table multistmt_pt, prune_config;
+reset plan_cache_mode;
diff --git a/src/test/regress/sql/plancache.sql b/src/test/regress/sql/plancache.sql
index 4b2f11dcc64..6a8b8787de6 100644
--- a/src/test/regress/sql/plancache.sql
+++ b/src/test/regress/sql/plancache.sql
@@ -223,3 +223,55 @@ select name, generic_plans, custom_plans from pg_prepared_statements
   where  name = 'test_mode_pp';
 
 drop table test_mode;
+
+-- This exercises the CachedPlanPrepCleanup() path, which must free
+-- the EState created by ExecutorPrep() when the plan is invalidated
+-- before execution begins.  The pruning expression uses a stable SQL
+-- function that calls a volatile plpgsql function.  That function
+-- performs DDL on a partition when a separate "signal" table says to
+-- do so.  The second EXECUTE should replan cleanly after the DDL.
+set plan_cache_mode to force_generic_plan;
+create table inval_during_pruning_p (a int) partition by list (a);
+create table inval_during_pruning_p1 partition of inval_during_pruning_p for values in (1);
+create table inval_during_pruning_p2 partition of inval_during_pruning_p for values in (2);
+insert into inval_during_pruning_p values (1), (2);
+
+create table inval_during_pruning_signal (create_idx bool not null);
+insert into inval_during_pruning_signal values (false);
+create or replace function invalidate_plancache_func() returns int
+as $$
+declare
+	create_index bool;
+begin
+	-- Perform DDL on a partition if asked to
+	select create_idx into create_index from inval_during_pruning_signal for update;
+	if create_index = true then
+		raise notice 'creating index on partition inval_during_pruning_p1';
+		create index on inval_during_pruning_p1 (a);
+		update inval_during_pruning_signal set create_idx = false;
+	end if;
+	-- value that pruning will match against partition bounds
+	return 1;
+end;
+$$ language plpgsql volatile;
+
+create or replace function stable_pruning_val() returns int as $$
+	select invalidate_plancache_func();
+$$ language sql stable;
+
+prepare inval_during_pruning_q as select * from inval_during_pruning_p where a = stable_pruning_val();
+
+-- Build a generic plan and run pruning once, but don't set the signal
+-- for invalidate_plancache_func() to perform the DDL.
+explain (verbose, costs off) execute inval_during_pruning_q;
+
+-- Reuse the generic plan.  Make invalidate_plancache_func() perform DDL
+-- during this execution, which should force replanning without errors.
+update inval_during_pruning_signal set create_idx = true;
+explain (verbose, costs off) execute inval_during_pruning_q;
+
+deallocate inval_during_pruning_q;
+drop table inval_during_pruning_p, inval_during_pruning_signal;
+drop function invalidate_plancache_func, stable_pruning_val;
+
+reset plan_cache_mode;
-- 
2.47.3



  [application/octet-stream] v10-0004-Make-SQL-function-executor-track-ExecutorPrep-st.patch (7.7K, 6-v10-0004-Make-SQL-function-executor-track-ExecutorPrep-st.patch)
  download | inline diff:
From 5769f6ca7c9ffcee1b51d27105c780c5d6102f55 Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Tue, 10 Feb 2026 22:09:23 +0900
Subject: [PATCH v10 4/5] Make SQL function executor track ExecutorPrep state

Extend the SQL function executor to use the ExecutorPrep results
returned by GetCachedPlan().  init_execution_state() now passes a
CachedPlanPrepData to GetCachedPlan() and stores the per statement
ExecPrep pointers in the execution_state nodes.

At execution time, postquel_start() reparents the prep estate's
es_query_cxt under the function's subcontext so that prep state
follows the usual per call context hierarchy.

This allows SQL language functions to participate in the same
ExecutorPrep machinery as other plan cache users.

Add a regression test where rule rewrite expands a single UPDATE
into multiple PlannedStmts, exercising the SQL function plan cache
and the generic plan reuse path that now invokes ExecutorPrep.
---
 src/backend/executor/functions.c        | 27 ++++++++++++--
 src/test/regress/expected/plancache.out | 48 +++++++++++++++++++++++++
 src/test/regress/sql/plancache.sql      | 34 ++++++++++++++++++
 3 files changed, 107 insertions(+), 2 deletions(-)

diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index c0ca72b38dd..2be816b6a75 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -73,6 +73,7 @@ typedef struct execution_state
 	bool		setsResult;		/* true if this query produces func's result */
 	bool		lazyEval;		/* true if should fetch one row at a time */
 	PlannedStmt *stmt;			/* plan for this query */
+	EState	   *prep_estate;	/* EState created in ExecutorPrep() for this plan */
 	QueryDesc  *qd;				/* null unless status == RUN */
 } execution_state;
 
@@ -658,6 +659,7 @@ init_execution_state(SQLFunctionCachePtr fcache)
 	execution_state *lasttages = NULL;
 	int			nstmts;
 	ListCell   *lc;
+	CachedPlanPrepData cprep = {0};
 
 	/*
 	 * Clean up after previous query, if there was one.
@@ -696,11 +698,20 @@ init_execution_state(SQLFunctionCachePtr fcache)
 	 * CurrentResourceOwner will be the same when ShutdownSQLFunction runs.)
 	 */
 	fcache->cowner = CurrentResourceOwner;
+
+	/*
+	 * Have ExecutorPrep() allocate under fcache->fcontext.  The prep
+	 * EStates it creates will initially live there; postquel_start()
+	 * will later reparent their es_query_cxt into fcache->subcontext
+	 * when using them for execution.
+	 */
+	cprep.context = fcache->fcontext;
+	cprep.owner = fcache->cowner;
 	fcache->cplan = GetCachedPlan(plansource,
 								  fcache->paramLI,
 								  fcache->cowner,
 								  NULL,
-								  NULL);
+								  &cprep);
 
 	/*
 	 * If necessary, make esarray[] bigger to hold the needed state.
@@ -721,6 +732,7 @@ init_execution_state(SQLFunctionCachePtr fcache)
 	/*
 	 * Build execution_state list to match the number of contained plans.
 	 */
+	Assert(cprep.prep_estate == NULL || list_length(fcache->cplan->stmt_list) == 1);
 	foreach(lc, fcache->cplan->stmt_list)
 	{
 		PlannedStmt *stmt = lfirst_node(PlannedStmt, lc);
@@ -765,6 +777,7 @@ init_execution_state(SQLFunctionCachePtr fcache)
 		newes->setsResult = false;	/* might change below */
 		newes->lazyEval = false;	/* might change below */
 		newes->stmt = stmt;
+		newes->prep_estate = cprep.prep_estate;
 		newes->qd = NULL;
 
 		if (stmt->canSetTag)
@@ -1363,6 +1376,15 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
 	else
 		dest = None_Receiver;
 
+	/*
+	 * Prep EStates were built under fcache->fcontext.  For execution,
+	 * make their es_query_cxt a child of fcache->subcontext so they
+	 * follow the usual per call lifetime.
+	 */
+	if (es->prep_estate)
+		MemoryContextSetParent(es->prep_estate->es_query_cxt,
+							   fcache->subcontext);
+
 	es->qd = CreateQueryDesc(es->stmt,
 							 fcache->func->src,
 							 GetActiveSnapshot(),
@@ -1371,7 +1393,7 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
 							 fcache->paramLI,
 							 es->qd ? es->qd->queryEnv : NULL,
 							 0,
-							 NULL);
+							 es->prep_estate);
 
 	/* Utility commands don't need Executor. */
 	if (es->qd->operation != CMD_UTILITY)
@@ -1462,6 +1484,7 @@ postquel_end(execution_state *es, SQLFunctionCachePtr fcache)
 
 	FreeQueryDesc(es->qd);
 	es->qd = NULL;
+	es->prep_estate = NULL;
 
 	MemoryContextSwitchTo(oldcontext);
 
diff --git a/src/test/regress/expected/plancache.out b/src/test/regress/expected/plancache.out
index 3043dbfac2d..547846b2945 100644
--- a/src/test/regress/expected/plancache.out
+++ b/src/test/regress/expected/plancache.out
@@ -460,4 +460,52 @@ NOTICE:  creating index on partition inval_during_pruning_p1
 deallocate inval_during_pruning_q;
 drop table inval_during_pruning_p, inval_during_pruning_signal;
 drop function invalidate_plancache_func, stable_pruning_val;
+-- exercise sql-function plan cache when rewrite expands a single statement
+-- into multiple planned statements. this forces cachedplan->stmt_list to
+-- contain more than one entry and checks that executor state for the first
+-- rewritten statement does not destroy state needed by the second one.
+set plan_cache_mode = force_generic_plan;
+create table sqlf_base(id int, val int) partition by list (id);
+create table sqlf_base_1 partition of sqlf_base for values in (1);
+create table sqlf_base_2 partition of sqlf_base for values in (2);
+create table sqlf_log(id int, note text);
+insert into sqlf_base values (1, 10);
+create rule sqlf_base_upd_log as
+on update to sqlf_base do also
+	insert into sqlf_log(id, note)
+	values (new.id, 'logged by rule');
+create or replace function sqlf_execprep_test(a int, v int)
+returns void
+language sql
+as $$
+	update sqlf_base set val = v where id = a;
+$$;
+select sqlf_execprep_test(1, 20);
+ sqlf_execprep_test 
+--------------------
+ 
+(1 row)
+
+select sqlf_execprep_test(1, 30);
+ sqlf_execprep_test 
+--------------------
+ 
+(1 row)
+
+select * from sqlf_base order by 1;
+ id | val 
+----+-----
+  1 |  30
+(1 row)
+
+select * from sqlf_log order by 1;
+ id |      note      
+----+----------------
+  1 | logged by rule
+  1 | logged by rule
+(2 rows)
+
+drop rule sqlf_base_upd_log on sqlf_base;
+drop table sqlf_base, sqlf_log;
+drop function sqlf_execprep_test;
 reset plan_cache_mode;
diff --git a/src/test/regress/sql/plancache.sql b/src/test/regress/sql/plancache.sql
index 6a8b8787de6..532fa58518b 100644
--- a/src/test/regress/sql/plancache.sql
+++ b/src/test/regress/sql/plancache.sql
@@ -274,4 +274,38 @@ deallocate inval_during_pruning_q;
 drop table inval_during_pruning_p, inval_during_pruning_signal;
 drop function invalidate_plancache_func, stable_pruning_val;
 
+-- exercise sql-function plan cache when rewrite expands a single statement
+-- into multiple planned statements. this forces cachedplan->stmt_list to
+-- contain more than one entry and checks that executor state for the first
+-- rewritten statement does not destroy state needed by the second one.
+
+set plan_cache_mode = force_generic_plan;
+
+create table sqlf_base(id int, val int) partition by list (id);
+create table sqlf_base_1 partition of sqlf_base for values in (1);
+create table sqlf_base_2 partition of sqlf_base for values in (2);
+create table sqlf_log(id int, note text);
+
+insert into sqlf_base values (1, 10);
+
+create rule sqlf_base_upd_log as
+on update to sqlf_base do also
+	insert into sqlf_log(id, note)
+	values (new.id, 'logged by rule');
+
+create or replace function sqlf_execprep_test(a int, v int)
+returns void
+language sql
+as $$
+	update sqlf_base set val = v where id = a;
+$$;
+
+select sqlf_execprep_test(1, 20);
+select sqlf_execprep_test(1, 30);
+select * from sqlf_base order by 1;
+select * from sqlf_log order by 1;
+
+drop rule sqlf_base_upd_log on sqlf_base;
+drop table sqlf_base, sqlf_log;
+drop function sqlf_execprep_test;
 reset plan_cache_mode;
-- 
2.47.3



^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2026-03-27 09:00  Amit Langote <[email protected]>
  parent: Amit Langote <[email protected]>
  0 siblings, 1 reply; 108+ messages in thread

From: Amit Langote @ 2026-03-27 09:00 UTC (permalink / raw)
  To: Chao Li <[email protected]>; +Cc: Tom Lane <[email protected]>; Tender Wang <[email protected]>; Alexander Lakhin <[email protected]>; Tomas Vondra <[email protected]>; Robert Haas <[email protected]>; Alvaro Herrera <[email protected]>; Andres Freund <[email protected]>; Daniel Gustafsson <[email protected]>; David Rowley <[email protected]>; pgsql-hackers; Thom Brown <[email protected]>

On Thu, Mar 26, 2026 at 6:24 PM Amit Langote <[email protected]> wrote:
> On Wed, Mar 25, 2026 at 4:39 PM Amit Langote <[email protected]> wrote:
> > On Fri, Mar 20, 2026 at 2:20 AM Amit Langote <[email protected]> wrote:
> > > On Mon, Mar 9, 2026 at 1:41 PM Amit Langote <[email protected]> wrote:
> > > Stepping back -- the core question is whether running executor logic
> > > (pruning) inside GetCachedPlan() is acceptable at all. The plan cache
> > > and executor have always had a clean boundary: plan cache locks
> > > everything, executor runs. This optimization necessarily crosses that
> > > line, because the information needed to decide which locks to skip
> > > (pruning results) can only come from executor machinery.
> > >
> > > The proposed approach has GetCachedPlan() call ExecutorPrep() to do a
> > > limited subset of executor work (range table init, permissions,
> > > pruning), carry the results out through CachedPlanPrepData, and leave
> > > the CachedPlan itself untouched. The executor already has a multi-step
> > > protocol: start/run/end. prep/start/run/end is just a finer
> > > decomposition of what InitPlan() was already doing inside
> > > ExecutorStart().
> > >
> > > Of the attached patches, I'm targeting 0001-0003 for commit. 0004 (SQL
> > > function support) and 0005 (parallel worker reuse) are useful
> > > follow-ons but not essential.  The optimization works without them for
> > > most cases, and they can be reviewed and committed separately.
> > >
> > > If there's a cleaner way to avoid locking pruned partitions without
> > > the plumbing this patch adds, I haven't found it in the year since the
> > > revert.  I'd welcome a pointer if you see one.  Failing that, I think
> > > this is the right trade-off, but it's a judgment call about where to
> > > hold your nose.
> > >
> > > Tom, I'd value your opinion on whether this approach is something
> > > you'd be comfortable seeing in the tree.
> >
> > Attached is an updated set with some cleanup after another pass.
> >
> > - Removed ExecCreatePartitionPruneStates() from 0001. In 0001-0003,
> > ExecDoInitialPruning() handles both setup and pruning internally; the
> > split isn't needed yet.
> >
> > - Tightened commit messages to describe what each commit does now, not
> > what later commits will use it for. In particular, 0002 is upfront
> > that the portal/SPI/EXPLAIN plumbing is scaffolding that 0003 lights
> > up.
> >
> > - Updated setrefs.c comment for firstResultRels to drop a blanket
> > claim about one ModifyTable per query level.
> >
> > As before, 0001-0003 is the focus, maybe 0004 which teaches the new
> > GetCachedPlan() pruning-aware contract to its relatively new user in
> > function.c.
>
> While reviewing the patch more carefully, I realized there's a
> correctness issue when rule rewriting causes a single statement to
> expand into multiple PlannedStmts in one CachedPlan.
>
> PortalRunMulti() executes those statements sequentially, with
> CommandCounterIncrement() between them, so Q2's ExecutorStart()
> normally sees the effects of Q1.
>
> With the patch, though, AcquireExecutorLocksUnpruned() runs
> ExecutorPrep() on all PlannedStmts in one pass during GetCachedPlan(),
> before any statement executes. If a later statement has
> initial-pruning expressions that read data modified by an earlier one,
> pruning can see stale results.
>
> There's also a memory lifetime issue: PortalRunMulti() calls
> MemoryContextDeleteChildren(portalContext) between statements, which
> destroys EStates prepared for later statements.
>
> Here's a concrete case demonstrating the semantic issue:
>
>   create table multistmt_pt (a int, b int) partition by list (a);
>   create table multistmt_pt_1 partition of multistmt_pt for values in (1);
>   create table multistmt_pt_2 partition of multistmt_pt for values in (2);
>   insert into multistmt_pt values (1, 0), (2, 0);
>
>   create table prune_config (val int);
>   insert into prune_config values (1);
>
>   create function get_prune_val() returns int as $$
>     select val from prune_config;
>   $$ language sql stable;
>
>   -- rule action runs first, updating prune_config before the
>   -- original statement's pruning would normally be evaluated
>   create rule config_upd_rule as on update to multistmt_pt
>     do also update prune_config set val = 2;
>
>   set plan_cache_mode to force_generic_plan;
>   prepare multi_q as
>     update multistmt_pt set b = b + 1 where a = get_prune_val();
>   execute multi_q;  -- creates the generic plan
>
>   -- reset for the real test
>   update prune_config set val = 1;
>   update multistmt_pt set b = 0;
>
>   -- second execute reuses the plan
>   execute multi_q;
>   select * from multistmt_pt order by a;
>
> Without the patch: the rule action updates prune_config to val=2
> first, then after CCI the original statement's initial pruning calls
> get_prune_val(), gets 2, prunes to multistmt_pt_2, and updates it
> correctly: (1, 0), (2, 1).
>
> With the patch as it stood: both statements' pruning runs during
> GetCachedPlan() before either executes. The original statement's
> pruning sees val=1, prunes to multistmt_pt_1, and multistmt_pt_2 is
> never touched.
>
> The fix is to skip pruning-aware locking for CachedPlans containing
> multiple PlannedStmts, falling back to locking all partitions.
> Single-statement plans are unchanged.

For good measure, I also verified that Tom's test case from last May
[1] that prompted the revert of the previous commit works correctly
with this patch. When the DO ALSO rule is created mid-execution, the
plan gets invalidated and rebuilt as a multi-statement CachedPlan,
which triggers the fallback to locking all partitions. No assertions,
no crashes.

-- 
Thanks, Amit Langote

[1] https://postgr.es/m/[email protected]





^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2026-04-04 12:10  Amit Langote <[email protected]>
  parent: Amit Langote <[email protected]>
  0 siblings, 1 reply; 108+ messages in thread

From: Amit Langote @ 2026-04-04 12:10 UTC (permalink / raw)
  To: Chao Li <[email protected]>; +Cc: Tom Lane <[email protected]>; Tender Wang <[email protected]>; Alexander Lakhin <[email protected]>; Tomas Vondra <[email protected]>; Robert Haas <[email protected]>; Alvaro Herrera <[email protected]>; Andres Freund <[email protected]>; Daniel Gustafsson <[email protected]>; David Rowley <[email protected]>; pgsql-hackers; Thom Brown <[email protected]>

Attached is a redesigned version. While working on the previous
design, I grew increasingly uncomfortable with CachedPlanPrepData --
it was smuggling executor state out of GetCachedPlan() through an
out-parameter, which papered over the real problem: GetCachedPlan()
was doing too much. The main change in this version is architectural:
GetCachedPlan() no longer acquires execution locks. Callers now own
that responsibility, which is natural because each call site iterates
stmt_list differently and manages execution state in its own way --
and it lets them choose between conservative lock-all and
pruning-aware locking where appropriate.

Non-portal call sites remain on the conservative path for now.
_SPI_execute_plan requires care around snapshot setup, which happens
after plan fetch rather than before. SQL functions have a different
issue: init_execution_state() fetches the plan while postquel_start()
handles execution, with execution_state containers in between, making
it harder to thread a prepped QueryDesc through. The portal path and
EXPLAIN EXECUTE cover the most common
prepared-statement-with-partitions workloads; the remaining sites can
be converted incrementally.

This is now starting to feel closer to what Tom suggested back in
January 2023 [1], where he proposed getting rid of
AcquireExecutorLocks() inside GetCachedPlan() entirely and pushing
lock acquisition out to callers. He noted that "we'd be pushing the
responsibility for looping back and re-planning out to fairly
high-level calling code" and that "we'd definitely be changing some
fundamental APIs." That is the direction I came around to over the
last couple of weeks while wrestling with CachedPlanPrepData.  The
reverted approach also tried to follow Tom's direction but moved
locking into ExecutorStart(), which forced it to handle plan
invalidation from inside the executor by mutating the CachedPlan
in-place. This version moves locking out to the callers instead, so
the executor and plan cache never reach into each other.

The series is now four patches:

0001: Move execution lock acquisition out of GetCachedPlan(). Adds
AcquireExecutorLocks() as a caller-facing function with validity check
and retry. Adds PortalLockCachedPlan() in pquery.c to centralize the
portal retry logic. All callers are converted. No behavioral change.

0002: Refactor executor's initial partition pruning setup. Cleanup
only, no behavioral change.

0003: Introduce ExecutorPrep() and refactor executor startup. Factors
range table init, permission checks, and initial pruning out of
InitPlan(). Scaffolding for 0004; all callers still go through the
normal ExecutorStart() path.

0004: Use pruning-aware locking for single-statement cached plans.
Adds ExecutorPrepAndLock() which locks unprunable relations, runs
ExecutorPrep() to determine surviving partitions, then locks only
those. Extends PortalLockCachedPlan() with a pruning-aware path for
eligible plans. Multi-statement CachedPlans (from rule rewriting)
always use conservative locking. In principle, this could be relaxed
if the planner can prove that no pruning expression reads state
modified by an earlier statement, but that is left for a future patch.
Includes regression tests.

In case it's not clear, I'm not targeting v19 at this point.  I'd like
to get this into v20 CF1 and would welcome review from anyone
interested.

--
Thanks,
Amit Langote

[1] https://www.postgresql.org/message-id/4191508.1674157166%40sss.pgh.pa.us


Attachments:

  [application/octet-stream] v11-0004-Use-pruning-aware-locking-for-single-statement-c.patch (40.3K, 2-v11-0004-Use-pruning-aware-locking-for-single-statement-c.patch)
  download | inline diff:
From f586635ab49f3027546a7bda4c4f6017b946f333 Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Sat, 4 Apr 2026 20:43:14 +0900
Subject: [PATCH v11 4/4] Use pruning-aware locking for single-statement cached
 plans

For single-statement reused generic plans, perform initial partition
pruning before acquiring execution locks, then lock only the
surviving partitions.

Add ExecutorPrepAndLock() which encapsulates the pruning-aware lock
sequence: lock unprunable relations, call ExecutorPrep() to run
initial pruning, then lock survivors.  Plan validity is checked
after each step; ExecutorPrepCleanup() handles the case where the
plan is invalidated between prep and execution.

Extend PortalLockCachedPlan() to use the pruning-aware path for
eligible plans (single-statement reused generic, non-utility).
All other cases continue using the conservative lock-all path
from the previous commit.

Track firstResultRels in PlannerGlobal and PlannedStmt so they
are locked even if pruned, preserving ExecInitModifyTable()
assumptions about the first result relation being available.

Multi-statement CachedPlans (from rule rewriting) always use
conservative locking, since PortalRunMulti() executes statements
sequentially with CCI between them and later statements' pruning
expressions may depend on earlier ones' effects.  In principle,
this could be relaxed if the planner can prove that no pruning
expression reads state modified by an earlier statement, but that
is left for a future patch.

Regression tests are included to verify:

- Only surviving partitions are locked when pruning is enabled, and
  all partitions are locked when it is disabled (pg_locks inspection).
- Multiple ModifyTable nodes (via writable CTEs) handle the case where
  all target partitions are pruned, exercising firstResultRels.
- Plan invalidation during pruning-aware lock setup (DDL triggered by
  a pruning expression) discards the prep state and replans cleanly.
- Multi-statement CachedPlans (from rule rewriting) fall back to
  locking all partitions, avoiding stale pruning results.

Note for extension authors: code that accesses partition relations
through EState must check that the RT index is a member of
es_unpruned_relids before opening the relation.  Previously this
was an optimization; it is now a correctness requirement, because
pruned partitions may not be locked.
---
 src/backend/commands/explain.c                |  45 +++--
 src/backend/commands/prepare.c                |  30 ++-
 src/backend/executor/execMain.c               | 142 ++++++++++++++
 src/backend/executor/nodeModifyTable.c        |   5 +-
 src/backend/optimizer/plan/planner.c          |   1 +
 src/backend/optimizer/plan/setrefs.c          |  18 ++
 src/backend/tcop/pquery.c                     |  54 ++++-
 src/backend/utils/cache/plancache.c           |  16 ++
 src/include/commands/explain.h                |   3 +-
 src/include/executor/executor.h               |   4 +
 src/include/nodes/pathnodes.h                 |   3 +
 src/include/nodes/plannodes.h                 |  10 +
 src/include/utils/plancache.h                 |   2 +
 src/test/regress/expected/partition_prune.out | 184 ++++++++++++++++++
 src/test/regress/expected/plancache.out       |  63 ++++++
 src/test/regress/sql/partition_prune.sql      | 116 +++++++++++
 src/test/regress/sql/plancache.sql            |  52 +++++
 17 files changed, 720 insertions(+), 28 deletions(-)

diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index e4b70166b0e..60cd912ace1 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -374,7 +374,8 @@ standard_ExplainOneQuery(Query *query, int cursorOptions,
 	/* run it (if needed) and produce output */
 	ExplainOnePlan(plan, into, es, queryString, params, queryEnv,
 				   &planduration, (es->buffers ? &bufusage : NULL),
-				   es->memory ? &mem_counters : NULL);
+				   es->memory ? &mem_counters : NULL,
+				   NULL);
 }
 
 /*
@@ -498,7 +499,8 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
 			   const char *queryString, ParamListInfo params,
 			   QueryEnvironment *queryEnv, const instr_time *planduration,
 			   const BufferUsage *bufusage,
-			   const MemoryContextCounters *mem_counters)
+			   const MemoryContextCounters *mem_counters,
+			   QueryDesc *prep_qd)
 {
 	DestReceiver *dest;
 	QueryDesc  *queryDesc;
@@ -527,13 +529,6 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
 	 */
 	INSTR_TIME_SET_CURRENT(starttime);
 
-	/*
-	 * Use a snapshot with an updated command ID to ensure this query sees
-	 * results of any previously executed queries.
-	 */
-	PushCopiedSnapshot(GetActiveSnapshot());
-	UpdateActiveSnapshotCommandId();
-
 	/*
 	 * We discard the output if we have no use for it.  If we're explaining
 	 * CREATE TABLE AS, we'd better use the appropriate tuple receiver, while
@@ -549,10 +544,34 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
 	else
 		dest = None_Receiver;
 
-	/* Create a QueryDesc for the query */
-	queryDesc = CreateQueryDesc(plannedstmt, queryString,
-								GetActiveSnapshot(), InvalidSnapshot,
-								dest, params, queryEnv, instrument_option);
+	/*
+	 * Create a QueryDesc for the query, or use the one provided by the
+	 * caller.  When reusing a prep QueryDesc, its snapshot was set at
+	 * creation time; we push it as active for ExecutorStart and override the
+	 * destination and instrument options, which were not known when the
+	 * caller created it.
+	 */
+	if (prep_qd)
+	{
+		PushActiveSnapshot(GetActiveSnapshot());
+		queryDesc = prep_qd;
+		Assert(queryDesc->dest == None_Receiver);
+		queryDesc->dest = dest;
+		queryDesc->instrument_options = instrument_option;
+	}
+	else
+	{
+		/*
+		 * Use a snapshot with an updated command ID to ensure this query sees
+		 * results of any previously executed queries.
+		 */
+		PushCopiedSnapshot(GetActiveSnapshot());
+		UpdateActiveSnapshotCommandId();
+		queryDesc = CreateQueryDesc(plannedstmt, queryString,
+									GetActiveSnapshot(), InvalidSnapshot,
+									dest, params, queryEnv,
+									instrument_option);
+	}
 
 	/* Select execution options */
 	if (es->analyze)
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 03d7a98fc58..3bbbc052149 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -588,6 +588,7 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	MemoryContextCounters mem_counters;
 	MemoryContext planner_ctx = NULL;
 	MemoryContext saved_ctx = NULL;
+	QueryDesc  *prep_qd = NULL;
 
 	if (es->memory)
 	{
@@ -640,8 +641,31 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 							  pstate->p_queryEnv);
 		plan_list = cplan->stmt_list;
 
-		if (AcquireExecutorLocks(cplan))
+		if (!CachedPlanCanPrep(cplan, entry->plansource))
+		{
+			if (AcquireExecutorLocks(cplan))
+				break;
+			ReleaseCachedPlan(cplan, CurrentResourceOwner);
+			continue;
+		}
+
+		prep_qd = CreateQueryDesc(linitial_node(PlannedStmt, plan_list),
+								  query_string,
+								  GetActiveSnapshot(),
+								  InvalidSnapshot,
+								  None_Receiver,	/* ExplainOnePlan will fix */
+								  paramLI,
+								  pstate->p_queryEnv,
+								  0 /* ExplainOnePlan will fix */ );
+		if (ExecutorPrepAndLock(prep_qd,
+								CurrentResourceOwner,
+								es->generic ? EXEC_FLAG_EXPLAIN_GENERIC : 0,
+								&cplan->is_valid))
 			break;
+
+		/* Try again. */
+		ExecutorPrepCleanup(prep_qd);
+		FreeQueryDesc(prep_qd);
 		ReleaseCachedPlan(cplan, CurrentResourceOwner);
 	}
 
@@ -664,6 +688,7 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	plan_list = cplan->stmt_list;
 
 	/* Explain each query */
+	Assert(prep_qd == NULL || list_length(plan_list) == 1);
 	foreach(p, plan_list)
 	{
 		PlannedStmt *pstmt = lfirst_node(PlannedStmt, p);
@@ -671,7 +696,8 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 		if (pstmt->commandType != CMD_UTILITY)
 			ExplainOnePlan(pstmt, into, es, query_string, paramLI, pstate->p_queryEnv,
 						   &planduration, (es->buffers ? &bufusage : NULL),
-						   es->memory ? &mem_counters : NULL);
+						   es->memory ? &mem_counters : NULL,
+						   prep_qd);
 		else
 			ExplainOneUtility(pstmt->utilityStmt, into, es, pstate, paramLI);
 
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 735c80e08a9..7333c0f66d5 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -324,6 +324,124 @@ standard_ExecutorStart(QueryDesc *queryDesc, int eflags)
 	MemoryContextSwitchTo(oldcontext);
 }
 
+/*
+ * LockRangeTableRelids
+ * 		Acquire or release locks on the specified relids, which reference
+ * 		entries in the provided range table.
+ *
+ * Helper for AcquireExecutorLocksUnpruned().
+ */
+static void
+LockRangeTableRelids(List *rtable, Bitmapset *relids, bool acquire)
+{
+	int			rtindex = -1;
+
+	while ((rtindex = bms_next_member(relids, rtindex)) >= 0)
+	{
+		RangeTblEntry *rte = list_nth_node(RangeTblEntry, rtable, rtindex - 1);
+
+		Assert(rte->rtekind == RTE_RELATION ||
+			   (rte->rtekind == RTE_SUBQUERY && OidIsValid(rte->relid)));
+
+		/*
+		 * Acquire the appropriate type of lock on each relation OID. Note
+		 * that we don't actually try to open the rel, and hence will not fail
+		 * if it's been dropped entirely --- we'll just transiently acquire a
+		 * non-conflicting lock.
+		 */
+		if (acquire)
+			LockRelationOid(rte->relid, rte->rellockmode);
+		else
+			UnlockRelationOid(rte->relid, rte->rellockmode);
+	}
+}
+
+/*
+ * AcquireExecutorLocksPrepared
+ *
+ * Acquire or release execution locks using pruning results already computed
+ * by ExecutorPrep() and stored in queryDesc->estate.
+ *
+ * This is intended for single-statement reused generic-plan paths that
+ * choose pruning-aware locking instead of the conservative
+ * AcquireExecutorLocks() path.
+ */
+static void
+AcquireExecutorLocksPrepared(QueryDesc *queryDesc, bool acquire)
+{
+	PlannedStmt *plannedstmt = queryDesc->plannedstmt;
+	EState	   *estate = queryDesc->estate;
+	Bitmapset  *lock_relids;
+	ListCell   *lc;
+
+	Assert(queryDesc != NULL);
+	Assert(estate != NULL);
+	Assert(plannedstmt != NULL);
+	Assert(plannedstmt->commandType != CMD_UTILITY);
+
+	lock_relids = bms_difference(estate->es_unpruned_relids,
+								 plannedstmt->unprunableRelids);
+
+	/*
+	 * Keep the first result relation of each ModifyTable locked even if
+	 * pruning removed all target partitions.  ExecInitModifyTable() relies on
+	 * one such relation remaining available.
+	 */
+	foreach(lc, plannedstmt->firstResultRels)
+	{
+		Index		rti = lfirst_int(lc);
+
+		lock_relids = bms_add_member(lock_relids, rti);
+	}
+
+	LockRangeTableRelids(plannedstmt->rtable, lock_relids, acquire);
+
+	bms_free(lock_relids);
+
+}
+
+/*
+ * ExecutorPrepAndLock
+ *		Perform pruning-aware locking for a single PlannedStmt.
+ *
+ * Locks unprunable relations first, then runs ExecutorPrep() to
+ * determine which partitions survive initial pruning, then locks
+ * only those survivors.  Checks *is_valid after each locking step
+ * to detect plan invalidation (e.g., from concurrent DDL or DDL
+ * triggered by a pruning expression).
+ *
+ * Returns true if the plan is still valid and all needed locks are
+ * held.  Returns false if the plan was invalidated at any point, in
+ * which case all acquired locks have been released and the caller
+ * should discard the QueryDesc and retry with a fresh plan.
+ */
+bool
+ExecutorPrepAndLock(QueryDesc *queryDesc, ResourceOwner owner,
+					int eflags, bool *is_valid)
+{
+	PlannedStmt *pstmt = queryDesc->plannedstmt;
+
+	/* Lock unprunable rels before pruning can access them. */
+	LockRangeTableRelids(pstmt->rtable, pstmt->unprunableRelids, true);
+	if (!*is_valid)
+	{
+		LockRangeTableRelids(pstmt->rtable, pstmt->unprunableRelids, false);
+		return false;
+	}
+
+	/* Run pruning and lock survivors. */
+	ExecutorPrep(queryDesc, owner, eflags);
+	AcquireExecutorLocksPrepared(queryDesc, true);
+	if (!*is_valid)
+	{
+		AcquireExecutorLocksPrepared(queryDesc, false);
+		LockRangeTableRelids(pstmt->rtable, pstmt->unprunableRelids, false);
+		return false;
+	}
+
+	return true;
+}
+
 /*
  * ExecutorPrep
  *
@@ -382,6 +500,30 @@ ExecutorPrep(QueryDesc *queryDesc, ResourceOwner owner, int eflags)
 	CurrentResourceOwner = oldowner;
 }
 
+/*
+ * ExecutorPrepCleanup
+ *		Clean up an EState that was created by ExecutorPrep() but never
+ *		passed to ExecutorStart().  This happens when the plan is
+ *		invalidated between prep and execution, and the caller must
+ *		discard the prepped state before retrying with a fresh plan.
+ *
+ * Unlike ExecutorEnd(), this does not expect a fully initialized
+ * plan state tree -- only the range table relations and the
+ * EState itself need to be freed.
+ */
+void
+ExecutorPrepCleanup(QueryDesc *queryDesc)
+{
+	EState	   *estate = queryDesc->estate;
+
+	if (estate == NULL)
+		return;
+
+	ExecCloseRangeTableRelations(estate);
+	FreeExecutorState(estate);
+	queryDesc->estate = NULL;
+}
+
 /* ----------------------------------------------------------------
  *		ExecutorRun
  *
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index dfd7b33aa9b..8bc5c36e09d 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -5112,8 +5112,8 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
 	 * as a reference for building the ResultRelInfo of the target partition.
 	 * In either case, it doesn't matter which result relation is kept, so we
 	 * just keep the first one, if all others have been pruned.  See also,
-	 * ExecDoInitialPruning(), which ensures that this first result relation
-	 * has been locked.
+	 * AcquireExecutorLocksUnpruned(), which ensures that this first result
+	 * relation has been locked.
 	 */
 	i = 0;
 	foreach(l, node->resultRelations)
@@ -5127,6 +5127,7 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
 			/* all result relations pruned; keep the first one */
 			keep_rel = true;
 			rti = linitial_int(node->resultRelations);
+			Assert(list_member_int(estate->es_plannedstmt->firstResultRels, rti));
 			i = 0;
 		}
 
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 4ec76ce31a9..ace1cbacc91 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -657,6 +657,7 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
 											  glob->prunableRelids);
 	result->permInfos = glob->finalrteperminfos;
 	result->subrtinfos = glob->subrtinfos;
+	result->firstResultRels = glob->firstResultRels;
 	result->appendRelations = glob->appendRelations;
 	result->subplans = glob->subplans;
 	result->rewindPlanIDs = glob->rewindPlanIDs;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index ff0e875f2a2..6ee51f06920 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -384,6 +384,24 @@ set_plan_references(PlannerInfo *root, Plan *plan)
 		}
 	}
 
+	/*
+	 * Record the first result relation if it belongs to the set of initially
+	 * prunable relations.  We use bms_next_member() to get the
+	 * lowest-numbered leaf result rel, which matches
+	 * linitial_int(ModifyTable.resultRelations) because partition expansion
+	 * preserves RT index order.  ExecInitModifyTable() asserts that the
+	 * recorded index matches what it actually needs.
+	 */
+	if (root->leaf_result_relids)
+	{
+		Index		firstResultRel = bms_next_member(root->leaf_result_relids, -1);
+
+		firstResultRel += rtoffset;
+		if (bms_is_member(firstResultRel, root->glob->prunableRelids))
+			root->glob->firstResultRels =
+				lappend_int(root->glob->firstResultRels, firstResultRel);
+	}
+
 	return result;
 }
 
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index 1b22515d56e..af732821139 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -59,7 +59,9 @@ static uint64 DoPortalRunFetch(Portal portal,
 							   long count,
 							   DestReceiver *dest);
 static void DoPortalRewind(Portal portal);
-static bool PortalLockCachedPlan(Portal portal);
+static bool PortalLockCachedPlan(Portal portal, bool do_prep,
+								 ParamListInfo params,
+								 QueryDesc **queryDesc_p);
 
 
 /*
@@ -492,9 +494,14 @@ restart:
 				 * the destination to DestNone.
 				 *
 				 * If the portal is backed by a cached plan, acquire execution
-				 * locks via PortalLockCachedPlan().  If the plan is
-				 * invalidated during locking, it replans and may change the
-				 * portal strategy, requiring us to restart PortalStart().
+				 * locks via PortalLockCachedPlan().  For eligible plans
+				 * (single-statement reused generic), this performs
+				 * pruning-aware locking: it runs ExecutorPrep() on the
+				 * QueryDesc to determine which partitions survive initial
+				 * pruning, then locks only those.  If the plan is invalidated
+				 * during this process, it replans and rebuilds the QueryDesc.
+				 * If replanning changes the portal strategy, we must restart
+				 * PortalStart() to redispatch.
 				 */
 				queryDesc = CreateQueryDesc(linitial_node(PlannedStmt, portal->stmts),
 											portal->sourceText,
@@ -506,7 +513,7 @@ restart:
 											0);
 				if (portal->cplan)
 				{
-					if (PortalLockCachedPlan(portal))
+					if (PortalLockCachedPlan(portal, true, params, &queryDesc))
 					{
 						PopActiveSnapshot();
 						goto restart;
@@ -552,7 +559,7 @@ restart:
 			case PORTAL_ONE_MOD_WITH:
 				if (portal->cplan)
 				{
-					if (PortalLockCachedPlan(portal))
+					if (PortalLockCachedPlan(portal, false, NULL, NULL))
 						goto restart;
 				}
 
@@ -608,7 +615,7 @@ restart:
 				 */
 				if (portal->cplan)
 				{
-					if (PortalLockCachedPlan(portal))
+					if (PortalLockCachedPlan(portal, false, NULL, NULL))
 						goto restart;
 				}
 
@@ -1825,15 +1832,32 @@ EnsurePortalSnapshotExists(void)
  *		Acquire execution locks for a cached-plan-backed portal,
  *		retrying with a fresh plan if the current one is invalidated.
  *
+ * If do_prep is true and the plan is eligible (single-statement reused
+ * generic plan), performs pruning-aware locking via ExecutorPrep() and
+ * populates portal->queryDesc with the prepped QueryDesc.  Otherwise
+ * falls back to locking all relations in the plan.
+ *
  * Returns true if replanning changed portal->strategy, meaning the
- * caller must redispatch.  Returns false once locks are held.
+ * caller must redispatch.  Returns false once locks are held and the
+ * plan is valid for execution.
  */
 static bool
-PortalLockCachedPlan(Portal portal)
+PortalLockCachedPlan(Portal portal, bool do_prep,
+					 ParamListInfo params,
+					 QueryDesc **prep_qd)
 {
 	PortalStrategy start_strategy = portal->strategy;
 
-	if (AcquireExecutorLocks(portal->cplan))
+	if (do_prep && CachedPlanCanPrep(portal->cplan, portal->plansource))
+	{
+		Assert(prep_qd);
+		if (ExecutorPrepAndLock(*prep_qd, portal->resowner, 0,
+								&portal->cplan->is_valid))
+			return false;
+		ExecutorPrepCleanup(*prep_qd);
+		FreeQueryDesc(*prep_qd);
+	}
+	else if (AcquireExecutorLocks(portal->cplan))
 		return false;
 
 	/* Replan.  Locks will be taken freshly. */
@@ -1849,5 +1873,15 @@ PortalLockCachedPlan(Portal portal)
 	if (portal->strategy != start_strategy)
 		return true;
 
+	if (prep_qd)
+	{
+		Assert(list_length(portal->stmts) == 1);
+		*prep_qd = CreateQueryDesc(linitial_node(PlannedStmt, portal->stmts),
+								   portal->sourceText,
+								   GetActiveSnapshot(), InvalidSnapshot,
+								   None_Receiver, params,
+								   portal->queryEnv, 0);
+	}
+
 	return false;
 }
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index f7fe366859c..fca2f84081e 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -1977,6 +1977,22 @@ AcquireExecutorLocks(CachedPlan *cplan)
 	return true;
 }
 
+/*
+ * CachedPlanCanPrep
+ *		Check whether a cached plan is eligible for pruning-aware locking
+ *		via ExecutorPrepAndLock().
+ *
+ * Only single-statement reused generic plans with a non-utility command
+ * qualify.
+ */
+bool
+CachedPlanCanPrep(CachedPlan *cplan, CachedPlanSource *plansource)
+{
+	return (cplan == plansource->gplan &&
+			list_length(cplan->stmt_list) == 1 &&
+			linitial_node(PlannedStmt, cplan->stmt_list)->commandType != CMD_UTILITY);
+}
+
 /*
  * AcquirePlannerLocks: acquire locks needed for planning of a querytree list;
  * or release them if acquire is false.
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 472e141bba3..3a03355e6b6 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -69,7 +69,8 @@ extern void ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into,
 						   ParamListInfo params, QueryEnvironment *queryEnv,
 						   const instr_time *planduration,
 						   const BufferUsage *bufusage,
-						   const MemoryContextCounters *mem_counters);
+						   const MemoryContextCounters *mem_counters,
+						   QueryDesc *prep_qd);
 
 extern void ExplainPrintPlan(ExplainState *es, QueryDesc *queryDesc);
 extern void ExplainPrintTriggers(ExplainState *es,
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index 491c4886506..fef5aadcdfa 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -21,6 +21,7 @@
 #include "nodes/lockoptions.h"
 #include "nodes/parsenodes.h"
 #include "utils/memutils.h"
+#include "utils/resowner.h"
 
 
 /*
@@ -235,6 +236,9 @@ ExecGetJunkAttribute(TupleTableSlot *slot, AttrNumber attno, bool *isNull)
  */
 extern void ExecutorStart(QueryDesc *queryDesc, int eflags);
 extern void standard_ExecutorStart(QueryDesc *queryDesc, int eflags);
+extern bool ExecutorPrepAndLock(QueryDesc *queryDesc, ResourceOwner owner,
+								int eflags, bool *is_valid);
+extern void ExecutorPrepCleanup(QueryDesc *queryDesc);
 extern void ExecutorRun(QueryDesc *queryDesc,
 						ScanDirection direction, uint64 count);
 extern void standard_ExecutorRun(QueryDesc *queryDesc,
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 693b879f76d..8753e05152b 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -217,6 +217,9 @@ typedef struct PlannerGlobal
 	/* "flat" list of integer RT indexes */
 	List	   *resultRelations;
 
+	/* "flat" list of integer RT indexes (one per ModifyTable node) */
+	List	   *firstResultRels;
+
 	/* "flat" list of AppendRelInfos */
 	List	   *appendRelations;
 
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 14a1dfed2b9..7f6f7cda781 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -120,6 +120,16 @@ typedef struct PlannedStmt
 	/* RT indexes of relations targeted by INSERT/UPDATE/DELETE/MERGE */
 	Bitmapset  *resultRelationRelids;
 
+	/*
+	 * rtable indexes of first target relation in each ModifyTable node in the
+	 * plan for INSERT/UPDATE/DELETE/MERGE.  NIL if resultRelations is NIL.
+	 *
+	 * These are used by AcquireExecutorLocksUnpruned() to ensure that the
+	 * first result rel for each ModifyTable remains locked even if pruned;
+	 * see ExecInitModifyTable() for the executor side assumptions.
+	 */
+	List	   *firstResultRels;
+
 	/* list of AppendRelInfo nodes */
 	List	   *appendRelations;
 
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index e0fc403e717..2941d3a301b 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -254,4 +254,6 @@ extern bool CachedPlanIsSimplyValid(CachedPlanSource *plansource,
 extern CachedExpression *GetCachedExpression(Node *expr);
 extern void FreeCachedExpression(CachedExpression *cexpr);
 
+extern bool CachedPlanCanPrep(CachedPlan *cplan, CachedPlanSource *plansource);
+
 #endif							/* PLANCACHE_H */
diff --git a/src/test/regress/expected/partition_prune.out b/src/test/regress/expected/partition_prune.out
index deacdd75807..61781389d2f 100644
--- a/src/test/regress/expected/partition_prune.out
+++ b/src/test/regress/expected/partition_prune.out
@@ -4824,3 +4824,187 @@ select min(a) over (partition by a order by a) from part_abc where a >= stable_o
 
 drop view part_abc_view;
 drop table part_abc;
+--
+-- Verify that pruning-aware locking skips pruned partitions
+-- when reusing a generic cached plan.
+--
+set plan_cache_mode to force_generic_plan;
+create table prunelock_p (a int) partition by list (a);
+create table prunelock_p1 partition of prunelock_p for values in (1);
+create table prunelock_p2 partition of prunelock_p for values in (2);
+create table prunelock_p3 partition of prunelock_p for values in (3);
+prepare prunelock_q (int) as select * from prunelock_p where a = $1;
+-- Force generic plan creation
+explain (costs off) execute prunelock_q(1);
+                  QUERY PLAN                  
+----------------------------------------------
+ Append
+   Subplans Removed: 2
+   ->  Seq Scan on prunelock_p1 prunelock_p_1
+         Filter: (a = $1)
+(4 rows)
+
+-- Execute and check which child partitions are locked
+begin;
+execute prunelock_q(1);
+ a 
+---
+(0 rows)
+
+select c.relname
+  from pg_locks l
+  join pg_class c on c.oid = l.relation
+ where l.pid = pg_backend_pid()
+   and c.relname like 'prunelock_p_'
+ order by c.relname;
+   relname    
+--------------
+ prunelock_p1
+(1 row)
+
+commit;
+deallocate prunelock_q;
+-- Turn pruning off
+set enable_partition_pruning to off;
+prepare prunelock_q (int) as select * from prunelock_p where a = $1;
+-- Force generic plan creation
+explain (costs off) execute prunelock_q(1);
+                  QUERY PLAN                  
+----------------------------------------------
+ Append
+   ->  Seq Scan on prunelock_p1 prunelock_p_1
+         Filter: (a = $1)
+   ->  Seq Scan on prunelock_p2 prunelock_p_2
+         Filter: (a = $1)
+   ->  Seq Scan on prunelock_p3 prunelock_p_3
+         Filter: (a = $1)
+(7 rows)
+
+-- Execute and check which child partitions are locked
+begin;
+execute prunelock_q(1);
+ a 
+---
+(0 rows)
+
+select c.relname
+  from pg_locks l
+  join pg_class c on c.oid = l.relation
+ where l.pid = pg_backend_pid()
+   and c.relname like 'prunelock_p_'
+ order by c.relname;
+   relname    
+--------------
+ prunelock_p1
+ prunelock_p2
+ prunelock_p3
+(3 rows)
+
+commit;
+deallocate prunelock_q;
+reset enable_partition_pruning;
+--
+-- Verify firstResultRels handling with multiple ModifyTable nodes
+-- (writable CTEs) targeting a partitioned table.  When a pruning
+-- parameter matches no partition, all result relations are pruned
+-- and the executor must still find a usable first result relation
+-- for each ModifyTable node.
+--
+prepare prunelock_mt_q (int, int) as
+  with upd1 as (update prunelock_p set a = a),
+       upd2 as (update prunelock_p set a = a where a = $2)
+  update prunelock_p set a = a where a = $1;
+-- Force generic plan creation
+explain (costs off) execute prunelock_mt_q(1, 2);
+                         QUERY PLAN                         
+------------------------------------------------------------
+ Update on prunelock_p
+   Update on prunelock_p1 prunelock_p_1
+   CTE upd1
+     ->  Update on prunelock_p prunelock_p_3
+           Update on prunelock_p1 prunelock_p_4
+           Update on prunelock_p2 prunelock_p_5
+           Update on prunelock_p3 prunelock_p_6
+           ->  Append
+                 ->  Seq Scan on prunelock_p1 prunelock_p_4
+                 ->  Seq Scan on prunelock_p2 prunelock_p_5
+                 ->  Seq Scan on prunelock_p3 prunelock_p_6
+   CTE upd2
+     ->  Update on prunelock_p prunelock_p_7
+           Update on prunelock_p2 prunelock_p_8
+           ->  Append
+                 Subplans Removed: 2
+                 ->  Seq Scan on prunelock_p2 prunelock_p_8
+                       Filter: (a = $2)
+   ->  Append
+         Subplans Removed: 2
+         ->  Seq Scan on prunelock_p1 prunelock_p_1
+               Filter: (a = $1)
+(22 rows)
+
+-- All partitions pruned: value 4 matches no partition, so each
+-- ModifyTable must still initialize correctly with no matching
+-- result relations.
+explain (costs off) execute prunelock_mt_q(4, 5);
+                         QUERY PLAN                         
+------------------------------------------------------------
+ Update on prunelock_p
+   CTE upd1
+     ->  Update on prunelock_p prunelock_p_2
+           Update on prunelock_p1 prunelock_p_3
+           Update on prunelock_p2 prunelock_p_4
+           Update on prunelock_p3 prunelock_p_5
+           ->  Append
+                 ->  Seq Scan on prunelock_p1 prunelock_p_3
+                 ->  Seq Scan on prunelock_p2 prunelock_p_4
+                 ->  Seq Scan on prunelock_p3 prunelock_p_5
+   CTE upd2
+     ->  Update on prunelock_p prunelock_p_6
+           ->  Append
+                 Subplans Removed: 3
+   ->  Append
+         Subplans Removed: 3
+(16 rows)
+
+deallocate prunelock_mt_q;
+drop table prunelock_p;
+--
+-- Verify that pruning-aware locking falls back to locking all
+-- partitions for multi-statement CachedPlans.  Rule rewriting can
+-- expand a single statement into multiple PlannedStmts, and later
+-- statements must not have their pruning evaluated before earlier
+-- ones have executed, since CCI between statements can change what
+-- pruning expressions see.
+--
+create table prune_config (val int);
+insert into prune_config values (1);
+create table multistmt_pt (a int, b int) partition by list (a);
+create table multistmt_pt_1 partition of multistmt_pt for values in (1);
+create table multistmt_pt_2 partition of multistmt_pt for values in (2);
+insert into multistmt_pt values (1, 0), (2, 0);
+create function get_prune_val() returns int as $$
+  select val from prune_config;
+$$ language sql stable;
+create rule config_upd_rule as on update to multistmt_pt
+  do also update prune_config set val = 2;
+set plan_cache_mode to force_generic_plan;
+prepare multi_q as update multistmt_pt set b = b + 1 where a = get_prune_val();
+-- first execute creates the generic plan
+execute multi_q;
+-- reset for the real test
+update prune_config set val = 1;
+update multistmt_pt set b = 0;
+-- second execute reuses the plan; pruning-aware locking kicks in
+execute multi_q;
+select * from multistmt_pt order by a;
+ a | b 
+---+---
+ 1 | 0
+ 2 | 1
+(2 rows)
+
+deallocate multi_q;
+drop rule config_upd_rule on multistmt_pt;
+drop function get_prune_val;
+drop table multistmt_pt, prune_config;
+reset plan_cache_mode;
diff --git a/src/test/regress/expected/plancache.out b/src/test/regress/expected/plancache.out
index 4e59188196c..3043dbfac2d 100644
--- a/src/test/regress/expected/plancache.out
+++ b/src/test/regress/expected/plancache.out
@@ -398,3 +398,66 @@ select name, generic_plans, custom_plans from pg_prepared_statements
 (1 row)
 
 drop table test_mode;
+-- This exercises the CachedPlanPrepCleanup() path, which must free
+-- the EState created by ExecutorPrep() when the plan is invalidated
+-- before execution begins.  The pruning expression uses a stable SQL
+-- function that calls a volatile plpgsql function.  That function
+-- performs DDL on a partition when a separate "signal" table says to
+-- do so.  The second EXECUTE should replan cleanly after the DDL.
+set plan_cache_mode to force_generic_plan;
+create table inval_during_pruning_p (a int) partition by list (a);
+create table inval_during_pruning_p1 partition of inval_during_pruning_p for values in (1);
+create table inval_during_pruning_p2 partition of inval_during_pruning_p for values in (2);
+insert into inval_during_pruning_p values (1), (2);
+create table inval_during_pruning_signal (create_idx bool not null);
+insert into inval_during_pruning_signal values (false);
+create or replace function invalidate_plancache_func() returns int
+as $$
+declare
+	create_index bool;
+begin
+	-- Perform DDL on a partition if asked to
+	select create_idx into create_index from inval_during_pruning_signal for update;
+	if create_index = true then
+		raise notice 'creating index on partition inval_during_pruning_p1';
+		create index on inval_during_pruning_p1 (a);
+		update inval_during_pruning_signal set create_idx = false;
+	end if;
+	-- value that pruning will match against partition bounds
+	return 1;
+end;
+$$ language plpgsql volatile;
+create or replace function stable_pruning_val() returns int as $$
+	select invalidate_plancache_func();
+$$ language sql stable;
+prepare inval_during_pruning_q as select * from inval_during_pruning_p where a = stable_pruning_val();
+-- Build a generic plan and run pruning once, but don't set the signal
+-- for invalidate_plancache_func() to perform the DDL.
+explain (verbose, costs off) execute inval_during_pruning_q;
+                                QUERY PLAN                                 
+---------------------------------------------------------------------------
+ Append
+   Subplans Removed: 1
+   ->  Seq Scan on public.inval_during_pruning_p1 inval_during_pruning_p_1
+         Output: inval_during_pruning_p_1.a
+         Filter: (inval_during_pruning_p_1.a = stable_pruning_val())
+(5 rows)
+
+-- Reuse the generic plan.  Make invalidate_plancache_func() perform DDL
+-- during this execution, which should force replanning without errors.
+update inval_during_pruning_signal set create_idx = true;
+explain (verbose, costs off) execute inval_during_pruning_q;
+NOTICE:  creating index on partition inval_during_pruning_p1
+                                QUERY PLAN                                 
+---------------------------------------------------------------------------
+ Append
+   Subplans Removed: 1
+   ->  Seq Scan on public.inval_during_pruning_p1 inval_during_pruning_p_1
+         Output: inval_during_pruning_p_1.a
+         Filter: (inval_during_pruning_p_1.a = stable_pruning_val())
+(5 rows)
+
+deallocate inval_during_pruning_q;
+drop table inval_during_pruning_p, inval_during_pruning_signal;
+drop function invalidate_plancache_func, stable_pruning_val;
+reset plan_cache_mode;
diff --git a/src/test/regress/sql/partition_prune.sql b/src/test/regress/sql/partition_prune.sql
index d93c0c03bab..692415a8d9f 100644
--- a/src/test/regress/sql/partition_prune.sql
+++ b/src/test/regress/sql/partition_prune.sql
@@ -1447,3 +1447,119 @@ select min(a) over (partition by a order by a) from part_abc where a >= stable_o
 
 drop view part_abc_view;
 drop table part_abc;
+
+--
+-- Verify that pruning-aware locking skips pruned partitions
+-- when reusing a generic cached plan.
+--
+set plan_cache_mode to force_generic_plan;
+
+create table prunelock_p (a int) partition by list (a);
+create table prunelock_p1 partition of prunelock_p for values in (1);
+create table prunelock_p2 partition of prunelock_p for values in (2);
+create table prunelock_p3 partition of prunelock_p for values in (3);
+
+prepare prunelock_q (int) as select * from prunelock_p where a = $1;
+
+-- Force generic plan creation
+explain (costs off) execute prunelock_q(1);
+
+-- Execute and check which child partitions are locked
+begin;
+execute prunelock_q(1);
+
+select c.relname
+  from pg_locks l
+  join pg_class c on c.oid = l.relation
+ where l.pid = pg_backend_pid()
+   and c.relname like 'prunelock_p_'
+ order by c.relname;
+commit;
+
+deallocate prunelock_q;
+
+-- Turn pruning off
+set enable_partition_pruning to off;
+
+prepare prunelock_q (int) as select * from prunelock_p where a = $1;
+
+-- Force generic plan creation
+explain (costs off) execute prunelock_q(1);
+
+-- Execute and check which child partitions are locked
+begin;
+execute prunelock_q(1);
+
+select c.relname
+  from pg_locks l
+  join pg_class c on c.oid = l.relation
+ where l.pid = pg_backend_pid()
+   and c.relname like 'prunelock_p_'
+ order by c.relname;
+commit;
+
+deallocate prunelock_q;
+reset enable_partition_pruning;
+
+--
+-- Verify firstResultRels handling with multiple ModifyTable nodes
+-- (writable CTEs) targeting a partitioned table.  When a pruning
+-- parameter matches no partition, all result relations are pruned
+-- and the executor must still find a usable first result relation
+-- for each ModifyTable node.
+--
+prepare prunelock_mt_q (int, int) as
+  with upd1 as (update prunelock_p set a = a),
+       upd2 as (update prunelock_p set a = a where a = $2)
+  update prunelock_p set a = a where a = $1;
+
+-- Force generic plan creation
+explain (costs off) execute prunelock_mt_q(1, 2);
+
+-- All partitions pruned: value 4 matches no partition, so each
+-- ModifyTable must still initialize correctly with no matching
+-- result relations.
+explain (costs off) execute prunelock_mt_q(4, 5);
+
+deallocate prunelock_mt_q;
+drop table prunelock_p;
+
+--
+-- Verify that pruning-aware locking falls back to locking all
+-- partitions for multi-statement CachedPlans.  Rule rewriting can
+-- expand a single statement into multiple PlannedStmts, and later
+-- statements must not have their pruning evaluated before earlier
+-- ones have executed, since CCI between statements can change what
+-- pruning expressions see.
+--
+create table prune_config (val int);
+insert into prune_config values (1);
+
+create table multistmt_pt (a int, b int) partition by list (a);
+create table multistmt_pt_1 partition of multistmt_pt for values in (1);
+create table multistmt_pt_2 partition of multistmt_pt for values in (2);
+insert into multistmt_pt values (1, 0), (2, 0);
+
+create function get_prune_val() returns int as $$
+  select val from prune_config;
+$$ language sql stable;
+
+create rule config_upd_rule as on update to multistmt_pt
+  do also update prune_config set val = 2;
+
+set plan_cache_mode to force_generic_plan;
+prepare multi_q as update multistmt_pt set b = b + 1 where a = get_prune_val();
+-- first execute creates the generic plan
+execute multi_q;
+-- reset for the real test
+update prune_config set val = 1;
+update multistmt_pt set b = 0;
+-- second execute reuses the plan; pruning-aware locking kicks in
+execute multi_q;
+select * from multistmt_pt order by a;
+
+deallocate multi_q;
+drop rule config_upd_rule on multistmt_pt;
+drop function get_prune_val;
+drop table multistmt_pt, prune_config;
+reset plan_cache_mode;
diff --git a/src/test/regress/sql/plancache.sql b/src/test/regress/sql/plancache.sql
index 4b2f11dcc64..6a8b8787de6 100644
--- a/src/test/regress/sql/plancache.sql
+++ b/src/test/regress/sql/plancache.sql
@@ -223,3 +223,55 @@ select name, generic_plans, custom_plans from pg_prepared_statements
   where  name = 'test_mode_pp';
 
 drop table test_mode;
+
+-- This exercises the CachedPlanPrepCleanup() path, which must free
+-- the EState created by ExecutorPrep() when the plan is invalidated
+-- before execution begins.  The pruning expression uses a stable SQL
+-- function that calls a volatile plpgsql function.  That function
+-- performs DDL on a partition when a separate "signal" table says to
+-- do so.  The second EXECUTE should replan cleanly after the DDL.
+set plan_cache_mode to force_generic_plan;
+create table inval_during_pruning_p (a int) partition by list (a);
+create table inval_during_pruning_p1 partition of inval_during_pruning_p for values in (1);
+create table inval_during_pruning_p2 partition of inval_during_pruning_p for values in (2);
+insert into inval_during_pruning_p values (1), (2);
+
+create table inval_during_pruning_signal (create_idx bool not null);
+insert into inval_during_pruning_signal values (false);
+create or replace function invalidate_plancache_func() returns int
+as $$
+declare
+	create_index bool;
+begin
+	-- Perform DDL on a partition if asked to
+	select create_idx into create_index from inval_during_pruning_signal for update;
+	if create_index = true then
+		raise notice 'creating index on partition inval_during_pruning_p1';
+		create index on inval_during_pruning_p1 (a);
+		update inval_during_pruning_signal set create_idx = false;
+	end if;
+	-- value that pruning will match against partition bounds
+	return 1;
+end;
+$$ language plpgsql volatile;
+
+create or replace function stable_pruning_val() returns int as $$
+	select invalidate_plancache_func();
+$$ language sql stable;
+
+prepare inval_during_pruning_q as select * from inval_during_pruning_p where a = stable_pruning_val();
+
+-- Build a generic plan and run pruning once, but don't set the signal
+-- for invalidate_plancache_func() to perform the DDL.
+explain (verbose, costs off) execute inval_during_pruning_q;
+
+-- Reuse the generic plan.  Make invalidate_plancache_func() perform DDL
+-- during this execution, which should force replanning without errors.
+update inval_during_pruning_signal set create_idx = true;
+explain (verbose, costs off) execute inval_during_pruning_q;
+
+deallocate inval_during_pruning_q;
+drop table inval_during_pruning_p, inval_during_pruning_signal;
+drop function invalidate_plancache_func, stable_pruning_val;
+
+reset plan_cache_mode;
-- 
2.47.3



  [application/octet-stream] v11-0003-Introduce-ExecutorPrep-and-refactor-executor-sta.patch (8.9K, 3-v11-0003-Introduce-ExecutorPrep-and-refactor-executor-sta.patch)
  download | inline diff:
From 1b9f7861d7162f5b20f69ea9db5dda13f64c202e Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Thu, 26 Mar 2026 16:08:46 +0900
Subject: [PATCH v11 3/4] Introduce ExecutorPrep and refactor executor startup

Move permission checks, range table initialization, and initial
partition pruning out of InitPlan() into a new ExecutorPrep()
helper.

ExecutorStart() invokes ExecutorPrep() when QueryDesc->estate is
NULL, keeping current behavior unchanged.  If QueryDesc->estate is
already set, ExecutorStart() reuses it.

This is preparatory refactoring only.  No caller outside the
executor supplies a prebuilt EState in this commit.

In assert builds, verify that the expected relation locks are held
when entering ExecutorStart().
---
 src/backend/executor/README     |  10 ++-
 src/backend/executor/execMain.c | 152 ++++++++++++++++++++++++++------
 src/include/executor/execdesc.h |   2 +-
 3 files changed, 132 insertions(+), 32 deletions(-)

diff --git a/src/backend/executor/README b/src/backend/executor/README
index 54f4782f31b..890bc3d9333 100644
--- a/src/backend/executor/README
+++ b/src/backend/executor/README
@@ -291,11 +291,17 @@ Query Processing Control Flow
 
 This is a sketch of control flow for full query processing:
 
+    ExecutorPrep
+		May be run before ExecutorStart, or implicitly from ExecutorStart
+		if not done earlier.  Creates the EState in QueryDesc, performs
+		range table initialization, permission checks, and initial
+		partition pruning.
+
 	CreateQueryDesc
 
 	ExecutorStart
-		CreateExecutorState
-			creates per-query context
+		ExecutorPrep (if QueryDesc.estate is NULL)
+			creates EState and per-query context
 		switch to per-query context to run ExecInitNode
 		AfterTriggerBeginQuery
 		ExecInitNode --- recursively scans plan tree
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 45e00c6af85..735c80e08a9 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -57,6 +57,7 @@
 #include "parser/parse_relation.h"
 #include "pgstat.h"
 #include "rewrite/rewriteHandler.h"
+#include "storage/lmgr.h"
 #include "tcop/utility.h"
 #include "utils/acl.h"
 #include "utils/backend_status.h"
@@ -76,6 +77,7 @@ ExecutorEnd_hook_type ExecutorEnd_hook = NULL;
 ExecutorCheckPerms_hook_type ExecutorCheckPerms_hook = NULL;
 
 /* decls for local routines only used within this module */
+static void ExecutorPrep(QueryDesc *queryDesc, ResourceOwner owner, int eflags);
 static void InitPlan(QueryDesc *queryDesc, int eflags);
 static void CheckValidRowMarkRel(Relation rel, RowMarkType markType);
 static void ExecPostprocessPlan(EState *estate);
@@ -147,7 +149,6 @@ standard_ExecutorStart(QueryDesc *queryDesc, int eflags)
 
 	/* sanity checks: queryDesc must not be started already */
 	Assert(queryDesc != NULL);
-	Assert(queryDesc->estate == NULL);
 
 	/* caller must ensure the query's snapshot is active */
 	Assert(GetActiveSnapshot() == queryDesc->snapshot);
@@ -173,9 +174,67 @@ standard_ExecutorStart(QueryDesc *queryDesc, int eflags)
 
 	/*
 	 * Build EState, switch into per-query memory context for startup.
-	 */
-	estate = CreateExecutorState();
-	queryDesc->estate = estate;
+	 *
+	 * If ExecutorPrep() ran earlier (e.g., to do initial pruning during plan
+	 * validity checking), reuse its EState to avoid redoing range table setup
+	 * and pruning. Otherwise, create a fresh EState as usual.
+	 *
+	 * In assert builds, verify that the expected locks are held.  When no
+	 * prep EState was provided, AcquireExecutorLocks() should have locked
+	 * every relation in the plan.  When one was provided, pruning-aware
+	 * locking should have locked at least the unpruned relations.  Both
+	 * checks are skipped in parallel workers, which acquire relation locks
+	 * lazily in ExecGetRangeTableRelation().
+	 */
+	if (queryDesc->estate == NULL)
+	{
+#ifdef USE_ASSERT_CHECKING
+		if (!IsParallelWorker())
+		{
+			ListCell   *lc;
+
+			foreach(lc, queryDesc->plannedstmt->rtable)
+			{
+				RangeTblEntry *rte = lfirst_node(RangeTblEntry, lc);
+
+				if (rte->rtekind == RTE_RELATION ||
+					(rte->rtekind == RTE_SUBQUERY && rte->relid != InvalidOid))
+					Assert(CheckRelationOidLockedByMe(rte->relid,
+													  rte->rellockmode,
+													  true));
+			}
+		}
+#endif
+		ExecutorPrep(queryDesc, CurrentResourceOwner, eflags);
+	}
+#ifdef USE_ASSERT_CHECKING
+	else
+	{
+		/*
+		 * A prep EState was provided, meaning pruning-aware locking should
+		 * have locked at least the unpruned relations.
+		 */
+		if (!IsParallelWorker())
+		{
+			int			rtindex = -1;
+
+			while ((rtindex = bms_next_member(queryDesc->estate->es_unpruned_relids,
+											  rtindex)) >= 0)
+			{
+				RangeTblEntry *rte = exec_rt_fetch(rtindex, queryDesc->estate);
+
+				Assert(rte->rtekind == RTE_RELATION ||
+					   (rte->rtekind == RTE_SUBQUERY &&
+						rte->relid != InvalidOid));
+				Assert(CheckRelationOidLockedByMe(rte->relid,
+												  rte->rellockmode, true));
+			}
+		}
+	}
+#endif
+
+	estate = queryDesc->estate;
+	Assert(estate);
 
 	oldcontext = MemoryContextSwitchTo(estate->es_query_cxt);
 
@@ -265,6 +324,64 @@ standard_ExecutorStart(QueryDesc *queryDesc, int eflags)
 	MemoryContextSwitchTo(oldcontext);
 }
 
+/*
+ * ExecutorPrep
+ *
+ * Build the initial executor state for queryDesc before ExecutorStart().
+ *
+ * This creates the EState and performs the subset of executor startup that
+ * does not require plan-tree initialization, allowing that work to be reused
+ * by callers that need executor state before ExecutorStart():
+ *
+ * - initialize the range table
+ * - perform permission checks
+ * - perform initial partition pruning
+ *
+ * On success, queryDesc->estate is set and can later be reused by
+ * ExecutorStart() instead of rebuilding the same state.
+ *
+ * Caller must ensure that queryDesc->snapshot is active.
+ */
+static void
+ExecutorPrep(QueryDesc *queryDesc, ResourceOwner owner, int eflags)
+{
+	ResourceOwner oldowner;
+	EState	   *estate;
+	PlannedStmt *pstmt;
+
+	Assert(queryDesc != NULL);
+
+	if (queryDesc->operation == CMD_UTILITY)
+		return;
+
+	Assert(ActiveSnapshotSet());
+	Assert(GetActiveSnapshot() == queryDesc->snapshot);
+	Assert(queryDesc->estate == NULL);
+
+	pstmt = queryDesc->plannedstmt;
+
+	estate = CreateExecutorState();
+	queryDesc->estate = estate;
+
+	estate->es_plannedstmt = pstmt;
+	estate->es_part_prune_infos = pstmt->partPruneInfos;
+	estate->es_param_list_info = queryDesc->params;
+	estate->es_queryEnv = queryDesc->queryEnv;
+	estate->es_top_eflags = eflags;
+
+	ExecCheckPermissions(pstmt->rtable, pstmt->permInfos, true);
+
+	ExecInitRangeTable(estate, pstmt->rtable, pstmt->permInfos,
+					   bms_copy(pstmt->unprunableRelids));
+
+	oldowner = CurrentResourceOwner;
+	CurrentResourceOwner = owner;
+
+	ExecDoInitialPruning(estate);
+
+	CurrentResourceOwner = oldowner;
+}
+
 /* ----------------------------------------------------------------
  *		ExecutorRun
  *
@@ -840,37 +957,14 @@ InitPlan(QueryDesc *queryDesc, int eflags)
 	CmdType		operation = queryDesc->operation;
 	PlannedStmt *plannedstmt = queryDesc->plannedstmt;
 	Plan	   *plan = plannedstmt->planTree;
-	List	   *rangeTable = plannedstmt->rtable;
 	EState	   *estate = queryDesc->estate;
 	PlanState  *planstate;
 	TupleDesc	tupType;
 	ListCell   *l;
 	int			i;
 
-	/*
-	 * Do permissions checks
-	 */
-	ExecCheckPermissions(rangeTable, plannedstmt->permInfos, true);
-
-	/*
-	 * initialize the node's execution state
-	 */
-	ExecInitRangeTable(estate, rangeTable, plannedstmt->permInfos,
-					   bms_copy(plannedstmt->unprunableRelids));
-
-	estate->es_plannedstmt = plannedstmt;
-	estate->es_part_prune_infos = plannedstmt->partPruneInfos;
-
-	/*
-	 * Perform runtime "initial" pruning to identify which child subplans,
-	 * corresponding to the children of plan nodes that contain
-	 * PartitionPruneInfo such as Append, will not be executed. The results,
-	 * which are bitmapsets of indexes of the child subplans that will be
-	 * executed, are saved in es_part_prune_results.  These results correspond
-	 * to each PartitionPruneInfo entry, and the es_part_prune_results list is
-	 * parallel to es_part_prune_infos.
-	 */
-	ExecDoInitialPruning(estate);
+	/* ExecutorPrep() must have been done. */
+	Assert(queryDesc->estate);
 
 	/*
 	 * Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/include/executor/execdesc.h b/src/include/executor/execdesc.h
index d3a57242844..27697760bb9 100644
--- a/src/include/executor/execdesc.h
+++ b/src/include/executor/execdesc.h
@@ -43,7 +43,7 @@ typedef struct QueryDesc
 	QueryEnvironment *queryEnv; /* query environment passed in */
 	int			instrument_options; /* OR of InstrumentOption flags */
 
-	/* These fields are set by ExecutorStart */
+	/* These fields are set by ExecutorStart or ExecutorPrep */
 	TupleDesc	tupDesc;		/* descriptor for result tuples */
 	EState	   *estate;			/* executor's query-wide state */
 	PlanState  *planstate;		/* tree of per-plan-node state */
-- 
2.47.3



  [application/octet-stream] v11-0001-Move-execution-lock-acquisition-out-of-GetCached.patch (16.4K, 4-v11-0001-Move-execution-lock-acquisition-out-of-GetCached.patch)
  download | inline diff:
From 8dc44320c7d4b20f50200d7b21c98e4058b8d6d7 Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Sat, 4 Apr 2026 18:38:34 +0900
Subject: [PATCH v11 1/4] Move execution lock acquisition out of
 GetCachedPlan()

GetCachedPlan() previously acquired execution locks on all plan
relations as part of cached plan validation.  Move this
responsibility to callers, making GetCachedPlan() return a valid
plan without holding execution locks.

Add AcquireExecutorLocks() as the caller-facing function: it locks
all relations in the plan, checks that the plan is still valid
afterward, and returns false if it was invalidated so the caller
can retry with a fresh plan.

For portal-backed callers, add PortalLockCachedPlan() in pquery.c
which wraps the lock-check-retry loop and handles the case where
replanning changes the portal strategy.  Store the CachedPlanSource
pointer in PortalData so retry can call GetCachedPlan() without
the caller threading it through.

Adjust all non-portal GetCachedPlan() callers (SPI, EXPLAIN
EXECUTE, SQL functions) to call AcquireExecutorLocks() explicitly
after fetching the plan.

No behavioral change.  This separates plan retrieval from execution
setup, allowing a later commit to substitute pruning-aware locking
for eligible plans.
---
 src/backend/commands/portalcmds.c   |  1 +
 src/backend/commands/prepare.c      | 14 +++++-
 src/backend/executor/functions.c    | 14 ++++--
 src/backend/executor/spi.c          | 22 ++++++++--
 src/backend/tcop/postgres.c         |  2 +
 src/backend/tcop/pquery.c           | 68 ++++++++++++++++++++++++++++-
 src/backend/utils/cache/plancache.c | 44 ++++++++++++++-----
 src/backend/utils/mmgr/portalmem.c  |  7 +++
 src/include/utils/plancache.h       |  1 +
 src/include/utils/portal.h          |  3 ++
 10 files changed, 155 insertions(+), 21 deletions(-)

diff --git a/src/backend/commands/portalcmds.c b/src/backend/commands/portalcmds.c
index 01efac3319e..cf5deec4943 100644
--- a/src/backend/commands/portalcmds.c
+++ b/src/backend/commands/portalcmds.c
@@ -118,6 +118,7 @@ PerformCursorOpen(ParseState *pstate, DeclareCursorStmt *cstmt, ParamListInfo pa
 					  queryString,
 					  CMDTAG_SELECT,	/* cursor's query is always a SELECT */
 					  list_make1(plan),
+					  NULL,
 					  NULL);
 
 	/*----------
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 876aad2100a..03d7a98fc58 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -207,6 +207,7 @@ ExecuteQuery(ParseState *pstate,
 					  query_string,
 					  entry->plansource->commandTag,
 					  plan_list,
+					  entry->plansource,
 					  cplan);
 
 	/*
@@ -632,8 +633,17 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	}
 
 	/* Replan if needed, and acquire a transient refcount */
-	cplan = GetCachedPlan(entry->plansource, paramLI,
-						  CurrentResourceOwner, pstate->p_queryEnv);
+	for (;;)
+	{
+		cplan = GetCachedPlan(entry->plansource, paramLI,
+							  CurrentResourceOwner,
+							  pstate->p_queryEnv);
+		plan_list = cplan->stmt_list;
+
+		if (AcquireExecutorLocks(cplan))
+			break;
+		ReleaseCachedPlan(cplan, CurrentResourceOwner);
+	}
 
 	INSTR_TIME_SET_CURRENT(planduration);
 	INSTR_TIME_SUBTRACT(planduration, planstart);
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index 88109348817..2afb814a435 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -654,6 +654,7 @@ static bool
 init_execution_state(SQLFunctionCachePtr fcache)
 {
 	CachedPlanSource *plansource;
+	CachedPlan *cplan;
 	execution_state *preves = NULL;
 	execution_state *lasttages = NULL;
 	int			nstmts;
@@ -696,10 +697,15 @@ init_execution_state(SQLFunctionCachePtr fcache)
 	 * CurrentResourceOwner will be the same when ShutdownSQLFunction runs.)
 	 */
 	fcache->cowner = CurrentResourceOwner;
-	fcache->cplan = GetCachedPlan(plansource,
-								  fcache->paramLI,
-								  fcache->cowner,
-								  NULL);
+	for (;;)
+	{
+		cplan = GetCachedPlan(plansource, fcache->paramLI,
+							  fcache->cowner, NULL);
+		if (AcquireExecutorLocks(cplan))
+			break;
+		ReleaseCachedPlan(cplan, fcache->cowner);
+	}
+	fcache->cplan = cplan;
 
 	/*
 	 * If necessary, make esarray[] bigger to hold the needed state.
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 52f3b11301c..268cd10bde8 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -1686,6 +1686,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 					  query_string,
 					  plansource->commandTag,
 					  stmt_list,
+					  plansource,
 					  cplan);
 
 	/*
@@ -2106,6 +2107,16 @@ SPI_plan_get_cached_plan(SPIPlanPtr plan)
 						  _SPI_current->queryEnv);
 	Assert(cplan == plansource->gplan);
 
+	if (!AcquireExecutorLocks(cplan))
+	{
+		/* Plan invalidated during locking; get a fresh one. */
+		ReleaseCachedPlan(cplan,
+						  plan->saved ? CurrentResourceOwner : NULL);
+		cplan = GetCachedPlan(plansource, NULL,
+							  plan->saved ? CurrentResourceOwner : NULL,
+							  _SPI_current->queryEnv);
+	}
+
 	/* Pop the error context stack */
 	error_context_stack = spierrcontext.previous;
 
@@ -2574,9 +2585,14 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 		 * Replan if needed, and increment plan refcount.  If it's a saved
 		 * plan, the refcount must be backed by the plan_owner.
 		 */
-		cplan = GetCachedPlan(plansource, options->params,
-							  plan_owner, _SPI_current->queryEnv);
-
+		for (;;)
+		{
+			cplan = GetCachedPlan(plansource, options->params,
+								  plan_owner, _SPI_current->queryEnv);
+			if (AcquireExecutorLocks(cplan))
+				break;
+			ReleaseCachedPlan(cplan, plan_owner);
+		}
 		stmt_list = cplan->stmt_list;
 
 		/*
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 10be60011ad..aaebefcdf7a 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -1231,6 +1231,7 @@ exec_simple_query(const char *query_string)
 						  query_string,
 						  commandTag,
 						  plantree_list,
+						  NULL,
 						  NULL);
 
 		/*
@@ -2030,6 +2031,7 @@ exec_bind_message(StringInfo input_message)
 					  query_string,
 					  psrc->commandTag,
 					  cplan->stmt_list,
+					  psrc,
 					  cplan);
 
 	/* Portal is defined, set the plan ID based on its contents. */
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index d8fc75d0bb9..1b22515d56e 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -59,6 +59,7 @@ static uint64 DoPortalRunFetch(Portal portal,
 							   long count,
 							   DestReceiver *dest);
 static void DoPortalRewind(Portal portal);
+static bool PortalLockCachedPlan(Portal portal);
 
 
 /*
@@ -462,6 +463,8 @@ PortalStart(Portal portal, ParamListInfo params,
 		 */
 		portal->strategy = ChoosePortalStrategy(portal->stmts);
 
+restart:
+
 		/*
 		 * Fire her up according to the strategy
 		 */
@@ -487,6 +490,11 @@ PortalStart(Portal portal, ParamListInfo params,
 				/*
 				 * Create QueryDesc in portal's context; for the moment, set
 				 * the destination to DestNone.
+				 *
+				 * If the portal is backed by a cached plan, acquire execution
+				 * locks via PortalLockCachedPlan().  If the plan is
+				 * invalidated during locking, it replans and may change the
+				 * portal strategy, requiring us to restart PortalStart().
 				 */
 				queryDesc = CreateQueryDesc(linitial_node(PlannedStmt, portal->stmts),
 											portal->sourceText,
@@ -496,6 +504,14 @@ PortalStart(Portal portal, ParamListInfo params,
 											params,
 											portal->queryEnv,
 											0);
+				if (portal->cplan)
+				{
+					if (PortalLockCachedPlan(portal))
+					{
+						PopActiveSnapshot();
+						goto restart;
+					}
+				}
 
 				/*
 				 * If it's a scrollable cursor, executor needs to support
@@ -534,6 +550,11 @@ PortalStart(Portal portal, ParamListInfo params,
 
 			case PORTAL_ONE_RETURNING:
 			case PORTAL_ONE_MOD_WITH:
+				if (portal->cplan)
+				{
+					if (PortalLockCachedPlan(portal))
+						goto restart;
+				}
 
 				/*
 				 * We don't start the executor until we are told to run the
@@ -577,7 +598,20 @@ PortalStart(Portal portal, ParamListInfo params,
 				break;
 
 			case PORTAL_MULTI_QUERY:
-				/* Need do nothing now */
+
+				/*
+				 * GetCachedPlan() no longer acquires execution locks, so we
+				 * must do it here.  Multi-statement plans always use
+				 * conservative locking (all partitions locked); pruning-aware
+				 * locking is not feasible because PortalRunMulti() executes
+				 * statements sequentially with CCI between them.
+				 */
+				if (portal->cplan)
+				{
+					if (PortalLockCachedPlan(portal))
+						goto restart;
+				}
+
 				portal->tupDesc = NULL;
 				break;
 		}
@@ -1785,3 +1819,35 @@ EnsurePortalSnapshotExists(void)
 	/* PushActiveSnapshotWithLevel might have copied the snapshot */
 	portal->portalSnapshot = GetActiveSnapshot();
 }
+
+/*
+ * PortalLockCachedPlan
+ *		Acquire execution locks for a cached-plan-backed portal,
+ *		retrying with a fresh plan if the current one is invalidated.
+ *
+ * Returns true if replanning changed portal->strategy, meaning the
+ * caller must redispatch.  Returns false once locks are held.
+ */
+static bool
+PortalLockCachedPlan(Portal portal)
+{
+	PortalStrategy start_strategy = portal->strategy;
+
+	if (AcquireExecutorLocks(portal->cplan))
+		return false;
+
+	/* Replan.  Locks will be taken freshly. */
+	ReleaseCachedPlan(portal->cplan, portal->resowner);
+	portal->cplan = NULL;
+	portal->stmts = NIL;
+	portal->cplan = GetCachedPlan(portal->plansource,
+								  portal->portalParams,
+								  portal->resowner,
+								  portal->queryEnv);
+	portal->stmts = portal->cplan->stmt_list;
+	portal->strategy = ChoosePortalStrategy(portal->stmts);
+	if (portal->strategy != start_strategy)
+		return true;
+
+	return false;
+}
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 698e7c1aa22..f7fe366859c 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -100,7 +100,7 @@ static bool choose_custom_plan(CachedPlanSource *plansource,
 							   ParamListInfo boundParams);
 static double cached_plan_cost(CachedPlan *plan, bool include_planner);
 static Query *QueryListGetPrimaryStmt(List *stmts);
-static void AcquireExecutorLocks(List *stmt_list, bool acquire);
+static void AcquireExecutorLocksInt(List *stmt_list, bool acquire);
 static void AcquirePlannerLocks(List *stmt_list, bool acquire);
 static void ScanQueryForLocks(Query *parsetree, bool acquire);
 static bool ScanQueryWalker(Node *node, bool *acquire);
@@ -945,8 +945,9 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
  * Caller must have already called RevalidateCachedQuery to verify that the
  * querytree is up to date.
  *
- * On a "true" return, we have acquired the locks needed to run the plan.
- * (We must do this for the "true" result to be race-condition-free.)
+ * On a "true" return, the generic plan may be reused as a valid cached
+ * plan.  Any execution-time setup, including lock acquisition, is the
+ * caller's responsibility.
  */
 static bool
 CheckCachedPlan(CachedPlanSource *plansource)
@@ -983,8 +984,6 @@ CheckCachedPlan(CachedPlanSource *plansource)
 		 */
 		Assert(plan->refcount > 0);
 
-		AcquireExecutorLocks(plan->stmt_list, true);
-
 		/*
 		 * If plan was transient, check to see if TransactionXmin has
 		 * advanced, and if so invalidate it.
@@ -1003,9 +1002,6 @@ CheckCachedPlan(CachedPlanSource *plansource)
 			/* Successfully revalidated and locked the query. */
 			return true;
 		}
-
-		/* Oops, the race case happened.  Release useless locks. */
-		AcquireExecutorLocks(plan->stmt_list, false);
 	}
 
 	/*
@@ -1282,8 +1278,11 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
  * plan or a custom plan for the given parameters: the caller does not know
  * which it will get.
  *
- * On return, the plan is valid and we have sufficient locks to begin
- * execution.
+ * On return, the plan is valid but no execution locks are held.
+ * The caller must call AcquireExecutorLocks() before executing.
+ * For freshly built plans (custom or new generic), the planner
+ * already holds the needed locks, so AcquireExecutorLocks() is
+ * redundant but harmless.
  *
  * On return, the refcount of the plan has been incremented; a later
  * ReleaseCachedPlan() call is expected.  If "owner" is not NULL then
@@ -1906,9 +1905,11 @@ QueryListGetPrimaryStmt(List *stmts)
 /*
  * AcquireExecutorLocks: acquire locks needed for execution of a cached plan;
  * or release them if acquire is false.
+ *
+ * This locks all relations in a given PlannedStmt's range table.
  */
 static void
-AcquireExecutorLocks(List *stmt_list, bool acquire)
+AcquireExecutorLocksInt(List *stmt_list, bool acquire)
 {
 	ListCell   *lc1;
 
@@ -1955,6 +1956,27 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
 	}
 }
 
+/*
+ * AcquireExecutorLocks
+ *		Acquire execution locks on all relations in a cached plan.
+ *
+ * Returns true if the plan is still valid after locking.  Returns
+ * false if the plan was invalidated while locks were being acquired,
+ * in which case the locks have been released and the caller should
+ * discard this plan and retry with a fresh one from GetCachedPlan().
+ */
+bool
+AcquireExecutorLocks(CachedPlan *cplan)
+{
+	AcquireExecutorLocksInt(cplan->stmt_list, true);
+	if (!cplan->is_valid)
+	{
+		AcquireExecutorLocksInt(cplan->stmt_list, false);
+		return false;
+	}
+	return true;
+}
+
 /*
  * AcquirePlannerLocks: acquire locks needed for planning of a querytree list;
  * or release them if acquire is false.
diff --git a/src/backend/utils/mmgr/portalmem.c b/src/backend/utils/mmgr/portalmem.c
index 493f9b0ee19..613f3be30b3 100644
--- a/src/backend/utils/mmgr/portalmem.c
+++ b/src/backend/utils/mmgr/portalmem.c
@@ -272,6 +272,10 @@ CreateNewPortal(void)
  * the passed plan trees have adequate lifetime.  Typically this is done by
  * copying them into the portal's context.
  *
+ * If plansource is provided, it is the CachedPlanSource that produced
+ * cplan.  PortalLockCachedPlan() uses it to fetch a fresh plan if the
+ * current one is invalidated during execution lock acquisition.
+ *
  * The caller is also responsible for ensuring that the passed prepStmtName
  * (if not NULL) and sourceText have adequate lifetime.
  *
@@ -286,6 +290,7 @@ PortalDefineQuery(Portal portal,
 				  const char *sourceText,
 				  CommandTag commandTag,
 				  List *stmts,
+				  CachedPlanSource *plansource,
 				  CachedPlan *cplan)
 {
 	Assert(PortalIsValid(portal));
@@ -299,6 +304,7 @@ PortalDefineQuery(Portal portal,
 	portal->commandTag = commandTag;
 	SetQueryCompletion(&portal->qc, commandTag, 0);
 	portal->stmts = stmts;
+	portal->plansource = plansource;
 	portal->cplan = cplan;
 	portal->status = PORTAL_DEFINED;
 }
@@ -517,6 +523,7 @@ PortalDrop(Portal portal, bool isTopCommit)
 
 	/* drop cached plan reference, if any */
 	PortalReleaseCachedPlan(portal);
+	portal->plansource = NULL;
 
 	/*
 	 * If portal has a snapshot protecting its data, release that.  This needs
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index 7a4a85c8038..e0fc403e717 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -241,6 +241,7 @@ extern CachedPlan *GetCachedPlan(CachedPlanSource *plansource,
 								 ParamListInfo boundParams,
 								 ResourceOwner owner,
 								 QueryEnvironment *queryEnv);
+extern bool AcquireExecutorLocks(CachedPlan *cplan);
 extern void ReleaseCachedPlan(CachedPlan *plan, ResourceOwner owner);
 
 extern bool CachedPlanAllowsSimpleValidityCheck(CachedPlanSource *plansource,
diff --git a/src/include/utils/portal.h b/src/include/utils/portal.h
index a7bedb12c18..3af535362cd 100644
--- a/src/include/utils/portal.h
+++ b/src/include/utils/portal.h
@@ -137,6 +137,8 @@ typedef struct PortalData
 	CommandTag	commandTag;		/* command tag for original query */
 	QueryCompletion qc;			/* command completion data for executed query */
 	List	   *stmts;			/* list of PlannedStmts */
+	CachedPlanSource *plansource;	/* CachedPlanSource, for replanning on
+									 * invalidation */
 	CachedPlan *cplan;			/* CachedPlan, if stmts are from one */
 
 	ParamListInfo portalParams; /* params to pass to query */
@@ -240,6 +242,7 @@ extern void PortalDefineQuery(Portal portal,
 							  const char *sourceText,
 							  CommandTag commandTag,
 							  List *stmts,
+							  CachedPlanSource *plansource,
 							  CachedPlan *cplan);
 extern PlannedStmt *PortalGetPrimaryStmt(Portal portal);
 extern void PortalCreateHoldStore(Portal portal);
-- 
2.47.3



  [application/octet-stream] v11-0002-Refactor-executor-s-initial-partition-pruning-se.patch (7.3K, 5-v11-0002-Refactor-executor-s-initial-partition-pruning-se.patch)
  download | inline diff:
From ddc05ba324ab0347b2219ead1740a14617029f30 Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Wed, 25 Mar 2026 16:06:38 +0900
Subject: [PATCH v11 2/4] Refactor executor's initial partition pruning setup

Simplify handling of unpruned relids by moving responsibility
for recording them in EState into CreatePartitionPruneState(),
avoiding the need to pass all_leafpart_rtis as an out parameter.

Also move the setting of ecxt_param_exec_vals from
ExecCreatePartitionPruneState() to InitExecPartitionPruneContexts(),
to allow the former to be called before PARAM_EXEC parameters are
set up.  A later commit needs this when running pruning state setup
outside of InitPlan().

No behavioral change.
---
 src/backend/executor/execPartition.c | 70 +++++++++++++++++++---------
 1 file changed, 48 insertions(+), 22 deletions(-)

diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index d96d4f9947b..2a3af006f77 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -185,8 +185,7 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation rel,
 static List *adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri);
 static List *adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap);
 static PartitionPruneState *CreatePartitionPruneState(EState *estate,
-													  PartitionPruneInfo *pruneinfo,
-													  Bitmapset **all_leafpart_rtis);
+													  PartitionPruneInfo *pruneinfo);
 static void InitPartitionPruneContext(PartitionPruneContext *context,
 									  List *pruning_steps,
 									  PartitionDesc partdesc,
@@ -1978,7 +1977,7 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  * estate->es_part_prune_infos. For each entry, it creates a PartitionPruneState
  * and adds it to es_part_prune_states.  ExecInitPartitionExecPruning() accesses
  * these states through their corresponding indexes in es_part_prune_states and
- * assign each state to the parent node's PlanState, from where it will be used
+ * assigns each state to the parent node's PlanState, from where it will be used
  * for "exec" pruning.
  *
  * If initial pruning steps exist for a PartitionPruneInfo entry, this function
@@ -1996,29 +1995,31 @@ ExecDoInitialPruning(EState *estate)
 {
 	ListCell   *lc;
 
+	Assert(estate->es_part_prune_results == NULL);
 	foreach(lc, estate->es_part_prune_infos)
 	{
 		PartitionPruneInfo *pruneinfo = lfirst_node(PartitionPruneInfo, lc);
 		PartitionPruneState *prunestate;
 		Bitmapset  *validsubplans = NULL;
-		Bitmapset  *all_leafpart_rtis = NULL;
 		Bitmapset  *validsubplan_rtis = NULL;
 
 		/* Create and save the PartitionPruneState. */
-		prunestate = CreatePartitionPruneState(estate, pruneinfo,
-											   &all_leafpart_rtis);
+		prunestate = CreatePartitionPruneState(estate, pruneinfo);
 		estate->es_part_prune_states = lappend(estate->es_part_prune_states,
 											   prunestate);
 
 		/*
 		 * Perform initial pruning steps, if any, and save the result
-		 * bitmapset or NULL as described in the header comment.
+		 * bitmapset or NULL as described in the header comment.  RT indexes
+		 * of surviving partitions would be added to validsubplan_rtis.
+		 *
+		 * Note that when do_initial_prune is false,
+		 * CreatePartitionPruneState() would have already added the RT indexes
+		 * of all leaf partitions to es_unpruned_relids directly.
 		 */
 		if (prunestate->do_initial_prune)
 			validsubplans = ExecFindMatchingSubPlans(prunestate, true,
 													 &validsubplan_rtis);
-		else
-			validsubplan_rtis = all_leafpart_rtis;
 
 		estate->es_unpruned_relids = bms_add_members(estate->es_unpruned_relids,
 													 validsubplan_rtis);
@@ -2136,14 +2137,12 @@ ExecInitPartitionExecPruning(PlanState *planstate,
  * parent plan node's PlanState.
  *
  * If initial pruning steps are to be skipped (e.g., during EXPLAIN
- * (GENERIC_PLAN)), *all_leafpart_rtis will be populated with the RT indexes of
- * all leaf partitions whose scanning subnode is included in the parent plan
- * node's list of child plans. The caller must add these RT indexes to
- * estate->es_unpruned_relids.
+ * (GENERIC_PLAN)), the RT indexes of all leaf partitions whose scanning
+ * subnode is included in the parent plan node's list of child plans are
+ * added to estate->es_unpruned_relids.
  */
 static PartitionPruneState *
-CreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo,
-						  Bitmapset **all_leafpart_rtis)
+CreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo)
 {
 	PartitionPruneState *prunestate;
 	int			n_part_hierarchies;
@@ -2377,8 +2376,8 @@ CreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo,
 													   pinfo->execparamids);
 
 			/*
-			 * Return all leaf partition indexes if we're skipping pruning in
-			 * the EXPLAIN (GENERIC_PLAN) case.
+			 * Add all leaf partition indexes to es_unpruned_relids if we're
+			 * skipping pruning in the EXPLAIN (GENERIC_PLAN) case.
 			 */
 			if (pinfo->initial_pruning_steps && !prunestate->do_initial_prune)
 			{
@@ -2390,9 +2389,28 @@ CreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo,
 					Index		rtindex = pprune->leafpart_rti_map[part_index];
 
 					if (rtindex)
-						*all_leafpart_rtis = bms_add_member(*all_leafpart_rtis,
-															rtindex);
+						estate->es_unpruned_relids =
+							bms_add_member(estate->es_unpruned_relids, rtindex);
+				}
+			}
+			else if (pinfo->initial_pruning_steps == NIL)
+			{
+				/*
+				 * All partitions better be present in es_unpruned_relids when
+				 * none are initially prunable.
+				 */
+#ifdef USE_ASSERT_CHECKING
+				int			part_index = -1;
+
+				while ((part_index = bms_next_member(pprune->present_parts,
+													 part_index)) >= 0)
+				{
+					Index		rtindex = pprune->leafpart_rti_map[part_index];
+
+					if (rtindex)
+						Assert(bms_is_member(rtindex, estate->es_unpruned_relids));
 				}
+#endif
 			}
 
 			j++;
@@ -2490,9 +2508,10 @@ InitPartitionPruneContext(PartitionPruneContext *context,
  *		Initialize exec pruning contexts deferred by CreatePartitionPruneState()
  *
  * This function finalizes exec pruning setup for a PartitionPruneState by
- * initializing contexts for pruning steps that require the parent plan's
- * PlanState. It iterates over PartitionPruningData entries and sets up the
- * necessary execution contexts for pruning during query execution.
+ * initializing contexts for pruning steps that require PARAM_EXEC parameters
+ * and the parent plan's PlanState. It iterates over PartitionPruningData
+ * entries and sets up the necessary execution contexts for pruning during
+ * query execution.
  *
  * Also fix the mapping of partition indexes to subplan indexes contained in
  * prunestate by considering the new list of subplans that survived initial
@@ -2520,9 +2539,16 @@ InitExecPartitionPruneContexts(PartitionPruneState *prunestate,
 	bool		fix_subplan_map = false;
 
 	Assert(prunestate->do_exec_prune);
+	Assert(prunestate->econtext);
 	Assert(parent_plan != NULL);
 	estate = parent_plan->state;
 
+	/*
+	 * These might not be available when ExecCreatePartitionPruneState() is
+	 * called.
+	 */
+	prunestate->econtext->ecxt_param_exec_vals = estate->es_param_exec_vals;
+
 	/*
 	 * No need to fix subplans maps if initial pruning didn't eliminate any
 	 * subplans.
-- 
2.47.3



^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2026-05-27 12:03  Thom Brown <[email protected]>
  parent: Amit Langote <[email protected]>
  0 siblings, 1 reply; 108+ messages in thread

From: Thom Brown @ 2026-05-27 12:03 UTC (permalink / raw)
  To: Amit Langote <[email protected]>; +Cc: Chao Li <[email protected]>; Tom Lane <[email protected]>; Tender Wang <[email protected]>; Alexander Lakhin <[email protected]>; Tomas Vondra <[email protected]>; Robert Haas <[email protected]>; Alvaro Herrera <[email protected]>; Andres Freund <[email protected]>; Daniel Gustafsson <[email protected]>; David Rowley <[email protected]>; pgsql-hackers

On Sat, 4 Apr 2026 at 13:11, Amit Langote <[email protected]> wrote:
>
> Attached is a redesigned version. While working on the previous
> design, I grew increasingly uncomfortable with CachedPlanPrepData --
> it was smuggling executor state out of GetCachedPlan() through an
> out-parameter, which papered over the real problem: GetCachedPlan()
> was doing too much. The main change in this version is architectural:
> GetCachedPlan() no longer acquires execution locks. Callers now own
> that responsibility, which is natural because each call site iterates
> stmt_list differently and manages execution state in its own way --
> and it lets them choose between conservative lock-all and
> pruning-aware locking where appropriate.
>
> Non-portal call sites remain on the conservative path for now.
> _SPI_execute_plan requires care around snapshot setup, which happens
> after plan fetch rather than before. SQL functions have a different
> issue: init_execution_state() fetches the plan while postquel_start()
> handles execution, with execution_state containers in between, making
> it harder to thread a prepped QueryDesc through. The portal path and
> EXPLAIN EXECUTE cover the most common
> prepared-statement-with-partitions workloads; the remaining sites can
> be converted incrementally.
>
> This is now starting to feel closer to what Tom suggested back in
> January 2023 [1], where he proposed getting rid of
> AcquireExecutorLocks() inside GetCachedPlan() entirely and pushing
> lock acquisition out to callers. He noted that "we'd be pushing the
> responsibility for looping back and re-planning out to fairly
> high-level calling code" and that "we'd definitely be changing some
> fundamental APIs." That is the direction I came around to over the
> last couple of weeks while wrestling with CachedPlanPrepData.  The
> reverted approach also tried to follow Tom's direction but moved
> locking into ExecutorStart(), which forced it to handle plan
> invalidation from inside the executor by mutating the CachedPlan
> in-place. This version moves locking out to the callers instead, so
> the executor and plan cache never reach into each other.
>
> The series is now four patches:
>
> 0001: Move execution lock acquisition out of GetCachedPlan(). Adds
> AcquireExecutorLocks() as a caller-facing function with validity check
> and retry. Adds PortalLockCachedPlan() in pquery.c to centralize the
> portal retry logic. All callers are converted. No behavioral change.
>
> 0002: Refactor executor's initial partition pruning setup. Cleanup
> only, no behavioral change.
>
> 0003: Introduce ExecutorPrep() and refactor executor startup. Factors
> range table init, permission checks, and initial pruning out of
> InitPlan(). Scaffolding for 0004; all callers still go through the
> normal ExecutorStart() path.
>
> 0004: Use pruning-aware locking for single-statement cached plans.
> Adds ExecutorPrepAndLock() which locks unprunable relations, runs
> ExecutorPrep() to determine surviving partitions, then locks only
> those. Extends PortalLockCachedPlan() with a pruning-aware path for
> eligible plans. Multi-statement CachedPlans (from rule rewriting)
> always use conservative locking. In principle, this could be relaxed
> if the planner can prove that no pruning expression reads state
> modified by an earlier statement, but that is left for a future patch.
> Includes regression tests.
>
> In case it's not clear, I'm not targeting v19 at this point.  I'd like
> to get this into v20 CF1 and would welcome review from anyone
> interested.

After not having looked at this in close to 2 years, I thought I'd
give it another look. Not found any user-facing issues, and I'm liking
seeing so few locks in pg_locks. I can see that with pruning disabled,
the fallback works, pruning-aware locking is working via SPI through
plpgsql, running ALTER between executions and also invalidating
indexes force replans, and it's looking good.

But I also think there might be a bug in patch 0001, but I'd
appreciate checking my reasoning because I'm not fully confident I've
been diligent enough.

When PortalStart() opens a SELECT cursor that's backed by a cached
plan, it does roughly the following. It builds a queryDesc (an
executor-side struct), one of whose fields is a pointer into the plan
tree inside the portal's cached plan. Then it calls
PortalLockCachedPlan() to acquire the necessary locks, and finally
hands the queryDesc over to the executor.

My worry is about what happens if the cached plan turns out to be
stale, for instance because someone ran DDL on a referenced table. In
that case PortalLockCachedPlan() throws the old plan away (via
ReleaseCachedPlan) and fetches a freshly-built replacement, updtating
the portal's own pointers to match. But the queryDesc from earlier
isn't touched. Its plan pointer still references the old, now-released
plan. From what I can see, once that old plan's last reference is
dropped its memory can be freed, which would leave the executor
reading from freed memory in the next step.

The bit I'm least sure about is whether the old plan's memory really
does get reclaimed straight away when its refcount hits zero. If
something keeps it alive longer then this isn't a bug, or at least not
as bad as I'm making out. I had a look but couldn't convince myself
either way from the code alone. To actually hit this you'd need a
cursor on a cached plan, plus an invalidation arriving in the small
window between the portal being set up and the cursor being opened.
The race condition is brief, and I've not been able to hit it in
testing.

The thing that got me thinking this is real: patch 0004 modifies
PortalLockCachedPlan() so that whenever it replans, it also rebuilds
the queryDesc. That's pretty much the fix I'd expect for this, which
makes me suspect somebody hit it at some point. But 0004 only applies
that fix on the new pruning-aware code path, and it was mentioned in
the thread that 0001 to 0003 might land before 0004. If so, master
would carry the bug in the gap between the two.

I suspect a way to deal with it would be to move the CreateQueryDesc
call in the SELECT case to after PortalLockCachedPlan() returns, which
is what the other portal strategies already seem to do. Alternatively,
you could bring 0004's changes in this area into 0001 and have
PortalLockCachedPlan() always rebuild the queryDesc when it replans.

If I've got this wrong and there's some lifetime mechanism I missed
that keeps the old plan's memory alive, then it's a non-issue and I'm
misreading the code. If I have got it wrong, could you please add
comments to make what is going on clearer?

Regards

Thom






^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2026-05-28 08:13  Amit Langote <[email protected]>
  parent: Thom Brown <[email protected]>
  0 siblings, 1 reply; 108+ messages in thread

From: Amit Langote @ 2026-05-28 08:13 UTC (permalink / raw)
  To: Thom Brown <[email protected]>; +Cc: Chao Li <[email protected]>; Tom Lane <[email protected]>; Tender Wang <[email protected]>; Alexander Lakhin <[email protected]>; Tomas Vondra <[email protected]>; Robert Haas <[email protected]>; Alvaro Herrera <[email protected]>; Andres Freund <[email protected]>; Daniel Gustafsson <[email protected]>; David Rowley <[email protected]>; pgsql-hackers

Hi Thom,

On Wed, May 27, 2026 at 9:03 PM Thom Brown <[email protected]> wrote:
>
> On Sat, 4 Apr 2026 at 13:11, Amit Langote <[email protected]> wrote:
> >
> > Attached is a redesigned version. While working on the previous
> > design, I grew increasingly uncomfortable with CachedPlanPrepData --
> > it was smuggling executor state out of GetCachedPlan() through an
> > out-parameter, which papered over the real problem: GetCachedPlan()
> > was doing too much. The main change in this version is architectural:
> > GetCachedPlan() no longer acquires execution locks. Callers now own
> > that responsibility, which is natural because each call site iterates
> > stmt_list differently and manages execution state in its own way --
> > and it lets them choose between conservative lock-all and
> > pruning-aware locking where appropriate.
> >
> > Non-portal call sites remain on the conservative path for now.
> > _SPI_execute_plan requires care around snapshot setup, which happens
> > after plan fetch rather than before. SQL functions have a different
> > issue: init_execution_state() fetches the plan while postquel_start()
> > handles execution, with execution_state containers in between, making
> > it harder to thread a prepped QueryDesc through. The portal path and
> > EXPLAIN EXECUTE cover the most common
> > prepared-statement-with-partitions workloads; the remaining sites can
> > be converted incrementally.
> >
> > This is now starting to feel closer to what Tom suggested back in
> > January 2023 [1], where he proposed getting rid of
> > AcquireExecutorLocks() inside GetCachedPlan() entirely and pushing
> > lock acquisition out to callers. He noted that "we'd be pushing the
> > responsibility for looping back and re-planning out to fairly
> > high-level calling code" and that "we'd definitely be changing some
> > fundamental APIs." That is the direction I came around to over the
> > last couple of weeks while wrestling with CachedPlanPrepData.  The
> > reverted approach also tried to follow Tom's direction but moved
> > locking into ExecutorStart(), which forced it to handle plan
> > invalidation from inside the executor by mutating the CachedPlan
> > in-place. This version moves locking out to the callers instead, so
> > the executor and plan cache never reach into each other.
> >
> > The series is now four patches:
> >
> > 0001: Move execution lock acquisition out of GetCachedPlan(). Adds
> > AcquireExecutorLocks() as a caller-facing function with validity check
> > and retry. Adds PortalLockCachedPlan() in pquery.c to centralize the
> > portal retry logic. All callers are converted. No behavioral change.
> >
> > 0002: Refactor executor's initial partition pruning setup. Cleanup
> > only, no behavioral change.
> >
> > 0003: Introduce ExecutorPrep() and refactor executor startup. Factors
> > range table init, permission checks, and initial pruning out of
> > InitPlan(). Scaffolding for 0004; all callers still go through the
> > normal ExecutorStart() path.
> >
> > 0004: Use pruning-aware locking for single-statement cached plans.
> > Adds ExecutorPrepAndLock() which locks unprunable relations, runs
> > ExecutorPrep() to determine surviving partitions, then locks only
> > those. Extends PortalLockCachedPlan() with a pruning-aware path for
> > eligible plans. Multi-statement CachedPlans (from rule rewriting)
> > always use conservative locking. In principle, this could be relaxed
> > if the planner can prove that no pruning expression reads state
> > modified by an earlier statement, but that is left for a future patch.
> > Includes regression tests.
> >
> > In case it's not clear, I'm not targeting v19 at this point.  I'd like
> > to get this into v20 CF1 and would welcome review from anyone
> > interested.
>
> After not having looked at this in close to 2 years, I thought I'd
> give it another look.

Thanks for taking a look.

> Not found any user-facing issues, and I'm liking
> seeing so few locks in pg_locks. I can see that with pruning disabled,
> the fallback works, pruning-aware locking is working via SPI through
> plpgsql, running ALTER between executions and also invalidating
> indexes force replans, and it's looking good.
>
> But I also think there might be a bug in patch 0001, but I'd
> appreciate checking my reasoning because I'm not fully confident I've
> been diligent enough.
>
> When PortalStart() opens a SELECT cursor that's backed by a cached
> plan, it does roughly the following. It builds a queryDesc (an
> executor-side struct), one of whose fields is a pointer into the plan
> tree inside the portal's cached plan. Then it calls
> PortalLockCachedPlan() to acquire the necessary locks, and finally
> hands the queryDesc over to the executor.
>
> My worry is about what happens if the cached plan turns out to be
> stale, for instance because someone ran DDL on a referenced table. In
> that case PortalLockCachedPlan() throws the old plan away (via
> ReleaseCachedPlan) and fetches a freshly-built replacement, updtating
> the portal's own pointers to match. But the queryDesc from earlier
> isn't touched. Its plan pointer still references the old, now-released
> plan. From what I can see, once that old plan's last reference is
> dropped its memory can be freed, which would leave the executor
> reading from freed memory in the next step.
>
> The bit I'm least sure about is whether the old plan's memory really
> does get reclaimed straight away when its refcount hits zero. If
> something keeps it alive longer then this isn't a bug, or at least not
> as bad as I'm making out. I had a look but couldn't convince myself
> either way from the code alone. To actually hit this you'd need a
> cursor on a cached plan, plus an invalidation arriving in the small
> window between the portal being set up and the cursor being opened.
> The race condition is brief, and I've not been able to hit it in
> testing.
>
> The thing that got me thinking this is real: patch 0004 modifies
> PortalLockCachedPlan() so that whenever it replans, it also rebuilds
> the queryDesc. That's pretty much the fix I'd expect for this, which
> makes me suspect somebody hit it at some point. But 0004 only applies
> that fix on the new pruning-aware code path, and it was mentioned in
> the thread that 0001 to 0003 might land before 0004. If so, master
> would carry the bug in the gap between the two.
>
> I suspect a way to deal with it would be to move the CreateQueryDesc
> call in the SELECT case to after PortalLockCachedPlan() returns, which
> is what the other portal strategies already seem to do. Alternatively,
> you could bring 0004's changes in this area into 0001 and have
> PortalLockCachedPlan() always rebuild the queryDesc when it replans.
>
> If I've got this wrong and there's some lifetime mechanism I missed
> that keeps the old plan's memory alive, then it's a non-issue and I'm
> misreading the code. If I have got it wrong, could you please add
> comments to make what is going on clearer?

It's a real bug.

You're right that if PortalLockCachedPlan() replans, the QueryDesc
created before the call still points at the old PlannedStmt from the
released plan.  And yes, 0004 happens to fix it by rebuilding the
QueryDesc inside PortalLockCachedPlan(), but 0001 through 0003 are
broken on their own.

Attached is an updated set with the fix: CreateQueryDesc now runs
after PortalLockCachedPlan() returns, as you suggested.  That said,
I'll probably focus first on settling the plancache refactoring that
spun off from this thread [1], and then start a new thread for the
pruning-aware locking work on top of it, incorporating parts of this
series.

-- 
Thanks, Amit Langote

[1] https://www.postgresql.org/message-id/CA%2BHiwqE1ntHy2h9zJ9v3MwAkoGAveSERcHWkDTTZnP0kxWqbKQ%40mail.g...


Attachments:

  [application/octet-stream] v12-0001-Move-execution-lock-acquisition-out-of-GetCached.patch (16.2K, 2-v12-0001-Move-execution-lock-acquisition-out-of-GetCached.patch)
  download | inline diff:
From a3214580f2ce1983a111af07ccb092ba03c812c8 Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Sat, 4 Apr 2026 18:38:34 +0900
Subject: [PATCH v12 1/4] Move execution lock acquisition out of
 GetCachedPlan()

GetCachedPlan() previously acquired execution locks on all plan
relations as part of cached plan validation.  Move this
responsibility to callers, making GetCachedPlan() return a valid
plan without holding execution locks.

Add AcquireExecutorLocks() as the caller-facing function: it locks
all relations in the plan, checks that the plan is still valid
afterward, and returns false if it was invalidated so the caller
can retry with a fresh plan.

For portal-backed callers, add PortalLockCachedPlan() in pquery.c
which wraps the lock-check-retry loop and handles the case where
replanning changes the portal strategy.  Store the CachedPlanSource
pointer in PortalData so retry can call GetCachedPlan() without
the caller threading it through.

Adjust all non-portal GetCachedPlan() callers (SPI, EXPLAIN
EXECUTE, SQL functions) to call AcquireExecutorLocks() explicitly
after fetching the plan.

No behavioral change.  This separates plan retrieval from execution
setup, allowing a later commit to substitute pruning-aware locking
for eligible plans.
---
 src/backend/commands/portalcmds.c   |  1 +
 src/backend/commands/prepare.c      | 14 +++++-
 src/backend/executor/functions.c    | 14 ++++--
 src/backend/executor/spi.c          | 22 +++++++--
 src/backend/tcop/postgres.c         |  2 +
 src/backend/tcop/pquery.c           | 70 ++++++++++++++++++++++++++++-
 src/backend/utils/cache/plancache.c | 44 +++++++++++++-----
 src/backend/utils/mmgr/portalmem.c  |  7 +++
 src/include/utils/plancache.h       |  1 +
 src/include/utils/portal.h          |  3 ++
 10 files changed, 157 insertions(+), 21 deletions(-)

diff --git a/src/backend/commands/portalcmds.c b/src/backend/commands/portalcmds.c
index 01efac3319e..cf5deec4943 100644
--- a/src/backend/commands/portalcmds.c
+++ b/src/backend/commands/portalcmds.c
@@ -118,6 +118,7 @@ PerformCursorOpen(ParseState *pstate, DeclareCursorStmt *cstmt, ParamListInfo pa
 					  queryString,
 					  CMDTAG_SELECT,	/* cursor's query is always a SELECT */
 					  list_make1(plan),
+					  NULL,
 					  NULL);
 
 	/*----------
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 876aad2100a..03d7a98fc58 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -207,6 +207,7 @@ ExecuteQuery(ParseState *pstate,
 					  query_string,
 					  entry->plansource->commandTag,
 					  plan_list,
+					  entry->plansource,
 					  cplan);
 
 	/*
@@ -632,8 +633,17 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	}
 
 	/* Replan if needed, and acquire a transient refcount */
-	cplan = GetCachedPlan(entry->plansource, paramLI,
-						  CurrentResourceOwner, pstate->p_queryEnv);
+	for (;;)
+	{
+		cplan = GetCachedPlan(entry->plansource, paramLI,
+							  CurrentResourceOwner,
+							  pstate->p_queryEnv);
+		plan_list = cplan->stmt_list;
+
+		if (AcquireExecutorLocks(cplan))
+			break;
+		ReleaseCachedPlan(cplan, CurrentResourceOwner);
+	}
 
 	INSTR_TIME_SET_CURRENT(planduration);
 	INSTR_TIME_SUBTRACT(planduration, planstart);
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index 88109348817..2afb814a435 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -654,6 +654,7 @@ static bool
 init_execution_state(SQLFunctionCachePtr fcache)
 {
 	CachedPlanSource *plansource;
+	CachedPlan *cplan;
 	execution_state *preves = NULL;
 	execution_state *lasttages = NULL;
 	int			nstmts;
@@ -696,10 +697,15 @@ init_execution_state(SQLFunctionCachePtr fcache)
 	 * CurrentResourceOwner will be the same when ShutdownSQLFunction runs.)
 	 */
 	fcache->cowner = CurrentResourceOwner;
-	fcache->cplan = GetCachedPlan(plansource,
-								  fcache->paramLI,
-								  fcache->cowner,
-								  NULL);
+	for (;;)
+	{
+		cplan = GetCachedPlan(plansource, fcache->paramLI,
+							  fcache->cowner, NULL);
+		if (AcquireExecutorLocks(cplan))
+			break;
+		ReleaseCachedPlan(cplan, fcache->cowner);
+	}
+	fcache->cplan = cplan;
 
 	/*
 	 * If necessary, make esarray[] bigger to hold the needed state.
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 52f3b11301c..268cd10bde8 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -1686,6 +1686,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 					  query_string,
 					  plansource->commandTag,
 					  stmt_list,
+					  plansource,
 					  cplan);
 
 	/*
@@ -2106,6 +2107,16 @@ SPI_plan_get_cached_plan(SPIPlanPtr plan)
 						  _SPI_current->queryEnv);
 	Assert(cplan == plansource->gplan);
 
+	if (!AcquireExecutorLocks(cplan))
+	{
+		/* Plan invalidated during locking; get a fresh one. */
+		ReleaseCachedPlan(cplan,
+						  plan->saved ? CurrentResourceOwner : NULL);
+		cplan = GetCachedPlan(plansource, NULL,
+							  plan->saved ? CurrentResourceOwner : NULL,
+							  _SPI_current->queryEnv);
+	}
+
 	/* Pop the error context stack */
 	error_context_stack = spierrcontext.previous;
 
@@ -2574,9 +2585,14 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 		 * Replan if needed, and increment plan refcount.  If it's a saved
 		 * plan, the refcount must be backed by the plan_owner.
 		 */
-		cplan = GetCachedPlan(plansource, options->params,
-							  plan_owner, _SPI_current->queryEnv);
-
+		for (;;)
+		{
+			cplan = GetCachedPlan(plansource, options->params,
+								  plan_owner, _SPI_current->queryEnv);
+			if (AcquireExecutorLocks(cplan))
+				break;
+			ReleaseCachedPlan(cplan, plan_owner);
+		}
 		stmt_list = cplan->stmt_list;
 
 		/*
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index dbef734a93f..2929f158338 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -1243,6 +1243,7 @@ exec_simple_query(const char *query_string)
 						  query_string,
 						  commandTag,
 						  plantree_list,
+						  NULL,
 						  NULL);
 
 		/*
@@ -2042,6 +2043,7 @@ exec_bind_message(StringInfo input_message)
 					  query_string,
 					  psrc->commandTag,
 					  cplan->stmt_list,
+					  psrc,
 					  cplan);
 
 	/* Portal is defined, set the plan ID based on its contents. */
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index ee731000820..4699b53cab7 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -59,6 +59,7 @@ static uint64 DoPortalRunFetch(Portal portal,
 							   long count,
 							   DestReceiver *dest);
 static void DoPortalRewind(Portal portal);
+static bool PortalLockCachedPlan(Portal portal);
 
 
 /*
@@ -463,6 +464,8 @@ PortalStart(Portal portal, ParamListInfo params,
 		 */
 		portal->strategy = ChoosePortalStrategy(portal->stmts);
 
+restart:
+
 		/*
 		 * Fire her up according to the strategy
 		 */
@@ -485,6 +488,21 @@ PortalStart(Portal portal, ParamListInfo params,
 				 * non-default nesting level for the snapshot.
 				 */
 
+				/*
+				 * If the portal is backed by a cached plan, acquire execution
+				 * locks via PortalLockCachedPlan().  If the plan is
+				 * invalidated during locking, it replans and may change the
+				 * portal strategy, requiring us to restart PortalStart().
+				 */
+				if (portal->cplan)
+				{
+					if (PortalLockCachedPlan(portal))
+					{
+						PopActiveSnapshot();
+						goto restart;
+					}
+				}
+
 				/*
 				 * Create QueryDesc in portal's context; for the moment, set
 				 * the destination to DestNone.
@@ -535,6 +553,11 @@ PortalStart(Portal portal, ParamListInfo params,
 
 			case PORTAL_ONE_RETURNING:
 			case PORTAL_ONE_MOD_WITH:
+				if (portal->cplan)
+				{
+					if (PortalLockCachedPlan(portal))
+						goto restart;
+				}
 
 				/*
 				 * We don't start the executor until we are told to run the
@@ -578,7 +601,20 @@ PortalStart(Portal portal, ParamListInfo params,
 				break;
 
 			case PORTAL_MULTI_QUERY:
-				/* Need do nothing now */
+
+				/*
+				 * GetCachedPlan() no longer acquires execution locks, so we
+				 * must do it here.  Multi-statement plans always use
+				 * conservative locking (all partitions locked); pruning-aware
+				 * locking is not feasible because PortalRunMulti() executes
+				 * statements sequentially with CCI between them.
+				 */
+				if (portal->cplan)
+				{
+					if (PortalLockCachedPlan(portal))
+						goto restart;
+				}
+
 				portal->tupDesc = NULL;
 				break;
 		}
@@ -1786,3 +1822,35 @@ EnsurePortalSnapshotExists(void)
 	/* PushActiveSnapshotWithLevel might have copied the snapshot */
 	portal->portalSnapshot = GetActiveSnapshot();
 }
+
+/*
+ * PortalLockCachedPlan
+ *		Acquire execution locks for a cached-plan-backed portal,
+ *		retrying with a fresh plan if the current one is invalidated.
+ *
+ * Returns true if replanning changed portal->strategy, meaning the
+ * caller must redispatch.  Returns false once locks are held.
+ */
+static bool
+PortalLockCachedPlan(Portal portal)
+{
+	PortalStrategy start_strategy = portal->strategy;
+
+	if (AcquireExecutorLocks(portal->cplan))
+		return false;
+
+	/* Replan.  Locks will be taken freshly. */
+	ReleaseCachedPlan(portal->cplan, portal->resowner);
+	portal->cplan = NULL;
+	portal->stmts = NIL;
+	portal->cplan = GetCachedPlan(portal->plansource,
+								  portal->portalParams,
+								  portal->resowner,
+								  portal->queryEnv);
+	portal->stmts = portal->cplan->stmt_list;
+	portal->strategy = ChoosePortalStrategy(portal->stmts);
+	if (portal->strategy != start_strategy)
+		return true;
+
+	return false;
+}
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 698e7c1aa22..f7fe366859c 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -100,7 +100,7 @@ static bool choose_custom_plan(CachedPlanSource *plansource,
 							   ParamListInfo boundParams);
 static double cached_plan_cost(CachedPlan *plan, bool include_planner);
 static Query *QueryListGetPrimaryStmt(List *stmts);
-static void AcquireExecutorLocks(List *stmt_list, bool acquire);
+static void AcquireExecutorLocksInt(List *stmt_list, bool acquire);
 static void AcquirePlannerLocks(List *stmt_list, bool acquire);
 static void ScanQueryForLocks(Query *parsetree, bool acquire);
 static bool ScanQueryWalker(Node *node, bool *acquire);
@@ -945,8 +945,9 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
  * Caller must have already called RevalidateCachedQuery to verify that the
  * querytree is up to date.
  *
- * On a "true" return, we have acquired the locks needed to run the plan.
- * (We must do this for the "true" result to be race-condition-free.)
+ * On a "true" return, the generic plan may be reused as a valid cached
+ * plan.  Any execution-time setup, including lock acquisition, is the
+ * caller's responsibility.
  */
 static bool
 CheckCachedPlan(CachedPlanSource *plansource)
@@ -983,8 +984,6 @@ CheckCachedPlan(CachedPlanSource *plansource)
 		 */
 		Assert(plan->refcount > 0);
 
-		AcquireExecutorLocks(plan->stmt_list, true);
-
 		/*
 		 * If plan was transient, check to see if TransactionXmin has
 		 * advanced, and if so invalidate it.
@@ -1003,9 +1002,6 @@ CheckCachedPlan(CachedPlanSource *plansource)
 			/* Successfully revalidated and locked the query. */
 			return true;
 		}
-
-		/* Oops, the race case happened.  Release useless locks. */
-		AcquireExecutorLocks(plan->stmt_list, false);
 	}
 
 	/*
@@ -1282,8 +1278,11 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
  * plan or a custom plan for the given parameters: the caller does not know
  * which it will get.
  *
- * On return, the plan is valid and we have sufficient locks to begin
- * execution.
+ * On return, the plan is valid but no execution locks are held.
+ * The caller must call AcquireExecutorLocks() before executing.
+ * For freshly built plans (custom or new generic), the planner
+ * already holds the needed locks, so AcquireExecutorLocks() is
+ * redundant but harmless.
  *
  * On return, the refcount of the plan has been incremented; a later
  * ReleaseCachedPlan() call is expected.  If "owner" is not NULL then
@@ -1906,9 +1905,11 @@ QueryListGetPrimaryStmt(List *stmts)
 /*
  * AcquireExecutorLocks: acquire locks needed for execution of a cached plan;
  * or release them if acquire is false.
+ *
+ * This locks all relations in a given PlannedStmt's range table.
  */
 static void
-AcquireExecutorLocks(List *stmt_list, bool acquire)
+AcquireExecutorLocksInt(List *stmt_list, bool acquire)
 {
 	ListCell   *lc1;
 
@@ -1955,6 +1956,27 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
 	}
 }
 
+/*
+ * AcquireExecutorLocks
+ *		Acquire execution locks on all relations in a cached plan.
+ *
+ * Returns true if the plan is still valid after locking.  Returns
+ * false if the plan was invalidated while locks were being acquired,
+ * in which case the locks have been released and the caller should
+ * discard this plan and retry with a fresh one from GetCachedPlan().
+ */
+bool
+AcquireExecutorLocks(CachedPlan *cplan)
+{
+	AcquireExecutorLocksInt(cplan->stmt_list, true);
+	if (!cplan->is_valid)
+	{
+		AcquireExecutorLocksInt(cplan->stmt_list, false);
+		return false;
+	}
+	return true;
+}
+
 /*
  * AcquirePlannerLocks: acquire locks needed for planning of a querytree list;
  * or release them if acquire is false.
diff --git a/src/backend/utils/mmgr/portalmem.c b/src/backend/utils/mmgr/portalmem.c
index 493f9b0ee19..613f3be30b3 100644
--- a/src/backend/utils/mmgr/portalmem.c
+++ b/src/backend/utils/mmgr/portalmem.c
@@ -272,6 +272,10 @@ CreateNewPortal(void)
  * the passed plan trees have adequate lifetime.  Typically this is done by
  * copying them into the portal's context.
  *
+ * If plansource is provided, it is the CachedPlanSource that produced
+ * cplan.  PortalLockCachedPlan() uses it to fetch a fresh plan if the
+ * current one is invalidated during execution lock acquisition.
+ *
  * The caller is also responsible for ensuring that the passed prepStmtName
  * (if not NULL) and sourceText have adequate lifetime.
  *
@@ -286,6 +290,7 @@ PortalDefineQuery(Portal portal,
 				  const char *sourceText,
 				  CommandTag commandTag,
 				  List *stmts,
+				  CachedPlanSource *plansource,
 				  CachedPlan *cplan)
 {
 	Assert(PortalIsValid(portal));
@@ -299,6 +304,7 @@ PortalDefineQuery(Portal portal,
 	portal->commandTag = commandTag;
 	SetQueryCompletion(&portal->qc, commandTag, 0);
 	portal->stmts = stmts;
+	portal->plansource = plansource;
 	portal->cplan = cplan;
 	portal->status = PORTAL_DEFINED;
 }
@@ -517,6 +523,7 @@ PortalDrop(Portal portal, bool isTopCommit)
 
 	/* drop cached plan reference, if any */
 	PortalReleaseCachedPlan(portal);
+	portal->plansource = NULL;
 
 	/*
 	 * If portal has a snapshot protecting its data, release that.  This needs
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index 7a4a85c8038..e0fc403e717 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -241,6 +241,7 @@ extern CachedPlan *GetCachedPlan(CachedPlanSource *plansource,
 								 ParamListInfo boundParams,
 								 ResourceOwner owner,
 								 QueryEnvironment *queryEnv);
+extern bool AcquireExecutorLocks(CachedPlan *cplan);
 extern void ReleaseCachedPlan(CachedPlan *plan, ResourceOwner owner);
 
 extern bool CachedPlanAllowsSimpleValidityCheck(CachedPlanSource *plansource,
diff --git a/src/include/utils/portal.h b/src/include/utils/portal.h
index a7bedb12c18..3af535362cd 100644
--- a/src/include/utils/portal.h
+++ b/src/include/utils/portal.h
@@ -137,6 +137,8 @@ typedef struct PortalData
 	CommandTag	commandTag;		/* command tag for original query */
 	QueryCompletion qc;			/* command completion data for executed query */
 	List	   *stmts;			/* list of PlannedStmts */
+	CachedPlanSource *plansource;	/* CachedPlanSource, for replanning on
+									 * invalidation */
 	CachedPlan *cplan;			/* CachedPlan, if stmts are from one */
 
 	ParamListInfo portalParams; /* params to pass to query */
@@ -240,6 +242,7 @@ extern void PortalDefineQuery(Portal portal,
 							  const char *sourceText,
 							  CommandTag commandTag,
 							  List *stmts,
+							  CachedPlanSource *plansource,
 							  CachedPlan *cplan);
 extern PlannedStmt *PortalGetPrimaryStmt(Portal portal);
 extern void PortalCreateHoldStore(Portal portal);
-- 
2.47.3



  [application/octet-stream] v12-0002-Refactor-executor-s-initial-partition-pruning-se.patch (7.3K, 3-v12-0002-Refactor-executor-s-initial-partition-pruning-se.patch)
  download | inline diff:
From 29e5ad113f6974a94fbcf984b43fa3ed86f57632 Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Wed, 25 Mar 2026 16:06:38 +0900
Subject: [PATCH v12 2/4] Refactor executor's initial partition pruning setup

Simplify handling of unpruned relids by moving responsibility
for recording them in EState into CreatePartitionPruneState(),
avoiding the need to pass all_leafpart_rtis as an out parameter.

Also move the setting of ecxt_param_exec_vals from
ExecCreatePartitionPruneState() to InitExecPartitionPruneContexts(),
to allow the former to be called before PARAM_EXEC parameters are
set up.  A later commit needs this when running pruning state setup
outside of InitPlan().

No behavioral change.
---
 src/backend/executor/execPartition.c | 70 +++++++++++++++++++---------
 1 file changed, 48 insertions(+), 22 deletions(-)

diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index d96d4f9947b..2a3af006f77 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -185,8 +185,7 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation rel,
 static List *adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri);
 static List *adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap);
 static PartitionPruneState *CreatePartitionPruneState(EState *estate,
-													  PartitionPruneInfo *pruneinfo,
-													  Bitmapset **all_leafpart_rtis);
+													  PartitionPruneInfo *pruneinfo);
 static void InitPartitionPruneContext(PartitionPruneContext *context,
 									  List *pruning_steps,
 									  PartitionDesc partdesc,
@@ -1978,7 +1977,7 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  * estate->es_part_prune_infos. For each entry, it creates a PartitionPruneState
  * and adds it to es_part_prune_states.  ExecInitPartitionExecPruning() accesses
  * these states through their corresponding indexes in es_part_prune_states and
- * assign each state to the parent node's PlanState, from where it will be used
+ * assigns each state to the parent node's PlanState, from where it will be used
  * for "exec" pruning.
  *
  * If initial pruning steps exist for a PartitionPruneInfo entry, this function
@@ -1996,29 +1995,31 @@ ExecDoInitialPruning(EState *estate)
 {
 	ListCell   *lc;
 
+	Assert(estate->es_part_prune_results == NULL);
 	foreach(lc, estate->es_part_prune_infos)
 	{
 		PartitionPruneInfo *pruneinfo = lfirst_node(PartitionPruneInfo, lc);
 		PartitionPruneState *prunestate;
 		Bitmapset  *validsubplans = NULL;
-		Bitmapset  *all_leafpart_rtis = NULL;
 		Bitmapset  *validsubplan_rtis = NULL;
 
 		/* Create and save the PartitionPruneState. */
-		prunestate = CreatePartitionPruneState(estate, pruneinfo,
-											   &all_leafpart_rtis);
+		prunestate = CreatePartitionPruneState(estate, pruneinfo);
 		estate->es_part_prune_states = lappend(estate->es_part_prune_states,
 											   prunestate);
 
 		/*
 		 * Perform initial pruning steps, if any, and save the result
-		 * bitmapset or NULL as described in the header comment.
+		 * bitmapset or NULL as described in the header comment.  RT indexes
+		 * of surviving partitions would be added to validsubplan_rtis.
+		 *
+		 * Note that when do_initial_prune is false,
+		 * CreatePartitionPruneState() would have already added the RT indexes
+		 * of all leaf partitions to es_unpruned_relids directly.
 		 */
 		if (prunestate->do_initial_prune)
 			validsubplans = ExecFindMatchingSubPlans(prunestate, true,
 													 &validsubplan_rtis);
-		else
-			validsubplan_rtis = all_leafpart_rtis;
 
 		estate->es_unpruned_relids = bms_add_members(estate->es_unpruned_relids,
 													 validsubplan_rtis);
@@ -2136,14 +2137,12 @@ ExecInitPartitionExecPruning(PlanState *planstate,
  * parent plan node's PlanState.
  *
  * If initial pruning steps are to be skipped (e.g., during EXPLAIN
- * (GENERIC_PLAN)), *all_leafpart_rtis will be populated with the RT indexes of
- * all leaf partitions whose scanning subnode is included in the parent plan
- * node's list of child plans. The caller must add these RT indexes to
- * estate->es_unpruned_relids.
+ * (GENERIC_PLAN)), the RT indexes of all leaf partitions whose scanning
+ * subnode is included in the parent plan node's list of child plans are
+ * added to estate->es_unpruned_relids.
  */
 static PartitionPruneState *
-CreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo,
-						  Bitmapset **all_leafpart_rtis)
+CreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo)
 {
 	PartitionPruneState *prunestate;
 	int			n_part_hierarchies;
@@ -2377,8 +2376,8 @@ CreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo,
 													   pinfo->execparamids);
 
 			/*
-			 * Return all leaf partition indexes if we're skipping pruning in
-			 * the EXPLAIN (GENERIC_PLAN) case.
+			 * Add all leaf partition indexes to es_unpruned_relids if we're
+			 * skipping pruning in the EXPLAIN (GENERIC_PLAN) case.
 			 */
 			if (pinfo->initial_pruning_steps && !prunestate->do_initial_prune)
 			{
@@ -2390,9 +2389,28 @@ CreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo,
 					Index		rtindex = pprune->leafpart_rti_map[part_index];
 
 					if (rtindex)
-						*all_leafpart_rtis = bms_add_member(*all_leafpart_rtis,
-															rtindex);
+						estate->es_unpruned_relids =
+							bms_add_member(estate->es_unpruned_relids, rtindex);
+				}
+			}
+			else if (pinfo->initial_pruning_steps == NIL)
+			{
+				/*
+				 * All partitions better be present in es_unpruned_relids when
+				 * none are initially prunable.
+				 */
+#ifdef USE_ASSERT_CHECKING
+				int			part_index = -1;
+
+				while ((part_index = bms_next_member(pprune->present_parts,
+													 part_index)) >= 0)
+				{
+					Index		rtindex = pprune->leafpart_rti_map[part_index];
+
+					if (rtindex)
+						Assert(bms_is_member(rtindex, estate->es_unpruned_relids));
 				}
+#endif
 			}
 
 			j++;
@@ -2490,9 +2508,10 @@ InitPartitionPruneContext(PartitionPruneContext *context,
  *		Initialize exec pruning contexts deferred by CreatePartitionPruneState()
  *
  * This function finalizes exec pruning setup for a PartitionPruneState by
- * initializing contexts for pruning steps that require the parent plan's
- * PlanState. It iterates over PartitionPruningData entries and sets up the
- * necessary execution contexts for pruning during query execution.
+ * initializing contexts for pruning steps that require PARAM_EXEC parameters
+ * and the parent plan's PlanState. It iterates over PartitionPruningData
+ * entries and sets up the necessary execution contexts for pruning during
+ * query execution.
  *
  * Also fix the mapping of partition indexes to subplan indexes contained in
  * prunestate by considering the new list of subplans that survived initial
@@ -2520,9 +2539,16 @@ InitExecPartitionPruneContexts(PartitionPruneState *prunestate,
 	bool		fix_subplan_map = false;
 
 	Assert(prunestate->do_exec_prune);
+	Assert(prunestate->econtext);
 	Assert(parent_plan != NULL);
 	estate = parent_plan->state;
 
+	/*
+	 * These might not be available when ExecCreatePartitionPruneState() is
+	 * called.
+	 */
+	prunestate->econtext->ecxt_param_exec_vals = estate->es_param_exec_vals;
+
 	/*
 	 * No need to fix subplans maps if initial pruning didn't eliminate any
 	 * subplans.
-- 
2.47.3



  [application/octet-stream] v12-0003-Introduce-ExecutorPrep-and-refactor-executor-sta.patch (8.8K, 4-v12-0003-Introduce-ExecutorPrep-and-refactor-executor-sta.patch)
  download | inline diff:
From 05c92346e2bec4c8ec9a7cf45ec572c15d64481f Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Thu, 26 Mar 2026 16:08:46 +0900
Subject: [PATCH v12 3/4] Introduce ExecutorPrep and refactor executor startup

Move permission checks, range table initialization, and initial
partition pruning out of InitPlan() into a new ExecutorPrep()
helper.

ExecutorStart() invokes ExecutorPrep() when QueryDesc->estate is
NULL, keeping current behavior unchanged.  If QueryDesc->estate is
already set, ExecutorStart() reuses it.

This is preparatory refactoring only.  No caller outside the
executor supplies a prebuilt EState in this commit.

In assert builds, verify that the expected relation locks are held
when entering ExecutorStart().
---
 src/backend/executor/README     |  10 ++-
 src/backend/executor/execMain.c | 152 ++++++++++++++++++++++++++------
 src/include/executor/execdesc.h |   2 +-
 3 files changed, 132 insertions(+), 32 deletions(-)

diff --git a/src/backend/executor/README b/src/backend/executor/README
index 54f4782f31b..890bc3d9333 100644
--- a/src/backend/executor/README
+++ b/src/backend/executor/README
@@ -291,11 +291,17 @@ Query Processing Control Flow
 
 This is a sketch of control flow for full query processing:
 
+    ExecutorPrep
+		May be run before ExecutorStart, or implicitly from ExecutorStart
+		if not done earlier.  Creates the EState in QueryDesc, performs
+		range table initialization, permission checks, and initial
+		partition pruning.
+
 	CreateQueryDesc
 
 	ExecutorStart
-		CreateExecutorState
-			creates per-query context
+		ExecutorPrep (if QueryDesc.estate is NULL)
+			creates EState and per-query context
 		switch to per-query context to run ExecInitNode
 		AfterTriggerBeginQuery
 		ExecInitNode --- recursively scans plan tree
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 4b30f768680..2b9397b72f3 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -57,6 +57,7 @@
 #include "parser/parse_relation.h"
 #include "pgstat.h"
 #include "rewrite/rewriteHandler.h"
+#include "storage/lmgr.h"
 #include "tcop/utility.h"
 #include "utils/acl.h"
 #include "utils/backend_status.h"
@@ -76,6 +77,7 @@ ExecutorEnd_hook_type ExecutorEnd_hook = NULL;
 ExecutorCheckPerms_hook_type ExecutorCheckPerms_hook = NULL;
 
 /* decls for local routines only used within this module */
+static void ExecutorPrep(QueryDesc *queryDesc, ResourceOwner owner, int eflags);
 static void InitPlan(QueryDesc *queryDesc, int eflags);
 static void CheckValidRowMarkRel(Relation rel, RowMarkType markType);
 static void ExecPostprocessPlan(EState *estate);
@@ -147,7 +149,6 @@ standard_ExecutorStart(QueryDesc *queryDesc, int eflags)
 
 	/* sanity checks: queryDesc must not be started already */
 	Assert(queryDesc != NULL);
-	Assert(queryDesc->estate == NULL);
 
 	/* caller must ensure the query's snapshot is active */
 	Assert(GetActiveSnapshot() == queryDesc->snapshot);
@@ -173,9 +174,67 @@ standard_ExecutorStart(QueryDesc *queryDesc, int eflags)
 
 	/*
 	 * Build EState, switch into per-query memory context for startup.
-	 */
-	estate = CreateExecutorState();
-	queryDesc->estate = estate;
+	 *
+	 * If ExecutorPrep() ran earlier (e.g., to do initial pruning during plan
+	 * validity checking), reuse its EState to avoid redoing range table setup
+	 * and pruning. Otherwise, create a fresh EState as usual.
+	 *
+	 * In assert builds, verify that the expected locks are held.  When no
+	 * prep EState was provided, AcquireExecutorLocks() should have locked
+	 * every relation in the plan.  When one was provided, pruning-aware
+	 * locking should have locked at least the unpruned relations.  Both
+	 * checks are skipped in parallel workers, which acquire relation locks
+	 * lazily in ExecGetRangeTableRelation().
+	 */
+	if (queryDesc->estate == NULL)
+	{
+#ifdef USE_ASSERT_CHECKING
+		if (!IsParallelWorker())
+		{
+			ListCell   *lc;
+
+			foreach(lc, queryDesc->plannedstmt->rtable)
+			{
+				RangeTblEntry *rte = lfirst_node(RangeTblEntry, lc);
+
+				if (rte->rtekind == RTE_RELATION ||
+					(rte->rtekind == RTE_SUBQUERY && rte->relid != InvalidOid))
+					Assert(CheckRelationOidLockedByMe(rte->relid,
+													  rte->rellockmode,
+													  true));
+			}
+		}
+#endif
+		ExecutorPrep(queryDesc, CurrentResourceOwner, eflags);
+	}
+#ifdef USE_ASSERT_CHECKING
+	else
+	{
+		/*
+		 * A prep EState was provided, meaning pruning-aware locking should
+		 * have locked at least the unpruned relations.
+		 */
+		if (!IsParallelWorker())
+		{
+			int			rtindex = -1;
+
+			while ((rtindex = bms_next_member(queryDesc->estate->es_unpruned_relids,
+											  rtindex)) >= 0)
+			{
+				RangeTblEntry *rte = exec_rt_fetch(rtindex, queryDesc->estate);
+
+				Assert(rte->rtekind == RTE_RELATION ||
+					   (rte->rtekind == RTE_SUBQUERY &&
+						rte->relid != InvalidOid));
+				Assert(CheckRelationOidLockedByMe(rte->relid,
+												  rte->rellockmode, true));
+			}
+		}
+	}
+#endif
+
+	estate = queryDesc->estate;
+	Assert(estate);
 
 	oldcontext = MemoryContextSwitchTo(estate->es_query_cxt);
 
@@ -274,6 +333,64 @@ standard_ExecutorStart(QueryDesc *queryDesc, int eflags)
 	MemoryContextSwitchTo(oldcontext);
 }
 
+/*
+ * ExecutorPrep
+ *
+ * Build the initial executor state for queryDesc before ExecutorStart().
+ *
+ * This creates the EState and performs the subset of executor startup that
+ * does not require plan-tree initialization, allowing that work to be reused
+ * by callers that need executor state before ExecutorStart():
+ *
+ * - initialize the range table
+ * - perform permission checks
+ * - perform initial partition pruning
+ *
+ * On success, queryDesc->estate is set and can later be reused by
+ * ExecutorStart() instead of rebuilding the same state.
+ *
+ * Caller must ensure that queryDesc->snapshot is active.
+ */
+static void
+ExecutorPrep(QueryDesc *queryDesc, ResourceOwner owner, int eflags)
+{
+	ResourceOwner oldowner;
+	EState	   *estate;
+	PlannedStmt *pstmt;
+
+	Assert(queryDesc != NULL);
+
+	if (queryDesc->operation == CMD_UTILITY)
+		return;
+
+	Assert(ActiveSnapshotSet());
+	Assert(GetActiveSnapshot() == queryDesc->snapshot);
+	Assert(queryDesc->estate == NULL);
+
+	pstmt = queryDesc->plannedstmt;
+
+	estate = CreateExecutorState();
+	queryDesc->estate = estate;
+
+	estate->es_plannedstmt = pstmt;
+	estate->es_part_prune_infos = pstmt->partPruneInfos;
+	estate->es_param_list_info = queryDesc->params;
+	estate->es_queryEnv = queryDesc->queryEnv;
+	estate->es_top_eflags = eflags;
+
+	ExecCheckPermissions(pstmt->rtable, pstmt->permInfos, true);
+
+	ExecInitRangeTable(estate, pstmt->rtable, pstmt->permInfos,
+					   bms_copy(pstmt->unprunableRelids));
+
+	oldowner = CurrentResourceOwner;
+	CurrentResourceOwner = owner;
+
+	ExecDoInitialPruning(estate);
+
+	CurrentResourceOwner = oldowner;
+}
+
 /* ----------------------------------------------------------------
  *		ExecutorRun
  *
@@ -849,37 +966,14 @@ InitPlan(QueryDesc *queryDesc, int eflags)
 	CmdType		operation = queryDesc->operation;
 	PlannedStmt *plannedstmt = queryDesc->plannedstmt;
 	Plan	   *plan = plannedstmt->planTree;
-	List	   *rangeTable = plannedstmt->rtable;
 	EState	   *estate = queryDesc->estate;
 	PlanState  *planstate;
 	TupleDesc	tupType;
 	ListCell   *l;
 	int			i;
 
-	/*
-	 * Do permissions checks
-	 */
-	ExecCheckPermissions(rangeTable, plannedstmt->permInfos, true);
-
-	/*
-	 * initialize the node's execution state
-	 */
-	ExecInitRangeTable(estate, rangeTable, plannedstmt->permInfos,
-					   bms_copy(plannedstmt->unprunableRelids));
-
-	estate->es_plannedstmt = plannedstmt;
-	estate->es_part_prune_infos = plannedstmt->partPruneInfos;
-
-	/*
-	 * Perform runtime "initial" pruning to identify which child subplans,
-	 * corresponding to the children of plan nodes that contain
-	 * PartitionPruneInfo such as Append, will not be executed. The results,
-	 * which are bitmapsets of indexes of the child subplans that will be
-	 * executed, are saved in es_part_prune_results.  These results correspond
-	 * to each PartitionPruneInfo entry, and the es_part_prune_results list is
-	 * parallel to es_part_prune_infos.
-	 */
-	ExecDoInitialPruning(estate);
+	/* ExecutorPrep() must have been done. */
+	Assert(queryDesc->estate);
 
 	/*
 	 * Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/include/executor/execdesc.h b/src/include/executor/execdesc.h
index 37c2576e4bc..aea5ec8ea02 100644
--- a/src/include/executor/execdesc.h
+++ b/src/include/executor/execdesc.h
@@ -45,7 +45,7 @@ typedef struct QueryDesc
 	int			query_instr_options;	/* OR of InstrumentOption flags for
 										 * query_instr */
 
-	/* These fields are set by ExecutorStart */
+	/* These fields are set by ExecutorStart or ExecutorPrep */
 	TupleDesc	tupDesc;		/* descriptor for result tuples */
 	EState	   *estate;			/* executor's query-wide state */
 	PlanState  *planstate;		/* tree of per-plan-node state */
-- 
2.47.3



  [application/octet-stream] v12-0004-Use-pruning-aware-locking-for-single-statement-c.patch (40.7K, 5-v12-0004-Use-pruning-aware-locking-for-single-statement-c.patch)
  download | inline diff:
From c68d5de848572defbb58625d915f3323245294d4 Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Sat, 4 Apr 2026 20:43:14 +0900
Subject: [PATCH v12 4/4] Use pruning-aware locking for single-statement cached
 plans

For single-statement reused generic plans, perform initial partition
pruning before acquiring execution locks, then lock only the
surviving partitions.

Add ExecutorPrepAndLock() which encapsulates the pruning-aware lock
sequence: lock unprunable relations, call ExecutorPrep() to run
initial pruning, then lock survivors.  Plan validity is checked
after each step; ExecutorPrepCleanup() handles the case where the
plan is invalidated between prep and execution.

Extend PortalLockCachedPlan() to use the pruning-aware path for
eligible plans (single-statement reused generic, non-utility).
All other cases continue using the conservative lock-all path
from the previous commit.

Track firstResultRels in PlannerGlobal and PlannedStmt so they
are locked even if pruned, preserving ExecInitModifyTable()
assumptions about the first result relation being available.

Multi-statement CachedPlans (from rule rewriting) always use
conservative locking, since PortalRunMulti() executes statements
sequentially with CCI between them and later statements' pruning
expressions may depend on earlier ones' effects.  In principle,
this could be relaxed if the planner can prove that no pruning
expression reads state modified by an earlier statement, but that
is left for a future patch.

Regression tests are included to verify:

- Only surviving partitions are locked when pruning is enabled, and
  all partitions are locked when it is disabled (pg_locks inspection).
- Multiple ModifyTable nodes (via writable CTEs) handle the case where
  all target partitions are pruned, exercising firstResultRels.
- Plan invalidation during pruning-aware lock setup (DDL triggered by
  a pruning expression) discards the prep state and replans cleanly.
- Multi-statement CachedPlans (from rule rewriting) fall back to
  locking all partitions, avoiding stale pruning results.

Note for extension authors: code that accesses partition relations
through EState must check that the RT index is a member of
es_unpruned_relids before opening the relation.  Previously this
was an optimization; it is now a correctness requirement, because
pruned partitions may not be locked.
---
 src/backend/commands/explain.c                |  45 +++--
 src/backend/commands/prepare.c                |  30 ++-
 src/backend/executor/execMain.c               | 142 ++++++++++++++
 src/backend/executor/nodeModifyTable.c        |   5 +-
 src/backend/optimizer/plan/planner.c          |   1 +
 src/backend/optimizer/plan/setrefs.c          |  18 ++
 src/backend/tcop/pquery.c                     |  76 ++++++--
 src/backend/utils/cache/plancache.c           |  16 ++
 src/include/commands/explain.h                |   3 +-
 src/include/executor/executor.h               |   4 +
 src/include/nodes/pathnodes.h                 |   3 +
 src/include/nodes/plannodes.h                 |  10 +
 src/include/utils/plancache.h                 |   2 +
 src/test/regress/expected/partition_prune.out | 184 ++++++++++++++++++
 src/test/regress/expected/plancache.out       |  63 ++++++
 src/test/regress/sql/partition_prune.sql      | 116 +++++++++++
 src/test/regress/sql/plancache.sql            |  52 +++++
 17 files changed, 731 insertions(+), 39 deletions(-)

diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 112c17b0d64..c5254f0f920 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -377,7 +377,8 @@ standard_ExplainOneQuery(Query *query, int cursorOptions,
 	/* run it (if needed) and produce output */
 	ExplainOnePlan(plan, into, es, queryString, params, queryEnv,
 				   &planduration, (es->buffers ? &bufusage : NULL),
-				   es->memory ? &mem_counters : NULL);
+				   es->memory ? &mem_counters : NULL,
+				   NULL);
 }
 
 /*
@@ -501,7 +502,8 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
 			   const char *queryString, ParamListInfo params,
 			   QueryEnvironment *queryEnv, const instr_time *planduration,
 			   const BufferUsage *bufusage,
-			   const MemoryContextCounters *mem_counters)
+			   const MemoryContextCounters *mem_counters,
+			   QueryDesc *prep_qd)
 {
 	DestReceiver *dest;
 	QueryDesc  *queryDesc;
@@ -532,13 +534,6 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
 	 */
 	INSTR_TIME_SET_CURRENT(starttime);
 
-	/*
-	 * Use a snapshot with an updated command ID to ensure this query sees
-	 * results of any previously executed queries.
-	 */
-	PushCopiedSnapshot(GetActiveSnapshot());
-	UpdateActiveSnapshotCommandId();
-
 	/*
 	 * We discard the output if we have no use for it.  If we're explaining
 	 * CREATE TABLE AS, we'd better use the appropriate tuple receiver, while
@@ -554,10 +549,34 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
 	else
 		dest = None_Receiver;
 
-	/* Create a QueryDesc for the query */
-	queryDesc = CreateQueryDesc(plannedstmt, queryString,
-								GetActiveSnapshot(), InvalidSnapshot,
-								dest, params, queryEnv, instrument_option);
+	/*
+	 * Create a QueryDesc for the query, or use the one provided by the
+	 * caller.  When reusing a prep QueryDesc, its snapshot was set at
+	 * creation time; we push it as active for ExecutorStart and override the
+	 * destination and instrument options, which were not known when the
+	 * caller created it.
+	 */
+	if (prep_qd)
+	{
+		PushActiveSnapshot(GetActiveSnapshot());
+		queryDesc = prep_qd;
+		Assert(queryDesc->dest == None_Receiver);
+		queryDesc->dest = dest;
+		queryDesc->instrument_options = instrument_option;
+	}
+	else
+	{
+		/*
+		 * Use a snapshot with an updated command ID to ensure this query sees
+		 * results of any previously executed queries.
+		 */
+		PushCopiedSnapshot(GetActiveSnapshot());
+		UpdateActiveSnapshotCommandId();
+		queryDesc = CreateQueryDesc(plannedstmt, queryString,
+									GetActiveSnapshot(), InvalidSnapshot,
+									dest, params, queryEnv,
+									instrument_option);
+	}
 
 	/* Select execution options */
 	if (es->analyze)
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 03d7a98fc58..3bbbc052149 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -588,6 +588,7 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	MemoryContextCounters mem_counters;
 	MemoryContext planner_ctx = NULL;
 	MemoryContext saved_ctx = NULL;
+	QueryDesc  *prep_qd = NULL;
 
 	if (es->memory)
 	{
@@ -640,8 +641,31 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 							  pstate->p_queryEnv);
 		plan_list = cplan->stmt_list;
 
-		if (AcquireExecutorLocks(cplan))
+		if (!CachedPlanCanPrep(cplan, entry->plansource))
+		{
+			if (AcquireExecutorLocks(cplan))
+				break;
+			ReleaseCachedPlan(cplan, CurrentResourceOwner);
+			continue;
+		}
+
+		prep_qd = CreateQueryDesc(linitial_node(PlannedStmt, plan_list),
+								  query_string,
+								  GetActiveSnapshot(),
+								  InvalidSnapshot,
+								  None_Receiver,	/* ExplainOnePlan will fix */
+								  paramLI,
+								  pstate->p_queryEnv,
+								  0 /* ExplainOnePlan will fix */ );
+		if (ExecutorPrepAndLock(prep_qd,
+								CurrentResourceOwner,
+								es->generic ? EXEC_FLAG_EXPLAIN_GENERIC : 0,
+								&cplan->is_valid))
 			break;
+
+		/* Try again. */
+		ExecutorPrepCleanup(prep_qd);
+		FreeQueryDesc(prep_qd);
 		ReleaseCachedPlan(cplan, CurrentResourceOwner);
 	}
 
@@ -664,6 +688,7 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	plan_list = cplan->stmt_list;
 
 	/* Explain each query */
+	Assert(prep_qd == NULL || list_length(plan_list) == 1);
 	foreach(p, plan_list)
 	{
 		PlannedStmt *pstmt = lfirst_node(PlannedStmt, p);
@@ -671,7 +696,8 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 		if (pstmt->commandType != CMD_UTILITY)
 			ExplainOnePlan(pstmt, into, es, query_string, paramLI, pstate->p_queryEnv,
 						   &planduration, (es->buffers ? &bufusage : NULL),
-						   es->memory ? &mem_counters : NULL);
+						   es->memory ? &mem_counters : NULL,
+						   prep_qd);
 		else
 			ExplainOneUtility(pstmt->utilityStmt, into, es, pstate, paramLI);
 
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 2b9397b72f3..1e81377cfd8 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -333,6 +333,124 @@ standard_ExecutorStart(QueryDesc *queryDesc, int eflags)
 	MemoryContextSwitchTo(oldcontext);
 }
 
+/*
+ * LockRangeTableRelids
+ * 		Acquire or release locks on the specified relids, which reference
+ * 		entries in the provided range table.
+ *
+ * Helper for AcquireExecutorLocksUnpruned().
+ */
+static void
+LockRangeTableRelids(List *rtable, Bitmapset *relids, bool acquire)
+{
+	int			rtindex = -1;
+
+	while ((rtindex = bms_next_member(relids, rtindex)) >= 0)
+	{
+		RangeTblEntry *rte = list_nth_node(RangeTblEntry, rtable, rtindex - 1);
+
+		Assert(rte->rtekind == RTE_RELATION ||
+			   (rte->rtekind == RTE_SUBQUERY && OidIsValid(rte->relid)));
+
+		/*
+		 * Acquire the appropriate type of lock on each relation OID. Note
+		 * that we don't actually try to open the rel, and hence will not fail
+		 * if it's been dropped entirely --- we'll just transiently acquire a
+		 * non-conflicting lock.
+		 */
+		if (acquire)
+			LockRelationOid(rte->relid, rte->rellockmode);
+		else
+			UnlockRelationOid(rte->relid, rte->rellockmode);
+	}
+}
+
+/*
+ * AcquireExecutorLocksPrepared
+ *
+ * Acquire or release execution locks using pruning results already computed
+ * by ExecutorPrep() and stored in queryDesc->estate.
+ *
+ * This is intended for single-statement reused generic-plan paths that
+ * choose pruning-aware locking instead of the conservative
+ * AcquireExecutorLocks() path.
+ */
+static void
+AcquireExecutorLocksPrepared(QueryDesc *queryDesc, bool acquire)
+{
+	PlannedStmt *plannedstmt = queryDesc->plannedstmt;
+	EState	   *estate = queryDesc->estate;
+	Bitmapset  *lock_relids;
+	ListCell   *lc;
+
+	Assert(queryDesc != NULL);
+	Assert(estate != NULL);
+	Assert(plannedstmt != NULL);
+	Assert(plannedstmt->commandType != CMD_UTILITY);
+
+	lock_relids = bms_difference(estate->es_unpruned_relids,
+								 plannedstmt->unprunableRelids);
+
+	/*
+	 * Keep the first result relation of each ModifyTable locked even if
+	 * pruning removed all target partitions.  ExecInitModifyTable() relies on
+	 * one such relation remaining available.
+	 */
+	foreach(lc, plannedstmt->firstResultRels)
+	{
+		Index		rti = lfirst_int(lc);
+
+		lock_relids = bms_add_member(lock_relids, rti);
+	}
+
+	LockRangeTableRelids(plannedstmt->rtable, lock_relids, acquire);
+
+	bms_free(lock_relids);
+
+}
+
+/*
+ * ExecutorPrepAndLock
+ *		Perform pruning-aware locking for a single PlannedStmt.
+ *
+ * Locks unprunable relations first, then runs ExecutorPrep() to
+ * determine which partitions survive initial pruning, then locks
+ * only those survivors.  Checks *is_valid after each locking step
+ * to detect plan invalidation (e.g., from concurrent DDL or DDL
+ * triggered by a pruning expression).
+ *
+ * Returns true if the plan is still valid and all needed locks are
+ * held.  Returns false if the plan was invalidated at any point, in
+ * which case all acquired locks have been released and the caller
+ * should discard the QueryDesc and retry with a fresh plan.
+ */
+bool
+ExecutorPrepAndLock(QueryDesc *queryDesc, ResourceOwner owner,
+					int eflags, bool *is_valid)
+{
+	PlannedStmt *pstmt = queryDesc->plannedstmt;
+
+	/* Lock unprunable rels before pruning can access them. */
+	LockRangeTableRelids(pstmt->rtable, pstmt->unprunableRelids, true);
+	if (!*is_valid)
+	{
+		LockRangeTableRelids(pstmt->rtable, pstmt->unprunableRelids, false);
+		return false;
+	}
+
+	/* Run pruning and lock survivors. */
+	ExecutorPrep(queryDesc, owner, eflags);
+	AcquireExecutorLocksPrepared(queryDesc, true);
+	if (!*is_valid)
+	{
+		AcquireExecutorLocksPrepared(queryDesc, false);
+		LockRangeTableRelids(pstmt->rtable, pstmt->unprunableRelids, false);
+		return false;
+	}
+
+	return true;
+}
+
 /*
  * ExecutorPrep
  *
@@ -391,6 +509,30 @@ ExecutorPrep(QueryDesc *queryDesc, ResourceOwner owner, int eflags)
 	CurrentResourceOwner = oldowner;
 }
 
+/*
+ * ExecutorPrepCleanup
+ *		Clean up an EState that was created by ExecutorPrep() but never
+ *		passed to ExecutorStart().  This happens when the plan is
+ *		invalidated between prep and execution, and the caller must
+ *		discard the prepped state before retrying with a fresh plan.
+ *
+ * Unlike ExecutorEnd(), this does not expect a fully initialized
+ * plan state tree -- only the range table relations and the
+ * EState itself need to be freed.
+ */
+void
+ExecutorPrepCleanup(QueryDesc *queryDesc)
+{
+	EState	   *estate = queryDesc->estate;
+
+	if (estate == NULL)
+		return;
+
+	ExecCloseRangeTableRelations(estate);
+	FreeExecutorState(estate);
+	queryDesc->estate = NULL;
+}
+
 /* ----------------------------------------------------------------
  *		ExecutorRun
  *
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index 478cb01783c..350096bfbe7 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -5133,8 +5133,8 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
 	 * as a reference for building the ResultRelInfo of the target partition.
 	 * In either case, it doesn't matter which result relation is kept, so we
 	 * just keep the first one, if all others have been pruned.  See also,
-	 * ExecDoInitialPruning(), which ensures that this first result relation
-	 * has been locked.
+	 * AcquireExecutorLocksUnpruned(), which ensures that this first result
+	 * relation has been locked.
 	 */
 	i = 0;
 	foreach(l, node->resultRelations)
@@ -5148,6 +5148,7 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
 			/* all result relations pruned; keep the first one */
 			keep_rel = true;
 			rti = linitial_int(node->resultRelations);
+			Assert(list_member_int(estate->es_plannedstmt->firstResultRels, rti));
 			i = 0;
 		}
 
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index f4689e7c9f8..4cddac7f2fc 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -675,6 +675,7 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
 											  glob->prunableRelids);
 	result->permInfos = glob->finalrteperminfos;
 	result->subrtinfos = glob->subrtinfos;
+	result->firstResultRels = glob->firstResultRels;
 	result->appendRelations = glob->appendRelations;
 	result->subplans = glob->subplans;
 	result->rewindPlanIDs = glob->rewindPlanIDs;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index ff0e875f2a2..6ee51f06920 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -384,6 +384,24 @@ set_plan_references(PlannerInfo *root, Plan *plan)
 		}
 	}
 
+	/*
+	 * Record the first result relation if it belongs to the set of initially
+	 * prunable relations.  We use bms_next_member() to get the
+	 * lowest-numbered leaf result rel, which matches
+	 * linitial_int(ModifyTable.resultRelations) because partition expansion
+	 * preserves RT index order.  ExecInitModifyTable() asserts that the
+	 * recorded index matches what it actually needs.
+	 */
+	if (root->leaf_result_relids)
+	{
+		Index		firstResultRel = bms_next_member(root->leaf_result_relids, -1);
+
+		firstResultRel += rtoffset;
+		if (bms_is_member(firstResultRel, root->glob->prunableRelids))
+			root->glob->firstResultRels =
+				lappend_int(root->glob->firstResultRels, firstResultRel);
+	}
+
 	return result;
 }
 
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index 4699b53cab7..53c50ab0fce 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -59,7 +59,9 @@ static uint64 DoPortalRunFetch(Portal portal,
 							   long count,
 							   DestReceiver *dest);
 static void DoPortalRewind(Portal portal);
-static bool PortalLockCachedPlan(Portal portal);
+static bool PortalLockCachedPlan(Portal portal, bool do_prep,
+								 ParamListInfo params,
+								 QueryDesc **queryDesc_p);
 
 
 /*
@@ -488,21 +490,6 @@ restart:
 				 * non-default nesting level for the snapshot.
 				 */
 
-				/*
-				 * If the portal is backed by a cached plan, acquire execution
-				 * locks via PortalLockCachedPlan().  If the plan is
-				 * invalidated during locking, it replans and may change the
-				 * portal strategy, requiring us to restart PortalStart().
-				 */
-				if (portal->cplan)
-				{
-					if (PortalLockCachedPlan(portal))
-					{
-						PopActiveSnapshot();
-						goto restart;
-					}
-				}
-
 				/*
 				 * Create QueryDesc in portal's context; for the moment, set
 				 * the destination to DestNone.
@@ -516,6 +503,26 @@ restart:
 											portal->queryEnv,
 											0);
 
+				/*
+				 * If the portal is backed by a cached plan, acquire execution
+				 * locks via PortalLockCachedPlan().  For eligible plans
+				 * (single-statement reused generic), this performs
+				 * pruning-aware locking: it runs ExecutorPrep() on the
+				 * QueryDesc to determine which partitions survive initial
+				 * pruning, then locks only those.  If the plan is invalidated
+				 * during this process, it replans and rebuilds the QueryDesc.
+				 * If replanning changes the portal strategy, we must restart
+				 * PortalStart() to redispatch.
+				 */
+				if (portal->cplan)
+				{
+					if (PortalLockCachedPlan(portal, true, params, &queryDesc))
+					{
+						PopActiveSnapshot();
+						goto restart;
+					}
+				}
+
 				/*
 				 * If it's a scrollable cursor, executor needs to support
 				 * REWIND and backwards scan, as well as whatever the caller
@@ -555,7 +562,7 @@ restart:
 			case PORTAL_ONE_MOD_WITH:
 				if (portal->cplan)
 				{
-					if (PortalLockCachedPlan(portal))
+					if (PortalLockCachedPlan(portal, false, NULL, NULL))
 						goto restart;
 				}
 
@@ -611,7 +618,7 @@ restart:
 				 */
 				if (portal->cplan)
 				{
-					if (PortalLockCachedPlan(portal))
+					if (PortalLockCachedPlan(portal, false, NULL, NULL))
 						goto restart;
 				}
 
@@ -1828,15 +1835,32 @@ EnsurePortalSnapshotExists(void)
  *		Acquire execution locks for a cached-plan-backed portal,
  *		retrying with a fresh plan if the current one is invalidated.
  *
+ * If do_prep is true and the plan is eligible (single-statement reused
+ * generic plan), performs pruning-aware locking via ExecutorPrep() and
+ * populates portal->queryDesc with the prepped QueryDesc.  Otherwise
+ * falls back to locking all relations in the plan.
+ *
  * Returns true if replanning changed portal->strategy, meaning the
- * caller must redispatch.  Returns false once locks are held.
+ * caller must redispatch.  Returns false once locks are held and the
+ * plan is valid for execution.
  */
 static bool
-PortalLockCachedPlan(Portal portal)
+PortalLockCachedPlan(Portal portal, bool do_prep,
+					 ParamListInfo params,
+					 QueryDesc **prep_qd)
 {
 	PortalStrategy start_strategy = portal->strategy;
 
-	if (AcquireExecutorLocks(portal->cplan))
+	if (do_prep && CachedPlanCanPrep(portal->cplan, portal->plansource))
+	{
+		Assert(prep_qd);
+		if (ExecutorPrepAndLock(*prep_qd, portal->resowner, 0,
+								&portal->cplan->is_valid))
+			return false;
+		ExecutorPrepCleanup(*prep_qd);
+		FreeQueryDesc(*prep_qd);
+	}
+	else if (AcquireExecutorLocks(portal->cplan))
 		return false;
 
 	/* Replan.  Locks will be taken freshly. */
@@ -1852,5 +1876,15 @@ PortalLockCachedPlan(Portal portal)
 	if (portal->strategy != start_strategy)
 		return true;
 
+	if (prep_qd)
+	{
+		Assert(list_length(portal->stmts) == 1);
+		*prep_qd = CreateQueryDesc(linitial_node(PlannedStmt, portal->stmts),
+								   portal->sourceText,
+								   GetActiveSnapshot(), InvalidSnapshot,
+								   None_Receiver, params,
+								   portal->queryEnv, 0);
+	}
+
 	return false;
 }
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index f7fe366859c..fca2f84081e 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -1977,6 +1977,22 @@ AcquireExecutorLocks(CachedPlan *cplan)
 	return true;
 }
 
+/*
+ * CachedPlanCanPrep
+ *		Check whether a cached plan is eligible for pruning-aware locking
+ *		via ExecutorPrepAndLock().
+ *
+ * Only single-statement reused generic plans with a non-utility command
+ * qualify.
+ */
+bool
+CachedPlanCanPrep(CachedPlan *cplan, CachedPlanSource *plansource)
+{
+	return (cplan == plansource->gplan &&
+			list_length(cplan->stmt_list) == 1 &&
+			linitial_node(PlannedStmt, cplan->stmt_list)->commandType != CMD_UTILITY);
+}
+
 /*
  * AcquirePlannerLocks: acquire locks needed for planning of a querytree list;
  * or release them if acquire is false.
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 472e141bba3..3a03355e6b6 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -69,7 +69,8 @@ extern void ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into,
 						   ParamListInfo params, QueryEnvironment *queryEnv,
 						   const instr_time *planduration,
 						   const BufferUsage *bufusage,
-						   const MemoryContextCounters *mem_counters);
+						   const MemoryContextCounters *mem_counters,
+						   QueryDesc *prep_qd);
 
 extern void ExplainPrintPlan(ExplainState *es, QueryDesc *queryDesc);
 extern void ExplainPrintTriggers(ExplainState *es,
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index 33bbdbfeffb..093be9bd24b 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -21,6 +21,7 @@
 #include "nodes/lockoptions.h"
 #include "nodes/parsenodes.h"
 #include "utils/memutils.h"
+#include "utils/resowner.h"
 
 
 /*
@@ -235,6 +236,9 @@ ExecGetJunkAttribute(TupleTableSlot *slot, AttrNumber attno, bool *isNull)
  */
 extern void ExecutorStart(QueryDesc *queryDesc, int eflags);
 extern void standard_ExecutorStart(QueryDesc *queryDesc, int eflags);
+extern bool ExecutorPrepAndLock(QueryDesc *queryDesc, ResourceOwner owner,
+								int eflags, bool *is_valid);
+extern void ExecutorPrepCleanup(QueryDesc *queryDesc);
 extern void ExecutorRun(QueryDesc *queryDesc,
 						ScanDirection direction, uint64 count);
 extern void standard_ExecutorRun(QueryDesc *queryDesc,
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 27a2c6815b7..a5d00633b4b 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -217,6 +217,9 @@ typedef struct PlannerGlobal
 	/* "flat" list of integer RT indexes */
 	List	   *resultRelations;
 
+	/* "flat" list of integer RT indexes (one per ModifyTable node) */
+	List	   *firstResultRels;
+
 	/* "flat" list of AppendRelInfos */
 	List	   *appendRelations;
 
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 14a1dfed2b9..7f6f7cda781 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -120,6 +120,16 @@ typedef struct PlannedStmt
 	/* RT indexes of relations targeted by INSERT/UPDATE/DELETE/MERGE */
 	Bitmapset  *resultRelationRelids;
 
+	/*
+	 * rtable indexes of first target relation in each ModifyTable node in the
+	 * plan for INSERT/UPDATE/DELETE/MERGE.  NIL if resultRelations is NIL.
+	 *
+	 * These are used by AcquireExecutorLocksUnpruned() to ensure that the
+	 * first result rel for each ModifyTable remains locked even if pruned;
+	 * see ExecInitModifyTable() for the executor side assumptions.
+	 */
+	List	   *firstResultRels;
+
 	/* list of AppendRelInfo nodes */
 	List	   *appendRelations;
 
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index e0fc403e717..2941d3a301b 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -254,4 +254,6 @@ extern bool CachedPlanIsSimplyValid(CachedPlanSource *plansource,
 extern CachedExpression *GetCachedExpression(Node *expr);
 extern void FreeCachedExpression(CachedExpression *cexpr);
 
+extern bool CachedPlanCanPrep(CachedPlan *cplan, CachedPlanSource *plansource);
+
 #endif							/* PLANCACHE_H */
diff --git a/src/test/regress/expected/partition_prune.out b/src/test/regress/expected/partition_prune.out
index 849049f9c51..ec73866486e 100644
--- a/src/test/regress/expected/partition_prune.out
+++ b/src/test/regress/expected/partition_prune.out
@@ -4956,3 +4956,187 @@ select * from (select a, b from phv_boolpart) t
 (2 rows)
 
 drop table phv_boolpart;
+--
+-- Verify that pruning-aware locking skips pruned partitions
+-- when reusing a generic cached plan.
+--
+set plan_cache_mode to force_generic_plan;
+create table prunelock_p (a int) partition by list (a);
+create table prunelock_p1 partition of prunelock_p for values in (1);
+create table prunelock_p2 partition of prunelock_p for values in (2);
+create table prunelock_p3 partition of prunelock_p for values in (3);
+prepare prunelock_q (int) as select * from prunelock_p where a = $1;
+-- Force generic plan creation
+explain (costs off) execute prunelock_q(1);
+                  QUERY PLAN                  
+----------------------------------------------
+ Append
+   Subplans Removed: 2
+   ->  Seq Scan on prunelock_p1 prunelock_p_1
+         Filter: (a = $1)
+(4 rows)
+
+-- Execute and check which child partitions are locked
+begin;
+execute prunelock_q(1);
+ a 
+---
+(0 rows)
+
+select c.relname
+  from pg_locks l
+  join pg_class c on c.oid = l.relation
+ where l.pid = pg_backend_pid()
+   and c.relname like 'prunelock_p_'
+ order by c.relname;
+   relname    
+--------------
+ prunelock_p1
+(1 row)
+
+commit;
+deallocate prunelock_q;
+-- Turn pruning off
+set enable_partition_pruning to off;
+prepare prunelock_q (int) as select * from prunelock_p where a = $1;
+-- Force generic plan creation
+explain (costs off) execute prunelock_q(1);
+                  QUERY PLAN                  
+----------------------------------------------
+ Append
+   ->  Seq Scan on prunelock_p1 prunelock_p_1
+         Filter: (a = $1)
+   ->  Seq Scan on prunelock_p2 prunelock_p_2
+         Filter: (a = $1)
+   ->  Seq Scan on prunelock_p3 prunelock_p_3
+         Filter: (a = $1)
+(7 rows)
+
+-- Execute and check which child partitions are locked
+begin;
+execute prunelock_q(1);
+ a 
+---
+(0 rows)
+
+select c.relname
+  from pg_locks l
+  join pg_class c on c.oid = l.relation
+ where l.pid = pg_backend_pid()
+   and c.relname like 'prunelock_p_'
+ order by c.relname;
+   relname    
+--------------
+ prunelock_p1
+ prunelock_p2
+ prunelock_p3
+(3 rows)
+
+commit;
+deallocate prunelock_q;
+reset enable_partition_pruning;
+--
+-- Verify firstResultRels handling with multiple ModifyTable nodes
+-- (writable CTEs) targeting a partitioned table.  When a pruning
+-- parameter matches no partition, all result relations are pruned
+-- and the executor must still find a usable first result relation
+-- for each ModifyTable node.
+--
+prepare prunelock_mt_q (int, int) as
+  with upd1 as (update prunelock_p set a = a),
+       upd2 as (update prunelock_p set a = a where a = $2)
+  update prunelock_p set a = a where a = $1;
+-- Force generic plan creation
+explain (costs off) execute prunelock_mt_q(1, 2);
+                         QUERY PLAN                         
+------------------------------------------------------------
+ Update on prunelock_p
+   Update on prunelock_p1 prunelock_p_1
+   CTE upd1
+     ->  Update on prunelock_p prunelock_p_3
+           Update on prunelock_p1 prunelock_p_4
+           Update on prunelock_p2 prunelock_p_5
+           Update on prunelock_p3 prunelock_p_6
+           ->  Append
+                 ->  Seq Scan on prunelock_p1 prunelock_p_4
+                 ->  Seq Scan on prunelock_p2 prunelock_p_5
+                 ->  Seq Scan on prunelock_p3 prunelock_p_6
+   CTE upd2
+     ->  Update on prunelock_p prunelock_p_7
+           Update on prunelock_p2 prunelock_p_8
+           ->  Append
+                 Subplans Removed: 2
+                 ->  Seq Scan on prunelock_p2 prunelock_p_8
+                       Filter: (a = $2)
+   ->  Append
+         Subplans Removed: 2
+         ->  Seq Scan on prunelock_p1 prunelock_p_1
+               Filter: (a = $1)
+(22 rows)
+
+-- All partitions pruned: value 4 matches no partition, so each
+-- ModifyTable must still initialize correctly with no matching
+-- result relations.
+explain (costs off) execute prunelock_mt_q(4, 5);
+                         QUERY PLAN                         
+------------------------------------------------------------
+ Update on prunelock_p
+   CTE upd1
+     ->  Update on prunelock_p prunelock_p_2
+           Update on prunelock_p1 prunelock_p_3
+           Update on prunelock_p2 prunelock_p_4
+           Update on prunelock_p3 prunelock_p_5
+           ->  Append
+                 ->  Seq Scan on prunelock_p1 prunelock_p_3
+                 ->  Seq Scan on prunelock_p2 prunelock_p_4
+                 ->  Seq Scan on prunelock_p3 prunelock_p_5
+   CTE upd2
+     ->  Update on prunelock_p prunelock_p_6
+           ->  Append
+                 Subplans Removed: 3
+   ->  Append
+         Subplans Removed: 3
+(16 rows)
+
+deallocate prunelock_mt_q;
+drop table prunelock_p;
+--
+-- Verify that pruning-aware locking falls back to locking all
+-- partitions for multi-statement CachedPlans.  Rule rewriting can
+-- expand a single statement into multiple PlannedStmts, and later
+-- statements must not have their pruning evaluated before earlier
+-- ones have executed, since CCI between statements can change what
+-- pruning expressions see.
+--
+create table prune_config (val int);
+insert into prune_config values (1);
+create table multistmt_pt (a int, b int) partition by list (a);
+create table multistmt_pt_1 partition of multistmt_pt for values in (1);
+create table multistmt_pt_2 partition of multistmt_pt for values in (2);
+insert into multistmt_pt values (1, 0), (2, 0);
+create function get_prune_val() returns int as $$
+  select val from prune_config;
+$$ language sql stable;
+create rule config_upd_rule as on update to multistmt_pt
+  do also update prune_config set val = 2;
+set plan_cache_mode to force_generic_plan;
+prepare multi_q as update multistmt_pt set b = b + 1 where a = get_prune_val();
+-- first execute creates the generic plan
+execute multi_q;
+-- reset for the real test
+update prune_config set val = 1;
+update multistmt_pt set b = 0;
+-- second execute reuses the plan; pruning-aware locking kicks in
+execute multi_q;
+select * from multistmt_pt order by a;
+ a | b 
+---+---
+ 1 | 0
+ 2 | 1
+(2 rows)
+
+deallocate multi_q;
+drop rule config_upd_rule on multistmt_pt;
+drop function get_prune_val;
+drop table multistmt_pt, prune_config;
+reset plan_cache_mode;
diff --git a/src/test/regress/expected/plancache.out b/src/test/regress/expected/plancache.out
index d58534ca1cd..54077294dce 100644
--- a/src/test/regress/expected/plancache.out
+++ b/src/test/regress/expected/plancache.out
@@ -402,3 +402,66 @@ select name, generic_plans, custom_plans from pg_prepared_statements
 (1 row)
 
 drop table test_mode;
+-- This exercises the CachedPlanPrepCleanup() path, which must free
+-- the EState created by ExecutorPrep() when the plan is invalidated
+-- before execution begins.  The pruning expression uses a stable SQL
+-- function that calls a volatile plpgsql function.  That function
+-- performs DDL on a partition when a separate "signal" table says to
+-- do so.  The second EXECUTE should replan cleanly after the DDL.
+set plan_cache_mode to force_generic_plan;
+create table inval_during_pruning_p (a int) partition by list (a);
+create table inval_during_pruning_p1 partition of inval_during_pruning_p for values in (1);
+create table inval_during_pruning_p2 partition of inval_during_pruning_p for values in (2);
+insert into inval_during_pruning_p values (1), (2);
+create table inval_during_pruning_signal (create_idx bool not null);
+insert into inval_during_pruning_signal values (false);
+create or replace function invalidate_plancache_func() returns int
+as $$
+declare
+	create_index bool;
+begin
+	-- Perform DDL on a partition if asked to
+	select create_idx into create_index from inval_during_pruning_signal for update;
+	if create_index = true then
+		raise notice 'creating index on partition inval_during_pruning_p1';
+		create index on inval_during_pruning_p1 (a);
+		update inval_during_pruning_signal set create_idx = false;
+	end if;
+	-- value that pruning will match against partition bounds
+	return 1;
+end;
+$$ language plpgsql volatile;
+create or replace function stable_pruning_val() returns int as $$
+	select invalidate_plancache_func();
+$$ language sql stable;
+prepare inval_during_pruning_q as select * from inval_during_pruning_p where a = stable_pruning_val();
+-- Build a generic plan and run pruning once, but don't set the signal
+-- for invalidate_plancache_func() to perform the DDL.
+explain (verbose, costs off) execute inval_during_pruning_q;
+                                QUERY PLAN                                 
+---------------------------------------------------------------------------
+ Append
+   Subplans Removed: 1
+   ->  Seq Scan on public.inval_during_pruning_p1 inval_during_pruning_p_1
+         Output: inval_during_pruning_p_1.a
+         Filter: (inval_during_pruning_p_1.a = stable_pruning_val())
+(5 rows)
+
+-- Reuse the generic plan.  Make invalidate_plancache_func() perform DDL
+-- during this execution, which should force replanning without errors.
+update inval_during_pruning_signal set create_idx = true;
+explain (verbose, costs off) execute inval_during_pruning_q;
+NOTICE:  creating index on partition inval_during_pruning_p1
+                                QUERY PLAN                                 
+---------------------------------------------------------------------------
+ Append
+   Subplans Removed: 1
+   ->  Seq Scan on public.inval_during_pruning_p1 inval_during_pruning_p_1
+         Output: inval_during_pruning_p_1.a
+         Filter: (inval_during_pruning_p_1.a = stable_pruning_val())
+(5 rows)
+
+deallocate inval_during_pruning_q;
+drop table inval_during_pruning_p, inval_during_pruning_signal;
+drop function invalidate_plancache_func, stable_pruning_val;
+reset plan_cache_mode;
diff --git a/src/test/regress/sql/partition_prune.sql b/src/test/regress/sql/partition_prune.sql
index 359a9208056..a98844d14f8 100644
--- a/src/test/regress/sql/partition_prune.sql
+++ b/src/test/regress/sql/partition_prune.sql
@@ -1518,3 +1518,119 @@ select * from (select a, b from phv_boolpart) t
   group by grouping sets (a, b);
 
 drop table phv_boolpart;
+
+--
+-- Verify that pruning-aware locking skips pruned partitions
+-- when reusing a generic cached plan.
+--
+set plan_cache_mode to force_generic_plan;
+
+create table prunelock_p (a int) partition by list (a);
+create table prunelock_p1 partition of prunelock_p for values in (1);
+create table prunelock_p2 partition of prunelock_p for values in (2);
+create table prunelock_p3 partition of prunelock_p for values in (3);
+
+prepare prunelock_q (int) as select * from prunelock_p where a = $1;
+
+-- Force generic plan creation
+explain (costs off) execute prunelock_q(1);
+
+-- Execute and check which child partitions are locked
+begin;
+execute prunelock_q(1);
+
+select c.relname
+  from pg_locks l
+  join pg_class c on c.oid = l.relation
+ where l.pid = pg_backend_pid()
+   and c.relname like 'prunelock_p_'
+ order by c.relname;
+commit;
+
+deallocate prunelock_q;
+
+-- Turn pruning off
+set enable_partition_pruning to off;
+
+prepare prunelock_q (int) as select * from prunelock_p where a = $1;
+
+-- Force generic plan creation
+explain (costs off) execute prunelock_q(1);
+
+-- Execute and check which child partitions are locked
+begin;
+execute prunelock_q(1);
+
+select c.relname
+  from pg_locks l
+  join pg_class c on c.oid = l.relation
+ where l.pid = pg_backend_pid()
+   and c.relname like 'prunelock_p_'
+ order by c.relname;
+commit;
+
+deallocate prunelock_q;
+reset enable_partition_pruning;
+
+--
+-- Verify firstResultRels handling with multiple ModifyTable nodes
+-- (writable CTEs) targeting a partitioned table.  When a pruning
+-- parameter matches no partition, all result relations are pruned
+-- and the executor must still find a usable first result relation
+-- for each ModifyTable node.
+--
+prepare prunelock_mt_q (int, int) as
+  with upd1 as (update prunelock_p set a = a),
+       upd2 as (update prunelock_p set a = a where a = $2)
+  update prunelock_p set a = a where a = $1;
+
+-- Force generic plan creation
+explain (costs off) execute prunelock_mt_q(1, 2);
+
+-- All partitions pruned: value 4 matches no partition, so each
+-- ModifyTable must still initialize correctly with no matching
+-- result relations.
+explain (costs off) execute prunelock_mt_q(4, 5);
+
+deallocate prunelock_mt_q;
+drop table prunelock_p;
+
+--
+-- Verify that pruning-aware locking falls back to locking all
+-- partitions for multi-statement CachedPlans.  Rule rewriting can
+-- expand a single statement into multiple PlannedStmts, and later
+-- statements must not have their pruning evaluated before earlier
+-- ones have executed, since CCI between statements can change what
+-- pruning expressions see.
+--
+create table prune_config (val int);
+insert into prune_config values (1);
+
+create table multistmt_pt (a int, b int) partition by list (a);
+create table multistmt_pt_1 partition of multistmt_pt for values in (1);
+create table multistmt_pt_2 partition of multistmt_pt for values in (2);
+insert into multistmt_pt values (1, 0), (2, 0);
+
+create function get_prune_val() returns int as $$
+  select val from prune_config;
+$$ language sql stable;
+
+create rule config_upd_rule as on update to multistmt_pt
+  do also update prune_config set val = 2;
+
+set plan_cache_mode to force_generic_plan;
+prepare multi_q as update multistmt_pt set b = b + 1 where a = get_prune_val();
+-- first execute creates the generic plan
+execute multi_q;
+-- reset for the real test
+update prune_config set val = 1;
+update multistmt_pt set b = 0;
+-- second execute reuses the plan; pruning-aware locking kicks in
+execute multi_q;
+select * from multistmt_pt order by a;
+
+deallocate multi_q;
+drop rule config_upd_rule on multistmt_pt;
+drop function get_prune_val;
+drop table multistmt_pt, prune_config;
+reset plan_cache_mode;
diff --git a/src/test/regress/sql/plancache.sql b/src/test/regress/sql/plancache.sql
index aed388d03a1..90b6c5f82bf 100644
--- a/src/test/regress/sql/plancache.sql
+++ b/src/test/regress/sql/plancache.sql
@@ -228,3 +228,55 @@ select name, generic_plans, custom_plans from pg_prepared_statements
   where  name = 'test_mode_pp';
 
 drop table test_mode;
+
+-- This exercises the CachedPlanPrepCleanup() path, which must free
+-- the EState created by ExecutorPrep() when the plan is invalidated
+-- before execution begins.  The pruning expression uses a stable SQL
+-- function that calls a volatile plpgsql function.  That function
+-- performs DDL on a partition when a separate "signal" table says to
+-- do so.  The second EXECUTE should replan cleanly after the DDL.
+set plan_cache_mode to force_generic_plan;
+create table inval_during_pruning_p (a int) partition by list (a);
+create table inval_during_pruning_p1 partition of inval_during_pruning_p for values in (1);
+create table inval_during_pruning_p2 partition of inval_during_pruning_p for values in (2);
+insert into inval_during_pruning_p values (1), (2);
+
+create table inval_during_pruning_signal (create_idx bool not null);
+insert into inval_during_pruning_signal values (false);
+create or replace function invalidate_plancache_func() returns int
+as $$
+declare
+	create_index bool;
+begin
+	-- Perform DDL on a partition if asked to
+	select create_idx into create_index from inval_during_pruning_signal for update;
+	if create_index = true then
+		raise notice 'creating index on partition inval_during_pruning_p1';
+		create index on inval_during_pruning_p1 (a);
+		update inval_during_pruning_signal set create_idx = false;
+	end if;
+	-- value that pruning will match against partition bounds
+	return 1;
+end;
+$$ language plpgsql volatile;
+
+create or replace function stable_pruning_val() returns int as $$
+	select invalidate_plancache_func();
+$$ language sql stable;
+
+prepare inval_during_pruning_q as select * from inval_during_pruning_p where a = stable_pruning_val();
+
+-- Build a generic plan and run pruning once, but don't set the signal
+-- for invalidate_plancache_func() to perform the DDL.
+explain (verbose, costs off) execute inval_during_pruning_q;
+
+-- Reuse the generic plan.  Make invalidate_plancache_func() perform DDL
+-- during this execution, which should force replanning without errors.
+update inval_during_pruning_signal set create_idx = true;
+explain (verbose, costs off) execute inval_during_pruning_q;
+
+deallocate inval_during_pruning_q;
+drop table inval_during_pruning_p, inval_during_pruning_signal;
+drop function invalidate_plancache_func, stable_pruning_val;
+
+reset plan_cache_mode;
-- 
2.47.3



^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2026-05-28 13:13  Thom Brown <[email protected]>
  parent: Amit Langote <[email protected]>
  0 siblings, 1 reply; 108+ messages in thread

From: Thom Brown @ 2026-05-28 13:13 UTC (permalink / raw)
  To: Amit Langote <[email protected]>; +Cc: Chao Li <[email protected]>; Tom Lane <[email protected]>; Tender Wang <[email protected]>; Alexander Lakhin <[email protected]>; Tomas Vondra <[email protected]>; Robert Haas <[email protected]>; Alvaro Herrera <[email protected]>; Andres Freund <[email protected]>; Daniel Gustafsson <[email protected]>; David Rowley <[email protected]>; pgsql-hackers

On Thu, 28 May 2026 at 09:14, Amit Langote <[email protected]> wrote:
>
> Hi Thom,
>
> On Wed, May 27, 2026 at 9:03 PM Thom Brown <[email protected]> wrote:
> >
> > On Sat, 4 Apr 2026 at 13:11, Amit Langote <[email protected]> wrote:
> > >
> > > Attached is a redesigned version. While working on the previous
> > > design, I grew increasingly uncomfortable with CachedPlanPrepData --
> > > it was smuggling executor state out of GetCachedPlan() through an
> > > out-parameter, which papered over the real problem: GetCachedPlan()
> > > was doing too much. The main change in this version is architectural:
> > > GetCachedPlan() no longer acquires execution locks. Callers now own
> > > that responsibility, which is natural because each call site iterates
> > > stmt_list differently and manages execution state in its own way --
> > > and it lets them choose between conservative lock-all and
> > > pruning-aware locking where appropriate.
> > >
> > > Non-portal call sites remain on the conservative path for now.
> > > _SPI_execute_plan requires care around snapshot setup, which happens
> > > after plan fetch rather than before. SQL functions have a different
> > > issue: init_execution_state() fetches the plan while postquel_start()
> > > handles execution, with execution_state containers in between, making
> > > it harder to thread a prepped QueryDesc through. The portal path and
> > > EXPLAIN EXECUTE cover the most common
> > > prepared-statement-with-partitions workloads; the remaining sites can
> > > be converted incrementally.
> > >
> > > This is now starting to feel closer to what Tom suggested back in
> > > January 2023 [1], where he proposed getting rid of
> > > AcquireExecutorLocks() inside GetCachedPlan() entirely and pushing
> > > lock acquisition out to callers. He noted that "we'd be pushing the
> > > responsibility for looping back and re-planning out to fairly
> > > high-level calling code" and that "we'd definitely be changing some
> > > fundamental APIs." That is the direction I came around to over the
> > > last couple of weeks while wrestling with CachedPlanPrepData.  The
> > > reverted approach also tried to follow Tom's direction but moved
> > > locking into ExecutorStart(), which forced it to handle plan
> > > invalidation from inside the executor by mutating the CachedPlan
> > > in-place. This version moves locking out to the callers instead, so
> > > the executor and plan cache never reach into each other.
> > >
> > > The series is now four patches:
> > >
> > > 0001: Move execution lock acquisition out of GetCachedPlan(). Adds
> > > AcquireExecutorLocks() as a caller-facing function with validity check
> > > and retry. Adds PortalLockCachedPlan() in pquery.c to centralize the
> > > portal retry logic. All callers are converted. No behavioral change.
> > >
> > > 0002: Refactor executor's initial partition pruning setup. Cleanup
> > > only, no behavioral change.
> > >
> > > 0003: Introduce ExecutorPrep() and refactor executor startup. Factors
> > > range table init, permission checks, and initial pruning out of
> > > InitPlan(). Scaffolding for 0004; all callers still go through the
> > > normal ExecutorStart() path.
> > >
> > > 0004: Use pruning-aware locking for single-statement cached plans.
> > > Adds ExecutorPrepAndLock() which locks unprunable relations, runs
> > > ExecutorPrep() to determine surviving partitions, then locks only
> > > those. Extends PortalLockCachedPlan() with a pruning-aware path for
> > > eligible plans. Multi-statement CachedPlans (from rule rewriting)
> > > always use conservative locking. In principle, this could be relaxed
> > > if the planner can prove that no pruning expression reads state
> > > modified by an earlier statement, but that is left for a future patch.
> > > Includes regression tests.
> > >
> > > In case it's not clear, I'm not targeting v19 at this point.  I'd like
> > > to get this into v20 CF1 and would welcome review from anyone
> > > interested.
> >
> > After not having looked at this in close to 2 years, I thought I'd
> > give it another look.
>
> Thanks for taking a look.
>
> > Not found any user-facing issues, and I'm liking
> > seeing so few locks in pg_locks. I can see that with pruning disabled,
> > the fallback works, pruning-aware locking is working via SPI through
> > plpgsql, running ALTER between executions and also invalidating
> > indexes force replans, and it's looking good.
> >
> > But I also think there might be a bug in patch 0001, but I'd
> > appreciate checking my reasoning because I'm not fully confident I've
> > been diligent enough.
> >
> > When PortalStart() opens a SELECT cursor that's backed by a cached
> > plan, it does roughly the following. It builds a queryDesc (an
> > executor-side struct), one of whose fields is a pointer into the plan
> > tree inside the portal's cached plan. Then it calls
> > PortalLockCachedPlan() to acquire the necessary locks, and finally
> > hands the queryDesc over to the executor.
> >
> > My worry is about what happens if the cached plan turns out to be
> > stale, for instance because someone ran DDL on a referenced table. In
> > that case PortalLockCachedPlan() throws the old plan away (via
> > ReleaseCachedPlan) and fetches a freshly-built replacement, updtating
> > the portal's own pointers to match. But the queryDesc from earlier
> > isn't touched. Its plan pointer still references the old, now-released
> > plan. From what I can see, once that old plan's last reference is
> > dropped its memory can be freed, which would leave the executor
> > reading from freed memory in the next step.
> >
> > The bit I'm least sure about is whether the old plan's memory really
> > does get reclaimed straight away when its refcount hits zero. If
> > something keeps it alive longer then this isn't a bug, or at least not
> > as bad as I'm making out. I had a look but couldn't convince myself
> > either way from the code alone. To actually hit this you'd need a
> > cursor on a cached plan, plus an invalidation arriving in the small
> > window between the portal being set up and the cursor being opened.
> > The race condition is brief, and I've not been able to hit it in
> > testing.
> >
> > The thing that got me thinking this is real: patch 0004 modifies
> > PortalLockCachedPlan() so that whenever it replans, it also rebuilds
> > the queryDesc. That's pretty much the fix I'd expect for this, which
> > makes me suspect somebody hit it at some point. But 0004 only applies
> > that fix on the new pruning-aware code path, and it was mentioned in
> > the thread that 0001 to 0003 might land before 0004. If so, master
> > would carry the bug in the gap between the two.
> >
> > I suspect a way to deal with it would be to move the CreateQueryDesc
> > call in the SELECT case to after PortalLockCachedPlan() returns, which
> > is what the other portal strategies already seem to do. Alternatively,
> > you could bring 0004's changes in this area into 0001 and have
> > PortalLockCachedPlan() always rebuild the queryDesc when it replans.
> >
> > If I've got this wrong and there's some lifetime mechanism I missed
> > that keeps the old plan's memory alive, then it's a non-issue and I'm
> > misreading the code. If I have got it wrong, could you please add
> > comments to make what is going on clearer?
>
> It's a real bug.
>
> You're right that if PortalLockCachedPlan() replans, the QueryDesc
> created before the call still points at the old PlannedStmt from the
> released plan.  And yes, 0004 happens to fix it by rebuilding the
> QueryDesc inside PortalLockCachedPlan(), but 0001 through 0003 are
> broken on their own.
>
> Attached is an updated set with the fix: CreateQueryDesc now runs
> after PortalLockCachedPlan() returns, as you suggested.  That said,
> I'll probably focus first on settling the plancache refactoring that
> spun off from this thread [1], and then start a new thread for the
> pruning-aware locking work on top of it, incorporating parts of this
> series.

Thanks.

I've done another pass. I see a reference to
AcquireExecutorLocksUnpruned(), but I can't find this function. Is
this supposed to be AcquireExecutorLocksPrepared()?

And also I have a question about the new firstResultRels code

If I've followed it right, the bit in setrefs.c records the
lowest-numbered RT index from leaf_result_relids as the
per-ModifyTable fallback that's used when all real targets get pruned
away, and the executor side looks it up via
linitial_int(node->resultRelations). For that to work those two have
to pick the same RT index, and the comment justifies it with
"partition expansion preserves RT index order". Where is that
preservation guaranteed?

And with the assertion in ExecInitModifyTable:

Assert(list_member_int(estate->es_plannedstmt->firstResultRels, rti));

With writable CTEs producing more than one ModifyTable node the list
has several entries, so all the assert really checks is that some
recorded entry matches, not that the one recorded for this particular
node matches. If that's correct, then in a case where the wrong entry
happened to line up the right relation wouldn't be locked and nothing
would complain. Is there something that keeps these in order
somewhere?

Thom






^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2026-05-29 08:56  Amit Langote <[email protected]>
  parent: Thom Brown <[email protected]>
  0 siblings, 1 reply; 108+ messages in thread

From: Amit Langote @ 2026-05-29 08:56 UTC (permalink / raw)
  To: Thom Brown <[email protected]>; +Cc: Chao Li <[email protected]>; Tom Lane <[email protected]>; Tender Wang <[email protected]>; Alexander Lakhin <[email protected]>; Tomas Vondra <[email protected]>; Robert Haas <[email protected]>; Alvaro Herrera <[email protected]>; Andres Freund <[email protected]>; Daniel Gustafsson <[email protected]>; David Rowley <[email protected]>; pgsql-hackers

On Thu, May 28, 2026 at 10:14 PM Thom Brown <[email protected]> wrote:
> On Thu, 28 May 2026 at 09:14, Amit Langote <[email protected]> wrote:
> > It's a real bug.
> >
> > You're right that if PortalLockCachedPlan() replans, the QueryDesc
> > created before the call still points at the old PlannedStmt from the
> > released plan.  And yes, 0004 happens to fix it by rebuilding the
> > QueryDesc inside PortalLockCachedPlan(), but 0001 through 0003 are
> > broken on their own.
> >
> > Attached is an updated set with the fix: CreateQueryDesc now runs
> > after PortalLockCachedPlan() returns, as you suggested.  That said,
> > I'll probably focus first on settling the plancache refactoring that
> > spun off from this thread [1], and then start a new thread for the
> > pruning-aware locking work on top of it, incorporating parts of this
> > series.
>
> Thanks.
>
> I've done another pass. I see a reference to
> AcquireExecutorLocksUnpruned(), but I can't find this function. Is
> this supposed to be AcquireExecutorLocksPrepared()?

You're right, stale comment. It should say
AcquireExecutorLocksPrepared(). Fixed.

> And also I have a question about the new firstResultRels code
>
> If I've followed it right, the bit in setrefs.c records the
> lowest-numbered RT index from leaf_result_relids as the
> per-ModifyTable fallback that's used when all real targets get pruned
> away, and the executor side looks it up via
> linitial_int(node->resultRelations). For that to work those two have
> to pick the same RT index, and the comment justifies it with
> "partition expansion preserves RT index order". Where is that
> preservation guaranteed?

The ordering comes from expand_inherited_rtentry(), which adds child
partitions to the range table sequentially in partition bound order.
Since ModifyTable.resultRelations is built from the same expansion,
its first element is the lowest-numbered RT index among the leaf
partitions for that node. That is the same value
bms_next_member(leaf_result_relids, -1) returns from the Bitmapset,
because Bitmapset iteration returns members in ascending order. I've
added a comment in setrefs.c pointing to expand_inherited_rtentry() as
the source of this guarantee.

> And with the assertion in ExecInitModifyTable:
>
> Assert(list_member_int(estate->es_plannedstmt->firstResultRels, rti));
>
> With writable CTEs producing more than one ModifyTable node the list
> has several entries, so all the assert really checks is that some
> recorded entry matches, not that the one recorded for this particular
> node matches. If that's correct, then in a case where the wrong entry
> happened to line up the right relation wouldn't be locked and nothing
> would complain. Is there something that keeps these in order
> somewhere?

This is a fair observation -- the Assert checks membership in the
global list rather than per-node correspondence. But node A's rti
can't accidentally pass the Assert by matching an entry recorded for
node B. Each ModifyTable node gets its own partition expansion with
distinct RT entries. In a writable CTE like:

  WITH upd1 AS (UPDATE t SET ...),
       upd2 AS (UPDATE t SET ...)
  UPDATE t SET ...

each UPDATE creates a separate set of leaf partition RT entries --
upd1 might get RT indexes 5,6,7, upd2 gets 8,9,10, and the main UPDATE
gets 11,12,13. The global firstResultRels list would be [5, 8, 11].
When ExecInitModifyTable falls back to linitial_int(resultRelations)
for a given node, it finds that node's own entry, because the RT index
sets are disjoint across nodes.

That said, it's worth being explicit about what protections exist at
each layer, since this is safety-critical code:

1. AcquireExecutorLocksPrepared(), added by 0004, locks every entry in
firstResultRels unconditionally. So regardless of which rti a
ModifyTable node falls back to, the relation will be locked.

2. ExecGetRangeTableRelation() has two checks when opening a relation.
For non-result relations (isResultRel=false), it checks
es_unpruned_relids and raises an ERROR in release builds if the
relation was pruned. For result relations (isResultRel=true), that
check is intentionally skipped -- it has to be, because at least one
result relation per ModifyTable node must remain openable even when
all partitions are pruned, since executor code paths like ExecMerge()
and ExecInitPartitionInfo() rely on resultRelInfo[0] being initialized
(see commit 28317de723b). The remaining protection for result
relations is Assert(CheckRelationLockedByMe()) inside table_open,
which fires in debug builds.

3. I've tightened ExecInitModifyTable to close this gap: the
all-pruned fallback path now raises an elog(ERROR) in release builds
if linitial_int(resultRelations) is not found in firstResultRels,
rather than just an Assert. This gives result relations a
production-visible check comparable to what es_unpruned_relids
provides for scan relations.

So the net effect is that for scan relations, opening a
pruned-and-unlocked relation is caught by an ERROR in production via
es_unpruned_relids. For result relations on the all-pruned fallback
path, it's now also caught by an ERROR in production via the
firstResultRels check in ExecInitModifyTable. The locking in
AcquireExecutorLocksPrepared() ensures the relation is always locked
regardless.

Thanks again for the review.  A close look at these aspects by someone
other than me is very useful.

-- 
Thanks, Amit Langote


Attachments:

  [application/octet-stream] v13-0003-Introduce-ExecutorPrep-and-refactor-executor-sta.patch (8.8K, 2-v13-0003-Introduce-ExecutorPrep-and-refactor-executor-sta.patch)
  download | inline diff:
From 05c92346e2bec4c8ec9a7cf45ec572c15d64481f Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Thu, 26 Mar 2026 16:08:46 +0900
Subject: [PATCH v13 3/4] Introduce ExecutorPrep and refactor executor startup

Move permission checks, range table initialization, and initial
partition pruning out of InitPlan() into a new ExecutorPrep()
helper.

ExecutorStart() invokes ExecutorPrep() when QueryDesc->estate is
NULL, keeping current behavior unchanged.  If QueryDesc->estate is
already set, ExecutorStart() reuses it.

This is preparatory refactoring only.  No caller outside the
executor supplies a prebuilt EState in this commit.

In assert builds, verify that the expected relation locks are held
when entering ExecutorStart().
---
 src/backend/executor/README     |  10 ++-
 src/backend/executor/execMain.c | 152 ++++++++++++++++++++++++++------
 src/include/executor/execdesc.h |   2 +-
 3 files changed, 132 insertions(+), 32 deletions(-)

diff --git a/src/backend/executor/README b/src/backend/executor/README
index 54f4782f31b..890bc3d9333 100644
--- a/src/backend/executor/README
+++ b/src/backend/executor/README
@@ -291,11 +291,17 @@ Query Processing Control Flow
 
 This is a sketch of control flow for full query processing:
 
+    ExecutorPrep
+		May be run before ExecutorStart, or implicitly from ExecutorStart
+		if not done earlier.  Creates the EState in QueryDesc, performs
+		range table initialization, permission checks, and initial
+		partition pruning.
+
 	CreateQueryDesc
 
 	ExecutorStart
-		CreateExecutorState
-			creates per-query context
+		ExecutorPrep (if QueryDesc.estate is NULL)
+			creates EState and per-query context
 		switch to per-query context to run ExecInitNode
 		AfterTriggerBeginQuery
 		ExecInitNode --- recursively scans plan tree
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 4b30f768680..2b9397b72f3 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -57,6 +57,7 @@
 #include "parser/parse_relation.h"
 #include "pgstat.h"
 #include "rewrite/rewriteHandler.h"
+#include "storage/lmgr.h"
 #include "tcop/utility.h"
 #include "utils/acl.h"
 #include "utils/backend_status.h"
@@ -76,6 +77,7 @@ ExecutorEnd_hook_type ExecutorEnd_hook = NULL;
 ExecutorCheckPerms_hook_type ExecutorCheckPerms_hook = NULL;
 
 /* decls for local routines only used within this module */
+static void ExecutorPrep(QueryDesc *queryDesc, ResourceOwner owner, int eflags);
 static void InitPlan(QueryDesc *queryDesc, int eflags);
 static void CheckValidRowMarkRel(Relation rel, RowMarkType markType);
 static void ExecPostprocessPlan(EState *estate);
@@ -147,7 +149,6 @@ standard_ExecutorStart(QueryDesc *queryDesc, int eflags)
 
 	/* sanity checks: queryDesc must not be started already */
 	Assert(queryDesc != NULL);
-	Assert(queryDesc->estate == NULL);
 
 	/* caller must ensure the query's snapshot is active */
 	Assert(GetActiveSnapshot() == queryDesc->snapshot);
@@ -173,9 +174,67 @@ standard_ExecutorStart(QueryDesc *queryDesc, int eflags)
 
 	/*
 	 * Build EState, switch into per-query memory context for startup.
-	 */
-	estate = CreateExecutorState();
-	queryDesc->estate = estate;
+	 *
+	 * If ExecutorPrep() ran earlier (e.g., to do initial pruning during plan
+	 * validity checking), reuse its EState to avoid redoing range table setup
+	 * and pruning. Otherwise, create a fresh EState as usual.
+	 *
+	 * In assert builds, verify that the expected locks are held.  When no
+	 * prep EState was provided, AcquireExecutorLocks() should have locked
+	 * every relation in the plan.  When one was provided, pruning-aware
+	 * locking should have locked at least the unpruned relations.  Both
+	 * checks are skipped in parallel workers, which acquire relation locks
+	 * lazily in ExecGetRangeTableRelation().
+	 */
+	if (queryDesc->estate == NULL)
+	{
+#ifdef USE_ASSERT_CHECKING
+		if (!IsParallelWorker())
+		{
+			ListCell   *lc;
+
+			foreach(lc, queryDesc->plannedstmt->rtable)
+			{
+				RangeTblEntry *rte = lfirst_node(RangeTblEntry, lc);
+
+				if (rte->rtekind == RTE_RELATION ||
+					(rte->rtekind == RTE_SUBQUERY && rte->relid != InvalidOid))
+					Assert(CheckRelationOidLockedByMe(rte->relid,
+													  rte->rellockmode,
+													  true));
+			}
+		}
+#endif
+		ExecutorPrep(queryDesc, CurrentResourceOwner, eflags);
+	}
+#ifdef USE_ASSERT_CHECKING
+	else
+	{
+		/*
+		 * A prep EState was provided, meaning pruning-aware locking should
+		 * have locked at least the unpruned relations.
+		 */
+		if (!IsParallelWorker())
+		{
+			int			rtindex = -1;
+
+			while ((rtindex = bms_next_member(queryDesc->estate->es_unpruned_relids,
+											  rtindex)) >= 0)
+			{
+				RangeTblEntry *rte = exec_rt_fetch(rtindex, queryDesc->estate);
+
+				Assert(rte->rtekind == RTE_RELATION ||
+					   (rte->rtekind == RTE_SUBQUERY &&
+						rte->relid != InvalidOid));
+				Assert(CheckRelationOidLockedByMe(rte->relid,
+												  rte->rellockmode, true));
+			}
+		}
+	}
+#endif
+
+	estate = queryDesc->estate;
+	Assert(estate);
 
 	oldcontext = MemoryContextSwitchTo(estate->es_query_cxt);
 
@@ -274,6 +333,64 @@ standard_ExecutorStart(QueryDesc *queryDesc, int eflags)
 	MemoryContextSwitchTo(oldcontext);
 }
 
+/*
+ * ExecutorPrep
+ *
+ * Build the initial executor state for queryDesc before ExecutorStart().
+ *
+ * This creates the EState and performs the subset of executor startup that
+ * does not require plan-tree initialization, allowing that work to be reused
+ * by callers that need executor state before ExecutorStart():
+ *
+ * - initialize the range table
+ * - perform permission checks
+ * - perform initial partition pruning
+ *
+ * On success, queryDesc->estate is set and can later be reused by
+ * ExecutorStart() instead of rebuilding the same state.
+ *
+ * Caller must ensure that queryDesc->snapshot is active.
+ */
+static void
+ExecutorPrep(QueryDesc *queryDesc, ResourceOwner owner, int eflags)
+{
+	ResourceOwner oldowner;
+	EState	   *estate;
+	PlannedStmt *pstmt;
+
+	Assert(queryDesc != NULL);
+
+	if (queryDesc->operation == CMD_UTILITY)
+		return;
+
+	Assert(ActiveSnapshotSet());
+	Assert(GetActiveSnapshot() == queryDesc->snapshot);
+	Assert(queryDesc->estate == NULL);
+
+	pstmt = queryDesc->plannedstmt;
+
+	estate = CreateExecutorState();
+	queryDesc->estate = estate;
+
+	estate->es_plannedstmt = pstmt;
+	estate->es_part_prune_infos = pstmt->partPruneInfos;
+	estate->es_param_list_info = queryDesc->params;
+	estate->es_queryEnv = queryDesc->queryEnv;
+	estate->es_top_eflags = eflags;
+
+	ExecCheckPermissions(pstmt->rtable, pstmt->permInfos, true);
+
+	ExecInitRangeTable(estate, pstmt->rtable, pstmt->permInfos,
+					   bms_copy(pstmt->unprunableRelids));
+
+	oldowner = CurrentResourceOwner;
+	CurrentResourceOwner = owner;
+
+	ExecDoInitialPruning(estate);
+
+	CurrentResourceOwner = oldowner;
+}
+
 /* ----------------------------------------------------------------
  *		ExecutorRun
  *
@@ -849,37 +966,14 @@ InitPlan(QueryDesc *queryDesc, int eflags)
 	CmdType		operation = queryDesc->operation;
 	PlannedStmt *plannedstmt = queryDesc->plannedstmt;
 	Plan	   *plan = plannedstmt->planTree;
-	List	   *rangeTable = plannedstmt->rtable;
 	EState	   *estate = queryDesc->estate;
 	PlanState  *planstate;
 	TupleDesc	tupType;
 	ListCell   *l;
 	int			i;
 
-	/*
-	 * Do permissions checks
-	 */
-	ExecCheckPermissions(rangeTable, plannedstmt->permInfos, true);
-
-	/*
-	 * initialize the node's execution state
-	 */
-	ExecInitRangeTable(estate, rangeTable, plannedstmt->permInfos,
-					   bms_copy(plannedstmt->unprunableRelids));
-
-	estate->es_plannedstmt = plannedstmt;
-	estate->es_part_prune_infos = plannedstmt->partPruneInfos;
-
-	/*
-	 * Perform runtime "initial" pruning to identify which child subplans,
-	 * corresponding to the children of plan nodes that contain
-	 * PartitionPruneInfo such as Append, will not be executed. The results,
-	 * which are bitmapsets of indexes of the child subplans that will be
-	 * executed, are saved in es_part_prune_results.  These results correspond
-	 * to each PartitionPruneInfo entry, and the es_part_prune_results list is
-	 * parallel to es_part_prune_infos.
-	 */
-	ExecDoInitialPruning(estate);
+	/* ExecutorPrep() must have been done. */
+	Assert(queryDesc->estate);
 
 	/*
 	 * Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/include/executor/execdesc.h b/src/include/executor/execdesc.h
index 37c2576e4bc..aea5ec8ea02 100644
--- a/src/include/executor/execdesc.h
+++ b/src/include/executor/execdesc.h
@@ -45,7 +45,7 @@ typedef struct QueryDesc
 	int			query_instr_options;	/* OR of InstrumentOption flags for
 										 * query_instr */
 
-	/* These fields are set by ExecutorStart */
+	/* These fields are set by ExecutorStart or ExecutorPrep */
 	TupleDesc	tupDesc;		/* descriptor for result tuples */
 	EState	   *estate;			/* executor's query-wide state */
 	PlanState  *planstate;		/* tree of per-plan-node state */
-- 
2.47.3



  [application/octet-stream] v13-0002-Refactor-executor-s-initial-partition-pruning-se.patch (7.3K, 3-v13-0002-Refactor-executor-s-initial-partition-pruning-se.patch)
  download | inline diff:
From 29e5ad113f6974a94fbcf984b43fa3ed86f57632 Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Wed, 25 Mar 2026 16:06:38 +0900
Subject: [PATCH v13 2/4] Refactor executor's initial partition pruning setup

Simplify handling of unpruned relids by moving responsibility
for recording them in EState into CreatePartitionPruneState(),
avoiding the need to pass all_leafpart_rtis as an out parameter.

Also move the setting of ecxt_param_exec_vals from
ExecCreatePartitionPruneState() to InitExecPartitionPruneContexts(),
to allow the former to be called before PARAM_EXEC parameters are
set up.  A later commit needs this when running pruning state setup
outside of InitPlan().

No behavioral change.
---
 src/backend/executor/execPartition.c | 70 +++++++++++++++++++---------
 1 file changed, 48 insertions(+), 22 deletions(-)

diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index d96d4f9947b..2a3af006f77 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -185,8 +185,7 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation rel,
 static List *adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri);
 static List *adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap);
 static PartitionPruneState *CreatePartitionPruneState(EState *estate,
-													  PartitionPruneInfo *pruneinfo,
-													  Bitmapset **all_leafpart_rtis);
+													  PartitionPruneInfo *pruneinfo);
 static void InitPartitionPruneContext(PartitionPruneContext *context,
 									  List *pruning_steps,
 									  PartitionDesc partdesc,
@@ -1978,7 +1977,7 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  * estate->es_part_prune_infos. For each entry, it creates a PartitionPruneState
  * and adds it to es_part_prune_states.  ExecInitPartitionExecPruning() accesses
  * these states through their corresponding indexes in es_part_prune_states and
- * assign each state to the parent node's PlanState, from where it will be used
+ * assigns each state to the parent node's PlanState, from where it will be used
  * for "exec" pruning.
  *
  * If initial pruning steps exist for a PartitionPruneInfo entry, this function
@@ -1996,29 +1995,31 @@ ExecDoInitialPruning(EState *estate)
 {
 	ListCell   *lc;
 
+	Assert(estate->es_part_prune_results == NULL);
 	foreach(lc, estate->es_part_prune_infos)
 	{
 		PartitionPruneInfo *pruneinfo = lfirst_node(PartitionPruneInfo, lc);
 		PartitionPruneState *prunestate;
 		Bitmapset  *validsubplans = NULL;
-		Bitmapset  *all_leafpart_rtis = NULL;
 		Bitmapset  *validsubplan_rtis = NULL;
 
 		/* Create and save the PartitionPruneState. */
-		prunestate = CreatePartitionPruneState(estate, pruneinfo,
-											   &all_leafpart_rtis);
+		prunestate = CreatePartitionPruneState(estate, pruneinfo);
 		estate->es_part_prune_states = lappend(estate->es_part_prune_states,
 											   prunestate);
 
 		/*
 		 * Perform initial pruning steps, if any, and save the result
-		 * bitmapset or NULL as described in the header comment.
+		 * bitmapset or NULL as described in the header comment.  RT indexes
+		 * of surviving partitions would be added to validsubplan_rtis.
+		 *
+		 * Note that when do_initial_prune is false,
+		 * CreatePartitionPruneState() would have already added the RT indexes
+		 * of all leaf partitions to es_unpruned_relids directly.
 		 */
 		if (prunestate->do_initial_prune)
 			validsubplans = ExecFindMatchingSubPlans(prunestate, true,
 													 &validsubplan_rtis);
-		else
-			validsubplan_rtis = all_leafpart_rtis;
 
 		estate->es_unpruned_relids = bms_add_members(estate->es_unpruned_relids,
 													 validsubplan_rtis);
@@ -2136,14 +2137,12 @@ ExecInitPartitionExecPruning(PlanState *planstate,
  * parent plan node's PlanState.
  *
  * If initial pruning steps are to be skipped (e.g., during EXPLAIN
- * (GENERIC_PLAN)), *all_leafpart_rtis will be populated with the RT indexes of
- * all leaf partitions whose scanning subnode is included in the parent plan
- * node's list of child plans. The caller must add these RT indexes to
- * estate->es_unpruned_relids.
+ * (GENERIC_PLAN)), the RT indexes of all leaf partitions whose scanning
+ * subnode is included in the parent plan node's list of child plans are
+ * added to estate->es_unpruned_relids.
  */
 static PartitionPruneState *
-CreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo,
-						  Bitmapset **all_leafpart_rtis)
+CreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo)
 {
 	PartitionPruneState *prunestate;
 	int			n_part_hierarchies;
@@ -2377,8 +2376,8 @@ CreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo,
 													   pinfo->execparamids);
 
 			/*
-			 * Return all leaf partition indexes if we're skipping pruning in
-			 * the EXPLAIN (GENERIC_PLAN) case.
+			 * Add all leaf partition indexes to es_unpruned_relids if we're
+			 * skipping pruning in the EXPLAIN (GENERIC_PLAN) case.
 			 */
 			if (pinfo->initial_pruning_steps && !prunestate->do_initial_prune)
 			{
@@ -2390,9 +2389,28 @@ CreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo,
 					Index		rtindex = pprune->leafpart_rti_map[part_index];
 
 					if (rtindex)
-						*all_leafpart_rtis = bms_add_member(*all_leafpart_rtis,
-															rtindex);
+						estate->es_unpruned_relids =
+							bms_add_member(estate->es_unpruned_relids, rtindex);
+				}
+			}
+			else if (pinfo->initial_pruning_steps == NIL)
+			{
+				/*
+				 * All partitions better be present in es_unpruned_relids when
+				 * none are initially prunable.
+				 */
+#ifdef USE_ASSERT_CHECKING
+				int			part_index = -1;
+
+				while ((part_index = bms_next_member(pprune->present_parts,
+													 part_index)) >= 0)
+				{
+					Index		rtindex = pprune->leafpart_rti_map[part_index];
+
+					if (rtindex)
+						Assert(bms_is_member(rtindex, estate->es_unpruned_relids));
 				}
+#endif
 			}
 
 			j++;
@@ -2490,9 +2508,10 @@ InitPartitionPruneContext(PartitionPruneContext *context,
  *		Initialize exec pruning contexts deferred by CreatePartitionPruneState()
  *
  * This function finalizes exec pruning setup for a PartitionPruneState by
- * initializing contexts for pruning steps that require the parent plan's
- * PlanState. It iterates over PartitionPruningData entries and sets up the
- * necessary execution contexts for pruning during query execution.
+ * initializing contexts for pruning steps that require PARAM_EXEC parameters
+ * and the parent plan's PlanState. It iterates over PartitionPruningData
+ * entries and sets up the necessary execution contexts for pruning during
+ * query execution.
  *
  * Also fix the mapping of partition indexes to subplan indexes contained in
  * prunestate by considering the new list of subplans that survived initial
@@ -2520,9 +2539,16 @@ InitExecPartitionPruneContexts(PartitionPruneState *prunestate,
 	bool		fix_subplan_map = false;
 
 	Assert(prunestate->do_exec_prune);
+	Assert(prunestate->econtext);
 	Assert(parent_plan != NULL);
 	estate = parent_plan->state;
 
+	/*
+	 * These might not be available when ExecCreatePartitionPruneState() is
+	 * called.
+	 */
+	prunestate->econtext->ecxt_param_exec_vals = estate->es_param_exec_vals;
+
 	/*
 	 * No need to fix subplans maps if initial pruning didn't eliminate any
 	 * subplans.
-- 
2.47.3



  [application/octet-stream] v13-0001-Move-execution-lock-acquisition-out-of-GetCached.patch (16.2K, 4-v13-0001-Move-execution-lock-acquisition-out-of-GetCached.patch)
  download | inline diff:
From a3214580f2ce1983a111af07ccb092ba03c812c8 Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Sat, 4 Apr 2026 18:38:34 +0900
Subject: [PATCH v13 1/4] Move execution lock acquisition out of
 GetCachedPlan()

GetCachedPlan() previously acquired execution locks on all plan
relations as part of cached plan validation.  Move this
responsibility to callers, making GetCachedPlan() return a valid
plan without holding execution locks.

Add AcquireExecutorLocks() as the caller-facing function: it locks
all relations in the plan, checks that the plan is still valid
afterward, and returns false if it was invalidated so the caller
can retry with a fresh plan.

For portal-backed callers, add PortalLockCachedPlan() in pquery.c
which wraps the lock-check-retry loop and handles the case where
replanning changes the portal strategy.  Store the CachedPlanSource
pointer in PortalData so retry can call GetCachedPlan() without
the caller threading it through.

Adjust all non-portal GetCachedPlan() callers (SPI, EXPLAIN
EXECUTE, SQL functions) to call AcquireExecutorLocks() explicitly
after fetching the plan.

No behavioral change.  This separates plan retrieval from execution
setup, allowing a later commit to substitute pruning-aware locking
for eligible plans.
---
 src/backend/commands/portalcmds.c   |  1 +
 src/backend/commands/prepare.c      | 14 +++++-
 src/backend/executor/functions.c    | 14 ++++--
 src/backend/executor/spi.c          | 22 +++++++--
 src/backend/tcop/postgres.c         |  2 +
 src/backend/tcop/pquery.c           | 70 ++++++++++++++++++++++++++++-
 src/backend/utils/cache/plancache.c | 44 +++++++++++++-----
 src/backend/utils/mmgr/portalmem.c  |  7 +++
 src/include/utils/plancache.h       |  1 +
 src/include/utils/portal.h          |  3 ++
 10 files changed, 157 insertions(+), 21 deletions(-)

diff --git a/src/backend/commands/portalcmds.c b/src/backend/commands/portalcmds.c
index 01efac3319e..cf5deec4943 100644
--- a/src/backend/commands/portalcmds.c
+++ b/src/backend/commands/portalcmds.c
@@ -118,6 +118,7 @@ PerformCursorOpen(ParseState *pstate, DeclareCursorStmt *cstmt, ParamListInfo pa
 					  queryString,
 					  CMDTAG_SELECT,	/* cursor's query is always a SELECT */
 					  list_make1(plan),
+					  NULL,
 					  NULL);
 
 	/*----------
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 876aad2100a..03d7a98fc58 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -207,6 +207,7 @@ ExecuteQuery(ParseState *pstate,
 					  query_string,
 					  entry->plansource->commandTag,
 					  plan_list,
+					  entry->plansource,
 					  cplan);
 
 	/*
@@ -632,8 +633,17 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	}
 
 	/* Replan if needed, and acquire a transient refcount */
-	cplan = GetCachedPlan(entry->plansource, paramLI,
-						  CurrentResourceOwner, pstate->p_queryEnv);
+	for (;;)
+	{
+		cplan = GetCachedPlan(entry->plansource, paramLI,
+							  CurrentResourceOwner,
+							  pstate->p_queryEnv);
+		plan_list = cplan->stmt_list;
+
+		if (AcquireExecutorLocks(cplan))
+			break;
+		ReleaseCachedPlan(cplan, CurrentResourceOwner);
+	}
 
 	INSTR_TIME_SET_CURRENT(planduration);
 	INSTR_TIME_SUBTRACT(planduration, planstart);
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index 88109348817..2afb814a435 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -654,6 +654,7 @@ static bool
 init_execution_state(SQLFunctionCachePtr fcache)
 {
 	CachedPlanSource *plansource;
+	CachedPlan *cplan;
 	execution_state *preves = NULL;
 	execution_state *lasttages = NULL;
 	int			nstmts;
@@ -696,10 +697,15 @@ init_execution_state(SQLFunctionCachePtr fcache)
 	 * CurrentResourceOwner will be the same when ShutdownSQLFunction runs.)
 	 */
 	fcache->cowner = CurrentResourceOwner;
-	fcache->cplan = GetCachedPlan(plansource,
-								  fcache->paramLI,
-								  fcache->cowner,
-								  NULL);
+	for (;;)
+	{
+		cplan = GetCachedPlan(plansource, fcache->paramLI,
+							  fcache->cowner, NULL);
+		if (AcquireExecutorLocks(cplan))
+			break;
+		ReleaseCachedPlan(cplan, fcache->cowner);
+	}
+	fcache->cplan = cplan;
 
 	/*
 	 * If necessary, make esarray[] bigger to hold the needed state.
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 52f3b11301c..268cd10bde8 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -1686,6 +1686,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 					  query_string,
 					  plansource->commandTag,
 					  stmt_list,
+					  plansource,
 					  cplan);
 
 	/*
@@ -2106,6 +2107,16 @@ SPI_plan_get_cached_plan(SPIPlanPtr plan)
 						  _SPI_current->queryEnv);
 	Assert(cplan == plansource->gplan);
 
+	if (!AcquireExecutorLocks(cplan))
+	{
+		/* Plan invalidated during locking; get a fresh one. */
+		ReleaseCachedPlan(cplan,
+						  plan->saved ? CurrentResourceOwner : NULL);
+		cplan = GetCachedPlan(plansource, NULL,
+							  plan->saved ? CurrentResourceOwner : NULL,
+							  _SPI_current->queryEnv);
+	}
+
 	/* Pop the error context stack */
 	error_context_stack = spierrcontext.previous;
 
@@ -2574,9 +2585,14 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 		 * Replan if needed, and increment plan refcount.  If it's a saved
 		 * plan, the refcount must be backed by the plan_owner.
 		 */
-		cplan = GetCachedPlan(plansource, options->params,
-							  plan_owner, _SPI_current->queryEnv);
-
+		for (;;)
+		{
+			cplan = GetCachedPlan(plansource, options->params,
+								  plan_owner, _SPI_current->queryEnv);
+			if (AcquireExecutorLocks(cplan))
+				break;
+			ReleaseCachedPlan(cplan, plan_owner);
+		}
 		stmt_list = cplan->stmt_list;
 
 		/*
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index dbef734a93f..2929f158338 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -1243,6 +1243,7 @@ exec_simple_query(const char *query_string)
 						  query_string,
 						  commandTag,
 						  plantree_list,
+						  NULL,
 						  NULL);
 
 		/*
@@ -2042,6 +2043,7 @@ exec_bind_message(StringInfo input_message)
 					  query_string,
 					  psrc->commandTag,
 					  cplan->stmt_list,
+					  psrc,
 					  cplan);
 
 	/* Portal is defined, set the plan ID based on its contents. */
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index ee731000820..4699b53cab7 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -59,6 +59,7 @@ static uint64 DoPortalRunFetch(Portal portal,
 							   long count,
 							   DestReceiver *dest);
 static void DoPortalRewind(Portal portal);
+static bool PortalLockCachedPlan(Portal portal);
 
 
 /*
@@ -463,6 +464,8 @@ PortalStart(Portal portal, ParamListInfo params,
 		 */
 		portal->strategy = ChoosePortalStrategy(portal->stmts);
 
+restart:
+
 		/*
 		 * Fire her up according to the strategy
 		 */
@@ -485,6 +488,21 @@ PortalStart(Portal portal, ParamListInfo params,
 				 * non-default nesting level for the snapshot.
 				 */
 
+				/*
+				 * If the portal is backed by a cached plan, acquire execution
+				 * locks via PortalLockCachedPlan().  If the plan is
+				 * invalidated during locking, it replans and may change the
+				 * portal strategy, requiring us to restart PortalStart().
+				 */
+				if (portal->cplan)
+				{
+					if (PortalLockCachedPlan(portal))
+					{
+						PopActiveSnapshot();
+						goto restart;
+					}
+				}
+
 				/*
 				 * Create QueryDesc in portal's context; for the moment, set
 				 * the destination to DestNone.
@@ -535,6 +553,11 @@ PortalStart(Portal portal, ParamListInfo params,
 
 			case PORTAL_ONE_RETURNING:
 			case PORTAL_ONE_MOD_WITH:
+				if (portal->cplan)
+				{
+					if (PortalLockCachedPlan(portal))
+						goto restart;
+				}
 
 				/*
 				 * We don't start the executor until we are told to run the
@@ -578,7 +601,20 @@ PortalStart(Portal portal, ParamListInfo params,
 				break;
 
 			case PORTAL_MULTI_QUERY:
-				/* Need do nothing now */
+
+				/*
+				 * GetCachedPlan() no longer acquires execution locks, so we
+				 * must do it here.  Multi-statement plans always use
+				 * conservative locking (all partitions locked); pruning-aware
+				 * locking is not feasible because PortalRunMulti() executes
+				 * statements sequentially with CCI between them.
+				 */
+				if (portal->cplan)
+				{
+					if (PortalLockCachedPlan(portal))
+						goto restart;
+				}
+
 				portal->tupDesc = NULL;
 				break;
 		}
@@ -1786,3 +1822,35 @@ EnsurePortalSnapshotExists(void)
 	/* PushActiveSnapshotWithLevel might have copied the snapshot */
 	portal->portalSnapshot = GetActiveSnapshot();
 }
+
+/*
+ * PortalLockCachedPlan
+ *		Acquire execution locks for a cached-plan-backed portal,
+ *		retrying with a fresh plan if the current one is invalidated.
+ *
+ * Returns true if replanning changed portal->strategy, meaning the
+ * caller must redispatch.  Returns false once locks are held.
+ */
+static bool
+PortalLockCachedPlan(Portal portal)
+{
+	PortalStrategy start_strategy = portal->strategy;
+
+	if (AcquireExecutorLocks(portal->cplan))
+		return false;
+
+	/* Replan.  Locks will be taken freshly. */
+	ReleaseCachedPlan(portal->cplan, portal->resowner);
+	portal->cplan = NULL;
+	portal->stmts = NIL;
+	portal->cplan = GetCachedPlan(portal->plansource,
+								  portal->portalParams,
+								  portal->resowner,
+								  portal->queryEnv);
+	portal->stmts = portal->cplan->stmt_list;
+	portal->strategy = ChoosePortalStrategy(portal->stmts);
+	if (portal->strategy != start_strategy)
+		return true;
+
+	return false;
+}
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 698e7c1aa22..f7fe366859c 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -100,7 +100,7 @@ static bool choose_custom_plan(CachedPlanSource *plansource,
 							   ParamListInfo boundParams);
 static double cached_plan_cost(CachedPlan *plan, bool include_planner);
 static Query *QueryListGetPrimaryStmt(List *stmts);
-static void AcquireExecutorLocks(List *stmt_list, bool acquire);
+static void AcquireExecutorLocksInt(List *stmt_list, bool acquire);
 static void AcquirePlannerLocks(List *stmt_list, bool acquire);
 static void ScanQueryForLocks(Query *parsetree, bool acquire);
 static bool ScanQueryWalker(Node *node, bool *acquire);
@@ -945,8 +945,9 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
  * Caller must have already called RevalidateCachedQuery to verify that the
  * querytree is up to date.
  *
- * On a "true" return, we have acquired the locks needed to run the plan.
- * (We must do this for the "true" result to be race-condition-free.)
+ * On a "true" return, the generic plan may be reused as a valid cached
+ * plan.  Any execution-time setup, including lock acquisition, is the
+ * caller's responsibility.
  */
 static bool
 CheckCachedPlan(CachedPlanSource *plansource)
@@ -983,8 +984,6 @@ CheckCachedPlan(CachedPlanSource *plansource)
 		 */
 		Assert(plan->refcount > 0);
 
-		AcquireExecutorLocks(plan->stmt_list, true);
-
 		/*
 		 * If plan was transient, check to see if TransactionXmin has
 		 * advanced, and if so invalidate it.
@@ -1003,9 +1002,6 @@ CheckCachedPlan(CachedPlanSource *plansource)
 			/* Successfully revalidated and locked the query. */
 			return true;
 		}
-
-		/* Oops, the race case happened.  Release useless locks. */
-		AcquireExecutorLocks(plan->stmt_list, false);
 	}
 
 	/*
@@ -1282,8 +1278,11 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
  * plan or a custom plan for the given parameters: the caller does not know
  * which it will get.
  *
- * On return, the plan is valid and we have sufficient locks to begin
- * execution.
+ * On return, the plan is valid but no execution locks are held.
+ * The caller must call AcquireExecutorLocks() before executing.
+ * For freshly built plans (custom or new generic), the planner
+ * already holds the needed locks, so AcquireExecutorLocks() is
+ * redundant but harmless.
  *
  * On return, the refcount of the plan has been incremented; a later
  * ReleaseCachedPlan() call is expected.  If "owner" is not NULL then
@@ -1906,9 +1905,11 @@ QueryListGetPrimaryStmt(List *stmts)
 /*
  * AcquireExecutorLocks: acquire locks needed for execution of a cached plan;
  * or release them if acquire is false.
+ *
+ * This locks all relations in a given PlannedStmt's range table.
  */
 static void
-AcquireExecutorLocks(List *stmt_list, bool acquire)
+AcquireExecutorLocksInt(List *stmt_list, bool acquire)
 {
 	ListCell   *lc1;
 
@@ -1955,6 +1956,27 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
 	}
 }
 
+/*
+ * AcquireExecutorLocks
+ *		Acquire execution locks on all relations in a cached plan.
+ *
+ * Returns true if the plan is still valid after locking.  Returns
+ * false if the plan was invalidated while locks were being acquired,
+ * in which case the locks have been released and the caller should
+ * discard this plan and retry with a fresh one from GetCachedPlan().
+ */
+bool
+AcquireExecutorLocks(CachedPlan *cplan)
+{
+	AcquireExecutorLocksInt(cplan->stmt_list, true);
+	if (!cplan->is_valid)
+	{
+		AcquireExecutorLocksInt(cplan->stmt_list, false);
+		return false;
+	}
+	return true;
+}
+
 /*
  * AcquirePlannerLocks: acquire locks needed for planning of a querytree list;
  * or release them if acquire is false.
diff --git a/src/backend/utils/mmgr/portalmem.c b/src/backend/utils/mmgr/portalmem.c
index 493f9b0ee19..613f3be30b3 100644
--- a/src/backend/utils/mmgr/portalmem.c
+++ b/src/backend/utils/mmgr/portalmem.c
@@ -272,6 +272,10 @@ CreateNewPortal(void)
  * the passed plan trees have adequate lifetime.  Typically this is done by
  * copying them into the portal's context.
  *
+ * If plansource is provided, it is the CachedPlanSource that produced
+ * cplan.  PortalLockCachedPlan() uses it to fetch a fresh plan if the
+ * current one is invalidated during execution lock acquisition.
+ *
  * The caller is also responsible for ensuring that the passed prepStmtName
  * (if not NULL) and sourceText have adequate lifetime.
  *
@@ -286,6 +290,7 @@ PortalDefineQuery(Portal portal,
 				  const char *sourceText,
 				  CommandTag commandTag,
 				  List *stmts,
+				  CachedPlanSource *plansource,
 				  CachedPlan *cplan)
 {
 	Assert(PortalIsValid(portal));
@@ -299,6 +304,7 @@ PortalDefineQuery(Portal portal,
 	portal->commandTag = commandTag;
 	SetQueryCompletion(&portal->qc, commandTag, 0);
 	portal->stmts = stmts;
+	portal->plansource = plansource;
 	portal->cplan = cplan;
 	portal->status = PORTAL_DEFINED;
 }
@@ -517,6 +523,7 @@ PortalDrop(Portal portal, bool isTopCommit)
 
 	/* drop cached plan reference, if any */
 	PortalReleaseCachedPlan(portal);
+	portal->plansource = NULL;
 
 	/*
 	 * If portal has a snapshot protecting its data, release that.  This needs
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index 7a4a85c8038..e0fc403e717 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -241,6 +241,7 @@ extern CachedPlan *GetCachedPlan(CachedPlanSource *plansource,
 								 ParamListInfo boundParams,
 								 ResourceOwner owner,
 								 QueryEnvironment *queryEnv);
+extern bool AcquireExecutorLocks(CachedPlan *cplan);
 extern void ReleaseCachedPlan(CachedPlan *plan, ResourceOwner owner);
 
 extern bool CachedPlanAllowsSimpleValidityCheck(CachedPlanSource *plansource,
diff --git a/src/include/utils/portal.h b/src/include/utils/portal.h
index a7bedb12c18..3af535362cd 100644
--- a/src/include/utils/portal.h
+++ b/src/include/utils/portal.h
@@ -137,6 +137,8 @@ typedef struct PortalData
 	CommandTag	commandTag;		/* command tag for original query */
 	QueryCompletion qc;			/* command completion data for executed query */
 	List	   *stmts;			/* list of PlannedStmts */
+	CachedPlanSource *plansource;	/* CachedPlanSource, for replanning on
+									 * invalidation */
 	CachedPlan *cplan;			/* CachedPlan, if stmts are from one */
 
 	ParamListInfo portalParams; /* params to pass to query */
@@ -240,6 +242,7 @@ extern void PortalDefineQuery(Portal portal,
 							  const char *sourceText,
 							  CommandTag commandTag,
 							  List *stmts,
+							  CachedPlanSource *plansource,
 							  CachedPlan *cplan);
 extern PlannedStmt *PortalGetPrimaryStmt(Portal portal);
 extern void PortalCreateHoldStore(Portal portal);
-- 
2.47.3



  [application/octet-stream] v13-0004-Use-pruning-aware-locking-for-single-statement-c.patch (40.8K, 5-v13-0004-Use-pruning-aware-locking-for-single-statement-c.patch)
  download | inline diff:
From 5785e0903b867f024e4b675783dfd76dc00ee733 Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Sat, 4 Apr 2026 20:43:14 +0900
Subject: [PATCH v13 4/4] Use pruning-aware locking for single-statement cached
 plans

For single-statement reused generic plans, perform initial partition
pruning before acquiring execution locks, then lock only the
surviving partitions.

Add ExecutorPrepAndLock() which encapsulates the pruning-aware lock
sequence: lock unprunable relations, call ExecutorPrep() to run
initial pruning, then lock survivors.  Plan validity is checked
after each step; ExecutorPrepCleanup() handles the case where the
plan is invalidated between prep and execution.

Extend PortalLockCachedPlan() to use the pruning-aware path for
eligible plans (single-statement reused generic, non-utility).
All other cases continue using the conservative lock-all path
from the previous commit.

Track firstResultRels in PlannerGlobal and PlannedStmt so they
are locked even if pruned, preserving ExecInitModifyTable()
assumptions about the first result relation being available.

Multi-statement CachedPlans (from rule rewriting) always use
conservative locking, since PortalRunMulti() executes statements
sequentially with CCI between them and later statements' pruning
expressions may depend on earlier ones' effects.  In principle,
this could be relaxed if the planner can prove that no pruning
expression reads state modified by an earlier statement, but that
is left for a future patch.

Regression tests are included to verify:

- Only surviving partitions are locked when pruning is enabled, and
  all partitions are locked when it is disabled (pg_locks inspection).
- Multiple ModifyTable nodes (via writable CTEs) handle the case where
  all target partitions are pruned, exercising firstResultRels.
- Plan invalidation during pruning-aware lock setup (DDL triggered by
  a pruning expression) discards the prep state and replans cleanly.
- Multi-statement CachedPlans (from rule rewriting) fall back to
  locking all partitions, avoiding stale pruning results.

Note for extension authors: code that accesses partition relations
through EState must check that the RT index is a member of
es_unpruned_relids before opening the relation.  Previously this
was an optimization; it is now a correctness requirement, because
pruned partitions may not be locked.
---
 src/backend/commands/explain.c                |  45 +++--
 src/backend/commands/prepare.c                |  30 ++-
 src/backend/executor/execMain.c               | 142 ++++++++++++++
 src/backend/executor/nodeModifyTable.c        |   7 +-
 src/backend/optimizer/plan/planner.c          |   1 +
 src/backend/optimizer/plan/setrefs.c          |  19 ++
 src/backend/tcop/pquery.c                     |  76 ++++++--
 src/backend/utils/cache/plancache.c           |  16 ++
 src/include/commands/explain.h                |   3 +-
 src/include/executor/executor.h               |   4 +
 src/include/nodes/pathnodes.h                 |   3 +
 src/include/nodes/plannodes.h                 |  10 +
 src/include/utils/plancache.h                 |   2 +
 src/test/regress/expected/partition_prune.out | 184 ++++++++++++++++++
 src/test/regress/expected/plancache.out       |  63 ++++++
 src/test/regress/sql/partition_prune.sql      | 116 +++++++++++
 src/test/regress/sql/plancache.sql            |  52 +++++
 17 files changed, 734 insertions(+), 39 deletions(-)

diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 112c17b0d64..c5254f0f920 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -377,7 +377,8 @@ standard_ExplainOneQuery(Query *query, int cursorOptions,
 	/* run it (if needed) and produce output */
 	ExplainOnePlan(plan, into, es, queryString, params, queryEnv,
 				   &planduration, (es->buffers ? &bufusage : NULL),
-				   es->memory ? &mem_counters : NULL);
+				   es->memory ? &mem_counters : NULL,
+				   NULL);
 }
 
 /*
@@ -501,7 +502,8 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
 			   const char *queryString, ParamListInfo params,
 			   QueryEnvironment *queryEnv, const instr_time *planduration,
 			   const BufferUsage *bufusage,
-			   const MemoryContextCounters *mem_counters)
+			   const MemoryContextCounters *mem_counters,
+			   QueryDesc *prep_qd)
 {
 	DestReceiver *dest;
 	QueryDesc  *queryDesc;
@@ -532,13 +534,6 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
 	 */
 	INSTR_TIME_SET_CURRENT(starttime);
 
-	/*
-	 * Use a snapshot with an updated command ID to ensure this query sees
-	 * results of any previously executed queries.
-	 */
-	PushCopiedSnapshot(GetActiveSnapshot());
-	UpdateActiveSnapshotCommandId();
-
 	/*
 	 * We discard the output if we have no use for it.  If we're explaining
 	 * CREATE TABLE AS, we'd better use the appropriate tuple receiver, while
@@ -554,10 +549,34 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
 	else
 		dest = None_Receiver;
 
-	/* Create a QueryDesc for the query */
-	queryDesc = CreateQueryDesc(plannedstmt, queryString,
-								GetActiveSnapshot(), InvalidSnapshot,
-								dest, params, queryEnv, instrument_option);
+	/*
+	 * Create a QueryDesc for the query, or use the one provided by the
+	 * caller.  When reusing a prep QueryDesc, its snapshot was set at
+	 * creation time; we push it as active for ExecutorStart and override the
+	 * destination and instrument options, which were not known when the
+	 * caller created it.
+	 */
+	if (prep_qd)
+	{
+		PushActiveSnapshot(GetActiveSnapshot());
+		queryDesc = prep_qd;
+		Assert(queryDesc->dest == None_Receiver);
+		queryDesc->dest = dest;
+		queryDesc->instrument_options = instrument_option;
+	}
+	else
+	{
+		/*
+		 * Use a snapshot with an updated command ID to ensure this query sees
+		 * results of any previously executed queries.
+		 */
+		PushCopiedSnapshot(GetActiveSnapshot());
+		UpdateActiveSnapshotCommandId();
+		queryDesc = CreateQueryDesc(plannedstmt, queryString,
+									GetActiveSnapshot(), InvalidSnapshot,
+									dest, params, queryEnv,
+									instrument_option);
+	}
 
 	/* Select execution options */
 	if (es->analyze)
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 03d7a98fc58..3bbbc052149 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -588,6 +588,7 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	MemoryContextCounters mem_counters;
 	MemoryContext planner_ctx = NULL;
 	MemoryContext saved_ctx = NULL;
+	QueryDesc  *prep_qd = NULL;
 
 	if (es->memory)
 	{
@@ -640,8 +641,31 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 							  pstate->p_queryEnv);
 		plan_list = cplan->stmt_list;
 
-		if (AcquireExecutorLocks(cplan))
+		if (!CachedPlanCanPrep(cplan, entry->plansource))
+		{
+			if (AcquireExecutorLocks(cplan))
+				break;
+			ReleaseCachedPlan(cplan, CurrentResourceOwner);
+			continue;
+		}
+
+		prep_qd = CreateQueryDesc(linitial_node(PlannedStmt, plan_list),
+								  query_string,
+								  GetActiveSnapshot(),
+								  InvalidSnapshot,
+								  None_Receiver,	/* ExplainOnePlan will fix */
+								  paramLI,
+								  pstate->p_queryEnv,
+								  0 /* ExplainOnePlan will fix */ );
+		if (ExecutorPrepAndLock(prep_qd,
+								CurrentResourceOwner,
+								es->generic ? EXEC_FLAG_EXPLAIN_GENERIC : 0,
+								&cplan->is_valid))
 			break;
+
+		/* Try again. */
+		ExecutorPrepCleanup(prep_qd);
+		FreeQueryDesc(prep_qd);
 		ReleaseCachedPlan(cplan, CurrentResourceOwner);
 	}
 
@@ -664,6 +688,7 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	plan_list = cplan->stmt_list;
 
 	/* Explain each query */
+	Assert(prep_qd == NULL || list_length(plan_list) == 1);
 	foreach(p, plan_list)
 	{
 		PlannedStmt *pstmt = lfirst_node(PlannedStmt, p);
@@ -671,7 +696,8 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 		if (pstmt->commandType != CMD_UTILITY)
 			ExplainOnePlan(pstmt, into, es, query_string, paramLI, pstate->p_queryEnv,
 						   &planduration, (es->buffers ? &bufusage : NULL),
-						   es->memory ? &mem_counters : NULL);
+						   es->memory ? &mem_counters : NULL,
+						   prep_qd);
 		else
 			ExplainOneUtility(pstmt->utilityStmt, into, es, pstate, paramLI);
 
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 2b9397b72f3..bbfa0e2b92a 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -333,6 +333,124 @@ standard_ExecutorStart(QueryDesc *queryDesc, int eflags)
 	MemoryContextSwitchTo(oldcontext);
 }
 
+/*
+ * LockRangeTableRelids
+ * 		Acquire or release locks on the specified relids, which reference
+ * 		entries in the provided range table.
+ *
+ * Helper for AcquireExecutorLocksPrepared().
+ */
+static void
+LockRangeTableRelids(List *rtable, Bitmapset *relids, bool acquire)
+{
+	int			rtindex = -1;
+
+	while ((rtindex = bms_next_member(relids, rtindex)) >= 0)
+	{
+		RangeTblEntry *rte = list_nth_node(RangeTblEntry, rtable, rtindex - 1);
+
+		Assert(rte->rtekind == RTE_RELATION ||
+			   (rte->rtekind == RTE_SUBQUERY && OidIsValid(rte->relid)));
+
+		/*
+		 * Acquire the appropriate type of lock on each relation OID. Note
+		 * that we don't actually try to open the rel, and hence will not fail
+		 * if it's been dropped entirely --- we'll just transiently acquire a
+		 * non-conflicting lock.
+		 */
+		if (acquire)
+			LockRelationOid(rte->relid, rte->rellockmode);
+		else
+			UnlockRelationOid(rte->relid, rte->rellockmode);
+	}
+}
+
+/*
+ * AcquireExecutorLocksPrepared
+ *
+ * Acquire or release execution locks using pruning results already computed
+ * by ExecutorPrep() and stored in queryDesc->estate.
+ *
+ * This is intended for single-statement reused generic-plan paths that
+ * choose pruning-aware locking instead of the conservative
+ * AcquireExecutorLocks() path.
+ */
+static void
+AcquireExecutorLocksPrepared(QueryDesc *queryDesc, bool acquire)
+{
+	PlannedStmt *plannedstmt = queryDesc->plannedstmt;
+	EState	   *estate = queryDesc->estate;
+	Bitmapset  *lock_relids;
+	ListCell   *lc;
+
+	Assert(queryDesc != NULL);
+	Assert(estate != NULL);
+	Assert(plannedstmt != NULL);
+	Assert(plannedstmt->commandType != CMD_UTILITY);
+
+	lock_relids = bms_difference(estate->es_unpruned_relids,
+								 plannedstmt->unprunableRelids);
+
+	/*
+	 * Keep the first result relation of each ModifyTable locked even if
+	 * pruning removed all target partitions.  ExecInitModifyTable() relies on
+	 * one such relation remaining available.
+	 */
+	foreach(lc, plannedstmt->firstResultRels)
+	{
+		Index		rti = lfirst_int(lc);
+
+		lock_relids = bms_add_member(lock_relids, rti);
+	}
+
+	LockRangeTableRelids(plannedstmt->rtable, lock_relids, acquire);
+
+	bms_free(lock_relids);
+
+}
+
+/*
+ * ExecutorPrepAndLock
+ *		Perform pruning-aware locking for a single PlannedStmt.
+ *
+ * Locks unprunable relations first, then runs ExecutorPrep() to
+ * determine which partitions survive initial pruning, then locks
+ * only those survivors.  Checks *is_valid after each locking step
+ * to detect plan invalidation (e.g., from concurrent DDL or DDL
+ * triggered by a pruning expression).
+ *
+ * Returns true if the plan is still valid and all needed locks are
+ * held.  Returns false if the plan was invalidated at any point, in
+ * which case all acquired locks have been released and the caller
+ * should discard the QueryDesc and retry with a fresh plan.
+ */
+bool
+ExecutorPrepAndLock(QueryDesc *queryDesc, ResourceOwner owner,
+					int eflags, bool *is_valid)
+{
+	PlannedStmt *pstmt = queryDesc->plannedstmt;
+
+	/* Lock unprunable rels before pruning can access them. */
+	LockRangeTableRelids(pstmt->rtable, pstmt->unprunableRelids, true);
+	if (!*is_valid)
+	{
+		LockRangeTableRelids(pstmt->rtable, pstmt->unprunableRelids, false);
+		return false;
+	}
+
+	/* Run pruning and lock survivors. */
+	ExecutorPrep(queryDesc, owner, eflags);
+	AcquireExecutorLocksPrepared(queryDesc, true);
+	if (!*is_valid)
+	{
+		AcquireExecutorLocksPrepared(queryDesc, false);
+		LockRangeTableRelids(pstmt->rtable, pstmt->unprunableRelids, false);
+		return false;
+	}
+
+	return true;
+}
+
 /*
  * ExecutorPrep
  *
@@ -391,6 +509,30 @@ ExecutorPrep(QueryDesc *queryDesc, ResourceOwner owner, int eflags)
 	CurrentResourceOwner = oldowner;
 }
 
+/*
+ * ExecutorPrepCleanup
+ *		Clean up an EState that was created by ExecutorPrep() but never
+ *		passed to ExecutorStart().  This happens when the plan is
+ *		invalidated between prep and execution, and the caller must
+ *		discard the prepped state before retrying with a fresh plan.
+ *
+ * Unlike ExecutorEnd(), this does not expect a fully initialized
+ * plan state tree -- only the range table relations and the
+ * EState itself need to be freed.
+ */
+void
+ExecutorPrepCleanup(QueryDesc *queryDesc)
+{
+	EState	   *estate = queryDesc->estate;
+
+	if (estate == NULL)
+		return;
+
+	ExecCloseRangeTableRelations(estate);
+	FreeExecutorState(estate);
+	queryDesc->estate = NULL;
+}
+
 /* ----------------------------------------------------------------
  *		ExecutorRun
  *
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index 478cb01783c..6e78b61f700 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -5133,8 +5133,8 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
 	 * as a reference for building the ResultRelInfo of the target partition.
 	 * In either case, it doesn't matter which result relation is kept, so we
 	 * just keep the first one, if all others have been pruned.  See also,
-	 * ExecDoInitialPruning(), which ensures that this first result relation
-	 * has been locked.
+	 * AcquireExecutorLocksPrepared(), which ensures that this first result
+	 * relation has been locked.
 	 */
 	i = 0;
 	foreach(l, node->resultRelations)
@@ -5148,6 +5148,9 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
 			/* all result relations pruned; keep the first one */
 			keep_rel = true;
 			rti = linitial_int(node->resultRelations);
+			if (!list_member_int(estate->es_plannedstmt->firstResultRels, rti))
+				elog(ERROR, "first result relation %u not found in firstResultRels",
+					 rti);
 			i = 0;
 		}
 
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index f4689e7c9f8..4cddac7f2fc 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -675,6 +675,7 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
 											  glob->prunableRelids);
 	result->permInfos = glob->finalrteperminfos;
 	result->subrtinfos = glob->subrtinfos;
+	result->firstResultRels = glob->firstResultRels;
 	result->appendRelations = glob->appendRelations;
 	result->subplans = glob->subplans;
 	result->rewindPlanIDs = glob->rewindPlanIDs;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index ff0e875f2a2..4495bc6e627 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -384,6 +384,25 @@ set_plan_references(PlannerInfo *root, Plan *plan)
 		}
 	}
 
+	/*
+	 * Record the first result relation if it belongs to the set of initially
+	 * prunable relations.  We use bms_next_member() to get the
+	 * lowest-numbered leaf result rel, which matches
+	 * linitial_int(ModifyTable.resultRelations) because
+	 * expand_inherited_rtentry() adds child partitions to the range table
+	 * sequentially in partition bound order, and resultRelations is built
+	 * from that same expansion.
+	 */
+	if (root->leaf_result_relids)
+	{
+		Index		firstResultRel = bms_next_member(root->leaf_result_relids, -1);
+
+		firstResultRel += rtoffset;
+		if (bms_is_member(firstResultRel, root->glob->prunableRelids))
+			root->glob->firstResultRels =
+				lappend_int(root->glob->firstResultRels, firstResultRel);
+	}
+
 	return result;
 }
 
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index 4699b53cab7..53c50ab0fce 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -59,7 +59,9 @@ static uint64 DoPortalRunFetch(Portal portal,
 							   long count,
 							   DestReceiver *dest);
 static void DoPortalRewind(Portal portal);
-static bool PortalLockCachedPlan(Portal portal);
+static bool PortalLockCachedPlan(Portal portal, bool do_prep,
+								 ParamListInfo params,
+								 QueryDesc **queryDesc_p);
 
 
 /*
@@ -488,21 +490,6 @@ restart:
 				 * non-default nesting level for the snapshot.
 				 */
 
-				/*
-				 * If the portal is backed by a cached plan, acquire execution
-				 * locks via PortalLockCachedPlan().  If the plan is
-				 * invalidated during locking, it replans and may change the
-				 * portal strategy, requiring us to restart PortalStart().
-				 */
-				if (portal->cplan)
-				{
-					if (PortalLockCachedPlan(portal))
-					{
-						PopActiveSnapshot();
-						goto restart;
-					}
-				}
-
 				/*
 				 * Create QueryDesc in portal's context; for the moment, set
 				 * the destination to DestNone.
@@ -516,6 +503,26 @@ restart:
 											portal->queryEnv,
 											0);
 
+				/*
+				 * If the portal is backed by a cached plan, acquire execution
+				 * locks via PortalLockCachedPlan().  For eligible plans
+				 * (single-statement reused generic), this performs
+				 * pruning-aware locking: it runs ExecutorPrep() on the
+				 * QueryDesc to determine which partitions survive initial
+				 * pruning, then locks only those.  If the plan is invalidated
+				 * during this process, it replans and rebuilds the QueryDesc.
+				 * If replanning changes the portal strategy, we must restart
+				 * PortalStart() to redispatch.
+				 */
+				if (portal->cplan)
+				{
+					if (PortalLockCachedPlan(portal, true, params, &queryDesc))
+					{
+						PopActiveSnapshot();
+						goto restart;
+					}
+				}
+
 				/*
 				 * If it's a scrollable cursor, executor needs to support
 				 * REWIND and backwards scan, as well as whatever the caller
@@ -555,7 +562,7 @@ restart:
 			case PORTAL_ONE_MOD_WITH:
 				if (portal->cplan)
 				{
-					if (PortalLockCachedPlan(portal))
+					if (PortalLockCachedPlan(portal, false, NULL, NULL))
 						goto restart;
 				}
 
@@ -611,7 +618,7 @@ restart:
 				 */
 				if (portal->cplan)
 				{
-					if (PortalLockCachedPlan(portal))
+					if (PortalLockCachedPlan(portal, false, NULL, NULL))
 						goto restart;
 				}
 
@@ -1828,15 +1835,32 @@ EnsurePortalSnapshotExists(void)
  *		Acquire execution locks for a cached-plan-backed portal,
  *		retrying with a fresh plan if the current one is invalidated.
  *
+ * If do_prep is true and the plan is eligible (single-statement reused
+ * generic plan), performs pruning-aware locking via ExecutorPrep() and
+ * populates portal->queryDesc with the prepped QueryDesc.  Otherwise
+ * falls back to locking all relations in the plan.
+ *
  * Returns true if replanning changed portal->strategy, meaning the
- * caller must redispatch.  Returns false once locks are held.
+ * caller must redispatch.  Returns false once locks are held and the
+ * plan is valid for execution.
  */
 static bool
-PortalLockCachedPlan(Portal portal)
+PortalLockCachedPlan(Portal portal, bool do_prep,
+					 ParamListInfo params,
+					 QueryDesc **prep_qd)
 {
 	PortalStrategy start_strategy = portal->strategy;
 
-	if (AcquireExecutorLocks(portal->cplan))
+	if (do_prep && CachedPlanCanPrep(portal->cplan, portal->plansource))
+	{
+		Assert(prep_qd);
+		if (ExecutorPrepAndLock(*prep_qd, portal->resowner, 0,
+								&portal->cplan->is_valid))
+			return false;
+		ExecutorPrepCleanup(*prep_qd);
+		FreeQueryDesc(*prep_qd);
+	}
+	else if (AcquireExecutorLocks(portal->cplan))
 		return false;
 
 	/* Replan.  Locks will be taken freshly. */
@@ -1852,5 +1876,15 @@ PortalLockCachedPlan(Portal portal)
 	if (portal->strategy != start_strategy)
 		return true;
 
+	if (prep_qd)
+	{
+		Assert(list_length(portal->stmts) == 1);
+		*prep_qd = CreateQueryDesc(linitial_node(PlannedStmt, portal->stmts),
+								   portal->sourceText,
+								   GetActiveSnapshot(), InvalidSnapshot,
+								   None_Receiver, params,
+								   portal->queryEnv, 0);
+	}
+
 	return false;
 }
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index f7fe366859c..fca2f84081e 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -1977,6 +1977,22 @@ AcquireExecutorLocks(CachedPlan *cplan)
 	return true;
 }
 
+/*
+ * CachedPlanCanPrep
+ *		Check whether a cached plan is eligible for pruning-aware locking
+ *		via ExecutorPrepAndLock().
+ *
+ * Only single-statement reused generic plans with a non-utility command
+ * qualify.
+ */
+bool
+CachedPlanCanPrep(CachedPlan *cplan, CachedPlanSource *plansource)
+{
+	return (cplan == plansource->gplan &&
+			list_length(cplan->stmt_list) == 1 &&
+			linitial_node(PlannedStmt, cplan->stmt_list)->commandType != CMD_UTILITY);
+}
+
 /*
  * AcquirePlannerLocks: acquire locks needed for planning of a querytree list;
  * or release them if acquire is false.
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 472e141bba3..3a03355e6b6 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -69,7 +69,8 @@ extern void ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into,
 						   ParamListInfo params, QueryEnvironment *queryEnv,
 						   const instr_time *planduration,
 						   const BufferUsage *bufusage,
-						   const MemoryContextCounters *mem_counters);
+						   const MemoryContextCounters *mem_counters,
+						   QueryDesc *prep_qd);
 
 extern void ExplainPrintPlan(ExplainState *es, QueryDesc *queryDesc);
 extern void ExplainPrintTriggers(ExplainState *es,
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index 33bbdbfeffb..093be9bd24b 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -21,6 +21,7 @@
 #include "nodes/lockoptions.h"
 #include "nodes/parsenodes.h"
 #include "utils/memutils.h"
+#include "utils/resowner.h"
 
 
 /*
@@ -235,6 +236,9 @@ ExecGetJunkAttribute(TupleTableSlot *slot, AttrNumber attno, bool *isNull)
  */
 extern void ExecutorStart(QueryDesc *queryDesc, int eflags);
 extern void standard_ExecutorStart(QueryDesc *queryDesc, int eflags);
+extern bool ExecutorPrepAndLock(QueryDesc *queryDesc, ResourceOwner owner,
+								int eflags, bool *is_valid);
+extern void ExecutorPrepCleanup(QueryDesc *queryDesc);
 extern void ExecutorRun(QueryDesc *queryDesc,
 						ScanDirection direction, uint64 count);
 extern void standard_ExecutorRun(QueryDesc *queryDesc,
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 27a2c6815b7..a5d00633b4b 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -217,6 +217,9 @@ typedef struct PlannerGlobal
 	/* "flat" list of integer RT indexes */
 	List	   *resultRelations;
 
+	/* "flat" list of integer RT indexes (one per ModifyTable node) */
+	List	   *firstResultRels;
+
 	/* "flat" list of AppendRelInfos */
 	List	   *appendRelations;
 
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 14a1dfed2b9..1a328ea138c 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -120,6 +120,16 @@ typedef struct PlannedStmt
 	/* RT indexes of relations targeted by INSERT/UPDATE/DELETE/MERGE */
 	Bitmapset  *resultRelationRelids;
 
+	/*
+	 * rtable indexes of first target relation in each ModifyTable node in the
+	 * plan for INSERT/UPDATE/DELETE/MERGE.  NIL if resultRelations is NIL.
+	 *
+	 * These are used by AcquireExecutorLocksPrepared() to ensure that the
+	 * first result rel for each ModifyTable remains locked even if pruned;
+	 * see ExecInitModifyTable() for the executor side assumptions.
+	 */
+	List	   *firstResultRels;
+
 	/* list of AppendRelInfo nodes */
 	List	   *appendRelations;
 
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index e0fc403e717..2941d3a301b 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -254,4 +254,6 @@ extern bool CachedPlanIsSimplyValid(CachedPlanSource *plansource,
 extern CachedExpression *GetCachedExpression(Node *expr);
 extern void FreeCachedExpression(CachedExpression *cexpr);
 
+extern bool CachedPlanCanPrep(CachedPlan *cplan, CachedPlanSource *plansource);
+
 #endif							/* PLANCACHE_H */
diff --git a/src/test/regress/expected/partition_prune.out b/src/test/regress/expected/partition_prune.out
index 849049f9c51..ec73866486e 100644
--- a/src/test/regress/expected/partition_prune.out
+++ b/src/test/regress/expected/partition_prune.out
@@ -4956,3 +4956,187 @@ select * from (select a, b from phv_boolpart) t
 (2 rows)
 
 drop table phv_boolpart;
+--
+-- Verify that pruning-aware locking skips pruned partitions
+-- when reusing a generic cached plan.
+--
+set plan_cache_mode to force_generic_plan;
+create table prunelock_p (a int) partition by list (a);
+create table prunelock_p1 partition of prunelock_p for values in (1);
+create table prunelock_p2 partition of prunelock_p for values in (2);
+create table prunelock_p3 partition of prunelock_p for values in (3);
+prepare prunelock_q (int) as select * from prunelock_p where a = $1;
+-- Force generic plan creation
+explain (costs off) execute prunelock_q(1);
+                  QUERY PLAN                  
+----------------------------------------------
+ Append
+   Subplans Removed: 2
+   ->  Seq Scan on prunelock_p1 prunelock_p_1
+         Filter: (a = $1)
+(4 rows)
+
+-- Execute and check which child partitions are locked
+begin;
+execute prunelock_q(1);
+ a 
+---
+(0 rows)
+
+select c.relname
+  from pg_locks l
+  join pg_class c on c.oid = l.relation
+ where l.pid = pg_backend_pid()
+   and c.relname like 'prunelock_p_'
+ order by c.relname;
+   relname    
+--------------
+ prunelock_p1
+(1 row)
+
+commit;
+deallocate prunelock_q;
+-- Turn pruning off
+set enable_partition_pruning to off;
+prepare prunelock_q (int) as select * from prunelock_p where a = $1;
+-- Force generic plan creation
+explain (costs off) execute prunelock_q(1);
+                  QUERY PLAN                  
+----------------------------------------------
+ Append
+   ->  Seq Scan on prunelock_p1 prunelock_p_1
+         Filter: (a = $1)
+   ->  Seq Scan on prunelock_p2 prunelock_p_2
+         Filter: (a = $1)
+   ->  Seq Scan on prunelock_p3 prunelock_p_3
+         Filter: (a = $1)
+(7 rows)
+
+-- Execute and check which child partitions are locked
+begin;
+execute prunelock_q(1);
+ a 
+---
+(0 rows)
+
+select c.relname
+  from pg_locks l
+  join pg_class c on c.oid = l.relation
+ where l.pid = pg_backend_pid()
+   and c.relname like 'prunelock_p_'
+ order by c.relname;
+   relname    
+--------------
+ prunelock_p1
+ prunelock_p2
+ prunelock_p3
+(3 rows)
+
+commit;
+deallocate prunelock_q;
+reset enable_partition_pruning;
+--
+-- Verify firstResultRels handling with multiple ModifyTable nodes
+-- (writable CTEs) targeting a partitioned table.  When a pruning
+-- parameter matches no partition, all result relations are pruned
+-- and the executor must still find a usable first result relation
+-- for each ModifyTable node.
+--
+prepare prunelock_mt_q (int, int) as
+  with upd1 as (update prunelock_p set a = a),
+       upd2 as (update prunelock_p set a = a where a = $2)
+  update prunelock_p set a = a where a = $1;
+-- Force generic plan creation
+explain (costs off) execute prunelock_mt_q(1, 2);
+                         QUERY PLAN                         
+------------------------------------------------------------
+ Update on prunelock_p
+   Update on prunelock_p1 prunelock_p_1
+   CTE upd1
+     ->  Update on prunelock_p prunelock_p_3
+           Update on prunelock_p1 prunelock_p_4
+           Update on prunelock_p2 prunelock_p_5
+           Update on prunelock_p3 prunelock_p_6
+           ->  Append
+                 ->  Seq Scan on prunelock_p1 prunelock_p_4
+                 ->  Seq Scan on prunelock_p2 prunelock_p_5
+                 ->  Seq Scan on prunelock_p3 prunelock_p_6
+   CTE upd2
+     ->  Update on prunelock_p prunelock_p_7
+           Update on prunelock_p2 prunelock_p_8
+           ->  Append
+                 Subplans Removed: 2
+                 ->  Seq Scan on prunelock_p2 prunelock_p_8
+                       Filter: (a = $2)
+   ->  Append
+         Subplans Removed: 2
+         ->  Seq Scan on prunelock_p1 prunelock_p_1
+               Filter: (a = $1)
+(22 rows)
+
+-- All partitions pruned: value 4 matches no partition, so each
+-- ModifyTable must still initialize correctly with no matching
+-- result relations.
+explain (costs off) execute prunelock_mt_q(4, 5);
+                         QUERY PLAN                         
+------------------------------------------------------------
+ Update on prunelock_p
+   CTE upd1
+     ->  Update on prunelock_p prunelock_p_2
+           Update on prunelock_p1 prunelock_p_3
+           Update on prunelock_p2 prunelock_p_4
+           Update on prunelock_p3 prunelock_p_5
+           ->  Append
+                 ->  Seq Scan on prunelock_p1 prunelock_p_3
+                 ->  Seq Scan on prunelock_p2 prunelock_p_4
+                 ->  Seq Scan on prunelock_p3 prunelock_p_5
+   CTE upd2
+     ->  Update on prunelock_p prunelock_p_6
+           ->  Append
+                 Subplans Removed: 3
+   ->  Append
+         Subplans Removed: 3
+(16 rows)
+
+deallocate prunelock_mt_q;
+drop table prunelock_p;
+--
+-- Verify that pruning-aware locking falls back to locking all
+-- partitions for multi-statement CachedPlans.  Rule rewriting can
+-- expand a single statement into multiple PlannedStmts, and later
+-- statements must not have their pruning evaluated before earlier
+-- ones have executed, since CCI between statements can change what
+-- pruning expressions see.
+--
+create table prune_config (val int);
+insert into prune_config values (1);
+create table multistmt_pt (a int, b int) partition by list (a);
+create table multistmt_pt_1 partition of multistmt_pt for values in (1);
+create table multistmt_pt_2 partition of multistmt_pt for values in (2);
+insert into multistmt_pt values (1, 0), (2, 0);
+create function get_prune_val() returns int as $$
+  select val from prune_config;
+$$ language sql stable;
+create rule config_upd_rule as on update to multistmt_pt
+  do also update prune_config set val = 2;
+set plan_cache_mode to force_generic_plan;
+prepare multi_q as update multistmt_pt set b = b + 1 where a = get_prune_val();
+-- first execute creates the generic plan
+execute multi_q;
+-- reset for the real test
+update prune_config set val = 1;
+update multistmt_pt set b = 0;
+-- second execute reuses the plan; pruning-aware locking kicks in
+execute multi_q;
+select * from multistmt_pt order by a;
+ a | b 
+---+---
+ 1 | 0
+ 2 | 1
+(2 rows)
+
+deallocate multi_q;
+drop rule config_upd_rule on multistmt_pt;
+drop function get_prune_val;
+drop table multistmt_pt, prune_config;
+reset plan_cache_mode;
diff --git a/src/test/regress/expected/plancache.out b/src/test/regress/expected/plancache.out
index d58534ca1cd..54077294dce 100644
--- a/src/test/regress/expected/plancache.out
+++ b/src/test/regress/expected/plancache.out
@@ -402,3 +402,66 @@ select name, generic_plans, custom_plans from pg_prepared_statements
 (1 row)
 
 drop table test_mode;
+-- This exercises the CachedPlanPrepCleanup() path, which must free
+-- the EState created by ExecutorPrep() when the plan is invalidated
+-- before execution begins.  The pruning expression uses a stable SQL
+-- function that calls a volatile plpgsql function.  That function
+-- performs DDL on a partition when a separate "signal" table says to
+-- do so.  The second EXECUTE should replan cleanly after the DDL.
+set plan_cache_mode to force_generic_plan;
+create table inval_during_pruning_p (a int) partition by list (a);
+create table inval_during_pruning_p1 partition of inval_during_pruning_p for values in (1);
+create table inval_during_pruning_p2 partition of inval_during_pruning_p for values in (2);
+insert into inval_during_pruning_p values (1), (2);
+create table inval_during_pruning_signal (create_idx bool not null);
+insert into inval_during_pruning_signal values (false);
+create or replace function invalidate_plancache_func() returns int
+as $$
+declare
+	create_index bool;
+begin
+	-- Perform DDL on a partition if asked to
+	select create_idx into create_index from inval_during_pruning_signal for update;
+	if create_index = true then
+		raise notice 'creating index on partition inval_during_pruning_p1';
+		create index on inval_during_pruning_p1 (a);
+		update inval_during_pruning_signal set create_idx = false;
+	end if;
+	-- value that pruning will match against partition bounds
+	return 1;
+end;
+$$ language plpgsql volatile;
+create or replace function stable_pruning_val() returns int as $$
+	select invalidate_plancache_func();
+$$ language sql stable;
+prepare inval_during_pruning_q as select * from inval_during_pruning_p where a = stable_pruning_val();
+-- Build a generic plan and run pruning once, but don't set the signal
+-- for invalidate_plancache_func() to perform the DDL.
+explain (verbose, costs off) execute inval_during_pruning_q;
+                                QUERY PLAN                                 
+---------------------------------------------------------------------------
+ Append
+   Subplans Removed: 1
+   ->  Seq Scan on public.inval_during_pruning_p1 inval_during_pruning_p_1
+         Output: inval_during_pruning_p_1.a
+         Filter: (inval_during_pruning_p_1.a = stable_pruning_val())
+(5 rows)
+
+-- Reuse the generic plan.  Make invalidate_plancache_func() perform DDL
+-- during this execution, which should force replanning without errors.
+update inval_during_pruning_signal set create_idx = true;
+explain (verbose, costs off) execute inval_during_pruning_q;
+NOTICE:  creating index on partition inval_during_pruning_p1
+                                QUERY PLAN                                 
+---------------------------------------------------------------------------
+ Append
+   Subplans Removed: 1
+   ->  Seq Scan on public.inval_during_pruning_p1 inval_during_pruning_p_1
+         Output: inval_during_pruning_p_1.a
+         Filter: (inval_during_pruning_p_1.a = stable_pruning_val())
+(5 rows)
+
+deallocate inval_during_pruning_q;
+drop table inval_during_pruning_p, inval_during_pruning_signal;
+drop function invalidate_plancache_func, stable_pruning_val;
+reset plan_cache_mode;
diff --git a/src/test/regress/sql/partition_prune.sql b/src/test/regress/sql/partition_prune.sql
index 359a9208056..a98844d14f8 100644
--- a/src/test/regress/sql/partition_prune.sql
+++ b/src/test/regress/sql/partition_prune.sql
@@ -1518,3 +1518,119 @@ select * from (select a, b from phv_boolpart) t
   group by grouping sets (a, b);
 
 drop table phv_boolpart;
+
+--
+-- Verify that pruning-aware locking skips pruned partitions
+-- when reusing a generic cached plan.
+--
+set plan_cache_mode to force_generic_plan;
+
+create table prunelock_p (a int) partition by list (a);
+create table prunelock_p1 partition of prunelock_p for values in (1);
+create table prunelock_p2 partition of prunelock_p for values in (2);
+create table prunelock_p3 partition of prunelock_p for values in (3);
+
+prepare prunelock_q (int) as select * from prunelock_p where a = $1;
+
+-- Force generic plan creation
+explain (costs off) execute prunelock_q(1);
+
+-- Execute and check which child partitions are locked
+begin;
+execute prunelock_q(1);
+
+select c.relname
+  from pg_locks l
+  join pg_class c on c.oid = l.relation
+ where l.pid = pg_backend_pid()
+   and c.relname like 'prunelock_p_'
+ order by c.relname;
+commit;
+
+deallocate prunelock_q;
+
+-- Turn pruning off
+set enable_partition_pruning to off;
+
+prepare prunelock_q (int) as select * from prunelock_p where a = $1;
+
+-- Force generic plan creation
+explain (costs off) execute prunelock_q(1);
+
+-- Execute and check which child partitions are locked
+begin;
+execute prunelock_q(1);
+
+select c.relname
+  from pg_locks l
+  join pg_class c on c.oid = l.relation
+ where l.pid = pg_backend_pid()
+   and c.relname like 'prunelock_p_'
+ order by c.relname;
+commit;
+
+deallocate prunelock_q;
+reset enable_partition_pruning;
+
+--
+-- Verify firstResultRels handling with multiple ModifyTable nodes
+-- (writable CTEs) targeting a partitioned table.  When a pruning
+-- parameter matches no partition, all result relations are pruned
+-- and the executor must still find a usable first result relation
+-- for each ModifyTable node.
+--
+prepare prunelock_mt_q (int, int) as
+  with upd1 as (update prunelock_p set a = a),
+       upd2 as (update prunelock_p set a = a where a = $2)
+  update prunelock_p set a = a where a = $1;
+
+-- Force generic plan creation
+explain (costs off) execute prunelock_mt_q(1, 2);
+
+-- All partitions pruned: value 4 matches no partition, so each
+-- ModifyTable must still initialize correctly with no matching
+-- result relations.
+explain (costs off) execute prunelock_mt_q(4, 5);
+
+deallocate prunelock_mt_q;
+drop table prunelock_p;
+
+--
+-- Verify that pruning-aware locking falls back to locking all
+-- partitions for multi-statement CachedPlans.  Rule rewriting can
+-- expand a single statement into multiple PlannedStmts, and later
+-- statements must not have their pruning evaluated before earlier
+-- ones have executed, since CCI between statements can change what
+-- pruning expressions see.
+--
+create table prune_config (val int);
+insert into prune_config values (1);
+
+create table multistmt_pt (a int, b int) partition by list (a);
+create table multistmt_pt_1 partition of multistmt_pt for values in (1);
+create table multistmt_pt_2 partition of multistmt_pt for values in (2);
+insert into multistmt_pt values (1, 0), (2, 0);
+
+create function get_prune_val() returns int as $$
+  select val from prune_config;
+$$ language sql stable;
+
+create rule config_upd_rule as on update to multistmt_pt
+  do also update prune_config set val = 2;
+
+set plan_cache_mode to force_generic_plan;
+prepare multi_q as update multistmt_pt set b = b + 1 where a = get_prune_val();
+-- first execute creates the generic plan
+execute multi_q;
+-- reset for the real test
+update prune_config set val = 1;
+update multistmt_pt set b = 0;
+-- second execute reuses the plan; pruning-aware locking kicks in
+execute multi_q;
+select * from multistmt_pt order by a;
+
+deallocate multi_q;
+drop rule config_upd_rule on multistmt_pt;
+drop function get_prune_val;
+drop table multistmt_pt, prune_config;
+reset plan_cache_mode;
diff --git a/src/test/regress/sql/plancache.sql b/src/test/regress/sql/plancache.sql
index aed388d03a1..90b6c5f82bf 100644
--- a/src/test/regress/sql/plancache.sql
+++ b/src/test/regress/sql/plancache.sql
@@ -228,3 +228,55 @@ select name, generic_plans, custom_plans from pg_prepared_statements
   where  name = 'test_mode_pp';
 
 drop table test_mode;
+
+-- This exercises the CachedPlanPrepCleanup() path, which must free
+-- the EState created by ExecutorPrep() when the plan is invalidated
+-- before execution begins.  The pruning expression uses a stable SQL
+-- function that calls a volatile plpgsql function.  That function
+-- performs DDL on a partition when a separate "signal" table says to
+-- do so.  The second EXECUTE should replan cleanly after the DDL.
+set plan_cache_mode to force_generic_plan;
+create table inval_during_pruning_p (a int) partition by list (a);
+create table inval_during_pruning_p1 partition of inval_during_pruning_p for values in (1);
+create table inval_during_pruning_p2 partition of inval_during_pruning_p for values in (2);
+insert into inval_during_pruning_p values (1), (2);
+
+create table inval_during_pruning_signal (create_idx bool not null);
+insert into inval_during_pruning_signal values (false);
+create or replace function invalidate_plancache_func() returns int
+as $$
+declare
+	create_index bool;
+begin
+	-- Perform DDL on a partition if asked to
+	select create_idx into create_index from inval_during_pruning_signal for update;
+	if create_index = true then
+		raise notice 'creating index on partition inval_during_pruning_p1';
+		create index on inval_during_pruning_p1 (a);
+		update inval_during_pruning_signal set create_idx = false;
+	end if;
+	-- value that pruning will match against partition bounds
+	return 1;
+end;
+$$ language plpgsql volatile;
+
+create or replace function stable_pruning_val() returns int as $$
+	select invalidate_plancache_func();
+$$ language sql stable;
+
+prepare inval_during_pruning_q as select * from inval_during_pruning_p where a = stable_pruning_val();
+
+-- Build a generic plan and run pruning once, but don't set the signal
+-- for invalidate_plancache_func() to perform the DDL.
+explain (verbose, costs off) execute inval_during_pruning_q;
+
+-- Reuse the generic plan.  Make invalidate_plancache_func() perform DDL
+-- during this execution, which should force replanning without errors.
+update inval_during_pruning_signal set create_idx = true;
+explain (verbose, costs off) execute inval_during_pruning_q;
+
+deallocate inval_during_pruning_q;
+drop table inval_during_pruning_p, inval_during_pruning_signal;
+drop function invalidate_plancache_func, stable_pruning_val;
+
+reset plan_cache_mode;
-- 
2.47.3



^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2026-05-29 10:30  Thom Brown <[email protected]>
  parent: Amit Langote <[email protected]>
  0 siblings, 1 reply; 108+ messages in thread

From: Thom Brown @ 2026-05-29 10:30 UTC (permalink / raw)
  To: Amit Langote <[email protected]>; +Cc: Chao Li <[email protected]>; Tom Lane <[email protected]>; Tender Wang <[email protected]>; Alexander Lakhin <[email protected]>; Tomas Vondra <[email protected]>; Robert Haas <[email protected]>; Alvaro Herrera <[email protected]>; Andres Freund <[email protected]>; Daniel Gustafsson <[email protected]>; David Rowley <[email protected]>; pgsql-hackers

On Fri, 29 May 2026 at 09:57, Amit Langote <[email protected]> wrote:
>
> On Thu, May 28, 2026 at 10:14 PM Thom Brown <[email protected]> wrote:
> > On Thu, 28 May 2026 at 09:14, Amit Langote <[email protected]> wrote:
> > > It's a real bug.
> > >
> > > You're right that if PortalLockCachedPlan() replans, the QueryDesc
> > > created before the call still points at the old PlannedStmt from the
> > > released plan.  And yes, 0004 happens to fix it by rebuilding the
> > > QueryDesc inside PortalLockCachedPlan(), but 0001 through 0003 are
> > > broken on their own.
> > >
> > > Attached is an updated set with the fix: CreateQueryDesc now runs
> > > after PortalLockCachedPlan() returns, as you suggested.  That said,
> > > I'll probably focus first on settling the plancache refactoring that
> > > spun off from this thread [1], and then start a new thread for the
> > > pruning-aware locking work on top of it, incorporating parts of this
> > > series.
> >
> > Thanks.
> >
> > I've done another pass. I see a reference to
> > AcquireExecutorLocksUnpruned(), but I can't find this function. Is
> > this supposed to be AcquireExecutorLocksPrepared()?
>
> You're right, stale comment. It should say
> AcquireExecutorLocksPrepared(). Fixed.
>
> > And also I have a question about the new firstResultRels code
> >
> > If I've followed it right, the bit in setrefs.c records the
> > lowest-numbered RT index from leaf_result_relids as the
> > per-ModifyTable fallback that's used when all real targets get pruned
> > away, and the executor side looks it up via
> > linitial_int(node->resultRelations). For that to work those two have
> > to pick the same RT index, and the comment justifies it with
> > "partition expansion preserves RT index order". Where is that
> > preservation guaranteed?
>
> The ordering comes from expand_inherited_rtentry(), which adds child
> partitions to the range table sequentially in partition bound order.
> Since ModifyTable.resultRelations is built from the same expansion,
> its first element is the lowest-numbered RT index among the leaf
> partitions for that node. That is the same value
> bms_next_member(leaf_result_relids, -1) returns from the Bitmapset,
> because Bitmapset iteration returns members in ascending order. I've
> added a comment in setrefs.c pointing to expand_inherited_rtentry() as
> the source of this guarantee.
>
> > And with the assertion in ExecInitModifyTable:
> >
> > Assert(list_member_int(estate->es_plannedstmt->firstResultRels, rti));
> >
> > With writable CTEs producing more than one ModifyTable node the list
> > has several entries, so all the assert really checks is that some
> > recorded entry matches, not that the one recorded for this particular
> > node matches. If that's correct, then in a case where the wrong entry
> > happened to line up the right relation wouldn't be locked and nothing
> > would complain. Is there something that keeps these in order
> > somewhere?
>
> This is a fair observation -- the Assert checks membership in the
> global list rather than per-node correspondence. But node A's rti
> can't accidentally pass the Assert by matching an entry recorded for
> node B. Each ModifyTable node gets its own partition expansion with
> distinct RT entries. In a writable CTE like:
>
>   WITH upd1 AS (UPDATE t SET ...),
>        upd2 AS (UPDATE t SET ...)
>   UPDATE t SET ...
>
> each UPDATE creates a separate set of leaf partition RT entries --
> upd1 might get RT indexes 5,6,7, upd2 gets 8,9,10, and the main UPDATE
> gets 11,12,13. The global firstResultRels list would be [5, 8, 11].
> When ExecInitModifyTable falls back to linitial_int(resultRelations)
> for a given node, it finds that node's own entry, because the RT index
> sets are disjoint across nodes.
>
> That said, it's worth being explicit about what protections exist at
> each layer, since this is safety-critical code:
>
> 1. AcquireExecutorLocksPrepared(), added by 0004, locks every entry in
> firstResultRels unconditionally. So regardless of which rti a
> ModifyTable node falls back to, the relation will be locked.
>
> 2. ExecGetRangeTableRelation() has two checks when opening a relation.
> For non-result relations (isResultRel=false), it checks
> es_unpruned_relids and raises an ERROR in release builds if the
> relation was pruned. For result relations (isResultRel=true), that
> check is intentionally skipped -- it has to be, because at least one
> result relation per ModifyTable node must remain openable even when
> all partitions are pruned, since executor code paths like ExecMerge()
> and ExecInitPartitionInfo() rely on resultRelInfo[0] being initialized
> (see commit 28317de723b). The remaining protection for result
> relations is Assert(CheckRelationLockedByMe()) inside table_open,
> which fires in debug builds.
>
> 3. I've tightened ExecInitModifyTable to close this gap: the
> all-pruned fallback path now raises an elog(ERROR) in release builds
> if linitial_int(resultRelations) is not found in firstResultRels,
> rather than just an Assert. This gives result relations a
> production-visible check comparable to what es_unpruned_relids
> provides for scan relations.
>
> So the net effect is that for scan relations, opening a
> pruned-and-unlocked relation is caught by an ERROR in production via
> es_unpruned_relids. For result relations on the all-pruned fallback
> path, it's now also caught by an ERROR in production via the
> firstResultRels check in ExecInitModifyTable. The locking in
> AcquireExecutorLocksPrepared() ensures the relation is always locked
> regardless.
>
> Thanks again for the review.  A close look at these aspects by someone
> other than me is very useful.

Ah, the disjoint RT-entries point is what I was missing. I'd been
reading firstResultRels as a flat list where in theory any entry could
line up with any node's lookup, which is what made the assert feel
potentially insufficient. If each ModifyTable's expansion produces its
own non-overlapping set of leaf RT indexes then membership in the
global list really is equivalent to membership in this node's own
entry, and the assert is sufficient as it stands. Walking through the
writable-CTE case helped.

Thanks

Thom






^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2026-06-02 17:54  Ilmar Yunusov <[email protected]>
  parent: Thom Brown <[email protected]>
  0 siblings, 1 reply; 108+ messages in thread

From: Ilmar Yunusov @ 2026-06-02 17:54 UTC (permalink / raw)
  To: [email protected]; +Cc: Amit Langote <[email protected]>

The following review has been posted through the commitfest application:
make installcheck-world:  not tested
Implements feature:       tested, failed
Spec compliant:           not tested
Documentation:            not tested

Hi,

I looked at v13, focusing on apply/build status and relation-lock behavior for
reused generic plans after initial partition pruning.

I used the v13 series from Amit's 2026-05-29 message, on origin/master at
4b0bf0788b066a4ca1d4f959566678e44ec93422.

The series applies cleanly with git am, and git diff --check reports no
issues.

I first built with:

./configure --prefix="$PWD/pg-install" --without-readline --without-zlib --without-icu
make -s -j8
make -s install

make -C src/test/regress check

passed; all 245 tests passed, including plancache and partition_prune.

I also built a cassert/debug tree with:

./configure --prefix="$PWD/pg-install" --without-readline --without-zlib --without-icu --enable-cassert --enable-debug 'CFLAGS=-O0 -g'
make -s -j8
make -s install

and ran:

make -C src/test/regress check

which also passed; all 245 tests passed.

For the lock behavior, I used a list-partitioned table with force_generic_plan.
After the generic plan had been built and then reused, EXECUTE held only the
matching child partition lock. For example, EXECUTE q(1) held only the
following child lock:

manual_prunelock_p1

EXPLAIN EXECUTE behaved the same way on a reused generic plan; EXPLAIN EXECUTE
q(2) removed the other subplans and held only the following child lock:

manual_prunelock_p2

With enable_partition_pruning = off and a newly prepared statement, executing
the same SELECT held all child partition locks:

manual_prunelock_p1, manual_prunelock_p2, manual_prunelock_p3

I also ran a bounded cassert/debug stress check around plan invalidation. It
did 20 cycles where a child index was created and dropped before EXECUTE, and
20 similar cycles before EXPLAIN EXECUTE. In each cycle, the first execution
after invalidation/replanning held all child partition locks, and the next
execution reusing the generic plan held only the matching child partition lock.
That matches my reading that the patch is reducing locks for reused generic
plans, not for the execution that has to rebuild the plan.

One behavior I wanted to confirm: prepared UPDATE execution still held all
child partition locks in my manual check, including on the second execution
where the generic plan was being reused.

The test was:

prepare upd(int, text) as
  update stress_prunelock_p set b = $2 where a = $1;

Then both:

execute upd(3, 'updated-row-3');

and an all-pruned value:

execute upd(99, 'no-row');

held:

stress_prunelock_p1, stress_prunelock_p2, stress_prunelock_p3,
stress_prunelock_p4

pg_prepared_statements showed generic_plans increasing for this prepared
statement, so this was not a custom-plan case.

Is this expected for ModifyTable/result relations in v13, or did I miss an
eligibility condition that prevents pruning-aware locking from being used for
this prepared UPDATE case? I saw the recent firstResultRels discussion, but I
was not sure whether those changes are intended only to make pruned
result-relation initialization safe, or whether actual prepared DML execution
is expected to see reduced child partition locking as well.

I did not review the broader plancache refactoring design, did not run
installcheck-world, and did not test concurrent DDL from a separate session.

Regards,
Ilmar Yunusov

The new status of this patch is: Waiting on Author


^ permalink  raw  reply  [nested|flat] 108+ messages in thread

* Re: generic plans and "initial" pruning
@ 2026-06-04 00:25  Amit Langote <[email protected]>
  parent: Ilmar Yunusov <[email protected]>
  0 siblings, 0 replies; 108+ messages in thread

From: Amit Langote @ 2026-06-04 00:25 UTC (permalink / raw)
  To: Ilmar Yunusov <[email protected]>; +Cc: [email protected]

Hi Ilmar,

On Wed, Jun 3, 2026 at 2:55 AM Ilmar Yunusov <[email protected]> wrote:
>
> I looked at v13, focusing on apply/build status and relation-lock behavior for
> reused generic plans after initial partition pruning.
>
> I used the v13 series from Amit's 2026-05-29 message, on origin/master at
> 4b0bf0788b066a4ca1d4f959566678e44ec93422.
>
> The series applies cleanly with git am, and git diff --check reports no
> issues.
>
> I first built with:
>
> ./configure --prefix="$PWD/pg-install" --without-readline --without-zlib --without-icu
> make -s -j8
> make -s install
>
> make -C src/test/regress check
>
> passed; all 245 tests passed, including plancache and partition_prune.
>
> I also built a cassert/debug tree with:
>
> ./configure --prefix="$PWD/pg-install" --without-readline --without-zlib --without-icu --enable-cassert --enable-debug 'CFLAGS=-O0 -g'
> make -s -j8
> make -s install
>
> and ran:
>
> make -C src/test/regress check
>
> which also passed; all 245 tests passed.
>
> For the lock behavior, I used a list-partitioned table with force_generic_plan.
> After the generic plan had been built and then reused, EXECUTE held only the
> matching child partition lock. For example, EXECUTE q(1) held only the
> following child lock:
>
> manual_prunelock_p1
>
> EXPLAIN EXECUTE behaved the same way on a reused generic plan; EXPLAIN EXECUTE
> q(2) removed the other subplans and held only the following child lock:
>
> manual_prunelock_p2
>
> With enable_partition_pruning = off and a newly prepared statement, executing
> the same SELECT held all child partition locks:
>
> manual_prunelock_p1, manual_prunelock_p2, manual_prunelock_p3
>
> I also ran a bounded cassert/debug stress check around plan invalidation. It
> did 20 cycles where a child index was created and dropped before EXECUTE, and
> 20 similar cycles before EXPLAIN EXECUTE. In each cycle, the first execution
> after invalidation/replanning held all child partition locks, and the next
> execution reusing the generic plan held only the matching child partition lock.
> That matches my reading that the patch is reducing locks for reused generic
> plans, not for the execution that has to rebuild the plan.

Thanks for thorough testing.

> One behavior I wanted to confirm: prepared UPDATE execution still held all
> child partition locks in my manual check, including on the second execution
> where the generic plan was being reused.
>
> The test was:
>
> prepare upd(int, text) as
>   update stress_prunelock_p set b = $2 where a = $1;
>
> Then both:
>
> execute upd(3, 'updated-row-3');
>
> and an all-pruned value:
>
> execute upd(99, 'no-row');
>
> held:
>
> stress_prunelock_p1, stress_prunelock_p2, stress_prunelock_p3,
> stress_prunelock_p4
>
> pg_prepared_statements showed generic_plans increasing for this prepared
> statement, so this was not a custom-plan case.
>
> Is this expected for ModifyTable/result relations in v13, or did I miss an
> eligibility condition that prevents pruning-aware locking from being used for
> this prepared UPDATE case? I saw the recent firstResultRels discussion, but I
> was not sure whether those changes are intended only to make pruned
> result-relation initialization safe, or whether actual prepared DML execution
> is expected to see reduced child partition locking as well.

Yes, this is expected; the pruning-aware path currently only kicks in
for the portal strategy used by SELECT. I hadn't noticed that
UPDATE/DELETE ends up on a different strategy that bypasses the new
pruning-aware locking path. I need to think about how best to handle
this; the DML portal strategies defer executor startup to a later
point, so it may require some restructuring.

-- 
Thanks, Amit Langote






^ permalink  raw  reply  [nested|flat] 108+ messages in thread


end of thread, other threads:[~2026-06-04 00:25 UTC | newest]

Thread overview: 108+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2021-12-25 03:36 generic plans and "initial" pruning Amit Langote <[email protected]>
2021-12-28 13:12 ` Ashutosh Bapat <[email protected]>
2021-12-31 02:26   ` Amit Langote <[email protected]>
2022-02-10 08:13 ` Amit Langote <[email protected]>
2022-02-10 22:01   ` Robert Haas <[email protected]>
2022-03-07 14:18     ` Amit Langote <[email protected]>
2022-03-11 14:35       ` Amit Langote <[email protected]>
2022-03-11 15:06         ` Amit Langote <[email protected]>
2022-03-14 18:42         ` Robert Haas <[email protected]>
2022-03-14 19:38           ` Tom Lane <[email protected]>
2022-03-14 20:06             ` Robert Haas <[email protected]>
2022-03-15 06:19               ` Amit Langote <[email protected]>
2022-03-22 12:44                 ` Amit Langote <[email protected]>
2022-03-28 07:17                   ` Amit Langote <[email protected]>
2022-03-28 07:28                     ` Amit Langote <[email protected]>
2022-03-31 03:25                       ` Amit Langote <[email protected]>
2022-03-31 09:56                         ` Alvaro Herrera <[email protected]>
2022-03-31 11:11                           ` Amit Langote <[email protected]>
2022-04-01 01:31                         ` David Rowley <[email protected]>
2022-04-01 03:09                           ` Amit Langote <[email protected]>
2022-04-01 03:45                             ` Tom Lane <[email protected]>
2022-04-01 07:01                               ` Amit Langote <[email protected]>
2022-04-01 04:08                             ` David Rowley <[email protected]>
2022-04-01 06:58                               ` Amit Langote <[email protected]>
2022-04-01 08:19                                 ` David Rowley <[email protected]>
2022-04-01 08:36                                   ` Amit Langote <[email protected]>
2022-04-06 07:20                                     ` Amit Langote <[email protected]>
2022-04-07 08:27                                       ` Amit Langote <[email protected]>
2022-04-07 12:41                                         ` David Rowley <[email protected]>
2022-04-08 05:49                                           ` Amit Langote <[email protected]>
2022-04-08 11:15                                             ` David Rowley <[email protected]>
2022-04-08 11:45                                               ` Amit Langote <[email protected]>
2022-04-11 03:05                                                 ` Amit Langote <[email protected]>
2022-04-11 03:58                                                   ` Zhihong Yu <[email protected]>
2022-05-27 08:09                                                     ` Amit Langote <[email protected]>
2022-05-27 20:08                                                       ` Zhihong Yu <[email protected]>
2022-07-05 17:43                                                       ` Jacob Champion <[email protected]>
2022-07-06 02:37                                                         ` Amit Langote <[email protected]>
2022-07-13 06:40                                                           ` Amit Langote <[email protected]>
2022-07-13 07:03                                                             ` Amit Langote <[email protected]>
2022-07-27 03:00                                                               ` Amit Langote <[email protected]>
2022-07-27 16:27                                                                 ` Robert Haas <[email protected]>
2022-07-29 04:20                                                                   ` Amit Langote <[email protected]>
2022-10-12 07:36                                                                     ` Amit Langote <[email protected]>
2022-10-17 09:29                                                                       ` Amit Langote <[email protected]>
2022-10-27 02:41                                                                         ` Amit Langote <[email protected]>
2022-11-08 06:22                                                                           ` Amit Langote <[email protected]>
2022-11-30 18:12                                                                             ` Alvaro Herrera <[email protected]>
2022-12-01 07:59                                                                               ` Amit Langote <[email protected]>
2022-12-01 11:21                                                                                 ` Alvaro Herrera <[email protected]>
2022-12-01 12:43                                                                                   ` Amit Langote <[email protected]>
2022-12-02 10:40                                                                                     ` Amit Langote <[email protected]>
2022-12-05 03:00                                                                                       ` Amit Langote <[email protected]>
2022-12-05 06:08                                                                                         ` Amit Langote <[email protected]>
2022-12-06 19:00                                                                                           ` Alvaro Herrera <[email protected]>
2022-12-09 08:26                                                                                             ` Amit Langote <[email protected]>
2022-12-09 09:52                                                                                               ` Alvaro Herrera <[email protected]>
2022-12-09 10:34                                                                                                 ` Amit Langote <[email protected]>
2022-12-09 10:49                                                                                                   ` Alvaro Herrera <[email protected]>
2022-12-09 11:02                                                                                                     ` Amit Langote <[email protected]>
2022-12-09 11:37                                                                                                       ` Alvaro Herrera <[email protected]>
2022-12-12 11:19                                                                                                         ` Amit Langote <[email protected]>
2022-12-12 17:24                                                                                                           ` Alvaro Herrera <[email protected]>
2022-12-14 08:35                                                                                                             ` Amit Langote <[email protected]>
2022-12-16 02:33                                                                                                               ` Amit Langote <[email protected]>
2022-12-21 10:18                                                                                                                 ` Alvaro Herrera <[email protected]>
2022-12-21 10:47                                                                                                                   ` Amit Langote <[email protected]>
2022-12-21 15:18                                                                                                                   ` Tom Lane <[email protected]>
2022-07-29 04:55                                                           ` Tom Lane <[email protected]>
2022-07-29 12:22                                                             ` Robert Haas <[email protected]>
2022-07-29 16:47                                                               ` Tom Lane <[email protected]>
2022-07-29 16:55                                                                 ` Robert Haas <[email protected]>
2022-07-29 15:04                                                             ` Tom Lane <[email protected]>
2022-07-29 15:56                                                               ` Robert Haas <[email protected]>
2025-05-20 03:06 Re: generic plans and "initial" pruning Tom Lane <[email protected]>
2025-05-20 07:59 ` Tomas Vondra <[email protected]>
2025-05-21 10:22   ` Amit Langote <[email protected]>
2025-05-20 13:25 ` Amit Langote <[email protected]>
2025-05-20 15:38   ` Tom Lane <[email protected]>
2025-05-21 10:22     ` Amit Langote <[email protected]>
2025-05-22 08:12       ` Amit Langote <[email protected]>
2025-05-22 13:04         ` Tomas Vondra <[email protected]>
2025-05-23 02:17           ` Amit Langote <[email protected]>
2025-06-20 12:30         ` Amit Langote <[email protected]>
2025-07-17 12:11           ` Amit Langote <[email protected]>
2025-07-22 06:43             ` Amit Langote <[email protected]>
2025-11-12 14:17               ` Amit Langote <[email protected]>
2025-11-17 12:50                 ` Amit Langote <[email protected]>
2025-11-20 07:30                   ` Amit Langote <[email protected]>
2025-11-23 12:17                     ` Tender Wang <[email protected]>
2025-11-25 01:56                       ` Amit Langote <[email protected]>
2025-11-24 03:29                     ` Chao Li <[email protected]>
2025-11-25 08:31                       ` Amit Langote <[email protected]>
2026-03-07 09:54                         ` Amit Langote <[email protected]>
2026-03-09 04:41                           ` Amit Langote <[email protected]>
2026-03-19 17:20                             ` Amit Langote <[email protected]>
2026-03-25 07:39                               ` Amit Langote <[email protected]>
2026-03-26 09:24                                 ` Amit Langote <[email protected]>
2026-03-27 09:00                                   ` Amit Langote <[email protected]>
2026-04-04 12:10                                     ` Amit Langote <[email protected]>
2026-05-27 12:03                                       ` Thom Brown <[email protected]>
2026-05-28 08:13                                         ` Amit Langote <[email protected]>
2026-05-28 13:13                                           ` Thom Brown <[email protected]>
2026-05-29 08:56                                             ` Amit Langote <[email protected]>
2026-05-29 10:30                                               ` Thom Brown <[email protected]>
2026-06-02 17:54                                                 ` Ilmar Yunusov <[email protected]>
2026-06-04 00:25                                                   ` Amit Langote <[email protected]>
2025-05-22 13:50     ` Robert Haas <[email protected]>

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox