public inbox for [email protected]  
help / color / mirror / Atom feed
From: Amit Langote <[email protected]>
To: Thom Brown <[email protected]>
Cc: Chao Li <[email protected]>
Cc: Tom Lane <[email protected]>
Cc: Tender Wang <[email protected]>
Cc: Alexander Lakhin <[email protected]>
Cc: Tomas Vondra <[email protected]>
Cc: Robert Haas <[email protected]>
Cc: Alvaro Herrera <[email protected]>
Cc: Andres Freund <[email protected]>
Cc: Daniel Gustafsson <[email protected]>
Cc: David Rowley <[email protected]>
Cc: PostgreSQL Hackers <[email protected]>
Subject: Re: generic plans and "initial" pruning
Date: Fri, 29 May 2026 17:56:51 +0900
Message-ID: <CA+HiwqGq2S+NL3Q8sYh2u9XLXmNYTy-Z6HqmTW4VHUahB=yqjw@mail.gmail.com> (raw)
In-Reply-To: <CAA-aLv4EdaMXs+NkZYEHMUr7qcLbJweg=uNwqQq43KC+ycuk6A@mail.gmail.com>
References: <CA+HiwqFpZ80UJKr4tZus4Omgg7YESzFXKSwSHRW2Ap2=XSVyUA@mail.gmail.com>
	<[email protected]>
	<CA+HiwqH8N-SxEB6SysEBsYNgV_KJs66k9Z2SNmqVzbBP-60yWg@mail.gmail.com>
	<[email protected]>
	<CA+HiwqEmG9YCQvG6uux7sO=jKFSAW6hA4Ea-ymfD+JhJAe4PWQ@mail.gmail.com>
	<CA+HiwqE2FfJfH=siLiR3kJ13tmXZORAGTWsZc2r52o1_5BDv+g@mail.gmail.com>
	<[email protected]>
	<CA+HiwqFhkpXHAA=4NY5SqYXX08uq=nYtXcSByNZF=2MAy1UA7A@mail.gmail.com>
	<CA+HiwqHCcSoYfpMjFshaU1bj6NjreiDvMSDpVSeBmqk-kbWrPw@mail.gmail.com>
	<CA+HiwqHOejJk0_qMuM5g38h70hY_JvHMAKwnH3k=urfTXauPQA@mail.gmail.com>
	<CA+HiwqFsGKM82oaMby3VWYXf_XFpDAMeT+6SXgj-45HpTrS1dA@mail.gmail.com>
	<CA+HiwqFA5hUWYktt3VMh4zQOYMxqH-MpdX8eemfM+o-9dY-zbQ@mail.gmail.com>
	<CA+HiwqEn7bbUXaXO=SmUujBjJSHfS31cwQroHRBwT0sR=66bgg@mail.gmail.com>
	<CA+HiwqGGLDTd1ZTK1c0zv4La7XOVSVMqBuNtscJeh6FyUQvFvA@mail.gmail.com>
	<CA+HiwqE2JFiqqrXdyJVQWY-fMGwzDkLqjXQdUKbPaCpDpxd_2g@mail.gmail.com>
	<CA+HiwqFp3jZGSz==QjeuV62_62F6+V6b62=Uqvy99sW_gsgWBA@mail.gmail.com>
	<CAHewXNkUz9XGG8nnoxZaw35e+5bQVVP=eeJE4cW4V2e+P9ndFw@mail.gmail.com>
	<CA+HiwqFKSpfYruzcVz-5CcFxg7gMa+ycXjMa2aPz_J_P4LGXTg@mail.gmail.com>
	<[email protected]>
	<CA+HiwqEQ1oME-hcDXwC9rGQb=u7MdUFG3Sc=Qg27uH480v10FA@mail.gmail.com>
	<[email protected]>
	<CA+HiwqGXMLSQyJvynWF40yNwBAx-pXtxemReP8L+C+kaUa5v5A@mail.gmail.com>
	<CA+HiwqGBfMgcxokEH_mg6s=ttLFm54dj4hT6yXydU2t0g6oQ3g@mail.gmail.com>
	<CA+HiwqEEkGfMc_LSJhfz96o-czVS4B59Vhw6i1_t58ZGqhP8VA@mail.gmail.com>
	<CA+HiwqHAd+9nptjxP6=KrcKA1BMsS6pbB3B2oaojwdyH_wBWCA@mail.gmail.com>
	<CA+HiwqE7_YpU--EsrhvNqcZ+10+92EGFaX5609AUJb9ENLntnQ@mail.gmail.com>
	<CA+HiwqEF9SgKyQ1HrYOURpv8DGRGHDNwBT9Y6yEBVCW+=kh_=w@mail.gmail.com>
	<CA+HiwqFpEHBjosRackQhm6yKKnHgqm8Ewkn=qsctT1N0PqVSrg@mail.gmail.com>
	<CA+HiwqGJP91Qed0EjuB72Lv4_QAiVOMYjya7GA0aas8K6NZUZA@mail.gmail.com>
	<[email protected]>
	<CA+HiwqE7LDSoaF024Mt9v1Gt-uE-WoT9GawC5ds45SaPczV8Qw@mail.gmail.com>
	<CA+HiwqGn38DsKgMYKWZ6jyv3_oqCSB0j+XucTjNM0S+BFsQpVA@mail.gmail.com>
	<CA+HiwqGFNe7kBkKZm0KtG_CFfw-ciK659SJMGP0CWVaa2q8rmw@mail.gmail.com>
	<CA+HiwqELAcgVg_3Gb4VTOpC6wcNhHP0m-8OJFG0MeGRo0M=_4Q@mail.gmail.com>
	<CA+HiwqHBxDL=3qQa1f-sBOBZqB88EiVAiagXF3X8Kagpr6Yhpw@mail.gmail.com>
	<CA+HiwqFx0kmGqSDcLrE37KkHS2T9O1NoBitZT4mA4yJBBt_QjA@mail.gmail.com>
	<CA+HiwqGq=xQvE0oCeOX_oXWq2iyNs5q9UwopyQ2uXF2kJPXTDg@mail.gmail.com>
	<CA+HiwqHN9x7ufTz3EfAA3-Zq3NOTeZMKtBatmevMesybwBUaAw@mail.gmail.com>
	<CA+HiwqGAT8jKSgjsfPvW2Ft=5xWCCfq05j9=jJKxP34Qqe68Pg@mail.gmail.com>
	<CAA-aLv5+dSSQ7KKZPgnysnFOTEXkFKYbeqSWk5Qu61_3Vd8aJw@mail.gmail.com>
	<CA+HiwqGqAHhJmJn5=n9363R8UkcTpu3Uxj4Q2DmuG527ERDt8A@mail.gmail.com>
	<CAA-aLv4EdaMXs+NkZYEHMUr7qcLbJweg=uNwqQq43KC+ycuk6A@mail.gmail.com>

On Thu, May 28, 2026 at 10:14 PM Thom Brown <[email protected]> wrote:
> On Thu, 28 May 2026 at 09:14, Amit Langote <[email protected]> wrote:
> > It's a real bug.
> >
> > You're right that if PortalLockCachedPlan() replans, the QueryDesc
> > created before the call still points at the old PlannedStmt from the
> > released plan.  And yes, 0004 happens to fix it by rebuilding the
> > QueryDesc inside PortalLockCachedPlan(), but 0001 through 0003 are
> > broken on their own.
> >
> > Attached is an updated set with the fix: CreateQueryDesc now runs
> > after PortalLockCachedPlan() returns, as you suggested.  That said,
> > I'll probably focus first on settling the plancache refactoring that
> > spun off from this thread [1], and then start a new thread for the
> > pruning-aware locking work on top of it, incorporating parts of this
> > series.
>
> Thanks.
>
> I've done another pass. I see a reference to
> AcquireExecutorLocksUnpruned(), but I can't find this function. Is
> this supposed to be AcquireExecutorLocksPrepared()?

You're right, stale comment. It should say
AcquireExecutorLocksPrepared(). Fixed.

> And also I have a question about the new firstResultRels code
>
> If I've followed it right, the bit in setrefs.c records the
> lowest-numbered RT index from leaf_result_relids as the
> per-ModifyTable fallback that's used when all real targets get pruned
> away, and the executor side looks it up via
> linitial_int(node->resultRelations). For that to work those two have
> to pick the same RT index, and the comment justifies it with
> "partition expansion preserves RT index order". Where is that
> preservation guaranteed?

The ordering comes from expand_inherited_rtentry(), which adds child
partitions to the range table sequentially in partition bound order.
Since ModifyTable.resultRelations is built from the same expansion,
its first element is the lowest-numbered RT index among the leaf
partitions for that node. That is the same value
bms_next_member(leaf_result_relids, -1) returns from the Bitmapset,
because Bitmapset iteration returns members in ascending order. I've
added a comment in setrefs.c pointing to expand_inherited_rtentry() as
the source of this guarantee.

> And with the assertion in ExecInitModifyTable:
>
> Assert(list_member_int(estate->es_plannedstmt->firstResultRels, rti));
>
> With writable CTEs producing more than one ModifyTable node the list
> has several entries, so all the assert really checks is that some
> recorded entry matches, not that the one recorded for this particular
> node matches. If that's correct, then in a case where the wrong entry
> happened to line up the right relation wouldn't be locked and nothing
> would complain. Is there something that keeps these in order
> somewhere?

This is a fair observation -- the Assert checks membership in the
global list rather than per-node correspondence. But node A's rti
can't accidentally pass the Assert by matching an entry recorded for
node B. Each ModifyTable node gets its own partition expansion with
distinct RT entries. In a writable CTE like:

  WITH upd1 AS (UPDATE t SET ...),
       upd2 AS (UPDATE t SET ...)
  UPDATE t SET ...

each UPDATE creates a separate set of leaf partition RT entries --
upd1 might get RT indexes 5,6,7, upd2 gets 8,9,10, and the main UPDATE
gets 11,12,13. The global firstResultRels list would be [5, 8, 11].
When ExecInitModifyTable falls back to linitial_int(resultRelations)
for a given node, it finds that node's own entry, because the RT index
sets are disjoint across nodes.

That said, it's worth being explicit about what protections exist at
each layer, since this is safety-critical code:

1. AcquireExecutorLocksPrepared(), added by 0004, locks every entry in
firstResultRels unconditionally. So regardless of which rti a
ModifyTable node falls back to, the relation will be locked.

2. ExecGetRangeTableRelation() has two checks when opening a relation.
For non-result relations (isResultRel=false), it checks
es_unpruned_relids and raises an ERROR in release builds if the
relation was pruned. For result relations (isResultRel=true), that
check is intentionally skipped -- it has to be, because at least one
result relation per ModifyTable node must remain openable even when
all partitions are pruned, since executor code paths like ExecMerge()
and ExecInitPartitionInfo() rely on resultRelInfo[0] being initialized
(see commit 28317de723b). The remaining protection for result
relations is Assert(CheckRelationLockedByMe()) inside table_open,
which fires in debug builds.

3. I've tightened ExecInitModifyTable to close this gap: the
all-pruned fallback path now raises an elog(ERROR) in release builds
if linitial_int(resultRelations) is not found in firstResultRels,
rather than just an Assert. This gives result relations a
production-visible check comparable to what es_unpruned_relids
provides for scan relations.

So the net effect is that for scan relations, opening a
pruned-and-unlocked relation is caught by an ERROR in production via
es_unpruned_relids. For result relations on the all-pruned fallback
path, it's now also caught by an ERROR in production via the
firstResultRels check in ExecInitModifyTable. The locking in
AcquireExecutorLocksPrepared() ensures the relation is always locked
regardless.

Thanks again for the review.  A close look at these aspects by someone
other than me is very useful.

-- 
Thanks, Amit Langote


Attachments:

  [application/octet-stream] v13-0003-Introduce-ExecutorPrep-and-refactor-executor-sta.patch (8.8K, 2-v13-0003-Introduce-ExecutorPrep-and-refactor-executor-sta.patch)
  download | inline diff:
From 05c92346e2bec4c8ec9a7cf45ec572c15d64481f Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Thu, 26 Mar 2026 16:08:46 +0900
Subject: [PATCH v13 3/4] Introduce ExecutorPrep and refactor executor startup

Move permission checks, range table initialization, and initial
partition pruning out of InitPlan() into a new ExecutorPrep()
helper.

ExecutorStart() invokes ExecutorPrep() when QueryDesc->estate is
NULL, keeping current behavior unchanged.  If QueryDesc->estate is
already set, ExecutorStart() reuses it.

This is preparatory refactoring only.  No caller outside the
executor supplies a prebuilt EState in this commit.

In assert builds, verify that the expected relation locks are held
when entering ExecutorStart().
---
 src/backend/executor/README     |  10 ++-
 src/backend/executor/execMain.c | 152 ++++++++++++++++++++++++++------
 src/include/executor/execdesc.h |   2 +-
 3 files changed, 132 insertions(+), 32 deletions(-)

diff --git a/src/backend/executor/README b/src/backend/executor/README
index 54f4782f31b..890bc3d9333 100644
--- a/src/backend/executor/README
+++ b/src/backend/executor/README
@@ -291,11 +291,17 @@ Query Processing Control Flow
 
 This is a sketch of control flow for full query processing:
 
+    ExecutorPrep
+		May be run before ExecutorStart, or implicitly from ExecutorStart
+		if not done earlier.  Creates the EState in QueryDesc, performs
+		range table initialization, permission checks, and initial
+		partition pruning.
+
 	CreateQueryDesc
 
 	ExecutorStart
-		CreateExecutorState
-			creates per-query context
+		ExecutorPrep (if QueryDesc.estate is NULL)
+			creates EState and per-query context
 		switch to per-query context to run ExecInitNode
 		AfterTriggerBeginQuery
 		ExecInitNode --- recursively scans plan tree
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 4b30f768680..2b9397b72f3 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -57,6 +57,7 @@
 #include "parser/parse_relation.h"
 #include "pgstat.h"
 #include "rewrite/rewriteHandler.h"
+#include "storage/lmgr.h"
 #include "tcop/utility.h"
 #include "utils/acl.h"
 #include "utils/backend_status.h"
@@ -76,6 +77,7 @@ ExecutorEnd_hook_type ExecutorEnd_hook = NULL;
 ExecutorCheckPerms_hook_type ExecutorCheckPerms_hook = NULL;
 
 /* decls for local routines only used within this module */
+static void ExecutorPrep(QueryDesc *queryDesc, ResourceOwner owner, int eflags);
 static void InitPlan(QueryDesc *queryDesc, int eflags);
 static void CheckValidRowMarkRel(Relation rel, RowMarkType markType);
 static void ExecPostprocessPlan(EState *estate);
@@ -147,7 +149,6 @@ standard_ExecutorStart(QueryDesc *queryDesc, int eflags)
 
 	/* sanity checks: queryDesc must not be started already */
 	Assert(queryDesc != NULL);
-	Assert(queryDesc->estate == NULL);
 
 	/* caller must ensure the query's snapshot is active */
 	Assert(GetActiveSnapshot() == queryDesc->snapshot);
@@ -173,9 +174,67 @@ standard_ExecutorStart(QueryDesc *queryDesc, int eflags)
 
 	/*
 	 * Build EState, switch into per-query memory context for startup.
-	 */
-	estate = CreateExecutorState();
-	queryDesc->estate = estate;
+	 *
+	 * If ExecutorPrep() ran earlier (e.g., to do initial pruning during plan
+	 * validity checking), reuse its EState to avoid redoing range table setup
+	 * and pruning. Otherwise, create a fresh EState as usual.
+	 *
+	 * In assert builds, verify that the expected locks are held.  When no
+	 * prep EState was provided, AcquireExecutorLocks() should have locked
+	 * every relation in the plan.  When one was provided, pruning-aware
+	 * locking should have locked at least the unpruned relations.  Both
+	 * checks are skipped in parallel workers, which acquire relation locks
+	 * lazily in ExecGetRangeTableRelation().
+	 */
+	if (queryDesc->estate == NULL)
+	{
+#ifdef USE_ASSERT_CHECKING
+		if (!IsParallelWorker())
+		{
+			ListCell   *lc;
+
+			foreach(lc, queryDesc->plannedstmt->rtable)
+			{
+				RangeTblEntry *rte = lfirst_node(RangeTblEntry, lc);
+
+				if (rte->rtekind == RTE_RELATION ||
+					(rte->rtekind == RTE_SUBQUERY && rte->relid != InvalidOid))
+					Assert(CheckRelationOidLockedByMe(rte->relid,
+													  rte->rellockmode,
+													  true));
+			}
+		}
+#endif
+		ExecutorPrep(queryDesc, CurrentResourceOwner, eflags);
+	}
+#ifdef USE_ASSERT_CHECKING
+	else
+	{
+		/*
+		 * A prep EState was provided, meaning pruning-aware locking should
+		 * have locked at least the unpruned relations.
+		 */
+		if (!IsParallelWorker())
+		{
+			int			rtindex = -1;
+
+			while ((rtindex = bms_next_member(queryDesc->estate->es_unpruned_relids,
+											  rtindex)) >= 0)
+			{
+				RangeTblEntry *rte = exec_rt_fetch(rtindex, queryDesc->estate);
+
+				Assert(rte->rtekind == RTE_RELATION ||
+					   (rte->rtekind == RTE_SUBQUERY &&
+						rte->relid != InvalidOid));
+				Assert(CheckRelationOidLockedByMe(rte->relid,
+												  rte->rellockmode, true));
+			}
+		}
+	}
+#endif
+
+	estate = queryDesc->estate;
+	Assert(estate);
 
 	oldcontext = MemoryContextSwitchTo(estate->es_query_cxt);
 
@@ -274,6 +333,64 @@ standard_ExecutorStart(QueryDesc *queryDesc, int eflags)
 	MemoryContextSwitchTo(oldcontext);
 }
 
+/*
+ * ExecutorPrep
+ *
+ * Build the initial executor state for queryDesc before ExecutorStart().
+ *
+ * This creates the EState and performs the subset of executor startup that
+ * does not require plan-tree initialization, allowing that work to be reused
+ * by callers that need executor state before ExecutorStart():
+ *
+ * - initialize the range table
+ * - perform permission checks
+ * - perform initial partition pruning
+ *
+ * On success, queryDesc->estate is set and can later be reused by
+ * ExecutorStart() instead of rebuilding the same state.
+ *
+ * Caller must ensure that queryDesc->snapshot is active.
+ */
+static void
+ExecutorPrep(QueryDesc *queryDesc, ResourceOwner owner, int eflags)
+{
+	ResourceOwner oldowner;
+	EState	   *estate;
+	PlannedStmt *pstmt;
+
+	Assert(queryDesc != NULL);
+
+	if (queryDesc->operation == CMD_UTILITY)
+		return;
+
+	Assert(ActiveSnapshotSet());
+	Assert(GetActiveSnapshot() == queryDesc->snapshot);
+	Assert(queryDesc->estate == NULL);
+
+	pstmt = queryDesc->plannedstmt;
+
+	estate = CreateExecutorState();
+	queryDesc->estate = estate;
+
+	estate->es_plannedstmt = pstmt;
+	estate->es_part_prune_infos = pstmt->partPruneInfos;
+	estate->es_param_list_info = queryDesc->params;
+	estate->es_queryEnv = queryDesc->queryEnv;
+	estate->es_top_eflags = eflags;
+
+	ExecCheckPermissions(pstmt->rtable, pstmt->permInfos, true);
+
+	ExecInitRangeTable(estate, pstmt->rtable, pstmt->permInfos,
+					   bms_copy(pstmt->unprunableRelids));
+
+	oldowner = CurrentResourceOwner;
+	CurrentResourceOwner = owner;
+
+	ExecDoInitialPruning(estate);
+
+	CurrentResourceOwner = oldowner;
+}
+
 /* ----------------------------------------------------------------
  *		ExecutorRun
  *
@@ -849,37 +966,14 @@ InitPlan(QueryDesc *queryDesc, int eflags)
 	CmdType		operation = queryDesc->operation;
 	PlannedStmt *plannedstmt = queryDesc->plannedstmt;
 	Plan	   *plan = plannedstmt->planTree;
-	List	   *rangeTable = plannedstmt->rtable;
 	EState	   *estate = queryDesc->estate;
 	PlanState  *planstate;
 	TupleDesc	tupType;
 	ListCell   *l;
 	int			i;
 
-	/*
-	 * Do permissions checks
-	 */
-	ExecCheckPermissions(rangeTable, plannedstmt->permInfos, true);
-
-	/*
-	 * initialize the node's execution state
-	 */
-	ExecInitRangeTable(estate, rangeTable, plannedstmt->permInfos,
-					   bms_copy(plannedstmt->unprunableRelids));
-
-	estate->es_plannedstmt = plannedstmt;
-	estate->es_part_prune_infos = plannedstmt->partPruneInfos;
-
-	/*
-	 * Perform runtime "initial" pruning to identify which child subplans,
-	 * corresponding to the children of plan nodes that contain
-	 * PartitionPruneInfo such as Append, will not be executed. The results,
-	 * which are bitmapsets of indexes of the child subplans that will be
-	 * executed, are saved in es_part_prune_results.  These results correspond
-	 * to each PartitionPruneInfo entry, and the es_part_prune_results list is
-	 * parallel to es_part_prune_infos.
-	 */
-	ExecDoInitialPruning(estate);
+	/* ExecutorPrep() must have been done. */
+	Assert(queryDesc->estate);
 
 	/*
 	 * Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/include/executor/execdesc.h b/src/include/executor/execdesc.h
index 37c2576e4bc..aea5ec8ea02 100644
--- a/src/include/executor/execdesc.h
+++ b/src/include/executor/execdesc.h
@@ -45,7 +45,7 @@ typedef struct QueryDesc
 	int			query_instr_options;	/* OR of InstrumentOption flags for
 										 * query_instr */
 
-	/* These fields are set by ExecutorStart */
+	/* These fields are set by ExecutorStart or ExecutorPrep */
 	TupleDesc	tupDesc;		/* descriptor for result tuples */
 	EState	   *estate;			/* executor's query-wide state */
 	PlanState  *planstate;		/* tree of per-plan-node state */
-- 
2.47.3



  [application/octet-stream] v13-0002-Refactor-executor-s-initial-partition-pruning-se.patch (7.3K, 3-v13-0002-Refactor-executor-s-initial-partition-pruning-se.patch)
  download | inline diff:
From 29e5ad113f6974a94fbcf984b43fa3ed86f57632 Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Wed, 25 Mar 2026 16:06:38 +0900
Subject: [PATCH v13 2/4] Refactor executor's initial partition pruning setup

Simplify handling of unpruned relids by moving responsibility
for recording them in EState into CreatePartitionPruneState(),
avoiding the need to pass all_leafpart_rtis as an out parameter.

Also move the setting of ecxt_param_exec_vals from
ExecCreatePartitionPruneState() to InitExecPartitionPruneContexts(),
to allow the former to be called before PARAM_EXEC parameters are
set up.  A later commit needs this when running pruning state setup
outside of InitPlan().

No behavioral change.
---
 src/backend/executor/execPartition.c | 70 +++++++++++++++++++---------
 1 file changed, 48 insertions(+), 22 deletions(-)

diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index d96d4f9947b..2a3af006f77 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -185,8 +185,7 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation rel,
 static List *adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri);
 static List *adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap);
 static PartitionPruneState *CreatePartitionPruneState(EState *estate,
-													  PartitionPruneInfo *pruneinfo,
-													  Bitmapset **all_leafpart_rtis);
+													  PartitionPruneInfo *pruneinfo);
 static void InitPartitionPruneContext(PartitionPruneContext *context,
 									  List *pruning_steps,
 									  PartitionDesc partdesc,
@@ -1978,7 +1977,7 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
  * estate->es_part_prune_infos. For each entry, it creates a PartitionPruneState
  * and adds it to es_part_prune_states.  ExecInitPartitionExecPruning() accesses
  * these states through their corresponding indexes in es_part_prune_states and
- * assign each state to the parent node's PlanState, from where it will be used
+ * assigns each state to the parent node's PlanState, from where it will be used
  * for "exec" pruning.
  *
  * If initial pruning steps exist for a PartitionPruneInfo entry, this function
@@ -1996,29 +1995,31 @@ ExecDoInitialPruning(EState *estate)
 {
 	ListCell   *lc;
 
+	Assert(estate->es_part_prune_results == NULL);
 	foreach(lc, estate->es_part_prune_infos)
 	{
 		PartitionPruneInfo *pruneinfo = lfirst_node(PartitionPruneInfo, lc);
 		PartitionPruneState *prunestate;
 		Bitmapset  *validsubplans = NULL;
-		Bitmapset  *all_leafpart_rtis = NULL;
 		Bitmapset  *validsubplan_rtis = NULL;
 
 		/* Create and save the PartitionPruneState. */
-		prunestate = CreatePartitionPruneState(estate, pruneinfo,
-											   &all_leafpart_rtis);
+		prunestate = CreatePartitionPruneState(estate, pruneinfo);
 		estate->es_part_prune_states = lappend(estate->es_part_prune_states,
 											   prunestate);
 
 		/*
 		 * Perform initial pruning steps, if any, and save the result
-		 * bitmapset or NULL as described in the header comment.
+		 * bitmapset or NULL as described in the header comment.  RT indexes
+		 * of surviving partitions would be added to validsubplan_rtis.
+		 *
+		 * Note that when do_initial_prune is false,
+		 * CreatePartitionPruneState() would have already added the RT indexes
+		 * of all leaf partitions to es_unpruned_relids directly.
 		 */
 		if (prunestate->do_initial_prune)
 			validsubplans = ExecFindMatchingSubPlans(prunestate, true,
 													 &validsubplan_rtis);
-		else
-			validsubplan_rtis = all_leafpart_rtis;
 
 		estate->es_unpruned_relids = bms_add_members(estate->es_unpruned_relids,
 													 validsubplan_rtis);
@@ -2136,14 +2137,12 @@ ExecInitPartitionExecPruning(PlanState *planstate,
  * parent plan node's PlanState.
  *
  * If initial pruning steps are to be skipped (e.g., during EXPLAIN
- * (GENERIC_PLAN)), *all_leafpart_rtis will be populated with the RT indexes of
- * all leaf partitions whose scanning subnode is included in the parent plan
- * node's list of child plans. The caller must add these RT indexes to
- * estate->es_unpruned_relids.
+ * (GENERIC_PLAN)), the RT indexes of all leaf partitions whose scanning
+ * subnode is included in the parent plan node's list of child plans are
+ * added to estate->es_unpruned_relids.
  */
 static PartitionPruneState *
-CreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo,
-						  Bitmapset **all_leafpart_rtis)
+CreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo)
 {
 	PartitionPruneState *prunestate;
 	int			n_part_hierarchies;
@@ -2377,8 +2376,8 @@ CreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo,
 													   pinfo->execparamids);
 
 			/*
-			 * Return all leaf partition indexes if we're skipping pruning in
-			 * the EXPLAIN (GENERIC_PLAN) case.
+			 * Add all leaf partition indexes to es_unpruned_relids if we're
+			 * skipping pruning in the EXPLAIN (GENERIC_PLAN) case.
 			 */
 			if (pinfo->initial_pruning_steps && !prunestate->do_initial_prune)
 			{
@@ -2390,9 +2389,28 @@ CreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo,
 					Index		rtindex = pprune->leafpart_rti_map[part_index];
 
 					if (rtindex)
-						*all_leafpart_rtis = bms_add_member(*all_leafpart_rtis,
-															rtindex);
+						estate->es_unpruned_relids =
+							bms_add_member(estate->es_unpruned_relids, rtindex);
+				}
+			}
+			else if (pinfo->initial_pruning_steps == NIL)
+			{
+				/*
+				 * All partitions better be present in es_unpruned_relids when
+				 * none are initially prunable.
+				 */
+#ifdef USE_ASSERT_CHECKING
+				int			part_index = -1;
+
+				while ((part_index = bms_next_member(pprune->present_parts,
+													 part_index)) >= 0)
+				{
+					Index		rtindex = pprune->leafpart_rti_map[part_index];
+
+					if (rtindex)
+						Assert(bms_is_member(rtindex, estate->es_unpruned_relids));
 				}
+#endif
 			}
 
 			j++;
@@ -2490,9 +2508,10 @@ InitPartitionPruneContext(PartitionPruneContext *context,
  *		Initialize exec pruning contexts deferred by CreatePartitionPruneState()
  *
  * This function finalizes exec pruning setup for a PartitionPruneState by
- * initializing contexts for pruning steps that require the parent plan's
- * PlanState. It iterates over PartitionPruningData entries and sets up the
- * necessary execution contexts for pruning during query execution.
+ * initializing contexts for pruning steps that require PARAM_EXEC parameters
+ * and the parent plan's PlanState. It iterates over PartitionPruningData
+ * entries and sets up the necessary execution contexts for pruning during
+ * query execution.
  *
  * Also fix the mapping of partition indexes to subplan indexes contained in
  * prunestate by considering the new list of subplans that survived initial
@@ -2520,9 +2539,16 @@ InitExecPartitionPruneContexts(PartitionPruneState *prunestate,
 	bool		fix_subplan_map = false;
 
 	Assert(prunestate->do_exec_prune);
+	Assert(prunestate->econtext);
 	Assert(parent_plan != NULL);
 	estate = parent_plan->state;
 
+	/*
+	 * These might not be available when ExecCreatePartitionPruneState() is
+	 * called.
+	 */
+	prunestate->econtext->ecxt_param_exec_vals = estate->es_param_exec_vals;
+
 	/*
 	 * No need to fix subplans maps if initial pruning didn't eliminate any
 	 * subplans.
-- 
2.47.3



  [application/octet-stream] v13-0001-Move-execution-lock-acquisition-out-of-GetCached.patch (16.2K, 4-v13-0001-Move-execution-lock-acquisition-out-of-GetCached.patch)
  download | inline diff:
From a3214580f2ce1983a111af07ccb092ba03c812c8 Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Sat, 4 Apr 2026 18:38:34 +0900
Subject: [PATCH v13 1/4] Move execution lock acquisition out of
 GetCachedPlan()

GetCachedPlan() previously acquired execution locks on all plan
relations as part of cached plan validation.  Move this
responsibility to callers, making GetCachedPlan() return a valid
plan without holding execution locks.

Add AcquireExecutorLocks() as the caller-facing function: it locks
all relations in the plan, checks that the plan is still valid
afterward, and returns false if it was invalidated so the caller
can retry with a fresh plan.

For portal-backed callers, add PortalLockCachedPlan() in pquery.c
which wraps the lock-check-retry loop and handles the case where
replanning changes the portal strategy.  Store the CachedPlanSource
pointer in PortalData so retry can call GetCachedPlan() without
the caller threading it through.

Adjust all non-portal GetCachedPlan() callers (SPI, EXPLAIN
EXECUTE, SQL functions) to call AcquireExecutorLocks() explicitly
after fetching the plan.

No behavioral change.  This separates plan retrieval from execution
setup, allowing a later commit to substitute pruning-aware locking
for eligible plans.
---
 src/backend/commands/portalcmds.c   |  1 +
 src/backend/commands/prepare.c      | 14 +++++-
 src/backend/executor/functions.c    | 14 ++++--
 src/backend/executor/spi.c          | 22 +++++++--
 src/backend/tcop/postgres.c         |  2 +
 src/backend/tcop/pquery.c           | 70 ++++++++++++++++++++++++++++-
 src/backend/utils/cache/plancache.c | 44 +++++++++++++-----
 src/backend/utils/mmgr/portalmem.c  |  7 +++
 src/include/utils/plancache.h       |  1 +
 src/include/utils/portal.h          |  3 ++
 10 files changed, 157 insertions(+), 21 deletions(-)

diff --git a/src/backend/commands/portalcmds.c b/src/backend/commands/portalcmds.c
index 01efac3319e..cf5deec4943 100644
--- a/src/backend/commands/portalcmds.c
+++ b/src/backend/commands/portalcmds.c
@@ -118,6 +118,7 @@ PerformCursorOpen(ParseState *pstate, DeclareCursorStmt *cstmt, ParamListInfo pa
 					  queryString,
 					  CMDTAG_SELECT,	/* cursor's query is always a SELECT */
 					  list_make1(plan),
+					  NULL,
 					  NULL);
 
 	/*----------
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 876aad2100a..03d7a98fc58 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -207,6 +207,7 @@ ExecuteQuery(ParseState *pstate,
 					  query_string,
 					  entry->plansource->commandTag,
 					  plan_list,
+					  entry->plansource,
 					  cplan);
 
 	/*
@@ -632,8 +633,17 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	}
 
 	/* Replan if needed, and acquire a transient refcount */
-	cplan = GetCachedPlan(entry->plansource, paramLI,
-						  CurrentResourceOwner, pstate->p_queryEnv);
+	for (;;)
+	{
+		cplan = GetCachedPlan(entry->plansource, paramLI,
+							  CurrentResourceOwner,
+							  pstate->p_queryEnv);
+		plan_list = cplan->stmt_list;
+
+		if (AcquireExecutorLocks(cplan))
+			break;
+		ReleaseCachedPlan(cplan, CurrentResourceOwner);
+	}
 
 	INSTR_TIME_SET_CURRENT(planduration);
 	INSTR_TIME_SUBTRACT(planduration, planstart);
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index 88109348817..2afb814a435 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -654,6 +654,7 @@ static bool
 init_execution_state(SQLFunctionCachePtr fcache)
 {
 	CachedPlanSource *plansource;
+	CachedPlan *cplan;
 	execution_state *preves = NULL;
 	execution_state *lasttages = NULL;
 	int			nstmts;
@@ -696,10 +697,15 @@ init_execution_state(SQLFunctionCachePtr fcache)
 	 * CurrentResourceOwner will be the same when ShutdownSQLFunction runs.)
 	 */
 	fcache->cowner = CurrentResourceOwner;
-	fcache->cplan = GetCachedPlan(plansource,
-								  fcache->paramLI,
-								  fcache->cowner,
-								  NULL);
+	for (;;)
+	{
+		cplan = GetCachedPlan(plansource, fcache->paramLI,
+							  fcache->cowner, NULL);
+		if (AcquireExecutorLocks(cplan))
+			break;
+		ReleaseCachedPlan(cplan, fcache->cowner);
+	}
+	fcache->cplan = cplan;
 
 	/*
 	 * If necessary, make esarray[] bigger to hold the needed state.
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 52f3b11301c..268cd10bde8 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -1686,6 +1686,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 					  query_string,
 					  plansource->commandTag,
 					  stmt_list,
+					  plansource,
 					  cplan);
 
 	/*
@@ -2106,6 +2107,16 @@ SPI_plan_get_cached_plan(SPIPlanPtr plan)
 						  _SPI_current->queryEnv);
 	Assert(cplan == plansource->gplan);
 
+	if (!AcquireExecutorLocks(cplan))
+	{
+		/* Plan invalidated during locking; get a fresh one. */
+		ReleaseCachedPlan(cplan,
+						  plan->saved ? CurrentResourceOwner : NULL);
+		cplan = GetCachedPlan(plansource, NULL,
+							  plan->saved ? CurrentResourceOwner : NULL,
+							  _SPI_current->queryEnv);
+	}
+
 	/* Pop the error context stack */
 	error_context_stack = spierrcontext.previous;
 
@@ -2574,9 +2585,14 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 		 * Replan if needed, and increment plan refcount.  If it's a saved
 		 * plan, the refcount must be backed by the plan_owner.
 		 */
-		cplan = GetCachedPlan(plansource, options->params,
-							  plan_owner, _SPI_current->queryEnv);
-
+		for (;;)
+		{
+			cplan = GetCachedPlan(plansource, options->params,
+								  plan_owner, _SPI_current->queryEnv);
+			if (AcquireExecutorLocks(cplan))
+				break;
+			ReleaseCachedPlan(cplan, plan_owner);
+		}
 		stmt_list = cplan->stmt_list;
 
 		/*
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index dbef734a93f..2929f158338 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -1243,6 +1243,7 @@ exec_simple_query(const char *query_string)
 						  query_string,
 						  commandTag,
 						  plantree_list,
+						  NULL,
 						  NULL);
 
 		/*
@@ -2042,6 +2043,7 @@ exec_bind_message(StringInfo input_message)
 					  query_string,
 					  psrc->commandTag,
 					  cplan->stmt_list,
+					  psrc,
 					  cplan);
 
 	/* Portal is defined, set the plan ID based on its contents. */
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index ee731000820..4699b53cab7 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -59,6 +59,7 @@ static uint64 DoPortalRunFetch(Portal portal,
 							   long count,
 							   DestReceiver *dest);
 static void DoPortalRewind(Portal portal);
+static bool PortalLockCachedPlan(Portal portal);
 
 
 /*
@@ -463,6 +464,8 @@ PortalStart(Portal portal, ParamListInfo params,
 		 */
 		portal->strategy = ChoosePortalStrategy(portal->stmts);
 
+restart:
+
 		/*
 		 * Fire her up according to the strategy
 		 */
@@ -485,6 +488,21 @@ PortalStart(Portal portal, ParamListInfo params,
 				 * non-default nesting level for the snapshot.
 				 */
 
+				/*
+				 * If the portal is backed by a cached plan, acquire execution
+				 * locks via PortalLockCachedPlan().  If the plan is
+				 * invalidated during locking, it replans and may change the
+				 * portal strategy, requiring us to restart PortalStart().
+				 */
+				if (portal->cplan)
+				{
+					if (PortalLockCachedPlan(portal))
+					{
+						PopActiveSnapshot();
+						goto restart;
+					}
+				}
+
 				/*
 				 * Create QueryDesc in portal's context; for the moment, set
 				 * the destination to DestNone.
@@ -535,6 +553,11 @@ PortalStart(Portal portal, ParamListInfo params,
 
 			case PORTAL_ONE_RETURNING:
 			case PORTAL_ONE_MOD_WITH:
+				if (portal->cplan)
+				{
+					if (PortalLockCachedPlan(portal))
+						goto restart;
+				}
 
 				/*
 				 * We don't start the executor until we are told to run the
@@ -578,7 +601,20 @@ PortalStart(Portal portal, ParamListInfo params,
 				break;
 
 			case PORTAL_MULTI_QUERY:
-				/* Need do nothing now */
+
+				/*
+				 * GetCachedPlan() no longer acquires execution locks, so we
+				 * must do it here.  Multi-statement plans always use
+				 * conservative locking (all partitions locked); pruning-aware
+				 * locking is not feasible because PortalRunMulti() executes
+				 * statements sequentially with CCI between them.
+				 */
+				if (portal->cplan)
+				{
+					if (PortalLockCachedPlan(portal))
+						goto restart;
+				}
+
 				portal->tupDesc = NULL;
 				break;
 		}
@@ -1786,3 +1822,35 @@ EnsurePortalSnapshotExists(void)
 	/* PushActiveSnapshotWithLevel might have copied the snapshot */
 	portal->portalSnapshot = GetActiveSnapshot();
 }
+
+/*
+ * PortalLockCachedPlan
+ *		Acquire execution locks for a cached-plan-backed portal,
+ *		retrying with a fresh plan if the current one is invalidated.
+ *
+ * Returns true if replanning changed portal->strategy, meaning the
+ * caller must redispatch.  Returns false once locks are held.
+ */
+static bool
+PortalLockCachedPlan(Portal portal)
+{
+	PortalStrategy start_strategy = portal->strategy;
+
+	if (AcquireExecutorLocks(portal->cplan))
+		return false;
+
+	/* Replan.  Locks will be taken freshly. */
+	ReleaseCachedPlan(portal->cplan, portal->resowner);
+	portal->cplan = NULL;
+	portal->stmts = NIL;
+	portal->cplan = GetCachedPlan(portal->plansource,
+								  portal->portalParams,
+								  portal->resowner,
+								  portal->queryEnv);
+	portal->stmts = portal->cplan->stmt_list;
+	portal->strategy = ChoosePortalStrategy(portal->stmts);
+	if (portal->strategy != start_strategy)
+		return true;
+
+	return false;
+}
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 698e7c1aa22..f7fe366859c 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -100,7 +100,7 @@ static bool choose_custom_plan(CachedPlanSource *plansource,
 							   ParamListInfo boundParams);
 static double cached_plan_cost(CachedPlan *plan, bool include_planner);
 static Query *QueryListGetPrimaryStmt(List *stmts);
-static void AcquireExecutorLocks(List *stmt_list, bool acquire);
+static void AcquireExecutorLocksInt(List *stmt_list, bool acquire);
 static void AcquirePlannerLocks(List *stmt_list, bool acquire);
 static void ScanQueryForLocks(Query *parsetree, bool acquire);
 static bool ScanQueryWalker(Node *node, bool *acquire);
@@ -945,8 +945,9 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
  * Caller must have already called RevalidateCachedQuery to verify that the
  * querytree is up to date.
  *
- * On a "true" return, we have acquired the locks needed to run the plan.
- * (We must do this for the "true" result to be race-condition-free.)
+ * On a "true" return, the generic plan may be reused as a valid cached
+ * plan.  Any execution-time setup, including lock acquisition, is the
+ * caller's responsibility.
  */
 static bool
 CheckCachedPlan(CachedPlanSource *plansource)
@@ -983,8 +984,6 @@ CheckCachedPlan(CachedPlanSource *plansource)
 		 */
 		Assert(plan->refcount > 0);
 
-		AcquireExecutorLocks(plan->stmt_list, true);
-
 		/*
 		 * If plan was transient, check to see if TransactionXmin has
 		 * advanced, and if so invalidate it.
@@ -1003,9 +1002,6 @@ CheckCachedPlan(CachedPlanSource *plansource)
 			/* Successfully revalidated and locked the query. */
 			return true;
 		}
-
-		/* Oops, the race case happened.  Release useless locks. */
-		AcquireExecutorLocks(plan->stmt_list, false);
 	}
 
 	/*
@@ -1282,8 +1278,11 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
  * plan or a custom plan for the given parameters: the caller does not know
  * which it will get.
  *
- * On return, the plan is valid and we have sufficient locks to begin
- * execution.
+ * On return, the plan is valid but no execution locks are held.
+ * The caller must call AcquireExecutorLocks() before executing.
+ * For freshly built plans (custom or new generic), the planner
+ * already holds the needed locks, so AcquireExecutorLocks() is
+ * redundant but harmless.
  *
  * On return, the refcount of the plan has been incremented; a later
  * ReleaseCachedPlan() call is expected.  If "owner" is not NULL then
@@ -1906,9 +1905,11 @@ QueryListGetPrimaryStmt(List *stmts)
 /*
  * AcquireExecutorLocks: acquire locks needed for execution of a cached plan;
  * or release them if acquire is false.
+ *
+ * This locks all relations in a given PlannedStmt's range table.
  */
 static void
-AcquireExecutorLocks(List *stmt_list, bool acquire)
+AcquireExecutorLocksInt(List *stmt_list, bool acquire)
 {
 	ListCell   *lc1;
 
@@ -1955,6 +1956,27 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
 	}
 }
 
+/*
+ * AcquireExecutorLocks
+ *		Acquire execution locks on all relations in a cached plan.
+ *
+ * Returns true if the plan is still valid after locking.  Returns
+ * false if the plan was invalidated while locks were being acquired,
+ * in which case the locks have been released and the caller should
+ * discard this plan and retry with a fresh one from GetCachedPlan().
+ */
+bool
+AcquireExecutorLocks(CachedPlan *cplan)
+{
+	AcquireExecutorLocksInt(cplan->stmt_list, true);
+	if (!cplan->is_valid)
+	{
+		AcquireExecutorLocksInt(cplan->stmt_list, false);
+		return false;
+	}
+	return true;
+}
+
 /*
  * AcquirePlannerLocks: acquire locks needed for planning of a querytree list;
  * or release them if acquire is false.
diff --git a/src/backend/utils/mmgr/portalmem.c b/src/backend/utils/mmgr/portalmem.c
index 493f9b0ee19..613f3be30b3 100644
--- a/src/backend/utils/mmgr/portalmem.c
+++ b/src/backend/utils/mmgr/portalmem.c
@@ -272,6 +272,10 @@ CreateNewPortal(void)
  * the passed plan trees have adequate lifetime.  Typically this is done by
  * copying them into the portal's context.
  *
+ * If plansource is provided, it is the CachedPlanSource that produced
+ * cplan.  PortalLockCachedPlan() uses it to fetch a fresh plan if the
+ * current one is invalidated during execution lock acquisition.
+ *
  * The caller is also responsible for ensuring that the passed prepStmtName
  * (if not NULL) and sourceText have adequate lifetime.
  *
@@ -286,6 +290,7 @@ PortalDefineQuery(Portal portal,
 				  const char *sourceText,
 				  CommandTag commandTag,
 				  List *stmts,
+				  CachedPlanSource *plansource,
 				  CachedPlan *cplan)
 {
 	Assert(PortalIsValid(portal));
@@ -299,6 +304,7 @@ PortalDefineQuery(Portal portal,
 	portal->commandTag = commandTag;
 	SetQueryCompletion(&portal->qc, commandTag, 0);
 	portal->stmts = stmts;
+	portal->plansource = plansource;
 	portal->cplan = cplan;
 	portal->status = PORTAL_DEFINED;
 }
@@ -517,6 +523,7 @@ PortalDrop(Portal portal, bool isTopCommit)
 
 	/* drop cached plan reference, if any */
 	PortalReleaseCachedPlan(portal);
+	portal->plansource = NULL;
 
 	/*
 	 * If portal has a snapshot protecting its data, release that.  This needs
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index 7a4a85c8038..e0fc403e717 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -241,6 +241,7 @@ extern CachedPlan *GetCachedPlan(CachedPlanSource *plansource,
 								 ParamListInfo boundParams,
 								 ResourceOwner owner,
 								 QueryEnvironment *queryEnv);
+extern bool AcquireExecutorLocks(CachedPlan *cplan);
 extern void ReleaseCachedPlan(CachedPlan *plan, ResourceOwner owner);
 
 extern bool CachedPlanAllowsSimpleValidityCheck(CachedPlanSource *plansource,
diff --git a/src/include/utils/portal.h b/src/include/utils/portal.h
index a7bedb12c18..3af535362cd 100644
--- a/src/include/utils/portal.h
+++ b/src/include/utils/portal.h
@@ -137,6 +137,8 @@ typedef struct PortalData
 	CommandTag	commandTag;		/* command tag for original query */
 	QueryCompletion qc;			/* command completion data for executed query */
 	List	   *stmts;			/* list of PlannedStmts */
+	CachedPlanSource *plansource;	/* CachedPlanSource, for replanning on
+									 * invalidation */
 	CachedPlan *cplan;			/* CachedPlan, if stmts are from one */
 
 	ParamListInfo portalParams; /* params to pass to query */
@@ -240,6 +242,7 @@ extern void PortalDefineQuery(Portal portal,
 							  const char *sourceText,
 							  CommandTag commandTag,
 							  List *stmts,
+							  CachedPlanSource *plansource,
 							  CachedPlan *cplan);
 extern PlannedStmt *PortalGetPrimaryStmt(Portal portal);
 extern void PortalCreateHoldStore(Portal portal);
-- 
2.47.3



  [application/octet-stream] v13-0004-Use-pruning-aware-locking-for-single-statement-c.patch (40.8K, 5-v13-0004-Use-pruning-aware-locking-for-single-statement-c.patch)
  download | inline diff:
From 5785e0903b867f024e4b675783dfd76dc00ee733 Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Sat, 4 Apr 2026 20:43:14 +0900
Subject: [PATCH v13 4/4] Use pruning-aware locking for single-statement cached
 plans

For single-statement reused generic plans, perform initial partition
pruning before acquiring execution locks, then lock only the
surviving partitions.

Add ExecutorPrepAndLock() which encapsulates the pruning-aware lock
sequence: lock unprunable relations, call ExecutorPrep() to run
initial pruning, then lock survivors.  Plan validity is checked
after each step; ExecutorPrepCleanup() handles the case where the
plan is invalidated between prep and execution.

Extend PortalLockCachedPlan() to use the pruning-aware path for
eligible plans (single-statement reused generic, non-utility).
All other cases continue using the conservative lock-all path
from the previous commit.

Track firstResultRels in PlannerGlobal and PlannedStmt so they
are locked even if pruned, preserving ExecInitModifyTable()
assumptions about the first result relation being available.

Multi-statement CachedPlans (from rule rewriting) always use
conservative locking, since PortalRunMulti() executes statements
sequentially with CCI between them and later statements' pruning
expressions may depend on earlier ones' effects.  In principle,
this could be relaxed if the planner can prove that no pruning
expression reads state modified by an earlier statement, but that
is left for a future patch.

Regression tests are included to verify:

- Only surviving partitions are locked when pruning is enabled, and
  all partitions are locked when it is disabled (pg_locks inspection).
- Multiple ModifyTable nodes (via writable CTEs) handle the case where
  all target partitions are pruned, exercising firstResultRels.
- Plan invalidation during pruning-aware lock setup (DDL triggered by
  a pruning expression) discards the prep state and replans cleanly.
- Multi-statement CachedPlans (from rule rewriting) fall back to
  locking all partitions, avoiding stale pruning results.

Note for extension authors: code that accesses partition relations
through EState must check that the RT index is a member of
es_unpruned_relids before opening the relation.  Previously this
was an optimization; it is now a correctness requirement, because
pruned partitions may not be locked.
---
 src/backend/commands/explain.c                |  45 +++--
 src/backend/commands/prepare.c                |  30 ++-
 src/backend/executor/execMain.c               | 142 ++++++++++++++
 src/backend/executor/nodeModifyTable.c        |   7 +-
 src/backend/optimizer/plan/planner.c          |   1 +
 src/backend/optimizer/plan/setrefs.c          |  19 ++
 src/backend/tcop/pquery.c                     |  76 ++++++--
 src/backend/utils/cache/plancache.c           |  16 ++
 src/include/commands/explain.h                |   3 +-
 src/include/executor/executor.h               |   4 +
 src/include/nodes/pathnodes.h                 |   3 +
 src/include/nodes/plannodes.h                 |  10 +
 src/include/utils/plancache.h                 |   2 +
 src/test/regress/expected/partition_prune.out | 184 ++++++++++++++++++
 src/test/regress/expected/plancache.out       |  63 ++++++
 src/test/regress/sql/partition_prune.sql      | 116 +++++++++++
 src/test/regress/sql/plancache.sql            |  52 +++++
 17 files changed, 734 insertions(+), 39 deletions(-)

diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 112c17b0d64..c5254f0f920 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -377,7 +377,8 @@ standard_ExplainOneQuery(Query *query, int cursorOptions,
 	/* run it (if needed) and produce output */
 	ExplainOnePlan(plan, into, es, queryString, params, queryEnv,
 				   &planduration, (es->buffers ? &bufusage : NULL),
-				   es->memory ? &mem_counters : NULL);
+				   es->memory ? &mem_counters : NULL,
+				   NULL);
 }
 
 /*
@@ -501,7 +502,8 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
 			   const char *queryString, ParamListInfo params,
 			   QueryEnvironment *queryEnv, const instr_time *planduration,
 			   const BufferUsage *bufusage,
-			   const MemoryContextCounters *mem_counters)
+			   const MemoryContextCounters *mem_counters,
+			   QueryDesc *prep_qd)
 {
 	DestReceiver *dest;
 	QueryDesc  *queryDesc;
@@ -532,13 +534,6 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
 	 */
 	INSTR_TIME_SET_CURRENT(starttime);
 
-	/*
-	 * Use a snapshot with an updated command ID to ensure this query sees
-	 * results of any previously executed queries.
-	 */
-	PushCopiedSnapshot(GetActiveSnapshot());
-	UpdateActiveSnapshotCommandId();
-
 	/*
 	 * We discard the output if we have no use for it.  If we're explaining
 	 * CREATE TABLE AS, we'd better use the appropriate tuple receiver, while
@@ -554,10 +549,34 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
 	else
 		dest = None_Receiver;
 
-	/* Create a QueryDesc for the query */
-	queryDesc = CreateQueryDesc(plannedstmt, queryString,
-								GetActiveSnapshot(), InvalidSnapshot,
-								dest, params, queryEnv, instrument_option);
+	/*
+	 * Create a QueryDesc for the query, or use the one provided by the
+	 * caller.  When reusing a prep QueryDesc, its snapshot was set at
+	 * creation time; we push it as active for ExecutorStart and override the
+	 * destination and instrument options, which were not known when the
+	 * caller created it.
+	 */
+	if (prep_qd)
+	{
+		PushActiveSnapshot(GetActiveSnapshot());
+		queryDesc = prep_qd;
+		Assert(queryDesc->dest == None_Receiver);
+		queryDesc->dest = dest;
+		queryDesc->instrument_options = instrument_option;
+	}
+	else
+	{
+		/*
+		 * Use a snapshot with an updated command ID to ensure this query sees
+		 * results of any previously executed queries.
+		 */
+		PushCopiedSnapshot(GetActiveSnapshot());
+		UpdateActiveSnapshotCommandId();
+		queryDesc = CreateQueryDesc(plannedstmt, queryString,
+									GetActiveSnapshot(), InvalidSnapshot,
+									dest, params, queryEnv,
+									instrument_option);
+	}
 
 	/* Select execution options */
 	if (es->analyze)
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 03d7a98fc58..3bbbc052149 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -588,6 +588,7 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	MemoryContextCounters mem_counters;
 	MemoryContext planner_ctx = NULL;
 	MemoryContext saved_ctx = NULL;
+	QueryDesc  *prep_qd = NULL;
 
 	if (es->memory)
 	{
@@ -640,8 +641,31 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 							  pstate->p_queryEnv);
 		plan_list = cplan->stmt_list;
 
-		if (AcquireExecutorLocks(cplan))
+		if (!CachedPlanCanPrep(cplan, entry->plansource))
+		{
+			if (AcquireExecutorLocks(cplan))
+				break;
+			ReleaseCachedPlan(cplan, CurrentResourceOwner);
+			continue;
+		}
+
+		prep_qd = CreateQueryDesc(linitial_node(PlannedStmt, plan_list),
+								  query_string,
+								  GetActiveSnapshot(),
+								  InvalidSnapshot,
+								  None_Receiver,	/* ExplainOnePlan will fix */
+								  paramLI,
+								  pstate->p_queryEnv,
+								  0 /* ExplainOnePlan will fix */ );
+		if (ExecutorPrepAndLock(prep_qd,
+								CurrentResourceOwner,
+								es->generic ? EXEC_FLAG_EXPLAIN_GENERIC : 0,
+								&cplan->is_valid))
 			break;
+
+		/* Try again. */
+		ExecutorPrepCleanup(prep_qd);
+		FreeQueryDesc(prep_qd);
 		ReleaseCachedPlan(cplan, CurrentResourceOwner);
 	}
 
@@ -664,6 +688,7 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 	plan_list = cplan->stmt_list;
 
 	/* Explain each query */
+	Assert(prep_qd == NULL || list_length(plan_list) == 1);
 	foreach(p, plan_list)
 	{
 		PlannedStmt *pstmt = lfirst_node(PlannedStmt, p);
@@ -671,7 +696,8 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
 		if (pstmt->commandType != CMD_UTILITY)
 			ExplainOnePlan(pstmt, into, es, query_string, paramLI, pstate->p_queryEnv,
 						   &planduration, (es->buffers ? &bufusage : NULL),
-						   es->memory ? &mem_counters : NULL);
+						   es->memory ? &mem_counters : NULL,
+						   prep_qd);
 		else
 			ExplainOneUtility(pstmt->utilityStmt, into, es, pstate, paramLI);
 
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 2b9397b72f3..bbfa0e2b92a 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -333,6 +333,124 @@ standard_ExecutorStart(QueryDesc *queryDesc, int eflags)
 	MemoryContextSwitchTo(oldcontext);
 }
 
+/*
+ * LockRangeTableRelids
+ * 		Acquire or release locks on the specified relids, which reference
+ * 		entries in the provided range table.
+ *
+ * Helper for AcquireExecutorLocksPrepared().
+ */
+static void
+LockRangeTableRelids(List *rtable, Bitmapset *relids, bool acquire)
+{
+	int			rtindex = -1;
+
+	while ((rtindex = bms_next_member(relids, rtindex)) >= 0)
+	{
+		RangeTblEntry *rte = list_nth_node(RangeTblEntry, rtable, rtindex - 1);
+
+		Assert(rte->rtekind == RTE_RELATION ||
+			   (rte->rtekind == RTE_SUBQUERY && OidIsValid(rte->relid)));
+
+		/*
+		 * Acquire the appropriate type of lock on each relation OID. Note
+		 * that we don't actually try to open the rel, and hence will not fail
+		 * if it's been dropped entirely --- we'll just transiently acquire a
+		 * non-conflicting lock.
+		 */
+		if (acquire)
+			LockRelationOid(rte->relid, rte->rellockmode);
+		else
+			UnlockRelationOid(rte->relid, rte->rellockmode);
+	}
+}
+
+/*
+ * AcquireExecutorLocksPrepared
+ *
+ * Acquire or release execution locks using pruning results already computed
+ * by ExecutorPrep() and stored in queryDesc->estate.
+ *
+ * This is intended for single-statement reused generic-plan paths that
+ * choose pruning-aware locking instead of the conservative
+ * AcquireExecutorLocks() path.
+ */
+static void
+AcquireExecutorLocksPrepared(QueryDesc *queryDesc, bool acquire)
+{
+	PlannedStmt *plannedstmt = queryDesc->plannedstmt;
+	EState	   *estate = queryDesc->estate;
+	Bitmapset  *lock_relids;
+	ListCell   *lc;
+
+	Assert(queryDesc != NULL);
+	Assert(estate != NULL);
+	Assert(plannedstmt != NULL);
+	Assert(plannedstmt->commandType != CMD_UTILITY);
+
+	lock_relids = bms_difference(estate->es_unpruned_relids,
+								 plannedstmt->unprunableRelids);
+
+	/*
+	 * Keep the first result relation of each ModifyTable locked even if
+	 * pruning removed all target partitions.  ExecInitModifyTable() relies on
+	 * one such relation remaining available.
+	 */
+	foreach(lc, plannedstmt->firstResultRels)
+	{
+		Index		rti = lfirst_int(lc);
+
+		lock_relids = bms_add_member(lock_relids, rti);
+	}
+
+	LockRangeTableRelids(plannedstmt->rtable, lock_relids, acquire);
+
+	bms_free(lock_relids);
+
+}
+
+/*
+ * ExecutorPrepAndLock
+ *		Perform pruning-aware locking for a single PlannedStmt.
+ *
+ * Locks unprunable relations first, then runs ExecutorPrep() to
+ * determine which partitions survive initial pruning, then locks
+ * only those survivors.  Checks *is_valid after each locking step
+ * to detect plan invalidation (e.g., from concurrent DDL or DDL
+ * triggered by a pruning expression).
+ *
+ * Returns true if the plan is still valid and all needed locks are
+ * held.  Returns false if the plan was invalidated at any point, in
+ * which case all acquired locks have been released and the caller
+ * should discard the QueryDesc and retry with a fresh plan.
+ */
+bool
+ExecutorPrepAndLock(QueryDesc *queryDesc, ResourceOwner owner,
+					int eflags, bool *is_valid)
+{
+	PlannedStmt *pstmt = queryDesc->plannedstmt;
+
+	/* Lock unprunable rels before pruning can access them. */
+	LockRangeTableRelids(pstmt->rtable, pstmt->unprunableRelids, true);
+	if (!*is_valid)
+	{
+		LockRangeTableRelids(pstmt->rtable, pstmt->unprunableRelids, false);
+		return false;
+	}
+
+	/* Run pruning and lock survivors. */
+	ExecutorPrep(queryDesc, owner, eflags);
+	AcquireExecutorLocksPrepared(queryDesc, true);
+	if (!*is_valid)
+	{
+		AcquireExecutorLocksPrepared(queryDesc, false);
+		LockRangeTableRelids(pstmt->rtable, pstmt->unprunableRelids, false);
+		return false;
+	}
+
+	return true;
+}
+
 /*
  * ExecutorPrep
  *
@@ -391,6 +509,30 @@ ExecutorPrep(QueryDesc *queryDesc, ResourceOwner owner, int eflags)
 	CurrentResourceOwner = oldowner;
 }
 
+/*
+ * ExecutorPrepCleanup
+ *		Clean up an EState that was created by ExecutorPrep() but never
+ *		passed to ExecutorStart().  This happens when the plan is
+ *		invalidated between prep and execution, and the caller must
+ *		discard the prepped state before retrying with a fresh plan.
+ *
+ * Unlike ExecutorEnd(), this does not expect a fully initialized
+ * plan state tree -- only the range table relations and the
+ * EState itself need to be freed.
+ */
+void
+ExecutorPrepCleanup(QueryDesc *queryDesc)
+{
+	EState	   *estate = queryDesc->estate;
+
+	if (estate == NULL)
+		return;
+
+	ExecCloseRangeTableRelations(estate);
+	FreeExecutorState(estate);
+	queryDesc->estate = NULL;
+}
+
 /* ----------------------------------------------------------------
  *		ExecutorRun
  *
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index 478cb01783c..6e78b61f700 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -5133,8 +5133,8 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
 	 * as a reference for building the ResultRelInfo of the target partition.
 	 * In either case, it doesn't matter which result relation is kept, so we
 	 * just keep the first one, if all others have been pruned.  See also,
-	 * ExecDoInitialPruning(), which ensures that this first result relation
-	 * has been locked.
+	 * AcquireExecutorLocksPrepared(), which ensures that this first result
+	 * relation has been locked.
 	 */
 	i = 0;
 	foreach(l, node->resultRelations)
@@ -5148,6 +5148,9 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
 			/* all result relations pruned; keep the first one */
 			keep_rel = true;
 			rti = linitial_int(node->resultRelations);
+			if (!list_member_int(estate->es_plannedstmt->firstResultRels, rti))
+				elog(ERROR, "first result relation %u not found in firstResultRels",
+					 rti);
 			i = 0;
 		}
 
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index f4689e7c9f8..4cddac7f2fc 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -675,6 +675,7 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
 											  glob->prunableRelids);
 	result->permInfos = glob->finalrteperminfos;
 	result->subrtinfos = glob->subrtinfos;
+	result->firstResultRels = glob->firstResultRels;
 	result->appendRelations = glob->appendRelations;
 	result->subplans = glob->subplans;
 	result->rewindPlanIDs = glob->rewindPlanIDs;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index ff0e875f2a2..4495bc6e627 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -384,6 +384,25 @@ set_plan_references(PlannerInfo *root, Plan *plan)
 		}
 	}
 
+	/*
+	 * Record the first result relation if it belongs to the set of initially
+	 * prunable relations.  We use bms_next_member() to get the
+	 * lowest-numbered leaf result rel, which matches
+	 * linitial_int(ModifyTable.resultRelations) because
+	 * expand_inherited_rtentry() adds child partitions to the range table
+	 * sequentially in partition bound order, and resultRelations is built
+	 * from that same expansion.
+	 */
+	if (root->leaf_result_relids)
+	{
+		Index		firstResultRel = bms_next_member(root->leaf_result_relids, -1);
+
+		firstResultRel += rtoffset;
+		if (bms_is_member(firstResultRel, root->glob->prunableRelids))
+			root->glob->firstResultRels =
+				lappend_int(root->glob->firstResultRels, firstResultRel);
+	}
+
 	return result;
 }
 
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index 4699b53cab7..53c50ab0fce 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -59,7 +59,9 @@ static uint64 DoPortalRunFetch(Portal portal,
 							   long count,
 							   DestReceiver *dest);
 static void DoPortalRewind(Portal portal);
-static bool PortalLockCachedPlan(Portal portal);
+static bool PortalLockCachedPlan(Portal portal, bool do_prep,
+								 ParamListInfo params,
+								 QueryDesc **queryDesc_p);
 
 
 /*
@@ -488,21 +490,6 @@ restart:
 				 * non-default nesting level for the snapshot.
 				 */
 
-				/*
-				 * If the portal is backed by a cached plan, acquire execution
-				 * locks via PortalLockCachedPlan().  If the plan is
-				 * invalidated during locking, it replans and may change the
-				 * portal strategy, requiring us to restart PortalStart().
-				 */
-				if (portal->cplan)
-				{
-					if (PortalLockCachedPlan(portal))
-					{
-						PopActiveSnapshot();
-						goto restart;
-					}
-				}
-
 				/*
 				 * Create QueryDesc in portal's context; for the moment, set
 				 * the destination to DestNone.
@@ -516,6 +503,26 @@ restart:
 											portal->queryEnv,
 											0);
 
+				/*
+				 * If the portal is backed by a cached plan, acquire execution
+				 * locks via PortalLockCachedPlan().  For eligible plans
+				 * (single-statement reused generic), this performs
+				 * pruning-aware locking: it runs ExecutorPrep() on the
+				 * QueryDesc to determine which partitions survive initial
+				 * pruning, then locks only those.  If the plan is invalidated
+				 * during this process, it replans and rebuilds the QueryDesc.
+				 * If replanning changes the portal strategy, we must restart
+				 * PortalStart() to redispatch.
+				 */
+				if (portal->cplan)
+				{
+					if (PortalLockCachedPlan(portal, true, params, &queryDesc))
+					{
+						PopActiveSnapshot();
+						goto restart;
+					}
+				}
+
 				/*
 				 * If it's a scrollable cursor, executor needs to support
 				 * REWIND and backwards scan, as well as whatever the caller
@@ -555,7 +562,7 @@ restart:
 			case PORTAL_ONE_MOD_WITH:
 				if (portal->cplan)
 				{
-					if (PortalLockCachedPlan(portal))
+					if (PortalLockCachedPlan(portal, false, NULL, NULL))
 						goto restart;
 				}
 
@@ -611,7 +618,7 @@ restart:
 				 */
 				if (portal->cplan)
 				{
-					if (PortalLockCachedPlan(portal))
+					if (PortalLockCachedPlan(portal, false, NULL, NULL))
 						goto restart;
 				}
 
@@ -1828,15 +1835,32 @@ EnsurePortalSnapshotExists(void)
  *		Acquire execution locks for a cached-plan-backed portal,
  *		retrying with a fresh plan if the current one is invalidated.
  *
+ * If do_prep is true and the plan is eligible (single-statement reused
+ * generic plan), performs pruning-aware locking via ExecutorPrep() and
+ * populates portal->queryDesc with the prepped QueryDesc.  Otherwise
+ * falls back to locking all relations in the plan.
+ *
  * Returns true if replanning changed portal->strategy, meaning the
- * caller must redispatch.  Returns false once locks are held.
+ * caller must redispatch.  Returns false once locks are held and the
+ * plan is valid for execution.
  */
 static bool
-PortalLockCachedPlan(Portal portal)
+PortalLockCachedPlan(Portal portal, bool do_prep,
+					 ParamListInfo params,
+					 QueryDesc **prep_qd)
 {
 	PortalStrategy start_strategy = portal->strategy;
 
-	if (AcquireExecutorLocks(portal->cplan))
+	if (do_prep && CachedPlanCanPrep(portal->cplan, portal->plansource))
+	{
+		Assert(prep_qd);
+		if (ExecutorPrepAndLock(*prep_qd, portal->resowner, 0,
+								&portal->cplan->is_valid))
+			return false;
+		ExecutorPrepCleanup(*prep_qd);
+		FreeQueryDesc(*prep_qd);
+	}
+	else if (AcquireExecutorLocks(portal->cplan))
 		return false;
 
 	/* Replan.  Locks will be taken freshly. */
@@ -1852,5 +1876,15 @@ PortalLockCachedPlan(Portal portal)
 	if (portal->strategy != start_strategy)
 		return true;
 
+	if (prep_qd)
+	{
+		Assert(list_length(portal->stmts) == 1);
+		*prep_qd = CreateQueryDesc(linitial_node(PlannedStmt, portal->stmts),
+								   portal->sourceText,
+								   GetActiveSnapshot(), InvalidSnapshot,
+								   None_Receiver, params,
+								   portal->queryEnv, 0);
+	}
+
 	return false;
 }
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index f7fe366859c..fca2f84081e 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -1977,6 +1977,22 @@ AcquireExecutorLocks(CachedPlan *cplan)
 	return true;
 }
 
+/*
+ * CachedPlanCanPrep
+ *		Check whether a cached plan is eligible for pruning-aware locking
+ *		via ExecutorPrepAndLock().
+ *
+ * Only single-statement reused generic plans with a non-utility command
+ * qualify.
+ */
+bool
+CachedPlanCanPrep(CachedPlan *cplan, CachedPlanSource *plansource)
+{
+	return (cplan == plansource->gplan &&
+			list_length(cplan->stmt_list) == 1 &&
+			linitial_node(PlannedStmt, cplan->stmt_list)->commandType != CMD_UTILITY);
+}
+
 /*
  * AcquirePlannerLocks: acquire locks needed for planning of a querytree list;
  * or release them if acquire is false.
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 472e141bba3..3a03355e6b6 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -69,7 +69,8 @@ extern void ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into,
 						   ParamListInfo params, QueryEnvironment *queryEnv,
 						   const instr_time *planduration,
 						   const BufferUsage *bufusage,
-						   const MemoryContextCounters *mem_counters);
+						   const MemoryContextCounters *mem_counters,
+						   QueryDesc *prep_qd);
 
 extern void ExplainPrintPlan(ExplainState *es, QueryDesc *queryDesc);
 extern void ExplainPrintTriggers(ExplainState *es,
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index 33bbdbfeffb..093be9bd24b 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -21,6 +21,7 @@
 #include "nodes/lockoptions.h"
 #include "nodes/parsenodes.h"
 #include "utils/memutils.h"
+#include "utils/resowner.h"
 
 
 /*
@@ -235,6 +236,9 @@ ExecGetJunkAttribute(TupleTableSlot *slot, AttrNumber attno, bool *isNull)
  */
 extern void ExecutorStart(QueryDesc *queryDesc, int eflags);
 extern void standard_ExecutorStart(QueryDesc *queryDesc, int eflags);
+extern bool ExecutorPrepAndLock(QueryDesc *queryDesc, ResourceOwner owner,
+								int eflags, bool *is_valid);
+extern void ExecutorPrepCleanup(QueryDesc *queryDesc);
 extern void ExecutorRun(QueryDesc *queryDesc,
 						ScanDirection direction, uint64 count);
 extern void standard_ExecutorRun(QueryDesc *queryDesc,
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 27a2c6815b7..a5d00633b4b 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -217,6 +217,9 @@ typedef struct PlannerGlobal
 	/* "flat" list of integer RT indexes */
 	List	   *resultRelations;
 
+	/* "flat" list of integer RT indexes (one per ModifyTable node) */
+	List	   *firstResultRels;
+
 	/* "flat" list of AppendRelInfos */
 	List	   *appendRelations;
 
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 14a1dfed2b9..1a328ea138c 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -120,6 +120,16 @@ typedef struct PlannedStmt
 	/* RT indexes of relations targeted by INSERT/UPDATE/DELETE/MERGE */
 	Bitmapset  *resultRelationRelids;
 
+	/*
+	 * rtable indexes of first target relation in each ModifyTable node in the
+	 * plan for INSERT/UPDATE/DELETE/MERGE.  NIL if resultRelations is NIL.
+	 *
+	 * These are used by AcquireExecutorLocksPrepared() to ensure that the
+	 * first result rel for each ModifyTable remains locked even if pruned;
+	 * see ExecInitModifyTable() for the executor side assumptions.
+	 */
+	List	   *firstResultRels;
+
 	/* list of AppendRelInfo nodes */
 	List	   *appendRelations;
 
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index e0fc403e717..2941d3a301b 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -254,4 +254,6 @@ extern bool CachedPlanIsSimplyValid(CachedPlanSource *plansource,
 extern CachedExpression *GetCachedExpression(Node *expr);
 extern void FreeCachedExpression(CachedExpression *cexpr);
 
+extern bool CachedPlanCanPrep(CachedPlan *cplan, CachedPlanSource *plansource);
+
 #endif							/* PLANCACHE_H */
diff --git a/src/test/regress/expected/partition_prune.out b/src/test/regress/expected/partition_prune.out
index 849049f9c51..ec73866486e 100644
--- a/src/test/regress/expected/partition_prune.out
+++ b/src/test/regress/expected/partition_prune.out
@@ -4956,3 +4956,187 @@ select * from (select a, b from phv_boolpart) t
 (2 rows)
 
 drop table phv_boolpart;
+--
+-- Verify that pruning-aware locking skips pruned partitions
+-- when reusing a generic cached plan.
+--
+set plan_cache_mode to force_generic_plan;
+create table prunelock_p (a int) partition by list (a);
+create table prunelock_p1 partition of prunelock_p for values in (1);
+create table prunelock_p2 partition of prunelock_p for values in (2);
+create table prunelock_p3 partition of prunelock_p for values in (3);
+prepare prunelock_q (int) as select * from prunelock_p where a = $1;
+-- Force generic plan creation
+explain (costs off) execute prunelock_q(1);
+                  QUERY PLAN                  
+----------------------------------------------
+ Append
+   Subplans Removed: 2
+   ->  Seq Scan on prunelock_p1 prunelock_p_1
+         Filter: (a = $1)
+(4 rows)
+
+-- Execute and check which child partitions are locked
+begin;
+execute prunelock_q(1);
+ a 
+---
+(0 rows)
+
+select c.relname
+  from pg_locks l
+  join pg_class c on c.oid = l.relation
+ where l.pid = pg_backend_pid()
+   and c.relname like 'prunelock_p_'
+ order by c.relname;
+   relname    
+--------------
+ prunelock_p1
+(1 row)
+
+commit;
+deallocate prunelock_q;
+-- Turn pruning off
+set enable_partition_pruning to off;
+prepare prunelock_q (int) as select * from prunelock_p where a = $1;
+-- Force generic plan creation
+explain (costs off) execute prunelock_q(1);
+                  QUERY PLAN                  
+----------------------------------------------
+ Append
+   ->  Seq Scan on prunelock_p1 prunelock_p_1
+         Filter: (a = $1)
+   ->  Seq Scan on prunelock_p2 prunelock_p_2
+         Filter: (a = $1)
+   ->  Seq Scan on prunelock_p3 prunelock_p_3
+         Filter: (a = $1)
+(7 rows)
+
+-- Execute and check which child partitions are locked
+begin;
+execute prunelock_q(1);
+ a 
+---
+(0 rows)
+
+select c.relname
+  from pg_locks l
+  join pg_class c on c.oid = l.relation
+ where l.pid = pg_backend_pid()
+   and c.relname like 'prunelock_p_'
+ order by c.relname;
+   relname    
+--------------
+ prunelock_p1
+ prunelock_p2
+ prunelock_p3
+(3 rows)
+
+commit;
+deallocate prunelock_q;
+reset enable_partition_pruning;
+--
+-- Verify firstResultRels handling with multiple ModifyTable nodes
+-- (writable CTEs) targeting a partitioned table.  When a pruning
+-- parameter matches no partition, all result relations are pruned
+-- and the executor must still find a usable first result relation
+-- for each ModifyTable node.
+--
+prepare prunelock_mt_q (int, int) as
+  with upd1 as (update prunelock_p set a = a),
+       upd2 as (update prunelock_p set a = a where a = $2)
+  update prunelock_p set a = a where a = $1;
+-- Force generic plan creation
+explain (costs off) execute prunelock_mt_q(1, 2);
+                         QUERY PLAN                         
+------------------------------------------------------------
+ Update on prunelock_p
+   Update on prunelock_p1 prunelock_p_1
+   CTE upd1
+     ->  Update on prunelock_p prunelock_p_3
+           Update on prunelock_p1 prunelock_p_4
+           Update on prunelock_p2 prunelock_p_5
+           Update on prunelock_p3 prunelock_p_6
+           ->  Append
+                 ->  Seq Scan on prunelock_p1 prunelock_p_4
+                 ->  Seq Scan on prunelock_p2 prunelock_p_5
+                 ->  Seq Scan on prunelock_p3 prunelock_p_6
+   CTE upd2
+     ->  Update on prunelock_p prunelock_p_7
+           Update on prunelock_p2 prunelock_p_8
+           ->  Append
+                 Subplans Removed: 2
+                 ->  Seq Scan on prunelock_p2 prunelock_p_8
+                       Filter: (a = $2)
+   ->  Append
+         Subplans Removed: 2
+         ->  Seq Scan on prunelock_p1 prunelock_p_1
+               Filter: (a = $1)
+(22 rows)
+
+-- All partitions pruned: value 4 matches no partition, so each
+-- ModifyTable must still initialize correctly with no matching
+-- result relations.
+explain (costs off) execute prunelock_mt_q(4, 5);
+                         QUERY PLAN                         
+------------------------------------------------------------
+ Update on prunelock_p
+   CTE upd1
+     ->  Update on prunelock_p prunelock_p_2
+           Update on prunelock_p1 prunelock_p_3
+           Update on prunelock_p2 prunelock_p_4
+           Update on prunelock_p3 prunelock_p_5
+           ->  Append
+                 ->  Seq Scan on prunelock_p1 prunelock_p_3
+                 ->  Seq Scan on prunelock_p2 prunelock_p_4
+                 ->  Seq Scan on prunelock_p3 prunelock_p_5
+   CTE upd2
+     ->  Update on prunelock_p prunelock_p_6
+           ->  Append
+                 Subplans Removed: 3
+   ->  Append
+         Subplans Removed: 3
+(16 rows)
+
+deallocate prunelock_mt_q;
+drop table prunelock_p;
+--
+-- Verify that pruning-aware locking falls back to locking all
+-- partitions for multi-statement CachedPlans.  Rule rewriting can
+-- expand a single statement into multiple PlannedStmts, and later
+-- statements must not have their pruning evaluated before earlier
+-- ones have executed, since CCI between statements can change what
+-- pruning expressions see.
+--
+create table prune_config (val int);
+insert into prune_config values (1);
+create table multistmt_pt (a int, b int) partition by list (a);
+create table multistmt_pt_1 partition of multistmt_pt for values in (1);
+create table multistmt_pt_2 partition of multistmt_pt for values in (2);
+insert into multistmt_pt values (1, 0), (2, 0);
+create function get_prune_val() returns int as $$
+  select val from prune_config;
+$$ language sql stable;
+create rule config_upd_rule as on update to multistmt_pt
+  do also update prune_config set val = 2;
+set plan_cache_mode to force_generic_plan;
+prepare multi_q as update multistmt_pt set b = b + 1 where a = get_prune_val();
+-- first execute creates the generic plan
+execute multi_q;
+-- reset for the real test
+update prune_config set val = 1;
+update multistmt_pt set b = 0;
+-- second execute reuses the plan; pruning-aware locking kicks in
+execute multi_q;
+select * from multistmt_pt order by a;
+ a | b 
+---+---
+ 1 | 0
+ 2 | 1
+(2 rows)
+
+deallocate multi_q;
+drop rule config_upd_rule on multistmt_pt;
+drop function get_prune_val;
+drop table multistmt_pt, prune_config;
+reset plan_cache_mode;
diff --git a/src/test/regress/expected/plancache.out b/src/test/regress/expected/plancache.out
index d58534ca1cd..54077294dce 100644
--- a/src/test/regress/expected/plancache.out
+++ b/src/test/regress/expected/plancache.out
@@ -402,3 +402,66 @@ select name, generic_plans, custom_plans from pg_prepared_statements
 (1 row)
 
 drop table test_mode;
+-- This exercises the CachedPlanPrepCleanup() path, which must free
+-- the EState created by ExecutorPrep() when the plan is invalidated
+-- before execution begins.  The pruning expression uses a stable SQL
+-- function that calls a volatile plpgsql function.  That function
+-- performs DDL on a partition when a separate "signal" table says to
+-- do so.  The second EXECUTE should replan cleanly after the DDL.
+set plan_cache_mode to force_generic_plan;
+create table inval_during_pruning_p (a int) partition by list (a);
+create table inval_during_pruning_p1 partition of inval_during_pruning_p for values in (1);
+create table inval_during_pruning_p2 partition of inval_during_pruning_p for values in (2);
+insert into inval_during_pruning_p values (1), (2);
+create table inval_during_pruning_signal (create_idx bool not null);
+insert into inval_during_pruning_signal values (false);
+create or replace function invalidate_plancache_func() returns int
+as $$
+declare
+	create_index bool;
+begin
+	-- Perform DDL on a partition if asked to
+	select create_idx into create_index from inval_during_pruning_signal for update;
+	if create_index = true then
+		raise notice 'creating index on partition inval_during_pruning_p1';
+		create index on inval_during_pruning_p1 (a);
+		update inval_during_pruning_signal set create_idx = false;
+	end if;
+	-- value that pruning will match against partition bounds
+	return 1;
+end;
+$$ language plpgsql volatile;
+create or replace function stable_pruning_val() returns int as $$
+	select invalidate_plancache_func();
+$$ language sql stable;
+prepare inval_during_pruning_q as select * from inval_during_pruning_p where a = stable_pruning_val();
+-- Build a generic plan and run pruning once, but don't set the signal
+-- for invalidate_plancache_func() to perform the DDL.
+explain (verbose, costs off) execute inval_during_pruning_q;
+                                QUERY PLAN                                 
+---------------------------------------------------------------------------
+ Append
+   Subplans Removed: 1
+   ->  Seq Scan on public.inval_during_pruning_p1 inval_during_pruning_p_1
+         Output: inval_during_pruning_p_1.a
+         Filter: (inval_during_pruning_p_1.a = stable_pruning_val())
+(5 rows)
+
+-- Reuse the generic plan.  Make invalidate_plancache_func() perform DDL
+-- during this execution, which should force replanning without errors.
+update inval_during_pruning_signal set create_idx = true;
+explain (verbose, costs off) execute inval_during_pruning_q;
+NOTICE:  creating index on partition inval_during_pruning_p1
+                                QUERY PLAN                                 
+---------------------------------------------------------------------------
+ Append
+   Subplans Removed: 1
+   ->  Seq Scan on public.inval_during_pruning_p1 inval_during_pruning_p_1
+         Output: inval_during_pruning_p_1.a
+         Filter: (inval_during_pruning_p_1.a = stable_pruning_val())
+(5 rows)
+
+deallocate inval_during_pruning_q;
+drop table inval_during_pruning_p, inval_during_pruning_signal;
+drop function invalidate_plancache_func, stable_pruning_val;
+reset plan_cache_mode;
diff --git a/src/test/regress/sql/partition_prune.sql b/src/test/regress/sql/partition_prune.sql
index 359a9208056..a98844d14f8 100644
--- a/src/test/regress/sql/partition_prune.sql
+++ b/src/test/regress/sql/partition_prune.sql
@@ -1518,3 +1518,119 @@ select * from (select a, b from phv_boolpart) t
   group by grouping sets (a, b);
 
 drop table phv_boolpart;
+
+--
+-- Verify that pruning-aware locking skips pruned partitions
+-- when reusing a generic cached plan.
+--
+set plan_cache_mode to force_generic_plan;
+
+create table prunelock_p (a int) partition by list (a);
+create table prunelock_p1 partition of prunelock_p for values in (1);
+create table prunelock_p2 partition of prunelock_p for values in (2);
+create table prunelock_p3 partition of prunelock_p for values in (3);
+
+prepare prunelock_q (int) as select * from prunelock_p where a = $1;
+
+-- Force generic plan creation
+explain (costs off) execute prunelock_q(1);
+
+-- Execute and check which child partitions are locked
+begin;
+execute prunelock_q(1);
+
+select c.relname
+  from pg_locks l
+  join pg_class c on c.oid = l.relation
+ where l.pid = pg_backend_pid()
+   and c.relname like 'prunelock_p_'
+ order by c.relname;
+commit;
+
+deallocate prunelock_q;
+
+-- Turn pruning off
+set enable_partition_pruning to off;
+
+prepare prunelock_q (int) as select * from prunelock_p where a = $1;
+
+-- Force generic plan creation
+explain (costs off) execute prunelock_q(1);
+
+-- Execute and check which child partitions are locked
+begin;
+execute prunelock_q(1);
+
+select c.relname
+  from pg_locks l
+  join pg_class c on c.oid = l.relation
+ where l.pid = pg_backend_pid()
+   and c.relname like 'prunelock_p_'
+ order by c.relname;
+commit;
+
+deallocate prunelock_q;
+reset enable_partition_pruning;
+
+--
+-- Verify firstResultRels handling with multiple ModifyTable nodes
+-- (writable CTEs) targeting a partitioned table.  When a pruning
+-- parameter matches no partition, all result relations are pruned
+-- and the executor must still find a usable first result relation
+-- for each ModifyTable node.
+--
+prepare prunelock_mt_q (int, int) as
+  with upd1 as (update prunelock_p set a = a),
+       upd2 as (update prunelock_p set a = a where a = $2)
+  update prunelock_p set a = a where a = $1;
+
+-- Force generic plan creation
+explain (costs off) execute prunelock_mt_q(1, 2);
+
+-- All partitions pruned: value 4 matches no partition, so each
+-- ModifyTable must still initialize correctly with no matching
+-- result relations.
+explain (costs off) execute prunelock_mt_q(4, 5);
+
+deallocate prunelock_mt_q;
+drop table prunelock_p;
+
+--
+-- Verify that pruning-aware locking falls back to locking all
+-- partitions for multi-statement CachedPlans.  Rule rewriting can
+-- expand a single statement into multiple PlannedStmts, and later
+-- statements must not have their pruning evaluated before earlier
+-- ones have executed, since CCI between statements can change what
+-- pruning expressions see.
+--
+create table prune_config (val int);
+insert into prune_config values (1);
+
+create table multistmt_pt (a int, b int) partition by list (a);
+create table multistmt_pt_1 partition of multistmt_pt for values in (1);
+create table multistmt_pt_2 partition of multistmt_pt for values in (2);
+insert into multistmt_pt values (1, 0), (2, 0);
+
+create function get_prune_val() returns int as $$
+  select val from prune_config;
+$$ language sql stable;
+
+create rule config_upd_rule as on update to multistmt_pt
+  do also update prune_config set val = 2;
+
+set plan_cache_mode to force_generic_plan;
+prepare multi_q as update multistmt_pt set b = b + 1 where a = get_prune_val();
+-- first execute creates the generic plan
+execute multi_q;
+-- reset for the real test
+update prune_config set val = 1;
+update multistmt_pt set b = 0;
+-- second execute reuses the plan; pruning-aware locking kicks in
+execute multi_q;
+select * from multistmt_pt order by a;
+
+deallocate multi_q;
+drop rule config_upd_rule on multistmt_pt;
+drop function get_prune_val;
+drop table multistmt_pt, prune_config;
+reset plan_cache_mode;
diff --git a/src/test/regress/sql/plancache.sql b/src/test/regress/sql/plancache.sql
index aed388d03a1..90b6c5f82bf 100644
--- a/src/test/regress/sql/plancache.sql
+++ b/src/test/regress/sql/plancache.sql
@@ -228,3 +228,55 @@ select name, generic_plans, custom_plans from pg_prepared_statements
   where  name = 'test_mode_pp';
 
 drop table test_mode;
+
+-- This exercises the CachedPlanPrepCleanup() path, which must free
+-- the EState created by ExecutorPrep() when the plan is invalidated
+-- before execution begins.  The pruning expression uses a stable SQL
+-- function that calls a volatile plpgsql function.  That function
+-- performs DDL on a partition when a separate "signal" table says to
+-- do so.  The second EXECUTE should replan cleanly after the DDL.
+set plan_cache_mode to force_generic_plan;
+create table inval_during_pruning_p (a int) partition by list (a);
+create table inval_during_pruning_p1 partition of inval_during_pruning_p for values in (1);
+create table inval_during_pruning_p2 partition of inval_during_pruning_p for values in (2);
+insert into inval_during_pruning_p values (1), (2);
+
+create table inval_during_pruning_signal (create_idx bool not null);
+insert into inval_during_pruning_signal values (false);
+create or replace function invalidate_plancache_func() returns int
+as $$
+declare
+	create_index bool;
+begin
+	-- Perform DDL on a partition if asked to
+	select create_idx into create_index from inval_during_pruning_signal for update;
+	if create_index = true then
+		raise notice 'creating index on partition inval_during_pruning_p1';
+		create index on inval_during_pruning_p1 (a);
+		update inval_during_pruning_signal set create_idx = false;
+	end if;
+	-- value that pruning will match against partition bounds
+	return 1;
+end;
+$$ language plpgsql volatile;
+
+create or replace function stable_pruning_val() returns int as $$
+	select invalidate_plancache_func();
+$$ language sql stable;
+
+prepare inval_during_pruning_q as select * from inval_during_pruning_p where a = stable_pruning_val();
+
+-- Build a generic plan and run pruning once, but don't set the signal
+-- for invalidate_plancache_func() to perform the DDL.
+explain (verbose, costs off) execute inval_during_pruning_q;
+
+-- Reuse the generic plan.  Make invalidate_plancache_func() perform DDL
+-- during this execution, which should force replanning without errors.
+update inval_during_pruning_signal set create_idx = true;
+explain (verbose, costs off) execute inval_during_pruning_q;
+
+deallocate inval_during_pruning_q;
+drop table inval_during_pruning_p, inval_during_pruning_signal;
+drop function invalidate_plancache_func, stable_pruning_val;
+
+reset plan_cache_mode;
-- 
2.47.3



view thread (114+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
  Subject: Re: generic plans and "initial" pruning
  In-Reply-To: <CA+HiwqGq2S+NL3Q8sYh2u9XLXmNYTy-Z6HqmTW4VHUahB=yqjw@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox